onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-17 21:10:43 +00:00

Author	SHA1	Message	Date
Hector Li	6a4e4488da	[QNN EP] Support Qnn MatMul with 2 dynamic inputs which are uint16 quantized (#18469 ) ### Description QNN can't run MatMul if both inputs are dynamic inputs with uint16 quantized on v68. Make it run by inserting Convert op to convert 1 input to int8	2023-11-16 13:44:15 -08:00
Scott McKay	e7a524fea9	Update to allow large models to be checked for mobile support. (#18357 ) ### Description <!-- Describe your changes. --> Update usability checker and related infrastructure to support checking models > 2GB. - Add ability to set flag to keep initializers as external data - we optimize the model as part of the checking so need to write out a new copy. - Handle issue with ONNX shape inferencing silently failing - use API that supports large models but requires writing the model to a new file - automate cleanup of that copy of the model ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Allow analysis of LLMs to determine gaps for mobile usage. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-11-17 07:20:16 +10:00
Dmitri Smirnov	b6b9aff608	Allow empty shapes and do not validate them for inputs/outputs (#18442 ) ### Description Allow empty shapes and do not validate them for inputs/outputs at the InferenceSession::ValidateInputsOutputs(). ### Motivation and Context https://github.com/microsoft/onnxruntime/pull/17301 disallowed empty shapes. However, many models depend on them as a way to pass shapes of different ranks.	2023-11-16 13:15:48 -08:00
Chi Lo	3588fbac13	[TensorRT EP] Fix memory leak for cudnn/cublas (#18467 ) Free memory for cudnn/cublas instances at TRT EP destruction. https://github.com/microsoft/onnxruntime/issues/18466	2023-11-16 10:23:08 -08:00
satyajandhyala	b291b20fa0	[JS/Web]Added uniforms support to Slice op. (#18422 ) ### Description Support uniforms in Slice op ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve ferformance	2023-11-16 09:44:13 -08:00
Wanming Lin	999752a35d	[WebNN EP] Support GreaterOrEqual and LessOrEqual ops (#18411 )	2023-11-16 08:01:58 -08:00
Tianlei Wu	119e86ec16	SDXL demo: Add Option to disable refiner (#18455 ) Add option to disable refiner and only run base model.	2023-11-16 06:43:18 -08:00
zhijiang	16d7f55193	lora conv1d replacement (#16643 ) in LoRA code, it will use conv1d to do projection for qkv, while the conv1d calculation is mathematically equivalent to matmul, and matmul is much faster than conv1d. The subsitution of the graph optimizer is: 1 conv1d >> 2 split + 1 squeeze + group_num matmul + 1 concat with this optimizer, we see 10%+ in one 1P model	2023-11-16 17:08:06 +08:00
guyang3532	751aa8d31a	fix axis of layernorm for UpstreamReshape (#18425 ) Similar to https://github.com/microsoft/onnxruntime/pull/17255 update axis for Layernormalization when Reshape upstream it.	2023-11-16 16:29:00 +08:00
Chi Lo	18a3675bf7	[TensorRT EP] Only instantiate TRT builder once (#18100 ) The TRT builder instantization is slow (see [here](https://github.com/microsoft/onnxruntime/issues/18071)). In current TRT EP, we instantiate builder object every time we need it. There are multiple places need the TRT builder so this causes huge performance overhead.	2023-11-15 23:39:41 -08:00
Yulong Wang	6f9f653ada	[wasm] increase test max memory from 2G to 4G (#18459 ) ### Description increase max memory from 2G to 4G for onnxruntime_test_all in WebAssembly build.	2023-11-15 17:51:04 -08:00
Dmitri Smirnov	6f863ae2ad	Allow optional axes tensor to be null and ignore it as optional (#18423 ) ### Description Our function inliner converts call nodes to a proto. `Node::ToProto()` function recreates optional NodeArgs into a `NodeProto`. While handling missing input parameters, our inliner simply renames them as empty strings. `Graph::InlineFunctionProto()` recreates missing NodeArgs even though the original call node did not have them. This results in the below mentioned issue. The inlined model has the following entries, notice the second argument is present, but has no value in `ReduceSum` call (from a Dynamo exported model). > InsertedPrecisionFreeCast__inlfunc__aten_linalg_vector_norm_no_dim_onnx_result_12 = ReduceSum <keepdims: int = 0, noop_with_empty_axes: int = 0> (InsertedPrecisionFreeCast__inlfunc_ReduceL1_data_abs, ) We now allow second input to ReduceSum to be nullptr and ignore it as it is optional. ### Motivation and Context This seeks to address https://github.com/microsoft/onnxruntime/issues/18338	2023-11-15 16:09:05 -08:00
Changming Sun	cc840c5289	Fix a bug in SaveInputOutputNamesToNodeMapping function (#18456 ) ### Description Fix a bug in SaveInputOutputNamesToNodeMapping function. The fix was provided by Scott. ### Motivation and Context	2023-11-15 14:51:42 -08:00
Edward Chen	0a4d76d98b	MLAS AArch64 quantized int4 Gemm kernel (#18031 ) - Implement MLAS function for quantized 4-bit int Gemm (Gemm with float A and quantized 4-bit int B) for ARM NEON. This is an initial implementation. Only the M=1 path (with M being number of rows of A and C) has any optimization attempted so far. More optimization to come in future PRs. - Connect MatMulNBits contrib op to MLAS function.	2023-11-15 09:31:54 -08:00
Yulong Wang	586f06f5a1	[js/web] set noUnusedParameters to true and fix a few bugs (#18404 ) ### Description - set tsconfig "noUnusedParameters" to `true` and fix a few bugs discovered by typescript. how unused parameter is fixed: - for most code (webgl), add underscore as prefix, which is the standard ignore pattern for typescript check. - remove unused parameter from function and modify corresponding function calls (jsep) - fix a bug in ArgMinMax: this 2 operators do not have more than one input(s) so the `createArgMinMaxAttributesFromInputs()` is removed. - add proxy main.ts into typescript check and fix a bug in parameter passing - fixed `run()` function call and add typecheck fix (hack)	2023-11-15 09:16:29 -08:00
Vincent Wang	ed89ca573a	[ORTModule] Support User Config for Triton Codegen, Bugfix for Reduce-to-scalar (#18448 ) User can provide Triton codegen config JSON through env variable. Also fix some bugs related to reduction to scalar case.	2023-11-15 17:16:38 +08:00
Vincent Wang	b0699d901c	Support Graph Input and Initializer for GatherToSplit Fusion (#18412 ) Support graph input and initializer for GatherToSplit fusion. Previously the fusion requires Gather nodes consume some other node which cannot be graph input or initializer. This helps some model training with such case so that we will not have GatherGrad in the final graph. GatherGrad is super inefficient in kernel implementation.	2023-11-15 13:46:38 +08:00
Tianlei Wu	d738ff16ec	SDXL demo: consistent opt shape and seed (#18445 ) ### Description A few refinements: (1) Use fixed optimized shape for dynamic engine of TRT. (2) Use same seed in base and refiner. (3) Save metadata to png file so that it is easy to reproduce. (4) Disable EulerA scheduler for XL since it has issue in refiner with 1.16.2. (5) Limit height and width to be divisible by 64. (6) Update document to add a link of downloading optimized model. --------- Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>	2023-11-14 20:24:32 -08:00
Jian Chen	05526b354b	Adding new yaml file for downloading cuda, and trt from azure blob (#18443 ) This also set the Path variable for the downloaded libraries. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-14 19:47:39 -08:00
Ye Wang	f9af94009b	onboard MoE (#18279 ) ### Description <!-- Describe your changes. --> 1. Introduce MoE CUDA op to ORT based on FT implementation. 2. Upgrade cutlass to 3.1.0 to avoid some build failures on Windows. Remove patch file for cutlass 3.0.0. 3. Sharded MoE implementation will come with another PR limitation: __CUDA_ARCH__ >= 700 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-14 16:48:51 -08:00
Changming Sun	27d068569a	Remove Node.js tool installer task from web ci pipeline (#18434 ) EMSDK already has a nodejs. We will use that one to be more consistent(the CI build pipeline would be less dependent on the VM image).	2023-11-14 13:16:01 -08:00
Yulong Wang	d22b1af5da	[js/web] add CI steps to log info for test failure investigating (#18418 ) ### Description add CI steps to log info for test failure investigating. Currently Web CI is marked as 'optional'. This change adds some script to dump debug info for investigating the random test failure	2023-11-14 11:40:58 -08:00
Hector Li	a6b515fad5	[QNN EP] Update Where Op UT to include the issue relate to data layout (#18426 ) ### Description [QNN EP] Update Where Op UT to include the issue relate to data layout	2023-11-14 08:15:27 -08:00
Adrian Lizarraga	c9d5345c46	[QNN EP] Clean-up todo for OnnxInputInfo (#18416 ) ### Description Renames `OnnxInputInfo` struct to `TensorInfo` because this struct can be used for both input and output tensors. ### Motivation and Context Clean up TODO item	2023-11-14 08:14:40 -08:00
dependabot[bot]	5aeed62630	Bump axios from 1.3.4 to 1.6.1 in /js/node (#18400 ) Bumps [axios](https://github.com/axios/axios) from 1.3.4 to 1.6.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/releases">axios's releases</a>.</em></p> <blockquote> <h2>Release v1.6.1</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>formdata:</strong> fixed content-type header normalization for non-standard browser environments; (<a href="https://redirect.github.com/axios/axios/issues/6056">#6056</a>) (<a href="`dd465ab22b`">dd465ab</a>)</li> <li><strong>platform:</strong> fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>) (<a href="`3dc8369e50`">3dc8369</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+432/-65 ([#6059](https://github.com/axios/axios/issues/6059) [#6056](https://github.com/axios/axios/issues/6056) [#6055](https://github.com/axios/axios/issues/6055) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/meyfa" title="+5/-2 ([#5835](https://github.com/axios/axios/issues/5835) )">Fabian Meyer</a></li> </ul> <h2>Release v1.6.0</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="`96ee232bd3`">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="`5aaff532a6`">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="`a1c8ad008b`">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code> ⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([#6032](https://github.com/axios/axios/issues/6032) [#6021](https://github.com/axios/axios/issues/6021) [#6011](https://github.com/axios/axios/issues/6011) [#5932](https://github.com/axios/axios/issues/5932) [#5931](https://github.com/axios/axios/issues/5931) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/valentin-panov" title="+4/-4 ([#6028](https://github.com/axios/axios/issues/6028) )">Valentin Panov</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/therealrinku" title="+1/-1 ([#5889](https://github.com/axios/axios/issues/5889) )">Rinku Chaudhari</a></li> </ul> <h2>Release v1.5.1</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="`e4107797a7`">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="`bc9af51b18`">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="`4c89f25196`">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="`9e6205630e`">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([#5919](https://github.com/axios/axios/issues/5919) [#5917](https://github.com/axios/axios/issues/5917) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/0o001" title="+4/-4 ()">Mustafa Ateş Uzun</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/sfc-gh-pmotacki" title="+2/-1 ([#5892](https://github.com/axios/axios/issues/5892) )">Przemyslaw Motacki</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/Cadienvan" title="+1/-1 ()">Michael Di Prisco</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/blob/v1.x/CHANGELOG.md">axios's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/axios/axios/compare/v1.6.0...v1.6.1">1.6.1</a> (2023-11-08)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>formdata:</strong> fixed content-type header normalization for non-standard browser environments; (<a href="https://redirect.github.com/axios/axios/issues/6056">#6056</a>) (<a href="`dd465ab22b`">dd465ab</a>)</li> <li><strong>platform:</strong> fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>) (<a href="`3dc8369e50`">3dc8369</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+432/-65 ([#6059](https://github.com/axios/axios/issues/6059) [#6056](https://github.com/axios/axios/issues/6056) [#6055](https://github.com/axios/axios/issues/6055) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/meyfa" title="+5/-2 ([#5835](https://github.com/axios/axios/issues/5835) )">Fabian Meyer</a></li> </ul> <h1><a href="https://github.com/axios/axios/compare/v1.5.1...v1.6.0">1.6.0</a> (2023-10-26)</h1> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="`96ee232bd3`">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="`5aaff532a6`">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="`a1c8ad008b`">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code> ⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([#6032](https://github.com/axios/axios/issues/6032) [#6021](https://github.com/axios/axios/issues/6021) [#6011](https://github.com/axios/axios/issues/6011) [#5932](https://github.com/axios/axios/issues/5932) [#5931](https://github.com/axios/axios/issues/5931) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/valentin-panov" title="+4/-4 ([#6028](https://github.com/axios/axios/issues/6028) )">Valentin Panov</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/therealrinku" title="+1/-1 ([#5889](https://github.com/axios/axios/issues/5889) )">Rinku Chaudhari</a></li> </ul> <h2><a href="https://github.com/axios/axios/compare/v1.5.0...v1.5.1">1.5.1</a> (2023-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="`e4107797a7`">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="`bc9af51b18`">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="`4c89f25196`">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="`9e6205630e`">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([#5919](https://github.com/axios/axios/issues/5919) [#5917](https://github.com/axios/axios/issues/5917) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`f6d2cf9763`"><code>f6d2cf9</code></a> chore(ci): fix publish action content permission; (<a href="https://redirect.github.com/axios/axios/issues/6061">#6061</a>)</li> <li><a href="`a22f4b918a`"><code>a22f4b9</code></a> chore(release): v1.6.1 (<a href="https://redirect.github.com/axios/axios/issues/6060">#6060</a>)</li> <li><a href="`cb8bb2beb2`"><code>cb8bb2b</code></a> chore(ci): Publish to NPM with provenance (<a href="https://redirect.github.com/axios/axios/issues/5835">#5835</a>)</li> <li><a href="`37cbf9214a`"><code>37cbf92</code></a> chore(ci): added labeling and notification for published PRs; (<a href="https://redirect.github.com/axios/axios/issues/6059">#6059</a>)</li> <li><a href="`dd465ab22b`"><code>dd465ab</code></a> fix(formdata): fixed content-type header normalization for non-standard brows...</li> <li><a href="`3dc8369e50`"><code>3dc8369</code></a> fix(platform): fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>)</li> <li><a href="`f7adacdbaa`"><code>f7adacd</code></a> chore(release): v1.6.0 (<a href="https://redirect.github.com/axios/axios/issues/6031">#6031</a>)</li> <li><a href="`9917e67cbb`"><code>9917e67</code></a> chore(ci): fix release-it arg; (<a href="https://redirect.github.com/axios/axios/issues/6032">#6032</a>)</li> <li><a href="`96ee232bd3`"><code>96ee232</code></a> fix(CSRF): fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>)</li> <li><a href="`7d45ab2e2a`"><code>7d45ab2</code></a> chore(tests): fixed tests to pass in node v19 and v20 with <code>keep-alive</code> enabl...</li> <li>Additional commits viewable in <a href="https://github.com/axios/axios/compare/v1.3.4...v1.6.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=axios&package-manager=npm_and_yarn&previous-version=1.3.4&new-version=1.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-14 00:38:00 -08:00
Chi Lo	3e1cf71067	[TensorRT EP] Fix bug for handling outer scope values in GetCapability (#18342 ) The issues found in yolov3, tiny-yolov3 etc where it has control flow ops. Two modifications: 1. In GetCapability/GetSupporedtList, only if the newly built graph has control flow op as well as it has parent node, it needs to handle outer scope values before calling graph.Resolve(). 2. Two graph/subgraphs has the chance to have the same graph->Name(). Add a function to get the unique graph name.	2023-11-14 00:26:06 -08:00
Changming Sun	a09099f2dd	Remove XNNPack from web pipelines (#18419 ) ### Description Remove XNNPack from web pipelines for now	2023-11-13 22:43:53 -08:00
Yi Zhang	0b16185223	build wasm with linux (#18106 ) ### Description Make all build_wasm tasks (NPM packaging and post merge)run on Linux. Enable web gpu test in npm package pipeline too. ### Motivation and Context Even on Windows, build_wasm is running in cygwin. So, it could save a lot of time to run it on Linux.	2023-11-14 14:42:11 +08:00
Scott McKay	897c1c1f05	Set DML package name correctly in CI (#18405 ) ### Description <!-- Describe your changes. --> Set DML package name correctly so the build doesn't try and include mobile targets. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix packaging pipeline.	2023-11-14 14:01:59 +10:00
Scott McKay	8ff41aea09	Fix 4 more bad delegates missing the attribute that cause iOS AOT errors at runtime (#18390 ) ### Description <!-- Describe your changes. --> Fix bad delegates. Add script to detect mismatch, and run in CI and when creating nuget package. Ignore whitespace when looking at the diff to the .cs file as clang-format ran. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #18363	2023-11-14 14:00:21 +10:00
PeixuanZuo	37d8bed53d	[ROCm] add migraphx into onnxruntime-training-rocm package (#18339 )	2023-11-14 11:54:22 +08:00
Dmitri Smirnov	f19c673595	If Branch Constant Folding (#18105 ) ### Description When and if `If` condition proves to be a constant value, inline the corresponding subgraph yielding to more constant folding and optimization. ### Motivation and Context Newly converted models feature lots of nested `If` nodes that can be inlined and collapsed. In particular, for the sample models we are gaining on TorchScript exported models. For `HF Mobile Bert Dynamo` runtime went down from 0.069 -> 0.046. In total, AOT inlining + `If` constant folding yields improvement of about 50% 0.102 -> 0.046. Brining us very close to TorchScript exported models. `HF Bart Dynamo` further improves 0.668 -> 0.45. AOT + `If` constant folding improves 0.98 -> 0.45 Earlier the size of HF Mobile Bert 161Mb+, now 98Mb HF Bart Dynamo pre-optimized model was about 1.2Gb. It is now 710MB ![image](https://github.com/microsoft/onnxruntime/assets/11303988/1491a247-d371-4e66-85a3-2aeb702e8ca0)	2023-11-13 17:33:30 -08:00
PeixuanZuo	a62a500ae1	[ROCm] Update CK version (#17628 ) update ck version	2023-11-13 15:43:38 -08:00
Changming Sun	c3b5479056	Remove extra CUDA version flag (#18397 ) ### Description Only one of "--cuda_version" and "--cuda_home" is needed. If they were both specified, the first one will take precedence. Since we download cuda SDKs on-the-fly now, the machines will not need to have a preinstalled CUDA SDK therefore will not have VS-CUDA integration extension. Therefore the "--cuda_version" flag will not work. This PR deletes such usages. Related PR: #15915	2023-11-13 15:11:42 -08:00
Adrian Lizarraga	4f2bd3862d	[QNN EP] Ensure QDQ Split input/output quant params are equal (#18332 ) ### Description Updates QNN EP to force Split operators to use the same quant params for all input/outputs (only if they were already nearly equal). This can be necessary for the sequence Sigmoid -> Split because QNN requires Sigmoid ops to override output quant params to specific values. Also did the same for the following operators that do not change input data: - Expand - Gather - MaxPool - Reshape/Flatten/Squeeze/Unsqueeze - Resize - Split - Tile ### Motivation and Context The QNN HTP backend employs certain optimizations when all the quantization parameters for the Split operator are equal. We need to ensure they are equal to get better inference latency performance. --------- Signed-off-by: adrianlizarraga <adlizarraga@microsoft.com>	2023-11-13 13:14:28 -08:00
Tianlei Wu	0d22d64420	Update SDXL demo and documents (#18395 ) Update SDXL demo to test more configurations (including every scheduler). Update documents to add instructions for running demo in docker. Update package version in requirements. Enable custom fp16 VAE in TensorRT for fair comparison.	2023-11-13 12:45:25 -08:00
Xu Xing	949ac4b7ce	[js/webgpu] Support uniforms for gather (#18312 )	2023-11-13 11:24:34 -08:00
Vincent Wang	4a82030339	[ORTModule] Symbolic Shape Support for Triton Codegen (#18317 ) Add symbolic shape support for Triton codegen for ORTModule.	2023-11-13 12:16:27 +08:00
Wanming Lin	73ed34ac4b	[WebNN EP] Support numThreads option for WebNN CPU device (#18054 )	2023-11-12 16:45:10 -08:00
Wanming Lin	cbf0cf06db	[WebNN EP] Disable clamp fusion for WebNN GPU (#18386 ) As which has not been supported in WebNN DirectML backend.	2023-11-12 08:56:39 -08:00
Scott McKay	8d298f6f78	Fix xnnpack compile error on arm32 (#18291 ) ### Description <!-- Describe your changes. --> Use different march flag to workaround what appears to be a clang issue. See https://github.com/tensorflow/tensorflow/issues/59970 for links to various relevant pieces of info/discussions. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-12 08:59:20 +10:00
Frank Dong	a46c79d211	fix llama2-70b bug, add document (#18398 ) 1. fix dist setting bug for LLaMA2-70b distributed convert and benchmark 2. Add instruction in README for how to benchmark LLaMA2-70b distribute inference	2023-11-10 21:59:23 -08:00
RandySheriffH	646f77a94b	Align context virtuals (#18396 ) Deprecate ROCM context virtual function, to align with CUDA. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-11-11 12:41:37 +10:00
Justin Chu	d87d480857	Remove deprecated vscode settings (#18349 ) ### Description Remove deprecated vscode settings. The python settings are deprecated and will cause vscode to pop up with a warning.	2023-11-10 18:00:35 -08:00
Xu Xing	0c8c0014f6	[js/webgpu] Use builtin num_workgroups to fix shader key conflict (#18387 ) This fixes conformance failure of tinyyolov2-8 and potential shader key conflict issues.	2023-11-10 17:37:45 -08:00
Yulong Wang	6b0c97b43f	[js/web] fix typescript type check (#18343 ) ### Description This PR fixes the TypeScript type check. Previously, when I use esbuild to replace webpack (#17745), typescript typecheck was disabled. This causes a few TypeScript type error checked in into the code base. This PR fixes the followings: - Use "Node16" as default "module" value in tsconfig.json, because in TypeScript v5, `(module == "ES2015" && moduleResolution == "Node16")` is an invalid combination. - Set `noUnusedParameters` to true as default. in web override it to false because multiple code need to be updated ( a following-up PR will do this ) - set correct project file for 'web/lib/*/.ts' for ESLint (otherwise WebGPU types are not populated correctly) - fix type error in file js/web/lib/wasm/jsep/webgpu/program-manager.ts - upgrade "@webgpu/types" to latest to fix type error in file js/web/lib/wasm/jsep/backend-webgpu.ts - add package script "prebuild" for web to run tsc type check - add type check in CI yml file	2023-11-10 16:03:38 -08:00
Xu Xing	8dba6efd61	[js/webgpu] Add uniforms support to concat op (#18238 )	2023-11-10 13:46:03 -08:00
Scott McKay	64c91d790b	Fix ability to use patch on Windows CI machines (#18356 ) ### Description <!-- Describe your changes. --> Add 32-bit patch binary and infra to fallback to it. The Azure devops Windows CIs are missing patch.exe from their git install for some reason so the default `find_package(Patch)` fails as that is where it expects to find it. Remove Eigen patch. Underlying issue was fixed in source 3 years ago by `c6c84ed961` and the patch command is invalid (args are for git apply not patch). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make usage of patch consistent across all CIs Fix https://github.com/microsoft/onnxruntime/issues/15248	2023-11-11 07:32:14 +10:00
Jiajia Qin	28c23aed04	[js/webgpu] Fix conv2d with activation (#18388 ) ### Description Fix #18297 With PR #17766, conv2d activation in mobilenetv2-12 will not be empty. However, activation is not supported yet in [biasActivationSnippet](https://github.com/microsoft/onnxruntime/blob/main/js/web/lib/wasm/jsep/webgpu/ops/3rd-party/activation_util.ts#L48C14-L48C36). This PR makes all places unify to use [getActivationSnippet](https://github.com/microsoft/onnxruntime/blob/main/js/web/lib/wasm/jsep/webgpu/ops/fuse-utils.ts#L13) to fix this issue.	2023-11-10 12:54:35 -08:00
Changming Sun	2d23b4e117	Update min macos version (#18251 )	2023-11-10 11:08:17 -08:00

1 2 3 4 5 ...

9996 commits