onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-12 17:57:38 +00:00

Author	SHA1	Message	Date
Hector Li	a6b515fad5	[QNN EP] Update Where Op UT to include the issue relate to data layout (#18426 ) ### Description [QNN EP] Update Where Op UT to include the issue relate to data layout	2023-11-14 08:15:27 -08:00
Adrian Lizarraga	c9d5345c46	[QNN EP] Clean-up todo for OnnxInputInfo (#18416 ) ### Description Renames `OnnxInputInfo` struct to `TensorInfo` because this struct can be used for both input and output tensors. ### Motivation and Context Clean up TODO item	2023-11-14 08:14:40 -08:00
dependabot[bot]	5aeed62630	Bump axios from 1.3.4 to 1.6.1 in /js/node (#18400 ) Bumps [axios](https://github.com/axios/axios) from 1.3.4 to 1.6.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/releases">axios's releases</a>.</em></p> <blockquote> <h2>Release v1.6.1</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>formdata:</strong> fixed content-type header normalization for non-standard browser environments; (<a href="https://redirect.github.com/axios/axios/issues/6056">#6056</a>) (<a href="`dd465ab22b`">dd465ab</a>)</li> <li><strong>platform:</strong> fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>) (<a href="`3dc8369e50`">3dc8369</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+432/-65 ([#6059](https://github.com/axios/axios/issues/6059) [#6056](https://github.com/axios/axios/issues/6056) [#6055](https://github.com/axios/axios/issues/6055) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/meyfa" title="+5/-2 ([#5835](https://github.com/axios/axios/issues/5835) )">Fabian Meyer</a></li> </ul> <h2>Release v1.6.0</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="`96ee232bd3`">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="`5aaff532a6`">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="`a1c8ad008b`">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code> ⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([#6032](https://github.com/axios/axios/issues/6032) [#6021](https://github.com/axios/axios/issues/6021) [#6011](https://github.com/axios/axios/issues/6011) [#5932](https://github.com/axios/axios/issues/5932) [#5931](https://github.com/axios/axios/issues/5931) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/valentin-panov" title="+4/-4 ([#6028](https://github.com/axios/axios/issues/6028) )">Valentin Panov</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/therealrinku" title="+1/-1 ([#5889](https://github.com/axios/axios/issues/5889) )">Rinku Chaudhari</a></li> </ul> <h2>Release v1.5.1</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="`e4107797a7`">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="`bc9af51b18`">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="`4c89f25196`">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="`9e6205630e`">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([#5919](https://github.com/axios/axios/issues/5919) [#5917](https://github.com/axios/axios/issues/5917) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/0o001" title="+4/-4 ()">Mustafa Ateş Uzun</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/sfc-gh-pmotacki" title="+2/-1 ([#5892](https://github.com/axios/axios/issues/5892) )">Przemyslaw Motacki</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/Cadienvan" title="+1/-1 ()">Michael Di Prisco</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/blob/v1.x/CHANGELOG.md">axios's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/axios/axios/compare/v1.6.0...v1.6.1">1.6.1</a> (2023-11-08)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>formdata:</strong> fixed content-type header normalization for non-standard browser environments; (<a href="https://redirect.github.com/axios/axios/issues/6056">#6056</a>) (<a href="`dd465ab22b`">dd465ab</a>)</li> <li><strong>platform:</strong> fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>) (<a href="`3dc8369e50`">3dc8369</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+432/-65 ([#6059](https://github.com/axios/axios/issues/6059) [#6056](https://github.com/axios/axios/issues/6056) [#6055](https://github.com/axios/axios/issues/6055) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/meyfa" title="+5/-2 ([#5835](https://github.com/axios/axios/issues/5835) )">Fabian Meyer</a></li> </ul> <h1><a href="https://github.com/axios/axios/compare/v1.5.1...v1.6.0">1.6.0</a> (2023-10-26)</h1> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="`96ee232bd3`">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="`5aaff532a6`">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="`a1c8ad008b`">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code> ⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([#6032](https://github.com/axios/axios/issues/6032) [#6021](https://github.com/axios/axios/issues/6021) [#6011](https://github.com/axios/axios/issues/6011) [#5932](https://github.com/axios/axios/issues/5932) [#5931](https://github.com/axios/axios/issues/5931) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/valentin-panov" title="+4/-4 ([#6028](https://github.com/axios/axios/issues/6028) )">Valentin Panov</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/therealrinku" title="+1/-1 ([#5889](https://github.com/axios/axios/issues/5889) )">Rinku Chaudhari</a></li> </ul> <h2><a href="https://github.com/axios/axios/compare/v1.5.0...v1.5.1">1.5.1</a> (2023-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="`e4107797a7`">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="`bc9af51b18`">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="`4c89f25196`">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="`9e6205630e`">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li><!-- raw HTML omitted --> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([#5919](https://github.com/axios/axios/issues/5919) [#5917](https://github.com/axios/axios/issues/5917) )">Dmitriy Mozgovoy</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li><!-- raw HTML omitted --> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`f6d2cf9763`"><code>f6d2cf9</code></a> chore(ci): fix publish action content permission; (<a href="https://redirect.github.com/axios/axios/issues/6061">#6061</a>)</li> <li><a href="`a22f4b918a`"><code>a22f4b9</code></a> chore(release): v1.6.1 (<a href="https://redirect.github.com/axios/axios/issues/6060">#6060</a>)</li> <li><a href="`cb8bb2beb2`"><code>cb8bb2b</code></a> chore(ci): Publish to NPM with provenance (<a href="https://redirect.github.com/axios/axios/issues/5835">#5835</a>)</li> <li><a href="`37cbf9214a`"><code>37cbf92</code></a> chore(ci): added labeling and notification for published PRs; (<a href="https://redirect.github.com/axios/axios/issues/6059">#6059</a>)</li> <li><a href="`dd465ab22b`"><code>dd465ab</code></a> fix(formdata): fixed content-type header normalization for non-standard brows...</li> <li><a href="`3dc8369e50`"><code>3dc8369</code></a> fix(platform): fixed emulated browser detection in node.js environment; (<a href="https://redirect.github.com/axios/axios/issues/6055">#6055</a>)</li> <li><a href="`f7adacdbaa`"><code>f7adacd</code></a> chore(release): v1.6.0 (<a href="https://redirect.github.com/axios/axios/issues/6031">#6031</a>)</li> <li><a href="`9917e67cbb`"><code>9917e67</code></a> chore(ci): fix release-it arg; (<a href="https://redirect.github.com/axios/axios/issues/6032">#6032</a>)</li> <li><a href="`96ee232bd3`"><code>96ee232</code></a> fix(CSRF): fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>)</li> <li><a href="`7d45ab2e2a`"><code>7d45ab2</code></a> chore(tests): fixed tests to pass in node v19 and v20 with <code>keep-alive</code> enabl...</li> <li>Additional commits viewable in <a href="https://github.com/axios/axios/compare/v1.3.4...v1.6.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=axios&package-manager=npm_and_yarn&previous-version=1.3.4&new-version=1.6.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-14 00:38:00 -08:00
Chi Lo	3e1cf71067	[TensorRT EP] Fix bug for handling outer scope values in GetCapability (#18342 ) The issues found in yolov3, tiny-yolov3 etc where it has control flow ops. Two modifications: 1. In GetCapability/GetSupporedtList, only if the newly built graph has control flow op as well as it has parent node, it needs to handle outer scope values before calling graph.Resolve(). 2. Two graph/subgraphs has the chance to have the same graph->Name(). Add a function to get the unique graph name.	2023-11-14 00:26:06 -08:00
Changming Sun	a09099f2dd	Remove XNNPack from web pipelines (#18419 ) ### Description Remove XNNPack from web pipelines for now	2023-11-13 22:43:53 -08:00
Yi Zhang	0b16185223	build wasm with linux (#18106 ) ### Description Make all build_wasm tasks (NPM packaging and post merge)run on Linux. Enable web gpu test in npm package pipeline too. ### Motivation and Context Even on Windows, build_wasm is running in cygwin. So, it could save a lot of time to run it on Linux.	2023-11-14 14:42:11 +08:00
Scott McKay	897c1c1f05	Set DML package name correctly in CI (#18405 ) ### Description <!-- Describe your changes. --> Set DML package name correctly so the build doesn't try and include mobile targets. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix packaging pipeline.	2023-11-14 14:01:59 +10:00
Scott McKay	8ff41aea09	Fix 4 more bad delegates missing the attribute that cause iOS AOT errors at runtime (#18390 ) ### Description <!-- Describe your changes. --> Fix bad delegates. Add script to detect mismatch, and run in CI and when creating nuget package. Ignore whitespace when looking at the diff to the .cs file as clang-format ran. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #18363	2023-11-14 14:00:21 +10:00
PeixuanZuo	37d8bed53d	[ROCm] add migraphx into onnxruntime-training-rocm package (#18339 )	2023-11-14 11:54:22 +08:00
Dmitri Smirnov	f19c673595	If Branch Constant Folding (#18105 ) ### Description When and if `If` condition proves to be a constant value, inline the corresponding subgraph yielding to more constant folding and optimization. ### Motivation and Context Newly converted models feature lots of nested `If` nodes that can be inlined and collapsed. In particular, for the sample models we are gaining on TorchScript exported models. For `HF Mobile Bert Dynamo` runtime went down from 0.069 -> 0.046. In total, AOT inlining + `If` constant folding yields improvement of about 50% 0.102 -> 0.046. Brining us very close to TorchScript exported models. `HF Bart Dynamo` further improves 0.668 -> 0.45. AOT + `If` constant folding improves 0.98 -> 0.45 Earlier the size of HF Mobile Bert 161Mb+, now 98Mb HF Bart Dynamo pre-optimized model was about 1.2Gb. It is now 710MB ![image](https://github.com/microsoft/onnxruntime/assets/11303988/1491a247-d371-4e66-85a3-2aeb702e8ca0)	2023-11-13 17:33:30 -08:00
PeixuanZuo	a62a500ae1	[ROCm] Update CK version (#17628 ) update ck version	2023-11-13 15:43:38 -08:00
Changming Sun	c3b5479056	Remove extra CUDA version flag (#18397 ) ### Description Only one of "--cuda_version" and "--cuda_home" is needed. If they were both specified, the first one will take precedence. Since we download cuda SDKs on-the-fly now, the machines will not need to have a preinstalled CUDA SDK therefore will not have VS-CUDA integration extension. Therefore the "--cuda_version" flag will not work. This PR deletes such usages. Related PR: #15915	2023-11-13 15:11:42 -08:00
Adrian Lizarraga	4f2bd3862d	[QNN EP] Ensure QDQ Split input/output quant params are equal (#18332 ) ### Description Updates QNN EP to force Split operators to use the same quant params for all input/outputs (only if they were already nearly equal). This can be necessary for the sequence Sigmoid -> Split because QNN requires Sigmoid ops to override output quant params to specific values. Also did the same for the following operators that do not change input data: - Expand - Gather - MaxPool - Reshape/Flatten/Squeeze/Unsqueeze - Resize - Split - Tile ### Motivation and Context The QNN HTP backend employs certain optimizations when all the quantization parameters for the Split operator are equal. We need to ensure they are equal to get better inference latency performance. --------- Signed-off-by: adrianlizarraga <adlizarraga@microsoft.com>	2023-11-13 13:14:28 -08:00
Tianlei Wu	0d22d64420	Update SDXL demo and documents (#18395 ) Update SDXL demo to test more configurations (including every scheduler). Update documents to add instructions for running demo in docker. Update package version in requirements. Enable custom fp16 VAE in TensorRT for fair comparison.	2023-11-13 12:45:25 -08:00
Xu Xing	949ac4b7ce	[js/webgpu] Support uniforms for gather (#18312 )	2023-11-13 11:24:34 -08:00
Vincent Wang	4a82030339	[ORTModule] Symbolic Shape Support for Triton Codegen (#18317 ) Add symbolic shape support for Triton codegen for ORTModule.	2023-11-13 12:16:27 +08:00
Wanming Lin	73ed34ac4b	[WebNN EP] Support numThreads option for WebNN CPU device (#18054 )	2023-11-12 16:45:10 -08:00
Wanming Lin	cbf0cf06db	[WebNN EP] Disable clamp fusion for WebNN GPU (#18386 ) As which has not been supported in WebNN DirectML backend.	2023-11-12 08:56:39 -08:00
Scott McKay	8d298f6f78	Fix xnnpack compile error on arm32 (#18291 ) ### Description <!-- Describe your changes. --> Use different march flag to workaround what appears to be a clang issue. See https://github.com/tensorflow/tensorflow/issues/59970 for links to various relevant pieces of info/discussions. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-12 08:59:20 +10:00
Frank Dong	a46c79d211	fix llama2-70b bug, add document (#18398 ) 1. fix dist setting bug for LLaMA2-70b distributed convert and benchmark 2. Add instruction in README for how to benchmark LLaMA2-70b distribute inference	2023-11-10 21:59:23 -08:00
RandySheriffH	646f77a94b	Align context virtuals (#18396 ) Deprecate ROCM context virtual function, to align with CUDA. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-11-11 12:41:37 +10:00
Justin Chu	d87d480857	Remove deprecated vscode settings (#18349 ) ### Description Remove deprecated vscode settings. The python settings are deprecated and will cause vscode to pop up with a warning.	2023-11-10 18:00:35 -08:00
Xu Xing	0c8c0014f6	[js/webgpu] Use builtin num_workgroups to fix shader key conflict (#18387 ) This fixes conformance failure of tinyyolov2-8 and potential shader key conflict issues.	2023-11-10 17:37:45 -08:00
Yulong Wang	6b0c97b43f	[js/web] fix typescript type check (#18343 ) ### Description This PR fixes the TypeScript type check. Previously, when I use esbuild to replace webpack (#17745), typescript typecheck was disabled. This causes a few TypeScript type error checked in into the code base. This PR fixes the followings: - Use "Node16" as default "module" value in tsconfig.json, because in TypeScript v5, `(module == "ES2015" && moduleResolution == "Node16")` is an invalid combination. - Set `noUnusedParameters` to true as default. in web override it to false because multiple code need to be updated ( a following-up PR will do this ) - set correct project file for 'web/lib/*/.ts' for ESLint (otherwise WebGPU types are not populated correctly) - fix type error in file js/web/lib/wasm/jsep/webgpu/program-manager.ts - upgrade "@webgpu/types" to latest to fix type error in file js/web/lib/wasm/jsep/backend-webgpu.ts - add package script "prebuild" for web to run tsc type check - add type check in CI yml file	2023-11-10 16:03:38 -08:00
Xu Xing	8dba6efd61	[js/webgpu] Add uniforms support to concat op (#18238 )	2023-11-10 13:46:03 -08:00
Scott McKay	64c91d790b	Fix ability to use patch on Windows CI machines (#18356 ) ### Description <!-- Describe your changes. --> Add 32-bit patch binary and infra to fallback to it. The Azure devops Windows CIs are missing patch.exe from their git install for some reason so the default `find_package(Patch)` fails as that is where it expects to find it. Remove Eigen patch. Underlying issue was fixed in source 3 years ago by `c6c84ed961` and the patch command is invalid (args are for git apply not patch). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make usage of patch consistent across all CIs Fix https://github.com/microsoft/onnxruntime/issues/15248	2023-11-11 07:32:14 +10:00
Jiajia Qin	28c23aed04	[js/webgpu] Fix conv2d with activation (#18388 ) ### Description Fix #18297 With PR #17766, conv2d activation in mobilenetv2-12 will not be empty. However, activation is not supported yet in [biasActivationSnippet](https://github.com/microsoft/onnxruntime/blob/main/js/web/lib/wasm/jsep/webgpu/ops/3rd-party/activation_util.ts#L48C14-L48C36). This PR makes all places unify to use [getActivationSnippet](https://github.com/microsoft/onnxruntime/blob/main/js/web/lib/wasm/jsep/webgpu/ops/fuse-utils.ts#L13) to fix this issue.	2023-11-10 12:54:35 -08:00
Changming Sun	2d23b4e117	Update min macos version (#18251 )	2023-11-10 11:08:17 -08:00
Bart Verhagen	87744e55fa	fix reference to Microsoft.GSL::GSL in CMake build scripts when enabling cuda (#17843 ) ### Description Some CMake scripts reference Microsoft.GSL::GSL. Most of the time, the GSL package that is found on the system is used. However, when cuda is enabled, it is downloaded and patched. Most CMake scripts rely on the first case and forget about the second. This patch makes the second case behave like the first case. ### Motivation and Context This is an issue that occurs 'in the wild'. For example, I had to patch this to be able to enable the CUDA provider for the onnxruntime conan package (see https://github.com/conan-io/conan-center-index/pull/20392).	2023-11-10 10:46:45 -08:00
Xu Xing	dd1bb760eb	[js/webgpu] Fix scalar uniform (#18318 )	2023-11-10 10:12:22 -08:00
sophies927	d955885791	Update stale.yml to fix start-date bug (#18376 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-09 16:04:31 -08:00
RandySheriffH	59262dfc63	Add cuda context headers to zip (#18330 ) Expose cuda context headers for cuda custom ops. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-11-09 14:53:58 -08:00
dependabot[bot]	1ff894898a	Bump actions/stale from 4.1.1 to 8.0.0 (#18149 )	2023-11-09 11:31:04 -08:00
Xu Xing	829d802337	[js/webgpu] Support uniform for softmax (#18345 )	2023-11-09 11:19:23 -08:00
Adrian Lizarraga	f237b0b1f8	[QNN EP/Quantization] Add MinimumRealRange extra option to quantization script (#18278 ) ### Description Adds the extra option `MinimumRealRange` to the quantization script: ```python3 """ MinimumRealRange= float\|None : Default is None. If set to a floating-point value, the calculation of the quantization parameters (i.e., scale and zero point) will enforce a minimum range between rmin and rmax. If (rmax - rmin) is less than the specified minimum range, rmax will be set to rmin + QuantMinRealRange. This is necessary for EPs like QNN that require a minimum floating-point range when determining quantization parameters. """ ``` ### Motivation and Context QNN requires a minimum floating-point range of 0.0001. --------- Signed-off-by: adrianlizarraga <adlizarraga@microsoft.com>	2023-11-09 10:55:09 -08:00
Guenther Schmuelling	25fbc2b0ab	fix fused relu activation (#18303 )	2023-11-09 08:18:21 -08:00
David Justice	2c22b49876	Fix rust compile issues and add GH action to run build validations and tests (#18346 ) ### Description This PR gets the onnxruntime Rust bindings to a foundation where they can be extended and validated as the onnxruntime progresses. Specifically, the PR does the following. - fixes some of the existing compilation issues due to missing some enums output tensor data types. - introduces a `just vendor` task that will vendor the source code from the onnxruntime to enable a common base directory within the crate directory rather than using a relative parent path. This enables `crate package` to be able to archive the onnxruntime native code, which will enable consumers of the onnxruntime-sys crate to be able to compile on their target. - introduces a GH action to lint the Rust code (rustfmt, clippy), build the library, validate through tests, and validate crate can package correctly. TODOs: - [x] This PR is based on #18200 and will need to be rebased once that PR is merged. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This is the first step to getting new onnxruntime Rust crates published through this project, which will unblock community Rust projects which would like to take a dependency on onnxruntime Rust. Follow up work to enable publication of onnxruntime Rust crates: - change name of the crates to be published (onnxruntime-rs and onnxruntime-sys are already taken and we'll need new names) - update authors / license to reflect contributions from previous maintainer(s) and new maintainers - introduce a crate publish GH action or ADO pipeline --------- Signed-off-by: David Justice <david@devigned.com>	2023-11-09 04:26:02 -08:00
Ted Themistokleous	8d50313816	[Migraphx EP] Static int8 QDQ support (#17931 ) ### Description <!-- Describe your changes. --> Adding static int8 quantization support for MIGraphX Execution Provider - Allows for parsing in calibration tables generated by Onnxruntime or TensorRT's toolsets - Add proper environment variables into the MIGraphX EP - Update python API to include updating execution provider flags -> was missing on python side - Hook into MIGraphX's int8 quantitation and optimization of models ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Required so that we can get onnxruntime to pass in models while leveraging the existing tooling for int8 static QDQ quantization. First step in a series of PRs which will add further static quantization on the operator level as MIGraphX releases further support. These changes drew heavily from the tensorRT EP should allow for similar functionality for GPU based (versus CPU) quantization of models before an inference is performed. --------- Co-authored-by: Ted Themistokleous <tthemist@amd.com> Co-authored-by: Ted Themistokleous <tedthemistokleous@amd.com>	2023-11-09 17:46:49 +08:00
Hector Li	55c19d6ab5	[QNN EP] Enable option to set QNN context priority (#18315 ) Enable option qnn_context_priority to set QNN context priority, options: "low", "normal", "normal_high", "high". ### Description Enable option qnn_context_priority to set QNN context priority, options: "low", "normal", "normal_high", "high". This feature guarantees the model inference with higher priority. Tested with onnxruntime_perf_test tool using same model. 1. Run the model on the NPU with single instance, the latency is 300ms. 2. Run the same model on NPU with 2 instance at same time. Case 1: both with same priority (high ) -- latency is 600ms Case 2: 1 with low priority -- latency is 30,000ms 1 with high priority -- latency is 300ms Case 3: 1 with normal priority -- latency is 15,000ms 1 with high priority -- latency is 300ms	2023-11-08 20:56:36 -08:00
Prathik Rao	7a3da4526f	add bfloat16 support for CUDA Neg kernel (#18306 ) ### Description <!-- Describe your changes. --> Registers BFloat16 datatype as valid input type for CUDA Neg Kernel. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Enabling `meta-llama/Llama-2-70b` to be finetuned with ONNX Runtime training. --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-11-08 18:32:12 -08:00
guyang3532	4dc63692f8	Add FlattenAndUnpad Op (#17845 ) ### Description Add an op named `FlattenAndUnpad`. This op implements functions: 1. Flatten the first two dims of input tensor. 2. Gather valid value from input tensor with index tensor,. ### Motivation and Context The grad op of `PadAndUnflatten` was `GatherGrad` which is inefficient in performance. I implement this `FlattenAndUnpad` just to replace the `GatherGrad` as grad of `PadAndUnflatten`. With this op, we also can simplify the "Reshape + ShrunkenGather" pattern to `PadAndUnflatten` in padding elimination optimizer, which will also improve performance.	2023-11-09 09:52:48 +08:00
Scott McKay	885bf3561d	Add tool to fix lines > 120 chars. (#18293 ) ### Description <!-- Describe your changes. --> Helper to run clang-format on lines that are > 120 chars. We disable clang-format enforcing 120 chars by default because it's formatting can negatively impact readability. If a developer has not manually kept a line within the 120 char limit this tool will fix it. It will leave all other lines alone to honor the formatting the developer chose. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Help developers fix lint errors. Preferred is to use a vertical ruler/guideline in your editor when actually writing the code.	2023-11-09 10:12:57 +10:00
Justin Chu	c250540722	Bump linter versions (#18341 ) Bump linter versions and run format.	2023-11-08 13:04:40 -08:00
Changming Sun	812532592e	Add a build validation for Linux ARM64 cross-compile (#18200 ) ### Description 1. Add a build validation for Linux ARM64/ARM32 cross-compile to catch issues listed in #18195 . 2. Revert eigen's commit id back to what we had before. ### Motivation and Context To catch cross-compile issues. Added a TODO item for fixing the compile warnings in Linux ARM32 build: AB#21639	2023-11-08 13:03:18 -08:00
sophies927	68fab24c22	Update stale.yml (#18304 ) Exempt all issues w/ assignees from stale bot, increase days before issue close, + add start date to address issue w/ GH API rate limiting ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-08 11:56:35 -08:00
Dmitri Smirnov	a37e6a503b	Update Abseil raw_flat_hash visualization (#18329 ) ### Description <!-- Describe your changes. --> Fix the broken pieces due to the latest Abseil update. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? Make the debugging bearable.	2023-11-08 11:19:45 -08:00
Adrian Lizarraga	a0eeeafa80	[QNN EP] Session option for graph optimization (#18262 ) ### Description Adds the QNN session option `htp_graph_finalization_optimization_mode` to enable QNN graph optimizations at the expense of longer preparation time. ### Motivation and Context Allow enabling QNN graph optimizations per app/model.	2023-11-08 10:06:15 -08:00
kunal-vaishnavi	c8def0cc51	Add LLaMA GQA ragged batching (#18337 ) This PR updates replacing MHA with GQA and updates the LLaMA scripts for the modified GQA op. It is related to the changes in [this PR](https://github.com/microsoft/onnxruntime/pull/18283). ### Motivation and Context This PR allows us to run LLaMA with the GQA op end-to-end using ragged batching (i.e. batched inputs of different lengths).	2023-11-08 09:36:28 -08:00
Prathik Rao	34f77eaa24	bfloat16 support for quickgelugrad (#18336 ) ### Description <!-- Describe your changes. --> Registers BFloat16 datatype as valid input type for CUDA QuickGeluGrad Kernel. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Enabling `meta-llama/Llama-2-70b` to be finetuned with ONNX Runtime training. --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2023-11-08 08:40:02 -08:00
pengwa	2151c79bf1	Tune ORTModule logging experience a bit (#18298 ) ### Tune logging experience a bit After last time we update the ORTModule log experience, we found few issues: 1. `INFO` level output too many things, including PyTorch exporter verbose logs (tracing graphs) on every ranks. On this level, we only want to - Output a little bit more information to Users than `WARNING` level, for example the memory recomputation recommendations or other not-fully-ready features. - Output a little bit more information for a quick diagnostic, collected on rank-0 only. 2. ONNX Runtime logging filter during graph build, session init sometimes will hide the issues (for example segement fault), there is no useful information in `WARNING`/`INFO` for users to report to us. This is not good! 3. Some of our devs like using `pdb` to debug Python code, but if we add `import pdb; pdb.set_trace()` in models' code might hang when they use `INFO` or `WARNING`, where exporter happens and all output got redirected due to log filtering. The only workaround is to switch to VERBOSE, which output toooooooooooo many logs. The corresponding changes proposed here are: 1. For `INFO` logging, - We only logs rank-0. - We restricted the ORT backend logging level to be WARNING in this case, because ORT backend code output way too many logs that should be under verbose, while we cannot guarantee we can get them cleaned up immediately once they are added. - We output the PyTorch exporter verbose log (including tracing graph), which is useful for a quick diagnostic when an issue happens. 2. Remove all logging filtering on ORT backend, then the segment fault issue details will not be hidden once it happens again. 3. Introduced a `DEVINFO` logging, - Log logs on all ranks - Log ORT backend logging level INFO - PyTorch exporter logging filtering are all turned OFF (to unblock the pdb debugging). 4. Currently, to use Memory Optimizer, need use DEVINFO (which will output ORT backend INFO log). So update memory optimizer document to reflect this. https://github.com/microsoft/onnxruntime/pull/17481 will update the requirement back to INFO for show memory optimization infos. You can check https://github.com/microsoft/onnxruntime/blob/pengwa/devinfo_level/docs/ORTModule_Training_Guidelines.md#log-level-explanations for a better view of different log levels. This PR also extract some changes from a bigger one https://github.com/microsoft/onnxruntime/pull/17481, to reduce its complexity for review. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: mindest <30493312+mindest@users.noreply.github.com>	2023-11-08 17:42:50 +08:00

1 2 3 4 5 ...

9974 commits