onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
dependabot[bot]	4497c97d54	Bump cross-spawn from 7.0.3 to 7.0.6 in /js/node (#22998 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 7.0.3 to 7.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.5...v7.0.6">7.0.6</a> (2024-11-18)</h3> <h3>Bug Fixes</h3> <ul> <li>update cross-spawn version to 7.0.5 in package-lock.json (<a href="`f700743918`">f700743</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.4...v7.0.5">7.0.5</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>fix escaping bug introduced by backtracking (<a href="`640d391fde`">640d391</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.4">7.0.4</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="`5ff3a07d9a`">5ff3a07</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`77cd97f3ca`"><code>77cd97f</code></a> chore(release): 7.0.6</li> <li><a href="`6717de49ff`"><code>6717de4</code></a> chore: upgrade standard-version</li> <li><a href="`f700743918`"><code>f700743</code></a> fix: update cross-spawn version to 7.0.5 in package-lock.json</li> <li><a href="`9a7e3b2165`"><code>9a7e3b2</code></a> chore: fix build status badge</li> <li><a href="`085268352d`"><code>0852683</code></a> chore(release): 7.0.5</li> <li><a href="`640d391fde`"><code>640d391</code></a> fix: fix escaping bug introduced by backtracking</li> <li><a href="`bff0c87c8b`"><code>bff0c87</code></a> chore: remove codecov</li> <li><a href="`a7c6abc6fe`"><code>a7c6abc</code></a> chore: replace travis with github workflows</li> <li><a href="`9b9246e096`"><code>9b9246e</code></a> chore(release): 7.0.4</li> <li><a href="`5ff3a07d9a`"><code>5ff3a07</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li>Additional commits viewable in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=7.0.3&new-version=7.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-03 18:48:22 -08:00
Yulong Wang	d3bc3180d8	[js/node] fix CUDA artifact installation script for Linux/x64 (#22984 ) ### Description This PR updates installation script to fix it for CUDA v12. However, it may be difficult for CUDA v11 since the steps are quite complicated to automate. Added a few lines of instructions instead. fixes #22877	2024-12-03 16:07:43 -08:00
Jian Chen	9ed0c7fe26	Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-02 18:34:25 -08:00
Wanming Lin	fe749a88a5	[WebNN EP] Fixed bug in usage of Array.reduce() (#22944 ) In JS, reduce of empty array with no initial value will throw error. Fix it by checking the array length firstly.	2024-11-26 19:03:44 -08:00
shiyi	afbb53937c	[WebNN] Support negative steps for slice (#22871 ) Slice with negative steps can be emulated by reverse+slice.	2024-11-25 23:06:23 -08:00
Bin Miao	558ae8621c	[WebNN EP] Fix an issue of CumSum operator (#22936 ) This PR limits the axis of the CumSum operator to be a constant when using WebNN EP. @Honry @fdwr PTAL.	2024-11-25 21:05:53 -08:00
Yi Zhang	a28246a994	Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914 ) …ime/java (#22771)" This reverts commit `632a36a233`. ### Description <!-- Describe your changes. --> ### Motivation and Context Run E2E tests using Browserstack failed due to this PR.	2024-11-21 18:12:28 +08:00
Wanming Lin	8a06f13301	[WebNN] Remove wasm.currentContext check (#22886 ) If a WebNN session is threw early, this check for `wasm.currentContext` will break all the following WebNN sessions, this often happens in npm tests.	2024-11-19 12:22:02 -06:00
Jiajia Qin	e597eaed4a	[js/webgpu] Optimize transpose as reshape when suitable (#22870 ) BUG #22031	2024-11-18 12:52:48 -08:00
Peishen Yan	5928009553	[WebNN EP] Support Einsum op (#19558 ) Adds support for einsum via WebNN matmul, transpose, reshape, reducesum, identity and element-wise binary ops.	2024-11-15 17:58:35 -08:00
Jian Chen	632a36a233	Update Gradle version 8.7 and java version 17 within onnxruntime/java (#22771 ) ### Description This change is to update the Gradle version within java project to 8.7, it also upgrades the JAVA to 17. Gradle version from react-native was also updated to 7.5 to make it compatible with changes from the Java directory. However, the target java version remains the same. Java version from these will be upgraded in a separated PR. This is spited from #22206 ### Motivation and Context This is the first step to upgrade the react native version.	2024-11-14 17:10:44 -08:00
Wanming Lin	82681205e4	[WebNN] Fix MLTensorUsage is undefined issue (#22831 ) `MLTensorUsage` has been removed from Chromium: https://chromium-review.googlesource.com/c/chromium/src/+/6015318, but we still need to make it compatible with old Chrome versions, so just make it `undefined` for latest Chrome version.	2024-11-13 20:22:22 -08:00
Xu Xing	ff57ac4f3d	[js/webgpu] Add scatterND (#22755 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-13 09:13:00 -08:00
Jiajia Qin	7e0dd9d433	[js/webgpu] Optimize Expand (#22752 ) Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.	2024-11-12 12:37:19 -08:00
Jiajia Qin	05c8dc9d1c	[js/webgpu] Optimize ConvTranspose (#22774 ) BUG #22031 The overall time of ConvTranspose in Demucs model becomes 517.41 ms from 1415.65 ms on my iGPUs.	2024-11-12 12:37:07 -08:00
Bin Miao	67f5be0da2	[WebNN EP] Support LRN operator (#22775 ) WebNN doesn't provide dedicate op for LRN, use a couple of WebNN ops to emulate it in WebNN EP: pow -> transpose -> pad -> averagePool -> transpose -> mul -> add -> pow -> div @Honry @fdwr PTAL, thanks!	2024-11-12 11:53:52 -08:00
Wanming Lin	cdc8db9984	[WebNN] Fixed WebNN Module undefined issue (#22795 ) `Module.jsepRegisterMLConstant` will be shorten by Closure Compiler in offical release, this would cause undefined error. Fix it by using `Module['jsepRegisterMLConstant']`.	2024-11-11 21:31:24 -08:00
shiyi	63cb53257b	[WebNN] Support steps >= 1 for slice operator (#22708 ) Co-authored-by: Wanming Lin <wanming.lin@intel.com>	2024-11-09 18:20:52 -08:00
xhcao	b5ee4ac760	[js/webgpu] support GridSample operator (#22652 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-08 11:02:36 -08:00
jzm-intel	d9b91682f1	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 ) This PR make MatMul shaders not depend on inputs broadcasting pattern, but only depend on input ranks and their shape provided in uniform. This change fix the issue that currently shaders code are different for different broadcasting, but have identical cache key and results in wrong cache hit.	2024-11-08 11:00:51 -08:00
jzm-intel	6a295eb75b	[JS/WebGPU] Creating devices with subgroup features enabled if possible (#21833 ) This CL make WebGPU backend support subgroup features and thus allow using subgroup optimizations in the future. ### Description With this CL WebGPU backends will create devices with subgroups and subgroups-f16 features (both are under origin trial in Chrome) or chromium-experimental-subgroups feature enabled whenever available. ### Motivation and Context This CL would allow WebGPU operator shaders to use subgroup optimizations in the future, and might get some significant speedup with these optimization.	2024-11-07 02:13:40 -08:00
Enrico Galli	1cb5ceedf3	[WebNN EP] Fix issues with MLTensor caching (#22701 ) This PR fixes a bug that occurs when searching for compatible `MLTensor` in the cache. We were missing checking the number of dimensions in the shape. This would mean that a cached buffer of shape `[1]` could match for `[1, 1, 256, 256]`. This PR also adds better handling when attempting to force an `MLTensor` to a different shape.	2024-11-06 09:17:11 -08:00
Yang Gu	811231e418	[js/webgpu] Destroy staging buffers aggressively during weights uploading (#22726 ) In current implementation, all the staging buffers for weights uploading are destroyed after first batch of kernel execution. It requires a lot of memory as all the staging buffers couldn't be reused. It also hurts the startup time (weights uploading only happens in session creation), as weights uploading is delayed to a very late time. This PR uses a very aggressive way to submit queue and destroy staging buffers, so that the related GPU memory could be reused as much as possible, though the real situation depends on the WebGPU and driver implementation. The aggressive queue submission also moves GPU operations to a very early time, which helps the startup time. Some buffer uploading benchmarks are composed to compare multiple solutions, regarding to the memory and time consumption. Benchmarks can be found at https://github.com/webatintel/webbench/blob/master/webgpu/buffer-upload.html, while detailed test data can be found at https://docs.google.com/document/d/1KgygOkb9ZNzkgzQ_tWOGlEI9ScmMBHDjDojjPFLmVXU/edit. I also tested phi3.5 on 2 machines, first inference time improved from 5141ms to 3579ms and from 4327ms to 2947ms separately.	2024-11-06 08:55:15 -08:00
Jiajia Qin	d5b2730ff8	[js/webgpu] Increase workgroupSize if only one workgroup is dispached (#22709 ) #22031 For reduce related ops, we should increase workgroupSize to improve parallelism if only one workgroup is dispatched. The total ReduceMean time becomes 8.98 ms from 77.79 ms on my iGPUs.	2024-11-05 13:13:52 -08:00
Jiajia Qin	64d8e25b4c	[js/webgpu] Optimize Gemm (#22706 ) BUG #22031 The total Gemm time in demucs model becomes 181.14 ms from over 1000 ms on my iGPUs. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-04 15:05:21 -08:00
Yulong Wang	bd5dbf86fe	support WebGPU EP in Node.js binding (#22660 ) ### Description This change enhances the Node.js binding with the following features: - support WebGPU EP - lazy initialization of `OrtEnv` - being able to initialize ORT with default log level setting from `ort.env.logLevel`. - session options: - `enableProfiling` and `profileFilePrefix`: support profiling. - `externalData`: explicit external data (optional in Node.js binding) - `optimizedModelFilePath`: allow dumping optimized model for diagnosis purpose - `preferredOutputLocation`: support IO binding. ====================================================== `Tensor.download()` is not implemented in this PR. Build pipeline update is not included in this PR.	2024-11-04 21:09:07 +00:00
Wanming Lin	6c21ab7337	[WebNN] Support SimplifiedLayerNormalization op (#22674 ) WebNN doesn't provide dedicate op for SimplifiedLayerNormalization, use a couple of WebNN ops to emulate it in WebNN EP. X --> Pow --> ReduceMean --> Add --> Sqrt --> Div -> Mul	2024-11-04 12:25:11 -08:00
Bin Miao	777fe7922c	[WebNN EP] Support Sign and CumSum operators (#22616 ) This PR supports Sign and CumSum operators for WebNN EP. @Honry @fdwr PTAL, thanks.	2024-11-03 20:08:16 -08:00
Jiajia Qin	8fbbf2fd4f	[js/webgpu] Optimize MatMul with M = 1 (#22577 ) ### Description <!-- Describe your changes. --> BUG #22031 In the demucs model, there are lots of MatMul ops with shapes like below: `input[0]: [3448,1,512] \| float32, input[1]: [512,1536] \| float32, output[0]: [3448,1,1536] \| float32` We can see that for this kind of shape, the batch size is a big value, but M = 1. Our current algorithm is based on [M, N] to partition tiles, which is not efficient for such kind of shapes. This PR reshapes the inputs to improve the matmul performance. Before: [3448,1,512] x [512,1536] = [3448,1,1536] After: [1, 3448, 512] x [512, 1536] = [1, 3448, 1536] , then the output can be reshaped to [3448, 1, 1536] The overall MatMul time in demucs model becomes 1778.45 ms from 4418.17 ms on my iGPUs. --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2024-11-01 08:04:42 -07:00
Wanming Lin	eb66bfa7b4	[WebNN] Convert MLOperand methods into readonly attributes (#22653 ) Adapt to spec change at https://github.com/webmachinelearning/webnn/pull/774	2024-10-30 17:54:49 -07:00
Wanming Lin	fc375a6f58	[WebNN] Support And, Or and Xor ops (#22598 ) Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2024-10-30 17:52:10 -07:00
Enrico Galli	df236c7894	[WebNN EP] Add cache for `MLContext`s in the `WebNNBackend` (#22510 ) ### Description This change adds a cache of `MLContext`s keyed by their options to the `WebNNBackend`. This makes is so that multiple `InferenceSession`s create with the same options will share the same context. ### Motivation and Context Since `MLTensor`s are tied `MLContext`s, developer can't easily share tensors between `InferenceSession` (outside of manually an `MLContext` and specifying the `context` options). This leads strange behaviors such as, ```js const sessionsA = ort.InferenceSession.create(urlA, { executionProviders: ["webnn"], preferredOutputLocation: "ml-buffer", }); const sessionsB = ort.InferenceSession.create(urlB, { executionProviders: ["webnn"], }); const temp = await sessionA.run({/* arguments */}); const result = await sessionB.run({"input":temp["output"]}); // ERROR: Failed to execute 'dispatch' on 'MLContext': Invalid inputs: The context of MLGraph doesn't match the context of the MLTensor with name "input". ``` We encountered this behavior when updating the transformers.js version in the developer preview demos. microsoft/webnn-developer-preview#46	2024-10-30 10:26:33 -07:00
shiyi	46ff240821	[WebNN] Add ScatterElements and GatherElements (#22534 )	2024-10-30 10:20:21 -07:00
Prathik Rao	5cc7fb4a74	[JSEP] Upgrade to ONNX Opset 21 (#22595 ) ### JSEP Ops that need updating - [x] Cast - [x] ReduceMax - [x] ReduceMin - [x] Squeeze - [x] Unsqueeze - [x] Transpose - [x] AveragePool - [x] Flatten - [x] Pad - [x] If	2024-10-29 17:44:38 -07:00
Jiajia Qin	04e696d8e0	[js/webgpu] Optimize InstanceNorm in some shapes (#22637 ) BUG #22031 Optimize below two situations: 1. Increase workgroupSize if only one workgroup is dispatched. 2. Avoid transpose if not necessary. The overall time of demucs model becomes 106.36 ms from 154.60 ms on my dGPUs with this PR and PR #22577	2024-10-29 17:10:14 -07:00
Yulong Wang	dbe8c83893	[js/web] remove "node": null in export table (#22618 ) ### Description This change resolves issue No.3 described in #22615	2024-10-29 04:01:26 -07:00
shiyi	dcf91266bd	[WebNN EP] Support GatherND and ScatterND op (#22181 )	2024-10-28 15:04:45 -07:00
Satya Kumar Jandhyala	05fbb43b34	[JSEP/WebGPU] Fix data causing output mismatch resulting in CI build failures occasionally (#22596 ) ### Description <!-- Describe your changes. --> Test case failing sometimes and passing other times. ### Motivation and Context Prevent unnecessary CI build failures requiring manually rerunning tests	2024-10-26 01:37:12 -07:00
Wanming Lin	008c9090b4	[WebNN] Support int4 and uint4 data types (#22575 )	2024-10-25 17:44:46 -07:00
Satya Kumar Jandhyala	4ed5bec2e7	[JS/WebGPU] Support WASM64 (#21836 ) ### Description Support wasm64 ### Motivation and Context Overcome memory limitations --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2024-10-24 20:21:51 -07:00
jzm-intel	374022e988	JSEP: Use global-agent in scripts to enable using network proxy (#22537 ) This PR add dependency to the global-agent package, and use it in JSEP scripts that download files from network (i.e. `js/scripts/utils.ts` and `js/web/script/pull-prebuilt-wasm-artifacts.ts`), so that user can make these script use network proxy by setting environment variable GLOBAL_AGENT_HTTPS_PROXY.	2024-10-24 16:27:11 -07:00
Yulong Wang	ef7f1ce08b	Update Node.js version from 18.x to 20.x in CI pipelines (#22576 )	2024-10-24 07:34:42 -07:00
Prathik Rao	742594c8f0	Clears GPU Cache when there are no more active sessions (#22490 ) Fixes https://github.com/microsoft/onnxruntime/issues/21574	2024-10-23 22:22:57 -07:00
Satya Kumar Jandhyala	fd8ee4894d	[JS/WebGPU] GroupQueryAttention rewrite (#20946 ) ### Description Implement JSEP GroupQueryAttention ### Motivation and Context Required to enable certain LLM models to run using WebGPU.	2024-10-23 10:14:09 -07:00
Wanming Lin	33e2f6ad8d	[WebNN EP] Support external data (#22263 ) ### Description This PR introduces support for registering external data inside WebNN EP. ### Motivation and Context - The WebNN EP needs to register the initializers at graph compilation stage, for initializers from external data, it can't leverage the general external data loader framework because the graph compilation of WebNN EP is executed before external data loader called. - Exposes the `utils::GetExternalDataInfo`, it is useful for WebNN EP to read the external tensor's infomation. - Define a new `registerMLConstant` in JSEP to create WebNN constants from external data in WebNN backend, with the info of tensor as parameters, as well as the `Module.MountedFiles`, which holds all preloaded external files.	2024-10-23 08:18:16 -07:00
Wanming Lin	ba40022ec4	[WebNN EP] Support axes and fix some validation for Resize (#21952 ) - Supports arbitrary axes for Resize opset 18+ - Check all inputs and attributes more carefully --------- Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2024-10-22 20:26:34 -07:00
Wanming Lin	e6e94e6252	[WebNN EP] Use boolean flags instead of MLTensorUsage (#22497 ) Fixed #22495 We will keep MLTensorUsage until it is removed from Chromium. --------- Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2024-10-22 17:20:36 -07:00
Enrico Galli	1e5bda88f0	[WebNN EP] Cache MLTensors between runs (#22278 ) ### Description This change enables caching `MLTensor`s between inferences runs. This is done by keeping a reference to `MLTensor`s alive after they have been released. `MLTensor`s are only destroyed once the sessions goes out of scope. ### Motivation and Context Creating and destroying `MTensor`s on every run has a non-trivial performance penalty. This performance penalty materializes when using `ort.Tensors`[location=cpu] for inputs/outputs or when using the CPU EP as a fallback EP for unsupported operators. The former could be mitigated by developer using `ort.Tensors`[location=ml-tensor]. The latter cannot be mitigated by developers.	2024-10-18 08:07:00 -07:00
Akshay Sonawane	e5c2e50849	bumps up version in main from 1.20 -> 1.21 (#22482 ) Bump up version in main from 1.20.0 to 1.21.0 since the release branch has been cut.	2024-10-17 12:32:35 -07:00
Wanming Lin	52b77762bd	[WebNN EP] Remove the numThreads option (#22464 ) Chromium has removed this option via https://chromium-review.googlesource.com/c/chromium/src/+/5905656.	2024-10-17 07:45:39 -07:00

1 2 3 4 5 ...

769 commits