onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
Yulong Wang	c53c9caf17	[js] update mocha to v11.0.1 (#23254 ) ### Description Update `mocha` to v11.0.1 and `fs-extra` to v11.2.0 ``` # npm audit report nanoid <3.3.8 Severity: moderate Predictable results in nanoid generation when given non-integer values - https://github.com/advisories/GHSA-mwcw-c2x4-8c55 fix available via `npm audit fix` node_modules/nanoid mocha 8.2.0 - 10.2.0 Depends on vulnerable versions of nanoid node_modules/mocha 2 moderate severity vulnerabilities ```	2025-01-05 22:29:02 -08:00
Wu, Junze	2a16ad0215	[js/node] add proxy agent support for onnxruntime-node install script (#23232 ) ### Description Add proxy agent to fetch request ### Motivation and Context Fixes #23231 --------- Signed-off-by: Junze Wu <junze.wu@intel.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2025-01-04 20:27:55 -08:00
Yulong Wang	5c2e60c5af	[js/node] update install script to allow use proxy (#23242 ) ### Description Use `https.get` instead of `fetch` in ORT Nodejs binding package install script. ### Motivation and Context According to discussions in #23232, the package `global-agent` cannot work with `fetch` API. To make it work with the proxy agent, this PR replaces the `fetch` API with `https.get` in the install script.	2025-01-03 14:27:15 -08:00
Changming Sun	5d692b0136	Merge web machine pools (#23243 ) ### Description The Web CI pipeline uses three different Windows machine pools: 1. onnxruntime-Win2022-webgpu-A10 2. onnxruntime-Win2022-VS2022-webgpu-A10 3. onnxruntime-Win-CPU-2022-web This PR merges them together to reduce ongoing maintenance cost.	2025-01-03 13:53:17 -08:00
xhcao	a3833a5e79	[js/webgpu] validate transpose perm if specified (#23197 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-01 15:58:54 -08:00
Wanming Lin	2d05c4bcd9	[WebNN] Support SkipSimplifiedLayerNormalization op (#23151 ) The algorithm of `SkipSimplifiedLayerNormalization` is quite similar to the `SimplifiedLayerNormalization`, only different is `SkipSimplifiedLayerNormalization` provides an additional output used for calculating the sum of the input, skip and bias (if it exits). BTW, fix a bug in `SimplifiedLayerNormalization`, adding bias if it exits.	2024-12-24 12:44:14 -08:00
liqun Fu	a9a881cc98	Integrate onnx 1.17.0 (#21897 ) ### Description <!-- Describe your changes. --> for ORT 1.21.0 release Create following related issues to track skipped tests due to updated ONNX operators in the ONNX 1.17.0 release: https://github.com/microsoft/onnxruntime/issues/23162 https://github.com/microsoft/onnxruntime/issues/23164 https://github.com/microsoft/onnxruntime/issues/23163 https://github.com/microsoft/onnxruntime/issues/23161 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com> Signed-off-by: Liqun Fu <liqun.fu@microsoft.com> Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Yifan Li <109183385+yf711@users.noreply.github.com> Co-authored-by: yf711 <yifanl@microsoft.com>	2024-12-24 09:02:02 -08:00
Yulong Wang	8680244ebc	Fix delay load for WebGPU EP and DML EP (#23111 ) ### Description This change fixes the DLL delay load problem for the WebGPU EP and DirectML EP. See detailed explanation below. ### Problem When onnxruntime.dll uses delay loading for its dependencies, the dependencies are loaded using `LoadLibraryEx()`, which search the directory of process (.exe) instead of this library (onnxruntime.dll). This is a problem for usages of Node.js binding and python binding, because Windows will try to find the dependencies in the directory of node.exe or python.exe, which is not the directory of onnxruntime.dll. There was previous attempt to fix this by loading DirectML.dll in the initialization of onnxruntime nodejs binding, which works for DML EP but is not a good solution because it does not really "delay" the load. For WebGPU, the situation became worse because webgpu_dawn.dll depends on dxil.dll and dxcompiler.dll, which are explicitly dynamically loaded in the code using `LoadLibraryA()`. This has the same problem of the DLL search. ### Solutions For onnxruntime.dll loading its direct dependencies, it can be resolved by set the [`__pfnDliNotifyHook2` hook](https://learn.microsoft.com/en-us/cpp/build/reference/understanding-the-helper-function?view=msvc-170#structure-and-constant-definitions) to load from an absolute path that constructed from the onnxruntime.dll folder and the DLL name. For webgpu_dawn.dll loading dxil.dll and dxcompiler.dll, since they are explicitly loaded in the code, the hook does not work. Instead, it can be resolved by ~~using WIN32 API `SetDllDirectory()` to add the onnxruntime.dll folder to the search path.~~ preloading the 2 DLLs from the onnxruntime.dll folder .	2024-12-19 10:23:48 -08:00
Yulong Wang	780735098d	[nodejs binding] Fix building in latest clang (#23146 ) ### Description This change fixes the build break for Node.js binding on latest AppleClang: ``` ...tensor_helper.cc:65:5 error: integer value -1 is outside of the valid range of values [0,15] for the enumeration type 'napi_typedarray_type' [-Wenum-constexpr-conversion] ``` Use the underlying type of enum `napi_typedarray_type` for `DATA_TYPE_TYPEDARRAY_MAP` to solve this issue. Because the underlying type is implementation defined (it's `int` for MSVC and `unsigned int` for Clang), we use `std::underlying_type_t` to get the correct type.	2024-12-19 10:23:27 -08:00
Yulong Wang	ae6dcc839e	Revert "[js/webgpu] disable failed tests temporarily (#23127 )" (#23130 ) ### Description This reverts commit `9115682d69`. ### Motivation and Context	2024-12-18 18:07:50 -08:00
Wanming Lin	a5b60ec03f	[WebNN] Add limit to QDQ ops (#23076 ) WebNN requires the `scale_shape` to be a subsample of the `input_shape`.	2024-12-17 12:52:08 -08:00
Enrico Galli	54edb43e77	[WebNN] Fixes MLTensor caching across different contexts (#23100 ) We weren't checking that MLTensors were from the same context before reusing them. Found while debugging microsoft/webnn-developer-preview#69	2024-12-17 12:51:16 -08:00
Yulong Wang	9115682d69	[js/webgpu] disable failed tests temporarily (#23127 ) ### Description Those test cases start to fail for unknown reasons. To unblock the CI, I disabled those tests temporarily to earn time to investigate the root cause.	2024-12-16 15:35:47 -08:00
Yulong Wang	01539ee7ab	[js/webgpu] fix Conv2DMatMul shader's out-of-bound read (#23085 ) ### Description <!-- Describe your changes. --> Fix a bug caused by potential out-of-bound reads of `W` in the Conv2DMatMul shader. ### Motivation and Context Fixes #22983	2024-12-12 11:33:53 -08:00
Yulong Wang	e605870783	[js/web] Update API for `ort.env.webgpu` (#23026 ) ### Description This PR is a replacement of #21671. It offers a new way for accessing the following: - `ort.env.webgpu.adapter`: - deprecating. There is no point to get the value of it. Once `GPUDevice.adapterInfo` is supported, there is no point to set the value too. - `ort.env.webgpu.device`: - set value of `GPUDevice` if user created it. Use at user's own risk. - get value of `Promise<GPUDevice>`. if not exist, create a new one. if exist return it. - `ort.env.webgpu.powerPreference`: - deprecating. encouraging users to set `ort.env.webgpu.device` if necessary. - `ort.env.webgpu.forceFallbackAdapter`: - deprecating. encouraging users to set `ort.env.webgpu.device` if necessary.	2024-12-11 10:24:14 -08:00
Jian Chen	5f7b9d0245	Upgrade gradle to 8.7 (#23016 ) ### Description This PR only upgrade the gradle version and `com.android.tools.build:gradle` version from build.gradle. This only update the react-native library gradle version, not the e2e test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-10 10:49:03 -08:00
dependabot[bot]	d27fecd3d3	Bump cross-spawn from 6.0.5 to 6.0.6 in /js/web (#23019 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 6.0.5 to 6.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/v6.0.6/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">6.0.6</a> (2024-11-18)</h2> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/ba5aaef">ba5aaef</a>)</li> <li><strong>core:</strong> support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>) (<a href="https://github.com/moxystudio/node-cross-spawn/commit/f4af31c">f4af31c</a>)</li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`d35c865b87`"><code>d35c865</code></a> chore(release): 6.0.6</li> <li><a href="`5a37e19173`"><code>5a37e19</code></a> chore: update package.json and package.lock</li> <li><a href="`ba5aaef783`"><code>ba5aaef</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li><a href="`f4af31c8ee`"><code>f4af31c</code></a> fix(core): support worker threads (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/127">#127</a>)</li> <li>See full diff in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v6.0.5...v6.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=6.0.5&new-version=6.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once it's up-to-date and CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-05 10:07:08 -08:00
Yulong Wang	1c79a4c9dd	[js/common] use TS type inference to eliminate `unknown` (#23012 ) ### Description This change uses a TypeScript trick to infer global types in onnxruntime-common. Thanks to the strong type system of TypeScript, we are able to refer to types that may not be available in the context. This helps to keep onnxruntime-common not to include dependencies like "@webgpu/types", and still being able to use the types in the declaration. See comments of `TryGetGlobalType` in `type-helper.ts`.	2024-12-04 19:01:26 -08:00
Yulong Wang	3234487385	[js] remove more unused training types (#22753 ) ### Description remove more unused training types	2024-12-04 16:44:09 -08:00
dependabot[bot]	3975e79303	Bump axios from 1.6.1 to 1.7.9 in /js/node (#23009 )	2024-12-04 23:52:24 +00:00
Yulong Wang	fdf5ffe2cf	[js/node] fix TypeScript declaration in onnxruntime-node (#23000 ) ### Description fix TypeScript declaration in onnxruntime-node ### Motivation and Context Fixes #22978	2024-12-04 11:29:27 -08:00
Xu Xing	c19617a24a	[js/webgpu] Add GatherND (#22847 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 09:57:32 -08:00
Yulong Wang	50b38ca9d5	[js/web] update default export to include webgpu (#22754 ) ### Description This PR changes the following exports: - `onnxruntime-web` now is same to `onnxruntime-web/webgpu`. - `onnxruntime-web/webgpu` is deprecating. ### Migration instructions: - use `onnxruntime-web` instead of `onnxruntime-web/webgpu`. - use `onnxruntime-web/wasm` if want to use onnxruntime-web without webgpu/webnn. ### Export table \| file name \| export entry \| includes WASM \| includes JSEP (WebGPU & WebNN) \| includes WebGL \| ------------- \| ------------- \| ----- \| ----- \| ----- \| ort.all.min.js<br/>ort.all.js<br/>ort.all.min.mjs<br/>ort.all.mjs \| `onnxruntime-web/all` \| ✔️\| ✔️\| ✔️ \| ort.min.js<br/>ort.js<br/>ort.min.mjs<br/>ort.mjs \| `onnxruntime-web` \| ✔️\| ❌ --> ✔️\| ✔️ -->❌ \| ort.webgpu.min.js<br/>ort.webgpu.js<br/>ort.webgpu.min.mjs<br/>ort.webgpu.mjs \| `onnxruntime-web/webgpu` \| ✔️ \| ✔️ \|❌ \| ort.wasm.min.js<br/>ort.wasm.js<br/>ort.wasm.min.mjs<br/>ort.wasm.mjs \| `onnxruntime-web/wasm` \| ✔️ \| ❌ \|❌	2024-12-04 09:46:45 -08:00
dependabot[bot]	bd701e4f33	Bump cross-spawn from 7.0.3 to 7.0.6 in /js (#23003 )	2024-12-04 05:07:21 +00:00
Yulong Wang	06526af346	[js/webgpu] fix a bug in transpose shader (#22997 ) ### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes #22994	2024-12-03 20:21:08 -08:00
dependabot[bot]	4497c97d54	Bump cross-spawn from 7.0.3 to 7.0.6 in /js/node (#22998 ) Bumps [cross-spawn](https://github.com/moxystudio/node-cross-spawn) from 7.0.3 to 7.0.6. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/moxystudio/node-cross-spawn/blob/master/CHANGELOG.md">cross-spawn's changelog</a>.</em></p> <blockquote> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.5...v7.0.6">7.0.6</a> (2024-11-18)</h3> <h3>Bug Fixes</h3> <ul> <li>update cross-spawn version to 7.0.5 in package-lock.json (<a href="`f700743918`">f700743</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.4...v7.0.5">7.0.5</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>fix escaping bug introduced by backtracking (<a href="`640d391fde`">640d391</a>)</li> </ul> <h3><a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.4">7.0.4</a> (2024-11-07)</h3> <h3>Bug Fixes</h3> <ul> <li>disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>) (<a href="`5ff3a07d9a`">5ff3a07</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`77cd97f3ca`"><code>77cd97f</code></a> chore(release): 7.0.6</li> <li><a href="`6717de49ff`"><code>6717de4</code></a> chore: upgrade standard-version</li> <li><a href="`f700743918`"><code>f700743</code></a> fix: update cross-spawn version to 7.0.5 in package-lock.json</li> <li><a href="`9a7e3b2165`"><code>9a7e3b2</code></a> chore: fix build status badge</li> <li><a href="`085268352d`"><code>0852683</code></a> chore(release): 7.0.5</li> <li><a href="`640d391fde`"><code>640d391</code></a> fix: fix escaping bug introduced by backtracking</li> <li><a href="`bff0c87c8b`"><code>bff0c87</code></a> chore: remove codecov</li> <li><a href="`a7c6abc6fe`"><code>a7c6abc</code></a> chore: replace travis with github workflows</li> <li><a href="`9b9246e096`"><code>9b9246e</code></a> chore(release): 7.0.4</li> <li><a href="`5ff3a07d9a`"><code>5ff3a07</code></a> fix: disable regexp backtracking (<a href="https://redirect.github.com/moxystudio/node-cross-spawn/issues/160">#160</a>)</li> <li>Additional commits viewable in <a href="https://github.com/moxystudio/node-cross-spawn/compare/v7.0.3...v7.0.6">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cross-spawn&package-manager=npm_and_yarn&previous-version=7.0.3&new-version=7.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by @fs-eire. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-12-03 18:48:22 -08:00
Yulong Wang	d3bc3180d8	[js/node] fix CUDA artifact installation script for Linux/x64 (#22984 ) ### Description This PR updates installation script to fix it for CUDA v12. However, it may be difficult for CUDA v11 since the steps are quite complicated to automate. Added a few lines of instructions instead. fixes #22877	2024-12-03 16:07:43 -08:00
Jian Chen	9ed0c7fe26	Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-02 18:34:25 -08:00
Wanming Lin	fe749a88a5	[WebNN EP] Fixed bug in usage of Array.reduce() (#22944 ) In JS, reduce of empty array with no initial value will throw error. Fix it by checking the array length firstly.	2024-11-26 19:03:44 -08:00
shiyi	afbb53937c	[WebNN] Support negative steps for slice (#22871 ) Slice with negative steps can be emulated by reverse+slice.	2024-11-25 23:06:23 -08:00
Bin Miao	558ae8621c	[WebNN EP] Fix an issue of CumSum operator (#22936 ) This PR limits the axis of the CumSum operator to be a constant when using WebNN EP. @Honry @fdwr PTAL.	2024-11-25 21:05:53 -08:00
Yi Zhang	a28246a994	Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914 ) …ime/java (#22771)" This reverts commit `632a36a233`. ### Description <!-- Describe your changes. --> ### Motivation and Context Run E2E tests using Browserstack failed due to this PR.	2024-11-21 18:12:28 +08:00
Wanming Lin	8a06f13301	[WebNN] Remove wasm.currentContext check (#22886 ) If a WebNN session is threw early, this check for `wasm.currentContext` will break all the following WebNN sessions, this often happens in npm tests.	2024-11-19 12:22:02 -06:00
Jiajia Qin	e597eaed4a	[js/webgpu] Optimize transpose as reshape when suitable (#22870 ) BUG #22031	2024-11-18 12:52:48 -08:00
Peishen Yan	5928009553	[WebNN EP] Support Einsum op (#19558 ) Adds support for einsum via WebNN matmul, transpose, reshape, reducesum, identity and element-wise binary ops.	2024-11-15 17:58:35 -08:00
Jian Chen	632a36a233	Update Gradle version 8.7 and java version 17 within onnxruntime/java (#22771 ) ### Description This change is to update the Gradle version within java project to 8.7, it also upgrades the JAVA to 17. Gradle version from react-native was also updated to 7.5 to make it compatible with changes from the Java directory. However, the target java version remains the same. Java version from these will be upgraded in a separated PR. This is spited from #22206 ### Motivation and Context This is the first step to upgrade the react native version.	2024-11-14 17:10:44 -08:00
Wanming Lin	82681205e4	[WebNN] Fix MLTensorUsage is undefined issue (#22831 ) `MLTensorUsage` has been removed from Chromium: https://chromium-review.googlesource.com/c/chromium/src/+/6015318, but we still need to make it compatible with old Chrome versions, so just make it `undefined` for latest Chrome version.	2024-11-13 20:22:22 -08:00
Xu Xing	ff57ac4f3d	[js/webgpu] Add scatterND (#22755 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-13 09:13:00 -08:00
Jiajia Qin	7e0dd9d433	[js/webgpu] Optimize Expand (#22752 ) Use components = 4 if possible. llama3.2-1B becomes 20 tokens/s from 18 tokens/s on my iGPUs.	2024-11-12 12:37:19 -08:00
Jiajia Qin	05c8dc9d1c	[js/webgpu] Optimize ConvTranspose (#22774 ) BUG #22031 The overall time of ConvTranspose in Demucs model becomes 517.41 ms from 1415.65 ms on my iGPUs.	2024-11-12 12:37:07 -08:00
Bin Miao	67f5be0da2	[WebNN EP] Support LRN operator (#22775 ) WebNN doesn't provide dedicate op for LRN, use a couple of WebNN ops to emulate it in WebNN EP: pow -> transpose -> pad -> averagePool -> transpose -> mul -> add -> pow -> div @Honry @fdwr PTAL, thanks!	2024-11-12 11:53:52 -08:00
Wanming Lin	cdc8db9984	[WebNN] Fixed WebNN Module undefined issue (#22795 ) `Module.jsepRegisterMLConstant` will be shorten by Closure Compiler in offical release, this would cause undefined error. Fix it by using `Module['jsepRegisterMLConstant']`.	2024-11-11 21:31:24 -08:00
shiyi	63cb53257b	[WebNN] Support steps >= 1 for slice operator (#22708 ) Co-authored-by: Wanming Lin <wanming.lin@intel.com>	2024-11-09 18:20:52 -08:00
xhcao	b5ee4ac760	[js/webgpu] support GridSample operator (#22652 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-08 11:02:36 -08:00
jzm-intel	d9b91682f1	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 ) This PR make MatMul shaders not depend on inputs broadcasting pattern, but only depend on input ranks and their shape provided in uniform. This change fix the issue that currently shaders code are different for different broadcasting, but have identical cache key and results in wrong cache hit.	2024-11-08 11:00:51 -08:00
jzm-intel	6a295eb75b	[JS/WebGPU] Creating devices with subgroup features enabled if possible (#21833 ) This CL make WebGPU backend support subgroup features and thus allow using subgroup optimizations in the future. ### Description With this CL WebGPU backends will create devices with subgroups and subgroups-f16 features (both are under origin trial in Chrome) or chromium-experimental-subgroups feature enabled whenever available. ### Motivation and Context This CL would allow WebGPU operator shaders to use subgroup optimizations in the future, and might get some significant speedup with these optimization.	2024-11-07 02:13:40 -08:00
Enrico Galli	1cb5ceedf3	[WebNN EP] Fix issues with MLTensor caching (#22701 ) This PR fixes a bug that occurs when searching for compatible `MLTensor` in the cache. We were missing checking the number of dimensions in the shape. This would mean that a cached buffer of shape `[1]` could match for `[1, 1, 256, 256]`. This PR also adds better handling when attempting to force an `MLTensor` to a different shape.	2024-11-06 09:17:11 -08:00
Yang Gu	811231e418	[js/webgpu] Destroy staging buffers aggressively during weights uploading (#22726 ) In current implementation, all the staging buffers for weights uploading are destroyed after first batch of kernel execution. It requires a lot of memory as all the staging buffers couldn't be reused. It also hurts the startup time (weights uploading only happens in session creation), as weights uploading is delayed to a very late time. This PR uses a very aggressive way to submit queue and destroy staging buffers, so that the related GPU memory could be reused as much as possible, though the real situation depends on the WebGPU and driver implementation. The aggressive queue submission also moves GPU operations to a very early time, which helps the startup time. Some buffer uploading benchmarks are composed to compare multiple solutions, regarding to the memory and time consumption. Benchmarks can be found at https://github.com/webatintel/webbench/blob/master/webgpu/buffer-upload.html, while detailed test data can be found at https://docs.google.com/document/d/1KgygOkb9ZNzkgzQ_tWOGlEI9ScmMBHDjDojjPFLmVXU/edit. I also tested phi3.5 on 2 machines, first inference time improved from 5141ms to 3579ms and from 4327ms to 2947ms separately.	2024-11-06 08:55:15 -08:00
Jiajia Qin	d5b2730ff8	[js/webgpu] Increase workgroupSize if only one workgroup is dispached (#22709 ) #22031 For reduce related ops, we should increase workgroupSize to improve parallelism if only one workgroup is dispatched. The total ReduceMean time becomes 8.98 ms from 77.79 ms on my iGPUs.	2024-11-05 13:13:52 -08:00
Jiajia Qin	64d8e25b4c	[js/webgpu] Optimize Gemm (#22706 ) BUG #22031 The total Gemm time in demucs model becomes 181.14 ms from over 1000 ms on my iGPUs. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-04 15:05:21 -08:00

1 2 3 4 5 ...

794 commits