onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-09 17:28:58 +00:00

Author	SHA1	Message	Date
Yulong Wang	5ff27ef02a	[js/webgpu] support customop FastGelu (#19392 ) ### Description Support WebGPU custom operator FastGelu.	2024-02-06 09:07:31 -08:00
Jiajia Qin	ccbe264a39	[js/webgpu] Add LeakyRelu activation for fusedConv (#19369 ) ### Description This PR 1) adds LeakyRelu activation for fusedConv; 2) makes `vec4<f16>` value work with `float32` uniforms attributes. For example: `clamp(value, vec4<f16>(uniforms.clip_min), vec4<f16>(uniforms.clip_max)` will throw compilation errors since `uniforms.clip_min` and `uniforms.clip_min` are `f32` not `f16`. So we need to change it to `clamp(value, vec4<f16>(f16(uniforms.clip_min)), vec4<f16>(f16(uniforms.clip_max))` And above problem was introduced when we make activation attributes as uniforms instead of constant. BTW, after adding LeakyRelu, `realesrgan-t256` model can pass.	2024-02-02 09:06:38 -08:00
Yulong Wang	50806a7dd5	[js/web] support external data in npm test (#19377 ) ### Description support external data in npm test. This allows test runner to detect whether an external data is available in the test folder, and if it is, load it as external data automatically. this feature does not parse every model to figure out whether the model has external data. the following comments in code explained how to determine whether should parse the model file. ```js // for performance consideration, we do not parse every model. when we think it's likely to have external // data, we will parse it. We think it's "likely" when one of the following conditions is met: // 1. any file in the same folder has the similar file name as the model file // (e.g., model file is "model_abc.onnx", and there is a file "model_abc.pb" or "model_abc.onnx.data") // 2. the file size is larger than 1GB ```	2024-02-02 09:05:57 -08:00
Jiajia Qin	efc17e79de	[js/webgpu] Fix the undefined push error (#19366 ) ### Description This PR fixes below errors when enable webgpu profiling: ``` TypeError: Cannot read properties of undefined (reading 'push') ```	2024-02-02 02:04:06 -08:00
Xu Xing	3a2ab1963a	[js/webgpu] Refactor createTensorShapeVariables (#18883 )	2024-02-01 17:59:00 -08:00
Yulong Wang	dd1f6ccc45	[js/webgpu] resolve codescan alert (#19343 ) ### Description resolve codescan alert: https://github.com/microsoft/onnxruntime/security/code-scanning/17687	2024-01-30 21:06:21 -08:00
Xu Xing	d73131cf0f	[js/webgpu] Use DataType as uniform cpu type (#19281 ) This saves turning data type to string by tensorDataTypeEnumToString.	2024-01-30 21:05:08 -08:00
Jiajia Qin	85cef0af8c	[js/webgpu] Support capture and replay for jsep (#18989 ) ### Description This PR expands the graph capture capability to JS EP, which is similar to #16081. But for JS EP, we don't use the CUDA Graph, instead, we records all gpu commands and replay them, which removes most of the cpu overhead to avoid the the situation that gpu waiting for cpu. mobilenetv2-12 becomes 3.7ms from 6ms on NV 3090 and becomes 3.38ms from 4.58ms on Intel A770. All limitations are similar with CUDA EP: 1. Models with control-flow ops (i.e. If, Loop and Scan ops) are not supported. 2. Usage of graph capture is limited to models where-in all ops in the model can be partitioned to the JS EP or CPU EP and no memory copy between them. 3. Shapes of inputs/outputs cannot change across inference calls. 4. IObinding is required. The usage is like below: Method 1: specify outputs buffers explicitly. ``` const sessionOptions = { executionProviders: [ { name: "webgpu", }, ], enableGraphCapture: true, }; const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions); // prepare the inputBuffer/outputBuffer ... ... const feeds = { 'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims }) }; const fetches = { 'output': ort.Tensor.fromGpuBuffer(outputBuffer, { dataType: 'float32', dims: [1, 1000] }) }; let results = await session.run(feeds, fetches); // The first run will begin to capture the graph. // update inputBuffer content ... ... results = = await session.run(feeds, fetches); // The 2ed run and after will directly call replay to execute the graph. ... ... session.release(); ``` Method 2: Don't specify outputs buffers explicitly. Internally, when graph capture is enabled, it will set all outputs location to 'gpu-buffer'. ``` const sessionOptions = { executionProviders: [ { name: "webgpu", }, ], enableGraphCapture: true, }; const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions); // prepare the inputBuffer ... ... const feeds = { 'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims }) }; let results = await session.run(feeds); // The first run will begin to capture the graph. // update inputBuffer content ... ... results = = await session.run(feeds); // The 2ed run and after will directly call replay to execute the graph. ... ... session.release();	2024-01-30 18:28:03 -08:00
Jiajia Qin	90883a366a	[js/webgpu] Add hardSigmoid activation for fusedConv (#19233 ) ### Description Add hardSigmoid activation for fusedConv. It will be used by mobilenetv3-small-100 model.	2024-01-30 16:28:53 -08:00
Xu Xing	624b4e2063	[js/webgpu] Remove enableShapesUniforms (#19279 )	2024-01-29 17:49:06 -08:00
Guenther Schmuelling	9e69606360	fix f16 for attention, enable slice and flatten for more types (#19262 )	2024-01-29 10:13:46 -08:00
Xu Xing	a3f0e2422b	[js/webgpu] Support f16 uniform (#19098 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-01-25 16:58:22 -08:00
Xu Xing	656ca66186	[js/webgpu] Support uniforms for conv, conv transpose, conv grouped (#18753 )	2024-01-25 15:37:05 -08:00
Jiajie Hu	5b06505073	[js/webgpu] Fix Tanh explosion (#19201 ) ### Description ```math \tanh(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}= \left\{ \begin{array}{cc} -\frac{1-e^{-2\cdot(-x)}}{1+e^{-2\cdot(-x)}}, & x<0 \\ 0, & x=0 \\ \frac{1-e^{-2x}}{1+e^{-2x}}, & x>0 \end{array} \right. ``` ### Motivation and Context On some platforms, $$\tanh(1000)=\frac{e^{1000}-e^{-1000}}{e^{1000}+e^{-1000}}$$ would produce NaN instead of 0.999... or 1 (imagine $e^{1000}=\infty$ and $\frac{\infty}{\infty}$ explodes).	2024-01-25 08:25:35 -08:00
Wanming Lin	7252c6e747	[WebNN EP] Support WebNN async API with Asyncify (#19145 )	2024-01-24 15:37:35 -08:00
Yang Gu	591f90c0b9	[js/webgpu] Fix issue of timestamp query (#19258 ) When we enable webgpu profiling mode between session.create and session.run, current implementation has a problem to create querySet (and also queryResolveBuffer) if we share the commandEncoder with inputs upload. This PR fixes this by moving the querySet creation to the place we set queryType.	2024-01-24 14:49:37 -08:00
satyajandhyala	a33b5bd1fa	[JS/WebGPU] Added Uniforms to SkipLayerNorm. (#18788 ) ### Description Added Uniforms to SkipLayerNorm ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve performance --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2024-01-25 01:12:21 +05:30
Xu Xing	61610ff986	[js/webgpu] Add FusedConv clip test case (#18900 ) Bug: https://github.com/microsoft/onnxruntime/issues/18899	2024-01-23 08:25:05 -08:00
Jiajia Qin	d226e40856	[js/webgpu] set query type in onRunStart (#19202 ) ### Description <!-- Describe your changes. --> `env.webgpu.profiling` is a global flag. It may change before each session.run. So the best place is to update it in `onRunStart` event. After this, we can directly check `this.queryType`'s value. Without this pr, we need to make sure that `getCommandEncoder()` is called before checking `this.queryType`. Otherwise, it may happen that `pendingKernels`'s length is not equal to `pendingDispatchNumber`'s length. See the two ugly workarounds [1)](`e630dbf528 (diff-006fc84d3997f96a29b8033bd2075d6a0a9509211bd5812a6b934fc74fedfd9dR267-R268)`) and [2)](`e630dbf528 (diff-618fe297fbe7a1da586380163b8fd2627311ccc217640a3c5cdc9c17a33472c1R73-R80)`) if we don't introduce `onRunStart`. Or we need to call `setQueryType` in each kernel run.	2024-01-22 16:08:55 -08:00
Jiajia Qin	2e0a388c36	[js/webgpu] Add HardSigmoid support (#19215 ) ### Description This op is required in mobilenetv3-small-100. With this PR, mobilenetv3-small-100 model becomes less than 10 ms from over 100 ms on ADL.	2024-01-22 15:53:26 -08:00
Yulong Wang	d69b622ef4	[js/web] upgrade dependency packages version (#19193 ) ### Description upgrade packages version. ``` # npm audit report electron 23.0.0-alpha.1 - 23.3.13 Severity: moderate ASAR Integrity bypass via filetype confusion in electron - https://github.com/advisories/GHSA-7m48-wc93-9g85 fix available via `npm audit fix --force` Will install electron@28.1.4, which is a breaking change node_modules/electron get-func-name <2.0.1 Severity: high Chaijs/get-func-name vulnerable to ReDoS - https://github.com/advisories/GHSA-4q6p-r6v2-jvc5 fix available via `npm audit fix` node_modules/get-func-name semver <=5.7.1 \|\| 6.0.0 - 6.3.0 \|\| 7.0.0 - 7.5.1 Severity: moderate semver vulnerable to Regular Expression Denial of Service - https://github.com/advisories/GHSA-c2qf-rxjj-qqgw semver vulnerable to Regular Expression Denial of Service - https://github.com/advisories/GHSA-c2qf-rxjj-qqgw semver vulnerable to Regular Expression Denial of Service - https://github.com/advisories/GHSA-c2qf-rxjj-qqgw fix available via `npm audit fix` node_modules/cross-spawn/node_modules/semver node_modules/global-agent/node_modules/semver node_modules/semver ```	2024-01-18 13:45:42 -08:00
Yulong Wang	f87e69801f	[js/web] show warning when numThreads is set but threads is not supported (#19179 ) ### Description show warning when numThreads is set but threads is not supported. Resolves #19148, #18933 for web: when crossOriginIsolated is false. for node: always disable.	2024-01-17 15:04:22 -08:00
Yulong Wang	146ebaf91e	[js/web] allow proxy to load model with 1GB <= size < 2GB (#19178 ) ### Description allow proxy to load model with 1GB <= size < 2GB resolves #19157.	2024-01-17 15:03:43 -08:00
Rachel Guo	bd9d8fb2a5	[ORT 1.17.0 release] Bump up version to 1.18.0 (#19170 ) ### Description <!-- Describe your changes. --> Bump up version to 1.18.0 since the release branch has been cut. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2024-01-17 11:18:32 -08:00
Guenther Schmuelling	9dee543bed	fix gemm beta for fp16 (#19153 ) per onnx spec beta is always fp32 so we need to cast it	2024-01-15 18:40:38 -08:00
Yulong Wang	f917dde717	[web] remove xnnpack from web backends (#19116 ) ### Description XNNPACK is already disabled in web assembly build. This change removes the xnnpack backend registration in JS.	2024-01-13 23:04:02 -08:00
Yang Gu	e803f8eb0f	[js/webgpu] Refactor timestamp-query and introduce timestamp-query-inside-passes (#18894 ) We submit kernels in a batch (a fixed number 16 is used except for the last batch) for better performance. However, timestamp query support is at pass level so we disable the batch execution in profiling mode in previous implementation. Actually we can have multiple passes in a batch so that we don't have to disable batch execution, which is the first enhancement of this PR. Furthermore, WebGPU has an extension to support timestamp query inside passes, which isn't supported by all the platforms (e.g., Windows supports it, while macOS doesn't). This is expected to have lower cost compared with multiple passes solution. So this PR also introduce this support when available. This PR also refactors some implementation related to kernelInfo, and try to unify the related kernel names.	2024-01-13 00:23:17 -08:00
Yulong Wang	07cfc56538	[js] enable external data loading for ort-web (#19087 ) ### Description enable external data loading for ort-web. ### Why The ORT external data design is highly depending on the file system, especially synchronous file I/O APIs. Those are not available in web platforms. We need to have extra code to make external data working on web. ### How Considering there is no file system in web, an implementation for web to support external data is to use pre-loaded data. Assume model file a.onnx includes initializers that linked to ./b.bin, we require users to pass a full data file list when creating the session. The user code will be look like: ```js const mySess = await ort.InferenceSession.create('./path/model/a.onnx', { // session options externalData: [ { // relative or absolute path/URL of the file, // or a pre-loaded Uint8Array containing the data of the external data file data: './path/data/b.bin', // the relative path of the external data. Should match initializers' "location" value defined in the model file path: './b.bin' }, // { } if multiple external data file ] }); ``` Currently, this feature only works with JSEP build enabled.	2024-01-12 19:24:24 -08:00
Guenther Schmuelling	a756017e9f	[js/webgpu] more fixes for access above 2GB (#19065 ) when jsep calls javascript with an index to HEAP8 or HEAP32 the index is negative when the heap is above 2GB, even if we pass it as uint32_t it remains negative. So in javascript use >>> 0 to make it unsigned.	2024-01-12 17:47:37 -08:00
Guenther Schmuelling	4a5f13b681	fix resize for fp16 (#19110 ) resize for fp16 has 2 issues: scales are always f32 and roi can be f32 or f16. scales: this is fixed. roi this is fixed for the case where roi is not passed as optional input with f16. To fix this it requires a much larger change and I did not want to risk this short before a release. For all practical purpose passing roi as input with f16 should be rare and we can fix it in the near future.	2024-01-12 13:44:28 -08:00
Caroline Zhu	4dbaa73738	[js/web/training] added end-to-end tests (#18700 ) ## Summary * following inference's [set-up for end-to-end tests](https://github.com/microsoft/onnxruntime/tree/main/js/web/test/e2e), created an end-to-end test runner for training * this test runner copies testdata from the [trainingapi folder](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/test/testdata/training_api) * then runs two tests (training session with evalModel & optimizer model, and training session with the minimum options), and tests if the ORT-web training package encompasses inference * these tests check * createTrainingSession * runTrainStep * runOptimizerStep if applicable * the parameters methods (getParametersSize, loadParametersBuffer, and getContiguousParameters) ## TL;DR * [`js/web/test/training/e2e/run.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-c1359c4d401f9ba69e937814219cefe5fd11b151a6ffd084c641af3c82e8216c) is responsible for setting up and running the end to end tests * [`js/web/test/training/e2e/common.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-ee5452491b7b2563d175d13d81d10f2323b12b18589aa4c5798962a8b904a4a8) contains the test function definitions (`testInferenceFunction`, `testTrainingFunctionMin`, `testTrainingFunctionAll`) ## Flow * entrypoint: user runs the following command in the terminal: `npm run test:training:e2e` * [`js/web/package.json`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-79275844e75c3c410bb3a71c7f59b2b633e5a3e975c804ffc47220025084da28) was modified to include an npm script that will run `run.js` which will run the end to end tests * [`js/web/test/training/e2e/run.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-c1359c4d401f9ba69e937814219cefe5fd11b151a6ffd084c641af3c82e8216c) is responsible for * detecting and installing local tarball packages of ORT-web * copying training data to the `js/web/training/e2e/data` folder * starting two Karma processes. Karma is a test runner framework that simulates testing in the browser. * In this case, the tests happen in Chrome. We can configure the tests to run in Edge and other browsers in the future. * one of these karma processes is self-hosted, meaning it pulls the ORT-web package from local * the other karma process is not self-hosted, meaning it pulls the ORT-web package from another source. In this case, we start an http server that serves the ORT-web binaries. * [`js/web/test/training/e2e/simple-http-server.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-f798ab485f3ec26c299fe5b2923574c9e4b090200ba20d490bbf6c183286993c) is responsible for starting the HTTP server and serving the ORT binary files. This code almost identical to the same code in the inference E2E tests. * [`js/web/test/training/e2e/karma.conf.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-436cfe8f670c768a04895bd4a1874a5e033f85e0e2d84941c62ff1f7c30a9f28) Karma configuration file that specifies what happens when a karma process is started. The config specifies Mocha as the testing framework, which will go through all the loaded files and run any tests that exist * [`js/web/test/training/e2e/browser-test-wasm.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-13b6155e106dddc7b531ef671186e69b2aadb8a0f4b2f3001db0991567d78221) File that contains the tests that Mocha will pick up on and run. * The test functions (such as testInference and testTrainingFunctionAll) are defined in [`js/web/test/training/e2e/common.js`](https://github.com/microsoft/onnxruntime/compare/main...carzh:onnxruntime:carzh/training-e2e-runner?expand=1#diff-ee5452491b7b2563d175d13d81d10f2323b12b18589aa4c5798962a8b904a4a8). ## Notes * I followed the [tests for training core](`b023de0bfc/orttraining/orttraining/test/training_api/core/training_api_tests.cc`) where they randomly generated input for the training session * E2E tests are triggered by running `npm run test:training:e2e` -- suggestions for alternative script names are appreciated!!! ## Motivation and Context - adding training bindings for web	2024-01-12 13:33:33 -08:00
zesongw	3eec1592bd	[WebNN EP] Update WebNN unit test list (#19103 ) Update WebNN test list in suite-test-list.jsonc so all test cases are passed behind WebNN CPU backend on Chrome Stable (Although some cases may fall back to CPU EP). Enable int64 support for WebNN in unit tests.	2024-01-12 10:22:38 -08:00
Jiajie Hu	acba63c36a	[js/webgpu] Change A/sqrt(B) to AinverseSqrt(B) in normalization ops (#19101 ) ### Description Change `A / sqrt(B)` to `A inverseSqrt(B)` in BatchNormalization, InstanceNormalization, LayerNormalization and SkipLayerNormalization. ### Motivation and Context For the same reason as the existence of the `inverseSqrt` built-in in WebGPU spec.	2024-01-12 00:08:16 -08:00
dependabot[bot]	5373c0c730	Bump follow-redirects from 1.15.2 to 1.15.4 in /js/web (#19068 ) Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.2 to 1.15.4. <details> <summary>Commits</summary> <ul> <li><a href="`65858205e5`"><code>6585820</code></a> Release version 1.15.4 of the npm package.</li> <li><a href="`7a6567e16d`"><code>7a6567e</code></a> Disallow bracketed hostnames.</li> <li><a href="`05629af696`"><code>05629af</code></a> Prefer native URL instead of deprecated url.parse.</li> <li><a href="`1cba8e85fa`"><code>1cba8e8</code></a> Prefer native URL instead of legacy url.resolve.</li> <li><a href="`72bc2a4229`"><code>72bc2a4</code></a> Simplify _processResponse error handling.</li> <li><a href="`3d42aecdca`"><code>3d42aec</code></a> Add bracket tests.</li> <li><a href="`bcbb096b32`"><code>bcbb096</code></a> Do not directly set Error properties.</li> <li><a href="`192dbe7ce6`"><code>192dbe7</code></a> Release version 1.15.3 of the npm package.</li> <li><a href="`bd8c81e4f3`"><code>bd8c81e</code></a> Fix resource leak on destroy.</li> <li><a href="`9c728c314b`"><code>9c728c3</code></a> Split linting and testing.</li> <li>Additional commits viewable in <a href="https://github.com/follow-redirects/follow-redirects/compare/v1.15.2...v1.15.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=follow-redirects&package-manager=npm_and_yarn&previous-version=1.15.2&new-version=1.15.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-11 22:25:50 -08:00
Guenther Schmuelling	d0bac8216d	[js/webgpu] fix bcast in where (#19009 )	2024-01-11 12:13:24 -08:00
Jiajia Qin	a89db01fce	[js/webgpu] disable GroupedConvVectorize path (#19090 ) Disable createGroupedConvVectorizeProgramInfo path due to bots failures on below two cases: [webgpu]Conv - conv - vectorize group - B [webgpu]Conv - conv - vectorize group - D	2024-01-11 08:13:14 -08:00
Jiajia Qin	fd6bab4250	[js/webgpu] Provide a vectorized algorithm for GroupedConv (#18884 ) ### Description This PR provides a vectorized algorithm for NHWC GroupedConv to improve performance. The aggregate time of GroupedConv in mobilenetv2-12 becomes ~1ms from ~4ms on Intel Alder Lake machine. About 20% improvement for the whole model.	2024-01-10 16:12:43 -08:00
Xu Xing	ed0f26d3d4	[js/webgpu] Revert parse norm attributes (#19074 ) This resolves the below build errors: ``` lib/wasm/jsep/webgpu/op-resolve-rules.ts:19:23 - error TS2724: '"./ops/instance-norm"' has no exported member named 'parseInstanceNormAttributes'. Did you mean 'InstanceNormAttributes'? 19 import {instanceNorm, parseInstanceNormAttributes} from './ops/instance-norm'; ~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/wasm/jsep/webgpu/op-resolve-rules.ts:19:23 - error TS6133: 'parseInstanceNormAttributes' is declared but its value is never read. 19 import {instanceNorm, parseInstanceNormAttributes} from './ops/instance-norm'; ~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/wasm/jsep/webgpu/op-resolve-rules.ts:20:20 - error TS2305: Module '"./ops/layer-norm"' has no exported member 'parseLayerNormAttributes'. 20 import {layerNorm, parseLayerNormAttributes} from './ops/layer-norm'; ~~~~~~~~~~~~~~~~~~~~~~~~ lib/wasm/jsep/webgpu/op-resolve-rules.ts:20:20 - error TS6133: 'parseLayerNormAttributes' is declared but its value is never read. 20 import {layerNorm, parseLayerNormAttributes} from './ops/layer-norm'; ```	2024-01-09 20:58:50 -08:00
Xu Xing	76dfe5347c	[js/webgpu] Support uniforms for instance-norm (#18929 ) Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>	2024-01-09 14:56:00 -08:00
Changming Sun	a2afd92093	Format TS code (#19066 ) ### Description Format code	2024-01-09 13:41:10 -08:00
zesongw	ad6dd0a597	[WebNN] Enable npm unit tests (#18486 ) ### Description - Support more test cases for WebNN EP in suite-test-list.jsonc - Add DISABLE_WEBNN flag in build.ts as preparing for WebNN EP release - Add test option: '--webnn-device-type' in test-runner-args-cli.ts to support running WebNN 'gpu' deviceType - Use Chrome Stable as default browser for WebNN testing to unblock the CI limitation.	2024-01-09 10:10:57 -08:00
Xu Xing	557ac74c05	[js/webgpu] Support gemm uniforms (#19056 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-01-09 09:57:06 -08:00
Xu Xing	42ba2aed54	[js/webgpu] Support pad uniforms (#19057 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-01-09 09:34:56 -08:00
Xu Xing	eb92681bfb	[js/webgpu] Support range uniforms (#19055 )	2024-01-09 09:33:57 -08:00
Xu Xing	dee6a5b371	[js/webgpu] Support uniforms for attention and multihead attention (#18903 )	2024-01-09 07:46:30 -08:00
Xu Xing	8f024b7394	[js/webgpu] Support uniforms for layer-norm (#18755 )	2024-01-08 18:16:25 -08:00
Jiajie Hu	447a3a7c70	[js/webgpu] Fix Expand/Gather when input type is bool (#18999 ) ### Description Also update the op test suite. ### Motivation and Context Previously the total size in case `Expand - last dim is not divisible by 4` was a multiple of 4, even though the last dimension was not, so the bug has never been caught.	2024-01-05 08:16:15 -08:00
Yulong Wang	b18abaaa2c	[js/web] wait for threadpool initialization (#18952 ) ### Description a replacement of #18683. try to resolve #18689. By specifying "-s PTHREAD_POOL_SIZE" flag in emscripten, it forces the threadpool to initialize before the webassembly instance is available.	2024-01-04 08:06:55 -08:00
xhcao	867b9d8f04	[js/webgpu] Fix f16 errors for ConvTranspose2D (#18986 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-01-04 08:06:01 -08:00
Jiajie Hu	3b8b9147fa	[js/webgpu] Mitigate floating point accuracy issue in Resize (#18956 ) ### Description The patch fixes a floating point accuracy issue in Resize by preferring integer indices and integer arithmetic where possible. ### Motivation and Context Model test `test_resize_upsample_sizes_nearest_floor_align_corners` was observed to be failing on certain platforms. The root cause is the inaccurate floating point evaluation of 21 / 7 (2.999... vs 3), which results in the wrong input element to be indexed (floor(2.999...) vs floor(3)).	2024-01-03 14:15:26 -08:00

1 2 3 4 5 ...

417 commits