onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

Author	SHA1	Message	Date
Satya Kumar Jandhyala	1fb2e71ddc	[JS/WebGPU] Avoid producing presentKey/presentValue outputs if pastKey/pastValue … (#21782 ) Avoid producing presentKey/presentValue outputs if pastKey/pastValue don't exists. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-19 18:02:19 -07:00
Wanming Lin	7ae0b4ce64	[WebNN EP] Support Erf and Trilu for CPU backend (#21768 )	2024-08-19 07:56:16 -07:00
xhcao	417aa00406	[js/webgpu] fix conv1d error (#21585 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-18 15:45:13 -07:00
Jiajia Qin	c4ade796d6	[js/webgpu] Fix attention shader recompilation issue (#21770 ) ### Description <!-- Describe your changes. --> This PR fixes the `AttentionProbsSoftmax` recompilation issue when executing the phi3 model. With this fix, it will further improve the phi3 performance. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-17 17:15:15 -07:00
Yang Gu	49fc168eed	[js/webgpu] Handle negative axis in op Split (#21771 ) This is to fix issue #21703, where the axis is a negative value in the model. According to the spec (https://onnx.ai/onnx/operators/onnx__Split.html), negative axis means counting dimensions from the back.	2024-08-17 16:41:23 -07:00
Tianlei Wu	d79e3c5791	Extend Attention Bias Broadcast Support (#21710 ) ### Description Previously, MultiHeadAttention supports relative position bias of shape [1, N, S, T] or [B, N, S, T], and DecoderMaskedMultiHeadAttention supports [1, N, S, T]. This will extend the support to allow [1, N, S, T], [B, N, S, T], [B, 1, S, T] and [1, 1, S, T] for CUDA and CPU EPs. - [x] Rename the input of "relative position bias" to "attention bias" because it can also be used for other types of bias, like ALiBi (Attention with Linear Biases) or attention mask. - [x] Update unfused kernel to support broadcasting 2nd dimension of attention bias. - [x] Update efficient attention to support broadcasting 2nd dimension of attention bias. - [x] Update operators (MultiHeadAttention, DecoderMaskedMultiHeadAttention, Attention, PackedAttention, PackedMultiHeadAttention) to support broadcast attention bias on CUDA and CPU EPs. - [x] Update ROCm, DML and WebGPU naming to be consistent. (Note that those EPs do not support broadcasting attention_bias for now). - [x] Add attention bias tests for MultiHeadAttention. - [x] Update operator documents - [x] Update benchmark script Other changes: * Fix some checks in multihead-attention.ts * Add helper functions to dump tensors given dimensions.	2024-08-16 15:40:04 -07:00
Yulong Wang	ef2ccc477b	[js/web] Add support for int4/uint4 tensor (#21720 ) ### Description Add support for int4/uint4 tensor.	2024-08-15 21:32:10 -07:00
Yulong Wang	d4d0bea1fb	[js] update docs for new code formatter (#21743 ) ### Description Update README.md for code formatter change (#21728)	2024-08-15 20:17:08 -07:00
Yang Gu	f8efc086ce	[js/webgpu] Support Chrome Canary in unit tests (#21750 ) Chrome Canary is helpful to test some new features. With this PR, we can enable Chrome Canary in unit tests with command like "npm test -- op abs.jsonc -b=webgpu -e=chromecanary".	2024-08-15 19:27:54 -07:00
Yulong Wang	abdc31de40	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 ) ### Description See `454996d496` for manual changes (excluded auto-generated formatting changes) ### Why Because the toolsets for old clang-format is out-of-date. This reduces the development efficiency. - The NPM package `clang-format` is already in maintenance mode. not updated since 2 years ago. - The VSCode extension for clang-format is not maintained for a while, and a recent Node.js security update made it not working at all in Windows. No one in community seems interested in fixing those. Choose Prettier as it is the most popular TS/JS formatter. ### How to merge It's easy to break the build: - Be careful of any new commits on main not included in this PR. - Be careful that after this PR is merged, other PRs that already passed CI can merge. So, make sure there is no new commits before merging this one, and invalidate js PRs that already passed CI, force them to merge to latest.	2024-08-14 16:51:22 -07:00
Guenther Schmuelling	d82f15d0e3	add Gelu opset-20 to webgpu (#21725 ) https://github.com/microsoft/onnxruntime/issues/21618	2024-08-14 09:45:05 -07:00
Xu Xing	7172aff1cf	[js/webgpu] Fix max pool shape end with 0 (#21698 ) Bug: https://github.com/microsoft/onnxruntime/issues/21386 ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-13 20:59:24 -07:00
Scott McKay	6af5394bd7	Replace usage of jcenter in React Native build.gradle files (#21714 ) ### Description <!-- Describe your changes. --> Replace jcenter. It's deprecated and not responding. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix CIs	2024-08-13 11:10:51 -07:00
xhcao	9c6ee89fa7	[js/webgpu] fix two errors of attention operator (#21687 ) Fix two issues: (1) scale shall be fp32 instead of f16 (2) Softmax program does not handle the normalized dispatch group values, so if the sequence length is over 65535, the result is not correct for this program.	2024-08-13 09:42:34 -07:00
Satya Kumar Jandhyala	51b2044120	[JS/WebGPU] Add Dequantizelinear operator (#21642 ) ### Description Added DequantizeLinear operator for JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-09 14:44:19 -07:00
Yulong Wang	e6e4047a77	[js/web] update the build script for webgpu to enable model dump by default (#19707 ) ### Description update the build script for webgpu to enable model dump by default Now if using build_jsep.bat to build debug, the model dump is enabled. Using [`optimizedModelFilePath`](https://onnxruntime.ai/docs/api/js/interfaces/InferenceSession.SessionOptions.html#optimizedModelFilePath) in session option can dump the optimized model in browser ### Motivation and Context Helps to debug/rule out problems may related to model optimizer.	2024-08-09 05:55:34 -07:00
Yulong Wang	5e66fcc703	[js/web] allow op test to use f16 type for inputs/outputs (#21664 ) ### Description allow op test to use f16 type for inputs/outputs. This PR introduces "@petamoriken/float16" as Float16Array polyfill but restricts it to be only used for test runner.	2024-08-08 09:56:37 -07:00
Prathik Rao	134f47743e	bumps up version in main from 1.19 -> 1.20 (#21588 ) Bump up version in main from 1.19.0 to 1.20.0 since the release branch has been cut.	2024-08-05 15:46:04 -07:00
Wanming Lin	8c641d7182	[WebNN EP] Support Dropout op (#21586 ) ### Description WebNN only supports test mode, so we don't care about other inputs or attributes about training mode, use WebNN's identity op to implement the Dropout op directly.	2024-08-02 16:25:04 -07:00
Wanming Lin	1d4b161145	[WebNN EP] Support ConvTranspose for TFLite backend (#21291 ) ### Description Chromium supports ConvTranspose for TFLite in https://chromium-review.googlesource.com/c/chromium/src/+/5635194 With constraint that only default dilations and groups are supported. --------- Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2024-07-30 17:46:08 -07:00
Yulong Wang	b03c9496aa	[js/web] allow load WebAssembly binary from buffer (#21534 ) ### Description This PR adds a new option `ort.env.wasm.wasmBinary`, which allows user to set to a buffer containing preload .wasm file content. This PR should resolve the problem from latest discussion in #20876.	2024-07-29 13:39:38 -07:00
Xu Xing	0d7cf301a1	[js/webgpu] Add activation Tanh (#21540 ) Bug:https://github.com/microsoft/onnxruntime/issues/21467 ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-29 11:05:34 -07:00
Xu Xing	5bc12bf209	[js/webgpu] Add activation for conv3d naive (#21466 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-29 08:47:41 -07:00
Yulong Wang	dbff0cd098	[js/node] enable float16 support for Node.js binding (#20581 ) ### Description enable float16 support for Node.js binding. data of float16 tensor uses `Uint16Array`.	2024-07-28 13:03:17 -07:00
Wanming Lin	b6b29309a5	[WebNN EP] Update argMax/argMin to adapt to latest spec (#21452 ) WebNN spec recently changes the definition of argMax/argMin: - Remove selectLastIndex option, let backends decide to select the last index or not. - Move axes option to axis input	2024-07-25 17:07:01 -07:00
Changming Sun	7af39c6955	Update nodejs's cmake file to fix a file copy issue (#21390 ) This commit `e5f18ba2c1` caused some nightly pipelines to fail. This PR fixes it. It is because recently I changed our Linux library's SONAME. At runtime onnxruntime_binding depends on libonnxruntime.so.1 , instead of libonnxruntime.so.1.19.0(with the full version number). Therefore we need to keep the libonnxruntime.so.1 symlink. The packaging tools/ci_build/github/js/pack-npm-packages.ps1 still needs be updated. I will address it in another PR.	2024-07-23 11:03:55 -07:00
mindest	5b9369e93c	Fix typos according to reviewdog report. (#21335 ) ### Description Fix typos based on reviewdog report but with some exceptions/corrections.	2024-07-22 13:37:32 -07:00
Yulong Wang	01df8c787d	[js/web] fix vulnerable version of dependencies (#21412 ) ### Description ``` # npm audit report socket.io 3.0.0 - 4.6.2 Severity: high socket.io has an unhandled 'error' event - https://github.com/advisories/GHSA-25hc-qcg6-38wj Depends on vulnerable versions of engine.io fix available via `npm audit fix` node_modules/socket.io ws 8.0.0 - 8.17.0 Severity: high ws affected by a DoS when handling a request with many HTTP headers - https://github.com/advisories/GHSA-3h5v-q93c-6h6q fix available via `npm audit fix` node_modules/ws engine.io 0.7.8 - 0.7.9 \|\| 6.0.0 - 6.5.4 Depends on vulnerable versions of ws node_modules/engine.io socket.io-adapter 2.5.2 - 2.5.4 Depends on vulnerable versions of ws node_modules/socket.io-adapter 4 high severity vulnerabilities ```	2024-07-19 11:11:30 -07:00
Xu Xing	92a8407b39	[js/webgpu] Remove unnecessary initialization of var (#21312 ) This var has been initialized to 0 in tint, so no need extra loop to do it again: ``` float tint_symbol_52[1][4] = (float[1][4])0; { for(int tint_symbol_53 = 0; (tint_symbol_53 < 1); tint_symbol_53 = (tint_symbol_53 + 1)) { { for(int tint_symbol_54 = 0; (tint_symbol_54 < 4); tint_symbol_54 = (tint_symbol_54 + 1)) { tint_symbol_52[min(uint(tint_symbol_53), 0u)][min(uint(tint_symbol_54), 3u)] = 0.0f; } } } } ``` ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-07-12 12:34:34 -07:00
pengwa	88336ffa92	Fix typos - 1st Wave (#21278 ) ### Description There are so many typos reported by the review dog, [Optional Lint] actions (example: https://github.com/microsoft/onnxruntime/actions/runs/9864564489/job/27239732367), this PR is to fix some of them. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-07-11 13:35:08 +08:00
Pavan Goyal	1b82d835d8	[Fix] InterOpNumThreads Session Option for ONNX ReactNative Package (#21263 ) ### Description This PR resolves a bug related to setting the interOpNumThreads session option when creating an ORTSession. Currently, when the interOpNumThreads option is passed from React Native, the native module incorrectly sets intraOpNumThreads instead of interOpNumThreads. ### Motivation and Context Since this is a bug, users of the Onnx React Native package may believe that they are setting interOpNumThreads correctly, So this change is required. Refer to the code snippet below for details <img width="634" alt="Screenshot 2024-07-05 at 9 28 58 PM" src="https://github.com/microsoft/onnxruntime/assets/88655321/70a8f216-553a-4f4c-9481-e6871f0e37e6">	2024-07-10 07:00:18 -07:00
Enrico Galli	4c3c809bdb	[js/webnn] Enable user-supplied MLContext (#20600 ) ### Description This PR enables the API added in #20816 as well as moving context creation to JS. ### Motivation and Context In order to enable I/O Binding with the upcoming [MLBuffer](https://github.com/webmachinelearning/webnn/issues/542) API in the WebNN specification, we need to share the same `MLContext` across multiple sessions. This is because `MLBuffer`s are restricted to the `MLContext` where they were created. This PR enables developers to use the same `MLContext` across multiple sessions.	2024-07-08 10:19:39 -07:00
Wanming Lin	cd516a1677	[WebNN EP] Remove constraint for conv ops on CPU backend (#21237 ) Currently WebNN TFLite backend allows the filter of conv2d/convTranspose2d be an input. Remove the constraint and operate necessary transpose/reshape operations for the filter input.	2024-07-08 10:14:43 -07:00
Guenther Schmuelling	9eb1c2a7a3	support for layernorm in webgpu pre opset-17 (#21121 ) handled the same way cpu does	2024-06-27 10:20:48 -07:00
Wanming Lin	41ad83fb00	[WebNN EP] Support rest Reduction ops for TFLite backend (#21135 ) - reduceLogSum, reduceLogSumExp and reduceSumSquare have been landed in https://chromium-review.googlesource.com/c/chromium/src/+/5575815 - reduceL1 and reduceL2 have been landed in https://chromium-review.googlesource.com/c/chromium/src/+/5606091	2024-06-25 18:30:55 -07:00
Wanming Lin	4743803944	[WebNN EP] Support more Normalization ops for TFLite backend (#21151 ) Following Normalization ops have been supported in Chromium for TFLite backend: - batchNormalization: https://chromium-review.googlesource.com/c/chromium/src/+/5532745 - layerNormalization: https://chromium-review.googlesource.com/c/chromium/src/+/5573326 - instanceNormalization: https://chromium-review.googlesource.com/c/chromium/src/+/5532750	2024-06-24 19:04:23 -07:00
Wanming Lin	3a917e49fb	[WebNN EP] Support 4 more ops for TFLite backend (#21134 ) Recently WebNN TFLite backend supports gelu, expand, softsign, reciprocal.	2024-06-24 09:52:12 -07:00
Wanming Lin	0c80cd2157	[WebNN EP] Update Prelu restriction for CPU backend (#20878 )	2024-06-20 11:04:01 -07:00
Xu Xing	c3076721f3	[js/webgpu] Support conv3d naive (#20706 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-06-19 10:13:50 -07:00
Wanming Lin	40879a2623	[WebNN EP] Enable Cast op for WebNN CPU backend (#20864 ) WebNN TFLite backend supports `cast` op but doesn't support casting to `uint64` data type.	2024-06-19 01:51:19 -07:00
Wanming Lin	35c430a95a	[WebNN EP] Enable several ops for WebNN CPU backend (#20847 ) WebNN CPU implementation has been migrated from XNNPack to TFLite which supports more ops. Turn on partial `cpu` supported ops which just need the change from `false` to `true` firstly.	2024-06-19 01:45:31 -07:00
Yulong Wang	5e81fa8aec	[js] fix vulnerability CVE-2024-4068: upgrade `braces` to 3.0.3 (#21078 ) ### Description Upgrade `braces` to 3.0.3 [CVE-2024-4068](https://github.com/advisories/GHSA-grv7-fg5c-xmjg) ``` # npm audit report braces <3.0.3 Severity: high Uncontrolled resource consumption in braces - https://github.com/advisories/GHSA-grv7-fg5c-xmjg fix available via `npm audit fix` node_modules/braces 1 high severity vulnerability ```	2024-06-18 16:02:08 -07:00
Yulong Wang	631a2c16be	[js/web] skip default locateFile() when dynamic import is disabled (#21073 ) ### Description skip default `locateFile()` when dynamic import is disabled. This allows the file to work with bundlers to load WebAssembly file correctly if `env.wasm.wasmPaths` is not set.	2024-06-18 12:21:45 -07:00
Yang Gu	1473d66a00	[js/webgpu] Prefer adapter.info to adapter.requestAdapterInfo (#21065 ) WebGPU is deprecating async adapter.requestAdapterInfo, and replacing it with sync adapter.info. Spec change: https://github.com/gpuweb/gpuweb/pull/4662	2024-06-18 12:02:38 -07:00
Jian Chen	4e18b0b7ce	Upgrade braces from 3.0.2 to 3.0.3 to fix the vulnerability (#21022 )	2024-06-12 18:02:52 -07:00
Yulong Wang	dd805ff77d	[js/web] ESM: use the bundled target as default export (#20991 ) ### Description ESM: use the bundled target as default export In this change, the default import of the following entries: ``` import from 'onnxruntime-web'; import from 'onnxruntime-web/all'; import from 'onnxruntime-web/webgpu'; ``` will use the "bundled" version, which has no dynamic import. This change should only apply to ESM on web.	2024-06-11 11:14:55 -07:00
Wanming Lin	043ef5c95f	[WebNN EP] Support latest WebNN softmax op (#20827 ) Latest WebNN softmax supports N-D input and axis parameter.	2024-06-11 08:27:14 -07:00
Edward Chen	981893c318	Remove deprecated "mobile" packages (#20941 ) # Description This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed. Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it. # Motivation and Context The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.	2024-06-07 16:20:32 -05:00
Wanming Lin	52874f628a	[WebNN EP] Remove some constraints for CPU backend (#20900 ) Following constraints have been supported by WebNN TFLite backend: - Concat: supports up to 4 inputs - Matmul: supports broadcasting - Resize: supports nearest mode - Split: supports up to 4 outputs	2024-06-06 08:22:41 -07:00
Wanming Lin	da1f8f9274	[WebNN EP] TFLite backend only supports limit ranges for Clip (#20863 )	2024-06-06 08:22:18 -07:00

1 2 3 4 5 ...

685 commits