onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

Author	SHA1	Message	Date
Yi Zhang	8e8840f1de	Enable Web CI on Linux (#16419 ) ### Description 1. Enable Web ci on Linux ### Motivation and Context 1. speed up web ci, the duration can be reduced from 160 minutes to 130 minutes, a time saving of 20% could be be achieved. The total computation time is 455 minutes now. Moved to Linux, it could be reduced to 336 minutes. 2. It's the first step to enable compilation cache for emscripten 3. per Yulong's request, build_web stages are still using windows pool ![image](https://github.com/microsoft/onnxruntime/assets/16190118/c9496408-74bd-45ea-b4ae-a4dd2a574d17) https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1038382&view=results	2023-06-22 15:42:58 +08:00
Yulong Wang	de476c8075	[js/web] update webgl context creating (#16436 ) ### Description Modify the creating of webgl context. Previous behavior: STEP.1 - create canvas (document.createElement), if failed, goto step.2 else step.3 STEP.2 - create offscreenCanvas, if failed abort STEP.3 - use the canvas created in step.1 or 2 to create webgl context. if successful return context else abort Now bahavior: STEP.1 create offscreenCanvas, if failed goto step.3 STEP.2 use it to create webgl context. if successful, return context STEP.3 create canvas (document.createElement). if failed, abort STEP.4 use it to create webgl context. if successful, return context else abort Motivation: we found in some environment, normalCanvas.getContext() returns null but offscreenCanvas.getContext() returns the context object. and when offscreenCanvas is available it is good idea to always prefer to use it.	2023-06-21 17:10:26 -07:00
Yulong Wang	da532f3f5a	[js/webgpu] fix GPU to GPU memcpy (#16393 ) ### Description Fixes a GPU to GPU memory copy bug which causes #16267	2023-06-21 15:50:08 -07:00
Yulong Wang	b8917ad84f	[js/web] fix nodejs detection (#16400 ) ### Description We used to use `typeof fetch === 'undefined'` as condition to detect the environment is Node.js or not. Before Node.js v18, this works. However, in Node.js v18, it introduced `fetch` function, so this check does not work any more. This PR changes the condition to check whether `process`, `process.versions` and `process.versions.node` exists. Checking whether `process` exists is not enough. This is because in some configuration, webpack may polyfill nodejs's process.	2023-06-20 00:20:58 -07:00
Guenther Schmuelling	5c0d5768e7	make package.json more rebost (#16366 ) "default" should be last element for exports. This fixes "Module not found: Error: Default condition should be last one" when importing the onnxruntime-web package in some conditions.	2023-06-15 14:17:37 -07:00
Yulong Wang	4f7900b553	[js/web] enable ONNX Runtime Web error messages in JS (#16335 ) ### Description enabling passing error messages from C++ to JavaScript so that when ORT Web API fails it generates more verbose errors.	2023-06-15 09:45:41 -07:00
Yulong Wang	e3e4926d00	[js/common] allow import onnxruntime-common as ESM and CJS (#15772 ) ### Description allow import onnxruntime-common as ESM and CJS.	2023-06-12 12:05:11 -07:00
satyajandhyala	889f80082f	[js/web] Added Reduce operators support (#16122 ) ### Description Added support for ReduceL1, ReduceL2, ReduceMean, ReduceMin, ReduceMax, ReduceSum, ReduceLogSum, ReduceLogSumExp, ReduceProd and ReduceSquareSum. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com> Co-authored-by: guschmue <guschmue@microsoft.com>	2023-06-12 07:46:27 -07:00
Yulong Wang	59f42cccb8	[js/common] refactor tensor type in onnxruntime-common (#15843 ) ### Description <!-- Describe your changes. --> refactor tensor type in onnxruntime-common. ### Motivation and Context There major motivation is that I am doing a local change to address the API part of #15312. And I am doing a refactoring of onnxruntime-common anyway (#15772). The `tensor.ts` and `tensor-impl.ts` are too large, so I split contents into multiple files to make the type declarations clearer. The original target of this change is for API only ( ie. do not refactor any implementation.). However, there are a few type/implementation inconsistencies so I also made minimal changes to fix them. ### Changes - extract `TensorUtils` for non-template interfaces - extract `TensorFactory` for all overloads of `Tensor.fromImage()` - refactor options type that used for `Tensor.fromImage()` - fix JSDoc comments to make option descriptions consistent with actual type declarations - fix an inconsistency for `options.format` and `options.bitmapFormat`; change all `bitmapFormat` to `format` - extract `ConversionUtils` for `tensor.toDataURL()` and `tensor.toImageData()` - put implementations into multiple files from `tensor-impl.ts` - fix a bug that cause unittest fail. put comments for future fix.	2023-06-09 16:19:29 -07:00
Yulong Wang	f274bbb0c8	[js] add API that allows to get package version (#16207 ) ### Description Add an API for users to get version of current package. example usage: ```js import { env } from 'onnxruntime-node'; console.log(env.versions.node); // output "1.16.0" ``` ```js import { env } from 'onnxruntime-web'; console.log(env.versions.web); // output "1.16.0" console.log(env.versions.common); // output "1.16.0" console.log(env.versions.node); // output "undefined" ``` #16156	2023-06-09 16:18:53 -07:00
Artur	dc1312cfb1	[web] fix: Provide typings for exports (#16249 ) ### Description Adds typings to be compatible with `moduleResolution: bundler` ### Motivation and Context Fixes #16242	2023-06-07 14:52:36 -07:00
Wanming Lin	a8c2f24ae0	[WebNN EP] Merge support for segment anything into main branch (#16208 ) We implemented a number of new ops and data types to support running segment anything model on Chromium WebNN DML backend (POC) in a forked branch https://github.com/honry/onnxruntime/tree/stable-diffusion In this PR, we migrate the changes in the forked branch to main branch, includes: - 22 new ops - New tensor data types: bool, int32, uint32, uint64, int64, float16 (As JavaScript hasn't shipped Float16Array, we use Uint16Array as a workaound) - Handle empty input tensors and duplicated outputs - Fixed some nits	2023-06-07 09:56:37 -07:00
Yulong Wang	319a0dc6aa	[js/doc] allow deduplicate opset version (#16182 ) ### Description allow deduplicate opset version in generated document webgpu-operators.md	2023-06-01 17:28:08 -07:00
Alexander Visheratin	e6c6184fee	[JS/WebGPU] Unsqueeze operator implementation (#16138 ) ### Description This PR adds an implementation of the Squeeze operator to WebGPU JSEP. The implementation follows the [operator schema](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Unsqueeze). To implement the `Unsqueeze` operator in the same fashion as the `Squeeze`, I added the `ComputeOutputShape()` method to the `UnsqueezeBase` class and made some slight modifications. Please let me know if it is a bad idea and if I should move this method to the JS implementation. I also uncommented test case lines in the `suite-test-list.jsonc` file for both Squeeze and Unsqueeze operators following @hariharans29's [comment](https://github.com/microsoft/onnxruntime/pull/16024#issuecomment-1565113633). ### How was it tested 1. I created a model with only one operator: ```Python import onnx.helper node = onnx.helper.make_node( "Unsqueeze", inputs=["T", "axes"], outputs=["y"], ) graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 4, 5]), onnx.helper.make_tensor_value_info("axes", 7, [2])], [onnx.helper.make_tensor_value_info("y", 1, [3, 1, 4, 5, 1])]) onnx.save(onnx.helper.make_model(graph), "unsqueeze.onnx") ``` 2. I compiled the runtime using @fs-eire's [instructions](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce). 3. I ran the test models in the browser using this minimal setup: ```HTML <html> <script src=".\dist\ort.webgpu.min.js"></script> <script> async function run() { const session = await ort.InferenceSession.create('unsqueeze.onnx', {executionProviders: ['webgpu']}); console.log(session); const input = new ort.Tensor('float32', new Float32Array(60), [3, 4, 5]); const dim = new ort.Tensor('int64', [1n, 4n], [2]); const output = await session.run({ "T": input, "axes": dim }); console.log(output); } run(); </script> </html> ``` ### Motivation and Context Improve operator coverage for WebGPU JSEP.	2023-06-01 12:23:02 -07:00
Yulong Wang	f67f7c0f0b	[js/web] disable node fallback in webpack (#16166 ) ### Description disable webpack's polyfill for node's `global`, `__filename` and `__dirname` in web build. This will confuse emscripten generated environment detection. see https://webpack.js.org/configuration/node/	2023-05-31 16:47:00 -07:00
dependabot[bot]	03216e2313	Bump socket.io-parser from 4.2.2 to 4.2.3 in /js/web (#16068 )	2023-05-31 02:15:23 +00:00
Yulong Wang	ebe715a817	[js/webgpu] fix RangeError in buffer download (#16165 ) ### Description this is a following up fix for #15990, which should resolve the RangeError issue.	2023-05-30 15:04:50 -07:00
Xavier Dupré	e726151b5c	Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-30 13:25:58 -07:00
Alexander Visheratin	415c26e46e	[JS/WebGPU] Squeeze operator implementation (#16024 ) ### Description This PR adds an implementation of the `Squeeze` operator to WebGPU JSEP. The implementation follows the [operator schema](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Squeeze) and allows one or two inputs. ### How was it tested 1. I created two models. Without `axes`: ```Python import onnx.helper node = onnx.helper.make_node( "Squeeze", inputs=["T"], outputs=["y"], ) graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 1, 4, 5])], [onnx.helper.make_tensor_value_info("y", 1, [3, 4, 5])]) onnx.save(onnx.helper.make_model(graph), "squeeze.onnx") ``` And with `axes`: ```Python import onnx.helper node = onnx.helper.make_node( "Squeeze", inputs=["T", "axes"], outputs=["y"], ) graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 1, 4, 5]), onnx.helper.make_tensor_value_info("axes", 7, [1])], [onnx.helper.make_tensor_value_info("y", 1, [3, 4, 5])]) onnx.save(onnx.helper.make_model(graph), "squeeze-dim.onnx") ``` 2. I compiled the runtime using @fs-eire's [instructions](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce). 3. I ran the test models in the browser using this minimal setup: ```HTML <html> <script src=".\dist\ort.webgpu.min.js"></script> <script> async function run() { const session = await ort.InferenceSession.create('squeeze-dim.onnx', {executionProviders: ['webgpu']}); console.log(session); const input = new ort.Tensor('float32', new Float32Array(60), [3, 1, 4, 5]); const dim = new ort.Tensor('int64', [-3n], [1]); const output = await session.run({ "T": input, "axes": dim }); console.log(output); } run(); </script> </html> ``` ### Motivation and Context Improve operator coverage for WebGPU JSEP.	2023-05-26 15:53:05 -07:00
Yulong Wang	e9e6bedf37	[js/webgpu] generate operator table for webgpu (#15954 ) ### Description [js/webgpu] generate operator table for webgpu	2023-05-20 12:20:41 -07:00
Yulong Wang	18f17c555d	[js/webgpu] fix buffer size when download (#15990 ) ### Description fix buffer size when download. buffer size should always be padded to multiple of 4. resolved issue described in #15796 > ![Image](https://user-images.githubusercontent.com/26504141/239093785-9417dffc-6f00-47b2-956d-402b43bdb0a9.png)	2023-05-20 00:21:18 -07:00
Yulong Wang	04ea561fc8	[js/webgpu] throw error when WebGPU=ON and SIMD=OFF (#15924 ) ### Description throw error when WebGPU=ON and SIMD=OFF	2023-05-16 11:05:56 -07:00
Yulong Wang	22a9a1a630	[js/webgpu] only register webgpu backend when it's available (#15922 ) ### Description only register webgpu backend when it's available	2023-05-15 18:09:31 -07:00
Yulong Wang	204111a79e	[js/webgpu] support proxy for webgpu (#15851 ) ### Description [js/webgpu] support proxy for webgpu. fixes #15832	2023-05-15 16:23:13 -07:00
Yulong Wang	f3b8130d1a	[js/web] support npm run pull:wasm [buildID] (#15877 ) ### Description support `npm run pull:wasm [buildID]` remove `npm run pull:wasm:debug` as it can be simply replaced with `npm run pull:wasm debug`.	2023-05-15 16:19:34 -07:00
Yulong Wang	9328a0f955	[js/webgpu] run test on chrome instead of chrome canary for webgpu (#15902 ) ### Description webgpu is released in chrome v113. No longer to use chrome canary in test cli	2023-05-12 15:47:59 -07:00
liqun Fu	ac9ae9f7c5	update onnx release 1.14 for docker files (#15680 ) ### Description this is for ort 1.15 release to work with onnx 1.14 It shall be merged after onnx 1.14 release and before ort 1.15 release. ### Motivation and Context --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-05-10 13:15:56 -07:00
Yulong Wang	02d94bcc8e	[js/web] fix terser reserved symbols for worker (#15864 ) ### Description due to change from `3935cdcc57`, our minimizer need to be updated to add "startWorker" to reserved symbol.	2023-05-09 11:11:26 -07:00
Yulong Wang	357e6289be	[wasm] allow pull debug artifacts from script (#15859 ) ### Description allow pull debug artifacts from script `npm run pull:wasm` - to pull release artifacts `npm run pull:wasm:debug` - to pull debug artifacts	2023-05-09 11:00:08 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
dependabot[bot]	58ee076750	Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799 ) Bumps [engine.io](https://github.com/socketio/engine.io) from 6.4.1 to 6.4.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/releases">engine.io's releases</a>.</em></p> <blockquote> <h2>6.4.2</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h4>Links</h4> <ul> <li>Diff: <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">https://github.com/socketio/engine.io/compare/6.4.1...6.4.2</a></li> <li>Client release: -</li> <li>ws version: <a href="https://github.com/websockets/ws/releases/tag/8.11.0">~8.11.0</a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/socketio/engine.io/blob/main/CHANGELOG.md">engine.io's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">6.4.2</a> (2023-05-02)</h2> <p>⚠️ This release contains an important security fix ⚠️</p> <p>A malicious client could send a specially crafted HTTP request, triggering an uncaught exception and killing the Node.js process:</p> <pre><code>TypeError: Cannot read properties of undefined (reading 'handlesUpgrades') at Server.onWebSocket (build/server.js:515:67) </code></pre> <p>Please upgrade as soon as possible.</p> <h3>Bug Fixes</h3> <ul> <li>include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>) (<a href="`93957828be`">9395782</a>)</li> <li>prevent crash when provided with an invalid query param (<a href="`fc480b4f30`">fc480b4</a>)</li> <li><strong>typings:</strong> make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>) (<a href="`bd6d4713b0`">bd6d471</a>)</li> <li><strong>uws:</strong> prevent crash when using with middlewares (<a href="`8b22162903`">8b22162</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/tyilo"><code>@tyilo</code></a> and <a href="https://github.com/cieldeville"><code>@cieldeville</code></a> for helping!</p> <h3>Dependencies</h3> <ul> <li><a href="https://github.com/websockets/ws/releases/tag/8.11.0"><code>ws@~8.11.0</code></a> (no change)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`95e215387c`"><code>95e2153</code></a> chore(release): 6.4.2</li> <li><a href="`fc480b4f30`"><code>fc480b4</code></a> fix: prevent crash when provided with an invalid query param</li> <li><a href="`0141951185`"><code>0141951</code></a> refactor(types): ensure compatibility with Express middlewares</li> <li><a href="`8b22162903`"><code>8b22162</code></a> fix(uws): prevent crash when using with middlewares</li> <li><a href="`93957828be`"><code>9395782</code></a> fix: include error handling for Express middlewares (<a href="https://redirect.github.com/socketio/engine.io/issues/674">#674</a>)</li> <li><a href="`911d0e3575`"><code>911d0e3</code></a> refactor: return HTTP 400 upon invalid request overlap</li> <li><a href="`bd6d4713b0`"><code>bd6d471</code></a> fix(typings): make clientsCount public (<a href="https://redirect.github.com/socketio/engine.io/issues/675">#675</a>)</li> <li>See full diff in <a href="https://github.com/socketio/engine.io/compare/6.4.1...6.4.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=engine.io&package-manager=npm_and_yarn&previous-version=6.4.1&new-version=6.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-05-04 10:06:01 -07:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
shalvamist	c10a6a9d17	Tensor <--> image - Adding per channel compute for Norm mean & Bias (#14705 ) ### Description Enabled the use of per channel Bias and Mean normalization when converting an image <--> tensor. Added a few bug fixes and updates to the relevant E2E tests. --------- Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2023-05-01 09:37:50 -07:00
Yulong Wang	94c9a31f83	[js/webgpu] fix download failure due to buffer change (#15723 ) ### Description fix download failure due to buffer change. WebAssembly buffer may change (growth triggered by memory allocation) during an async function call.	2023-04-28 00:16:31 -07:00
Yulong Wang	d471432e10	[js/webgpu] fix attribute cache key for 2 operators (#15710 ) ### Description fix attribute cache key for LeakyRelu and ThresholdedRelu	2023-04-27 15:04:33 -07:00
Yulong Wang	c0116af619	[js/webgpu] operator Exp (#15713 ) ### Description operator Exp	2023-04-27 15:04:09 -07:00
Yulong Wang	a02c885f86	[js/webgpu] add implementation of Relu, LeakyRelu and ThresholdedRelu (#15668 ) ### Description add implementation of Relu, LeakyRelu and ThresholdedRelu	2023-04-26 15:11:01 -07:00
Yulong Wang	b98317b907	[js/webgpu] following up for JSEP/WebGPU code cleanup (#15666 ) ### Description This PR resolves a part of non-critical comments from code review comments in #14579. - use `USE_JSEP` instead of `USE_JS` in build definition to make it less ambiguous - remove unused util functions from util.ts - fix transpose.h - other misc fixes	2023-04-25 21:20:03 -07:00
Yulong Wang	d30831d829	[js/webgpu] make `RunFunction` return `void` (#15669 ) ### Description make `RunFunction` return `void`. the return value is meaningless in the OpResolveRule context. Allows any JavaScript error to be caught and returns non-zero return value from `computeKernel()`	2023-04-25 14:14:26 -07:00
Yulong Wang	14cc02c65c	[js/web] WebGPU backend via JSEP (#14579 ) ### Description This change introduced the following new components into ONNX Runtime Web: - JavaScript Execution Provider (JSEP) - Asynchronized inferencing execution powered by Emscripten's Asyncify - WebGPU backend implemented in TypeScript - initial implementation of kernels: - elementwise operators (22) - binary operators (5) - tensor: Shape, Reshape, Transpose, Gemm - nn: Conv, {Global}Maxpool, {Global}AveragePool Code need to be polished. still working on it. ## Q&A What is JSEP? > JSEP, aka JavaScript Execution Provider, is a new ONNXRuntime execution provider that specifically works on Web environment (browsers). JSEP allows JavaScript code to kick in from various places when ONNX Runtime inferences a model. Why JSEP? > JSEP is a hybrid mode EP that contains both C/C++ and TypeScript/JavaScript implementation. There are 2 strong reasons why we introduces JSEP: > 1. the C/C++ part helps JSEP to leverage ONNX Runtime's capabilities as much as possible including graph transformer, optimizers and also the capabilities to fallback to CPU EP. TypeScript/JavaScript helps JSEP to develop and debug much easier in the browser for the kernel implementation. > 2. the requirement of asynchronized execution from JavaScript API (eg. `buffer.mapAsync()`) makes it impossible to run `OrtRun()` in a synchronized context (see "async problem" section below). This is done by using Emscripten's Asyncify. What is WebGPU? > WebGPU is the new GPU API that available in browser. It's one of the only 2 APIs that currently available to access the GPU from browser (the other is WebGL). > WebGPU is designed with more advanced and stronger features comparing to WebGL and is potentially solution that offer the best GPU performance for model inferencing that currently available. What is the async problem and why we have the problem? > The "async problem" is a problem that you cannot call an async function in a synchronous context. Think about the following C++ code: > ```c > // C-style declarations (API) > typedef void (ON_COMPLETE)(PVOID state, DATA data); > void read_data_from_file(FILEHANDLE file, ON_COMPLETE on_complete); > > // implementation > DATA * my_impl_read_data_from_file_sync(FILEHANDLE file) { > // how to implement? > } > ``` > The answer is, it's impossible to implement this function. Usually we try to find a sync version API, or launch a thread to call the async function and sync-wait on the main thread. Unfortunately, in browser environment, neither is possible. > > WebGPU does not offer any synchronized API for data downloading (GPU to CPU). This is the only operation that MUST be async. As `OrtRun()` will eventually call into DataTransfer for copy data from GPU to CPU, and `OrtRun()` is a synchronized function, this cannot be done in normal way. What is Emscripten? How is the Asyncify feature resolved the problem? > Emscripten is the C/C++ compiler for WebAssembly. It's what we use to compile ORT and generates the WebAssembly artifacts which runs on browsers. > > Asyncify is a [compiler feature](https://emscripten.org/docs/porting/asyncify.html) that allows calling async functions from a synchronized context. In short, it generates code to unwind and rewind call stack to emulate async execution. With this feature, we are able to call the async function inside `OrtRun()` call. ## Design Overview Inter-op JSEP is doing pretty much same thing to just another EP. It exposes an interface for inter-op with JavaScript, which is defined in onnxruntime/wasm/js_internal_api.js: ```js // init JSEP Module["jsepInit"] = function (backend, alloc, free, copy, copyAsync, createKernel, releaseKernel, run) { Module.jsepBackend = backend; Module.jsepAlloc = alloc; Module.jsepFree = free; Module.jsepCopy = copy; Module.jsepCopyAsync = copyAsync; Module.jsepCreateKernel = createKernel; Module.jsepReleaseKernel = releaseKernel; Module.jsepRun = run; }; ``` This simple JavaScript snippet defines all language barrier level functions that requires by JSEP to achieve implementing kernels and data transfers using JavaScript inside ONNX Runtime: - `jsepBackend`: assign the singleton object to webassembly module - `jsepAlloc` and `jsepFree`: implementation of data transfer's Alloc() and Free() - `jsepCopy`: synchronized copy ( GPU to GPU, CPU to GPU) - `jsepCopyAsync`: asynchronized copy ( GPU to CPU) - `jsepCreateKernel` and `jsepReleaseKernel`: a corresponding object that maintained in JS to match lifecycle of Kernel in ORT - `jsepRun`: OpKernel::Compute() should call into this The abstraction above allows to tie as little as possible connections and dependencies between C/C++ and TypeScript/JavaScript. Resource Management Lifecycle of tensor data and kernels are managed by ORT(C/C++) but the implementation are left to JavaScript. JavaScript code are responsible to implement the callbacks correctly. For WebGPU, the GPU data is managed by JavaScript using a singleton map (tensot_data_id => GPUBuffer). GPU pipeline is managed as singleton. Shaders are managed using a singletonmap (shader_key => gpu_program), while shader_key is generated by cache_key (OP specific, including attributes) and input shapes. about data transfer `js::DataTransfer::CopyTensor` implemented to call either synchronized or asynchronized copy callback, depending on the destination is GPU or not. Emscripten's macro `EM_ASYNC_JS` is used to wrap the async function to be called in the synchronized context. run kernel in JS Kernel class constructor calls once `jsepCreateKernel()` with an optional per-kernel specific serialization to pass attributes into JavaScript. `Compute()` are implemented in a way that a metadata serialization is performed in a base class and JavaScript code can access the data using the Emscripten specific builtin macro `EM_ASM_`. disabled features* memory pattern is force disabled, because the WebGPU data is not presented by a general memory model (a buffer can be represented by offset + size). concurrent run support is disabled. WebGPU is stateful and it also has async function call. To support concurrent run will significantly increase the complexity and we don't get any real benefit from it. prefer channels last JSEP prefers channels last and returns `DataLayout::NHWC` in method `GetPreferredLayout()`. This will let the graph transformers to preprocess the graph into a channels last form so that a more optimized WebGPU shader can be used. Testing code It's impossible to test JSEP directly because JSEP itself does not contain any kernel implementation. However, it has the kernel registration which need to work together with the corresponding JavaScript code. There are unit tests that run onnx models from JavaScript API. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-24 15:21:18 -07:00
Yulong Wang	cb83d2b1a9	[js/web] allow script to use partial success build (#15547 ) ### Description allow script `npm run pull:wasm` to use partial success build.	2023-04-18 17:41:47 -07:00
Yulong Wang	0205b63756	[wasm] optimize default session options parsing (#15428 ) ### Description optimize default session options parsing. - do minimal property assignment to the passed in `options` object. - modify default value of `enableCpuMemArena` and `enableMemPattern` to `false`. We don't get benefits from enabling these 2 flags in web assembly	2023-04-10 11:09:09 -07:00
shalvamist	fff75a301c	ORT_Web - JS graph parsing update (#15185 ) ### Description Simplified the JS graph parsing logic - addressing gitHub issue #15006 bug fix	2023-03-31 09:26:55 -07:00
Guenther Schmuelling	4645726d74	fix for webgl lrn (#15236 ) fix issue that resulted in wrong results for lrn on webgpu	2023-03-30 16:16:57 -07:00
Yulong Wang	f972d21e81	[js] upgrade dependencies and enable strict mode (#14930 ) ### Description This PR includes the following changes: - upgrade js dependencies - enable STRICT mode for web assembly build. - corresponding fix for cmake-js upgrade - corresponsing fix for linter upgrade - upgrade default typescript compile option of: - `moduleResolution`: from `node` to `node16` - `target`: from `es2017` to `es2020` - fix ESM module import in commonJS source file ## change explanation ### changes to onnxruntime_webassembly.cmake `-s WASM=1` and `-s LLD_REPORT_UNDEFINED` in latest version is by-default and deprecated. ### changes to onnxruntime_node.cmake The npm package `cmake-js` updated its way to find file `node.lib`. previously it downloads this file from Node.js public release channel, and now it generates it from a definition file. The node.js release channel does not contain a windows/arm64 version, so previously cmake-js will fail to download `node.lib` for that platform. this is why we made special handling to download the unofficial binary to build. now this is no longer needed so we removed that from the cmake file. ### changes to tsconfig.json `node16` module resolution supports async import and `es2020` as target supports top level await.	2023-03-22 15:05:04 -07:00
Christian Veenhuis	59dfcfdce7	Fix typos in sources: operater, tranform, neccessary, trainig (#14907 ) ### Description While browsing the sources I found several typos here and there. I collected them to a single PR and fixed them. Namely these typos are: operater, tranform, neccessary, trainig. After fixing none of them was found anymore: $ git grep "operater" $ git grep "tranform" $ git grep "neccessary" $ git grep "trainig" $ ### Motivation and Context Since some of the typos are in example notebooks and markdown files, users can see them.	2023-03-13 22:45:04 -07:00
Yulong Wang	8844474083	[js] remove 'npm bin' (#14943 ) ### Description 'npm bin' is deprecated in latest version. use 'npx' instead. This PR resolves #14934	2023-03-08 15:03:27 -08:00
Yulong Wang	2d079c6333	[js/web] disable multi-thread test on Node.js in E2E test (#14844 ) ### Description disable multi-thread test on Node.js in E2E test. multi-thread test on Node.js in E2E test never worked, however the CI does not pick up the error every time. So this became a flaky test case which sometimes cause a build break. Disable this test now and should enable it once it's get fixed.	2023-02-27 16:01:51 -08:00
Yulong Wang	a631ed77c0	[js/web] support flag 'optimizedModelFilePath' in session options (#14355 ) ### Description * Support flag 'optimizedModelFilePath' in session options. In Node.js, the model will be saved into filesystem just like its behaviour on native platforms. In browser, the new model is not saved to filesystem. the file path is ignored. Instead, a new pop-up window will be launched in browser and user can 'save' the file as onnx model. * Add corresponding commandline args for the following session option flags: - optimizedModelFilePath - graphOptimizationLevel	2023-02-24 15:50:15 -08:00

1 2 3 4

186 commits