onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

Author	SHA1	Message	Date
Yulong Wang	fb51faea64	[js/webgpu] fix 2 build breaks introduced in merge (#17273 ) ### Description fix 2 build breaks introduced in merge. Fixes web build	2023-08-23 18:09:50 -07:00
Yulong Wang	8b18d48c7c	[js/webgpu] make IndicesHelper implementation implicit (#17193 ) ### Description This change makes it no longer required to call indicesHelper.impl() in shader code.	2023-08-23 14:41:35 -07:00
Guenther Schmuelling	d3d3dde844	fix webgpu split (#17258 ) fix webgpu split for the case of split_sizes coming from input[1]	2023-08-22 16:49:22 -07:00
Yulong Wang	6fc3fd9ece	[js/webgpu] support Cast operator (#16489 ) ### Description support `Cast` operator for webgpu backend. Cast operator for webgpu backend currently only supports f32, u32, i32 and bool.	2023-08-18 23:51:03 -07:00
xhcao	dd3b2cefd6	[js/webgpu] Support int32 type for binary (#16901 ) ### Description Enable typed binary and support int32 type for binary. Co-authored-by: Xing Xu <xing.xu@intel.com> --------- Co-authored-by: Xing Xu <xing.xu@intel.com>	2023-08-18 12:19:01 -07:00
Hariharan Seshadri	a476dbf430	[JS/WebGPU] Support Tile operator (#17123 ) ### Description As title ### Motivation and Context Improve WebGPU op coverage	2023-08-18 10:07:21 -07:00
satyajandhyala	7d1a5635a0	[JS/Web] Added SkipLayerNormalization operator. (#17102 ) ### Description Add SkipLayerNormalization operator to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-18 09:59:03 -07:00
Hariharan Seshadri	66df11769c	[JS/WebGPU] Expand operator fixes (#17137 )	2023-08-16 11:24:26 -07:00
satyajandhyala	89b682e3f3	[JS/Web] The bias input is optional, not required, for LayerNormalization operator (#17143 ) ### Description Fix a typo. LayerNormalization takes 2 or 3 inputs. The third input, bias, is optional. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-16 10:41:20 -07:00
Yulong Wang	133af1385c	[js/webgpu] update shader cache key to include input tensor datatype (#17176 ) ### Description update shader cache key to include input tensor datatype. and make the key a little bit easier to read	2023-08-16 09:14:19 -07:00
Guenther Schmuelling	8289e8b6ef	[js/webgpu] fix a few shader errors (#17171 ) Fix for segment anything decoder, reduceMax with rank1 and concat.	2023-08-15 21:14:20 -07:00
Arthur Islamov	ccf14e891e	[js/web] JSEP node assignment optimization (#17128 ) ### Description Since WebGPU supports only float32 and int32, having Gather, Reshape, Shape, Squeeze and Unsqueeze ops with other data types create additional MemCpy ops and slow down the overall execution as all other OPs with other tensor types will be done on CPU. Before this patch SD Unet had these numbers: Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1141 Node(s) placed on [JsExecutionProvider]. Number of nodes: 4025 memcpy tokens: 2001 After patch: Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1735 Node(s) placed on [JsExecutionProvider]. Number of nodes: 2243 memcpu tokens: 813 It also gives more than 5X performance benefit. From 12sec for one Unet step to 2.2sec on RTX 3090 Ti, so we are almost getting to native performance. UPD: with latest changes from main branch and multi-threading it went down to 1.6sec. Will try re-exporting my model to onnx with maximum optimizations, like using MultiHeadAttention to decrease node count. Maybe after implementing that it can go in less than 1 sec	2023-08-15 18:58:05 -07:00
xhcao	24e0bd37b4	[JS/WebGPU] Support Log operator (#17045 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-14 18:04:12 -07:00
Yulong Wang	14a8315f10	[js/web] [webgpu] new incides helper (#16957 ) ### Description This PR introduces the new incides helper. IndicesHelper is a helper class for generating WGSL code for manipulating indices and data for a shader's input or output. This class is designed to offer a unified way to generate WGSL code for manipulating indices and data for a shader's input or output. The following is a list of terminologies used in this class: - `offset`: a uint32 value representing the offset of an element in the data buffer. - `indices`: an abstraction of a multi-dimensional array's indices representing the data's index on each dimension. - `value`: a value of a data element. Users are expected to create an instance of this class for each shader's input or output, and use the instance to generate WGSL code for manipulating indices and data. The following 2 exported functions are for users to call to create an instance of an indices helper: - `inputVariable()`: create an indices helper instance for an input. - `outputVariable()`: create an indices helper instance for an output. An indices helper instance contains helper functions for the following operations: - access readonly basic information, including: `name`(the name of the input or output), `usage`(whether it's an input or an output) and `shape`(the passed in shape). - `type`: access readonly type information, including: `indices`(the type of indices), `value`(the type of value at runtime), `storage`(the type of value at storage) and `tensor`(the tensor type as represented in TensorView). - generate WGSL code for getting indices from offset. Use `offsetToIndices()` for WGSL code snippet to calculate incides from offset, and use `indicesToOffset()` for WGSL code snippet to calculate offset from indices. - to manipulate an instance of indices, use `setIndices()` and `getIndices()` to set and get the indices on an indices variable. - to manipulate data, use `set()`/`get()` to access data at the given indices from parameter list, use `setByIndices()`/`getByIndices()` to access data at the given indices from an indices variable, and use `setByOffset()`/`getByOffset()` to access data at the given offset. - `impl`: get WGSL code of function implementation for the util functions mentioned above. This change applies the usage of new IndicesHelper through the code, but not necessary for all code.	2023-08-11 11:36:59 -07:00
Zimon Tai	a3e02e8e2a	Fix Resize op input check (#16594 ) ### Description onnxjs contains a `Resize` op input check which is outdated since opset 9. Currently `Resize` supports up to 4 inputs. This PR looses the input check. ### Motivation and Context Fixes #15636	2023-08-09 15:42:30 -07:00
Arthur Islamov	c3f04251c7	[js/web] JSEP LayerNormalization and InstanceNormalizations kernels (#16830 ) ### Description Added two kernels for Layer and Instance norm Also added maximum limits for `maxBufferSize` when requesting GPU device as by default it's limited to 256mb and it fails allocating 600mb buffer while running fp32 StableDiffusion weights. ### Motivation and Context These two are used in StableDiffusion and many other networks	2023-08-08 09:09:37 -07:00
Jiajia Qin	9ea0a3129b	[js/webgpu] Make sure only storage buffers are reused (#16893 ) ### Description <!-- Describe your changes. --> This PR makes sure that only storage buffers are reused. Previously, the query buffer might also get from the freeBuffers list if there is a matching size in it. But they are different usage, which results errors.	2023-08-04 13:40:52 -07:00
satyajandhyala	7ad43d9564	[JS/Web] Fixed ArgMin and ArgMax and refactored (#17002 ) Fixed ArgMin and ArgMax and refactored using functionality from Reduce operator code. ### Description Removed code/functionality duplication and fixed some issue. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-04 12:59:36 -07:00
satyajandhyala	cc4b64f646	[JS/Web] Modify Reduce, Expand and Slice to pass op and node tests. (#16979 ) ### Description Make CacheHint mechanism, which is designed to avoid running the same test multiple times saving the result mapped against a key, working by adding input dims. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-03 15:48:47 -07:00
Arthur Islamov	ea55700e1c	[js/web] JSEP Gather OP (#16855 ) ### Description Added Gather op that works with both i32 and i64 indices, assuming that values fall into i32 limit. The assumption is safe because it's not possible to allocate more than 2gb buffer for inputs. It treats all data from input tensor as u32, copying 1 or 2 elements for i64, u64 and double. --------- Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>	2023-08-03 14:09:37 -07:00
Arthur Islamov	acb9e56164	[js/web] JSEP Expand fix for inputs with rank < 2 (#16829 ) ### Description If Expand inputs has rank < 2, `inputIndicesHelper` and `outputIndicesHelper` create indices as u32 instead if array<u32> and `calculateInputIndex` throws an error ### Motivation and Context I've encountered this error while making StableDiffusion work with JSEP	2023-08-03 11:38:04 -07:00
Guenther Schmuelling	0df2e14038	js/webgpu: argmax,argmin,softmax support (#16882 ) argmax and argmin are similar to reduce. Eventually we need to add optimized flavors of the shader. softmax is optimized but only works on the last axis for now which should be the common use case. todo: enable more ut for argmax/argmin	2023-08-02 18:16:19 -07:00
Hariharan Seshadri	506ddb3d5d	[js/WebGPU] Support int32 Transpose in WebGPU (#16952 )	2023-08-02 16:27:24 -07:00
Jiajia Qin	fa8487ea3a	[js/webgpu] Check profilingMode in each run (#16897 ) ### Description <!-- Describe your changes. --> This PR moves checking profilingMode to each run instead of the initialization stage. In this way, users can start/stop profiling at any time. Otherwise, profiling only take effects at the very beginning and can't be stopped.	2023-07-31 17:37:24 -07:00
satyajandhyala	77b2b618b2	[JS/WebGPU] Add Resize operator (#16680 ) ### Description Implemented Resize operator support in JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:35:06 -07:00
satyajandhyala	dd24d52737	[JS/Web] Added Gelu contrib operator support to JSEP (#16909 ) ### Description Added Gelu operator to JSEP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-31 09:18:58 -07:00
Yulong Wang	1743e9a615	[js] enable formatter for more file types (#16888 ) ### Description enable formatter for .js/.json/.jsonc/.md files	2023-07-28 15:46:58 -07:00
satyajandhyala	03ce0a5693	[Web/JS] Added Slice operator in JSEP. (#16811 ) ### Description Added Slice operator support to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-25 14:19:20 -07:00
Jiajia Qin	193415a162	[js/webgpu] reuse buffer for GpuDataManager (#16746 ) ### Description <!-- Describe your changes. --> Allocating new GPUBuffer in every session.run is not efficient. We should make it only happen in the first run. In the following runs, we should try to reuse those buffers. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - This PR is for performance. See mobilenetv2 becomes 9.58 ms from 12.9 ms.	2023-07-21 13:13:01 -07:00
Yulong Wang	7dcb805ab8	[js/web] upgrade onnx-proto version (#16722 ) ### Description This change upgrades a lot of dependencies. There are 2 motivations of doing this change: - fix the security issue reported by dependabot (protobufjs Prototype Pollution vulnerability - https://github.com/advisories/GHSA-h755-8qp9-cq85) - resolve the requirement of using ONNX IR_VERSION 9 (#16638) This requires: - upgrade protobufjs to v7.2.4 - upgrade library 'onnx-proto' to consume latest ONNX release (v1.14.0). Problems: - protobufjs v7.2.4 depends on long.js v5, which does not work well with typescript (commonjs). - onnx-proto depends on this fix with a new release of long.js - long.js is in maintenance and it takes longer than expected to put in new changes Solutions: - use a patch script in `preprepare` to copy type declarations to make long.js work with typescript (commonjs) - generate onnx protobuf JS/TS files and put them under js/web/lib/onnxjs/ort-schema/protobuf folder - remove 'onnx-proto' from dependency. - apply fixes to generated onnx.d.ts	2023-07-18 16:36:39 -07:00
Yulong Wang	d1d65978f6	[js/web] fix file size trim for wasm only .min.js (#16681 ) ### Description fix file size trim for wasm only .min.js minimal build `ort.wasm.min.js` and `ort.wasm-core.min.js` should exclude JSEP related source code.	2023-07-13 14:20:51 -07:00
satyajandhyala	d41bbac7b9	[Web/JS] Added Expand operator support. (#16577 ) ### Description Added Expand operator support. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-11 09:38:16 -07:00
satyajandhyala	00e8f2a2a9	[Web/JS] Add ConvTranspose support (#16433 ) ### Description Add ConvTranspose support for WebGPU ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-08 11:10:50 -07:00
Yulong Wang	5c6613875c	[js/web] [JSEP] allow passing data in kernel compute (#16621 ) ### Description allow passing data in OpKernel::Compute() from C++ to JS.	2023-07-07 14:27:30 -07:00
satyajandhyala	e55a20ece8	[Web/JS] Added Split operator support. (#16567 ) ### Description Added WeGPU/JSEP Split operator support. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-07 12:16:10 -07:00
satyajandhyala	a7c892106d	[Web/JS] Support WebGPU Concat operator (#16543 ) ### Description Add Concat operator ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-07-05 11:59:45 -07:00
Yulong Wang	708dec5d95	[js/webgpu] allow 0 sized tensor for tensor view (#16540 ) ### Description allow 0 sized tensor for tensor view	2023-06-30 12:05:04 -07:00
satyajandhyala	3be6eb53c8	[JS/Web] Fixed the output indexing in the shader code when the output is 1-dim. (#16508 ) ### Description Modified indexing into outputIndices in the shader code. When the output is 1-dim the outputIndices is not a vector and indexing results in error. ### Motivation and Context Fix the problem in the Reduce Ops implementation in WebGPU. <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-30 09:42:38 -07:00
Yulong Wang	de476c8075	[js/web] update webgl context creating (#16436 ) ### Description Modify the creating of webgl context. Previous behavior: STEP.1 - create canvas (document.createElement), if failed, goto step.2 else step.3 STEP.2 - create offscreenCanvas, if failed abort STEP.3 - use the canvas created in step.1 or 2 to create webgl context. if successful return context else abort Now bahavior: STEP.1 create offscreenCanvas, if failed goto step.3 STEP.2 use it to create webgl context. if successful, return context STEP.3 create canvas (document.createElement). if failed, abort STEP.4 use it to create webgl context. if successful, return context else abort Motivation: we found in some environment, normalCanvas.getContext() returns null but offscreenCanvas.getContext() returns the context object. and when offscreenCanvas is available it is good idea to always prefer to use it.	2023-06-21 17:10:26 -07:00
Yulong Wang	da532f3f5a	[js/webgpu] fix GPU to GPU memcpy (#16393 ) ### Description Fixes a GPU to GPU memory copy bug which causes #16267	2023-06-21 15:50:08 -07:00
Yulong Wang	b8917ad84f	[js/web] fix nodejs detection (#16400 ) ### Description We used to use `typeof fetch === 'undefined'` as condition to detect the environment is Node.js or not. Before Node.js v18, this works. However, in Node.js v18, it introduced `fetch` function, so this check does not work any more. This PR changes the condition to check whether `process`, `process.versions` and `process.versions.node` exists. Checking whether `process` exists is not enough. This is because in some configuration, webpack may polyfill nodejs's process.	2023-06-20 00:20:58 -07:00
Yulong Wang	4f7900b553	[js/web] enable ONNX Runtime Web error messages in JS (#16335 ) ### Description enabling passing error messages from C++ to JavaScript so that when ORT Web API fails it generates more verbose errors.	2023-06-15 09:45:41 -07:00
satyajandhyala	889f80082f	[js/web] Added Reduce operators support (#16122 ) ### Description Added support for ReduceL1, ReduceL2, ReduceMean, ReduceMin, ReduceMax, ReduceSum, ReduceLogSum, ReduceLogSumExp, ReduceProd and ReduceSquareSum. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com> Co-authored-by: guschmue <guschmue@microsoft.com>	2023-06-12 07:46:27 -07:00
Yulong Wang	f274bbb0c8	[js] add API that allows to get package version (#16207 ) ### Description Add an API for users to get version of current package. example usage: ```js import { env } from 'onnxruntime-node'; console.log(env.versions.node); // output "1.16.0" ``` ```js import { env } from 'onnxruntime-web'; console.log(env.versions.web); // output "1.16.0" console.log(env.versions.common); // output "1.16.0" console.log(env.versions.node); // output "undefined" ``` #16156	2023-06-09 16:18:53 -07:00
Wanming Lin	a8c2f24ae0	[WebNN EP] Merge support for segment anything into main branch (#16208 ) We implemented a number of new ops and data types to support running segment anything model on Chromium WebNN DML backend (POC) in a forked branch https://github.com/honry/onnxruntime/tree/stable-diffusion In this PR, we migrate the changes in the forked branch to main branch, includes: - 22 new ops - New tensor data types: bool, int32, uint32, uint64, int64, float16 (As JavaScript hasn't shipped Float16Array, we use Uint16Array as a workaound) - Handle empty input tensors and duplicated outputs - Fixed some nits	2023-06-07 09:56:37 -07:00
Yulong Wang	ebe715a817	[js/webgpu] fix RangeError in buffer download (#16165 ) ### Description this is a following up fix for #15990, which should resolve the RangeError issue.	2023-05-30 15:04:50 -07:00
Xavier Dupré	e726151b5c	Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-30 13:25:58 -07:00
Yulong Wang	18f17c555d	[js/webgpu] fix buffer size when download (#15990 ) ### Description fix buffer size when download. buffer size should always be padded to multiple of 4. resolved issue described in #15796 > ![Image](https://user-images.githubusercontent.com/26504141/239093785-9417dffc-6f00-47b2-956d-402b43bdb0a9.png)	2023-05-20 00:21:18 -07:00
Yulong Wang	04ea561fc8	[js/webgpu] throw error when WebGPU=ON and SIMD=OFF (#15924 ) ### Description throw error when WebGPU=ON and SIMD=OFF	2023-05-16 11:05:56 -07:00
Yulong Wang	22a9a1a630	[js/webgpu] only register webgpu backend when it's available (#15922 ) ### Description only register webgpu backend when it's available	2023-05-15 18:09:31 -07:00

1 2 3

122 commits