onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-22 22:01:08 +00:00

Author	SHA1	Message	Date
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
Yulong Wang	94c9a31f83	[js/webgpu] fix download failure due to buffer change (#15723 ) ### Description fix download failure due to buffer change. WebAssembly buffer may change (growth triggered by memory allocation) during an async function call.	2023-04-28 00:16:31 -07:00
Yulong Wang	d471432e10	[js/webgpu] fix attribute cache key for 2 operators (#15710 ) ### Description fix attribute cache key for LeakyRelu and ThresholdedRelu	2023-04-27 15:04:33 -07:00
Yulong Wang	c0116af619	[js/webgpu] operator Exp (#15713 ) ### Description operator Exp	2023-04-27 15:04:09 -07:00
Yulong Wang	a02c885f86	[js/webgpu] add implementation of Relu, LeakyRelu and ThresholdedRelu (#15668 ) ### Description add implementation of Relu, LeakyRelu and ThresholdedRelu	2023-04-26 15:11:01 -07:00
Yulong Wang	b98317b907	[js/webgpu] following up for JSEP/WebGPU code cleanup (#15666 ) ### Description This PR resolves a part of non-critical comments from code review comments in #14579. - use `USE_JSEP` instead of `USE_JS` in build definition to make it less ambiguous - remove unused util functions from util.ts - fix transpose.h - other misc fixes	2023-04-25 21:20:03 -07:00
Yulong Wang	d30831d829	[js/webgpu] make `RunFunction` return `void` (#15669 ) ### Description make `RunFunction` return `void`. the return value is meaningless in the OpResolveRule context. Allows any JavaScript error to be caught and returns non-zero return value from `computeKernel()`	2023-04-25 14:14:26 -07:00
Yulong Wang	14cc02c65c	[js/web] WebGPU backend via JSEP (#14579 ) ### Description This change introduced the following new components into ONNX Runtime Web: - JavaScript Execution Provider (JSEP) - Asynchronized inferencing execution powered by Emscripten's Asyncify - WebGPU backend implemented in TypeScript - initial implementation of kernels: - elementwise operators (22) - binary operators (5) - tensor: Shape, Reshape, Transpose, Gemm - nn: Conv, {Global}Maxpool, {Global}AveragePool Code need to be polished. still working on it. ## Q&A What is JSEP? > JSEP, aka JavaScript Execution Provider, is a new ONNXRuntime execution provider that specifically works on Web environment (browsers). JSEP allows JavaScript code to kick in from various places when ONNX Runtime inferences a model. Why JSEP? > JSEP is a hybrid mode EP that contains both C/C++ and TypeScript/JavaScript implementation. There are 2 strong reasons why we introduces JSEP: > 1. the C/C++ part helps JSEP to leverage ONNX Runtime's capabilities as much as possible including graph transformer, optimizers and also the capabilities to fallback to CPU EP. TypeScript/JavaScript helps JSEP to develop and debug much easier in the browser for the kernel implementation. > 2. the requirement of asynchronized execution from JavaScript API (eg. `buffer.mapAsync()`) makes it impossible to run `OrtRun()` in a synchronized context (see "async problem" section below). This is done by using Emscripten's Asyncify. What is WebGPU? > WebGPU is the new GPU API that available in browser. It's one of the only 2 APIs that currently available to access the GPU from browser (the other is WebGL). > WebGPU is designed with more advanced and stronger features comparing to WebGL and is potentially solution that offer the best GPU performance for model inferencing that currently available. What is the async problem and why we have the problem? > The "async problem" is a problem that you cannot call an async function in a synchronous context. Think about the following C++ code: > ```c > // C-style declarations (API) > typedef void (ON_COMPLETE)(PVOID state, DATA data); > void read_data_from_file(FILEHANDLE file, ON_COMPLETE on_complete); > > // implementation > DATA * my_impl_read_data_from_file_sync(FILEHANDLE file) { > // how to implement? > } > ``` > The answer is, it's impossible to implement this function. Usually we try to find a sync version API, or launch a thread to call the async function and sync-wait on the main thread. Unfortunately, in browser environment, neither is possible. > > WebGPU does not offer any synchronized API for data downloading (GPU to CPU). This is the only operation that MUST be async. As `OrtRun()` will eventually call into DataTransfer for copy data from GPU to CPU, and `OrtRun()` is a synchronized function, this cannot be done in normal way. What is Emscripten? How is the Asyncify feature resolved the problem? > Emscripten is the C/C++ compiler for WebAssembly. It's what we use to compile ORT and generates the WebAssembly artifacts which runs on browsers. > > Asyncify is a [compiler feature](https://emscripten.org/docs/porting/asyncify.html) that allows calling async functions from a synchronized context. In short, it generates code to unwind and rewind call stack to emulate async execution. With this feature, we are able to call the async function inside `OrtRun()` call. ## Design Overview Inter-op JSEP is doing pretty much same thing to just another EP. It exposes an interface for inter-op with JavaScript, which is defined in onnxruntime/wasm/js_internal_api.js: ```js // init JSEP Module["jsepInit"] = function (backend, alloc, free, copy, copyAsync, createKernel, releaseKernel, run) { Module.jsepBackend = backend; Module.jsepAlloc = alloc; Module.jsepFree = free; Module.jsepCopy = copy; Module.jsepCopyAsync = copyAsync; Module.jsepCreateKernel = createKernel; Module.jsepReleaseKernel = releaseKernel; Module.jsepRun = run; }; ``` This simple JavaScript snippet defines all language barrier level functions that requires by JSEP to achieve implementing kernels and data transfers using JavaScript inside ONNX Runtime: - `jsepBackend`: assign the singleton object to webassembly module - `jsepAlloc` and `jsepFree`: implementation of data transfer's Alloc() and Free() - `jsepCopy`: synchronized copy ( GPU to GPU, CPU to GPU) - `jsepCopyAsync`: asynchronized copy ( GPU to CPU) - `jsepCreateKernel` and `jsepReleaseKernel`: a corresponding object that maintained in JS to match lifecycle of Kernel in ORT - `jsepRun`: OpKernel::Compute() should call into this The abstraction above allows to tie as little as possible connections and dependencies between C/C++ and TypeScript/JavaScript. Resource Management Lifecycle of tensor data and kernels are managed by ORT(C/C++) but the implementation are left to JavaScript. JavaScript code are responsible to implement the callbacks correctly. For WebGPU, the GPU data is managed by JavaScript using a singleton map (tensot_data_id => GPUBuffer). GPU pipeline is managed as singleton. Shaders are managed using a singletonmap (shader_key => gpu_program), while shader_key is generated by cache_key (OP specific, including attributes) and input shapes. about data transfer `js::DataTransfer::CopyTensor` implemented to call either synchronized or asynchronized copy callback, depending on the destination is GPU or not. Emscripten's macro `EM_ASYNC_JS` is used to wrap the async function to be called in the synchronized context. run kernel in JS Kernel class constructor calls once `jsepCreateKernel()` with an optional per-kernel specific serialization to pass attributes into JavaScript. `Compute()` are implemented in a way that a metadata serialization is performed in a base class and JavaScript code can access the data using the Emscripten specific builtin macro `EM_ASM_`. disabled features* memory pattern is force disabled, because the WebGPU data is not presented by a general memory model (a buffer can be represented by offset + size). concurrent run support is disabled. WebGPU is stateful and it also has async function call. To support concurrent run will significantly increase the complexity and we don't get any real benefit from it. prefer channels last JSEP prefers channels last and returns `DataLayout::NHWC` in method `GetPreferredLayout()`. This will let the graph transformers to preprocess the graph into a channels last form so that a more optimized WebGPU shader can be used. Testing code It's impossible to test JSEP directly because JSEP itself does not contain any kernel implementation. However, it has the kernel registration which need to work together with the corresponding JavaScript code. There are unit tests that run onnx models from JavaScript API. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-24 15:21:18 -07:00
Yulong Wang	0205b63756	[wasm] optimize default session options parsing (#15428 ) ### Description optimize default session options parsing. - do minimal property assignment to the passed in `options` object. - modify default value of `enableCpuMemArena` and `enableMemPattern` to `false`. We don't get benefits from enabling these 2 flags in web assembly	2023-04-10 11:09:09 -07:00
shalvamist	fff75a301c	ORT_Web - JS graph parsing update (#15185 ) ### Description Simplified the JS graph parsing logic - addressing gitHub issue #15006 bug fix	2023-03-31 09:26:55 -07:00
Guenther Schmuelling	4645726d74	fix for webgl lrn (#15236 ) fix issue that resulted in wrong results for lrn on webgpu	2023-03-30 16:16:57 -07:00
Yulong Wang	f972d21e81	[js] upgrade dependencies and enable strict mode (#14930 ) ### Description This PR includes the following changes: - upgrade js dependencies - enable STRICT mode for web assembly build. - corresponding fix for cmake-js upgrade - corresponsing fix for linter upgrade - upgrade default typescript compile option of: - `moduleResolution`: from `node` to `node16` - `target`: from `es2017` to `es2020` - fix ESM module import in commonJS source file ## change explanation ### changes to onnxruntime_webassembly.cmake `-s WASM=1` and `-s LLD_REPORT_UNDEFINED` in latest version is by-default and deprecated. ### changes to onnxruntime_node.cmake The npm package `cmake-js` updated its way to find file `node.lib`. previously it downloads this file from Node.js public release channel, and now it generates it from a definition file. The node.js release channel does not contain a windows/arm64 version, so previously cmake-js will fail to download `node.lib` for that platform. this is why we made special handling to download the unofficial binary to build. now this is no longer needed so we removed that from the cmake file. ### changes to tsconfig.json `node16` module resolution supports async import and `es2020` as target supports top level await.	2023-03-22 15:05:04 -07:00
Christian Veenhuis	59dfcfdce7	Fix typos in sources: operater, tranform, neccessary, trainig (#14907 ) ### Description While browsing the sources I found several typos here and there. I collected them to a single PR and fixed them. Namely these typos are: operater, tranform, neccessary, trainig. After fixing none of them was found anymore: $ git grep "operater" $ git grep "tranform" $ git grep "neccessary" $ git grep "trainig" $ ### Motivation and Context Since some of the typos are in example notebooks and markdown files, users can see them.	2023-03-13 22:45:04 -07:00
Yulong Wang	a631ed77c0	[js/web] support flag 'optimizedModelFilePath' in session options (#14355 ) ### Description * Support flag 'optimizedModelFilePath' in session options. In Node.js, the model will be saved into filesystem just like its behaviour on native platforms. In browser, the new model is not saved to filesystem. the file path is ignored. Instead, a new pop-up window will be launched in browser and user can 'save' the file as onnx model. * Add corresponding commandline args for the following session option flags: - optimizedModelFilePath - graphOptimizationLevel	2023-02-24 15:50:15 -08:00
Yulong Wang	b1a17188a6	[js/web] add LRN unpacked kernel for webgl backend (#14459 ) ### Description add LRN unpacked kernel for webgl backend	2023-02-01 11:51:10 -08:00
shalvamist	5c16e0befb	[web] utility functions for tensor<->image conversion in ORT web (#13603 ) ### Description Data processing capabilities to ORT Web. This PR will focus augmenting raw data to and from Tensors. ### Motivation and Context Enabling different app building use cases to leverage ORT in a more natural form. Currently, the user needs to process the data and call Tensor constructors - these util functions will provide a direct path to generating ORT tensors. Co-authored-by: shalvamist <shalva.mist@microsoft.com>	2023-01-12 09:05:18 -08:00
Seungwon Jeong	307ad1413a	[js/web] support 'pytorch_half_pixel' mode for WebGL kernel 'Resize' (#11208 ) Description: 1. add pytorch_half_pixel interpolation mode in resize-packed.ts Changes: add the following case in createPackedResizeProgramInfo function: ``` case 'pytorch_half_pixel': getSourceFracIndex = ` vec4 getSourceFracIndex(ivec4 coords) { vec4 fcoords = vec4(coords); return vec4( ${outputWidth}.0 > 1.0 ? (fcoords.x + 0.5) / scaleWHWH.x - 0.5 : 0.0, ${outputHeight}.0 > 1.0 ? (fcoords.y + 0.5) / scaleWHWH.y - 0.5 : 0.0, ${outputWidth}.0 > 1.0 ? (fcoords.z + 0.5) / scaleWHWH.z - 0.5 : 0.0, ${outputHeight}.0 > 1.0 ? (fcoords.w + 0.5) / scaleWHWH.w - 0.5 : 0.0 ); } `; break; ``` 2. fix "unrecognized input '' for node: Resize_$num" error when inputs like [input_tensor, None, scale_factor] (roiInput not given) are fed into the resize layer. Changes: change in input handling logic in upsample.ts & node scanning logic in graph.ts Motivation and Context Before this fix, we aren't able to use webGL backend when the neural network contains pytorch resize layers. This fix adds 'pytorch_half_pixel' interpolation mode support and makes it possible to use webGL backend for more kind of computer vision networks. This commit solves: #10430 Co-authored-by: neo <neo@icode-lab.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2022-11-21 12:03:48 -08:00
Guenther Schmuelling	6f6560a7b9	fix to reduce peak memory usage in ort-web (#13323 ) fix to reduce peak memory usage in ort-web	2022-11-14 12:18:02 -08:00
Yulong Wang	82786baed1	[js/web] add 'xnnpack' to EP list (#12723 ) Description: This PR adds support for "XNNPACK EP" in ORTWeb and changes the behavior of how ORTWeb deals with "backends", or "EPs" in API. Background: Term "backend" is introduced in ONNX.js to representing a TypeScript type which implements a "backend" interface, which is a similar but different concept to ORT's EP (execution provider). There was 3 backends in ONNX.js: "cpu", "wasm" and "webgl". When ORT Web is launched, the concept is derived to help users to integrate smoothly. Technically, when "wasm" backend is used, users need to also specify "EP" in the session options. Considering it may get complicated and confused for users to figure out the difference between "backend" and "EP", the JS API hide the "backend" concept and made a mapping between names, backends and EPs: "webgl" (Name) <==> "onnxjsBackend" (Backend) "wasm" (Name) <==> "wasmBackend" (Backend) <==> "CPU" (EP) Details: The following changes are applied in this PR: 1. allow multi-registration for backends using the same name. This is for use scenarios where both "onnxruntime-node" and "onnxruntime-web" are consumed in a Node.js App ( so "cpu" will be registered twice in this scenario. ) 2. re-assign priority values to backends. I give 100 as base to "cpu" for node and react_native, and 10 as base to "cpu" in web. 3. add "cpu", "xnnpack" as new names of backends. 4. update onnxruntime wasm exported functions to support EP registration. 5. update implementations in ort web to handle execution providers in session options. 6. add '--use_xnnpack' as default build flag for ort-web	2022-10-03 10:38:45 -07:00
shalvamist	851b0ce936	[js/web][Fix] - updating the C API to catch non-tensor data (#12811 ) Added a check for tensor validation on the input - this change fixes the quiet abort WASM takes when processing a non tensor data in "OrtGetTensorData" Motivation and Context At the current status when we try to process non-tensor data through OrtGetTensorData and exception is thrown which results in a quiet abort from WASM (assuming WASM was built without exception handling). I added a check in the C API to catch this case and output a meaningful message to the user [example_error_github_12622.zip](https://github.com/microsoft/onnxruntime/files/9464328/example_error_github_12622.zip)	2022-09-21 13:59:17 -07:00
Yulong Wang	1a402a3f25	replace 'master' branch ref to 'main' for onnx repo (#12678 )	2022-08-30 13:41:42 -07:00
Yulong Wang	f40e90c33f	[js/web] fix incorrect shader for 'Resize' (#12588 )	2022-08-21 21:47:28 -07:00
101arrowz	148b1efe5e	[js/web] add ConvTranspose2D to WebGL backend (#11990 ) * Add ConvTranspose * Update docs + tests * fix lint * fix output shape calculations * Revert "fix output shape calculations" This reverts commit 8014fa9b33115f1d6a677fe2270a6da1b510ff67. * fix format * remove broken output_shape test	2022-07-27 13:57:12 -07:00
101arrowz	c72bb8aaa9	[js/web] add OffscreenCanvas support to WebGL backend (#12159 ) * Add OffscreenCanvas support to WebGL backend * fix format * fix lint	2022-07-20 14:06:03 -07:00
Yosshi999	0702364d7a	[js/web][bugfix] fix negative axes for unsqueeze (#11944 ) [js/web] fix negative axes for unsqueeze	2022-06-28 11:28:35 -07:00
Yulong Wang	af21a04977	[js] upgrade async@3.2.3 /js/ (#11421 ) * [js] upgrade async@3.2.3 /js/ * format code	2022-05-03 23:41:36 -07:00
Yulong Wang	6c7090a829	[js/web] fix output type mapping (#11049 )	2022-03-30 16:26:04 -07:00
Yulong Wang	893ee65e54	[js/web] fix lint error when run without ort-web TS types (#10429 ) * [js/web] fix lint error when run without ort-web TS types * update CI to run linter before 'npm ci' in /js/web	2022-02-17 22:34:38 -08:00
Yulong Wang	b9909f985e	[js/web] rename build-def.ts to build-def.d.ts (#9954 )	2021-12-09 14:17:42 -08:00
Yulong Wang	a3ebc5e082	[js/web] do not use nodejs type 'Buffer' in web (#9839 ) * [js/web] do not use nodejs type 'Buffer' in web * resolve comments and validate tests * remove 'Buffer' in test	2021-11-24 14:14:42 -08:00
Yulong Wang	74ca417c0e	[js/web] optimize bundle file size (#9817 ) * es2017 by default for ort-common * add visualizer and define plugin * es2017 for ort-web. also add build target for es5 * add multiple reduced size build for ort-web * resolve comments, add e2e tests and add docs	2021-11-22 13:56:55 -08:00
Sunghoon	e65f284476	[js/web] Support WebGL for ort format models in benchmarks (#9661 ) * add p50 in test * Support FusedConv in WebGL * resolve comments * add a comment for longToNumber change Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-11-09 11:58:47 -08:00
Sunghoon	c79307e7b4	[js/web] support opset-13 of softmax (#9493 ) * add p50 in test * support opset-13 of softmax * update a operators.md * resolve comments * fix lint and format Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-10-26 23:58:50 -07:00
Yulong Wang	901c7de918	[js/web] remove webgl from default fallback list (#9374 )	2021-10-14 21:46:22 -07:00
Sunghoon	74eaaad768	[js/web] Support opset-13 for squeeze, unsqueeze, maxpool, pad, cast and clip (#9249 ) * Support opset-13 for squeeze, unsqueeze, maxpool, pad, cast, clip * merge master and update a operators.md * resolve comment. revise pool and cast kernel implementation. * skip fusion when clip min and max is not in initializer	2021-10-14 16:29:37 -07:00
Yulong Wang	634bb5ede0	fix CodeQL warning 'Remote property injection' (#9224 )	2021-09-30 13:45:22 -07:00
Yulong Wang	8c57d51928	support WebAssembly SIMD for qgemm (#9191 ) * support WebAssembly SIMD for qgemm * remove '--experimental-wasm-bulk-memory' for test	2021-09-30 12:40:56 -07:00
Yulong Wang	750e2e0481	[js/web] check session ID in releaseSession() (#9105 )	2021-09-20 17:49:53 -07:00
Yulong Wang	be80698698	[js/web] a bugfix and add tests for wasm proxy worker (#9048 ) * [js/web] add tests for wasm proxy worker * fix script src override	2021-09-14 10:38:58 -07:00
Du Li	57b7ab56cd	Adding async fetching for webgl backend (#8951 ) * Adding async fetching for webgl backend * fix PR comments and CI failure. * fixing a bug * adding a flag	2021-09-09 22:17:42 -07:00
Sunghoon	450524359e	[js/web] WebAssembly profiling (#8932 ) * add p50 in test * Preallocate WebAssembly worker threads to minimize worker creation time * WebAssembly profiling * merge master * merge with proxy changes * disable profiling tests from WebAssembly build * fix e2e test failure Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-09-07 17:18:08 -07:00
Yulong Wang	206537936f	[js/web] enable proxy worker for wasm backend (#8862 )	2021-08-31 10:23:42 -07:00
Yulong Wang	cb67fca738	[js/web] enable 'use_ort_model_bytes_directly' by default (#8734 )	2021-08-18 11:18:58 -07:00
Yulong Wang	4ceedbe933	[js/web] add SharedArrayBuffer check for wasm multi-thread (#8749 )	2021-08-16 23:17:54 -07:00
Yulong Wang	3e8cabbc3e	[js/web] WebGL backend refactor (#8586 )	2021-08-12 12:30:49 -07:00
Yulong Wang	f3a1aebb33	[js/web] support override wasm file path (#8610 )	2021-08-05 18:01:03 -07:00
Du Li	fa722d208b	[js/web] adding webgl pointwise conv kernel (#8418 )	2021-08-04 20:46:08 -07:00
Tixxx	db88f3059c	[js] fixing broadcast issues in pack mode (#8090 ) * fixing broadcast issues in pack mode * improved bcast logic for matmul * removed TODO * rebased from master	2021-06-23 09:55:19 -07:00
Du Li	352d560fd5	Adding Conv+Clip fusion (#8102 )	2021-06-21 16:30:12 -07:00

1 2

71 commits