onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

Author	SHA1	Message	Date
Yulong Wang	56bced0581	[js/web] enable webgpu in browser unit test (#16310 ) ### Description enable webgpu in browser unit test. The CI pipeline uses Edge v113+ which enables WebGPU. === UPDATE on 08/07/2023: - add flags to Edge browser launch commandline so that Edge on CI agents can initialize WebGPU correctly. - ONLY enable webgpu on web release build. Other pipelines are using flag `-b=wasm,webgl,xnnpack` to specify the other 3 backends explicitly. - disable "Resize" related test failures. Once they are fixed the tests can be re-enabled. --------- Co-authored-by: Satya Jandhyala <satya.k.jandhyala@gmail.com>	2023-08-08 11:45:04 -07:00
Yulong Wang	f274bbb0c8	[js] add API that allows to get package version (#16207 ) ### Description Add an API for users to get version of current package. example usage: ```js import { env } from 'onnxruntime-node'; console.log(env.versions.node); // output "1.16.0" ``` ```js import { env } from 'onnxruntime-web'; console.log(env.versions.web); // output "1.16.0" console.log(env.versions.common); // output "1.16.0" console.log(env.versions.node); // output "undefined" ``` #16156	2023-06-09 16:18:53 -07:00
Yulong Wang	9328a0f955	[js/webgpu] run test on chrome instead of chrome canary for webgpu (#15902 ) ### Description webgpu is released in chrome v113. No longer to use chrome canary in test cli	2023-05-12 15:47:59 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	14cc02c65c	[js/web] WebGPU backend via JSEP (#14579 ) ### Description This change introduced the following new components into ONNX Runtime Web: - JavaScript Execution Provider (JSEP) - Asynchronized inferencing execution powered by Emscripten's Asyncify - WebGPU backend implemented in TypeScript - initial implementation of kernels: - elementwise operators (22) - binary operators (5) - tensor: Shape, Reshape, Transpose, Gemm - nn: Conv, {Global}Maxpool, {Global}AveragePool Code need to be polished. still working on it. ## Q&A What is JSEP? > JSEP, aka JavaScript Execution Provider, is a new ONNXRuntime execution provider that specifically works on Web environment (browsers). JSEP allows JavaScript code to kick in from various places when ONNX Runtime inferences a model. Why JSEP? > JSEP is a hybrid mode EP that contains both C/C++ and TypeScript/JavaScript implementation. There are 2 strong reasons why we introduces JSEP: > 1. the C/C++ part helps JSEP to leverage ONNX Runtime's capabilities as much as possible including graph transformer, optimizers and also the capabilities to fallback to CPU EP. TypeScript/JavaScript helps JSEP to develop and debug much easier in the browser for the kernel implementation. > 2. the requirement of asynchronized execution from JavaScript API (eg. `buffer.mapAsync()`) makes it impossible to run `OrtRun()` in a synchronized context (see "async problem" section below). This is done by using Emscripten's Asyncify. What is WebGPU? > WebGPU is the new GPU API that available in browser. It's one of the only 2 APIs that currently available to access the GPU from browser (the other is WebGL). > WebGPU is designed with more advanced and stronger features comparing to WebGL and is potentially solution that offer the best GPU performance for model inferencing that currently available. What is the async problem and why we have the problem? > The "async problem" is a problem that you cannot call an async function in a synchronous context. Think about the following C++ code: > ```c > // C-style declarations (API) > typedef void (ON_COMPLETE)(PVOID state, DATA data); > void read_data_from_file(FILEHANDLE file, ON_COMPLETE on_complete); > > // implementation > DATA * my_impl_read_data_from_file_sync(FILEHANDLE file) { > // how to implement? > } > ``` > The answer is, it's impossible to implement this function. Usually we try to find a sync version API, or launch a thread to call the async function and sync-wait on the main thread. Unfortunately, in browser environment, neither is possible. > > WebGPU does not offer any synchronized API for data downloading (GPU to CPU). This is the only operation that MUST be async. As `OrtRun()` will eventually call into DataTransfer for copy data from GPU to CPU, and `OrtRun()` is a synchronized function, this cannot be done in normal way. What is Emscripten? How is the Asyncify feature resolved the problem? > Emscripten is the C/C++ compiler for WebAssembly. It's what we use to compile ORT and generates the WebAssembly artifacts which runs on browsers. > > Asyncify is a [compiler feature](https://emscripten.org/docs/porting/asyncify.html) that allows calling async functions from a synchronized context. In short, it generates code to unwind and rewind call stack to emulate async execution. With this feature, we are able to call the async function inside `OrtRun()` call. ## Design Overview Inter-op JSEP is doing pretty much same thing to just another EP. It exposes an interface for inter-op with JavaScript, which is defined in onnxruntime/wasm/js_internal_api.js: ```js // init JSEP Module["jsepInit"] = function (backend, alloc, free, copy, copyAsync, createKernel, releaseKernel, run) { Module.jsepBackend = backend; Module.jsepAlloc = alloc; Module.jsepFree = free; Module.jsepCopy = copy; Module.jsepCopyAsync = copyAsync; Module.jsepCreateKernel = createKernel; Module.jsepReleaseKernel = releaseKernel; Module.jsepRun = run; }; ``` This simple JavaScript snippet defines all language barrier level functions that requires by JSEP to achieve implementing kernels and data transfers using JavaScript inside ONNX Runtime: - `jsepBackend`: assign the singleton object to webassembly module - `jsepAlloc` and `jsepFree`: implementation of data transfer's Alloc() and Free() - `jsepCopy`: synchronized copy ( GPU to GPU, CPU to GPU) - `jsepCopyAsync`: asynchronized copy ( GPU to CPU) - `jsepCreateKernel` and `jsepReleaseKernel`: a corresponding object that maintained in JS to match lifecycle of Kernel in ORT - `jsepRun`: OpKernel::Compute() should call into this The abstraction above allows to tie as little as possible connections and dependencies between C/C++ and TypeScript/JavaScript. Resource Management Lifecycle of tensor data and kernels are managed by ORT(C/C++) but the implementation are left to JavaScript. JavaScript code are responsible to implement the callbacks correctly. For WebGPU, the GPU data is managed by JavaScript using a singleton map (tensot_data_id => GPUBuffer). GPU pipeline is managed as singleton. Shaders are managed using a singletonmap (shader_key => gpu_program), while shader_key is generated by cache_key (OP specific, including attributes) and input shapes. about data transfer `js::DataTransfer::CopyTensor` implemented to call either synchronized or asynchronized copy callback, depending on the destination is GPU or not. Emscripten's macro `EM_ASYNC_JS` is used to wrap the async function to be called in the synchronized context. run kernel in JS Kernel class constructor calls once `jsepCreateKernel()` with an optional per-kernel specific serialization to pass attributes into JavaScript. `Compute()` are implemented in a way that a metadata serialization is performed in a base class and JavaScript code can access the data using the Emscripten specific builtin macro `EM_ASM_`. disabled features* memory pattern is force disabled, because the WebGPU data is not presented by a general memory model (a buffer can be represented by offset + size). concurrent run support is disabled. WebGPU is stateful and it also has async function call. To support concurrent run will significantly increase the complexity and we don't get any real benefit from it. prefer channels last JSEP prefers channels last and returns `DataLayout::NHWC` in method `GetPreferredLayout()`. This will let the graph transformers to preprocess the graph into a channels last form so that a more optimized WebGPU shader can be used. Testing code It's impossible to test JSEP directly because JSEP itself does not contain any kernel implementation. However, it has the kernel registration which need to work together with the corresponding JavaScript code. There are unit tests that run onnx models from JavaScript API. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-24 15:21:18 -07:00
Yulong Wang	a631ed77c0	[js/web] support flag 'optimizedModelFilePath' in session options (#14355 ) ### Description * Support flag 'optimizedModelFilePath' in session options. In Node.js, the model will be saved into filesystem just like its behaviour on native platforms. In browser, the new model is not saved to filesystem. the file path is ignored. Instead, a new pop-up window will be launched in browser and user can 'save' the file as onnx model. * Add corresponding commandline args for the following session option flags: - optimizedModelFilePath - graphOptimizationLevel	2023-02-24 15:50:15 -08:00
Yulong Wang	82786baed1	[js/web] add 'xnnpack' to EP list (#12723 ) Description: This PR adds support for "XNNPACK EP" in ORTWeb and changes the behavior of how ORTWeb deals with "backends", or "EPs" in API. Background: Term "backend" is introduced in ONNX.js to representing a TypeScript type which implements a "backend" interface, which is a similar but different concept to ORT's EP (execution provider). There was 3 backends in ONNX.js: "cpu", "wasm" and "webgl". When ORT Web is launched, the concept is derived to help users to integrate smoothly. Technically, when "wasm" backend is used, users need to also specify "EP" in the session options. Considering it may get complicated and confused for users to figure out the difference between "backend" and "EP", the JS API hide the "backend" concept and made a mapping between names, backends and EPs: "webgl" (Name) <==> "onnxjsBackend" (Backend) "wasm" (Name) <==> "wasmBackend" (Backend) <==> "CPU" (EP) Details: The following changes are applied in this PR: 1. allow multi-registration for backends using the same name. This is for use scenarios where both "onnxruntime-node" and "onnxruntime-web" are consumed in a Node.js App ( so "cpu" will be registered twice in this scenario. ) 2. re-assign priority values to backends. I give 100 as base to "cpu" for node and react_native, and 10 as base to "cpu" in web. 3. add "cpu", "xnnpack" as new names of backends. 4. update onnxruntime wasm exported functions to support EP registration. 5. update implementations in ort web to handle execution providers in session options. 6. add '--use_xnnpack' as default build flag for ort-web	2022-10-03 10:38:45 -07:00
Du Li	57b7ab56cd	Adding async fetching for webgl backend (#8951 ) * Adding async fetching for webgl backend * fix PR comments and CI failure. * fixing a bug * adding a flag	2021-09-09 22:17:42 -07:00
Yulong Wang	206537936f	[js/web] enable proxy worker for wasm backend (#8862 )	2021-08-31 10:23:42 -07:00
Sunghoon	a16c681103	[js/web] Prepare to integrate ONNX Runtime Web CI with BrowserStack (#8843 ) * Integrate BrowserStack with ONNX Runtime Web CI pipeline * Change to Linux command for BrowserStack CI * Set preferTriggeringPipeline as true * Fix a commit fetching script * Remove wasm binary download from the latest build * Use release build of WebAssembly * Disable check-out of commit for testing * Use commit of WebAssembly build CI pipeline * Need to issue two PRs to prevent build failure	2021-08-26 11:57:31 -07:00
Yulong Wang	e66846da4a	revise terms according to guideline	2021-07-23 13:26:15 -07:00
Gao, Chun	0f01de3b0b	[js/web] Add wasm SIMD backend to onnxruntime-web (#7896 ) * [js/web] Add wasm SIMD backend to onnxruntime-web * Import SIMD wasm artifacts enabled by PR #7839 * Detect SIMD capability of web engine * Use SIMD wasm backend in both single-thread and multi-thread cases * update optimized SIMD loading from ort web * code lint and format * fix WasmFileName in CI * replace deprecated wasm SIMD functions * fix unittest for simd * optimize CI pipeline to merge build matrix * make clean build for each config * fix simd wasm to enable it. * update script/pull-prebuilt-wasm-artifacts.ts Co-authored-by: Yulong Wang <yulongw@microsoft.com> Co-authored-by: Lei Zhang <zhang.huanning@hotmail.com>	2021-06-07 23:24:27 -07:00
Tixxx	6d9f541442	[JS]moved logging level flag to global env (#7700 ) * moved logging level flag to global env * added setter and getter for loggingLevel in Env * moved implementation of env to a separate file	2021-05-17 14:16:59 -07:00
Yulong Wang	97d9bcd644	[js/web] fix bundle for multi-thread, add e2e test and support nodejs (#7688 ) * fix bundle for multi-thread, add e2e test and support nodejs * add copyright banner * resolve comments * add comments for isMultiThreadSupported()	2021-05-14 18:15:38 -07:00
Yulong Wang	bdefc6c4d8	[js/web] support multi-thread for wasm backend (#7601 )	2021-05-07 12:12:37 -07:00
Yulong Wang	4ebc9c3b5e	[JS] onnxruntime-web (#7394 ) * add web * add script and test * fix lint * add test/data/ops * add test/data/node/ to gitignore * modify scripts * add onnxjs * fix tests * fix test-runner * fix sourcemap * fix onnxjs profiling * update test list * update README * resolve comments * set wasm as default backend * rename package * update copyright header * do not use class "Buffer" in browser context * revise readme	2021-04-27 00:04:25 -07:00

16 commits