onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-17 21:10:43 +00:00

Author	SHA1	Message	Date
Wanming Lin	82681205e4	[WebNN] Fix MLTensorUsage is undefined issue (#22831 ) `MLTensorUsage` has been removed from Chromium: https://chromium-review.googlesource.com/c/chromium/src/+/6015318, but we still need to make it compatible with old Chrome versions, so just make it `undefined` for latest Chrome version.	2024-11-13 20:22:22 -08:00
Wanming Lin	e6e94e6252	[WebNN EP] Use boolean flags instead of MLTensorUsage (#22497 ) Fixed #22495 We will keep MLTensorUsage until it is removed from Chromium. --------- Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2024-10-22 17:20:36 -07:00
Wanming Lin	52b77762bd	[WebNN EP] Remove the numThreads option (#22464 ) Chromium has removed this option via https://chromium-review.googlesource.com/c/chromium/src/+/5905656.	2024-10-17 07:45:39 -07:00
Yang Gu	9e5153b688	[js/webgpu] Manage model download with a specific unittest option (#22214 ) Currently in debug mode, unit test will always download models to local file system, which is a bit annoying. This PR fixes this by adding a specific option to enable model download.	2024-09-30 18:27:43 -07:00
Enrico Galli	52a8c1cae8	[WebNN EP] Enable IO Bindings with MLTensor (#21301 ) ### Description Enables using the MLTensor to pass data between models. ### Motivation and Context Using MLTensor instead of ArrayBuffers reduces the number of copies between the CPU and devices as well as the renderer and GPU process in Chromium.	2024-09-27 17:24:21 -07:00
Yulong Wang	ef2ccc477b	[js/web] Add support for int4/uint4 tensor (#21720 ) ### Description Add support for int4/uint4 tensor.	2024-08-15 21:32:10 -07:00
Yulong Wang	abdc31de40	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 ) ### Description See `454996d496` for manual changes (excluded auto-generated formatting changes) ### Why Because the toolsets for old clang-format is out-of-date. This reduces the development efficiency. - The NPM package `clang-format` is already in maintenance mode. not updated since 2 years ago. - The VSCode extension for clang-format is not maintained for a while, and a recent Node.js security update made it not working at all in Windows. No one in community seems interested in fixing those. Choose Prettier as it is the most popular TS/JS formatter. ### How to merge It's easy to break the build: - Be careful of any new commits on main not included in this PR. - Be careful that after this PR is merged, other PRs that already passed CI can merge. So, make sure there is no new commits before merging this one, and invalidate js PRs that already passed CI, force them to merge to latest.	2024-08-14 16:51:22 -07:00
Yulong Wang	5e66fcc703	[js/web] allow op test to use f16 type for inputs/outputs (#21664 ) ### Description allow op test to use f16 type for inputs/outputs. This PR introduces "@petamoriken/float16" as Float16Array polyfill but restricts it to be only used for test runner.	2024-08-08 09:56:37 -07:00
Yulong Wang	4385602386	[js/web] fix test runner with optional input/output (#20399 ) ### Description fix test runner with optional input/output. This change fixes the OP test runner (.jsonc format test) with optional input(s) and/or output(s). this fix reveals a problem of dealing with optional outputs: > Take SkipSimplifiedLayerNorm as example: > > if in the ONNX model, the node's outputs are: [ 'output_0', '' ] instead of [ 'output_0' ], the current implementation will fail. The difference is, in the first case, context.outputCount == 2, and then the typescript implementation will try to create a tensor for output[1]. It will eventually call to C++ function (OpKernelContext::Output), and the output.DataRaw() will be nullptr. WebGPU backend will fail because it cannot deal with a TensorView with data == 0. > This problem may need to be fixed or workaround in separated PR. This PR does not fix this problem. Failed test cases are modified to work - please note this PR does not break those test cases as they never work.	2024-04-22 12:53:10 -07:00
Yulong Wang	28907d8c59	[js/web] workaround NPM test fetch failure (#20020 ) ### Description Sometimes the `npm test` failed with an error of "TypeError: Failed to fetch". I checked the callback entry of the localhost server started by karma. When the "Failed to fetch" happens, no request is reflected on the server side. The root cause is still not identified. However, as this issue only happens sometimes when the browser is just launched by karma runner, doing retry can workaround this issue for most of the time.	2024-03-26 21:35:49 -07:00
Yulong Wang	aec2389ad0	[js/webgpu] allows a ProgramInfo's RunData to use zero sized output (#19614 ) ### Description This PR allows zero-sized output. To make the implementation simple, it does not support partial zero-sized tensor. Which means, either all outputs are zero-sized, or an error will be reported. added 2 tests: - op test of `Add` with input T[2,0] T[2,1], and - test_split_zero_size_splits	2024-02-23 12:52:47 -08:00
satyajandhyala	ae3d73c981	[JS/WebGPU] Fix Split and Where to handle corner cases. (#19613 ) ### Description <!-- Describe your changes. --> 1. Fix Where operator to handle Boolean input less than 4 bytes. 2. Fix JSEP test harness to use tensor names consistently. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-02-23 00:21:15 -08:00
Yulong Wang	70567a4b3a	[js/web] use ApiTensor insteadof onnxjs Tensor in TensorResultValidator (#19358 ) ### Description use ApiTensor insteadof onnxjs Tensor in TensorResultValidator. Make test runner less depend on onnxjs classes.	2024-02-20 17:33:21 -08:00
Yulong Wang	50806a7dd5	[js/web] support external data in npm test (#19377 ) ### Description support external data in npm test. This allows test runner to detect whether an external data is available in the test folder, and if it is, load it as external data automatically. this feature does not parse every model to figure out whether the model has external data. the following comments in code explained how to determine whether should parse the model file. ```js // for performance consideration, we do not parse every model. when we think it's likely to have external // data, we will parse it. We think it's "likely" when one of the following conditions is met: // 1. any file in the same folder has the similar file name as the model file // (e.g., model file is "model_abc.onnx", and there is a file "model_abc.pb" or "model_abc.onnx.data") // 2. the file size is larger than 1GB ```	2024-02-02 09:05:57 -08:00
Yulong Wang	f917dde717	[web] remove xnnpack from web backends (#19116 ) ### Description XNNPACK is already disabled in web assembly build. This change removes the xnnpack backend registration in JS.	2024-01-13 23:04:02 -08:00
zesongw	3eec1592bd	[WebNN EP] Update WebNN unit test list (#19103 ) Update WebNN test list in suite-test-list.jsonc so all test cases are passed behind WebNN CPU backend on Chrome Stable (Although some cases may fall back to CPU EP). Enable int64 support for WebNN in unit tests.	2024-01-12 10:22:38 -08:00
zesongw	ad6dd0a597	[WebNN] Enable npm unit tests (#18486 ) ### Description - Support more test cases for WebNN EP in suite-test-list.jsonc - Add DISABLE_WEBNN flag in build.ts as preparing for WebNN EP release - Add test option: '--webnn-device-type' in test-runner-args-cli.ts to support running WebNN 'gpu' deviceType - Use Chrome Stable as default browser for WebNN testing to unblock the CI limitation.	2024-01-09 10:10:57 -08:00
Yulong Wang	9a61388f0a	[js/web] revise backend registration (#18715 ) ### Description This PR revises the backend registration. The following describes the expected behavior after this change: (bolded are changed behavior) - (ort.min.js - built without webgpu support) - loading: do not register 'webgpu' backend - creating session without EP list: use default EP list ['webnn', 'cpu', 'wasm'] - creating session with ['webgpu'] as EP list: should fail with backend not available - (ort.webgpu.min.js - built with webgpu support) - loading: always register 'webgpu' backend ( previous behavior: only register 'webgpu' backend when `navigator.gpu` is available) - creating session without EP list: use default EP list ['webgpu', 'webnn', 'cpu', 'wasm'] - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): should fail backend init, and try to use next backend in the list, 'webnn' (previous behavior: does not fail backend init, but fail in JSEP init, which was too late to switch to next backend) - creating session with ['webgpu'] as EP list - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): **should fail backend init, and because no more EP listed, fail. related PRs: #18190 #18144	2023-12-20 14:45:55 -08:00
Scott McKay	4f2096be38	Update XNNPACK to latest version (#18038 ) ### Description <!-- Describe your changes. --> Update XNNPACK to latest version - adds fp16 kernels and various other improvements - requires pthreadpool update as well Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API - 'setup' is split into 'reshape' and 'setup' - some ops use a workspace buffer - copied workspace allocation from XNNPACK unit test code - some suffixes changed Added wrapper for XNNPACK caches to base XNNPACK EP kernel - simplifies usage - XNNPACK split out the code and weights caches, but the code cache isn't currently usable via the public API - we could use the internal types if we think it's required for performance reasons. non-trivial though as we'd need to propagate ifdef values from the XNNPACK build up to the ORT build. - using XNNPACK internals would also mean we would not be able to support using a pre-build XNNPACK package - not an issue currently Fixed opset registration for internal NHWC domain - was not being tied to the ONNX version, so nodes inserted by layout transformation had the incorrect opset - a number of other places needed updating once this issue was fixed Remove support for NCHW Resize from XNNPACK EP so it's NHWC only - we only supported NCHW for fp32, - doing so adds complexity in multiple places (XNNPACK EP kernel implementation, layout transformation and transpose optimization) - unclear if that complexity provides any benefit. can add back if required by production scenario ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> We're looking at enabling fp16 support for CoreML and NNAPI. If we do that we need a good fallback story if the CPU EP will be used. The XNNPACK fp16 kernels will hopefully provide that. NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That can be done as required in separate EPs and should be relatively simple to do.	2023-11-03 09:04:28 -07:00
Yulong Wang	561aca97cf	[js/webgpu] support IO binding (#17480 ) <del> This PR is based on a few prerequisites PRs. They are listed as below: - #17465 - #17469 - #17470 - #17472 - #17473 - #17484 Please review the current change by only looking at commit e2e6623e673ec6de55a5c1f8edcbd3a46b535a89 and later. </del> ### Description This PR introduces WebGPU IO binding. This new feature allows onnxruntime-web users to use tensors created from GPU as model input/output so that a model inferencing can be done without unnecessary data copy between CPU and GPU for model input/output. ### Examples An E2E demo/example is being worked on. Following is some simple demo with code snippet. Let's first check today how we do: ```js // STEP.1 - create an inference session: const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'] }); // STEP.2 - create model input: (supposing myImageCpuData is a Float32Array) const feeds = { 'input_image:0': new ort.Tensor('float32', myImageCpuData, [1, 224, 224, 3]) }; // STEP.3 - run model const myResults = await mySession.run(feeds); // STEP.4 - get output data const myData = myResults['output_image:0'].data; // Float32Array ``` #### for inputs (GPU tensor): Now, with IO binding, you can create a tensor from a GPU buffer, and feed it to the model: ```js // new STEP.2.A - create model input from a GPU buffer: (supposing myInputGpuBuffer is a `GPUBuffer` object with input data) const feeds = { 'input_image:0': ort.Tensor.fromGpuBuffer(myInputGpuBuffer, { dataType: 'float32', dims: [1, 224, 224, 3] }) }; ``` ### for outputs (pre-allocated GPU tensor) you can also do that for output, if you know the output shape: ```js // new STEP.2.B - create model output from a GPU buffer: (supposing myOutputGpuBuffer is a pre-allocated `GPUBuffer` object) const fetches = { 'output_image:0': ort.Tensor.fromGpuBuffer(myOutputGpuBuffer, { dataType: 'float32', dims: [1, 512, 512, 3] }) }; // new STEP.3 - run model with pre-allocated output (fetches) const myResults = await mySession.run(feeds, fetches); ``` ### for outputs (specify location) if you do not know the output shape, you can specify the output location when creating the session: ```js // new STEP.1 - create an inference session with an option "preferredOutputLocation": const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'], preferredOutputLocation: "gpu-buffer" }); ``` if the model has multiple outputs, you can specify them seperately: ```js // new STEP.1 - create an inference session with an option "preferredOutputLocation": const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'], preferredOutputLocation: { "output_image:0": "gpu-buffer" } }); ``` now you don't need to prepare the `fetches` object and onnxruntime-web will prepare output data on the location that specified. #### read data when you get the output tensor, you can: ```js // get the gpu buffer object: const gpuBuffer = myOutputTensor.gpuBuffer; // GPUBuffer // get the CPU data asynchronizely const cpuData = await myOutputTensor.getData(); // get the CPU data asynchronizely and release the underlying GPU resources const cpuData = await myOutputTensor.getData(true); // dispose the tensor (release the underlying GPU resources). This tensor object will be invalid after dispose() is called. myOutputTensor.dispose(); ``` #### resource management JavaScript has GC so you don't need to worry about managing JavaScript objects. But there are 2 types of resources that are not managed by GC: - GPU buffer that used in tensors - Underlying ORT native resources To simplify, most of the unmanaged resources and handled inside ORT web. But there are a few resources that need users to manage: - All external GPU resources, including GPU buffers inside all tensors created by `Tensor.fromGpuBuffer()`, will not be managed by ORT. User should manage those GPU buffers themselves. - When a session is created with `preferredOutputLocation` == "gpu-buffer" specified in session options, and the corresponding output is not pre-allocated, user need to call the output tensor's `dispose()` or `getData(true)` to manually release the underlying GPU buffers. - ORT internal errors (including providing a pre-allocated output tensor with wrong type/dims) will invalidate the whole wasm memory and is not recoverable. An exception is thrown in this situation.	2023-09-29 11:24:42 -07:00
Yulong Wang	f923eec28b	[js/web] release session after use in npm test (#17470 ) ### Description release session after use in npm test. This is one of the prerequisites for supporting IO binding for WebGPU buffer in onnxruntime-web. list of prerequisites PRs: #17465 #17469 #17470 (this one)	2023-09-12 16:59:13 -07:00
satyajandhyala	7d1a5635a0	[JS/Web] Added SkipLayerNormalization operator. (#17102 ) ### Description Add SkipLayerNormalization operator to JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-18 09:59:03 -07:00
Yulong Wang	cbee84ddfb	[js/web] allow optional input/output in operator test (#17184 ) ### Description allow optional input/output in operator test	2023-08-16 11:50:11 -07:00
Yulong Wang	35363dd9a5	[js/web] a few optimizations for test runner (#17174 ) ### Description 1. allows passing session options to operator test (eg. graph optimization level) 2. add a short flag '-x' for '--wasm-number-threads' as it is frequently used.	2023-08-15 21:00:23 -07:00
Yulong Wang	641c3a4a37	[js/web] update op test schema (#16921 ) ### Description update op test schema. This changes fixes several problems for operator tests for web: - `opsets` -> `opset`: an operator uses exactly one opset instead of multiple - `condition` -> `platformCondition`: make it less confusing - `inputShapeDefinitions`: allows to test ORT behaviors when it get no/partial/full shape info. Added a JSON schema file and also an example file	2023-08-03 14:20:20 -07:00
Guenther Schmuelling	0df2e14038	js/webgpu: argmax,argmin,softmax support (#16882 ) argmax and argmin are similar to reduce. Eventually we need to add optimized flavors of the shader. softmax is optimized but only works on the last axis for now which should be the common use case. todo: enable more ut for argmax/argmin	2023-08-02 18:16:19 -07:00
Yulong Wang	7dcb805ab8	[js/web] upgrade onnx-proto version (#16722 ) ### Description This change upgrades a lot of dependencies. There are 2 motivations of doing this change: - fix the security issue reported by dependabot (protobufjs Prototype Pollution vulnerability - https://github.com/advisories/GHSA-h755-8qp9-cq85) - resolve the requirement of using ONNX IR_VERSION 9 (#16638) This requires: - upgrade protobufjs to v7.2.4 - upgrade library 'onnx-proto' to consume latest ONNX release (v1.14.0). Problems: - protobufjs v7.2.4 depends on long.js v5, which does not work well with typescript (commonjs). - onnx-proto depends on this fix with a new release of long.js - long.js is in maintenance and it takes longer than expected to put in new changes Solutions: - use a patch script in `preprepare` to copy type declarations to make long.js work with typescript (commonjs) - generate onnx protobuf JS/TS files and put them under js/web/lib/onnxjs/ort-schema/protobuf folder - remove 'onnx-proto' from dependency. - apply fixes to generated onnx.d.ts	2023-07-18 16:36:39 -07:00
Yulong Wang	d13f3153d7	[js/webgpu] enable op test for webgpu (#16542 ) ### Description This change enables the JSON-format operator tests for webgpu. Usage: ``` npm test -- op abs.jsonc -b=webgpu ```	2023-07-06 08:35:19 -07:00
Yulong Wang	b8917ad84f	[js/web] fix nodejs detection (#16400 ) ### Description We used to use `typeof fetch === 'undefined'` as condition to detect the environment is Node.js or not. Before Node.js v18, this works. However, in Node.js v18, it introduced `fetch` function, so this check does not work any more. This PR changes the condition to check whether `process`, `process.versions` and `process.versions.node` exists. Checking whether `process` exists is not enough. This is because in some configuration, webpack may polyfill nodejs's process.	2023-06-20 00:20:58 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	14cc02c65c	[js/web] WebGPU backend via JSEP (#14579 ) ### Description This change introduced the following new components into ONNX Runtime Web: - JavaScript Execution Provider (JSEP) - Asynchronized inferencing execution powered by Emscripten's Asyncify - WebGPU backend implemented in TypeScript - initial implementation of kernels: - elementwise operators (22) - binary operators (5) - tensor: Shape, Reshape, Transpose, Gemm - nn: Conv, {Global}Maxpool, {Global}AveragePool Code need to be polished. still working on it. ## Q&A What is JSEP? > JSEP, aka JavaScript Execution Provider, is a new ONNXRuntime execution provider that specifically works on Web environment (browsers). JSEP allows JavaScript code to kick in from various places when ONNX Runtime inferences a model. Why JSEP? > JSEP is a hybrid mode EP that contains both C/C++ and TypeScript/JavaScript implementation. There are 2 strong reasons why we introduces JSEP: > 1. the C/C++ part helps JSEP to leverage ONNX Runtime's capabilities as much as possible including graph transformer, optimizers and also the capabilities to fallback to CPU EP. TypeScript/JavaScript helps JSEP to develop and debug much easier in the browser for the kernel implementation. > 2. the requirement of asynchronized execution from JavaScript API (eg. `buffer.mapAsync()`) makes it impossible to run `OrtRun()` in a synchronized context (see "async problem" section below). This is done by using Emscripten's Asyncify. What is WebGPU? > WebGPU is the new GPU API that available in browser. It's one of the only 2 APIs that currently available to access the GPU from browser (the other is WebGL). > WebGPU is designed with more advanced and stronger features comparing to WebGL and is potentially solution that offer the best GPU performance for model inferencing that currently available. What is the async problem and why we have the problem? > The "async problem" is a problem that you cannot call an async function in a synchronous context. Think about the following C++ code: > ```c > // C-style declarations (API) > typedef void (ON_COMPLETE)(PVOID state, DATA data); > void read_data_from_file(FILEHANDLE file, ON_COMPLETE on_complete); > > // implementation > DATA * my_impl_read_data_from_file_sync(FILEHANDLE file) { > // how to implement? > } > ``` > The answer is, it's impossible to implement this function. Usually we try to find a sync version API, or launch a thread to call the async function and sync-wait on the main thread. Unfortunately, in browser environment, neither is possible. > > WebGPU does not offer any synchronized API for data downloading (GPU to CPU). This is the only operation that MUST be async. As `OrtRun()` will eventually call into DataTransfer for copy data from GPU to CPU, and `OrtRun()` is a synchronized function, this cannot be done in normal way. What is Emscripten? How is the Asyncify feature resolved the problem? > Emscripten is the C/C++ compiler for WebAssembly. It's what we use to compile ORT and generates the WebAssembly artifacts which runs on browsers. > > Asyncify is a [compiler feature](https://emscripten.org/docs/porting/asyncify.html) that allows calling async functions from a synchronized context. In short, it generates code to unwind and rewind call stack to emulate async execution. With this feature, we are able to call the async function inside `OrtRun()` call. ## Design Overview Inter-op JSEP is doing pretty much same thing to just another EP. It exposes an interface for inter-op with JavaScript, which is defined in onnxruntime/wasm/js_internal_api.js: ```js // init JSEP Module["jsepInit"] = function (backend, alloc, free, copy, copyAsync, createKernel, releaseKernel, run) { Module.jsepBackend = backend; Module.jsepAlloc = alloc; Module.jsepFree = free; Module.jsepCopy = copy; Module.jsepCopyAsync = copyAsync; Module.jsepCreateKernel = createKernel; Module.jsepReleaseKernel = releaseKernel; Module.jsepRun = run; }; ``` This simple JavaScript snippet defines all language barrier level functions that requires by JSEP to achieve implementing kernels and data transfers using JavaScript inside ONNX Runtime: - `jsepBackend`: assign the singleton object to webassembly module - `jsepAlloc` and `jsepFree`: implementation of data transfer's Alloc() and Free() - `jsepCopy`: synchronized copy ( GPU to GPU, CPU to GPU) - `jsepCopyAsync`: asynchronized copy ( GPU to CPU) - `jsepCreateKernel` and `jsepReleaseKernel`: a corresponding object that maintained in JS to match lifecycle of Kernel in ORT - `jsepRun`: OpKernel::Compute() should call into this The abstraction above allows to tie as little as possible connections and dependencies between C/C++ and TypeScript/JavaScript. Resource Management Lifecycle of tensor data and kernels are managed by ORT(C/C++) but the implementation are left to JavaScript. JavaScript code are responsible to implement the callbacks correctly. For WebGPU, the GPU data is managed by JavaScript using a singleton map (tensot_data_id => GPUBuffer). GPU pipeline is managed as singleton. Shaders are managed using a singletonmap (shader_key => gpu_program), while shader_key is generated by cache_key (OP specific, including attributes) and input shapes. about data transfer `js::DataTransfer::CopyTensor` implemented to call either synchronized or asynchronized copy callback, depending on the destination is GPU or not. Emscripten's macro `EM_ASYNC_JS` is used to wrap the async function to be called in the synchronized context. run kernel in JS Kernel class constructor calls once `jsepCreateKernel()` with an optional per-kernel specific serialization to pass attributes into JavaScript. `Compute()` are implemented in a way that a metadata serialization is performed in a base class and JavaScript code can access the data using the Emscripten specific builtin macro `EM_ASM_`. disabled features* memory pattern is force disabled, because the WebGPU data is not presented by a general memory model (a buffer can be represented by offset + size). concurrent run support is disabled. WebGPU is stateful and it also has async function call. To support concurrent run will significantly increase the complexity and we don't get any real benefit from it. prefer channels last JSEP prefers channels last and returns `DataLayout::NHWC` in method `GetPreferredLayout()`. This will let the graph transformers to preprocess the graph into a channels last form so that a more optimized WebGPU shader can be used. Testing code It's impossible to test JSEP directly because JSEP itself does not contain any kernel implementation. However, it has the kernel registration which need to work together with the corresponding JavaScript code. There are unit tests that run onnx models from JavaScript API. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-24 15:21:18 -07:00
Yulong Wang	a631ed77c0	[js/web] support flag 'optimizedModelFilePath' in session options (#14355 ) ### Description * Support flag 'optimizedModelFilePath' in session options. In Node.js, the model will be saved into filesystem just like its behaviour on native platforms. In browser, the new model is not saved to filesystem. the file path is ignored. Instead, a new pop-up window will be launched in browser and user can 'save' the file as onnx model. * Add corresponding commandline args for the following session option flags: - optimizedModelFilePath - graphOptimizationLevel	2023-02-24 15:50:15 -08:00
Yulong Wang	82786baed1	[js/web] add 'xnnpack' to EP list (#12723 ) Description: This PR adds support for "XNNPACK EP" in ORTWeb and changes the behavior of how ORTWeb deals with "backends", or "EPs" in API. Background: Term "backend" is introduced in ONNX.js to representing a TypeScript type which implements a "backend" interface, which is a similar but different concept to ORT's EP (execution provider). There was 3 backends in ONNX.js: "cpu", "wasm" and "webgl". When ORT Web is launched, the concept is derived to help users to integrate smoothly. Technically, when "wasm" backend is used, users need to also specify "EP" in the session options. Considering it may get complicated and confused for users to figure out the difference between "backend" and "EP", the JS API hide the "backend" concept and made a mapping between names, backends and EPs: "webgl" (Name) <==> "onnxjsBackend" (Backend) "wasm" (Name) <==> "wasmBackend" (Backend) <==> "CPU" (EP) Details: The following changes are applied in this PR: 1. allow multi-registration for backends using the same name. This is for use scenarios where both "onnxruntime-node" and "onnxruntime-web" are consumed in a Node.js App ( so "cpu" will be registered twice in this scenario. ) 2. re-assign priority values to backends. I give 100 as base to "cpu" for node and react_native, and 10 as base to "cpu" in web. 3. add "cpu", "xnnpack" as new names of backends. 4. update onnxruntime wasm exported functions to support EP registration. 5. update implementations in ort web to handle execution providers in session options. 6. add '--use_xnnpack' as default build flag for ort-web	2022-10-03 10:38:45 -07:00
Yulong Wang	af21a04977	[js] upgrade async@3.2.3 /js/ (#11421 ) * [js] upgrade async@3.2.3 /js/ * format code	2022-05-03 23:41:36 -07:00
Yulong Wang	a3ebc5e082	[js/web] do not use nodejs type 'Buffer' in web (#9839 ) * [js/web] do not use nodejs type 'Buffer' in web * resolve comments and validate tests * remove 'Buffer' in test	2021-11-24 14:14:42 -08:00
Sunghoon	450524359e	[js/web] WebAssembly profiling (#8932 ) * add p50 in test * Preallocate WebAssembly worker threads to minimize worker creation time * WebAssembly profiling * merge master * merge with proxy changes * disable profiling tests from WebAssembly build * fix e2e test failure Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-09-07 17:18:08 -07:00
Yulong Wang	3e8cabbc3e	[js/web] WebGL backend refactor (#8586 )	2021-08-12 12:30:49 -07:00
Du Li	6a9023f47d	[JS/Web]Adding support for WebGL v1 (#7906 ) * Adding support for WebGL v1 * enabling a few tests * Minor changes for README.md	2021-06-03 21:30:42 -07:00
Yulong Wang	896f32ec09	[js/web] support string tensor for wasm backend (#7891 ) * [js/web] support string tensor for wasm backend * disable v9/test_cast_STRING_to_FLOAT: test data is wrong * add non-string check * Update session-handler.ts * Update session-handler.ts	2021-06-03 00:44:50 -07:00
Yulong Wang	bdefc6c4d8	[js/web] support multi-thread for wasm backend (#7601 )	2021-05-07 12:12:37 -07:00
Yulong Wang	3600c3e66e	[js/web] integrate latest changes from onnxjs (#7535 ) * [js/web] integrate latest changes from onnxjs * apply ESLint rules: filename-case and header * remove filename-case rule for wasm .d.ts	2021-05-03 15:03:25 -07:00
Yulong Wang	4ebc9c3b5e	[JS] onnxruntime-web (#7394 ) * add web * add script and test * fix lint * add test/data/ops * add test/data/node/ to gitignore * modify scripts * add onnxjs * fix tests * fix test-runner * fix sourcemap * fix onnxjs profiling * update test list * update README * resolve comments * set wasm as default backend * rename package * update copyright header * do not use class "Buffer" in browser context * revise readme	2021-04-27 00:04:25 -07:00

42 commits