mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

History

Scott McKay 4f2096be38 Update XNNPACK to latest version (#18038 ) ### Description <!-- Describe your changes. --> Update XNNPACK to latest version - adds fp16 kernels and various other improvements - requires pthreadpool update as well Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API - 'setup' is split into 'reshape' and 'setup' - some ops use a workspace buffer - copied workspace allocation from XNNPACK unit test code - some suffixes changed Added wrapper for XNNPACK caches to base XNNPACK EP kernel - simplifies usage - XNNPACK split out the code and weights caches, but the code cache isn't currently usable via the public API - we could use the internal types if we think it's required for performance reasons. non-trivial though as we'd need to propagate ifdef values from the XNNPACK build up to the ORT build. - using XNNPACK internals would also mean we would not be able to support using a pre-build XNNPACK package - not an issue currently Fixed opset registration for internal NHWC domain - was not being tied to the ONNX version, so nodes inserted by layout transformation had the incorrect opset - a number of other places needed updating once this issue was fixed Remove support for NCHW Resize from XNNPACK EP so it's NHWC only - we only supported NCHW for fp32, - doing so adds complexity in multiple places (XNNPACK EP kernel implementation, layout transformation and transpose optimization) - unclear if that complexity provides any benefit. can add back if required by production scenario ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> We're looking at enabling fp16 support for CoreML and NNAPI. If we do that we need a good fallback story if the CPU EP will be used. The XNNPACK fp16 kernels will hopefully provide that. NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That can be done as required in separate EPs and should be relatively simple to do.		2023-11-03 09:04:28 -07:00
..
docs	Update XNNPACK to latest version (#18038 )	2023-11-03 09:04:28 -07:00
lib	[js/web] optimize reduce related operators (#17957 )	2023-11-02 12:51:48 -07:00
script	[js/web] optimize tsc for web: split out "npm prepare" (#17955 )	2023-10-16 09:04:54 -07:00
test	Update XNNPACK to latest version (#18038 )	2023-11-03 09:04:28 -07:00
.gitignore	[js/web] add target ort.webgpu.min.js (#15780 )	2023-05-04 10:05:39 -07:00
.npmignore	[js/web] fix a few package consuming problems (#18109 )	2023-10-30 08:11:43 -07:00
karma.conf.js	[js/web] use esbuild to accelerate bundle build (#17745 )	2023-10-06 13:37:37 -07:00
package-lock.json	Bump Up Version to 1.17.0 (#17587 )	2023-09-20 11:02:58 +08:00
package.json	[js/web] fix a few package consuming problems (#18109 )	2023-10-30 08:11:43 -07:00
README.md	[js] enable formatter for more file types (#16888 )	2023-07-28 15:46:58 -07:00
tsconfig.json	[js/web] optimize tsc for web: split out "npm prepare" (#17955 )	2023-10-16 09:04:54 -07:00
types.d.ts	Add "glue" between training WASM artifacts and training web (#17474 )	2023-10-12 11:16:56 -07:00

README.md

ONNX Runtime Web

ONNX Runtime Web is a Javascript library for running ONNX models on browsers and on Node.js.

ONNX Runtime Web has adopted WebAssembly and WebGL technologies for providing an optimized ONNX model inference runtime for both CPUs and GPUs.

Why ONNX models

The Open Neural Network Exchange (ONNX) is an open standard for representing machine learning models. The biggest advantage of ONNX is that it allows interoperability across different open source AI frameworks, which itself offers more flexibility for AI frameworks adoption.

Why ONNX Runtime Web

With ONNX Runtime Web, web developers can score models directly on browsers with various benefits including reducing server-client communication and protecting user privacy, as well as offering install-free and cross-platform in-browser ML experience.

ONNX Runtime Web can run on both CPU and GPU. On CPU side, WebAssembly is adopted to execute the model at near-native speed. ONNX Runtime Web complies the native ONNX Runtime CPU engine into WebAssembly backend by using Emscripten, so it supports most functionalities native ONNX Runtime offers, including full ONNX operator coverage, multi-threading, ONNX Runtime Quantization as well as ONNX Runtime Mobile. For performance acceleration with GPUs, ONNX Runtime Web leverages WebGL, a popular standard for accessing GPU capabilities. We are keeping improving op coverage and optimizing performance in WebGL backend.

See Compatibility and Operators Supported for a list of platforms and operators ONNX Runtime Web currently supports.

Usage

Refer to ONNX Runtime JavaScript examples for samples and tutorials.

Documents

Developement

Refer to the following links for development information:

Compatibility

OS/Browser	Chrome	Edge	Safari	Electron	Node.js
Windows 10	wasm, webgl	wasm, webgl	-	wasm, webgl	wasm
macOS	wasm, webgl	wasm, webgl	wasm, webgl	wasm, webgl	wasm
Ubuntu LTS 18.04	wasm, webgl	wasm, webgl	-	wasm, webgl	wasm
iOS	wasm, webgl	wasm, webgl	wasm, webgl	-	-
Android	wasm, webgl	wasm, webgl	-	-	-

Operators

WebAssembly backend

ONNX Runtime Web currently support all operators in ai.onnx and ai.onnx.ml.

WebGL backend

ONNX Runtime Web currently supports a subset of operators in ai.onnx operator set. See webgl-operators.md for a complete, detailed list of which ONNX operators are supported by WebGL backend.

WebGPU backend

WebGPU backend is still an experimental feature. See webgpu-operators.md for a detailed list of which ONNX operators are supported by WebGPU backend.

License

License information can be found here.