### Description This PR is to update the win-ort-main branch to the tip main branch as of 2025-01-23. ### PR List ddf0d377a7 [QNN EP] Add LoggingManager::HasDefaultLogger() to provider bridge API (#23467) 05fbbdf91f [QNN EP] Make QNN EP a shared library (#23120) 1336566d7f Add custom vcpkg ports (#23456) 2e1173c411 Update the compile flags for vcpkg packages (#23455) 1f628a9858 [Mobile] Add BrowserStack Android MAUI Test (#23383) 009cae0ec8 [js/webgpu] Optimize ConvTranspose (Continue) (#23429) 04a4a694cb Use onnx_protobuf.h to suppress some GCC warnings (#23453) 2e3b62b4b0 Suppress some strict-aliasing related warnings in WebGPU EP (#23454) b708f9b1dc Bump ruff from 0.9.1 to 0.9.2 (#23427) c0afc66b2a [WebNN] Remove workarounds for TFLite backend (#23406) 8a821ff7f9 Bump vite from 6.0.7 to 6.0.11 in /js/web/test/e2e/exports/testcases/vite-default (#23446) 220c1a203e Make ORT and Dawn use the same protobuf/abseil source code (#23447) b7b5792147 Change MacOS-13 to ubuntu on for android-java-api-aar-test.yml. (#23444) 19d0d2a30f WIP: Dp4MatMulNBits accuracy level 4 matmul for WebGPU EP (#23365) 95b8effbc4 [QNN EP]: Clean up QNN logging resources if an error occurs during initialization (#23435) 626134c5b5 Bump clang-format from 19.1.6 to 19.1.7 (#23428) 0cf975301f Fix eigen external deps (#23439) f9440aedce Moving RN_CI Android Testing to Linux (#23422) 1aa5902ff4 [QNN EP] workaround for QNN validation bug for Tanh with uint16 quantized output (#23432) 7f5582a0e2 Seperate RN andriod and IOS into 2 separated Stages. (#23400) 73deac2e7f Implement some missing element wise Add/Sub/Mul/Div/Neg operations for CPU and CUDA EPs (#23090) 949fe42af4 Upgrade Java version from react-native/android to Java 17 (#23066) 0892c23463 Update Qnn SDK default version to 2.30 (#23411) 94c099bcec Fix type cast build error (#23423) d633e571d1 [WebNN EP] Fix AddInitializersToSkip issues (#23354) e988ef00e2 [QNN EP] Fix regression for MatMul with two quantized/dynamic uint16 inputs (#23419) 7538795f6b Update onnxruntime binary size checks ci pipeline's docker image (#23405) 6c5ea41cad Revert "[QNN EP] Clean up correctly from a partial setup (#23320)" (#23420) e866804bbe Enable comprehension simplification in ruff rules (#23414) 0a5f1f392c bugfix: string_view of invalid memory (#23417) 4cc38e0277 fix crash when first input of BatchNormalization is 1-D (#23387) 033441487f Target py310 and modernize codebase with ruff (#23401) 87341ac010 [QNN EP] Fix segfault when unregistering HTP shared memory handles (#23402) ### Motivation and Context This update includes the change to make QNN-EP a shared library. --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Peishen Yan <peishen.yan@intel.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: Hector Li <hecli@microsoft.com> Co-authored-by: Jian Chen <cjian@microsoft.com> Co-authored-by: Alexis Tsogias <1114095+Zyrin@users.noreply.github.com> Co-authored-by: junchao-zhao <68935141+junchao-loongson@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: sushraja-msft <44513542+sushraja-msft@users.noreply.github.com> Co-authored-by: Wanming Lin <wanming.lin@intel.com> Co-authored-by: Jiajia Qin <jiajiaqin@microsoft.com> Co-authored-by: Caroline Zhu <wolfivyaura@gmail.com> |
||
|---|---|---|
| .. | ||
| docs | ||
| lib | ||
| script | ||
| test | ||
| .gitignore | ||
| .npmignore | ||
| karma.conf.js | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| tsconfig.json | ||
| types.d.ts | ||
ONNX Runtime Web
ONNX Runtime Web is a Javascript library for running ONNX models on browsers and on Node.js.
ONNX Runtime Web has adopted WebAssembly and WebGL technologies for providing an optimized ONNX model inference runtime for both CPUs and GPUs.
Why ONNX models
The Open Neural Network Exchange (ONNX) is an open standard for representing machine learning models. The biggest advantage of ONNX is that it allows interoperability across different open source AI frameworks, which itself offers more flexibility for AI frameworks adoption.
Why ONNX Runtime Web
With ONNX Runtime Web, web developers can score models directly on browsers with various benefits including reducing server-client communication and protecting user privacy, as well as offering install-free and cross-platform in-browser ML experience.
ONNX Runtime Web can run on both CPU and GPU. On CPU side, WebAssembly is adopted to execute the model at near-native speed. ONNX Runtime Web compiles the native ONNX Runtime CPU engine into WebAssembly backend by using Emscripten, so it supports most functionalities native ONNX Runtime offers, including full ONNX operator coverage, multi-threading, ONNX Runtime Quantization as well as ONNX Runtime Mobile. For performance acceleration with GPUs, ONNX Runtime Web leverages WebGL, a popular standard for accessing GPU capabilities. We are keeping improving op coverage and optimizing performance in WebGL backend.
See Compatibility and Operators Supported for a list of platforms and operators ONNX Runtime Web currently supports.
Usage
-
See Get started as a landing page for ONNX Runtime Web documentation.
-
Refer to ONNX Runtime JavaScript examples for samples and tutorials.
-
See also ONNX Runtime Web API reference for detailed API documentation.
Documents
Development
Refer to the following links for development information:
Compatibility
| EPs/Browsers | Chrome/Edge (Windows) | Chrome/Edge (Android) | Chrome/Edge (MacOS) | Chrome/Edge (iOS) | Safari (MacOS) | Safari (iOS) | Firefox (Windows) | Node.js |
|---|---|---|---|---|---|---|---|---|
| WebAssembly (CPU) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️[1] |
| WebGPU | ✔️[2] | ✔️[3] | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ |
| WebGL | ✔️[4] | ✔️[4] | ✔️[4] | ✔️[4] | ✔️[4] | ✔️[4] | ✔️[4] | ❌ |
| WebNN | ✔️[5] | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
- [1]: Node.js only support single-threaded
wasmEP. - [2]: WebGPU requires Chromium v113 or later on Windows. Float16 support requires Chrome v121 or later, and Edge v122 or later.
- [3]: WebGPU requires Chromium v121 or later on Windows.
- [4]: WebGL support is in maintenance mode. It is recommended to use WebGPU for better performance.
- [5]: Requires to launch browser with commandline flag
--enable-features=WebMachineLearningNeuralNetwork.
Operators
WebAssembly backend
ONNX Runtime Web currently support all operators in ai.onnx and ai.onnx.ml.
WebGL backend
ONNX Runtime Web currently supports a subset of operators in ai.onnx operator set. See webgl-operators.md for a complete, detailed list of which ONNX operators are supported by WebGL backend.
WebGPU backend
WebGPU backend is still an experimental feature. See webgpu-operators.md for a detailed list of which ONNX operators are supported by WebGPU backend.
WebNN backend
WebNN backend is still an experimental feature. See webnn-operators.md for a detailed list of which ONNX operators are supported by WebNN backend.
License
License information can be found here.