onnxruntime/js/web
Joshua Lochner d981b153d3
[webgpu/js] Optimize resize webgpu op & fix precision issues (#23591)
### Description
<!-- Describe your changes. -->

This PR is a follow-up to
https://github.com/microsoft/onnxruntime/pull/23488 and partially
improves upon https://github.com/microsoft/onnxruntime/issues/23403. It
does the following:
- Prevents unnecessary cache shader recompilation for 'nearest' resize
operation.
- Fixes precision (offset-by-one) errors with asymmetric coordinate
transform. When running the Kokoro TTS model, values for the
`/decoder/decoder/generator/f0_upsamp/Resize_output_0` results in
differences at the end bounds due to precision issues when dividing
21600 by 72 (should be 300, but seemingly results in 299.999, which
causes issues when flooring)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

I did a deep dive over the weekend to try fix Kokoro TTS on WebGPU and
found that the above node had a large difference. Thinking this was a
major issue, I spent some time fixing it. Turns out, it only happens for
a small number of values, leading to high maximum error, but most values
are correct (as seen here).

BEFORE:
```
[/decoder/decoder/generator/f0_upsamp/Resize_output_0] atol: 78.6640682220459 | rtol: 24.13991587587724 | avgDiff: 0.009967932171121087 | medianDiff: 0.000030517578125
```

AFTER:
```
[/decoder/decoder/generator/f0_upsamp/Resize_output_0] atol: 0.0011138916015625 | rtol: 0.0020059924232260704 | avgDiff: 0.00008570214675873825 | medianDiff: 0.000030517578125
```

So, although it has a very small impact on the final output (waveform),
this bug could appear with other models in a more severe way.

BEFORE:
```
[waveform] atol: 0.04784199967980385 | rtol: 1366.0462001093495 | avgDiff: 0.0009544936942737713 | medianDiff: 0.00015346752479672432
```

AFTER:
```
[waveform] atol: 0.04775865003466606 | rtol: 1354.7002460360852 | avgDiff: 0.000954830244055033 | medianDiff: 0.00015274062752723694
```
2025-02-06 10:26:25 -08:00
..
docs [WebNN] Remove workarounds for TFLite backend (#23406) 2025-01-21 17:20:19 -08:00
lib [webgpu/js] Optimize resize webgpu op & fix precision issues (#23591) 2025-02-06 10:26:25 -08:00
script Pre-requisites of upgrading EMSDK (#23347) 2025-01-14 11:07:21 -08:00
test Fix ConvTranspose for certain attribute combinations (#23488) 2025-02-05 12:22:47 -08:00
.gitignore [js/web] optimize module export and deployment (#20165) 2024-05-20 09:51:16 -07:00
.npmignore [js/web] optimize module export and deployment (#20165) 2024-05-20 09:51:16 -07:00
karma.conf.js [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
package-lock.json [js/web] upgrade version of flatbuffers (#23545) 2025-01-31 10:28:53 -08:00
package.json [js/web] upgrade version of flatbuffers (#23545) 2025-01-31 10:28:53 -08:00
README.md [WebNN EP] Add WebNN operators doc to README.md (#20734) 2024-05-20 14:57:40 -07:00
tsconfig.json
types.d.ts [js/web] remove training release (#22103) 2024-09-16 10:56:22 -07:00

ONNX Runtime Web

ONNX Runtime Web is a Javascript library for running ONNX models on browsers and on Node.js.

ONNX Runtime Web has adopted WebAssembly and WebGL technologies for providing an optimized ONNX model inference runtime for both CPUs and GPUs.

Why ONNX models

The Open Neural Network Exchange (ONNX) is an open standard for representing machine learning models. The biggest advantage of ONNX is that it allows interoperability across different open source AI frameworks, which itself offers more flexibility for AI frameworks adoption.

Why ONNX Runtime Web

With ONNX Runtime Web, web developers can score models directly on browsers with various benefits including reducing server-client communication and protecting user privacy, as well as offering install-free and cross-platform in-browser ML experience.

ONNX Runtime Web can run on both CPU and GPU. On CPU side, WebAssembly is adopted to execute the model at near-native speed. ONNX Runtime Web compiles the native ONNX Runtime CPU engine into WebAssembly backend by using Emscripten, so it supports most functionalities native ONNX Runtime offers, including full ONNX operator coverage, multi-threading, ONNX Runtime Quantization as well as ONNX Runtime Mobile. For performance acceleration with GPUs, ONNX Runtime Web leverages WebGL, a popular standard for accessing GPU capabilities. We are keeping improving op coverage and optimizing performance in WebGL backend.

See Compatibility and Operators Supported for a list of platforms and operators ONNX Runtime Web currently supports.

Usage

Documents

Development

Refer to the following links for development information:

Compatibility

EPs/Browsers Chrome/Edge (Windows) Chrome/Edge (Android) Chrome/Edge (MacOS) Chrome/Edge (iOS) Safari (MacOS) Safari (iOS) Firefox (Windows) Node.js
WebAssembly (CPU) ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️[1]
WebGPU ✔️[2] ✔️[3] ✔️
WebGL ✔️[4] ✔️[4] ✔️[4] ✔️[4] ✔️[4] ✔️[4] ✔️[4]
WebNN ✔️[5]
  • [1]: Node.js only support single-threaded wasm EP.
  • [2]: WebGPU requires Chromium v113 or later on Windows. Float16 support requires Chrome v121 or later, and Edge v122 or later.
  • [3]: WebGPU requires Chromium v121 or later on Windows.
  • [4]: WebGL support is in maintenance mode. It is recommended to use WebGPU for better performance.
  • [5]: Requires to launch browser with commandline flag --enable-features=WebMachineLearningNeuralNetwork.

Operators

WebAssembly backend

ONNX Runtime Web currently support all operators in ai.onnx and ai.onnx.ml.

WebGL backend

ONNX Runtime Web currently supports a subset of operators in ai.onnx operator set. See webgl-operators.md for a complete, detailed list of which ONNX operators are supported by WebGL backend.

WebGPU backend

WebGPU backend is still an experimental feature. See webgpu-operators.md for a detailed list of which ONNX operators are supported by WebGPU backend.

WebNN backend

WebNN backend is still an experimental feature. See webnn-operators.md for a detailed list of which ONNX operators are supported by WebNN backend.

License

License information can be found here.