### Description
This change introduced the following new components into ONNX Runtime
Web:
- JavaScript Execution Provider (JSEP)
- Asynchronized inferencing execution powered by Emscripten's Asyncify
- WebGPU backend implemented in TypeScript
- initial implementation of kernels:
- elementwise operators (22)
- binary operators (5)
- tensor: Shape, Reshape, Transpose, Gemm
- nn: Conv, {Global}Maxpool, {Global}AveragePool
Code need to be polished. still working on it.
## Q&A
What is JSEP?
> JSEP, aka JavaScript Execution Provider, is a new ONNXRuntime
execution provider that specifically works on Web environment
(browsers). JSEP allows JavaScript code to kick in from various places
when ONNX Runtime inferences a model.
Why JSEP?
> JSEP is a hybrid mode EP that contains both C/C++ and
TypeScript/JavaScript implementation. There are 2 strong reasons why we
introduces JSEP:
> 1. the C/C++ part helps JSEP to leverage ONNX Runtime's capabilities
as much as possible including graph transformer, optimizers and also the
capabilities to fallback to CPU EP. TypeScript/JavaScript helps JSEP to
develop and debug much easier in the browser for the kernel
implementation.
> 2. the requirement of asynchronized execution from JavaScript API (eg.
`buffer.mapAsync()`) makes it impossible to run `OrtRun()` in a
synchronized context (see "async problem" section below). This is done
by using Emscripten's Asyncify.
What is WebGPU?
> WebGPU is the new GPU API that available in browser. It's one of the
only 2 APIs that currently available to access the GPU from browser (the
other is WebGL).
> WebGPU is designed with more advanced and stronger features comparing
to WebGL and is potentially solution that offer the best GPU performance
for model inferencing that currently available.
What is the async problem and why we have the problem?
> The "async problem" is a problem that you cannot call an async
function in a synchronous context. Think about the following C++ code:
> ```c
> // C-style declarations (API)
> typedef void (*ON_COMPLETE)(PVOID state, DATA *data);
> void read_data_from_file(FILEHANDLE file, ON_COMPLETE on_complete);
>
> // implementation
> DATA * my_impl_read_data_from_file_sync(FILEHANDLE file) {
> // how to implement?
> }
> ```
> The answer is, it's impossible to implement this function. Usually we
try to find a sync version API, or launch a thread to call the async
function and sync-wait on the main thread. Unfortunately, in browser
environment, neither is possible.
>
> WebGPU does not offer any synchronized API for data downloading (GPU
to CPU). This is the only operation that MUST be async. As `OrtRun()`
will eventually call into DataTransfer for copy data from GPU to CPU,
and `OrtRun()` is a synchronized function, this cannot be done in normal
way.
What is Emscripten? How is the Asyncify feature resolved the problem?
> Emscripten is the C/C++ compiler for WebAssembly. It's what we use to
compile ORT and generates the WebAssembly artifacts which runs on
browsers.
>
> Asyncify is a [compiler
feature](https://emscripten.org/docs/porting/asyncify.html) that allows
calling async functions from a synchronized context. In short, it
generates code to unwind and rewind call stack to emulate async
execution. With this feature, we are able to call the async function
inside `OrtRun()` call.
## Design Overview
**Inter-op**
JSEP is doing pretty much same thing to just another EP. It exposes an
interface for inter-op with JavaScript, which is defined in
onnxruntime/wasm/js_internal_api.js:
```js
// init JSEP
Module["jsepInit"] = function (backend, alloc, free, copy, copyAsync, createKernel, releaseKernel, run) {
Module.jsepBackend = backend;
Module.jsepAlloc = alloc;
Module.jsepFree = free;
Module.jsepCopy = copy;
Module.jsepCopyAsync = copyAsync;
Module.jsepCreateKernel = createKernel;
Module.jsepReleaseKernel = releaseKernel;
Module.jsepRun = run;
};
```
This simple JavaScript snippet defines all language barrier level
functions that requires by JSEP to achieve implementing kernels and data
transfers using JavaScript inside ONNX Runtime:
- `jsepBackend`: assign the singleton object to webassembly module
- `jsepAlloc` and `jsepFree`: implementation of data transfer's Alloc()
and Free()
- `jsepCopy`: synchronized copy ( GPU to GPU, CPU to GPU)
- `jsepCopyAsync`: asynchronized copy ( GPU to CPU)
- `jsepCreateKernel` and `jsepReleaseKernel`: a corresponding object
that maintained in JS to match lifecycle of Kernel in ORT
- `jsepRun`: OpKernel::Compute() should call into this
The abstraction above allows to tie as little as possible connections
and dependencies between C/C++ and TypeScript/JavaScript.
**Resource Management**
Lifecycle of tensor data and kernels are managed by ORT(C/C++) but the
implementation are left to JavaScript. JavaScript code are responsible
to implement the callbacks correctly.
For WebGPU, the GPU data is managed by JavaScript using a singleton map
(tensot_data_id => GPUBuffer). GPU pipeline is managed as singleton.
Shaders are managed using a singletonmap (shader_key => gpu_program),
while shader_key is generated by cache_key (OP specific, including
attributes) and input shapes.
**about data transfer**
`js::DataTransfer::CopyTensor` implemented to call either synchronized
or asynchronized copy callback, depending on the destination is GPU or
not. Emscripten's macro `EM_ASYNC_JS` is used to wrap the async function
to be called in the synchronized context.
**run kernel in JS**
Kernel class constructor calls once `jsepCreateKernel()` with an
optional per-kernel specific serialization to pass attributes into
JavaScript.
`Compute()` are implemented in a way that a metadata serialization is
performed in a base class and JavaScript code can access the data using
the Emscripten specific builtin macro `EM_ASM_*`.
**disabled features**
memory pattern is force disabled, because the WebGPU data is not
presented by a general memory model (a buffer can be represented by
offset + size).
concurrent run support is disabled. WebGPU is stateful and it also has
async function call. To support concurrent run will significantly
increase the complexity and we don't get any real benefit from it.
**prefer channels last**
JSEP prefers channels last and returns `DataLayout::NHWC` in method
`GetPreferredLayout()`. This will let the graph transformers to
preprocess the graph into a channels last form so that a more optimized
WebGPU shader can be used.
**Testing code**
It's impossible to test JSEP directly because JSEP itself does not
contain any kernel implementation. However, it has the kernel
registration which need to work together with the corresponding
JavaScript code. There are unit tests that run onnx models from
JavaScript API.
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
|
||
|---|---|---|
| .. | ||
| .vscode | ||
| common | ||
| node | ||
| react_native | ||
| scripts | ||
| web | ||
| .clang-format | ||
| .eslintrc.js | ||
| .gitignore | ||
| build_web.bat | ||
| build_web.sh | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| tsconfig.json | ||
| tsconfig.tools.json | ||
ONNX Runtime JavaScript API
This directory contains multiple NPM projects:
Development
This folder contains a .vscode folder for Visual Studio Code workspace configs. Using VSCode to open this folder
will allow code-formatting and linting features on typescript and C/C++ source code inside this folder. Following files
are used for code-formatting and linting features for developers:
- .vscode/**
- package.json
- packages-lock.json
- .eslintrc.js
- .clang-format
Please follow the steps described below to setup development environment.
Prerequisites
-
Node.js (16.0+): https://nodejs.org/ - (Optional) Use nvm (Windows / Mac/Linux) to install Node.js
-
Python (2.7 or 3.6+): https://www.python.org/downloads/
- python should be added to the PATH environment variable
-
Visual Studio Code: https://code.visualstudio.com/
- required extension: ESLint
- required extension: Clang-Format
- required extension: Debugger for Chrome
-
Chrome or Edge Browser
Setup TypeScript development environment
In <ORT_ROOT>/js, run:
npm ci
This will install Clang-format and ESLint for code-formatting and linting features. This is a one-time setup unless a git clean is performed or folder <ORT_ROOT>/js/node_modules is removed manually.
Using VSCode:
Use VSCode to open folder <ORT_ROOT>/js.
Make sure to open the correct folder to allow VSCode to load workspace configuration. Otherwise typescript and code formatter may not work as expected.
To populate typescript type declarations, in each project folder, run npm ci.
Run code formatter and linter manually
In <ORT_ROOT>/js, use npm run lint to run ESLint , and use npm run format to run clang-format.
onnxruntime-common
language: typescript
dependency:
folder: <ORT_ROOT>/js/common
This project is designed to include all "common" code, which are pure javascript that can run in both Node.js and browsers.
Requirements
Node.js v12+ (recommended v14+)
Build
Use following command in folder <ORT_ROOT>/js/common to install NPM packages, build typescript files and generate bundles:
npm ci
Distribution
It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-common) and from a CDN service that serves a .min.js bundle file.
Features
Following features are included in onnxruntime-common:
InferenceSessioninterfacesTensor/OnnxValueinterfaces, implementation and a set of utility functionsBackendinterfaces and a set of functions for backend registration
Generate API reference document
Use following command in folder <ORT_ROOT>/js/common to generate API reference document:
npx typedoc
Document will be generated in folder <ORT_ROOT>/js/common/docs.
onnxruntime-node
language: typescript/C++
dependency: onnxruntime-common, ONNXRuntime.dll
folder: <ORT_ROOT>/js/node
This project is designed to be used as a NPM package to enable Node.js users to consume ONNX Runtime via Node.js binding, in Node.js or any Node.js compatible environment.
Requirements
Node.js v12+ (recommended v14+)
Build
Build ONNX Runtime and Node.js binding
Follow instructions for building ONNX Runtime Node.js binding
Build Node.js binding only
Use following command in folder <ORT_ROOT>/js/node to install NPM packages and build typescript files:
npm ci
This will download the latest pre-built ONNX Runtime binaries for the current platform.
Distribution
It should be able to consumed by from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-node).
onnxruntime-web
language: typescript
dependency: onnxruntime-common, ONNXRuntime WebAssembly
folder: <ORT_ROOT>/js/web
This project is a library for running ONNX models on browsers. It is the successor of ONNX.js.
Build
onnxruntime-web build instructions
Test
We use command npm test (test runner) and npm run test:e2e (E2E test) for tests in ONNXRuntime Web.
test runner
In folder <ORT_ROOT>/js/web,
- Run
npm test -- --helpfor a full CLI instruction. - Run
npm test -- <your-args> --debugto run one or more test cases.
There are multiple levels of tests for ONNXRuntime Web:
-
unit test: tests for individual components written in TypeScript. Launch unit test by:
npm test -- unittest -
model test: run a single model. The model folder should contains one .onnx model file and one or more folders for test cases, each folder contains several input**.pb and output**.pb as test data. Launch model test by:
npm test -- model <model_folder> -
op test: test a single operator. An op test is described in a
.jsoncfile which specify the operator type, its attributes and one or more test case(s), each includes a list of expected input tensor(s) and output tensor(s). The.jsoncfile is located at<ORT_ROOT>/js/web/test/data/ops. Launch op test by:npm test -- op <file_name> -
suite test: suite test includes unit test, a list of model tests and op tests. Launch suite test by:
npm test
E2E test
E2E test is for testing end-to-end package consuming. In this test, NPM packages for onnxruntime-common and onnxruntime-web are generated and a clean folder is used for installing packages. Then a simple mocha test is performed to make sure package can be consumed correctly.
To launch E2E test:
npm run test:e2e
Debugging
Debugging TypeScript on Desktop/Chrome
To debug the code from test-runner on Chrome:
- Launch
npm test -- <your_args> --debug. It opens an instance of Chrome browser. - In the open Chrome browser, click the
DEBUGbutton on the top-right of the page. - In VSCode, click [side bar]->Run and Debug->select [Attach to Chrome]->click [Start Debugging] to attach.
- put breakpoints in source code, and Refresh the page to reload.
Debugging TypeScript on iOS/Safari
To debug on an Apple iOS device, please refer to the following steps:
- install RemoteDebug iOS WebKit Adapter by following its instructions.
- launch the adapter in commandline:
remotedebug_ios_webkit_adapter --port=9000. - in VSCode, select debug configuration
Remote Browser via Webkit Adaptor. - follow the steps above to debug.
Debugging TypeScript on Android/Chrome
To debug on an Android device, please refer to the following steps:
- Install Android SDK Platform Tools and make sure
adbis ready to use. - Follow instructions in Remote Debugging on Android to launch
adb. Make sure to use port 9000 so that the existing debug configuration works. - in VSCode, select debug configuration
Remote Browser via Webkit Adaptor. - follow the steps above to debug.
Debugging C/C++ for ONNX Runtime WebAssembly
To debug C/C++ code for ONNX Runtime WebAssembly, you need to build ONNX Runtime with debug info (see Build).
Currently debugging C/C++ code in WebAssembly is not supported in VSCode yet. Please follow this instruction to debug in browser devtool using extension C/C++ DevTools Support (DWARF).
Generating Document
This section describes how to generate the latest document for ONNX Runtime Web.
The document contains information about operators WebGL backend supports. It should align with the operator resolve rules in code and spec definition from ONNX.
In folder <ORT_ROOT>/js/web, use command npm run build:doc to generate the latest documents.
Distribution
It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-web) and from a CDN service that serves a ort.min.js file and one or multiple .wasm file(s).
Reduced WebAssembly artifacts
By default, the WebAssembly artifacts from onnxruntime-web package allows use of both standard ONNX models (.onnx) and ORT format models (.ort). There is an option to use a minimal build of ONNX Runtime to reduce the binary size, which only supports ORT format models. See also ORT format model for more information.
Reduced JavaScript bundle file fize
By default, the main bundle file ort.min.js of ONNX Runtime Web contains all features. However, its size is over 500kB and for some scenarios we want a smaller sized bundle file, if we don't use all the features. The following table lists all available bundles with their support status of features.
| bundle file name | file size | file size (gzipped) | WebGL | WASM-core | WASM-proxy | WASM-threads | ES5 backward compatibility |
|---|---|---|---|---|---|---|---|
| ort.es5.min.js | 594.15KB | 134.25KB | O | O | O | O | O |
| ort.min.js | 526.02KB | 125.07KB | O | O | O | O | X |
| ort.webgl.min.js | 385.25KB | 83.83KB | O | X | X | X | X |
| ort.wasm.min.js | 148.56 | 44KB | X | O | O | O | X |
| ort.wasm-core.min.js | 40.56KB | 12.74KB | X | O | X | X | X |
Build ONNX Runtime as a WebAssembly static library
When --build_wasm_static_lib is given instead of --build_wasm, it builds a WebAssembly static library of ONNX Runtime and creates a libonnxruntime_webassembly.a file at a build output directory. Developers who have their own C/C++ project and build it as WebAssembly with ONNX Runtime, this build option would be useful. This static library is not published by a pipeline, so a manual build is required if necessary.
onnxruntime-react-native
language: typescript, java, objective-c
dependency: onnxruntime-common
folder: <ORT_ROOT>/js/react_native
This project provides an ONNX Runtime React Native JavaScript library to run ONNX models on React Native Android and iOS app.
Requirements
- Yarn
- Android SDK and NDK, which can be installed via Android Studio or sdkmanager command line tool
- A Mac computer with the latest macOS
- Xcode
- CMake
- Python 3
Models with ORT format
Prior to ORT v1.13, the ONNX Runtime React Native package utilized the ONNX Runtime Mobile package, which required an ONNX model to be converted to ORT format. Follow these instructions to convert ONNX model to ORT format. Note that the ONNX Runtime Mobile package includes a reduced set of operators and types, so not all models are supported. See here for the list of supported operators and types.
From ORT v1.13 onwards the 'full' ONNX Runtime package is used. It supports both ONNX and ORT format models, and all operators and types.
Build
-
Install NPM packages for ONNX Runtime common JavaScript library and required React Native JavaScript libraries
- in
<ORT_ROOT>/js/, runnpm ci. - in
<ORT_ROOT>/js/common/, runnpm ci. - in
<ORT_ROOT>/js/react_native/, runyarn.
- in
-
Acquire or build the Android ONNX Runtime package
-
To use a published Android ONNX Runtime Mobile package from Maven, go to step 5.
-
Set up an Android build environment using these instructions. Note that the dependencies are quite convoluted, so using the specified JDK and Gradle versions is important.
-
In
<ORT_ROOT>, run the below python script to build the ONNX Runtime Android archive file. On a Windows machine, this requires an admin account to build.
You can build a 'full' package that supports all operators and types, or a reduced size 'mobile' package that supports a limited set of operators and types based on your model/s to miminize the binary size. See here for information about how the reduced build works, including creating the configuration file using your model/s.
Full build:
python tools/ci_build/github/android/build_aar_package.py tools/ci_build/github/android/default_full_aar_build_settings.json --config Release --android_sdk_path <ANDROID_SDK_PATH> --android_ndk_path <ANDROID_NDK_PATH> --build_dir <BUILD_DIRECTORY>Reduced size build with configuration file generated from your model/s. Note that either Release or MinSizeRel could be used as the config, depending on your priorities:
python tools/ci_build/github/android/build_aar_package.py tools/ci_build/github/android/default_mobile_aar_build_settings.json --config MinSizeRel --android_sdk_path <ANDROID_SDK_PATH> --android_ndk_path <ANDROID_NDK_PATH> --build_dir <BUILD_DIRECTORY> --include_ops_by_config <required_ops_and_types_for_your_models.config> --enable_reduced_operator_type_support-
Move the generated ONNX Runtime Android archive file to
<ORT_ROOT>/js/react_native/android/libs/.Full build: Copy
<BUILD_DIRECTORY>/aar_out/Release/com/microsoft/onnxruntime/onnxruntime-android/<version>/onnxruntime-android-<version>.aarinto<ORT_ROOT>/js/react_native/android/libsdirectory.Reduced size build: Copy
<BUILD_DIRECTORY>/aar_out/MinSizeRel/com/microsoft/onnxruntime/onnxruntime-mobile/<version>/onnxruntime-mobile-<version>.aarinto<ORT_ROOT>/js/react_native/android/libsdirectory and update to dependencies in js/react_native/android/build.gradle to use onnxruntime-mobile instead of onnxruntime-android. -
To verify, open the Android Emulator and run this command from
<ORT_ROOT>/js/react_native/android./gradlew connectedDebugAndroidTest
-
-
Build iOS ONNX Runtime package
-
To use the published C/C++ ONNX Runtime package from CocoaPods, skip all steps below.
-
Set up iOS build environment using these instructions.
-
Build a fat ONNX Runtime Mobile Framework for iOS and iOS simulator from
<ORT_ROOT>using this command:Full build:
python tools/ci_build/github/apple/build_ios_framework.py tools/ci_build/github/apple/default_full_ios_framework_build_settings.json --config ReleaseReduced size build:
python tools/ci_build/github/apple/build_ios_framework.py tools/ci_build/github/apple/default_mobile_ios_framework_build_settings.json --config MinSizeRel --include_ops_by_config <required_ops_and_types_for_your_models.config> --enable_reduced_operator_type_supportThe build creates
Headers,LICENSE, andonnxruntime.xcframeworkinbuild/iOS_framework/framework_outdirectory. Fromframework_outdirectory, create an archive file namedonnxruntime-c.zipfor a full build oronnxruntime-mobile-c.zipfor a reduced size build and copy to<ORT_ROOT>/js/react_native/local_podsdirectory.Full build:
zip -r onnxruntime-c.zip .Reduced size build:
zip -r onnxruntime-mobile-c.zip . -
To verify, open the iOS Simulator and run the below command from
<ORT_ROOT>/js/react_native/ios. Change the destination argument as needed to specify a running iOS Simulator.If using the reduced size build it is necessary to first update some configuration to use the mobile ORT package:
- replace
onnxruntime/onnxruntime.frameworkwithonnxruntime-mobile/onnxruntime.frameworkin /js/react_native/ios/OnnxruntimeModule.xcodeproj/project.pbxproj - replace
onnxruntime-cwithonnxruntime-mobile-cin /js/react_native/ios/Podfile - For reference, this PR shows the changes made to switch from using the 'mobile' ORT package to the 'full' package.
pod install xcodebuild test -workspace OnnxruntimeModule.xcworkspace -scheme OnnxruntimeModuleTest -destination 'platform=iOS Simulator,OS=latest,name=iPhone 13' - replace
-
-
Test Android and iOS apps. In Windows, open Android Emulator first.
debug.keystoremust be generated ahead for Android example.keytool -genkey -v -keystore <ORT_ROOT>/js/react_native/e2e/android/debug.keystore -alias androiddebugkey -storepass android -keypass android -keyalg RSA -keysize 2048 -validity 999999 -dname "CN=Android Debug,O=Android,C=US"From `<ORT_ROOT>/js/react_native,
yarn bootstrapWhen testing with a custom built ONNX Runtime Android package, copy
<BUILD_DIRECTORY>/aar_out/MinSizeRel/com/microsoft/onnxruntime/onnxruntime-{android|mobile}/<version>/onnxruntime-{android|mobile}-<version>.aarinto the<ORT_ROOT>/js/react_native/e2e/android/app/libsdirectory.When testing with a custom built ONNX Runtime iOS package, copy
onnxruntime-[mobile-]c.zipinto the<ORT_ROOT>/js/react_native/local_podsdirectory.If using the reduced size build it is necessary to update some configuration to use the mobile ORT package:
- replace
com.microsoft.onnxruntime:onnxruntime-androidwithcom.microsoft.onnxruntime:onnxruntime-mobilein /js/react_native/e2e/android/app/build.gradle - replace
onnxruntime-cwithonnxruntime-mobile-cin /js/react_native/e2e/ios/Podfile
- replace
-
Run E2E Testing with Detox framework
When testing with integrated Detox framework for Android and iOS e2e apps:
-
Detox prerequisites:
Install detox command line tools:
yarn global add detox-cliInstall applesimutils which is required by Detox to work with iOS simulators. (Requires a MacOS device)
brew tap wix/brew brew install applesimutilsMain Detox project files:
.detoxrc.js-Detox config file;e2e/jest.config.js-Jest configuration;e2e/OnnxruntimeModuleExample.test.js- initial react native onnxruntimemodule e2e detox test.
-
Build the detox e2e testing app.
From
<ORT_ROOT>/js/react_native/e2e, run the command to build the e2e testing app. Before that ensure you have android emulator/ios simulator started locally.iOS (Debug):
detox build --configuration ios.sim.debugAndroid (Debug):
detox build --configuration android.emu.debug- Note: If names of local testing android/ios devices do not match the default setting in
.detoxrc.jsfile, modify the device name in config files accordingly to match local device name otherwise would cause a build failure.
- Note: If names of local testing android/ios devices do not match the default setting in
-
Run the detox e2e tests.
In a debug configuration, you need to have React Native packager running in parallel before you start Detox tests:
npm start > react-native startFrom
<ORT_ROOT>/js/react_native/e2e, run Detox tests using the following command:iOS (Debug):
detox test --configuration ios.sim.debugAndroid (Debug):
detox test --configuration android.emu.debugTo record logs for testing results, add
--record-logs. Output logs and test results will be produced in thee2e/artifacts/folder. See: Detox/logger#artifactsyarn bootstrapchangespackages.jsonandyarn.lockfiles. Once testing is done, restore changes to avoid unwanted commit.
-
-
Run Android and iOS apps.
yarn e2e android yarn e2e ios
NPM Packaging
-
Update a version using
npm version <version>from<ORT_ROOT>/js/react_nativefolder. If it's for a dev, usenpm version <version>-dev.<subversion> -
Run
npm packand verify NPM package contents -
Run
npm publish <tgz> --dry-runto see how it's going to be published -
Run
npm publish <tgz>to publish to npmjs. If it's for a dev, add flag--tag dev.
Distribution
It should be able to consumed by React Native projects that uses Yarn packages through yarn add onnxruntime-react-native.