Commit graph

99 commits

Author SHA1 Message Date
Jon Campbell
768c79317c
Enable QNN HTP support for Node (#20576)
### Description
Add support for using Onnx Runtime with Node

### Motivation and Context
Onnx Runtime supports the QNN HTP, but does not support it for Node.js.
This adds baseline support for the Onnx Runtime to be used with Node.

Note it does not update the node packages that are distributed
officially. This simply patches the onnxruntime.dll to allow 'qnn' to be
used as an execution provider.

Testing was done using the existing onnxruntime-node package. The
`onnxruntime.dll` and `onnxruntime_binding.node` were swapped into
`node_modules\onnxruntime-node\bin\napi-v3\win32\arm64` with the newly
built version, then the various QNN dlls and .so files were placed next
to the onnxruntime.dll. Testing was performed on a variety of models and
applications, but the easiest test is to modify the [node quickstart
example](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/quick-start_onnxruntime-node).
2024-05-09 13:11:07 -07:00
Yi-Hong Lyu
b2481e3602
Bump up version in main from 1.18.0 to 1.19.0 (#20489)
Bump up version in main from 1.18.0 to 1.19.0 since the release branch
has been cut.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-04-29 20:21:41 -07:00
Wanming Lin
fe1c3a45c1
[WebNN EP] Support NPU deviceType (#20278) 2024-04-15 18:43:46 -07:00
Yulong Wang
01c7aaf6aa
[js/webgpu] allow setting env.webgpu.adapter (#19940)
### Description
Allow user to set `env.webgpu.adapter` before creating the first
inference session.

Feature request:
https://github.com/microsoft/onnxruntime/pull/19857#issuecomment-1999984753

@xenova
2024-03-19 12:55:00 -07:00
Yulong Wang
b29849a287
[js/common] fix typedoc warnings (#19933)
### Description
Fix a few warnings in typedoc (for generating JS API):
```
[warning] The signature TrainingSession.loadParametersBuffer has an @param with name "buffer", which was not used.
[warning] NonTensorType, defined in ./lib/onnx-value.ts, is referenced by OnnxValue but not included in the documentation.
[warning] TensorFactory, defined in ./lib/tensor-factory.ts, is referenced by Tensor but not included in the documentation.
[warning] ExternalDataFileType, defined in ./lib/onnx-model.ts, is referenced by InferenceSession.SessionOptions.externalData but not included in the documentation.
[warning] TensorToDataUrlOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toDataURL.toDataURL.options but not included in the documentation.
[warning] TensorToImageDataOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toImageData.toImageData.options but not included in the documentation.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.adapter.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.device.
```

Changes highlighted:
- Merge `CoreMlExecutionProviderOption` and
`CoreMLExecutionProviderOption`. They expose 2 set of different options
for React-native and ORT nodejs binding. This should be fixed in future.
- Fix a few inconsistency of names between JSDoc and parameters
- Fix broken type links
- Exclude trace functions
2024-03-15 19:01:50 -07:00
Belem Zhang
acb0df2280
Fix #19931 broken Get Started link of "ONNX Runtime JavaScript API" page (#19932)
### Description
Fix #19931 broken Get Started link

HTTP 404 for "Get Started" link in "ONNX Runtime JavaScript API" page

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2024-03-15 19:00:30 -07:00
Yulong Wang
79e50aeef3
[js/web] rewrite backend resolve to allow multiple EPs (#19735)
### Description

This PR rewrite the backend resolve logic to support specifying multiple
EPs.

#### Backend

The first version of ONNX Runtime Web actually carried some existing
code from [ONNX.js](https://github.com/microsoft/onnxjs), which includes
the "backend" concept. The original "backend" in ONNX.js is designed in
a way assuming there is only one backend from user's backend hint list
will be used. For example, in ONNX.js, if user specify a backend hint as
`['webgl', 'wasm']`, ONNX.js will first try to use WebGL backend - if it
loads successfully (the browser supports webgl), then "webgl" backend
will be used and "wasm" will be ignored; otherwise, "webgl" will be
ignored and try to load "wasm" backend.

In short: only one backend will be used when initializing a session.

#### Execution Provider

Execution Provider, or EP, in ONNX Runtime is a different concept. One
of the differences is that users are allow to specify multiple EPs, and
if one does not support a particular kernel, it can fallback to other
EP. This is a very common case when using a GPU EP in ONNX Runtime.

#### Current Status: Backend v.s. EP

Because of the history reasons mentioned above, the current status is
quite confusing. There are **real backend**s, which means it's different
implementation in code; and there are **backend hint**s, which are used
as string names for backend hint; and there are **EP**s of the ONNX
Runtime concepts.

currently there are only 2 **backend**s in our code base: The "onnxjs
backend", and the "wasm backend". The "onnxjs backend" currently only
powers backend hint "webgl", which go into the old onnx.js code path.
All other backend hints including "wasm", "cpu"(alias to wasm), "webgpu"
and "webnn" are all powered by "wasm backend".

And because ORT Web treat "backend" as an internal concept and want to
align with ONNX Runtime, so those names of backend hints are becoming EP
names.

The following table shows today's status:

| Execution Provider Name (public) / Backend Hint (internal) | Backend |
EP in ORT
| -------- | ------- | ------- |
| "wasm"/"cpu" | WasmBackend | CPU EP
| "webgl" | OnnxjsBackend | \* technically not an EP
| "webgpu" | WasmBackend | JSEP
| "webnn" | WasmBackend | WebNN EP

#### Problem

While the API allows to specify multiple EPs, the backend resolving only
allows one backend. This causes issues when user specify multiple EP
names in session options, the backend resolve behavior and EP
registration behavior is inconsistent. Specifically, in this issue:
https://github.com/microsoft/onnxruntime/issues/15796#issuecomment-1925363908:

EP list `['webgpu', 'wasm']` on a browser without WebGPU support
resolves to 'wasm' backend, but the full EP list is passed in session
options, so JSEP is still enabled, causing the runtime error.


#### Solution

Since we still need WebGL backend, we cannot totally remove the backend
register/resolve system. In this PR I made the following changes:
- initialize every backend from the EP list, instead of only do that for
the first successful one.
- for the first resolved backend, filter all EP using the exact same
backend. Remove all EPs not using this backend from session options
- for every explicitly specified EP, if it's removed, show a warning
message in console
2024-03-15 11:47:45 -07:00
Yulong Wang
4538d31a8b
[js/webgpu] expose a few properties in WebGPU API (#19857)
### Description
This change exposes a few properties in `ort.env.webgpu` to resolve
feature requirement mentioned in properties in
https://github.com/microsoft/onnxruntime/pull/14579#discussion_r1519612619.

- Add `powerPreference` and `forceFallbackAdapter` in `ort.env.webgpu`,
to allow users to set the value of the properties before the first
inference session is created.
- Add readonly property `adapter` in `ort.env.webgpu` to allow users to
get the adapter instance. Now users can access `ort.env.webgpu.device`
and `ort.env.webgpu.adapter`.

@xenova @beaufortfrancois
2024-03-12 19:50:51 -07:00
Yulong Wang
3cb81cdde2
[js/common] move 'env.wasm.trace' to 'env.trace' (#19617)
### Description

Try to move 'env.wasm.trace' to 'env.trace' to make it less confusing,
because it also works in webgpu. Marked 'env.wasm.trace' as deprecated.
2024-02-27 11:07:15 -08:00
Yulong Wang
58f4921686
[js] changes to allow Float16Array if any polyfill is available (#19305)
### Description

This change adds only necessary code to enable ort-web works with any
Float16Array polyfill. Unlike #19302, in this PR, ort-web does not
include any specific polyfill; instead, it's user's choice for how to
use a polyfill.

ORT-web uses Float16Array if it's available; otherwise, fallback to use
Uint16Array.

```js
// case 1: user does not use polyfill:
import * as ort from 'onnxruntime-web';

const myF16Data = new Uint16Array(...);  // need to use Uint16Array
const myF16tensor = new ort.Tensor('float16', myF16Data, dims);
```

```js
// case 2: user use polyfill:
import * as ort from 'onnxruntime-web';
import {
  Float16Array, isFloat16Array, isTypedArray,
  getFloat16, setFloat16,
  f16round,
} from "@petamoriken/float16";
globalThis.Float16Array = Float16Array;  // ort-web will pick the global Float16Array

const myF16Data = new Float16Array(...);  // Use the polyfilled Float16Array type
const myF16tensor = new ort.Tensor('float16', myF16Data, dims);
```
2024-02-21 00:31:06 -08:00
Yulong Wang
6e04e36e3f
[js/common] upgrade tsc in common from 4.9.5 to 5.2.2 (#19317)
### Description
upgrade tsc in common from 4.9.5 to 5.2.2
2024-02-20 17:33:37 -08:00
Yulong Wang
06269a3952
[js/webgpu] allow uint8 tensors for webgpu (#19545)
### Description
allow uint8 tensors for webgpu
2024-02-16 18:28:27 -08:00
Jiajia Qin
85cef0af8c
[js/webgpu] Support capture and replay for jsep (#18989)
### Description
This PR expands the graph capture capability to JS EP, which is similar
to #16081. But for JS EP, we don't use the CUDA Graph, instead, we
records all gpu commands and replay them, which removes most of the cpu
overhead to avoid the the situation that gpu waiting for cpu.

mobilenetv2-12 becomes 3.7ms from 6ms on NV 3090 and becomes 3.38ms from
4.58ms on Intel A770.

All limitations are similar with CUDA EP:
1. Models with control-flow ops (i.e. If, Loop and Scan ops) are not
supported.
2. Usage of graph capture is limited to models where-in all ops in the
model can be partitioned to the JS EP or CPU EP and no memory copy
between them.
3. Shapes of inputs/outputs cannot change across inference calls.
4. IObinding is required.

The usage is like below:
Method 1: specify outputs buffers explicitly.
```
    const sessionOptions = {
        executionProviders: [
          {
            name: "webgpu",
          },
        ],
        enableGraphCapture: true,
      };
    const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions);
   
    // prepare the inputBuffer/outputBuffer
    ... ...

   const feeds = {
       'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims })
   };

   const fetches = {
       'output': ort.Tensor.fromGpuBuffer(outputBuffer, { dataType: 'float32', dims: [1, 1000] })
   };

   let results = await session.run(feeds, fetches);  // The first run will begin to capture the graph.

   // update inputBuffer content
  ... ...
   results = = await session.run(feeds, fetches);  // The 2ed run and after will directly call replay to execute the graph.

  ... ...
   session.release();
```
Method 2: Don't specify outputs buffers explicitly. Internally, when
graph capture is enabled, it will set all outputs location to
'gpu-buffer'.
```
    const sessionOptions = {
        executionProviders: [
          {
            name: "webgpu",
          },
        ],
        enableGraphCapture: true,
      };
    const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions);

    // prepare the inputBuffer
    ... ...

   const feeds = {
       'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims })
   };

   let results = await session.run(feeds);  // The first run will begin to capture the graph.
   
   // update inputBuffer content
  ... ...
   results = = await session.run(feeds);  // The 2ed run and after will directly call replay to execute the graph.

  ... ...
   session.release();
2024-01-30 18:28:03 -08:00
Rachel Guo
bd9d8fb2a5
[ORT 1.17.0 release] Bump up version to 1.18.0 (#19170)
### Description
<!-- Describe your changes. -->

Bump up version to 1.18.0 since the release branch has been cut.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2024-01-17 11:18:32 -08:00
Yulong Wang
f917dde717
[web] remove xnnpack from web backends (#19116)
### Description
XNNPACK is already disabled in web assembly build. This change removes
the xnnpack backend registration in JS.
2024-01-13 23:04:02 -08:00
Yang Gu
e803f8eb0f
[js/webgpu] Refactor timestamp-query and introduce timestamp-query-inside-passes (#18894)
We submit kernels in a batch (a fixed number 16 is used except for the
last batch) for better performance. However, timestamp query support is
at pass level so we disable the batch execution in profiling mode in
previous implementation. Actually we can have multiple passes in a batch
so that we don't have to disable batch execution, which is the first
enhancement of this PR.
Furthermore, WebGPU has an extension to support timestamp query inside
passes, which isn't supported by all the platforms (e.g., Windows
supports it, while macOS doesn't). This is expected to have lower cost
compared with multiple passes solution. So this PR also introduce this
support when available.
This PR also refactors some implementation related to kernelInfo, and
try to unify the related kernel names.
2024-01-13 00:23:17 -08:00
Yulong Wang
07cfc56538
[js] enable external data loading for ort-web (#19087)
### Description
enable external data loading for ort-web.

### Why
The ORT external data design is highly depending on the file system,
especially synchronous file I/O APIs. Those are not available in web
platforms. We need to have extra code to make external data working on
web.

### How
Considering there is no file system in web, an implementation for web to
support external data is to use pre-loaded data. Assume model file
a.onnx includes initializers that linked to ./b.bin, we require users to
pass a full data file list when creating the session. The user code will
be look like:
```js
const mySess = await ort.InferenceSession.create('./path/model/a.onnx', {
  // session options
  externalData: [
    {
      // relative or absolute path/URL of the file,
      // or a pre-loaded Uint8Array containing the data of the external data file
      data: './path/data/b.bin', 

      // the relative path of the external data. Should match initializers' "location" value defined in the model file
      path: './b.bin'
    },
    // { } if multiple external data file
  ]
});
```

Currently, this feature only works with JSEP build enabled.
2024-01-12 19:24:24 -08:00
Jiangzhuo
a503561d0c
[js] using OffscreenCanvas when DOM is not available (#19033)
### Description
when DOM API is not avaiable, using OffscreenCanvas


### Motivation and Context
In some environment like service worker or web worker, the DOM API is
not avaiable, we can use OffscreenCanvas API to replace
`document.createElement('canvas')`.
Most of the APIs of OffscreenCanvas and HTMLCanvasElement are the same,
except that `toDataUrl` is missing.

It fix this issues #19032
2024-01-12 13:54:05 -08:00
Yang Gu
c5f3952b68
[js/webgpu] Introduce trace support (#18928)
This is to leverage console.timeStamp to add a single marker to
browsers' (only Chromium and Firefox support it) performance tool. With
this support, we can dump both CPU and GPU timestamps, and use
post-processing tool to clearly understand the calibrated timeline. A
demo tool can be found at https://github.com/webatintel/ort-test, and
more detailed info can be found at

https://docs.google.com/document/d/1TuVxjE8jnELBXdhI4QGFgMnUqQn6Q53QA9y4a_dH688/edit.
2024-01-03 10:13:17 -08:00
Yulong Wang
9a61388f0a
[js/web] revise backend registration (#18715)
### Description
This PR revises the backend registration.

The following describes the expected behavior after this change:
(**bolded are changed behavior**)

- (ort.min.js - built without webgpu support)
    - loading: do not register 'webgpu' backend
- creating session without EP list: use default EP list ['webnn', 'cpu',
'wasm']
- creating session with ['webgpu'] as EP list: should fail with backend
not available
- (ort.webgpu.min.js - built with webgpu support)
    - loading: **always register 'webgpu' backend**
( previous behavior: only register 'webgpu' backend when `navigator.gpu`
is available)
- creating session without EP list: use default EP list ['webgpu',
'webnn', 'cpu', 'wasm']
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init,**
and try to use next backend in the list, 'webnn'
(previous behavior: does not fail backend init, but fail in JSEP init,
which was too late to switch to next backend)
    - creating session with ['webgpu'] as EP list
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init, and
because no more EP listed, fail.


related PRs: #18190 #18144
2023-12-20 14:45:55 -08:00
Caroline Zhu
eb03032925
[js/web/training] lazyResetGrad implementation (#18711)
### Description
* implemented lazyResetGrad function

### Motivation and Context
* we are in the process of adding language bindings to enable training
on web
* lazyresetgrad ensures that the gradients are calculated correctly
after the first runTrainStep call

---------

Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-12-11 17:36:54 -08:00
Yulong Wang
efbef5f611
[js/webgpu] allow to specify callback for profiling data (#18732)
### Description

**This PR is a replacement of #17820.**

allow to specify callback for profiling data

*Previous*:
```js
ort.env.webgpu.profilingMode = 'default';  // enable profiling

// profiling data will output to console.
```

*Now*:
```js
ort.env.webgpu.profiling = {
  mode: 'default';  // enable profiling
  ondata: (data) => {
    // .. process the profiling data
  }
};

//for each kernel, "ondata" will be called once. only output to console if ondata is not specified.
```
2023-12-07 14:10:28 -08:00
Caroline Zhu
c02a386145
[js/web/training] Implemented runEvalStep & runOptimizerStep (#18259)
### Description
* implemented runEvalStep and runOptimizerStep
* added hasEvalModel and hasOptimizerModel boolean fields in
TrainingSession representation
* added evalInputNames and evalOutputNames fields to
TrainingSessionHandler & TrainingSession
* removed the inputNamesEncoded and outputNamesEncoded fields from
TrainingSessionHandler -- since none of the training methods require the
input names and output names as parameters, there's no need to store
them.

### Motivation and Context
* part of the work for implementing web bindings for training
* previous PR: #18250

---------

Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-12-04 13:37:14 -08:00
Caroline Zhu
dd355e39a0
[js/web/training] Added parameters methods (#18250)
### Description
* Implemented: `getParametersSize`, `getContiguousParameters`
(equivalent to copyParametersToBuffer), and `loadParametersBuffer`
(equivalent to copyParametersFromBuffer)
* as part of these changes, getParametersSize was added to the
TrainingSession interface so that users know what size buffer to create
for loadParametersBuffer
* The parameters methods in the interface were modified to take in a
Float32Array instead


### Motivation and Context
* part of the work for implementing web bindings for training
* enables federated learning in the web
* previous  PR: #18006

---------

Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-11-27 10:30:13 -08:00
Wanming Lin
73ed34ac4b
[WebNN EP] Support numThreads option for WebNN CPU device (#18054) 2023-11-12 16:45:10 -08:00
Yulong Wang
6b0c97b43f
[js/web] fix typescript type check (#18343)
### Description

This PR fixes the TypeScript type check.

Previously, when I use esbuild to replace webpack (#17745), typescript
typecheck was disabled. This causes a few TypeScript type error checked
in into the code base. This PR fixes the followings:

- Use "Node16" as default "module" value in tsconfig.json, because in
TypeScript v5, `(module == "ES2015" && moduleResolution == "Node16")` is
an invalid combination.
- Set `noUnusedParameters` to true as default. in web override it to
false because multiple code need to be updated ( a following-up PR will
do this )
- set correct project file for 'web/lib/**/*.ts' for ESLint (otherwise
WebGPU types are not populated correctly)
- fix type error in file js/web/lib/wasm/jsep/webgpu/program-manager.ts
- upgrade "@webgpu/types" to latest to fix type error in file
js/web/lib/wasm/jsep/backend-webgpu.ts
- add package script "prebuild" for web to run tsc type check
- add type check in CI yml file
2023-11-10 16:03:38 -08:00
Caroline Zhu
e3b043ba17
[js/web/training] runTrainStep implementation (#18006)
### Description
* based on design document & following InferenceSession's run
implementation, implemented TrainingSession.runTrainStep

### Motivation and Context
* Adding web bindings for training

#### Related work
* #16521 allowed for training artifacts to be built
* #17333 added interfaces for training
* #17474 allowed for training package to be built + added training
backend to web package
* #17891 implementation for createTrainingSession on the TypeScript side
**[SHOULD BE MERGED IN BEFORE THIS PR]**

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-11-02 08:32:50 -07:00
Caroline Zhu
64de71c5e2
[js/web/training] Add CreateTrainingSession (#17891)
### Description
* Adds TrainingSession.create() functionality following the web bindings
for training design doc
* Added 2 new training APIs to wasm/api.h:
   * OrtTrainingGetInputOutputName
   * OrtTrainingGetInputOutputCount
* Moved isOrtEnvInitialized boolean to the wasm-core-impl and added a
method that references it

### Motivation and Context
* Adding web bindings for training

#### Related work
* #16521 allowed for training artifacts to be built
* #17333 added interfaces for training
* #17474 allows for training package to be built + adds training backend
to web package **[MUST BE MERGED IN BEFORE THIS ONE]**

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-10-26 09:22:10 -07:00
Yulong Wang
5228332c9f
[js] upgrade JS shared dev dependencies (#17831)
### Description
upgrade JS shared dev dependencies.

- webpack: removed
- eslint: upgrade to latest.
   - eslint config upgraded to compatible with latest version
- typescript upgrade to v5
   - update module "CommonJS" to "Node16" in tsconfig
- update deprecated config "importsNotUsedAsValues" to
"verbatimModuleSyntax"
- remove webpack bundles in onnxruntime-common
2023-10-10 17:44:39 -07:00
Yulong Wang
451c02543a
[js/webgpu] allow specify preferredLayout (#17756)
### Description
Allow WebGPU backend to specify `preferredLayout`. Default is NHWC.

```js
const options = {executionProviders: [{name:'webgpu', preferredLayout: 'NCHW'}]};
sess1 = await ort.InferenceSession.create('./mobilenetv2-12.onnx', options);
```

### Motivation and Context
- implement @qjia7's requirement for an easier way to do performance
comparison between NCHW vs NHWC.
- It's possible that NCHW does better on some models and NHWC on others.
So offer user the capability to switch.
2023-10-02 21:25:12 -07:00
xhcao
0d60604638
[JS/WebGPU] support Range operator (#17233)
The patch also introduces the method which copies
data from GPU to CPU synchronously.

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-30 02:05:32 -07:00
Caroline Zhu
6a5f469d44
Add training interfaces to js/common (#17333)
### Description
Following the design document:
* Added CreateTrainingSessionHandler to the Backend interface
* All existing Backend implementations throw an error for the new method
createTrainingSessionHandler
* Created TrainingSession namespace, interface, and
TrainingSessionFactory interface
* Created TrainingSessionImpl class implementation 

As methods are implemented, the TrainingSession interface will be added
to or modified.

### Motivation and Context
Adding the public-facing interfaces to the onnxruntime-common package is
one of the first steps to support ORT training for web bindings.

---------

Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com>
2023-09-29 19:05:10 -07:00
Vincent Wang
e6301eee6a
Bump Up Version to 1.17.0 (#17587)
Bump up version to 1.17.0 as the 1.16.0 release branch had been branched
out.
2023-09-20 11:02:58 +08:00
Yulong Wang
a2e75114cc
[js/web] add sessionOptions.freeDimensionOverrides (#17488)
### Description
Allows to specify fixed size for dynamic input of a model. resolves
#16707

Pending test
2023-09-13 09:17:34 -07:00
Yulong Wang
4d753b74a5
[js/common] prepare work for supporting webgpu IO binding implementation (#17465)
### Description
This PR contains a few changes in /js/common/ to support a coming PR for
a full implementation of webgpu IO binding.

- allows pass-through if value is already a Tensor instance in return
value of `handler.run()` called by `InferenceSession.run()`
(inference-session-impl.ts). Specifically, onnxruntime-node and
onnxruntime-react-native uses native bindings to generate a Tensor-like
object so we need to create a real Tensor instance here; for
onnxruntime-web the return value is already a Tensor instance.

- adds new types for GPU buffer supported types: `'float32'|'int32'` ->
`'float32'|'float16'|'int32'|'int64'|'uint32'|'bool'`

- exposes types `GpuBufferDataTypes` together with `CpuPinnedDataTypes`
and `TextureDataTypes` as exported
2023-09-08 13:49:24 -07:00
Yulong Wang
d88406a31b
[js/common] use Map instead of object for backends (#17352)
### Description
resolved
https://github.com/microsoft/onnxruntime/security/code-scanning/1140
2023-09-05 23:14:46 -07:00
Yulong Wang
2cb75420ac
[js/common] clean up JSDoc (#17408)
### Description
clean up JSDoc for onnxruntime-common:

- replace "@internal" to "@ignore" as JSDoc do not use "@internal".
Using "@ignore" will let the content not show on the generated doc.
2023-09-05 20:40:23 -07:00
Yulong Wang
e5ca3f3dcb
[js/api] introducing IO binding for tensor (#16452)
[//]: # (## Work In Progress. Feedbacks are welcome!)

### Description
This PR adds a few properties, methods and factories to Tensor type to
support IO-binding feature. This will allow user to create tensor from
GPU/CPU bound data without a force transferring of data between CPU and
GPU.

This change is a way to resolve #15312

### Change Summary
1. Add properties to `Tensor` type:
a. `location`: indicating where the data is sitting. valid values are
`cpu`, `cpu-pinned`, `texture`, `gpu-buffer`.
b. `texture`: sit side to `data`, a readonly property of `WebGLTexture`
type. available only when `location === 'texture'`
c. `gpuBuffer`: sit side to `data`, a readonly property of `GPUBuffer`
type. available only when `location === 'gpu-buffer'`

2. Add methods to `Tensor` type (usually dealing with inference
outputs):
- async function `getData()` allows user to download data from GPU to
CPU manually.
- function `dispose()` allows user to release GPU resources manually.

3. Add factories for creating `Tensor` instances:
    a. `fromTexture()` to create a WebGL texture bound tensor data
    b. `fromGpuBuffer()` to create a WebGPUBuffer bound tensor data
    c. `fromPinnedBuffer()` to create a tensor using a CPU pinned buffer

### Examples:

create tensors from texture and pass to inference session as inputs
```js
// when create session, specify we prefer 'image_output:0' to be stored on GPU as texture
const session = await InferenceSession.create('./my_model.onnx', {
  executionProviders: [ 'webgl' ],
  preferredOutputLocation: { 'image_output:0': 'texture' }
});

...

const myImageTexture = getTexture(); // user's function to get a texture
const myFeeds = { input0: Tensor.fromTexture(myImageTexture, { width: 224, height: 224 }) }; // shape [1, 224, 224, 4], RGBA format.
const results = await session.run(myFeeds);
const myOutputTexture = results['image_output:0'].texture;
```
2023-08-29 12:58:26 -07:00
Arthur Islamov
c262879214
Added DML and CUDA provider support in onnxruntime-node (#16050)
### Description
I've added changes to support CUDA and DML (only on Windows, on other
platforms it will throw an error)



### Motivation and Context
It fixes this feature request
https://github.com/microsoft/onnxruntime/issues/14127 which is tracked
here https://github.com/microsoft/onnxruntime/issues/14529

I was working on StableDiffusion implementation for node.js and it is
very slow on CPU, so GPU support is essential.

Here is a working demo with a patched and precompiled version
https://github.com/dakenf/stable-diffusion-nodejs

---------
2023-08-25 16:57:06 -07:00
Yulong Wang
969c95f73f
[js/common] a few fixes/revises to onnxruntime-common (#16853)
### Description
- enable unit test for js/common in CI
- add debug config in js/.vscode/launch.json
- enable source map for js/common/test for debugging purposes; add
source map files to ignore list
- ignore js/common/test folder for npm packaging
2023-08-01 11:17:39 -07:00
Artyom Stepanishchev
ba23e5b234
[JS/Common] Fix malformed result of Tensor.fromImage(ImageBitmap) (#16919)
### Description

Set `canvas` dimensions to the `ImageBitmap` dimensions, thus fixing a
malformed Tensor creation.

### Motivation and Context

According to the [HTMLCanvasElement.drawImage()
spec](https://html.spec.whatwg.org/multipage/canvas.html#drawing-images):
> When the destination rectangle is outside the destination image (the
output bitmap), the pixels that land outside the output bitmap are
discarded, as if the destination was an infinite canvas whose rendering
was clipped to the dimensions of the output bitmap.

meaning that `ImageBitmap` pixels exceeding the canvas dimensions will
be discarded. Since no canvas dimensions are set for
`Tensor.fromImage(ImageBitmap)` if-case, the default 300x150px canvas
dimensions are used leading to the creation of malformed Tensors where
all the exceeding pixels are discarded and equal to `0, 0, 0, 0` during
the subsequent `pixels2DContext.getImageData()` call.
2023-07-31 18:18:06 -07:00
Yulong Wang
1743e9a615
[js] enable formatter for more file types (#16888)
### Description
enable formatter for .js/.json/.jsonc/.md files
2023-07-28 15:46:58 -07:00
Yulong Wang
53c771f215
[js/common] add unit tests for onnxruntime-common (#16812)
### Description
"onnxruntime-common" starts to get more and more complicated, so it's a
good idea to add unit tests for it.

Includes the following changes:
- move `mocha` from each subfolder (js/web/, js/node/) to root (js/), so
that it will be installed once and all subfolder can use.
- add folder `test` in js/common/ as root folder for ort-common tests.
- add sub folder `type-tests`. this folder contains a few typescript
source code, which are excluded from the tsconfig.json. they are not
compiled by default. instead, file `type-tests.ts` calls typescript
compiler (tsc) to check for the files under this folder whether the
compilation result is as expected. If tsc compiles a file successfully
when a failure is expected, this is considered an failed test.
- add sub folder `unit-tests`. files under this folder will be compiled
by default. we use default mode of mocha (using `describe()` and `it()`)
to setup test groups and cases.
- update eslint rules accordingly.
2023-07-25 14:37:41 -07:00
Yulong Wang
ecca11340a
[js/common] allow creating (u)int64 tensors in 2 ways (#16541)
### Description
allow creating (u)int64 tensors from either a number array or a bigint
array.

before:

```js
// TypeScript think is good, but actually does not work
// runtime error: Uncaught TypeError: Cannot convert 1 to a BigInt
const myTensor1 = new Tensor('int64', [1, 2, 3, 4], [2, 2]);

// runtime good, but TypeScript thinks myTensor2 is a string tensor
const myTensor2 = new Tensor('int64', [1n, 2n, 3n, 4n], [2, 2]);
```

after:
```js
// both work at runtime and TypeScript populates the correct types
const myTensor1 = new Tensor('int64', [1, 2, 3, 4], [2, 2]);
const myTensor2 = new Tensor('int64', [1n, 2n, 3n, 4n], [2, 2]);
```
2023-07-11 21:07:36 -07:00
Jhen-Jie Hong
685816bb0a
[js/rn] Add executionProviders support (#16233)
### Description
<!-- Describe your changes. -->

This PR adds support for `executionProviders` option for react-native
package, support:

- Android: cpu / xnnpack / nnapi
- iOS: cpu / xnnpack /  coreml

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

In my case I want to enable Core ML / NNAPI EP for react-native project.
2023-06-16 19:38:41 +10:00
Yulong Wang
e3e4926d00
[js/common] allow import onnxruntime-common as ESM and CJS (#15772)
### Description
allow import onnxruntime-common as ESM and CJS.
2023-06-12 12:05:11 -07:00
Yulong Wang
59f42cccb8
[js/common] refactor tensor type in onnxruntime-common (#15843)
### Description
<!-- Describe your changes. -->

refactor tensor type in onnxruntime-common.

### Motivation and Context
There major motivation is that I am doing a local change to address the
API part of #15312. And I am doing a refactoring of onnxruntime-common
anyway (#15772).

The `tensor.ts` and `tensor-impl.ts` are too large, so I split contents
into multiple files to make the type declarations clearer.

The original target of this change is for API only ( ie. do not refactor
any implementation.). However, there are a few type/implementation
inconsistencies so I also made minimal changes to fix them.

### Changes
- extract `TensorUtils` for non-template interfaces
- extract `TensorFactory` for all overloads of `Tensor.fromImage()`
- refactor options type that used for `Tensor.fromImage()`
- fix JSDoc comments to make option descriptions consistent with actual
type declarations
- fix an inconsistency for `options.format` and `options.bitmapFormat`;
change all `bitmapFormat` to `format`
- extract `ConversionUtils` for `tensor.toDataURL()` and
`tensor.toImageData()`
- put implementations into multiple files from `tensor-impl.ts`
- fix a bug that cause unittest fail. put comments for future fix.
2023-06-09 16:19:29 -07:00
Yulong Wang
f274bbb0c8
[js] add API that allows to get package version (#16207)
### Description

Add an API for users to get version of current package. example usage:

```js
import { env } from 'onnxruntime-node';

console.log(env.versions.node);  // output "1.16.0"
```

```js
import { env } from 'onnxruntime-web';

console.log(env.versions.web);  // output "1.16.0"
console.log(env.versions.common);  // output "1.16.0"
console.log(env.versions.node);  // output "undefined"
```

#16156
2023-06-09 16:18:53 -07:00
Wanming Lin
a8c2f24ae0
[WebNN EP] Merge support for segment anything into main branch (#16208)
We implemented a number of new ops and data types to support running
segment anything model on Chromium WebNN DML backend (POC) in a forked
branch https://github.com/honry/onnxruntime/tree/stable-diffusion

In this PR, we migrate the changes in the forked branch to main branch,
includes:
 - 22 new ops
- New tensor data types: bool, int32, uint32, uint64, int64, float16 (As
JavaScript hasn't shipped Float16Array, we use Uint16Array as a
workaound)
 - Handle empty input tensors and duplicated outputs
 - Fixed some nits
2023-06-07 09:56:37 -07:00
Yulong Wang
ba5f5e3198
[js] allow manually release inference session (#16169)
### Description
This change adds a new instance function (method) to type
`InferenceSession` to allow users to manually release an inference
session instance.

#16131 depends on this change to work correctly.
2023-05-31 00:31:38 -07:00