Commit graph

669 commits

Author SHA1 Message Date
Yulong Wang
5e66fcc703
[js/web] allow op test to use f16 type for inputs/outputs (#21664)
### Description
allow op test to use f16 type for inputs/outputs.

This PR introduces "@petamoriken/float16" as Float16Array polyfill but
restricts it to be only used for test runner.
2024-08-08 09:56:37 -07:00
Prathik Rao
134f47743e
bumps up version in main from 1.19 -> 1.20 (#21588)
Bump up version in main from 1.19.0 to 1.20.0 since the release branch
has been cut.
2024-08-05 15:46:04 -07:00
Wanming Lin
8c641d7182
[WebNN EP] Support Dropout op (#21586)
### Description
WebNN only supports test mode, so we don't care about other inputs or
attributes about training mode, use WebNN's identity op to implement the
Dropout op directly.
2024-08-02 16:25:04 -07:00
Wanming Lin
1d4b161145
[WebNN EP] Support ConvTranspose for TFLite backend (#21291)
### Description
Chromium supports ConvTranspose for TFLite in
https://chromium-review.googlesource.com/c/chromium/src/+/5635194

With constraint that only default dilations and groups are supported.

---------

Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
2024-07-30 17:46:08 -07:00
Yulong Wang
b03c9496aa
[js/web] allow load WebAssembly binary from buffer (#21534)
### Description

This PR adds a new option `ort.env.wasm.wasmBinary`, which allows user
to set to a buffer containing preload .wasm file content.

This PR should resolve the problem from latest discussion in #20876.
2024-07-29 13:39:38 -07:00
Xu Xing
0d7cf301a1
[js/webgpu] Add activation Tanh (#21540)
Bug:https://github.com/microsoft/onnxruntime/issues/21467

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-29 11:05:34 -07:00
Xu Xing
5bc12bf209
[js/webgpu] Add activation for conv3d naive (#21466)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-29 08:47:41 -07:00
Yulong Wang
dbff0cd098
[js/node] enable float16 support for Node.js binding (#20581)
### Description
enable float16 support for Node.js binding.

data of float16 tensor uses `Uint16Array`.
2024-07-28 13:03:17 -07:00
Wanming Lin
b6b29309a5
[WebNN EP] Update argMax/argMin to adapt to latest spec (#21452)
WebNN spec recently changes the definition of argMax/argMin:
- Remove selectLastIndex option, let backends decide to select the last
index or not.
- Move axes option to axis input
2024-07-25 17:07:01 -07:00
Changming Sun
7af39c6955
Update nodejs's cmake file to fix a file copy issue (#21390)
This commit e5f18ba2c1 caused some nightly
pipelines to fail. This PR fixes it.
It is because recently I changed our Linux library's SONAME. At runtime
onnxruntime_binding depends on libonnxruntime.so.1 , instead of
libonnxruntime.so.1.19.0(with the full version number). Therefore we
need to keep the libonnxruntime.so.1 symlink.

The packaging tools/ci_build/github/js/pack-npm-packages.ps1 still needs
be updated. I will address it in another PR.
2024-07-23 11:03:55 -07:00
mindest
5b9369e93c
Fix typos according to reviewdog report. (#21335)
### Description
Fix typos based on reviewdog report but with some
exceptions/corrections.
2024-07-22 13:37:32 -07:00
Yulong Wang
01df8c787d
[js/web] fix vulnerable version of dependencies (#21412)
### Description
```
# npm audit report

socket.io  3.0.0 - 4.6.2
Severity: high
socket.io has an unhandled 'error' event - https://github.com/advisories/GHSA-25hc-qcg6-38wj
Depends on vulnerable versions of engine.io
fix available via `npm audit fix`
node_modules/socket.io

ws  8.0.0 - 8.17.0
Severity: high
ws affected by a DoS when handling a request with many HTTP headers - https://github.com/advisories/GHSA-3h5v-q93c-6h6q
fix available via `npm audit fix`
node_modules/ws
  engine.io  0.7.8 - 0.7.9 || 6.0.0 - 6.5.4
  Depends on vulnerable versions of ws
  node_modules/engine.io
  socket.io-adapter  2.5.2 - 2.5.4
  Depends on vulnerable versions of ws
  node_modules/socket.io-adapter

4 high severity vulnerabilities
```
2024-07-19 11:11:30 -07:00
Xu Xing
92a8407b39
[js/webgpu] Remove unnecessary initialization of var (#21312)
This var has been initialized to 0 in tint, so no need extra loop to do
it again:
```
  float tint_symbol_52[1][4] = (float[1][4])0;
  {
    for(int tint_symbol_53 = 0; (tint_symbol_53 < 1); tint_symbol_53 = (tint_symbol_53 + 1)) {
      {
        for(int tint_symbol_54 = 0; (tint_symbol_54 < 4); tint_symbol_54 = (tint_symbol_54 + 1)) {
          tint_symbol_52[min(uint(tint_symbol_53), 0u)][min(uint(tint_symbol_54), 3u)] = 0.0f;
        }
      }
    }
  }
```
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-12 12:34:34 -07:00
pengwa
88336ffa92
Fix typos - 1st Wave (#21278)
### Description

There are so many typos reported by the review dog, [Optional Lint]
actions (example:
https://github.com/microsoft/onnxruntime/actions/runs/9864564489/job/27239732367),
this PR is to fix some of them.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-07-11 13:35:08 +08:00
Pavan Goyal
1b82d835d8
[Fix] InterOpNumThreads Session Option for ONNX ReactNative Package (#21263)
### Description
This PR resolves a bug related to setting the **interOpNumThreads**
session option when creating an **ORTSession**. Currently, when the
**interOpNumThreads** option is passed from React Native, the native
module incorrectly sets **intraOpNumThreads** instead of
**interOpNumThreads**.


### Motivation and Context
Since this is a bug, users of the Onnx React Native package may believe
that they are setting **interOpNumThreads** correctly, So this change is
required. Refer to the code snippet below for details
<img width="634" alt="Screenshot 2024-07-05 at 9 28 58 PM"
src="https://github.com/microsoft/onnxruntime/assets/88655321/70a8f216-553a-4f4c-9481-e6871f0e37e6">
2024-07-10 07:00:18 -07:00
Enrico Galli
4c3c809bdb
[js/webnn] Enable user-supplied MLContext (#20600)
### Description
This PR enables the API added in #20816 as well as moving context
creation to JS.

### Motivation and Context
In order to enable I/O Binding with the upcoming
[MLBuffer](https://github.com/webmachinelearning/webnn/issues/542) API
in the WebNN specification, we need to share the same `MLContext` across
multiple sessions. This is because `MLBuffer`s are restricted to the
`MLContext` where they were created. This PR enables developers to use
the same `MLContext` across multiple sessions.
2024-07-08 10:19:39 -07:00
Wanming Lin
cd516a1677
[WebNN EP] Remove constraint for conv ops on CPU backend (#21237)
Currently WebNN TFLite backend allows the filter of
conv2d/convTranspose2d be an input. Remove the constraint and operate
necessary transpose/reshape operations for the filter input.
2024-07-08 10:14:43 -07:00
Guenther Schmuelling
9eb1c2a7a3
support for layernorm in webgpu pre opset-17 (#21121)
handled the same way cpu does
2024-06-27 10:20:48 -07:00
Wanming Lin
41ad83fb00
[WebNN EP] Support rest Reduction ops for TFLite backend (#21135)
- reduceLogSum, reduceLogSumExp and reduceSumSquare have been landed in
https://chromium-review.googlesource.com/c/chromium/src/+/5575815
- reduceL1 and reduceL2 have been landed in
https://chromium-review.googlesource.com/c/chromium/src/+/5606091
2024-06-25 18:30:55 -07:00
Wanming Lin
4743803944
[WebNN EP] Support more Normalization ops for TFLite backend (#21151)
Following Normalization ops have been supported in Chromium for TFLite
backend:
- batchNormalization:
https://chromium-review.googlesource.com/c/chromium/src/+/5532745
- layerNormalization:
https://chromium-review.googlesource.com/c/chromium/src/+/5573326
- instanceNormalization:
https://chromium-review.googlesource.com/c/chromium/src/+/5532750
2024-06-24 19:04:23 -07:00
Wanming Lin
3a917e49fb
[WebNN EP] Support 4 more ops for TFLite backend (#21134)
Recently WebNN TFLite backend supports gelu, expand, softsign,
reciprocal.
2024-06-24 09:52:12 -07:00
Wanming Lin
0c80cd2157
[WebNN EP] Update Prelu restriction for CPU backend (#20878) 2024-06-20 11:04:01 -07:00
Xu Xing
c3076721f3
[js/webgpu] Support conv3d naive (#20706)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-19 10:13:50 -07:00
Wanming Lin
40879a2623
[WebNN EP] Enable Cast op for WebNN CPU backend (#20864)
WebNN TFLite backend supports `cast` op but doesn't support casting to
`uint64` data type.
2024-06-19 01:51:19 -07:00
Wanming Lin
35c430a95a
[WebNN EP] Enable several ops for WebNN CPU backend (#20847)
WebNN CPU implementation has been migrated from XNNPack to TFLite which
supports more ops. Turn on partial `cpu` supported ops which just need
the change from `false` to `true` firstly.
2024-06-19 01:45:31 -07:00
Yulong Wang
5e81fa8aec
[js] fix vulnerability CVE-2024-4068: upgrade braces to 3.0.3 (#21078)
### Description

Upgrade `braces` to 3.0.3

[CVE-2024-4068](https://github.com/advisories/GHSA-grv7-fg5c-xmjg)

```
# npm audit report

braces  <3.0.3
Severity: high
Uncontrolled resource consumption in braces - https://github.com/advisories/GHSA-grv7-fg5c-xmjg
fix available via `npm audit fix`
node_modules/braces

1 high severity vulnerability
```
2024-06-18 16:02:08 -07:00
Yulong Wang
631a2c16be
[js/web] skip default locateFile() when dynamic import is disabled (#21073)
### Description
skip default `locateFile()` when dynamic import is disabled. This allows
the file to work with bundlers to load WebAssembly file correctly if
`env.wasm.wasmPaths` is not set.
2024-06-18 12:21:45 -07:00
Yang Gu
1473d66a00
[js/webgpu] Prefer adapter.info to adapter.requestAdapterInfo (#21065)
WebGPU is deprecating async adapter.requestAdapterInfo, and replacing it
with sync adapter.info.
Spec change: https://github.com/gpuweb/gpuweb/pull/4662
2024-06-18 12:02:38 -07:00
Jian Chen
4e18b0b7ce
Upgrade braces from 3.0.2 to 3.0.3 to fix the vulnerability (#21022) 2024-06-12 18:02:52 -07:00
Yulong Wang
dd805ff77d
[js/web] ESM: use the bundled target as default export (#20991)
### Description
ESM: use the bundled target as default export

In this change, the default import of the following entries:
```
import from 'onnxruntime-web';
import from 'onnxruntime-web/all';
import from 'onnxruntime-web/webgpu';
```
will use the "bundled" version, which has no dynamic import.

This change should only apply to ESM on web.
2024-06-11 11:14:55 -07:00
Wanming Lin
043ef5c95f
[WebNN EP] Support latest WebNN softmax op (#20827)
Latest WebNN softmax supports N-D input and axis parameter.
2024-06-11 08:27:14 -07:00
Edward Chen
981893c318
Remove deprecated "mobile" packages (#20941)
# Description

This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed.

Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it.

# Motivation and Context

The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.
2024-06-07 16:20:32 -05:00
Wanming Lin
52874f628a
[WebNN EP] Remove some constraints for CPU backend (#20900)
Following constraints have been supported by WebNN TFLite backend:
- Concat: supports up to 4 inputs
- Matmul: supports broadcasting
- Resize: supports nearest mode
- Split: supports up to 4 outputs
2024-06-06 08:22:41 -07:00
Wanming Lin
da1f8f9274
[WebNN EP] TFLite backend only supports limit ranges for Clip (#20863) 2024-06-06 08:22:18 -07:00
Guenther Schmuelling
c749bd997a
webgpu quickgelu (#20939) 2024-06-06 08:21:33 -07:00
Changming Sun
3dd6fcc089
Upgrade min ios version to 13.0 (#20773)
To align with Office and other MS products.
Office's support policy is:
"Office for iPad and iPhone is supported on the two most recent versions
of iOS and iPadOS. When a new version of iOS or iPadOS is released, the
Office Operating System requirement becomes the two most recent
versions: the new version of iOS or iPadOS and the previous version."
(from https://products.office.com/office-system-requirements)

The latest iOS version is 17. So they support both 17 and 16. Here I set
our min iOS version to 13 so that it will be a superset of what Office
supports.

This change would allow us using C++17's std::filesystem feature in the
core framework. The modifications were generated by running
```bash
 find . -type f -exec sed -i "s/apple_deploy_target[ =]12.0/apple_deploy_target=13.0/g"  {} \;
```

Cannot use 15.0 because otherwise iOS packaging would fail with:

```
/Users/runner/work/1/b/apple_framework/intermediates/iphoneos_arm64/Release/_deps/coremltools-src/mlmodel/src/MILBlob/Util/Span.hpp:288:9: error: cannot use 'throw' with exceptions disabled
        MILVerifyIsTrue(index < Size(), std::range_error, "index out of bounds");
```

The Google OSS libraries we use only officially support iOS 15+.
2024-06-04 10:15:20 -07:00
Wanming Lin
9c6481fa2d
[WebNN EP] Enable ArgMax and ArgMin for CPU backend (#20865)
WebNN TFLite backend supports ArgMax and ArgMin, but only supports
'select_last_index' value is 0.
2024-06-03 14:12:11 -07:00
Wanming Lin
c128132dd8
[WebNN EP] TFLite backend only supports Elu with default alpha (#20862) 2024-06-03 14:10:22 -07:00
Yulong Wang
ab9f153746
[js/web] allow build target for non dynamic import (#20898)
### Description
<!-- Describe your changes. -->

This PR allows to build ORT web to `ort{.all|.webgpu}.bundle.min.mjs`,
which does not have any dynamic import. This makes it possible to use
ort web via static import in service worker.

Fixes #20876
2024-06-03 12:33:37 -07:00
Yulong Wang
35697d2421
[js/webnn] update API of session options for WebNN (#20816)
### Description

This PR is an API-only change to address the requirements being
discussed in #20729.

There are multiple ways that users may create an ORT session by
specifying the session options differently.

All the code snippet below will use the variable `webnnOptions` as this:
```js
const myWebnnSession = await ort.InferenceSession.create('./model.onnx', {
   executionProviders: [
     webnnOptions
   ]
});
```

### The old way (backward-compatibility)

```js
// all-default, name only
const webnnOptions_0 = 'webnn';

// all-default, properties omitted
const webnnOptions_1 = { name: 'webnn' };

// partial
const webnnOptions_2 = {
  name: 'webnn',
  deviceType: 'cpu'
};

// full
const webnnOptions_3 = {
  name: 'webnn',
  deviceType: 'gpu',
  numThreads: 1,
  powerPreference: 'high-performance'
};
```

### The new way (specify with MLContext)

```js
// options to create MLcontext
const options = {
  deviceType: 'gpu',
  powerPreference: 'high-performance'
};

const myMlContext = await navigator.ml.createContext(options);

// options for session options
const webnnOptions = {
  name: 'webnn',
  context: myMlContext,
  ...options
};
```

This should throw (because no deviceType is specified):
```js
const myMlContext = await navigator.ml.createContext({ ... });
const webnnOptions = {
  name: 'webnn',
  context: myMlContext
};
```

### Interop with WebGPU
```js
// get WebGPU device
const adaptor = await navigator.gpu.requestAdapter({ ... });
const device = await adaptor.requestDevice({ ... });

// set WebGPU adaptor and device
ort.env.webgpu.adaptor = adaptor;
ort.env.webgpu.device = device;

const myMlContext = await navigator.ml.createContext(device);
const webnnOptions = {
  name: 'webnn',
  context: myMlContext,
  gpuDevice: device
};
```

This should throw (because cannot specify both gpu device and MLContext
option at the same time):
```js
const webnnOptions = {
  name: 'webnn',
  context: myMlContext,
  gpuDevice: device,
  deviceType: 'gpu'
};
```
2024-05-31 03:25:14 -07:00
Xu Xing
25ac65375c
[js/webgpu] Fix mha name (#20860)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-30 00:01:06 -07:00
Peishen Yan
cfe68e489e
[WebNN EP] Support Trilu op (#20730)
Adds support for Trilu via WebNN Triangular op
2024-05-24 10:46:54 -07:00
Guenther Schmuelling
33a68d221f
add missing file for pr20791 (#20811)
this file should have been in pr20791 to allow fp16 in the tile
implementation
2024-05-24 09:59:13 -07:00
Satya Kumar Jandhyala
bab5037eab
Eliminate explicit Concat operations in Attention (#20556)
### Description
Remove explicitly concatinating pastKey with Key and pastValue with
Value.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-24 09:07:57 -07:00
Wanming Lin
2c39d0c502
[WebNN EP] Disable ConvTranspose for WebNN CPU (#20762)
WebNN CPU backend implementation has been migrated from XNNPack to
TFLite, currently TFLite has not supported WebNN's convTranspose2d yet,
just disable it for now.
2024-05-22 20:59:37 -07:00
Yulong Wang
e412bc1919
[doc] update file size table for ORT Web (#20755)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-22 11:04:57 -07:00
Xu Xing
f1fef19b6e
[js/webgpu] Support shared memory for transpose 2d (#19267)
For 1024x1024, without shared memoey, 18.7ms. With shared memory 13.2ms.
2024-05-22 08:15:44 -07:00
Yulong Wang
068bb3d5ee
[js/webgpu] add missing space in build script (#20752) 2024-05-21 16:24:34 -07:00
Wanming Lin
87d49e3dda
[WebNN EP] Add WebNN operators doc to README.md (#20734) 2024-05-20 14:57:40 -07:00
Wanming Lin
0399d1b12d
[WebNN EP] Update chromium flag (#20732)
WebNN is currently enabled behind "Enables WebNN API" flag.
2024-05-20 14:57:30 -07:00