### Description
This change upgrades a lot of dependencies. There are 2 motivations of
doing this change:
- fix the security issue reported by dependabot (protobufjs Prototype
Pollution vulnerability -
https://github.com/advisories/GHSA-h755-8qp9-cq85)
- resolve the requirement of using ONNX IR_VERSION 9 (#16638)
This requires:
- upgrade protobufjs to v7.2.4
- upgrade library 'onnx-proto' to consume latest ONNX release (v1.14.0).
Problems:
- protobufjs v7.2.4 depends on long.js v5, which does not work well with
typescript (commonjs).
- onnx-proto depends on this fix with a new release of long.js
- long.js is in maintenance and it takes longer than expected to put in
new changes
Solutions:
- use a patch script in `preprepare` to copy type declarations to make
long.js work with typescript (commonjs)
- generate onnx protobuf JS/TS files and put them under
js/web/lib/onnxjs/ort-schema/protobuf folder - remove 'onnx-proto' from
dependency.
- apply fixes to generated onnx.d.ts
### Description
fix file size trim for wasm only .min.js
minimal build `ort.wasm.min.js` and `ort.wasm-core.min.js` should
exclude JSEP related source code.
### Description
<!-- Describe your changes. -->
As title.
Validation at JS call level in E2E app is not included. Can cover
together in a separate pr.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Test coverage.
---------
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
### Description
allow creating (u)int64 tensors from either a number array or a bigint
array.
before:
```js
// TypeScript think is good, but actually does not work
// runtime error: Uncaught TypeError: Cannot convert 1 to a BigInt
const myTensor1 = new Tensor('int64', [1, 2, 3, 4], [2, 2]);
// runtime good, but TypeScript thinks myTensor2 is a string tensor
const myTensor2 = new Tensor('int64', [1n, 2n, 3n, 4n], [2, 2]);
```
after:
```js
// both work at runtime and TypeScript populates the correct types
const myTensor1 = new Tensor('int64', [1, 2, 3, 4], [2, 2]);
const myTensor2 = new Tensor('int64', [1n, 2n, 3n, 4n], [2, 2]);
```
### Description
This change reduces the number of calls to globby functions so that it
accelerates the initialization for 'npm test' with suite0/1 tests from
~14sec to <2sec.
### Description
Added Expand operator support.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Add ConvTranspose support for WebGPU
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Added WeGPU/JSEP Split operator support.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Add missing L1Reduce and L2Reduce operator kernels.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
always use 'typescript' from /js/ folder. This allows all NPM packages
to use the same typescript version.
- remove 'typescript' from /js/react_native/package.json. use the one
from /js/package.json
- remove unused '@types/fs-extra'
### Description
Add Concat operator
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Modified indexing into outputIndices in the shader code. When the output
is 1-dim the outputIndices is not a vector and indexing results in
error.
### Motivation and Context
Fix the problem in the Reduce Ops implementation in WebGPU.
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
As title.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Works with local onnxruntime-c pod in js/rn/e2e test.
A couple of places in onnxruntime used `float_t` data type alias as an
alternative to `float`. However, this is not entirely correct, since
`float_t` is an implementation-defined type alias, which may be `float`,
`double`, `long double` or some other implementation-defined data type,
depending on the state of the internal `FLT_EVAL_METHOD` macro:
https://en.cppreference.com/w/c/numeric/math/float_t
On most major platforms and compilers (clang, GCC, MSVC) this is only a
cosmetic change and will not lead to any changes. However, icpx compiler
(and legacy icc) tends to substitute `float_t` with `long double`,
resulting in a linker error (unresolved reference) to the base onnx
library, that only contains the `ParseData` function for `float` and
`double` as in
[here](9264e09367/onnx/defs/tensor_proto_util.cc (L133-L134)).
Overall, this PR cleans up the implementation-defined behaviour and
enables building onnxruntime with icpx.
### Description
Modify the creating of webgl context.
Previous behavior:
STEP.1 - create canvas (document.createElement), if failed, goto step.2
else step.3
STEP.2 - create offscreenCanvas, if failed abort
STEP.3 - use the canvas created in step.1 or 2 to create webgl context.
if successful return context else abort
Now bahavior:
STEP.1 create offscreenCanvas, if failed goto step.3
STEP.2 use it to create webgl context. if successful, return context
STEP.3 create canvas (document.createElement). if failed, abort
STEP.4 use it to create webgl context. if successful, return context
else abort
Motivation:
we found in some environment, normalCanvas.getContext() returns null but
offscreenCanvas.getContext() returns the context object. and when
offscreenCanvas is available it is good idea to always prefer to use it.
### Description
We used to use `typeof fetch === 'undefined'` as condition to detect the
environment is Node.js or not. Before Node.js v18, this works. However,
in Node.js v18, it introduced `fetch` function, so this check does not
work any more.
This PR changes the condition to check whether `process`,
`process.versions` and `process.versions.node` exists.
Checking whether `process` exists is not enough. This is because in some
configuration, webpack may polyfill nodejs's process.
### Description
<!-- Describe your changes. -->
This PR adds support for `executionProviders` option for react-native
package, support:
- Android: cpu / xnnpack / nnapi
- iOS: cpu / xnnpack / coreml
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
In my case I want to enable Core ML / NNAPI EP for react-native project.
### Description
<!-- Describe your changes. -->
- Create `OnnxruntimeJSIHelper` native module to provide two JSI
functions
- `jsiOnnxruntimeStoreArrayBuffer`: Store buffer in Blob Manager &
return blob object (iOS: RCTBlobManager, Android: BlobModule)
- `jsiOnnxruntimeResolveArrayBuffer`: Use blob object to get buffer
- The part of implementation is reference to
[react-native-blob-jsi-helper](https://github.com/mrousavy/react-native-blob-jsi-helper)
- Replace base64 encode/decode
- `loadModelFromBlob`: Rename from `loadModelFromBase64EncodedBuffer`
- `run`: Use blob object to replace input.data & results[].data
For [this
context](https://github.com/microsoft/onnxruntime/issues/16031#issuecomment-1556527812),
it saved a lot of time and avoid JS thread blocking in decode return
type, it is 3700ms -> 5~20ms for the case. (resolve function only takes
0.x ms)
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
It’s related to #16031, but not a full implementation for migrate to
JSI.
It just uses JSI through BlobManager to replace the slow part (base64
encode / decode).
Rewriting it entirely in JSI could be complicated, like type convertion
and threading. This PR might be considered a minor change.
/cc @skottmckay
"default" should be last element for exports.
This fixes "Module not found: Error: Default condition should be last
one" when importing the onnxruntime-web package in some conditions.
### Description
Added support for ReduceL1, ReduceL2, ReduceMean, ReduceMin, ReduceMax,
ReduceSum, ReduceLogSum, ReduceLogSumExp, ReduceProd and
ReduceSquareSum.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com>
Co-authored-by: guschmue <guschmue@microsoft.com>
### Description
<!-- Describe your changes. -->
refactor tensor type in onnxruntime-common.
### Motivation and Context
There major motivation is that I am doing a local change to address the
API part of #15312. And I am doing a refactoring of onnxruntime-common
anyway (#15772).
The `tensor.ts` and `tensor-impl.ts` are too large, so I split contents
into multiple files to make the type declarations clearer.
The original target of this change is for API only ( ie. do not refactor
any implementation.). However, there are a few type/implementation
inconsistencies so I also made minimal changes to fix them.
### Changes
- extract `TensorUtils` for non-template interfaces
- extract `TensorFactory` for all overloads of `Tensor.fromImage()`
- refactor options type that used for `Tensor.fromImage()`
- fix JSDoc comments to make option descriptions consistent with actual
type declarations
- fix an inconsistency for `options.format` and `options.bitmapFormat`;
change all `bitmapFormat` to `format`
- extract `ConversionUtils` for `tensor.toDataURL()` and
`tensor.toImageData()`
- put implementations into multiple files from `tensor-impl.ts`
- fix a bug that cause unittest fail. put comments for future fix.
### Description
Add an API for users to get version of current package. example usage:
```js
import { env } from 'onnxruntime-node';
console.log(env.versions.node); // output "1.16.0"
```
```js
import { env } from 'onnxruntime-web';
console.log(env.versions.web); // output "1.16.0"
console.log(env.versions.common); // output "1.16.0"
console.log(env.versions.node); // output "undefined"
```
#16156
### Description
<!-- Describe your changes. -->
Implement `dispose` react native method.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Currently we are not able to release the memory used by model in JS
runtime if we don't want to use it anymore, we can do that only by
reload app on debug or restart app on release.
We implemented a number of new ops and data types to support running
segment anything model on Chromium WebNN DML backend (POC) in a forked
branch https://github.com/honry/onnxruntime/tree/stable-diffusion
In this PR, we migrate the changes in the forked branch to main branch,
includes:
- 22 new ops
- New tensor data types: bool, int32, uint32, uint64, int64, float16 (As
JavaScript hasn't shipped Float16Array, we use Uint16Array as a
workaound)
- Handle empty input tensors and duplicated outputs
- Fixed some nits
### Description
This PR adds an implementation of the Squeeze operator to WebGPU JSEP.
The implementation follows the [operator
schema](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Unsqueeze).
To implement the `Unsqueeze` operator in the same fashion as the
`Squeeze`, I added the `ComputeOutputShape()` method to the
`UnsqueezeBase` class and made some slight modifications. Please let me
know if it is a bad idea and if I should move this method to the JS
implementation.
I also uncommented test case lines in the `suite-test-list.jsonc` file
for both Squeeze and Unsqueeze operators following @hariharans29's
[comment](https://github.com/microsoft/onnxruntime/pull/16024#issuecomment-1565113633).
### How was it tested
1. I created a model with only one operator:
```Python
import onnx.helper
node = onnx.helper.make_node(
"Unsqueeze",
inputs=["T", "axes"],
outputs=["y"],
)
graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 4, 5]), onnx.helper.make_tensor_value_info("axes", 7, [2])], [onnx.helper.make_tensor_value_info("y", 1, [3, 1, 4, 5, 1])])
onnx.save(onnx.helper.make_model(graph), "unsqueeze.onnx")
```
2. I compiled the runtime using @fs-eire's
[instructions](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce).
3. I ran the test models in the browser using this minimal setup:
```HTML
<html>
<script src=".\dist\ort.webgpu.min.js"></script>
<script>
async function run() {
const session = await ort.InferenceSession.create('unsqueeze.onnx', {executionProviders: ['webgpu']});
console.log(session);
const input = new ort.Tensor('float32', new Float32Array(60), [3, 4, 5]);
const dim = new ort.Tensor('int64', [1n, 4n], [2]);
const output = await session.run({ "T": input, "axes": dim });
console.log(output);
}
run();
</script>
</html>
```
### Motivation and Context
Improve operator coverage for WebGPU JSEP.
### Description
disable webpack's polyfill for node's `global`, `__filename` and
`__dirname` in web build. This will confuse emscripten generated
environment detection.
see https://webpack.js.org/configuration/node/
### Description
This change adds a new instance function (method) to type
`InferenceSession` to allow users to manually release an inference
session instance.
#16131 depends on this change to work correctly.
### Description
The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ
as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA
API to cast float/half to float8 if CUDA>=11.8, a custom implementation
if CUDA<11.8.
* It implements, Cast, QuantizeLinear, DequantizeLinear for all types on
CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA.
* It extends the supported types for control flow operator, Shape,
Reshape, Identity, If, Loop, Scan, Reshape
* It implements Equal(19).
* Cast, QuantizeLinear, DequantizeLinear operators now support a
parameter `saturate` only valid for float 8 types. It is true by
default. In that case, any value out of range is converted into the
maximum float 8 value. If false, it is infinite.
* QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA
(and ROCm by extension), scale = 1D tensor with one scale per channel
### Motivation and Context
Supports latest onnx version.
Fixes
[AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395)
---------
Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
### Description
This PR adds an implementation of the `Squeeze` operator to WebGPU JSEP.
The implementation follows the [operator
schema](https://github.com/onnx/onnx/blob/main/docs/Operators.md#Squeeze)
and allows one or two inputs.
### How was it tested
1. I created two models. Without `axes`:
```Python
import onnx.helper
node = onnx.helper.make_node(
"Squeeze",
inputs=["T"],
outputs=["y"],
)
graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 1, 4, 5])],
[onnx.helper.make_tensor_value_info("y", 1, [3, 4, 5])])
onnx.save(onnx.helper.make_model(graph), "squeeze.onnx")
```
And with `axes`:
```Python
import onnx.helper
node = onnx.helper.make_node(
"Squeeze",
inputs=["T", "axes"],
outputs=["y"],
)
graph = onnx.helper.make_graph([node], "test", [onnx.helper.make_tensor_value_info("T", 1, [3, 1, 4, 5]), onnx.helper.make_tensor_value_info("axes", 7, [1])], [onnx.helper.make_tensor_value_info("y", 1, [3, 4, 5])])
onnx.save(onnx.helper.make_model(graph), "squeeze-dim.onnx")
```
2. I compiled the runtime using @fs-eire's
[instructions](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce).
3. I ran the test models in the browser using this minimal setup:
```HTML
<html>
<script src=".\dist\ort.webgpu.min.js"></script>
<script>
async function run() {
const session = await ort.InferenceSession.create('squeeze-dim.onnx', {executionProviders: ['webgpu']});
console.log(session);
const input = new ort.Tensor('float32', new Float32Array(60), [3, 1, 4, 5]);
const dim = new ort.Tensor('int64', [-3n], [1]);
const output = await session.run({ "T": input, "axes": dim });
console.log(output);
}
run();
</script>
</html>
```
### Motivation and Context
Improve operator coverage for WebGPU JSEP.