Commit graph

410 commits

Author SHA1 Message Date
Yulong Wang
a441a71e8e
[js/web] support different export format for ort-web (#17878)
### Description
support different export format for ort-web.
2023-10-11 09:38:51 -07:00
Yulong Wang
5228332c9f
[js] upgrade JS shared dev dependencies (#17831)
### Description
upgrade JS shared dev dependencies.

- webpack: removed
- eslint: upgrade to latest.
   - eslint config upgraded to compatible with latest version
- typescript upgrade to v5
   - update module "CommonJS" to "Node16" in tsconfig
- update deprecated config "importsNotUsedAsValues" to
"verbatimModuleSyntax"
- remove webpack bundles in onnxruntime-common
2023-10-10 17:44:39 -07:00
Yulong Wang
c6f1a1ce69
update build_jsep.bat to add release build flags (#17471)
### Description
flags `--enable_wasm_api_exception_catching --disable_rtti` are used in
release build, so fix the build_jsep.bat script to make it more
consistent with CI.
2023-10-10 17:38:35 -07:00
Yulong Wang
d9b9c5a537
[js/webgpu] support using uniform buffer (#17803)
### Description
support using uniform buffer.

This PR allows to use uniform buffer in shader program, so that some
runtime information (eg. input/output shape) is no longer need to be
hardcoded into shader code.

There are 2 commits in this PR:
-
[667f31c](667f31c83d):
framework changes to support uniform buffer, as well as updates in
program manager, gpu data manager and indices helper.
-
[09e1d2a](09e1d2ad1d):
an example change for operator `Transpose` to use input's rank-only
instead of dims as shader key. With this change, model mobilenetv2-12
shader compile times dropped from 71 to 52.
2023-10-10 00:31:12 -07:00
Chi Lo
569876fb16
[TensorRT EP] Refactor OrtTensorRTProviderOptions initialization and make it easy to add new field (#17617)
Two major modifications of this PR:

1. Refactor OrtTensorRTProviderOptions initialization and make it easy
to add new field.
2. Make Python API capable of using TensorRT plugins by adding new
Python binding api `register_tensorrt_plugins_as_custom_ops`. (It needs
to register ep's custom op domain before model load. For C++ API, it's
slightly different, when calling
SessionOptionsAppendExecutionProvider_TensorRT_XX, it appends cutom op
domain to session option. Later ORT can register custom op domain from
session option before model loading)
2023-10-06 14:12:20 -07:00
Yulong Wang
6ea493571e
[js/web] use esbuild to accelerate bundle build (#17745)
### Description

Use esbuild to accelerate bundle build.

This change uses esbuild to replace webpack for onnxruntime-web. Bundle
build time reduced from ~20sec to ~0.6sec on my windows dev box.

A few changes applied:
- import nodejs modules using "node:" prefix
- remove enum declaration inside namespace (EncoderUsage)
- use "fs/promise" to replace the old promisify from "util"
- separate ort-web and test-runner. Previously they are bundled
together, now they are built into 2 files.
- optimize karma runner launch time
- remove unnecessary sourcemap preprocessor. sourcemaps are handled
inside esbuild
- remove unnecessary proxies (because ort-web and test-runner are
separated now, the path are correctly inferred)
    - remove file watcher for test data
- optimize special handling as esbuild plugins:
- polyfill dummy imports for node.js modules when targetting browser.
    - load as content string for ort-wasm-*.worker.js
    - load as content string for ./proxy-worker/main.ts
- a source patch to ort-wasm*-threaded*.js (see details in comments in
code)
- updated debug configurations for sourcemap mapping to ensure
out-of-box good dev experience
2023-10-06 13:37:37 -07:00
Jiajia Qin
db3901ab97
[js/webgpu] Enable the NCHW ConvMatMul path (#17717)
1) Enable pointwise NCHW conv2d by MatMul.
2) Enable non-pointwise NCHW conv2d by convMatMul.
3) Fix bug when `sameSize` is true

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-10-05 00:26:01 -07:00
Xu Xing
992f3e4609
[js/webgpu] Support where (#17544)
Supported type: float. int32_t, uint32_t, bool.
Case where_broadcast.jsonc is not enabled due to
https://github.com/microsoft/onnxruntime/issues/17405.

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-10-03 14:28:21 -07:00
Guenther Schmuelling
f8a8452a6b
[js/webgpu] fix pad operator (#17775)
fix pad operator
2023-10-03 13:39:50 -07:00
Arthur Islamov
d0519a7603
[js/web] BiasSplitGelu and BiasAdd kernels (#17161)
### Description
Two contrib kernels that supposed to speed-up StableDiffusion according
to this doc
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md

However, there is no noticable effect in speed or memory consumption. So
i guess the only way to make it faster is to implement
MultiHeadAttention but i'm not capable of doing that right now. So i'll
focus on existing PRs and finding the JSEP kernel that produces
incorrect results. It should be one of the old ones (i suspect Conv or
ConvTranspose), as SD was not generating images correctly on webgpu
since i started working on it. I hoped someone else would fix that by
the time i finish with kernels/optimizations 😅

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-10-03 12:20:20 -07:00
Yulong Wang
451c02543a
[js/webgpu] allow specify preferredLayout (#17756)
### Description
Allow WebGPU backend to specify `preferredLayout`. Default is NHWC.

```js
const options = {executionProviders: [{name:'webgpu', preferredLayout: 'NCHW'}]};
sess1 = await ort.InferenceSession.create('./mobilenetv2-12.onnx', options);
```

### Motivation and Context
- implement @qjia7's requirement for an easier way to do performance
comparison between NCHW vs NHWC.
- It's possible that NCHW does better on some models and NHWC on others.
So offer user the capability to switch.
2023-10-02 21:25:12 -07:00
Scott McKay
ac4e726046
Add bytes model loading test to react native e2e (#17749)
### Description
<!-- Describe your changes. -->
Update E2E test to also check InferenceSession.create with bytes. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Add tests to validate #17739
2023-10-02 12:25:28 +10:00
xhcao
0d60604638
[JS/WebGPU] support Range operator (#17233)
The patch also introduces the method which copies
data from GPU to CPU synchronously.

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-30 02:05:32 -07:00
Arthur Islamov
a941dd583e
[js/web] FP16 Conv, ConvTranspose and MatMul (#17514)
### Description
Another three ops for fp16

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-09-30 00:00:23 -07:00
Caroline Zhu
6a5f469d44
Add training interfaces to js/common (#17333)
### Description
Following the design document:
* Added CreateTrainingSessionHandler to the Backend interface
* All existing Backend implementations throw an error for the new method
createTrainingSessionHandler
* Created TrainingSession namespace, interface, and
TrainingSessionFactory interface
* Created TrainingSessionImpl class implementation 

As methods are implemented, the TrainingSession interface will be added
to or modified.

### Motivation and Context
Adding the public-facing interfaces to the onnxruntime-common package is
one of the first steps to support ORT training for web bindings.

---------

Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com>
2023-09-29 19:05:10 -07:00
Rachel Guo
e106b1eb8f
Fix react native load from Uint8Array buffer bug (#17739)
### Description
<!-- Describe your changes. -->

Use `.buffer` of Uint8Array to get ArrayBuffer.

TODO: Add E2E React Native test case to cover JS level testing to avoid
future breakage.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

#17732

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2023-09-29 18:03:28 -07:00
Yulong Wang
561aca97cf
[js/webgpu] support IO binding (#17480)
<del>
**This PR is based on a few prerequisites PRs. They are listed as
below:**
- #17465
- #17469
- #17470
- #17472
- #17473
- #17484

Please review the current change by only looking at commit
e2e6623e673ec6de55a5c1f8edcbd3a46b535a89 and later.


</del>

### Description

This PR introduces WebGPU IO binding. This new feature allows
onnxruntime-web users to use tensors created from GPU as model
input/output so that a model inferencing can be done without unnecessary
data copy between CPU and GPU for model input/output.

### Examples

An E2E demo/example is being worked on.

Following is some simple demo with code snippet.

Let's first check today how we do:
```js
// STEP.1 - create an inference session:
const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'] });

// STEP.2 - create model input: (supposing myImageCpuData is a Float32Array)
const feeds = {
  'input_image:0': new ort.Tensor('float32', myImageCpuData, [1, 224, 224, 3])
};

// STEP.3 - run model
const myResults = await mySession.run(feeds);

// STEP.4 - get output data
const myData = myResults['output_image:0'].data; // Float32Array

```

#### for inputs (GPU tensor):

Now, with IO binding, you can create a tensor from a GPU buffer, and
feed it to the model:
```js
// new STEP.2.A - create model input from a GPU buffer: (supposing myInputGpuBuffer is a `GPUBuffer` object with input data)
const feeds = {
  'input_image:0': ort.Tensor.fromGpuBuffer(myInputGpuBuffer, { dataType: 'float32', dims: [1, 224, 224, 3] })
};
```

### for outputs (pre-allocated GPU tensor)

you can also do that for output, **if you know the output shape**:
```js
// new STEP.2.B - create model output from a GPU buffer: (supposing myOutputGpuBuffer is a pre-allocated `GPUBuffer` object)
const fetches = {
  'output_image:0': ort.Tensor.fromGpuBuffer(myOutputGpuBuffer, { dataType: 'float32', dims: [1, 512, 512, 3] })
};

// new STEP.3 - run model with pre-allocated output (fetches)
const myResults = await mySession.run(feeds, fetches);
```

### for outputs (specify location)

if you do not know the output shape, you can specify the output location
when creating the session:

```js
// new STEP.1 - create an inference session with an option "preferredOutputLocation":
const mySession = await ort.InferenceSession.create('./my_model.onnx', {
    executionProviders: ['webgpu'],
    preferredOutputLocation: "gpu-buffer"
});
```

if the model has multiple outputs, you can specify them seperately:
```js
// new STEP.1 - create an inference session with an option "preferredOutputLocation":
const mySession = await ort.InferenceSession.create('./my_model.onnx', {
    executionProviders: ['webgpu'],
    preferredOutputLocation: {
         "output_image:0": "gpu-buffer"
    }
});
```

now you don't need to prepare the `fetches` object and onnxruntime-web
will prepare output data on the location that specified.

#### read data

when you get the output tensor, you can:
```js
// get the gpu buffer object:
const gpuBuffer = myOutputTensor.gpuBuffer; // GPUBuffer

// get the CPU data asynchronizely
const cpuData = await myOutputTensor.getData();

// get the CPU data asynchronizely and release the underlying GPU resources
const cpuData = await myOutputTensor.getData(true);

// dispose the tensor (release the underlying GPU resources). This tensor object will be invalid after dispose() is called.
myOutputTensor.dispose();
```

#### resource management

JavaScript has GC so you don't need to worry about managing JavaScript
objects. But there are 2 types of resources that are not managed by GC:
- GPU buffer that used in tensors
- Underlying ORT native resources

To simplify, most of the unmanaged resources and handled inside ORT web.
But there are a few resources that need users to manage:
- All external GPU resources, including GPU buffers inside all tensors
created by `Tensor.fromGpuBuffer()`, will not be managed by ORT. User
should manage those GPU buffers themselves.
- When a session is created with `preferredOutputLocation` ==
"gpu-buffer" specified in session options, and the corresponding output
is not pre-allocated, user need to call the output tensor's `dispose()`
or `getData(true)` to manually release the underlying GPU buffers.
- ORT internal errors (including providing a pre-allocated output tensor
with wrong type/dims) will invalidate the whole wasm memory and is not
recoverable. An exception is thrown in this situation.
2023-09-29 11:24:42 -07:00
satyajandhyala
b4fbc25b1f
[JS/Web] Add ConvTranspose implementation using MatMul (#17573)
### Description
Add ConvTranspose implementation using MatMul to increase perf.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-29 11:00:44 -07:00
Yulong Wang
b2b1408608
[js/web] update browser launch cmd flags (#17658)
### Description
update Chromium browser launch command line flags

Canary already using dxc so no need to specify
'--enable-dawn-features=use_dxc' for canary.
2023-09-25 12:24:46 -07:00
Yulong Wang
df15a3a335
[js/web] configure 5GB memory space for webpack build (#17684)
### Description
ort-web build step - webpack consumes the amount of memory on the edge
of Node.js(V8)'s default max-old-space-size, so increase the default
memory size to 5GB to avoid this issue.
2023-09-25 09:22:00 -07:00
Jiajia Qin
891fba3b9c
[js/webgpu] Optimize Gather op (#17625)
### Description
This PR optimizes the gather op, which is improved ~6ms in segment
anything model in ADL.
The problem in original algorithm is that it includes a for loop to
calculate a block size of data. However, the block size may be very
large, like `65536`. In GPU shader, we should try to avoid large loop in
shader and try to use more threads to do it parallelly.

Before:
```
[profiling] kernel "41771992|[Gather] 41771992" input[0]: [4,65536] | float32, input[1]: [1] | int64, output[0]: [1,65536] | float32, execution time: 6886207 ns
```
After:
```
[profiling] kernel "41771992|[Gather] 41771992" input[0]: [4,65536] | float32, input[1]: [1] | int64, output[0]: [1,65536] | float32, execution time: 11719 ns
2023-09-21 21:00:36 -07:00
Jiajia Qin
cd3fb377ea
[js/webgpu] Allow binary ops with scalar to use the vectorize path (#17589)
### Description
1. For binary ops, the components is always 4. So the dispatchGroup
should be : `{x: Math.ceil(outputSize / 64 /* workgroup size */ / 4 /*
component size */)}` instead of `{x: Math.ceil(outputSize / 64 /*
workgroup size */ / (vectorize ? 4 : 1) /* vec size */)}`.

2. If any of a or b only has one element, we still can use the vectorize
path since the same value will be broadcasted.
2023-09-21 20:55:08 -07:00
Arthur Islamov
498b60d8a4
[js/web] fp16 Pool & Reduce (#17512)
### Description
Two more ops to support fp16
2023-09-21 14:52:13 -07:00
Vincent Wang
e6301eee6a
Bump Up Version to 1.17.0 (#17587)
Bump up version to 1.17.0 as the 1.16.0 release branch had been branched
out.
2023-09-20 11:02:58 +08:00
Hariharan Seshadri
460f17fbb8
[JS/WebGPU] Support If on WebGPU (#17478) 2023-09-19 12:20:18 -07:00
Arthur Islamov
0f406ca1d3
[js/web] FP16 binary and unary ops (#17515)
### Description
Binary and unary ops with fp16 support
2023-09-18 15:43:32 -07:00
Yulong Wang
efd416b71f
[js/web] update test to explicitly fail for webnn without proxy (#17554)
### Description

Update test to explicitly fail for webnn without proxy.

I am doing this change because if I test webnn with other backend
together, it silently enables proxy. I want to make test runner behave
with less implicit flag reset. If proxy is not enabled, webnn test
should fail.

@Honry please let me know if other places (eg. CI scripts) should change
also.
2023-09-15 14:40:22 -07:00
Yulong Wang
155887593d
[js/web] update npm test to load test cases only for required backends (#17555)
### Description
update npm test to load test cases for required backends.

No need to load test case list for the backends that we don't test.
2023-09-15 13:55:25 -07:00
Yulong Wang
9aafbe3feb
[js/web] revise TensorView (#17473)
### Description

This change:
- removes the unused `Tensor` types declared in
/js/web/lib/wasm/jsep/tensor.ts
- removes duplicated util functions in  /js/web/lib/wasm/jsep/tensor.ts
- renames /js/web/lib/wasm/jsep/**tensor.ts** to
/js/web/lib/wasm/jsep/**tensor-view.ts** and update corresponding
references. It was kind of confusing that we have multiple `Tensor`
types defined in different places also we have multiple `tensor.ts`
source files.

This is one of the prerequisites for supporting IO binding for WebGPU
buffer in onnxruntime-web.

list of prerequisites PRs:
https://github.com/microsoft/onnxruntime/pull/17465
https://github.com/microsoft/onnxruntime/pull/17469
https://github.com/microsoft/onnxruntime/pull/17470
https://github.com/microsoft/onnxruntime/pull/17472
https://github.com/microsoft/onnxruntime/pull/17473 (this one)
2023-09-14 21:14:44 -07:00
Jiajia Qin
41d2ff622c
[js/webgpu] Optimize InstanceNormalization (#17491)
### Description
<!-- Describe your changes. -->
In previous implementation, there are two loops to iterate H * W
elements to calculate the `mean` and `squaredNorm` value in one thread,
meanwhile it outputs H * W elements in one thread. That results it's
very very slow when H * W is a large value. And usually, H * W does be a
large value in a model. For example, in the `candy-8` model, the shapes
of [H, W] are [224,224], [112,112], [56,56] for `InstanceNormalization`
op. And in my ADL, `[1,224,224,32]` consumes 17 ms. See below:
```
[profiling] kernel "23848328|[InstanceNormalization] 23848328" input[0]: [1,224,224,32] | float32, input[1]: [32] | float32, input[2]: [32] | float32, output[0]: [1,224,224,32] | float32, execution time: 17007914 ns
```

In this PR, it uses workgroup memory to optimize the original algorithm.
The advantage is that it can parallelly utilize the 64 (workgroupSize)
threads in one workgroup to calculate `mean` and `squaredNorm` value.
Meanwhile, it only outputs `H * W / workgroupSize` outputs for one
thread, which greatly reduces the overhead for one thread. With this
optimization, `[1,224,224,32]` becomes 3 ms and the main overhead is the
extra two `transpose`. The `createInstanceNormProgramInfo` only needs
`0.64` ms. See below:
```
[profiling] kernel "23003600|[InstanceNormalization] 23003600" input[0]: [1,224,224,32] | float32, output[0]: [1,32,224,224] | float32, execution time: 1543792 ns
program-manager.ts:115 
[profiling] kernel "23003600|[InstanceNormalization] 23003600" input[0]: [1,32,224,224] | float32, input[1]: [32] | float32, input[2]: [32] | float32, output[0]: [1,32,224,224] | float32, execution time: 642652 ns
program-manager.ts:115 
[profiling] kernel "23003600|[InstanceNormalization] 23003600" input[0]: [1,32,224,224] | float32, output[0]: [1,224,224,32] | float32, execution time: 991608 ns
```
This PR currently only applies the new algorithm to NCHW format. For
NHWC format, one way is to transpose the input so that it can use the
new algorithm. But the disadvantage is that 2 extra transpose are added.
@dakenf also gives another way to optimize NHWC. Details see
[here](d45a96616d/js/web/lib/wasm/jsep/webgpu/ops/instance-norm.ts).
I checked @dakenf's method. The perf is similar with transpose +
optimized NCHW. But on different GPUs, one is a little better than
another or vice versa. So I prefer this PR only does the NCHW part.
@dakenf can submit his optimization on NHWC.
2023-09-14 17:03:18 -07:00
xhcao
198d468849
[WebGPU/JS] Added Pad operator support (#16928)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-14 13:14:11 -07:00
Yulong Wang
7af2f68ef3
[js/web] add a test flag to customize chromium flags (#17545)
### Description
add a test flag to customize chromium flags.

Usage:
npm test -- \<other flags> --chromium-flags=<...>
2023-09-14 10:05:31 -07:00
Hans
ad369a1fad
[js/rn] Support create boolean tensor (#17052)
### Description
<!-- Describe your changes. -->

For some use case need to create boolean tensor.

I've tested on [this
project](https://github.com/hans00/react-native-transformers-example)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Add handle `ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL`

And it required #15556 (It seems not include in latest release
(v1.15.1))
2023-09-14 15:02:27 +10:00
Arthur Islamov
03b56f7a73
[js/webgpu] FP16 extension registration (#17493)
### Description
First small change to support FP16

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-09-13 13:11:17 -07:00
Yulong Wang
a2e75114cc
[js/web] add sessionOptions.freeDimensionOverrides (#17488)
### Description
Allows to specify fixed size for dynamic input of a model. resolves
#16707

Pending test
2023-09-13 09:17:34 -07:00
Yulong Wang
cdf3e9dba9
[js] update prepack script to use exact version (#17484)
### Description
update prepack script to use exact version.

the prepack script for onnxruntime-node, onnxruntime-web and
onnxruntime-react-native is used to update their referencing version of
dependency "onnxruntime-common".

Previously "~" (tilde symbol) is used. This may cause NPM choose an
older version (if the old version matches the version requirement and
was previously installed already so hit the cache). see also
https://semver.npmjs.com/. [This
build](https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1134671&view=results)
is caused by this issue.
2023-09-13 00:07:16 -07:00
xhcao
ec94b07f0a
[JS/WebGPU] support Concat.int32 operator (#17003)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-13 00:05:00 -07:00
Yulong Wang
41584b2827
[js/web] ensure ORT initialization to run only once (#17529)
### Description
ensure ORT initialization to run only once
2023-09-12 23:52:08 -07:00
Yulong Wang
f923eec28b
[js/web] release session after use in npm test (#17470)
### Description
release session after use in npm test.

This is one of the prerequisites for supporting IO binding for WebGPU
buffer in onnxruntime-web.

list of prerequisites PRs:
#17465
#17469
#17470 (this one)
2023-09-12 16:59:13 -07:00
Arthur Islamov
65249f42e4
[js/web] FP16 Gemm, Softmax & Transpose (#17494)
### Description
First three OPs to support fp16. Will add more once this gets merged
since others depend on changes in js_data_types
2023-09-11 21:09:37 -07:00
satyajandhyala
bf6d6961cc
[JS/Web] Added Einsum operator support. (#17401)
### Description
Added Einsum operator support to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-11 15:57:15 -07:00
Yulong Wang
89da5a0108
[js/webgpu] exclude WebGPU reduce_log_sum_exp_* float64 test cases (#17472)
### Description

as explained in the comments, tests "test_reduce_log_sum_exp_*" on
opset17/opset18 are excluded because they use float64.

They are passing now because they fallback to CPU. WebGPU does not
support f64.


This is one of the prerequisites for supporting IO binding for WebGPU
buffer in onnxruntime-web.

list of prerequisites PRs:
https://github.com/microsoft/onnxruntime/pull/17465
https://github.com/microsoft/onnxruntime/pull/17469
https://github.com/microsoft/onnxruntime/pull/17470
https://github.com/microsoft/onnxruntime/pull/17472 (this one)
2023-09-08 17:03:04 -07:00
Caroline Zhu
dcc93909b4
Add training WASM generation to Web CI pipeline (#17319)
### Description
[Successful pipeline
run](https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1123141&view=results)

Added flag to build the training artifacts & updated the
pull-wasm-artifacts script to pull the training artifacts as well.

Bundled into this PR are minor formatting fixes + naming fixes.

### Motivation and Context
[This PR](https://github.com/microsoft/onnxruntime/pull/16521) extended
the WASM API wrapper to build training WASM artifacts as well.
The ORT training WASM artifacts are required to support ORT training web
bindings.
2023-09-08 15:49:47 -07:00
Yulong Wang
4d753b74a5
[js/common] prepare work for supporting webgpu IO binding implementation (#17465)
### Description
This PR contains a few changes in /js/common/ to support a coming PR for
a full implementation of webgpu IO binding.

- allows pass-through if value is already a Tensor instance in return
value of `handler.run()` called by `InferenceSession.run()`
(inference-session-impl.ts). Specifically, onnxruntime-node and
onnxruntime-react-native uses native bindings to generate a Tensor-like
object so we need to create a real Tensor instance here; for
onnxruntime-web the return value is already a Tensor instance.

- adds new types for GPU buffer supported types: `'float32'|'int32'` ->
`'float32'|'float16'|'int32'|'int64'|'uint32'|'bool'`

- exposes types `GpuBufferDataTypes` together with `CpuPinnedDataTypes`
and `TextureDataTypes` as exported
2023-09-08 13:49:24 -07:00
xhcao
9017ea131b
[js/webgpu] support GreaterOrEqual and LessOrEqual operators (#17310)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-07 17:41:16 -07:00
dependabot[bot]
eaef485461
Bump electron from 23.1.2 to 23.3.13 in /js/web (#17436)
Bumps [electron](https://github.com/electron/electron) from 23.1.2 to
23.3.13.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/electron/electron/releases">electron's
releases</a>.</em></p>
<blockquote>
<h2>electron v23.3.13</h2>
<h1>Release Notes for v23.3.13</h1>
<h2>End of Support for 23.x.y</h2>
<p>Electron 23.x.y has reached end-of-support as per the project's <a
href="https://www.electronjs.org/docs/latest/tutorial/electron-timelines#version-support-policy">support
policy</a>. Developers and applications are encouraged to upgrade to a
newer version of Electron.</p>
<h2>electron v23.3.12</h2>
<h1>Release Notes for v23.3.12</h1>
<h2>Other Changes</h2>
<ul>
<li>Fixed a crash while screen sharing on Wayland with PipeWire. <a
href="https://redirect.github.com/electron/electron/pull/39274">#39274</a></li>
<li>Security: backported fix for CVE-2023-3732.
<ul>
<li>Security: backported fix for CVE-2023-3728.</li>
<li>Security: backported fix for CVE-2023-3730. <a
href="https://redirect.github.com/electron/electron/pull/39268">#39268</a></li>
</ul>
</li>
</ul>
<h2>electron v23.3.11</h2>
<h1>Release Notes for v23.3.11</h1>
<h2>Fixes</h2>
<ul>
<li>Fixed a crash when listing desktop capture sources on Wayland with
PipeWire. <a
href="https://redirect.github.com/electron/electron/pull/39116">#39116</a>
<!-- raw HTML omitted -->(Also in <a
href="https://redirect.github.com/electron/electron/pull/39050">24</a>,
<a
href="https://redirect.github.com/electron/electron/pull/39051">25</a>,
<a
href="https://redirect.github.com/electron/electron/pull/39049">26</a>)<!--
raw HTML omitted --></li>
</ul>
<h2>electron v23.3.10</h2>
<h1>Release Notes for v23.3.10</h1>
<h2>Other Changes</h2>
<ul>
<li>Security: backported fix for CVE-2023-3422.
<ul>
<li>Security: backported fix for CVE-2023-3421.</li>
<li>Security: backported fix for CVE-2023-3420.</li>
<li>Security: backported fix for 1454860. <a
href="https://redirect.github.com/electron/electron/pull/38948">#38948</a></li>
</ul>
</li>
</ul>
<h2>electron v23.3.9</h2>
<h1>Release Notes for v23.3.9</h1>
<h2>Fixes</h2>
<ul>
<li>Fixed <code>preload</code> script may not run in some child windows
opened by <code>window.open</code>. <a
href="https://redirect.github.com/electron/electron/pull/38933">#38933</a>
<!-- raw HTML omitted -->(Also in <a
href="https://redirect.github.com/electron/electron/pull/38932">24</a>,
<a
href="https://redirect.github.com/electron/electron/pull/38931">25</a>,
<a
href="https://redirect.github.com/electron/electron/pull/38930">26</a>)<!--
raw HTML omitted --></li>
<li>Fixed minimize button to be visible when all buttons reenabled. <a
href="https://redirect.github.com/electron/electron/pull/38880">#38880</a>
<!-- raw HTML omitted -->(Also in <a
href="https://redirect.github.com/electron/electron/pull/38881">24</a>,
<a
href="https://redirect.github.com/electron/electron/pull/38879">25</a>)<!--
raw HTML omitted --></li>
</ul>
<h2>electron v23.3.8</h2>
<h1>Release Notes for v23.3.8</h1>
<h2>Other Changes</h2>
<ul>
<li>Security: backported fix for CVE-2023-3215.
<ul>
<li>Security: backported fix for CVE-2023-3216.</li>
<li>Security: backported fix for 1450536. <a
href="https://redirect.github.com/electron/electron/pull/38788">#38788</a></li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4b782e259b"><code>4b782e2</code></a>
fix: avoid package.json check on built-in modules (<a
href="https://redirect.github.com/electron/electron/issues/39426">#39426</a>)</li>
<li><a
href="b2047d710c"><code>b2047d7</code></a>
ci: fix hang when validating AppVeyor artifacts (<a
href="https://redirect.github.com/electron/electron/issues/39401">#39401</a>)</li>
<li><a
href="10b2baea43"><code>10b2bae</code></a>
docs: clean up removed systemPreferences methods (<a
href="https://redirect.github.com/electron/electron/issues/39349">#39349</a>)</li>
<li><a
href="454990a201"><code>454990a</code></a>
chore: cherry-pick 4 changes from Release-0-M115 (<a
href="https://redirect.github.com/electron/electron/issues/39268">#39268</a>)</li>
<li><a
href="10b49ffa12"><code>10b49ff</code></a>
chore: cherry-pick 2 changes from webrtc (<a
href="https://redirect.github.com/electron/electron/issues/39274">#39274</a>)</li>
<li><a
href="dc0fc78fac"><code>dc0fc78</code></a>
fix: do not resolve electron entrypoints on disk (<a
href="https://redirect.github.com/electron/electron/issues/39249">#39249</a>)</li>
<li><a
href="1aafc2ae38"><code>1aafc2a</code></a>
ci: fail appveyor build if artifacts are missing (<a
href="https://redirect.github.com/electron/electron/issues/39219">#39219</a>)</li>
<li><a
href="595e25a270"><code>595e25a</code></a>
fix: use StartUpdating method for PipeWire capturer (<a
href="https://redirect.github.com/electron/electron/issues/39116">#39116</a>)</li>
<li><a
href="7fe5925c94"><code>7fe5925</code></a>
build: disable unneeded depot_tools update on Windows CI (<a
href="https://redirect.github.com/electron/electron/issues/39016">#39016</a>)</li>
<li><a
href="c4b0ff4994"><code>c4b0ff4</code></a>
chore: cherry-pick 4 changes from Release-3-M114 (<a
href="https://redirect.github.com/electron/electron/issues/38948">#38948</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/electron/electron/compare/v23.1.2...v23.3.13">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=electron&package-manager=npm_and_yarn&previous-version=23.1.2&new-version=23.3.13)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-07 17:39:49 -07:00
Jian Chen
8914fe687b
[js/webgpu] Include Support for neg.int32 (#17374)
### Description
Include Support for neg.int32



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-06 12:00:16 -07:00
Yulong Wang
fa868ca9cd
[js/node] release sessions after use in npm test (#17353)
### Description
resolve sessions after use in NPM test.
2023-09-05 23:42:32 -07:00
Yulong Wang
d88406a31b
[js/common] use Map instead of object for backends (#17352)
### Description
resolved
https://github.com/microsoft/onnxruntime/security/code-scanning/1140
2023-09-05 23:14:46 -07:00
Yulong Wang
75710f0006
[js/webgpu] add matmul broadcast tests (#17335)
### Description

Commit fffefb1c22 (#16969) optimized
matmul and also fixes broadcasting. So #17191 is no longer needed.
However, the newly added operator test file from the PR by @dakenf is
helpful so pick and add it to enhance the tests.
2023-09-05 20:41:46 -07:00