Commit graph

137 commits

Author SHA1 Message Date
Arthur Islamov
65249f42e4
[js/web] FP16 Gemm, Softmax & Transpose (#17494)
### Description
First three OPs to support fp16. Will add more once this gets merged
since others depend on changes in js_data_types
2023-09-11 21:09:37 -07:00
satyajandhyala
bf6d6961cc
[JS/Web] Added Einsum operator support. (#17401)
### Description
Added Einsum operator support to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-11 15:57:15 -07:00
xhcao
9017ea131b
[js/webgpu] support GreaterOrEqual and LessOrEqual operators (#17310)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-09-07 17:41:16 -07:00
Jiajia Qin
5e747071be
[js/webgpu] Fix bug in conv2dByMatMul path (#17369)
### Description
<!-- Describe your changes. -->
For the conv2dByMatMul path, the simulated matmul output shape is the
reshape of the original conv2d. So we should pass this information to
`createMatmulProgramInfo` so that it can process it correctly.
2023-09-02 00:16:28 -07:00
Jiajia Qin
352b745deb
[js/webgpu] Add input/output shapes information to profiling (#17342)
### Description
This PR is to enhance the profiling information.
With the PR, the profiling result is like below:
```
[profiling] kernel "[Split] 51288384" input[0]: 1,256,64,64, output[0]: 1,256,64,64, execution time: 37135 ns
program-manager.ts:114 
[profiling] kernel "[Concat] 52361040" input[0]: 1,256,64,64, output[0]: 1,256,64,64, execution time: 50833 ns
program-manager.ts:114 
[profiling] kernel "[Transpose] 52375264" input[0]: 1,256,64,64, output[0]: 1,64,64,256, execution time: 99791 ns
program-manager.ts:114 
[profiling] kernel "[Sub] 51098472" input[0]: , input[1]: 1, output[0]: 1, execution time: 7448 ns
program-manager.ts:114 
[profiling] kernel "[Mul] 51344440" input[0]: 1, input[1]: 1,256,1,1, output[0]: 1,256,1,1, execution time: 8334 ns
```
Without this PR, the profiling result is like below:
```
[profiling] kernel "52097928|[Split] 52097928" execution time: 37760 ns
program-manager.ts:105 
[profiling] kernel "41898328|[Concat] 41898328" execution time: 51666 ns
program-manager.ts:105 
[profiling] kernel "41915648|[Transpose] 41915648" execution time: 95416 ns
program-manager.ts:105 
[profiling] kernel "49757856|[Sub] 49757856" execution time: 7969 ns
program-manager.ts:105 
[profiling] kernel "51680504|[Mul] 51680504" execution time: 8906 ns
```
With the new information, we can easily know what kind of shape ops have
poor performance. Also it can help us to check whether too small shape
ops run on gpu.
2023-08-31 08:12:28 -07:00
Yulong Wang
e5ca3f3dcb
[js/api] introducing IO binding for tensor (#16452)
[//]: # (## Work In Progress. Feedbacks are welcome!)

### Description
This PR adds a few properties, methods and factories to Tensor type to
support IO-binding feature. This will allow user to create tensor from
GPU/CPU bound data without a force transferring of data between CPU and
GPU.

This change is a way to resolve #15312

### Change Summary
1. Add properties to `Tensor` type:
a. `location`: indicating where the data is sitting. valid values are
`cpu`, `cpu-pinned`, `texture`, `gpu-buffer`.
b. `texture`: sit side to `data`, a readonly property of `WebGLTexture`
type. available only when `location === 'texture'`
c. `gpuBuffer`: sit side to `data`, a readonly property of `GPUBuffer`
type. available only when `location === 'gpu-buffer'`

2. Add methods to `Tensor` type (usually dealing with inference
outputs):
- async function `getData()` allows user to download data from GPU to
CPU manually.
- function `dispose()` allows user to release GPU resources manually.

3. Add factories for creating `Tensor` instances:
    a. `fromTexture()` to create a WebGL texture bound tensor data
    b. `fromGpuBuffer()` to create a WebGPUBuffer bound tensor data
    c. `fromPinnedBuffer()` to create a tensor using a CPU pinned buffer

### Examples:

create tensors from texture and pass to inference session as inputs
```js
// when create session, specify we prefer 'image_output:0' to be stored on GPU as texture
const session = await InferenceSession.create('./my_model.onnx', {
  executionProviders: [ 'webgl' ],
  preferredOutputLocation: { 'image_output:0': 'texture' }
});

...

const myImageTexture = getTexture(); // user's function to get a texture
const myFeeds = { input0: Tensor.fromTexture(myImageTexture, { width: 224, height: 224 }) }; // shape [1, 224, 224, 4], RGBA format.
const results = await session.run(myFeeds);
const myOutputTexture = results['image_output:0'].texture;
```
2023-08-29 12:58:26 -07:00
Jiajia Qin
fffefb1c22
[js/webgpu] Optimize matmul (#16969)
### Description
Changes in this PR:
1) use the optimized version `makeMatMulPacked[Vec4]Source` to support
matmul.
2) enable the conv2dByMatMul path.
3) support broadcast
4) use IndicesHelper.

MatMul with M = 512, K = 512, N = 512 becomes 2ms from 15ms when
enabling profilingMode on my ADL.
2023-08-29 12:40:57 -07:00
Caroline
228db24317
Add training API functions to WASM API (#16521)
### Description
* Created `wasm/training_api` source and header files & modified
WebAssembly CMake to include training flags
* The `wasm/training_api` files use an `OrtTrainingManager` handle which
is a struct of an OrtCheckpointState and an OrtTrainingSession, rather
than creating a CheckpointState handle & a separate TrainingSession
handle.
* This is so that the TypeScript side only has to manage one handle that
will be passed between TrainingSession & CheckpointState
representations, rather than the TypeScript side managing separate
CheckpointStateHandle and TrainingSessionHandle.


### Motivation and Context
WASM API needs to be updated with ORT training API function calls so
that ORT training web bindings can be added for on-device training.

---------

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: carzh <carolinezhu@microsoft.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com>
2023-08-28 11:05:02 -07:00
Hariharan Seshadri
cbd97515cd
[JS/WebGPU] Support GatherElements kernel (#17243)
### Description
As title


### Motivation and Context
Improve WebGPU kernel coverage
2023-08-28 09:55:25 -07:00
Yulong Wang
bb1871332f
[js/webgpu] add kernel Not and Equal (#17306)
### Description
This PR adds kernel implementation for operator "Not" and "Equal". Also
removed download cache in gpu data manager.

**Why removing download cache**
The following test case failed. ("Or" is on CPU, "Greater" and "Equal"
are on JSEP)

![image](https://github.com/microsoft/onnxruntime/assets/7679871/8d9798ad-2703-4fb9-907e-ff716c67d0b2)
after debugging, I found that both "Equal" and "Greater" are using the
same output GPU Data ID. This is because when ORT executes the graph, it
first run "Equal", allowing its shader to write into GPU Data ID 2; then
a Gpu2Cpu copy for it is issued (because currently "Or" is on CPU EP);
at this point, ORT thinks GPU Data ID=2 is free to use; so it reuse it
as output for "Greater". This means there is no allocation for output of
"Greater" kernel, and both kernel writes to GPU Data ID=2.

For gpu data manager, there will be 2 downloads from the same GPU
buffer. Previously I think this is a waste of resource so I cached the
data. But now it shoes that we need to perform 2 downloads because the
GPU data is already different. The download data cache should be
removed.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-27 19:50:17 -07:00
Yulong Wang
ddcd46174e
[js/webgpu] fix jsepOnRunEnd (#17300)
### Description
fix jsepOnRunEnd: jsepOnRunEnd() need to be run after runPromise is
resolved.
2023-08-26 00:30:28 -07:00
Jiajia Qin
873ef8b8f0
[js/webgpu] add label for some webgpu APIs (#17291)
### Description
<!-- Describe your changes. -->
With the label, it's more easier to identify which op causes the error.

Without the label, the error message is like below: 
```
Tint WGSL reader failure: :12:5 error: return statement type must match its function return type, returned 'vec4<f32>', expected 'f32'
    return W[i2o_W(indices)];
    ^^^^^^

 - While validating [ShaderModuleDescriptor]
 - While calling [Device].CreateShaderModule([ShaderModuleDescriptor]).
```
With the label, the error message is like below:
```
Tint WGSL reader failure: :12:5 error: return statement type must match its function return type, returned 'vec4<f32>', expected 'f32'
    return W[i2o_W(indices)];
    ^^^^^^

 - While validating [ShaderModuleDescriptor "ConvTranspose2D"]
 - While calling [Device].CreateShaderModule([ShaderModuleDescriptor "ConvTranspose2D"]).
```
### Motivation and Context
This change is mainly for debugging. With this change, we can easily
know that `ConvTranspose2D`'s shader has problem from above message.
2023-08-25 12:12:56 -07:00
xhcao
5e8d94cec8
[js/webgpu] support Greater and Less operators (#17296)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-25 12:11:25 -07:00
Yulong Wang
79c4ed9a45
[js/webgpu] support error pop and kernel name (#17260)
### Description
This PR contains changes to support error pop and kernel name.

- Add a function `JsepGetNodeName` to allow reading kernel name from JS
to C++
- When in debug mode ( `env.debug = true;` ) or in profiling mode (
`env.webgpu.profilingMode = 'default';` ), kernel name will be read from
ORT; otherwise use the kernel pointer ( a number ) as kernel name to
save calls from JS to C++.
- When in debug mode, WebGPU validation errors will be recorded and if
any error occurs, `inferenceSession.run()` will fail (Promise get
rejected). Behavior when not in debug mode is not changed. This is
because recording errors are not zero-overhead, and GPU validation
errors should occur consistently in and not in debug mode.
- Add `jsepOnRunStart()` and `jsepOnRunEnd()` hook to:
   - allow implementation of the features mentioned above.
   - pass session ID to backend.
2023-08-25 08:08:15 -07:00
satyajandhyala
da180b20fa
[JS/Web] Fix ConvTranspose shader code compilation errors. (#17232)
### Description
Fix JSEP ConvTranspose shader code errors.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-25 06:25:54 -07:00
Yulong Wang
fb51faea64
[js/webgpu] fix 2 build breaks introduced in merge (#17273)
### Description
fix 2 build breaks introduced in merge. Fixes web build
2023-08-23 18:09:50 -07:00
Yulong Wang
8b18d48c7c
[js/webgpu] make IndicesHelper implementation implicit (#17193)
### Description
This change makes it no longer required to call indicesHelper.impl() in
shader code.
2023-08-23 14:41:35 -07:00
Guenther Schmuelling
d3d3dde844
fix webgpu split (#17258)
fix webgpu split for the case of split_sizes coming from input[1]
2023-08-22 16:49:22 -07:00
Yulong Wang
6fc3fd9ece
[js/webgpu] support Cast operator (#16489)
### Description
support `Cast` operator for webgpu backend.

Cast operator for webgpu backend currently only supports f32, u32, i32
and bool.
2023-08-18 23:51:03 -07:00
xhcao
dd3b2cefd6
[js/webgpu] Support int32 type for binary (#16901)
### Description
Enable typed binary and support int32 type for binary.

Co-authored-by: Xing Xu <xing.xu@intel.com>

---------

Co-authored-by: Xing Xu <xing.xu@intel.com>
2023-08-18 12:19:01 -07:00
Hariharan Seshadri
a476dbf430
[JS/WebGPU] Support Tile operator (#17123)
### Description
As title

### Motivation and Context
Improve WebGPU op coverage
2023-08-18 10:07:21 -07:00
satyajandhyala
7d1a5635a0
[JS/Web] Added SkipLayerNormalization operator. (#17102)
### Description
Add SkipLayerNormalization operator to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-18 09:59:03 -07:00
Hariharan Seshadri
66df11769c
[JS/WebGPU] Expand operator fixes (#17137) 2023-08-16 11:24:26 -07:00
satyajandhyala
89b682e3f3
[JS/Web] The bias input is optional, not required, for LayerNormalization operator (#17143)
### Description
Fix a typo. LayerNormalization takes 2 or 3 inputs. The third input,
bias, is optional.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-16 10:41:20 -07:00
Yulong Wang
133af1385c
[js/webgpu] update shader cache key to include input tensor datatype (#17176)
### Description
update shader cache key to include input tensor datatype.

and make the key a little bit easier to read
2023-08-16 09:14:19 -07:00
Guenther Schmuelling
8289e8b6ef
[js/webgpu] fix a few shader errors (#17171)
Fix for segment anything decoder, reduceMax with rank1 and concat.
2023-08-15 21:14:20 -07:00
Arthur Islamov
ccf14e891e
[js/web] JSEP node assignment optimization (#17128)
### Description
Since WebGPU supports only float32 and int32, having Gather, Reshape,
Shape, Squeeze and Unsqueeze ops with other data types create additional
MemCpy ops and slow down the overall execution as all other OPs with
other tensor types will be done on CPU.

Before this patch SD Unet had these numbers:
Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1141
Node(s) placed on [JsExecutionProvider]. Number of nodes: 4025
memcpy tokens: 2001

After patch:
Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1735
Node(s) placed on [JsExecutionProvider]. Number of nodes: 2243
memcpu tokens: 813

It also gives more than 5X performance benefit. From 12sec for one Unet
step to 2.2sec on RTX 3090 Ti, so we are almost getting to native
performance.

UPD: with latest changes from main branch and multi-threading it went
down to 1.6sec. Will try re-exporting my model to onnx with maximum
optimizations, like using MultiHeadAttention to decrease node count.
Maybe after implementing that it can go in less than 1 sec
2023-08-15 18:58:05 -07:00
xhcao
24e0bd37b4
[JS/WebGPU] Support Log operator (#17045)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-14 18:04:12 -07:00
Yulong Wang
14a8315f10
[js/web] [webgpu] new incides helper (#16957)
### Description
This PR introduces the new incides helper.

IndicesHelper is a helper class for generating WGSL code for
manipulating indices and data for a shader's input or output.

This class is designed to offer a unified way to generate WGSL code for
manipulating indices and data for a shader's input or output. The
following is a list of terminologies used in this class:
- `offset`: a uint32 value representing the offset of an element in the
data buffer.
- `indices`: an abstraction of a multi-dimensional array's indices
representing the data's index on each dimension.
- `value`: a value of a data element.

Users are expected to create an instance of this class for each shader's
input or output, and use the instance to generate WGSL code for
manipulating indices and data. The following 2 exported functions are
for users to call to create an instance of an indices helper:
 - `inputVariable()`: create an indices helper instance for an input.
 - `outputVariable()`: create an indices helper instance for an output.


An indices helper instance contains helper functions for the following
operations:
- access readonly basic information, including: `name`(the name of the
input or output), `usage`(whether it's an input or an output) and
`shape`(the passed in shape).
- `type`: access readonly type information, including: `indices`(the
type of indices), `value`(the type of value at runtime), `storage`(the
type of value at storage) and `tensor`(the tensor type as represented in
TensorView).
- generate WGSL code for getting indices from offset. Use
`offsetToIndices()` for WGSL code snippet to calculate incides from
offset, and use `indicesToOffset()` for WGSL code snippet to calculate
offset from indices.
- to manipulate an instance of indices, use `setIndices()` and
`getIndices()` to set and get the indices on an indices variable.
- to manipulate data, use `set()`/`get()` to access data at the given
indices from parameter list, use `setByIndices()`/`getByIndices()` to
access data at the given indices from an indices variable, and use
`setByOffset()`/`getByOffset()` to access data at the given offset.
- `impl`: get WGSL code of function implementation for the util
functions mentioned above.

This change applies the usage of new IndicesHelper through the code, but
not necessary for all code.
2023-08-11 11:36:59 -07:00
Zimon Tai
a3e02e8e2a
Fix Resize op input check (#16594)
### Description
onnxjs contains a `Resize` op input check which is outdated since opset
9. Currently `Resize` supports up to 4 inputs. This PR looses the input
check.



### Motivation and Context

Fixes #15636
2023-08-09 15:42:30 -07:00
Arthur Islamov
c3f04251c7
[js/web] JSEP LayerNormalization and InstanceNormalizations kernels (#16830)
### Description
Added two kernels for Layer and Instance norm

Also added maximum limits for `maxBufferSize` when requesting GPU device
as by default it's limited to 256mb and it fails allocating 600mb buffer
while running fp32 StableDiffusion weights.


### Motivation and Context
These two are used in StableDiffusion and many other networks
2023-08-08 09:09:37 -07:00
Jiajia Qin
9ea0a3129b
[js/webgpu] Make sure only storage buffers are reused (#16893)
### Description
<!-- Describe your changes. -->
This PR makes sure that only storage buffers are reused. Previously, the
query buffer might also get from the freeBuffers list if there is a
matching size in it. But they are different usage, which results errors.
2023-08-04 13:40:52 -07:00
satyajandhyala
7ad43d9564
[JS/Web] Fixed ArgMin and ArgMax and refactored (#17002)
Fixed ArgMin and ArgMax and refactored using functionality from Reduce
operator code.

### Description
Removed code/functionality duplication and fixed some issue.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-04 12:59:36 -07:00
satyajandhyala
cc4b64f646
[JS/Web] Modify Reduce, Expand and Slice to pass op and node tests. (#16979)
### Description
Make CacheHint mechanism, which is designed to avoid running the same
test multiple times saving the result mapped against a key, working by
adding input dims.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-03 15:48:47 -07:00
Arthur Islamov
ea55700e1c
[js/web] JSEP Gather OP (#16855)
### Description
Added Gather op that works with both i32 and i64 indices, assuming that
values fall into i32 limit. The assumption is safe because it's not
possible to allocate more than 2gb buffer for inputs.

It treats all data from input tensor as u32, copying 1 or 2 elements for
i64, u64 and double.

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
2023-08-03 14:09:37 -07:00
Arthur Islamov
acb9e56164
[js/web] JSEP Expand fix for inputs with rank < 2 (#16829)
### Description
If Expand inputs has rank < 2, `inputIndicesHelper` and
`outputIndicesHelper` create indices as u32 instead if array<u32> and
`calculateInputIndex` throws an error



### Motivation and Context
I've encountered this error while making StableDiffusion work with JSEP
2023-08-03 11:38:04 -07:00
Guenther Schmuelling
0df2e14038
js/webgpu: argmax,argmin,softmax support (#16882)
argmax and argmin are similar to reduce. Eventually we need to add
optimized flavors of the shader.

softmax is optimized but only works on the last axis for now which
should be the common use case.

todo: enable more ut for argmax/argmin
2023-08-02 18:16:19 -07:00
Hariharan Seshadri
506ddb3d5d
[js/WebGPU] Support int32 Transpose in WebGPU (#16952) 2023-08-02 16:27:24 -07:00
Jiajia Qin
fa8487ea3a
[js/webgpu] Check profilingMode in each run (#16897)
### Description
<!-- Describe your changes. -->
This PR moves checking profilingMode to each run instead of the
initialization stage. In this way, users can start/stop profiling at any
time. Otherwise, profiling only take effects at the very beginning and
can't be stopped.
2023-07-31 17:37:24 -07:00
satyajandhyala
77b2b618b2
[JS/WebGPU] Add Resize operator (#16680)
### Description
Implemented Resize operator support in JSEP



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-31 09:35:06 -07:00
satyajandhyala
dd24d52737
[JS/Web] Added Gelu contrib operator support to JSEP (#16909)
### Description
Added Gelu operator to JSEP


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-31 09:18:58 -07:00
Yulong Wang
1743e9a615
[js] enable formatter for more file types (#16888)
### Description
enable formatter for .js/.json/.jsonc/.md files
2023-07-28 15:46:58 -07:00
satyajandhyala
03ce0a5693
[Web/JS] Added Slice operator in JSEP. (#16811)
### Description
Added Slice operator support to JSEP.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-25 14:19:20 -07:00
Jiajia Qin
193415a162
[js/webgpu] reuse buffer for GpuDataManager (#16746)
### Description
<!-- Describe your changes. -->
Allocating new GPUBuffer in every session.run is not efficient. We
should make it only happen in the first run. In the following runs, we
should try to reuse those buffers.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- This PR is for performance.
See mobilenetv2 becomes 9.58 ms from 12.9 ms.
2023-07-21 13:13:01 -07:00
Yulong Wang
7dcb805ab8
[js/web] upgrade onnx-proto version (#16722)
### Description
This change upgrades a lot of dependencies. There are 2 motivations of
doing this change:
- fix the security issue reported by dependabot (protobufjs Prototype
Pollution vulnerability -
https://github.com/advisories/GHSA-h755-8qp9-cq85)
 - resolve the requirement of using ONNX IR_VERSION 9 (#16638)


This requires:
- upgrade protobufjs to v7.2.4
- upgrade library 'onnx-proto' to consume latest ONNX release (v1.14.0).

Problems:
- protobufjs v7.2.4 depends on long.js v5, which does not work well with
typescript (commonjs).
- onnx-proto depends on this fix with a new release of long.js
- long.js is in maintenance and it takes longer than expected to put in
new changes

Solutions:
- use a patch script in `preprepare` to copy type declarations to make
long.js work with typescript (commonjs)
- generate onnx protobuf JS/TS files and put them under
js/web/lib/onnxjs/ort-schema/protobuf folder - remove 'onnx-proto' from
dependency.
- apply fixes to generated onnx.d.ts
2023-07-18 16:36:39 -07:00
Yulong Wang
d1d65978f6
[js/web] fix file size trim for wasm only .min.js (#16681)
### Description
fix file size trim for wasm only .min.js

minimal build `ort.wasm.min.js` and `ort.wasm-core.min.js` should
exclude JSEP related source code.
2023-07-13 14:20:51 -07:00
satyajandhyala
d41bbac7b9
[Web/JS] Added Expand operator support. (#16577)
### Description
Added Expand operator support.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-11 09:38:16 -07:00
satyajandhyala
00e8f2a2a9
[Web/JS] Add ConvTranspose support (#16433)
### Description
Add ConvTranspose support for WebGPU


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-08 11:10:50 -07:00
Yulong Wang
5c6613875c
[js/web] [JSEP] allow passing data in kernel compute (#16621)
### Description
allow passing data in OpKernel::Compute() from C++ to JS.
2023-07-07 14:27:30 -07:00
satyajandhyala
e55a20ece8
[Web/JS] Added Split operator support. (#16567)
### Description
Added WeGPU/JSEP Split operator support.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-07-07 12:16:10 -07:00