onnxruntime/js/web/docs/webgpu-operators.md
Yulong Wang bb1871332f
[js/webgpu] add kernel Not and Equal (#17306)
### Description
This PR adds kernel implementation for operator "Not" and "Equal". Also
removed download cache in gpu data manager.

**Why removing download cache**
The following test case failed. ("Or" is on CPU, "Greater" and "Equal"
are on JSEP)

![image](https://github.com/microsoft/onnxruntime/assets/7679871/8d9798ad-2703-4fb9-907e-ff716c67d0b2)
after debugging, I found that both "Equal" and "Greater" are using the
same output GPU Data ID. This is because when ORT executes the graph, it
first run "Equal", allowing its shader to write into GPU Data ID 2; then
a Gpu2Cpu copy for it is issued (because currently "Or" is on CPU EP);
at this point, ORT thinks GPU Data ID=2 is free to use; so it reuse it
as output for "Greater". This means there is no allocation for output of
"Greater" kernel, and both kernel writes to GPU Data ID=2.

For gpu data manager, there will be 2 downloads from the same GPU
buffer. Previously I think this is a waste of resource so I cached the
data. But now it shoes that we need to perform 2 downloads because the
GPU data is already different. The download data cache should be
removed.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-27 19:50:17 -07:00

3.9 KiB

Operators Support Table

The following table shows ONNX operators and the supported opset domain/versions in WebGPU EP by ONNX Runtime Web. For example, 4-6, 8+ means ONNX Runtime Web currently support opset version 4 to 6, 8 and above.

This file is automatically generated from the def files via this script. Do not modify directly.

Operator Opset Comments
Abs ai.onnx(6-12,13+)
Acos ai.onnx(7+)
Acosh ai.onnx(9+)
Add ai.onnx(7-12,13,14+)
ArgMax ai.onnx(1-10,11-12,13+)
ArgMin ai.onnx(1-10,11-12,13+)
Asin ai.onnx(7+)
Asinh ai.onnx(9+)
Atan ai.onnx(7+)
Atanh ai.onnx(9+)
AveragePool ai.onnx(7-9,10,11+); com.ms.internal.nhwc(11+) need perf optimization; need implementing activation
Cast ai.onnx(6-8,9-12,13-18,19+)
Ceil ai.onnx(6-12,13+)
Clip ai.onnx(6-10,11,12,13+)
Concat ai.onnx(1-3,4-10,11-12,13+)
Conv ai.onnx(1-10,11+); com.ms.internal.nhwc(11+) need perf optimization; conv3d is not supported; need implementing activation
ConvTranspose ai.onnx(1-10,11+); com.ms.internal.nhwc(11+) need perf optimization; ConvTranspose3d is not supported; need implementing activation
Cos ai.onnx(7+)
Cosh ai.onnx(9+)
Div ai.onnx(7-12,13,14+)
Elu ai.onnx(6+)
Equal ai.onnx(7-10,11-12,13-18,19+)
Erf ai.onnx(9-12,13+)
Exp ai.onnx(6-12,13+)
Expand ai.onnx(8-12,13+)
Flatten ai.onnx(1-8,9-10,11-12,13+)
Floor ai.onnx(6-12,13+)
Gather ai.onnx(1-10,11-12,13+)
Gelu com.microsoft(1+)
Gemm ai.onnx(7-8,9-10,11-12,13+)
GlobalAveragePool ai.onnx(1+); com.ms.internal.nhwc(1+)
GlobalMaxPool ai.onnx(1+); com.ms.internal.nhwc(1+)
Greater ai.onnx(7-8,9-12,13+)
InstanceNormalization ai.onnx(6+); com.ms.internal.nhwc(6+)
LayerNormalization ai.onnx(17+)
LeakyRelu ai.onnx(6-15,16+)
Less ai.onnx(7-8,9-12,13+)
Log ai.onnx(6-12,13+)
MatMul ai.onnx(1-12,13+)
MaxPool ai.onnx(1-7,8-9,10,11,12+); com.ms.internal.nhwc(11,12+) need perf optimization; need implementing activation
MemcpyFromHost ai.onnx(1+)
MemcpyToHost ai.onnx(1+)
Mul ai.onnx(7-12,13,14+)
Neg ai.onnx(6-12,13+)
Not ai.onnx(1+)
Pow ai.onnx(7-11,12,13-14,15+)
Reciprocal ai.onnx(6-12,13+)
ReduceL1 ai.onnx(1-10,11-12,13-17,18+)
ReduceL2 ai.onnx(1-10,11-12,13-17,18+)
ReduceLogSum ai.onnx(1-10,11-12,13-17,18+)
ReduceLogSumExp ai.onnx(1-10,11-12,13-17,18+)
ReduceMax ai.onnx(1-10,11,12,13-17,18+)
ReduceMean ai.onnx(1-10,11-12,13-17,18+)
ReduceMin ai.onnx(1-10,11,12,13-17,18+)
ReduceProd ai.onnx(1-10,11-12,13-17,18+)
ReduceSum ai.onnx(1-10,11-12,13+)
ReduceSumSquare ai.onnx(1-10,11-12,13-17,18+)
Relu ai.onnx(6-12,13,14+)
Reshape ai.onnx(5-12,13,14+) no GPU kernel
Resize ai.onnx(10,11-12,13-17,18,19+); com.ms.internal.nhwc(11-12,13-17,18,19+) CoordinateTransformMode align_corners is not supported with downsampling
Shape ai.onnx(1-12,13-14,15+) no GPU kernel; an ORT warning is generated - need to fix
Sigmoid ai.onnx(6-12,13+)
Sin ai.onnx(7+)
Sinh ai.onnx(9+)
SkipLayerNormalization com.microsoft(1+)
Slice ai.onnx(1-9,10,11-12,13+)
Softmax ai.onnx(1-10,11-12,13+)
Split ai.onnx(1,2-10,11-12,13-17,18+)
Sqrt ai.onnx(6-12,13+)
Squeeze ai.onnx(1-10,11-12,13+)
Sub ai.onnx(7-12,13,14+)
Tan ai.onnx(7+)
Tanh ai.onnx(6-12,13+)
ThresholdedRelu ai.onnx(10+)
Tile ai.onnx(6-12,13+)
Transpose ai.onnx(1-12,13+) need perf optimization
Unsqueeze ai.onnx(1-10,11-12,13+)