onnxruntime/js/web/docs/webgpu-operators.md at fac3e33da510c27c7a2631cf44a79923ee14e09f

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

[js/web] JSEP Attention & MultiHeadAttention (#17742 )

### Description
This is a narrow implementation of Attention/MultiHeadAttention as it
does not support:
a. inputs 5-7 for MHA
b. packed QKV/KV
c. past/present
d. attention mask

But it works well for StableDiffusion and can be extended later. It
reduces VRAM usage as it combines many ops into few
I've updated demo here https://islamov.ai/stable-diffusion-webgpu/ it
takes ~13sec for 1 image with 20 steps on RTX3090Ti and about 25s on M1
Pro
VRAM usage is about 8gb if you don't use img2img

Going to focus on SDXL now

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>

2023-11-17 12:23:52 -08:00

4.5 KiB

Raw Blame History

Operators Support Table

The following table shows ONNX operators and the supported opset domain/versions in WebGPU EP by ONNX Runtime Web. For example, 4-6, 8+ means ONNX Runtime Web currently support opset version 4 to 6, 8 and above.

This file is automatically generated from the def files via this script. Do not modify directly.

Operator	Opset	Comments
Abs	ai.onnx(6-12,13+)
Acos	ai.onnx(7+)
Acosh	ai.onnx(9+)
Add	ai.onnx(7-12,13,14+)
ArgMax	ai.onnx(1-10,11-12,13+)
ArgMin	ai.onnx(1-10,11-12,13+)
Asin	ai.onnx(7+)
Asinh	ai.onnx(9+)
Atan	ai.onnx(7+)
Atanh	ai.onnx(9+)
Attention	com.microsoft(1+)	need implementing mask and past/present
AveragePool	ai.onnx(7-9,10,11+); com.ms.internal.nhwc(7-9,10,11+)	need perf optimization; need implementing activation
BiasAdd	com.microsoft(1+)
BiasSplitGelu	com.microsoft(1+)
Cast	ai.onnx(6-8,9-12,13-18,19+)
Ceil	ai.onnx(6-12,13+)
Clip	ai.onnx(6-10,11,12,13+)
Concat	ai.onnx(1-3,4-10,11-12,13+)
Conv	ai.onnx(1-10,11+); com.ms.internal.nhwc(1-10,11+)	need perf optimization; conv3d is not supported; need implementing activation
ConvTranspose	ai.onnx(1-10,11+); com.ms.internal.nhwc(1-10,11+)	need perf optimization; ConvTranspose3d is not supported; need implementing activation
Cos	ai.onnx(7+)
Cosh	ai.onnx(9+)
Div	ai.onnx(7-12,13,14+)
Einsum	ai.onnx(12+)
Elu	ai.onnx(6+)
Equal	ai.onnx(7-10,11-12,13-18,19+)
Erf	ai.onnx(9-12,13+)
Exp	ai.onnx(6-12,13+)
Expand	ai.onnx(8-12,13+)
Flatten	ai.onnx(1-8,9-10,11-12,13+)
Floor	ai.onnx(6-12,13+)
FusedConv	com.microsoft(1+)
Gather	ai.onnx(1-10,11-12,13+)
GatherElements	ai.onnx(11-12,13+)
Gelu	com.microsoft(1+)
Gemm	ai.onnx(7-8,9-10,11-12,13+)
GlobalAveragePool	ai.onnx(1+); com.ms.internal.nhwc(1+)
GlobalMaxPool	ai.onnx(1+); com.ms.internal.nhwc(1+)
Greater	ai.onnx(7-8,9-12,13+)
GreaterOrEqual	ai.onnx(12-15,16+)
If	ai.onnx(1-10,11-12,13-18,19+)
InstanceNormalization	ai.onnx(6+); com.ms.internal.nhwc(6+)
LayerNormalization	ai.onnx(17+)
LeakyRelu	ai.onnx(6-15,16+)
Less	ai.onnx(7-8,9-12,13+)
LessOrEqual	ai.onnx(12-15,16+)
Log	ai.onnx(6-12,13+)
MatMul	ai.onnx(1-12,13+)
MaxPool	ai.onnx(1-7,8-9,10,11,12+); com.ms.internal.nhwc(1-7,8-9,10,11,12+)	need perf optimization; need implementing activation
MemcpyFromHost	ai.onnx(1+)
MemcpyToHost	ai.onnx(1+)
Mul	ai.onnx(7-12,13,14+)
MultiHeadAttention	com.microsoft(1+)	need implementing mask and past/present
Neg	ai.onnx(6-12,13+)
Not	ai.onnx(1+)
Pad	ai.onnx(2-10,11-12,13-17,18,19+)
Pow	ai.onnx(7-11,12,13-14,15+)
Range	ai.onnx(11+)
Reciprocal	ai.onnx(6-12,13+)
ReduceL1	ai.onnx(1-10,11-12,13-17,18+)
ReduceL2	ai.onnx(1-10,11-12,13-17,18+)
ReduceLogSum	ai.onnx(1-10,11-12,13-17,18+)
ReduceLogSumExp	ai.onnx(1-10,11-12,13-17,18+)
ReduceMax	ai.onnx(1-10,11,12,13-17,18+)
ReduceMean	ai.onnx(1-10,11-12,13-17,18+)
ReduceMin	ai.onnx(1-10,11,12,13-17,18+)
ReduceProd	ai.onnx(1-10,11-12,13-17,18+)
ReduceSum	ai.onnx(1-10,11-12,13+)
ReduceSumSquare	ai.onnx(1-10,11-12,13-17,18+)
Relu	ai.onnx(6-12,13,14+)
Reshape	ai.onnx(5-12,13,14+)	no GPU kernel
Resize	ai.onnx(10,11-12,13-17,18,19+); com.ms.internal.nhwc(10,11-12,13-17,18,19+)	CoordinateTransformMode align_corners is not supported with downsampling
Shape	ai.onnx(1-12,13-14,15+)	no GPU kernel; an ORT warning is generated - need to fix
Sigmoid	ai.onnx(6-12,13+)
Sin	ai.onnx(7+)
Sinh	ai.onnx(9+)
SkipLayerNormalization	com.microsoft(1+)
Slice	ai.onnx(1-9,10,11-12,13+)
Softmax	ai.onnx(1-10,11-12,13+)
Split	ai.onnx(1,2-10,11-12,13-17,18+)
Sqrt	ai.onnx(6-12,13+)
Squeeze	ai.onnx(1-10,11-12,13+)
Sub	ai.onnx(7-12,13,14+)
Tan	ai.onnx(7+)
Tanh	ai.onnx(6-12,13+)
ThresholdedRelu	ai.onnx(10+)
Tile	ai.onnx(6-12,13+)
Transpose	ai.onnx(1-12,13+)	need perf optimization
Unsqueeze	ai.onnx(1-10,11-12,13+)
Where	ai.onnx(9-15,16+)

4.5 KiB Raw Blame History

Operators Support Table

4.5 KiB

Raw Blame History