onnxruntime/js/web/lib/wasm/jsep/webgpu/ops
Arthur Islamov fac3e33da5
[js/web] JSEP Attention & MultiHeadAttention (#17742)
### Description
This is a narrow implementation of Attention/MultiHeadAttention as it
does not support:
a. inputs 5-7 for MHA
b. packed QKV/KV
c. past/present
d. attention mask

But it works well for StableDiffusion and can be extended later. It
reduces VRAM usage as it combines many ops into few
I've updated demo here https://islamov.ai/stable-diffusion-webgpu/ it
takes ~13sec for 1 image with 20 steps on RTX3090Ti and about 25s on M1
Pro
VRAM usage is about 8gb if you don't use img2img

Going to focus on SDXL now

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2023-11-17 12:23:52 -08:00
..
3rd-party [js/webgpu] Fix conv2d with activation (#18388) 2023-11-10 12:54:35 -08:00
argminmax.ts [js/web] set noUnusedParameters to true and fix a few bugs (#18404) 2023-11-15 09:16:29 -08:00
attention.ts [js/web] JSEP Attention & MultiHeadAttention (#17742) 2023-11-17 12:23:52 -08:00
bias-add.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
bias-split-gelu.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
binary-op.ts [js/webgpu] Fix scalar uniform (#18318) 2023-11-10 10:12:22 -08:00
common.ts [JS/Web]Added uniforms support to Slice op. (#18422) 2023-11-16 09:44:13 -08:00
concat.ts [js/webgpu] Add uniforms support to concat op (#18238) 2023-11-10 13:46:03 -08:00
conv-grouped.ts [js/webgpu] Fix conv2d with activation (#18388) 2023-11-10 12:54:35 -08:00
conv-transpose.ts [js/webgpu] Fix the transpose error when dims > 4D (#18027) 2023-10-23 11:02:19 -07:00
conv.ts [js/webgpu] Fix the transpose error when dims > 4D (#18027) 2023-10-23 11:02:19 -07:00
einsum.ts [js/web] set noUnusedParameters to true and fix a few bugs (#18404) 2023-11-15 09:16:29 -08:00
expand.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
fuse-utils.ts [js/webgpu] Fix conv2d with activation (#18388) 2023-11-10 12:54:35 -08:00
gather-elements.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
gather.ts [js/webgpu] Support uniforms for gather (#18312) 2023-11-13 11:24:34 -08:00
gemm.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
instance-norm.ts [js/web] FP16 LayerNorm, InstanceNorm, SkipLayerNorm (#17630) 2023-10-18 10:47:41 -07:00
layer-norm.ts [js/web] FP16 LayerNorm, InstanceNorm, SkipLayerNorm (#17630) 2023-10-18 10:47:41 -07:00
matmul.ts [js/webgpu] support using uniform buffer (#17803) 2023-10-10 00:31:12 -07:00
multi-head-attentiion.ts [js/web] JSEP Attention & MultiHeadAttention (#17742) 2023-11-17 12:23:52 -08:00
pad.ts [js/web] set noUnusedParameters to true and fix a few bugs (#18404) 2023-11-15 09:16:29 -08:00
pool.ts [JS/Web] Enabled 1d spacial input to GlobalAveragePool (#17973) 2023-10-23 16:02:50 -07:00
range.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
reduce-shared.ts [js/web] set noUnusedParameters to true and fix a few bugs (#18404) 2023-11-15 09:16:29 -08:00
reduce.ts [js/web] optimize reduce related operators (#17957) 2023-11-02 12:51:48 -07:00
resize.ts [js/web] set noUnusedParameters to true and fix a few bugs (#18404) 2023-11-15 09:16:29 -08:00
skip-layer-norm.ts [js/web] FP16 LayerNorm, InstanceNorm, SkipLayerNorm (#17630) 2023-10-18 10:47:41 -07:00
slice.ts [JS/Web]Added uniforms support to Slice op. (#18422) 2023-11-16 09:44:13 -08:00
softmax.ts [js/webgpu] Support uniform for softmax (#18345) 2023-11-09 11:19:23 -08:00
split.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
tile.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00
transpose.ts [js/webgpu] Fix the transpose error when dims > 4D (#18027) 2023-10-23 11:02:19 -07:00
unary-op.ts [JS/Web] Added Unifroms support to unary ops. (#18223) 2023-11-03 09:30:54 -07:00
where.ts [js/webgpu] revise uniform support (#17871) 2023-10-11 16:41:46 -07:00