onnxruntime/js/web/lib/wasm/jsep/webgpu/ops
Xu Xing 8c59cd4fce
[js/webgpu] Support GroupQueryAttention (#20237)
TODOs:
1. Handle H * params.kvNumHeads greater than work group size limit.
2. Support BNSH kv cache.
2024-05-13 09:43:37 -07:00
..
3rd-party [js/webgpu] perform uniform consistency check (#20019) 2024-03-26 17:14:43 -07:00
argminmax.ts
attention.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
batch-norm.ts
bias-add.ts
bias-split-gelu.ts
binary-op.ts
common.ts [js/webgpu] fixes for fp16 attention (#20440) 2024-04-24 08:01:28 -07:00
concat.ts [JS/WebGPU] Multihead attention improvements (#20286) 2024-04-23 12:39:49 -07:00
conv-grouped.ts [js/webgpu] perform uniform consistency check (#20019) 2024-03-26 17:14:43 -07:00
conv-transpose.ts fix ConvTranspose 1D (#20194) 2024-04-05 10:05:32 -07:00
conv.ts [js/webgpu] Enable GroupedConvVectorize path (#19791) 2024-03-12 22:25:07 -07:00
cumsum.ts fix csum and enable ut (#20355) 2024-04-17 15:01:06 -07:00
depth-to-space.ts [js/webgpu] implement DepthToSpace operator in webgpu (#19948) 2024-04-10 12:13:46 -07:00
einsum.ts
expand.ts
fast-gelu.ts
fuse-utils.ts
gather-elements.ts
gather.ts [js/webgpu] minor fixes to make tinyllama work (#19564) 2024-02-23 15:45:30 -08:00
gemm.ts
group-query-attention.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
instance-norm.ts [js/webgpu] Use global id in attention and instance-norm (#20008) 2024-04-02 01:42:39 -07:00
layer-norm.ts [js/web] support SimplifiedLayerNorm and SkipSimplifiedLayerNorm (#20277) 2024-04-11 14:08:50 -07:00
matmul.ts
matmulnbits.ts [JS/WebGPU] MatMulNBits remove unnecessary condition (#20396) 2024-04-29 14:27:21 -07:00
multihead-attentiion.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
pad.ts [js/webgpu] perform uniform consistency check (#20019) 2024-03-26 17:14:43 -07:00
pool.ts [js/webgpu] fix maxpool / fp16 (#19981) 2024-03-19 16:15:49 -07:00
range.ts
reduce-shared.ts accumulate in fp32 for Reduce* (#19868) 2024-03-18 08:28:43 -07:00
reduce.ts
resize.ts
rotary-embedding.ts [js/webgpu] Implement com.microsoft.RotaryEmbedding (#20209) 2024-04-08 09:11:26 -07:00
skip-layer-norm.ts optimize skiplayernorm (#20551) 2024-05-08 08:40:03 -07:00
slice.ts
softmax.ts [js/webgpu] perform uniform consistency check (#20019) 2024-03-26 17:14:43 -07:00
split.ts [js/webgpu] Create Split indices helpers by rank, not by shape (#19554) 2024-02-20 09:24:34 -08:00
tile.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
transpose.ts
unary-op.ts
where.ts [JS/WebGPU] Fix Split and Where to handle corner cases. (#19613) 2024-02-23 00:21:15 -08:00