onnxruntime/js/web/lib/wasm/jsep/webgpu/ops
Jiajia Qin 8159723ba7
[js/webgpu] Optimize matmulnbits (#22360)
### Description
<!-- Describe your changes. -->
This PR further optimizes matmulnbits specially for iGPUs. The phi3 demo
becomes ~12 tokens/second from ~8 tokens on iGPUs.

Some todos:
1. Make the optimization more general, Remove the blockSize = 32
limitation.
2. Tune the parameter, such as workgroupSize, components size (currently
only support components = 1), to see the performance change.
2024-10-14 15:49:29 -07:00
..
3rd-party [js/webgpu] Optimize conv1d by conv2d (#19388) 2024-08-22 22:56:07 -07:00
argminmax.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
attention.ts [JS/WebGPU] Avoid producing presentKey/presentValue outputs if pastKey/pastValue … (#21782) 2024-08-19 18:02:19 -07:00
batch-norm.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
bias-add.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
bias-split-gelu.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
binary-op.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
common.ts [js/webgpu] Optimize matmulnbits (#22360) 2024-10-14 15:49:29 -07:00
concat.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
conv-grouped.ts [js/webgpu] Optimize grouped conv (#21892) 2024-09-04 17:16:35 -07:00
conv-transpose.ts [js/webgpu] Fix issue to run model demucs (#22074) 2024-09-16 23:17:10 -07:00
conv.ts [js/webgpu] Fix issue to run model demucs (#22074) 2024-09-16 23:17:10 -07:00
cumsum.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
depth-to-space.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
einsum.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
expand.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
fast-gelu.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
fuse-utils.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
gather-block-quantized.ts [JS/WebGPU] Add GatherBlockQuantized op support (#21734) 2024-08-26 14:46:04 -07:00
gather-elements.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
gather.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
gemm.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
group-query-attention.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
instance-norm.ts [js/webgpu] Optimize InstanceNormalization (#21995) 2024-09-23 11:32:09 -07:00
layer-norm.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
matmul.ts [js/webgpu] Optimize conv1d by conv2d (#19388) 2024-08-22 22:56:07 -07:00
matmulnbits.ts [js/webgpu] Optimize matmulnbits (#22360) 2024-10-14 15:49:29 -07:00
multihead-attention.ts [js/webgpu] Optimize MultiHeadAttention|Transpose (#22420) 2024-10-14 15:43:14 -07:00
pad.ts [js/webgpu] Enable pad f16 uniform (#21691) 2024-08-26 07:58:48 -07:00
pool.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
quantize-linear.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
range.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
reduce-shared.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
reduce.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
resize.ts [JS/WebGPU] Fixed bugs in inputs validation of Resize (#21955) 2024-10-04 18:29:53 -07:00
rotary-embedding.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
skip-layer-norm.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
slice.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
softmax.ts [js/webgpu] Remove the limitation on axis in softmax (#22231) 2024-09-30 18:27:11 -07:00
split.ts [js/webgpu] Handle negative axis in op Split (#21771) 2024-08-17 16:41:23 -07:00
tile.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
transpose.ts [js/webgpu] Replace array with string in transpose perm (#21930) 2024-09-16 23:17:46 -07:00
unary-op.ts [js/webgpu] support float16 for Clip (#21584) 2024-08-28 13:19:20 -07:00
where.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00