onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-17 18:40:28 +00:00

History

Jiajia Qin 8fbbf2fd4f [js/webgpu] Optimize MatMul with M = 1 (#22577 ) ### Description <!-- Describe your changes. --> BUG #22031 In the demucs model, there are lots of MatMul ops with shapes like below: `input[0]: [3448,1,512] \| float32, input[1]: [512,1536] \| float32, output[0]: [3448,1,1536] \| float32` We can see that for this kind of shape, the batch size is a big value, but M = 1. Our current algorithm is based on [M, N] to partition tiles, which is not efficient for such kind of shapes. This PR reshapes the inputs to improve the matmul performance. Before: [3448,1,512] x [512,1536] = [3448,1,1536] After: [1, 3448, 512] x [512, 1536] = [1, 3448, 1536] , then the output can be reshaped to [3448, 1, 1536] The overall MatMul time in demucs model becomes 1778.45 ms from 4418.17 ms on my iGPUs. --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>		2024-11-01 08:04:42 -07:00
..
3rd-party	[js/webgpu] Optimize conv1d by conv2d (#19388 )	2024-08-22 22:56:07 -07:00
argminmax.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
attention.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
batch-norm.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
bias-add.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
bias-split-gelu.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
binary-op.ts	[JS/WebGPU] Support WASM64 (#21836 )	2024-10-24 20:21:51 -07:00
common.ts	[JS/WebGPU] Support WASM64 (#21836 )	2024-10-24 20:21:51 -07:00
concat.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
conv-grouped.ts	[js/webgpu] Optimize grouped conv (#21892 )	2024-09-04 17:16:35 -07:00
conv-transpose.ts	[js/webgpu] Fix issue to run model demucs (#22074 )	2024-09-16 23:17:10 -07:00
conv.ts	[js/webgpu] Fix issue to run model demucs (#22074 )	2024-09-16 23:17:10 -07:00
cumsum.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
depth-to-space.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
einsum.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
expand.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
fast-gelu.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
fuse-utils.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
gather-block-quantized.ts	[JS/WebGPU] Add GatherBlockQuantized op support (#21734 )	2024-08-26 14:46:04 -07:00
gather-elements.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
gather.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
gemm.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
group-query-attention.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
instance-norm.ts	[js/webgpu] Optimize InstanceNorm in some shapes (#22637 )	2024-10-29 17:10:14 -07:00
layer-norm.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
matmul.ts	[js/webgpu] Optimize MatMul with M = 1 (#22577 )	2024-11-01 08:04:42 -07:00
matmulnbits.ts	[js/webgpu] Optimize matmulnbits (#22360 )	2024-10-14 15:49:29 -07:00
multihead-attention.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
pad.ts	[js/webgpu] Enable pad f16 uniform (#21691 )	2024-08-26 07:58:48 -07:00
pool.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
quantize-linear.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
range.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
reduce-shared.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
reduce.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
resize.ts	[JS/WebGPU] Fixed bugs in inputs validation of Resize (#21955 )	2024-10-04 18:29:53 -07:00
rotary-embedding.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
skip-layer-norm.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
slice.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
softmax.ts	[js/webgpu] Remove the limitation on axis in softmax (#22231 )	2024-09-30 18:27:11 -07:00
split.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
tile.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
transpose.ts	[js/webgpu] Replace array with string in transpose perm (#21930 )	2024-09-16 23:17:46 -07:00
unary-op.ts	[js/webgpu] support float16 for Clip (#21584 )	2024-08-28 13:19:20 -07:00
where.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00