onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-17 21:10:43 +00:00

History

Jiajia Qin 8fbbf2fd4f [js/webgpu] Optimize MatMul with M = 1 (#22577 ) ### Description <!-- Describe your changes. --> BUG #22031 In the demucs model, there are lots of MatMul ops with shapes like below: `input[0]: [3448,1,512] \| float32, input[1]: [512,1536] \| float32, output[0]: [3448,1,1536] \| float32` We can see that for this kind of shape, the batch size is a big value, but M = 1. Our current algorithm is based on [M, N] to partition tiles, which is not efficient for such kind of shapes. This PR reshapes the inputs to improve the matmul performance. Before: [3448,1,512] x [512,1536] = [3448,1,1536] After: [1, 3448, 512] x [512, 1536] = [1, 3448, 1536] , then the output can be reshaped to [3448, 1, 1536] The overall MatMul time in demucs model becomes 1778.45 ms from 4418.17 ms on my iGPUs. --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>		2024-11-01 08:04:42 -07:00
..
data/ops	[js/webgpu] Optimize MatMul with M = 1 (#22577 )	2024-11-01 08:04:42 -07:00
e2e	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
unittests	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
op-test-schema.json	[js/web] Add support for int4/uint4 tensor (#21720 )	2024-08-15 21:32:10 -07:00
suite-test-list.jsonc	[WebNN] Support And, Or and Xor ops (#22598 )	2024-10-30 17:52:10 -07:00
test-main.ts	[js/webgpu] Manage model download with a specific unittest option (#22214 )	2024-09-30 18:27:43 -07:00
test-runner.ts	[WebNN EP] Use boolean flags instead of MLTensorUsage (#22497 )	2024-10-22 17:20:36 -07:00
test-shared.ts	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 )	2024-08-14 16:51:22 -07:00
test-types.ts	[js/webgpu] Manage model download with a specific unittest option (#22214 )	2024-09-30 18:27:43 -07:00