onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

Author	SHA1	Message	Date
jzm-intel	d9b91682f1	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 ) This PR make MatMul shaders not depend on inputs broadcasting pattern, but only depend on input ranks and their shape provided in uniform. This change fix the issue that currently shaders code are different for different broadcasting, but have identical cache key and results in wrong cache hit.	2024-11-08 11:00:51 -08:00
Jiajia Qin	8fbbf2fd4f	[js/webgpu] Optimize MatMul with M = 1 (#22577 ) ### Description <!-- Describe your changes. --> BUG #22031 In the demucs model, there are lots of MatMul ops with shapes like below: `input[0]: [3448,1,512] \| float32, input[1]: [512,1536] \| float32, output[0]: [3448,1,1536] \| float32` We can see that for this kind of shape, the batch size is a big value, but M = 1. Our current algorithm is based on [M, N] to partition tiles, which is not efficient for such kind of shapes. This PR reshapes the inputs to improve the matmul performance. Before: [3448,1,512] x [512,1536] = [3448,1,1536] After: [1, 3448, 512] x [512, 1536] = [1, 3448, 1536] , then the output can be reshaped to [3448, 1, 1536] The overall MatMul time in demucs model becomes 1778.45 ms from 4418.17 ms on my iGPUs. --------- Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2024-11-01 08:04:42 -07:00
Jiajia Qin	fffefb1c22	[js/webgpu] Optimize matmul (#16969 ) ### Description Changes in this PR: 1) use the optimized version `makeMatMulPacked[Vec4]Source` to support matmul. 2) enable the conv2dByMatMul path. 3) support broadcast 4) use IndicesHelper. MatMul with M = 512, K = 512, N = 512 becomes 2ms from 15ms when enabling profilingMode on my ADL.	2023-08-29 12:40:57 -07:00
Yulong Wang	1743e9a615	[js] enable formatter for more file types (#16888 ) ### Description enable formatter for .js/.json/.jsonc/.md files	2023-07-28 15:46:58 -07:00
Yulong Wang	3e8cabbc3e	[js/web] WebGL backend refactor (#8586 )	2021-08-12 12:30:49 -07:00
Yulong Wang	4ebc9c3b5e	[JS] onnxruntime-web (#7394 ) * add web * add script and test * fix lint * add test/data/ops * add test/data/node/ to gitignore * modify scripts * add onnxjs * fix tests * fix test-runner * fix sourcemap * fix onnxjs profiling * update test list * update README * resolve comments * set wasm as default backend * rename package * update copyright header * do not use class "Buffer" in browser context * revise readme	2021-04-27 00:04:25 -07:00

6 commits