mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
### Description Changes in this PR: 1) use the optimized version `makeMatMulPacked[Vec4]Source` to support matmul. 2) enable the conv2dByMatMul path. 3) support broadcast 4) use IndicesHelper. MatMul with M = 512, K = 512, N = 512 becomes 2ms from 15ms when enabling profilingMode on my ADL. |
||
|---|---|---|
| .. | ||
| ops | ||