onnxruntime/js/web/lib
Jiajia Qin b30e721dc8
[js/webgpu] Provide a naive vectorized matmul algorithm (#18758)
### Description
This PR provided a vectorized matmul algorithm. In most situations, we
still go to the workgroup memory optimized matmul. But for some
situations, like N and K are very small, using workgroup optimized
matmul can't fully utilize the underlying hardware due to the 32x32 tile
size. So for very small N/K, we switch to the naive vectorized matmul
algorithm to improve the hardware execution unit usage.

With this PR, matmul with input0: [1, 36864, 3], input1: [1, 3, 3],
input2: [3] becomes less than 1 ms from 4.34 ms on Intel Gen9 GPUs.
2023-12-13 09:03:23 -08:00
..
onnxjs [js] optimize eslint config (#18460) 2023-11-20 12:00:56 -08:00
wasm [js/webgpu] Provide a naive vectorized matmul algorithm (#18758) 2023-12-13 09:03:23 -08:00
backend-onnxjs.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
backend-wasm-inference.ts Add "glue" between training WASM artifacts and training web (#17474) 2023-10-12 11:16:56 -07:00
backend-wasm-training.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
backend-wasm.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
build-def.d.ts [js/web] fix typescript type check (#18343) 2023-11-10 16:03:38 -08:00
index.ts [js/web] fix a few package consuming problems (#18109) 2023-10-30 08:11:43 -07:00
version.ts Bump Up Version to 1.17.0 (#17587) 2023-09-20 11:02:58 +08:00