onnxruntime/js/web/lib
Jiajia Qin abdf8b7c3f
[js/webgpu] Optimize broadcast binary. (#18185)
### Description
Currently, the binary algorithms are divided into the vectorize one
(efficient) and non-vectorize one (less efficient). Below situations
will go to the vectorize one:
1) A or B's shape length is 1.
2) The shared dimensions length of A and B are divisible by 4.
3) A and B have same shape.

This PR adds another situation as below to go to the vectorize
algorithm.
4. A or B's last dimension is divisible by 4.

With this change, the aggerate time of Add in sam-b-encoder becomes
309.65 ms from 409.12 ms on Intel ADL.
2023-11-20 16:52:17 -08:00
..
onnxjs [js] optimize eslint config (#18460) 2023-11-20 12:00:56 -08:00
wasm [js/webgpu] Optimize broadcast binary. (#18185) 2023-11-20 16:52:17 -08:00
backend-onnxjs.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
backend-wasm-inference.ts Add "glue" between training WASM artifacts and training web (#17474) 2023-10-12 11:16:56 -07:00
backend-wasm-training.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
backend-wasm.ts [js/web/training] runTrainStep implementation (#18006) 2023-11-02 08:32:50 -07:00
build-def.d.ts [js/web] fix typescript type check (#18343) 2023-11-10 16:03:38 -08:00
index.ts [js/web] fix a few package consuming problems (#18109) 2023-10-30 08:11:43 -07:00
version.ts