onnxruntime/js/web/test/data
Jiajia Qin fd6bab4250
[js/webgpu] Provide a vectorized algorithm for GroupedConv (#18884)
### Description
This PR provides a vectorized algorithm for NHWC GroupedConv to improve
performance.

The aggregate time of GroupedConv in mobilenetv2-12 becomes ~1ms from
~4ms on Intel Alder Lake machine. About 20% improvement for the whole
model.
2024-01-10 16:12:43 -08:00
..
ops [js/webgpu] Provide a vectorized algorithm for GroupedConv (#18884) 2024-01-10 16:12:43 -08:00