onnxruntime/js/web/lib
Jiajia Qin 25f427466e
[js/webgpu] Optimize ConvTranspose (Continue) (#23429)
BUG #23273

This PR does below optimizations:
1. When output channels is one, 1) calculate the offset before the
inchannel loop to reduce indices to offsets calculation, 2) split the
`inputChannelsPerGroup` into `inputChannelsPerGroupInt` and
`inputChannelsRemainder` parts so that we can always access 4 data for
`inputChannelsPerGroupInt`.
2. Use precise initial value to reduce useless loop iterations. Thanks
@jiangzhaoming 's suggestion's on this.

With this PR, ConvTranspose becomes 3.7s from 8.4s on Intel Meteor Lake.
On NV RTX 2000 Ada, it becomes 1.6s from 2.7s.
2025-01-22 08:59:17 -08:00
..
onnxjs [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
wasm [js/webgpu] Optimize ConvTranspose (Continue) (#23429) 2025-01-22 08:59:17 -08:00
backend-onnxjs.ts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
backend-wasm.ts [js/web] fix package export for bundlers (#23257) 2025-01-09 11:01:00 -08:00
build-def.d.ts [js/web] fix package export for bundlers (#23257) 2025-01-09 11:01:00 -08:00
index.ts [js/web] remove training release (#22103) 2024-09-16 10:56:22 -07:00
version.ts bumps up version in main from 1.20 -> 1.21 (#22482) 2024-10-17 12:32:35 -07:00