onnxruntime/js/web/test/data/ops
Jiajia Qin 25f427466e
[js/webgpu] Optimize ConvTranspose (Continue) (#23429)
BUG #23273

This PR does below optimizations:
1. When output channels is one, 1) calculate the offset before the
inchannel loop to reduce indices to offsets calculation, 2) split the
`inputChannelsPerGroup` into `inputChannelsPerGroupInt` and
`inputChannelsRemainder` parts so that we can always access 4 data for
`inputChannelsPerGroupInt`.
2. Use precise initial value to reduce useless loop iterations. Thanks
@jiangzhaoming 's suggestion's on this.

With this PR, ConvTranspose becomes 3.7s from 8.4s on Intel Meteor Lake.
On NV RTX 2000 Ada, it becomes 1.6s from 2.7s.
2025-01-22 08:59:17 -08:00
..
_example.jsonc
abs-int32.jsonc
abs.jsonc
absr.jsonc [JS] onnxruntime-web (#7394) 2021-04-27 00:04:25 -07:00
abss.jsonc
acos.jsonc [js] enable formatter for more file types (#16888) 2023-07-28 15:46:58 -07:00
add.jsonc
add_int32.jsonc
add_zero-sized.jsonc
and.jsonc
asin.jsonc
attention.jsonc
batch-norm.jsonc
bias-add.jsonc
bias-split-gelu.jsonc
cast.jsonc
ceil.jsonc
clip.jsonc
concat.jsonc [js] enable formatter for more file types (#16888) 2023-07-28 15:46:58 -07:00
concat_int32.jsonc
concat_zero-sized.jsonc
conv-transpose.jsonc
conv.jsonc
conv1d.jsonc
conv3dncdhw.jsonc
cos.jsonc
cumsum.jsonc
depth-to-space.jsonc
dequantize-linear-int4.jsonc
dequantizelinear.jsonc
div.jsonc
div_int32.jsonc
einsum.jsonc
equal.jsonc
exp.jsonc
expand.jsonc
fast-gelu.jsonc
floor.jsonc
fused-conv.jsonc
fused-conv3dncdhw.jsonc [js/webgpu] Add activation for conv3d naive (#21466) 2024-07-29 08:47:41 -07:00
gather-block-quantized.jsonc
gather-elements.jsonc [JS/WebGPU] Support GatherElements kernel (#17243) 2023-08-28 09:55:25 -07:00
gather-nd.jsonc
gather.jsonc
gelu.jsonc
gemm.jsonc
global-average-pool.jsonc
greater.jsonc
group-query-attention.jsonc
identity.jsonc
image-scaler.jsonc
instance-norm.jsonc
layer-norm.jsonc
leaky-relu.jsonc
less.jsonc
log.jsonc
matmul-broadcast.jsonc
matmul.jsonc
matmulnbits.jsonc
max-pool.jsonc
mul.jsonc
mul_int32.jsonc
multihead-attention.jsonc
neg-int32.jsonc
neg.jsonc
not.jsonc
or.jsonc
pad-big.jsonc [js/web] update op test schema (#16921) 2023-08-03 14:20:20 -07:00
pad.jsonc
pad_f16.jsonc
pow-big-number.jsonc
pow.jsonc
pow_int32.jsonc
quick-gelu.jsonc
reduce-min.jsonc
relu.jsonc
reshape-int32.jsonc
reshape-pack.jsonc
reshape.jsonc
resize-pack.jsonc
resize.jsonc
rotary-embedding.jsonc [js/webgpu] Implement com.microsoft.RotaryEmbedding (#20209) 2024-04-08 09:11:26 -07:00
scatternd.jsonc
shape.jsonc
simplified-layer-norm.jsonc
sin.jsonc
skip-layer-norm.jsonc
skip-simplified-layer-norm.jsonc [js/web] fix test runner with optional input/output (#20399) 2024-04-22 12:53:10 -07:00
slice.jsonc
softmax.jsonc [js/webgpu] Remove the limitation on axis in softmax (#22231) 2024-09-30 18:27:11 -07:00
split.jsonc
sqrt.jsonc
sub.jsonc
sub_int32.jsonc
tan.jsonc
tanh.jsonc [js/webgpu] Fix Tanh explosion (#19201) 2024-01-25 08:25:35 -08:00
tile.jsonc
transpose.jsonc
transpose_int32_uint32.jsonc
upsample.jsonc
where.jsonc
where_broadcast.jsonc
xor.jsonc