onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-23 02:38:28 +00:00

History

Joshua Lochner d981b153d3 [webgpu/js] Optimize resize webgpu op & fix precision issues (#23591 ) ### Description <!-- Describe your changes. --> This PR is a follow-up to https://github.com/microsoft/onnxruntime/pull/23488 and partially improves upon https://github.com/microsoft/onnxruntime/issues/23403. It does the following: - Prevents unnecessary cache shader recompilation for 'nearest' resize operation. - Fixes precision (offset-by-one) errors with asymmetric coordinate transform. When running the Kokoro TTS model, values for the `/decoder/decoder/generator/f0_upsamp/Resize_output_0` results in differences at the end bounds due to precision issues when dividing 21600 by 72 (should be 300, but seemingly results in 299.999, which causes issues when flooring) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> I did a deep dive over the weekend to try fix Kokoro TTS on WebGPU and found that the above node had a large difference. Thinking this was a major issue, I spent some time fixing it. Turns out, it only happens for a small number of values, leading to high maximum error, but most values are correct (as seen here). BEFORE: ``` [/decoder/decoder/generator/f0_upsamp/Resize_output_0] atol: 78.6640682220459 \| rtol: 24.13991587587724 \| avgDiff: 0.009967932171121087 \| medianDiff: 0.000030517578125 ``` AFTER: ``` [/decoder/decoder/generator/f0_upsamp/Resize_output_0] atol: 0.0011138916015625 \| rtol: 0.0020059924232260704 \| avgDiff: 0.00008570214675873825 \| medianDiff: 0.000030517578125 ``` So, although it has a very small impact on the final output (waveform), this bug could appear with other models in a more severe way. BEFORE: ``` [waveform] atol: 0.04784199967980385 \| rtol: 1366.0462001093495 \| avgDiff: 0.0009544936942737713 \| medianDiff: 0.00015346752479672432 ``` AFTER: ``` [waveform] atol: 0.04775865003466606 \| rtol: 1354.7002460360852 \| avgDiff: 0.000954830244055033 \| medianDiff: 0.00015274062752723694 ```		2025-02-06 10:26:25 -08:00
..
3rd-party	Fix ConvTranspose for certain attribute combinations (#23488 )	2025-02-05 12:22:47 -08:00
argminmax.ts
attention.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
batch-norm.ts
bias-add.ts
bias-split-gelu.ts
binary-op.ts	[JS/WebGPU] Support WASM64 (#21836 )	2024-10-24 20:21:51 -07:00
common.ts	[js/webgpu] Add scatterND (#22755 )	2024-11-13 09:13:00 -08:00
concat.ts
conv-grouped.ts
conv-transpose.ts	Fix ConvTranspose for certain attribute combinations (#23488 )	2025-02-05 12:22:47 -08:00
conv.ts	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 )	2024-11-08 11:00:51 -08:00
cumsum.ts
depth-to-space.ts
einsum.ts
expand.ts	[js/webgpu] Optimize Expand (#22752 )	2024-11-12 12:37:19 -08:00
fast-gelu.ts
fuse-utils.ts
gather-block-quantized.ts
gather-elements.ts
gather-nd.ts	[js/webgpu] Add GatherND (#22847 )	2024-12-04 09:57:32 -08:00
gather.ts
gemm.ts	[js/webgpu] Optimize Gemm (#22706 )	2024-11-04 15:05:21 -08:00
grid-sample.ts	[js/webgpu] support GridSample operator (#22652 )	2024-11-08 11:02:36 -08:00
group-query-attention.ts	[JSEP/WebGPU] Add a fatal error message for unsupported GQA do_rotary attribute. (#23287 )	2025-01-09 08:52:17 -08:00
instance-norm.ts	[js/webgpu] Optimize InstanceNorm in some shapes (#22637 )	2024-10-29 17:10:14 -07:00
layer-norm.ts
matmul-shaders.ts	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 )	2024-11-08 11:00:51 -08:00
matmul.ts	WebGPU JSEP: Make shader code not depend on input broadcasting patterns (#22536 )	2024-11-08 11:00:51 -08:00
matmulnbits.ts	[js/webgpu] Optimize matmulnbits (#22360 )	2024-10-14 15:49:29 -07:00
multihead-attention.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
pad.ts
pool.ts
quantize-linear.ts
range.ts
reduce-shared.ts	[js/webgpu] Increase workgroupSize if only one workgroup is dispached (#22709 )	2024-11-05 13:13:52 -08:00
reduce.ts
resize.ts	[webgpu/js] Optimize resize webgpu op & fix precision issues (#23591 )	2025-02-06 10:26:25 -08:00
rotary-embedding.ts
scatter-nd.ts	[js/webgpu] Add scatterND (#22755 )	2024-11-13 09:13:00 -08:00
skip-layer-norm.ts
slice.ts
softmax.ts	[js/webgpu] Increase workgroupSize if only one workgroup is dispached (#22709 )	2024-11-05 13:13:52 -08:00
split.ts	[JS/WebGPU] GroupQueryAttention rewrite (#20946 )	2024-10-23 10:14:09 -07:00
tile.ts
transpose.ts	[js/webgpu] validate transpose perm if specified (#23197 )	2025-01-01 15:58:54 -08:00
unary-op.ts
where.ts