onnxruntime/js/web/lib/wasm/jsep/webgpu/ops
Arthur Islamov ccf14e891e
[js/web] JSEP node assignment optimization (#17128)
### Description
Since WebGPU supports only float32 and int32, having Gather, Reshape,
Shape, Squeeze and Unsqueeze ops with other data types create additional
MemCpy ops and slow down the overall execution as all other OPs with
other tensor types will be done on CPU.

Before this patch SD Unet had these numbers:
Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1141
Node(s) placed on [JsExecutionProvider]. Number of nodes: 4025
memcpy tokens: 2001

After patch:
Node(s) placed on [CPUExecutionProvider]. Number of nodes: 1735
Node(s) placed on [JsExecutionProvider]. Number of nodes: 2243
memcpu tokens: 813

It also gives more than 5X performance benefit. From 12sec for one Unet
step to 2.2sec on RTX 3090 Ti, so we are almost getting to native
performance.

UPD: with latest changes from main branch and multi-threading it went
down to 1.6sec. Will try re-exporting my model to onnx with maximum
optimizations, like using MultiHeadAttention to decrease node count.
Maybe after implementing that it can go in less than 1 sec
2023-08-15 18:58:05 -07:00
..
3rd-party [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
argminmax.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
binary-op.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
common.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
concat.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
conv-grouped.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
conv-transpose.ts [Web/JS] Add ConvTranspose support (#16433) 2023-07-08 11:10:50 -07:00
conv.ts [js/web] [JSEP] allow passing data in kernel compute (#16621) 2023-07-07 14:27:30 -07:00
conv2d-mm.ts
expand.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
fuse-utils.ts
gather.ts [js/web] JSEP node assignment optimization (#17128) 2023-08-15 18:58:05 -07:00
gemm.ts [js/webgpu] make RunFunction return void (#15669) 2023-04-25 14:14:26 -07:00
instance-norm.ts [js/web] JSEP LayerNormalization and InstanceNormalizations kernels (#16830) 2023-08-08 09:09:37 -07:00
layer-norm.ts [js/web] JSEP LayerNormalization and InstanceNormalizations kernels (#16830) 2023-08-08 09:09:37 -07:00
matmul.ts [js/webgpu] make RunFunction return void (#15669) 2023-04-25 14:14:26 -07:00
pool.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
reduce.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
resize.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
slice.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
softmax.ts js/webgpu: argmax,argmin,softmax support (#16882) 2023-08-02 18:16:19 -07:00
split.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
transpose.ts [js/web] [webgpu] new incides helper (#16957) 2023-08-11 11:36:59 -07:00
unary-op.ts [JS/WebGPU] Support Log operator (#17045) 2023-08-14 18:04:12 -07:00