onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

History

Jiajia Qin 891fba3b9c [js/webgpu] Optimize Gather op (#17625 ) ### Description This PR optimizes the gather op, which is improved ~6ms in segment anything model in ADL. The problem in original algorithm is that it includes a for loop to calculate a block size of data. However, the block size may be very large, like `65536`. In GPU shader, we should try to avoid large loop in shader and try to use more threads to do it parallelly. Before: ``` [profiling] kernel "41771992\|[Gather] 41771992" input[0]: [4,65536] \| float32, input[1]: [1] \| int64, output[0]: [1,65536] \| float32, execution time: 6886207 ns ``` After: ``` [profiling] kernel "41771992\|[Gather] 41771992" input[0]: [4,65536] \| float32, input[1]: [1] \| int64, output[0]: [1,65536] \| float32, execution time: 11719 ns		2023-09-21 21:00:36 -07:00
..
onnxjs	[js/api] introducing IO binding for tensor (#16452 )	2023-08-29 12:58:26 -07:00
wasm	[js/webgpu] Optimize Gather op (#17625 )	2023-09-21 21:00:36 -07:00
backend-onnxjs.ts	[js] upgrade async@3.2.3 /js/ (#11421 )	2022-05-03 23:41:36 -07:00
backend-wasm.ts	[js/webgpu] support proxy for webgpu (#15851 )	2023-05-15 16:23:13 -07:00
build-def.d.ts	[js/web] WebGPU backend via JSEP (#14579 )	2023-04-24 15:21:18 -07:00
index.ts	[js/api] introducing IO binding for tensor (#16452 )	2023-08-29 12:58:26 -07:00
version.ts	Bump Up Version to 1.17.0 (#17587 )	2023-09-20 11:02:58 +08:00