onnxruntime/js/web/lib/wasm/jsep
Xu Xing 8c59cd4fce
[js/webgpu] Support GroupQueryAttention (#20237)
TODOs:
1. Handle H * params.kvNumHeads greater than work group size limit.
2. Support BNSH kv cache.
2024-05-13 09:43:37 -07:00
..
webgpu [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
backend-webgpu.ts [JS/WebGPU] Multihead attention improvements (#20286) 2024-04-23 12:39:49 -07:00
init.ts [JS/WebGPU] Multihead attention improvements (#20286) 2024-04-23 12:39:49 -07:00
log.ts [js/webgpu] support proxy for webgpu (#15851) 2023-05-15 16:23:13 -07:00
tensor-view.ts [js/web] revise TensorView (#17473) 2023-09-14 21:14:44 -07:00
util.ts [js/webgpu] allows a ProgramInfo's RunData to use zero sized output (#19614) 2024-02-23 12:52:47 -08:00