onnxruntime/js/web/lib/wasm/jsep/webgpu
Xu Xing 8c59cd4fce
[js/webgpu] Support GroupQueryAttention (#20237)
TODOs:
1. Handle H * params.kvNumHeads greater than work group size limit.
2. Support BNSH kv cache.
2024-05-13 09:43:37 -07:00
..
ops [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
attribute-with-cache-key.ts
gpu-data-manager.ts more conservitive gpu-buffer cache algo (#20312) 2024-04-23 09:07:04 -07:00
op-resolve-rules.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
program-manager.ts
types.ts