onnxruntime/js/web/lib/wasm/jsep/webgpu
Xu Xing 8c59cd4fce
[js/webgpu] Support GroupQueryAttention (#20237)
TODOs:
1. Handle H * params.kvNumHeads greater than work group size limit.
2. Support BNSH kv cache.
2024-05-13 09:43:37 -07:00
..
ops [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
attribute-with-cache-key.ts [js] optimize eslint config (#18460) 2023-11-20 12:00:56 -08:00
gpu-data-manager.ts more conservitive gpu-buffer cache algo (#20312) 2024-04-23 09:07:04 -07:00
op-resolve-rules.ts [js/webgpu] Support GroupQueryAttention (#20237) 2024-05-13 09:43:37 -07:00
program-manager.ts [js/webgpu] add validation to workgroup size (#20110) 2024-04-02 19:29:20 -07:00
types.ts [JS/WebGPU] Improve MatMulNBits perf (#19974) 2024-04-12 11:03:05 -07:00