mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-22 22:01:08 +00:00
TODOs: 1. Handle H * params.kvNumHeads greater than work group size limit. 2. Support BNSH kv cache. |
||
|---|---|---|
| .. | ||
| ops | ||
| attribute-with-cache-key.ts | ||
| gpu-data-manager.ts | ||
| op-resolve-rules.ts | ||
| program-manager.ts | ||
| types.ts | ||