onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

History

Satya Kumar Jandhyala e8bf46a70e [WebGPU EP] Support GroupQueryAttention (#22658 ) ### Description <!-- Describe your changes. --> Support GroupQueryAttention operator for native webgpu ep. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This is required for inferencing some LLMs.		2024-12-02 12:40:03 -08:00
..
contrib_ops	[WebGPU EP] Support GroupQueryAttention (#22658 )	2024-12-02 12:40:03 -08:00
core	[TensorRT EP] Fix wrong input order when generating IndexedSubGraph (#22857 )	2024-12-02 01:45:29 -08:00
lora	Accomodate BE platforms. Make sure we always write flatbuffers LE (#22375 )	2024-10-11 09:14:44 -07:00
python	[CoreML] Create EP by AppendExecutionProvider (#22675 )	2024-11-27 09:26:31 +08:00
test	Quantize Bias for Conv/Gemm on Quantized Model (#22889 )	2024-11-28 10:10:24 +08:00
tool/etw
wasm	[WebNN] Fixed WebNN Module undefined issue (#22795 )	2024-11-11 21:31:24 -08:00
__init__.py	bumps up version in main from 1.20 -> 1.21 (#22482 )	2024-10-17 12:32:35 -07:00
ReformatSource.ps1
ReformatSourcePython.bat
VSCodeCoverage.runsettings