onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-02 03:55:34 +00:00

History

zhijiang 16d7f55193 lora conv1d replacement (#16643 ) in LoRA code, it will use conv1d to do projection for qkv, while the conv1d calculation is mathematically equivalent to matmul, and matmul is much faster than conv1d. The subsitution of the graph optimizer is: 1 conv1d >> 2 split + 1 squeeze + group_num matmul + 1 concat with this optimizer, we see 10%+ in one 1P model		2023-11-16 17:08:06 +08:00
..
orttraining	lora conv1d replacement (#16643 )	2023-11-16 17:08:06 +08:00
pytorch_frontend_examples	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 )	2023-07-21 12:53:41 -07:00
tools	[ROCm] Update ROCm and MIGraphX CI to ROCm5.7 (#17834 )	2023-10-09 10:29:11 +08:00