onnxruntime/orttraining
zhijiang 16d7f55193
lora conv1d replacement (#16643)
in LoRA code, it will use conv1d to do projection for qkv, while the
conv1d calculation is mathematically equivalent to matmul, and matmul is
much faster than conv1d.
The subsitution of the graph optimizer is: 1 conv1d >> 2 split + 1
squeeze + group_num matmul + 1 concat

with this optimizer, we see 10%+ in one 1P model
2023-11-16 17:08:06 +08:00
..
orttraining lora conv1d replacement (#16643) 2023-11-16 17:08:06 +08:00
pytorch_frontend_examples [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
tools [ROCm] Update ROCm and MIGraphX CI to ROCm5.7 (#17834) 2023-10-09 10:29:11 +08:00