onnxruntime/orttraining
Vincent Wang e6aa0fa174
Add Gelu Related Ops to Triton Codegen (#17713)
Add Gelu/QuickGelu/GeluGrad/QuickGeluGrad support to Triton Codegen so
that it can be fused with some other connected supported Ops. For
example, in llama2, it can be fused with Mul so we will have extra 1-2%
perf gain.
2023-09-27 19:57:39 +08:00
..
orttraining Add Gelu Related Ops to Triton Codegen (#17713) 2023-09-27 19:57:39 +08:00
pytorch_frontend_examples [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
tools [ROCm] Update CI based on ubuntu 22.04 (#17076) 2023-08-10 09:51:29 -07:00