onnxruntime/orttraining/orttraining/python/training
Vincent Wang e6aa0fa174
Add Gelu Related Ops to Triton Codegen (#17713)
Add Gelu/QuickGelu/GeluGrad/QuickGeluGrad support to Triton Codegen so
that it can be fused with some other connected supported Ops. For
example, in llama2, it can be fused with Mul so we will have extra 1-2%
perf gain.
2023-09-27 19:57:39 +08:00
..
amp [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
api [On-Device Training] Expose Parameters through the Training API (#17364) 2023-09-25 20:03:24 -07:00
experimental Manage ORTModule configurations consistently (#16396) 2023-06-27 19:19:36 +08:00
onnxblock Build gradient graph starting at the loss alone (#17240) 2023-08-23 23:54:45 -07:00
optim [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
ort_triton Add Gelu Related Ops to Triton Codegen (#17713) 2023-09-27 19:57:39 +08:00
ortmodule Model post process for zero stage3 training (#17187) 2023-09-22 08:54:25 +08:00
torchdynamo Fix orttraining_test_dort.py (#17034) 2023-08-08 08:11:48 -07:00
utils Model post process for zero stage3 training (#17187) 2023-09-22 08:54:25 +08:00
__init__.py Add mac and windows python packages for onnxruntime-training (#16993) 2023-08-07 20:32:55 -07:00
_checkpoint_storage.py
_utils.py [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
artifacts.py add logits option to generate artifacts (#17276) 2023-08-29 16:55:31 -07:00
checkpoint.py Disable PERF* rules in ruff to allow better readability (#16834) 2023-07-25 15:38:22 -07:00
model_desc_validation.py
orttrainer.py Disable PERF* rules in ruff to allow better readability (#16834) 2023-07-25 15:38:22 -07:00
orttrainer_options.py [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
postprocess.py Disable PERF* rules in ruff to allow better readability (#16834) 2023-07-25 15:38:22 -07:00