onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-16 01:33:39 +00:00

History

cloudhan 71a4e7eb97 Automatically enable tunable op usage for production models (#15156 ) Split `IsTunbaleOpEnable` semantics into enable tunable op for using and enable tunable op for tuning. They remain disabled in general for safety purpose. But - if session is created with onnx model with tuning results embeded - the embedded tuning results is set to the EP without error `Status` then we automatically enable the using, tuning remains disabled. The planned options will be - `tunable_op_enable`: The top-level switch of `TunableOp`, indicate if we will run into `TunableOp` related logic. NOTE: most of our impls have a bottom impl that is acting as a fallback and is set as the default. In this case, we still call into the `TunableOp`, but no kernel selection, no kernel tuning and caching is involved. This reduced our maintainance burden of a duplicate code path. - `tunable_op_tuning_enable`: The secondary switch of `TunableOp`, indicate if we will run into the tuning related logic of `TunableOp` Then for the possible future options: - `tunable_op_tuning_max_iteration`: blahblah - `tunable_op_tuning_max_duration_ms`: blahblah - `tunable_op_flash_attention_enable`: blahblah, for example only, we will not have this. For developer oriented envvar, it is for developers' convenience to inspect the performance impact of tuning. So there is only `ORT_ROCM_TUNABLE_OP_ENABLE`, `ORT_ROCM_TUNABLE_OP_TUNING_ENABLE` to take the fine-grind control of combinations.		2023-04-06 13:52:47 +08:00
..
custom_op_wrapper	Custom Op runtime wrapper (#13427 )	2023-01-18 09:09:32 -08:00
kernel_explorer	Automatically enable tunable op usage for production models (#15156 )	2023-04-06 13:52:47 +08:00
microbench	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
profile_explorer	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
quantization	Enable pylint and numpy rules (#15218 )	2023-03-27 20:37:53 -07:00
tensorrt/perf	Enable pylint and numpy rules (#15218 )	2023-03-27 20:37:53 -07:00
transformers	Add support for full ViT optimization (#15289 )	2023-04-04 14:05:24 -07:00
__init__.py
offline_tuning.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
onnx_randomizer.py
onnxruntime_test.py	Fix: Add def main() in onnxruntime_test.py (#15208 )	2023-04-05 12:31:39 -07:00
pytorch_export_contrib_ops.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
symbolic_shape_infer.py	Add tool to support packing mode for BERT model (#15283 )	2023-03-31 08:46:47 -07:00