onnxruntime/onnxruntime/python/tools
cloudhan 71a4e7eb97
Automatically enable tunable op usage for production models (#15156)
Split `IsTunbaleOpEnable` semantics into **enable tunable op for using**
and **enable tunable op for tuning**.

They remain disabled in general for safety purpose. But
- if session is created with onnx model with tuning results embeded
- the embedded tuning results is set to the EP without error `Status`

then we automatically enable the using, tuning remains disabled.

The planned options will be
- `tunable_op_enable`: The top-level switch of `TunableOp`, indicate if we will run into `TunableOp` related logic. **NOTE:** most of our impls have a bottom impl that is acting as a fallback and is set as the default. In this case, we still call into the `TunableOp`, but no kernel selection, no kernel tuning and caching is involved. This reduced our maintainance burden of a duplicate code path.
- `tunable_op_tuning_enable`: The secondary switch of `TunableOp`, indicate if we will run into the tuning related logic of `TunableOp`

Then for the possible future options:
- `tunable_op_tuning_max_iteration`: blahblah
- `tunable_op_tuning_max_duration_ms`: blahblah
- `tunable_op_flash_attention_enable`: blahblah, for example only, we will not have this.

For developer oriented envvar, it is for developers' convenience to inspect the performance impact of tuning. So there is only `ORT_ROCM_TUNABLE_OP_ENABLE`, `ORT_ROCM_TUNABLE_OP_TUNING_ENABLE` to take the fine-grind control of combinations.
2023-04-06 13:52:47 +08:00
..
custom_op_wrapper Custom Op runtime wrapper (#13427) 2023-01-18 09:09:32 -08:00
kernel_explorer Automatically enable tunable op usage for production models (#15156) 2023-04-06 13:52:47 +08:00
microbench Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
profile_explorer Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
quantization Enable pylint and numpy rules (#15218) 2023-03-27 20:37:53 -07:00
tensorrt/perf Enable pylint and numpy rules (#15218) 2023-03-27 20:37:53 -07:00
transformers Add support for full ViT optimization (#15289) 2023-04-04 14:05:24 -07:00
__init__.py
offline_tuning.py Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
onnx_randomizer.py
onnxruntime_test.py Fix: Add def main() in onnxruntime_test.py (#15208) 2023-04-05 12:31:39 -07:00
pytorch_export_contrib_ops.py Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
symbolic_shape_infer.py Add tool to support packing mode for BERT model (#15283) 2023-03-31 08:46:47 -07:00