mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
### Description <!-- Describe your changes. --> - Introduce option `trt_engine_hw_compatible` to support engine hardware compatibility for Ampere+ GPUs - This enables `nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS` flag when generating engines - This option has been validated on sm80/86 GPUs, as engine can be reused across different ampere+ arch: - Client side need to enable this option as well to leverage existing sm80+ engines - If this option is enabled by users which TRT<8.6 or sm<80, there will be a warning showing this option not supported Engine naming: | When | `trt_engine_hw_compat=false` | `trt_engine_hw_compat=true` | | -------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | | A100 (sm80) | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm**80**.engine | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm**80+**.engine | | RTX3080 (sm86) | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm**86**.engine | TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm**80+**.engine | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Reference: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#hardware-compat --------- Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| onnxruntime/core | ||