mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-22 22:01:08 +00:00
CUDA EP already supports [CUDA graph](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs), also we observed some models can benefit from using CUDA graph with `trtexec`. Therefore, this PR enables the CUDA graph support for TRT EP. The implementation is based on https://github.com/microsoft/onnxruntime/pull/9978 with the same [constraints](https://github.com/microsoft/onnxruntime/pull/9978) as below: - Models with control-flow ops (i.e. If, Loop and Scan ops) are not supported. - Usage of CUDA Graphs is limited to models where-in all the model ops (graph nodes) can be partitioned to the TRT EP. - The input/output types of models need to be tensors. - Shapes of inputs/outputs cannot change across inference calls. - IObinding is required. |
||
|---|---|---|
| .. | ||
| cuda_ops.cu | ||
| custom_op_utils.cc | ||
| custom_op_utils.h | ||
| fns_candy_style_transfer.c | ||
| onnx_protobuf.h | ||
| test_allocator.cc | ||
| test_fixture.h | ||
| test_inference.cc | ||
| test_io_types.cc | ||
| test_model_loading.cc | ||
| test_nontensor_types.cc | ||
| test_ort_format_models.cc | ||
| test_run_options.cc | ||
| test_session_options.cc | ||
| utils.cc | ||
| utils.h | ||