onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

History

Chi Lo 4e3cff60fd CUDA graph support for TRT EP (#16081 ) CUDA EP already supports [CUDA graph](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs), also we observed some models can benefit from using CUDA graph with `trtexec`. Therefore, this PR enables the CUDA graph support for TRT EP. The implementation is based on https://github.com/microsoft/onnxruntime/pull/9978 with the same [constraints](https://github.com/microsoft/onnxruntime/pull/9978) as below: - Models with control-flow ops (i.e. If, Loop and Scan ops) are not supported. - Usage of CUDA Graphs is limited to models where-in all the model ops (graph nodes) can be partitioned to the TRT EP. - The input/output types of models need to be tensors. - Shapes of inputs/outputs cannot change across inference calls. - IObinding is required.		2023-06-21 09:36:45 -07:00
..
common	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
eager	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
framework	ExecutionProvider API refactor - move allocator from EP level to SessionState level and indexed by OrtDevice (#15833 )	2023-06-19 17:44:45 -07:00
graph	Support WebNN EP (#15698 )	2023-05-08 21:25:10 -07:00
optimizer	fix compilation error in no absl build (#15769 )	2023-05-02 08:20:49 -07:00
platform	Implement mutex-free spin lock for task queue (#14834 )	2023-05-19 10:12:10 -07:00
providers	CUDA graph support for TRT EP (#16081 )	2023-06-21 09:36:45 -07:00
session	Move tests from core/providers/cuda/test/* to test/providers/cuda/ and refactor CUDA UT (#16161 )	2023-06-20 14:54:55 -07:00