onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

History

Maximilian Müller ad4db12699 TensorRT EP - timing cache (#14767 ) ### Description This will enable a user to use a TensorRT timing cache based on #10297 to accelerate build times on a device with the same compute capability. This will work across models as it simply store kernel runtimes for specific configurations. Those files are usually very small (only a few MB) which makes them very easy to ship with an application to accelerate the build time on the user end. ### Motivation and Context Especially for workstation use cases TRT build times can be a roadblock. With a few model from ONNX model zoo i evaluated speedups when a timing cache is present. `./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i "trt_timing_cache_enable\|true" <onnx_path>` \|Model \| no Cache \| with Cache\| \| ------------- \| ------------- \| ------------- \| \|efficientnet-lite4-11 \| 34.6 s \| 7.7 s\| \|yolov4 \| 108.62 s \| 9.4 s\| To capture this is had to modify the onnxruntime_perf_test. The time is sometimes not captured within "Session creation time cost:" which is why i introduced "First inference time cost:". --------- Co-authored-by: Chi Lo <Chi.Lo@microsoft.com>		2023-03-10 09:02:27 -08:00
..
common	Improve compatibility with certain STL's	2023-02-21 14:06:16 -08:00
eager
framework	Introduce RemovableAttributes (#14868 )	2023-03-07 12:37:12 +01:00
graph	Introduce RemovableAttributes (#14868 )	2023-03-07 12:37:12 +01:00
optimizer	Pass SessionOptions to XnnpackProviderFactoryCreator. (#13318 )	2022-12-10 14:23:46 +08:00
platform	Improve thread pool creation failure handling. (#13313 )	2022-10-15 17:57:19 -07:00
providers	TensorRT EP - timing cache (#14767 )	2023-03-10 09:02:27 -08:00
session	Add GetVersionSting API for C++, C# and Python (#14873 )	2023-03-02 17:11:07 -08:00