mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
### Description This will enable a user to use a TensorRT timing cache based on #10297 to accelerate build times on a device with the same compute capability. This will work across models as it simply store kernel runtimes for specific configurations. Those files are usually very small (only a few MB) which makes them very easy to ship with an application to accelerate the build time on the user end. ### Motivation and Context Especially for workstation use cases TRT build times can be a roadblock. With a few model from ONNX model zoo i evaluated speedups when a timing cache is present. `./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i "trt_timing_cache_enable|true" <onnx_path>` |Model | no Cache | with Cache| | ------------- | ------------- | ------------- | |efficientnet-lite4-11 | 34.6 s | 7.7 s| |yolov4 | 108.62 s | 9.4 s| To capture this is had to modify the onnxruntime_perf_test. The time is sometimes not captured within "Session creation time cost:" which is why i introduced "First inference time cost:". --------- Co-authored-by: Chi Lo <Chi.Lo@microsoft.com> |
||
|---|---|---|
| .. | ||
| eigen@d10b27fe37 | ||
| emsdk@0ab19024f0 | ||
| libprotobuf-mutator@7a2ed51a6b | ||
| onnx@9b7bca2a72 | ||
| onnxruntime-extensions@81e7799c69 | ||
| protobuf@a20c65f2cd | ||
| abseil-cpp.cmake | ||
| abseil-cpp.natvis | ||
| composable_kernel.cmake | ||
| cutlass.cmake | ||
| dml.cmake | ||
| dnnl.cmake | ||
| eigen.cmake | ||
| extensions.cmake | ||
| find_snpe.cmake | ||
| FindNumPy.cmake | ||
| helper_functions.cmake | ||
| ipp-crypto.cmake | ||
| mimalloc.cmake | ||
| onnx_minimal.cmake | ||
| onnx_protobuf.natvis | ||
| onnxruntime_external_deps.cmake | ||
| protobuf_function.cmake | ||
| pybind11.cmake | ||
| pyxir.cmake | ||
| triton.cmake | ||
| tvm.cmake | ||
| wil.cmake | ||
| xnnpack.cmake | ||