onnxruntime/tools
Patrice Vignola 76434907fb
[DML EP] Add graph capture (#20257)
This adds a new "Graph Capture" option to the DML ep, similar to the
cuda graph functionality. Here's how graph capture works:

- A user can enable graph capture in the session options by setting
`ep.dml.enable_graph_capture` to `true`
- When they want to capture a run, they set `gpu_graph_id` in their
`RunOptions` to a number bigger than 0 (0 is reserved for internal use
according to the cuda graph documentation).
- Then, when they start the inference, the graph will be captured and
stored in the DML EP for future use
- When they execute the run for a second time with the same id, the
`ReplayGraph` function in the DML EP will be called instead of executing
the kernels, resulting in very low overhead and avoiding kernel
recompilation.

This feature can give up-to-par or even better performance than
specifying the static dimensions at session creation time, but is also
much more flexible.
2024-04-18 10:15:00 -07:00
..
android_custom_build Update NDK version to 26.1.10909125 (#18493) 2023-11-17 14:14:01 -08:00
ci_build [DML EP] Add graph capture (#20257) 2024-04-18 10:15:00 -07:00
doc Bump ruff to 0.3.2 and black to 24 (#19878) 2024-03-13 10:00:32 -07:00
nuget OneDNN/dnnl: Fix filepath after dnnl move (#20086) 2024-04-04 21:24:49 -07:00
perf_view fixed #16873 (#16932) 2023-09-26 09:57:01 -07:00
python Bump ruff to 0.3.2 and black to 24 (#19878) 2024-03-13 10:00:32 -07:00
scripts Fix a build issue: /MP was not enabled correctly (#19190) 2024-01-29 12:45:38 -08:00