mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-06-03 23:49:44 +00:00
This adds a new "Graph Capture" option to the DML ep, similar to the cuda graph functionality. Here's how graph capture works: - A user can enable graph capture in the session options by setting `ep.dml.enable_graph_capture` to `true` - When they want to capture a run, they set `gpu_graph_id` in their `RunOptions` to a number bigger than 0 (0 is reserved for internal use according to the cuda graph documentation). - Then, when they start the inference, the graph will be captured and stored in the DML EP for future use - When they execute the run for a second time with the same id, the `ReplayGraph` function in the DML EP will be called instead of executing the kernels, resulting in very low overhead and avoiding kernel recompilation. This feature can give up-to-par or even better performance than specifying the static dimensions at session creation time, but is also much more flexible. |
||
|---|---|---|
| .. | ||
| github | ||
| __init__.py | ||
| amd_hipify.py | ||
| build.py | ||
| clean_docker_image_cache.py | ||
| compile_triton.py | ||
| coverage.py | ||
| gen_def.py | ||
| get_docker_image.py | ||
| logger.py | ||
| op_registration_utils.py | ||
| op_registration_validator.py | ||
| patch_manylinux.py | ||
| policheck_exclusions.xml | ||
| reduce_op_kernels.py | ||
| replace_urls_in_deps.py | ||
| requirements-transformers-test.txt | ||
| set-trigger-rules.py | ||
| update_tsaoptions.py | ||
| upload_python_package_to_azure_storage.py | ||