onnxruntime/winml
Abhishek Udupa 83c59d2594
Session-aware and thread-safe CUDA profiler (#13706)
### Description
The existing CUDA profiler is neither session-aware, nor thread-safe.
This PR ensures both.

### Motivation and Context
[PR 13549](https://github.com/microsoft/onnxruntime/pull/13549) brought
thread-safety and session-awareness to the ROCm profiler. This PR brings
the same goodness to the CUDA profiler as well.

Sample outputs of a profiling run from the StableDiffusion model (this
model was chosen because it requires orchestration of multiple sessions,
and verifies that the profilers are now indeed session-aware) on both
CUDA and ROCm EPs are attached, along with a script that checks that the
trace files generated by the profile are well-formed.

Update 11/29: Updated the profile outputs. The older profile outputs
exhibited an issue where some timestamps were wildly out of range,
leading to problems visualizing the traces. The bug has been fixed and
the profile outputs have been updated, along with an update to the check
script to ensure that timestamps are monotonically increasing.


[sd_profile_outputs_cuda.tar.gz](https://github.com/microsoft/onnxruntime/files/10118088/sd_profile_outputs_cuda.tar.gz)

[sd_profile_outputs_rocm.tar.gz](https://github.com/microsoft/onnxruntime/files/10118089/sd_profile_outputs_rocm.tar.gz)

[check_profile_output_well_formedness.zip](https://github.com/microsoft/onnxruntime/files/10118090/check_profile_output_well_formedness.zip)

Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>
2022-12-09 13:22:12 -08:00
..
adapter Session-aware and thread-safe CUDA profiler (#13706) 2022-12-09 13:22:12 -08:00
api Revert "add load from buffer (#10162)" (#10590) 2022-03-08 13:35:23 -08:00
dll Enable JoinModels API in WinML+RT Experimental API (#9746) 2021-11-12 16:56:31 -08:00
lib Fix C26436 prefast errors (#13774) 2022-12-01 09:07:44 -08:00
test [DML EP] Attention Kernel bug fix (#13879) 2022-12-07 15:24:27 -08:00