onnxruntime/onnxruntime/test/testdata/multi_stream_models
RandySheriffH 657ab2f43c
Sync between parent node and subgraph (#15757)
By https://github.com/microsoft/onnxruntime/issues/14691, we found that
there is a mis-reuse of GPU memory between NonZero(GPU) and
Identity(GPU) which is a subgraph node in If(CPU).
The NonZero gives a GPU output consumed by Transpose(GPU), after which
that GPU output marks as free in BFCArena, and soon be reused by
Identity(GPU) in a subgraph of If(CPU).
However, NonZero(GPU) and Identity(GPU) run on separate cuda streams,
there is no synchronization because the Identity node is in a subgraph
of If(CPU). Meaning - Identity(GPU) can write to the memory when
Transpose(GPU) is reading from it.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-11 09:28:04 -07:00
..
3_gpu_streams.json remove unnecessary waitOnEPStep when current node and the consumer node are in the same stream (#14173) 2023-01-20 07:35:15 -08:00
conv_add_relu.onnx
conv_add_relu_single_stream.json
conv_add_relu_single_stream_mismatch_device.json
conv_add_relu_single_stream_missing_node.json
cpu_if.onnx Sync between parent node and subgraph (#15757) 2023-05-11 09:28:04 -07:00
memcpyToHost_same_stream_with_transpose.json Make MemcpyToHost to a separate stream for performance gain (#14487) 2023-02-23 14:52:01 -08:00
multi_stream_double_stream.json
multi_stream_single_stream.json
simplified_ssd.onnx
simplified_ssd_cpu.json