mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-06-04 23:59:56 +00:00
By https://github.com/microsoft/onnxruntime/issues/14691, we found that there is a mis-reuse of GPU memory between NonZero(GPU) and Identity(GPU) which is a subgraph node in If(CPU). The NonZero gives a GPU output consumed by Transpose(GPU), after which that GPU output marks as free in BFCArena, and soon be reused by Identity(GPU) in a subgraph of If(CPU). However, NonZero(GPU) and Identity(GPU) run on separate cuda streams, there is no synchronization because the Identity node is in a subgraph of If(CPU). Meaning - Identity(GPU) can write to the memory when Transpose(GPU) is reading from it. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> |
||
|---|---|---|
| .. | ||
| 3_gpu_streams.json | ||
| conv_add_relu.onnx | ||
| conv_add_relu_single_stream.json | ||
| conv_add_relu_single_stream_mismatch_device.json | ||
| conv_add_relu_single_stream_missing_node.json | ||
| cpu_if.onnx | ||
| memcpyToHost_same_stream_with_transpose.json | ||
| multi_stream_double_stream.json | ||
| multi_stream_single_stream.json | ||
| simplified_ssd.onnx | ||
| simplified_ssd_cpu.json | ||