onnxruntime/onnxruntime
RandySheriffH 009cd4ea2e
Allow cuda custom ops allocate deferred cpu mem (#17893)
Expose a new allocator from cuda stream.
The allocator manages deferred cpu memory which only get recycled before
stream destruction.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-10-20 16:12:21 -07:00
..
contrib_ops Fix Packed MultiHead Attention (#17996) 2023-10-18 10:52:14 -07:00
core Allow cuda custom ops allocate deferred cpu mem (#17893) 2023-10-20 16:12:21 -07:00
python ResizeGrad CUDA/ROCM kernel implementation (#17772) 2023-10-20 11:39:57 -07:00
test Allow cuda custom ops allocate deferred cpu mem (#17893) 2023-10-20 16:12:21 -07:00
tool/etw
wasm [js/webgpu] support IO binding (#17480) 2023-09-29 11:24:42 -07:00
__init__.py Python API to check whether collective ops are available or not (#17730) 2023-09-29 14:11:05 -07:00
ReformatSource.ps1
ReformatSourcePython.bat
VSCodeCoverage.runsettings