onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-28 22:56:32 +00:00

History

zhijiang 8fadc6c913 Zhijxu/cleanup cached tensors when oom (#19306 ) in pytorch, when oom happens at bp, user could decrease the batch size and rerun it without restarting the process. while in ORT, the intermediate tensors are kept even OOM, so decrease batch size still fail. this is torch run, we can see after oom failure, torch will release tensor before next step ![image](https://github.com/microsoft/onnxruntime/assets/43435212/92b8a2e3-454b-448a-a223-17cb91d463c2) this is from ort, we can see ort not release its tensors after OOM failure. ![image](https://github.com/microsoft/onnxruntime/assets/43435212/bb6a3882-8e14-4f37-8079-e7f70fc2546b) ort with the PR, we can see memory is released, the 4GB memory is not own by ort, and will be released by torch at the end. ![image](https://github.com/microsoft/onnxruntime/assets/43435212/7f39d711-4e36-47d5-aecf-3805433a6d01)		2024-02-21 10:41:42 +08:00
..
contrib_ops	[JS/WebGPU] Add MatMulNBits (#19446 )	2024-02-17 09:19:17 -08:00
core	Zhijxu/cleanup cached tensors when oom (#19306 )	2024-02-21 10:41:42 +08:00
python	add option DefaultTensorType to specify the default tensor type to quantize (#19455 )	2024-02-20 08:22:44 -08:00
test	add option DefaultTensorType to specify the default tensor type to quantize (#19455 )	2024-02-20 08:22:44 -08:00
tool/etw
wasm	[js/webgpu] Support capture and replay for jsep (#18989 )	2024-01-30 18:28:03 -08:00
__init__.py	[ORT 1.17.0 release] Bump up version to 1.18.0 (#19170 )	2024-01-17 11:18:32 -08:00
ReformatSource.ps1
ReformatSourcePython.bat
VSCodeCoverage.runsettings