pytorch/docs at 11ea09effcf2c77c2f05eb4fceac1e1e56fcfb82 - saymrwulf/pytorch - Forgejo: Beyond coding. We forge.

saymrwulf/pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

Jaewon Lee 11ea09effc [CUDACachingAlloc/GPUInference] Implement garbage collection without GPU sync (#74261 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74261 ### Goal Implement a cheap way to reclaim GPU memory (garbage collection) without incurring GPU sync. ### Why do we need this? Currently, there are only two ways to reclaim GPU memory block already assigned to a particular stream. - `release_available_cached_blocks(params)`: Free blocks exceeding the `CachingAllocatorConfig::max_split_size()` until we can satisfy the request. Issue: If the `max_split_size` is unset (default), this function is a no-op. Even if this is set, the reclamation is quite conservative (e.g., never frees blocks under max_split_size). - `release_cached_blocks()`: Waits for all the in-flight events and then reclaim blocks. Issue: 'waiting for all event' is very expensive as it will likely stall all the GPU operations. Many GPU applications without a proper handling of potential GPU throttling would suffer/crash. ### Proposed idea - If the garbage collection threshold is set, try to reclaim some memory blocks without synchronization. It should be safe to do so, as `release_available_cached_blocks` essentially does the same thing (but less aggressively). - GC is triggered only when we fail to serve a `malloc` request from the block pool. No need to free blocks when the block pool is functioning just fine. - Prioritize reclaiming blocks that weren't reused for long time. Reclamation stops once the used memory capacity < threshold. - This code path is totally optional; by default it won't be invoked. Test Plan: - Unit tests - Manually checked that the GPU memory usage stays as indicated by the garbage collector. If not the caching allocator at least tries to keep freeing the blocks. Reviewed By: jianyuh Differential Revision: D34482514 fbshipit-source-id: d5eae62ac60b94b0bca851f9d233a092d086e3c2 (cherry picked from commit 05780f1ed4b176f05e765b2411c9eaa2eaeb48b0)		2022-03-21 18:46:02 +00:00
..
caffe2	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default	2021-08-12 11:45:01 -07:00
cpp	add VS extension in doc (#63944 )	2021-11-11 18:02:08 -08:00
source	[CUDACachingAlloc/GPUInference] Implement garbage collection without GPU sync (#74261 )	2022-03-21 18:46:02 +00:00
.gitignore
libtorch.rst
make.bat
Makefile
README.md
requirements.txt	Sphinx panel	2022-03-07 14:50:09 +00:00

README.md

Please see the Writing documentation section of CONTRIBUTING.md for details on both writing and building the docs.