pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

Bert Maher 03342af3a3 Add env variable to bypass CUDACachingAllocator for debugging (#45294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45294 While tracking down a recent memory corruption bug we found that cuda-memcheck wasn't finding the bad accesses, and ngimel pointed out that it's because we use a caching allocator so a lot of "out of bounds" accesses land in a valid slab. This PR adds a runtime knob (`PYTORCH_NO_CUDA_MEMORY_CACHING`) that, when set, bypasses the caching allocator's caching logic so that allocations go straight to cudaMalloc. This way, cuda-memcheck will actually work. Test Plan: Insert some memory errors and run a test under cuda-memcheck; observe that cuda-memcheck flags an error where expected. Specifically I removed the output-masking logic here: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/tensorexpr/cuda_codegen.cpp#L819-L826 And ran: ``` PYTORCH_NO_CUDA_MEMORY_CACHING=1 cuda-memcheck pytest -k test_superslomo test_jit_fuser_te.py ``` Reviewed By: ngimel Differential Revision: D23964734 Pulled By: bertmaher fbshipit-source-id: 04efd11e8aff037b9edde80c70585cb820ee6e39		2020-09-28 11:40:04 -07:00
..
amp_examples.rst	Reference amp tutorial (recipe) from core amp docs (#44725 )	2020-09-16 11:37:58 -07:00
autograd.rst
broadcasting.rst
cpu_threading_runtimes.svg
cpu_threading_torchscript_inference.rst
cpu_threading_torchscript_inference.svg
cuda.rst	Add env variable to bypass CUDACachingAllocator for debugging (#45294 )	2020-09-28 11:40:04 -07:00
ddp.rst
extending.rst
faq.rst
large_scale_deployments.rst
multiprocessing.rst
randomness.rst	Update determinism documentation (#41692 )	2020-08-31 21:06:24 -07:00
serialization.rst
windows.rst