pytorch/docs/source
Banit Agrawal a575ce0dc6 [PyTorch Pinned Allocator] Add support of background thread to process events (#135524)
Summary: Currently we process events in the regular allocation path and we call cudaEventQuery to check on the events and this path can take some locks in libcuda driver. Its not entirely needed to do process events in the allocation path, we could move this to a background thread and keep processing events regularly and put the freed block to the free list.

Differential Revision: D62396585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135524
Approved by: https://github.com/zyan0
2024-09-17 21:08:10 +00:00
..
_static Clean up distributed/CONTRIBUTING.md (#128450) 2024-06-22 02:41:22 +00:00
_templates
community Add Alban and Piotr into Core Maintainers (#130903) 2024-07-20 16:02:42 +00:00
elastic DOC: add docstring to construct_and_record_rdzv_event() (#128189) 2024-06-10 22:17:33 +00:00
notes [PyTorch Pinned Allocator] Add support of background thread to process events (#135524) 2024-09-17 21:08:10 +00:00
rpc [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
scripts [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
amp.rst Update document for autocast on CPU (#135299) 2024-09-13 09:11:47 +00:00
autograd.rst
backends.rst [sparse] Add cuSPARSELt as a backend (#128534) 2024-08-21 22:06:07 +00:00
benchmark_utils.rst
bottleneck.rst
checkpoint.rst [checkpoint] Clean up selective activation checkpoint and make public (#125795) 2024-06-18 18:18:50 +00:00
complex_numbers.rst
cond.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
conf.py Revert "[dtensor] move DTensor to public namespace (#133113)" 2024-08-19 05:00:19 +00:00
config_mod.rst
cpp_extension.rst
cpp_index.rst
cpu.rst
cuda._sanitizer.rst
cuda.rst Uses MemPoolContext to route allocations from CUDACachingAllocator (#134685) 2024-08-29 03:56:31 +00:00
cuda.tunable.rst [ROCm] TunableOp improvements (#124362) 2024-06-03 22:30:11 +00:00
cuda_environment_variables.rst
cudnn_persistent_rnn.rst
cudnn_rnn_determinism.rst
data.rst
ddp_comm_hooks.rst
debugging_environment_variables.rst
deploy.rst
deterministic.rst
distributed.algorithms.join.rst
distributed.checkpoint.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
distributed.elastic.rst Reapply "distributed debug handlers (#126601)" (#127805) 2024-06-04 19:44:30 +00:00
distributed.optim.rst
distributed.pipelining.rst [PP] Add ZeroBubble schedule (#133467) 2024-08-22 13:32:15 +00:00
distributed.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
distributed.tensor.parallel.rst Update link in distributed.tensor.parallel.rst (#136103) 2024-09-15 19:36:29 +00:00
distributed.tensor.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
distributions.rst
dlpack.rst
docutils.conf
export.ir_spec.rst
export.rst Add some doc for export_for_training (#135918) 2024-09-15 17:08:12 +00:00
fft.rst
fsdp.rst
func.api.rst
func.batch_norm.rst
func.migrating.rst
func.rst
func.ux_limitations.rst
func.whirlwind_tour.rst
future_mod.rst
futures.rst
fx.experimental.rst Only thunkify proxies in some situations (#132421) 2024-08-08 12:03:06 +00:00
fx.rst Consolidate SymDispatchMode into ProxyTensorMode (#132674) 2024-08-08 12:02:54 +00:00
hub.rst
index.rst [reland][dtensor] move DTensor to public namespace (#134203) 2024-09-08 17:08:40 +00:00
jit.rst
jit_builtin_functions.rst
jit_language_reference.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
jit_language_reference_v2.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
jit_python_reference.rst
jit_unsupported.rst
jit_utils.rst
library.rst [custom ops] Add register_vmap for custom ops (#130589) 2024-07-23 17:48:38 +00:00
linalg.rst
logging.rst
masked.rst Add MaskedTensor passthrough: unfold, F.Unfold, F.Fold, stack (#125262) 2024-09-06 19:06:23 +00:00
math-quantizer-equation.png
meta.rst
miscellaneous_environment_variables.rst [RFC] Add support for device extension autoloading (#127074) 2024-07-09 06:14:13 +00:00
mobile_optimizer.rst Add ExecuTorch warning to mobile_optimizer (#134697) 2024-09-04 17:47:14 +00:00
model_zoo.rst
module_tracker.rst
monitor.rst
mps.rst Add support in Python API for the recommended max working set size. (#128289) 2024-06-12 16:03:57 +00:00
mps_environment_variables.rst [MPS] Add mps profiler env vars to docs (#129552) 2024-07-04 06:44:48 +00:00
mtia.rst [MTIA] Support torch.cuda.get_device_capability equivalent API on MTIA (#135889) 2024-09-17 17:42:56 +00:00
multiprocessing.rst
name_inference.rst
named_tensor.rst
nested.rst
nn.attention.bias.rst
nn.attention.flex_attention.rst [Inductor] Added and_masks and or_masks utilities & make fully masked out rows 0 instead of nan (#131552) 2024-07-25 21:29:46 +00:00
nn.attention.rst Make FlexAttention API public (#130755) 2024-07-16 16:21:25 +00:00
nn.functional.rst
nn.init.rst
nn.rst Make adding Buffers more like adding Parameters (#125971) 2024-07-31 10:32:40 +00:00
onnx.rst [ONNX] Improves documentation of ONNX exporter (#135372) 2024-09-09 15:09:01 +00:00
onnx_dynamo.rst [ONNX] Improves documentation of ONNX exporter (#135372) 2024-09-09 15:09:01 +00:00
onnx_dynamo_onnxruntime_backend.rst
onnx_torchscript.rst [ONNX] Remove logging apis from public (#133825) 2024-09-13 22:19:52 +00:00
onnx_torchscript_supported_aten_ops.rst
optim.rst Make optim.swa.util content accessible from the torch.optim doc (#133393) 2024-08-21 00:43:46 +00:00
package.rst
profiler.rst
quantization-accuracy-debugging.rst
quantization-backend-configuration.rst
quantization-support.rst Update pt2e numeric debugger to use node.meta["custom"] field (#134040) 2024-08-27 19:51:03 +00:00
quantization.rst
random.rst
rpc.rst
signal.rst
size.rst
sparse.rst SparseCsrCUDA: cuDSS backend for linalg.solve (#129856) 2024-08-22 07:57:30 +00:00
special.rst
storage.rst
tensor_attributes.rst Refine the logic of device construction when only device index is given (#129119) 2024-07-15 14:34:29 +00:00
tensor_view.rst
tensorboard.rst
tensors.rst add xpu to torch.tensors (#127280) 2024-06-11 18:13:01 +00:00
testing.rst
threading_environment_variables.rst
torch.ao.ns._numeric_suite.rst
torch.ao.ns._numeric_suite_fx.rst
torch.compiler.rst add xpu to torch.compile (#127279) 2024-06-13 21:15:09 +00:00
torch.compiler_aot_inductor.rst [AOTI] docs: add suggestion to turn on freezing on CPU (#128010) 2024-06-07 08:57:02 +00:00
torch.compiler_api.rst [RFC][dynamo] add decorator to register polyfill for unsupported C++ function to avoid graph break (#133712) 2024-08-21 06:36:41 +00:00
torch.compiler_best_practices_for_backends.rst
torch.compiler_cudagraph_trees.rst [CUDAGraph] add more docs for cudagraph trees (#127963) 2024-06-18 02:07:07 +00:00
torch.compiler_custom_backends.rst
torch.compiler_dynamic_shapes.rst
torch.compiler_dynamo_deepdive.rst Stop immediately specializing common constants 0/1 for plain int (#128327) 2024-07-03 16:41:51 +00:00
torch.compiler_dynamo_overview.rst
torch.compiler_fake_tensor.rst [BE] Reroute all uses of proxy_tensor.maybe_disable_fake_tensor_mode to fake_tensor.unset_fake_temporarily (#132770) 2024-08-08 23:07:23 +00:00
torch.compiler_faq.rst [dynamo] Retire CompileProfiler (#135133) 2024-09-05 01:08:40 +00:00
torch.compiler_fine_grain_apis.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
torch.compiler_get_started.rst Revert "[inductor] More fixes on the keys of constants and signature dictionaries (#135406)" 2024-09-16 17:58:02 +00:00
torch.compiler_inductor_profiling.rst
torch.compiler_ir.rst
torch.compiler_nn_module.rst
torch.compiler_performance_dashboard.rst
torch.compiler_profiling_torch_compile.rst [EZ] Fix spelling typo (#136157) 2024-09-16 19:30:30 +00:00
torch.compiler_transformations.rst
torch.compiler_troubleshooting.rst [dynamo] Retire CompileProfiler (#135133) 2024-09-05 01:08:40 +00:00
torch.overrides.rst
torch.rst Autoselect default device in FSDP construction. (#127609) 2024-08-08 05:25:17 +00:00
torch_cuda_memory.rst
torch_environment_variables.rst [Docs][MPS] Add mps environment variable table (#129008) 2024-06-20 03:30:35 +00:00
torch_nccl_environment_variables.rst [c10d][doc] Add docs for ENV variables TORCH_NCCL_ASYNC_ERROR_HANDLING TORCH_NCCL_TRACE_CPP_STACK and TORCH_NCCL_COORD_CHECK_MILSEC (#132920) 2024-08-09 21:08:20 +00:00
type_info.rst
utils.rst
xpu.rst [Intel GPU] Add XPU memory-related APIs (#129919) 2024-09-07 11:15:17 +00:00