pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Jeff Daily ce5bca5502 ProcessGroupNCCL::alltoall_base needs to call recordStream (#46603 ) Summary: For similar reasons as documented in the `[Sync Streams]` note. For a current example, `ProcessGroupNCCL::allgather` must also call `recordStream` and does so already. The output tensor is created on the default stream (by the application). NCCL/RCCL internally uses another stream (i.e., ncclStream). If we do not record the output tensor on the ncclStream, there is a chance that the output tensor might be deallocated while NCCL/RCCL is using it. The application is not aware of the ncclStream since it's internal to ProcessGroupNCCL. So, the application cannot record the output tensor on the ncclStream. Patch originally developed by sarunyap. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46603 Reviewed By: srinivas212 Differential Revision: D24458530 fbshipit-source-id: b02e74d1c3a176ea1b9bbdd7dc671b221fcadaef		2020-10-22 15:53:19 -07:00
..
_C	Namespace cleanup for 1.7 Part 2 (#46673 )	2020-10-22 07:57:51 -07:00
autograd	Allow Tensor-likes in torch.autograd.gradcheck (#45732 )	2020-10-09 11:51:27 -07:00
backends
contrib
csrc	[pt][static_runtime] Add option enable_out_variant (#46690 )	2020-10-22 15:00:23 -07:00
cuda	Replace map(lambda constructs (#46462 )	2020-10-22 09:50:22 -07:00
distributed	Pull in fairscale.nn.Pipe into PyTorch. (#44090 )	2020-10-22 10:59:02 -07:00
distributions	Expose script_if_tracing as public API (#46494 )	2020-10-17 17:31:57 -07:00
fft	torch.fft: Two dimensional FFT functions (#45164 )	2020-10-17 16:23:06 -07:00
for_onnx
futures	fix #45552 - adding add_done_callback(fn) to torch.futures.Future (#45675 )	2020-10-13 07:47:36 -07:00
fx	[FX] Make wrapped functions traceable (#46692 )	2020-10-22 12:00:02 -07:00
jit	Replace map(lambda constructs (#46462 )	2020-10-22 09:50:22 -07:00
legacy
lib	ProcessGroupNCCL::alltoall_base needs to call recordStream (#46603 )	2020-10-22 15:53:19 -07:00
linalg
multiprocessing	Add exception classification to torch.multiprocessing.spawn (#45174 )	2020-10-09 12:59:41 -07:00
nn	Replace map(lambda constructs (#46462 )	2020-10-22 09:50:22 -07:00
onnx	[ONNX] Export var, var_mean and std_mean ops (#45678 )	2020-10-21 11:23:54 -07:00
optim	Replace list(map(...)) constructs by list comprehensions (#46461 )	2020-10-19 18:42:49 -07:00
package	[packaging] simpler dependency plotting (#45686 )	2020-10-06 23:40:00 -07:00
quantization	[quant][graphmode][fx] Add support for additional_fuse_method_mapping (#46345 )	2020-10-22 15:15:31 -07:00
sparse	Revised sparse tensor documentation. (#45400 )	2020-10-22 02:07:54 -07:00
testing	Replace map(lambda constructs (#46462 )	2020-10-22 09:50:22 -07:00
utils	Namespace cleanup for 1.7 Part 2 (#46673 )	2020-10-22 07:57:51 -07:00
__config__.py
__future__.py
__init__.py	Avoid leaking has_torch_function and handle_torch_function in torch namespace (#46680 )	2020-10-22 07:48:36 -07:00
_appdirs.py
_classes.py
_jit_internal.py	[JIT] adding torch.jit.isinstance support (#46062 )	2020-10-20 16:47:49 -07:00
_linalg_utils.py
_lobpcg.py
_lowrank.py
_namedtensor_internals.py
_ops.py
_six.py
_storage_docs.py
_tensor_docs.py	Revised sparse tensor documentation. (#45400 )	2020-10-22 02:07:54 -07:00
_tensor_str.py
_torch_docs.py	Revised sparse tensor documentation. (#45400 )	2020-10-22 02:07:54 -07:00
_utils.py
_utils_internal.py
_VF.py
_vmap_internals.py	Allow vmap to accept nested python data structures as inputs (#46289 )	2020-10-20 07:52:17 -07:00
abi-check.cpp
CMakeLists.txt	make a way to disable callgrind (#46116 )	2020-10-13 16:18:04 -07:00
custom_class.h
custom_class_detail.h
extension.h
functional.py	Fix typing errors in the torch.distributions module (#45689 )	2020-10-12 10:29:45 -07:00
hub.py
library.h	Rationalize inlining of kernels into the unboxing wrapper (#42845 )	2020-10-15 04:02:51 -07:00
overrides.py	[py][vulkan][reland] Add is_vulkan to py api, add vulkan to device type parsing (#46655 )	2020-10-22 09:35:50 -07:00
py.typed
quasirandom.py
random.py
README.txt
script.h
serialization.py	Use storage.cpu() for moving storage to CPU in serialization. (#46028 )	2020-10-13 12:51:10 -07:00
storage.py
tensor.py	Allow consumer ops to sync on GraphRoot's gradient (#45787 )	2020-10-07 08:53:53 -07:00
types.py

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.