pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Olga Andreeva a48f3059b7 Corrected comments in fsdp (#80456 ) Currently, pre- and post-division steps in `FullyShardedDataParallel._post_backward_hook` state the following: > Average grad by world_size for consistency with PyTorch DDP. This is not matching what is actually going on, i.e. pre-divide factor may be equal to `world_size` and may not. For example, for `world_size = 3 `, `predivide_factor=2` This PR clarifies pre- and post-division in the code Pull Request resolved: https://github.com/pytorch/pytorch/pull/80456 Approved by: https://github.com/rohan-varma		2022-06-28 18:46:05 +00:00
..
_shard	[shard] make state_dict hook be consistent	2022-06-17 22:08:06 +00:00
_sharded_tensor
_sharding_spec
algorithms	FSDP communication hook interface for NO_SHARD strategy (#79833 )	2022-06-28 08:03:11 +00:00
autograd
benchmarks
elastic	Add __all__ to torch.distributed and tensorboard submodules (#80444 )	2022-06-28 16:33:22 +00:00
fsdp	Corrected comments in fsdp (#80456 )	2022-06-28 18:46:05 +00:00
launcher	Add __all__ to torch.distributed and tensorboard submodules (#80444 )	2022-06-28 16:33:22 +00:00
nn	Ensure tensors are contiguous in functional all_gather.	2022-06-17 01:27:11 +00:00
optim	[CUDA graphs] Allows Adam and AdamW to be capture-safe (#77862 )	2022-06-13 01:56:47 +00:00
pipeline
rpc	Add __all__ to various submodules in torch.fx, distributions, distributed, package (#80367 )	2022-06-27 21:27:30 +00:00
__init__.py
argparse_util.py
constants.py
CONTRIBUTING.md	Fix some links in torch/distributed/CONTRIBUTING.md (#79855 )	2022-06-21 00:48:30 +00:00
distributed_c10d.py	Revert "Revert "[distributed] Handle object collectives and NCCL. (#79034 )""	2022-06-15 10:04:37 -07:00
launch.py
remote_device.py
rendezvous.py
run.py
utils.py