pytorch/torch/distributed
Andrew Gu 0b0e65516d [FSDP] Fix param name prefixes for ignored modules (#79955)
For ignored modules' parameters, we should also clean their parameter names since they will have the FSDP-specific prefixes.

This change only affects the prefixed parameter name keys in `full_optim_state_dict()` (i.e. optim state dict saving). Not having this change does not actually violate the correctness of the optim state dict save-load flow because it only requires that the keys are unique and internally consistent.

Either way, this PR explicitly adds the specification now that the parameter keys in the optim state dict should match the keys of full model state dict.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79955
Approved by: https://github.com/rohan-varma
2022-06-21 22:10:33 +00:00
..
_shard [shard] make state_dict hook be consistent 2022-06-17 22:08:06 +00:00
_sharded_tensor
_sharding_spec
algorithms [CheckpointWrapper] Replace generic mod prefix (#79830) 2022-06-21 16:01:59 +00:00
autograd
benchmarks
elastic
fsdp [FSDP] Fix param name prefixes for ignored modules (#79955) 2022-06-21 22:10:33 +00:00
launcher
nn Ensure tensors are contiguous in functional all_gather. 2022-06-17 01:27:11 +00:00
optim [CUDA graphs] Allows Adam and AdamW to be capture-safe (#77862) 2022-06-13 01:56:47 +00:00
pipeline
rpc
__init__.py
argparse_util.py
constants.py
CONTRIBUTING.md Fix some links in torch/distributed/CONTRIBUTING.md (#79855) 2022-06-21 00:48:30 +00:00
distributed_c10d.py Revert "Revert "[distributed] Handle object collectives and NCCL. (#79034)"" 2022-06-15 10:04:37 -07:00
launch.py
remote_device.py
rendezvous.py
run.py
utils.py