pytorch/torch/distributed
Junjie Wang (PyTorch) b02c514764 [PT-D][Sharded Tensor] new init api for local tensor and sharding spec auto inference (#72733)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72733

To improve the perf cost due to communication in the process of init the sharded tensor. There are two changes in this PR/diff:

1. We create a new API named `_init_from_local_tensor` so that if we have only one local tensor, we can initiate a sharded tensor directly from it. (GH issue: https://github.com/pytorch/pytorch/issues/72092)
2. We create a new API to infer the sharding spec from global meta data, so we don't have to manually set the sharding spec when it's not `EnumerableShardingSpec`. (GH issue: https://github.com/pytorch/pytorch/issues/67244)
ghstack-source-id: 149229259

Test Plan: CI

Reviewed By: wanchaol

Differential Revision: D34132739

fbshipit-source-id: 3a60135761bcc19d6020b6c45cb2979869645ce6
(cherry picked from commit af569325e2794309a4a86e51749642a062a25f6e)
2022-02-16 17:42:39 +00:00
..
_shard [PT-D][Sharded Tensor] new init api for local tensor and sharding spec auto inference (#72733) 2022-02-16 17:42:39 +00:00
_sharded_tensor [reland] Create torch.distributed._shard package. (#72141) 2022-02-02 06:58:20 +00:00
_sharding_spec [reland] Create torch.distributed._shard package. (#72141) 2022-02-02 06:58:20 +00:00
algorithms [Join][BE] Fix typo; remove obsolete method (#72886) 2022-02-16 15:03:09 +00:00
autograd
benchmarks
elastic #71946 Remove Python 3.6 references (#72211) 2022-02-08 03:46:20 +00:00
fsdp Revert D34111109: [FSDP] Implement apply() 2022-02-16 15:49:04 +00:00
launcher
nn Revert D33716716: [pytorch][PR] Added remove_duplicate parameter to nn.Module 2022-02-03 09:04:29 +00:00
optim Revert D34106940: [ZeRO] Add ctor support for multiple param groups 2022-02-16 03:45:15 +00:00
pipeline
rpc [distributed] Make rref_proxy._invoke_rpc trully async when needed. (#70206) 2022-01-19 23:37:15 +00:00
__init__.py
argparse_util.py
constants.py
CONTRIBUTING.md
distributed_c10d.py Stop writing logs to root logger (#72649) 2022-02-11 21:30:53 +00:00
launch.py
remote_device.py
rendezvous.py Update _create_c10d_store to check port value (#71863) 2022-01-26 22:29:33 +00:00
run.py