mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72733 To improve the perf cost due to communication in the process of init the sharded tensor. There are two changes in this PR/diff: 1. We create a new API named `_init_from_local_tensor` so that if we have only one local tensor, we can initiate a sharded tensor directly from it. (GH issue: https://github.com/pytorch/pytorch/issues/72092) 2. We create a new API to infer the sharding spec from global meta data, so we don't have to manually set the sharding spec when it's not `EnumerableShardingSpec`. (GH issue: https://github.com/pytorch/pytorch/issues/67244) ghstack-source-id: 149229259 Test Plan: CI Reviewed By: wanchaol Differential Revision: D34132739 fbshipit-source-id: 3a60135761bcc19d6020b6c45cb2979869645ce6 (cherry picked from commit af569325e2794309a4a86e51749642a062a25f6e) |
||
|---|---|---|
| .. | ||
| _shard | ||
| _sharded_tensor | ||
| _sharding_spec | ||
| algorithms | ||
| autograd | ||
| benchmarks | ||
| elastic | ||
| fsdp | ||
| launcher | ||
| nn | ||
| optim | ||
| pipeline | ||
| rpc | ||
| __init__.py | ||
| argparse_util.py | ||
| constants.py | ||
| CONTRIBUTING.md | ||
| distributed_c10d.py | ||
| launch.py | ||
| remote_device.py | ||
| rendezvous.py | ||
| run.py | ||