mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
See https://github.com/pytorch/pytorch/issues/82891 for full context. When we init FSDP with device_id + CPU offload, we could potentially hit a crash when an outer FSDP unit does not manage any params. What was happening is that it would end up getting a flat param of a child FSDP module, check the device of this, see it is CPU, and throw an error. The fix is to avoid this check if we hit a flat param. Also fixes up the documentation of the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82892 Approved by: https://github.com/awgu |
||
|---|---|---|
| .. | ||
| _shard | ||
| algorithms | ||
| bin | ||
| elastic | ||
| fsdp | ||
| launcher | ||
| nn/jit | ||
| optim | ||
| pipeline/sync | ||
| rpc | ||
| argparse_util_test.py | ||
| defs.bzl | ||
| test_c10d_common.py | ||
| test_c10d_gloo.py | ||
| test_c10d_nccl.py | ||
| test_c10d_object_collectives.py | ||
| test_c10d_pypg.py | ||
| test_c10d_spawn.py | ||
| test_c10d_spawn_gloo.py | ||
| test_c10d_spawn_nccl.py | ||
| test_data_parallel.py | ||
| test_distributed_spawn.py | ||
| test_launcher.py | ||
| test_nccl.py | ||
| test_pg_wrapper.py | ||
| test_store.py | ||