mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
The original `_all_gather_keys` call was for a safety check, but could be costly as things scale, and it blocks CPU. Instead, we make it clear in the documentation that the `state_dict` passed to the `load` API should have same set of keys, otherwise the API may hang. In addition, we move the check to a utility function: `utils.assert_same_keys`. User uncertain about state dict unity can optionally call this API to check. Resolves #145965 (as a workaround). Pull Request resolved: https://github.com/pytorch/pytorch/pull/145998 Approved by: https://github.com/mhorowitz, https://github.com/fegin |
||
|---|---|---|
| .. | ||
| e2e | ||
| fsdp | ||
| test_checkpoint.py | ||
| test_compatibility.py | ||
| test_dedup_tensors.py | ||
| test_dtensor_checkpoint.py | ||
| test_dtensor_resharding.py | ||
| test_file_system_checkpoint.py | ||
| test_file_system_checkpoint_cpu.py | ||
| test_format_utils.py | ||
| test_fsdp_model_state.py | ||
| test_fsdp_optim_state.py | ||
| test_fsdp_tp_checkpoint_conversion.py | ||
| test_fsspec.py | ||
| test_hsdp_checkpoint.py | ||
| test_nested_dict.py | ||
| test_planner.py | ||
| test_save_load_api.py | ||
| test_state_dict.py | ||
| test_state_dict_utils.py | ||
| test_tp_checkpoint.py | ||
| test_traverse.py | ||
| test_utils.py | ||