pytorch/test/distributed/checkpoint
Lucas Pasqualin 5c7b71dccf [DCP] Adds strict option to DefaultPlanner (#123869)
~Users may have custom use cases for the `strict` parameter in load. In my mind, if we automatically call `state_dict` and `load_state_dict` in save/load, we need to support the same functionality in `nn.Modules`.~

It turns out this is actually not related to nn.Module's strict param. Since `state_dict` is called inside `dcp.load`, it's actually impossible to create a model such that the following would raise an error:
```
state_dict = module.state_dict()
module.load_state_dict(state_dict, strict=True)
```

The issue is actually just when there are elements in `state_dict` which do not exist in the checkpoint. This PR adds the ability to configure this behavior through the DefaultSavePlanner (see tests).

Concretely, if module has extra attributes not present in the checkpoint, we will only raise an error if `DefaultLoadPlanner.allow_partial_load==False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123869
Approved by: https://github.com/fegin
2024-05-02 22:50:32 +00:00
..
e2e [DCP] Provides default AsyncStager (#124939) 2024-05-02 19:48:54 +00:00
fsdp
test_checkpoint.py
test_compatibility.py
test_dedup_tensors.py
test_dtensor_checkpoint.py
test_dtensor_resharding.py
test_file_system_checkpoint.py
test_file_system_checkpoint_cpu.py
test_format_utils.py
test_fsdp_model_state.py
test_fsdp_optim_state.py
test_fsdp_tp_checkpoint_conversion.py
test_fsspec.py
test_hsdp_checkpoint.py
test_nested_dict.py
test_planner.py [DCP] Adds strict option to DefaultPlanner (#123869) 2024-05-02 22:50:32 +00:00
test_save_load_api.py [DCP] Fix broken validate checkpoint api test (#124786) 2024-04-25 14:50:58 +00:00
test_state_dict.py
test_state_dict_utils.py
test_tp_checkpoint.py
test_traverse.py
test_utils.py