pytorch/torch/distributed/_composable
Andrew Gu bc938184de [FSDP2] Added set_reduce_scatter_divide_factor (#129286)
This PR adds an API `FSDPModule.set_reduce_scatter_divide_factor` to allow setting a custom gradient divide factor for reduce-scatter. This can be useful when using parallelisms in combination with FSDP (e.g. expert parallelism), where gradients need to be divided by a custom factor (e.g. an extra `EP` factor).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129286
Approved by: https://github.com/weifengpy
2024-07-24 12:42:35 +00:00
..
fsdp [FSDP2] Added set_reduce_scatter_divide_factor (#129286) 2024-07-24 12:42:35 +00:00
__init__.py
checkpoint_activation.py [BE] mypy: disallow untyped decorators (#131428) 2024-07-23 21:50:55 +00:00
contract.py [Reland][PT-D] Relaxed contract to allow Sequence[nn.Module] (#127773) (#130947) 2024-07-17 22:40:13 +00:00
fully_shard.py [BE] mypy: disallow untyped decorators (#131428) 2024-07-23 21:50:55 +00:00
replicate.py [BE] mypy: disallow untyped decorators (#131428) 2024-07-23 21:50:55 +00:00