don't use no_sync when deepspeed doesn't support it for certain zero stages (#35157)

* don't use no_sync when deepspeed doesn't support it for certain zero stages * chore: lint * fix no_sync context for deepspeed across all zero types * chore: lint
2026-05-14 20:58:08 +00:00 · 2024-12-13 13:23:00 -05:00 · 2024-12-13 13:23:00 -05:00 · add53e25ff
commit add53e25ff
parent 7237b3ecfc
1 changed files with 1 additions and 0 deletions
--- a/src/transformers/trainer.py
+++ b/src/transformers/trainer.py
@ -2517,6 +2517,7 @@ class Trainer:
                    context = (
                        functools.partial(self.accelerator.no_sync, model=model)
                        if i != len(batch_samples) - 1
+                        and self.accelerator.distributed_type != DistributedType.DEEPSPEED
                        else contextlib.nullcontext
                    )
                    with context():