pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Luca Wehrstedt 3ee655e4d4 [async-TP] Fix scheduling in matmul+reduce-scatter for 2 ranks (#145846 ) There's a sleep that is issued in order to "nudge" CUDA to do the right scheduling decision, but this is issued on iteration number 2. However, when the world size is 2, we never reach that iteration, which led to a suboptimal scheduling. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145846 Approved by: https://github.com/yifuwang	2025-01-30 18:26:34 +00:00
..
__init__.py	[async-TP] Fix scheduling in matmul+reduce-scatter for 2 ranks (#145846 )	2025-01-30 18:26:34 +00:00

Luca Wehrstedt 3ee655e4d4 [async-TP] Fix scheduling in matmul+reduce-scatter for 2 ranks (#145846 )

There's a sleep that is issued in order to "nudge" CUDA to do the right scheduling decision, but this is issued on iteration number 2. However, when the world size is 2, we never reach that iteration, which led to a suboptimal scheduling.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145846
Approved by: https://github.com/yifuwang

2025-01-30 18:26:34 +00:00

__init__.py

[async-TP] Fix scheduling in matmul+reduce-scatter for 2 ranks (#145846 )

2025-01-30 18:26:34 +00:00