pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Will Constable 84416618a6 [Pipelining] Update schedules to use I, B actions. (#138886 ) Also, update tests to use I (BACKWARD_INPUT) vs B (FULL_BACKWARD) consistently. Previously, schedules would issue a 'B' operation and leave it ambiguous whether that operation should be BACKWARD_INPUT or FULL_BACKWARD, depending on a separate flag (use_full_backward) passed to the schedule class, which would determine which behavior was taken at runtime. Now, use_full_backward is removed and the schedule class is required to produce unambiguous IR. The logic for 'use_full_backward' is removed from the runtime. _validate_pipeline_order is replaced with _simulate_comms_compute. Both offer similar functionality, to validate the corrrectness of a schedule IR. 'validate' operates on compute-only IR, while simulate operates on compute + comm IR. To convert from using validate to simulate, you have to first insert comm actions via '_add_send_recv'. 'simulate' was inefficiently written before this PR and needed to be optimized to run quickly for extra large schedules with >32 ranks and microbatches per rank used in some unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/138886 Approved by: https://github.com/H-Huang		2024-11-01 03:54:06 +00:00
..
_composable	[Device] Replace hardcoded devices with 'torch._C._get_accelerator()' (#139032 )	2024-10-29 04:51:47 +00:00
_shard
_sharded_tensor
_sharding_spec
_symmetric_memory	get_symm_mem_workspace(): print helpful error during graph capture (#138028 )	2024-10-30 18:11:09 +00:00
_tensor
_tools
algorithms	Make DDP Quantization hooks backend Agnostic (#138816 )	2024-10-29 15:02:45 +00:00
autograd
benchmarks
checkpoint	[DCP] Unit Test to validate the stateful and non-stateful loads (#139251 )	2024-10-31 01:12:51 +00:00
elastic
examples
fsdp
launcher
nn
optim
pipelining	[Pipelining] Update schedules to use I, B actions. (#138886 )	2024-11-01 03:54:06 +00:00
rpc
tensor	[DTensor][Bug Fix]Fix 2D DTensor mm with mesh_shape (1, n) or (n, 1) (#139134 )	2024-10-30 08:09:39 +00:00
__init__.py
_checkpointable.py
_composable_state.py
_functional_collectives.py	[c10d][Partial-Graph Overlap] Support calling .wait_tensor() on output tensor of eager `async_op=True` collective if under `allow_inflight_collective_as_graph_input_ctx()` context manager (#137763 )	2024-10-29 03:31:19 +00:00
_functional_collectives_impl.py
_state_dict_utils.py
argparse_util.py
c10d_logger.py
collective_utils.py
constants.py
CONTRIBUTING.md
device_mesh.py	[DeviceMesh] fix sub mesh size calculation in create_sub_mesh() (#138945 )	2024-10-29 17:56:56 +00:00
distributed_c10d.py	[c10d] allow sub group to be eagerly inited even if default one is not (#138665 )	2024-10-24 23:51:28 +00:00
launch.py
logging_handlers.py
remote_device.py
rendezvous.py
run.py
utils.py