| .. |
|
_composable
|
[Traceable FSDP2] Change from register_multi_grad_hook to per-tensor backward hook (#126350)
|
2024-05-18 04:44:29 +00:00 |
|
_shard
|
Revert "[FX] Update type hints in torch.fx._compatibility.py (#125469)"
|
2024-05-06 18:36:43 +00:00 |
|
_sharded_tensor
|
|
|
|
_sharding_spec
|
|
|
|
_spmd
|
[dtensor] refactor view ops to use OpStrategy (#126011)
|
2024-05-17 05:39:21 +00:00 |
|
_tensor
|
[DTensor] Turn on foreach implementation for clip_grad_norm_ for DTensor by default (#126423)
|
2024-05-17 06:57:52 +00:00 |
|
_tools
|
[BE]: Try TCH autofixes on torch/ (#125536)
|
2024-05-05 23:13:59 +00:00 |
|
algorithms
|
|
|
|
autograd
|
|
|
|
benchmarks
|
|
|
|
checkpoint
|
Fix strict default value in StateDictOptions (#125998)
|
2024-05-16 21:42:53 +00:00 |
|
elastic
|
[torch-distributed] Make log directory creation idempotent (#126496)
|
2024-05-18 00:17:13 +00:00 |
|
examples
|
|
|
|
fsdp
|
[BE][FSDP] Remove unnecessary warnings (#126365)
|
2024-05-16 17:34:01 +00:00 |
|
launcher
|
torchelastic: change monitor_interval default to 0.1 (#124692)
|
2024-04-24 01:44:41 +00:00 |
|
nn
|
|
|
|
optim
|
[optim] add fused_adagrad support for CPU device (#124905)
|
2024-05-16 01:11:51 +00:00 |
|
pipeline
|
[BE]: Update ruff to v0.4.4 (#125031)
|
2024-05-12 20:02:37 +00:00 |
|
pipelining
|
Revert "[pipelining] Add pipeline stage test (#126721)"
|
2024-05-21 04:40:05 +00:00 |
|
rpc
|
|
|
|
tensor
|
[DeviceMesh] Make _validate_tp_mesh_dim support 3D (#125763)
|
2024-05-08 21:22:11 +00:00 |
|
__init__.py
|
Revert "c10d: add Collectives abstraction (#125978)"
|
2024-05-20 07:40:41 +00:00 |
|
_composable_state.py
|
|
|
|
_functional_collectives.py
|
[Traceable FSDP2] Add all_gather_into_tensor out variant (#126334)
|
2024-05-16 10:27:06 +00:00 |
|
_functional_collectives_impl.py
|
Make c10d_functional ops call into _c10d_functional ops (#124979)
|
2024-04-27 08:08:02 +00:00 |
|
_state_dict_utils.py
|
[DSD] Implement broadcast_from_rank0 option for optim state_dict (#125339)
|
2024-05-08 07:22:20 +00:00 |
|
argparse_util.py
|
|
|
|
c10d_logger.py
|
[DCP] Adds better handling in logging of specific kwargs (#123658)
|
2024-04-11 21:09:38 +00:00 |
|
collective_utils.py
|
|
|
|
constants.py
|
|
|
|
CONTRIBUTING.md
|
|
|
|
device_mesh.py
|
[DeviceMesh] Supported N groups in from_group (#126258)
|
2024-05-17 01:03:21 +00:00 |
|
distributed_c10d.py
|
[C10D] make get_node_local_rank() accept fallback_rank (#126737)
|
2024-05-21 03:38:02 +00:00 |
|
launch.py
|
[Docs][Distributed] Add migration notes for --local-rank option style change for torchrun in PyTorch 2.0 (#109480)
|
2024-04-16 05:51:57 +00:00 |
|
logging_handlers.py
|
|
|
|
remote_device.py
|
|
|
|
rendezvous.py
|
Fix public binding to actually traverse modules (#126103)
|
2024-05-15 19:36:03 +00:00 |
|
run.py
|
[BE]: Improve exception typing. Remove NOQAs (#125535)
|
2024-05-08 14:07:13 +00:00 |
|
utils.py
|
|
|