pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Alexander Zinoviev ee713f80ed Enable channels_last format for FSDP (#137382 ) Enable FSDP to deal with channels_last memory formatted tensors. Preserving channels_last memory format makes FSDP compatible with the best kernels CUDNN offers. Summary of changes: 1) Store strides information along with shapes 2) Replace calls to flatten() with as_strided(size=(param.numel(),), stride=(1,)) for flattening 3) Replace calls to view() with as_strided with the stored sizes and strides for unflattening Pull Request resolved: https://github.com/pytorch/pytorch/pull/137382 Approved by: https://github.com/awgu		2024-10-11 03:47:16 +00:00
..
test_checkpoint_wrapper.py
test_distributed_checkpoint.py
test_fsdp_apply.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_backward_prefetch.py
test_fsdp_checkpoint.py
test_fsdp_clip_grad_norm.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_comm.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_comm_hooks.py
test_fsdp_core.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_dtensor_state_dict.py
test_fsdp_exec_order.py
test_fsdp_fine_tune.py
test_fsdp_flatten_params.py	Enable channels_last format for FSDP (#137382 )	2024-10-11 03:47:16 +00:00
test_fsdp_freezing_weights.py
test_fsdp_fx.py
test_fsdp_grad_acc.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_hybrid_shard.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_ignored_modules.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_input.py
test_fsdp_memory.py
test_fsdp_meta.py
test_fsdp_misc.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_mixed_precision.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_multiple_forward.py
test_fsdp_multiple_wrapping.py
test_fsdp_optim_state.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_overlap.py
test_fsdp_pure_fp16.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_sharded_grad_scaler.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_state_dict.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_tp_integration.py
test_fsdp_traversal.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_uneven.py
test_fsdp_unshard_params.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00
test_fsdp_use_orig_params.py	Add Triton CPU as an Inductor backend (#133408 )	2024-09-30 20:24:52 +00:00
test_hsdp_dtensor_state_dict.py
test_shard_utils.py
test_utils.py
test_wrap.py	Generalization of FSDP common for non-cuda execution (#133209 )	2024-09-27 00:38:10 +00:00