mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
# Motivation
## for `torch.amp.GradScaler`,
- `torch.cpu.amp.GradScaler(args...)` is completely equivalent to `torch. amp.GradScaler("cpu", args...)`.
- `torch.cuda.amp.GradScaler(args...)` is completely equivalent to `torch.amp.GradScaler("cuda", args...)`.
So, we intend to depreate them and **strongly recommend** developer to use `torch.amp.GradScaler`.
## for `custom_fwd` and `custom_bwd`,
this is a good solution to make the custom function run with or without effect even in an autocast-enabled region and can be shared by other backends, like CPU and XPU.
So we generalize it to be device-agnostic and put them int `torch/amp/autocast_mode.py` and re-expose to `torch.amp.custom_fwd` and `torch.amp.custom_bwd`. Meanwhile, we deprecate `torch.cuda.amp.custom_fwd` and `torch.cuda.amp.custom_bwd`.
# Additional Context
Add UT to cover the deprecated warning.
No need for more UTs to cover the functionality of `torch.amp.custom_f/bwd`, the existing UTs that previously covered the functionality of `torch.cuda.amp.custom_f/bwd` can cover them.
To facilitate the review, we separate these code changes to two PRs. The first PR cover `torch.amp.GradScaler`. The follow-up covers `custom_fwd` and `custom_bwd`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126527
Approved by: https://github.com/jgong5, https://github.com/gujinghui, https://github.com/janeyx99, https://github.com/EikanWang
|
||
|---|---|---|
| .. | ||
| _composable | ||
| _shard | ||
| _spmd | ||
| _tensor | ||
| _tools | ||
| algorithms | ||
| bin | ||
| checkpoint | ||
| elastic | ||
| fsdp | ||
| launcher | ||
| nn/jit | ||
| optim | ||
| pipeline/sync | ||
| pipelining | ||
| rpc | ||
| tensor/parallel | ||
| argparse_util_test.py | ||
| test_c10d_common.py | ||
| test_c10d_functional_native.py | ||
| test_c10d_gloo.py | ||
| test_c10d_logger.py | ||
| test_c10d_nccl.py | ||
| test_c10d_object_collectives.py | ||
| test_c10d_ops_nccl.py | ||
| test_c10d_pypg.py | ||
| test_c10d_spawn.py | ||
| test_c10d_spawn_gloo.py | ||
| test_c10d_spawn_nccl.py | ||
| test_c10d_spawn_ucc.py | ||
| test_c10d_ucc.py | ||
| test_collective_utils.py | ||
| test_compute_comm_reordering.py | ||
| test_control_collectives.py | ||
| test_cuda_p2p.py | ||
| test_data_parallel.py | ||
| test_device_mesh.py | ||
| test_distributed_spawn.py | ||
| test_dynamo_distributed.py | ||
| test_fake_pg.py | ||
| test_functional_api.py | ||
| test_inductor_collectives.py | ||
| test_launcher.py | ||
| test_multi_threaded_pg.py | ||
| test_nccl.py | ||
| test_pg_wrapper.py | ||
| test_store.py | ||