pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

janEbert b0708654c0 Implement NAdamW optimizer (#103881 ) NAdamW, which is simply NAdam with the AdamW weight decay term, has shown strong performance in optimizer comparisons such as 1. https://arxiv.org/abs/2211.09760 1. https://arxiv.org/abs/2306.07179 [The VeLO paper](https://arxiv.org/abs/2211.09760) argues its power lies in its ability to act as a superset of other popular optimizers. This PR adds NAdamW by ~~copying and making very small adaptations to the NAdam implementation (just like AdamW and Adam). To see the small changes in better detail, you can `diff torch/optim/nadam.py torch/optim/nadamw.py`.~~ adding a boolean flag `decoupled_weight_decay` that activates NAdamW behavior (`False` by default) to NAdam. Interest in the optimizer has also been shown in the PyTorch forums: https://discuss.pytorch.org/t/nadamw-and-demon-optimizers/179778 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103881 Approved by: https://github.com/janeyx99		2023-07-24 19:29:26 +00:00
..
test_lrscheduler.py	[BE] f-stringify torch/ and scripts (#105538 )	2023-07-21 19:35:24 +00:00
test_optim.py	Implement NAdamW optimizer (#103881 )	2023-07-24 19:29:26 +00:00
test_swa_utils.py	Dont run test files that are already run in test_optim (#103017 )	2023-06-06 17:31:21 +00:00