From d1fe68e70bb72a425667c63007a68cfb2ea6fc4b Mon Sep 17 00:00:00 2001 From: Ilqar Ramazanli Date: Fri, 23 Apr 2021 09:33:22 -0700 Subject: [PATCH] To add single and chained learning schedulers to docs (#56705) Summary: In the optimizer documentation, many of the learning rate schedulers [examples](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate) are provided according to a generic template. In this PR we provide a precise simple use case example to show how to use learning rate schedulers. Moreover, in a followup example we show an example how to chain two schedulers next to each other. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56705 Reviewed By: ezyang Differential Revision: D27966704 Pulled By: iramazanli fbshipit-source-id: f32b2d70d5cad7132335a9b13a2afa3ac3315a13 --- docs/source/optim.rst | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/docs/source/optim.rst b/docs/source/optim.rst index e64945fe462..b8ef01f8ec1 100644 --- a/docs/source/optim.rst +++ b/docs/source/optim.rst @@ -146,6 +146,45 @@ allows dynamic learning rate reducing based on some validation measurements. Learning rate scheduling should be applied after optimizer's update; e.g., you should write your code this way: +Example:: + + model = [Parameter(torch.randn(2, 2, requires_grad=True))] + optimizer = SGD(model, 0.1) + scheduler = ExponentialLR(optimizer, gamma=0.9) + + for epoch in range(20): + for input, target in dataset: + optimizer.zero_grad() + output = model(input) + loss = loss_fn(output, target) + loss.backward() + optimizer.step() + scheduler.step() + +Most learning rate schedulers can be called back-to-back (also referred to as +chaining schedulers). The result is that each scheduler is applied one after the +other on the learning rate obtained by the one preceding it. + +Example:: + + model = [Parameter(torch.randn(2, 2, requires_grad=True))] + optimizer = SGD(model, 0.1) + scheduler1 = ExponentialLR(optimizer, gamma=0.9) + scheduler2 = MultiStepLR(optimizer, milestones=[30,80], gamma=0.1) + + for epoch in range(20): + for input, target in dataset: + optimizer.zero_grad() + output = model(input) + loss = loss_fn(output, target) + loss.backward() + optimizer.step() + scheduler1.step() + scheduler2.step() + +In many places in the documentation, we will use the following template to refer to schedulers +algorithms. + >>> scheduler = ... >>> for epoch in range(100): >>> train(...)