What is an LR drop?

927 views
Skip to first unread message

Alexandre Meirelles

unread,
Mar 11, 2019, 4:19:31 PM3/11/19
to LCZero
What is an LR drop?

Lee Sailer

unread,
Mar 11, 2019, 4:31:06 PM3/11/19
to LCZero

Gradient descent step size.

Alexander Lyashuk

unread,
Mar 11, 2019, 4:50:35 PM3/11/19
to Lee Sailer, LCZero
More precisely, it's reduction of gradient descent step size.

--
You received this message because you are subscribed to the Google Groups "LCZero" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lczero+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lczero/1652f6b6-387b-405b-9848-75856edf4cca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yaron Shoham

unread,
Mar 11, 2019, 4:50:49 PM3/11/19
to LCZero
google learning rate

alvaro...@gmail.com

unread,
Mar 13, 2019, 5:06:39 PM3/13/19
to LCZero
Qualitatively, using a lower learning rate it means the NN will learn slower, but the learned weights will contain less noise. You start training with a somewhat high learning rate so you can make quick progress, and at some point you see that the NN is not learning anything new. By dropping the learning rate you allow the learning procedure to refine the values of the weights, and you usually see very quick progress immediately, which then slows down again.

I hope that helps you understand.

Jon Kochenbauer

unread,
Mar 15, 2019, 3:12:37 AM3/15/19
to LCZero
Each iteration the network weights are adjusted in the direction which will make the losing moves from that batch of training less likely to be chosen.

The learning rate is a parameter which controls how much to adjust the weights by. At the beginning of the training, the weights are random and relatively large adjustments are appropriate. As the weights converge towards a locally optimal solution the adjustments need to be smaller or the weights will overshoot their optimal values.

The LR drop (learning rate drops) are when the parameter controlling the size of the adjustments to the weights is reduced.

When it is done at the correct time the adjustments stop overshooting so progress improves for a while. By the time of the second LR drop the weights are close to the local optimum, so it is generally the last phase to squeeze out a few more ELO (about 30-40 in the case of T30)

Reply all
Reply to author
Forward
0 new messages