which lr_policy should I choose?

613 views

Skip to first unread message

weining wang

unread,

Mar 26, 2015, 3:13:49 AM3/26/15

to caffe...@googlegroups.com

I start a learning rate of 0.01, and reduce it to 0.001 when the accuracy on the evaluation set stops improving. which lr_policy should I choose? I am a beginner, thanks for your patience .

Dinesh

unread,

Mar 26, 2015, 10:36:55 AM3/26/15

to caffe...@googlegroups.com

Good advice on learning rate and other SGD-related things may be found here: http://yyue.blogspot.com/2015/01/a-brief-overview-of-deep-learning.html, here: http://research.microsoft.com/pubs/192769/tricks-2012.pdf and here: http://arxiv.org/pdf/1206.5533.pdf

When I find that optimizing learning schedule-related hyperparameters for a bunch of baselines is cumbersome in practice, I find that starting with a learning rate around 1e-3 and using a solver like ADAGRAD or NESTEROV that adapts the learning rate by itself does reasonably well and spares me the headache of huge grid searches.

Reply all

Reply to author

Forward

0 new messages