Error plateaus meaning.

mahdi yed

unread,

Mar 14, 2017, 7:26:18 AM3/14/17

to Caffe Users

Hi everyone,
I am trying to creat a network based on an article but i didn't understand this sentence "I divide learning rate by 10 when the error plateaus" so i have two question bout this sentence
1- what does the error plateaus means?
and most important
2- how can i tell to my network to divide the learning rate when the error get plateaus?
Thanks for for help

Przemek D

unread,

Mar 14, 2017, 7:54:52 AM3/14/17

to Caffe Users

When someone says the error plateaus, they mean that the error function does not decrease anymore (plot became flat). In caffe you cannot dynamically detect it - you can only specify LR decrease points before training. When someone says they did something with LR when the error plateaued, they most likely mean that they interrupted the training, changed some settings and launched training again from a saved solver state.

mahdi yed

unread,

Mar 14, 2017, 9:36:02 AM3/14/17

to Caffe Users

Thank you so much Przemek D Now things get clear so they will do this manually.
just one last question, based on your explication i understand that we can say the the error is plateaus if the error not decrease anymore, but is there like a fixed period then we do an evaluation sofor exapmle we devide the learning rate if the error doesn't get change in 2-5 epochs ?

Przemek D

unread,

Mar 14, 2017, 10:20:29 AM3/14/17

to Caffe Users

I don't think there is any absolute rule for that. I found that for many models it doesn't really change much if you wait 2 epochs or 10 (in my experience classification tasks behave this way). However, with autoencoders for example I noticed loss function fluctuating around some value for a long time before suddenly falling down to some lower value over the course of 1-2 epochs - it was worth the wait then.
To a newcomer I'd say: don't bother with stopping and resuming training, just set the LR to decrease x0.1 every 10-20 epochs and see where it gets you. Unless your network performance is critical, this is an unnecessary hassle.

Reply all

Reply to author

Forward