You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Caffe Users
Hi everyone, I am trying to creat a network based on an article but i didn't understand this sentence "I divide learning rate by 10 when the error plateaus" so i have two question bout this sentence 1- what does the error plateaus means? and most important 2- how can i tell to my network to divide the learning rate when the error get plateaus? Thanks for for help
Przemek D
unread,
Mar 14, 2017, 7:54:52 AM3/14/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Caffe Users
When someone says the error plateaus, they mean that the error function does not decrease anymore (plot became flat). In caffe you cannot dynamically detect it - you can only specify LR decrease points before training. When someone says they did something with LR when the error plateaued, they most likely mean that they interrupted the training, changed some settings and launched training again from a saved solver state.
mahdi yed
unread,
Mar 14, 2017, 9:36:02 AM3/14/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Caffe Users
Thank you so much Przemek D Now things get clear so they will do this manually. just one last question, based on your explication i understand that we can say the the error is plateaus if the error not decrease anymore, but is there like a fixed period then we do an evaluation sofor exapmle we devide the learning rate if the error doesn't get change in 2-5 epochs ?
Przemek D
unread,
Mar 14, 2017, 10:20:29 AM3/14/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Caffe Users
I don't think there is any absolute rule for that. I found that for many models it doesn't really change much if you wait 2 epochs or 10 (in my experience classification tasks behave this way). However, with autoencoders for example I noticed loss function fluctuating around some value for a long time before suddenly falling down to some lower value over the course of 1-2 epochs - it was worth the wait then. To a newcomer I'd say: don't bother with stopping and resuming training, just set the LR to decrease x0.1 every 10-20 epochs and see where it gets you. Unless your network performance is critical, this is an unnecessary hassle.