Fine-tuning: Learning Rates

Adarsh Chauhan

unread,

Feb 9, 2016, 9:16:52 AM2/9/16

to torch7

Hey everyone,

I am curious about the learning rates while training neural networks. I read somewhere that while fine-tuning conv-nets, learning rate should be kept low.

I was wondering about how should one decide what the learning rate should be?
On what parameters does it theoretically depend?
Apart from theory, are there any practical recommendations on how to choose a particular figure/range for the learning rates to obtain better accuracy while training/testing?

Any good literary resource or relevant research regarding the same will be appreciated.
Thanks in advance.

Regards,
Adarsh.

Sudipto Banerjee

unread,

Feb 9, 2016, 10:13:37 AM2/9/16

to torch7

http://cs231n.github.io/transfer-learning/ might be a good resource to start from. Generally, the learning rates are kept low for lower layers (because they are already learned weights), and fully connected layers in the end have a slightly higher rate compared to the base rate. But then again, it depends on the application and the dataset that you are finetuning on. I suggest you follow the link, and decide for yourself. Other than that, it's mostly experimental.

Adarsh Chauhan

unread,

Feb 10, 2016, 2:12:25 PM2/10/16

to torch7

Sudipto,

Thanks for the link. Though I referred to the same link that you provided above before posting this question.

Right now, I am trying to fine-tune a (pre-trained) network and am using a learning rate of 1e-3. What made me curious to ask this question was the confusion whether this is a good learning rate for my purpose or not. I am new to the field of deep learning and was looking for a direction in this regard.

A related query of mine in this respect is how to adjust different learning rates for different layers, since in cases of fine-tuning, it is generally recommended that the lower convolutional layers have lower learning rates whereas the classifier layers have relatively higher learning rates. Can someone please give me an example on how and where to adjust the learning rates?

Thanks in advance.

Regards,
Adarsh.

Reply all

Reply to author

Forward