Hello,
I have been fine-tuning FCN models using my own data with Caffe and also with MatConvNet. The initial weights of the models are adapted from PASCAL VOC trained CNN. I have observed a huge variance in the optimum learning rate for both frameworks. In Caffe the optimum learning rate is around 1e-12. Besides, if a learning rate higher than 1e-10 is used the net doesn't learn. On the other hand, in MatConvNet the optimum learning rate is around 1e-4.
Any ideas about that difference? Isn't 1e-12 a too tiny learning rate?
The FCN models I'm using are:
- Caffe:
https://github.com/shelhamer/fcn.berkeleyvision.org- MatConvNet:
https://github.com/vlfeat/matconvnet-fcnAside from my experiments, I have observed in several text localization papers the same kind of variance in the used learning rates. The authors do not specify the framework used for training, but the differences match with the ones I have observed:
- Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network - Tong He (1e-10)
- Synthetic Data for Text Localisation in Natural Images - Ankush Gupta (1e-4)
Thank you