Hi-
When fine-tuning the googlenet models, how do you handle the first 2 auxiliary loss classifiers? Do you reduce their learning rates, or delete them entirely?
I was fine-tuning for a 3-class classifier, and did the following:
in quick_solver.prototxt:
- decrease base_lr by a factor of 100
- decrease max_iter to 10,000
in train_val.prototxt:
- delete all loss1/* layers
- delete all loss2/* layers
- rename loss3/classifer layer to loss3/classifier_mod
- increase loss3/classifier_mod's learning rates by a factor of 10 (from 1, 2 to 10, 20)
Loss decreases and test top-1 accuracy hits ~97%, but I'm wondering if my approach is optimal/sensible.
cheers,
-Steven