Caffe Batch Normalization: lr_mult confusion

774 views

Skip to first unread message

Siddharth Mohan

unread,

Dec 4, 2015, 10:37:52 PM12/4/15

to Caffe Users

Hi all,

I am trying to set a constant learning rate for the batch_norm parameters (gamma, beta). I couldn't find a way to do it in the prototxt so I was trying to hardcode it in SGDSolver.

I am a bit confused by this in common_layers.hpp. There are only two params (gamma, beta). Why is the lr_mult:0 ? If this is zero, what learning rate is finally used to train the batchnorm parameters?

* By default, during training time, the network is computing global mean/

 * variance statistics via a running average, which is then used at test                                                                                       
 * time to allow deterministic outputs for each input.  You can manually                                                                                       
 * toggle whether the network is accumulating or using the statistics via the                                                                                  
 * use_global_stats option.  IMPORTANT: for this feature to work, you MUST                                                                                     
 * set the learning rate to zero for all three parameter blobs, i.e.,                                                                                          
 * param {lr_mult: 0} three times in the layer definition.

Reply all

Reply to author

Forward

0 new messages