Caffe Batch Normalization: lr_mult confusion

774 views
Skip to first unread message

Siddharth Mohan

unread,
Dec 4, 2015, 10:37:52 PM12/4/15
to Caffe Users
Hi all, 

I am trying to set a constant learning rate for the batch_norm parameters (gamma, beta). I couldn't find a way to do it in the prototxt so I was trying to hardcode it in SGDSolver. 

I am a bit confused by this in common_layers.hpp. There are only two params (gamma, beta). Why is the lr_mult:0 ? If this is zero, what learning rate is finally used to train the batchnorm parameters? 


 * By default, during training time, the network is computing global mean/                                                                                     
 * variance statistics via a running average, which is then used at test                                                                                       
 * time to allow deterministic outputs for each input.  You can manually                                                                                       
 * toggle whether the network is accumulating or using the statistics via the                                                                                  
 * use_global_stats option.  IMPORTANT: for this feature to work, you MUST                                                                                     
 * set the learning rate to zero for all three parameter blobs, i.e.,                                                                                          
 * param {lr_mult: 0} three times in the layer definition.             
Reply all
Reply to author
Forward
0 new messages