how do you implement batch normalization in caffe?

4,890 views
Skip to first unread message

Emre Can Kaya

unread,
Mar 24, 2016, 6:13:07 AM3/24/16
to Caffe Users
does anyone  know how to implement batch normalization in caffe? I know there is a batch-norm layer type and I tried to use it with several different configurations but I couldn't obtain a reasonable training result.

Hossein Hasanpour

unread,
Mar 27, 2016, 5:23:47 AM3/27/16
to Caffe Users
Did you also use scaler layer after the batch normalization, As far as I know and if I'm not mistaken, caffe broke the google batch normalization layer into two separate layers, BatchNormalization(called "BatchNorm") and Scaler layer (called "Scale").
I remember that When I used only the "BatchNorm" layer I didnt get much good results either, but when I applied the "Scale" layer after each "BatchNorm" layer, I got a considerably good result.

Emre Can Kaya

unread,
Mar 27, 2016, 10:53:56 AM3/27/16
to Caffe Users

how do you connect the layers? can you give an example
27 Mart 2016 Pazar 12:23:47 UTC+3 tarihinde Hossein Hasanpour yazdı:

Hossein Hasanpour

unread,
Mar 27, 2016, 12:23:23 PM3/27/16
to Caffe Users
Sure,
this is how you could do it :
layer {
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
  param
{
    lr_mult
: 1
 
}
  convolution_param
{
    num_output
: 64
    pad
: 2
    kernel_size
: 5
    stride
: 1
    bias_term
: false
    weight_filler
{
      type
: "gaussian"
      std
: 0.0001
   
}
 
}
}
layer
{
  name
: "pool1"
  type
: "Pooling"
  bottom
: "conv1"
  top
: "pool1"
  pooling_param
{
    pool
: MAX
    kernel_size
: 3
    stride
: 2
 
}
}

layer
{
  name
: "bn1"
  type
: "BatchNorm"
  bottom
: "pool1"
  top
: "bn1"
  param
{
    lr_mult
: 0
 
}
  param
{
    lr_mult
: 0
 
}
  param
{
    lr_mult
: 0
 
}
}

layer
{
  name
: "scale1"
  type
: "Scale"
  bottom
: "bn1"
  top
: "scale1"
  scale_param
{
    bias_term
: true
 
}
}

layer
{
  name
: "relu1"
  type
: "ReLU"
  bottom
: "scale1"
  top
: "relu1"
}

jakkala kalvik

unread,
Mar 27, 2016, 6:49:36 PM3/27/16
to Caffe Users
Is the Scaler layer  available in the standard caffe version because its not mentioned in the layer catalog.

Emre Can Kaya

unread,
Mar 28, 2016, 4:10:44 AM3/28/16
to Caffe Users

I saw somewhere else that you should set use_global_stats: false in order to be able to train it and also in all the examples I saw relu's bottom and top was another layer (a conv). why is there no documentation about these things? I mean where do people learn how to do these things?



27 Mart 2016 Pazar 19:23:23 UTC+3 tarihinde Hossein Hasanpour yazdı:

Emre Can Kaya

unread,
Mar 28, 2016, 4:28:05 AM3/28/16
to Caffe Users
by the way, there is a scale layer in header and source files but but when we write scale_param;  caffe gives error:

226:3 : Message type "caffe.LayerParameter" has no field named "scale_param".

Hossein Hasanpour

unread,
Mar 28, 2016, 6:30:06 AM3/28/16
to Caffe Users
These are available in the latest version of caffe ( the official build ).
and for the documentation, layer catalog is not up to date, there are lots of stuff that needs  to be addressed in  the documentation.

Evgeny Nizhibitsky

unread,
Mar 28, 2016, 6:35:16 AM3/28/16
to Caffe Users
You can try learning at https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto.

Scale layer parameters with some doc-like comments are located after message ScaleParameter {.

понедельник, 28 марта 2016 г., 11:10:44 UTC+3 пользователь Emre Can Kaya написал:

Mladen Fernežir

unread,
Apr 3, 2016, 10:31:06 PM4/3/16
to Caffe Users
Does anybody know what does each of the three lr_mult: 0 parameters mean? I know they are multiplying some global constants, but which three and in what order? Also, I believe it's now possible to only use BatchNorm layer with the new scale_filler and bias_filler parameters (I'm trying to figure it out at the moment).

Joshua Slocum

unread,
Apr 27, 2016, 9:15:09 PM4/27/16
to Mladen Fernežir, Caffe Users
They prevent back-propagation from changing the batch-norm params; those parameters are computed based on network activations.

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/88824615-adf5-4774-bb9b-e9c31d5d8de8%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Message has been deleted

T Nguyen

unread,
Sep 20, 2016, 11:15:23 PM9/20/16
to Caffe Users
Is it neccessary to implement BatchNorm in Test phase? 


Alex

unread,
Nov 10, 2017, 2:53:05 PM11/10/17
to Caffe Users
Suppose you have figured out which three are they?
Reply all
Reply to author
Forward
0 new messages