Different learned parameters with/without group in Deconvolution layer

19 views
Skip to first unread message

john1...@gmail.com

unread,
May 21, 2017, 4:55:55 AM5/21/17
to Caffe Users
I found an interesting thing in computing number of parameters. Let's take an example of Deconvolution layer as

    layer {
      name
: "data"
      type
: "HDF5Data"
      top
: "data"
      top
: "label"
      include
{
        phase
: TRAIN
     
}
      hdf5_data_param
{
        source
: "./list.txt"
        batch_size
: 1
     
}
   
}
    layer
{
      name
: "conv1"
      type
: "Convolution"
      bottom
: "data"
      top
: "conv1"
      param
{
        lr_mult
: 1
        decay_mult
: 1
     
}
      param
{
        lr_mult
: 2
        decay_mult
: 0
     
}
      convolution_param
{
        num_output
: 8
        pad
: 1
        kernel_size
: 3
        stride
: 1
        weight_filler
{
          type
: "gaussian"
          std
: 0.01
       
}
     
}
   
}
    layer
{
      name
: "deconv1"
      type
: "Deconvolution"
      bottom
: "conv1"
      top
: "deconv1"
      convolution_param
{
        num_output
: 2
        bias_term
: false
        pad
: 1
        kernel_size
: 4
       
group: 2
        stride
: 2
        weight_filler
{
          type
: "bilinear"
       
}
     
}
   
}


Assume that my data size is 3x256x256. Then the number of parameters for above network is

    3x3x3x8+4x4x8=200 learned parameters


However, if I uncomment `group=2` in deconvolution layer, the number of parameters will be

    3x3x3x8+4x4x2x8=328 learned parameters


In CAFFE, with the group=2, we can print the number of parameters as

    Layer-wise parameters:
   
[('conv1',(8, 1, 3, 3)), ('deconv1', (8, 1, 4, 4))]


It looks like not so fair to compute the number of learned parameters for both cases (with/without using group). My question is which is the correct number of parameters do we have for above network architecture?
Reply all
Reply to author
Forward
Message has been deleted
0 new messages