How to calculate layer output size?

7,291 views
Skip to first unread message

Ilya Zhenin

unread,
Jun 15, 2015, 9:49:11 AM6/15/15
to caffe...@googlegroups.com
For example, I have image of size 1x34x34. Then, in training phase there is a convolution on this image. 

num_output: 128
kernel_size: 5
stride: 2

 As I understand after convolution there will be 128 images with 1 channel for gray image input and with 3 for RGB. Is it right? Further, how to compute size of this images? Would it be 15x15? And if kernel_size = 4?

also let's suppose that next layer is pooling

    pool: MAX
    kernel_size: 2
    stride: 2

that takes as input previous convolution..How to calculate size of pooling layer?


Philip H

unread,
Jun 19, 2015, 11:19:49 AM6/19/15
to caffe...@googlegroups.com
You will get 128 activation maps for the 128 kernels you're training.

Each activation map will have the size (width - kernel_size + 2*pad)/stride +1 and likewise for the height (as documented here http://caffe.berkeleyvision.org/tutorial/layers.html).

The same thing happens in your pooling layer.

Cheers,
Phil

Jarvis Du

unread,
Aug 20, 2015, 3:18:30 PM8/20/15
to Caffe Users
Hi Phil,

Can you tell me whether the divide / is to take floor or ceil? I guess it should be floor, but somehow it is ceil for maxpooling in caffe.

Jarvis Du

unread,
Aug 21, 2015, 10:02:00 AM8/21/15
to Caffe Users
A Small example would be the following prototxt file.

name: "net"
input
: "data"
input_dim
: 1
input_dim
: 3
input_dim
: 231
input_dim
: 231
state
{
  phase
: TEST
}
layer
{
  name
: "conv1"
  type
: "Convolution"
  bottom
: "data"
  top
: "conv1"
  convolution_param
{
    num_output
: 48
    pad
: 4
    kernel_h
: 9
    kernel_w
: 9
    stride_h
: 4
    stride_w
: 4
 
}
}
layer
{
  name
: "relu1"
  type
: "ReLU"
  bottom
: "conv1"
  top
: "conv1"
}
layer
{
  name
: "conv2"
  type
: "Convolution"
  bottom
: "conv1"
  top
: "conv2"
  convolution_param
{
    num_output
: 64
    pad
: 0
    kernel_h
: 5
    kernel_w
: 5
    stride_h
: 1
    stride_w
: 1
 
}
}
layer
{
  name
: "pool2"
  type
: "Pooling"
  bottom
: "conv2"
  top
: "pool2"
  pooling_param
{
    pool
: MAX
    pad
: 0
    kernel_h
: 2
    kernel_w
: 2
    stride_h
: 2
    stride_w
: 2
 
}
}
layer
{
  name
: "relu2"
  type
: "ReLU"
  bottom
: "pool2"
  top
: "pool2"
}
layer
{
  name
: "conv3"
  type
: "Convolution"
  bottom
: "pool2"
  top
: "conv3"
  convolution_param
{
    num_output
: 64
    pad
: 0
    kernel_h
: 3
    kernel_w
: 3
    stride_h
: 1
    stride_w
: 1
 
}
}
layer
{
  name
: "pool3"
  type
: "Pooling"
  bottom
: "conv3"
  top
: "pool3"
  pooling_param
{
    pool
: MAX
    pad
: 0
    kernel_h
: 2
    kernel_w
: 2
    stride_h
: 2
    stride_w
: 2
 
}
}
layer
{
  name
: "relu3"
  type
: "ReLU"
  bottom
: "pool3"
  top
: "pool3"
}
layer
{
  name
: "conv4"
  type
: "Convolution"
  bottom
: "pool3"
  top
: "conv4"
  convolution_param
{
    num_output
: 64
    pad
: 0
    kernel_h
: 3
    kernel_w
: 3
    stride_h
: 1
    stride_w
: 1
 
}
}
layer
{                                                                      
  name
: "pool4"
  type
: "Pooling"
  bottom
: "conv4"
  top
: "pool4"
  pooling_param
{
    pool
: MAX
    pad
: 0
    kernel_h
: 2
    kernel_w
: 2
    stride_h
: 2
    stride_w
: 2
 
}
}
layer
{
  name
: "relu4"
  type
: "ReLU"
  bottom
: "pool4"
  top
: "pool4"
}layer {
  name
: "conv5"
  type
: "Convolution"
  bottom
: "pool4"
  top
: "conv5"
  convolution_param
{
    num_output
: 32
    pad
: 0
    kernel_h
: 3
    kernel_w
: 3
    stride_h
: 1
    stride_w
: 1
 
}
}
layer
{
  name
: "relu5"
  type
: "ReLU"
  bottom
: "conv5"
  top
: "conv5"
}

For this architecture, the final output should be 32*3*3=288, but it gives 32*4*4=512. By scrutinizing every layer, the problem comes with pooling layer.

For example, for a maxpooling layer with input size 25*25, where the pooling parameters include: kernel size of 2 and stride of 2. The output should be of size 12*12 rather than 13*13 from caffe.


On Friday, June 19, 2015 at 11:19:49 AM UTC-4, Philip H wrote:

Sergii Bondariev

unread,
Jan 11, 2017, 6:30:45 PM1/11/17
to Caffe Users
It is indeed rounded up in caffe, see pooling_layer.cpp, not "ceil":

  pooled_height_ = static_cast<int>(ceil(static_cast<float>(

      height_ + 2 * pad_h_ - kernel_h_) / stride_h_)) + 1;

  pooled_width_ = static_cast<int>(ceil(static_cast<float>(

      width_ + 2 * pad_w_ - kernel_w_) / stride_w_)) + 1;

Reply all
Reply to author
Forward
0 new messages