Hi,
Theoretically, a max pooling layer with input 64(channel)*27(height)*27(width), pad of 0, kernel_size of 2 and stride of 2 should give an output of 64*12*12.
When I tried to create such a layer in caffe, it gives out 64*13*13
Here is the prototxt file.
name: "net"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 231
input_dim: 231
state {
phase: TEST
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 48
pad: 4
kernel_h: 9
kernel_w: 9
stride_h: 4
stride_w: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1"
top: "conv2"
convolution_param {
num_output: 64
pad: 0
kernel_h: 5
kernel_w: 5
stride_h: 1
stride_w: 1
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
pad: 0
kernel_h: 2
kernel_w: 2
stride_h: 2
stride_w: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "pool2"
top: "pool2"
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
convolution_param {
num_output: 64
pad: 0
kernel_h: 3
kernel_w: 3
stride_h: 1
stride_w: 1
}
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: MAX
pad: 0
kernel_h: 2
kernel_w: 2
stride_h: 2
stride_w: 2
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "pool3"
top: "pool3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "pool3"
top: "conv4"
convolution_param {
num_output: 64
pad: 0
kernel_h: 3
kernel_w: 3
stride_h: 1
stride_w: 1
}
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4"
top: "pool4"
pooling_param {
pool: MAX
pad: 0
kernel_h: 2
kernel_w: 2
stride_h: 2
stride_w: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "pool4"
top: "pool4"
}layer {
name: "conv5"
type: "Convolution"
bottom: "pool4"
top: "conv5"
convolution_param {
num_output: 32
pad: 0
kernel_h: 3
kernel_w: 3
stride_h: 1
stride_w: 1
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
I guess the calculation of output size for pooling is to take ceiling instead of floor for dividing in
h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1. (See here).
Though for convolutional layer, it is to take floor here.
#1318 mentioned this problem, but no solution is given on this.