image scaling and pooling

609 views
Skip to first unread message

Alex Orloff

unread,
Jan 29, 2016, 8:42:55 AM1/29/16
to Caffe Users
Hi all, 
I've read caffe docs, but still have some questions uncleared

1) If I want to pool (2n+1)x(2n+1) blob with kernel size 2x2, what output i receive? nxn or (n+1)x(n+1)?
i.e. what happens with formula  h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1, when h_o is not integer? Round down? Round up? Round?

2) Let's imagine we have network trained on images 8x8 -> what output i recieve if I feed 16x16 image to network? will it be 8x8 (6x6?) blob or something else? Is it safe at all?
I mean following network
8x8 - > conv1 (kernel 3x3, padding=0 stride=1) -> 6x6 -> pooling max 2x2 -> 3x3 ->conv2(kernel 3x3, padding=0 stride=1) -> output


3) Is there any way to feed different images to network? I mean different height and width.

Thank you in advance

Hossein Hasanpour

unread,
Jan 29, 2016, 10:59:43 AM1/29/16
to Caffe Users
I dont know about the last two questions but for your first question If I get you right, suppose you have a blob of size W x H x D ,
And in your pooling layer you have F = 2 (your receptive field size (kernel size )) and Stride of S = 1, then
your output blob would have this size like this :

     W' =  (W - F)/S +1
     H' = (H - F)/S + 1

we dont use padding in the pooling layer.
when the output of the formula is not an integer, it means the stride is not chosen correctly!you need to use another value for your stride.
his might help as well : http://cs231n.github.io/convolutional-networks/#pool

Alex Orloff

unread,
Jan 29, 2016, 3:30:41 PM1/29/16
to Caffe Users
Thank you Hossein!

Actually I mean Pooling layer with F=2 and S=2, so what happens in case when either H or W are odd?
Its important cause I want to apply network learned using images 64x64 to images with different (bigger) dimentions.

Hope it's possible.
Thanks again.

пятница, 29 января 2016 г., 18:59:43 UTC+3 пользователь Hossein Hasanpour написал:

Jan C Peters

unread,
Jan 30, 2016, 5:48:22 AM1/30/16
to Caffe Users
ad 1)

If that formula is used that way in the code (I had a look at that one day, but don't remember right now), and all quantities to the left and right of the "/" are integers, C++ does integer division, i.e. the remainder is discarded. This (if only positive quantities are involved) corresponds to rounding the result down to the closest integer. And effectively this would mean in your settings that the last row and/or column of pixels of the image that goes into the pooling layer are just never looked at by the pooling, completely ignored. I have no idea if that case caught somewhere or if caffe silently ignores it.

Well on the other hand, you can always create a very simple network and try to load it, the scaffolding output will tell you want you want to know.

ad 2)

Depends on your network and the kind of images you feed if that could make sense. Usually the blobs between the layers are automatically resized accordingly, but then your output will not be of the same size either (depens on your application if that makes sense). using the same conv layers/filters could make sense if the images are bigger but what is shown on there is about the same size as in the smaller images. InnerProduct layers are always broken by a resize, because the number of weights changes.

ad 3)

As i said in 2), it is in some cases, but O am not sure how you could formulate more general rules on how and when to do that. I would be interested in that too.

Jan

Alex Orloff

unread,
Jan 30, 2016, 7:28:31 AM1/30/16
to Caffe Users
Thank you Jan,

ad 1) OK, actually It not so important in my task.
ad 2) Sure, I understand that I cannot use InnerProduct layers, but as I understood, I can completely replace InnerProduct with  conv(nxn->1x1)->conv(1x1->1x1) with same funcionality.
ad 3) I hope if I train network with 64x64 images -> 8x8 at last conv layer -> 1x1 layers to replace InnerProduct it will be possible to feed to network also any (64+8n)x(64+8k) image, with blob (n x k x number_of_outputs) as result.

PS: to understand clearly what I trying to achieve, I'll can explain.
I want to build network for cells detection
image -> network - > heatmap
Example of image to investigate below:



суббота, 30 января 2016 г., 13:48:22 UTC+3 пользователь Jan C Peters написал:
Reply all
Reply to author
Forward
0 new messages