about FCN-Alexnet

509 views
Skip to first unread message

xdtl

unread,
Aug 31, 2016, 5:11:14 PM8/31/16
to Caffe Users
Hi,

I am working on a foreground/background segmentation problem using FCN-Alexnet. I took the following steps:

1. train a Alexnet that takes a 256*256*3 image as input and outputs a single label indicates if the center pixel of the patch is foreground (1) or background (0);
2. follow "net-surgery", convert that Alexnet to a fcn, so that it can take input of any size, for example (m*n*3) pixels, and output a heatmap of size (m/32)*(n/32);
3. the result heatmap of step 2 looks reasonable;
4. copy parameters of that Alexnet to FCN-Alexnet: https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/voc-fcn-alexnet/train.prototxt (all layers are initialized here including fc6, fc7 and fc8), the Deconvolution layer is initialized using bilinear filter;
5. finetune FCN-Alexnet, so that the network can take input of any size and output a binary mask of the same size.

I have the following questions:

1. At step 5, the loss did decrease, starting from about 0.3 to below 0.1. But during testing, the output mask are all zeros. I checked values of all layers and weights, they are not zero. The problem is from the output of 'score' layer, which has a dimension of (1, 2, m, n). The output 'score(1, 0 , m, n)' are all positive and 'score(1, 1, m, n)' are all negative. So when I take: mask=solver.net.blobs['score'].data[0].argmax(axis=0), the mask is zero.

2. Since I hope the network can take input of any size for testing, I don't know how to set parameter 'offset' in the crop layer. I understand it is related with the structure of the network, but I don't think it is independent of the size of the input image.

3. The crop layer takes the first bottom and crops it according to the dimension of the second bottom. For example, if ('data', (1, 3, 227, 227)), ('label', (1, 1, 256, 256)), and ('upscore', (1, 2, 287, 287)), I hope ('score', (1, 2, 256, 256)). If I set 'data' as the second bottom and set the crop axis to be 2, I always get an error complaining 'data' has 3 channels which is larger than the number of output (2). One option is to set 'label' as the second bottom, but I don't think it is reasonable during testing.

4. parameter 'lr_mult' of the Deconvolution layer is set to 0, does that mean weights of that layer don't need to be tuned?

5. for the structure of FCN-Alexnet, I am not sure if it is able to work by inserting a single Deconvolution layer to Alexnet without any skips mentioned in the paper (I cannot use VGG net for some other concerns).

Maybe I misunderstood something, could someone please help me out?

Thanks in advance!

Ilya Zhenin

unread,
Sep 2, 2016, 9:59:38 AM9/2/16
to Caffe Users
I've got similar problem(zero output), figured that was due to nonlinear unit saturation during training. Not sure if it's your case, look at the blobs output to see wheter signal decays to zeroz though layers.

Hmm, intresting, how do ypu perform foreground-background  classification, can you share? On which data? 

четверг, 1 сентября 2016 г., 0:11:14 UTC+3 пользователь xdtl написал:

Simon Sun

unread,
Jan 11, 2018, 3:48:19 AM1/11/18
to Caffe Users

Hello i am new about this and could you please answer me some questions? I want to ask how do you do step 2 since i just trained an AlexNet model however i don't figure out how to change the AlexNet into the FCN-AlexNet can you tell me. I use the Caffe model.Thanks/
在 2016年9月1日星期四 UTC+8上午5:11:14,xdtl写道:

Przemek D

unread,
Jan 15, 2018, 9:42:01 AM1/15/18
to Caffe Users
I know this is an old thread but perhaps I can give some insight to some of the problems.
1. This might be very well caused by difficult data. If your training images have a lot of background in them and only very little objects, the network is likely to learn that everything is background. It is the easiest choice to make if there is an imbalance in the data.
2. There is a dependency on the size of the input image. Evan Shelhamer said something about this here (also see Mohamed's answer above).
3. Another option is to specify the axis parameter in the crop. For details, see this answer and the links in it.
4. Yes, setting lr_mult effectively blocks any update to this layer's parameters. It can still propagate back gradients though.
Reply all
Reply to author
Forward
0 new messages