A couple questions about FCN training

Ilya Zhenin

unread,

Aug 18, 2016, 4:58:43 AM8/18/16

to Caffe Users

As a reference I'm using Shelhamer Github https://github.com/shelhamer/fcn.berkeleyvision.org

1. Why at the first convolutional layer there is padding of 100? If needed we just could have upscaled input image to obtain better initialization.

2. Why there is no convolutional and deconvolutional layer weights initialization specified - all weights are zeros?

3. I'm trying to train fcn8 on my data - I've created two HDF5 files, one with image data and other with masks(just two classes, zero and one), and it is just only modification I've made. And net learns nothing, all weights stays zeros. Then I've added xavier weight initialization, and again, net learns nothing, though output not all zeros - just mess. I've set low learning rate, e-11, and datasets are correct. Any clues about what might be wrong?

codeforever

unread,

Aug 19, 2016, 12:42:08 AM8/19/16

to Caffe Users

I meet the same thing as you said in the third one. I try nyu data set with color and depth, all the nets can not converge.

在 2016年8月18日星期四 UTC+8下午4:58:43，Ilya Zhenin写道：

Evan Shelhamer

unread,

Sep 14, 2016, 2:56:49 PM9/14/16

to Ilya Zhenin, Caffe Users

1. Why at the first convolutional layer there is padding of 100? If needed we just could have upscaled input image to obtain better initialization.

From new fcn.berkeleyvision.org FAQ:

Why pad the input?: The 100 pixel input padding guarantees that the network output can be aligned to the input for any input size in the given datasets, for instance PASCAL VOC. The alignment is handled automatically by net specification and the crop layer. It is possible, though less convenient, to calculate the exact offsets necessary and do away with this amount of padding.

2. Why there is no convolutional and deconvolutional layer weights initialization specified - all weights are zeros?

All the nets in the paper are fine-tuned from ilsvrc classifiers, so the weights are either initialized from these nets, initialized with bilinear kernels (the deconv. layers), or initialized with zero (the score layers).

To train from scratch, you'll need to modify the VGG-16 weight initialization since in the original paper it is trained in stages.

Evan Shelhamer

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users+unsubscribe@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/57e09112-0041-418e-8372-eeceb056701d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward