training the fully convolutional networks (FCN) from scratch for RGB-D data

170 views
Skip to first unread message

puren...@gmail.com

unread,
Mar 6, 2018, 10:23:53 AM3/6/18
to Caffe Users
Hello,

I looked for an earlier answer related to my question. However I couldn't find an exact answer. Please warn me if there is an already given answer.

I'm new to neural networks and caffe. Thus, sorry if my questions sound trivial. 
I'm trying to understand the FCN implementation together with the paper. So actually, my question will be mainly related to the paper. I hope this is the right place to ask.

In the FCN paper, there is a line saying 
"Training from scratch is not feasible considering the time required to learn the base classification nets. (Note that the VGG is trained in stages, while we initialize from the full 16-layer version)." 
  1. Does "training from scratch" mean that initializing all the weights in each layer randomly and running back-propagation and optimization until the weights converge? Instead, do they initialize all the weights from trained VGG network and when they train with this initialization, they run back-propagation for only last layer and the optimization takes less time as well since the weights are initialized at a good starting point? Am I correct? 
  2. "What does "trained in stages" mean? Where do they implement this "training in stages" strategy in the code?
  3. VGG is trained on RGB images. So are these weights still a good initialization for RGB-D data? For instance to train the network for RGB-D and HHA data of NYUDv2 in the paper, did they use the same trained VGG network's weights?

Przemek D

unread,
Mar 20, 2018, 7:58:06 AM3/20/18
to Caffe Users
I can answer two of your questions, as I've never done any training on RGBD.
1. I'm not sure if they run backprop only for the last layer or more (all?) of them, but aside from that detail you are correct.
2. You probably will not find anything related to it in the FCN code as this is related to the pre-training of the VGG net. I will refer you to the original paper on it: Simonyan & Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition (2014) - section 3.1 describes the strategy.
Reply all
Reply to author
Forward
0 new messages