Use a Pre-trained Network(1 branch) to initialize weights of layers in a Network with 2 branches

112 views
Skip to first unread message

Jayant Agrawal

unread,
Jan 6, 2017, 8:22:06 AM1/6/17
to Caffe Users
I have a Multi Task Network with two similar branches and a pre-trained network with only branch (which is also same).
I want to initialize the weights of the layers in the two branches(in my multi task network) with the weights of the layers in my pre-trained network.

Now, I can initialize one of the branch correctly by using the same name for the layers as in the pre-trained network.
But, I have to keep the names of the layers in the other branch different, and thus those layers won't take the pre-trained weights.

Also, I don't want to share the weights in the two branches. So, giving the same name to the weights in the corresponding layers in the two branches won't work.

Is there a nice way/hack to do this ? 

PS: I would want to avoid Network Surgery, but any comments, explaining a nice way to do it, are also welcome.

Jayant Agrawal

unread,
Jan 6, 2017, 8:25:24 AM1/6/17
to Caffe Users
Clarification : I just want to initialize the two branches with the same weights. They can learn different weights during the training phase, since they are governed by different loss layers.

Przemek D

unread,
Jan 9, 2017, 6:55:09 AM1/9/17
to Caffe Users
One hack way I can think of:
* name the layers you want to initialize differently and enable weight sharing
* initialize the network (if you want to avoid surgery, you can train it for 1 iteration with 0 learning rate)
* use the resulting caffemodel to initialize a modified model: with layer names unchanged, but disabled sharing
Haven't tried that, so no guarantees (tell us if you try that though).

Jayant Agrawal

unread,
Jan 10, 2017, 2:50:35 PM1/10/17
to Caffe Users
Thanks!! The hack works fine. 

ngc...@gmail.com

unread,
Dec 3, 2018, 11:22:38 PM12/3/18
to Caffe Users
Hi,
Is there a quick way to enable / disable weights sharing?
What I know is to name the weights identically (to share weights) or differently (to disable weights sharing). But for a very deep network, that would entail a lot of work naming each layers' weights....!
Reply all
Reply to author
Forward
0 new messages