Training a pre-trained network

67 views
Skip to first unread message

Manohar

unread,
Jun 10, 2017, 4:30:07 AM6/10/17
to Caffe Users
Is it possible to load pre-trained weights to certain layers of the network and add additional layers to the network by initializing them to random weights ? Basically, I've a trained model on ImageNet using Caffe and I want to use these weights to train a new model by adding additional layers for the task of object detection.

Manohar

unread,
Jun 10, 2017, 8:19:56 PM6/10/17
to Caffe Users
This seems to be a pretty straight question, please care to reply.

Jonathan R. Williford

unread,
Jun 11, 2017, 3:49:58 AM6/11/17
to Manohar, Caffe Users
You can load both the existing model and a new model (without weights) using PyCaffe interface. The new model will be given random weights according to the rules in its prototxt. You can then copy the weights that you want from the old model to the new one.

Cheers,
Jonathan

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users+unsubscribe@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/3e327f65-277c-4b28-8619-16956982c792%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Manohar

unread,
Jun 11, 2017, 3:53:16 AM6/11/17
to Caffe Users
Could you forward some resources regarding this ?
And also is this possible using c++ interface ?


On Saturday, June 10, 2017 at 2:00:07 PM UTC+5:30, Manohar wrote:

Przemek D

unread,
Jun 12, 2017, 3:25:43 AM6/12/17
to Caffe Users
It is very simple using the commandline interface and prototxt modification. Caffe by default attempts to load weights for named layers from a given caffemodel file. Easy to explain on an example: let's assume you trained a network with layers "a", "b", and "c" and you want to load weights for layers "a" and "c", leave layer "b" intact but add a layer "d" instead. Write your new proto like you normally would, but rename "b" to, say, "b_", and add definition for "d". For training your new model, supply the new proto and the trained weights from the old model. Caffe will encounter layers "a" and "c", look for the same layers in the caffemodel and, having found them, attempt to load their weights. For "b_" and "d" no counterparts will be found, so these will be initialized according to specification in your proto.
This will only work however, when your new model's layer "a" has the exact same parameter shape (eg. kernel size and count) as the one in the model you're loading from - otherwise caffe will complain and crash, and you will have to resort to net surgery to transfer those weights.

Manohar

unread,
Jun 12, 2017, 3:08:31 PM6/12/17
to Caffe Users
That's very helpful. Thanks for the answer !
Reply all
Reply to author
Forward
0 new messages