Can you train on top of the pre-trained .caffemodel file?

1,116 views
Skip to first unread message

Carlo Alessi

unread,
Feb 12, 2017, 5:34:14 PM2/12/17
to Caffe Users
Is it possible to train on top (not from scratch) of the bvlc_reference_caffenet.caffemodel with your own images? 

I don't see any .solverstate file, so I presume it is not possible, is that correct?

Avi Parshan

unread,
Feb 13, 2017, 3:43:39 AM2/13/17
to Caffe Users

I know you can fine-tune neural networks, take a look at this: http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html

Carlo Alessi

unread,
Feb 14, 2017, 5:20:15 AM2/14/17
to Caffe Users
Thank you for answering, I'd have a couple of questions:

1) Can I add one class and change the num_output parameter in the last fc layer from 1000 to 1001?

Then I would like to train with additional images, some of existing class and some of the added class, this means I will have another image_mean.binaryproto

2) Which image mean should I use during training and deployment, the original (imagenet_mean.binaryproto) or the new one?

Przemek D

unread,
Feb 14, 2017, 5:59:39 AM2/14/17
to Caffe Users
1. If you just change num_output and attempt loading a pretrained caffemodel, caffe will throw a shape mismatch error. You will have to perform a manual net surgery if you want to change a layer shape but still load weights to it.
2. You should use a mean image of the dataset you're using to train (fine-tune).

Carlo Alessi

unread,
Feb 17, 2017, 3:55:16 PM2/17/17
to Caffe Users
Hi, 

I understand that one would get a shape mismatch if one just changes the num_output parameter, but I don't get how that net surgery tutorial is useful in my case.

Here is an example of what I would like to do:
- classify Gorilla ( label number 366 in the synset file)
- classify Chimpanzee (label number 367 in the synset file)
- classify background / jungle / grass ( not present, so I want to add this class, number 1000)

Would the following work?

1) Add 'background, jungle, grass' to the file synset_words.txt
2) Create training set like this: 

chimpanzee_0.png 367
chimpanzee_1.png 367
gorilla_0.png 366
background_0.png 1000

3) Change num_outuput to 1001, (should I change the name of the layer as well?)
4) Resume training of the pretrained model

Przemek D

unread,
Feb 27, 2017, 4:30:20 AM2/27/17
to Caffe Users
If you want to have more classes, you need to have more room for weights. There are two ways to do this: with and without transfer.
Without transfer, you simply change num_output and rename the layer - weights get randomly initialized and you must retrain the classifier on the complete dataset. If you don't rename you get a load mismatch.
With transfer, ie. if you want to train the network to recognize one additional class while still remembering the previously known classes, you need to perform a net surgery. What you want to do is edit the pretrained caffemodel and change the last layer weight blob shape from (n,1000) to (n,1001). Then you could change num_output and load the weights (layer name stays the same) - this way you avoid the mismatch but still get the learned weights transferred.

Carlo Alessi

unread,
Feb 27, 2017, 7:22:59 AM2/27/17
to Caffe Users
Yes, I want to do it with transfer ( remember the old classes and recognize one additional class).

How can I change the last layer weight blob shape? I am using the Reshape layer in this way but I get the error:

Cannot copy param 0 weights from layer 'fc8'; shape mismatch.  Source param shape is 1 1 1000 4096 (4096000); target param shape is 1001 4096 (4100096). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

layer {
type: "Reshape" 
bottom: "fc8" 
top: "fc8_reshaped"
    reshape_param { 
    shape { 
    dim: 16
    dim: 1001
   
  }
}

please find in attachment my train_val.prototxt
train_val.prototxt

Przemek D

unread,
Feb 27, 2017, 8:17:54 AM2/27/17
to Caffe Users
How can I change the last layer weight blob shape?
I have already pointed you to the tutorial on net surgery: http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb
Read it carefully, it describes ways of accessing and modifying weights of an existing caffemodel. This + basic knowledge of python (and numpy!) and you can do anything with your weights.

Soumen Pramanik

unread,
Feb 27, 2017, 1:21:50 PM2/27/17
to Caffe Users
Hello Carlo,
Sorry to ask you this as it is not relevant for you. I have been using TF for long time. But I had some trouble in caffe installation, so I need some guidance/help to install as it is very complicated. Could you please help me? Thank you very much for you help.

Carlo Alessi

unread,
Feb 27, 2017, 3:34:55 PM2/27/17
to Caffe Users

Carlo Alessi

unread,
Feb 27, 2017, 3:41:12 PM2/27/17
to Caffe Users
All I understand from that tutorial is that you can transform a fc layer into a convolutional one. Apart from this, I don't get whether I have to code a little or just change something in the prototxt file, could you please explain a bit more?

Also, I understand python but I am using C++ for my project (sorry for not having mentioned that)
Reply all
Reply to author
Forward
0 new messages