Can you train on top of the pre-trained .caffemodel file?

Carlo Alessi

unread,

Feb 12, 2017, 5:34:14 PM2/12/17

to Caffe Users

Is it possible to train on top (not from scratch) of the bvlc_reference_caffenet.caffemodel with your own images?

I don't see any .solverstate file, so I presume it is not possible, is that correct?

Avi Parshan

unread,

Feb 13, 2017, 3:43:39 AM2/13/17

to Caffe Users

I know you can fine-tune neural networks, take a look at this: http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html

Carlo Alessi

unread,

Feb 14, 2017, 5:20:15 AM2/14/17

to Caffe Users

Thank you for answering, I'd have a couple of questions:

1) Can I add one class and change the num_output parameter in the last fc layer from 1000 to 1001?

Then I would like to train with additional images, some of existing class and some of the added class, this means I will have another image_mean.binaryproto

2) Which image mean should I use during training and deployment, the original (imagenet_mean.binaryproto) or the new one?

Przemek D

unread,

Feb 14, 2017, 5:59:39 AM2/14/17

to Caffe Users

1. If you just change num_output and attempt loading a pretrained caffemodel, caffe will throw a shape mismatch error. You will have to perform a manual net surgery if you want to change a layer shape but still load weights to it.
2. You should use a mean image of the dataset you're using to train (fine-tune).

Carlo Alessi

unread,

Feb 17, 2017, 3:55:16 PM2/17/17

to Caffe Users

Hi,

I understand that one would get a shape mismatch if one just changes the num_output parameter, but I don't get how that net surgery tutorial is useful in my case.

Here is an example of what I would like to do:

- classify Gorilla ( label number 366 in the synset file)

- classify Chimpanzee (label number 367 in the synset file)

- classify background / jungle / grass ( not present, so I want to add this class, number 1000)

Would the following work?

1) Add 'background, jungle, grass' to the file synset_words.txt

2) Create training set like this:

chimpanzee_0.png 367

chimpanzee_1.png 367

gorilla_0.png 366

background_0.png 1000

3) Change num_outuput to 1001, (should I change the name of the layer as well?)

4) Resume training of the pretrained model

Przemek D

unread,

Feb 27, 2017, 4:30:20 AM2/27/17

to Caffe Users

If you want to have more classes, you need to have more room for weights. There are two ways to do this: with and without transfer.
Without transfer, you simply change num_output and rename the layer - weights get randomly initialized and you must retrain the classifier on the complete dataset. If you don't rename you get a load mismatch.
With transfer, ie. if you want to train the network to recognize one additional class while still remembering the previously known classes, you need to perform a net surgery. What you want to do is edit the pretrained caffemodel and change the last layer weight blob shape from (n,1000) to (n,1001). Then you could change num_output and load the weights (layer name stays the same) - this way you avoid the mismatch but still get the learned weights transferred.

Carlo Alessi

unread,

Feb 27, 2017, 7:22:59 AM2/27/17

to Caffe Users

Yes, I want to do it with transfer ( remember the old classes and recognize one additional class).

How can I change the last layer weight blob shape? I am using the Reshape layer in this way but I get the error:

Cannot copy param 0 weights from layer 'fc8'; shape mismatch. Source param shape is 1 1 1000 4096 (4096000); target param shape is 1001 4096 (4100096). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

layer {
type: "Reshape"
bottom: "fc8"
top: "fc8_reshaped"
reshape_param {
shape {
dim: 16
dim: 1001
}
}
}

please find in attachment my train_val.prototxt

train_val.prototxt

Przemek D

unread,

Feb 27, 2017, 8:17:54 AM2/27/17

to Caffe Users

How can I change the last layer weight blob shape?

I have already pointed you to the tutorial on net surgery: http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb
Read it carefully, it describes ways of accessing and modifying weights of an existing caffemodel. This + basic knowledge of python (and numpy!) and you can do anything with your weights.

Soumen Pramanik

unread,

Feb 27, 2017, 1:21:50 PM2/27/17

to Caffe Users

Hello Carlo,
Sorry to ask you this as it is not relevant for you. I have been using TF for long time. But I had some trouble in caffe installation, so I need some guidance/help to install as it is very complicated. Could you please help me? Thank you very much for you help.

Carlo Alessi

unread,

Feb 27, 2017, 3:34:55 PM2/27/17

to Caffe Users

Hi,

I followed this guide https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide. Give it a try!

Carlo Alessi

unread,

Feb 27, 2017, 3:41:12 PM2/27/17

to Caffe Users

All I understand from that tutorial is that you can transform a fc layer into a convolutional one. Apart from this, I don't get whether I have to code a little or just change something in the prototxt file, could you please explain a bit more?

Also, I understand python but I am using C++ for my project (sorry for not having mentioned that)

Reply all

Reply to author

Forward