If the 20 classes are a subset of the 40, the simplest approach would
be to fine-tune the original 40-class network using the 20-class data,
assuming the class ids matched up (e.g. class 7 of the 20-class
dataset was also called class 7 in the 40-class dataset).
If the 20 classes are not a subset of the 40, then you would need to
create a new layer that outputs 20 values instead of 40. This can be
done in the *_train_val.prototxt file like so:
Section from old file:
layers {
layer {
name: "fc8"
type: "innerproduct"
num_output: 40
...
bottom: "fc7"
top: "fc8"
}
Section from new file:
layers {
layer {
name: "fc8-new"
type: "innerproduct"
num_output: 20
...
bottom: "fc7"
top: "fc8-new"
}
Then use the following command to create and train a new network where
the first N-1 layers are copied from the 40-class network (because the
layer names match) and the last layer is randomly initialized (because
there is no layer named fc8-new in the old net):
./caffe.bin train --weights=trained_40_class_weights_iter_xxxxx
--solver=solver.prototxt
You can set the learning rates of the lower N-1 layers to 0 to train
only the last 20-ouput layer (faster training, more likely to
underfit), or you can set them to a non-zero value to allow all layers
to be fine-tuned (slower, more likely to overfit).
Hope that helps!
>
https://groups.google.com/d/msgid/caffe-users/953a147c-45c0-4ae9-9208-08f153ebd9eb%40googlegroups.com.