Re-training when the number of classes grows

159 views
Skip to first unread message

Andres Romero Mier y Teran

unread,
Jan 28, 2015, 9:10:09 AM1/28/15
to caffe...@googlegroups.com
I've tested Caffe to train a CNN classifier for an image recognition engine application. 

I am wondering: Which is the recommended procedure to follow if the number of classes on the dataset changes? (supose we add a new category to it).

I imagine we must change the size of the INNER_PRODUCT layer that feeds the SOFTMAX output layer, but how can we use the previous network solver state? Or do we need to train the whole network from scratch?

Thank you very much, for your help.

Andres Romero

Evan Shelhamer

unread,
Feb 5, 2015, 1:35:00 PM2/5/15
to Andres Romero Mier y Teran, caffe...@googlegroups.com
There isn't really a standard approach to this yet as it's still somewhat of an open research question, but here's what I've done to good effect:

1. learn the original model with num_output = k
2. define the new model with with num_output > k
3. do net surgery to transfer the old parameters into the new model's layer -- here's a connect-the-dots example: https://gist.github.com/shelhamer/bee2a5b2b739fe6cee6f#file-expand_output-py -- while the params for the new classes can be randomly or zero initialized
4. add the data for new classes to your existing data
5. fine-tune

This still involves training on all the data, but could be faster than starting from scratch.


Evan Shelhamer

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/11b7aad4-9f14-4115-a3a5-1c45b3a3b0c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andres Romero Mier y Teran

unread,
Feb 6, 2015, 7:48:53 AM2/6/15
to caffe...@googlegroups.com, andres...@gmail.com
Thank you for your answer and for sharing your net surgery code!

Jennifer Wang

unread,
Mar 18, 2016, 2:58:07 AM3/18/16
to Caffe Users
Thanks you for the answer, Evan. 

You mentioned your method still involves training on all the data. Is there a way to train just use the data for the new class? Looking forward to your answer.

Jennifer
Reply all
Reply to author
Forward
0 new messages