CNN + Cross Validation

2,509 views
Skip to first unread message

Antonio Paes

unread,
Nov 11, 2015, 8:05:44 PM11/11/15
to Caffe Users
Hi Guys,

Anyone already do cross validation in caffe?

I'm do it through manual splits, but I dispense much time doing this.


Thanks!

Antonio Paes

unread,
Nov 15, 2015, 9:57:47 AM11/15/15
to Caffe Users
I search how to do cross validation in caffe but I'm not found, then I thing this:

I split my train set in 10 folds, for each fold i create a lmdb file.

I train 20K with my cnn architecture for fold 1 to 9 and use a fold 10 for validation.

So I finetuning this trained network jus change lmdb files for alternate train/validation splits..

This is correct?

Thanks.

ath...@ualberta.ca

unread,
Nov 16, 2015, 12:45:14 PM11/16/15
to Caffe Users
Just a general comment.

K-fold cross validation is great but comes at a cost since one must train-test k times. In traditional machine learning circles you will find cross-validation used almost everywhere and more often with small datasets.

However, in deep learning circles you will generally find it used much less since, usually, the gains to be had by training a single deep model for k times as long, outweighs the statistical security that k-fold cross validation provides. By "used much less" of course I mean never (you probably won't find a deep learning paper from Stanford or Berkeley that uses cross-validation). It is not uncommon for deep models to be trained for months (facenet = 5 mo+) making cross-validation a mute point.

Also note that deep learning systems are now being trained with datasets with 100s of billions of elements (google TensorFlow white paper released one week ago). The benefits of cross-validation diminish when dealing with such large datasets so this is another reason that it is not generally used in deep learning.

Antonio Paes

unread,
Nov 16, 2015, 1:03:42 PM11/16/15
to Caffe Users
Hi athess, thanks for your answer.

But how you says K-fold cross validation is used in smaller datasets which is my case, I'm using 6.000 images for train my network for this reason I'm trying do a cross validation.

ath...@ualberta.ca

unread,
Nov 17, 2015, 2:10:42 PM11/17/15
to Caffe Users
Smaller datasets yes but my point was not usually in deep learning. You can of course always do it as you mentioned in your original post and train 10 separate caffe models (each on 9) and test each on the remaining one.

Antonio Paes

unread,
Nov 17, 2015, 2:15:49 PM11/17/15
to ath...@ualberta.ca, Caffe Users
ok, thanks athess this was my doubts. And for choice what caffe model uses, i just uses which has better accuracy in my test subset?

Att Antonio Carlos Paes

On Tue, Nov 17, 2015 at 5:10 PM, <ath...@ualberta.ca> wrote:
Smaller datasets yes but my point was not usually in deep learning. You can of course always do it as you mentioned in your original post and train 10 separate caffe models (each on 9) and test each on the remaining one.

--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/1UpQdQB5VGE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/cfee86e4-94a6-4799-a7ce-45efc61746e3%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Abhilash Panigrahi

unread,
Nov 20, 2015, 12:50:19 PM11/20/15
to Caffe Users
Guys,

For 4-fold cross validation, I have trained my network 4 times to get 4 different caffemodels. I am using of each this caffemodel on the left out validation set. I get different accuracy in each case. How do I proceed from here? Should I choose the one with highest accuracy?

Antonio Paes

unread,
Nov 20, 2015, 1:35:50 PM11/20/15
to Caffe Users
Hi Abhilash, I'm using a K-fold cross-validation protocol.

I split my train subset in 10 folders, then I train 9 an test in 1, so I have a caffemodel trained. Then I use finetuning tecnique for next split, and I do thisvinto ran in all splits. When I finish the training in all splits, I've a caffemodel combined in a single one.

After this process I use my trained caffemodel in a test subset.

Abhilash Panigrahi

unread,
Nov 20, 2015, 2:13:06 PM11/20/15
to Caffe Users
Hi Antonio,
Doesn't your method result in over-fitting?

Antonio Paes

unread,
Nov 20, 2015, 7:48:38 PM11/20/15
to Caffe Users
In first moment yes because I was using only 7K for training, but now I'm using 245K. It is training, but the validation have a better comportment now. 

If you are with overfitting try add more images or initialize the last layers without finetuning.

ath...@ualberta.ca

unread,
Nov 22, 2015, 9:18:47 PM11/22/15
to Caffe Users
Antonio, say you train on splits 1-9 and test on split 10 to get model A. Then the weights of A have been exposed to (seen the) data in split 1.

Now you fine-tune A using say splits 2-10 and test on split 1. Herein lies the problem, you are testing model A on data that model A has already seen which is peeking into the test (validation) set. Remember, the weights of model A have been influenced by split 1 so you cannot use split 1 for testing.

-

I recommend that you just randomly split your data into train, validation and test. Train a single model on the train data and use the single validation set to check for overfitting. When all is completely done, test on test set and you are done.




Antonio Paes

unread,
Nov 23, 2015, 9:27:03 AM11/23/15
to Caffe Users
I see athess, this is a problem same, I see which split in train,validation an test is the best practice, but I really need a cross validation, any idea of which I can do this?
Reply all
Reply to author
Forward
0 new messages