what about training/test/val set

733 views
Skip to first unread message

Benedetta Savelli

unread,
Feb 26, 2016, 9:31:47 AM2/26/16
to Caffe Users
I'm a bit confused about the division between training set, test set and validation set. In creating the solver.prototxt file that defines the train parameters i have to define test_iter parameter ( that seems to be the number of iterations, that is how many batches of test images were sent into the network ). Why i have to define this parameter for training? Is test set used during training as validation set? Or does caffe do training and test at the same time ? 





Mohamed Ezz

unread,
Feb 26, 2016, 6:57:55 PM2/26/16
to Caffe Users
Probably your confusing is because of naming in caffe.

In general machine learning terms, a validation set is used to determine the goodness of hyper parameters used to train on the training set. A validation set is also used for early stopping a neural network when it seems to do so well on the training set but performance starts to degrade on the validation set (the number of iterations is a hyper parameter too, right?). After all model parameters and hyper parameters are found and fixed, one uses a final test set just to report the true error of the model on a completely unseen data .

In caffe, it just happened that they name that validation set "a test set", that's all. So what you should typically do is :
- Build a training set (that's what model weights are optimized on) 
- Build a validation test (called test set in caffe, use it to find best number of iterations, weight_decay,..etc)
- Build a test set (use this to find the true accuracy of your model, after caffe training is all done)

Your other question about test_iter in solver.prototxt :
This just specifies the number of test batches (actually "validation batches") to use to calculate accuracy. So in your network.prototxt if you have the batch size of the test data layer set to 40 for example, and your test_iter = 100, then after each test_interval iterations, 400 test images will be predicted and accuracy is calculated and reported over these 400 images/datapoints). 

As far as I understand (someone correct me if i'm wrong), the following settings makes absolutely no difference:
test_iter=4 , test batch size = 100
test_iter=20, test batch size=20
test_iter=100,test batch size=4
..you get the point, all settings will result in 400 images to be tested.
Reply all
Reply to author
Forward
0 new messages