Probably your confusing is because of naming in caffe.
In general machine learning terms, a validation set is used to determine the goodness of hyper parameters used to train on the training set. A validation set is also used for early stopping a neural network when it seems to do so well on the training set but performance starts to degrade on the validation set (the number of iterations is a hyper parameter too, right?). After all model parameters and hyper parameters are found and fixed, one uses a final test set just to report the true error of the model on a completely unseen data .
In caffe, it just happened that they name that validation set "a test set", that's all. So what you should typically do is :
- Build a training set (that's what model weights are optimized on)
- Build a validation test (called test set in caffe, use it to find best number of iterations, weight_decay,..etc)
- Build a test set (use this to find the true accuracy of your model, after caffe training is all done)
Your other question about test_iter in solver.prototxt :
This just specifies the number of test batches (actually "validation batches") to use to calculate accuracy. So in your network.prototxt if you have the batch size of the test data layer set to 40 for example, and your test_iter = 100, then after each test_interval iterations, 400 test images will be predicted and accuracy is calculated and reported over these 400 images/datapoints).
As far as I understand (someone correct me if i'm wrong), the following settings makes absolutely no difference:
test_iter=4 , test batch size = 100
test_iter=20, test batch size=20
test_iter=100,test batch size=4
..you get the point, all settings will result in 400 images to be tested.