Hello everyone.
I wanted to provide a validation set for my dataset(CIFAR10) in caffe.
to my knowledge, validation set is a fraction of test set right? like 25% of the actual test set.
Now, Something crossed my mind today,
Is it right to split my solver files into two and configure them as follows?
suppose in CIFAR10, we have 10,000 test images. and the whole test set is converted into leveldb.
again suppose my batch size for testing is 100 like below(and my training is 50):
name: "CIFAR10_full"
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "examples/cifar10/mean.binaryproto"
mirror: true
}
data_param {
source: "examples/cifar10/cifar10_train_lmdb"
batch_size: 50#100
backend: LMDB
}
}
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "examples/cifar10/mean.binaryproto"
}
data_param {
source: "examples/cifar10/cifar10_test_lmdb"
batch_size: 100
backend: LMDB
}
}
So far so good, now in one solver file which would act as my train_val solver, I would write :
# test_iter specifies how many forward passes the test should carry out.
# In the case of CIFAR10, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 25 # 25*100 = 2500 which is one quarter of 10,000 or 25% of 10,000
# Carry out testing every 1000 training iterations.
test_interval: 1000
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.1
momentum: 0.9
etc...
And in the second solver I would write :
# The train/test net protocol buffer definition
net: "examples/cifar10/cifar10_full_relu_train_test_bn.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of CIFAR10, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100# 100*100 =10,000 the whole test size
# Carry out testing every 1000 training iterations.
test_interval: 1000
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.1
momentum: 0.9
So basically what I'm doing here is that, we limit the number of tests in the validation solver file (like instead of 100, we are using 25, which when multiplied by the batch size would yield 25*100=2500 which is 25% of the actual test size)
and then in the test solver file, we would use the normal number of test iterations e.i 100 (100*100 = 10,000) meaning use the whole test dataset.
Is it right? or not I'd be grateful to know.
Thanks in advance