I'm trying to prepare
LFW dataset for CNN face recognition network. I've managed to create an lmdb files for both training and test data using the caffe's *convert_imageset* tool. The tool uses txt file as input with the following format describing images and their classes:
<picture name> <classID>
01_pictureOfClass1.jpg 1
02_pictureOfClass1.jpg 1
01_pictureOfClass2.jpg 2
On the LFW website, they suggest a concrete split for matching pairs for training and testing data, so in this example, it could be something like:
#Matching pairs
01_pictureOfClass1.jpg 02_pictureOfClass1.jpg
#Mismatching pairs
01_pictureOfClass1.jpg 01_pictureOfClass2.jpg
1. Is there any way to enforce Caffe to learn certain pairs of matching and mismatching pairs of data?
2. If not, how then should I split the whole data of 5549 classes into training and testing datasets?
3. They are multiple classes in LFW which consist of only one picture. How should I treat the single picture classes in LFW database? Can I even train network to recognize such classes if they are only represent in testing dataset?