Hi,
I am working on training a face recogniser. The recogniser I am planning will be trained on a few thousand faces.
As an easy starting point, I am trying to make a neural net that can differentiate 2 faces: Princess Diana and David Beckham (these were faces that it was easy to get a lot of images for!)
I have preprocessed the images with OpenCV so my training set is a set of 227x227 images each with only the face in. I've put 300 images of each. I know this isn't much, but it's more a practice run than anything.
My net is a modified CaffeNet file (attached), and I've also taken the CaffeNet solver.
I've been able to train it for 9000 iterations.
What happens when I then run my trained model is that every time, I get the same output values, no matter what input image I give it:
David Beckham 0 0.432829
Princess Diana 1 0.567171
The numbers are exact every time to all visible decimal places.
This looks odd to me, as I would not expect excellent performance, but I would at least have expected a change in output for different input images.
Has anyone seen this effect before? Is there something obvious that I'm missing?
I know CaffeNet isn't the most adequate net architecture for the problem, also I've been looking at the VGG net (
http://www.robots.ox.ac.uk/~vgg/software/vgg_face/), but they only supply the deploy.prototxt file and I was not sure how to convert this to a train_test.prototxt.
I would be grateful for any advice, especially if someone has tried making their own face recogniser from scratch as I am doing.
Thanks
Tom