Using the OpenFace LFW evaluation code on VGG Face

Brandon Amos

unread,

May 21, 2016, 4:06:11 PM5/21/16

to CMU-OpenFace

I'm moving this discussion from our chatroom to the mailing list:

Varun Suresh @varun-suresh May 19 09:43
I'm curious if anyone here has experimented with VGG's caffe model. I'm not able to recreate their results on LFW. I used the pre-trained model available on the caffe model zoo.

elbamos @elbamos May 19 11:29
Everyone has played with it and the reproducibility of the results is pretty well established.

Varun Suresh @varun-suresh May 20 16:38
@elbamos - This is the procedure I followed -
1) Used dlib's face detection and landmark detection to align the faces. I used Openface's code to do this. I resize the images to be (224,224,3)
2) For the cropped images, I calculate the VGG Face descriptor. I load the image, subtract the mean and then re-shape them to be (1,3,224,224) and feed that as the input to the deep neural net.
3) I generate the labels.csv and reps.csv like in Openface and use the lfw.py script to evaluate it. I have modified the lfw.py to normalize the 4096-dimensional vectors before calculating the L2 distance.
Please let me know if you did something differently. Thanks!

elbamos @elbamos May 20 16:44
@varun-suresh I can't really evaluate your method. But I have certainly used VGG many times, and I'm positive that it works
well, let me say that differently: I'm positive the model detects useful features in many cases, and I'm confident that the original work is reproducible research.

Varun Suresh @varun-suresh May 20 17:07
I'm certain it is reproducible, just trying to figure out what I am missing. For a sanity check, I looked at this (https://github.com/AKSHAYUBHAT/TensorFace/blob/master/vggface/torch_verify.ipynb) and verified I was calculating the descriptor correctly.
Do you use the 4096-dimensional vector or the embedding? Thanks.

---

I could be wrong, but I don't think you should normalize the VGG classification model's embeddings for the similarity metric since it potentially removes a lot of information. Can you try modifying the OpenFace LFW evaluation code to check for distances between 0 and N (found empirically) on their unnormalized embeddings instead of 0 and 2?

-Brandon.

Varun S

unread,

May 23, 2016, 11:54:36 AM5/23/16

to CMU-OpenFace

I empirically found N and modified the LFW code. The accuracy was much lower, it came out to be 0.7516. I initially thought normalizing would result in loss of information. Their demo however suggested normalizing it and that was why I did it.
In their paper, in the Experiments and Results section, they talk about multi-scale testing. They scale the image and the final descriptor is the average of all the combinations. I haven't done that and I think that could potentially be why I'm not able to recreate their results. I will try that and post the results here.

Thanks
-Varun

Varun S

unread,

May 24, 2016, 10:21:03 AM5/24/16

to CMU-OpenFace

Multi-scale testing resulted in an accuracy of 0.7508. In the training phase, they scaled the image and used a random 224x224 patch. From my understanding, this is to try and make the network position and scale invariant. That was why I thought taking an average of all the 224 x 224 patches by scaling the face (followed the procedure in the paper ) could result in better numbers. Initially, I calculated the accuracy without normalizing. It came out to be 0.6892.

Any thoughts on this?

Thanks

Varun

Brandon Amos

unread,

Jun 3, 2016, 4:41:37 PM6/3/16

to CMU-OpenFace

Interesting, it seems like the accuracy should be better that what you're getting even without multi-scale testing. Have you resolved this yet? Have you tried contacting the authors?

-Brandon.

Brandon Amos

unread,

Jun 13, 2016, 11:51:59 AM6/13/16

to CMU-OpenFace, Dante Knowles

Hi Varun,

I was thinking a little more about this and I think that using OpenFace's alignment could potentially be causing the accuracy issues since it could canonicalize the faces differently than what VGG Face is using. Is there a way you can use the same alignment as in the VGG paper? Also we've added a new alignment technique that should work better. I think you should try it with VGG face. See the details in https://groups.google.com/forum/#!topic/cmu-openface/aQuYBLcVKAo. After checking out the latest code, you can use it with our `align-dlib` script with the `--version 2` command line flag.

-Brandon.

Varun S

unread,

Jun 13, 2016, 11:59:04 AM6/13/16

to CMU-OpenFace, god...@gmail.com

Hey Brandon,

I came to the same conclusion too. I experimented with their face detection model without alignment and I'm starting to get the results the paper reports. I will try and implement the alignment used in their paper.

Thanks for the work on the new alignment technique. I'll do some experiments with it in the next couple of days!

Reply all

Reply to author

Forward