Hi,
which database do you use?
I have only learn at FaceSrub (80k images) and net was overfitting.
On WLDB I get only 20%, but this is because of weekly labeled data.
Now I am trying to get bigger database. I hope this will help with overfitting.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/b137cb8c-551f-4fc1-b891-82175427a2b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
In Caffe "0" mean wrong, "1" right,
Caffe version: L(W,Y,X1,X2) = Y * (Ew^2) + (1-Y) * (Q -Ew^2)
I am right?
If yes, first thing to develop,
Second thing is to make learning process and architecture modeling easier. This should look like
- define one network
- contrastive loss should sample pair images from data
- softmax loss should work as normal
I need better understanding of sampling process for Contrastive loss, so I will ask author.
@Ho
Do you reproduce CASIA model? If yes, using two types of losses or only one?
1. Question: You mentioned about weighted cost function. As I understand, by alpha is multiplied SoftMax loss. The Contrastive cost was weight = 1, right? Is the step size of alpha the same as for learning rate?
Answer: My cost function is (softmax + alpha * contrastive). The learning rate the alpha are tuned in 3 times. The learning rates are:1e-2,1e-3,1e-4 and 1e-5. The alpha(s) are:3.2e-4, 9e-4, 2e-3, 6.4e-3. The final results are not very sensitive to the values of alpha, but you could refer my setting.
2. Question: Could you explain a bit more the process of sampling face pair from test set? I understand it that way:
- take batch of faces, produce loss from SoftMax
- sample pairs from batch, use Contrastive
The questions are:
- do you sample pairs after each batch? or take two next samples from train set?
- how many positive and negative pair do to sample? Same quantity or random?
Answer: I just sample face pairs within each batch. Usually, 10,000 positive and 10,000 negative pairs are drawn from each batch. You could adjust this number for your convenience.
3. Question: The Contrastive cost has parameter "margin". To what value did you set it?
Answer: You could read CUHK's "DeepID2+" paper for the details of "margin" in contrastive loss.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/6b032af8-cc4e-4f16-8643-39df5d409ae8%40googlegroups.com.
By the way, the datasets I used do have a lot of errors, do you think this will influence final feature vectors? and it caused linear svm doesn't work?
1. I did not use mean file, how do you calculate your mean file?
HI ,Bartosz, thanks for your mail, please see my underlined comment
I have made some progress in face recognition process. So far my best score at LFW is 90%. I was using architecture from CASIA paper and I was training on FaceSrub (it has 70k images). Now I just started learning using CASIA-WebFace, we will see what the result would be.
90% classification rate is quite remarkable performance, actually by using CUHK Deep learning 2 library we merely achieve 80% on LFW, this totally match what your test ed DEEPID2. I think FaceSrub is more suitable than CASIA-Webface in caffe training due to its high quality, we plan to use FaceSrub to train our HD-LBPH.
More Technical stuff:
- I use Siamese architecture, because in all paper about Face Verification there is the "Verification Loss". So, you need to get two faces from database and get label same/not same and produce loss. As I do not know how to do it exactly in Caffe, So I use Siamese Architecture, where in in batch I have Identification and Verification loss (I compare images from 2 separate nets). Using Verification loss get + 10-15% on LFW.
Are u using Siamese network which you posted on Feb 26th ,We failed to train your Siamese architecture by using CASIA-webface(in Quadro K1100M GPU, we shall move to K40 soon), both contrastive and softmax loss are high.
BTW, you mentioned in your posts that CASIA webface suggested you to adjust learning rate according to softmax loss movement, how can you dynamically change it in protottxt or solver?
- I tested 2 architectures, DeepID2 (79% on LFW) and CASIA (90%)
Do you use FaceSrub to train your DeepID2 network , so overfit problem disappear?
- I could not find any implementation of joint Bayesian so I do not use it. Do you have any implementation? I use Chi^2 distance + SVM
One of our researchers talked with researchers in CUHK DeepID2, they told us that Joint Bayesian is the best among all methods in face verification s, but you should reduce dimensions from 4000 to 160/320 firstly. I think that is Joint Bayesian is one of critical factors that your Deep ID2 cannot achieve good performance . We also cannot find any source code of Joint Bayesian , we are developing C/C++ based on OpenCV version Joint Bayesian now, we can share you our code as soon as we finish.
- I use only 2D Alignment
BTW face pre-process shall strongly affect recognition accuracy, you might notice we have to align face and frontlize faces before we use it, and DeepID2 also need to slice a face picture into 25 overlapped patches. I remember you use dlib to do image preprocess , do you slice your face image into patches? this might seriously affect your recoginition accuracy.
We use CUHK private lib from Deep ID2(face detect)+ opencv /Intraface (face alignment) for face image preprocess, and we are developing face patches by using intraface.
90% classification rate is quite remarkable performance, actually by using CUHK Deep learning 2 library we merely achieve 80% on LFW, this totally match what your test ed DEEPID2. I think FaceSrub is more suitable than CASIA-Webface in caffe training due to its high quality, we plan to use FaceSrub to train our HD-LBPH.
More Technical stuff:
- I use Siamese architecture, because in all paper about Face Verification there is the "Verification Loss". So, you need to get two faces from database and get label same/not same and produce loss. As I do not know how to do it exactly in Caffe, So I use Siamese Architecture, where in in batch I have Identification and Verification loss (I compare images from 2 separate nets). Using Verification loss get + 10-15% on LFW.
Are u using Siamese network which you posted on Feb 26th ,We failed to train your Siamese architecture by using CASIA-webface(in Quadro K1100M GPU, we shall move to K40 soon), both contrastive and softmax loss are high.
BTW, you mentioned in your posts that CASIA webface suggested you to adjust learning rate according to softmax loss movement, how can you dynamically change it in protottxt or solver?
- I tested 2 architectures, DeepID2 (79% on LFW) and CASIA (90%)
Do you use FaceSrub to train your DeepID2 network , so overfit problem disappear?
- I could not find any implementation of joint Bayesian so I do not use it. Do you have any implementation? I use Chi^2 distance + SVM
One of our researchers talked with researchers in CUHK DeepID2, they told us that Joint Bayesian is the best among all methods in face verification s, but you should reduce dimensions from 4000 to 160/320 firstly. I think that is Joint Bayesian is one of critical factors that your Deep ID2 cannot achieve good performance . We also cannot find any source code of Joint Bayesian , we are developing C/C++ based on OpenCV version Joint Bayesian now, we can share you our code as soon as we finish.
- I use only 2D Alignment
BTW face pre-process shall strongly affect recognition accuracy, you might notice we have to align face and frontlize faces before we use it, and DeepID2 also need to slice a face picture into 25 overlapped patches. I remember you use dlib to do image preprocess , do you slice your face image into patches? this might seriously affect your recoginition accuracy.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/2bbbe6b8-5cb1-48b3-a8e8-559f86473470%40googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/94c63157-2602-4c18-ba42-663092947ccb%40googlegroups.com.
HI Bartosz:
I am writing a prototxt of network and solver according to CASIA-webface, I saw your mail loop with CASIA member about some questions in Cost function:
question 1.
CASIA member propose to use function like: softmax + alpha * contrastive
does it mean we only need set loss_weight in contrastive layer, and leave empty in softmax layer?
question 2.
You said change loss_weight of Constrastive loss (from 3.2e-3. tp 6e-2) using several *.prototxt file, does it mean you pipeline several solver optimization prototxts to adjust loss_weight parameters gradually?
BR
#!/usr/bin/env sh
TOOLS=/home/blcv/LIB/caffe_master/build/tools/ $TOOLS/caffe train \ --solver solver.prototxt --gpu 1 2> log.txt
#turn on Constrantive loss$TOOLS/caffe train \ --solver solver30k.prototxt --gpu 1 \ --snapshot=nets/_iter_75001.solverstate 2> log_75k.txt
# Constrantive loss * 5$TOOLS/caffe train \ --solver solver60k.prototxt --gpu 1 \ --snapshot=nets/_iter_150000.solverstate 2> log_150k.txt
# Constrantive loss * 4$TOOLS/caffe train \ --solver solver120k.prototxt --gpu 1 \ --snapshot=nets/_iter_225001.solverstate 2> log_225k.txtHi StevenL,I have made some progress in face recognition process. So far my best score at LFW is 90%. I was using architecture from CASIA paper and I was training on FaceSrub (it has 70k images). Now I just started learning using CASIA-WebFace, we will see what the result would be.
More Technical stuff:- I use Siamese architecture, because in all paper about Face Verification there is the "Verification Loss". So, you need to get two faces from database and get label same/not same and produce loss. As I do not know how to do it exactly in Caffe, So I use Siamese Architecture, where in in batch I have Identification and Verification loss (I compare images from 2 separate nets). Using Verification loss get + 10-15% on LFW.
- I tested 2 architectures, DeepID2 (79% on LFW) and CASIA (90%)
- I could not find any implementation of joint Bayesian so I do not use it. Do you have any implementation? I use Chi^2 distance + SVM
- I use only 2D Alignment
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/62191da5-1499-4642-94c3-5f290be9d89a%40googlegroups.com.
I want to reproduce DeepFace net architecture. As their database is not public, I use biggest public face dataset WLFDB. It has 0.7 millions of images for 6025 subjects.My net architecture is pretty the same like in paper (I check features in dimension in every step to be sure), I now learn only classification net. But, as I do not have 1k images per subject, I remove last two LOCAL layer ( taken from here).As frontalization code for DeepFace is not released, I try to use raw faces and alignment faces delivered by WLFDB.My problem is that net overfit (sth like train 85%, test: 15%). In paper and presentation they claim, that net does not overfiitt at all (they provide plot of logloss from train and test). Facebook use dropout only in last layer.I do not find any information about data augmentation (like mirror, color, random crop). I understand, that I these technique may be not appropriate for face classification. But I do not why my net overfitt so bad.Question: Is anybody try do reproduce DeepFace? Or maybe was trying do reproduce Deep cypof ID ( I use not exactly the same architecture, but it overfit too).I think, that is not a problem with net architecture, but the problem is data. Maybe I miss some data pre-processing before start learning.How can I reduce overfitting?
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/ba79b71b-ee63-4dcb-bb52-fc147f32ce06%40googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/ACIhR132F90/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/9ddca6fb-444a-4b58-937b-3f72462197c9%40googlegroups.com.
--