Face detection with caffe

Vu Hong Thuan

unread,

Dec 3, 2014, 11:26:39 PM12/3/14

to caffe...@googlegroups.com

Hi all,

I plan to use caffe for face detection. Does anybody do something similar? What's about the result? In compare with Viola & Jones method?

How could I search region in a image using caffe?

Thank you so much.

Thuan

Jason Yosinski

unread,

Dec 3, 2014, 11:54:10 PM12/3/14

to Vu Hong Thuan, caffe...@googlegroups.com

If you poke around, you'll find a few face neurons (face channels,
more precisely) in the conv5 layer of an AlexNet or CaffeNet model
that's been trained on ImageNet. Pretty cool, especially considering
ImageNet doesn't contain any explicitly labeled faces!

jason

---------------------------
Jason Yosinski, Cornell Computer Science Ph.D. student
http://yosinski.com/ +1.719.440.1357

> --
> You received this message because you are subscribed to the Google Groups
> "Caffe Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to caffe-users...@googlegroups.com.
> To post to this group, send email to caffe...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/caffe-users/37ee6043-1f2a-4aea-b968-eda4d05cf45b%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Edmond Zhang

unread,

Dec 4, 2014, 9:06:13 AM12/4/14

to caffe...@googlegroups.com, thu...@gmail.com

Hi Jason,

Would you please elaborate on that? Maybe you could point me to the right direction?

Cheers,

Edmond

Vu Hong Thuan

unread,

Dec 6, 2014, 4:08:56 AM12/6/14

to caffe...@googlegroups.com, thu...@gmail.com

Hi,

In ImageNet models, how could we organise data input and output layer with many bounding boxes with their labels?

Thank you,

Thuan

Jason Yosinski

unread,

Dec 8, 2014, 11:47:52 AM12/8/14

to Edmond Zhang, caffe...@googlegroups.com, Vu Hong Thuan

Sure; as luck would have it, I even made a video showing one of these
neurons (starts at 2:30):
http://yosinski.cs.cornell.edu/yosinski_convnet_cam_demo_141124.mov

It's pretty rough / unscripted / just made in 10 min to send to one
person / etc., and I even made a mistake about one of the first layer
neurons, so please don't post beyond this list :)

If by some chance you happen to be at the NIPS conference right now,
you can stop by my demo booth tomorrow evening to poke around in the
convnet live:

Demo: "Playing with Convnets"
Tuesday, 7:00 – 11:59pm, Level 2, room 230B
http://nips.cc/Conferences/2014/Program/event.php?ID=4820

cheers,

jason

---------------------------
Jason Yosinski, Cornell Computer Science Ph.D. student
http://yosinski.com/ +1.719.440.1357

> https://groups.google.com/d/msgid/caffe-users/94de8ee3-ee6f-4401-a05a-2d00487fd6ef%40googlegroups.com.

Ed Lawson

unread,

Dec 8, 2014, 1:11:51 PM12/8/14

to Jason Yosinski, Edmond Zhang, caffe...@googlegroups.com, Vu Hong Thuan

Thanks for sharing that video, Jason. Very interesting, and also seems to give a sense of the scale of the face. What are you using to visualize the neurons? Is this code that you wrote or is this something in caffe?

Thanks,

Ed Lawson

To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/CAPhAuetg%2BUsO3LVw5tSWVe7ay5-6%3DSusUFvuAkP_grA-9kcRtg%40mail.gmail.com.

revspooner

unread,

Dec 8, 2014, 3:15:44 PM12/8/14

to Ed Lawson, Jason Yosinski, Edmond Zhang, caffe...@googlegroups.com, Vu Hong Thuan

+1 very cool, thanks for sharing

To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/CAKcC3YeJcjn3GQFPPiXgwWagtDeFYvsFHoVa9TChXkNsw9fWxg%40mail.gmail.com.

Jason Yosinski

unread,

Dec 8, 2014, 3:52:10 PM12/8/14

to Ed Lawson, Edmond Zhang, caffe...@googlegroups.com, Vu Hong Thuan

It's just code I wrote, all in Python:

OpenCV grabs image from Mac camera -> push through trained convnet ->
grab activations at given layer using Python bindings -> display on
screen using OpenCV.

I could probably share (after NIPS is over) if people are interested.

jason

---------------------------
Jason Yosinski, Cornell Computer Science Ph.D. student
http://yosinski.com/ +1.719.440.1357

Amir Alush

unread,

Dec 9, 2014, 4:10:20 AM12/9/14

to caffe...@googlegroups.com

Look at http://arxiv.org/abs/1411.7766.

Message has been deleted

Jason Yosinski

unread,

Jan 11, 2015, 8:06:50 PM1/11/15

to quin...@gmail.com, caffe...@googlegroups.com

Hey quinnjarr,

I'm definitely planning to release it, but it's not quite ready yet.
I'll post here when it is!

jason

---------------------------
Jason Yosinski, Cornell Computer Science Ph.D. student
http://yosinski.com/ +1.719.440.1357

On Thu, Jan 8, 2015 at 1:52 PM, <quin...@gmail.com> wrote:
> Hey Jason, I'd be really interested seeing the python code for seeing the
> activations of the neurons, could you share it?

> --
> You received this message because you are subscribed to the Google Groups
> "Caffe Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to caffe-users...@googlegroups.com.
> To post to this group, send email to caffe...@googlegroups.com.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/caffe-users/68a615cb-d4bf-42b6-8f0e-9ddd287f6fb0%40googlegroups.com.

Eduard Feicho

unread,

Apr 7, 2015, 10:21:45 PM4/7/15

to caffe...@googlegroups.com, quin...@gmail.com

Any news on this?

I'm trying to get my heads around face recognition using caffe. So far, I can theoretically follow your statement of intercepting face-recognising neurons directly.

Did you implement a sliding window that triggers only when those face neurons fire? Or did you go with r-CNN?

(I can't follow on r-CNN, because I don't have matlab :'( )

My initial thought, however, was to reduce fine-tune a modified net where the ultimate fully-connected layer has only 2 outputs (face/no-face). What do you think about this?

Eduard

Jason Yosinski

unread,

Apr 7, 2015, 10:38:42 PM4/7/15

to Eduard Feicho, caffe...@googlegroups.com, Quinn Jarrell

Hi Eduard,

> Any news on this?

The demo isn't ready yet, but it's probably not quite what you're
looking for if you want to train a face detector. All the demo does is
show activations from a pre-trained model; there's no extra training
going on.

> I'm trying to get my heads around face recognition using caffe. So far, I
> can theoretically follow your statement of intercepting face-recognising
> neurons directly.
> Did you implement a sliding window that triggers only when those face
> neurons fire? Or did you go with r-CNN?
> (I can't follow on r-CNN, because I don't have matlab :'( )

If by "sliding window" you mean "convolution", and by "triggers" you
mean "the relu activation is greater than 0", then yes. But this is
just the operation of every conv/relu layer. The network used was
trained on ILSVRC12 classification only (no detection labels). You
could just use the similar model at models/bvlc_reference_caffenet.

> My initial thought, however, was to reduce fine-tune a modified net where
> the ultimate fully-connected layer has only 2 outputs (face/no-face). What
> do you think about this?

Great idea! Just grab a dataset and stick an FC layer atop, say,
conv5. Given the locality of the representation (that is, the face or
no-face information seems to be contained in a few channels as opposed
to distributed across many), you might have good luck using L1 weight
decay instead of or in addition to L2.

jason

---------------------------
Jason Yosinski, Cornell Computer Science Ph.D. candidate
http://yosinski.com/ +1.719.440.1357

> https://groups.google.com/d/msgid/caffe-users/0afab1e2-cfe7-425e-9dac-84b87f07b23b%40googlegroups.com.

Danai Tri

unread,

Dec 4, 2015, 2:34:56 PM12/4/15

to Caffe Users

Hey I did something similar! I trained a network and converted it into a fully convolutional model so that it can produce a heatmap of the input. Here is a video showing the network's output in a real time video.

Danai Tri

unread,

Dec 4, 2015, 2:36:28 PM12/4/15

to Caffe Users

https://drive.google.com/file/d/0B8OQ8LGmcHpIRGxvVmxkcW5JWFk/view

On Thursday, December 4, 2014 at 6:26:39 AM UTC+2, Vu Hong Thuan wrote:

Danai Tri

unread,

Dec 4, 2015, 2:41:00 PM12/4/15

to Caffe Users

And here are the results compared to Viola Jones method on the FDDB dataset: (my model is called FLIP)

On Thursday, December 4, 2014 at 6:26:39 AM UTC+2, Vu Hong Thuan wrote:

Сергей Алямкин

unread,

Dec 23, 2015, 8:22:49 AM12/23/15

to Caffe Users

Hi, Danai. Can you share you code for face detection?

суббота, 5 декабря 2015 г., 1:41:00 UTC+6 пользователь Danai Tri написал:

Hossein Hasanpour

unread,

Jan 6, 2016, 3:32:36 PM1/6/16

to Caffe Users

Any update on the subject is greatly appreciated.
did you manage to get what you were after?
@Jason Yosinski
Would you please share your code now?
by the way does caffe not have such capability that relieves us from coding it oursleves?
Thanks in advance

David Ohm

unread,

Jan 8, 2016, 3:35:04 PM1/8/16

to Caffe Users

Dani - I love the demo. Thanks for posting that. It is very cool!

Akala Moyo

unread,

Jul 18, 2017, 4:20:23 PM7/18/17

to Caffe Users

Hi there. please do you have any python code that can help me with face detection with cnn. Thanks in advance

gk...@mobicloudinfosys.com

unread,

Sep 2, 2017, 1:22:07 AM9/2/17

to Caffe Users

please provide the procedure to do the coding for face detection with vaffe

Reply all

Reply to author

Forward