Custom Gesture Recognition using OpenNI and Unity3D

MichaelK

unread,

Jun 24, 2011, 10:57:09 AM6/24/11

to OpenNI

Hi guys,

I have uploaded a video of my first implementation for custom
gestures. It works really great - just have a look here:
http://www.youtube.com/watch?v=t89TUfjFuGs
Right now it's a beta version, but I look forward to finish the work
this month :)

Regards,
Michael

kenshen

unread,

Jun 24, 2011, 11:15:02 AM6/24/11

to OpenNI

Awesome can't wait :)

Kaushal Jain

unread,

Jun 24, 2011, 2:36:47 PM6/24/11

to openn...@googlegroups.com

some really interesting stuff!!

--
You received this message because you are subscribed to the Google Groups "OpenNI" group.
To post to this group, send email to openn...@googlegroups.com.
To unsubscribe from this group, send email to openni-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/openni-dev?hl=en.

Bill

unread,

Jun 25, 2011, 8:50:41 AM6/25/11

to OpenNI

This looks excellent. Do you plan on releasing it to the community?

Bill

MichaelK

unread,

Jun 25, 2011, 9:39:09 AM6/25/11

to OpenNI

Right now it's in beta stage. I have to discuss this with the members
of the company I am working for. Maybe I can release a dll-File which
can be used ;)

hassan abu al-haija

unread,

Jun 25, 2011, 8:59:00 PM6/25/11

to OpenNI

Really awesome work !:)
I just wanna ask you if you can give us some details about the method
you are using here.
Are you using some machine learning method for learning the pose and
then classify it ? or you are just using other classical way with
saving angels and tolerances ?
did you use any other libraries than OpenNI ? did you use the C#
wrapper of NITE ?

Sorry for too much questions, I am just too interested in it :)

cheers and congratulations for the great job.

tim obarr

unread,

Jun 25, 2011, 9:11:55 PM6/25/11

to openn...@googlegroups.com

yeah srs i am working on a kinect rpg and would love this it would make making animations so much easier

--
You received this message because you are subscribed to the Google Groups "OpenNI" group.
To post to this group, send email to openn...@googlegroups.com.
To unsubscribe from this group, send email to openni-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/openni-dev?hl=en.

--
I don't usually write emails but when i do i forget punctuation

MichaelK

unread,

Jun 26, 2011, 8:05:36 AM6/26/11

to OpenNI

I am only using OpeNI's C# wrapper and some Unity3D functions. I am
not using machine learning, but like you wrote, I use the joint-
informations together with a tolerance. I think that works better than
machine learned stuff and is much faster :)

Naëm Baron

unread,

Jun 27, 2011, 3:16:36 AM6/27/11

to OpenNI

It's pretty neat !
The plugin seems well done with lots of features...

So you did that at work, right ? It would be cool to release it. Just
to use custom gesture and also because I am curious to see how to code
"full-integrated" unity plugin.

NB

Sam Muscroft

unread,

Jun 27, 2011, 6:08:04 AM6/27/11

to OpenNI

Looks like a nice tool for triggering events based on static poses or
for capturing mocap data. I would have to disagree with you about
machine learning though Michael, you've made a sweeping generalisation
about a term which encompasses many different techniques and
technologies. I have successfully used svm for hand pose recognition
and it can be trained in seconds and can recognise in real-time very
accurately.

Paulo Trigueiros

unread,

Jun 27, 2011, 7:43:57 AM6/27/11

to openn...@googlegroups.com

Hi Sam,

I saw your message I would like to use SVM for gesture recognition.
I'm using openframeworks+code::blocks for the development. Is there any
library for openframeworks. How did you implemented it? Can you share
code, experience?
Thanks in advance,
Paulo Trigueiros

Message has been deleted

Jeff Winters

unread,

Jun 27, 2011, 2:31:02 PM6/27/11

to openn...@googlegroups.com

Wow! This is superb -- well done, MichaelK! Add me to the list of those clamouring for a source release, or failing that, a detailed description of how you used the joint information to do gesture tracking.

Sam Muscroft

unread,

Jun 28, 2011, 1:02:08 PM6/28/11

to OpenNI

Hi Paulo,

I used pure OpenCV for the image processing and machine learning . I
am unable to share the source, but the basic principle was to track
the contours and then train the svm with contour features. This has
it's limitations as it's based on 2d descriptors...i am interested in
looking at using point clouds with the kinect to gain better 3d shape
descriptors, there is already some work done on this by ROS (http://
pointclouds.org/).

If you're interested in machine learning with computer vision, I
recommend buying the OpenCV O'Reilly book(http://www.amazon.com/
Learning-OpenCV-Computer-Vision-Library/dp/0596516134), the machine
learning chapter is very well written and explains the core concepts
well at a high level.

An important thing to remember is that the features with which you
train your classifier are unique, distinct and repeatable...garbage in
= garbage out. With this in mind it is possible to classify most types
of data. I've found svm to perform very well on smaller data sets and
there is no lengthy training period as there is with some decision
tree models.

If you're at all interested in this, it is well worth reading around
the subject rather than learning by modifying code, you'll gain a
deeper understanding that way.

Sam.

Paulo Trigueiros

unread,

Jun 28, 2011, 2:55:18 PM6/28/11

to openn...@googlegroups.com

Hi Sam,

Thanks for your answer. I going to try it and i will give some feedback.
I'm already using OpenCV inside openFrameworks for other things, so I
will look at the machine learning stuff.
Thanks again,
Paulo

Radu B. Rusu

unread,

Jun 29, 2011, 12:41:32 PM6/29/11

to openn...@googlegroups.com, Sam Muscroft

Sam,

On 06/28/2011 10:02 AM, Sam Muscroft wrote:
> Hi Paulo,
>
> I used pure OpenCV for the image processing and machine learning . I
> am unable to share the source, but the basic principle was to track
> the contours and then train the svm with contour features. This has
> it's limitations as it's based on 2d descriptors...i am interested in
> looking at using point clouds with the kinect to gain better 3d shape
> descriptors, there is already some work done on this by ROS (http://
> pointclouds.org/).

Just as a clarification: the PCL project is standalone and has nothing to do with ROS. ;) Most of its developers are
geographically distributed around the world, with the OpenCV team working on PCL as well (see
www.pointclouds.org/about.html).

Cheers,
Radu.
--
Point Cloud Library (PCL) - http://pointclouds.org

Sam Muscroft

unread,

Jun 30, 2011, 4:54:07 AM6/30/11

to Radu B. Rusu, openn...@googlegroups.com

Hi Radu,

Apologies, I thought there was some affiliation between ros and pointclouds.org.There's a link to documentation about pcl on the ros.org website (http://www.ros.org/wiki/pcl), hence I added 2+2 and came up with 5. Again, Sorry for any confusion caused.

Cheers,

Sam.

MichaelK

unread,

Jul 12, 2011, 12:24:16 PM7/12/11

to OpenNI

Hi guys,

I have uploaded a second video of my custom gesture recognition:
http://www.youtube.com/watch?v=dQz78rWVBwA

Regards,
Michael

Sam Muscroft

unread,

Jul 12, 2011, 12:42:10 PM7/12/11

to openn...@googlegroups.com

Nice work Michael ... looks like a very useful tool.

A quick question. Are the gestures always based on absolute co-ordinates. I mean where you've recorded the swipe gesture with your arms above your head, do you always have to have your hands in the same start position in order to begin the recognition of that particular gesture?

Last time I was looking into motion based gestures, I was using openCV and looking at motion history gradients where a silhouette of the object being tracked (a hand in my case) was captured over time and the difference in angle of movement captured for each frame / silhouette. I suspect it would be possible to use this technique along with your approach so that you could store arrays of the skeleton point co-ordinates over time and track the deltas of the positions rather than the positions themselves so that you could detect movement in a relative fashion to make it more flexible. Just an idea...

Regards,
Michael

Jman

unread,

Jul 12, 2011, 2:48:59 PM7/12/11

to OpenNI

Hey Michael,

Your work on this is incredible. I'm working on a similar project
in .NET and you have given me a lot of inspiration to move ahead.
Keep the videos coming!

- Justin

kenshen

unread,

Jul 20, 2011, 10:00:44 AM7/20/11

to OpenNI

so will this be released any time soon?

MichaelK

unread,

Jul 20, 2011, 11:58:55 AM7/20/11

to OpenNI

@Sam: I save the positions of every Joint in 4 different ways:
Absolute
Absolute with orientation to the front
Relative
Relative with orientation to the front

The orientation means, that the joints are rotated, so that it faces
the kinect. That way you can rotate around, without loosing the
gesture.

@Jman: Thanks :) I think another video is coming in 2 weeks.

@kenshen: I hope that I can release it. I have to ask my chef :) But
that can take 3-4 weeks...

On Jul 12, 6:42 pm, Sam Muscroft <sam.muscr...@gmail.com> wrote:
> Nice work Michael ... looks like a very useful tool.
>
> A quick question. Are the gestures always based on absolute co-ordinates. I
> mean where you've recorded the swipe gesture with your arms above your head,
> do you always have to have your hands in the same start position in order to
> begin the recognition of that particular gesture?
>
> Last time I was looking into motion based gestures, I was using openCV and
> looking at motion history gradients where a silhouette of the object being
> tracked (a hand in my case) was captured over time and the difference in
> angle of movement captured for each frame / silhouette. I suspect it would
> be possible to use this technique along with your approach so that you could
> store arrays of the skeleton point co-ordinates over time and track the
> deltas of the positions rather than the positions themselves so that you
> could detect movement in a relative fashion to make it more flexible. Just
> an idea...
>

Sam Muscroft

unread,

Jul 20, 2011, 12:25:36 PM7/20/11

to openn...@googlegroups.com

Thanks Michael - when you say relative, what are the positions relative to? I think this tool could be extremely useful if the gestures were machine learned based on point deltas so that you could start from any position. It is then possible to work out the gradient from point to point, this way you can build up a chain of unique position features for each gesture to use in a supervised machine learning algorithm.

I imagine that this technique or similar is how some of the NITE gestures are trained.

face

MichaelK

unread,

Jul 20, 2011, 12:44:55 PM7/20/11

to OpenNI

It's relative to any joint you define. For example the Torso, the
Head ... But you could of course also use the Kinect as center.
I think the machine learned thing is not so good, because this would
result in "too much" gestures. I mean that when I swipe with my hand,
it would be automatically be reconized as a gesture - and that is not
what I wanted to do. With my system you can define for your self where
the gesture should be recognized, and where not. Also it is possible
to edit gestures and to show the users, how a gesture is performed!

On Jul 20, 6:25 pm, Sam Muscroft <sam.muscr...@gmail.com> wrote:
> Thanks Michael - when you say relative, what are the positions relative to?
> I think this tool could be extremely useful if the gestures were machine
> learned based on point deltas so that you could start from any position. It
> is then possible to work out the gradient from point to point, this way you
> can build up a chain of unique position features for each gesture to use in
> a supervised machine learning algorithm.
>
> I imagine that this technique or similar is how some of the NITE gestures
> are trained.
>

Sam Muscroft

unread,

Jul 20, 2011, 12:54:38 PM7/20/11

to openn...@googlegroups.com

Yes that is the tricky thing about vision based gesture, there needs to be a start and end state for a gesture. A steady state could be used to denote the starting point and then if the correct sequence of feature is detected over a certain time period then this could yield a positive classification. You could then filter out the classifiers using a confidence measure and by tracking variance over a number of frames...I've used this technique for hand posture recognition with a webcam.

When I have some time I'll pick up motion based gestures with the kinect...I have hand posture working with it using 2D features, but there's room for improvement.

Best of luck with your project.

Sam.

MichaelK

unread,

Jul 20, 2011, 2:48:11 PM7/20/11

to OpenNI

You mentioned another important point: the time.

What if I want to do a very slow gesture with pauses and so on. That
would not work, if you set a time limit. And what if I draw a circle,
but in the half way I go back, and then forward again... I don't think
that such gestures could be created as easy as in my system...

On Jul 20, 6:54 pm, Sam Muscroft <sam.muscr...@gmail.com> wrote:
> Yes that is the tricky thing about vision based gesture, there needs to be a
> start and end state for a gesture. A steady state could be used to denote
> the starting point and then if the correct sequence of feature is detected
> over a certain time period then this could yield a positive classification.
> You could then filter out the classifiers using a confidence measure and by
> tracking variance over a number of frames...I've used this technique for
> hand posture recognition with a webcam.
>
> When I have some time I'll pick up motion based gestures with the kinect...I
> have hand posture working with it using 2D features, but there's room for
> improvement.
>
> Best of luck with your project.
>
> Sam.
>

Reply all

Reply to author

Forward