Augmented reality with the Kinect: occlusion detection and object manipulation

jnouellet

unread,

May 25, 2012, 10:08:54 AM5/25/12

to OpenKinect

Hi,
Here is new demo we made I think might be of interest. It uses the
Kinect color image and depth map to achieve augmented reality with
occlusion and object manipulation. Without wasting time, here is the
link to the demo (April 2012):
http://youtu.be/OTmHtAaQD_c

Now the description: we work on augmented reality (AR) where a virtual
object is added to a live video stream as if it was really part of the
scene. Last year, we developed unity3d plugin that is able to create
the AR illusion. Basically it grabs the webcam image and computes the
camera position wrt. a marker. Unity then only shows the webcam image
superimposed with a render of the virtual object from the correct view
point. This video shows what it looks like:
http://youtu.be/npuubLGpU1A

Then we wanted to allow the user to manipulate the state of the
object, rotate and resize, with his hands. The Kinect is perfect for
that, the skeletal tracking measure where the user is relative to the
camera (hand, head, etc). While doing so, we were confronted with the
occlusion problem. That is, if the hand is closer to the camera than
virtual object, the hand must be seen (we must not render the virtual
object over the hand). Again, the Kinect measure the distance of
everything it sees, the depth map, the z-distance. Similarly, when a
video card renders a virtual scene it computes occlusion from the
depth buffer, i.e. the z distance of every object in the virtual
scene. The occlusion algorithm is then obvious, on the shader
clip( z.real < z.virtual).

Thus, we used the Kinect (with the Microsoft sdk, but any other sdk
with skeletal tracking would do) to manage occlusion and manipulate
the virtual objects interactively. Rendering is done in Unity3d free,
occlusion is managed in a unity shader (this is very fast since
computation is done on the graphic card), manipulation is done in
Unity3d from skeletal information from the Kinect.

The biggest problem we faced was speed, transferring the depth map to
the shader was time consuming. We had to resize the depth map to
512x512 to allow faster manipulation. While doing so, we were
confronted by a new problem; the occlusion clip was highlighting the
object contours. So occlusion was correct but when the hand was behind
the virtual object we could see through the virtual object at the
object boundary. The effect was kind of cool but unwanted.

After some investigation we found out that opencv was using bilinear
interpolation to resize the depth map image. This was creating
intermediate values between the background and the object boundaries.
Forcing to nearest neighbor interpolation was the solution, but the
contours were still there. The shader was interpolating the values
when we extracted the depth value from the texture. The coordinates
are indexed from 0 to 1 instead of 0 to width (or 0 to height). We had
to crop the decimal value of the coordinate to make wure it was
falling at the center of the texture pixel to avoid interpolation.
While doing so we found out that unity3d is indexing pixels
coordinates from the pixel corner, not the center. An offset of 0.5
must be added to the coordinate in the native image range.

We are interested in comment from the community. While the technology
is cool, useful applications are difficult to find.

nin...@gmail.com

unread,

May 25, 2012, 10:39:06 AM5/25/12

to openk...@googlegroups.com

Very Cool one step closer to the world builder video http://www.youtube.com/watch?v=VzFpg271sm8 nice work. If you developed a plugin for secondlife opensim I am sure some developers would appreciate. Now all we need is a 3D holographic projector.
Sent wirelessly from my BlackBerry device on the Bell network.
Envoyé sans fil par mon terminal mobile BlackBerry sur le réseau de Bell.

Ken Mankoff

unread,

May 25, 2012, 10:49:16 AM5/25/12

to OpenKinect

On Fri, 25 May 2012, jnouellet wrote:
> We are interested in comment from the community. While the
> technology is cool, useful applications are difficult to find.

Comment: Very nice work.

Application: Simulating surgery. No haptic feedback, but most
robotics used in surgery don't provide feedback, so this is
realistic in that sense.

-k.

Kyuhyoung Choi

unread,

May 25, 2012, 3:54:06 PM5/25/12

to openk...@googlegroups.com

This reminds me of our company's recent work shown in the WIS (World IT Show).

http://www.youtube.com/watch?v=GNskjwgbhrU

We used OpenNI and Ogre3D. The Sinbad example was a great help.

2012/5/25 Ken Mankoff <man...@gmail.com>

Anselm Hook

unread,

May 25, 2012, 4:04:19 PM5/25/12

to openk...@googlegroups.com

This is great to see.

>>> cool, useful applications are difficult to find.

One possible use would be to have a kind of visual interface to day to
day work. For example it might be nice to be able to do a simple task
such as group facebook friends. You could grab friends and drag them
into specific buckets. A simple example but a real one - I would use
such a tool.

A brainstorming session on simple uses may help you justify the work.
I personally feel there are use cases... I riffed on some use cases
in this talk: http://igniteshow.com/videos/dance-dance-brainy but I
feel like there are cases outside of this that you will discover
through actual practice of building; rather than just talking - so
definitely I admire your work!

a

--
@anselm 415 215 4856 http://twitter.com/anselm

Reply all

Reply to author

Forward