Mapping RGB pixel coordinate to it's nearest corresponding depth 3D coordinate.

1,008 views
Skip to first unread message

xilconic

unread,
Jun 27, 2011, 6:31:32 AM6/27/11
to OpenKinect
While browsing this group about the RGB-Depth correspondence, I
noticed I only read about the Depth -> RGB mapping. Now I understand
some camera calibration is in order (both monocular and stereo), which
yields some camera intrinsic parameters, and and pose difference
between the two camera's.

Now I personally am interested into finding a 3D depth location
belonging to some RGB coordinate. I want to do this, so I can use
feature tracking in the RGB images, find the estimated 3D locations of
these points, and use these to estimate a pose difference between data
frames. I intend to use Least Squares Error for a Rigid Pose estimate,
with the help of RANSAC for outlier removal (because it is possible to
have a mismatch in feature<->Depth location correspondence).

So my question is, how do I derive a 3D location based on the 2D RGB
coordinates? I was thinking about first mapping RGB coordinates to
Depth coordinates (2D to 2D), then read out the depth values that
belongs for that Depth coordinate? Would that be a smart thing to do?
If so, how would I do such a thing, given that I have my Kinect
calibrated. If not, why would that be? And what would be a better
alternative then?

Looking forward for a reply,
Xilconic

Bilal

unread,
Jun 27, 2011, 8:43:45 AM6/27/11
to OpenKinect
Hi Xilconic,

I think that i have the same problem as you!
in my case i look to augmented reality using kinect, so the pose
estimation is a first treatment i look to get out.
the difference betwin us i think, that i have CAD model (theorical
model of my object) who is not the case for you!!!
I think that you wish to use depth map to extract 3d points of object
and estimate a pose (location) of 3d points on RGB image. In this case
i think that edges is best information you can explored to:
1- Optimising compute time (CPU time) if you look for real-time
application.
2- edges (like lines) is information (often) of boundary of object,
that says corresponding to the depth (DFD: depth from defocus)
hope that can help, and sorry for a bad english!
when I have results i would keep you informed,

Regards,
--
Bilal

xilconic

unread,
Jun 27, 2011, 1:35:56 PM6/27/11
to OpenKinect
Hi Bilal,

I actually have an implementation of feature tracking, the sparse
iterative version of the Lucas-Kanade optical flow in pyramids built
in the MRPT library. So getting those points from an RGB image is not
a problem. What ís the problem, is getting the mapping from a generic
(x,y) coordinate in the RGB image to a 3D point with respect to the
Kinect itself. So what I need would be:
a) a conversion from 2D coordinate in RGB to a corresponding (nearest)
2D coordinate in the depth image. Then read out the depth value at
that coordinate in the depth map. Then generate a (x',y',z') based on
some reference frame. This would need 2 different mapping functions.
b) a conversion from 2D cooridnate in RGB to a 3D points in space with
respect to the Kinect. This gives a single maping function.

I think option b would be some compound version of a, but I don;t know
how to do either of them, not knowing how to derive these mapping
functions.

Cheers,
Xilconic

Bilal

unread,
Jun 28, 2011, 3:38:59 AM6/28/11
to OpenKinect
Hi Xilconic,
Look at http://www.cs.washington.edu/homes/xren/publication/3d-mapping-iser-10-final.pdf
and i hope can help

Best regards,
---
Bilal

Paweł Królikowski

unread,
Jun 28, 2011, 3:57:11 AM6/28/11
to openk...@googlegroups.com
And if it doesn't help, maybe http://nicolas.burrus.name/index.php/Research/KinectCalibration will. 

xilconic

unread,
Jun 28, 2011, 2:13:33 PM6/28/11
to OpenKinect
I've already looked at Nicolas Burrus' work, and it is like I said in
my openings post: its a Depth -> RGB mapping, and not the other way
around. Can I just use some kind of inverse of the "Depth -> RGB"
mapping, to get me a "RGB -> Depth" mapping, or doesn't that work that
way? My linear algebra is not near what it has been to be honest.

@Bilal: My project is actually inspired by the guys at Intel Research
Seattle. They don't say anything about any kind of RGB coordinate to
Depth 3D coordinate mapping, only about a mapping application using a
Kinect style camera. They indeed use image features, just like I
intend to do as I try to replicate their work with some changes
(different image feature extractor for example). But it might be worth
to send an email to them about how they generate a pose out of the RGB
image features.

Cheers,
xilconic

On 28 jun, 09:57, Paweł Królikowski <rabb...@gmail.com> wrote:
> And if it doesn't help, maybehttp://nicolas.burrus.name/index.php/Research/KinectCalibrationwill.
>
> On 28 June 2011 09:38, Bilal <bilal.automati...@gmail.com> wrote:> Hi Xilconic,
> > Look at
> >http://www.cs.washington.edu/homes/xren/publication/3d-mapping-iser-1...

Paweł Królikowski

unread,
Jun 28, 2011, 4:12:33 PM6/28/11
to openk...@googlegroups.com
Well, I needed to know a depth of a point at specific RGB coordinate too .. and I created a simple workaround like: 

int intoDepthX(int x) {
    return (double)abs(x - 46)/586*640;
}

int intoDepthY(int y) {
    return (double)abs(y - 37)/436*480;
}

Obviously, it's only for 640x480 and I created it manually.  I mean, I recorded few sample movies with box at specific distance then tried to display point at the same position in both RGB and Depth. 
On my kinect it works up to 1-2 pixel difference, which was good enough for me :] It might not work near the edges, but again, I didn't need it. 

The problem with RGB->Depth is both shifted and has smaller field of view -> had to shift it back & rescale. 
Comments are welcomed. 

2011/6/28 xilconic <xilc...@gmail.com>

Ian Medeiros

unread,
Jun 29, 2011, 8:54:58 AM6/29/11
to openk...@googlegroups.com
After carlibration, the [i][j] index of the depth matrix doesn´t correspond to the [i][j] index in the RGB matrix?

2011/6/28 Paweł Królikowski <rab...@gmail.com>

Paweł Królikowski

unread,
Jun 29, 2011, 9:44:37 AM6/29/11
to openk...@googlegroups.com
In my code? Nah, I got 2 structures, one for rgb and one for depth.

rgb[i][j] corresponds to depth[intoDepthX(i)][intoDepthY(j)]

xilconic

unread,
Jun 29, 2011, 1:21:55 PM6/29/11
to OpenKinect
@Ian Medeiros: Calibration only yields you some parameters (using the
Kinect calibration from Nicolas Burrus; mine were done with version
0.4 using checkerboard pattern), which are:
- Camera intrinsics of the RGB camera. You can use this to correct for
distortion.
- Camera intrinsics of the Depth camera. Same story for it's usage.
- Stereo camera translation and rotation matrix, which tells you the 6
DOF of 1 camera w.r.t. another camera.

So what calibration doesn't do, is making the (x_rgb,y_rgb) equal to
(x_depth,y_depth). If that would be the case, than my life was easy.

@Pawel: If I understand correctly, what you have done is just
'overlapping' RGB with depth with some resizing, correct? That might
also be accurate enough. At least it will be more accurate than doing
(x_depth,ydepth) = (x_rgb,y_rgb). I'll keep that one in mind, but I do
prefer a more accurate mapping if that is possible. Because a more
accurate mapping will probably yield me more inliers to be used in the
Rigid Pose estimation process I use.

Cheers,
Xilconic

On 29 jun, 15:44, Paweł Królikowski <rabb...@gmail.com> wrote:
> In my code? Nah, I got 2 structures, one for rgb and one for depth.
>
> rgb[i][j] corresponds to depth[intoDepthX(i)][intoDepthY(j)]
>
> On 29 June 2011 14:54, Ian Medeiros <ianmcoe...@gmail.com> wrote:
>
>
>
>
>
>
>
> > After carlibration, the [i][j] index of the depth matrix doesn´t correspond
> > to the [i][j] index in the RGB matrix?
>
> > 2011/6/28 Paweł Królikowski <rabb...@gmail.com>
>
> >> Well, I needed to know a depth of a point at specific RGB coordinate too
> >> .. and I created a simple workaround like:
>
> >> int intoDepthX(int x) {
> >>     return (double)abs(x - 46)/586*640;
> >> }
>
> >> int intoDepthY(int y) {
> >>     return (double)abs(y - 37)/436*480;
> >> }
>
> >> Obviously, it's only for 640x480 and I created it manually.  I mean, I
> >> recorded few sample movies with box at specific distance then tried to
> >> display point at the same position in both RGB and Depth.
> >> On my kinect it works up to 1-2 pixel difference, which was good enough
> >> for me :] It might not work near the edges, but again, I didn't need it.
>
> >> The problem with RGB->Depth is both shifted and has smaller field of view
> >> -> had to shift it back & rescale.
> >> Comments are welcomed.
>
> >> 2011/6/28 xilconic <xilco...@gmail.com>

Paweł Królikowski

unread,
Jun 30, 2011, 8:40:39 AM6/30/11
to openk...@googlegroups.com
Exactly, just shifting & resizing to get the same field of view. 
I needed it for 1 m distance and for that it was good enough. Especially, with my sucky RGB tracking - the noise was too big to actually track something reliably so 1-2 pixels "loss" with rgb->depth mapping was not a big deal. 

2011/6/29 xilconic <xilc...@gmail.com>
Reply all
Reply to author
Forward
0 new messages