It seems to be like all this RGB<->Z callibration stuff is overly
complicated.
We should be able to come up with decent constants for FOV and Z depth
units (this may be adequate
http://www.ros.org/wiki/kinect_node?action=AttachFile&do=get&target=kinect_calibration.png
). After that, we should be able to come up with a lens distortion
model that should be uniform for all Kinects. The only thing variable
then, is the cross-calibration between the RGB and Z sensors, which
apparently varies from one Kinect to another.
If you want to take out that part, it seems like we should be able to
auto-calibrate. We can identify the Z background relatively easily,
and subtract it from every frame. We can do the same with the RGB
image. Then, even if you just use bounding boxes, you should be able
to determine a matching alignment between the foreground elements of
both RGB and Z, and calculate an offset specific to that part of the
image. Repeat as necessary by moving something around the scene, and
accumulating offsets. Presto, auto-calibrated offsets.
No need for targets, openCV, or anything -- just basic image
similarity testing.
Is anyone else pursuing this?