Bundle Adjustment of Aruco Markers

146 views
Skip to first unread message

Ce

unread,
Oct 15, 2022, 5:34:35 AM10/15/22
to BoofCV
Dear Boofcv Community,

I like the application of BoofCV very much. However, I have now encountered a challenge that I could not solve using the code samples.

It is about the positioning of Aruco markers. These are known from OpenCV (https://docs.opencv.org/4.x/d5/dae/tutorial_aruco_detection.html). I would like to stick or magnetically attach these markers to objects. With a camera, or maybe later an app, it should be possible to capture images from different perspectives. 

aruco1.jpegaruco2.jpegaruco3.jpeg
The program should then generate a point cloud from the Aruco markers. Each marker consists of four corner points. Thus, it would be useful if finally a point cloud is created from the corner points, in which each corner point has the ID of the marker.

The program should import 1. images 2. OpenCV detects the marker corners on the images and returns for each image a list of Aruco markers (with their ID) + their corners with the respective corner index. 3. from this list the 3D coordinates of the corners should be calculated by bundle block adjustment.
For another application I have simulated the situation, how it could look like later. The colored points are the marker corners, the white ones are surface points, which play no role here.

aruco marker 3D.jpeg

I would be very happy to find a solution, which creates a 3D point cloud from my collected 2D image points of the markers.

Opional I can also stay away from Aruco markers, if there are also easy to capture coded markers at BoofCV.

I appreciate any tips and hints.
If my question is unclear, I would be happy to add more information. Have a nice day. 
Best regards
Ce

Translated with www.DeepL.com/Translator (free version)

Peter A

unread,
Oct 16, 2022, 6:53:48 PM10/16/22
to boo...@googlegroups.com
You want to know how to detect Aruco in BoofCV right? In BoofCV AruCo and other similar markers are called Hamming Fidicuals. That's because you match an decoded pattern to the dictionary by minimizing the hamming distance.

Check out the example "ExampleFiducialHamming" for how to detect the markers. Make sure you select the correct dictionary.

I suspect you will have more questions relating to how you would feed this into bundle adjustment and do a reconstruction.

Cheers,
- Peter

--
You received this message because you are subscribed to the Google Groups "BoofCV" group.
To unsubscribe from this group and stop receiving emails from it, send an email to boofcv+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/boofcv/3f9d0355-7a01-4b3d-b7e7-251d4bd108bcn%40googlegroups.com.


--
"Now, now my good man, this is no time for making enemies."    — Voltaire (1694-1778), on his deathbed in response to a priest asking that he renounce Satan.

Ce

unread,
Oct 18, 2022, 9:14:58 AM10/18/22
to BoofCV
Thank you Peter for your answer!

Yes, that's exactly what I was looking for. Now I can switch from OpenCV to BoofCV for marker recognition. I will implement it in my program the next days. Similar to OpenCV, this way I get the markers detected in the image and their 2D pixels.

As you mention now, I am facing the challenge of incorporating this into the bundle block adjustment. In my case I don't need feature detection or other surface points, I just use the corner points of the Aruco marks.

As input value I have with OpenCV or BoofCV thus something like:

ArrayList<ArrayList<ArucoMarker >> foundMarkers= new ArrayList<>();
-> A list for each captured image containing a list of found markers.


The class ArucoMarker:

public class ArucoMarker {

    //ID
    int id;

    //Corner Points in 2D
    public ArrayList<Point2D_F64> cornersPoints= new ArrayList<Point2D_F64>();

}


Now I would like to convert this array list with 2D information into an array list with the 3D object ArucoMarker 3D with a bundle block adjustment. I would be very glad to get a solution.

Best regards and again thank you
Ce

Peter A

unread,
Nov 2, 2022, 7:20:22 PM11/2/22
to boo...@googlegroups.com
FYI I'm going to be a bit erratic at replying for the next month or so. Feel free to nudge me if you think it's a simple question.

What you want here is a PNP. That will take the observations and convert it into a camera pose. There is a PNP example on the website. However, in this case you want to use FactoryMultiView.pnp_1(pnp_1.EPNP, -1, 1). This will return Estimate1ofPnP, which takes in normalized image corners of the corners and returns the 3D pose. It handles the multiple solutions for you by applying a positive depth constraint. Let me know if this helps.

Ce

unread,
Nov 6, 2022, 1:53:00 PM11/6/22
to BoofCV
Hello Peter,

thank you very much for your message. However, this presents me with several questions:

Question1: If I understood it correctly, I need the location of the markers in 3D space to calculate the camera position of each image, then the following way would be correct.

        //2D observations + the needed 3D locations
        List<Point2D3D> imagePoints = new ArrayList<Point2D3D>();
       
        Estimate1ofPnP ePNP = FactoryMultiView.pnp_1(EnumPNP.EPNP, -1, 1);
       
        //Final cameralocation
        Se3_F64 cameraPosition = new Se3_F64();
        ePNP.process(imagePoints, cameraPosition);

    
Unfortunately, I do not know the 3D position of the marks. Would it still be possible to calculate a kind of 3D point cloud of the markers in an unscaled 3D space? I could multiply this point cloud finally for example with a scale (known distance between two markers, which I stick on a ruler and put next to the measuring object).

Question 2: If this is to be done with an uncalibrated camera, how do I adjust the normalization? So far I have only worked with calibrated cameras with the following code, which I copied from an example:
 
        CameraPinholeBrown calibCam = new CameraPinholeBrown();
       
        //Normalized
        Point2D_F64 cornerNorm = new Point2D_F64();

        // transform from pixel coordinates to normalized pixel coordinates, which removes lens distortion
        Point2Transform2_F64 pixelToNorm = LensDistortionFactory.narrow(calibCam).normalized_F64();

        // Convert to normalized image coordinates because that's what PNP needs.
        // it can't process pixel coordinates
        pixelToNorm.compute(imagePoint.x,imagePoint.y,cornerNorm);


Question 3:  Once I have the camera positions figured out, how do I correctly triangulate the 3D positions of the pattern points? One approach I could not test yet, but would have composed me from the documentation, would be the following code. Would that be correct?  How could I work around the problem if a marker point is not visible from a camera image?

         ConfigTriangulation ct = new ConfigTriangulation(Type.DLT);
         TriangulateNViewsMetric tm = FactoryMultiView.triangulateNViewMetric(ct);
         tm.triangulate(observations, cameraPos, final3DPoint);

        




I am very happy about every answered question. Thanks again for the support so far.

Many greetings
Ce

Peter A

unread,
Nov 7, 2022, 11:16:53 AM11/7/22
to boo...@googlegroups.com
See reply below
 
Question1: If I understood it correctly, I need the location of the markers in 3D space to calculate the camera position of each image, then the following way would be correct.

Yes and you already know it! Define the coordinate system to be the marker. For example, one corner can be at (0,0,0) and another at (0,1,0). Just make sure z=0. There is one algorithm which requires that.
 
Question 2: If this is to be done with an uncalibrated camera, how do I adjust the normalization? So far I have only worked with calibrated cameras with the following code, which I copied from an example:

Get this to work first with a calibrated camera. Uncalibrated case is MUCH harder. The most common way to handle uncalibration is to guess the intrinsics, have liberal tolerances, and hope bundle adjustment can sort it out. You can experiment with self calibration included with BoofCV, then use that calibration. Again this will require some work.
 
Question 3:  Once I have the camera positions figured out, how do I correctly triangulate the 3D positions of the pattern points? One approach I could not test yet, but would have composed me from the documentation, would be the following code. Would that be correct?  How could I work around the problem if a marker point is not visible from a camera image?

This is really a problem of applying a ton of coordinate transforms. Let's assume you have a single image of multiple markers. You will have the transform from marker[i] to camera (or maybe the inverse?). You then need to invert these transforms and treat the camera as a common reference frame. Then when you get a new image, you assume the markers are static and find the new camera location in that frame. Then it's possible to find the relationship between the two frames. 

Data will be noisy so don't forget to run bundle adjustment to clean things up. To create a stable system must use some approach to reject bad marker pose estimation too.
         

Ce

unread,
Nov 10, 2022, 9:06:26 PM11/10/22
to BoofCV
Hello Peter,

thank you for your message. I have now introduced a camera calibration. So I can now work with normalized pixels.

If I had a single Aruco marker visible on all images, I could put a local system in the marker with (0,0,0),(0,1,0),(1,0,0) and (1,1,0) as corner point positions in my local system as you already described.

Unluckily, there is not always a marker that is present in all images. Unfortunately, I do not know the position of the other markers in my local coordinate system. Additionally, in contrast to the corner points of a marker, I cannot assume that all markers lie on one plane (Z=0) in relation to each other.

Nevertheless, I tried to define a local coordinate system with random x- and y-coordinates and to calculate the orientation. Unfortunately, my final point cloud was not the result I was looking for...
badpointcloud.png

I would be very happy if you could go into more detail about the definition of a local coordinate system with several markers distributed in the room. That would help me a lot, thank you!
Ce

Ce

unread,
Dec 10, 2022, 8:58:23 PM12/10/22
to BoofCV
I can now present a small update.

I was able to perform a bundle block adjustment and I also get values. 

On the following pictures are six markers and the solution of the sparse bundle block adjustment.
However, you can see that the points do not lie perfectly on a straight line but drift partially.
OnTable.JPG BundleAdjustment.png

According to the adjustment a good result could be generated as the console outputs:

Steps fx change |step| f-test g-test tr-ratio lambda
0     3,844E+08   0,000E+00  0,000E+00  0,000E+00  0,000E+00    0,00   1,00E-03
1     4,031E+07  -3,441E+08  1,384E+04  8,951E-01  4,442E+03   1,236   3,33E-04
...  
34 3.631E-02 -3.067E-08 1.782E+05 8.448E-07 1.324E-03 -0.000 3.43E+05
Converged f-test
Error reduced by 1058759787023,6%



I thus assume two sources of error:

Error source A:
Incorrect input of the approximate values of the camera parameters. Here I am not sure how the focal length and the radial parameters are handled. This is how I export my proximity camera values for the import file:
Text.png

Error source B:
Bad approximation data, which are imported into the adjustment. Otherwise, the adjustment would not improve the parameters so immensely in percent. I generated the approximate data as you described. I take out an Aruco and look for all the pictures on which it was visible. Then I create a small local system with the corners of the aruco as coordinates as you described. With FactoryMultiView.pnp_1(EnumPNP.EPNP, -1, 1) I then calculate the positions of the images where the picked out marker can be seen. With these localized images, I set out to find more markers, whose positions I then obtain using triangulation. The newly obtained marker positions allow me to orient more camera images. Through several loop passes, I thus obtain a network of 3D markers and as many oriented images as possible. This may not be the nicest and most stable solution, but after a lot of work it gives me approximate values that I can use.

Appro.png
You can see on this view that the approximate values of the markers look quite good. However, all camera positions are quite central over the markers, although (as seen above) I also captured very flat images from the side with a small vertical angle. Are the camera positions correct? If not, why do I get such good marker coordinates after triangulation? Here the result and the visualization contradict each other.

CameraToAruco.png
I picked out the image from above again and did a single test with this shot. You can clearly see that even though the angle of capture is very flat to the markers, it is still displayed very centrally above the marker in the 3D visualization. I use a calibrated SLR and normalized coordinates from it. As coordinates of the marker I used its real edge length (64mm). Unfortunately, I can't think of any other sources of error for this effect. 

As you have already written this has now required a lot of time and work, so I would be very happy to solve this problems. I am very grateful for every little tip! I have tried to visualize the problems as clearly as possible, but if something is still unclear, I will gladly provide more information.

I would like to thank you again for the support and greet the entire community!

Best regards
Ce

Priscilla Komlofske

unread,
Dec 17, 2022, 3:36:38 AM12/17/22
to BoofCV
👍

Ce

unread,
Jan 9, 2023, 11:52:27 PM1/9/23
to BoofCV
Thanks komlofske. But unfortunately it does not work properly yet... :(
Reply all
Reply to author
Forward
0 new messages