Add Custom Track/Keypoint?

21 views
Skip to first unread message

charl...@gmail.com

unread,
Sep 22, 2018, 10:45:18 AM9/22/18
to Theia Vision Library
I've been spending a lot more time with the theia::Reconstruction class. I am curious about the AddTrack function.

For example, after using SIFT/AKAZE and generating the initial keypoints and set of matches, I perform an Incremental reconstruction to yield a sparse point cloud from the detected keypoints.

If I wanted to then manually add a keypoint to the reconstruction, would it simply be to use the https://github.com/sweeneychris/TheiaSfM/blob/master/src/theia/sfm/reconstruction.cc#L239 method, with a Track object?

I would like to be able to create key points manually and re-run the reconstruction with an emphasis (Heavier weighting?) on the manually injected keypoints. I understand this could be unstable if the manual keypoints disagree with the overall scene, however I am trying to understand how control points can be used to increase the accuracy of the sparse reconstruction inside Theia.

Thanks,
Charles

Aaron K

unread,
Sep 22, 2018, 12:06:04 PM9/22/18
to charl...@gmail.com, Theia Vision Library
Accuracy of what? I could be wrong, but in general, wouldn't you want to use control points during the final estimation of the rigid transform (e.g. Umeyama) to your world coordinate system?

Aaron

--
You received this message because you are subscribed to the Google Groups "Theia Vision Library" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theia-vision-lib...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Charles O

unread,
Sep 22, 2018, 1:57:31 PM9/22/18
to aa...@limitedslip.net, Theia Vision Library
Hi Aaron, thank you for responding so quickly.

This is likely an oversimplified of the overall SFM process, so I apologies up front, but I will attempt to ask as best as I can.

I have an implementation using the https://en.wikipedia.org/wiki/Kabsch_algorithm to obtain a rotation matrix, and use all of the view GPS priors as the input.  I then convert the GPS locations to UTM, and use the UTM coordinate system to determine a translation and scale to perform the "rigid transform" (first time hearing that phrase) so i know 'where' the model is in my 'world coordinate system', UTM Zone 17.  After reading a bit on Umeyama, I think this could be a better approach, so thank you for that recommendation.  I still am working through both understanding the process of converting local coordinates to world coordinates in 3d space, and implementing it's solution.  
I am currently generating the rot, trans and scale matrix/vector after the reconstruction has been created, and plan to see if I can use them to align/fuse multiple reconstructions from the same area, processed in smaller sets.  For example, some areas could be constructed with sparse features, like a football field, while others I would like to do with dense features, like complex man-made buildings and equipment around the football field.  At the end, I hope to be able to merge everything together as a single scene.  I believe I'll have to learn the ICP (iterative closest point) algorithm if I ever get there.

Regarding manual key points and accuracy, I may be using the wrong terminology.

My understanding is that the reconstruction builder uses the matches, views and intrinsic priors as input and then provides the sparse point cloud data as an output.  My thoughts are in regard to the 'agreement' across all the tracks, that they may be discarded if they are determined to be 'outliers'.  It caused me to wonder if I could either speed up or assist the algorithm by defining an initial set of tracks manually.  As a human I am able to see a unique feature (for example the corner of a goal line on the football field) in the original imagery and define their X,Y position (with say, 3x3 pixel accuracy).  If I were to repeat this for n images that cover the same key point from different perspectives, I was hoping I could provide a set of initial 'good' tracks (albeit less than SIFT/AKAZE can do) to reduce the complexity of solving the initial position(s) for cameras? Perhaps this is not worth pursuing, as the hundreds of keypoints matched by the cascade hashing/brute force technique may far out weigh the few dozen manual key points I could make as a human.  Maybe I can use this concept to re-run a more dense feature extraction and matching on strategy on the chosen images in the selected areas.

Just kind of thinking through some ideas for how to leverage the existing theia framework with a customized user input to enhance either speed or quality of output, I apologize that I cannot  be more specific.

Thanks again,

Charles


Aaron K

unread,
Sep 25, 2018, 12:28:10 AM9/25/18
to Charles O, Aaron Klingaman, Theia Vision Library
To me, it feels unlikely that manually added tracks would help any. A well solved and calibrated set of images should have a reprojection error of well under 1 pixel. Unless you have a bunch of images from different cameras.

Aaron
Message has been deleted

charl...@gmail.com

unread,
Sep 25, 2018, 11:39:22 AM9/25/18
to Theia Vision Library
Having different cameras would be something to interesting consider. I seem to find that in the case that perspectives from the same camera are quite different (oblique vs. nadir), obtaining convergence is still sometimes quite difficult with only 1 pass of initial matching.

For example, if I were to 'orbit' a point of interest in a circle with an oblique angle, and then perform aerial grid collection in nadir, many matches are found within the orbit OR within the aerial grid, while there are less 2-view matches between the nadir and oblique perspectives mutually. The result is sometimes two disconnected models, even though the same camera is used.

I am currently attempting to create 'buckets' of matches using the GPS location for the nadir sets, and then connecting them with a second round of matching to tie them together. The oblique orbit is more tricky because the lat/lon of the sensor does not correlate with any kind of heading or camera angle. I was hoping a manual key point process could be used to find additional matches between the nadir and oblique images. Perhaps I can simply use theia::FeatureMatcher's "SetImagePairsToMatch" function and generate a set that strictly compares each image in the smaller set (oblique) to the neighbors (via gps) in the larger nadir set.

Aaron K

unread,
Sep 25, 2018, 9:08:36 PM9/25/18
to Charles O, Theia Vision Library
Can you share an oblique and nadir image pair that don't match well?

What descriptors are you using?

Aaron

Reply all
Reply to author
Forward
0 new messages