--
--
----------------------------------------
Ceres Solver Google Group
http://groups.google.com/group/ceres-solver?hl=en?hl=en
What is most interesting to me is that the Essential Matrix (E), solved in isolation using only image pairs, can actually be used as a sort of final solution for the camera locations.
Consider an airborne trajectory that circles a target of interest with the camera staring towards the center.This might be typical for a 3D scene recreation of a building site.The images are taken in temporal order as a sequence and the pairs will be processed that way.The result using only pair-wise image processing, is a set of relative position vectors, and relative attitudes (Direction Cosine Matrices DCM)
We could actually prepare the complete trajectory of the aircraft/camera using only this data -- up to an unknown initial condition.
If we make an assumption that the aircraft/camera starts at a known position and attitude, then the future position for the circular trajectory comes from simply adding up the relative positions.Similarly the attitude trajectory comes from multiplying together the DCMs
.Moreover, consider the case where we have an onboard GPS and IMU.
All modern airborne imaging systems will include this today -- and even handheld cameras tomorrow.Using the IMU/GPS filte/smootherr and the precise camera shutter timing, we can get the attitude and position (independent of the camera info) to 0.1m and 0.1 deg with reasonable covariance matrices.
The synergism of these independent datasets is explosive! and there are NO features other than those archived in the initial E(i,j) computations.
However, this may be more a Bayesian recursive estimation problem than an optimization problem.Does ceres have any ability to deal with this type of mixed sensor problem?
On Monday, December 10, 2012 12:51:25 PM UTC-6, james kain wrote:I am looking at use of the Essential Matrix as a way to reduce the order of the BA problem for aerial imagery.The Essential Matrix (E) provides a scalar constraint on unit vectors to a common feature in a pair of frames.The E constraint can be expressed in terms of the relative position vector between the two frames and the Direction Cosine Matrix (DCM) relating to the frames.E = [pX]DCM -- where [pX] is the cross product matrix from the relative position.There are many methods for estimating E from feature matches between image frames.My thoughts are to use a local estimator for each overlapped frame pair to estimate its E matrix.If the relative position is available for the frame pair, then the DCM relative attitude is easily obtained using only three well distributed features.This enables the feature position vectors to be completely removed from the BA problem and reduces the number of estimated parameters to 3 per overlapped image pair.Note that after the BA is solved in this manner the archived feature unit vectors can be used to recover the feature positions.Does anyone have any thoughts on this?If this is practical, I would be interested in developing such a method for inclusion within ceres.
--
My motivation for this discussion is the desire to process multiple aerial datasets, each containing many 1000s of images very quickly, with little memory and using standard PC hardware, in the field, using unskilled staff -- with 99.9% reliability. Our current practice is to run a very thorough (and well proven) navigation KF smoother that processes data from a low cost MEMS IMU and GPS that provides exceptional per-image pose and its covariance. This GPS/IMU KF is 100% reliable and only takes a minute or so. The massive BA problem that results is the bottleneck. Our standard pre-planned mission uses parallel image collection swaths with 60% downlap and 30% sidelap.
Returning to the circular trajectory as a simple (but practical) example:Consider that you have preprocessed all the overlapped image pairs to extract their E Matrices.Assume you have loop closure so that the nth image overlaps the n-1th image -- and the E(n,n-1) matrix is computed.Can we form a cost function in the E matrices, which when optimized, causes a trajectory that is consistent with all the feature observations?I you draw a circle trajectory about a point with image points on a regular polygon, then it looks like this is possible, ending up with a single scale uncertainty.I understand that the imagery-alone will result in an overal scene scale uncertainty.
More to the point, do the pre-computed E matrices, form an information set that summarizes all the information from the observed features we have used in our pre-processing? The "reduced measurement matrix" from the GEA paper (near eq 7) suggests this. And they describe numerous examples of actual imagery processing with seemingly good results. They dont get a "better" result (based on reprojection error) -- just a faster result.
Our current methods use a pretty standard BA solution to adjust the per-frame orientations based on the re-projections process; but the BA solution gets awkward when we have many thousands of images and many matched features per image.
On Monday, December 10, 2012 12:51:25 PM UTC-6, james kain wrote:
I am looking at use of the Essential Matrix as a way to reduce the order of the BA problem for aerial imagery.The Essential Matrix (E) provides a scalar constraint on unit vectors to a common feature in a pair of frames.The E constraint can be expressed in terms of the relative position vector between the two frames and the Direction Cosine Matrix (DCM) relating to the frames.E = [pX]DCM -- where [pX] is the cross product matrix from the relative position.There are many methods for estimating E from feature matches between image frames.My thoughts are to use a local estimator for each overlapped frame pair to estimate its E matrix.If the relative position is available for the frame pair, then the DCM relative attitude is easily obtained using only three well distributed features.This enables the feature position vectors to be completely removed from the BA problem and reduces the number of estimated parameters to 3 per overlapped image pair.Note that after the BA is solved in this manner the archived feature unit vectors can be used to recover the feature positions.Does anyone have any thoughts on this?If this is practical, I would be interested in developing such a method for inclusion within ceres.
--
My motivation for this discussion is the desire to process multiple aerial datasets, each containing many 1000s of images very quickly, with little memory and using standard PC hardware, in the field, using unskilled staff -- with 99.9% reliability. Our current practice is to run a very thorough (and well proven) navigation KF smoother that processes data from a low cost MEMS IMU and GPS that provides exceptional per-image pose and its covariance. This GPS/IMU KF is 100% reliable and only takes a minute or so. The massive BA problem that results is the bottleneck. Our standard pre-planned mission uses parallel image collection swaths with 60% downlap and 30% sidelap.
Returning to the circular trajectory as a simple (but practical) example:Consider that you have preprocessed all the overlapped image pairs to extract their E Matrices.Assume you have loop closure so that the nth image overlaps the n-1th image -- and the E(n,n-1) matrix is computed.Can we form a cost function in the E matrices, which when optimized, causes a trajectory that is consistent with all the feature observations?I you draw a circle trajectory about a point with image points on a regular polygon, then it looks like this is possible, ending up with a single scale uncertainty.I understand that the imagery-alone will result in an overal scene scale uncertainty.
More to the point, do the pre-computed E matrices, form an information set that summarizes all the information from the observed features we have used in our pre-processing? The "reduced measurement matrix" from the GEA paper (near eq 7) suggests this. And they describe numerous examples of actual imagery processing with seemingly good results. They dont get a "better" result (based on reprojection error) -- just a faster result.
Our current methods use a pretty standard BA solution to adjust the per-frame orientations based on the re-projections process; but the BA solution gets awkward when we have many thousands of images and many matched features per image.
We have used commercial stuff but we have preferred our home-grown methods. Mostly because we can capitalize on the exquiste pose from the IMU/GPS. By doing the matches in projected space, we dont need the scale/affine invariance of SIFT but do a more 100% successful match approach using more tradition pyramid-ed correlations over large feature templates. We have tried all the various feature selection/match/RANSAC methods, but even a single bad match destroys any attempts at automation.
Our post-match home-grown BA solutions have used recursive methods that have evolved over many years. Careful use of Krylov techniques allow replacing all HP (H=measurement matrix and P=covariance) computations with a single vector computation that automatically accounts for all the sparseness in the H. We use a special formulation of the KF so that it is mechanized in fixed point (see Mohinder Grewal's work on the sigmaRho filter) -- this further allows removing all insignificant multiply-accumulate operations related to weakly correlated non-contributing states. This unique technique bypasses the notion of "block partitioning" and secondary structure sorting used by SBA methods replaciing it with a more organic "trigger" for omitting non-contributing computations. Using this stuff makes BA feature match processing manageable for 1000s of images --- but still exhibits the "curse of dimensionality".
Where we want to go is a whole new approach to the problem. All organized imaging platforms move along in time and snap images according a pre-planned coverage approach. Tourist camera "happy snaps" seems to have driven a lot of the one-size-fits-all BA technologies! The Essential matrix actually provides real navigation information -- but its use within current BA is limited in the same way as the Schur decomposition -- resulting in a curse of dimensionality (as you point out above). From these discussions and the excellent papers I have now read, I am now convinced that we can "roll-up" the feature matching into a massively parallel pre-process resulting in the E matrices. The E matrix form can be linearized using a small-angle (cross product) matrix. This enables a local (per pair) estimator that is initialized with an a priori covariance of the relative attitude and iteratively produces an estimate of the relative pose Euler angles. The relative position part of this yields the unit vector direction of travel between frames (like a dead reckoning navigator). So the E matrix now produces direct navigation information -- almost like integrating a gyro and accelerometer between image shutters. What this means is that we can treat the E matrix components just like Kalman filter updates to the traditional navigation filter. This classic filter already has the attitude and position being propagated as states and the relative information from the image data can be simply inserted as an update to this filter. Because each image-derived update includes 6 correlated quantities (3 relative positions & atttudes and their covariance), the overall computation is really not increased much from the traditional navigation KF. Our KFs already run with sigmaRho so the complete forward/backward filter runs in minutes.
[1] Horn B. K. P. , Recovering Baseline and Orientation from ‘Essential’ Matrix; January 1990; people.csail.mit.edu/bkph/articles/Essential.pdf
--
GPU helps, mult-threads help -- but you still get geometrical execution growth with frame count.
Ceres is a generalized LSQ solver --- I am hoping to find an alternative cost function that makes use of both apriori knowledge of the collection plan and really good initial pose for all frames.