Combining multiple odometry sources, with auto extrinsic calibration

166 views
Skip to first unread message

Terry

unread,
Apr 4, 2025, 12:48:42 PMApr 4
to gtsam users
Hi,

I am investigating ways to combine multiple (6-DoF or 3-DoF) pose sources coming from sensors (e.g. VO and LO) rigidly mounted to a platform. Assuming they each produce measurements at different times and frequencies, how could we still constrain this system?

Attached is an (incomplete) graph of the system. I am trying to work out what the ?<?> factor should be.  If we know the extrinsic parameters, we could just transform C and L into the common frame of X and add a BetweenFactor<Pose3>.

Would a viable way be something like BetweenFactorWithExtrinsic(x1, c1, xc), where xc is a Pose3 variable representing the rigid transform?

I've also considered ?<?> being some constant velocity factor, where X also contains velocity parameters (and timestamp field).

Thanks for looking!

Terry
Screenshot 2025-04-04 at 12.39.56 PM.png

Dellaert, Frank

unread,
Apr 4, 2025, 2:29:53 PMApr 4
to Terry, gtsam users
If you have an IMU, an IMU-Factor is the right answer. Without IMU, and easy answer is to just add regular BetweenFactors that impose some smoothness constraint. If you set the variance proportional to the delta-time, it believe that it will automatically interpolate x2 between x1 and x3, as if you had used `interp<Pose3>`.

You can think of more complex schemes, but probably with not much added value, IMHO.

Frank

From: gtsam...@googlegroups.com <gtsam...@googlegroups.com> on behalf of Terry <terry...@gmail.com>
Sent: Friday, April 4, 2025 12:48:42 PM
To: gtsam users <gtsam...@googlegroups.com>
Subject: [GTSAM] Combining multiple odometry sources, with auto extrinsic calibration
 
You don't often get email from terry...@gmail.com. Learn why this is important
--
You received this message because you are subscribed to the Google Groups "gtsam users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtsam-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/gtsam-users/00f0e85b-a215-4821-9dfe-4e9bc0f101b0n%40googlegroups.com.

Terry

unread,
Apr 4, 2025, 3:31:05 PMApr 4
to gtsam users
Hi Frank,

Thanks for the fast reply! So if I understand you correctly, I could uses a BetweenFactor<Pose3>(x1, x2, Pose3(), noise * dt) for all consecutive poses (x2->x3 etc.), as a smoothing constraint, even though only x1->x3 is observed? This does seem easier than velocity constraints.

For the extrinsic parameter does it make sense to have a 3-way Factor that works like a BetweenFactor, instead of the measured pose it uses a symbol representing the extrinsic Pose3?

For the IMU, we don't have one unfortunately so I'm trying to find a general loosely coupled solution.

Thanks again!

Terry

Dellaert, Frank

unread,
Apr 4, 2025, 3:56:41 PMApr 4
to Terry, gtsam users
If you know​ the extrinsics you can do a similarity transform before adding the betweenfactor to the graph, e.g., If you decide to use the LIDAR frame wT1...wTm and you have a betweenfactor 1C3 = inv(wC1)*wC3 from the cameras, you could transform it by:
1T3 = inv(wT1)*wT3 = inv(wT1*lTc)*wC3*cEl = inv(cEl)*wC3*cEl
Since wT1=wC1*cEl and wT3=wC3*cEl

If you want to *estimate* the extrinsic cEl, you would indeed need to build a custom factor or use expressions. However, there are projection factors that already use extrinsic as a variable to optimize (if I remember correctly).

Frank

Sent: Friday, April 4, 2025 3:31:05 PM
To: gtsam users <gtsam...@googlegroups.com>
Subject: Re: [GTSAM] Combining multiple odometry sources, with auto extrinsic calibration
 

Yotam Stern

unread,
Apr 6, 2025, 3:24:40 AMApr 6
to gtsam users
Hi, its an interesting problem, i would like to propose a slightly different approach to that of Frank, but it would require some custom factors to be added.

You can pick one sensor as the main coordinate system for your rig, lets say its the camera like most Typical system, you can pick only the poses at the times observed by the cameras as the variables you will estimate in the pose graph and treat the LO measurements are pure constaints on those like this:

define lidar to camera calibration symbol lets call it C(1) and define a custom factor that handles the LO measurements as such:
1. transform the LO measurement to the camera CS with C(1) value
2. calculate interpolation parameter alpha_1 by the 2 camera poses nearest in time to the first LO pose relevant for the current factor. lets denote t_prev_camera as the time of the camera pose prior to the first LO pose t_LO_first and t_next_camera as the camera pose after the LO first pose, so alpha_1 = (t_LO_first - T_prev_camera) / (t_next_camera - t_prev_camera), now interpolate the previous and next camera poses by the parameter alpha_1
3. do the same for the second LO pose and the corresponding camera poses before/after it.
4. calculate a regular between factor error on the 2 interpolate camera poses and the LO between measurement that was transformed to the camera CS with the calibration C(1).

that is basically your factor that will both calibrate the lidar to the camera and enforce the LO measurements on the camera poses.

This is assuming the camera poses frequency is enough for your need, it will simply the system and will alleviate the need for tunning covariance for smoothing between factors like otherwise suggested. with this factor you need the measurement covariance from the LO and the VO systems, and some prior on the calibration which is only a single global parameter to tune.

Maurice Fallon

unread,
Apr 6, 2025, 8:28:53 AMApr 6
to Yotam Stern, gtsam users
Hi Terry,

You can find useful discussion on this in our multi-sensor fusion papers here:
https://robots.ox.ac.uk/~mfallon/publications/2021RAL_wisth.pdf
This is 'tight fusion' at the lower, 10Hz, level. But I think you might be operating at a higher level with the output of independent VO and LO modules.

A key thing we did was to reinterpret the lidar scans at timestamps equivalent to camera timestamps (as their timestamp is somewhat arbitrary because of the continuous scanning motion). So the factor graph poses correspond to either camera frames or lidar frames - which is much easier to interpret and to develop in code. I think this is what Yotam said.

Like he said, you should be estimating a fixed frame on the vehicle/sensor rig by transforming the odometry estimates into a common coordinate frame.

I would say that sensors arent created equally - lidar odometry will typically not suffer from calibration bias and has lower drift rate especially in open space. So putting duplicate edges for VO and LO into a factor graph and letting GTSAM resolve it isnt really productive and can become unstable. I'd recommend a looser fusion where you use LO when available and working well and fuse VO when it struggles - especially if you are running these odometry systems standalone and have any problems with clock synchronisation.  

best wishes,
Maurice


--
You received this message because you are subscribed to the Google Groups "gtsam users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtsam-users...@googlegroups.com.

Dellaert, Frank

unread,
Apr 6, 2025, 12:03:15 PMApr 6
to Maurice Fallon, Yotam Stern, gtsam users
Woohoo! The GTSAM user group in action! Love it :-)
Frank

Best!
Frank

From: gtsam...@googlegroups.com <gtsam...@googlegroups.com> on behalf of Maurice Fallon <mauric...@gmail.com>
Sent: Sunday, April 6, 2025 8:28:37 AM
To: Yotam Stern <sht...@gmail.com>
Cc: gtsam users <gtsam...@googlegroups.com>
Subject: Re: [GTSAM] Re: Combining multiple odometry sources, with auto extrinsic calibration
 
You don't often get email from mauric...@gmail.com. Learn why this is important

Brett Downing

unread,
Apr 8, 2025, 9:26:45 AMApr 8
to gtsam users
Hi Terry,

I've been building a system of factors to implement and generalise Furgale's 2013 paper
https://doi.org/10.1109/IROS.2013.6696514

I'm working in an additive manufacturing context where I need to compute extrinsic calibration for asynchronous sensors.
The sensors only work while they're in motion, and the temporal uncertainty of their sampling time (propagated with the velocity) dominates any positional uncertainty in the rest of the system.
The initial clock offset is so poorly characterised that the order of events is unknown. To reach the required uncertainty, we solve a temporal calibration at the same time as the extrinsics.

Our trajectory model uses monotonic control points with an interpolation kernel applied using a recursive call to interpolate<>, forming a FIR filter that can exactly and efficiently replicate Nth order cardinal b-splines.
A per-sensor clock-model provides a global timestamp to the FIR-filter Expression for each measurement.  (You need to jostle the system to make the temporal calibration non-degenerate)
Building the FIR-filter Expression relies on interpolate<> having derivatives for the t parameter: https://github.com/borglab/gtsam/pull/2035
To carry the sensor extrinsics, we have a ternary version of the BetwenFactor, taking three keys representing global chassis pose, global sensor pose, and sensor extrinsics relative to chassis datum.

For each measurement time Tn, the graph computes:
* global time stamp from the sensor's clock model and the sensor's measurement timestamp (clock factor)
* chassis pose at Tn from the global timestamp and the trajectory model (FIR factor / trajectory pose factor)
* sensor pose at Tn from the chassis pose and the sensor extrinsic (Ternary BetweenFactor)

Solving this system for a robot in motion explicitly recovers the extrinsics and the temporal calibration.
By generating the chassis pose from the timestamp and an interpolation, the solver gets to decide on the order of events.
The trajectory model has a user-defined FIR filter, so the smoothing effect can be explicitly defined rather than relying on whatever norms and loss functions are in the solver.

I haven't written the Factor style boilerplate for these, we haven't needed it; to make the solver go faster, we've been aggressively pruning intermediate parameters so we just have a pile of dense anonymous expression factors linking trajectory parameters, extrinsics parameters, and geometry parameters.
We're processing raw lidar data in the factor graph using signed distance fields and a CSG tree, all implemented in Expressions.  Our initialisation is tight enough that we only have to linearise two or three times.

I prefer Furgale's clock recovery approach because it exposes the full continuous-time dynamics of the system. It can trivially relate the trajectory model to stochastically sampled IMU data and GPS doppler data, and it can theoretically be extended to estimating excitation of vibrational modes in large structures.

Cheers,
Brett



Terry

unread,
Jun 23, 2025, 9:50:49 AMJun 23
to gtsam users
Thanks for the great responses!  And sorry for delayed reply - I've been meaning to circle back here.

I've done some work on this and have landed on something which is working quite nicely, and has room for growth. It involves 1) Expressions (They're great!) 2) Frank's suggestion for 'interpolation factors', and 3) transform to common frame. 4) Use expressions to make the common-frame transform (extrinsic cal) a variable!

What's also interesting is that it's quite insensitive to timing (mis)synchronization for the measurement sources. Would love to get to estimating the timing offset for sources too.

Here's an essential example for two (rigidly attached) pose streams:


// Symbols
using Pose2_ = Expression<Pose2>;
Pose2_ a1(1), b(2), a3(3), b4(4), a5(5);
Pose2_ extrinsic_a_, extrinsic_b_;
// Pose2 extrinsic_a_, extrinsic_b_; // Value instead of symbol, if known transform

// Between Constraints for stream A
noiseModel::Diagonal::shared_ptr model = noiseModel::Diagonal::Sigmas(Vector3(0.01, 0.01, 0.01));
graph.addExpressionFactor(between_with_extrinsic(a1, a3, extrinsic_a_), pose_1_3, model);
graph.addExpressionFactor(between_with_extrinsic(a3, a5, extrinsic_a_), pose_3_5, model);

// Between Constraints for stream B
graph.addExpressionFactor(between_with_extrinsic(b2, b4, extrinsic_a_), pose_2_4, model);

// Set smoothing term (equality constraint, covariance scaled on dt)
// Pretend 1.0s between each pose measurement
double dt = 1.0;
noiseModel::Diagonal::shared_ptr smoothing = noiseModel::Diagonal::Sigmas(Vector3(0.1, 0.1, 0.1) * dt);
graph.addExpressionFactor(between(a1, b2), Pose2(0, 0, 0), smoothing);
graph.addExpressionFactor(between(b2, a3), Pose2(0, 0, 0), smoothing);
graph.addExpressionFactor(between(a3, b4), Pose2(0, 0, 0), smoothing);
graph.addExpressionFactor(between(b4, a5), Pose2(0, 0, 0), smoothing);


// Expression:
Pose2_ between_with_extrinsic(const Pose2_& a, const Pose2_& b, const Pose2_& extrinsic)
{
return between(compose(a, extrinsic), compose(b, extrinsic));
}


Reply all
Reply to author
Forward
0 new messages