TagSLAM latency

61 views
Skip to first unread message

Torchtopher

unread,
Mar 26, 2025, 12:32:09 PMMar 26
to tagslam

Hi Bernd,


We (FRC 900) are using TagSLAM to localize using a known map. Our goal is to follow a preplanned path/align to poles with roughly +/- 1 inch tolerance. For inputs, we have 4 OV2311 cameras (using approximate sync) running 1600x1300 at 60 Hz detecting tags, as well as our wheel odometry which is 250 Hz (configured in the TagSLAM config file to have a positional error of 0.5 meters, which should be worst case, and an angular error of 0.017 radians). For path following, we use the approach you recommended of updating the transform from the map frame to the odom frame using TagSLAM's updates and then we just run PID + velocity feedforward using the map to base_link transform as our pose estimate.


Our problem is that the positions output by TagSLAM seem to be very accurate, but have considerable latency of around 500ms which by that time our robot has already moved quite a lot. Then when the TagSLAM output does catch up to base_link (i.e. frc_robot and base_link are on top of each other) we have been shifted a fairly significant amount.


Any ideas? Even if this is expected it would still be helpful to know, and most likely we are doing something wrong. 

Things we have hopefully already fixed:

  • We spent a while getting extrinsics for the cameras, we essentially just placed 20 tags around the robot and slowly moved around, this seems to have worked as previously we got tons of high subgraph warning messages, however we would be fine with using just one camera if that is better because of the lack of hardware sync. 


Relevant files: 

Tagslam config

Cameras

Camera poses:

map_to_odom


Bag with compressed camera data


Another bag with video of the robot

Video 1: Bag 1


We would be happy to provide more bags (we ran a few trials)/collect other data.


Thanks again for all your help so far the past seasons. 


Bernd Pfrommer

unread,
Mar 27, 2025, 2:53:44 PMMar 27
to tagslam
Chris
Here some comments:
- The camera images look actually pretty good, given the lighting, which is pretty dim. (Leave lighting as is since you don't know what you'll get during the competition).
- What cameras are you using? Is there any way you could hardware sync them? It really helps when the robot rotates.
- Usually TagSLAM itself is not the source of delay, it's the tag detection and the sync. You have fairly high resolution, but I think you can run the sensor at lower resolution, which would give you better shutter times and reduce your compute load.
-  Unfortunately the bags don't seem to have the original images, just the debug images, so I can't easily run the full pipeline, or actually look at the time stamps between image header.stamp and tag detection header.stamp. That would tell me how much latency the tag detection produces.
-  In the first bag, the cameras seem to run different frequencies. Not sure why, but I would not do that (unless you have tasks other than tag detection that require different rates)
- What tag detector are you running? If your tag detector is keeping up with 1600x1300@20Hz you must almost be running on GPU.
- Have a hard look at the image software sync (approximate sync). If you only see the problem when going multi-cam, that is often the problem:
   - When the approximate sync runs, it tries to coerce image frames into the same time slot. The slowest frequency will be the one that wins, meaning you should run all cams at the same frequency.
   - If the tag detector for cam 0 falls behind and drops a frame, then the approximate sync will not happen. If on the next frame the detector for cam 1 falls behind, then *that* frame pair will also be dropped because a valid sync only happens with *all* cameras deliver a valid image for a time slot. So if the detectors alternatingly drop frames, you will *never* get a valid sync. Or finally you get lucky, and both detections succeed. That could happen every 0.5s or so :). On top of it comes the sync against the odometry, but that one runs at 250Hz, so there will always be an odom frame close by.
  - This is why I've written sync_and_detect. There, the sync happens *before* the detection. So a single node subscribes to the images, and only when it has images from all cameras for a given time slot (approx synced already!) does it run the detector on each image and then publish the tags for all images. That works a lot better in the case when the detector falls behind.
- On the other hand from your bag TagSLAM is outputting odometry poses at the same frequency as the images. If you were dropping frames as described above, then tagslam would not be outputting at the same rate. So maybe that is not the problem.
- If your camera allows external sync triggering by all means do that. Soldering up a sync cable is not rocket science. On the other hand, this will not fix the problem of the alternating dropped frames described above.
- You can actually get a pretty good sense of tagslam latency by looking at the header stamp vs recording stamp of the tagslam odom output. Because the header.stamp is passed through all the way from the device driver (camera/IMU/wheel encoder) to tagslam. It is never altered (except for the approx sync, which will pick one of the header.stamps). TagSLAM itself also doesn't alter it, so when you look at the recorded taglslam odom output, the difference between header stamp - recording stamp gives you an upper bound on latency (because recording adds additional latency). The magic line is this:
rostopic echo -p -b ./_2025-03-25-18-12-19.bag /tagslam/odom/body_frc_robot | awk -F, '{print $1, $3}' > tagslam_output.txt
Then you subtract the columns from each other to get latencies, see histogram below. Looks like about 10ms latency, which is quite reasonable. 

tagslam_latency.png



- Your odom noise of 0.5m seems way off. You are basically saying that your odom is mistracking by 0.5m *between two image frames*. This may get you the best results, but it means you are essentially ignoring the odom, and just using it to interpolate a little bit. This will mask some underlying problems you have. My guess is that your transform between camera frame and body frame is off. Because I see in tagslam.yaml you have set the rig body to coincide with the odom frame (TF is set to the identity). Now I wonder how you got the absolute pose for the camera with respect to your rig body.
- I suggest working out the transform between the odometry frame and *one* camera first. Start out by measuring with a measure tape and doing the TF on paper, then put camera 0 at that position in camera_poses.yaml, and leave the odom frame transform the identity (as is). Now do a test run where you set the image noise in tagslam.yaml to be very low, so the path should follow the odometry. Take a screenshot of what you see in rviz. Then do another test run with the image noise at 1 pix, and the odometry noise very large (rotational noise to about pi, position error to 10m). These two paths should look roughly the same. If they go off in different directions, then you have a problem with your camera transform. Fiddle around until the two paths are roughly the same. Once you have that, you can record another data set where you drive around and do plenty of rotations and translations with the camera. Give a loose position for the pose of camera 0 (set the R-matrix *SMALL* in camera_poses.yaml, because R is the inverse of variance (noise)). Then let tagslam optimize the pose of the camera. That should be close to your measured/guessed transform. Once that is done, fix the pose of cam 0 by increasing the R matrix in camera_poses.yaml again, and bring the other cameras online and do extrinsic calibration of those cameras w.r.t cam 0.

Phew. That was almost too long a reply. Hope this helps. Let me know if you have other questions.

PS: rosbag info has a --freq flag
PPS: what Jetson and what cameras are you using? I need to upgrade my Jetson and am curious what model you have and how you like it.
P^3S: NVidia is coming out with a new Jetson in June (supposedly).




Torchtopher

unread,
Mar 27, 2025, 4:20:01 PMMar 27
to tagslam
Just a quick response on the hardware / tag detector from our software mentor.

"

We’re using a few different types of Jetsons on the robot.  The main control system is running on an Orin NX, and we have two Xavier NX systems, each with two cameras.  The cameras are these : https://www.arducam.com/product/arducam-2mp-ov2311-global-shutter-mono-visible-light-camera-modules-for-jetson/.  They were a good intersection of 2MP, global shutter, high frame rate, and designed to work with a Jetson.  They do expose the pins needed for hardware sync, we’re working on that with hopes of having it before the end of the season.


We’re running an optimized fork of a GPU-based april detector, code is
https://github.com/FRC900/971-Robot-Code/tree/main/frc971/orin (branch optim_2) and https://github.com/FRC900/900RobotCode/tree/main/zebROS_ws/src/gpu_apriltag.  


The Xavier NX has 6-7msec of latency to decode a tag using this, so we’re able to keep up with 2 cameras at 60Hz.  We’d use OrinNX (or Orin Nanos, both are fast enough) except those have kernel driver issues running 2 of the arducam cameras we haven’t had time to debug.  The Orin Nano is a msec or two faster but given we’re already keeping up with the cameras that doesn’t seem to be the limiting factor.  The biggest downside to the Xavier is code compilation speed, Orin is way faster for local builds.


Random aside - we have at least one Orin Nano using the waveshare carrier board clone with no issues (https://www.amazon.com/Waveshare-Jetson-Development-Board-Module/dp/B0C5DMKFYW/134-1996527-5104617?pd_rd_w=vJe2k&content-id=amzn1.sym.4c8c52db-06f8-4e42-8e56-912796f2ea6c&pf_rd_p=4c8c52db-06f8-4e42-8e56-912796f2ea6c&pf_rd_r=0BEC7KV3KCMHRF4RQ30G&pd_rd_wg=XAt8b&pd_rd_r=a922c8d0-db89-4e91-896e-c22f647c9ce6&pd_rd_i=B0C5DMKFYW&psc=1, or cheaper from Aliexpress ). I can’t 100% confirm that an Orin NX would also work but it looks like a straight up clone of the NV devkit board so I’d expect it would be fine.  This is only really useful if you decide to pick up an OrinNX, since those seem to come as bare modules or in overpriced combos with a devkit carrier board. 


I’m sure Thor will be awesome, but probably out of our price range given the competition rules we’re working under (< $600 per component).  Maybe if there’s ever a Thor Nano…


We’ll begin trying to answer the rest of the questions, appreciate the guidance on those.  


In the meantime the students have implemented a simple hack to wait for base_link to converge with frc_robot as a “we’re confident in our position” check.  Even with a bit of a latency hit doing so, the results look promising : https://drive.google.com/file/d/1GX8MYjRaSUK18NxFrvk75X_uplVkc9QQ/view?usp=sharing.  The robot is totally autonomous for that entire video, and both path following and game piece placement is using tag detections to localize the robot on the field."


We are testing currently and think that both sets of 2 cameras are running at the same frequency, so maybe it was just a bag artifact. Here is rostopic hz from the live robot.

ubuntu@dino-laptop:~$ rostopic hz /apriltag_detection_ov2311_10_9_0_9_video1/tags
subscribed to [/apriltag_detection_ov2311_10_9_0_9_video1/tags]
average rate: 60.168
min: 0.010s max: 0.022s std dev: 0.00153s window: 60
average rate: 60.034
min: 0.010s max: 0.022s std dev: 0.00123s window: 120
average rate: 60.038
min: 0.010s max: 0.022s std dev: 0.00106s window: 180
average rate: 60.008
min: 0.010s max: 0.022s std dev: 0.00102s window: 240
average rate: 60.004

ubuntu@dino-laptop:~$ rostopic hz /apriltag_detection_ov2311_10_9_0_9_video0/tags
subscribed to [/apriltag_detection_ov2311_10_9_0_9_video0/tags]
average rate: 60.032
min: 0.014s max: 0.021s std dev: 0.00138s window: 59
average rate: 59.841
min: 0.014s max: 0.022s std dev: 0.00119s window: 119
average rate: 59.932
min: 0.013s max: 0.022s std dev: 0.00120s window: 179
^C^Caverage rate: 59.963
min: 0.013s max: 0.022s std dev: 0.00118s window: 228

ubuntu@dino-laptop:~$ rostopic hz /apriltag_detection_ov2311_10_9_0_10_video1/tags
subscribed to [/apriltag_detection_ov2311_10_9_0_10_video1/tags]
average rate: 60.020
min: 0.014s max: 0.019s std dev: 0.00100s window: 59
average rate: 59.982
min: 0.014s max: 0.021s std dev: 0.00116s window: 119
average rate: 60.042
min: 0.011s max: 0.023s std dev: 0.00145s window: 179
average rate: 60.083
min: 0.011s max: 0.024s std dev: 0.00170s window: 240
average rate: 60.057

ubuntu@dino-laptop:~$ rostopic hz /apriltag_detection_ov2311_10_9_0_10_video0/tags
subscribed to [/apriltag_detection_ov2311_10_9_0_10_video0/tags]
average rate: 59.996
min: 0.015s max: 0.019s std dev: 0.00062s window: 59
average rate: 59.978
min: 0.009s max: 0.032s std dev: 0.00169s window: 119
average rate: 59.637
min: 0.009s max: 0.048s std dev: 0.00282s window: 178
average rate: 59.748
min: 0.009s max: 0.048s std dev: 0.00248s window: 239
average rate: 59.796
min: 0.009s max: 0.048s std dev: 0.00235s window: 299
average rate: 59.833.


Will track down your other points but wanted to update with that.

Torchtopher

unread,
Mar 27, 2025, 7:45:39 PMMar 27
to tagslam

Hi again,


We did a full few hours of practice today, so probably 2 hours of robot running. Our current method of waiting for base_link and frc_robot to overlap while probably not optimal works very well and we almost never miss. However we did observe one concerning behavior which was that relatively rarely, latency - calculated as you did with ```rostopic echo -p -b ./[bagfile].bag /tagslam/odom/body_frc_robot | awk -F, '{print $1, $3}' > tagslam_output.txt```, increased to anywhere from 500ms to 1 second. Here is the problematic bag. We then graph this over time and notice a few concerning spikes, corresponding to very old transforms. Graphs made with this script.

 



You can also see the same graph for a previous fully working bag (above _2025-03-25-18-12-19.bag). These times also correspond with how old the tagslam transform is - we see roughly 20ms old transforms between map -> frc_robot pre spike, but during this we see nearly 1 second old transforms being the most recent. The main thing is that we had not changed anything from when we sent the previous bag and have never observed this behavior before. We also reproduced your histogram, and made one for the broken bag which you can see here.

Left is the bad bag, right is just what you had but us making it so the script is working. 


We suspected hardware, but the cameras still look to provide good images and are properly seated in their mounts. Later runs do not consistently have this problem (but occasionally did). Any ideas? We can’t find any obvious correlation between moving/tags seen. 

Reply all
Reply to author
Forward
0 new messages