Faster ray casting methods for AMCL

393 views
Skip to first unread message

Corey Walsh

unread,
Apr 14, 2017, 4:18:56 AM4/14/17
to ros-sig-navigation
Hi all,

As part of my masters thesis, I have written a library for fast 2D ray casting called RangeLibc [1]. The readme is slightly outdated but general docs are at [2]. I implemented particle filter localization with a beam model using RangeLibc for acceleration [3], and gotten very good results. The Python implementation can support ~10K particles in real time (~30Hz) with 61 ray casts per particle.

AMCL uses Bresenham's Line (BL) algorithm for ray casting in the beam model [4]. That is a very slow algorithm, and I have implemented several alternatives which are an order of magnitude faster. I would like to integrate RangeLibc algorithms with AMCL to significantly accelerate the beam model computation. I wanted to run this by you since it would require some degree of architectural change.

Namely, unlike BL, the fast algorithms in RangeLibc rely on acceleration structures with varying degrees of computation/memory overhead to construct. Ray marching and CDDT (my own algorithm, paper pending publication) are fairly fast to construct (<1 second) and have small memory requirement. The "giant lookup table" method is slower to construct and has a large 3D table, but is very fast at runtime. I think at the very least ray marching should be integrated, since it is a straightforward algorithm which requires only a euclidean distance transform for acceleration. In my experience, an average 5x performance boost from ray marching is typical in comparison to BL. The others are even faster for ray casting, but are discretized and/or more complicated and/or more memory intensive. Since these algorithms require acceleration structures, they are not great if the map changes frequently. For this reason, I am proposing to implement them alongside BL so applications with static maps may benefit. I figure a rosparam could specify which algorithm to use when the laser_model_type parameter is "beam".

I am wondering where would be best to perform initialization logic, and where the code should go? Any other suggestions in terms of structure would be appreciated. I'm happy to provide more information on RangeLibc by request.

I also noticed that AMCL does not [5] precompute the sensor model as suggested by section 3.4.1 of [6]. This leads to a lot of redundant computation at runtime which could be replaced by a single table lookup with negligible loss of precision. I'm interested in "fixing" this as well, but it's a separate issue.

6. D. Fox, W. Burgard, and S. Thrun. “Markov localization for mobile robots in dynamic environments,” Journal of Artificial Intelligence Research, vol. 11, pp. 391427, 1999.

v4hn

unread,
Apr 14, 2017, 3:56:39 PM4/14/17
to ros-sig-n...@googlegroups.com
> The Python implementation can support ~10K particles in real time (~30Hz)
> with 61 ray casts per particle.

Wow, nice numbers. I suppose this is your "RMGPU" / CUDA implementation though?
The performance comparison in your docs speaks for itself either way.

+10

This sounds really great and it would be awesome to see these ideas incorporated
into a ready-to-use ROS package! Either as part of AMCL or separately.

For the specific issues you found in amcl, it would be best to file
a pull-request or at least a detailed issue with the github repository.


v4hn

--
Michael Görner, M.Sc. Cognitive Science, PhD Student
Universität Hamburg
Faculty of Mathematics, Informatics and Natural Sciences
Department of Informatics
Group Technical Aspects of Multimodal Systems

Vogt-Kölln-Straße 30
D-22527 Hamburg

Room: F-315
Phone: +49 40 42883-2432
signature.asc

Corey Walsh

unread,
Apr 14, 2017, 4:29:49 PM4/14/17
to ros-sig-navigation, m...@v4hn.de
Thanks for the response and suggestion. I'd like some pointers on how to best integrate with AMCL to avoid lots of rework if my initial integration does live up to AMCL conventions/best practices, so I will file an issue on the repository with details. Good to hear you are open to these changes!

You are correct, that 10K number was for the GPU method (it can actually do a bit more than 10K). However, I've gotten similar performance using the "GLT" method - maybe 8 or 9K. In the particle filter, the cache performance of GLT is excellent since table lookups are concentrated in a small region. Hence, it performs even better than those performance comparisons might suggest. IIRC, [P]CDDT can do around 4.5/5K, and RM on the CPU can do around 3.5/4K particles.

v4hn

unread,
Apr 15, 2017, 6:43:30 AM4/15/17
to ros-sig-n...@googlegroups.com
On Fri, Apr 14, 2017 at 01:29:49PM -0700, Corey Walsh wrote:
> I'd like some pointers on how to best integrate with AMCL to avoid lots
> of rework if my initial integration does live up to AMCL conventions/
> best practices, so I will file an issue on the repository with details.

Sounds like a good idea. Although a working implementation,
fitting the conventions or not, usually is a good idea too.
People can test it right away. Also in case you loose interest before
it is merged, someone else can continue on that work later on more easily.

> Good to hear you are open to these changes!

Sorry, I didn't mean to confuse! Apparently I have commit access
to the repository in question, but I'm not one of its core maintainers.
That would be David Lu!! and Michael Ferguson. They seem to be
a bit behind on reviewing/merging pull-requests, although things happen there.
The navigation stack is quite crucial to most ROS users, so they probably
do not want to merge prematurely and only after proper testing.

The same clearly applies to your ideas too.

But if there is significant performance gain and features are well-tested,
In my opinion they are welcome. :)
signature.asc
Reply all
Reply to author
Forward
0 new messages