RRB (Relaxed Radix Balanced) Trees for efficient immutable vector concat

499 views
Skip to first unread message

Andy Fingerhut

unread,
Sep 5, 2012, 2:40:25 PM9/5/12
to cloju...@googlegroups.com
Nicolas Oury asked about a paper describing Relaxed Radix Balanced Trees (RRB trees) on the Clojure group:


Here is a link to the paper, by Phil Bagwell and Tiark Rompf:


I've read it, and it looks quite interesting.  Basically it is a tweak on the 32-way branching tree used for Clojure vectors, where the branches can have 31- or 32-way branching (in general (n-1)- or n-way branching).  This extra "wiggle room" makes it possible to concatenate two such immutable vectors in O(log n) time, resulting in an immutable vector with the same efficient access to index i as the original immutable vectors, and the same ability to be efficiently concatenated again.

I haven't followed the recent reducers work at all, except to watch the talk by Guy Steele that Rich recommended.  From my understanding, it appears that RRB trees would be a good thing to add to Clojure to achieve its goals of efficient parallel operations that operate on, and produce as results, efficiently accessible immutable vectors.

Is something like this already part of the reducers work that has been going on?  If so, could someone point out which part?

Thanks,
Andy

David Nolen

unread,
Sep 5, 2012, 10:10:10 PM9/5/12
to cloju...@googlegroups.com
On Wed, Sep 5, 2012 at 2:40 PM, Andy Fingerhut <andy.fi...@gmail.com> wrote:
> Is something like this already part of the reducers work that has been going
> on? If so, could someone point out which part?
>
> Thanks,
> Andy

Not that I'm aware of. Would be nice! Could probably be hashed out as a contrib?

David

Mikera

unread,
Sep 8, 2012, 12:39:44 AM9/8/12
to cloju...@googlegroups.com
These are pretty cool data structures and something along these lines could well be worth adding - but only if provided as an alternative to normal vectors rather than a replacement.

From the benchmarks in the paper it appears that RRB trees have quite a bit of extra overhead in the common case (indexed access) which would not usually be worth trading off for fast performance in a relatively uncommon case (vector concatenation). So it's better to have two implementations and let the user decide.

I think this is relatively orthogonal to the reducers work - perhaps the fast concatenation might be helpful in some map/reduce type scenarios where you are accumulating a result set as a vector but it's not obvious to me that this would be much better than doing lazy concats and then just calling vec on the result.

Philip Potter

unread,
Sep 8, 2012, 2:28:35 AM9/8/12
to cloju...@googlegroups.com

The performance loss is only paid for in the relaxed nodes. You could easily start with a vector made only of ArrayNodes, and if you never concatenate, you never pay the price.

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/clojure-dev/-/f9HNuJUnvZMJ.
To post to this group, send email to cloju...@googlegroups.com.
To unsubscribe from this group, send email to clojure-dev...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clojure-dev?hl=en.

Rich Hickey

unread,
Sep 8, 2012, 8:25:12 AM9/8/12
to cloju...@googlegroups.com

On Sep 8, 2012, at 2:28 AM, Philip Potter wrote:

> The performance loss is only paid for in the relaxed nodes. You could easily start with a vector made only of ArrayNodes, and if you never concatenate, you never pay the price.

That was my understanding as well.

I'm definitely interested in these. concat, and also fast add to front I think.

This should be considered orthogonal to reducers, but obvious utility there.
Reply all
Reply to author
Forward
0 new messages