Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] map implementation question

5 views
Skip to first unread message

Brian Hurt

unread,
Aug 4, 2006, 4:11:34 PM8/4/06
to caml...@yquem.inria.fr
I was just looking at the map.ml implementation, and noticed that the
logic for when to do a rotation was:

> if hl > hr + 2 then begin


Isn't this supposed to be:

if hl >= hr + 2 then begin

? The latter will cause more rotations, but keep the tree more
balanced. The worst-case access of the >= version is log base 3/2,
while the > is log base 4/3, which means that the >= will be about 41%
(log(3/2)/log(4/3) ~ 1.41). Both are correct in that they return the
right answer and are still O(log(N)) performance, it's a question of
performance of looking up an element in the tree vr.s the cost of
inserting an element into the tree.

Was there a reason it was done this way, or is this a (minor) bug?

Brian

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Xavier Leroy

unread,
Aug 17, 2006, 7:40:31 AM8/17/06
to Brian Hurt
Brian Hurt wrote:
> I was just looking at the map.ml implementation, and noticed that the
> logic for when to do a rotation was:
> if hl > hr + 2 then begin
> Isn't this supposed to be:
> if hl >= hr + 2 then begin

No, it was a conscious decision to use height-balanced binary trees
with an height imbalance of at most 2, rather than at most 1 as in
standard AVL trees. As you note, log(N) access times are still
guaranteed, and it's a tradeoff between lookup time vs. rebalancing time.
Light experimentation suggested that imbalance <= 2 is globally more
efficient than imbalance <= 1. Didn't try with larger imbalance bounds.

This said, red-black trees would probably work faster anyway, but I'll
let the algorithm experts on this list comment.

- Xavier Leroy

j h woodyatt

unread,
Aug 17, 2006, 2:54:45 PM8/17/06
to The Caml Trade
On Aug 17, 2006, at 4:37 AM, Xavier Leroy wrote:
>
> This said, red-black trees would probably work faster anyway, but I'll
> let the algorithm experts on this list comment.

My experience trying to tweak the red-black trees in the Cf library
of OCaml NAE so they perform globally better than the height-balanced
trees in the standard library has been mixed. Some functions perform
marginally better, but others are worse-- sometimes substantially
worse, and I don't think there's any way around it. (It doesn't help
that a lot of my exercises reveal that my binary set operations need
improvement, but there are other places where there's simply nothing
to do. I'll get around to fixing the binary set operators someday,
before my next release.)

By the way, Xavier is very correct: that "imbalance <= 2" thing is
utterly brilliant. I'm pretty sure my red-black trees would smoke
the standard library if it weren't for that.

The result is that I recommend using my red-black trees only when you
are either 1) using the other facilities in the Cf library that are
integrated well with them, e.g. Cf_seq and such, or 2) using them in
a [currently hypothetical] case where you have compared the
performance with the standard library and it makes a valuable
difference to get 15% more CPU (or one less field per tree node) out
of your tree algorithm.



j h woodyatt <j...@conjury.org>
san francisco, ca

Shawn

unread,
Aug 17, 2006, 2:56:34 PM8/17/06
to caml...@yquem.inria.fr
Xavier Leroy wrote:

> This said, red-black trees would probably work faster anyway, but I'll
> let the algorithm experts on this list comment.
>


I don't suppose anyone's compared the ocaml splay tree library that's
floating around with Map and red-black trees?

0 new messages