I was happily coding on my
little avl tree library.
type Node{K, V} <: Avl{K, V}
child :: Array{Avl{K, V}, 1}
key :: K
value :: V
balance :: Int8
end
Everything seemed natural and reasonable to me.
I keep the children in a length 2 array so that I can select one with a bool:
function insert{K, V}(node :: Node{K, V}, key :: K, value :: V)
side = (key > node.key) + 1
node.child[side] = insert(node.child[side], key, value)
… fix balance factors and return
end
This is for two reasons. The good reason is it makes the code much shorter. I need only half as much rotation and balance-factor fixing code. This means less debugging and maintainance. The other reason is to avoid one conditional branch each time a child is accessed.
However, I just recently discovered that code like this seems to be rather expensive. Inserting 100 values in a tree takes a noticable amount of time. 1000 elements is enough to choke up the terminal.
The question is WHY? Is this a bug? If not, what are these things you call arrays? How are they implemented?
The weird thing is that the slow code worked correctly. The objects in the arrays behaved just like references. E.g. one can insert, delete and rotate nodes. That wouldn't be possible if for instance the objects were pasted into the array as data (while copying stuff around to make space). And if they were, even that would hardly explain the awful performance.
If I change the code to this:
type Node{K, V} <: Avl{K, V}
left :: Avl,
right :: Avl
…
end
the performance becomes as expected (very acceptable). But then I have to rewrite and practically double the size of the core routines.
What I want is a way to store two pointers to trees in a type (like one would in c), so that I can select between them with a variable in constant time (preferably without conditional branching).
Is there a way to do that? Tuples can't do it, because you can't assign to them.
--
No problem, printing issues have caused confusion many times. Branches can matter in tight inner loops, but allocating memory is definitely worse. In some cases bounds checks have very little overhead.
--
Here is a pastbin link anyway.
http://pastebin.com/1qtjGhne