Hi All,
Nail found a bug the other day, and I dove into some parts of the code
where I want to do a pretty sophisticated optimisation and I was
wondering if anyone had opinions/tips on it.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
B+Tree/Trie Hybrid data structure
-------------------------------------------
One problem w/ B+trees is they allocate memory in chunks.
If you have a big table it makes a lot of sense to allocate chunks in
4K blocks.
Redisql's indexes are also B+trees and so are the nodes of those
indexes.
For space saving reasons the 3 B+trees do not allocate memory in equal
proportions
1.) Data B+tree 4K
2.) Index B+tree 1K
3.) Node B+tree 128B
The reason that the Node B+tree is small is to not allocate a lot of
memory if e.g. there is an index on lastname and only one person's
last name is "Sprats", so only 128B would be allocated.
The problem w/ this approach is for the "Smith" lastname, that has
100K entries. w/ 128B block sizes, a 100K element B+tree has to split
nodes many times ....
The Node B+trees also dont have to be B+trees, they dont do range
queries, they just need fast insert/delete and low memory usage (so a B
+tree is correct, but not at small sizes)
So I think the truly elegant (although more complicated) is to have a
hybrid data structure that is a Trie when there are less than ?32?
elements and a B+tree w/ 4K block-size afterwards. This means when the
structure gets its 33rd element, it will have to convert from Trie to B
+tree.
At a size of 32 elements, a Trie should be as fast as a B+tree for
range-queries, but a Trie takes up almost no additional memory, so for
the FK on lastname "Sprats" entry and the "Smith" entry, both high
speed and low memory would be the case.
Additionally, in the case where you want to make a table per user, say
100K tables, and each table only has 10 rows, then using this B+tree/
Hash for the Data B+tree, would be a big win.
The only con to this approach (besides code complexity) is when the 32
element Trie has to be converted to a 33 element B+tree, there will be
a performance hiccup.
Thoughts?
- Jak
p.s. redis has similar issues w/ ZSETs
http://groups.google.com/group/redis-db/msg/dd7cad10e21ecf65