Does index use btree or b+tree?

615 views
Skip to first unread message

Ars Roseregen

unread,
Oct 7, 2023, 5:16:06 AM10/7/23
to wiredtiger-users

For a long time I thought he was using btree until I saw this document link which says WiredTiger maintains a table’s data in memory using a data structure called a B-Tree (B+ Tree to be specific). But I found the source code of the btree link which doesn’t look like a b+tree(because I can’t find the next or pre pointer for leaf node). And this document says it is B-tree link. This make me confused. Can someone give me some pointers?



Keith Smith

unread,
Oct 9, 2023, 9:27:28 AM10/9/23
to wiredtig...@googlegroups.com
Hi Ars, thanks for the question!

Within the WiredTiger team, we tend to refer to our tables "B-Trees". I think this is mostly for simplicity, and not because it is the precisely correct term. But that does lead to some inconsistency (as you've found) in the documentation and comments.

The exact answer, however, is a bit complicated. WiredTiger includes a number of optimizations and design choices that don't adhere to the classic definitions of a B-Tree or a B+ Tree. I expect that if you look in detail at any other widely used database you will find similar variations.

There is not, to my knowledge, a precise definition of a B+ tree. Two features that are commonly cited are:
  1. All keys reside in the leaves.
  2. The leaves are linked for easy sequential access.
WiredTiger B-Trees do store all keys and values in the leaf pages. (An exception are overflow pages where we store large keys and values that might be inefficient to store in the leaf page. This is one of those optimizations I mentioned.) So in that regard they behave like B+ trees.

WiredTiger does *not* (as you've found) provide links directly from one leaf page to the next. This is because WiredTiger always writes an updated page to a new location in the file. So linking leaf pages in this manner would be impractical. When we update a leaf page, we write it to a new file location. That means we would need to update the pointer in the leaf pages pointing to it, and therefore we would write those pages to new locations, until we rewrite every leaf page.

WiredTiger moves from one leaf page to the next by going back through the parent page. This is pretty efficient because the WiredTiger cache will never evict an internal B-Tree page if any of its child pages are in the cache. Therefore any time we need to move from one leaf to the next, the parent is guaranteed to be in the cache.

FWIW, Douglas Comer's classic paper on B-Trees from 1979 says that in B+ trees "leaf nodes are *usually* linked together" (p. 129, emphasis mine). So it has never been a strict requirement that B+ trees have these links.

Keith

On Sat, Oct 7, 2023 at 10:16 AM Ars Roseregen <f8111...@gmail.com> wrote:

For a long time I thought he was using btree until I saw this document link which says WiredTiger maintains a table’s data in memory using a data structure called a B-Tree (B+ Tree to be specific). But I found the source code of the btree link which doesn’t look like a b+tree(because I can’t find the next or pre pointer for leaf node). And this document says it is B-tree link. This make me confused. Can someone give me some pointers?



--
You received this message because you are subscribed to the Google Groups "wiredtiger-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wiredtiger-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wiredtiger-users/ef64efdc-4851-483c-863e-cdb97b01ca1fn%40googlegroups.com.

Ars Roseregen

unread,
Oct 9, 2023, 11:46:18 PM10/9/23
to wiredtiger-users
Thank you very much for your detailed and patient answer.This helped me a lot in understanding WiredTiger storage structure
Reply all
Reply to author
Forward
0 new messages