B-Tree v Skip-List Insertion Benchmark

Jeffrey Carter

unread,

Sep 17, 2014, 9:15:18 PM9/17/14

to

I ran a quick comparison of the insertion times for Kazakov's Generic_B_Tree pkg
against PragmARC.Skip_List_Unbounded from the PragmAda Reusable Components.

Both data structures are used for similar purposes, allowing O(log N) look-up
times. Insertion and deletion are expensive operations on balanced trees due to
re-balancing, and one reason skip lists were invented was to have fast insertion
and deletion times compared to balanced trees. I had never actually compared
times, and Kazakov's recent post comparing DB times got me thinking about it.

A typical run inserting the same 1,000 items into both structures gives times of

1.484 ms for the B tree
0.709 ms for the skip list

(Divide by 1,000 for average per-insertion times.)

The trade off is similar to using heap sort or quick sort. Both are O(N log N),
with quick sort usually being faster. Heap sort is always O(N log N), but in
rare cases, however, quick sort has worst-case performance of O(N**2).

A skip list is probabilistically balanced, and has a worst-case search time of
O(N) in very rare circumstances, almost always for small N (< 256).

--
Jeff Carter
"Perfidious English mouse-dropping hoarders."
Monty Python & the Holy Grail
10

Dmitry A. Kazakov

unread,

Sep 18, 2014, 3:42:12 AM9/18/14

to

Thanks for pointing this.

Can skip list made fit into a series fixed-size blocks?

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Jeffrey Carter

unread,

Sep 18, 2014, 12:53:54 PM9/18/14

to

On 09/18/2014 12:41 AM, Dmitry A. Kazakov wrote:
>
> Can skip list made fit into a series fixed-size blocks?

I don't understand the question.

--
Jeff Carter
"From this day on, the official language of San Marcos will be Swedish."
Bananas
28

Jeffrey Carter

unread,

Sep 18, 2014, 1:39:16 PM9/18/14

to

One comment on Generic_B_Tree: O(log N)-searchable structures are commonly used
as maps, sets, and priority queues. The interface of Generic_B_Tree seems skewed
towards use as a map; for the other uses the client must supply a dummy Value
parameter. It might be better to have a more general B-tree pkg, and, if
desired, build a more specific map interface on top of that.

PragmARC.Skip_List_Unbounded takes the opposite approach, providing a general
interface and presuming that the client knows how to use it to implement a map.

Ada.Containers provides the specific interfaces (Ordered_Maps and such) but does
not provide the underlying data structure.

Dmitry A. Kazakov

unread,

Sep 18, 2014, 1:55:54 PM9/18/14

to

On Thu, 18 Sep 2014 09:53:50 -0700, Jeffrey Carter wrote:

> On 09/18/2014 12:41 AM, Dmitry A. Kazakov wrote:
>>
>> Can skip list made fit into a series fixed-size blocks?
>
> I don't understand the question.

The structure must be chunked into blocks in order to be stored. When you
update the container, the change should be localized to one block when
possible. This is a property B-trees have. You make its buckets of the file
block size and then insert or delete would update just one block, most of
the time.

If, say, a doubly-linked list element is inserted or deleted this would
involve, most likely, tree blocks (to fix the links), which would devaluate
the fact that these operations have O(1) complexity.

Jeffrey Carter

unread,

Sep 18, 2014, 3:33:19 PM9/18/14

to

On 09/18/2014 10:54 AM, Dmitry A. Kazakov wrote:
>
> The structure must be chunked into blocks in order to be stored. When you
> update the container, the change should be localized to one block when
> possible. This is a property B-trees have. You make its buckets of the file
> block size and then insert or delete would update just one block, most of
> the time.

So you store the tree structure in the file, as well as the data? A node in a
skip list has a variable number of links to other nodes, so as implemented in
PragmARC.Skip_List_Unbounded would probably not lend itself to that. It could be
implemented with a fixed number of links, of which only a certain number are
actually used, which might work better.

I was only interested in checking the claim that insertion in a skip list is
quicker than in a B tree.

> If, say, a doubly-linked list element is inserted or deleted this would
> involve, most likely, tree blocks (to fix the links), which would devaluate
> the fact that these operations have O(1) complexity.

Insertion and deletion in a skip list are O(log N), since you have to search for
where to insert the new node. The time savings comes from not having a
re-balancing step.

Dmitry A. Kazakov

unread,

Sep 18, 2014, 4:33:28 PM9/18/14

to

On Thu, 18 Sep 2014 12:33:16 -0700, Jeffrey Carter wrote:

> On 09/18/2014 10:54 AM, Dmitry A. Kazakov wrote:
>>
>> The structure must be chunked into blocks in order to be stored. When you
>> update the container, the change should be localized to one block when
>> possible. This is a property B-trees have. You make its buckets of the file
>> block size and then insert or delete would update just one block, most of
>> the time.
>
> So you store the tree structure in the file, as well as the data?

Yes. Everything is in the same file.

> A node in a
> skip list has a variable number of links to other nodes, so as implemented in
> PragmARC.Skip_List_Unbounded would probably not lend itself to that. It could be
> implemented with a fixed number of links, of which only a certain number are
> actually used, which might work better.

There is a persistent memory pool, so it would be no problem to allocate
variable-length nodes. The concern is the case when the nodes become too
large for one block. The memory pool supports a stream interface. So a node
can be read from a chain of blocks and reconstructed in the memory then
updated and written back (in other blocks), no matter how many blocks it
spans. But it would be a considerable performance hit, I guess.

> I was only interested in checking the claim that insertion in a skip list is
> quicker than in a B tree.

Yes, there are many structures more efficient than B-tree, but they might
be difficult to apply to the case of segmented (as blocks) storage rather
than flat storage. It is interesting to learn if skip lists could be used
this way. Which is why I asked.

>> If, say, a doubly-linked list element is inserted or deleted this would
>> involve, most likely, tree blocks (to fix the links), which would devaluate
>> the fact that these operations have O(1) complexity.
>
> Insertion and deletion in a skip list are O(log N), since you have to search for
> where to insert the new node. The time savings comes from not having a
> re-balancing step.

No, I consider search as another operation. Let the location be already
determined. Then maintaining the structure gets a hit when not localized to
few blocks.