Tarantool as a Leaderboard database

211 views
Skip to first unread message

Christian Brandoni

unread,
Nov 10, 2016, 9:56:57 AM11/10/16
to Tarantool discussion group (English)
Hi,
First of all congratulation for this awesome work. The architecture is a developer dream and the documentation is super easy to follow.
I am currently evaluating Tarantool to use for a leaderboard for a f2p game I am developing. I saw redis has a lots of features for leaderboards  but I can't find much for Tarantool in this regard.
Do you think out of the box it is suited for this scenario or I would have to implement several things my self? 
The leaderboard would have to handle this:
1-Top players with pagination.
2-current rank with adjacent rank shown.
3-possible match making with similar ranks

So I was thinking of storing time and top score for each user and count from the top of the leaderboard to get the rank, but how would this scale?
Would it be performant with million of players? 

And for point 2 I was thinking of getting the fetching the player score and with its key using GT and limits=3 to retrieve players above him and LT for player below him.
Or it is possible to retrieve everything in a single query like offset= -3 limit=7(3 before plus the player plus 3 after)?
I am not concerned about ram, just performance that should be as close to real time as possible.

Another thing, from the docs doesn't seem possible, on_replace  trigger doesn't  get the tuple as parameter? It's kind of useless if I can't know what's being done.

Last thing, the faq says that in master master replication the replication get disabled if the key is duplicated. Why it Isn't possible to handle the error on the script side?

Konstantin Osipov

unread,
Nov 11, 2016, 8:11:48 AM11/11/16
to tara...@googlegroups.com
* Christian Brandoni <christian...@gmail.com> [16/11/10 17:58]:
> First of all congratulation for this awesome work. The architecture is a
> developer dream and the documentation is super easy to follow.
> I am currently evaluating Tarantool to use for a leaderboard for a f2p game
> I am developing. I saw redis has a lots of features for leaderboards but I
> can't find much for Tarantool in this regard.
> Do you think out of the box it is suited for this scenario or I would have
> to implement several things my self?
> The leaderboard would have to handle this:
> 1-Top players with pagination.
> 2-current rank with adjacent rank shown.
> 3-possible match making with similar ranks

> So I was thinking of storing time and top score for each user and count
> from the top of the leaderboard to get the rank, but how would this scale?
> Would it be performant with million of players?

Yes, you can write efficient Lua code to support these use cases.
Take a look at pagination with Tarantool here:
https://tarantool.org/doc/book/box/box_index.html#example-showing-a-user-defined-iterator


> And for point 2 I was thinking of getting the fetching the player score and
> with its key using GT and limits=3 to retrieve players above him and LT for
> player below him.
> Or it is possible to retrieve everything in a single query like offset= -3
> limit=7(3 before plus the player plus 3 after)?
> I am not concerned about ram, just performance that should be as close to
> real time as possible.

Usually, to implement pagination in any direction, you begin with
offset 0, supply a necessary limit, and then continue from the
last key of the previous page. This is the way it works in the
example I pasted.

> Another thing, from the docs doesn't seem possible, on_replace trigger
> doesn't get the tuple as parameter? It's kind of useless if I can't know
> what's being done.

Yes, it does, it gets two arguments: old_tuple, new_tuple.

I filed a ticket against the manual:

https://github.com/tarantool/doc/issues/115

Check the tests here:

https://github.com/tarantool/tarantool/blob/1.7/test/box/on_replace.test.lua

But today you can't use triggers to insert into spaces :(, we're
working on this.

> Last thing, the faq says that in master master replication the replication
> get disabled if the key is duplicated. Why it Isn't possible to handle the
> error on the script side?

It's a good idea, but it is not done yet.

--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.org - www.twitter.com/kostja_osipov

Christian Brandoni

unread,
Nov 11, 2016, 9:56:59 AM11/11/16
to Tarantool discussion group (English)
Hi, 
Thanks for your help.
More questions below: 
Usually, to implement pagination in any direction, you begin with
offset 0, supply a necessary limit, and then continue from the
last key of the previous page. This is the way it works in the
example I pasted.
Point 2 was related to player rank, not pagination. 
I need to know the position of the player in the leader board.
Let's say I have the following with 
1- user a-10 points
2-user b 5 points
3-user c 4 points
4-user d d 3 points
5-user e 2 points

Let's say I am user c, 4 points in the leaderboard.
I need to show him he is rank 3. that there is user b above him and user e below him. I am just asking 1 rank below and above for simplicity, but in reality I would like to know 3 at least.
So I would then count from the beginning of the index to get his rank:3. 
Then I would select the one below and the one above with two index scan. Do you think this is the most efficient way or there are better way?


But today you can't use triggers to insert into spaces :(, we're
working on this.
Sorry I don't understand what does this mean. Isn't the trigger like any other tarantool lua function?
I can't call any box functions from there?
Or there is a bug just on space insert?

Another thing. Today talking with the node js connector dev he told me the protocol doesn't allow initiating the request from tarantool side.
Is this something you are working on? It would be useful for example with triggers or any server side event.Right now there doesn't seem to be a way to talk with clients other than implementing http or sockets server or polling the server for updates.

Konstantin Osipov

unread,
Nov 11, 2016, 10:59:25 AM11/11/16
to tara...@googlegroups.com
* Christian Brandoni <christian...@gmail.com> [16/11/11 18:01]:
> Hi,
> Thanks for your help.
> More questions below:
>
> > Usually, to implement pagination in any direction, you begin with
> > offset 0, supply a necessary limit, and then continue from the
> > last key of the previous page. This is the way it works in the
> > example I pasted.
> >
> Point 2 was related to player rank, not pagination.
> I need to know the position of the player in the leader board.
> Let's say I have the following with
> 1- user a-10 points
> 2-user b 5 points
> 3-user c 4 points
> 4-user d d 3 points
> 5-user e 2 points
>
> Let's say I am user c, 4 points in the leaderboard.
> I need to show him he is rank 3. that there is user b above him and user e
> below him. I am just asking 1 rank below and above for simplicity, but in
> reality I would like to know 3 at least.
> So I would then count from the beginning of the index to get his rank:3.
> Then I would select the one below and the one above with two index scan. Do
> you think this is the most efficient way or there are better way?

What if there is more than one user with 3 points? Do they have
the same rank?

Counting the number of users with a higher rank would take a lot
of time if there are many users.

> > But today you can't use triggers to insert into spaces :(, we're
> > working on this.
> >
> Sorry I don't understand what does this mean. Isn't the trigger like any
> other tarantool lua function?

Yes.

> I can't call any box functions from there?

insert/update/delete - no, you can't.

> Or there is a bug just on space insert?

Yes, you could call it a bug. Subtransactions are
note implemented. What you could do, potentially, is put the new
tuple into a Lua table for a background fiber to work on.

> Another thing. Today talking with the node js connector dev he told me the
> protocol doesn't allow initiating the request from tarantool side.
> Is this something you are working on? It would be useful for example with
> triggers or any server side event.Right now there doesn't seem to be a way
> to talk with clients other than implementing http or sockets server or
> polling the server for updates.

You could use 'socket' Lua rock which is on board. If you need to
communicate with another Tarantool instance, you could use
net.box.
There is no pub/sub in the protocol, yes.

Christian Brandoni

unread,
Nov 11, 2016, 11:22:27 AM11/11/16
to Tarantool discussion group (English)
 
What if there is more than one user with 3 points? Do they have
the same rank?

Counting the number of users with a higher rank would take a lot
of time if there are many users.
 
That was oversimplified. The actual leaderboard use also time as value. The first user to score 3 point would get the higher rank.
I would keep everything in memory, not using vinyl here.
Hopefully if the game perform well there will be millions of score at least, considering that there are also monthly challenges so even a single play in a month would account as a score in the leaderboard.
From what I read Redis has no issue doing this with a command called ZRANK which has a complexity of O(log(N)) so it doesn't grow linearly with the number of keys, but I would like to avoid to use it as Tarantool seem more future proof and cooler, could even evaluate to replace node totally in the future.



Message has been deleted

Christian Brandoni

unread,
Nov 11, 2016, 11:37:34 AM11/11/16
to tara...@googlegroups.com
Just found out how that zrank sorted sets works. It's using skip lists
and zip lists inside.
Anything like that in Tarantool (wrong link before) :
http://stackoverflow.com/questions/9625246/what-are-the-underlying-data-structures-used-for-redis

2016-11-11 17:27 GMT+01:00 Christian Brandoni <christian...@gmail.com>:
> Just found out how that zrank works. It's using skip lists and zip lists inside.
> Anything like that in Tarantool :
> http://redisplanet.com/redis/under-the-hood-of-redis-hash-part-2-and-list/
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Tarantool discussion group (English)" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/tarantool/5NlBApklG28/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> tarantool+...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

Konstantin Osipov

unread,
Nov 11, 2016, 1:32:05 PM11/11/16
to tara...@googlegroups.com
* Christian Brandoni <christian...@gmail.com> [16/11/11 19:24]:

We don't have ZRANK. We use in-memory block storage for index
keys, so getting an ordinal key number is not straightforward:

- it's easy to get an approximation with cost O(logB N),
where B is our block size, effectively this would be constant
cost since our in-memory B-tree is rarely more than 3-4 levels
deep. It's an approximation, since not all blocks are full at
all times.
- it's harder to get an exact number quickly: one could iterate
over blocks and count the number of keys in each block, but this
is O(N/B).

So it sounds like ZRANK is either not exact or not straightforward
with our index structure.

An alternative would be to design a new index for this purpose -
indexes are pluggable - but this is a moderate engineering effort.

> That was oversimplified. The actual leaderboard use also time as value. The
> first user to score 3 point would get the higher rank.
> I would keep everything in memory, not using vinyl here.
> Hopefully if the game perform well there will be millions of score at
> least, considering that there are also monthly challenges so even a single
> play in a month would account as a score in the leaderboard.
> From what I read Redis has no issue doing this with a command called ZRANK
> which has a complexity of O(log(N)) so it doesn't grow linearly with the
> number of keys, but I would like to avoid to use it as Tarantool seem more
> future proof and cooler, could even evaluate to replace node totally in the
> future.

Thanks, but our TREE index is also a bit more advanced,
unfortunately making ZRANK a bit harder to do.

Christian Brandoni

unread,
Nov 11, 2016, 5:16:34 PM11/11/16
to tara...@googlegroups.com
Wouldn't be possible to add a counter to each block of the TREE index?
I see this is how other db engine solved it:
https://www.voltdb.com/blog/leaderboards-optimization-ranking-related-queries

Konstantin Osipov

unread,
Nov 12, 2016, 1:14:19 PM11/12/16
to tara...@googlegroups.com
* Christian Brandoni <christian...@gmail.com> [16/11/12 01:21]:
> Wouldn't be possible to add a counter to each block of the TREE index?
> I see this is how other db engine solved it:
> https://www.voltdb.com/blog/leaderboards-optimization-ranking-related-queries

We already have a counter, how would that help?

Christian Brandoni

unread,
Nov 14, 2016, 10:14:23 AM11/14/16
to tara...@googlegroups.com
My knowledge of tree index is close to zero so not sure I get it can
apply to you, but from my understanding let's say you need all keys
with value > 500, you check the first key of the next block and if
it's above 500 you can add the block size to the counter, if the value
is below 500 you start traversing the block and repeat the above and
so on. You should be able to do it since all block and keys should be
ordered, no need to check all keys one by one.
Maybe that doesn't apply to your index, don't know.

By the way, is Vinyl production ready?

Konstantin Osipov

unread,
Nov 14, 2016, 11:59:58 AM11/14/16
to tara...@googlegroups.com
* Christian Brandoni <christian...@gmail.com> [16/11/14 18:17]:
> My knowledge of tree index is close to zero so not sure I get it can
> apply to you, but from my understanding let's say you need all keys
> with value > 500, you check the first key of the next block and if
> it's above 500 you can add the block size to the counter, if the value
> is below 500 you start traversing the block and repeat the above and
> so on. You should be able to do it since all block and keys should be
> ordered, no need to check all keys one by one.
> Maybe that doesn't apply to your index, don't know.

Yes, that would require that you traverse blocks one by one and
add numbers. This is faster than traversing keys, but can still be
the cost could still be quite high if you have (tens of?) millions
of keys. Imagine this is the counting you need to make tens of
thousands times a second. A binary tree or skip list, where you
can calculate the ordinal position exactly, is more handy for such
case.

> By the way, is Vinyl production ready?

We have 3 instances running in production which we actively
support. We will make an official announce this year about it
general availability. Please wait for it.

For now we recommend to put it under heavy stress testing - it
should be quite good for write-heavy workloads. But there still
are issues with reads and secondary keys.

BTW, why don't you join http://telegram.me/tarantool?
Reply all
Reply to author
Forward
0 new messages