[erlang-questions] How would you do it?

68 views
Skip to first unread message

Max Bourinov

unread,
Jul 2, 2012, 6:29:32 AM7/2/12
to erlang-questions
Hi Erlangers,

I need to build a ranking system for online game.

I think I will have a ranking gen_server that will accept cast messages like this: {score, Player :: profile(), Score :: integer()}.

So, the question is what would be most appropriate data structure if:
  1. I have to rank about 50 000 - 100 000 different players.
  2. On each score message I have to re-sort whole ranking table.
  3. It must be very cheap to get:
    1. top 100 players
    2. player's rating +/- 10 players about current player
  4. I expect about 20-50 score messages per seconds
  5. Size of score message is about 4KB (profile takes most of the space).

Any ideas or suggestions are welcome!

Best regards,
Max


Francesco Mazzoli

unread,
Jul 2, 2012, 7:06:05 AM7/2/12
to erlang-questions
At Mon, 2 Jul 2012 14:29:32 +0400,
Max Bourinov wrote:
> 1. I have to rank about 50 000 - 100 000 different players.
> 2. On each score message I have to re-sort whole ranking table.
> 3. It must be very cheap to get:
> 1. top 100 players
> 2. player's rating +/- 10 players about current player
> 4. I expect about 20-50 score messages per seconds
> 5. Size of score message is about 4KB (profile takes most of the space).
>
> Any ideas or suggestions are welcome!

A priority search queue will fit your bill. You can to 2 in and 3.1 in
logarithmic time, and 3.1 in constant time.

I'm not aware of an Erlang implementation, here is an Haskell module
with link to a paper with implementation details:
http://hackage.haskell.org/packages/archive/PSQueue/1.1/doc/html/Data-PSQueue.html .

If you know some Haskell, this is an - untested - stub that will do
what you want:

-------------8<-----------------------------------

import Data.PSQueue as PSQ

type Ranking = PSQ Player Score

newRanking :: Ranking
newRanking = PSQ.empty

updateScore :: Player -> Score -> Ranking -> Ranking
updateScore player score rank =
case PSQ.lookup player rank of
Nothing -> PSQ.insert player score rank
Just score' -> let score'' = magic to get new score
in PSQ.insert player score'' rank

topNPlayers :: Int -> Ranking -> [Player]
topNPlayers n = take n . PSQ.toDescList

worsePlayers :: Player -> Int -> Ranking
-> Maybe [Binding Player Score]
worsePlayers player n rank =
case PSQ.lookup player rank of
Nothing -> Nothing
Just score -> Just (take n (PSQ.atMost score rank))

------------->8-----------------------------------

For some reason you don't have the dual of `atMost' which returns the
items with a priority *greater* then the one given. I'd be surprised
if you can't implement that.

--
Francesco * Often in error, never in doubt
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Francesco Mazzoli

unread,
Jul 2, 2012, 7:44:37 AM7/2/12
to erlang-questions
At Mon, 02 Jul 2012 12:06:05 +0100,
Francesco Mazzoli wrote:
> I'm not aware of an Erlang implementation, here is an Haskell module
> with link to a paper with implementation details:
> http://hackage.haskell.org/packages/archive/PSQueue/1.1/doc/html/Data-PSQueue.html .

Mpf, I noticed that the paper linked there requires logging in
CiteSeer, thankfully a quick Googling lead to a direct source:
www.cs.ox.ac.uk/ralf.hinze/publications/ICFP01.pdf

dmitry kolesnikov

unread,
Jul 2, 2012, 8:43:49 AM7/2/12
to Max Bourinov, erlang-questions
Hello,

You have to split your data structure. The player profile has to go into hash table (e.g. Ets)  and ranking into separate index. It should give you better performance since your player profile is big. The first candidate for index is gb_trees. It should be good enough to cover 3.1 but 3.2 requires a range search, which is available in my Branch of gbt (http://github.com/fogfish/feta)
If you are not fine with gbt performance then you have to shard your scores into multiple trees. 

Best Regards,
Dmitry >-|-|-*>

Wojtek Narczyński

unread,
Jul 2, 2012, 3:41:28 PM7/2/12
to erlang-q...@erlang.org
On 2012-07-02 12:29, Max Bourinov wrote:
>
> I need to build a ranking system for online game.
>

If this is a 1 vs 1 game, you might be interested in Elo Rating system.
If it is team vs. team, it gets more complex.

I am not sure how gamer 4KB Profile fits into ranking. Well, okay, maybe
it is a good idea to keep gamer profiles and ranks in the same data
structure, but they don't have to be updated with the same message.

I would expect an interfrace along these lines:
{set_status, GamerID, offline / ready / playing} - info for match making
{get_opponents, GamerID} -> [{OpponentID, Rank}] - select some potential
opponents, ready to play
{update_rank, MatchID, WinnerID, LooserID} -> {NewWinnerRank,
NewLooserRank} - ranks after a match, Elo here

Pleasantries (you may not dispense with):
{register, GamerID, Profile}
{unregister, GamerID}
{set_profile, GamerID, Profile}
{get_profile, GamerID} -> Profile
{get_rank, GamerID} -> Rank
{get_top, N} -> [{GamerID, Rank}]

Sorting and searching is going to be the easy part, the performance can
easily be assessed by an automated test.

I have never done anything even remotely gaming related, so don't listen
to me.

--Regards,
Wojtek Narczynski

Erik Søe Sørensen

unread,
Jul 2, 2012, 5:52:39 PM7/2/12
to Erlang Questions

(forgot to Cc the list.)

---------- Videresendt besked ----------
Fra: "Erik Søe Sørensen" <eri...@gmail.com>
Dato: 02/07/2012 23.50
Emne: Re: [erlang-questions] How would you do it?
Til: "Max Bourinov" <bour...@gmail.com>

I think I'd do this...:
0) Not include the profile info if the score messages unless it really does change that often.
1) Keep two ets tables: one with player-id as key and current score as a column (and possibly other columns, for profile info etc., but for this I'd only fetch the score column).
2) And the other ets table would have {Score,PlayerID} as key. No other columns needed for the tasks you mention.
3) The first table would be a normal set, the second a sorted set. This means that player rows can be found and updated in constant time, and the score table row updated in logarithmic time (using the old score from the first table). Top 100, and neighbouring scores, can be found using ets iterators, in log time.

The performance of this should be quite adequate. There is at least one advantage of using ets tables over in-heap data structures for this, namely that it keeps the heap size down, so there's less work for the GC.

Happy hacking, whichever approach you settle on.

/Erik
(Reminds me, I ought to add mention of two-table approaches to AGttES soonish. It can be quite handy to have your data indexed in several ways.)

/Erik

Jesper Louis Andersen

unread,
Jul 3, 2012, 4:28:49 AM7/3/12
to Erik Søe Sørensen, Erlang Questions
On 7/2/12 11:52 PM, Erik Søe Sørensen wrote:
>
> I think I'd do this...:
> 0) Not include the profile info if the score messages unless it really
> does change that often.
> 1) Keep two ets tables: one with player-id as key and current score as
> a column (and possibly other columns, for profile info etc., but for
> this I'd only fetch the s...
>
This is my solution as well. The PSQueue data structure can be quite
heavy in memory operations, so while it has nice properties on paper, it
does have some hefty constant factors. I used a PSQueue in
Combinatorrent, but eTorrent uses the above solution for its piece
histogram.


--
Jesper Louis Andersen
Erlang Solutions Ltd., Copenhagen, DK

Jesper Louis Andersen

unread,
Jul 3, 2012, 4:34:24 AM7/3/12
to erlang-q...@erlang.org
On 7/2/12 9:41 PM, Wojtek Narczyński wrote:
> If this is a 1 vs 1 game, you might be interested in Elo Rating
> system. If it is team vs. team, it gets more complex.
In a modern world of 1v1 games, it is better to use a more complex
ranking system with better results. ELO is designed for
hand-calculation. My Glicko2 code, which is a bayesian ranking
algorithm, is on-line:

https://github.com/jlouis/erl-glicko2/blob/master/src/glicko2.erl

Do note that the Volatility root finding in step 5 is not numerically
stable in that code. If you need a stable variant, prod me and I'll push
a later version, which uses another variant of root-finding which has
proven to be stable enough in practice.

--
Jesper Louis Andersen
Erlang Solutions Ltd., Copenhagen, DK

_______________________________________________

Rudolph van Graan

unread,
Jul 6, 2012, 8:46:09 AM7/6/12
to erlang-questions Questions
  1. I have to rank about 50 000 - 100 000 different players.
  2. On each score message I have to re-sort whole ranking table.
  3. It must be very cheap to get:
    1. top 100 players
    2. player's rating +/- 10 players about current player
  4. I expect about 20-50 score messages per seconds
  5. Size of score message is about 4KB (profile takes most of the space).
Trying to be pragmatic, I would do this:

1. Every player has a unique key (say ID) and this is the key to a Mnesia table that stores the player records.
2. I would have a second table with the rankings, keyed on score.
3. Each slot contains the ID and score of the player at that position: 
- {score,30,{player,200},undefined,20}
- {score,20,{player,5},30,19}
- {score,19,{player,18},20,undefined}
...
So the top player is the one with rank 30, i.e. player 200, the 2nd one is player 5 and the third is player 18. You can get to the first one by reading the first entry or last entry in the table. Each score record contains a "pointer" to the one before and the one after.

The record will look like this:

{score, Score, PlayerID, Next, Previous}

4. The player table is keyed on the player id and contains the player's current score. So in the example above Player 200 will be rank 1 (score 30) and Player 5 will be rank 2 (score 20) etc.

So you can easily get the top N, simply read the records starting at the end (using dirty last) and following the next pointer. That will give you two database reads per entry. If you want to cache this, make a simple gen_server that keeps this list in memory and refreshes every 10 seconds.

4. Process your score messages concurrently, i.e. each operation does the folling
 a) Read the player record according to ID and get the score from the player ID. 
 b) Read the score record using the current score and retrieve the values of "previous" and "next"
 c) Delete the score record and modify "previous" to point to "next" and "next" to point to previous
 d) Insert a new #score{score=NewScore,previous=undefined, next=undefined}
 e) Call mnesia:next(Tab,{score,NewScore}) to get the logical Next and mnesia:prev/2 to get the logical previous
 f) Update the three records with the appropriate next/prev values

(all of this protected by a transaction of course)

This approach is heavy compared to the other approaches suggested, but it gives you distribution for free and no single process bottlenecks. It should be very fast if you use only memory tables and fast enough if you add persistent tables.

Obviously I haven't worked out all the details above - I am sure you can figure that out.

Hope it helps.

Rudolph van Graan


Please have a look at my blogs:

Random Views of London -- A photographer's interpretation of London


On 2 Jul 2012, at 11:29, Max Bourinov wrote:

Hi Erlangers,

I need to build a ranking system for online game.

I think I will have a ranking gen_server that will accept cast messages like this: {score, Player :: profile(), Score :: integer()}.

So, the question is what would be most appropriate data structure if:
  1. I have to rank about 50 000 - 100 000 different players.
  2. On each score message I have to re-sort whole ranking table.
  3. It must be very cheap to get:
    1. top 100 players
    2. player's rating +/- 10 players about current player
  4. I expect about 20-50 score messages per seconds
  5. Size of score message is about 4KB (profile takes most of the space).

Any ideas or suggestions are welcome!

Best regards,
Max


Reply all
Reply to author
Forward
0 new messages