Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

The Backgammon rating system

8 views

Skip to first unread message

Matti Rinta-Nikkola

unread,

Dec 1, 1998, 3:00:00 AM12/1/98

I have done some studying about the Backgammon
rating formula used in many Backgammon servers.
I send the results of my thinking here... perhaps
someone might find use to them.

Best regards,
Matti Rinta-Nikkola

Backgammon rating formula
-------------------------
Many persons have noticed that the FIBS rating formula
(used also in many other backgammon servers) does not
work correctly for different match lengths. This conclusion
is done by studying match statistics collected from the
FIBS (ref 1,2,3,4,5). I will explain here how the rating
formula could be modified to be more accurate in different
match lengths. This problem has been studied also by many
others (ref 6,7).

1. FIBS rating formula
----------------------
The FIBS formula has been described elsewhere in more
detailed (ref 8). The main assumption of the rating formula
is that the rating distribution of the players will follow
the Gaussian distribution. In order to derive the formula
for the different match lengths it has been presumed that
the game winner get always one point (i.e. no gammons,
backgammons or doubling cube) (ref 8)! These assumptions
lead to the match winning probability formula:

1
P(D) = ---------------------- ;
10**(-D*SQRT(Skill)/2000) + 1

where D is the elo difference of the players
P is the winning probability
Skill is the match length.

So what is wrong in the formula above? Formula itself is
correct but the second assumption that have been used to
derive it is wrong! Wrong assumption leads to the erroneous
Skill function. In Backgammon the Skill function is not
simply equal the match length.

2. Backgammon Skill function
----------------------------
Who will win most when you play Backgammon? If you mostly
lose your matches then its certainly luckier player who win
mostly :-). But if you win more you probably like to explain
the Backgammon playing skills you have. What those skills
might be? Obviously there is two skills: 1) checker play and
2) cube handling.

Lets us try to construct the Skill function for Backgammon
rating formula. As already noted above in FIBS formula the
Skill function is simply

Skill(N)= N ; where N is a match length

If we introduce the doubling cube to the game the average
points per game will increase and the checker play skill
become less important in a given match length. On the other
hand doubling cube brings the new skill to the play -cube
handling skill. Studying match equity table for players of
different checker play skills (ref 7) we see that
the the probability to win 1 point match is equal to that
of two point match as well as the probability of winning
3 and 4 point matches are equal. Because of that fact it is
easier to construct the Skill function separately for even
and odd point matches. Here I will consider only odd point
matches i.e. N=1,3,5,7... . For odd point matches the Skill
function can be written as

N-1
Skill(N)= 1 + (cp + ch)( --- ) ; N=1,3,5,7... (1)
2

where cp defines the "extra" checker play skill in a matches
(N>1)
ch defines the cube handling skill

o Note 1: If the cube handling skill ch=0 the Skill
function gives the expectation value of the minimum
number of games needed to win the match i.e.

Skill(N) = N/ppg(N) ; ch=0 (2)

where ppg(N) is the average points per game. In a longer
matches ppg(N) is near the value of the ppg in a money game.
The value of cp can be calculated using equations (1) and
(2).

o Note 2: Cube handling and checker play skill parameters
(ch, cp) are expressed in units of the one point match
checker play skill.

o Note 3: Total checker play and cube handling skills in
a N point match are 1+cp*(N-1)/2 and ch*(N-1)/2.

3. Defining the values for parameters cp and ch
-----------------------------------------------

- Checker play skill

As noted before the cp can be calculated if the cube
handling skill is equal zero. From equations (1) and (2)
we get

cp=2*(N/ppg(N) - 1)/(N-1) = 2/ppg ; assuming N>>1 (3)

The cp value for shorter matches (N=3,5,7...) is a bit
smaller than the value obtained from the equation (3).
Better estimation for cp is got if we use smaller N, for
example N=21. If the ppg=1 as assumed in the derivation
of the FIBS rating formula we will get cp=2.

More realistic value can be obtained if we assume continues
Backgammon and efficient doubling (assumptions used to
derive match equity table). In that case ppg=3.3 and we can
calculate cp from the equation (3) (cube handling skill is
zero because cube handling errors are not made). We obtain
cp=0.61. More accurate value for cp can be obtained if the
Skill function is fitted to the match equity data, see
table 1. I have used match equity table calculated by Tom Keith
(ref 7).

We can make another estimation for cp if we assume that the
JellyFish is playing perfect Backgammon (at level 5:)). For
JellyFish ppg=2.3 (ref 9) which gives cp = 0.87.

The rolls method introduced by Tom Keith (ref 7) gives c=0.84,
see Table 1. The rolls method does not give any information
about the cube handling errors made. It's quite probable that
the cube handling errors are averaged out from the used data
(there are equal number of bad drops and bad takes).

We can calculate the checker play skill from equation (3) also
in a case when the match is played without doubling cube.
Assuming 25% gammon rate we can calculate cube less
ppg = 0.75+2*0.25 = 1.25 which gives cp=1.6

Table 1. Skill function for different methods (ch=0). In
parenthesis is the value calculated from the
equation (1) using the fitted cp value.

FIBS MatEq Rolls JellyFish
Match length 1 1 (1) 1.00 (1.00) 1.00 (1.00)
Match length 3 3 (3) 1.54 (1.54) 1.77 (1.84)
Match length 5 5 (5) 2.07 (2.08) 2.66 (2.68)
Match length 7 7 (7) 2.62 (2.62) 3.57 (3.52)
Match length 9 9 (9) 3.13 (3.16) 4.67 (4.36)
Match length 11 11 (11) 3.69 (3.70) 5.48 (5.20)
fitted cp 2 0.54 0.84
calculated cp 2 0.61 - 0.87

- Cube handling skill

The value of the ch parameter has to be calculated from the
experimental data. There is no way to determine it theoretically
because its value depends on the cube handling and checker play
errors players do. ch depends on the checker play errors too
because it has to be expressed in the same units as cp.

However we already know that the FIBS value for cp+ch=2 is too
high and we have also very good estimation for cp=0.85. So we
must have ch < 1.15.

One limit can be still found if there is someone who is able to
answer to the following question: Lets assume that the top rated
player plays an 11 point match against an averaged rated player.
Who get the advantage in the match if they decide to play without
doubling cube?
As calculated above we know that cp=1.6 in a match played without
doubling cube. If the answer to the above question is "Top rated
player", which I think is the correct answer, we can write

P > P
nocube cube

<=>
Skill > Skill
nocube cube

=> ch < 0.75

4. Backgammon rating system
---------------------------
Finally I will suggest how the rating system should be implemented
in the Backgammon server.

1) I think that the rating system should be simplified so that the
rating is calculated only for odd point matches (N=1,3,5,7...)
2) Pick up a value for "cp+ch". It should be in the range from 0.8
to 2.0. In a new server I would probably start with a value
cp+ch=1.2. In a old server like FIBS I think you need to ask from
the players what they think about. Perhaps they don't want to
change the rating system at all :-).
3) Design the system which can indicate easily if the parameter cp+ch
has a wrong value.
4) Be conservative when changing the value cp+ch and don't change it
too often.

References
----------
1) FIBS--Rating Formula: Different length matches by Jim Williams
http://www.bkgm.com/rgb/rgb.cgi?view+603
2) FIBS--Rating Formula: Emperical analysis by Carry Wong
http://www.bkgm.com/rgb/rgb.cgi?view+601
3) FIBS--Rating Formula One-point matches by David Montgomery
http://www.bkgm.com/rgb/rgb.cgi?view+44
4) FIBS--Rating Formula Opponent's strength by William Hill
http://www.bkgm.com/rgb/rgb.cgi?view+524
5) Match Archives Big Brother--Statistics by Peter Fankhauser
http://www.bkgm.com/rgb/rgb.cgi?view+139
6) FIBS--Rating Formula Possible adjustments by Christopher D. Yep
http://www.bkgm.com/rgb/rgb.cgi?view+597
7) FIBS--Rating Formula Different length matches by Tom Keith
http://www.bkgm.com/rgb/rgb.cgi?view+523
8) ELO ranking
http://www.netgammon.com/us/facts/elo2.htm
9) Miscellaneous Distribution of points per game by Stig Eide
http://www.bkgm.com/rgb/rgb.cgi?view+513

Jim Williams

unread,

Dec 1, 1998, 3:00:00 AM12/1/98

Matti Rinta-Nikkola wrote:
>
> 4. Backgammon rating system
> ---------------------------
> Finally I will suggest how the rating system should be implemented
> in the Backgammon server.
>
> 1) I think that the rating system should be simplified so that the
> rating is calculated only for odd point matches (N=1,3,5,7...)
> 2) Pick up a value for "cp+ch". It should be in the range from 0.8
> to 2.0. In a new server I would probably start with a value
> cp+ch=1.2. In a old server like FIBS I think you need to ask from
> the players what they think about. Perhaps they don't want to
> change the rating system at all :-).
> 3) Design the system which can indicate easily if the parameter cp+ch
> has a wrong value.
> 4) Be conservative when changing the value cp+ch and don't change it
> too often.
>

Some interesting ideas! The idea of using a match equity table
for players of unequal skill as the basis for the rating system
seems sound, but it begs the question of how the match equity
table was calculated. I did not see the generating equations
in the reference noted below.

One effect that I have noticed in my own practice matches against
jellyfish is that a disproportionate amount of the equity that
I give up occurs with bad checker play in tricky end game situations.
The presence of the cube allows a significant percentage of these
situations to be avoided because a double is declined. I find
that being cautions with the cube, declining marginal cubes, and
waiting in marginal doubling situations, actually improves my
winning percentage by avoiding many situations in which I would
be outplayed. Of course it may just be that I am a bad end game
player, but if this effect is common, it adds another dynamic
into the effect of match length and the doubling cube on skill.

As far as modifying the rating system, my vote would be to use
Skill(n) = K1 + K2*n where K1 and K2 are determined to best
match empirical data. Fixing K1 at 1 could result in a significant
change in rating spread of players with existing ratings.

Getting good empirical data is
tricky though, as there are a number of statistical biases which
tend to creep in and are difficult to filter out. Also it
takes a surprisingly large amount of data to reduce the random
error to an acceptable level.

rintan...@my-dejanews.com

unread,

Dec 2, 1998, 3:00:00 AM12/2/98

In article <366470...@giga-net.com>,
Jim Williams <ji...@giga-net.com> wrote:

>
> Some interesting ideas! The idea of using a match equity table
> for players of unequal skill as the basis for the rating system
> seems sound, but it begs the question of how the match equity
> table was calculated. I did not see the generating equations
> in the reference noted below.

In fact I did a program which calculate a match equity table for
players of unequal checker play skill and table what I got is
consistent with table of Tom Keith. I might explain later how you
can calculate that table or perhaps someone else can do it. Now
I'm afraid that you just have to believe that the table send by
Tom Keith is correct.
The match equity table is not used as a basis to construct the Skill
function and the rating system. It has been used to verify that the
assumption that the Skill function has following form
Skill(N) = a + b*(N-1)/2, where N=1,3,5,... and parameters a,b are
constant is correct. In fact all the examples I gave to determinate
the value for checker play skill parameter cp can be understood as
a test of the Skill function form. If someone is skeptical about the
correctness of Skill function form he can generate match equity table
in a case where no doubling cube is used and test the function
against the table data :-).

>
> One effect that I have noticed in my own practice matches against
> jellyfish is that a disproportionate amount of the equity that
> I give up occurs with bad checker play in tricky end game situations.
> The presence of the cube allows a significant percentage of these
> situations to be avoided because a double is declined. I find
> that being cautions with the cube, declining marginal cubes, and
> waiting in marginal doubling situations, actually improves my
> winning percentage by avoiding many situations in which I would
> be outplayed. Of course it may just be that I am a bad end game
> player, but if this effect is common, it adds another dynamic
> into the effect of match length and the doubling cube on skill.
>

I'm not sure if I understand you here correctly. Are you trying to
explain that there might be a way to play Backgammon which would
change the form of the Skill function written above? I don't believe
that it is possible. In parameter b on above equation is included
every skill present in Backgammon. Note that the value of the b
parameter has to be extracted from the empirical data. Ofcourse it
isn't necessary to divide the b parameter b=cp+ch. So why I was wanting
to write it in that way? I did so because the checker skill part
can be calculated easily and I thought that there might be someone
who have a idea what value the ratio ch/cp might have. If you know
the value of the ratio you have a good estimation for parameter b.
If you think that there is skill present in a game which I did not
cited you can call the parameter ch as "all the other skills" and
everything is in a line again :-).

> As far as modifying the rating system, my vote would be to use
> Skill(n) = K1 + K2*n where K1 and K2 are determined to best
> match empirical data. Fixing K1 at 1 could result in a significant
> change in rating spread of players with existing ratings.

That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....
If you choose a different value than one for K1 it is equivalent
to that if you change the number 2000 in the probability equation.
I would change the number 2000 but it's a question of taste what
you want to fit.

>
> Getting good empirical data is
> tricky though, as there are a number of statistical biases which
> tend to creep in and are difficult to filter out.

Sorry for my ignorance of those statistical biases. Can you mention
and explain some of them?

Matti Rinta-Nikkola

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Jim Williams

unread,

Dec 2, 1998, 3:00:00 AM12/2/98

rintan...@my-dejanews.com wrote:
> Now
> I'm afraid that you just have to believe that the table send by
> Tom Keith is correct.

I'm not sure what you mean by "correct". I'm certainly willing to
accept that it is a good and well thought out approximation. My
interest is more in what the assumptions and limits of the
approximation are. If player A can win 60% of 1 point matches
against player B, what percent of 3 point matches can he win?
I don't think the answer is directly calculable. It depends on
what phases of the game that the relative strengths and weaknesses
of A and B are. You addressed that youself with the cp and ch
parameters, and I suspect that a more detailed analysis may reveal
even more parameters.

> I'm not sure if I understand you here correctly. Are you trying to
> explain that there might be a way to play Backgammon which would
> change the form of the Skill function written above?

I would go even farther than that and state that everyone plays
differently with different strengths and weaknesses, and therefore
everyone's skill function would be a little different. I am willing
to concede that in general people may be close enough that a pretty
accurate rating system can be built which is a quite good approximation.
I'm not sure it can be done purly analytically though, and may require
some empirical data.

> I don't believe
> that it is possible. In parameter b on above equation is included
> every skill present in Backgammon. Note that the value of the b

> parameter has to be extracted from the empirical data. Of course it

> isn't necessary to divide the b parameter b=cp+ch. So why I was
> wanting
> to write it in that way? I did so because the checker skill part
> can be calculated easily and I thought that there might be someone
> who have a idea what value the ratio ch/cp might have. If you know
> the value of the ratio you have a good estimation for parameter b.
> If you think that there is skill present in a game which I did not
> cited you can call the parameter ch as "all the other skills" and
> everything is in a line again :-).

Please forgive a contrived example, but how about a player that
is masterful in handling the cube as long as the number on top
is a 2. When the cube gets turned to 4 or 8, he chokes amd makes
terrible decisions.

> > As far as modifying the rating system, my vote would be to use
> > Skill(n) = K1 + K2*n where K1 and K2 are determined to best
> > match empirical data. Fixing K1 at 1 could result in a significant
> > change in rating spread of players with existing ratings.
>
> That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....

My assumption was that n=N, but n=(N-1)/2 is algebraically
equivalently (just tweek the constants). Ny inclination is not
to restrict N to odd. Even if the even case is less accurate, its
probably better than nothing. Most people play odd length matches
anyway.

> If you choose a different value than one for K1 it is equivalent
> to that if you change the number 2000 in the probability equation.

I agree with that.

> Sorry for my ignorance of those statistical biases. Can you mention
> and explain some of them?

Sure. These may or may not be real, and there probably are others
I haven't thought of.

1. Partition bias. The universe of FIBS players may partition itself
into groups with each group preferring a specific match length.
Within each group, the ratings of the players would tend
to distribute according to the rating formula for that match
length.

2. Diffusion bias. A player with a rating of 1800 is far more likely
to have a true strength of 1750 and be overrated than a
true strength of 1850 and be underrated. This is not
bacause any given player is more likely to be overrated than
underrated, but because there are a lot more players with
a true strength of 1750 than of 1850.

3. Diffusion magnification. Players may tend to play others of
similar rating. Therefore an 1800 rated player will tend to
play others who are overrated due to #2.

4. Match length effects. A player may tend to take 5 point matches
more seriously than one point matches and play much more
carefully (or vice versa).

5. Celebrity bias. A player who often plays while doing other things
at the same time may finally get a match with a 1950 rated
player, stop everything else, and concentrate fully on the
game.
>
> Matti Rinta-Nikkola

Robert-Jan Veldhuizen

unread,

Dec 2, 1998, 3:00:00 AM12/2/98

On 02-dec-98 15:48:15, Jim Williams wrote:

JW> rintan...@my-dejanews.com wrote:

>> Now
>> I'm afraid that you just have to believe that the table send by
>> Tom Keith is correct.

JW> I'm not sure what you mean by "correct". I'm certainly willing to
JW> accept that it is a good and well thought out approximation.

If I am well informed, this match equity table only takes into account
the probability that one player will defeat the opponent in a 1pt match.
Not very realistic I think, especially on longer matches other factors
are much more important. An example could be the skill of a player to
create gammons and to avoid them himself. People that mostly play (very)
short matches are probably not too good at this, whereas it can be very
important in somewhat longer matches.

JW> My
JW> interest is more in what the assumptions and limits of the
JW> approximation are. If player A can win 60% of 1 point matches
JW> against player B, what percent of 3 point matches can he win?
JW> I don't think the answer is directly calculable. It depends on
JW> what phases of the game that the relative strengths and weaknesses
JW> of A and B are. You addressed that youself with the cp and ch
JW> parameters, and I suspect that a more detailed analysis may reveal
JW> even more parameters.

I think in fact there might be an almost infinite number of parameters!
You just can't design *one* rating-system that gives accurate results
for matches of all lengths and players with various skills.

Every player will have specific strong and weak points, in checker play
as well as in cube handling *and* psychologically. These will work out
differently at different match lengths, and also against different
opponents.

If you would really want to make the rating system more accurate, I
think you would have to make different ratings for various match
lengths, for instance three seperate categories: 1 and 2 pointers, 3 to
8 pointers and 9 pointers and up.

It would be interesting to see how the present "one rating for
everything" would divide into three different ratings. I think lots of
players would get pretty different ratings for the above mentioned
categories, for instance.

Bottom line: You just can't combine the various aspects of backgammon at
different match lengths and between players with different levels of
skill at those different aspects, into one number without making
arbitrary decisions.

Some other problem with the present rating system that I haven't seen
mentioned in this discussion, yet seems much more important to me *and*
much more easier to solve is the following:

In bg, the luck factor plays a very important role. That means that
ratings will go up and down, sometimes rather wildly, because every
player will meet (un)lucky streaks. This is a difference with
a game like chess. Wouldn't it be a good idea to reflect this difference
in nature of the game in the ratingsystem? I don't know what would be a
good way of doing this, but averaging the rating over the last 50
results f.i. (with probably a form of bias for the most recent results)
could make the ratings much more accurate it seems to me. Any ideas
on this?

--
Zorba/Robert-Jan

Matti Rinta-Nikkola

unread,

Dec 3, 1998, 3:00:00 AM12/3/98

Jim Williams wrote:

> rintan...@my-dejanews.com wrote:
> > Now
> > I'm afraid that you just have to believe that the table send by
> > Tom Keith is correct.
>

> I'm not sure what you mean by "correct". I'm certainly willing to

> accept that it is a good and well thought out approximation. My
> interest is more in what the assumptions and limits of the
> approximation are.

I think Tom Keith should answer to this one. Anyway I got tablelike Tom Keith
assuming 25% gammon rate. The other assumptions
are explained in a article "How to Compute a Match Equity Table"
http://www.bkgm.com/articles/met.html. Limits of the approximation...
that would be far too long story to start explain now -I might do it later.
Note however that assuming 25% gammon rate and efficient doubles
leads to the ppg=3.4 while in the real life Backgammon ppg is more near
to two than three.

> If player A can win 60% of 1 point matches

> against player B, what percent of 3 point matches can he win?

> I don't think the answer is directly calculable. It depends on

> what phases of the game that the relative strengths and weaknesses

> of A and B are. You addressed that youself with the cp and ch

> parameters, and I suspect that a more detailed analysis may reveal

> even more parameters.

Yes, you are right. You can divide the term ch = a*cp*ch' + ch',where ch' is the
"real cube handling skill" and a is constant. You
might find even more terms. What I think is that all those skill terms
(what ever they might be) in the Skill function are in a good approximation
directly proportional to the (N-1)/2.

> > I'm not sure if I understand you here correctly. Are you trying to
> > explain that there might be a way to play Backgammon which would
> > change the form of the Skill function written above?
>
> I would go even farther than that and state that everyone plays
> differently with different strengths and weaknesses, and therefore
> everyone's skill function would be a little different.

Skill function cannot be calculated for individuals. Skill functions has tobe
calculated to the group of persons who play against each other. Every
backgammon server and a group of persons can have different constants
in their Skill function but the form of the function should be the same.

> I am willing
> to concede that in general people may be close enough that a pretty
> accurate rating system can be built which is a quite good approximation.
> I'm not sure it can be done purly analytically though, and may require
> some empirical data.

Surely the empirical data is needed. There is no way to determinate theparameter
of the Skill function theoretically because it depends on the
errors players do.

> > I don't believe
> > that it is possible. In parameter b on above equation is included
> > every skill present in Backgammon. Note that the value of the b
> > parameter has to be extracted from the empirical data. Of course it
> > isn't necessary to divide the b parameter b=cp+ch. So why I was
> > wanting
> > to write it in that way? I did so because the checker skill part
> > can be calculated easily and I thought that there might be someone
> > who have a idea what value the ratio ch/cp might have. If you know
> > the value of the ratio you have a good estimation for parameter b.
> > If you think that there is skill present in a game which I did not
> > cited you can call the parameter ch as "all the other skills" and
> > everything is in a line again :-).
>
> Please forgive a contrived example, but how about a player that
> is masterful in handling the cube as long as the number on top
> is a 2. When the cube gets turned to 4 or 8, he chokes amd makes
> terrible decisions.

Yes, you are right! There was an error on my Skill function. For examplein a
three point match doesn't occur doubling cube errors when cube is
turned to 8. If you like to take account also those errors you have to write
Skill function as

Skill(N) = 1 + a*(N-1)/2 + b*int(N/4) + c*int(N/8) + ....; N=1,3,5...

But I'm quite sure ;-) that the b*int(N/4)<<a*(N-1)/2 and
c*int(N/8)<<b*int(N/4). Why? Because
1) there is no pure checker play skill term in parameters b and c
and I think that checker play skill is much bigger than cube
handling skill.
2) in every match length you do more cube handling errors
on low cube values because those decision occur much more
frequently than the high cube value double decisions
Anyway if it is difficult to determinate the value for parameter a, it is
certainly
more difficult (or even impossible) to fix parameters b and c..

>
>
> > > As far as modifying the rating system, my vote would be to use
> > > Skill(n) = K1 + K2*n where K1 and K2 are determined to best
> > > match empirical data. Fixing K1 at 1 could result in a significant
> > > change in rating spread of players with existing ratings.
> >
> > That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....
>
> My assumption was that n=N, but n=(N-1)/2 is algebraically
> equivalently (just tweek the constants). Ny inclination is not
> to restrict N to odd. Even if the even case is less accurate, its
> probably better than nothing. Most people play odd length matches
> anyway.

If you don't want to restrict N to odd then I think that it is better to
usefollowing Skill function

Skill(N) = 1 + a*int(N/(2+e));

where 0<e<<1. Because Skill(1) has to be equal Skill(2) and probably also
Skill(3) and Skill(4) are quite near each other.

>> Sorry for my ignorance of those statistical biases. Can you mention
>> and explain some of them?
>
> Sure. These may or may not be real, and there probably are others
> I haven't thought of.

......
Thanks for your the statistical bias discussion part.

Matti Rinta-Nikkola

unread,

Dec 4, 1998, 3:00:00 AM12/4/98

Robert-Jan Veldhuizen wrote:

> If I am well informed, this match equity table only takes into account
> the probability that one player will defeat the opponent in a 1pt match.

That is correct.

> Not very realistic I think, especially on longer matches other factors
> are much more important. An example could be the skill of a player to
> create gammons and to avoid them himself. People that mostly play (very)
> short matches are probably not too good at this, whereas it can be very
> important in somewhat longer matches.

By the way, there is also gammon factor in one point matches. It is
moreproductive to play back games in one point matches than longer matches
although you will lose more gammons. These factors can be taken easily
account in the rating system (see my previus messages).

> I think in fact there might be an almost infinite number of parameters!
> You just can't design *one* rating-system that gives accurate results
> for matches of all lengths and players with various skills.

If the ELO assumption about Gaussian distribution is good enough,there is no
problem to design good rating system in a case when
doubling cube is not used (see my previous messages). Doubling cube
instead carries a lot of problems to design the rating system. The question
is how good rating system we can get if we add one free parameter to
the Skill function. That parameter has to be extracted from the game
statistics. To get a good answer to the above question is neither easy...
You have to try and to analyze to game statistics in order to see how well
it works.

> Every player will have specific strong and weak points, in checker play
> as well as in cube handling *and* psychologically. These will work out
> differently at different match lengths, and also against different
> opponents.

Yes, that is true. And that's why the players can be rated! You don't needto
know precisely all the skills affecting to the game result in order to create
a good rating system.

> If you would really want to make the rating system more accurate, I
> think you would have to make different ratings for various match
> lengths, for instance three seperate categories: 1 and 2 pointers, 3 to
> 8 pointers and 9 pointers and up.

Yes, that's one solution. But it is more complicate to realize in practice
thanwhat I'm suggesting.

> Bottom line: You just can't combine the various aspects of backgammon at
> different match lengths and between players with different levels of
> skill at those different aspects, into one number without making
> arbitrary decisions.

This I (as well as Arpad Elo, in a case when no doupling cube is used)
disagreeas you know.

> Some other problem with the present rating system that I haven't seen
> mentioned in this discussion, yet seems much more important to me *and*
> much more easier to solve is the following:
>
> In bg, the luck factor plays a very important role. That means that
> ratings will go up and down, sometimes rather wildly, because every
> player will meet (un)lucky streaks. This is a difference with
> a game like chess. Wouldn't it be a good idea to reflect this difference
> in nature of the game in the ratingsystem? I don't know what would be a
> good way of doing this, but averaging the rating over the last 50
> results f.i. (with probably a form of bias for the most recent results)
> could make the ratings much more accurate it seems to me. Any ideas
> on this?

I think that the bad rating system can applify the amplitude of the rating
fluctuation.

Matti Rinta-Nikkola

rintan...@my-dejanews.com

unread,

Dec 6, 1998, 3:00:00 AM12/6/98

Hello everyone,

I will present here a detailed derivation of the Backgammon
Skill function.

Best regards,

Matti Rinta-Nikkola

Derivation of the Backgammon Skill function
-------------------------------------------

Assuming that the skill distribution of the players will
follow the Gaussian distribution and that the game winner
get always one point. These assumptions lead to the match
winning probability formula:

1
P(D) = ---------------------- ; (1)
10**(-D*SQRT(Skill)/2000) + 1

where D is the elo difference of the players
P is the winning probability

Skill(N)=N
N is the match length

If you get always two points (instead of one) from victory,
we know without any calculation that the Skill function can
be written as

Skill (N)= 1 + a'*int(N/2)
2 2

In a case of the "n points"/victory the Skill function is

Skill (N)= 1 + a'*int(N/n) (2)
n n

What is the Skill function of the Backgammon when no doubling
cube is used? Skill function is the combination of the functions
Skill , Skill and Skill . We can write the Skill in the
1 2 3
following form

Skill (N) = 1 + a *N + a *int(N/2) + a *int(N/3); (3)
bg 1 2 3

That is evidently too complicate in practical use. So further
approximation are needed. We can take a = 0 by assuming zero
3
backgammon rate. If we consider only odd point matches the
second and the third term can be but together. After these
simplifications we can write the Backgammon Skill function as

Skill (N) = 1 + a'*int(N/2) ; N=1,3,5... (4)
bg

The value a' can be solved if we know the gammon rate (see my
previous messages).

If we introduce the doubling cube to the game then following the
procedure explained above the Skill function can be written as

Skill (N) = 1+b'*int(N/2)+c'*int(N/4)+d'*int(N/8)+...; N=1,3,5 (5)
bgc

Note that c',d'... are also functions of N! Now we are lost.
Or are we? Let's us see what "playing skills" are included
in the terms b',c',d'....

term skills
b' checker play skill, doubling 1->2 and partly 2->4
c' doubling 2->4 and partly 4->8
d' doubling 4->8 and partly 8->16
....

Lets try to approximate how frequently different doubling
situations occur in a money game. In approximation 1 we
assume continous backgammon and efficient doubling. In a
bit more realistic approximation (appr.2) we can assume
that 60% of the doubles are taken and the rest are dropped.
Lets take the drop point to be p=30% in both cases. We will
get the following table

appr. 1 appr. 2
doubling frequency frequency
1->2 1. 1.
2->4 0.3 0.18
4->8 0.09 0.03
8->16 0.03 0.006

Assuming that N>>8, we can estimate the ratio

d'*int(N/8) d'
----------- = --- = 0.5*.17 = 0.08
c'*int(N/4) 2*c'

The ratio c'*int(N/4)/b'*int(N/2) is even smaller because
in term b' is included also checker play skill. Because
of these facts the Skill function can be written in a good
approximation as

Skill (N) = 1 + a*(N-1)/2 ; N=1,3,5,7,... (6)
bgc

Better approximation can be derived if we consider only
matches N=1,5,9,13,17,....

I'm afraid that there is no other way to test the Skill equation
than to try it in practice and after to analyze the match
statistics.

Note
----
As Jim Williams and Robert-Jan Veldhuizen have noticed
for one pointers we should calculate its own rating because
in a one point matches there is no doubling cube skill
present and also the gammon factor is probably different
than matches N>1.

rintan...@my-dejanews.com

unread,

Dec 6, 1998, 3:00:00 AM12/6/98

I found a small error

> If you get always two points (instead of one) from victory,

> we know without any calculation that the Skill function can
> be written as
>

> Skill (N)= 1 + a'*int(N/2)
> 2 2

should be

Skill (N) = 1 + a'*int(N/(2+e)) ; 0<e<<1

2 2
>
> In a case of the "n points"/victory the Skill function is
>

Skill (N)= 1 + a'*int(N/(n+e))
n n

> What is the Skill function of the Backgammon when no doubling
> cube is used? Skill function is the combination of the functions
> Skill , Skill and Skill . We can write the Skill in the
> 1 2 3
> following form
>

Skill (N) = 1 + a *N + a *int(N/(2+e)) + a *int(N/(3+e)); (3)
bg 1 2 3

Everything else I think is ok.

Matti Rinta-Nikkola

0 new messages