Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

5 views

Skip to first unread message

Dec 1, 1998, 3:00:00â€¯AM12/1/98

to

I have done some studying about the Backgammon

rating formula used in many Backgammon servers.

I send the results of my thinking here... perhaps

someone might find use to them.

rating formula used in many Backgammon servers.

I send the results of my thinking here... perhaps

someone might find use to them.

Best regards,

Matti Rinta-Nikkola

Backgammon rating formula

-------------------------

Many persons have noticed that the FIBS rating formula

(used also in many other backgammon servers) does not

work correctly for different match lengths. This conclusion

is done by studying match statistics collected from the

FIBS (ref 1,2,3,4,5). I will explain here how the rating

formula could be modified to be more accurate in different

match lengths. This problem has been studied also by many

others (ref 6,7).

1. FIBS rating formula

----------------------

The FIBS formula has been described elsewhere in more

detailed (ref 8). The main assumption of the rating formula

is that the rating distribution of the players will follow

the Gaussian distribution. In order to derive the formula

for the different match lengths it has been presumed that

the game winner get always one point (i.e. no gammons,

backgammons or doubling cube) (ref 8)! These assumptions

lead to the match winning probability formula:

1

P(D) = ---------------------- ;

10**(-D*SQRT(Skill)/2000) + 1

where D is the elo difference of the players

P is the winning probability

Skill is the match length.

So what is wrong in the formula above? Formula itself is

correct but the second assumption that have been used to

derive it is wrong! Wrong assumption leads to the erroneous

Skill function. In Backgammon the Skill function is not

simply equal the match length.

2. Backgammon Skill function

----------------------------

Who will win most when you play Backgammon? If you mostly

lose your matches then its certainly luckier player who win

mostly :-). But if you win more you probably like to explain

the Backgammon playing skills you have. What those skills

might be? Obviously there is two skills: 1) checker play and

2) cube handling.

Lets us try to construct the Skill function for Backgammon

rating formula. As already noted above in FIBS formula the

Skill function is simply

Skill(N)= N ; where N is a match length

If we introduce the doubling cube to the game the average

points per game will increase and the checker play skill

become less important in a given match length. On the other

hand doubling cube brings the new skill to the play -cube

handling skill. Studying match equity table for players of

different checker play skills (ref 7) we see that

the the probability to win 1 point match is equal to that

of two point match as well as the probability of winning

3 and 4 point matches are equal. Because of that fact it is

easier to construct the Skill function separately for even

and odd point matches. Here I will consider only odd point

matches i.e. N=1,3,5,7... . For odd point matches the Skill

function can be written as

N-1

Skill(N)= 1 + (cp + ch)( --- ) ; N=1,3,5,7... (1)

2

where cp defines the "extra" checker play skill in a matches

(N>1)

ch defines the cube handling skill

o Note 1: If the cube handling skill ch=0 the Skill

function gives the expectation value of the minimum

number of games needed to win the match i.e.

Skill(N) = N/ppg(N) ; ch=0 (2)

where ppg(N) is the average points per game. In a longer

matches ppg(N) is near the value of the ppg in a money game.

The value of cp can be calculated using equations (1) and

(2).

o Note 2: Cube handling and checker play skill parameters

(ch, cp) are expressed in units of the one point match

checker play skill.

o Note 3: Total checker play and cube handling skills in

a N point match are 1+cp*(N-1)/2 and ch*(N-1)/2.

3. Defining the values for parameters cp and ch

-----------------------------------------------

- Checker play skill

As noted before the cp can be calculated if the cube

handling skill is equal zero. From equations (1) and (2)

we get

cp=2*(N/ppg(N) - 1)/(N-1) = 2/ppg ; assuming N>>1 (3)

The cp value for shorter matches (N=3,5,7...) is a bit

smaller than the value obtained from the equation (3).

Better estimation for cp is got if we use smaller N, for

example N=21. If the ppg=1 as assumed in the derivation

of the FIBS rating formula we will get cp=2.

More realistic value can be obtained if we assume continues

Backgammon and efficient doubling (assumptions used to

derive match equity table). In that case ppg=3.3 and we can

calculate cp from the equation (3) (cube handling skill is

zero because cube handling errors are not made). We obtain

cp=0.61. More accurate value for cp can be obtained if the

Skill function is fitted to the match equity data, see

table 1. I have used match equity table calculated by Tom Keith

(ref 7).

We can make another estimation for cp if we assume that the

JellyFish is playing perfect Backgammon (at level 5:)). For

JellyFish ppg=2.3 (ref 9) which gives cp = 0.87.

The rolls method introduced by Tom Keith (ref 7) gives c=0.84,

see Table 1. The rolls method does not give any information

about the cube handling errors made. It's quite probable that

the cube handling errors are averaged out from the used data

(there are equal number of bad drops and bad takes).

We can calculate the checker play skill from equation (3) also

in a case when the match is played without doubling cube.

Assuming 25% gammon rate we can calculate cube less

ppg = 0.75+2*0.25 = 1.25 which gives cp=1.6

Table 1. Skill function for different methods (ch=0). In

parenthesis is the value calculated from the

equation (1) using the fitted cp value.

FIBS MatEq Rolls JellyFish

Match length 1 1 (1) 1.00 (1.00) 1.00 (1.00)

Match length 3 3 (3) 1.54 (1.54) 1.77 (1.84)

Match length 5 5 (5) 2.07 (2.08) 2.66 (2.68)

Match length 7 7 (7) 2.62 (2.62) 3.57 (3.52)

Match length 9 9 (9) 3.13 (3.16) 4.67 (4.36)

Match length 11 11 (11) 3.69 (3.70) 5.48 (5.20)

fitted cp 2 0.54 0.84

calculated cp 2 0.61 - 0.87

- Cube handling skill

The value of the ch parameter has to be calculated from the

experimental data. There is no way to determine it theoretically

because its value depends on the cube handling and checker play

errors players do. ch depends on the checker play errors too

because it has to be expressed in the same units as cp.

However we already know that the FIBS value for cp+ch=2 is too

high and we have also very good estimation for cp=0.85. So we

must have ch < 1.15.

One limit can be still found if there is someone who is able to

answer to the following question: Lets assume that the top rated

player plays an 11 point match against an averaged rated player.

Who get the advantage in the match if they decide to play without

doubling cube?

As calculated above we know that cp=1.6 in a match played without

doubling cube. If the answer to the above question is "Top rated

player", which I think is the correct answer, we can write

P > P

nocube cube

<=>

Skill > Skill

nocube cube

=> ch < 0.75

4. Backgammon rating system

---------------------------

Finally I will suggest how the rating system should be implemented

in the Backgammon server.

1) I think that the rating system should be simplified so that the

rating is calculated only for odd point matches (N=1,3,5,7...)

2) Pick up a value for "cp+ch". It should be in the range from 0.8

to 2.0. In a new server I would probably start with a value

cp+ch=1.2. In a old server like FIBS I think you need to ask from

the players what they think about. Perhaps they don't want to

change the rating system at all :-).

3) Design the system which can indicate easily if the parameter cp+ch

has a wrong value.

4) Be conservative when changing the value cp+ch and don't change it

too often.

References

----------

1) FIBS--Rating Formula: Different length matches by Jim Williams

http://www.bkgm.com/rgb/rgb.cgi?view+603

2) FIBS--Rating Formula: Emperical analysis by Carry Wong

http://www.bkgm.com/rgb/rgb.cgi?view+601

3) FIBS--Rating Formula One-point matches by David Montgomery

http://www.bkgm.com/rgb/rgb.cgi?view+44

4) FIBS--Rating Formula Opponent's strength by William Hill

http://www.bkgm.com/rgb/rgb.cgi?view+524

5) Match Archives Big Brother--Statistics by Peter Fankhauser

http://www.bkgm.com/rgb/rgb.cgi?view+139

6) FIBS--Rating Formula Possible adjustments by Christopher D. Yep

http://www.bkgm.com/rgb/rgb.cgi?view+597

7) FIBS--Rating Formula Different length matches by Tom Keith

http://www.bkgm.com/rgb/rgb.cgi?view+523

8) ELO ranking

http://www.netgammon.com/us/facts/elo2.htm

9) Miscellaneous Distribution of points per game by Stig Eide

http://www.bkgm.com/rgb/rgb.cgi?view+513

Dec 1, 1998, 3:00:00â€¯AM12/1/98

to

Matti Rinta-Nikkola wrote:

>

> 4. Backgammon rating system

> ---------------------------

> Finally I will suggest how the rating system should be implemented

> in the Backgammon server.

>

> 1) I think that the rating system should be simplified so that the

> rating is calculated only for odd point matches (N=1,3,5,7...)

> 2) Pick up a value for "cp+ch". It should be in the range from 0.8

> to 2.0. In a new server I would probably start with a value

> cp+ch=1.2. In a old server like FIBS I think you need to ask from

> the players what they think about. Perhaps they don't want to

> change the rating system at all :-).

> 3) Design the system which can indicate easily if the parameter cp+ch

> has a wrong value.

> 4) Be conservative when changing the value cp+ch and don't change it

> too often.

>

>

> 4. Backgammon rating system

> ---------------------------

> Finally I will suggest how the rating system should be implemented

> in the Backgammon server.

>

> 1) I think that the rating system should be simplified so that the

> rating is calculated only for odd point matches (N=1,3,5,7...)

> 2) Pick up a value for "cp+ch". It should be in the range from 0.8

> to 2.0. In a new server I would probably start with a value

> cp+ch=1.2. In a old server like FIBS I think you need to ask from

> the players what they think about. Perhaps they don't want to

> change the rating system at all :-).

> 3) Design the system which can indicate easily if the parameter cp+ch

> has a wrong value.

> 4) Be conservative when changing the value cp+ch and don't change it

> too often.

>

Some interesting ideas! The idea of using a match equity table

for players of unequal skill as the basis for the rating system

seems sound, but it begs the question of how the match equity

table was calculated. I did not see the generating equations

in the reference noted below.

One effect that I have noticed in my own practice matches against

jellyfish is that a disproportionate amount of the equity that

I give up occurs with bad checker play in tricky end game situations.

The presence of the cube allows a significant percentage of these

situations to be avoided because a double is declined. I find

that being cautions with the cube, declining marginal cubes, and

waiting in marginal doubling situations, actually improves my

winning percentage by avoiding many situations in which I would

be outplayed. Of course it may just be that I am a bad end game

player, but if this effect is common, it adds another dynamic

into the effect of match length and the doubling cube on skill.

As far as modifying the rating system, my vote would be to use

Skill(n) = K1 + K2*n where K1 and K2 are determined to best

match empirical data. Fixing K1 at 1 could result in a significant

change in rating spread of players with existing ratings.

Getting good empirical data is

tricky though, as there are a number of statistical biases which

tend to creep in and are difficult to filter out. Also it

takes a surprisingly large amount of data to reduce the random

error to an acceptable level.

Dec 2, 1998, 3:00:00â€¯AM12/2/98

to

In article <366470...@giga-net.com>,

Jim Williams <ji...@giga-net.com> wrote:

Jim Williams <ji...@giga-net.com> wrote:

>

> Some interesting ideas! The idea of using a match equity table

> for players of unequal skill as the basis for the rating system

> seems sound, but it begs the question of how the match equity

> table was calculated. I did not see the generating equations

> in the reference noted below.

In fact I did a program which calculate a match equity table for

players of unequal checker play skill and table what I got is

consistent with table of Tom Keith. I might explain later how you

can calculate that table or perhaps someone else can do it. Now

I'm afraid that you just have to believe that the table send by

Tom Keith is correct.

The match equity table is not used as a basis to construct the Skill

function and the rating system. It has been used to verify that the

assumption that the Skill function has following form

Skill(N) = a + b*(N-1)/2, where N=1,3,5,... and parameters a,b are

constant is correct. In fact all the examples I gave to determinate

the value for checker play skill parameter cp can be understood as

a test of the Skill function form. If someone is skeptical about the

correctness of Skill function form he can generate match equity table

in a case where no doubling cube is used and test the function

against the table data :-).

>

> One effect that I have noticed in my own practice matches against

> jellyfish is that a disproportionate amount of the equity that

> I give up occurs with bad checker play in tricky end game situations.

> The presence of the cube allows a significant percentage of these

> situations to be avoided because a double is declined. I find

> that being cautions with the cube, declining marginal cubes, and

> waiting in marginal doubling situations, actually improves my

> winning percentage by avoiding many situations in which I would

> be outplayed. Of course it may just be that I am a bad end game

> player, but if this effect is common, it adds another dynamic

> into the effect of match length and the doubling cube on skill.

>

I'm not sure if I understand you here correctly. Are you trying to

explain that there might be a way to play Backgammon which would

change the form of the Skill function written above? I don't believe

that it is possible. In parameter b on above equation is included

every skill present in Backgammon. Note that the value of the b

parameter has to be extracted from the empirical data. Ofcourse it

isn't necessary to divide the b parameter b=cp+ch. So why I was wanting

to write it in that way? I did so because the checker skill part

can be calculated easily and I thought that there might be someone

who have a idea what value the ratio ch/cp might have. If you know

the value of the ratio you have a good estimation for parameter b.

If you think that there is skill present in a game which I did not

cited you can call the parameter ch as "all the other skills" and

everything is in a line again :-).

> As far as modifying the rating system, my vote would be to use

> Skill(n) = K1 + K2*n where K1 and K2 are determined to best

> match empirical data. Fixing K1 at 1 could result in a significant

> change in rating spread of players with existing ratings.

That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....

If you choose a different value than one for K1 it is equivalent

to that if you change the number 2000 in the probability equation.

I would change the number 2000 but it's a question of taste what

you want to fit.

>

> Getting good empirical data is

> tricky though, as there are a number of statistical biases which

> tend to creep in and are difficult to filter out.

Sorry for my ignorance of those statistical biases. Can you mention

and explain some of them?

Matti Rinta-Nikkola

-----------== Posted via Deja News, The Discussion Network ==----------

http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Dec 2, 1998, 3:00:00â€¯AM12/2/98

to

rintan...@my-dejanews.com wrote:

> Now

> I'm afraid that you just have to believe that the table send by

> Tom Keith is correct.

> Now

> I'm afraid that you just have to believe that the table send by

> Tom Keith is correct.

I'm not sure what you mean by "correct". I'm certainly willing to

accept that it is a good and well thought out approximation. My

interest is more in what the assumptions and limits of the

approximation are. If player A can win 60% of 1 point matches

against player B, what percent of 3 point matches can he win?

I don't think the answer is directly calculable. It depends on

what phases of the game that the relative strengths and weaknesses

of A and B are. You addressed that youself with the cp and ch

parameters, and I suspect that a more detailed analysis may reveal

even more parameters.

> I'm not sure if I understand you here correctly. Are you trying to

> explain that there might be a way to play Backgammon which would

> change the form of the Skill function written above?

I would go even farther than that and state that everyone plays

differently with different strengths and weaknesses, and therefore

everyone's skill function would be a little different. I am willing

to concede that in general people may be close enough that a pretty

accurate rating system can be built which is a quite good approximation.

I'm not sure it can be done purly analytically though, and may require

some empirical data.

> I don't believe

> that it is possible. In parameter b on above equation is included

> every skill present in Backgammon. Note that the value of the b

> parameter has to be extracted from the empirical data. Of course it

> isn't necessary to divide the b parameter b=cp+ch. So why I was

> wanting

> to write it in that way? I did so because the checker skill part

> can be calculated easily and I thought that there might be someone

> who have a idea what value the ratio ch/cp might have. If you know

> the value of the ratio you have a good estimation for parameter b.

> If you think that there is skill present in a game which I did not

> cited you can call the parameter ch as "all the other skills" and

> everything is in a line again :-).

Please forgive a contrived example, but how about a player that

is masterful in handling the cube as long as the number on top

is a 2. When the cube gets turned to 4 or 8, he chokes amd makes

terrible decisions.

> > As far as modifying the rating system, my vote would be to use

> > Skill(n) = K1 + K2*n where K1 and K2 are determined to best

> > match empirical data. Fixing K1 at 1 could result in a significant

> > change in rating spread of players with existing ratings.

>

> That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....

My assumption was that n=N, but n=(N-1)/2 is algebraically

equivalently (just tweek the constants). Ny inclination is not

to restrict N to odd. Even if the even case is less accurate, its

probably better than nothing. Most people play odd length matches

anyway.

> If you choose a different value than one for K1 it is equivalent

> to that if you change the number 2000 in the probability equation.

I agree with that.

> Sorry for my ignorance of those statistical biases. Can you mention

> and explain some of them?

Sure. These may or may not be real, and there probably are others

I haven't thought of.

1. Partition bias. The universe of FIBS players may partition itself

into groups with each group preferring a specific match length.

Within each group, the ratings of the players would tend

to distribute according to the rating formula for that match

length.

2. Diffusion bias. A player with a rating of 1800 is far more likely

to have a true strength of 1750 and be overrated than a

true strength of 1850 and be underrated. This is not

bacause any given player is more likely to be overrated than

underrated, but because there are a lot more players with

a true strength of 1750 than of 1850.

3. Diffusion magnification. Players may tend to play others of

similar rating. Therefore an 1800 rated player will tend to

play others who are overrated due to #2.

4. Match length effects. A player may tend to take 5 point matches

more seriously than one point matches and play much more

carefully (or vice versa).

5. Celebrity bias. A player who often plays while doing other things

at the same time may finally get a match with a 1950 rated

player, stop everything else, and concentrate fully on the

game.

>

> Matti Rinta-Nikkola

Dec 2, 1998, 3:00:00â€¯AM12/2/98

to

On 02-dec-98 15:48:15, Jim Williams wrote:

JW> rintan...@my-dejanews.com wrote:

>> Now

>> I'm afraid that you just have to believe that the table send by

>> Tom Keith is correct.

JW> I'm not sure what you mean by "correct". I'm certainly willing to

JW> accept that it is a good and well thought out approximation.

If I am well informed, this match equity table only takes into account

the probability that one player will defeat the opponent in a 1pt match.

Not very realistic I think, especially on longer matches other factors

are much more important. An example could be the skill of a player to

create gammons and to avoid them himself. People that mostly play (very)

short matches are probably not too good at this, whereas it can be very

important in somewhat longer matches.

JW> My

JW> interest is more in what the assumptions and limits of the

JW> approximation are. If player A can win 60% of 1 point matches

JW> against player B, what percent of 3 point matches can he win?

JW> I don't think the answer is directly calculable. It depends on

JW> what phases of the game that the relative strengths and weaknesses

JW> of A and B are. You addressed that youself with the cp and ch

JW> parameters, and I suspect that a more detailed analysis may reveal

JW> even more parameters.

I think in fact there might be an almost infinite number of parameters!

You just can't design *one* rating-system that gives accurate results

for matches of all lengths and players with various skills.

Every player will have specific strong and weak points, in checker play

as well as in cube handling *and* psychologically. These will work out

differently at different match lengths, and also against different

opponents.

If you would really want to make the rating system more accurate, I

think you would have to make different ratings for various match

lengths, for instance three seperate categories: 1 and 2 pointers, 3 to

8 pointers and 9 pointers and up.

It would be interesting to see how the present "one rating for

everything" would divide into three different ratings. I think lots of

players would get pretty different ratings for the above mentioned

categories, for instance.

Bottom line: You just can't combine the various aspects of backgammon at

different match lengths and between players with different levels of

skill at those different aspects, into one number without making

arbitrary decisions.

--

Some other problem with the present rating system that I haven't seen

mentioned in this discussion, yet seems much more important to me *and*

much more easier to solve is the following:

In bg, the luck factor plays a very important role. That means that

ratings will go up and down, sometimes rather wildly, because every

player will meet (un)lucky streaks. This is a difference with

a game like chess. Wouldn't it be a good idea to reflect this difference

in nature of the game in the ratingsystem? I don't know what would be a

good way of doing this, but averaging the rating over the last 50

results f.i. (with probably a form of bias for the most recent results)

could make the ratings much more accurate it seems to me. Any ideas

on this?

--

Zorba/Robert-Jan

Dec 3, 1998, 3:00:00â€¯AM12/3/98

to

Jim Williams wrote:

> rintan...@my-dejanews.com wrote:

> > Now

> > I'm afraid that you just have to believe that the table send by

> > Tom Keith is correct.

>

> I'm not sure what you mean by "correct". I'm certainly willing to

> accept that it is a good and well thought out approximation. My

> interest is more in what the assumptions and limits of the

> approximation are.

I think Tom Keith should answer to this one. Anyway I got tablelike Tom Keith

assuming 25% gammon rate. The other assumptions

are explained in a article "How to Compute a Match Equity Table"

http://www.bkgm.com/articles/met.html. Limits of the approximation...

that would be far too long story to start explain now -I might do it later.

Note however that assuming 25% gammon rate and efficient doubles

leads to the ppg=3.4 while in the real life Backgammon ppg is more near

to two than three.

> If player A can win 60% of 1 point matches

> against player B, what percent of 3 point matches can he win?

> I don't think the answer is directly calculable. It depends on

> what phases of the game that the relative strengths and weaknesses

> of A and B are. You addressed that youself with the cp and ch

> parameters, and I suspect that a more detailed analysis may reveal

> even more parameters.

Yes, you are right. You can divide the term ch = a*cp*ch' + ch',where ch' is the

"real cube handling skill" and a is constant. You

might find even more terms. What I think is that all those skill terms

(what ever they might be) in the Skill function are in a good approximation

directly proportional to the (N-1)/2.

> > I'm not sure if I understand you here correctly. Are you trying to

> > explain that there might be a way to play Backgammon which would

> > change the form of the Skill function written above?

>

> I would go even farther than that and state that everyone plays

> differently with different strengths and weaknesses, and therefore

> everyone's skill function would be a little different.

Skill function cannot be calculated for individuals. Skill functions has tobe

calculated to the group of persons who play against each other. Every

backgammon server and a group of persons can have different constants

in their Skill function but the form of the function should be the same.

> I am willing

> to concede that in general people may be close enough that a pretty

> accurate rating system can be built which is a quite good approximation.

> I'm not sure it can be done purly analytically though, and may require

> some empirical data.

Surely the empirical data is needed. There is no way to determinate theparameter

of the Skill function theoretically because it depends on the

errors players do.

> > I don't believe

> > that it is possible. In parameter b on above equation is included

> > every skill present in Backgammon. Note that the value of the b

> > parameter has to be extracted from the empirical data. Of course it

> > isn't necessary to divide the b parameter b=cp+ch. So why I was

> > wanting

> > to write it in that way? I did so because the checker skill part

> > can be calculated easily and I thought that there might be someone

> > who have a idea what value the ratio ch/cp might have. If you know

> > the value of the ratio you have a good estimation for parameter b.

> > If you think that there is skill present in a game which I did not

> > cited you can call the parameter ch as "all the other skills" and

> > everything is in a line again :-).

>

> Please forgive a contrived example, but how about a player that

> is masterful in handling the cube as long as the number on top

> is a 2. When the cube gets turned to 4 or 8, he chokes amd makes

> terrible decisions.

Yes, you are right! There was an error on my Skill function. For examplein a

three point match doesn't occur doubling cube errors when cube is

turned to 8. If you like to take account also those errors you have to write

Skill function as

Skill(N) = 1 + a*(N-1)/2 + b*int(N/4) + c*int(N/8) + ....; N=1,3,5...

But I'm quite sure ;-) that the b*int(N/4)<<a*(N-1)/2 and

c*int(N/8)<<b*int(N/4). Why? Because

1) there is no pure checker play skill term in parameters b and c

and I think that checker play skill is much bigger than cube

handling skill.

2) in every match length you do more cube handling errors

on low cube values because those decision occur much more

frequently than the high cube value double decisions

Anyway if it is difficult to determinate the value for parameter a, it is

certainly

more difficult (or even impossible) to fix parameters b and c..

>

>

> > > As far as modifying the rating system, my vote would be to use

> > > Skill(n) = K1 + K2*n where K1 and K2 are determined to best

> > > match empirical data. Fixing K1 at 1 could result in a significant

> > > change in rating spread of players with existing ratings.

> >

> > That's a good point. I assume in your equation n=(N-1)/2; N=1,3,5....

>

> My assumption was that n=N, but n=(N-1)/2 is algebraically

> equivalently (just tweek the constants). Ny inclination is not

> to restrict N to odd. Even if the even case is less accurate, its

> probably better than nothing. Most people play odd length matches

> anyway.

If you don't want to restrict N to odd then I think that it is better to

usefollowing Skill function

Skill(N) = 1 + a*int(N/(2+e));

where 0<e<<1. Because Skill(1) has to be equal Skill(2) and probably also

Skill(3) and Skill(4) are quite near each other.

>> Sorry for my ignorance of those statistical biases. Can you mention

>> and explain some of them?

>

> Sure. These may or may not be real, and there probably are others

> I haven't thought of.

......

Thanks for your the statistical bias discussion part.

Matti Rinta-Nikkola

Dec 4, 1998, 3:00:00â€¯AM12/4/98

to

Robert-Jan Veldhuizen wrote:

> If I am well informed, this match equity table only takes into account

> the probability that one player will defeat the opponent in a 1pt match.

That is correct.

> Not very realistic I think, especially on longer matches other factors

> are much more important. An example could be the skill of a player to

> create gammons and to avoid them himself. People that mostly play (very)

> short matches are probably not too good at this, whereas it can be very

> important in somewhat longer matches.

By the way, there is also gammon factor in one point matches. It is

moreproductive to play back games in one point matches than longer matches

although you will lose more gammons. These factors can be taken easily

account in the rating system (see my previus messages).

> I think in fact there might be an almost infinite number of parameters!

> You just can't design *one* rating-system that gives accurate results

> for matches of all lengths and players with various skills.

If the ELO assumption about Gaussian distribution is good enough,there is no

problem to design good rating system in a case when

doubling cube is not used (see my previous messages). Doubling cube

instead carries a lot of problems to design the rating system. The question

is how good rating system we can get if we add one free parameter to

the Skill function. That parameter has to be extracted from the game

statistics. To get a good answer to the above question is neither easy...

You have to try and to analyze to game statistics in order to see how well

it works.

> Every player will have specific strong and weak points, in checker play

> as well as in cube handling *and* psychologically. These will work out

> differently at different match lengths, and also against different

> opponents.

Yes, that is true. And that's why the players can be rated! You don't needto

know precisely all the skills affecting to the game result in order to create

a good rating system.

> If you would really want to make the rating system more accurate, I

> think you would have to make different ratings for various match

> lengths, for instance three seperate categories: 1 and 2 pointers, 3 to

> 8 pointers and 9 pointers and up.

Yes, that's one solution. But it is more complicate to realize in practice

thanwhat I'm suggesting.

> Bottom line: You just can't combine the various aspects of backgammon at

> different match lengths and between players with different levels of

> skill at those different aspects, into one number without making

> arbitrary decisions.

This I (as well as Arpad Elo, in a case when no doupling cube is used)

disagreeas you know.

> Some other problem with the present rating system that I haven't seen

> mentioned in this discussion, yet seems much more important to me *and*

> much more easier to solve is the following:

>

> In bg, the luck factor plays a very important role. That means that

> ratings will go up and down, sometimes rather wildly, because every

> player will meet (un)lucky streaks. This is a difference with

> a game like chess. Wouldn't it be a good idea to reflect this difference

> in nature of the game in the ratingsystem? I don't know what would be a

> good way of doing this, but averaging the rating over the last 50

> results f.i. (with probably a form of bias for the most recent results)

> could make the ratings much more accurate it seems to me. Any ideas

> on this?

I think that the bad rating system can applify the amplitude of the rating

fluctuation.

Matti Rinta-Nikkola

Dec 6, 1998, 3:00:00â€¯AM12/6/98

to

Hello everyone,

I will present here a detailed derivation of the Backgammon

Skill function.

Best regards,

Matti Rinta-Nikkola

Derivation of the Backgammon Skill function

-------------------------------------------

Assuming that the skill distribution of the players will

follow the Gaussian distribution and that the game winner

get always one point. These assumptions lead to the match

winning probability formula:

1

P(D) = ---------------------- ; (1)

10**(-D*SQRT(Skill)/2000) + 1

where D is the elo difference of the players

P is the winning probability

Skill(N)=N

N is the match length

If you get always two points (instead of one) from victory,

we know without any calculation that the Skill function can

be written as

Skill (N)= 1 + a'*int(N/2)

2 2

In a case of the "n points"/victory the Skill function is

Skill (N)= 1 + a'*int(N/n) (2)

n n

What is the Skill function of the Backgammon when no doubling

cube is used? Skill function is the combination of the functions

Skill , Skill and Skill . We can write the Skill in the

1 2 3

following form

Skill (N) = 1 + a *N + a *int(N/2) + a *int(N/3); (3)

bg 1 2 3

That is evidently too complicate in practical use. So further

approximation are needed. We can take a = 0 by assuming zero

3

backgammon rate. If we consider only odd point matches the

second and the third term can be but together. After these

simplifications we can write the Backgammon Skill function as

Skill (N) = 1 + a'*int(N/2) ; N=1,3,5... (4)

bg

The value a' can be solved if we know the gammon rate (see my

previous messages).

If we introduce the doubling cube to the game then following the

procedure explained above the Skill function can be written as

Skill (N) = 1+b'*int(N/2)+c'*int(N/4)+d'*int(N/8)+...; N=1,3,5 (5)

bgc

Note that c',d'... are also functions of N! Now we are lost.

Or are we? Let's us see what "playing skills" are included

in the terms b',c',d'....

term skills

b' checker play skill, doubling 1->2 and partly 2->4

c' doubling 2->4 and partly 4->8

d' doubling 4->8 and partly 8->16

....

Lets try to approximate how frequently different doubling

situations occur in a money game. In approximation 1 we

assume continous backgammon and efficient doubling. In a

bit more realistic approximation (appr.2) we can assume

that 60% of the doubles are taken and the rest are dropped.

Lets take the drop point to be p=30% in both cases. We will

get the following table

appr. 1 appr. 2

doubling frequency frequency

1->2 1. 1.

2->4 0.3 0.18

4->8 0.09 0.03

8->16 0.03 0.006

Assuming that N>>8, we can estimate the ratio

d'*int(N/8) d'

----------- = --- = 0.5*.17 = 0.08

c'*int(N/4) 2*c'

The ratio c'*int(N/4)/b'*int(N/2) is even smaller because

in term b' is included also checker play skill. Because

of these facts the Skill function can be written in a good

approximation as

Skill (N) = 1 + a*(N-1)/2 ; N=1,3,5,7,... (6)

bgc

Better approximation can be derived if we consider only

matches N=1,5,9,13,17,....

I'm afraid that there is no other way to test the Skill equation

than to try it in practice and after to analyze the match

statistics.

Note

----

As Jim Williams and Robert-Jan Veldhuizen have noticed

for one pointers we should calculate its own rating because

in a one point matches there is no doubling cube skill

present and also the gammon factor is probably different

than matches N>1.

Dec 6, 1998, 3:00:00â€¯AM12/6/98

to

I found a small error

> If you get always two points (instead of one) from victory,

> we know without any calculation that the Skill function can

> be written as

>

> Skill (N)= 1 + a'*int(N/2)

> 2 2

should be

Skill (N) = 1 + a'*int(N/(2+e)) ; 0<e<<1

2 2

>

> In a case of the "n points"/victory the Skill function is

>

Skill (N)= 1 + a'*int(N/(n+e))

n n

> What is the Skill function of the Backgammon when no doubling

> cube is used? Skill function is the combination of the functions

> Skill , Skill and Skill . We can write the Skill in the

> 1 2 3

> following form

>

Skill (N) = 1 + a *N + a *int(N/(2+e)) + a *int(N/(3+e)); (3)

bg 1 2 3

Everything else I think is ok.

Matti Rinta-Nikkola

0 new messages

Search

Clear search

Close search

Google apps

Main menu