Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Do Bots Help Revisited

5 views
Skip to first unread message

Andrew Bokelman

unread,
Nov 8, 1998, 3:00:00 AM11/8/98
to
Awhile back I was thinking about buying JellyFish or Snowie and asked here if
bots can help someone improve their rating. Here is my follow-up after
buying Snowie.

After I was on FIBS for awhile I got back into practice and my average was
usually around 1665. A couple times I got up to the low 1670s but only
stayed there a short time. Then my play seemed to get worse and for awhile I
was usually around 1655.

Then I got Snowie. I immediately learned why my play had gotten worse.
There was a new habit I had developed that Snowie said was wrong. After I
corrected this my average started to rise. And as I studied more it went up
more. I had one bad day where I dropped 20 points but the next day I started
to climb again. And today I hit 1686 -- higher than I have ever
been before.

I don't know if these changes are statistically significant. And there
could be luck involved. But I really think I have improved as a backgammon
player. As for how good I can get with the help of Snowie, I don't know. My
concentration is not that good and I make mistakes that even I know are
wrong. But I am happy with the level of improvement I've made so far.

Gary Wong

unread,
Nov 8, 1998, 3:00:00 AM11/8/98
to
Andrew Bokelman <73457...@CompuServe.COM> writes:
> After I was on FIBS for awhile I got back into practice and my average was
> usually around 1665. A couple times I got up to the low 1670s but only
> stayed there a short time. Then my play seemed to get worse and for awhile I
> was usually around 1655.
>
> Then I got Snowie. I immediately learned why my play had gotten worse.
> There was a new habit I had developed that Snowie said was wrong. After I
> corrected this my average started to rise. And as I studied more it went up
> more. I had one bad day where I dropped 20 points but the next day I started
> to climb again. And today I hit 1686 -- higher than I have ever
> been before.
>
> I don't know if these changes are statistically significant.

Personally I would say that they are not. Search on Deja News for
articles about FIBS ratings and you will read that fluctuations of
well over 100 points and back again are not unheard of. (As an
example, Abbott plays on FIBS with an estimated ability of around
1470. Its play hasn't changed at all for several months, but over
that time its rating has reached lower than 1300 and higher than 1600
through random noise alone.) I don't have any measurements of the
accuracy of FIBS ratings, but I would guess that the standard error in
a rating is of the order of 50 points. (Loosely speaking, this means
that all else being equal, if you take a large sample of FIBS players,
you'd expect about 2/3 of them to have a rating within 50 points of
their "true" ability.)

To judge whether you are improving based on your rating is very
difficult. Several months ago I posted an article here estimating how
long it takes for a rating to change, and I concluded that the "half
life" of a FIBS rating is of the order of 200 experience points. (One
way of interpreting this is that for any sufficiently experienced
player, the last 200 experience points contribute as much to your
rating as all previous matches put together). Therefore I would be
very reluctant to compare two measurements made within, say, 400
experience points of each other, because they won't be sufficiently
independent.

When you put all of this together, I would argue that you need samples
made more than 400 experience points apart to be independent, and more
than two standard deviations (ie. 100 points) to be significant. So,
if your current rating is over 100 points higher than it was 400
experience points ago, you can be reasonably confident that you are
improving; if it's 100 points lower, that suggests you're getting
worse!

(Strictly speaking, you can't ever prove a hypothesis is _true_ with
sampled data: you can only gather data that suggests some hypothesis
seems to be false. If, over 400 experience points, your rating
increases by 100, then that's pretty strong evidence against the
hypothesis that your ability remained the same or decreased. If it
went down by 100, that tends to reject the hypothesis that you stayed
the same or improved. If it changed by less than 100 points, then
your ability could well have changed during the sample period, but not
by enough to be detected by this fairly crude test.)


I think a better way of determining whether you are improving is to
trust your instincts. If you can identify concepts that are
significant in a particular position that you wouldn't have recognised
a month or a year ago, then that could well indicate improvement. Or
if you have since learned why a particular play you once made was
wrong, that probably constitutes improvement too. Or take a quiz (for
instance Robertie's _Reno 1986_, or Clay's _Backgammon: Winning
Strategies_, or the online positions at Backgammon By The Bay); wait
until you've "forgotten" the answers, and take the test again. If
your score has improved by, say, 10% of the total (my very rough
estimation -- long quizzes need smaller differences to be significant;
short quizzes require more) then that probably indicates significant
improvement.

Last of all, I've learned so much from reading rec.games.backgammon
that I find it very hard to believe that anybody here is not improving
at least a little bit. If you're so good that you can read several
months' r.g.b. and not learn a thing, then get off the computer and
go out and play for money; if you're not that good, then you're
improving -- congratulations! :-)

Cheers,
Gary "I suck less than I used to" Wong.
--
Gary Wong, Department of Computer Science, University of Arizona
ga...@cs.arizona.edu http://www.cs.arizona.edu/~gary/

David Montgomery

unread,
Nov 8, 1998, 3:00:00 AM11/8/98
to
In article <wtaf213...@brigantine.CS.Arizona.EDU> Gary Wong <ga...@cs.arizona.edu> writes:
>Andrew Bokelman <73457...@CompuServe.COM> writes:
>> After I was on FIBS for awhile I got back into practice and my average was
>> usually around 1665. [...] Then I got Snowie. [...] And today I hit
>>1686 -- higher than I have ever been before. I don't know if these
>>changes are statistically significant.
>
>Personally I would say that they are not. [...]

>To judge whether you are improving based on your rating is very
>difficult. [...]

>I think a better way of determining whether you are improving is to
>trust your instincts. [...] Or take a quiz [...]

If you have Snowie, there is an even better way. Use Snowie to
create an account for yourself. Play on FIBS or GG, and import
and analyze all of your matches. On the statistics window,
associate the statistics with your account.

After you have played a few matches this way, go to the Account
Manager window and look at what it says in the "Overall", "Moves"
and "Cube" panels. Pretty quickly you'll see how you do on
average.

This way, every match you play becomes like a quiz that you can
use both to improve and to objectively evaluate how well you
are playing. Since Snowie looks at your choice for every single
move, sometimes hundreds of moves per match, you get an accurate
reading much more quickly than by watching your rating go up
and down.

David Montgomery
mo...@cs.umd.edu
monty on FIBS


Murat Kalinyaprak

unread,
Nov 11, 1998, 3:00:00 AM11/11/98
to
In <wtaf213...@brigantine.CS.Arizona.EDU> Gary Wong wrote:

>.... and you will read that fluctuations of well over 100


>points and back again are not unheard of. (As an example,
>Abbott plays on FIBS with an estimated ability of around
>1470. Its play hasn't changed at all for several months,
>but over that time its rating has reached lower than 1300
>and higher than 1600 through random noise alone.)

Thanks for sharing such info with us here. Could
you be any more specific about what you mean by
"random noise"? 300 points is a huge swing... Do
you know what may be the average points won/lost
per 1-point game by Abbott? At 2 points per game
it would take 150 wins/losses (not consecutively
of course) for such a swing. Would be interesting
to know the ratio of this to the total number of
games played during a period when Abbott has gone
from 1300 to 1600 or from 1600 to 1300... Given
that Abbott is a robot without emotions, good or
bad days, etc. a 300 point fluctuation in its
rating may indicate something much worse and/or
difficult to explain...

>I don't have any measurements of the accuracy of FIBS
>ratings, but I would guess that the standard error in a
>rating is of the order of 50 points. (Loosely speaking,
>this means that all else being equal, if you take a large
>sample of FIBS players, you'd expect about 2/3 of them to
>have a rating within 50 points of their "true" ability.)

Such statements bother me a little. Where do we
get the "true ability" to compare FIBS ratings
with...?

We know for example that a "kilogram" equals the
weight of one cubic decimeter of water. So, if I
want to know whether my bath-scale measures my
"true weight" accurately enough, I can resort to
that fact as a reference outside of "me and my
bath-scale"...

How do we do that with FIBS ratings...? Saying
that one's FIBS rating is withing N points of
their "true ability as measured by FIBS points"
is circular...

If we start with a brand new FIBS, have Jim and
Joe sign on, have them start playing matches,
and then try to begin awarding them points based
on some formula like the one used now, we can see
that there is "a little too much" of a circular
hocus-pocus in it... I hope it's not necessary to
play out the scenario step by step to illustrate
this.

It may be that there is no better choice and that
we have to make do with whatever we can... That's
fine. But it needs to be acknowledged that things
are such, as far as FIBS rating system goes...

Let me ask a question specificly to Gary: with no
obligation to adopt or promote any other system,
do you think that a "much simpler" rating system
could achieve an accuracy/inaccuracy similar to
FIBS' (i.e. in the order of 50 points)...?

FIBS rating formula may be "beautiful", but it's
not "real". Imagine some players could break off
from the pack and reach ratings of 2800, 3400, etc.
while others dip to 720, 290, etc... I would say
that the "real winning chances" of a 290 player
against a 3400 player may be "zero". I chose such
extreme numbers to start making the point, but if
we work backwards from those, we may be able to
say the same for players rated at 720 and 2800, or
1230 and 1920...

On FIBS, I regularly see players with 700+ points
difference in their ratings play for points. I'm
of the opinion that the stage where a player may
have practicly zero chance of winning would occur
much earlier than 700+ FIBS-points difference. And
I see this as a problem with FIBS rating system. I
think that pretending a "rosy" hypothetical world
can exist where anyone can play against any other
player without regard to ratings (i.e. because they
win/lose proportionately based on "probabilities")
is unrealistic...

There was snide remarks made in the past about my
not believing in "probabilities". It's true that
when the term is used for some figures obtained
from "circular data", I don't believe that it has
anything to do with what it should mean...

Having mentioned again ratings like 2800, 3400,
etc. one thing I still haven't figured out (and
nobody else offered opinions on it either) is why
haven't robots like JF and SW reached ratings well
past 2000 or 2100's...? They play large volumes of
matches and against players of all skill levels
indiscriminately. Assuming that top players in the
world have better things to do than playing against
those robots on game servers in order to keep their
ratings in check, 90?+% of their opponents should be
easy prey for them to generate lots of surplus wins
and keep increasing their ratings indefinetely. Why
isn't it happening...?

Anyway, before the people who have me in their kill
file complain about my writing long articles again,
I better stop for now... :)

MK

Andrew Bokelman

unread,
Nov 11, 1998, 3:00:00 AM11/11/98
to
Gary,

>>>>I think a better way of determining whether you are improving is to

trust your instincts. If you can identify concepts that are

significant in a particular position that you wouldn't have recognized


a month or a year ago, then that could well indicate improvement. Or
if you have since learned why a particular play you once made was
wrong, that probably constitutes improvement too.

This has happened too. For example, discovering that I had developed a bad
habit in moving my back checkers up to soon. Hitting and slotting in my home
board when I could just give up one point and make another while hitting.
Breaking and running too soon in a two-way holding position. Not being bold
enough in my doubling.

Which brings me to another good thing about having a bot tutor. After I
learn the new thing it is very easy to apply it in the wrong places. So by
reviewing later matches I can see if Snowie tells me if I'm applying it
correctly.

Patti Beadles

unread,
Nov 11, 1998, 3:00:00 AM11/11/98
to
In article <72bhu5$c0p$1...@news.chatlink.com>,
Murat Kalinyaprak <mu...@cyberport.net> wrote:

>Having mentioned again ratings like 2800, 3400,
>etc. one thing I still haven't figured out (and
>nobody else offered opinions on it either) is why
>haven't robots like JF and SW reached ratings well
>past 2000 or 2100's...?

Maybe because the ratings system works a lot better than you
think, and the bots reach their "true rating" and then hover
there, plus or minus the hundred or so points that one would
expect for random swings.

-Patti
--
Patti Beadles |
pat...@netcom.com/pat...@gammon.com | You are sick. It's the kind of
http://www.gammon.com/ | sick that we all like, mind you,
or just yell, "Hey, Patti!" | but it is sick.

Murat Kalinyaprak

unread,
Nov 18, 1998, 3:00:00 AM11/18/98
to
In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72bhu5$c0p$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>.... why haven't robots like JF and SW reached ratings well


>>past 2000 or 2100's...?

>Maybe because the ratings system works a lot better than you

>think, and the bots reach their "true rating"......

I have a little problem with the term "true rating" as
used in relation to FIBS (and likes) rating systems...

"True rating" by what unit of measure...?

I think that the only way we could even come close to
using such a term in a rating system would be after a
process like the following:

We take let's say 100 players and have them let's again
say 100 matches against a *common opponent* who/which
would preferably be impartial and static in stregth.
Robots are ideal for that and we can use any robot of
any stregth (like Gary's Abbott), because we just want
to use it as a *unit of measure*...

After this, we can rate/sort those players based on the
number of matches they won against that robot like:

John rated at 92 robot units
Joe rated at 87 robot units
Jim rated at 81 robot units
Jack rated at 77 robot units
5 Etc...

Then, we make all those players play 100 matches against
each other and from the results we can derive some
conclusions as to the *relative probabilities* of their
winning chances against each other (i.e. devise a formula
to reflect the discovered relativity).

The most sensible way to add a new player to this bunch
then would be to make him/her play 100 matches against
the same *measuring stick* robot and base his/her initial
rating on the result of those matches. But this may be
totally impractical in the long term. So alternatives may
be to insert a new player at the midpoint of the ratings
range, or better yet at the most common rating, etc.

I would consider a similar process as a required *minimal*
in order to speak about a "true rating" of any sort...

Of course, if the rating formula will take into account
factors like single-point, multi-point, cubeless, cubeful
matches, etc. then the above process should include enough
samples of each of them.

My question is whether FIBS rating formula is based on
such a foundation containing some amount of *concrete*
(pun intended:) or build out of wet beach sand...?

MK

Murat Kalinyaprak

unread,
Nov 18, 1998, 3:00:00 AM11/18/98
to
In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72bhu5$c0p$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>... why haven't robots like JF and SW reached ratings well
>>past 2000 or 2100's...?

>Maybe because the ratings system works a lot better than you

>think, and the bots reach their "true rating" and then hover
>there, plus or minus the hundred or so points that one would
>expect for random swings.

What I would like to know is whether we are trying to
observe a result or trying to artificially create a
result...?

Why do we expect that any/all players ratings will
*reach* a "whatever rating" and hover around it forever
after...?

Some time ago I had argued that after a certain amount
of ratings difference, the lesser rated player's winning
chances would rely on dice alone and I had received (I
believe from you) a counter-argument that FIBS formula
was calculating (reflecting) those probablilities based
on mistakes the higher rated player is expected to make.

I'll leave the subject of "what is a mistake" for another
time but here we are talking about JF and SW, who play
based on statistics/probabilities alone and don't make the
"mistakes" that humans make. In fact, so many people have
such a high esteem of them that they are often regarded
as the ultimate judge on what are right/wrong moves, etc...

It had been claimed that perhaps only as few as 10 people
in the world can beat those robots in the long term. Of
the tens of thousands of games those robots had played,
the ones they played against each other and/or against
those 10 people must be a very very small number.

For the practical scope/purpose of this argument, those
robots don't make "mistakes" (on which the FIBS rating
formula does supposedly depend on). And after that many
thousand games the luck factor should certainly be no
longer a factor. Yet, they have so far failed to produce
enough surplus wins against the "*losing masses*" to get
past 2000-2100 ratings...? How can that be...?

I don't care which way the reality goes but something
doesn't add up as far as I can see. What could be some
possibilies here...?

a) JF and SW are not as good as some people make it
sound. But then, only 3-4 people at the most have ever
openly claimed in this newsgroup that they beat those
robots. The rest said they lose (and pretty badly at it).

b) The crowd on FIBS is very different than the crowd
in this newsgroup. People in this newsgroup wrongly
think that there are only 10 or so people in the world
who can beat those robots but in reality there are tens
of thousands of people on FIBS that can beat JF and SW...

c) Those robots don't play differently based on cube
ownership and are beaten by people on FIBS who play
very differently based on cube ownership. And since
cube ownership is not a factor in FIBS formula, those
poor robots are inadvertently kept from reaching their
full potential (i.e. "true") ratings... :)

d) FIBS dice is rigged to maintain the ratings/ranges
in a way to artificially validate its own formula...

e) Any other ideas...?

MK

limill...@my-dejanews.com

unread,
Nov 18, 1998, 3:00:00 AM11/18/98
to
In article <72u8m1$rm8$1...@news.chatlink.com>,
mu...@cyberport.net (Murat Kalinyaprak) wrote:

>
> Why do we expect that any/all players ratings will
> *reach* a "whatever rating" and hover around it forever
> after...?
>

Because the ratings system on Fibs works properly.

>
> For the practical scope/purpose of this argument, those
> robots don't make "mistakes" (on which the FIBS rating
> formula does supposedly depend on). And after that many
> thousand games the luck factor should certainly be no
> longer a factor. Yet, they have so far failed to produce
> enough surplus wins against the "*losing masses*" to get
> past 2000-2100 ratings...? How can that be...?

You seem to be saying that the mythical perfect player
will beat all inferior players %100 of the time. If that
were the case, then yes, the perfect player's rating would
increase with no upper bound.
However, the perfect player will always lose a significant
number of matches, due to the element of luck in the game.

Using made up numbers: Let's say the perfect player can beat
your average intermediate player, rated 1700, about 75% of
the time in 5 point matches. The perfect player would gain
+2.236 rating points for every win, and lose -6.708 for every
loss. Averaging 3 wins for every loss, the perfect player's
rating will in the long run remain unchanged, at approximately
2126.75.

If you log on to fibs and type "help formula" you can see
exactly how the ratings changes are calculated.

L.Miller

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Patti Beadles

unread,
Nov 18, 1998, 3:00:00 AM11/18/98
to
In article <72u8m1$rm8$1...@news.chatlink.com>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:
>For the practical scope/purpose of this argument, those
>robots don't make "mistakes" (on which the FIBS rating
>formula does supposedly depend on). And after that many
>thousand games the luck factor should certainly be no
>longer a factor. Yet, they have so far failed to produce
>enough surplus wins against the "*losing masses*" to get
>past 2000-2100 ratings...? How can that be...?

Because the formula works.


Let's assume for the sake of argument that every player has a skill
level to which we can assign a number. To further simplify the
argument, let's say that the skill level for an average player is
exactly 1500.0.

Let's choose a player, and give him a skill level of 1800.0. What this
skill level means is that he has a 65% of beating an average player in
a 3-point match, and a 71% chance of beating an average player in a
7-point match.

Our 1800 player now goes off and plays a very large number (say 10000)
of 7-point matches against an average player. He'll win around 71%
of them, and lose 29%. All in all, his rating will stay close to
1800, and his opponent's rating will stay close to 1500.

Why is that? It's because the FIBS rating system calculates what it
thinks the probability of winning a match is, based on the skill
(rating) difference of the players, and assigns points accordingly.
For example, in our hypothetical 1800 vs 1500 7-point match:

If player #1 wins:
Changes for player#1 +3.029076, new rating 1803.03
Changes for player#2 -3.029076, new rating 1496.97
If player #2 wins:
Changes for player#1 -7.553929, new rating 1792.45
Changes for player#2 +7.553929, new rating 1507.55

The underlying assumptions of the FIBS rating system are:

(a) Every player has a skill level that can be assigned a
numeric value,
(b) based on those skill levels, the probability of one
player beating another in a match of a paritcular
length can be determined.

If we don't take (a) as true, then the whole thing falls apart.
(b) is the tricky one, but I believe the system is fairly good
if not perfect. It's been shown, for example, that the formula
overestimates the chances of a weaker player winning a very short
match. It seems to work well for longer matches.

Remember that luck is still a factor in backgammon. I'm only an
intermediate player, but I've beaten world-class players in short
(5, 7, 9-point) matches.

-Patti
--
Patti Beadles |
pat...@netcom.com/pat...@gammon.com |

http://www.gammon.com/ | The deep end isn't a place
or just yell, "Hey, Patti!" | for dipping a toe.

Michael J Zehr

unread,
Nov 18, 1998, 3:00:00 AM11/18/98
to
In article <72u8m1$rm8$1...@news.chatlink.com>,
Murat Kalinyaprak <mu...@cyberport.net> wrote:
>For the practical scope/purpose of this argument, those
>robots don't make "mistakes" (on which the FIBS rating
>formula does supposedly depend on). And after that many
>thousand games the luck factor should certainly be no
>longer a factor. Yet, they have so far failed to produce
>enough surplus wins against the "*losing masses*" to get
>past 2000-2100 ratings...? How can that be...?

> [options snipped]

f) The ratings are accurate in the sense that the "average" FIBS player
has a rating in the 1500s, and a rating difference of 500 points
accurately predicts the ratio of games SW and JF win on average.

There seems to be a misunderstanding that a "perfect" player should have
their rating arbitrarily high. If perfect play lets you win 75% of the
time in a 9 point match against the "average" player on FIBS, then using
the rating system you can determine the rating that perfect player ought
to have. It might well be 2000-2100.

The expectation that a player's rating will move towards some point and
then hover there is based on statistical and empirical data. One can
run simulations to see how a rating system behaves. (Start two players
with ratings of 1500. Assume one of them wins 60% of the time in a 7
point match. Simulate matches by using a psuedo random sequence, or
some other method of generating a 60% chance. Adjust the ratings using
the FIBS formula (which you can get from the help on FIBS). See what
happens to the ratings over the long run.)

Remember that althought SW and JF might have vastly winning records,
they get far fewer points when they win than when they lose, so they
need to win more often than they lose just to maintain their high
ratings.

Regarding the comments that the ratings differences don't reflect the
assertion that only 10 or so people in the world can maintain a winning
record against SW or JF, I would say that the ratings _do_ reflect
that. After all, how many people on FIBS consistenly have a higher
rating than SW or JF? Isn't it just possible that robots have the
highest ratings on FIBS because they're the best players?

-Michael J. Zehr

Murat Kalinyaprak

unread,
Nov 22, 1998, 3:00:00 AM11/22/98
to
In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>For the practical scope/purpose of this argument, those
>>robots don't make "mistakes" (on which the FIBS rating
>>formula does supposedly depend on). And after that many
>>thousand games the luck factor should certainly be no
>>longer a factor. Yet, they have so far failed to produce
>>enough surplus wins against the "*losing masses*" to get
>>past 2000-2100 ratings...? How can that be...?

>Because the formula works.

Well, maybe I'll become convinced later... :)

>Let's assume for the sake of argument that every player has
>a skill level to which we can assign a number. To further
>simplify the argument, let's say that the skill level for an
>average player is exactly 1500.0.

Ok, I'll use the same figure in my examples.

>Let's choose a player, and give him a skill level of 1800.0.
>What this skill level means is that he has a 65% of beating
>an average player in a 3-point match, and a 71% chance of
>beating an average player in a 7-point match.
>Our 1800 player now goes off and plays a very large number
>(say 10000) of 7-point matches against an average player.
>He'll win around 71% of them, and lose 29%. All in all, his
>rating will stay close to 1800, and his opponent's rating
>will stay close to 1500.

Fine. Let's look at this from a different angle also.
Let's have Snowie (rated at 2089?) play each opponent
a 1-point match and only once. With about 600 points
difference (2089-1500), its winning chances would be
around 65%? for 1-point matches. If Snowie does indeed
win 65% of those matches, then we can reword your above
statement as: "Snowie can beat 65% of its opponents"
(among players on FIBS with 1500 average rating)...

If the players on FIBS represent a good sampling of all
players in the world, then we can expand that statement
to say: "Snowie can beat 65% of all players on earth".
(When talking about matches of other lengths, the "65%"
can be replaced by the appropriate figure).

If we have Snowie play another set of matches against
the same players again, and again, and again... this
ratio will not change. Therefore, we can say that 35%
of the players will beat Snowie consistently (i.e. in
the long run), which sounds much less impressive than
claims made previously in this newgroup...

>Why is that? It's because the FIBS rating system calculates
>what it thinks the probability of winning a match is, based
>on the skill (rating) difference of the players, and assigns
>points accordingly. For example, in our hypothetical 1800 vs

>1500 7-point match: ...............

> (a) Every player has a skill level that can be assigned a
> numeric value,
> (b) based on those skill levels, the probability of one
> player beating another in a match of a paritcular
> length can be determined.

I had already made an argument about two requirements
for this to work. Only "one particular form of skill"
can be measured (i.e. single-point, multi-point, etc.
matches), not a combination of many kinds at the same
time. This is not the case with FIBS rating system, as
you and others acknowledge also. And at least an initial
sample of players need to be measured against a common
"unit" before they can be used to measure each other or
players. I don't know if this was done in establishing
the FIBS formula or not. It would be good if we get an
answer on this from somebody who knows...

>If we don't take (a) as true, then the whole thing falls apart.
>(b) is the tricky one, but I believe the system is fairly good
>if not perfect. It's been shown, for example, that the formula
>overestimates the chances of a weaker player winning a very short
>match. It seems to work well for longer matches.

And my argument is that such inconsistencies can add up
and should even be expected to do so in the long term.
There could possibly be other elusive elements such as
"style/strategy", for example. It has been argued in
this newsgroup that robots play unlike-humans and that's
why they do better and that's why humans aspire to play
like them. So one question could be whether a 2100 rated
human and a 2100 rated robot really have the same winning
chances against a 1500 rated player (human and/or robot)
in any/all variations of backgammon?

For the moment, let's stick with the match length and
assume that a robot can do slightly better in 1-point
matches against weaker opponents and it's "true" winning
chances (referring to the previous examples used above)
is 66% instead of 65%. Let's also say that half of the
30000 experience Snowie has consists of 1-point matches
(which would had to be played against almost all weaker
players from the beginning). At about 1.5? points earned
per match, that 1% would translate to 150x1.5=225 rating
points and would put Snowie at 2325. This is only with
a mere 1% inaccuracy in calculating the probability of
winning...

Of course, it's possible that a few human players may
end up not fitting the "standard mold" and cause such
irregularities/extremes also. I'm just using robots as
likely cases because they play large amounts of matches
and arguments have been made about their being superior
and/or different than at least most humans...

And with the seemingly arbitrary round number constants
in it, the FIBS formula looks just too crude to be able
to prevent such possible or even expected irregularities.
Yet, no such irregularities are observed...

Some people had argued that inflating ratings by taking
advantage of deficiencies in FIBS ratings was possible.
If it's possible on purpose, why couldn't it be possible
for it to happen inadvertently, which I believe would be
more likely than not to happen. I just can't see how a
some "crude" formula with some round/arbitrary constants
can result in the apparent stability/smoothness/neatness
in FIBS ratings and their ranges and can't help wonder
if some other mecahnism/s are used to ensure that...

MK

Murat Kalinyaprak

unread,
Nov 22, 1998, 3:00:00 AM11/22/98
to
In <72vf4c$a...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>longer a factor. Yet, they have so far failed to produce
>>enough surplus wins against the "*losing masses*" to get
>>past 2000-2100 ratings...? How can that be...?

>There seems to be a misunderstanding that a "perfect" player


>should have their rating arbitrarily high. If perfect play
>lets you win 75% of the time in a 9 point match against the
>"average" player on FIBS, then using the rating system you
>can determine the rating that perfect player ought to have.
>It might well be 2000-2100.

This is what I don't understand. How can "*perfect*"
(or close to it) play can only win 75% of the time...?

We either have to reassess the stregth of those robots,
redefine "perfect", argue that FIBS rating measures not
skill but luck, or something else whatever... We can't
have it all.

>The expectation that a player's rating will move towards some
>point and then hover there is based on statistical and empirical
>data.

What data...? Where did that data come from...?

>One can run simulations to see how a rating system behaves.
>(Start two players with ratings of 1500. Assume one of them
>wins 60% of the time in a 7 point match. Simulate matches by
>using a psuedo random sequence, or some other method of
>generating a 60% chance. Adjust the ratings using the FIBS
>formula (which you can get from the help on FIBS). See what
>happens to the ratings over the long run.)

>Remember that althought SW and JF might have vastly winning
>records, they get far fewer points when they win than when
>they lose, so they need to win more often than they lose just
>to maintain their high ratings.

I think the problem with your argumant is that JF/SW in
this case are not playing against a single opponent but
against a crowd of claimedly tens of thousands (50000?)
of players with an average rating (at least at the very
start) of 1500. It would take an enormous amount of wins
on their part to lower that average to such a low level
that the winning wouldn't earn them much... If they won
1 match of 1-point against each and every player on FIBS
(i.e. 50000 wins), that average rating of 1500 would go
down by only 1.5? (please correct me if I'm off on this
and use a more accurate number) points... Could someone
calculate what JF/SW rating would be by the time they'd
pull the average rating on FIBS by just 1.5 points...?

>Regarding the comments that the ratings differences don't
>reflect the assertion that only 10 or so people in the world
>can maintain a winning record against SW or JF, I would say
>that the ratings _do_ reflect that. After all, how many
>people on FIBS consistenly have a higher rating than SW or
>JF? Isn't it just possible that robots have the highest
>ratings on FIBS because they're the best players?

While discussing rating systems, somebody had tried to
differentiate between "rankings" and "ratings", which
wasn't applicable in that context but is in this case.
The rating difference between #1 and #2 player can be
a million points while the difference between #2 and #3
players can be ten points and they still would rank as
#1, #2 and #3...

The issue here is the closeness of those robots ratings
to a good number of other players. If SW earned 30000
experience by playing 3 point matches on the average,
that would mean it played against a pool of 10000 people
with an average rating of 1500. If it could consistently
beat 9990 of them, top 10 players on FIBS couldn't even
come close to making a dent towards keeping its rating
"stabilized" at around 2000-2100 (even if they were the
same people as the top 10 players in the world)...

Something just doesn't add up...

MK

Murat Kalinyaprak

unread,
Nov 22, 1998, 3:00:00 AM11/22/98
to
In <72vj16$tqc$1...@nnrp1.dejanews.com> limill...@my-dejanews.com wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>> Why do we expect that any/all players ratings will
>> *reach* a "whatever rating" and hover around it forever
>> after...?

>Because the ratings system on Fibs works properly.

I don't believe the results we observe could simply be
achieved by the current formula used by FIBS. I don't
see how that formula could prevent some players from
straying far away from the pack...

>> longer a factor. Yet, they have so far failed to produce
>> enough surplus wins against the "*losing masses*" to get
>> past 2000-2100 ratings...? How can that be...?

>You seem to be saying that the mythical perfect player


>will beat all inferior players %100 of the time. If that
>were the case, then yes, the perfect player's rating would
>increase with no upper bound.

I am not the one turning certain robots into "mythical
perfect players". I'm only making multi-edged arguments,
without any intention to prove which way they cut. There
seem to be a case where we have to decide whether we want
to eat the cake or have it...

My argument is that even a less than "perfect" player can
produce *enough* surplus of wins against a large mass of
opponents with a much lower average rate. It doesn't have
to be a boundless process in order for it to produce huge
differences in ratings...

>However, the perfect player will always lose a significant
>number of matches, due to the element of luck in the game.

The luck factor is supposed to even out in the long run.
But I personally don't mind hearing this comment because
it leaves room for the possibility that a ceratin luck
factor may be artificially maintained by FIBS dice... :)

>Using made up numbers: Let's say the perfect player can beat
>your average intermediate player, rated 1700, about 75% of
>the time in 5 point matches. The perfect player would gain
>+2.236 rating points for every win, and lose -6.708 for every
>loss. Averaging 3 wins for every loss, the perfect player's
>rating will in the long run remain unchanged, at approximately
>2126.75.

With the luck factor eliminated, why would a "perfect"
player or even a player close to that would only win
75% of the time, regardless of whether his opponent is
rated at 1000, 500, 200, 50 or 2 points below him...?

As a side comment, when talking about certain robots if
"perfect" had to mean "75%", I would have no problem
with it... :)

MK

David Montgomery

unread,
Nov 22, 1998, 3:00:00 AM11/22/98
to
MK:

Having P% chances against players of level L
does not mean you have P% against any group
whose average level is L. Playing 10 matches
against 1700-level players is different than
5 matches aginst 2000 combined with 5 matches
against 1400.

Saying a player will win P% of its matches
against a group of players G is not the same
thing as saying that P% of the players in
G consistently beat the player. At my club
I consistenly lose about 1/3 of my matches.
But no one beats me consistently.

Perfect play could win only 75% of the time
if the other player plays well enough to
win 25% of the time. If the other player
played better, the perfect player might win
only half the time.

The rating formula prevents players from
straying far from the pack by requiring that
players win a higher and higher percent of their
matches to go to a higher level. If you can't
win with that percent, you don't go higher.

The fact that luck evens out in the end does
not mean that the better player eventually wins
all of the matches. It means that the percentage
of matches won converges arbitrarily close to the
the result you would get if you played an infinite
number of matches. The weaker player keeps
winning matches, even when the luck has evened out.

You cannot eliminate the luck factor in backgammon.
It is always there. If you play many, many matches
then both players will get approximately equal
amounts of luck, but that doesn't take the luck
away.

An analogy. Let's say you played perfect (but
honest, non-prescient) blackjack. What percent
of the hands would you win? You still win only
about 1/2 the hands, even though you are playing
perfectly. If you played 10,000,000,000,000,000,000
hands, to "eliminate" the luck factor, you would
still win only about 1/2 the hands.

Me Again

unread,
Nov 22, 1998, 3:00:00 AM11/22/98
to
Murat Kalinyaprak wrote:

> This is what I don't understand. How can "*perfect*"

> (or close to it) play can only win 75% of the time...?

You roll 5 - 2 and play 13-8 24-22. Your opponent rolls 5-5 and points
on both your blots. You roll any of the 9 numbers (6-6, 6-3, 3-6, 3-3,
1-1, 6-1, 1-6, 3-1, 1-3) that fail to bring in either of your hit men.
Your opponent doubles, you drop.

Did you make any mistakes? (Depending on the match score, your opponent
may have made a mistake in doubling and should have played for a gammon,
but in a money game using the jacoby rule it's a given that this is a
double/drop.)

This is not an isolated position, there are several other 2 roll and 3
roll scenarios (usually involving doubles, something that is rolled 1
out of every 6 rolls) that with "perfect play" will result in a
double/drop.

Thus, even with "perfect play" you can (and will) lose many games,
because of the luck of the dice. Any game that has a luck factor will
be a game in which it will always be impossible to win 100% of your
games, even with perfect play.

HTH

jc

Michael J Zehr

unread,
Nov 23, 1998, 3:00:00 AM11/23/98
to
In article <73a32u$snp$1...@news.chatlink.com>,
Murat Kalinyaprak <mu...@cyberport.net> wrote:

>In <72vf4c$a...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:
>
>>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:
>
>>>longer a factor. Yet, they have so far failed to produce
>>>enough surplus wins against the "*losing masses*" to get
>>>past 2000-2100 ratings...? How can that be...?
>
>>There seems to be a misunderstanding that a "perfect" player
>>should have their rating arbitrarily high. If perfect play
>>lets you win 75% of the time in a 9 point match against the
>>"average" player on FIBS, then using the rating system you
>>can determine the rating that perfect player ought to have.
>>It might well be 2000-2100.
>
>This is what I don't understand. How can "*perfect*"
>(or close to it) play can only win 75% of the time...?

Suppose someone did write a "perfect" backgammon program, by whatever
definition you chose for perfect... and then it played itself. What do
you think it's percentage of wins would be?

50% right?

So what should its win percent be against an "almost perfect" opponent?
(Feel free to define "almost perfect" however you want.) What about an
"slightly less than almost perfect" opponent?)


-Michael J. Zehr


Graham Price

unread,
Nov 23, 1998, 3:00:00 AM11/23/98
to

Murat Kalinyaprak <mu...@cyberport.net> wrote in article
<73a4l3$snp$2...@news.chatlink.com>...


> In <72vj16$tqc$1...@nnrp1.dejanews.com> limill...@my-dejanews.com wrote:
> With the luck factor eliminated, why would a "perfect"
> player or even a player close to that would only win
> 75% of the time, regardless of whether his opponent is
> rated at 1000, 500, 200, 50 or 2 points below him...?
>
> As a side comment, when talking about certain robots if
> "perfect" had to mean "75%", I would have no problem
> with it... :)
>
> MK
>

I always thought that backgammon was a combination of skill + luck
so it's pretty difficult to eliminate the luck factor.
If the ratio is say 50% skill and 50% luck
then wouldn't a perfect robot's winning rate be
calculated by adding the skill factor + the luck factor.
If it approached perfect then it's winning rate would be
~ .5 for skill + (somevariable)* .5 for luck factor
If it got an even split on luck then somevariable would at that
time be .5 and it would win 75% of its matches.
Maybe skill difference might even be a better variable because
if the opponent were weaker then skill would be more telling and
luck would have less influence whereas if the opponent were
closer in strength then skill difference would decrease and
luck would therefore increase.
Anyway one of the enjoyable (sort of) paradoxes with backgammon
is that you can play a "perfect" game and still get punished by
being backgammoned, just the way it is.
Graham


EdmondT

unread,
Nov 23, 1998, 3:00:00 AM11/23/98
to
>I always thought that backgammon was a combination of skill + luck so it's
pretty difficult to eliminate the luck factor. If the ratio is say 50% skill
and 50% luck ***>

I think luck is MUCH less than a 50% factor. If you spend some time playing
much stronger players than you, I think you'll find this out quickly.


Edm...@aol.com

limill...@my-dejanews.com

unread,
Nov 23, 1998, 3:00:00 AM11/23/98
to
In article <739vq4$k67$1...@news.chatlink.com>,
mu...@cyberport.net (Murat Kalinyaprak) wrote:

>
> Fine. Let's look at this from a different angle also.
> Let's have Snowie (rated at 2089?) play each opponent
> a 1-point match and only once. With about 600 points
> difference (2089-1500), its winning chances would be
> around 65%? for 1-point matches. If Snowie does indeed
> win 65% of those matches, then we can reword your above
> statement as: "Snowie can beat 65% of its opponents"
> (among players on FIBS with 1500 average rating)...
>
> If the players on FIBS represent a good sampling of all
> players in the world, then we can expand that statement
> to say: "Snowie can beat 65% of all players on earth".
> (When talking about matches of other lengths, the "65%"
> can be replaced by the appropriate figure).
>
> If we have Snowie play another set of matches against
> the same players again, and again, and again... this
> ratio will not change. Therefore, we can say that 35%
> of the players will beat Snowie consistently (i.e. in
> the long run), which sounds much less impressive than
> claims made previously in this newgroup...

I honestly can't tell, are you simply trolling this newsgroup?

If not, I can tell you that that the two statements
"Snowie beats a 1500 player 65% of the time" and
"Snowie can beat 65% of all players" are not equivalent.


> I had already made an argument about two requirements
> for this to work. Only "one particular form of skill"
> can be measured (i.e. single-point, multi-point, etc.
> matches), not a combination of many kinds at the same
> time. This is not the case with FIBS rating system, as
> you and others acknowledge also. And at least an initial
> sample of players need to be measured against a common
> "unit" before they can be used to measure each other or
> players. I don't know if this was done in establishing
> the FIBS formula or not. It would be good if we get an
> answer on this from somebody who knows...
>

The ELO formula is based entirely on basic probability
theory and not on empircal data. Empirical data is highly
error prone and impossible to generalize into a formula.

The derivation of the formula is loosely explained at
the netgammon site:

http://ibs.nordnet.fr/netgammon/elobis_usa.html

>
> For the moment, let's stick with the match length and
> assume that a robot can do slightly better in 1-point
> matches against weaker opponents and it's "true" winning
> chances (referring to the previous examples used above)
> is 66% instead of 65%. Let's also say that half of the
> 30000 experience Snowie has consists of 1-point matches
> (which would had to be played against almost all weaker
> players from the beginning). At about 1.5? points earned
> per match, that 1% would translate to 150x1.5=225 rating
> points and would put Snowie at 2325. This is only with
> a mere 1% inaccuracy in calculating the probability of
> winning...

Your math is wrong. A player who wins 65% of the time over
1500 rated opponents will have a rating of 2037, a player
who wins 66% of the time over the same opponents will have
a rating of 2076. Given the known error rate in the formula,
a 1% difference in skill is in practice difficult to observe.

>
> Of course, it's possible that a few human players may
> end up not fitting the "standard mold" and cause such
> irregularities/extremes also. I'm just using robots as
> likely cases because they play large amounts of matches
> and arguments have been made about their being superior
> and/or different than at least most humans...
>
> And with the seemingly arbitrary round number constants
> in it, the FIBS formula looks just too crude to be able
> to prevent such possible or even expected irregularities.
> Yet, no such irregularities are observed...
>

You're right that the constants are arbitrary, since they
could be changed to anything and the formula would still
work. However, the range of the ratings and magnitude of
the changes would be different.

I don't know what you mean by "expected irregularities".

> Some people had argued that inflating ratings by taking
> advantage of deficiencies in FIBS ratings was possible.
> If it's possible on purpose, why couldn't it be possible
> for it to happen inadvertently, which I believe would be
> more likely than not to happen. I just can't see how a
> some "crude" formula with some round/arbitrary constants
> can result in the apparent stability/smoothness/neatness
> in FIBS ratings and their ranges and can't help wonder
> if some other mecahnism/s are used to ensure that...
>

Your argument translates to, "I don't understand Fibs'
ratings formula, therefore Fibs is rigged".

limill...@my-dejanews.com

unread,
Nov 24, 1998, 3:00:00 AM11/24/98
to
In article <73a32u$snp$1...@news.chatlink.com>,
mu...@cyberport.net (Murat Kalinyaprak) wrote:

>
> This is what I don't understand. How can "*perfect*"

> (or close to it) play can only win 75% of the time...?
>

"Perfect" in backgammon, means, I believe, playing every
move and making every cube decision such that they maximize ones
chance of winning the match.
Perfection does not imply winning every match.

It seems perfectly reasonable to me that a perfect strategy would
beat intermediate opponents 75% of the time in 5 point matches.

> We either have to reassess the stregth of those robots,

Computer programs aren't perfect, I'll give you that if
it's what you really wanted to hear.


> I think the problem with your argumant is that JF/SW in
> this case are not playing against a single opponent but
> against a crowd of claimedly tens of thousands (50000?)
> of players with an average rating (at least at the very
> start) of 1500. It would take an enormous amount of wins
> on their part to lower that average to such a low level
> that the winning wouldn't earn them much... If they won
> 1 match of 1-point against each and every player on FIBS
> (i.e. 50000 wins), that average rating of 1500 would go
> down by only 1.5? (please correct me if I'm off on this
> and use a more accurate number) points... Could someone
> calculate what JF/SW rating would be by the time they'd
> pull the average rating on FIBS by just 1.5 points...?
>

I can't for the life of me see what your point is here.
You seem to enjoy deflecting an argument by re-phrasing it
in incomprehensible terms. (which is what leads me to
believe that you're very cleverly trolling the newsgroup)

>
> While discussing rating systems, somebody had tried to
> differentiate between "rankings" and "ratings", which
> wasn't applicable in that context but is in this case.
> The rating difference between #1 and #2 player can be
> a million points while the difference between #2 and #3
> players can be ten points and they still would rank as
> #1, #2 and #3...
>
> The issue here is the closeness of those robots ratings
> to a good number of other players. If SW earned 30000
> experience by playing 3 point matches on the average,
> that would mean it played against a pool of 10000 people
> with an average rating of 1500. If it could consistently
> beat 9990 of them, top 10 players on FIBS couldn't even
> come close to making a dent towards keeping its rating
> "stabilized" at around 2000-2100 (even if they were the
> same people as the top 10 players in the world)...

You seem to have forgotten that the formula on fibs works
properly. Snowie's rating will fluctuate just like everyone
else's, irrespective of its experience. (once its experience
is greater than 400). There is no conspiracy needed by the
masses to "make a dent" in its rating.

Since you prefer hand-waving arguments to those based on fact,
here comes mine. Don't look at things in the big picture, look at
it on the microscopic level. A good player does not win all of the
time, right? Why, because of luck. When a good player beats
a bad player, he gets a modest reward. When a good player loses
to a bad player, he sufferes a big loss. It all evens out
in the end.

>
> Something just doesn't add up...
>

You can say that again. And I'm sure you will.

limill...@my-dejanews.com

unread,
Nov 24, 1998, 3:00:00 AM11/24/98
to
In article <73a4l3$snp$2...@news.chatlink.com>,
mu...@cyberport.net (Murat Kalinyaprak) wrote:

>
> With the luck factor eliminated, why would a "perfect"

Wow! You've eliminated luck from backgammon?!!?

> player or even a player close to that would only win
> 75% of the time, regardless of whether his opponent is
> rated at 1000, 500, 200, 50 or 2 points below him...?

Please tell me how often you would expect a perfect player
to beat a player rated 2 points below her.

>
> As a side comment, when talking about certain robots if
> "perfect" had to mean "75%", I would have no problem
> with it... :)
>

Me neither, 75% is damn good.

Murphy McKalin

unread,
Nov 25, 1998, 3:00:00 AM11/25/98
to
In <73alip$8...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>In <73a32u$snp$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>This is what I don't understand. How can "*perfect*"

>>(or close to it) play can only win 75% of the time...?

>Suppose someone did write a "perfect" backgammon program, by
>whatever definition you chose for perfect... and then it played
>itself. What do you think it's percentage of wins would be?
>50% right?

Yes, assuming there is no "luck factor"...

>So what should its win percent be against an "almost perfect"
>opponent? (Feel free to define "almost perfect" however you
>want.) What about an "slightly less than almost perfect"
>opponent?)

100%

BTW: notice that I wasn't the one who used the
term "perfect" first. I carried it over from
the article I was responding to and used it
within quotation marks ever since...

MK

Murphy McKalin

unread,
Nov 25, 1998, 3:00:00 AM11/25/98
to
In <73ap54$g...@krackle.cs.umd.edu> David Montgomery wrote:

>MK:

>Having P% chances against players of level L
>does not mean you have P% against any group
>whose average level is L. Playing 10 matches
>against 1700-level players is different than
>5 matches aginst 2000 combined with 5 matches
>against 1400.

What you say could be possible only if the FIBS
formula was lop-sided (i.e. used the "ratings"
themselves in some fashion). But it only uses
the difference between two ratings...

Just to avoid any calculation errors I may make,
I just logged on to FIBS and was lucky enough to
spot 3 players with ratings of 1965, 1565 and
1766 (close enough) all at once. The on-screen
calculator showed that my winning chances against
them were 43.49%, 54.95% and 49.18% respectively.
If I adjust the last one for 1765, I get 49.21%
while the average of first two is 49.22%...

>Saying a player will win P% of its matches
>against a group of players G is not the same
>thing as saying that P% of the players in
>G consistently beat the player. At my club
>I consistenly lose about 1/3 of my matches.
>But no one beats me consistently.

Ok, I'll give in on this one. What I said was true
for one round but not necessarily so at all to say
"consistently" (unless the same players repeated
the same performance each and every match, which
is possible but very unlikely/unrealistic).

However, in order to say one won against another,
it's enough that one wins 51% of the time, which
is quite lower than the 65% used in the examples.
So, with the figures that were used as examples,
the number of people who would win consistently
would be much less than 35% but much more than 10.
I'm not sure if this is something that can even be
truely calculated with the variables at hand...?

>Perfect play could win only 75% of the time
>if the other player plays well enough to
>win 25% of the time. If the other player
>played better, the perfect player might win
>only half the time.

If I have to accept this as true, then I would have
to argue that there is no such thing as "perfect"
or even anything close to it in bg. 75% is just too
far from it... Given this, the FIBS forfula can't
be claimed to rate "skill" either...

>The rating formula prevents players from
>straying far from the pack by requiring that
>players win a higher and higher percent of their
>matches to go to a higher level. If you can't
>win with that percent, you don't go higher.

So...? Their ratings will go up in ever smaller
increments (i.e. slower) but what would prevent
them from going much higher? Imagine a Martian
with a potential rating of 3000 lands on earth
and joins FIBS. Are you guys saying that even
after 20000, 50000, 100000 matches he will never
get past achieveing a rating of 2000-2100...?

>You cannot eliminate the luck factor in backgammon.
>It is always there. If you play many, many matches
>then both players will get approximately equal
>amounts of luck, but that doesn't take the luck away.

I don't know about others but to clarify things
just speaking for myself, when I say "eliminating
the luck (factor)" I mean "*equal enough* luck"
for both/all players...

>An analogy. Let's say you played perfect (but
>honest, non-prescient) blackjack. What percent
>of the hands would you win? You still win only
>about 1/2 the hands, even though you are playing
>perfectly. If you played 10,000,000,000,000,000,000
>hands, to "eliminate" the luck factor, you would
>still win only about 1/2 the hands.

I barely know blackjack but I agree that what you
say would be true between equal players in bg. If
the players are unequal and the luck is equal, then
the better player can generate a surplus of wins
without limit. When awarding points as is done now
with the FIBS formula, the points earned may get
increasingly small but should never reach zero
and stop...

I understand the argument that within the FIBS
rating scheme any player will eventually settle
at around a certain rating. What I'm arguing is
that any player claimed to be one of the top 10
players in the world (human or robot) would break
away from the pack by a larger gap before reaching
their so-called "true rating"...

MK

Patti Beadles

unread,
Nov 25, 1998, 3:00:00 AM11/25/98
to
In article <73gedb$l6s$1...@news.chatlink.com>,

Murphy McKalin <mu...@cyberport.net> wrote:
>>So what should its win percent be against an "almost perfect"
>>opponent? (Feel free to define "almost perfect" however you
>>want.) What about an "slightly less than almost perfect"
>>opponent?)

>100%

No way. There will always be some luck involved.

For example, Perfect Player opens with 51 and plays 13/8 24/23, the
commonly accepted best move.

Total Idiot rolls 55 and plays 8/3(2) 6/1(2)*. PP now dances, TI
rolls 64 and plays 8/2* 6/2. PP continues to dance while TI rolls
just the right numbers to close him out and bear off safely.

PP played his single move flawlessly, but TI got lucky.

-Patti
--
Patti Beadles | Not just your average purple-haired
pat...@netcom.com/pat...@gammon.com | degenerate gambling adrenaline
http://www.gammon.com/ | junkie software geek leatherbyke
or just yell, "Hey, Patti!" | nethead biker.

Gary Wong

unread,
Nov 25, 1998, 3:00:00 AM11/25/98
to
I had been trying to avoid writing any more about FIBS ratings, because I
don't think I have anything else to contribute. Here is one last post all
the same. Apologies to everybody who is sick of this stuff :-)

mu...@cyberport.net (Murphy McKalin) writes:
> In <73ap54$g...@krackle.cs.umd.edu> David Montgomery wrote:
> >Having P% chances against players of level L
> >does not mean you have P% against any group
> >whose average level is L. Playing 10 matches
> >against 1700-level players is different than
> >5 matches aginst 2000 combined with 5 matches
> >against 1400.
>

> Just to avoid any calculation errors I may make,
> I just logged on to FIBS and was lucky enough to
> spot 3 players with ratings of 1965, 1565 and
> 1766 (close enough) all at once. The on-screen
> calculator showed that my winning chances against
> them were 43.49%, 54.95% and 49.18% respectively.
> If I adjust the last one for 1765, I get 49.21%
> while the average of first two is 49.22%...

Unfortunately that's just a special case where the total probability
IS roughly the average of the three parts (because your rating is very
close to the median of a symmetric distribution). David is right: the
probability against the average rating is not necessarily the same as
the average probability against all ratings. A counterexample:

- suppose I am only rated at 1165, and play 9-point matches against the
three players you found (1565, 1765 and 1965);

- my probabilities of winning against the three are 20.1%, 11.2% and
6.0% respectively;

- my average probability of winning against the 1565 and 1965 players
is 13.1%, but against the 1765 player is only 11.2%. They are NOT
the same.

> >Perfect play could win only 75% of the time
> >if the other player plays well enough to
> >win 25% of the time. If the other player
> >played better, the perfect player might win
> >only half the time.
>
> If I have to accept this as true, then I would have
> to argue that there is no such thing as "perfect"
> or even anything close to it in bg. 75% is just too
> far from it...

There is such a thing as perfect -- perfection is never making any
mistakes. (A precise definition of a perfect strategy is one that
maximises your "security level" -- ie. a maximin strategy, one that
maximises your minimum expected gain across all possible opponents.
Since the rules in backgammon are symmetric (as opposed to games like
blackjack, where the dealer follows different rules than the players)
and backgammon is a zero-sum game, this maximum security level is
zero.)

Perhaps one issue that is causing confusion is that the idea of
perfection in backgammon is somewhat abstract. (This is because we
haven't reached perfection, and we don't always know what a mistake
is.) If a concrete example would clarify things, consider Hugh
Sconyer's programs which play all bearoff positions (though the
publically available versions only play as many positions as will fit
on the CD-ROMs), and every position in Hyper-Backgammon (essentially
backgammon played with three chequers per player) perfectly for money.
These players are PERFECT. No mistakes. Maximum security level,
etc. etc. Yet they cannot win every game. The probability of them
winning depends on the position, and the opponent. Imagine Hugh was
somehow able to extend his exhaustive search to include every
backgammon position -- this would be the perfect player we're talking
about. And it could not win every match, either. Even against an
intermediate player like me it would probably only win about 2/3 of
the games; it could win 75% or 90% or even more of the matches, as long
as the matches were long enough.

> Given this, the FIBS forfula can't be claimed to
> rate "skill" either...

Yes, it can. Skill is the ability to play without making mistakes.
The more mistakes you make, the less matches you expect to win. If
both players play without making any mistakes, then they each expect
to win 50% of the matches. If only one player makes mistakes, then he
expects to win less than 50%. The more (and costlier) mistakes he
makes, the fewer matches he expects to win. You can view FIBS ratings
as measuring skill, or the ability to play without making mistakes, or
the rate of matches won -- they are all equivalent.

> >The rating formula prevents players from
> >straying far from the pack by requiring that
> >players win a higher and higher percent of their
> >matches to go to a higher level. If you can't
> >win with that percent, you don't go higher.
>
> So...? Their ratings will go up in ever smaller
> increments (i.e. slower) but what would prevent
> them from going much higher? Imagine a Martian
> with a potential rating of 3000 lands on earth
> and joins FIBS. Are you guys saying that even
> after 20000, 50000, 100000 matches he will never
> get past achieveing a rating of 2000-2100...?

For one thing, it is impossible to have a potential rating of 3000
(without cheating). My guess is that the best humans and computers in
the world today make mistakes which would cost them at most an
expected 0.4 points per game for money against a perfect player
(that's including chequer play and cube decisions). This is only
worth about 200 FIBS rating points. If we assume that the best
current players could consistently maintain a rating of 2100 without
cheating (which is very generous), then even a perfect player would
have difficulty remaining above 2300. In truth it's very likely to be
lower. The other players on FIBS simply do not make enough mistakes
for anybody to be consistently rated higher than that, no matter how
good they are.

In general I believe that 1000 matches is sufficient to "reach" a
rating, regardless of your previous rating and experience: by that I
mean that after 1000 matches, the bias from your old rating will be
insignficant compared to random fluctuations. (Part of the justification
is given in an old article at <http://www.bkgm.com/rgb/rgb.cgi?view+471>.)
Therefore I claim that if your perfect Martian really did deserve a
rating of 2300, I'm sure it could reach it within 1000 matches (in
fact since ratings change more quickly for new players, the number
would be significantly less).

> >An analogy. Let's say you played perfect (but
> >honest, non-prescient) blackjack. What percent
> >of the hands would you win? You still win only
> >about 1/2 the hands, even though you are playing
> >perfectly. If you played 10,000,000,000,000,000,000
> >hands, to "eliminate" the luck factor, you would
> >still win only about 1/2 the hands.
>
> I barely know blackjack but I agree that what you
> say would be true between equal players in bg. If
> the players are unequal and the luck is equal, then
> the better player can generate a surplus of wins
> without limit. When awarding points as is done now
> with the FIBS formula, the points earned may get
> increasingly small but should never reach zero
> and stop...

It is perfectly possible for the sum of an infinite series to remain
below some limit, even though the individual terms never "reach zero
and stop". Add 1/2 + 1/4 + 1/8 + 1/16 + ... for instance; you can
come arbitrarily close to 1, but never exceed it.

But you don't even need this mechanism to show that a FIBS rating will
not increase without bound. The points earned are only half the
story, you have to consider the points _lost_ as well! Suppose you
are much better than me, and you win 2/3 of the games (this is what is
expected to happen if you are rated at 1800 and I am rated at 1200,
for instance). If we played for money, then yes, you would expect to
generate a "surplus of wins" (money) without limit. But on FIBS, 2/3
of our games will result in a win to you (you gain 1.33 rating points,
and I lose 1.33); the other 1/3 will result in a win to me (I gain
2.67 points, and you lose 2.67). If we play 300 1-point matches, then
you expect to win 200 of them for a gain of 267 points; but you lose
the other 100 which also costs you 267 points. In the long run, you
don't expect to change at all! A "surplus of wins without limit"
(ie. winning more than 50% of the matches) does NOT imply a surplus of
RATING POINTS without limit. To maintain a rating over 1800, you
would have to consistently win more than 2/3 of the games against me.

> I understand the argument that within the FIBS
> rating scheme any player will eventually settle
> at around a certain rating. What I'm arguing is
> that any player claimed to be one of the top 10
> players in the world (human or robot) would break
> away from the pack by a larger gap before reaching
> their so-called "true rating"...

But right behind the top 10 players in the world are another 100 that
are very nearly as good as them. If there WERE a group of 10 players
on FIBS who were far better than anybody else then we would expect a
gap between their ratings and those of all other players; but there
are not. In any case, you're talking about the distribution of the
population, not the ratings mechanism. Just because you're one of the
top 10 players in the world doesn't mean a different set of rules
apply to you; the same reasoning in the example above (you can't
expect to raise above 1800 no matter how much you play me, if you only
win 2/3 of the games) applies to top 10 players, just like it does to
everybody else.

Cheers,
Gary.

Murat Kalinyaprak

unread,
Nov 26, 1998, 3:00:00 AM11/26/98
to
limill...@my-dejanews.com wrote:

><739vq4$k67$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>> Fine. Let's look at this from a different angle also.
>> Let's have Snowie (rated at 2089?) play each opponent
>> a 1-point match and only once. With about 600 points
>> difference (2089-1500), its winning chances would be
>> around 65%? for 1-point matches. If Snowie does indeed
>> win 65% of those matches, then we can reword your above
>> statement as: "Snowie can beat 65% of its opponents"
>> (among players on FIBS with 1500 average rating)...

>> If the players on FIBS represent a good sampling of all
>> players in the world, then we can expand that statement
>> to say: "Snowie can beat 65% of all players on earth".
>> (When talking about matches of other lengths, the "65%"
>> can be replaced by the appropriate figure).

>> If we have Snowie play another set of matches against
>> the same players again, and again, and again... this
>> ratio will not change. Therefore, we can say that 35%
>> of the players will beat Snowie consistently (i.e. in
>> the long run), which sounds much less impressive than
>> claims made previously in this newgroup...

> I honestly can't tell, are you simply trolling this newsgroup?

No.

> If not, I can tell you that that the two statements
> "Snowie beats a 1500 player 65% of the time" and

> "Snowie can beat 65% of all players" are not equivalent.

It could be but not necessarily is so. The wins/losses
can be distributed in a way that it could beat more than
65% or less than 65% of players. (Let's also keep in mind
that 51% is enough for winning).

>> I had already made an argument about two requirements
>> for this to work. Only "one particular form of skill"
>> can be measured (i.e. single-point, multi-point, etc.
>> matches), not a combination of many kinds at the same
>> time. This is not the case with FIBS rating system, as
>> you and others acknowledge also. And at least an initial
>> sample of players need to be measured against a common
>> "unit" before they can be used to measure each other or
>> players. I don't know if this was done in establishing
>> the FIBS formula or not. It would be good if we get an
>> answer on this from somebody who knows...

> The ELO formula is based entirely on basic probability


> theory and not on empircal data. Empirical data is highly
> error prone and impossible to generalize into a formula.

After questioning whether I'm trolling, are these empty
statements all you can offer...? We don't need data to
know the probability of a coin or a die landing on one
of it's faces because we know a coin has two faces and
a die has six faces. If we want to know the probabiliy
of getting snow on Christmas day however, we would need
statistical data. Error proneness is not an issue since
there are no other ways of predicting probabilities in
such cases. And yes, what is observed from data can be
generalized into a formula, although it may have to be
much more complicated than a crudely invented formula,
depending on how much accuracy would be desired. Such a
formula can also be further fine-tuned based on future
accumulations of data without being circular. Unless the
FIBS formula is based on some real statistical data, it
would be nothing more than a hocus-pocus invention. Why
is this so hard to accept for you guys? Are you a cult
or something...?

> The derivation of the formula is loosely explained at
> the netgammon site:
> http://ibs.nordnet.fr/netgammon/elobis_usa.html

I know what the formula is. What I would like to know is
whose bright idea was to multiply by the square-root of
match length, divide by 2000, etc. and based on what...?
Is this too much ask...?

>> For the moment, let's stick with the match length and
>> assume that a robot can do slightly better in 1-point
>> matches against weaker opponents and it's "true" winning
>> chances (referring to the previous examples used above)
>> is 66% instead of 65%. Let's also say that half of the
>> 30000 experience Snowie has consists of 1-point matches
>> (which would had to be played against almost all weaker
>> players from the beginning). At about 1.5? points earned
>> per match, that 1% would translate to 150x1.5=225 rating
>> points and would put Snowie at 2325. This is only with
>> a mere 1% inaccuracy in calculating the probability of
>> winning...

> Your math is wrong. A player who wins 65% of the time over
> 1500 rated opponents will have a rating of 2037, a player
> who wins 66% of the time over the same opponents will have
> a rating of 2076. Given the known error rate in the formula,
> a 1% difference in skill is in practice difficult to observe.

Based on SW's rating and experience being 2100 and 30000
and the assumption that 15000 of those consist of 1-point
matches played against 1500 rated opponents, at the start
it would earn 1.34 points per win. By the time it would
reach a rating of 2325, its earnings per match would drop
to 1.12 points. But if I used a more accurate points-per-
match figure than the 1.5 I had used, then it wouldn't
reach a point where it would drop to 1.12. So let me just
approximate it to 1.15 and replace 1.5 with the average
of 1.34 and 1.15 = 1.24. In that case 150 extra wins would
result in 186 point that would raise its rating to about
2286...

BTW, I'm not talking about any error rate due to the FIBS
formula itself. Even if the FIBS formula was based on some
statistical data (which it seems to be not), a player who
would later deviate from the statistics by a mere 1% would
cause "visible" enough irregularities/extremes in ratings.

>> Of course, it's possible that a few human players may
>> end up not fitting the "standard mold" and cause such
>> irregularities/extremes also. I'm just using robots as
>> likely cases because they play large amounts of matches
>> and arguments have been made about their being superior
>> and/or different than at least most humans...

>> And with the seemingly arbitrary round number constants
>> in it, the FIBS formula looks just too crude to be able
>> to prevent such possible or even expected irregularities.
>> Yet, no such irregularities are observed...

> You're right that the constants are arbitrary, since they


> could be changed to anything and the formula would still
> work. However, the range of the ratings and magnitude of
> the changes would be different.

Yes, simply replacing them wouldn't accomplish any more
than what you described. The fact is, FIBS formula just
doesn't have enough buttons and knobs to fine-tune it
against possible irregularities that may be caused by
external factors. Let alone that, any error rate coming
from the formula itself is apt to be compounded when it's
applied recursively and in a relative manner. If the FIBS
formula is so sacred to be touched, then maybe we could
redefine what "works/doesn't work" mean...

>> Some people had argued that inflating ratings by taking
>> advantage of deficiencies in FIBS ratings was possible.
>> If it's possible on purpose, why couldn't it be possible
>> for it to happen inadvertently, which I believe would be
>> more likely than not to happen. I just can't see how a
>> some "crude" formula with some round/arbitrary constants
>> can result in the apparent stability/smoothness/neatness
>> in FIBS ratings and their ranges and can't help wonder
>> if some other mecahnism/s are used to ensure that...

> Your argument translates to, "I don't understand Fibs'


> ratings formula, therefore Fibs is rigged".

probability of the favorite winning the match =
1/(10^(rating-difference*SQRT(match-legth)/2000)+1)

points earned by the favorite =
4*SQRT(match-length)*favorite's-probability-winning

This is the formula in short and produces some results in
recursively self-validating manner. The question is what
those results can mean in terms of "measuring", etc... I
say that this formula can't be said to "measure" anything,
unless we change again the definition of what "measuring"
means...

MK


Murat Kalinyaprak

unread,
Nov 26, 1998, 3:00:00 AM11/26/98
to
limill...@my-dejanews.com wrote:

><73a32u$snp$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>> I think the problem with your argumant is that JF/SW in
>> this case are not playing against a single opponent but
>> against a crowd of claimedly tens of thousands (50000?)
>> of players with an average rating (at least at the very
>> start) of 1500. It would take an enormous amount of wins
>> on their part to lower that average to such a low level
>> that the winning wouldn't earn them much... If they won
>> 1 match of 1-point against each and every player on FIBS
>> (i.e. 50000 wins), that average rating of 1500 would go
>> down by only 1.5? (please correct me if I'm off on this
>> and use a more accurate number) points... Could someone
>> calculate what JF/SW rating would be by the time they'd
>> pull the average rating on FIBS by just 1.5 points...?

> I can't for the life of me see what your point is here.
> You seem to enjoy deflecting an argument by re-phrasing it
> in incomprehensible terms. (which is what leads me to
> believe that you're very cleverly trolling the newsgroup)

All of the arguments made on this issue had been based on
two players playing against each other. In that case, for
each point the winner wins, the loser loses a point and
the gap between their ratings widen quickly, so that a 1%
surplus of wins would take extremely long time to become
visible. In my argument, I'm making a certain player play
against the entire FIBS. Imagine a player plays 100 games
against 1000 different players and generates 1 extra win
against each one of them. After 100000 games, the average
rating of all its 1000 oppenents would go down much more
slowly while its own rating skyrockets. I don't know why
this should be so difficult to understand.

>> The issue here is the closeness of those robots ratings
>> to a good number of other players. If SW earned 30000
>> experience by playing 3 point matches on the average,
>> that would mean it played against a pool of 10000 people
>> with an average rating of 1500. If it could consistently
>> beat 9990 of them, top 10 players on FIBS couldn't even
>> come close to making a dent towards keeping its rating
>> "stabilized" at around 2000-2100 (even if they were the
>> same people as the top 10 players in the world)...

> You seem to have forgotten that the formula on fibs works
> properly. Snowie's rating will fluctuate just like everyone
> else's, irrespective of its experience. (once its experience
> is greater than 400). There is no conspiracy needed by the
> masses to "make a dent" in its rating.

Maybe I'm not expressing myself clearly. By "winning"
I mean "not breaking even" (i.e. winning at least one
match more than predicted in a 100)...

> Since you prefer hand-waving arguments to those based on fact,
> here comes mine. Don't look at things in the big picture, look at
> it on the microscopic level. A good player does not win all of the
> time, right? Why, because of luck. When a good player beats
> a bad player, he gets a modest reward. When a good player loses
> to a bad player, he sufferes a big loss. It all evens out
> in the end.

A 2100 rated player's having a 65% chance of winning
against a 1500 rated player is an average. It doesn't
mean that each and every 2100 rated player will win
exactly 65% of the time against a 1500 rated player,
in every type of match. The average would still be
the same regardles of whether there would be "stray"
ratings or not. I'm making an issue of the fact that
with the current FIBS formula, I would expect to see
some strays which I don't. I feel that everything is
intriguingly too neat despite the known/acknowledged
deficiencies in the formula...

>> Something just doesn't add up...

> You can say that again. And I'm sure you will.

I may say even more later. I'm working on it... :)

MK


Murat Kalinyaprak

unread,
Nov 26, 1998, 3:00:00 AM11/26/98
to
limill...@my-dejanews.com wrote:

><73a4l3$snp$2...@news.chatlink.com> Murat Kalinyaprak wrote:

>> With the luck factor eliminated, why would a "perfect"

> Wow! You've eliminated luck from backgammon?!!?

Not me, the notion of "perfect" necessitates the
assumption that "luck" is eliminated. "Luck" and
"perfect" don't mix...



>> player or even a player close to that would only win
>> 75% of the time, regardless of whether his opponent is
>> rated at 1000, 500, 200, 50 or 2 points below him...?

> Please tell me how often you would expect a perfect player
> to beat a player rated 2 points below her.

100% of the time. If that's not what we are observing,
then we must be "measuring" something different than
just skill...

At the bottom of all this is my past argument that when
there is a large enough gap between the skills of two
players, skill would stop being a factor in predicting
the winning chances of the players (i.e. the mistakes
the better player will make, etc.) I believe that past
a certain amount of skill difference between two players,
the better player just wouldn't make the type of errors
that the lesser player would be capable of exploting to
his advantage (the lesser player woudln't even be able
to recognize it as an error). At that point, it would
be the better player basicly just playing against the
dice and possibly haphazard moves of the lesser player.

Let's not kid ourselves that we can let a world class
player against a beginner and claim that we can measure
the difference of skill between them, make predictions
and award winning points based on that. I propose that
the only way we could come close to accomplishing this
would be by making people play against their approximate
peers...

MK


David desJardins

unread,
Nov 27, 1998, 3:00:00 AM11/27/98
to
Dan <adz...@dartmouth.edu> writes:
> The FIBS rating formula is based on theoretical laws of statistics.

No, it isn't. There's no theoretical statistical reason why, if A beats
B x% of the time, and B beats C y% of the time, then A should beat C z%
of the time, for any particular choice of x,y,z. The formulas used by
FIBS (and all other Elo-style rating systems) are simply ad hoc choices
with no particular statistical justification except that they work
reasonably well.

> It's constructed so that if 2 people at their "true rating" play each
> other, the average number of fibs rating points each will win is zero.

There's no reason to believe that this is true, and lots of reasons to
believe that it's not true in general. It would be quite a miracle if
somehow the probability of winning a match exactly followed the FIBS
formula for all pairs of players and all match lengths.

Clearly you understand this stuff far better than Murat, and I don't
disagree especially strongly with your explanation of why it is the way
it is. But it overstates the case to describe it as perfect.

David desJardins

adz...@dartmouth.edu

unread,
Nov 28, 1998, 3:00:00 AM11/28/98
to

Allow me to jump out of lurk mode to post a little bit. Hope no one minds the
interruption. :-)

In article <365DE4...@cyberport.net>,


Murat Kalinyaprak <mu...@cyberport.net> wrote:
> Unless the
> FIBS formula is based on some real statistical data, it
> would be nothing more than a hocus-pocus invention. Why
> is this so hard to accept for you guys? Are you a cult
> or something...?

Oh, absolutely not! It's a fundamental tenet mathematics and physics that
the world follows theoretical laws. Empirical laws are only approximations -
"as good as we can get right now" as opposed to "the way the world acutally
is".

It's a difference people take for granted. It's because we have these
theoretical laws that we have a right to make mathematical deductions.
There's one last law in Newtonian mechanics which isn't known; for turbulent
fluid flow, resistance is roughly proportional to the 7/4th power of
temperature, I believe (I don't have an engineering reference handy). That's
an empirical law that people have developed, by observation. You can't do
all those neat mathematical things, like calculus, to the function for
resistance and still have much meaning.

The FIBS rating formula is based on theoretical laws of statistics. It's


constructed so that if 2 people at their "true rating" play each other, the

average number of fibs rating points each will win is zero. A player's true
rating can be expressed as a function of the chance of winning against a
baseline (say 1500) player.

In article <365DF4...@cyberport.net>,
Murat Kalinyaprak <mu...@cyberport.net> wrote:
> limill...@my-dejanews.com wrote:

> All of the arguments made on this issue had been based on
> two players playing against each other. In that case, for
> each point the winner wins, the loser loses a point and
> the gap between their ratings widen quickly, so that a 1%
> surplus of wins would take extremely long time to become
> visible. In my argument, I'm making a certain player play
> against the entire FIBS.

Whether a fixed player's opponent is also fixed or instead varies over all
the players in FIBS doesn't make a difference. What's important is the
average number of FIBS ratings points won against each opponent. Here's why:

A player's true rating is a function of the chance he has of winning
against a baseline player. As a player's chance approaches 100%, his true
rating approaches infinity. No one wins 100% of the time, so each player's
true rating is finite. Therefore, in order for a player's rating to increase
without bound, his rating must increase over his true rating.

The average ratings points per game won against a player of skill 1500 is
lower when a player's rating is higher. This should be obvious from the
formula; as the difference between players increases, the number of points
that the favorite wins decreases. So, if a player's rating is going to
increase without bound, it can only help him if we reward him points as if he
were at his true rating when he's actually above it; we'll be rewarding him
more points. However, the average number of points a player wins at his true
rating is 0. If 0 is more points than the player will win on average, it's
clear that his rating on average will not go up. And you know that as the
number of games a player plays goes up, his winnings approach their average
value.

You asked why the ratings formula has that square root involved ... it's
that way in order to force the average number of points won between 2 players
at their true ratings to be 0. That's the theoretical law.

You may note that true ratings fluctuate. Well, unless a player's getting
so good that he's approaching winning 100% of the time, his rating will have
to go above his true rating if his rating is going to increase without bound,
and then the above argument will apply.

What if you play players rated 1500 who are all below their true rating? I
don't know. I know your rating will increase above it's true rating, but I
don't know if it's without bound. Even if the average amount of points you
win per per game is positive, and you play an inifite number of games, your
total winnings can still be bounded, as Gary pointed out in another post. I
would guess that so long as the difference between your opponents' ratings
and true ratings were bounded, your rating would also be bounded, but I
wouldn't make a strong claim for it without thinking about it more first.

I hope this helps your understanding. I'd guess also that your next
question is going to be why the average ratings increase limits on 0, but I
think that's been answered enough here - your losses count more against you
than your wins. All I can say is to do the math to convince yourself that the
losses count against you enough. Start with an percentage chance of winning
against a player rated 1500, and have him play different players rated 1500
ad inifinitum, and see that the rating of your player stops going up.

Dan

Robert-Jan Veldhuizen

unread,
Nov 30, 1998, 3:00:00 AM11/30/98
to
On 27-nov-98 00:31:07, Murat Kalinyaprak wrote:

MK> limill...@my-dejanews.com wrote:

>><739vq4$k67$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>> If we have Snowie play another set of matches against
>>> the same players again, and again, and again... this
>>> ratio will not change. Therefore, we can say that 35%
>>> of the players will beat Snowie consistently (i.e. in
>>> the long run), which sounds much less impressive than
>>> claims made previously in this newgroup...

>> I honestly can't tell, are you simply trolling this newsgroup?

MK> No.

Then you are simply very ignorant but fail to recognize it.

>> If not, I can tell you that that the two statements
>> "Snowie beats a 1500 player 65% of the time" and
>> "Snowie can beat 65% of all players" are not equivalent.

MK> It could be but not necessarily is so.

You don't understand what you're talking about (again). limiller is
absolutely right (no question at all) that the two statements are not
equivalent. If you don't understand, do some study of your own instead
of making all sorts of false claims and assumptions.

MK> The wins/losses
MK> can be distributed in a way that it could beat more than
MK> 65% or less than 65% of players. (Let's also keep in mind
MK> that 51% is enough for winning).

This is just plain nonsense and shows you don't understand at all what
you're talking about (again). Do some self study, ask some questions if
you wish but please stop making statements about things you don't
understand, I think it's tiresome.

[ELO-rating based on probability]

MK> After questioning whether I'm trolling, are these empty
MK> statements all you can offer...?

Just get a clue first maybe?

MK> I know what the formula is. What I would like to know is
MK> whose bright idea was to multiply by the square-root of
MK> match length, divide by 2000, etc. and based on what...?
MK> Is this too much ask...?

If you are so convinced of your own observations and conclusions, you'll
probably never understand the formula. Just do some experimenting with a
calculator or read some text about rating systems, then come back and we
can have a sensible discussion instead of repeatedly tell you that
you're wrong.

>> Your math is wrong. A player who wins 65% of the time over
>> 1500 rated opponents will have a rating of 2037, a player
>> who wins 66% of the time over the same opponents will have
>> a rating of 2076. Given the known error rate in the formula,
>> a 1% difference in skill is in practice difficult to observe.

MK> Based on SW's rating and experience being 2100 and 30000
MK> and the assumption that 15000 of those consist of 1-point
MK> matches played against 1500 rated opponents, at the start
MK> it would earn 1.34 points per win. By the time it would
MK> reach a rating of 2325, its earnings per match would drop
MK> to 1.12 points. But if I used a more accurate points-per-
MK> match figure than the 1.5 I had used, then it wouldn't
MK> reach a point where it would drop to 1.12. So let me just
MK> approximate it to 1.15 and replace 1.5 with the average
MK> of 1.34 and 1.15 = 1.24. In that case 150 extra wins would
MK> result in 186 point that would raise its rating to about
MK> 2286...

You're wrong.

[more nonsense snipped]

MK> This is the formula in short and produces some results in
MK> recursively self-validating manner. The question is what
MK> those results can mean in terms of "measuring", etc... I
MK> say that this formula can't be said to "measure" anything,
MK> unless we change again the definition of what "measuring"
MK> means...

It is a relative measure. Any absolute measure of bg skill is impossible
(as of today and the near future at least) because we don't know what
perfect play in all possible situations is.

--
Zorba/Robert-Jan


Robert-Jan Veldhuizen

unread,
Nov 30, 1998, 3:00:00 AM11/30/98
to
On 27-nov-98 02:20:12, Murat Kalinyaprak wrote:

MK> limill...@my-dejanews.com wrote:

>><73a4l3$snp$2...@news.chatlink.com> Murat Kalinyaprak wrote:

>>> With the luck factor eliminated, why would a "perfect"

MK>

>> Wow! You've eliminated luck from backgammon?!!?

MK> Not me, the notion of "perfect" necessitates the
MK> assumption that "luck" is eliminated.

No. Perfect play doesn't mean winning every game/match.

MK> "Luck" and
MK> "perfect" don't mix...

Yes they do. Perfect in bg means *maximizing* winning chances, *not*
necessarily make them 100%. If you don't understand this, any further
discussion about the ratings system is absolutely useless.

>> Please tell me how often you would expect a perfect player
>> to beat a player rated 2 points below her.

MK> 100% of the time. If that's not what we are observing,
MK> then we must be "measuring" something different than
MK> just skill...

Or you "must" be completely ignorant and don't have any idea what
you're talking about.

This whole discussion resembles of someone discussing differential
equations and how modern theoory about it is wrong while he himself
still has trouble taking a square root.

--
Zorba/Robert-Jan


adz...@dartmouth.edu

unread,
Dec 1, 1998, 3:00:00 AM12/1/98
to
In article <vohg1b4...@pizza.berkeley.edu>,

David desJardins <da...@desjardins.org> wrote:
> No, it isn't. There's no theoretical statistical reason why, if A beats
> B x% of the time, and B beats C y% of the time, then A should beat C z%
> of the time, for any particular choice of x,y,z. The formulas used by
> FIBS (and all other Elo-style rating systems) are simply ad hoc choices
> with no particular statistical justification except that they work
> reasonably well.
>

You're correct. For some reason, I was thinking that the ratings formula
was derived from some probability distribution, but that's not true.

> > It's constructed so that if 2 people at their "true rating" play each
> > other, the average number of fibs rating points each will win is zero.
>

> There's no reason to believe that this is true, and lots of reasons to
> believe that it's not true in general. It would be quite a miracle if
> somehow the probability of winning a match exactly followed the FIBS
> formula for all pairs of players and all match lengths.
>

I didn't consider different match lengths. I'd agree that there's no way
to relate a player's rathing to his cahnces of winning an n length match. 2
different players could achieve the same rating because one is a good checker
player, and the other a good cube player.

So I looked a the ratings formula more closely. It's clear that for all
match lengths, if the underdog actually does win P(upset) of the time, then
the average ratings points increase is 0. It's also the case that for a
ratings difference of 4000 pts, fibs expects the better player to win 10/11%
of 1 pt matches. I'd imagine everyone could agree that no player can win 99%
of the time against an opponent who's trying (maybe even against an opponent
who's not trying, too :-) ), so that expert will lose points on average.
That's enough to show that no player's rating can increase without bound
(which was what I was posting about originally).

What I was really wanted to do is point out that there's a "true ratings
difference" between 2 players (without considering skill improvement, of
course). If the 2 players play each other in fixed length matchs ad
infinitum, their average (expected) ratings after the ith game will converge,
since it'd be a monotonically increasing bounded sequence. Not too
interesting, in general, but it's enough to show that continually playing
someone who's just terrible won't keep increasing your rating. I believe
this also shows that ratings angle players, newbie predators, and the like
shouldn't expect their ratings to go up forever. Even if the inferior
player's rating remains constant, the expected rating of the better player
will still converge, since the fibs formula replies on ratings difference
only.


> Clearly you understand this stuff far better than Murat, and I don't
> disagree especially strongly with your explanation of why it is the way
> it is. But it overstates the case to describe it as perfect.
>
> David desJardins
>

-----------== Posted via Deja News, The Discussion Network ==----------

Michael J Zehr

unread,
Dec 1, 1998, 3:00:00 AM12/1/98
to
In article <365DFE...@cyberport.net>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:
>> Please tell me how often you would expect a perfect player
>> to beat a player rated 2 points below her.
>
>100% of the time. If that's not what we are observing,
>then we must be "measuring" something different than
>just skill...

Am I understanding correctly that what you want is a measure of skill
level such that a player always beats a less skillful player, always
wins against a more skillful player, and wins with 50% probability
against a player of equal skill?

-Michael J. zehr

Michael J Zehr

unread,
Dec 1, 1998, 3:00:00 AM12/1/98
to
In article <pattibF2...@netcom.com>,

Patti Beadles <pat...@netcom.com> wrote:
>For example, Perfect Player opens with 51 and plays 13/8 24/23, the
>commonly accepted best move.
>
>Total Idiot rolls 55 and plays 8/3(2) 6/1(2)*. PP now dances, TI
>rolls 64 and plays 8/2* 6/2. PP continues to dance while TI rolls
>just the right numbers to close him out and bear off safely.
>
>PP played his single move flawlessly, but TI got lucky.


I'm reminded of a quote which I'll attribute to Evan Diamond with an 80%
confidence level (15% Rick Barabino, 5% someone else at NEBC):

"A perfect player wouldn't have danced after 55."

:-)

-Michael J. Zehr

Murat Kalinyaprak

unread,
Dec 2, 1998, 3:00:00 AM12/2/98
to
In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <73gedb$l6s$1...@news.chatlink.com> Murphy McKalin wrote:

>>>So what should its win percent be against an "almost perfect"
>>>opponent? (Feel free to define "almost perfect" however you
>>>want.) What about an "slightly less than almost perfect"
>>>opponent?)

>>100%

>No way. There will always be some luck involved.

>For example, Perfect Player opens with 51 and plays 13/8 24/23,
>the commonly accepted best move.
>Total Idiot rolls 55 and plays 8/3(2) 6/1(2)*. PP now dances, TI
>rolls 64 and plays 8/2* 6/2. PP continues to dance while TI rolls
>just the right numbers to close him out and bear off safely.
>PP played his single move flawlessly, but TI got lucky.

Ok, so would you agree that when the gap between
ratings is wide enough to be between an "almost
perfect" player and a "total idiot", what we can
measure is no longer skill but just luck...?

Previously you had defended that any miniscule
probability of winning a much underrated player
may have against a highly rated one was based on
the mistakes that the better player would make.

Given your above example, the "almost perfect"
player's chance of winning would depend rather
on the "total idiot"s somehow messing up those
rolls... :) I bet that there will be times the
moves will be forced and the "total idiot" won't
be able to mess up (even if he could)...

If the dice alone can beat an "almost perfect"
player, then a real fancy formula claiming to
measure skill would account for the probablity
of that happening (i.e. the probabilities of
winning/losing can't go above/below a certain
percentage).

Of course, there is nothing wrong with measuring
the combination of skill and luck but the problem
is that when you get to the extremes there isn't
much of a combination to speak of anymore (i.e.
it becomes a matter of pure luck).

Dice can't hurt a "total idiot" any more than an
"almost perfect" player can hurt him by his skill
but dice can hurt an "almost perfect" player much
more than a "total idiot" could hurt him with his
total lack of skill. Allowing rated matches between
"almost perfect" players and "total idiots" is one
of the problems in the FIBS rating system...

MK

Murat Kalinyaprak

unread,
Dec 2, 1998, 3:00:00 AM12/2/98
to
In <36590C4F...@rahul.net> Me Again wrote:

>Murat Kalinyaprak wrote:

>> This is what I don't understand. How can "*perfect*"

>> (or close to it) play can only win 75% of the time...?

>You roll 5 - 2 and play 13-8 24-22. Your opponent rolls 5-5
>and points on both your blots. You roll any of the 9 numbers
>(6-6, 6-3, 3-6, 3-3, 1-1, 6-1, 1-6, 3-1, 1-3) that fail to
>bring in either of your hit men. Your opponent doubles, you drop.

>..................


>Thus, even with "perfect play" you can (and will) lose many games,
>because of the luck of the dice. Any game that has a luck factor
>will be a game in which it will always be impossible to win 100%
>of your games, even with perfect play.

Your argument is valid. I'm not going to make any
more arguments in this article because I just
responded to one from Patti B. on the same subject
and I have nothing more to add.

Please notice that I'm not arguing that there is
such a thing as perfect play/player, what percent
of the time a perfect play/player would win, etc.
There are a lot of conflicting arguments on some
issues in this newsgroup (some even coming from
the same people). I would like to see those get
sorted out. I'm trying to carry discussions in a
way that what I would like to say will somehow
come out of somebody else's mouth. Since some
people in this newsgroup appear to have developed
a mental block against anything I say, I thought
this may be a more productive approach...

MK

Patti Beadles

unread,
Dec 2, 1998, 3:00:00 AM12/2/98
to
In article <742mvq$9kv$2...@news.chatlink.com>,
Murat Kalinyaprak <mu...@cyberport.net> wrote:

>Previously you had defended that any miniscule
>probability of winning a much underrated player
>may have against a highly rated one was based on
>the mistakes that the better player would make.

I don't *think* so. I realize that luck is a part of backgammon,
especially in the short term.


>Given your above example, the "almost perfect"
>player's chance of winning would