1 view

Skip to first unread message

Nov 8, 1998, 3:00:00 AM11/8/98

to

Awhile back I was thinking about buying JellyFish or Snowie and asked here if

bots can help someone improve their rating. Here is my follow-up after

buying Snowie.

bots can help someone improve their rating. Here is my follow-up after

buying Snowie.

After I was on FIBS for awhile I got back into practice and my average was

usually around 1665. A couple times I got up to the low 1670s but only

stayed there a short time. Then my play seemed to get worse and for awhile I

was usually around 1655.

Then I got Snowie. I immediately learned why my play had gotten worse.

There was a new habit I had developed that Snowie said was wrong. After I

corrected this my average started to rise. And as I studied more it went up

more. I had one bad day where I dropped 20 points but the next day I started

to climb again. And today I hit 1686 -- higher than I have ever

been before.

I don't know if these changes are statistically significant. And there

could be luck involved. But I really think I have improved as a backgammon

player. As for how good I can get with the help of Snowie, I don't know. My

concentration is not that good and I make mistakes that even I know are

wrong. But I am happy with the level of improvement I've made so far.

Nov 8, 1998, 3:00:00 AM11/8/98

to

Andrew Bokelman <73457...@CompuServe.COM> writes:

> After I was on FIBS for awhile I got back into practice and my average was

> usually around 1665. A couple times I got up to the low 1670s but only

> stayed there a short time. Then my play seemed to get worse and for awhile I

> was usually around 1655.

>

> Then I got Snowie. I immediately learned why my play had gotten worse.

> There was a new habit I had developed that Snowie said was wrong. After I

> corrected this my average started to rise. And as I studied more it went up

> more. I had one bad day where I dropped 20 points but the next day I started

> to climb again. And today I hit 1686 -- higher than I have ever

> been before.

>

> I don't know if these changes are statistically significant.

> After I was on FIBS for awhile I got back into practice and my average was

> usually around 1665. A couple times I got up to the low 1670s but only

> stayed there a short time. Then my play seemed to get worse and for awhile I

> was usually around 1655.

>

> Then I got Snowie. I immediately learned why my play had gotten worse.

> There was a new habit I had developed that Snowie said was wrong. After I

> corrected this my average started to rise. And as I studied more it went up

> more. I had one bad day where I dropped 20 points but the next day I started

> to climb again. And today I hit 1686 -- higher than I have ever

> been before.

>

> I don't know if these changes are statistically significant.

Personally I would say that they are not. Search on Deja News for

articles about FIBS ratings and you will read that fluctuations of

well over 100 points and back again are not unheard of. (As an

example, Abbott plays on FIBS with an estimated ability of around

1470. Its play hasn't changed at all for several months, but over

that time its rating has reached lower than 1300 and higher than 1600

through random noise alone.) I don't have any measurements of the

accuracy of FIBS ratings, but I would guess that the standard error in

a rating is of the order of 50 points. (Loosely speaking, this means

that all else being equal, if you take a large sample of FIBS players,

you'd expect about 2/3 of them to have a rating within 50 points of

their "true" ability.)

To judge whether you are improving based on your rating is very

difficult. Several months ago I posted an article here estimating how

long it takes for a rating to change, and I concluded that the "half

life" of a FIBS rating is of the order of 200 experience points. (One

way of interpreting this is that for any sufficiently experienced

player, the last 200 experience points contribute as much to your

rating as all previous matches put together). Therefore I would be

very reluctant to compare two measurements made within, say, 400

experience points of each other, because they won't be sufficiently

independent.

When you put all of this together, I would argue that you need samples

made more than 400 experience points apart to be independent, and more

than two standard deviations (ie. 100 points) to be significant. So,

if your current rating is over 100 points higher than it was 400

experience points ago, you can be reasonably confident that you are

improving; if it's 100 points lower, that suggests you're getting

worse!

(Strictly speaking, you can't ever prove a hypothesis is _true_ with

sampled data: you can only gather data that suggests some hypothesis

seems to be false. If, over 400 experience points, your rating

increases by 100, then that's pretty strong evidence against the

hypothesis that your ability remained the same or decreased. If it

went down by 100, that tends to reject the hypothesis that you stayed

the same or improved. If it changed by less than 100 points, then

your ability could well have changed during the sample period, but not

by enough to be detected by this fairly crude test.)

I think a better way of determining whether you are improving is to

trust your instincts. If you can identify concepts that are

significant in a particular position that you wouldn't have recognised

a month or a year ago, then that could well indicate improvement. Or

if you have since learned why a particular play you once made was

wrong, that probably constitutes improvement too. Or take a quiz (for

instance Robertie's _Reno 1986_, or Clay's _Backgammon: Winning

Strategies_, or the online positions at Backgammon By The Bay); wait

until you've "forgotten" the answers, and take the test again. If

your score has improved by, say, 10% of the total (my very rough

estimation -- long quizzes need smaller differences to be significant;

short quizzes require more) then that probably indicates significant

improvement.

Last of all, I've learned so much from reading rec.games.backgammon

that I find it very hard to believe that anybody here is not improving

at least a little bit. If you're so good that you can read several

months' r.g.b. and not learn a thing, then get off the computer and

go out and play for money; if you're not that good, then you're

improving -- congratulations! :-)

Cheers,

Gary "I suck less than I used to" Wong.

--

Gary Wong, Department of Computer Science, University of Arizona

ga...@cs.arizona.edu http://www.cs.arizona.edu/~gary/

Nov 8, 1998, 3:00:00 AM11/8/98

to

In article <wtaf213...@brigantine.CS.Arizona.EDU> Gary Wong <ga...@cs.arizona.edu> writes:

>Andrew Bokelman <73457...@CompuServe.COM> writes:

>> After I was on FIBS for awhile I got back into practice and my average was

>> usually around 1665. [...] Then I got Snowie. [...] And today I hit >Andrew Bokelman <73457...@CompuServe.COM> writes:

>> After I was on FIBS for awhile I got back into practice and my average was

>>1686 -- higher than I have ever been before. I don't know if these

>>changes are statistically significant.

>

>Personally I would say that they are not. [...]>>changes are statistically significant.

>

>To judge whether you are improving based on your rating is very

>I think a better way of determining whether you are improving is to

If you have Snowie, there is an even better way. Use Snowie to

create an account for yourself. Play on FIBS or GG, and import

and analyze all of your matches. On the statistics window,

associate the statistics with your account.

After you have played a few matches this way, go to the Account

Manager window and look at what it says in the "Overall", "Moves"

and "Cube" panels. Pretty quickly you'll see how you do on

average.

This way, every match you play becomes like a quiz that you can

use both to improve and to objectively evaluate how well you

are playing. Since Snowie looks at your choice for every single

move, sometimes hundreds of moves per match, you get an accurate

reading much more quickly than by watching your rating go up

and down.

David Montgomery

mo...@cs.umd.edu

monty on FIBS

Nov 11, 1998, 3:00:00 AM11/11/98

to

In <wtaf213...@brigantine.CS.Arizona.EDU> Gary Wong wrote:

>.... and you will read that fluctuations of well over 100

>points and back again are not unheard of. (As an example,

>Abbott plays on FIBS with an estimated ability of around

>1470. Its play hasn't changed at all for several months,

>but over that time its rating has reached lower than 1300

>and higher than 1600 through random noise alone.)

Thanks for sharing such info with us here. Could

you be any more specific about what you mean by

"random noise"? 300 points is a huge swing... Do

you know what may be the average points won/lost

per 1-point game by Abbott? At 2 points per game

it would take 150 wins/losses (not consecutively

of course) for such a swing. Would be interesting

to know the ratio of this to the total number of

games played during a period when Abbott has gone

from 1300 to 1600 or from 1600 to 1300... Given

that Abbott is a robot without emotions, good or

bad days, etc. a 300 point fluctuation in its

rating may indicate something much worse and/or

difficult to explain...

>I don't have any measurements of the accuracy of FIBS

>ratings, but I would guess that the standard error in a

>rating is of the order of 50 points. (Loosely speaking,

>this means that all else being equal, if you take a large

>sample of FIBS players, you'd expect about 2/3 of them to

>have a rating within 50 points of their "true" ability.)

Such statements bother me a little. Where do we

get the "true ability" to compare FIBS ratings

with...?

We know for example that a "kilogram" equals the

weight of one cubic decimeter of water. So, if I

want to know whether my bath-scale measures my

"true weight" accurately enough, I can resort to

that fact as a reference outside of "me and my

bath-scale"...

How do we do that with FIBS ratings...? Saying

that one's FIBS rating is withing N points of

their "true ability as measured by FIBS points"

is circular...

If we start with a brand new FIBS, have Jim and

Joe sign on, have them start playing matches,

and then try to begin awarding them points based

on some formula like the one used now, we can see

that there is "a little too much" of a circular

hocus-pocus in it... I hope it's not necessary to

play out the scenario step by step to illustrate

this.

It may be that there is no better choice and that

we have to make do with whatever we can... That's

fine. But it needs to be acknowledged that things

are such, as far as FIBS rating system goes...

Let me ask a question specificly to Gary: with no

obligation to adopt or promote any other system,

do you think that a "much simpler" rating system

could achieve an accuracy/inaccuracy similar to

FIBS' (i.e. in the order of 50 points)...?

FIBS rating formula may be "beautiful", but it's

not "real". Imagine some players could break off

from the pack and reach ratings of 2800, 3400, etc.

while others dip to 720, 290, etc... I would say

that the "real winning chances" of a 290 player

against a 3400 player may be "zero". I chose such

extreme numbers to start making the point, but if

we work backwards from those, we may be able to

say the same for players rated at 720 and 2800, or

1230 and 1920...

On FIBS, I regularly see players with 700+ points

difference in their ratings play for points. I'm

of the opinion that the stage where a player may

have practicly zero chance of winning would occur

much earlier than 700+ FIBS-points difference. And

I see this as a problem with FIBS rating system. I

think that pretending a "rosy" hypothetical world

can exist where anyone can play against any other

player without regard to ratings (i.e. because they

win/lose proportionately based on "probabilities")

is unrealistic...

There was snide remarks made in the past about my

not believing in "probabilities". It's true that

when the term is used for some figures obtained

from "circular data", I don't believe that it has

anything to do with what it should mean...

Having mentioned again ratings like 2800, 3400,

etc. one thing I still haven't figured out (and

nobody else offered opinions on it either) is why

haven't robots like JF and SW reached ratings well

past 2000 or 2100's...? They play large volumes of

matches and against players of all skill levels

indiscriminately. Assuming that top players in the

world have better things to do than playing against

those robots on game servers in order to keep their

ratings in check, 90?+% of their opponents should be

easy prey for them to generate lots of surplus wins

and keep increasing their ratings indefinetely. Why

isn't it happening...?

Anyway, before the people who have me in their kill

file complain about my writing long articles again,

I better stop for now... :)

MK

Nov 11, 1998, 3:00:00 AM11/11/98

to

Gary,

>>>>I think a better way of determining whether you are improving is to

trust your instincts. If you can identify concepts that are

significant in a particular position that you wouldn't have recognized

a month or a year ago, then that could well indicate improvement. Or

if you have since learned why a particular play you once made was

wrong, that probably constitutes improvement too.

This has happened too. For example, discovering that I had developed a bad

habit in moving my back checkers up to soon. Hitting and slotting in my home

board when I could just give up one point and make another while hitting.

Breaking and running too soon in a two-way holding position. Not being bold

enough in my doubling.

Which brings me to another good thing about having a bot tutor. After I

learn the new thing it is very easy to apply it in the wrong places. So by

reviewing later matches I can see if Snowie tells me if I'm applying it

correctly.

Nov 11, 1998, 3:00:00 AM11/11/98

to

In article <72bhu5$c0p$1...@news.chatlink.com>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>Having mentioned again ratings like 2800, 3400,

>etc. one thing I still haven't figured out (and

>nobody else offered opinions on it either) is why

>haven't robots like JF and SW reached ratings well

>past 2000 or 2100's...?

Maybe because the ratings system works a lot better than you

think, and the bots reach their "true rating" and then hover

there, plus or minus the hundred or so points that one would

expect for random swings.

-Patti

--

Patti Beadles |

pat...@netcom.com/pat...@gammon.com | You are sick. It's the kind of

http://www.gammon.com/ | sick that we all like, mind you,

or just yell, "Hey, Patti!" | but it is sick.

Nov 18, 1998, 3:00:00 AM11/18/98

to

In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72bhu5$c0p$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>.... why haven't robots like JF and SW reached ratings well

>>past 2000 or 2100's...?

>Maybe because the ratings system works a lot better than you

>think, and the bots reach their "true rating"......

I have a little problem with the term "true rating" as

used in relation to FIBS (and likes) rating systems...

"True rating" by what unit of measure...?

I think that the only way we could even come close to

using such a term in a rating system would be after a

process like the following:

We take let's say 100 players and have them let's again

say 100 matches against a *common opponent* who/which

would preferably be impartial and static in stregth.

Robots are ideal for that and we can use any robot of

any stregth (like Gary's Abbott), because we just want

to use it as a *unit of measure*...

After this, we can rate/sort those players based on the

number of matches they won against that robot like:

John rated at 92 robot units

Joe rated at 87 robot units

Jim rated at 81 robot units

Jack rated at 77 robot units

5 Etc...

Then, we make all those players play 100 matches against

each other and from the results we can derive some

conclusions as to the *relative probabilities* of their

winning chances against each other (i.e. devise a formula

to reflect the discovered relativity).

The most sensible way to add a new player to this bunch

then would be to make him/her play 100 matches against

the same *measuring stick* robot and base his/her initial

rating on the result of those matches. But this may be

totally impractical in the long term. So alternatives may

be to insert a new player at the midpoint of the ratings

range, or better yet at the most common rating, etc.

I would consider a similar process as a required *minimal*

in order to speak about a "true rating" of any sort...

Of course, if the rating formula will take into account

factors like single-point, multi-point, cubeless, cubeful

matches, etc. then the above process should include enough

samples of each of them.

My question is whether FIBS rating formula is based on

such a foundation containing some amount of *concrete*

(pun intended:) or build out of wet beach sand...?

MK

Nov 18, 1998, 3:00:00 AM11/18/98

to

In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72bhu5$c0p$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>... why haven't robots like JF and SW reached ratings well

>>past 2000 or 2100's...?

>Maybe because the ratings system works a lot better than you

>think, and the bots reach their "true rating" and then hover

>there, plus or minus the hundred or so points that one would

>expect for random swings.

What I would like to know is whether we are trying to

observe a result or trying to artificially create a

result...?

Why do we expect that any/all players ratings will

*reach* a "whatever rating" and hover around it forever

after...?

Some time ago I had argued that after a certain amount

of ratings difference, the lesser rated player's winning

chances would rely on dice alone and I had received (I

believe from you) a counter-argument that FIBS formula

was calculating (reflecting) those probablilities based

on mistakes the higher rated player is expected to make.

I'll leave the subject of "what is a mistake" for another

time but here we are talking about JF and SW, who play

based on statistics/probabilities alone and don't make the

"mistakes" that humans make. In fact, so many people have

such a high esteem of them that they are often regarded

as the ultimate judge on what are right/wrong moves, etc...

It had been claimed that perhaps only as few as 10 people

in the world can beat those robots in the long term. Of

the tens of thousands of games those robots had played,

the ones they played against each other and/or against

those 10 people must be a very very small number.

For the practical scope/purpose of this argument, those

robots don't make "mistakes" (on which the FIBS rating

formula does supposedly depend on). And after that many

thousand games the luck factor should certainly be no

longer a factor. Yet, they have so far failed to produce

enough surplus wins against the "*losing masses*" to get

past 2000-2100 ratings...? How can that be...?

I don't care which way the reality goes but something

doesn't add up as far as I can see. What could be some

possibilies here...?

a) JF and SW are not as good as some people make it

sound. But then, only 3-4 people at the most have ever

openly claimed in this newsgroup that they beat those

robots. The rest said they lose (and pretty badly at it).

b) The crowd on FIBS is very different than the crowd

in this newsgroup. People in this newsgroup wrongly

think that there are only 10 or so people in the world

who can beat those robots but in reality there are tens

of thousands of people on FIBS that can beat JF and SW...

c) Those robots don't play differently based on cube

ownership and are beaten by people on FIBS who play

very differently based on cube ownership. And since

cube ownership is not a factor in FIBS formula, those

poor robots are inadvertently kept from reaching their

full potential (i.e. "true") ratings... :)

d) FIBS dice is rigged to maintain the ratings/ranges

in a way to artificially validate its own formula...

e) Any other ideas...?

MK

Nov 18, 1998, 3:00:00 AM11/18/98

to

In article <72u8m1$rm8$1...@news.chatlink.com>,

mu...@cyberport.net (Murat Kalinyaprak) wrote:

mu...@cyberport.net (Murat Kalinyaprak) wrote:

>

> Why do we expect that any/all players ratings will

> *reach* a "whatever rating" and hover around it forever

> after...?

>

Because the ratings system on Fibs works properly.

>

> For the practical scope/purpose of this argument, those

> robots don't make "mistakes" (on which the FIBS rating

> formula does supposedly depend on). And after that many

> thousand games the luck factor should certainly be no

> longer a factor. Yet, they have so far failed to produce

> enough surplus wins against the "*losing masses*" to get

> past 2000-2100 ratings...? How can that be...?

You seem to be saying that the mythical perfect player

will beat all inferior players %100 of the time. If that

were the case, then yes, the perfect player's rating would

increase with no upper bound.

However, the perfect player will always lose a significant

number of matches, due to the element of luck in the game.

Using made up numbers: Let's say the perfect player can beat

your average intermediate player, rated 1700, about 75% of

the time in 5 point matches. The perfect player would gain

+2.236 rating points for every win, and lose -6.708 for every

loss. Averaging 3 wins for every loss, the perfect player's

rating will in the long run remain unchanged, at approximately

2126.75.

If you log on to fibs and type "help formula" you can see

exactly how the ratings changes are calculated.

L.Miller

-----------== Posted via Deja News, The Discussion Network ==----------

http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Nov 18, 1998, 3:00:00 AM11/18/98

to

In article <72u8m1$rm8$1...@news.chatlink.com>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>For the practical scope/purpose of this argument, those

>robots don't make "mistakes" (on which the FIBS rating

>formula does supposedly depend on). And after that many

>thousand games the luck factor should certainly be no

>longer a factor. Yet, they have so far failed to produce

>enough surplus wins against the "*losing masses*" to get

>past 2000-2100 ratings...? How can that be...?

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>For the practical scope/purpose of this argument, those

>robots don't make "mistakes" (on which the FIBS rating

>formula does supposedly depend on). And after that many

>thousand games the luck factor should certainly be no

>longer a factor. Yet, they have so far failed to produce

>enough surplus wins against the "*losing masses*" to get

>past 2000-2100 ratings...? How can that be...?

Because the formula works.

Let's assume for the sake of argument that every player has a skill

level to which we can assign a number. To further simplify the

argument, let's say that the skill level for an average player is

exactly 1500.0.

Let's choose a player, and give him a skill level of 1800.0. What this

skill level means is that he has a 65% of beating an average player in

a 3-point match, and a 71% chance of beating an average player in a

7-point match.

Our 1800 player now goes off and plays a very large number (say 10000)

of 7-point matches against an average player. He'll win around 71%

of them, and lose 29%. All in all, his rating will stay close to

1800, and his opponent's rating will stay close to 1500.

Why is that? It's because the FIBS rating system calculates what it

thinks the probability of winning a match is, based on the skill

(rating) difference of the players, and assigns points accordingly.

For example, in our hypothetical 1800 vs 1500 7-point match:

If player #1 wins:

Changes for player#1 +3.029076, new rating 1803.03

Changes for player#2 -3.029076, new rating 1496.97

If player #2 wins:

Changes for player#1 -7.553929, new rating 1792.45

Changes for player#2 +7.553929, new rating 1507.55

The underlying assumptions of the FIBS rating system are:

(a) Every player has a skill level that can be assigned a

numeric value,

(b) based on those skill levels, the probability of one

player beating another in a match of a paritcular

length can be determined.

If we don't take (a) as true, then the whole thing falls apart.

(b) is the tricky one, but I believe the system is fairly good

if not perfect. It's been shown, for example, that the formula

overestimates the chances of a weaker player winning a very short

match. It seems to work well for longer matches.

Remember that luck is still a factor in backgammon. I'm only an

intermediate player, but I've beaten world-class players in short

(5, 7, 9-point) matches.

-Patti

--

Patti Beadles |

pat...@netcom.com/pat...@gammon.com |

http://www.gammon.com/ | The deep end isn't a place

or just yell, "Hey, Patti!" | for dipping a toe.

Nov 18, 1998, 3:00:00 AM11/18/98

to

In article <72u8m1$rm8$1...@news.chatlink.com>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>For the practical scope/purpose of this argument, those

>robots don't make "mistakes" (on which the FIBS rating

>formula does supposedly depend on). And after that many

>thousand games the luck factor should certainly be no

>longer a factor. Yet, they have so far failed to produce

>enough surplus wins against the "*losing masses*" to get

>past 2000-2100 ratings...? How can that be...?

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>For the practical scope/purpose of this argument, those

>robots don't make "mistakes" (on which the FIBS rating

>formula does supposedly depend on). And after that many

>thousand games the luck factor should certainly be no

>longer a factor. Yet, they have so far failed to produce

>enough surplus wins against the "*losing masses*" to get

>past 2000-2100 ratings...? How can that be...?

> [options snipped]

f) The ratings are accurate in the sense that the "average" FIBS player

has a rating in the 1500s, and a rating difference of 500 points

accurately predicts the ratio of games SW and JF win on average.

There seems to be a misunderstanding that a "perfect" player should have

their rating arbitrarily high. If perfect play lets you win 75% of the

time in a 9 point match against the "average" player on FIBS, then using

the rating system you can determine the rating that perfect player ought

to have. It might well be 2000-2100.

The expectation that a player's rating will move towards some point and

then hover there is based on statistical and empirical data. One can

run simulations to see how a rating system behaves. (Start two players

with ratings of 1500. Assume one of them wins 60% of the time in a 7

point match. Simulate matches by using a psuedo random sequence, or

some other method of generating a 60% chance. Adjust the ratings using

the FIBS formula (which you can get from the help on FIBS). See what

happens to the ratings over the long run.)

Remember that althought SW and JF might have vastly winning records,

they get far fewer points when they win than when they lose, so they

need to win more often than they lose just to maintain their high

ratings.

Regarding the comments that the ratings differences don't reflect the

assertion that only 10 or so people in the world can maintain a winning

record against SW or JF, I would say that the ratings _do_ reflect

that. After all, how many people on FIBS consistenly have a higher

rating than SW or JF? Isn't it just possible that robots have the

highest ratings on FIBS because they're the best players?

-Michael J. Zehr

Nov 22, 1998, 3:00:00 AM11/22/98

to

In <pattibF2...@netcom.com> Patti Beadles wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>For the practical scope/purpose of this argument, those

>>robots don't make "mistakes" (on which the FIBS rating

>>formula does supposedly depend on). And after that many

>>thousand games the luck factor should certainly be no

>>longer a factor. Yet, they have so far failed to produce

>>enough surplus wins against the "*losing masses*" to get

>>past 2000-2100 ratings...? How can that be...?

>Because the formula works.

Well, maybe I'll become convinced later... :)

>Let's assume for the sake of argument that every player has

>a skill level to which we can assign a number. To further

>simplify the argument, let's say that the skill level for an

>average player is exactly 1500.0.

Ok, I'll use the same figure in my examples.

>Let's choose a player, and give him a skill level of 1800.0.

>What this skill level means is that he has a 65% of beating

>an average player in a 3-point match, and a 71% chance of

>beating an average player in a 7-point match.

>Our 1800 player now goes off and plays a very large number

>(say 10000) of 7-point matches against an average player.

>He'll win around 71% of them, and lose 29%. All in all, his

>rating will stay close to 1800, and his opponent's rating

>will stay close to 1500.

Fine. Let's look at this from a different angle also.

Let's have Snowie (rated at 2089?) play each opponent

a 1-point match and only once. With about 600 points

difference (2089-1500), its winning chances would be

around 65%? for 1-point matches. If Snowie does indeed

win 65% of those matches, then we can reword your above

statement as: "Snowie can beat 65% of its opponents"

(among players on FIBS with 1500 average rating)...

If the players on FIBS represent a good sampling of all

players in the world, then we can expand that statement

to say: "Snowie can beat 65% of all players on earth".

(When talking about matches of other lengths, the "65%"

can be replaced by the appropriate figure).

If we have Snowie play another set of matches against

the same players again, and again, and again... this

ratio will not change. Therefore, we can say that 35%

of the players will beat Snowie consistently (i.e. in

the long run), which sounds much less impressive than

claims made previously in this newgroup...

>Why is that? It's because the FIBS rating system calculates

>what it thinks the probability of winning a match is, based

>on the skill (rating) difference of the players, and assigns

>points accordingly. For example, in our hypothetical 1800 vs

>1500 7-point match: ...............

> (a) Every player has a skill level that can be assigned a

> numeric value,

> (b) based on those skill levels, the probability of one

> player beating another in a match of a paritcular

> length can be determined.

I had already made an argument about two requirements

for this to work. Only "one particular form of skill"

can be measured (i.e. single-point, multi-point, etc.

matches), not a combination of many kinds at the same

time. This is not the case with FIBS rating system, as

you and others acknowledge also. And at least an initial

sample of players need to be measured against a common

"unit" before they can be used to measure each other or

players. I don't know if this was done in establishing

the FIBS formula or not. It would be good if we get an

answer on this from somebody who knows...

>If we don't take (a) as true, then the whole thing falls apart.

>(b) is the tricky one, but I believe the system is fairly good

>if not perfect. It's been shown, for example, that the formula

>overestimates the chances of a weaker player winning a very short

>match. It seems to work well for longer matches.

And my argument is that such inconsistencies can add up

and should even be expected to do so in the long term.

There could possibly be other elusive elements such as

"style/strategy", for example. It has been argued in

this newsgroup that robots play unlike-humans and that's

why they do better and that's why humans aspire to play

like them. So one question could be whether a 2100 rated

human and a 2100 rated robot really have the same winning

chances against a 1500 rated player (human and/or robot)

in any/all variations of backgammon?

For the moment, let's stick with the match length and

assume that a robot can do slightly better in 1-point

matches against weaker opponents and it's "true" winning

chances (referring to the previous examples used above)

is 66% instead of 65%. Let's also say that half of the

30000 experience Snowie has consists of 1-point matches

(which would had to be played against almost all weaker

players from the beginning). At about 1.5? points earned

per match, that 1% would translate to 150x1.5=225 rating

points and would put Snowie at 2325. This is only with

a mere 1% inaccuracy in calculating the probability of

winning...

Of course, it's possible that a few human players may

end up not fitting the "standard mold" and cause such

irregularities/extremes also. I'm just using robots as

likely cases because they play large amounts of matches

and arguments have been made about their being superior

and/or different than at least most humans...

And with the seemingly arbitrary round number constants

in it, the FIBS formula looks just too crude to be able

to prevent such possible or even expected irregularities.

Yet, no such irregularities are observed...

Some people had argued that inflating ratings by taking

advantage of deficiencies in FIBS ratings was possible.

If it's possible on purpose, why couldn't it be possible

for it to happen inadvertently, which I believe would be

more likely than not to happen. I just can't see how a

some "crude" formula with some round/arbitrary constants

can result in the apparent stability/smoothness/neatness

in FIBS ratings and their ranges and can't help wonder

if some other mecahnism/s are used to ensure that...

MK

Nov 22, 1998, 3:00:00 AM11/22/98

to

In <72vf4c$a...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>longer a factor. Yet, they have so far failed to produce

>>enough surplus wins against the "*losing masses*" to get

>>past 2000-2100 ratings...? How can that be...?

>There seems to be a misunderstanding that a "perfect" player

>should have their rating arbitrarily high. If perfect play

>lets you win 75% of the time in a 9 point match against the

>"average" player on FIBS, then using the rating system you

>can determine the rating that perfect player ought to have.

>It might well be 2000-2100.

This is what I don't understand. How can "*perfect*"

(or close to it) play can only win 75% of the time...?

We either have to reassess the stregth of those robots,

redefine "perfect", argue that FIBS rating measures not

skill but luck, or something else whatever... We can't

have it all.

>The expectation that a player's rating will move towards some

>point and then hover there is based on statistical and empirical

>data.

What data...? Where did that data come from...?

>One can run simulations to see how a rating system behaves.

>(Start two players with ratings of 1500. Assume one of them

>wins 60% of the time in a 7 point match. Simulate matches by

>using a psuedo random sequence, or some other method of

>generating a 60% chance. Adjust the ratings using the FIBS

>formula (which you can get from the help on FIBS). See what

>happens to the ratings over the long run.)

>Remember that althought SW and JF might have vastly winning

>records, they get far fewer points when they win than when

>they lose, so they need to win more often than they lose just

>to maintain their high ratings.

I think the problem with your argumant is that JF/SW in

this case are not playing against a single opponent but

against a crowd of claimedly tens of thousands (50000?)

of players with an average rating (at least at the very

start) of 1500. It would take an enormous amount of wins

on their part to lower that average to such a low level

that the winning wouldn't earn them much... If they won

1 match of 1-point against each and every player on FIBS

(i.e. 50000 wins), that average rating of 1500 would go

down by only 1.5? (please correct me if I'm off on this

and use a more accurate number) points... Could someone

calculate what JF/SW rating would be by the time they'd

pull the average rating on FIBS by just 1.5 points...?

>Regarding the comments that the ratings differences don't

>reflect the assertion that only 10 or so people in the world

>can maintain a winning record against SW or JF, I would say

>that the ratings _do_ reflect that. After all, how many

>people on FIBS consistenly have a higher rating than SW or

>JF? Isn't it just possible that robots have the highest

>ratings on FIBS because they're the best players?

While discussing rating systems, somebody had tried to

differentiate between "rankings" and "ratings", which

wasn't applicable in that context but is in this case.

The rating difference between #1 and #2 player can be

a million points while the difference between #2 and #3

players can be ten points and they still would rank as

#1, #2 and #3...

The issue here is the closeness of those robots ratings

to a good number of other players. If SW earned 30000

experience by playing 3 point matches on the average,

that would mean it played against a pool of 10000 people

with an average rating of 1500. If it could consistently

beat 9990 of them, top 10 players on FIBS couldn't even

come close to making a dent towards keeping its rating

"stabilized" at around 2000-2100 (even if they were the

same people as the top 10 players in the world)...

Something just doesn't add up...

MK

Nov 22, 1998, 3:00:00 AM11/22/98

to

In <72vj16$tqc$1...@nnrp1.dejanews.com> limill...@my-dejanews.com wrote:

>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>> Why do we expect that any/all players ratings will

>> *reach* a "whatever rating" and hover around it forever

>> after...?

>Because the ratings system on Fibs works properly.

I don't believe the results we observe could simply be

achieved by the current formula used by FIBS. I don't

see how that formula could prevent some players from

straying far away from the pack...

>> longer a factor. Yet, they have so far failed to produce

>> enough surplus wins against the "*losing masses*" to get

>> past 2000-2100 ratings...? How can that be...?

>You seem to be saying that the mythical perfect player

>will beat all inferior players %100 of the time. If that

>were the case, then yes, the perfect player's rating would

>increase with no upper bound.

I am not the one turning certain robots into "mythical

perfect players". I'm only making multi-edged arguments,

without any intention to prove which way they cut. There

seem to be a case where we have to decide whether we want

to eat the cake or have it...

My argument is that even a less than "perfect" player can

produce *enough* surplus of wins against a large mass of

opponents with a much lower average rate. It doesn't have

to be a boundless process in order for it to produce huge

differences in ratings...

>However, the perfect player will always lose a significant

>number of matches, due to the element of luck in the game.

The luck factor is supposed to even out in the long run.

But I personally don't mind hearing this comment because

it leaves room for the possibility that a ceratin luck

factor may be artificially maintained by FIBS dice... :)

>Using made up numbers: Let's say the perfect player can beat

>your average intermediate player, rated 1700, about 75% of

>the time in 5 point matches. The perfect player would gain

>+2.236 rating points for every win, and lose -6.708 for every

>loss. Averaging 3 wins for every loss, the perfect player's

>rating will in the long run remain unchanged, at approximately

>2126.75.

With the luck factor eliminated, why would a "perfect"

player or even a player close to that would only win

75% of the time, regardless of whether his opponent is

rated at 1000, 500, 200, 50 or 2 points below him...?

As a side comment, when talking about certain robots if

"perfect" had to mean "75%", I would have no problem

with it... :)

MK

Nov 22, 1998, 3:00:00 AM11/22/98

to

MK:

Having P% chances against players of level L

does not mean you have P% against any group

whose average level is L. Playing 10 matches

against 1700-level players is different than

5 matches aginst 2000 combined with 5 matches

against 1400.

Saying a player will win P% of its matches

against a group of players G is not the same

thing as saying that P% of the players in

G consistently beat the player. At my club

I consistenly lose about 1/3 of my matches.

But no one beats me consistently.

Perfect play could win only 75% of the time

if the other player plays well enough to

win 25% of the time. If the other player

played better, the perfect player might win

only half the time.

The rating formula prevents players from

straying far from the pack by requiring that

players win a higher and higher percent of their

matches to go to a higher level. If you can't

win with that percent, you don't go higher.

The fact that luck evens out in the end does

not mean that the better player eventually wins

all of the matches. It means that the percentage

of matches won converges arbitrarily close to the

the result you would get if you played an infinite

number of matches. The weaker player keeps

winning matches, even when the luck has evened out.

You cannot eliminate the luck factor in backgammon.

It is always there. If you play many, many matches

then both players will get approximately equal

amounts of luck, but that doesn't take the luck

away.

An analogy. Let's say you played perfect (but

honest, non-prescient) blackjack. What percent

of the hands would you win? You still win only

about 1/2 the hands, even though you are playing

perfectly. If you played 10,000,000,000,000,000,000

hands, to "eliminate" the luck factor, you would

still win only about 1/2 the hands.

Nov 22, 1998, 3:00:00 AM11/22/98

to

Murat Kalinyaprak wrote:

> This is what I don't understand. How can "*perfect*"

> (or close to it) play can only win 75% of the time...?

You roll 5 - 2 and play 13-8 24-22. Your opponent rolls 5-5 and points

on both your blots. You roll any of the 9 numbers (6-6, 6-3, 3-6, 3-3,

1-1, 6-1, 1-6, 3-1, 1-3) that fail to bring in either of your hit men.

Your opponent doubles, you drop.

Did you make any mistakes? (Depending on the match score, your opponent

may have made a mistake in doubling and should have played for a gammon,

but in a money game using the jacoby rule it's a given that this is a

double/drop.)

This is not an isolated position, there are several other 2 roll and 3

roll scenarios (usually involving doubles, something that is rolled 1

out of every 6 rolls) that with "perfect play" will result in a

double/drop.

Thus, even with "perfect play" you can (and will) lose many games,

because of the luck of the dice. Any game that has a luck factor will

be a game in which it will always be impossible to win 100% of your

games, even with perfect play.

HTH

jc

Nov 23, 1998, 3:00:00 AM11/23/98

to

In article <73a32u$snp$1...@news.chatlink.com>,

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>In <72vf4c$a...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>

>>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>

>>>longer a factor. Yet, they have so far failed to produce

>>>enough surplus wins against the "*losing masses*" to get

>>>past 2000-2100 ratings...? How can that be...?

>

Murat Kalinyaprak <mu...@cyberport.net> wrote:

>In <72vf4c$a...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>

>>In <72u8m1$rm8$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>

>>>longer a factor. Yet, they have so far failed to produce

>>>enough surplus wins against the "*losing masses*" to get

>>>past 2000-2100 ratings...? How can that be...?

>

>>There seems to be a misunderstanding that a "perfect" player

>>should have their rating arbitrarily high. If perfect play

>>lets you win 75% of the time in a 9 point match against the

>>"average" player on FIBS, then using the rating system you

>>can determine the rating that perfect player ought to have.

>>It might well be 2000-2100.

>

>>should have their rating arbitrarily high. If perfect play

>>lets you win 75% of the time in a 9 point match against the

>>"average" player on FIBS, then using the rating system you

>>can determine the rating that perfect player ought to have.

>>It might well be 2000-2100.

>

>This is what I don't understand. How can "*perfect*"

>(or close to it) play can only win 75% of the time...?

>(or close to it) play can only win 75% of the time...?

Suppose someone did write a "perfect" backgammon program, by whatever

definition you chose for perfect... and then it played itself. What do

you think it's percentage of wins would be?

50% right?

So what should its win percent be against an "almost perfect" opponent?

(Feel free to define "almost perfect" however you want.) What about an

"slightly less than almost perfect" opponent?)

-Michael J. Zehr

Nov 23, 1998, 3:00:00 AM11/23/98

to

Murat Kalinyaprak <mu...@cyberport.net> wrote in article

<73a4l3$snp$2...@news.chatlink.com>...

> In <72vj16$tqc$1...@nnrp1.dejanews.com> limill...@my-dejanews.com wrote:

> With the luck factor eliminated, why would a "perfect"

> player or even a player close to that would only win

> 75% of the time, regardless of whether his opponent is

> rated at 1000, 500, 200, 50 or 2 points below him...?

>

> As a side comment, when talking about certain robots if

> "perfect" had to mean "75%", I would have no problem

> with it... :)

>

> MK

>

I always thought that backgammon was a combination of skill + luck

so it's pretty difficult to eliminate the luck factor.

If the ratio is say 50% skill and 50% luck

then wouldn't a perfect robot's winning rate be

calculated by adding the skill factor + the luck factor.

If it approached perfect then it's winning rate would be

~ .5 for skill + (somevariable)* .5 for luck factor

If it got an even split on luck then somevariable would at that

time be .5 and it would win 75% of its matches.

Maybe skill difference might even be a better variable because

if the opponent were weaker then skill would be more telling and

luck would have less influence whereas if the opponent were

closer in strength then skill difference would decrease and

luck would therefore increase.

Anyway one of the enjoyable (sort of) paradoxes with backgammon

is that you can play a "perfect" game and still get punished by

being backgammoned, just the way it is.

Graham

Nov 23, 1998, 3:00:00 AM11/23/98

to

>I always thought that backgammon was a combination of skill + luck so it's

pretty difficult to eliminate the luck factor. If the ratio is say 50% skill

and 50% luck ***>pretty difficult to eliminate the luck factor. If the ratio is say 50% skill

I think luck is MUCH less than a 50% factor. If you spend some time playing

much stronger players than you, I think you'll find this out quickly.

Nov 23, 1998, 3:00:00 AM11/23/98

to

In article <739vq4$k67$1...@news.chatlink.com>,

mu...@cyberport.net (Murat Kalinyaprak) wrote:

mu...@cyberport.net (Murat Kalinyaprak) wrote:

>

> Fine. Let's look at this from a different angle also.

> Let's have Snowie (rated at 2089?) play each opponent

> a 1-point match and only once. With about 600 points

> difference (2089-1500), its winning chances would be

> around 65%? for 1-point matches. If Snowie does indeed

> win 65% of those matches, then we can reword your above

> statement as: "Snowie can beat 65% of its opponents"

> (among players on FIBS with 1500 average rating)...

>

> If the players on FIBS represent a good sampling of all

> players in the world, then we can expand that statement

> to say: "Snowie can beat 65% of all players on earth".

> (When talking about matches of other lengths, the "65%"

> can be replaced by the appropriate figure).

>

> If we have Snowie play another set of matches against

> the same players again, and again, and again... this

> ratio will not change. Therefore, we can say that 35%

> of the players will beat Snowie consistently (i.e. in

> the long run), which sounds much less impressive than

> claims made previously in this newgroup...

I honestly can't tell, are you simply trolling this newsgroup?

If not, I can tell you that that the two statements

"Snowie beats a 1500 player 65% of the time" and

"Snowie can beat 65% of all players" are not equivalent.

> I had already made an argument about two requirements

> for this to work. Only "one particular form of skill"

> can be measured (i.e. single-point, multi-point, etc.

> matches), not a combination of many kinds at the same

> time. This is not the case with FIBS rating system, as

> you and others acknowledge also. And at least an initial

> sample of players need to be measured against a common

> "unit" before they can be used to measure each other or

> players. I don't know if this was done in establishing

> the FIBS formula or not. It would be good if we get an

> answer on this from somebody who knows...

>

The ELO formula is based entirely on basic probability

theory and not on empircal data. Empirical data is highly

error prone and impossible to generalize into a formula.

The derivation of the formula is loosely explained at

the netgammon site:

http://ibs.nordnet.fr/netgammon/elobis_usa.html

>

> For the moment, let's stick with the match length and

> assume that a robot can do slightly better in 1-point

> matches against weaker opponents and it's "true" winning

> chances (referring to the previous examples used above)

> is 66% instead of 65%. Let's also say that half of the

> 30000 experience Snowie has consists of 1-point matches

> (which would had to be played against almost all weaker

> players from the beginning). At about 1.5? points earned

> per match, that 1% would translate to 150x1.5=225 rating

> points and would put Snowie at 2325. This is only with

> a mere 1% inaccuracy in calculating the probability of

> winning...

Your math is wrong. A player who wins 65% of the time over

1500 rated opponents will have a rating of 2037, a player

who wins 66% of the time over the same opponents will have

a rating of 2076. Given the known error rate in the formula,

a 1% difference in skill is in practice difficult to observe.

>

> Of course, it's possible that a few human players may

> end up not fitting the "standard mold" and cause such

> irregularities/extremes also. I'm just using robots as

> likely cases because they play large amounts of matches

> and arguments have been made about their being superior

> and/or different than at least most humans...

>

> And with the seemingly arbitrary round number constants

> in it, the FIBS formula looks just too crude to be able

> to prevent such possible or even expected irregularities.

> Yet, no such irregularities are observed...

>

You're right that the constants are arbitrary, since they

could be changed to anything and the formula would still

work. However, the range of the ratings and magnitude of

the changes would be different.

I don't know what you mean by "expected irregularities".

> Some people had argued that inflating ratings by taking

> advantage of deficiencies in FIBS ratings was possible.

> If it's possible on purpose, why couldn't it be possible

> for it to happen inadvertently, which I believe would be

> more likely than not to happen. I just can't see how a

> some "crude" formula with some round/arbitrary constants

> can result in the apparent stability/smoothness/neatness

> in FIBS ratings and their ranges and can't help wonder

> if some other mecahnism/s are used to ensure that...

>

Your argument translates to, "I don't understand Fibs'

ratings formula, therefore Fibs is rigged".

Nov 24, 1998, 3:00:00 AM11/24/98

to

In article <73a32u$snp$1...@news.chatlink.com>,

mu...@cyberport.net (Murat Kalinyaprak) wrote:

mu...@cyberport.net (Murat Kalinyaprak) wrote:

>

> This is what I don't understand. How can "*perfect*"

> (or close to it) play can only win 75% of the time...?

>

"Perfect" in backgammon, means, I believe, playing every

move and making every cube decision such that they maximize ones

chance of winning the match.

Perfection does not imply winning every match.

It seems perfectly reasonable to me that a perfect strategy would

beat intermediate opponents 75% of the time in 5 point matches.

> We either have to reassess the stregth of those robots,

Computer programs aren't perfect, I'll give you that if

it's what you really wanted to hear.

> I think the problem with your argumant is that JF/SW in

> this case are not playing against a single opponent but

> against a crowd of claimedly tens of thousands (50000?)

> of players with an average rating (at least at the very

> start) of 1500. It would take an enormous amount of wins

> on their part to lower that average to such a low level

> that the winning wouldn't earn them much... If they won

> 1 match of 1-point against each and every player on FIBS

> (i.e. 50000 wins), that average rating of 1500 would go

> down by only 1.5? (please correct me if I'm off on this

> and use a more accurate number) points... Could someone

> calculate what JF/SW rating would be by the time they'd

> pull the average rating on FIBS by just 1.5 points...?

>

I can't for the life of me see what your point is here.

You seem to enjoy deflecting an argument by re-phrasing it

in incomprehensible terms. (which is what leads me to

believe that you're very cleverly trolling the newsgroup)

>

> While discussing rating systems, somebody had tried to

> differentiate between "rankings" and "ratings", which

> wasn't applicable in that context but is in this case.

> The rating difference between #1 and #2 player can be

> a million points while the difference between #2 and #3

> players can be ten points and they still would rank as

> #1, #2 and #3...

>

> The issue here is the closeness of those robots ratings

> to a good number of other players. If SW earned 30000

> experience by playing 3 point matches on the average,

> that would mean it played against a pool of 10000 people

> with an average rating of 1500. If it could consistently

> beat 9990 of them, top 10 players on FIBS couldn't even

> come close to making a dent towards keeping its rating

> "stabilized" at around 2000-2100 (even if they were the

> same people as the top 10 players in the world)...

You seem to have forgotten that the formula on fibs works

properly. Snowie's rating will fluctuate just like everyone

else's, irrespective of its experience. (once its experience

is greater than 400). There is no conspiracy needed by the

masses to "make a dent" in its rating.

Since you prefer hand-waving arguments to those based on fact,

here comes mine. Don't look at things in the big picture, look at

it on the microscopic level. A good player does not win all of the

time, right? Why, because of luck. When a good player beats

a bad player, he gets a modest reward. When a good player loses

to a bad player, he sufferes a big loss. It all evens out

in the end.

>

> Something just doesn't add up...

>

You can say that again. And I'm sure you will.

Nov 24, 1998, 3:00:00 AM11/24/98

to

In article <73a4l3$snp$2...@news.chatlink.com>,

mu...@cyberport.net (Murat Kalinyaprak) wrote:

mu...@cyberport.net (Murat Kalinyaprak) wrote:

>

> With the luck factor eliminated, why would a "perfect"

Wow! You've eliminated luck from backgammon?!!?

> player or even a player close to that would only win

> 75% of the time, regardless of whether his opponent is

> rated at 1000, 500, 200, 50 or 2 points below him...?

Please tell me how often you would expect a perfect player

to beat a player rated 2 points below her.

>

> As a side comment, when talking about certain robots if

> "perfect" had to mean "75%", I would have no problem

> with it... :)

>

Me neither, 75% is damn good.

Nov 25, 1998, 3:00:00 AM11/25/98

to

In <73alip$8...@senator-bedfellow.MIT.EDU> Michael J Zehr wrote:

>In <73a32u$snp$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>>This is what I don't understand. How can "*perfect*"

>>(or close to it) play can only win 75% of the time...?

>Suppose someone did write a "perfect" backgammon program, by

>whatever definition you chose for perfect... and then it played

>itself. What do you think it's percentage of wins would be?

>50% right?

Yes, assuming there is no "luck factor"...

>So what should its win percent be against an "almost perfect"

>opponent? (Feel free to define "almost perfect" however you

>want.) What about an "slightly less than almost perfect"

>opponent?)

100%

BTW: notice that I wasn't the one who used the

term "perfect" first. I carried it over from

the article I was responding to and used it

within quotation marks ever since...

MK

Nov 25, 1998, 3:00:00 AM11/25/98

to

In <73ap54$g...@krackle.cs.umd.edu> David Montgomery wrote:

>MK:

>Having P% chances against players of level L

>does not mean you have P% against any group

>whose average level is L. Playing 10 matches

>against 1700-level players is different than

>5 matches aginst 2000 combined with 5 matches

>against 1400.

What you say could be possible only if the FIBS

formula was lop-sided (i.e. used the "ratings"

themselves in some fashion). But it only uses

the difference between two ratings...

Just to avoid any calculation errors I may make,

I just logged on to FIBS and was lucky enough to

spot 3 players with ratings of 1965, 1565 and

1766 (close enough) all at once. The on-screen

calculator showed that my winning chances against

them were 43.49%, 54.95% and 49.18% respectively.

If I adjust the last one for 1765, I get 49.21%

while the average of first two is 49.22%...

>Saying a player will win P% of its matches

>against a group of players G is not the same

>thing as saying that P% of the players in

>G consistently beat the player. At my club

>I consistenly lose about 1/3 of my matches.

>But no one beats me consistently.

Ok, I'll give in on this one. What I said was true

for one round but not necessarily so at all to say

"consistently" (unless the same players repeated

the same performance each and every match, which

is possible but very unlikely/unrealistic).

However, in order to say one won against another,

it's enough that one wins 51% of the time, which

is quite lower than the 65% used in the examples.

So, with the figures that were used as examples,

the number of people who would win consistently

would be much less than 35% but much more than 10.

I'm not sure if this is something that can even be

truely calculated with the variables at hand...?

>Perfect play could win only 75% of the time

>if the other player plays well enough to

>win 25% of the time. If the other player

>played better, the perfect player might win

>only half the time.

If I have to accept this as true, then I would have

to argue that there is no such thing as "perfect"

or even anything close to it in bg. 75% is just too

far from it... Given this, the FIBS forfula can't

be claimed to rate "skill" either...

>The rating formula prevents players from

>straying far from the pack by requiring that

>players win a higher and higher percent of their

>matches to go to a higher level. If you can't

>win with that percent, you don't go higher.

So...? Their ratings will go up in ever smaller

increments (i.e. slower) but what would prevent

them from going much higher? Imagine a Martian

with a potential rating of 3000 lands on earth

and joins FIBS. Are you guys saying that even

after 20000, 50000, 100000 matches he will never

get past achieveing a rating of 2000-2100...?

>You cannot eliminate the luck factor in backgammon.

>It is always there. If you play many, many matches

>then both players will get approximately equal

>amounts of luck, but that doesn't take the luck away.

I don't know about others but to clarify things

just speaking for myself, when I say "eliminating

the luck (factor)" I mean "*equal enough* luck"

for both/all players...

>An analogy. Let's say you played perfect (but

>honest, non-prescient) blackjack. What percent

>of the hands would you win? You still win only

>about 1/2 the hands, even though you are playing

>perfectly. If you played 10,000,000,000,000,000,000

>hands, to "eliminate" the luck factor, you would

>still win only about 1/2 the hands.

I barely know blackjack but I agree that what you

say would be true between equal players in bg. If

the players are unequal and the luck is equal, then

the better player can generate a surplus of wins

without limit. When awarding points as is done now

with the FIBS formula, the points earned may get

increasingly small but should never reach zero

and stop...

I understand the argument that within the FIBS

rating scheme any player will eventually settle

at around a certain rating. What I'm arguing is

that any player claimed to be one of the top 10

players in the world (human or robot) would break

away from the pack by a larger gap before reaching

their so-called "true rating"...

MK

Nov 25, 1998, 3:00:00 AM11/25/98

to

In article <73gedb$l6s$1...@news.chatlink.com>,

Murphy McKalin <mu...@cyberport.net> wrote:

>>So what should its win percent be against an "almost perfect"

>>opponent? (Feel free to define "almost perfect" however you

>>want.) What about an "slightly less than almost perfect"

>>opponent?)

Murphy McKalin <mu...@cyberport.net> wrote:

>>So what should its win percent be against an "almost perfect"

>>opponent? (Feel free to define "almost perfect" however you

>>want.) What about an "slightly less than almost perfect"

>>opponent?)

>100%

No way. There will always be some luck involved.

For example, Perfect Player opens with 51 and plays 13/8 24/23, the

commonly accepted best move.

Total Idiot rolls 55 and plays 8/3(2) 6/1(2)*. PP now dances, TI

rolls 64 and plays 8/2* 6/2. PP continues to dance while TI rolls

just the right numbers to close him out and bear off safely.

PP played his single move flawlessly, but TI got lucky.

-Patti

--

Patti Beadles | Not just your average purple-haired

pat...@netcom.com/pat...@gammon.com | degenerate gambling adrenaline

http://www.gammon.com/ | junkie software geek leatherbyke

or just yell, "Hey, Patti!" | nethead biker.

Nov 25, 1998, 3:00:00 AM11/25/98

to

I had been trying to avoid writing any more about FIBS ratings, because I

don't think I have anything else to contribute. Here is one last post all

the same. Apologies to everybody who is sick of this stuff :-)

don't think I have anything else to contribute. Here is one last post all

the same. Apologies to everybody who is sick of this stuff :-)

mu...@cyberport.net (Murphy McKalin) writes:

> In <73ap54$g...@krackle.cs.umd.edu> David Montgomery wrote:

> >Having P% chances against players of level L

> >does not mean you have P% against any group

> >whose average level is L. Playing 10 matches

> >against 1700-level players is different than

> >5 matches aginst 2000 combined with 5 matches

> >against 1400.

>

> Just to avoid any calculation errors I may make,

> I just logged on to FIBS and was lucky enough to

> spot 3 players with ratings of 1965, 1565 and

> 1766 (close enough) all at once. The on-screen

> calculator showed that my winning chances against

> them were 43.49%, 54.95% and 49.18% respectively.

> If I adjust the last one for 1765, I get 49.21%

> while the average of first two is 49.22%...

Unfortunately that's just a special case where the total probability

IS roughly the average of the three parts (because your rating is very

close to the median of a symmetric distribution). David is right: the

probability against the average rating is not necessarily the same as

the average probability against all ratings. A counterexample:

- suppose I am only rated at 1165, and play 9-point matches against the

three players you found (1565, 1765 and 1965);

- my probabilities of winning against the three are 20.1%, 11.2% and

6.0% respectively;

- my average probability of winning against the 1565 and 1965 players

is 13.1%, but against the 1765 player is only 11.2%. They are NOT

the same.

> >Perfect play could win only 75% of the time

> >if the other player plays well enough to

> >win 25% of the time. If the other player

> >played better, the perfect player might win

> >only half the time.

>

> If I have to accept this as true, then I would have

> to argue that there is no such thing as "perfect"

> or even anything close to it in bg. 75% is just too

> far from it...

There is such a thing as perfect -- perfection is never making any

mistakes. (A precise definition of a perfect strategy is one that

maximises your "security level" -- ie. a maximin strategy, one that

maximises your minimum expected gain across all possible opponents.

Since the rules in backgammon are symmetric (as opposed to games like

blackjack, where the dealer follows different rules than the players)

and backgammon is a zero-sum game, this maximum security level is

zero.)

Perhaps one issue that is causing confusion is that the idea of

perfection in backgammon is somewhat abstract. (This is because we

haven't reached perfection, and we don't always know what a mistake

is.) If a concrete example would clarify things, consider Hugh

Sconyer's programs which play all bearoff positions (though the

publically available versions only play as many positions as will fit

on the CD-ROMs), and every position in Hyper-Backgammon (essentially

backgammon played with three chequers per player) perfectly for money.

These players are PERFECT. No mistakes. Maximum security level,

etc. etc. Yet they cannot win every game. The probability of them

winning depends on the position, and the opponent. Imagine Hugh was

somehow able to extend his exhaustive search to include every

backgammon position -- this would be the perfect player we're talking

about. And it could not win every match, either. Even against an

intermediate player like me it would probably only win about 2/3 of

the games; it could win 75% or 90% or even more of the matches, as long

as the matches were long enough.

> Given this, the FIBS forfula can't be claimed to

> rate "skill" either...

Yes, it can. Skill is the ability to play without making mistakes.

The more mistakes you make, the less matches you expect to win. If

both players play without making any mistakes, then they each expect

to win 50% of the matches. If only one player makes mistakes, then he

expects to win less than 50%. The more (and costlier) mistakes he

makes, the fewer matches he expects to win. You can view FIBS ratings

as measuring skill, or the ability to play without making mistakes, or

the rate of matches won -- they are all equivalent.

> >The rating formula prevents players from

> >straying far from the pack by requiring that

> >players win a higher and higher percent of their

> >matches to go to a higher level. If you can't

> >win with that percent, you don't go higher.

>

> So...? Their ratings will go up in ever smaller

> increments (i.e. slower) but what would prevent

> them from going much higher? Imagine a Martian

> with a potential rating of 3000 lands on earth

> and joins FIBS. Are you guys saying that even

> after 20000, 50000, 100000 matches he will never

> get past achieveing a rating of 2000-2100...?

For one thing, it is impossible to have a potential rating of 3000

(without cheating). My guess is that the best humans and computers in

the world today make mistakes which would cost them at most an

expected 0.4 points per game for money against a perfect player

(that's including chequer play and cube decisions). This is only

worth about 200 FIBS rating points. If we assume that the best

current players could consistently maintain a rating of 2100 without

cheating (which is very generous), then even a perfect player would

have difficulty remaining above 2300. In truth it's very likely to be

lower. The other players on FIBS simply do not make enough mistakes

for anybody to be consistently rated higher than that, no matter how

good they are.

In general I believe that 1000 matches is sufficient to "reach" a

rating, regardless of your previous rating and experience: by that I

mean that after 1000 matches, the bias from your old rating will be

insignficant compared to random fluctuations. (Part of the justification

is given in an old article at <http://www.bkgm.com/rgb/rgb.cgi?view+471>.)

Therefore I claim that if your perfect Martian really did deserve a

rating of 2300, I'm sure it could reach it within 1000 matches (in

fact since ratings change more quickly for new players, the number

would be significantly less).

> >An analogy. Let's say you played perfect (but

> >honest, non-prescient) blackjack. What percent

> >of the hands would you win? You still win only

> >about 1/2 the hands, even though you are playing

> >perfectly. If you played 10,000,000,000,000,000,000

> >hands, to "eliminate" the luck factor, you would

> >still win only about 1/2 the hands.

>

> I barely know blackjack but I agree that what you

> say would be true between equal players in bg. If

> the players are unequal and the luck is equal, then

> the better player can generate a surplus of wins

> without limit. When awarding points as is done now

> with the FIBS formula, the points earned may get

> increasingly small but should never reach zero

> and stop...

It is perfectly possible for the sum of an infinite series to remain

below some limit, even though the individual terms never "reach zero

and stop". Add 1/2 + 1/4 + 1/8 + 1/16 + ... for instance; you can

come arbitrarily close to 1, but never exceed it.

But you don't even need this mechanism to show that a FIBS rating will

not increase without bound. The points earned are only half the

story, you have to consider the points _lost_ as well! Suppose you

are much better than me, and you win 2/3 of the games (this is what is

expected to happen if you are rated at 1800 and I am rated at 1200,

for instance). If we played for money, then yes, you would expect to

generate a "surplus of wins" (money) without limit. But on FIBS, 2/3

of our games will result in a win to you (you gain 1.33 rating points,

and I lose 1.33); the other 1/3 will result in a win to me (I gain

2.67 points, and you lose 2.67). If we play 300 1-point matches, then

you expect to win 200 of them for a gain of 267 points; but you lose

the other 100 which also costs you 267 points. In the long run, you

don't expect to change at all! A "surplus of wins without limit"

(ie. winning more than 50% of the matches) does NOT imply a surplus of

RATING POINTS without limit. To maintain a rating over 1800, you

would have to consistently win more than 2/3 of the games against me.

> I understand the argument that within the FIBS

> rating scheme any player will eventually settle

> at around a certain rating. What I'm arguing is

> that any player claimed to be one of the top 10

> players in the world (human or robot) would break

> away from the pack by a larger gap before reaching

> their so-called "true rating"...

But right behind the top 10 players in the world are another 100 that

are very nearly as good as them. If there WERE a group of 10 players

on FIBS who were far better than anybody else then we would expect a

gap between their ratings and those of all other players; but there

are not. In any case, you're talking about the distribution of the

population, not the ratings mechanism. Just because you're one of the

top 10 players in the world doesn't mean a different set of rules

apply to you; the same reasoning in the example above (you can't

expect to raise above 1800 no matter how much you play me, if you only

win 2/3 of the games) applies to top 10 players, just like it does to

everybody else.

Cheers,

Gary.

Nov 26, 1998, 3:00:00 AM11/26/98

to

limill...@my-dejanews.com wrote:

><739vq4$k67$1...@news.chatlink.com> Murat Kalinyaprak wrote:

><739vq4$k67$1...@news.chatlink.com> Murat Kalinyaprak wrote:

>> Fine. Let's look at this from a different angle also.

>> Let's have Snowie (rated at 2089?) play each opponent

>> a 1-point match and only once. With about 600 points

>> difference (2089-1500), its winning chances would be

>> around 65%? for 1-point matches. If Snowie does indeed

>> win 65% of those matches, then we can reword your above

>> statement as: "Snowie can beat 65% of its opponents"

>> (among players on FIBS with 1500 average rating)...

>> If the players on FIBS represent a good sampling of all

>> players in the world, then we can expand that statement

>> to say: "Snowie can beat 65% of all players on earth".

>> (When talking about matches of other lengths, the "65%"

>> can be replaced by the appropriate figure).

>> If we have Snowie play another set of matches against

>> the same players again, and again, and again... this

>> ratio will not change. Therefore, we can say