36 views

Skip to first unread message

Jun 30, 2000, 3:00:00 AM6/30/00

to

It seems commonsensical that the more players within a ratings system,

the larger the spread will be tend to be from top to bottom.

the larger the spread will be tend to be from top to bottom.

For instance, oldtime FIBSters remember when 1800 really meant

something :)

Currently,

Rating Currently Highest Lowest

System Rated Players (difference from start rating)

Norway: 126 +256 -300

BIBA: 194 +288 -396

Sweden: 369 +356 -192

Denmark: 1042 +264 -373

GamesGrid: 2402 +543 -550

FIBS: 6769 +768* -1170**

Netgammon ? ? ?

How can this effect be quantified?

*2nd highest is+580

**2nd lowest is -845

Jul 10, 2000, 3:00:00 AM7/10/00

to

On Fri, 30 Jun 2000 06:22:17 GMT, rac...@best.com (Daniel

Murphy) wrote:

Murphy) wrote:

>It seems commonsensical that the more players within a ratings system,

>the larger the spread will be tend to be from top to bottom.

However commensensical it might be, it is not in line with the

workings of the rating system. This system is designed to

guarantee, that a difference of 200 points is a difference of 200

points - no matter the number of players. The numbers you give

for the internet servers are not really interesting - since the

lowest ranked players are certainly (ill-programmed) bots. A more

ionteresting number would be the average deviation of the

rankings. Please note, that an internet server, and probably also

a large national system like the Danish one, will probably

attract more weak players.

>For instance, oldtime FIBSters remember when 1800 really meant

>something :)

This has to do with inflation IMHO (the average rating rising),

not with the rating differences.

Jul 10, 2000, 3:00:00 AM7/10/00

to

Say Nis, how do you explain NIHILST's manipulation of the ratings on FIBS?

He plays 1 pointers against low rated players, and has pushed his ratings

to the top. Is there a way to create a ratings formula that won't let

this happen?

--

don

Jul 11, 2000, 3:00:00 AM7/11/00

to

Nis Jørgensen <n...@dkik.dk> writes:

> On Fri, 30 Jun 2000 06:22:17 GMT, rac...@best.com (Daniel Murphy) wrote:

> >It seems commonsensical that the more players within a ratings system,

> >the larger the spread will be tend to be from top to bottom.

>

> However commensensical it might be, it is not in line with the

> workings of the rating system. This system is designed to

> guarantee, that a difference of 200 points is a difference of 200

> points - no matter the number of players.

> On Fri, 30 Jun 2000 06:22:17 GMT, rac...@best.com (Daniel Murphy) wrote:

> >It seems commonsensical that the more players within a ratings system,

> >the larger the spread will be tend to be from top to bottom.

>

> However commensensical it might be, it is not in line with the

> workings of the rating system. This system is designed to

> guarantee, that a difference of 200 points is a difference of 200

> points - no matter the number of players.

Well, I think you're both right here. Yes, the expected difference

between two players should (by and large) be equivalent regardless of

the number of players in the system, but it is perfectly normal to

find larger differences between the extremes with larger samples.

(After all, if you rate 10 randomly selected backgammon players,

chances are the best player of the 10 will be somewhat better than the

worst, but not by a huge amount. But if you rate every player in the

world, you'll be able to measure the difference between the world

champ and somebody who barely knows the rules, which we would expect

to be enormous.)

> The numbers you give for the internet servers are not really

> interesting - since the lowest ranked players are certainly

> (ill-programmed) bots. A more ionteresting number would be the average

> deviation of the rankings.

That's quite possibly true, although it depends what you mean by

"interesting" :-). A better metric for the "spread" of a distribution

would be the inter-quartile range (the difference between the 25th and

75th percentiles). If we are allowed to assume that the populations

we're sampling from are equivalent (e.g. that FIBS does not attract a

different type of player than those measured in the Norweigian

system), then the expected inter-quartile ranges between each rating

system ought to be the same. Of course this assumption is unlikely to

be reasonable in practice (FIBS attracts all kinds of players from the

casual to world-class professionals, whereas national ratings mostly

consist of regular tournament players; the tournament players are more

likely to be closely matched than the FIBS ones). But the important

thing about the inter-quartile range is that its expectation is

independent of sample size.

> >For instance, oldtime FIBSters remember when 1800 really meant

> >something :)

Well, I think it still means more or less the same thing, depending on

how you interpret it; a rating of 1800 today means you are in the top

6% (or so) of FIBS players. When there were only 300 players, that would

put you in the top 20; now that there are going on 7000, it's only enough

to make it into the top 400.

> This has to do with inflation IMHO (the average rating rising),

> not with the rating differences.

Actually I believe the effect of inflation is rather small compared to the

other factors. The median FIBS rating at the moment is only 1528, after

8 years of FIBS -- inflation of 3 or 4 points a year doesn't seem like much

to me!

To get back to the original question ("How can this effect be

quantified?"), there has surely been plenty of work on the expected

maxima of samples of various distributions. I'm at work at the moment

and don't have any references handy, so I cheated and made a quick

simulation which appears to show that the expected deviation of the

maximum of n samples from a normal distribution appears to grow

slightly less than proportionally with log(n). I plotted a graph of

this expectation and superimposed Daniel's data on it; I had to assume

that backgammon ratings are normally distributed with std. dev. 150

points. I have no idea whether this assumption is reasonable or not;

in practice the FIBS/GG standard deviations are likely to be higher

than the national ratings, because they include a wider variety of

players, as described above. (Daniel, do you still have your original

samples available? It might be interesting to compute the inter-quartile

ranges and standard deviations to see how much they vary between pools

of players.)

The graph is available in PostScript form at:

http://www.cs.arizona.edu/~gary/backgammon/spread.ps

for those interested.

Cheers,

Gary.

--

Gary Wong, Department of Computer Science, University of Arizona

ga...@cs.arizona.edu http://www.cs.arizona.edu/~gary/

Jul 11, 2000, 3:00:00 AM7/11/00

to

On 11 Jul 2000 12:27:00 -0700, Gary Wong <ga...@cs.arizona.edu>

wrote:

wrote:

>> This has to do with inflation IMHO (the average rating rising),

>> not with the rating differences.

>

>Actually I believe the effect of inflation is rather small compared to the

>other factors. The median FIBS rating at the moment is only 1528, after

>8 years of FIBS -- inflation of 3 or 4 points a year doesn't seem like much

>to me!

Hmmm - I saw that too. Perhaps the playing strength of players

signing up has decreased - that could give inflation in ratings,

without increasing the average. Or perhaps it is just as you say

- there is no real inflation - only 1800 doesn't make you

"someone".

Jul 11, 2000, 3:00:00 AM7/11/00

to

>"Don Hanlen" <dha...@oneworld.owt.com> wrote in message

news:8kc0kc$423$1...@news.owt.com...

Snip

>

>Is there a way to create a ratings formula that won't let this happen?

>

news:8kc0kc$423$1...@news.owt.com...

Snip

>

>Is there a way to create a ratings formula that won't let this happen?

>

Brings up a good point about the formulas used. Most of it makes sense to me

but there is one number in it that I wonder about. Maybe one of the old

timers will know.

Given:

The value of the match = n

The absolute value of the difference between the two players ratings = D

The probability of the lower rated player winning = U

The probability of the higher rated player winning = 1-U

The formulas for ratings change feed directly upon the probabilities

computed from above.

U is computed by this formula which then feeds to the ratings change

calculations

1/(10^(D*SQRT(n)/2000)+1)

Where did the "2000" come from?

When any other number is used, the ratings change for two equally rated

players is STILL the same. A match of 1 point = 2 ratings points. However as

the difference between two players increases, the higher the number, the

closer the probabilities are of win/loss. This translates to lower winning

probability for the higher rated player and thus more points for winning and

less points deducted for losing. Conversely, as the number is lowered, the

"curve" flattens. The closer the rated players are, the less of an affect

there is.

For the sake of simplicity, below are some calcs between players separated

by 500 points. It's the same whether it is 1000 vs 500, or 2500 vs 2000.

With a value of 2000, if players are 1900 vs 1400

Probability = 64.01%/35.99% Ratings change for 1900 player: (Win/Loss) =

1.440/-2.560

When using 3000:

Probability = 59.48%/40.52% Ratings change for 1900 player: (Win/Loss) =

1.621/-2.379

Likewise the lower the number the smaller the spread. Using 1000 instead of

2000 it calcs like so:

Probability = 75.97%/24.03% Ratings change for 1900 player: (Win/Loss) =

0.961/-3.039

As long as player's real-life winning percentages exceed the computed

probabilities their rating WILL go up with time.

Even if all of this proves a change is needed, it does not necessarily mean

FIBS (And the rest to the BG community) should change anything. However:

Does a "flattened" probability scale more accurately reflect the real world?

If so, are there distributions and data that provide real world input to

what the number should be?

Also if so, what are the impressions and opinions of what it should be,

based upon experience?

Here is another way to look at it. How many one point games out of 1000

would a 1450 rated player beat Jellyfish? I doubt it would be 360. Maybe we

could set up two bots with lots of games (some of them have tens of

thousands of games) that are rated differently by 500 points and let them go

at it in an unrated 1,000 point money game?

Jul 11, 2000, 3:00:00 AM7/11/00

to

On 11 Jul 2000 12:27:00 -0700, Gary Wong <ga...@cs.arizona.edu> wrote:

>Nis Jørgensen <n...@dkik.dk> writes:

>> The numbers you give for the internet servers are not really

>> interesting - since the lowest ranked players are certainly

>> (ill-programmed) bots. A more interesting number would be the average

>> deviation of the rankings.

Server rating extremes on either end are subject to uncompetitive

manipulation. The lowest rating on FIBS is 649.78, the 2nd lowest

701.50. The 3rd lowest (775.72) is most definitely a bonafide human.

The 4th lowest (also human) boasts a substantially more respectable

rating of 870.16. All the lowest rated GamesGrid players appear to be

bonafide human players.

See below, where the statistics given include exclusion of 1% of

players at each end of the ratings lists.

>Actually I believe the effect of inflation is rather small compared to the

>other factors. The median FIBS rating at the moment is only 1528, after

>8 years of FIBS -- inflation of 3 or 4 points a year doesn't seem like much

>to me!

Agreed, and an aside: it's been mentioned in other discussions that

average, not median rating, is a better indication of ratings

inflation. Danish median is 1502.86, the average 1516.6. Norway median

is 1526.50, average is 1523.13. Calculating averages for other systems

is beyond my endurance for tedium.

>To get back to the original question ("How can this effect be

>quantified?"), there has surely been plenty of work on the expected

>maxima of samples of various distributions. I'm at work at the moment

>and don't have any references handy, so I cheated and made a quick

>simulation which appears to show that the expected deviation of the

>maximum of n samples from a normal distribution appears to grow

>slightly less than proportionally with log(n). I plotted a graph of

>this expectation and superimposed Daniel's data on it; I had to assume

>that backgammon ratings are normally distributed with std. dev. 150

>points. I have no idea whether this assumption is reasonable or not;

>in practice the FIBS/GG standard deviations are likely to be higher

>than the national ratings, because they include a wider variety of

>players, as described above. (Daniel, do you still have your original

>samples available? It might be interesting to compute the inter-quartile

>ranges and standard deviations to see how much they vary between pools

>of players.)

Can you use these statistics, Gary?

Group # #1 75%-ile median 25%-ile lowest

FIBS 6683 2273.74 1640.40 (+111.76) 1528.64 1430.50 (-69.50) 701.50

GG 2418 2068.09 1711.68 (+132.64) 1579.04 1491.28 (-87.76) 957.58

DKk 992 1764.93 1555.22 (+ 52.36) 1502.86 1473.80 (-29.06) 1126.31

SEn 388 1879.00 1605.00 (+ 99.00) 1506.00 1428.00 (-78.00) 1173.00

BIBA 217 1781.00 1597.00 (+ 92.00) 1505.00 1432.00 (-73.00) 1102.00

NO 130 1760.00 1618.00 (+ 91.50) 1526.50 1435.00 (-91.50) 1200.00

Group # 1%-ile 75%-ile median 25%-ile 99%-ile

FIBS 6683 1911.42 1640.40 (+111.76) 1528.64 1430.50 (-69.50) 1136.30

GG 2418 1960.75 1711.68 (+132.64) 1579.04 1491.28 (-87.76) 1234.71

DBgF 992 1696.22 1555.22 (+ 52.36) 1502.86 1473.80 (-29.06) 1373.77

SBgF 388 1827.00 1605.00 (+ 99.00) 1506.00 1428.00 (-78.00) 1266.00

BIBA 217 1772.00 1597.00 (+ 92.00) 1505.00 1432.00 (-73.00) 1187.00

NBgF 130 1741.00 1618.00 (+ 91.50) 1526.50 1435.00 (-91.50) 1272.00

FIBS: excludes unknown # of players with less than 50 TMP, and lowest

ranked "player."

GG: excludes the non-player at bottom of list

DK: excludes 51 members with 0 TMP

Sweden: includes all rated players (qualifications unknown)

BIBA: includes all rated players (qualifications unknown)

Norway: includes all listed players (i.e., minimum 15 matches and and

least 1 match played in last year).

Danish system start point is 1000, not 1500; ratings adjusted by +500

for comparison.

Jul 11, 2000, 3:00:00 AM7/11/00

to

rac...@best.com (Daniel Murphy) writes:

> On 11 Jul 2000 12:27:00 -0700, Gary Wong <ga...@cs.arizona.edu> wrote:

> >Actually I believe the effect of inflation is rather small compared to the

> >other factors. The median FIBS rating at the moment is only 1528, after

> >8 years of FIBS -- inflation of 3 or 4 points a year doesn't seem like much

> >to me!

>

> Agreed, and an aside: it's been mentioned in other discussions that

> average, not median rating, is a better indication of ratings

> inflation.

> On 11 Jul 2000 12:27:00 -0700, Gary Wong <ga...@cs.arizona.edu> wrote:

> >Actually I believe the effect of inflation is rather small compared to the

> >other factors. The median FIBS rating at the moment is only 1528, after

> >8 years of FIBS -- inflation of 3 or 4 points a year doesn't seem like much

> >to me!

>

> Agreed, and an aside: it's been mentioned in other discussions that

> average, not median rating, is a better indication of ratings

> inflation.

True -- I tried searching for the articles about inflation that had

been posted here in the past, but unfortunately now that we have only

a "precision buying service" instead of Deja News, things like that

aren't easy to find.

Luckily we still have Tom Keith's r.g.b. archive -- one relevant article

is:

http://www.bkgm.com/rgb/rgb.cgi?view+416

which does seem to indicate that a FIBS rating of 1800 has been reasonably

consistent at marking the 95th percentile in 1995, 1997 and 2000.

One other snippet -- Michael Klein's latest FIBS Ratings Report shows

the mean FIBS rating to be 1534, which is surprisingly close to the median.

Thanks for those data! (I believe that the "-69.50" figure in the FIBS

25%-ile should be "-98.14".)

A few random observations:

- The inter-quartile ranges of the online servers do seem to be

significantly higher than the national ratings (~210 vs. ~170), which

supports the hypothesis that the Internet servers attract a more

varied range of players than real-life tournaments.

The Danish range is much smaller than the others, though; I have no

idea why this would be the case (perhaps the results include a large

number of relatively new players? The other descriptions make it

sound as if they do or might exclude inexperienced players.)

- The Danish, Swedish and British medians show virtually no sign of

inflation. I suspect this may be because they "include all rated

players": the main cause of inflation is that weak players are more

likely to leave the system than strong players, and so weak ratings

are gradually deleted over time which effectively raises whatever is

left behind. The Norweigian ratings (which require at least 1 match

played in the last year) show comparable inflation to FIBS.

GamesGrid shows the most inflation of all. This might well be because

the financial cost increases the tendency of weak players to leave. I

understand that GG have added points to all players' ratings in the

past when a server crash lost the results of some games (I'm not sure

which is more disturbing -- that somebody thought this was a good idea,

or that users were apparently pacified by it!) which would certainly

add to this effect.

- The results show that the distributions tend to be skewed slightly to

the right (the upper quartile is larger than the lower quartile). One

explanation for this might be that weak players tend to improve faster

than strong players (hopefully nobody's getting significantly worse!)

which could shrink the left-hand tail somewhat.

Jul 12, 2000, 3:00:00 AM7/12/00

to

On Tue, 11 Jul 2000 16:28:24 -0500, "Pip_Panther"

<Pip_noPa...@my-deja.com> wrote:

<Pip_noPa...@my-deja.com> wrote:

[snip]

>1/(10^(D*SQRT(n)/2000)+1)

>

>Where did the "2000" come from?

It is deciding the scaling of the rating system, ie how much a

certain difference means in playing strength. If you replace it

with 1000, two players who in the old rating system are 200

apart, will in the new be 100 points apart. You will see, that

they then have exactly the same expected outcomes as in the old

system.

The number is not totally irrelevant, though, as the scaling must

be balanced with the number of points lossed and gained in single

matches. If the scaling is to low, people will be elevatoring. If

the scaling is too high, it takes too long time to reach your

true rating.

Jul 12, 2000, 3:00:00 AM7/12/00

to

On 11 Jul 2000 18:17:48 -0700, Gary Wong <ga...@cs.arizona.edu>

wrote:

wrote:

> - The Danish, Swedish and British medians show virtually no sign of

> inflation. I suspect this may be because they "include all rated

> players": the main cause of inflation is that weak players are more

> likely to leave the system than strong players, and so weak ratings

> are gradually deleted over time which effectively raises whatever is

> left behind. The Norweigian ratings (which require at least 1 match

> played in the last year) show comparable inflation to FIBS.

I am almost sure the Danish numbers are only of paying members

(correct me if I'm wrong, Daniel). I suspect the Norwegians, like

FIBS, use a gearing system. This way, when a strong player enters

at 1500, he injects rating points into the system.

Jul 12, 2000, 3:00:00 AM7/12/00

to

The Danish rating list includes only current, paid-up members. Members

who neglect to renew their membership are dropped from the rankings.

Ditto for GamesGrid and NBgF and, I assume, for BIBA and SBgF. Not

only because seeing one's name in the ratings list is an incentive to

remain a member, but because (as is the case in Denmark) membership in

the national federation is mandatory for residents to participate in

Open or Intermediate flights of almost all tournaments.

But several factors do limit inflation in the national ratings. No one

can drop out and then rejoin under a different identify. No ever ever

gets his rating "re-set" to par. The system never awards all players X

points. At least in Denmark, everyone new to the system starts out at

par regardless of real or estimated ability. And my impression is that

in Denmark, for example, there's a small but steady outflow of

higher-ranked players every year, as people move or give up real life

play for whatever reason -- I imagine this effect isn't so notable on

the online servers. Nis mentions another reason -- unlike all the

online systems, the Danish system has no accelerated ratings boost for

low-experience players. I believe he's correct that the Norwegian

system has adopted the exact FIBS formula, including the "boost" for

players with less than 400 TMP.

Jul 12, 2000, 3:00:00 AM7/12/00

to

On 11 Jul 2000 18:17:48 -0700, Gary Wong <ga...@cs.arizona.edu> wrote:

>True -- I tried searching for the articles about inflation that had

>been posted here in the past, but unfortunately now that we have only

>a "precision buying service" instead of Deja News, things like that

>aren't easy to find.

>been posted here in the past, but unfortunately now that we have only

>a "precision buying service" instead of Deja News, things like that

>aren't easy to find.

The DejaNews newsgroup archive is still there, it's just not mentioned

on the deja.com homepage front page. Who know why ... I think if you

click on the link to their FAQs you can find your way to their

archive.

Jul 12, 2000, 3:00:00 AM7/12/00

to

rac...@best.com (Daniel Murphy) writes:

> The DejaNews newsgroup archive is still there, it's just not mentioned

> on the deja.com homepage front page. Who know why ... I think if you

> click on the link to their FAQs you can find your way to their

> archive.

> The DejaNews newsgroup archive is still there, it's just not mentioned

> on the deja.com homepage front page. Who know why ... I think if you

> click on the link to their FAQs you can find your way to their

> archive.

Well, it's still there, but old messages aren't. The earliest available

message from r.g.b. is Henrik Jensen's "Gammon?" from May 15th last year.

Bits and pieces of the archive have been coming and going for the last

two months. The last I heard was a notice at:

http://www.deja.com/=dnc/info/site_move.shtml

which reads:

Old Usenet messages - Between May 20 and May 26, messages posted 2

weeks to a year ago will not be available. Starting May 4, many

messages posted over two years ago will not be accessible on a

temporary basis, and after May 15, all messages posted over a year ago

will not be accessible on a temporary basis. We will be taking this

opportunity to reconfigure the service that provides messages posted

prior to May 1999. Therefore, these messages will not be accessible on

the site for some time, possibly a few months. Have no fear: We're

committed to bringing these messages back online as soon as possible.

Jul 12, 2000, 3:00:00 AM7/12/00

to

rac...@best.com (Daniel Murphy) writes:

> On Wed, 12 Jul 2000 11:03:04 +0200, Nis Jørgensen <n...@dkik.dk> wrote:

> >I am almost sure the Danish numbers are only of paying members

> >(correct me if I'm wrong, Daniel). I suspect the Norwegians, like

> >FIBS, use a gearing system. This way, when a strong player enters

> >at 1500, he injects rating points into the system.

>

> The Danish rating list includes only current, paid-up members. Members

> who neglect to renew their membership are dropped from the rankings.

> Ditto for GamesGrid and NBgF and, I assume, for BIBA and SBgF.

> On Wed, 12 Jul 2000 11:03:04 +0200, Nis Jørgensen <n...@dkik.dk> wrote:

> >I am almost sure the Danish numbers are only of paying members

> >(correct me if I'm wrong, Daniel). I suspect the Norwegians, like

> >FIBS, use a gearing system. This way, when a strong player enters

> >at 1500, he injects rating points into the system.

>

> The Danish rating list includes only current, paid-up members. Members

> who neglect to renew their membership are dropped from the rankings.

> Ditto for GamesGrid and NBgF and, I assume, for BIBA and SBgF.

Thanks! Those explanations make a lot of sense, and appear to correspond

well to the data Daniel compiled. If the Danish system makes equal rating

changes regardless of experience, that could well explain why the observed

range is smaller there than with the other systems.

Jul 13, 2000, 3:00:00 AM7/13/00

to

Pip_Panther wrote:

> Snip

>

>

> Does a "flattened" probability scale more accurately reflect the real world?

What is the real world, LOL? But I think I know what you meant...

>

> If so, are there distributions and data that provide real world input to

> what the number should be?

> Also if so, what are the impressions and opinions of what it should be,

> based upon experience?

It depends on whether we are talking bot vs. bot, human vs. human, or human vs.

bot--and in this last group, it depends on who is higher-rated, the human or the

bot.

>

> Here is another way to look at it. How many one point games out of 1000

> would a 1450 rated player beat Jellyfish? I doubt it would be 360.

If by JellyFish, you mean 3.0 or 3.5, Level 7, the games sure would be ugly to

watch.

> Maybe we

> could set up two bots with lots of games (some of them have tens of

> thousands of games) that are rated differently by 500 points and let them go

> at it in an unrated 1,000 point money game?

Low-rated bots tend to play only "one-pointers", but you could still have them

play 1000 games. However, the score at the end of 1000 games would be different

when you have a 1900-rated human vs. a 1400-rated bot, as opposed to a

1900-rated bot vs. a 1400-rated bot. I believe the 1900-rated human would do

better than the 1900-rated bot, because he/she would notice what the bot's

weakest areas are, and intentionally steer the games in that direction. As yet,

there is no bot (that I know of, anyway) that changes its plays based on what it

sees its opponent doing wrong.

For example, Costello (a bot on FIBS, rating class 1500-1600) will run from a

defensive anchor, especially a 3-anchor, long before it should. Therefore, if it

would normally be a close call as to whether to hit loose to prevent his getting

the anchor, you might as well punt and let him make it. He won't know what to

do with it once he has it anyway.

Costello is also rather clueless in a backgame, as are even some of the

higher-rated bots. If you are playing one-pointers (which are all it plays

anyway), if you fall behind in the race early on, you might as well go all-out

and play from the back, even when against a human a bit more moderation would be

called for. It has been my experience that the bot will cheerfully help you

solve your timing problems and also dump checkers behind you instead of clearing

forward points ASAP, causing it to leave shots later that could have been

avoided.

The above are just two examples. Put the cube in the picture, and you have even

more angles to discover and exploit. These angles can be found against any

player, myself included. The difference is that a human will look for these

things, but a bot won't because it doesn't know how. This means that a

technically superior bot may be ranked below humans that go this extra mile.

One more thing: Even if you did decide to have the bots play each other, you

would need to establish the true rating for each bot based on some consistent

method. I have seen one FIBS bot, MonteCarlo, rated below 1800 and I've also

seen it in the higher half of the 1900s! I haven't watched the others as much,

but I am sure that they, too, cover a lot of ground as they rack up their

5-figure experience levels.

mamabear

Jul 13, 2000, 3:00:00 AM7/13/00

to

"Mary Hickey" <mamab...@att.net> wrote in message

news:396D72BF...@att.net...

>

>

> What is the real world, LOL? But I think I know what you meant...

>

other for who gets the final say.

>

> > Here is another way to look at it. How many one point games out of 1000

> > would a 1450 rated player beat Jellyfish? I doubt it would be 360.

>

> If by JellyFish, you mean 3.0 or 3.5, Level 7, the games sure would be

ugly to

> watch.

>

lvl5? I don't remember.

You are right about a test of bot vs bot or even human vs bot. The only way

to tell would be to gather bulk data from actual match statistics. For

example, how often do matches between players separated by x points adhere

to the computed probabilities? As in the example I gave, what is the

dispersion of win/loss between players separated by 500 points and does it

match the formula? 450? 300? I think it is evident there is "elevatoring"

going on. If the data showed that the greater the point spread between

players the greater the deviation from the formula then it would be proof

and the 2000 "seed" number could be adjusted up or down.

It doesn't mean that the whole thing makes it any less fun to play, ratings

are fun to have but it's just not that big of a deal to all but a lot of

people. Some don't even care at all. And for those that do care, adjusting

the formula that has been used so widely for so long would not be easily

accepted. The cure could be worse than the problem.

Jul 13, 2000, 3:00:00 AM7/13/00

to

On Thu, 13 Jul 2000 12:31:08 -0500, "Pip_Panther"

<Pip_noPa...@my-deja.com> wrote:

<Pip_noPa...@my-deja.com> wrote:

>You are right about a test of bot vs bot or even human vs bot. The only way

>to tell would be to gather bulk data from actual match statistics. For

>example, how often do matches between players separated by x points adhere

>to the computed probabilities?

The question is not "how often do they match", but "how well do

they match".

> As in the example I gave, what is the

>dispersion of win/loss between players separated by 500 points and does it

>match the formula? 450? 300? I think it is evident there is "elevatoring"

>going on. If the data showed that the greater the point spread between

>players the greater the deviation from the formula then it would be proof

>and the 2000 "seed" number could be adjusted up or down.

As I explained, ot tried to, the number 2000 is not the important

part here. We would have to change either the outer formula (the

1/(10^(something) + 1) part, or the formula for "something",

which basically involves (rating diference)/2000 and matchlength.

I think DBgF has data material lying around, which would provide

useful in this matter.

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu