Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

statistics

0 views
Skip to first unread message

jonbrown1

unread,
Nov 14, 2002, 8:38:53 AM11/14/02
to
I have a few questions related to sample size in statistics:


1. About how many experience points are necessary to get an accurate measure
of your rating ((how does the answer change if you are only playing the same
computer opponent, with a fixed rating who performs at the same level (ex.
playing offline against snowie) as opposed to playing opponents of varying
levels and ratings (ex. playing on gamesgrid)).

A related question:

2. About how many matches are needed to determine with a reasonable level of
statistical certainty the stronger of two players who are only playing each
other in 1pt or only 7pt. or only 25 pt matches?


Anyone know of any good websites where I could go to learn more about how
statistics can solve these kinds of problems?


Lastly, thanks Greg and Jorn, your responses are always helpful and your
postings are insightful.


Peter Schneider

unread,
Nov 14, 2002, 12:09:49 PM11/14/02
to
Hi,

> 1. About how many experience points are necessary
> to get an accurate measure
> of your rating

Well... what do you mean with accurate? I'd guess that I'm more than 50% of
the time more than 20 rating points away from the rating which corresponds
to my average skill. (Yes, also upwards! ;-) ) The interval spanned within
the last months was 1740-1860. Kit Woolsey is told to have been seen below
1800 on FIBS once.

So I think for me, after having played many thousand games, I have a 95%
probability to be within a range of 1760-1820.

The center of the oscillations gives a better number, but it takes so many
games to determine, that I hopefully have improved inbetween. (But true,
this could be detected by sophisticated statistical methods.)

I'd say that a rating snapshot at best gives an indication of the playing
class (like intermediate or advanced), with a significant probability to be
off by 1 class.

Last not least, imho the online rating does not say much about the play over
the board.

I hope I


Michael Sullivan

unread,
Nov 14, 2002, 3:46:49 PM11/14/02
to
Peter Schneider <schneiderp...@gmx.net> wrote:

> Well... what do you mean with accurate? I'd guess that I'm more than 50% of
> the time more than 20 rating points away from the rating which corresponds
> to my average skill. (Yes, also upwards! ;-) ) The interval spanned within
> the last months was 1740-1860. Kit Woolsey is told to have been seen below
> 1800 on FIBS once.

> So I think for me, after having played many thousand games, I have a 95%
> probability to be within a range of 1760-1820.

> The center of the oscillations gives a better number, but it takes so many
> games to determine, that I hopefully have improved inbetween. (But true,
> this could be detected by sophisticated statistical methods.)

> I'd say that a rating snapshot at best gives an indication of the playing
> class (like intermediate or advanced), with a significant probability to be
> off by 1 class.

> Last not least, imho the online rating does not say much about the play over
> the board.

Here's a question -- what would you say the rating classes are. I've
been doing some searching and not finding any really good information
about this.

What percentage of active tournament players are in various ranges.
What's the bottom of "world class" range (1900?).

Is it trying to model elo, where a 200 point difference equates to 80%
winning chance for the higher rated player? And if so, for what match
length? People enter at 1500 on FIBS auutomatically so presumably
that's average or close for avid internet players. What range would you
call intermediate? 1400-1600?


Michael

--
Michael Sullivan
Business Card Express of CT Thermographers to the Trade
Cheshire, CT mic...@bcect.com

Douglas Zare

unread,
Nov 14, 2002, 4:52:55 PM11/14/02
to
jonbrown1 wrote:

> I have a few questions related to sample size in statistics:
>
> 1. About how many experience points are necessary to get an accurate measure
> of your rating ((how does the answer change if you are only playing the same
> computer opponent, with a fixed rating who performs at the same level (ex.
> playing offline against snowie) as opposed to playing opponents of varying
> levels and ratings (ex. playing on gamesgrid)).

I agree with Peter Schneider's comments, but I'll reply directly instead of
quoting them.

You can't get a very accurate estimate from your current rating. If everyone
else is correctly rated and the FIBS formula is accurate, then in the long run,
your rating will follow a stable distribution with standard deviation roughly
42. The tails are slightly thicker than for a normal distribution, as they drop
off roughly exponentially rather than as exp(c x^2). Now that I think about it
again, maybe the drop off is more like exp(c x lnx).

There are better indicators than the current rating. For large experience levels
(over perhaps 5000), the maximum rating varies less than the current rating.
This assumes that one's opponents and behavior do not change when one is
overrated, which seem unlikely.

You can find discussions of these and other issues related to the rating system
in the article by me and Adam Stocks, "Ratings - A Mathematical Survey" in
GammonVillage.com, which requires a subscription. You can also find some
information in the rec.games.backgammon archive
http://www.bkgm.com/rgb/rgb.cgi?menu .

If your opponents online are often misrated, this widens the stable
distribution, and affects the maximum rating. This can seriously affect the
distribution if there are, for example, overrated players who only play you when
they see you are overrated. You need a lot of extra assumptions to build that
into a mathematical model.

> A related question:
>
> 2. About how many matches are needed to determine with a reasonable level of
> statistical certainty the stronger of two players who are only playing each
> other in 1pt or only 7pt. or only 25 pt matches?

It depends. If one player wins 98% of the matches, very few matches will be
needed to detect that the stronger player is really stronger. It would take many
matches to determine whether the right value is 98% rather than 97% or 99%, but
few to realize that it isn't 50%. If there is a 51-49 advantage, it will take
many matches.

If you are trying to distinguish nearly equal players who differ by an advantage
of x% (1% corresponding to a 51-49 edge), then after about 7000/x^2 matches the
stronger player has a 95% chance of being ahead. If someone is a 60-40 favorite,
then about 70 matches are enough to make it a significant surprise for the
weaker player to be ahead. A 51-49 edge would take about 7000 matches to reach
the same level of confidence.

On the other hand, the score is not the most efficient way to determine who is
stronger. There are unbiased methods of variance reduction (by obviously fair
side bets on each roll) that can decrease the number of matches needed by at
least a factor of 10. See my article "Hedging Toward Skill" in GammonVillage.com
. A version of this is implemented by gnu, though it takes some work to extract
the unbiased skill estimates.

If the FIBS formula is correct (in estimating the relative advantage in 25 point
matches from the advantage in 7 point matches), then the total experience level
needed is roughly the same for 7 point and for 25 point matches.

Douglas Zare

moorg

unread,
Nov 20, 2002, 10:13:18 PM11/20/02
to
Hi there,

There is some good ratings info at:

www.acepointclub.com (NY BG Club)

and

www.zroundtable.com (BG Ratings)

It seems like experience of 1500 or so eliminates much of the
"rampup effect" BUT

due to randomness, etc. -- one's ELO/FIBS rating fluctuates +/- 100 pts...


"jonbrown1" <jonb...@hotmail.com> wrote in message
news:NHNA9.117888$c51.35...@twister.nyroc.rr.com...

0 new messages