Testing for Biased Dice

Allan Longley

unread,

Nov 10, 1992, 12:37:09 AM11/10/92

to

Good day (evening) fellow netters.

Some time ago I was talking with somebody -- now that's descriptive -- about
testing dice for bias. Well, here is a test to use. I haven't actually
tested this yet, but it should -- in theory -- work. And no, this is not
a copy out of the old Dragon magazine, but it is the same test -- its a
pretty standard test. I will use simple terms -- so all you math/stat
people out there, don't correct the fine points, I know.

To test a die with X sides (ie., X=4,6,8,10,12,20), roll the die 20*X times
(eg., 80 times for a d4, 400 times for a d20). Record the number of times
each face occurs (eg., a 3 occurred 7 times) in a form similar to the first
two columns of Table 1.

Table 1. Example Results for a 4-sided Die
----------------------------------------------
value occurrances subtract 20 square
----------------------------------------------
1 16 -4 16
2 24 4 16
3 18 -2 4
4 22 2 4
----------------------------------------------
Total = 40
Indicator = 40/20 = 2.00
----------------------------------------------

Next, subtract 20 from the number of occurances record in the second column
(ie., third column of Table 1). Square these values (ie., square the volue
in the third column and record them in the fourth column of Table 1). Sum
the values in the fourth colunm (ie., Total=40) and divide this value
by 20 to yield the indicator value (ie., Indicator=2.00). Compare your
indicator value to the values in the Table 2 based on how many sides the die
you're testing has -- the larger the value of the Indicator the worse the
die probably is.

Table 2. Critical Values of the Indicator
--------------------------------------------------------------
Sides Probably Maybe Probably
of Die Fair Biased
--------------------------------------------------------------
4 0.35 2.37 7.82
6 1.15 4.35 11.07
8 2.17 6.35 14.07
10 3.33 8.34 16.92
12 4.58 10.34 19.68
20 10.12 18.34 30.14
--------------------------------------------------------------

If you get a Indicator value below the value in the "Probably Fair" column
then the die is probably an unbiased die. If the value of the Indicator is
above the value in the "Probably Biased" column then the die is probably a
biased die. Values between these two columns are a problem. The column
titled "Maybe" are the Indicator values where there is a 50% chance
that the die is fair and a 50% chance that the die in biased. If you have
gotten a Indicator value between the "Fair" and "Biased" columns then do the
test again. If both the tests produce an Indicator value less than the
"Maybe" value then the die is probably fair -- but don't bet your life on it.
If both the tests yield an Indicator value greater than the "Maybe" column
then the die could be biased.

From Table 1, it appears that the d4 tested may be "Fair" but another test
should be done.

-----------------------------------------------------------------------------
Allan Longley, University of Waterloo, Department of Chemical Engineering
e-mail: alon...@cape.uwaterloo.ca
voice: (519) 885-1211 Don't make it a complex problem
home: (519) 746-1498
-----------------------------------------------------------------------------
"Ph-nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn"
-----------------------------------------------------------------------------

Allan Longley

unread,

Nov 10, 1992, 11:44:16 AM11/10/92

to

dcgo...@eos.ncsu.edu writes:

> There are a few dice I have that need to be tested, I think...
> The only thing I don't like about this system is having to roll that one
> die 100's of times...

Well, you could roll the die fewer number of times, but your chance of
getting a meaningful result decreases. The test could be changed to using
half the number of rolls as follows:

To test a die with X sides (ie., X=4,6,8,10,12,20), roll the die 10*X times
(eg., 40 times for a d4, 200 times for a d20). Record the number of times

each face occurs (eg., a 3 occurred 7 times) in a form similar to the first
two columns of Table 1.

Table 1. Example Results for a 4-sided Die
----------------------------------------------

value occurrances subtract 10 square
----------------------------------------------
1 8 -2 4
2 12 2 4
3 9 -1 1
4 11 1 1
----------------------------------------------
Total = 10
Indicator = 10/10 = 1.00
----------------------------------------------

Next, subtract 10 from the number of occurances record in the second column
(ie., third column of Table 1). Square these values (ie., square the value

in the third column and record them in the fourth column of Table 1). Sum

the values in the fourth colunm (ie., Total=10) and divide this value
by 10 to yield the indicator value (ie., Indicator=1.00). Compare your

voice: (519) 885-1211 Stats can NEVER prove something
home: (519) 746-1498 is true!

d...@acpub.duke.edu

unread,

Nov 10, 1992, 1:06:50 PM11/10/92

to

This is called a chi-square test, and an article with the
procedure and numbers for it appeared way back in Dragon issue
#74... Thank you, Mr. Longley, for reposting it (or did you come
up with it in isolation? =) ) for the benefit of those who don't
have the issue (probably most readers).
-Abdiel

David Alexandre Golden

unread,

Nov 12, 1992, 2:40:15 PM11/12/92

to

In article <BxIEH...@watserv1.uwaterloo.ca> alon...@cape.UWaterloo.ca (Allan Longley) writes:
>
>dcgo...@eos.ncsu.edu writes:
>
>> There are a few dice I have that need to be tested, I think...
>> The only thing I don't like about this system is having to roll that one
>> die 100's of times...

I once did something along those lines to test whether my DM's die was
fair. (It would roll 20's a lot. Often on command.)

To test the die, I made a histogram (I believe is the term for it) like this:

x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

With an "x" being each time that number came up. A totally fair die would
have a straight line across, assuming it was rolled enough times. The
question is, what is enough times? I rolled a d20 about 380 times (not bad
if one person rolls and the other makes an x in the column) and while it
gave a fairly good idea that the die was biased, the biased numbers had only
occured a couple times more than the average. (If I remember correctly).
So I wrote a computer program to do the same thing until the deviation between
the highest and lowest number of occurances was less than about 15% of the
average number of occurances. (i.e a reasonably smooth profile). The
computer required SEVERAL THOUSAND ROLLS to do this. Of course, the computer
is not using a truely random generator. Still, the point is that I'm skeptical
that the "fairness" of a die can be determined in only a hundred or so rolls.
(d4 maybe... d20 no way!)

Of course, I'm not a statistics major. Any comments by those with knowledge
of the field?

Dave Golden
--
David A. Golden '95 (dago...@phoenix.princeton.edu)
Princeton University

Adam Dray

unread,

Nov 12, 1992, 6:10:19 PM11/12/92

to

In article <1992Nov12....@Princeton.EDU>, dago...@phoenix.Princeton.EDU (David Alexandre Golden) writes:
>
> I once did something along those lines to test whether my DM's die was
> fair. (It would roll 20's a lot. Often on command.)
>
> To test the die, I made a histogram (I believe is the term for it) like this:

[etc]

No, no, no. I, too, was once accused of using a bad d20, so I did a
chi square test on it. The details of the process have long been
erased from my memory, but it goes something like this:

1. Roll your die a lot of times.
2. Tally the number of times each possibility occurs.
3. Do a lot of messy, but otherwise simple, arithmetic on the numbers
in order to produce a standard deviation.
4. As long as that number falls within certain bounds, your die is
"fair."

In other words, a histogram shows very little. Random doesn't mean
necessarily that you'll get an even distribution. It just mean the
probability that you won't get an even distribution is proportional to
the number of sides on the die, and the number of times you roll it.

Notes about the fairness of dice:

Sharp-edged dice are better than smooth-edged dice. They're also more
expensive, however. Rounded dice are often inked by coating the
entire die with ink, then tossing the die in a "tumbler" (similar to
tumblers for smoothing rocks) until all the die on the outside is
gone. Thus, the ink is left in the crevices where the numbers are.

Theoretically, the grooves for the numbers can make one side a more
likely outcome. Official casino dice don't have inset pips.

The standards for polyhedral dice are very low. In one GameScience
ad, several stacks of 10d20 (all of the same brand) don't stack up to
the same height. That means that the dice aren't necessarily of the
same uniform size, and even suggests that the dice aren't "true"
regular polyhedrons. GameScience claims their dice are "true."

GameScience did tests on other manufacturers' dice. They found
certain numbers to be more likely. I've heard that the real 100-sided
die tends to roll certain numbers more often.

Filing corners off your dice can make certain outcomes more probable.
Natural wear can do the same thing.

For most people, none of this matters one damn bit. =)

Adam.

Michael G. Wright

unread,

Nov 15, 1992, 9:13:01 PM11/15/92

to

d...@acpub.duke.edu writes:

Actually, I use a program I made to roll dice for stats. Unfortunately, nobody
in the party wants to use it, because dice rolls invariably end up better. I
think this must be because of the pseudo-randomness of the program.
Anyone out there that knows better?
--
Michael Wright _-_|\ Peg- "Al, can I have a thousand bucks?"
LaTrobe University, / \ Al- "Why sure honey, have you got change
Melbourne, Australia. \_.-_*/ for a million?"
wri...@latcs1.lat.oz.au v Married with children. :-)

Allan Longley

unread,

Nov 16, 1992, 6:49:22 PM11/16/92

to

In article <wright.7...@latcs1.lat.oz.au> wri...@latcs2.lat.oz.au (Michael G. Wright) writes:
>d...@acpub.duke.edu writes:
>
>> This is called a chi-square test, and an article with the
>>procedure and numbers for it appeared way back in Dragon issue
>>#74... Thank you, Mr. Longley, for reposting it (or did you come
>>up with it in isolation? =) ) for the benefit of those who don't
>>have the issue (probably most readers).

Yes, I seen the issue. The chi-square test is a standard statistical test
for determinig if a data set matches a particular distribution -- so, no, I
did not come up with the test in isolation, its been around for a lot longer
than D&D. I don't actually have the issue, so I didn't copy it for the net.

I've been playing with the chi-square test and you know what I found out --
ALL DICE ARE BIASED!! Well, that's not true -- all except d4's and d6's are
biased. Of course, this really shouldn't be a surprise. So, I've been
looking at modifiying the chi-square test for "real world" dice -- more on
this in a later post.

>Actually, I use a program I made to roll dice for stats. Unfortunately, nobody
>in the party wants to use it, because dice rolls invariably end up better. I
>think this must be because of the pseudo-randomness of the program.
>Anyone out there that knows better?

I wouldn't want to use a computer generated die-value while playing AD&D.
THe thrill of the "rolling die" is part of the game. Also, with reference
to the above, most players will have a favourite die/dice due to the
inherent bias found in real dice. The trick is to find the dice that are
biased beyond reasonable playability.

-----------------------------------------------------------------------------
Allan Longley, University of Waterloo, Department of Chemical Engineering
e-mail: alon...@cape.uwaterloo.ca

voice: (519) 885-1211 Snow, no, No, NO, AAAaaahhhh!!!!
home: (519) 746-1498

Michael Sawyer

unread,

Nov 16, 1992, 10:25:50 PM11/16/92

to

In article <wright.7...@latcs1.lat.oz.au> wri...@latcs2.lat.oz.au (Michael G. Wright) writes:
>

>Actually, I use a program I made to roll dice for stats. Unfortunately, nobody
>in the party wants to use it, because dice rolls invariably end up better. I
>think this must be because of the pseudo-randomness of the program.
>Anyone out there that knows better?

The dice rolling routine of the chat-like server I wrote took quite a
bit of time to get to a good level of randomness. Right now, running
that routine through the chi-squared test, I get good results, but
never really great ones; I always seem to end up on the border between
"good" and "maybe." It is really more of a condition of (1) what you
seed the random number generator with and (2) what style of random
number generator you use. Of the various generally available Unix
routines, I have found srand48() to be the only partially acceptable
one, although I know that better algorithms do exist if you look
through the literature on Monte Carlo techniques and other numerical
procedures requiring good psudorandom numbers. Since you are using a
computer, you should be able to test the randomness fairly easily with
a method like the one described here not long ago.

--
Michael Sawyer msa...@mael.soest.hawaii.edu
University of Hawaii Physical Oceanography/Satellite Remote Sensing

Paul Kinsler

unread,

Nov 17, 1992, 6:09:19 PM11/17/92

to

In article <Bxu26...@watserv1.uwaterloo.ca>, alon...@cape.UWaterloo.ca (Allan Longley) writes:
|> In article <wright.7...@latcs1.lat.oz.au> wri...@latcs2.lat.oz.au (Michael G. Wright) writes:
|> I've been playing with the chi-square test and you know what I found out --
|> ALL DICE ARE BIASED!! Well, that's not true -- all except d4's and d6's are
|> biased. Of course, this really shouldn't be a surprise. So, I've been
|> looking at modifiying the chi-square test for "real world" dice -- more on
|> this in a later post.

ALL DICE ARE BIASED?

Perhaps you'd care to explain further - you mean that it is in practise
impossible to manufacture d8+ dice that are unbiased? or that any (even
"ideal") dice are biased!!???!?!? or some bizarre property of the chi-sq
test indicates a bias even for dice with a perfectly even distribution
of numbers?!!#%??! or are you refering to the fact that for any finite sample
of roll, the mean etc are never exactly the theoretical ones??, or what?

--
+soluble fish+
Masks beneath masks until suddenly the bare bloodless skull.
Salman Rushdie, "The Satanic Verses"

David John Hanes

unread,

Nov 19, 1992, 7:44:02 PM11/19/92

to

On the subject of biased dice: I have been accused of using biased dice when
rolling characters. We use the 4d6 highest three method (Doesn't every one?)
and I always used the same four dice. I admit that the rolls were high
( ~average 14 per stat), but I had in no way altered the dice, and they were
bought off the shelf as normal dice (two sets of 2 dice at seperate times).

The funny thing is that I recently lost one of those dice and my rolls have
significantly reduced. The highh rolls only seemed to work when the four
dice were used as a set. I have studied stats at uni but still I think there
is something more in the values given by dice (some people call it luck,
some people may even believe there is something mystical about it).

What ever it is, nobody can predict the outcome of the rolls, even when
there seems to be a pattern (as exhibited by biased dice).

DAVE!
=====

University of Wollongong.

u903...@wraith.cs.uow.edu.au (just incase .sig doesn't work)

--
++++++++++++++++++++++++++++++++++++s
David Hanes..

E-Mail u903...@wraith.cs.uow.edu.au

Glen Barnett

unread,

Nov 20, 1992, 1:20:40 AM11/20/92

to

In this post, I will begin by discussing various posts on the subject
of testing dice, then show how to do the test under discussion properly,
and then talk about more appropriate tests.

This is a very long post (about 500 lines if you include the header),
so you may want to print it out to read it.

---------------------------------------------------------------------

In article <BxHJL...@watserv1.uwaterloo.ca>,
alon...@cape.UWaterloo.ca (Allan Longley) writes:
[..stuff deleted..]

>testing dice for bias. Well, here is a test to use. I haven't actually
>tested this yet, but it should -- in theory -- work. And no, this is not
>a copy out of the old Dragon magazine, but it is the same test -- its a
>pretty standard test. I will use simple terms -- so all you math/stat
>people out there, don't correct the fine points, I know.

[description of chi-squared goodness-of-fit test deleted]

Since Allan asks for no correction of fine points, I will attempt to
limit myself to major problems. This is not intended as a flame on
Allan, but this is fairly important stuff, and should be explained
correctly. If at any stage I get less than pleasant, please accept
my apology in advance.

While the calculations that Allan describes give the correct value
of a chi-square goodness-of-fit statistic (which he calls "Indicator"),
you should be *very* wary of interpreting the results in the way
he describes, as I will explain:

Let us assume you have 40 dice that (unknown to you) are all perfectly
fair, and you wish to test all of them, to see if any are "biased".

The way Allan has set his test up, you'd expect 2 of them to give
results below "Probably Fair", which he says indicates the die is
probably unbiased. That is, you have 40 fair dice, and you will
expect to regard only *two* of them as probably O.K.! Similarly,
you will expect to consider two of your "purely fair" dice as
probably unfair. Of the remaining 36, you will expect 18 scores
between "Probably Fair" and "Maybe" and 18 more between "Maybe"
and "Probably biased". For these 36, you have to do the test again,
under Allan's scheme. If you get both results below "Maybe" (you expect
9 of these) you say "Probably Fair". Similarly you expect 9 above
"Maybe" on both trials.

So we have (after repeating the test for 90% of the dice):

Number of Fair Dice: 40
Expected number "probably fair": 11
Expected number "probably biased": 11
Expected number which we don't know about: 18.

So over a quarter of perfectly fair dice will be called "probably biased".
If we continue testing those remaining 18 we are still undecided about,
the problem gets worse.

Other problems:

Allan says:

"The column titled "Maybe" are the Indicator values where there is

a 50% chance that the die is fair and a 50% chance that the die is
biased." (A)

This is just plain *wrong*.

The column he refers to is the value that a test on a *fair* die will
exceed 50% of the time. This is very different, and probably explains
why Allan misunderstands the whole interpretation of the results. (B)

If any of you can't see why what I said (B), and what Allan said (A) are
totally different, don't despair. This stuff is not always obvious from
the start. If you can follow the rules of an average RPG, you are smart
enough to understand a few non-trivial statistical ideas. I'm quite happy
to provide further clarification to the net if the demand is there.

> >From Table 1, it appears that the d4 tested may be "Fair" but another test
> should be done.

I'd say not. A reasonable interpretation of the result is "There is
no reason to doubt that the die is O.K.".

Incidentally, the test statistic of 2.00 obtained in the example is
only 2/3 of what you'd *expect* with a fair die. The value of 2.00 will
be exceeded almost 60% of the time by a test on a fair die.

---------------------------------------------------------------------

In article <Bxu26...@watserv1.uwaterloo.ca>,
alon...@cape.UWaterloo.ca (Allan Longley), in response
to Michael Wright, says:

|In article <wright.7...@latcs1.lat.oz.au|wri...@latcs2.lat.oz.au
| (Michael G. Wright) writes:
|>d...@acpub.duke.edu writes:
|>
|>> This is called a chi-square test, and an article with the
|>>procedure and numbers for it appeared way back in Dragon issue
|>>#74... Thank you, Mr. Longley, for reposting it (or did you come
|>>up with it in isolation? =) ) for the benefit of those who don't
|>>have the issue (probably most readers).
|
|Yes, I seen the issue. The chi-square test is a standard statistical test
|for determinig if a data set matches a particular distribution -- so, no, I
|did not come up with the test in isolation, its been around for a lot longer
|than D&D. I don't actually have the issue, so I didn't copy it for the net.
|
|I've been playing with the chi-square test and you know what I found out --
|ALL DICE ARE BIASED!! Well, that's not true -- all except d4's and d6's are
|biased. Of course, this really shouldn't be a surprise. So, I've been
|looking at modifiying the chi-square test for "real world" dice -- more on
|this in a later post.

Allan is correct that the test has been around a lot longer than D&D.
The test, due to Karl Pearson, is nearly 100 years old.

Its no surprise that Allan finds that all real dice are biased:
i) Its impossible to make a truly fair die (obviously). Its just
that most are close enough that we don't care too much. The
chance of getting "close to fair" will decrease with the number
of sides.

ii) Allan's testing method will call more than a quarter of fair
dice (assuming they existed) biased. Even the fairest dice you
could buy have a good chance of being called biased.

|>Actually, I use a program I made to roll dice for stats. Unfortunately, nobody
|>in the party wants to use it, because dice rolls invariably end up better. I
|>think this must be because of the pseudo-randomness of the program.
|>Anyone out there that knows better?
|
|I wouldn't want to use a computer generated die-value while playing AD&D.
|THe thrill of the "rolling die" is part of the game. Also, with reference
|to the above, most players will have a favourite die/dice due to the
|inherent bias found in real dice. The trick is to find the dice that are
|biased beyond reasonable playability.

In response to Michael G Wright:

Michael's discovery that hand-rolled dice often come out better
could occur for a couple of reasons:

i) Players tend to hang on to "favourite" dice that "roll well"
(i.e. come up with good results). So biased dice have some
chance of concentrating into the hands of players. As long as
this isn't too extreme, it probably doesn't matter too much.
(Allan quite correctly identifies this reason).

ii) Players don't really roll randomly. I had a fairly long email
discussion with Sea Wasp on this topic just recently. Even
unconciously, you can pick up the dice in a "non-random" fashion,
so that a good roll will tend to be followed by another good
roll if you don't roll tooo vigorously. You may notice that
after a bad roll players tend to throw harder. This may be more
of a problem, as some players are *much* better at it than others.
If it becomes too noticeable, you may wish to invest in a dice
cup, or mock up a craps-table affair.

Allan's comment that "The trick is to find the dice that are biased
beyond reasonable playability" is spot on. Exactly correct.
Remember it, because I'll come back to it later.

I think the above discussion also answers Paul Kinsler's questions.

[in article <Bxvuz...@bunyip.cc.uq.oz.au>, kin...@physics.uq.oz.au
(Paul Kinsler) asked for clarification of Allan's comment that all
dice are biased].

---------------------------------------------------------------------

In article <1992Nov12....@Princeton.EDU>,
dago...@phoenix.Princeton.EDU (David Alexandre Golden) writes:

[stuff deleted]

>I once did something along those lines to test whether my DM's die was
>fair. (It would roll 20's a lot. Often on command.)
>
>To test the die, I made a histogram (I believe is the term for it) like this:
>

>x x x x x
>x x x x x x x x x x x x x x x x x x x x
>x x x x x x x x x x x x x x x x x x x x
>1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
>
>With an "x" being each time that number came up. A totally fair die would
>have a straight line across, assuming it was rolled enough times. The
>question is, what is enough times? I rolled a d20 about 380 times (not bad
>if one person rolls and the other makes an x in the column) and while it
>gave a fairly good idea that the die was biased, the biased numbers had only
>occured a couple times more than the average. (If I remember correctly).
>So I wrote a computer program to do the same thing until the deviation between
>the highest and lowest number of occurances was less than about 15% of the
>average number of occurances. (i.e a reasonably smooth profile). The
>computer required SEVERAL THOUSAND ROLLS to do this.

Your idea of making a histogram is a good one. In fact all the
chi-squared test does is the same as drawing a line across the
histogram where your expected "straight line across" would go,
looking at the deviations from that, squaring (to get all positives)
and adding the squared deviations up and dividing by that expected
number. This gives a single overall measure of deviation from uniformity.
The advantage of looking at the histogram is you see where
the the differences are, but you can't tell how big they "ought" to be
for a fair die. The actual number in the cell will be approximately normally
distributed with mean equal to the expected number in each cell and standard
deviation approximately the square root of the mean. In the above example,
we'd expect Dave to get 19 in each cell, so the standard deviation is about
4.35. That is, we'd expect to get about 2/3 of the cells with counts in
the range 15 to 23, and about a 2/3 chance of all but 1 or 2 of the values
inside the range 11 to 27. Any values outside the range 6 to 32 would give
very serious cause for concern that the die was biased (why do I take
about 3s.d.'s, rather than 2? - because we are doing 20 tests at once
using the histogram).

> ... Still, the point is that I'm skeptical

>that the "fairness" of a die can be determined in only a hundred or so rolls.
>(d4 maybe... d20 no way!)

Well, in fact Allan's suggestion was to use 20 rolls per cell, so he'd use
400 rolls for a d20 and 80 rolls for a d4. But in any case you can never
decide that a die is actually fair. If you do a test and get a result
close to what you expected if the die was fair, you have a lack of evidence
against the hypothesis of fairness (which is the default assumption for
a statistical test of biasedness in a die - the "null hypothesis").

What you get is either a higher degree of evidence against the hypothesis
of fairness (by getting a result that is very unlikely with a fair die),
or a low degree of evidence against fairness. It's like in a court case,
(a criminal case), where the defendant is assumed innocent until proven
guilty (innocence is the null hypothesis), but evidence against the
defendant is presented by the prosecution. The jury then decides either
"guilty" if there is strong enough evidence, or "Not guilty" if there
is not. They don't declare innocence.

So we can't determine "fairness" anyway. The question we need to ask
is: If the die is biased, will a hundred rolls (or whatever number)
be enough for us to have a good chance to pick up that difference,
while at the same time, not "convicting the innocent" too often. Whether
it is enough depends on how big a difference you think it is
important to pick up.

-----------------------------------------------------------------------

In article <1992Nov12....@mcs.kent.edu>,
ad...@mcs.kent.edu (Adam Dray) writes (in response to Dave Golden):

>In other words, a histogram shows very little. Random doesn't mean
>necessarily that you'll get an even distribution. It just mean the
>probability that you won't get an even distribution is proportional to
>the number of sides on the die, and the number of times you roll it.

I disgree with the first sentence. The final sentence above is wrong.
The more you roll it, the more even the distribution will be, as long
as the die is fair. It doesn't really depend on the number of sides.
(except as far as the negative dependence between cell counts is
reduced for more sides).

>Notes about the fairness of dice:
>
>Sharp-edged dice are better than smooth-edged dice. They're also more

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Not always, but this may be true more often than not.

>expensive, however. Rounded dice are often inked by coating the
>entire die with ink, then tossing the die in a "tumbler" (similar to
>tumblers for smoothing rocks) until all the die on the outside is
>gone. Thus, the ink is left in the crevices where the numbers are.
>
>Theoretically, the grooves for the numbers can make one side a more
>likely outcome. Official casino dice don't have inset pips.

If you do it right any effect will be swamped by other manufacturing
defects anyway.

[some stuff deleted]

>
>GameScience did tests on other manufacturers' dice. They found
>certain numbers to be more likely. I've heard that the real 100-sided
>die tends to roll certain numbers more often.

It is impossible (both effectively and theoretically) to get a fair 100-
sided die. The practical problems are more important than the theoretical
ones.

>Filing corners off your dice can make certain outcomes more probable.
>Natural wear can do the same thing.
>
>For most people, none of this matters one damn bit. =)

In general, no, it doesn't matter. It's encouraging to see that so
many people (just about every poster on the topic) realise this.

---------------------------------------------------------------------------

Interpreting the test statistic (Allan's "Indicator")

Carry out the calculations as described by Allan, but use any number of
throws per cell (possible outcome) you like (I'd suggest 10 as a minimum,
because otherwise the tabled distribution is out a bit). The more rolls
you do, the better chance you have of picking up a difference of a given
size. The value of 20 that Allan suggested may well be a reasonable choice
in most circumstances. Allan gives the calculations for two different
numbers of throws (20 and 10 per cell, but in different posts), so you
ought to be able to generalise.

If the result is less than the final column of Allan's table 2 (which
are the tabulated values for a 5% significance level), you shouldn't
worry too much, there is not very strong evidence of bias - in fact 1
in 20 tests on a fair die will score worse than this. If the result is
much bigger than the value you have some cause for concern. A result
bigger than the 1% column below is quite unusual if the die is fair
(a result at least this big only occuring 1% of the time), so it gives
us good reason to suspect bias.

A small table of the chi-squared distribution:

5% 1% df
d4 7.81 11.34 3
d6 11.07 15.09 5
d8 14.07 18.48 7
d10 16.92 21.67 9
d12 19.68 24.72 11
d20 30.14 36.19 19

(these results came from a computer approximation to the chi-squared
distribution. They should be accurate to the figures given.)

If you want a more "cookbook" approach; if the result exceeds the 1%
value, its probably biased. If its between the 1% and 5% values, there
is a moderate degree of evidence that its biased, but it still might be
OK. If its less than the 5% value, you don't have any reason to think
its biased on the basis of the test.

You will find more extensive tables in most elementary statistics books.
You look up the df (degress-of-freedom) that are one less than the
number of faces on the die (e.g. d4 -> 3 df).

A note on pronunciation: The Greek letter chi (the capital looks like
an X, and the lower-case has one of the two crossed lines a bit curly)
is pronounced with a hard "ch" like Charisma, and the word rhymes with
pie. Note that mathematical symbols come from *ancient* Greek, so no
arguments from any modern Greeks please.

This will provide a reasonable all-round test for bias in a die.

-------------------------------------------------------------------

Why you probably don't want to do the chi-squared test:
(at least for d8 and above)

The chi-squared test will pick up any kind of deviation from a purely even
distribution. However, we are much more worried about some kind of deviations
than others. For example, I'd be more interested in knowing that "20" came
up too often on a d20 than knowing "10" came up too often. The first could
affect play substantially, the second probably only a little. We should use a
test with a better chance to pick up the kind of deviations from fairness
that are most important to us (which will trade off with less chance of
picking up deviations we are less concerned with).

Let us consider a more complete example:

Imagine we have two d20's we'd like to test, and that in fact
(but unknown to us) they have the following (percentage) probabilities:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

5 4.5 6 3.5 7 2.5 8 1.5 7 5 5 7 1.5 8 2.5 7 3.5 6 4.5 5
1.5 1.5 2.5 2.5 3.5 3.5 4.5 4.5 5 5 5 5 6 6 7 7 7 7 8 8

A fair die would have 5% right across, of course. These 2 dice can be obtained
from each other by relabelling the faces. The first die will be reasonable in
play, because, of course, we don't try to roll 'exactly 8' or 'exactly 9',
but 'less than 8' or 'greater than 11'. The first die is never out by
more than one twentieth of the required probablility (e.g. probability of
a 2 or less is 9.5% instead of 10%) in either direction. It has the correct
average, and almost the correct standard deviation (the difference is tiny).

The second die would be very unbalancing in play: it has about a 2/3 chance
(66%) of rolling 11 or higher, and a 20 is more than 6 times as likely as
a 1. The mean is almost 13. The standard deviation is also out, but that's
relatively unimportant.

The chi-squared will rate them as equally bad!

So a good test should be likely to identify the second die, but we might
be prepared to sacrifice some of our ability to pick up the first, since
it will make little practical difference in play. (I said I'd come back to
this point!)

Note that almost any deviation on a d4 will be important (there are only 4
different values), and to a lesser extent a d6. I'd stick with the chi-
squared test on those.

There are many tests that will do what we want. I will present only one
such test. (This is not to say that a properly applied chi-squared is
not good, just that a test more closely tailored to our specific
question of interest will be even better.)

The Kolmogorov-Smirnov test:

Collect data as for the chi-squared test, up to the point where you
start doing calculations.

That is, lay out like this (you could run down rather than across):

Roll: 1 2 3 4 5 ......
Count: 17 19 22 27 24 ......
Expected: 20 20 20 20 20 ......

Now add up your counts and expected counts, writing the partial
totals as you go:

Roll: 1 2 3 4 5 ......
Count: 17 19 22 27 24 ......
Expected: 20 20 20 20 20 ......
Sum Count: 17 36 58 85 109 ......
Sum Exp: 20 40 60 80 100 ......

Now find the differences (without sign):

Sum Count: 17 36 58 85 109 ......
Sum Exp: 20 40 60 80 100 ......
Difference: 3 4 2 5 9 ......

The last difference will be zero, so you don't have to work out the
final column (I still would as a check).

Divide the largest difference (9 is the largest difference above, for
the calculations you can see) by the number of rolls you made altogether.

This is your test statistic. Let's call the value D.

You can look it up in most books on nonparametrics, which will
have tables. However, if you made more than about 50 rolls altogether,
you can multiply D by the square root of the number of rolls
(equivalently, divide the largest difference by the square root of
the number of rolls), and compare with:

5% 1%
1.36 1.63 (irrespective of the number of sides on the die)

and interpret as I suggested for the chi-square test.

So, for our above example, assume there are no larger differences
than 9, and that we made 400 rolls on a d20 (hence the expected number
in each cell is 20, as above). Then D is 9/400 = .0225, which if you
can get tables you'd look up. We made 400 rolls, so we could use the
table above: the square root of 400 is 20, so D x 20 (= 9/20) = .45.
This is much less than the 5% value, so there is little evidence that
the die is unfair.

This test will be conservative (a fair die will actually reject slightly
less often than the supposed 5% and 1% for the above table), due to
the distribution of values being discrete (d20 generates only integers,
not anything between). This will not matter in practice - it could be
important for a d4, but will make almost no difference for a d20 or d12.
Since I'd stick with the chi-squared test for a d4 and probably d6 as
well (for other reasons), there is no real problem.

There are tests which are probably even more appropriate, but these
two (chi-squared and K-S) will be enough for you to get a good idea
of any suspect dice.

Glen

P.S. sorry if this is a bit technical, but better to be technical than
wrong.

P.P.S. If further explanation is required, or a recommendation of
further reading desired, please email or post.

P.P.P.S. If you suspect a die, and decide to test it, don't use the
rolls that made you suspect it in the test. Generate a new
set. e.g. if you are all recording your rolls as you play,
and one players' results look funny, don't then test those
recorded values - you have to generate a new set.

Allan Longley

unread,

Nov 19, 1992, 1:15:07 PM11/19/92

to

In article <Bxvuz...@bunyip.cc.uq.oz.au> kin...@physics.uq.oz.au (Paul Kinsler) writes:
>In article <Bxu26...@watserv1.uwaterloo.ca>, alon...@cape.UWaterloo.ca (Allan Longley) writes:
>|> I've been playing with the chi-square test and you know what I found out --
>|> ALL DICE ARE BIASED!! Well, that's not true -- all except d4's and d6's are
>|> biased. Of course, this really shouldn't be a surprise. So, I've been
>|> looking at modifiying the chi-square test for "real world" dice -- more on
>|> this in a later post.
>
>ALL DICE ARE BIASED?
>
>Perhaps you'd care to explain further - you mean that it is in practise
>impossible to manufacture d8+ dice that are unbiased? or that any (even
>"ideal") dice are biased!!???!?!? or some bizarre property of the chi-sq
>test indicates a bias even for dice with a perfectly even distribution
>of numbers?!!#%??! or are you refering to the fact that for any finite sample
>of roll, the mean etc are never exactly the theoretical ones??, or what?

I mean that all REAL dice are biased. The bias inherent in a d4 or a d6 is
very small (due to the very discrete nature of the sides), but for the more
rounded die (ie., d8, d10, d12, d20, d30, d100) the amount of bias becomes
notable. In order to construct truely unbiased dice would make the dice so
expensive that few would be sold. The amount of bias increases the more
sides the die has (eg., a d20 is probably more biased than a d8). I have
seen a d20 that was almost perfect -- it was a precisely machined rod of
brass that weighed about 5 pounds -- not very good for regular use.

-----------------------------------------------------------------------------
Allan Longley, University of Waterloo, Department of Chemical Engineering
e-mail: alon...@cape.uwaterloo.ca

voice: (519) 885-1211 And that's the difference between
home: (519) 746-1498 real life and theory.
-----------------------------------------------------------------------------

Allan Longley

unread,

Nov 20, 1992, 1:24:23 PM11/20/92

to

IF ANY ONE WANTS TO FIND SOME NEAT RESULTS, BUY A D100 FROM
GAMESCIENCE AND THEY HAVE LITTLE PAMPHLETS WHICH SHOW SOME OF THEIR RESULTS.
LOVE,
BUNNY BEAR

<jonesc...@math.concord.wvnet.edu>

Paul Goodwin

unread,

Nov 20, 1992, 4:11:13 PM11/20/92

to

In article <Bxz6p...@watserv1.uwaterloo.ca>, alon...@cape.UWaterloo.ca (Allan

Longley) says:
>sides the die has (eg., a d20 is probably more biased than a d8). I have
>seen a d20 that was almost perfect -- it was a precisely machined rod of
>brass that weighed about 5 pounds -- not very good for regular use.

No, but it may help you build those bulging muscles while playing D&D! Think
of it...D&D that gives you exercise!

Paul :)