What's Standard Deviation?

Zhang Hongyu

ungelesen,

06.11.1995, 03:00:0006.11.95

an

Hi, Dear all,

Can u tell me which definition is correct for Standard Deviation,
---------- ----------
/ (X-EX)2 / (X-EX)2
/----------- or /----------- ?
V N V N-1

where
EX means the Expectation(average value) of X. X2 means square of X.

I've met both of these definitions in several cases, so I wonder
what's their difference?

Thanks for ur attention!

Henry
--
----------------------------------------------------------------------
Henry Hongyu Zhang, Ph.D. student | email: z...@ipc.pku.edu.cn
Molecular Design Laboratory | z...@pschnetware.pku.edu.cn
Institute of Physical Chemistry | Tel: 8610-2501490
Peking University | Fax: 8610-2501725
Peking 100871 , China | URL: http://www.ipc.pku.edu.cn/moldsgn/
| zhy/hom.htm

-----------
Too hard, to be broken
Too soft, to be worthless
------------ Old Chinese Saying

Bob Silverman

ungelesen,

06.11.1995, 03:00:0006.11.95

an

In article <47kv2h$6...@sunrise.pku.edu.cn>, Zhang Hongyu <zhy@pchindig> wrote:
:Hi, Dear all,

:
:Can u tell me which definition is correct for Standard Deviation,
: ---------- ----------
: / (X-EX)2 / (X-EX)2
: /----------- or /----------- ?
: V N V N-1
:
:where
: EX means the Expectation(average value) of X. X2 means square of X.

Both. It depends whether you want the unbiased estimator or the maximal
likelihood estimator.

--
Bob Silverman
The MathWorks Inc.
24 Prime Park Way
Natick, MA

David Seal

ungelesen,

06.11.1995, 03:00:0006.11.95

an

zhy@pchindig (Zhang Hongyu) writes:

>Can u tell me which definition is correct for Standard Deviation,
> ---------- ----------
> / (X-EX)2 / (X-EX)2
> /----------- or /----------- ?
> V N V N-1
>
>where
> EX means the Expectation(average value) of X. X2 means square of X.
>

>I've met both of these definitions in several cases, so I wonder
>what's their difference?

They're both valid (apart from some typos), but in different
circumstances. Basically, the first is a formula in probability (where
you're dealing with a known distribution); the second is one in
statistics (where you're dealing with an unknown distribution).

In probability, given a known distribution for X, the variance is
E((X-E(X))^2). If there are a finite number N of equiprobable values
for X, this is the same as:

SUM((X-E(X))^2)
---------------
N

The standard deviation is the square root of this variance, giving a
formula akin to your first one above.

In statistics, given a set of N samples from an unknown distribution,
an unbiased estimate for the mean of the unknown distribution is:

SUM(X_i)
M = --------
N

and an unbiased estimate for its variance is:

SUM((X_i-M)^2)
V = --------------
N-1

Why N-1 rather than N? Very roughly: if we could subtract the true
mean of the unknown distribution from the samples, square and sum the
results and then divide by N, we would get a good estimate of the
unknown variance. But we can't do this: we only know the mean M of the
samples we took, not the true mean of the distribution. Now, M tends
to follow the samples around a bit - e.g. if lots of the samples are
less than the true mean, our value for M will probably be below the
true mean as well. This effect tends to reduce the sum of the squared
differences, and if you do the mathematics, it turns out that the
factor by which it is expected to reduce it is (N-1)/N. So dividing by
N-1 instead of N compensates for the fact that we can only work with
M, not the true mean. (Except of course in the extreme case of N=1,
where M is always equal to the one and only sample, making the squared
difference equal to 0. This is in accordance with the reduction by a
factor of (N-1)/N = 0/1 = 0, but we can't compensate by dividing by 0
instead of 1: all we get is the undefined value 0/0. But even this
makes sense if you think about what is going on: seeing 1 sample from
an unknown distribution tells you *nothing* about how widely spread
that distribution is.)

To show a very simple example of what is going on: consider a known
distribution which produces -1 with probability 1/2 and +1 with
probability 1/2. We can calculate the variance of this probability
distribution by:

true mean = (-1 + +1)/2 = 0

true variance = ((-1 - 0)^2 + (+1 - 0)^2)/2 = 1

true standard deviation = SQR(1) = 1.

Now suppose that we're faced with this distribution as an unknown
distribution, and we do an experiment involving taking 2 samples.
There are four equally likely outcomes for the experiment:

1st sample 2nd sample M Variance calculated by V
dividing by N:
From M From true mean = 0
-------------------------------------------------------------
-1 -1 -1 0 1 0
-1 +1 0 1 1 2
+1 -1 0 1 1 2
+1 +1 +1 0 1 0

The variance calculated using a division by N, with differences taken
from M, is too small in the cases where M is not equal to the true
mean. By dividing by N-1 instead of N, we get an estimate for the
variance which is 0 half the time and 2 the other half, making it an
unbiased estimate for the true variance of 1. (Obviously not a very
good estimate, of course - but we can't expect a good estimate from
just two samples!)

Finally, note that I have been careful to talk about V being an
unbiased estimate for the variance, not SQR(V) being an unbiased
estimate for the standard deviation. This is because SQR(V) is no such
thing: in the case above, for instance, SQR(V) is 0 half the time and
SQR(2) the other half: its expected value is therefore SQR(2)/2, not
the true standard deviation (i.e. 1).

To summarise:

* The formula involving dividing by N is suitable for calculating the
variance of a known distribution having N equally probable outcomes.
(If the outcomes aren't equiprobable, go back to the E((X-E(X))^2)
formula.)

* The formula involving dividing by N-1 is suitable for estimating the
variance of an unknown distribution, given N samples from that
distribution.

David Seal
ds...@armltd.co.uk

John McGowan

ungelesen,

06.11.1995, 03:00:0006.11.95

an

Bob Silverman (bo...@mathworks.com) wrote:
> In article <47kv2h$6...@sunrise.pku.edu.cn>, Zhang Hongyu <zhy@pchindig> wrote:
> :Hi, Dear all,
> :

> :Can u tell me which definition is correct for Standard Deviation,

> : ---------- ----------
> : / (X-EX)2 / (X-EX)2
> : /----------- or /----------- ?
> : V N V N-1
> :
> :where
> : EX means the Expectation(average value) of X. X2 means square of X.
>

> Both. It depends whether you want the unbiased estimator or the maximal
> likelihood estimator.

Well... the first is the standard deviation... but if you have a SAMPLE
from a larger population, the standard deviation calculated for the
sample (first formula) is not the best estimator for the whole
population. E[(x-Ex)^2] (somehow there is a SUM or expectation or average
missing from the formulas above) or SUM[(X-EX)^2]/N (average being the
sum over the number) is not an unbiased estimator of the variance of the
total population while SUM[(X-EX)^2]/(N-1) is an unbiased estimator of
the variance of the whole population (note that it is the variance that
one estimates using the second in an unbiased way, NOT the standard
deviation, for which taking the square root generally leaves a bias).

So... if you want the standard deviation OF YOUR DATA, use the first
formula, but if your data is a SAMPLE from a larger population and want
an unbiased estimate of the variance of the larger population (the sample
either being taken with replacement or small compared to the population
size) based upon your sample, use the second formula.

Regards,

--
John McGowan | jmcg...@inch.com [Internet Channel]
| jmcg...@mail.coin.missouri.edu [COIN]
--------------+-----------------------------------------------------

Eric Bohlman

ungelesen,

06.11.1995, 03:00:0006.11.95

an

Zhang Hongyu (zhy@pchindig) wrote:
: Hi, Dear all,

: Can u tell me which definition is correct for Standard Deviation,
: ---------- ----------
: / (X-EX)2 / (X-EX)2
: /----------- or /----------- ?
: V N V N-1

: where
: EX means the Expectation(average value) of X. X2 means square of X.

: I've met both of these definitions in several cases, so I wonder
: what's their difference?

The first formula is the one for the standard deviation of a particular
set of data. However, if your set of data is a sample from a larger
population and you're trying to estimate the standard deviation of that
population, then you need to use the second formula, because the standard
deviation of a sample will usually be slightly smaller than the standard
deviation of the population itself. The reason for this is that when
trying to estimate the population SD from the sample, you also have to
estimate the population mean from the sample, and of course that estimate
will differ from the actual population mean by an unknown, but
predictable, amount. The first formula doesn't take this variability of
the mean into account, and therefore gives an answer that's too small.

On the other hand, if you were in the (unlikely) situation where you were
trying to estimate the population SD from a sample, but you knew the
population mean exactly, then you'd use the first formula (with EX being
the population mean rather than the sample mean).

AaCBrown

ungelesen,

08.11.1995, 03:00:0008.11.95

an

zhy@pchindig (Zhang Hongyu) in <47kv2h$6...@sunrise.pku.edu.cn> asks
whether the correct definition of Standard Deviation uses n or n-1. There
were four previous answers.

All the previous answers are correct as far as they go but I think fail to
make the crucial point (with the possible exception of John McGowan's).

The DEFINITION of standard deviation uses n. This is true for a sample or
a population, for a continuous or a discrete distribution (although for a
continuous distribution the notation is a little different).

For a complicated reason that some answers explained, it can make sense to
estimate the true standard deviation by using a statistic whose formula is
similar to the definition of standard deviation. It's the same as the
definition with n-1 replacing n. This is sometimes called the "sample
standard deviation".

It is often the case in statistics that we estimate a parameter using a
totally different statistic. For example you might estimate a population
mean by a sample mean, but in other cases you use a different statistic.
Beginning statistics students often make mistakes because they confuse
parameters with statistics. This is easy to do because the names are often
the same and the formulae are often similar.

Aaron C. Brown
New York, NY

Jan Willem Nienhuys

ungelesen,

08.11.1995, 03:00:0008.11.95

an

aacb...@aol.com (AaCBrown) writes:

#The DEFINITION of standard deviation uses n. This is true for a sample or
#a population, for a continuous or a discrete distribution (although for a
#continuous distribution the notation is a little different).

I think not everybody might agree. Expectation doesn't use n or n-1
or whatever related to the particular sample.

#It is often the case in statistics that we estimate a parameter using a
#totally different statistic. For example you might estimate a population
#mean by a sample mean, but in other cases you use a different statistic.

A beautiful example would be the case of a statistic that is known
to be uniformly distributed in an interval of unknown length.
The average between the largest and the smallest value is in that case
a better estimate of the mean than the sample mean.

Correct me if I'm wrong.

JWN

Terry Moore

ungelesen,

08.11.1995, 03:00:0008.11.95

an

In article <47l58t$2...@puff.mathworks.com>, bo...@mathworks.com (Bob
Silverman) wrote:

>
> In article <47kv2h$6...@sunrise.pku.edu.cn>, Zhang Hongyu <zhy@pchindig> wrote:
> :Hi, Dear all,
> :
> :Can u tell me which definition is correct for Standard Deviation,
> : ---------- ----------
> : / (X-EX)2 / (X-EX)2
> : /----------- or /----------- ?
> : V N V N-1
> :
> :where
> : EX means the Expectation(average value) of X. X2 means square of X.
>

> Both. It depends whether you want the unbiased estimator or the maximal
> likelihood estimator.

Both are biased, but the square of the latter is unbiased for
the variance. In terms of mean square error, a better divisor
is N+1 when estimating the variance, but this is not clear when
estimating the standard deviation.

The former is the maximum likelihood estimator if the
distribution is normal (and observations are iid), but may
not be otherwise.

Terry Moore, Statistics Department, Massey University, New Zealand.

Imagine a person with a gift of ridicule [He might say] First that a
negative quantity has no logarithm; secondly that a negative quantity has
no square root; thirdly that the first non-existent is to the second as the
circumference of a circle is to the diameter. Augustus de Morgan

David Kastrup

ungelesen,

11.11.1995, 03:00:0011.11.95

an

zhy@pchindig (Zhang Hongyu) writes:

>Hi, Dear all,

>Can u tell me which definition is correct for Standard Deviation,
> ---------- ----------
> / (X-EX)2 / (X-EX)2
> /----------- or /----------- ?
> V N V N-1

>where
> EX means the Expectation(average value) of X. X2 means square of X.

>I've met both of these definitions in several cases, so I wonder
>what's their difference?

The variance is one of an infinite set of characteristics
of distributions called "cumulants". Those cumulants have the feature
that they add up when you add independent random variables. The first
cumulant is the mean, the second the variance, the third
is sometimes called skewness, and so forth.

The standard deviation is the square root of the variance.
One formula for variance is:
V{X} = E{X^2} - E^2{X}
where E{X} is the expected value, or mean of the distribution X.
If X is entirely represented by N values, we get
V{X} = (sum_i x_i^2)/N - ((sum_i x_i)/N)^2
which leads to the left formula of yours. More commonly, however,
you don't know X exactly, but have rather a batch of samples.
In that case, you probably want an estimate of which the
expected value will be the true variance. Now it can be shown
that
E{V{X}} = V{X} - V{E{X}}
That is, if calculating the variance of a sample, this tends to
underestimate your whole variance by the variance of the mean
value of the sample.

Now assuming independent samples, we have
V{E{X}} = V{(sum X_i)/N} = V{X}/N^2

That is all that is needed to derive the second formula of yours
above, which gives an unbiased variance estimator.
--
David Kastrup, Goethestr. 20, D-52064 Aachen Tel: +49-241-72419
Email: d...@pool.informatik.rwth-aachen.de Fax: +49-241-79502

Bill Taylor

ungelesen,

13.11.1995, 03:00:0013.11.95

an

ds...@armltd.co.uk (David Seal) writes:

|> instead of 1: all we get is the undefined value 0/0. But even this
|> makes sense if you think about what is going on: seeing 1 sample from
|> an unknown distribution tells you *nothing* about how widely spread
|> that distribution is.)

Incredibly, this "obvious" statement is not, strictly speaking, true !!!!!

It *is* possible, believe it or not, to sometimes make statistical inferences
about the standard deviation of an unknown distribution from a SAMPLE OF SIZE 1 !
~~~~~~~~~~~~~~~~
I don't think it is quite possible to do it if the unknown distribution is
continuous but otherwise completely arbitrary; (though I'm not certain).

It is definitely possible to get an exact 90% (or whatever) confidence interval
for the *centre* of a unimodal symmetric (but otherwise arbitrary) continuous
distribution, from a sample of size 1.

It is also possible to get a lowerly-bounded-by (but not exact) 90% confidence
region for the *JOINT* mean and stdevn of a normal where both are unknown,
*still* from a sample of size 1.

These are indeed insane results!
A little bit like that result that appears here with monotonous regularity:-
that you can do better than 50%-by-guessing at whether the
50-50-randomly-chosen-one-or-the-other of two otherwise arbitrary numbers
is the higher or the lower.

Of course it should be noted:- the confidence regions mentioned above cannot
be made translation-invariant, (which is what most people would hope for). And
with that little clue, you may be able to work out roughly how it's all done.

-------------------------------------------------------------------------------
Bill Taylor w...@math.canterbury.ac.nz
-------------------------------------------------------------------------------
There are lies, damned lies, and sadistics.
-------------------------------------------------------------------------------

Michael Creel

ungelesen,

14.11.1995, 03:00:0014.11.95

an

It all depends on where you're from. In some places, picking your nose
while driving is a standard deviation. In other places, it's
scratching your crotch while speaking in public.

(Sorry, I've been following this thread for a few days and I couldn't
resist.)

A. Katherine Ricci

ungelesen,

16.11.1995, 03:00:0016.11.95

an

Michael Creel (MCR...@VOLCANO.UAB.ES) wrote:
|| It all depends on where you're from. In some places, picking your nose
|| while driving is a standard deviation. In other places, it's

^^^^^^^^^^^^^^^^^^

|| scratching your crotch while speaking in public.

Sounds more like Septum deviation. :-}

Kate Belisle-Phinney/Ricci Belisl...@ccsu.ctstateu.edu

cybe...@pixie.co.za

ungelesen,

19.11.1995, 03:00:0019.11.95

an

bo...@mathworks.com (Bob Silverman) spewed forth:

>In article <47kv2h$6...@sunrise.pku.edu.cn>, Zhang Hongyu <zhy@pchindig> wrote:

>:Hi, Dear all,

>:
>:Can u tell me which definition is correct for Standard Deviation,
>: ---------- ----------
>: / (X-EX)2 / (X-EX)2
>: /----------- or /----------- ?
>: V N V N-1
>:
>:where
>: EX means the Expectation(average value) of X. X2 means square of X.
>

>Both. It depends whether you want the unbiased estimator or the maximal
>likelihood estimator.

>--

>Bob Silverman
>The MathWorks Inc.
>24 Prime Park Way
>Natick, MA

Practically it would depend on whether you sample from a finite
population or a population w/o finite no. of elements.

Ryan