Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Testing if a distribution is normal

13 views
Skip to first unread message

G. Anthony Reina

unread,
May 27, 1999, 3:00:00 AM5/27/99
to
I have a regression that seems to fit the data well. The residuals don't
appear to have any trends so I think what's left is just random (normal
distribution) error.

Is there a way that I can quantitatively test the residuals to see if
they are truly in a normal disribution?

-Tony

William B. Ware

unread,
May 27, 1999, 3:00:00 AM5/27/99
to

There are tests of normality that you could apply... SPSS uses the
Shapiro-Wilk test for small (<=50) and the K-S test for larger samples...
However, you need to be aware that the assumption of normality applies to
the conditional distributions, not the overall distribution of
residuals... Therefore, the normality of the residuals collectively is a
necessary, but not sufficient, condition for meeting the assumption...

__________________________________________________________________________
William B. Ware, Professor and Chair Psychological Studies
CB# 3500 in Education
University of North Carolina PHONE (919)-962-7848
Chapel Hill, NC 27599-3500 FAX: (919)-962-1533
http://www.unc.edu/~wbware/ EMAIL: wbw...@unc.edu
__________________________________________________________________________

Jon Cryer

unread,
May 27, 1999, 3:00:00 AM5/27/99
to
Randomness and normality are two different things.
Randomness asks if the errors are (approximately) independent
and all come from the same distribution (which might or might not
be a normal distribution). There are a lot of things to check
to see if you have randomness.

Jon Cryer

At 03:02 PM 5/27/99 -0700, you wrote:
>I have a regression that seems to fit the data well. The residuals don't
>appear to have any trends so I think what's left is just random (normal
>distribution) error.
>
>Is there a way that I can quantitatively test the residuals to see if
>they are truly in a normal disribution?
>

>-Tony

Tjen-Sien Lim

unread,
May 28, 1999, 3:00:00 AM5/28/99
to
In article <374DC10D...@nsi.edu>, re...@nsi.edu says...

>
>I have a regression that seems to fit the data well. The residuals don't
>appear to have any trends so I think what's left is just random (normal
>distribution) error.
>
>Is there a way that I can quantitatively test the residuals to see if
>they are truly in a normal disribution?
>
>-Tony

Just plot the residuals and obtain the normal probability plot. Don't be too
obsessed with statistical tests!

--
Tjen-Sien Lim
ts...@recursive-partitioning.com
www.Recursive-Partitioning.com
____________________________________________________________________
Get your free Web-based email! http://recursive-partitioning.zzn.com


Dr. Robert Nemeth

unread,
May 28, 1999, 3:00:00 AM5/28/99
to
NO, there is no way to show that a distribution is normal.
You can test the normality supposing it (e.g. Shapiro Wilk or
Kolmogorow-Smirnow) i.e. you can only reject it.

A generally accepted method to give some evidence for
normality is the inspection of residual plots, Q-Q plots etc.
The following book may be useful for further hints and suggestions:
Hoaglin, Mosteller, Tukey (eds.): Exploring Data Tables, Trends and Shapes
(Wiley, 1985)

Regards
Robert

William B. Ware

unread,
May 28, 1999, 3:00:00 AM5/28/99
to

OK, I'll bite... I fail to see the difference between the two processes.
In the first, formal testing, you assume a null of normality, and when you
fail to reject, you conclude that the sample is not sufficiently nonnormal
such that it contradicts the plausability of a normal population...

In the second approach, examination of residual plots and Q-Q plots, you
look at the plots to "see" whether they are consistent with an assumption
of a normal population... In the Q-Q plot, the expectation (hope) that the
points fall along the diagonal is predicated on an assumption of
normality...

One process is "computational" and the other is "visual." Both assume
normality... However, Robert raises an important issue. In examining data
for the assumption of normality, one should probably use _both_
approaches...

WBW


Richard F Ulrich

unread,
May 28, 1999, 3:00:00 AM5/28/99
to
G. Anthony Reina (re...@nsi.edu) wrote:
: I have a regression that seems to fit the data well. The residuals don't

: appear to have any trends so I think what's left is just random (normal
: distribution) error.

: Is there a way that I can quantitatively test the residuals to see if
: they are truly in a normal disribution?

: -Tony

- I have read 5 responses and no one has mentioned "independence."
Dependency is more subtle, and (maybe) more damning to a simple model.

Residuals should not be correlated with X or with the sequence of
sampling. Et cetera.


--
Rich Ulrich, biostatistician wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html Univ. of Pittsburgh

Herman Rubin

unread,
May 28, 1999, 3:00:00 AM5/28/99
to
In article <374DC10D...@nsi.edu>, G. Anthony Reina <re...@nsi.edu> wrote:
>I have a regression that seems to fit the data well. The residuals don't
>appear to have any trends so I think what's left is just random (normal
>distribution) error.

>Is there a way that I can quantitatively test the residuals to see if
>they are truly in a normal disribution?

There is no good reason why random variables should be normal. The
regression model works quite well with "reasonable" non-normal true
residuals (the Gauss-Markoff Theorem). While there are ways to use
the non-normality to improve estimates, the sample sizes required
to gain much are immense.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558

Herman Rubin

unread,
May 29, 1999, 3:00:00 AM5/29/99
to
In article <7imc1r$rqh$1...@usenet01.srv.cis.pitt.edu>,
Richard F Ulrich <wpi...@pitt.edu> wrote:

>G. Anthony Reina (re...@nsi.edu) wrote:
>: I have a regression that seems to fit the data well. The residuals don't
>: appear to have any trends so I think what's left is just random (normal
>: distribution) error.

>: Is there a way that I can quantitatively test the residuals to see if
>: they are truly in a normal disribution?

> - I have read 5 responses and no one has mentioned "independence."


>Dependency is more subtle, and (maybe) more damning to a simple model.

>Residuals should not be correlated with X or with the sequence of
>sampling. Et cetera.


It cannot be overemphasized that normality is NOT necessary for the
validity of regression. What has just been said is what is most
important, that the disturbances (actual deviations from the "true"
regression expression) must be uncorrelated with the "explanatory"
variables.

The precise probabilities of various tests do depend on normality,
some more than others. But regression has rather good robustness,
by which I mean that the properties of the procedure do not depend
much on those assumptions which one does not wish (or need) to make.

Don Ramirez

unread,
May 30, 1999, 3:00:00 AM5/30/99
to
Prof. Ulrich correctly points out that there are problems when we use the
usual diagnostic tests for normality on ordinary least-squared residuals,
when we know that they are correlated and singular.

There is a paper that will appear shortly in Metrica which addresses these
issues and shows how to modify Theils' Linear Unbiased Scaled estimators for
the residuals to compute Recovered Errors which are independent and
nonsingular. The usual diagnostic tests can be applied to these Recovered
Errors without violating the standard assumptions,

The URL for a preprint is

http://www.math.virginia.edu/~der/pdf/der65.pdf

A companion paper is

http://www.math.virginia.edu/~der/pdf/der68.pdf

Don Ramirez

Richard F Ulrich wrote in message
<7imc1r$rqh$1...@usenet01.srv.cis.pitt.edu>...
>: -Tony

0 new messages