stepwise from Frank Harrell

Byron Davis

unread,

Feb 10, 1995, 1:32:24 PM2/10/95

to

Frank,

Your post mentions my name as saying something, but the message you include
is from Stan Mulaik. Another one of those dogon automated flukes I guess.

Your arguments regarding automated regression techniques are, contrary to my
recent post. So, I don't agree with your assertion that these techniques are
"inherently bad." I further don't agree that use of these techniques leads
to bad science. If you feel this way, then you are free not to use such
techniques. However, your personal biases needn't keep other researchers
from using such techniques in a responsible manner.

To be honest, I must admit that I make use of such techniques only rarely.
Perhaps in one project out of a hundred. However, your interpretation that
my use of such a technique amounts to doing bad science is a rather
offensive, and poorly grounded comment. In my uses of such techniques, I
must say that I have never had occasion to "find things we wished we had never
learned" or have I ever had occasion to come to conclusions which were not
confirmed by later data.

I totally agree with criticisms of misuses of any statistical tool, however,
I seldom agree with criticisms of tools which ask us to throw out the baby
with the bath water. Let's put to rest condemnation of statistical tools which
may, in fact, be used responsibly, and enlighten researchers to the inherent
weaknesses of some tools, and save our condemnations for the obvious
missuses of any and all techniques.

Byron L. Davis, Ph.D. by...@osiris.usi.utah.edu
Staff Consultant for Statistics and Research Methodology
Utah Supercomputing Institute University of Utah

Date: Fri, 10 Feb 1995 16:40:13 GMT
From: "Frank E Harrell Jr." <f...@BIOSTAT.MC.DUKE.EDU>
Subject: Re: Stepwise revisited

Byron Davis <by...@OSIRIS.USI.UTAH.EDU> wrote in article
<950209155...@osiris.usi.utah.edu> :
>
>I haven't kept up with these developments in regression theory. I
>would be curious to know how they figure the degrees of freedom for
>an all-possible regressions analysis. Effectively the technique
>sounds like estimating all of the possible parameters and adjusting
>those that may be zero. For purposes of evaluating such models I
>would say one lost as many degrees of freedom as parameters estimated..
>
>Stan Mulaik
>

I question whether an inherently bad technique such as stepwise regressi
on
is even good as an exploratory tool. Why do we explore data using a
technique which is very likely to find things we wished we had never
learned? Things that will not be confirmed with future data?

Warren Sarle's recent postings about principal components and especially
about variable clustering point out some really useful exploratory
techniques that are liable to find interesting patterns in data that
are not just "noise". Incomplete principal components regression and
regression using a single score from each variable cluster have been
found to be extremely useful and reliable. These techniques reduce
a large number of potential predictors into a manageable number of
dimensions that are worthy of estimating regression coefficients for.
One you reduce the number of regressors you don't need to do any
deletion of insignificant effects (the literature has shown that
such deletion causes a WORSENING of predictive accuracy in most cases).

I hope we can put to rest also the notion that all possible regressions
is a worthwhile endeavor. As the reference I posted the other day
concluded, it has some disadvantages over other stepwise techniques in
terms of finding the "right" variables. But the more important point
is that it contributes to bad science.
------------------------------------------------------------------------
----
Frank E Harrell Jr
f...@biostat.mc.d
uke.edu
Associate Professor of Biostatistics
Division of Biometry Duke University Medical Center
Box 3363 Durham NC 27710 USA

f...@biostat.mc.duke.edu

unread,

Feb 10, 1995, 2:33:17 PM2/10/95

to

Byron Davis <by...@OSIRIS.USI.UTAH.EDU> wrote in article <950210183...@osiris.usi.utah.edu> :

>
>Frank,
>
>Your post mentions my name as saying something, but the message you include
>is from Stan Mulaik. Another one of those dogon automated flukes I guess.

Sorry about that. My mistake.

>
>Your arguments regarding automated regression techniques are, contrary to my
>recent post. So, I don't agree with your assertion that these techniques are
>"inherently bad." I further don't agree that use of these techniques leads
>to bad science. If you feel this way, then you are free not to use such
>techniques. However, your personal biases needn't keep other researchers
>from using such techniques in a responsible manner.

I haven't seen your analyses, Byron, but I have seldom seen these
techniques used responsibly. 95% of what I've seen is really bad
science. The basic method is flawed (hence inherently bad) because
among other things:

1. It yields R-squared values that are badly biased high
2. The F and chi-squared tests quoted next to each variable on the
printout do not have the claimed distribution
3. The method yields confidence intervals for effects and predicted
values that are falsely narrow (See Altman and Anderson Stat in Med)
4. It yields P-values that do not have the proper meaning and the
proper correction for them is a very difficult problem
5. It gives biased regression coefficients that need shrinkage
6. It has severe problems in the presence of collinearity
7. It is based on methods (e.g. F tests for nested models) that were
intended to be used to test pre-specified hypotheses.
8. It allows us to not think about the problem
9. It uses a lot of paper

It is true that I'm free not to use these techniques. But I wish I
didn't have to waste time reading bad applications of stepwise methods
in medical journals. I guess we just have to agree to disagree on this stuff.

>
>To be honest, I must admit that I make use of such techniques only rarely.
>Perhaps in one project out of a hundred. However, your interpretation that
>my use of such a technique amounts to doing bad science is a rather
>offensive, and poorly grounded comment. In my uses of such techniques, I
>must say that I have never had occasion to "find things we wished we had never
>learned" or have I ever had occasion to come to conclusions which were not
>confirmed by later data.

I have found things that didn't replicate almost every time I've done a usual
stepwise analysis in which I held back some validation data. I could
name a paper I've published which used a stepwise analysis for which the
list of "important" variables changed shortly after publication. All we
did was to add a few patients to the data! After I finished graduate
training, I was oblivious to these problems until I began working with
a physician named Tom Rickert who had a lot of training in pattern
recognition. He challenged me to validate every model I developed (now I
use the bootstrap to do this so I don't have to hold back data).
Boy was I disappointed when I began doing that.

>
>I totally agree with criticisms of misuses of any statistical tool, however,
>I seldom agree with criticisms of tools which ask us to throw out the baby
>with the bath water. Let's put to rest condemnation of statistical tools which
>may, in fact, be used responsibly, and enlighten researchers to the inherent
>weaknesses of some tools, and save our condemnations for the obvious
>missuses of any and all techniques.
>
>
>Byron L. Davis, Ph.D. by...@osiris.usi.utah.edu
> Staff Consultant for Statistics and Research Methodology
>Utah Supercomputing Institute University of Utah
>

----------------------------------------------------------------------------
Frank E Harrell Jr f...@biostat.mc.duke.edu

Paul F. Velleman

unread,

Feb 15, 1995, 11:07:40 AM2/15/95

to

In article <3hgf1t$r...@news.duke.edu>, f...@biostat.mc.duke.edu wrote:

I have seldom seen these

> techniques [stepwise regression] used responsibly.

> 95% of what I've seen is really bad
> science. The basic method is flawed (hence inherently bad) because
> among other things:
>
> 1. It yields R-squared values that are badly biased high
> 2. The F and chi-squared tests quoted next to each variable on the
> printout do not have the claimed distribution
> 3. The method yields confidence intervals for effects and predicted
> values that are falsely narrow (See Altman and Anderson Stat in Med)
> 4. It yields P-values that do not have the proper meaning and the
> proper correction for them is a very difficult problem
> 5. It gives biased regression coefficients that need shrinkage
> 6. It has severe problems in the presence of collinearity
> 7. It is based on methods (e.g. F tests for nested models) that were
> intended to be used to test pre-specified hypotheses.
> 8. It allows us to not think about the problem
> 9. It uses a lot of paper
>

I would add that where data contain missing values,
10. the sample on which the analysis rests can itself change as a result
of variable selection and deletion.

I have seen examples in which the R2 *increases* when a variable is omitted
by a backwards step. The explanation is that enough cases that had been
missing in the omitted variable were allowed to reenter the regression.
Most statistics packages don't track missing values carefully enough to
protect against this.

-- Paul Velleman