Number of events required

561 views
Skip to first unread message

martin...@yahoo.co.uk

unread,
May 27, 2008, 11:32:41 AM5/27/08
to MedStats
Hi,

Apologies for cross-posting.

When doing logistic regression, I've always understood that there must
be at least 10 of the rarer observations per parameter going into the
model - as an absolute minimum. Then I read in "Categorical Data
Analysis using the SAS System" by Maura E. Stokes et al (a SAS
Institute publication): "Your choice depends partially on the sample
size. There should be at least 5 observations for the rarer outcome
per parameter being considered in the expanded model. Some analysts
would prefer at least 10".

Does anyone know of any support for 5 ?

Best Wishes,

Martin Holt

Adrian Sayers

unread,
May 27, 2008, 11:37:01 AM5/27/08
to MedS...@googlegroups.com
I think you can do exact logistic regression in stata 10 and stat
Xact, i wonder if you cant do it in sas? Got to be a better solution
for small numbers.

Adrian

2008/5/27 <martin...@yahoo.co.uk>:

Bruce Weaver

unread,
May 27, 2008, 1:44:40 PM5/27/08
to MedStats
In his "Introduction to Medical Statistics" (3rd Ed., p. 323), Martin
Bland says this:

"Logistic regression is a large sample method. A rule of thumb is that
there should be at least 10 'yes's and 10 'no's, and preferably 20,
for each predictor variable (Peduzzi et al. 1996))."

As Adrian suggested, exact logistic regression should be used if you
do not have at least 10 events per predictor. You can get a free 30-
day demo of LogXact from www.cytel.com, for example.

--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."

Simon, Steve (PhD)

unread,
May 27, 2008, 3:44:39 PM5/27/08
to MedS...@googlegroups.com
Martin Holt writes:

> When doing logistic regression, I've always understood that there must

> be at least 10 of the rarer observations per parameter going into the
> model - as an absolute minimum. Then I read in "Categorical Data
> Analysis using the SAS System" by Maura E. Stokes et al (a SAS
> Institute publication): "Your choice depends partially on the sample
> size. There should be at least 5 observations for the rarer outcome
> per parameter being considered in the expanded model. Some analysts
> would prefer at least 10".
>
> Does anyone know of any support for 5?

As far as I know, the only empirical support for any of these rules
comes from a publication by Frank Harrell. He simulated some models and
showed that stepwise procedures did not replicate well when the 10-15
events per independent variables ratio was not maintained.

Obviously this would not apply to models that use cross-validation like
CART. Also, if you are willing to cite your work as exploratory and
needing replication prior to use in the real world, then a 10-15 to 1
ratio is not as critical.

I believe the proper citation is

Regression modeling strategies for improved prognostic prediction. Frank
E. Harrell. Statistics in Medicine 1984: 3143-152.

but I don't have the article in front of me right now to verify this.
I mention these issues briefly at

http://www.childrensmercy.org/stats/weblog2004/survival.asp

and

http://www.childrensmercy.org/stats/weblog2004/ratioobsivs.asp

but do not address this in the detail that it deserves.

Steve Simon, ssi...@cmh.edu, Standard Disclaimer
Evidence Based Medicine gives my book 4/4.5 stars out of five!
Full text is at http://ebm.bmj.com/cgi/content/full/12/2/59

Pablo A. Mora

unread,
May 27, 2008, 4:08:45 PM5/27/08
to MedS...@googlegroups.com
You may want to check this piece too for references and a discussion of logistic and cox regressions.

Babyak, M. A. (2004). What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting in Regression-Type Models. Psychosomatic Medicine, 66(3), 411-421.

As
--
Pablo Mora, Ph.D.
30 College Avenue
The Institute for Health
Rutgers University
New Brunswick, NJ 08901-1293
Office Phone: (732) 932-1941 FAX: (732) 932-1945

martin...@yahoo.co.uk

unread,
May 28, 2008, 5:52:06 AM5/28/08
to MedStats
Thanks for your reply, Adrian. I would have to look into it, but I
believe that SAS does now support exact logistic regression. The text
I was quoting from was written before this, and I am wondering if
there is anything to support use of 5 events per parameter. Maybe it's
an historical thing: that people believed this at first, but had to
later change upto 10.Best Wishes,MartinOn May 27, 4:37 pm, "Adrian
Sayers" <adriansay...@gmail.com> wrote:> I think you can do exact
logistic regression in stata 10 and stat> Xact, i wonder if you cant
do it in sas? Got to be a better solution> for small numbers.> >
Adrian> > 2008/5/27  <martinhol...@yahoo.co.uk>:> > > > > > > Hi,> > >
Apologies for cross-posting.> > > When doing logistic regression, I've
always understood that there must> > be at least 10 of the rarer
observations per parameter going into the> > model - as an absolute
minimum. Then I read in "Categorical Data> > Analysis using the SAS
System" by Maura E. Stokes et al (a SAS> > Institute publication):
"Your choice depends partially on the sample> > size. There should be
at least 5 observations for the rarer outcome> > per parameter being
considered in the expanded model. Some analysts> > would prefer at
least 10".> > > Does anyone know of any support for 5 ?> > > Best
Wishes,> > > Martin Holt- Hide quoted text -> > - Show quoted text -

Adrian Sayers

unread,
May 28, 2008, 7:56:32 AM5/28/08
to MedS...@googlegroups.com
i wonder if having lots of parameters in a model and not enough events
to support this might lead to the inability to identify "over
dispersion", and therefore underestimate you se's.

However i am not sure Over dispersion is one concept which i really
struggle with. I have been reading the grail of McCullagh and Nedler,
and i cant find a specific reference to the number of parameters to
estimate and its consequence with regards to an over dispersed model,
except to say that if you have to many parameters you wont be able to
estimate the over dispersion parameter.

Not sure though?

Adrian

2008/5/28 <martin...@yahoo.co.uk>:

Reply all
Reply to author
Forward
0 new messages