Sample size estimation from Median and IQR

8,954 views
Skip to first unread message

ציפי שוחט

unread,
May 7, 2012, 12:07:36 PM5/7/12
to meds...@googlegroups.com, Gidi Stein
I have a request to estimate a sample size for a parameter, where the only data in the literature is Median and Inter Quantile  Range (IQR).

Can I estimate a sample size from this data?
Thanks

Tzippy Shochat

Steve Simon, P.Mean Consulting

unread,
May 7, 2012, 5:08:08 PM5/7/12
to meds...@googlegroups.com, ציפי שוחט, Gidi Stein
On 5/7/2012 11:07 AM, ציפי שוחט wrote:

> I have a request to estimate a sample size for a parameter, where the
> only data in the literature is Median and Inter Quantile Range
> (IQR).
>
> Can I estimate a sample size from this data?

You didn't say whether you wanted to justify your sample size using the
desired width of the 95% confidence interval or using a power
calculation for a particular hypothesis test. But the simple answer is
to pretend that the median is actually the mean and pretend that the
data is approximately normally distributed. It may not be perfect, but I
doubt you could improve on things without more information about the
distribution of your data. Besides, you're probably going to be able to
rely on the central limit theorem.

So here's what you do. Notice that the standard normal has its 25th and
75th percentiles about 2/3 of a standard deviation away from zero. So
the interquartile range for a normal distribution is going to be
approximately 4/3 of a standard deviation. If your interquartile range
is 12, then your standard deviation is probably around 9. Plug this
standard deviation into your confidence interval or power calculation
formula.

Steve Simon, n...@pmean.com, Standard Disclaimer.
Sign up for the Monthly Mean, the newsletter that
dares to call itself average at www.pmean.com/news

Abhaya Indrayan

unread,
May 7, 2012, 10:44:57 PM5/7/12
to meds...@googlegroups.com
In addition to what Steve has suggested, do not forget to make an adjustment for highly skewed distribution. Use of Median and IQR suggest that the distribution is highly skewed. You will be using mean, SD and Gaussian distribution instead. I add at least 10% to the sample size for skewed distribution. This is subjective and seem to work in practice. Additional adjustments may be needed for error in the estimate and for nonresponse, if any.

~Abhaya Indrayan



--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules




Mehwish Hussain

unread,
May 8, 2012, 12:57:42 AM5/8/12
to meds...@googlegroups.com

Dear Abhaya and Steve,

Can you please, provide references for the same, you described?
--
Regards

Mehwish Hussain, PhD*
Senior Lecturer (DUHS)
Coordinator, MSBE Program (DUHS)
Manager, ORIC (HEC)
Pakistan

Frank Harrell

unread,
May 9, 2012, 8:28:27 AM5/9/12
to meds...@googlegroups.com
Also, if the 3 quartiles are Q1 Q2 Q3 and the ratio of Q3:Q2 equals the ratio of Q2:Q1 you might take logs of all 3 and assume you have a log-normal distribution, solve for its sigma, and power based on fold change.

Frank


On Monday, May 7, 2012 9:44:57 PM UTC-5, aindrayan wrote:
In addition to what Steve has suggested, do not forget to make an adjustment for highly skewed distribution. Use of Median and IQR suggest that the distribution is highly skewed. You will be using mean, SD and Gaussian distribution instead. I add at least 10% to the sample size for skewed distribution. This is subjective and seem to work in practice. Additional adjustments may be needed for error in the estimate and for nonresponse, if any.

~Abhaya Indrayan

On Tue, May 8, 2012 at 2:38 AM, Steve Simon, P.Mean Consulting wrote:
On 5/7/2012 11:07 AM, ציפי שוחט wrote:

I have a request to estimate a sample size for a parameter, where the
only data in the literature is Median and Inter Quantile  Range
(IQR).

Can I estimate a sample size from this data?

You didn't say whether you wanted to justify your sample size using the
desired width of the 95% confidence interval or using a power
calculation for a particular hypothesis test. But the simple answer is
to pretend that the median is actually the mean and pretend that the
data is approximately normally distributed. It may not be perfect, but I
doubt you could improve on things without more information about the
distribution of your data. Besides, you're probably going to be able to
rely on the central limit theorem.

So here's what you do. Notice that the standard normal has its 25th and
75th percentiles about 2/3 of a standard deviation away from zero. So
the interquartile range for a normal distribution is going to be
approximately 4/3 of a standard deviation. If your interquartile range
is 12, then your standard deviation is probably around 9. Plug this
standard deviation into your confidence interval or power calculation
formula.

Steve Simon, , Standard Disclaimer.

ציפי שוחט

unread,
May 10, 2012, 12:09:49 AM5/10/12
to meds...@googlegroups.com
Thanks all.
Unfortunately, I have only IQR, not the actual quartiles.
The data is probably skewed left, as it is bounded by zero and the values are not very large.
So I will combine Dr. Simon's and Dr. Indrayan's suggestions.
Tzippy

2012/5/9 Frank Harrell <harr...@gmail.com>

ציפי שוחט

unread,
Apr 4, 2013, 8:56:36 AM4/4/13
to emmanue...@gmail.com, meds...@googlegroups.com
in regards to Dr. Pedro Emmanuel, I encountered a similar situation a while back and Medstaters were kind enough to assist.

Please take into account that the data may be skewed and this skewness may have motivated using median and IQR instead of Mean and Std.

A related question, is MAD (Meadian Absolute Difference from Median) ever used instead of STD or IQR?
Thanks
Tzippy Shochat
Reply all
Reply to author
Forward
0 new messages