On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au> wrote:
> Hi all,
> Is anyone aware of any literature that discusses or investigates the > statistical implications of analysing BMI (body mass index) as categories > rather than leaving in its continuous form?
> I am putting together a class discussing the statistical problems of > dichotomising/categorising variables in analyses, and BMI is such a commonly > used categorical measure that it would be nice to specifically discuss it.
Nothing specific to BMI, but Frank Harrell has a useful page on the problems associated with binning continuous variables...
Faraggi D, Simon R (1996) A simulation study of cross-validation for selecting an optimal cutpoint in univariate survival analysis. Statistics in Medicine 15(20):2203-2213 http://www.ncbi.nlm.nih.gov/pubmed/8910964
Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous data. Acta Neuropsychiatrica 19(2):129-130
The problem is really independent of the variable that you are attempting to categorize, and I'm sure you could phrase the above points in the context of BMI.
Neil -- "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." ~ John Tukey (1986), "Sunset salvo". The American Statistician 40(1).
For the source, I'd recommend reading all publications by the inventor of the BMI in C19, the Belgian scientist Henri Quetelet. BMI used to be called the Quetelet Index (QI); initially the QI faced rival indices such as the Broca Index which have become vestigeal or extinct in recent years, at least in modern medical science. I trust that your French is up to scratch!
> On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au> > wrote:
> > Hi all,
> > Is anyone aware of any literature that discusses or investigates the > > statistical implications of analysing BMI (body mass index) as categories > > rather than leaving in its continuous form?
> > I am putting together a class discussing the statistical problems of > > dichotomising/categorising variables in analyses, and BMI is such a > commonly > > used categorical measure that it would be nice to specifically discuss > it.
> Nothing specific to BMI, but Frank Harrell has a useful page on the > problems associated with binning continuous variables...
> Faraggi D, Simon R (1996) A simulation study of cross-validation for > selecting an optimal cutpoint in univariate survival analysis. > Statistics in Medicine 15(20):2203-2213 > http://www.ncbi.nlm.nih.gov/pubmed/8910964
> Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous > data. Acta Neuropsychiatrica 19(2):129-130
> The problem is really independent of the variable that you are > attempting to categorize, and I'm sure you could phrase the above > points in the context of BMI.
> Neil > -- > "The combination of some data and an aching desire for an answer does > not ensure that a reasonable answer can be extracted from a given body > of data." ~ John Tukey (1986), "Sunset salvo". The American > Statistician 40(1).
BUT Current evidence suggests that waist to hip ratio is consierably more predictive of wieght trlated problems So WHY ohWHY do GPS and hear untis routinelay take bmi and ignore w/h? Best
Diana
On 24/06/2009 10:48, "William Stanbury" <williamstanb...@gmail.com> wrote:
> For the source, I'd recommend reading all publications by the inventor of the > BMI in C19, the Belgian scientist Henri Quetelet. BMI used to be called the > Quetelet Index (QI); initially the QI faced rival indices such as the Broca > Index which have become vestigeal or extinct in recent years, at least in > modern medical science. I trust that your French is up to scratch! > > Best wishes, > > William Stanbury.
>> On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au> >> wrote:
>>> > Hi all,
>>> > Is anyone aware of any literature that discusses or investigates the >>> > statistical implications of analysing BMI (body mass index) as categories >>> > rather than leaving in its continuous form?
>>> > I am putting together a class discussing the statistical problems of >>> > dichotomising/categorising variables in analyses, and BMI is such a >>> commonly >>> > used categorical measure that it would be nice to specifically discuss it.
>> Nothing specific to BMI, but Frank Harrell has a useful page on the >> problems associated with binning continuous variables...
> Faraggi D, Simon R (1996) A simulation study of cross-validation for > selecting an optimal cutpoint in univariate survival analysis. > Statistics in Medicine 15(20):2203-2213 > http://www.ncbi.nlm.nih.gov/pubmed/8910964
> Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous > data. Acta Neuropsychiatrica 19(2):129-130
> The problem is really independent of the variable that you are > attempting to categorize, and I'm sure you could phrase the above > points in the context of BMI.
> Neil > -- > "The combination of some data and an aching desire for an answer does > not ensure that a reasonable answer can be extracted from a given body > of data." ~ John Tukey (1986), "Sunset salvo". The American > Statistician 40(1).
On Wed, Jun 24, 2009 at 11:40 AM, kornbrot<d.e.kornb...@herts.ac.uk> wrote: > BUT > Current evidence suggests that waist to hip ratio is consierably more > predictive of wieght trlated problems > So WHY ohWHY do GPS and hear untis routinelay take bmi and ignore w/h?
I'd suggest that there are a few reasons why...
1. They are unaware of the evidence that waist/hip ratio is more predictive.
2. Studies don't collect the waist or hip measurements because of this, but do collect height and weight (statisticians should be consulted at the design phase of a study, not at the end to perform surgery on the data that has been collected).
3. Studies using older data don't have a chance of using waist-hip ratio as only height and weight were collected at the time as BMI was thought to be the appropriate metric.
Its a case of 'educating' people as to the most appropriate measure to take, and all statisticians have a role in this (assuming they work in biostatistics where this sort of data measurement is used of course!).
Neil
-- "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." ~ John Tukey (1986), "Sunset salvo". The American Statistician 40(1).
Richard Kronmal (1993) "Spurious Correlation and the Fallacy of the
Ratio Standard Revisited". Journal of the Royal Statistical Society.
Vol. 156, No 3, 379-392.
Which does not deal with the categorization issue, but rather the fact
that using the ratio rather than the 2 numbers (whether height-weight
or waist-hip) can lead to misleading results and it is better to use
the original values rather than their ratio.
On Jun 24, 4:40 am, kornbrot <d.e.kornb...@herts.ac.uk> wrote:
> BUT
> Current evidence suggests that waist to hip ratio is consierably more
> predictive of wieght trlated problems
> So WHY ohWHY do GPS and hear untis routinelay take bmi and ignore w/h?
> Best
> Diana
> On 24/06/2009 10:48, "William Stanbury" <williamstanb...@gmail.com> wrote:
> > For the source, I'd recommend reading all publications by the inventor of the
> > BMI in C19, the Belgian scientist Henri Quetelet. BMI used to be called the
> > Quetelet Index (QI); initially the QI faced rival indices such as the Broca
> > Index which have become vestigeal or extinct in recent years, at least in
> > modern medical science. I trust that your French is up to scratch!
> >
> > Best wishes,
> >
> > William Stanbury.
> >
> > 2009/6/24 Neil Shephard <nsheph...@gmail.com>
> >> On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au>
> >> wrote:
> >>> > Hi all,
> >>> > Is anyone aware of any literature that discusses or investigates the
> >>> > statistical implications of analysing BMI (body mass index) as categories
> >>> > rather than leaving in its continuous form?
> >>> > I am putting together a class discussing the statistical problems of
> >>> > dichotomising/categorising variables in analyses, and BMI is such a
> >>> commonly
> >>> > used categorical measure that it would be nice to specifically discuss it.
> >> Nothing specific to BMI, but Frank Harrell has a useful page on the
> >> problems associated with binning continuous variables...
> > Faraggi D, Simon R (1996) A simulation study of cross-validation for
> > selecting an optimal cutpoint in univariate survival analysis.
> > Statistics in Medicine 15(20):2203-2213
> >http://www.ncbi.nlm.nih.gov/pubmed/8910964
> > Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous
> > data. Acta Neuropsychiatrica 19(2):129-130
> > The problem is really independent of the variable that you are
> > attempting to categorize, and I'm sure you could phrase the above
> > points in the context of BMI.
> > Neil
> > --
> > "The combination of some data and an aching desire for an answer does
> > not ensure that a reasonable answer can be extracted from a given body
> > of data." ~ John Tukey (1986), "Sunset salvo". The American
> > Statistician 40(1).
> On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au> > wrote:
> > Hi all,
> > Is anyone aware of any literature that discusses or investigates the > > statistical implications of analysing BMI (body mass index) as categories > > rather than leaving in its continuous form?
> > I am putting together a class discussing the statistical problems of > > dichotomising/categorising variables in analyses, and BMI is such a > commonly > > used categorical measure that it would be nice to specifically discuss it.
> Nothing specific to BMI, but Frank Harrell has a useful page on the > problems associated with binning continuous variables...
> Faraggi D, Simon R (1996) A simulation study of cross-validation for > selecting an optimal cutpoint in univariate survival analysis. > Statistics in Medicine 15(20):2203-2213 > http://www.ncbi.nlm.nih.gov/pubmed/8910964
> Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous > data. Acta Neuropsychiatrica 19(2):129-130
> The problem is really independent of the variable that you are > attempting to categorize, and I'm sure you could phrase the above > points in the context of BMI.
> Neil > -- > "The combination of some data and an aching desire for an answer does > not ensure that a reasonable answer can be extracted from a given body > of data." ~ John Tukey (1986), "Sunset salvo". The American > Statistician 40(1).
Do the readers of this list think there is room for another article?
Maybe I will try to write one.
Peter
-----Original Message-----
From: John Whittington
Sent: Jun 25, 2009 6:41 AM
To: MedStats@googlegroups.com
Subject: {MEDSTATS} Re: categorising BMI
It's reassuring to find someone authoritative agreeing with me for
once!
The exercise of developing a prognostic model which Doug mentions
obviously introduces even more issues (such as he discusses), since one
is then having to decide upon the 'cutpoints' (category boundaries) as
well as developing the model.
The situations I was thinking of were those in which those cutpoints
(which relate to decision-making) are already (at least for the time
being) 'externally defined' (often pretty arbitrarily) - whether they
relate to criteria for diagnosis, treatment, prosecution or
whatever. It is in those situations which I feel that (having
analysed all of the data, without categorisation), hypotheses relating to
the (pre-defined) categorisation really should be undertaken - but I am
surprised by how rare this seems to happen. A prognostic model is,
I assume, likely to be tested against 'known facts' (i.e. actual observed
outcome/prognosis), so other analytical techniques would presumably be
employed.
Kind Regards,
John
At 10:25 25/06/2009 +0100, Doug Altman wrote:
I completely agree with John. We
also made this point briefly in relation to developing a prognostic
model:
We agree
that medical decision making often requires categorization of data, e.g.
to define a high-risk group of patients for
a clinical trial ... However, categorization should be applied to the
prognostic index, not to the original prognostic
variables.
Royston P, Altman DG, Sauerbrei W.
Dichotomizing continuous predictors in multiple regression: a bad
idea. Statistics in Medicine 2006; 25:127-141.
Similar comments apply to other contexts. But how this should best
be done is not agreed, both in terms of the number of groups and
(especially) the placement of the cutpoints. I reviewed this issue
in
Altman DG. Categorizing continuous variables.
In: Armitage P, Colton T (eds) Encyclopedia of
biostatistics. 2nd edn. Chichester: John Wiley, 2005: 708-711.
We do know though that choosing the cutpoints to minimise P value
- or maximise differences in outcome - is highly biased, in common with
other data-dependent analysis approaches. See
Altman DG, Lausen B, Sauerbrei W, Schumacher
M. Dangers of using “optimal” cutpoints in the
evaluation of prognostic factors. [Commentary] Journal of the National
Cancer Institute 1994; 86:829-835.
There are nvertheless situations where the ratio, in addition to the raw numbers are useful. For example, ratio of total cholesterol to LDL may be a better guide for statin treatment decisions than either LDL or total cholesterol diana
On 24/06/2009 16:57, "Greg Snow" <greg.s...@imail.org> wrote:
> Richard Kronmal (1993) "Spurious Correlation and the Fallacy of the > Ratio Standard Revisited". Journal of the Royal Statistical Society. > Vol. 156, No 3, 379-392.
> Which does not deal with the categorization issue, but rather the fact > that using the ratio rather than the 2 numbers (whether height-weight > or waist-hip) can lead to misleading results and it is better to use > the original values rather than their ratio.
> On Jun 24, 4:40 am, kornbrot <d.e.kornb...@herts.ac.uk> wrote: >> > BUT >> > Current evidence suggests that waist to hip ratio is consierably more >> > predictive of wieght trlated problems >> > So WHY ohWHY do GPS and hear untis routinelay take bmi and ignore w/h? >> > Best
>> > Diana
>> > On 24/06/2009 10:48, "William Stanbury" <williamstanb...@gmail.com> wrote:
>>> > > For the source, I'd recommend reading all publications by the inventor >>> of the >>> > > BMI in C19, the Belgian scientist Henri Quetelet. BMI used to be called the >>> > > Quetelet Index (QI); initially the QI faced rival indices such as the >>> Broca >>> > > Index which have become vestigeal or extinct in recent years, at least in >>> > > modern medical science. I trust that your French is up to scratch!
>>> > > Best wishes,
>>> > > William Stanbury.
>>> > > 2009/6/24 Neil Shephard <nsheph...@gmail.com>
>>>> > >> On Wed, Jun 24, 2009 at 4:17 AM, Kylie >>>> Lange<kylie.la...@adelaide.edu.au> >>>> > >> wrote:
>>>>>> > >>> > Hi all,
>>>>>> > >>> > Is anyone aware of any literature that discusses or investigates the >>>>>> > >>> > statistical implications of analysing BMI (body mass index) as >>>>>> categories >>>>>> > >>> > rather than leaving in its continuous form?
>>>>>> > >>> > I am putting together a class discussing the statistical problems of >>>>>> > >>> > dichotomising/categorising variables in analyses, and BMI is such a >>>>> > >>> commonly >>>>>> > >>> > used categorical measure that it would be nice to specifically >>>>>> discuss it.
>>>> > >> Nothing specific to BMI, but Frank Harrell has a useful page on the >>>> > >> problems associated with binning continuous variables...
>>> > > Faraggi D, Simon R (1996) A simulation study of cross-validation for >>> > > selecting an optimal cutpoint in univariate survival analysis. >>> > > Statistics in Medicine 15(20):2203-2213 >>> > >http://www.ncbi.nlm.nih.gov/pubmed/8910964
>>> > > Hadzi-Pavlovic D (2007) Correlations II : categorizing continuous >>> > > data. Acta Neuropsychiatrica 19(2):129-130
>>> > > The problem is really independent of the variable that you are >>> > > attempting to categorize, and I'm sure you could phrase the above >>> > > points in the context of BMI.
>>> > > Neil >>> > > -- >>> > > "The combination of some data and an aching desire for an answer does >>> > > not ensure that a reasonable answer can be extracted from a given body >>> > > of data." ~ John Tukey (1986), "Sunset salvo". The American >>> > > Statistician 40(1).
>There are nvertheless situations where the ratio, in addition to the raw >numbers are useful. >For example, ratio of total cholesterol to LDL may be a better guide for >statin treatment decisions than either LDL or total cholesterol
Is not the point that, in reality, it is very unlikely that a ratio (or any other fixed mathematical combination) of two variables is going to remain an ideal predictor/guide across all values of both variables - hence the suggestion that it is better to look at both of the values? In effect, using your example, it would mean that for each value of total cholesterol there would be a specific 'treatment decision threshold' in terms of LDL level (or vice versa), with 'the ratio' (at the threshold point) not necessarily always being the same.
Of course, that's far more complicated (both to estimate the thresholds and to apply them) - so 'less-than-ideal' fixed combinations (e.g. a ratio) are likely to continue to be used in practice, for their simplicity.
Kind Regards,
John
---------------------------------------------------------------- Dr John Whittington, Voice: +44 (0) 1296 730225 Mediscience Services Fax: +44 (0) 1296 738893 Twyford Manor, Twyford, E-mail: Joh...@mediscience.co.uk Buckingham MK18 4EL, UK ----------------------------------------------------------------
There have also been deviations between different schools of thought on the
best cutoff for BMI between normal weight and overweight with some setting
the cutoff at 27 and others at 25.
Using the ratio in its original
value would, of course, maximise power. As to using the individual
weight and height rather than BMI may require some transformation of height
to allow for a better
fit.
Neville
-- ================== Dr Neville
Calleja 12 Mon Nid Ganni Faure Str Tarxien
TXN2421 MALTA
There are nvertheless situations where the ratio, in
addition to the raw numbers are useful. For example, ratio of total
cholesterol to LDL may be a better guide for statin treatment decisions than
either LDL or total cholesterol diana
Richard Kronmal
(1993) "Spurious Correlation and the Fallacy of the Ratio Standard
Revisited". Journal of the Royal Statistical Society. Vol. 156, No 3,
379-392.
Which does not deal with the categorization issue, but
rather the fact that using the ratio rather than the 2 numbers (whether
height-weight or waist-hip) can lead to misleading results and it is
better to use the original values rather than their ratio.
On Jun
24, 4:40 am, kornbrot <d.e.kornb...@herts.ac.uk>
wrote: > BUT > Current evidence suggests that waist to hip ratio
is consierably more > predictive of wieght trlated problems > So
WHY ohWHY do GPS and hear untis routinelay take bmi and ignore w/h? >
Best > > Diana > > On 24/06/2009 10:48, "William
Stanbury" <williamstanb...@gmail.com>
wrote: > > > > > > > For the source,
I'd recommend reading all publications by the inventor of the > >
BMI in C19, the Belgian scientist Henri Quetelet. BMI used to be called
the > > Quetelet Index (QI); initially the QI faced rival indices
such as the Broca > > Index which have become vestigeal or extinct
in recent years, at least in > > modern medical science. I trust
that your French is up to scratch! > > > > Best
wishes, > > > > William Stanbury. > >
> > > 2009/6/24 Neil Shephard <nsheph....@gmail.com> > >
>> On Wed, Jun 24, 2009 at 4:17 AM, Kylie Lange<kylie.la...@adelaide.edu.au> >
>> wrote: > > >>> > Hi all, > >
>>> > Is anyone aware of any literature that discusses or
investigates the > >>> > statistical implications of
analysing BMI (body mass index) as categories > >>> >
rather than leaving in its continuous form? > > >>>
> I am putting together a class discussing the statistical problems
of > >>> > dichotomising/categorising variables in
analyses, and BMI is such a > >>> commonly >
>>> > used categorical measure that it would be nice to
specifically discuss it. > > >> Nothing specific to BMI,
but Frank Harrell has a useful page on the > >> problems
associated with binning continuous variables... > > >>http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous > >
>> There was some discussion earlier this year on MedStats itself
(which > >> pertains to age)... > > >>http://groups.google.com/group/MedStats/browse_thread/thread/f14ced3d... >
>> /ed262012c02d8ecc?pli=1 > > >> And here are a few
references.... > > >> Altman DG (1991) Categorising
Continuous covariates. British Journal > >> of Cancer
64:975 > > >> Altman DG, Royston P (2006) The cost of
dichotomising continuous > >> variables. 332(7549):1080 >
> MailScanner has detected a possible fraud attempt from
"171.66.124.147" > > claiming to be MailScanner has detected a
possible fraud attempt from > > "171.66.124.147" claiming to
be > >http://171.66.124.147/cgi/content/extract/332/7549/1080 >
> <http://171.66.124.147/cgi/content/extract/332/7549/1080> > >
> Faraggi D, Simon R (1996) A simulation study of cross-validation
for > > selecting an optimal cutpoint in univariate survival
analysis. > > Statistics in Medicine 15(20):2203-2213 >
>http://www.ncbi.nlm.nih.gov/pubmed/8910964 > >
> Owen SV, Froman RD (2005) Why carve up your continuous data?
Research > > in NUrsing & Health 28(6):496-503 > >http://www3.interscience.wiley.com/journal/112141778/abstract > >
> Hadzi-Pavlovic D (2007) Correlations II : categorizing
continuous > > data. Acta Neuropsychiatrica
19(2):129-130 > > > The problem is really independent of the
variable that you are > > attempting to categorize, and I'm sure
you could phrase the above > > points in the context of
BMI. > > > Neil > > -- > > "The combination
of some data and an aching desire for an answer does > > not ensure
that a reasonable answer can be extracted from a given body > > of
data." ~ John Tukey (1986), "Sunset salvo". The American > >
Statistician 40(1). > > > Email - nsheph...@gmail.com >
> Website -http://slack.ser.man.ac.uk/ > > Photos
-http://www.flickr.com/photos/slackline/ > > Professor Diana
Kornbrot > email: d.e.kornb...@herts.ac.uk >
web: http://web.mac.com/kornbrot/iweb/KornbrotHome.html >
Work > School of Psychology > University of
Hertfordshire > College Lane, Hatfield, Hertfordshire AL10 9AB,
UK > voice: +44 (0) 170 728 4626 >
fax: +44 (0) 170 728 5073 >
Home > 19 Elmhurst Avenue > London N2 0LT,
UK > voice: +44 (0) 208 883
3657 > mobile: +44 (0) 796 890
2102 > fax: +44
(0) 870 706 4997- Hide quoted text - > > - Show quoted text
-
Professor
Diana Kornbrot email: d.e.kornbrot@herts.ac.uk web:
http://web.mac.com/kornbrot/iweb/KornbrotHome.html Work School of Psychology University of
Hertfordshire College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:
+44 (0) 170 728 4626 fax:
+44 (0) 170 728 5073 Home 19
Elmhurst Avenue London N2 0LT, UK voice:
+44 (0) 208 883 3657 mobile:
+44 (0) 796 890 2102
See also @Article{fil07cat,
author = {Filardo, Giovanni and Hamilton, Cody and Hamman, Baron
and Ng, Hon K. T. and Grayburn, Paul},
title = {Categorizing {BMI} may lead to biased results in studies
investigating in-hospital mortality after isolated {CABG}},
journal = J Clin Epi,
year = 2007,
volume = 60,
pages = {1132-1139},
annote = {BMI;CABG;surgical adverse events;hospital
mortality;epidemiology;smoothing methods;categorization;categorizing
continuous variables;investigators should waive categorization
entirely and use smoothed functions for continuous variables;examples
of non-monotonic relationships}