Example where adding ratios is correct?

Frank Harrell

unread,

May 3, 2013, 1:07:56 PM5/3/13

to meds...@googlegroups.com

I am having an argument with a journal that has published several papers where authors have created a prognostic score by anti-logging Cox proportional hazards model regression coefficients and adding the hazard ratios. This is wrong on so many levels that it is difficult to fathom. Ewout Steyerberg has pointed out that with this scheme, a protective factor will be scored as harmful upon anti-logging its negative regression coefficient.

One thing I'd like to say to the journal is that there is no example where adding ratios is appropriate when (1) the ratios do not represent the ratios of parts to a whole (i.e., proportions) and (2) the numerators and denominators are both variables (i.e., not fixed constants). It is OK to add proportions, and we have variance formulas where you take 1/m + 1/n, but the numerators are fixed constants in the latter.

Can anyone think of an example in biology, medicine, statistics, physics, or any other field (subject to the two exclusions above) where adding ratios actually works? I note in passing that it is not legitimate to compute the arithmetic mean of ratios or percent change - that's why we have geometric means and why we average logs then anti-log to get fold change.

Frank

Peter Flom

unread,

May 3, 2013, 2:16:57 PM5/3/13

to meds...@googlegroups.com

I am reminded of a famous Ripleyism (his was in response to stepwise regression)

“The only reason to do this is to show why it is wrong”

But maybe one of us will come up with some exception!

Peter

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

SteveDrD

unread,

May 10, 2013, 9:26:49 AM5/10/13

to meds...@googlegroups.com

And I can think of exactly one trivial example (although it may fall into the sum of proportions caveat). Suppose one of the ratios was equal to zero. Would the sum be meaningful?

Steve Denham

Frank Harrell

unread,

May 10, 2013, 9:40:34 AM5/10/13

to meds...@googlegroups.com

Hi Steve,

I'm looking for examples where the elements of the ratios are not fixed constants. No one has thought of one yet - I hope to get some more responses.

Frank

John Sorkin

unread,

May 10, 2013, 10:05:32 AM5/10/13

to meds...@googlegroups.com

Prof. Harrell,

At the risk of appearing unknowledgeable, can you give me an example, or reason why it is not legitimate to compute the arithmetic mean of ratios or percent change? Additionally, what is the correct method to use to get a summary measure of these statistics? I don't want to do that which is not correct.

John

P.S. In asking this question, I take courage from the aphorism that the only stupid question is the unasked question,
>>> Frank Harrell <harr...@gmail.com> 5/10/2013 9:40 AM >>>
Hi Steve,

Confidentiality Statement:

This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Kornbrot, Diana

unread,

May 10, 2013, 10:17:49 AM5/10/13

to meds...@googlegroups.com

No reason why you shouldn’t take arithmetic mean of ratios or proportions.
BUT if the reason you are doing it is to predict what another similar group of entities will do next time, it may be more accurate to 1st transform to z-scores, take mean of z-scores & then back trasnofr to the proportion
WHY?
Because ratios have floor and ceiling effect as they approach 0 or 1.
It is much easier to increase % correct in an exam from 505%ot 55% than from 90% to 95%
It is much easier to increase % buing product after viewing ad from 505% to 55% than from 90% to 95%
Etc.
Best

diana

Emeritus Professor Diana Kornbrot
email: d.e.ko...@herts.ac.uk
web:    http://dianakornbrot.wordpress.com/
Work
Department of Psychology
School of Life and Medical Sciences
University of Hertfordshire
College Lane, Hatfield, Hertfordshire AL10 9AB, UK
voice:   +44 (0) 170 728 4626
Home
19 Elmhurst Avenue
London N2 0LT, UK
voice:   +44 (0) 208 444 2081
mobile: +44 (0) 740 318 1612

John Whittington

unread,

May 10, 2013, 10:17:23 AM5/10/13

to meds...@googlegroups.com

At 10:05 10/05/2013 -0400, John Sorkin wrote:
>At the risk of appearing unknowledgeable, can you give me an example, or
>reason why it is not legitimate to compute the arithmetic mean of ratios
>or percent change?

One of the most basic issues is presumably that such a practice can give
inappropriate weights to the figures being combined, if the denominators of
the ratios are very different. ... Consider two ratios, one of 0.1 (derived
from 1/10) and 0.9 (derived from 9000/10000). The arithmetic mean of the
two ratios would be 0.5. However, the actual overall situation would be
9001/10010, namely 0.899.

Kind Regards,

John

----------------------------------------------------------------
Dr John Whittington, Voice: +44 (0) 1296 730225
Mediscience Services Fax: +44 (0) 1296 738893
Twyford Manor, Twyford, E-mail: Joh...@mediscience.co.uk
Buckingham MK18 4EL, UK
----------------------------------------------------------------

Frank Harrell

unread,

May 10, 2013, 10:57:33 AM5/10/13

to meds...@googlegroups.com

Let me turn it around and ask you for an example where it is valid to do so, with specifics.

Details about problems caused by percent change are at http://biostat.mc.vanderbilt.edu/ManuscriptChecklist - look for Measures of Change.

For Cox and logistic models the simple answer is that the regression coefficients were derived using maximum likelihood for a model in which the effects were additive on the log scale, not multiplicative on the log scale. So adding antilogs gives the wrong estimate of risk and is inconsistent with how the betas were estimated.

Frank

Frank Harrell

unread,

May 10, 2013, 11:01:19 AM5/10/13

to meds...@googlegroups.com

Diana,

Floor effects are not the problem here. Z-transforms are not appropriate for ratios. The correct approach, usually, is to take logs of ratios and to operate on those, then later anti-log to get fold change.

Arithmetic means can be appropriate for proportions, not for ratios in general. If you know of cases where arithmetic means of two ratios is appropriate, where the ratios are not proportions and none of the numerators and denominators is a fixed literal constant, please post it.

Frank

John Sorkin

unread,

May 10, 2013, 11:06:26 AM5/10/13

to meds...@googlegroups.com

Assume we have two subjects one with an initial systolic blood pressure of 200 mmHg. which after treatment decreases to 150 mmHg, a 25% decrease. The second subject has an initial systolic blood pressure of 100 mmHg and after treatment has a systolic blood pressure of 75 mmHg, also a 25% decrease. Other than being concerned that I have lowered the second subject's systolic blood pressure too much, and I not correct to say that the average percentage decrease brought about by treatment was 25%?

John

>>> Frank Harrell <harr...@gmail.com> 5/10/2013 10:57 AM >>>

Kornbrot, Diana

unread,

May 10, 2013, 11:33:52 AM5/10/13

to meds...@googlegroups.com

Frank is right

MY too quick solution is for proportions, which must lie in interval 0,1 not ratios
Ratios are bunded by zero and infinity, which IS very different from proportions

Best is to log ratios, take mean & then anti-log the mean.
Rationale
Ratios are often used in situations where 1 indicates neutrality, or equivalence
For example, accuracy: estimate/true value.
A’s estimate is 2* true value, an overestimate
B’s estimate is .5* true value, an under estimate
It seems to most that B’s underestimate is equivalent to A’s overestimate, BUT
Aratio –1 = 1
Bratio –1 = .5
So the mean of Aratio and Bratio is NOT 1
BUT log(1) = 0 and log(2) = - log(.5). Hence mean of log(Aratio) and log(Bratio) is 0, which is neutral and antilog (0) =1

Sex ratios is a similar example. Suppose we want to compute mean sex ratios across schools in biology
Define sex ratio as Nfemale/Nmale: school1 = 1.5, school2 = .33. Is school2 nearer equality than school 1? Well from ratios yes.
Then another research defines sex ratio and Nmale/N female. Now school1 = .33 and nearer neutrality than school2 with 1.5
Doesn’t make sense!
Taking logs so that neutral is zero removes the problem

Importantly for medical applications ratio may be observe biomarker concentration/recommended conncentration

With all this in mind I fail Frank’s challenge. Cannot think of scenario where means of ratios is appropriate
Best

Diana

time to row 2km/relative to mean time for age, sex group

On 10/05/2013 16:01, "Frank Harrell" <harr...@gmail.com> wrote:

Diana,

Floor effects are not the problem here. Z-transforms are not appropriate for ratios. The correct approach, usually, is to take logs of ratios and to operate on those, then later anti-log to get fold change.

Arithmetic means can be appropriate for proportions, not for ratios in general. If you know of cases where arithmetic means of two ratios is appropriate, where the ratios are not proportions and none of the numerators and denominators is a fixed literal constant, please post it.

Frank

On Friday, May 10, 2013 9:17:49 AM UTC-5, d.e.ko...@herts.ac.uk wrote:

No reason why you shouldn’t take arithmetic mean of ratios or proportions.
BUT if the reason you are doing it is to predict what another similar group of entities will do next time, it may be more accurate to 1st transform to z-scores, take mean of z-scores & then back trasnofr to the proportion
WHY?
Because ratios have floor and ceiling effect as they approach 0 or 1.
It is much easier to increase % correct in an exam from 505%ot 55% than from 90% to 95%
It is much easier to increase % buing product after viewing ad from 505% to 55% than from 90% to 95%
Etc.
Best

diana

On 10/05/2013 15:05, "John Sorkin" <jso...@grecc.umaryland.edu <http://jso...@grecc.umaryland.edu> > wrote:

Prof. Harrell,
At the risk of appearing unknowledgeable, can you give me an example, or reason why it is not legitimate to compute the arithmetic mean of ratios or percent change? Additionally, what is the correct method to use to get a summary measure of these statistics? I don't want to do that which is not correct.
John

P.S. In asking this question, I take courage from the aphorism that the only stupid question is the unasked question,

>>> Frank Harrell <harr...@gmail.com <http://harr...@gmail.com> > 5/10/2013 9:40 AM >>>

Hi Steve,

I'm looking for examples where the elements of the ratios are not fixed constants. No one has thought of one yet - I hope to get some more responses.

Frank

On Friday, May 10, 2013 8:26:49 AM UTC-5, SteveDrD wrote:

And I can think of exactly one trivial example (although it may fall into the sum of proportions caveat). Suppose one of the ratios was equal to zero. Would the sum be meaningful?

Steve Denham

On Friday, May 3, 2013 2:16:57 PM UTC-4, plf515 wrote:

I am reminded of a famous Ripleyism (his was in response to stepwise regression)

“The only reason to do this is to show why it is wrong”

But maybe one of us will come up with some exception!

Peter

From: meds...@googlegroups.com <http://meds...@googlegroups.com> [mailto:meds...@googlegroups.com] On Behalf Of Frank Harrell

Sent: Friday, May 03, 2013 1:08 PM

To: meds...@googlegroups.com <http://meds...@googlegroups.com>

Subject: {MEDSTATS} Example where adding ratios is correct?

I am having an argument with a journal that has published several papers where authors have created a prognostic score by anti-logging Cox proportional hazards model regression coefficients and adding the hazard ratios. This is wrong on so many levels that it is difficult to fathom. Ewout Steyerberg has pointed out that with this scheme, a protective factor will be scored as harmful upon anti-logging its negative regression coefficient.

One thing I'd like to say to the journal is that there is no example where adding ratios is appropriate when (1) the ratios do not represent the ratios of parts to a whole (i.e., proportions) and (2) the numerators and denominators are both variables (i.e., not fixed constants). It is OK to add proportions, and we have variance formulas where you take 1/m + 1/n, but the numerators are fixed constants in the latter.

Can anyone think of an example in biology, medicine, statistics, physics, or any other field (subject to the two exclusions above) where adding ratios actually works? I note in passing that it is not legitimate to compute the arithmetic mean of ratios or percent change - that's why we have geometric means and why we average logs then anti-log to get fold change.

Frank

Emeritus Professor Diana Kornbrot

email: d.e.ko...@herts.ac.uk <http://d.e.ko...@herts.ac.uk>

Kornbrot, Diana

unread,

May 10, 2013, 11:44:44 AM5/10/13

to meds...@googlegroups.com

See Frank’s comments
The denominators are different ofr the 2 patients
If the ratio was relative to a ‘healthy’ pressure of, say 120, then ratio has decreased from 200/120 to 150/120 and other patient has decreased from 100/120 to 75/120 taking logs of these ratios and subtracting shows you which patient has had the greatest change in ratio.
If one was comparing the efficacy of 2 drugs, then getting average of log(ratio(start/healthy)-log(ratio(end/healthy) will enable you to compare the 2 drugs
Best
Diana

On 10/05/2013 16:06, "John Sorkin" <jso...@grecc.umaryland.edu> wrote:

Assume we have two subjects one with an initial systolic blood pressure of 200 mmHg. which after treatment decreases to 150 mmHg, a 25% decrease. The second subject has an initial systolic blood pressure of 100 mmHg and after treatment has a systolic blood pressure of 75 mmHg, also a 25% decrease. Other than being concerned that I have lowered the second subject's systolic blood pressure too much, and I not correct to say that the average percentage decrease brought about by treatment was 25%?
John

>>> Frank Harrell <harr...@gmail.com> 5/10/2013 10:57 AM >>>
Let me turn it around and ask you for an example where it is valid to do so, with specifics.

Details about problems caused by percent change are at http://biostat.mc.vanderbilt.edu/ManuscriptChecklist - look for Measures of Change.

For Cox and logistic models the simple answer is that the regression coefficients were derived using maximum likelihood for a model in which the effects were additive on the log scale, not multiplicative on the log scale. So adding antilogs gives the wrong estimate of risk and is inconsistent with how the betas were estimated.

Frank

On Friday, May 10, 2013 9:05:32 AM UTC-5, John Sorkin wrote:

Prof. Harrell,
At the risk of appearing unknowledgeable, can you give me an example, or reason why it is not legitimate to compute the arithmetic mean of ratios or percent change? Additionally, what is the correct method to use to get a summary measure of these statistics? I don't want to do that which is not correct.
John

P.S. In asking this question, I take courage from the aphorism that the only stupid question is the unasked question,

>>> Frank Harrell <harr...@gmail.com <javascript:> > 5/10/2013 9:40 AM >>>

Hi Steve,

I'm looking for examples where the elements of the ratios are not fixed constants. No one has thought of one yet - I hope to get some more responses.

Frank

On Friday, May 10, 2013 8:26:49 AM UTC-5, SteveDrD wrote:

And I can think of exactly one trivial example (although it may fall into the sum of proportions caveat). Suppose one of the ratios was equal to zero. Would the sum be meaningful?

Steve Denham

On Friday, May 3, 2013 2:16:57 PM UTC-4, plf515 wrote:

I am reminded of a famous Ripleyism (his was in response to stepwise regression)

“The only reason to do this is to show why it is wrong”

But maybe one of us will come up with some exception!

Peter
From: meds...@googlegroups.com [mailto:meds...@googlegroups.com] On Behalf Of Frank Harrell
Sent: Friday, May 03, 2013 1:08 PM
To: meds...@googlegroups.com
Subject: {MEDSTATS} Example where adding ratios is correct?

I am having an argument with a journal that has published several papers where authors have created a prognostic score by anti-logging Cox proportional hazards model regression coefficients and adding the hazard ratios. This is wrong on so many levels that it is difficult to fathom. Ewout Steyerberg has pointed out that with this scheme, a protective factor will be scored as harmful upon anti-logging its negative regression coefficient.

One thing I'd like to say to the journal is that there is no example where adding ratios is appropriate when (1) the ratios do not represent the ratios of parts to a whole (i.e., proportions) and (2) the numerators and denominators are both variables (i.e., not fixed constants). It is OK to add proportions, and we have variance formulas where you take 1/m + 1/n, but the numerators are fixed constants in the latter.

Can anyone think of an example in biology, medicine, statistics, physics, or any other field (subject to the two exclusions above) where adding ratios actually works? I note in passing that it is not legitimate to compute the arithmetic mean of ratios or percent change - that's why we have geometric means and why we average logs then anti-log to get fold change.

Frank

John Whittington

unread,

May 10, 2013, 11:59:47 AM5/10/13

to meds...@googlegroups.com

At 11:06 10/05/2013 -0400, John Sorkin wrote:

Assume we have two subjects one with an initial systolic blood pressure of 200 mmHg. which after treatment decreases to 150 mmHg, a 25% decrease. The second subject has an initial systolic blood pressure of 100 mmHg and after treatment has a systolic blood pressure of 75 mmHg, also a 25% decrease. Other than being concerned that I have lowered the second subject's systolic blood pressure too much, and I not correct to say that the average percentage decrease brought about by treatment was 25%?

In situations like that, I think the primary question is whether one should be using percentage changes in the first place, let alone working out the average of two or more of these percentage changes. There may be some situations (e.g. in some physical sciences) in which a response is linearly related to the starting value (hence maybe offering some basis for looking at percentage changes), but response to medical treatments is, in general, certainly not such a situation. Taking your example, a treatment which reduces a systolic BP of 200 mmHg by 50 mmHg is quite likely to result in a catastrophically high decrease (not a pro-rata smaller decrease) if the starting BP is 100 mmHg. Percentage change figures therefore have the potential to be very misleading in such situations.

Frank Harrell

unread,

May 10, 2013, 12:34:45 PM5/10/13

to meds...@googlegroups.com

It would be unusual for log blood pressure to satisfy the Bland-Altman conditions for a change score. Changes should be computed on the scale such that the change is unrelated to the average of the transformed values.

Percents cause many problems. I wish we never used them. They get misused all the time. When the stock market declines by 3% one day, it does not get back to where it was when it increases by 3% the next day, regardless of what the media say. If the market goes down by a factor of 0.97 it has to go up by a factor of 1/0.97 to return to the original amount. I'm always amazed how many statisticians don't realize that percent change is an asymmetric measure. Many papers have been written about this (several are cited on the wiki I referenced in an earlier post).

Frank

John Sorkin

unread,

May 10, 2013, 12:36:30 PM5/10/13

to meds...@googlegroups.com

John,

The numbers used in my example were chosen simply for simplicity of exposition. They were not chosen to represent a true experiment. My question remains, am I wrong in stating that the drug, on average, results in a 25% decrease in blood pressure? Is there a first principle that would preclude the average, much as one does not average standard deviations, but rather converts the SD to a variance averages the variances and the concerts the resultant value to an SD.

John

>>> John Whittington <Joh...@mediscience.co.uk> 5/10/2013 11:59 AM >>>

--

John Whittington

unread,

May 10, 2013, 12:52:38 PM5/10/13

to meds...@googlegroups.com

At 12:36 10/05/2013 -0400, John Sorkin wrote:

The numbers used in my example were chosen simply for simplicity of exposition. They were not chosen to represent a true experiment.

I obviously realised that but the (fairly extreme) figures you chose were,in fact, quite useful to me in making my point!

My question remains, am I wrong in stating that the drug, on average, results in a 25% decrease in blood pressure?

As I see it, it's not so much a question of 'right and wrong' as of 'meaningfulness'. A result stated as a percentage begs the question "percentage of what" - and in your case the answer is "percentage of the average pre-treatment value" (average pre-treatment = 150, average decrease = 37.5, hence 25%). Is that necessarily meaningful?

Greg Snow

unread,

May 10, 2013, 1:28:07 PM5/10/13

to meds...@googlegroups.com

I wonder if the definition of “ratio” or lack thereof may be causing some confusion here.

When I first read Frank’s question my thought was that a ratio just meant a fraction and I thought of a few different examples where adding proportions (with equal, fixed denominators) made sense. But reading further I believe that Frank is talking about epidemiologic ratios which are a division of 2 rates or odds and important from a statistical view is that at least one piece in each of the numerator and denominator is a random variable. For example, the proportion of smokers who develop lung cancer is a rate (and a fraction and a proportion) but is not a ratio. The proportion of smokers who develop lung cancer divided by the proportion of non-smokers who develop lung cancer is a ratio.

I don’t think that the change in blood pressure example (and other examples so far) meet that definition of ratio that Frank is interested in.

From the statistics/random variable approach it make sense to work with ratios on the log scale because taking the log turns those pesky divisions into subtractions and dealing with the difference between 2 random variables is simple, the ratio of 2 random variables is not so simple.

If I remember my elementary school math correctly, you can only add 2 fractions when the denominator is exactly equal. So adding 2 rates from the same population (therefore with the same, fixed denominator) can be meaningful. But if there is a random variable in the denominator then can the denominators ever be considered to be equal?

Even if we have a case where numerically the denominators are equal, for example we have an odds ratio (or hazard ratio, or risk ratio) for heavy smokers developing lung cancer compared to non-smokers and we also have an odds ratio for light smokers developing lung cancer compared to non-smokers then we have the rate of lung cancer in non-smokers as the denominator in both cases. Can we add those 2 odds ratios together? Would that be the odds ratio smokers (either heavy or light) developing lung cancer compared to non-smokers? It would be paradoxical if smokers (undetermined amount) had a higher odds ratio than either heavy or light smokers. It could be considered the risk of being both a light smoker and a heavy smoker at the same time (even though that is impossible) but that would still require additivity (no interaction) on the non-logged scale. Averaging might make sense here, but there is still the main issue of weights in the averaging.

So my 2 cents worth is that I cannot think of a meaningful case of adding ratios (epidemiology definition) on the non-logged scale.

--

Gregory (Greg) L. Snow Ph.D.

Statistical Data Center

Intermountain Healthcare

greg...@imail.org

801.408.8111

--

John Whittington

unread,

May 10, 2013, 1:44:00 PM5/10/13

to meds...@googlegroups.com

At 11:28 10/05/2013 -0600, Greg Snow wrote:
>I wonder if the definition of ratio or lack thereof may be causing some

>confusion here. .... I don t think that the change in blood pressure

>example (and other examples so far) meet that definition of ratio that
>Frank is interested in.

Well, for what it's worth, I thought he was talking geberally, about _any_
sort or ratios (or fractions/percentages), not the least because I would
agree with that general concern.

Thompson,Paul

unread,

May 10, 2013, 1:48:01 PM5/10/13

to meds...@googlegroups.com

"talking geberally"

Not an adverb that I am familiar with. Perhaps you meant "talking gerbilally" or "speaking in the manner of a gerbil"

Just a wild guess, mind you.

-----Original Message-----
From: meds...@googlegroups.com [mailto:meds...@googlegroups.com] On Behalf Of John Whittington

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-----------------------------------------------------------------------
Confidentiality Notice: This e-mail message, including any attachments,

is for the sole use of the intended recipient(s) and may contain

privileged and confidential information. Any unauthorized review, use,

disclosure or distribution is prohibited. If you are not the intended

recipient, please contact the sender by reply e-mail and destroy

John Whittington

unread,

May 10, 2013, 2:40:34 PM5/10/13

to meds...@googlegroups.com

At 17:48 10/05/2013 +0000, Thompson,Paul wrote:
>"talking geberally" ... Not an adverb that I am familiar with.

It must be the weekend !! I don't know about yours, but the 'b' and 'n'
keys on my keyboard are adjacent!

Thompson,Paul

unread,

May 10, 2013, 2:43:47 PM5/10/13

to meds...@googlegroups.com

Just a little attempt at humor, John, lame and limp though it may be. In addition, I did not run my joke through the IACUC (animal welfare committee), but you can be assured that no gerbils were harmed in the manufacture of this humorous interlude.

So, back to adding ratios...

-----Original Message-----
From: meds...@googlegroups.com [mailto:meds...@googlegroups.com] On Behalf Of John Whittington
Sent: Friday, May 10, 2013 1:41 PM
To: meds...@googlegroups.com

--
--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

---
You received this message because you are subscribed to the Google Groups "MedStats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to medstats+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-----------------------------------------------------------------------
Confidentiality Notice: This e-mail message, including any attachments,

is for the sole use of the intended recipient(s) and may contain

privileged and confidential information. Any unauthorized review, use,

disclosure or distribution is prohibited. If you are not the intended

recipient, please contact the sender by reply e-mail and destroy

Pedro Emmanuel Alvarenga Americano do Brasil

unread,

May 15, 2013, 8:33:51 AM5/15/13

to MedStats-list

StatMasters,

I did read the thread yesterday but I didnt have time to comment. The very first thing that came to me is that many years a go, a teacher of mine said that any rate can be decomposed in other rates. At the time the topic was transmissible diseases and epidemics. From now one Im talking about rates as number of events/person-years.

A disease prevalence in a population is a balance of new cases, deaths and cures. This is the most simple way to understand. But it can be more complex, such as there is an inflow from other populations, this inflow may be of immune and susceptible subjects, sick subjects can become sick again after they were cured, there is also an outflow to other populations beyond cure and death.

So lets suppose a have a child death rate (events/person-year) of 20 deaths per 10000 person-years in a country. If I split this rate by semester or by month, I have several rates from the same population. If I average these rates should I get the original rate of 20/10000? The same example can be explored by places. Instead decomposing by time, we may decompose by place. Split the country rate by states or counties. This is even more common. But, as those who do that also have the original country rate, averaging the rates to get the country rate does not make much sense. But is it reasonable?

Also, at ancient times, when I was closer to public health, I did see reports of diseases rates as averages of adjacent periods, following rationales of moving averages of time-series. For example, the rate of moth 5 is the average of months 4, 5 and 6. The argument is that this procedure would turn the rates more stable when these rates are very low when compared to the international standard (1000 or 100000 person-years). However, I never gave some thought about any limitation on this procedure.

Does this bring any light?

Dr. Pedro Emmanuel A. A. do Brasil

Curriculum Lattes: http://lattes.cnpq.br/6597654894290806
ResearchGate.net: https://www.researchgate.net/profile/Pedro_Brasil2/

Instituto de Pesquisa Clínica Evandro Chagas
Fundação Oswaldo Cruz
Rio de Janeiro - Brasil
Av. Brasil 4365,
CEP 21040-360,
Tel 55 21 3865-9648
email: pedro....@ipec.fiocruz.br
email: emmanue...@gmail.com

---Apoio aos softwares livres
www.zotero.org - gerenciamento de referências bibliográficas.
www.broffice.org ou www.libreoffice.org - textos, planilhas ou apresentações.
www.epidata.dk - entrada de dados.
www.r-project.org - análise de dados.
www.ubuntu.com - sistema operacional

Frank Harrell

unread,

May 27, 2013, 11:42:54 PM5/27/13

to meds...@googlegroups.com

Pedro,

I'm trying to understand whether the average should be a geometric mean (i.e., average the log rates then anti-log) or should be arithmetic means of rates. I think it's the former but I don't know the literature in that area very well.

Frank

John Sorkin

unread,

Jun 3, 2013, 8:32:03 AM6/3/13

to meds...@googlegroups.com

Colleagues,

What is the best way to quantify agreement when multiple observers are scoring an instrument where the outcome is measured on a ratio scale. In my particular case each observer will, for example, score 4 studies. I will have five people do the scoring. Each study will there fore be scored 5 times; there will be a total of 20 scores (5 observers * 4 studies=20) The score on the questionnaire ranges from zero to a very large number. Each score will be a positive integer (e.g. 0, 1, 2, 3, ........100 etc).

Thank you,

John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Swank, Paul R

unread,

Jun 3, 2013, 8:35:47 AM6/3/13

to meds...@googlegroups.com

Rather than agreement, I would use an intraclass correlation to obtain reliability.

Paul

Paul R. Swank, Ph.D., Professor
Health Promotions and Behavioral Sciences
School of Public Health
University of Texas Health Science Center Houston

________________________________
From: meds...@googlegroups.com [meds...@googlegroups.com] On Behalf Of John Sorkin [jso...@grecc.umaryland.edu]
Sent: Monday, June 03, 2013 7:32 AM
To: meds...@googlegroups.com
Subject: {MEDSTATS} assessing agreement with multiple observers, data on ratio scale

Peter Flom

unread,

Jun 3, 2013, 9:42:31 AM6/3/13

to meds...@googlegroups.com

I don’t know about “best” but it seems to me that you would get good information from a Cronbach’s alpha, or, rather, 4 alphas – one for each study

Peter

Peter Flom

My web site: http://www.statisticalanalysisconsulting.com/

Linked in: http://www.linkedin.com/in/peterflom

Twitter: @PeterFlomStat

From: meds...@googlegroups.com [mailto:meds...@googlegroups.com] On Behalf Of John Sorkin
Sent: Monday, June 03, 2013 8:32 AM
To: meds...@googlegroups.com
Subject: {MEDSTATS} assessing agreement with multiple observers, data on ratio scale

Colleagues,

--

Martin Bland

unread,

Jun 3, 2013, 10:16:49 AM6/3/13

to meds...@googlegroups.com

I agree that one way is to do intra-cluster r=correlations, you can also do within subject between observer standard deviations and statistics derived from that. You could look at:

http://www-users.york.ac.uk/~mb55/meas/obshead.htm

Martin

--
***************************************************
J. Martin Bland
Prof. of Health Statistics
Dept. of Health Sciences
ARRC Building
University of York
Heslington
York YO10 5DD

Email: martin...@york.ac.uk
Phone: 01904 321334 Fax: 01904 321382
Web site: http://martinbland.co.uk/

Statement by the University of York:
This email and its attachments may be confidential and are intended solely for the use of the intended recipient. If you are not the intended recipient of this email and its attachments, you must take no action based upon them, nor must you copy or show them to anyone. Please contact the sender if you believe you have received this email in error. Any views or opinions expressed are solely those of the author and do not necessarily represent those of The University of York.
***************************************************

Giulio Flore

unread,

Jun 3, 2013, 10:18:37 AM6/3/13

to meds...@googlegroups.com

Hi,

You might take a leaf from Multi-Dimensional Scaling procedures, and compute 'distances' between subjects as an inverse measure of agreement. Each paper score (instrument) would be a variable in a penta - dimensional space. Each individual would be fixed in that space by these 5 variables (coordinates if you will). Then distances between subjects can be calculated. The smaller the distance, the greater the agreement.

There are several ways to measure such distances. I would be cautious in using euclidean distances, because the ratio scale method looks deceptively like a continuous variable. Besides what 80 is for rater A may not mean the 80 for rater B.

I would play safe and use ordinal measurement method, based on the count of rank differences (city block distances). This is justifiable as it has been proved that using ranking provide optimal solutions (see below for references to software and texts as to do that).

There is a bonus insofar that MDS will not only extract the distances but also project the paper score on a simpler two dimensional space, so that you can map you subjects graphically too on a 2D plot.

For reference on this methodology, either use the very useful little books by Sage University Paper series "Quantitative Applications in the Social Sciences" on the subject or, for an exhaustive discussion, Ingwer Borg, Patrick J.F. Groenen "Modern Multidimensional Scaling Theory and Applications" by Springer.

Software wise, r has got a mds function, STATA I believe has got it too, or otherwise you can use SPSS Analyze/Scale/Multidimensional Scaling (PROXSCAL).

Hope it helps & good luck

Giulio

--

Basilio de Braganca Pereira

unread,

Jun 3, 2013, 12:54:40 PM6/3/13

to jso...@grecc.umaryland.edu, meds...@googlegroups.com

Dear John
I believe you can adapt the results of the attached paper for your problem
Basilio

2013/6/3 John Sorkin <jso...@grecc.umaryland.edu>

--

Pereira.pdf

John Sorkin

unread,

Jun 3, 2013, 1:35:45 PM6/3/13

to meds...@googlegroups.com

Thank you,

John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Basilio de Braganca Pereira <basilio...@gmail.com> 6/3/2013 12:54 PM >>>

Thompson,Paul

unread,

Jun 3, 2013, 1:36:51 PM6/3/13

to meds...@googlegroups.com

But one comment – I doubt that you have “ratio” level data. You may have a “rational” zero, but that does not make it ratio-level data.

-----------------------------------------------------------------------
Confidentiality Notice: This e-mail message, including any attachments,

is for the sole use of the intended recipient(s) and may contain

privileged and confidential information. Any unauthorized review, use,

disclosure or distribution is prohibited. If you are not the intended

recipient, please contact the sender by reply e-mail and destroy

Ray Koopman

unread,

Jun 3, 2013, 4:34:04 PM6/3/13

to MedStats

I would start with an ICC for absolute agreement. However, the
underlying model presumes an additive constant, which having a real
zero rules out. If all the scores are well away from zero then you
can probably break the rule with impunity, but otherwise you might
try a Poisson-link generalized linear model.

Pedro Emmanuel Alvarenga Americano do Brasil

unread,

Jun 3, 2013, 5:09:27 PM6/3/13

to MedStats-list

John,

I did read several parts of an earlier edition of this book and helped me a lot. Perhaps you shohuld take a look too. 

http://www.agreestat.com/

Dr. Pedro Emmanuel A. A. do Brasil

http://blog.ipec.fiocruz.br/lapclin-chagas/

Curriculum Lattes: http://lattes.cnpq.br/6597654894290806

ResearchGate.net: https://www.researchgate.net/profile/Pedro_Brasil2/

Instituto Nacional de Infectologia/Instituto de Pesquisa Clínica Evandro Chagas

Fundação Oswaldo Cruz
Rio de Janeiro - Brasil
Av. Brasil 4365,
CEP 21040-360,
Tel 55 21 3865-9648

e-mail: pedro....@ipec.fiocruz.br
e-mail: emmanue...@gmail.com

---Apoio aos softwares livres
www.zotero.org - gerenciamento de referências bibliográficas.
www.broffice.org ou www.libreoffice.org - textos, planilhas ou apresentações.
www.epidata.dk - entrada de dados.
www.r-project.org - análise de dados.
www.ubuntu.com - sistema operacional

2013/6/3 Ray Koopman <koo...@sfu.ca>

John Sorkin

unread,

Jun 3, 2013, 5:51:58 PM6/3/13

to MedStats-list

Thank you,

John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Pedro Emmanuel Alvarenga Americano do Brasil <emmanue...@gmail.com> 6/3/2013 5:09 PM >>>

Pedro Emmanuel Alvarenga Americano do Brasil

unread,

Jun 5, 2013, 11:39:13 AM6/5/13

to MedStats-list

StatMasters,

Several good comments above. I never faced a problem like this, so I cant say my comment would be very useful. Nevertheless, when I was involved in reliability studies, reading several parts of Gwet's book was very handy. Estimating reliability/agreement in multirater-multireader studies is not straightforward and there is now a complication of the scale where ratios may not behave equally as a continuous scale.

I suggest you take a look at http://www.agreestat.com/ He provides xls and SAS macros to compute most of the stuff in his book. Perhaps one fits your purpose.

Regards,

Dr. Pedro Emmanuel A. A. do Brasil

http://blog.ipec.fiocruz.br/lapclin-chagas/

Curriculum Lattes: http://lattes.cnpq.br/6597654894290806

ResearchGate.net: https://www.researchgate.net/profile/Pedro_Brasil2/

Instituto Nacional de Infectologia/Instituto de Pesquisa Clínica Evandro Chagas

Fundação Oswaldo Cruz
Rio de Janeiro - Brasil
Av. Brasil 4365,
CEP 21040-360,
Tel 55 21 3865-9648
e-mail: pedro....@ipec.fiocruz.br
e-mail: emmanue...@gmail.com

---Apoio aos softwares livres
www.zotero.org - gerenciamento de referências bibliográficas.
www.broffice.org ou www.libreoffice.org - textos, planilhas ou apresentações.
www.epidata.dk - entrada de dados.
www.r-project.org - análise de dados.
www.ubuntu.com - sistema operacional

2013/6/3 Thompson,Paul <Paul.T...@sanfordhealth.org>

Frank Harrell

unread,

Jun 10, 2013, 2:25:02 PM6/10/13

to meds...@googlegroups.com

No one has come up with an example where adding ratios is correct under the restrictions listed below. I'm going to conclude that adding ratios is incorrect in this context.