Re: {MEDSTATS} inter-observer variation

Peter Flom

unread,

Dec 16, 2009, 10:13:35 AM12/16/09

to MedStats

Evie <sos_...@yahoo.co.uk> wrote
>I need to compare the readings for 2 observers measured on 5 moulds,
>as you can see from the data below there is a wide range of values
>for the 5 moulds i.e. 1 very small and 1 large. Because of this I
>can't use the intra-class correlation coefficient. Have you any
>suggestions what I could use (I've calculated the % difference between
>observers)?
>
>Observer 1 Observer 2
>44.49 44.96
>9.29 9.03
>3.65 3.51
>1 1.02
>0.02 0.02
>
>Each observer measured each mould 10 times so the above values are the
>means of the 10 replicates.
>
>I appreciate your advice.

What's wrong with % difference between obsevers?
That sounds like a good measure to me.

Does it not do what you want?

Peter

Peter L. Flom, PhD
Statistical Consultant
Website: http://www DOT statisticalanalysisconsulting DOT com/
Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
Twitter: @peterflom

ציפי שוחט‎

unread,

Dec 16, 2009, 10:59:50 AM12/16/09

to meds...@googlegroups.com

It seems that the problem is the non-normality of the observations.

Using a log transformation should normalize the data and shrink the range so that the ICC could be calculated for the log transformed data.

Hope this helps.

Tzippy Shochat

2009/12/16 Peter Flom <peterflom...@mindspring.com>

--
To post a new thread to MedStats, send email to MedS...@googlegroups.com .
MedStats' home page is http://groups.google.com/group/MedStats .
Rules: http://groups.google.com/group/MedStats/web/medstats-rules

BXC (Bendix Carstensen)

unread,

Dec 16, 2009, 4:45:00 PM12/16/09

to meds...@googlegroups.com

Evie (et. al.):

The actual distribution of the measurements is irrelevant.
What counts is the distribution of the DIFFERENCES of the measurements.

If you compute the differences you can produce a 95% prediction interval for the difference as mean +/- 2 SD, though 5 observations is not much to base an estimate of an SD on. This is the so-called Limits of Agreement, see:

author = "JM Bland and DG Altman",
title = "Statistical methods for assessing agreement between two
methods of clinical measurement",
journal = "Lancet",
year = "1986",
volume = "i",
pages = "307--310"

or

author = {JM Bland and DG Altman},
title = {Measuring agreement in method comparison studies.},
journal = {Statistical Methods in Medical Research},
year = {1999},
volume = {8},
pages = {136--160}

With 5 observations there is of course so little information that a test of the difference = 0 is non-significant, but this hypothesis is beside the point. You are interested in whether the two observers are sufficiently close, and that is what you use the prediction interval for the differences to assess. But with 5 obs. this is very poorly determined.

You may of course also consider whether you should take the logarithm before you take the differences, in which case you will get the differences as the log of the ratio of the measurements, and you can back-transform to a prediction interval for the ratio of the measurements by the two observers.

The ICC is meaningless in the context of comparing two observers, as is any other correlation measure, see e.g.:

author = {G Atkinson and A Neville},
title = {Comment on the use of concordance correlation to
assess the agreement between two variables.},
journal = {Biometrics},
year = {1997},
volume = {52},
pages = {775--778},

Your main problem is the small number of moulds measured. You can a get bit better handle on the individual observers' precision by using the original replicate data;
the problem can be solved using a fairly simple variance components model that can be stuffed into the usual statistical packages, see:

author = {B Carstensen and J Simpson and LC Gurrin},
title = {Statistical models for assessing agreement in method
comparison studies with replicate measurements},
journal = {International Journal of Biostatistics},
year = {2008},
volume = {4},
number = {1},
pages = {Article 16}

But even with 10 replicates by each observer, you will not get a reliable prediction interval with only 5 moulds measured, regardless of the scaling.

If you make the log-transform you should do it on the original replicate measurements, and the SD.s you get from the variance components model can then be interpreted as coefficients of variation (provided you use the natural log).

Best regards,
Bendix Carstensen
_______________________________________________

Bendix Carstensen
Senior Statistician
Steno Diabetes Center
Niels Steensens Vej 2-4
DK-2820 Gentofte
Denmark
+45 44 43 87 38 (direct)
+45 30 75 87 38 (mobile)
b...@steno.dk http://www.biostat.ku.dk/~bxc
www.steno.dk

> Website: http://www <http://www/> DOT

Reply all

Reply to author

Forward