Correlating ordinal and interval data

1,304 views
Skip to first unread message

Liz

unread,
Aug 3, 2006, 9:43:46 AM8/3/06
to MedStats
Dear all,
I have been asked to advise a student in the department who wishes to
examine the correlation between an ordinal scale (0-3) and an interval
level measurement, the values of which can range from 0 to 10000. I was
wondering what the best course of action might be. We could simply use
Spearman's rho, with or without collapsing the interval scale to, say,
10 categories, or we could perhaps try ordinal regression. Or we could
treat the ordinal scale as a factor and perform a Kruskal-Wallis test.
I would appreciate your comments.
Thanks,
Liz Hensor

Bland, M.

unread,
Aug 3, 2006, 10:10:30 AM8/3/06
to MedS...@googlegroups.com
I would suggest none of the above. I suggest Kendall's tau b
coeefficient of rank correlation. This is why:

Spearman's rank correlation is not so good for ties as Kendall's (so
Kendall said and he did study this subject).

Collapsing the interval scale is a waste of time. Never throw away
information.

Ordinal regression is very complicated for such a simple question.

Kruskal Wallis ignores the ordering of the categories and so throws away
information.

Martin

Liz wrote:

--
***************************************************
J. Martin Bland
Prof. of Health Statistics
Dept. of Health Sciences
Seebohm Rowntree Building Area 2
University of York
Heslington
York YO10 5DD

Email: mb...@york.ac.uk
Phone: 01904 321334
Fax: 01904 321382
Web site: http://www-users.york.ac.uk/~mb55/
***************************************************

Liz

unread,
Aug 3, 2006, 10:50:28 AM8/3/06
to MedStats
Thanks Martin, that's just the ticket! I too shied away from the idea
of collapsing the scale to categories, and thought ordinal regression
was a bit to high falutin' for these purposes, but this student had
received some advice along these lines from other people he had
consulted so I thought it best to find out whether or not I was
dismissing those ideas unnecessarily...
Thanks again,
Liz

John Uebersax

unread,
Aug 4, 2006, 3:12:23 AM8/4/06
to MedStats
I agree with Martin's suggestions.

I'd just offer one more alternative--the polyserial correlation
coefficient.

Advantage

Adjusts for censoring/discontinuity/ties in the ordered-category
variable

Potential Limitation

Assumes that the latent (unobserved), prediscretized version of the
ordered-category variable has a normal or near-normal distribution.
(Actually the assumption is that this latent variable, and your second,
interval-level measure are jointly distributed as bivariate normal).

The function, polyserial, available for R, will calculate the
polyserial correlation coefficient.

HTH

--
John Uebersax, PhD

Sanjoy Paul

unread,
Aug 4, 2006, 3:55:12 AM8/4/06
to MedS...@googlegroups.com
Dear Prof. Bland,

I have a similar problem.

I am dealing with polytomous response data from a treatment adherence study
in cluster randomized clinical trial setup.

Some responses are coded as 1, 2, 3, 4, and 5.

there are some responses, where the patients give their socres on their
level of agreement on a certain question in a scale like

0 - 10- 20 - 30 - 40 ..... upto 100

every values are spaced by 10 units and the patients are asked to circle any
of these 11 values in the scale.

The clinicians are interested in the correlation coefficients between these
two types of response.

What would be the best approach - please suggest.

Regards.

Sanjoy

Dr. Sanjoy K. Paul
Senior Medical Statistician
University of Oxford
Diabetes Trial Unit
OCDEM, Churchill Hospital
Old Road
Headington
Oxford, OX3 7LJ
Tel: +44 (0)1865 857283
Fax: +44 (0)1865 857260
Email: sanjo...@dtu.ox.ac.uk
sambh...@hotmail.com

Ted Harding

unread,
Aug 4, 2006, 8:38:50 AM8/4/06
to MedS...@googlegroups.com

I'm going back to the original posting, since I think that subsequent
replies may have not quite addressed the real question.

The ordinal scale 0-3 (call it X) is clearly ranked in 0,1,2,3 order.

The very fact of seeking a "correlation" between this and the
"interval level measurement" (call it Y) -- i.e. ordinary numbers,
as I understand it -- implies that Liz is contemplating a monotonic
relationship (increasing or decreasing) between Y and X -- as X
moves upwards through 0,1,2,3 so the level of Y should move
upwards (or downwards), to within random scatter.

I say this straight out, since if the relationship is not
monotonic, then the notion of correlation makes less sense,
or even none.

So we should be looking for a measure of monotonic relationship
between the values of Y and the ordered categories of X. (I can
see no indication, and will make no asusmption about, whether
the values 0,1,2,3 of the ordinal scale have a quantitative or
"interval" meaning, or are merely ordered labels).

Now recall an earlier discussion about the "Williams test",
namely the ANOVA test devised by D.A. Williams to test a
null hypothesis of no difference between ordered groups
(labelled by X here), versus an alternative hypothesis that
the means differ, such that the power of the test is focussed
on differences which are monotonic with respect to X.

This thread on MedStats began with Jim Groenveld's posting
on 2 March 2006, and ended on 6 March. I quote in particular
from my reply on 2 March 19:15 --

"... the Williams test I'm aware of is designed to be applied
to the case where treatment groups receiving different dosages
are compared with a control or placebo group, AND it is expected
that the response will be monotonic in the dose (i.e. the higher
the dose, the greater the expected response.

"The basis of the test is to construct a series of contrasts
combining the group means in a specific order, and to apply
the test to the largest of these. The purpose of the test is
to target the power on alternatives which are of the 'monotonic'
type with respect to the group identities (identified according
to increasing dose).

"In contrast, the "usual" AOV methods treat all groups on the
same footing, without regard to order. Hence the Williams test
would be superior for monotonic alternatives, but likely to
be less powerful when the groups are "permutable" (unless by
chance the order of increasing effect in the groups was the
same as the order of the groups in the Williams test).

"The original Williams papers are in

Williams DA (1971). A test for differences between treatment
means when several dose levels are compared with a zero
dose control. Biometrics 27, 103-117.
Williams DA (1972). The comparison of several dose levels
with a zero dose control. Biometrics 28, 519-531.

"An outline of what goes on can be found at

http://www.stat.fi/isi99/proceedings/arkisto/varasto/brow0281.pdf

So I suggest that a similar approach would be suitable for
Liz's query, using the magnitude of the test statistic as
a measure of correlation. NOTE that the test depends essentially
on a regression of Y against *categorical* X, using contrasts
which reflect the targeted monotonicity.

If it would be of interest, I can try to work out details of
this and post them to the list.

As a final comment: Martin Bland suggested using a rank correlation
approach. This -- contrary to his stern (and comendable) injunction
"Never throw away information" -- throws away the information in the
values of Y (over and above the information in their ranks). It may
be that the distribution of the Y-values -- overall or within the
categories of X -- is so intangible that one is driven to resort
to a distribution-free method such as a rank-correlation, but that
is a matter of fact which depends on the context of the query and
can only be addressed in terms of the context and of the data on Y.

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.H...@nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Aug-06 Time: 13:38:46
------------------------------ XFMail ------------------------------

greybeard

unread,
Aug 4, 2006, 10:05:20 AM8/4/06
to MedStats
(Ted Harding) wrote:
> On 03-Aug-06 Liz wrote:
> >
> > Dear all,
> > I have been asked to advise a student in the department who wishes to
> > examine the correlation between an ordinal scale (0-3) and an interval
> > level measurement, the values of which can range from 0 to 10000.
snip

>
> I'm going back to the original posting, since I think that subsequent
> replies may have not quite addressed the real question.
>
> The ordinal scale 0-3 (call it X) is clearly ranked in 0,1,2,3 order.
>
snip

>
> I say this straight out, since if the relationship is not
> monotonic, then the notion of correlation makes less sense,
> or even none.
>
> So we should be looking for a measure of monotonic relationship
> between the values of Y and the ordered categories of X. (I can
> see no indication, and will make no asusmption about, whether
> the values 0,1,2,3 of the ordinal scale have a quantitative or
> "interval" meaning, or are merely ordered labels).
>
> Now recall an earlier discussion about the "Williams test",
> namely the ANOVA test devised by D.A. Williams to test a
> null hypothesis of no difference between ordered groups
> (labelled by X here), versus an alternative hypothesis that
> the means differ, such that the power of the test is focussed
> on differences which are monotonic with respect to X.
>
> This thread on MedStats began with Jim Groenveld's posting
> on 2 March 2006, and ended on 6 March. I quote in particular
> from my reply on 2 March 19:15 --
>
> "... the Williams test I'm aware of is designed to be applied
> to the case where treatment groups receiving different dosages
> are compared with a control or placebo group, AND it is expected
> that the response will be monotonic in the dose (i.e. the higher
> the dose, the greater the expected response.
>
snipped further information on Williams test

Does the Williams test have advantages over using an ordinary
regression approach, keeping the ordinal levels as a 0-3 score?
Polynomial alternatives to the linear score could also be constructed
and compared to a factored (3 indicator variables) treatment which
would be the order "thrown away" comparison model.

An additional reminder that the Jonckheere-Terpstra test would be a
rank (permutation) alternative if the assumption of equal variances of
residuals across levels of the score variable were violated.

--
David Winsemius

Ted Harding

unread,
Aug 4, 2006, 10:28:38 AM8/4/06
to MedS...@googlegroups.com
On 04-Aug-06 greybeard wrote:
> [...]

> snipped further information on Williams test
>
> Does the Williams test have advantages over using an ordinary
> regression approach, keeping the ordinal levels as a 0-3 score?
> Polynomial alternatives to the linear score could also be constructed
> and compared to a factored (3 indicator variables) treatment which
> would be the order "thrown away" comparison model.

It has the property that it need only depend on the ordering of the
categories (despite the fact that it was originally applied in
a context where the levels were ordered by a numerical variable,
namely the dose level).

> An additional reminder that the Jonckheere-Terpstra test would be a
> rank (permutation) alternative if the assumption of equal variances
> of residuals across levels of the score variable were violated.

Yes, that's also a possibility! (With its own properties, of course).

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.H...@nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861

Date: 04-Aug-06 Time: 15:28:33
------------------------------ XFMail ------------------------------

valter...@geriatrik.gu.se

unread,
Aug 6, 2006, 3:18:41 PM8/6/06
to MedStats
I didn't known about the Williams test, but after a quick glance in the
document at Ted's link, the Williams test seems to be complicated to
perform, and it doesn't give effect measures, and as far as I know,
none of the standard statistics programs like SAS and SPSS report the
result of this test.
An important point about rank test: They throw away information when
used on numerical data as Ted Harding says, but their really important
disadvantage is that on categorical ordinal data they add information
- information that is irrelevant and lead to a random weighting of
the data. This is easy to understand for the Spearman test which
inserts mid-rank values into the formula for the product moment
correlation, but it is also true for test based on the number of
concordant-discordant pairs of observations like Kendall's tau b.
So I would advice that one or more of the following methods should be
used:
(Y is the linear variable and X the ordinal variable)
1. A non-parametric test of the Pearson correlation coefficient.
2. A simple regression model Y = a + bX,
3. A polynomial model Y = a + b1X + b2X**2
4. A categorical model Y = a + contrasts of X.
with explicitly motivated scores for the ordinal variable in the three
first cases.
The advantage of the polynomial model is that it does not assume that
the effect on Y of going one level up on X is the same for all levels
of X, which also means that it is less sensitive to the choice of
scores.
If you want to describe the association in greatest detail, use a
linear regression model with contrast coding of X (method 4).
If you define three contrast variables from X the following way:
IF X=0 THEN C1=0.75, C2=0.5 C3=0.25;
IF X=1 THEN C1=-0.25, C2=0.5 C3=0.25;
IF X=2 THEN C1=-0.25, C2=-0.5 C3=0.25;
IF X=3 THEN C1=-0.25, C2=-0.5 C3=-0.75;
and then calculated the linear model
Y = a + b1C1 + b2C2 + b3C3,
you will get separate estimates of the effect on Y of the three
differences in the four-level scale 0-1, 1-2 and 2-3.
If all three effects (b1, b2 and b3) are approximately the same, there
is an approximate linear effect of X on Y in the choose scores, and the
simple correlation and regression coefficients are sufficient to
describe the association between Y and X, and the simple model (2) will
give a good fit to the data. If they are far from equal, examine the
values to decide if they change in a smooth way or not. If the effect
values change in a smooth way, the polynomial model (3) will probably
give a good fit the data, otherwise report all three separate effects
from (4).
In conclusion: at least to a certain extent the polynomial model is
independent of the scores used, and the contrast model is, like the
Williams test, completely indpendent of the scoring, and when
considering how much easier they are to perform, I think they are
preferable to the Willams test in all situations I can think of.

Finally: there is nothing wrong with using an ordinal logistic
regression model in this case if you know what you are doing, and
understand the underlying assumptions behind the model and can to
interpret the parameter values that are reported. In this respect the
linear models have an advantage: they are easier to understand and to
interpret.

Valter Sundh
Dept. of Community Medicine and Public Health
Sahlgrenska Academy,
Göteborg, Sweden

Reply all
Reply to author
Forward
0 new messages