Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SPSS doesn't calculate Kappa when one variable is constant

5,967 views
Skip to first unread message

Kurt

unread,
May 16, 2007, 12:21:48 PM5/16/07
to
I am trying to assess the level of agreement between two raters who
rated items as either Yes or No. This calls for Kappa. But if one
rater rated all items the same, SPSS sees this as a constant and
doesn't calculate Kappa.

For example, SPSS will not calculate Kappa for the following data,
because Rater 2 rated everything a Yes.

Rater1 Rater2
Item1 Y Y
Item2 N Y
Item3 Y Y
Item4 Y Y
Item5 N Y

SPSS completes the crosstab (which shows that the raters agreed 60% of
the time), but as for Kappa, it returns this note:

"No measures of association are computed for the crosstabulation of
VARIABLE1 and VARIABLE2. At least one variable in each 2-way table
upon which measures of association are computed is a constant."

Is there anywhere to get around this? I can calculate Kappa by hand
with the above data; why doesn't SPSS?

Thanks.

Kurt

Bruce Weaver

unread,
May 16, 2007, 3:30:24 PM5/16/07
to

Does this solve your problem?

* ---------------------------------- .
data list list / r1 r2 count (3f2.0) .
begin data.
1 1 3
1 2 0
2 1 2
2 2 0
end data.

var lab
r1 'Rater 1'
r2 'Rater 2'
.
val lab r1 r2 1 'Yes' 2 'No'
.
weight by count.
crosstabs r1 by r2 /stat = kappa .

* Kappa is not computed because r2 is constant .
* Repeat, but with a very small number in place of 0 .

recode count (0 = .0001) (else=copy).
crosstabs r1 by r2 /stat = kappa .

* Now kappa is computed .

* ---------------------------------- .

--
Bruce Weaver
bwe...@lakeheadu.ca
www.angelfire.com/wv/bwhomedir

klange

unread,
May 16, 2007, 7:58:06 PM5/16/07
to


Hi Kurt,

Add one extra case to your file with the value of 'N' for Rater 2 (and
any value for Rater 1). Add a weighting variable that has a value of 1
for your real cases, and a very small value for this new dummy case
(eg, 0.00000001). Weight the file by the weighting variable (Data >
Weight cases), and then run the Crosstabs/Kappa.

The new case is enough for the Kappa to be calculated, but the
weighting means that it won't impact your results.

Cheers,
Kylie.

Richard Ulrich

unread,
May 16, 2007, 8:31:56 PM5/16/07
to
On 16 May 2007 09:21:48 -0700, Kurt <khei...@cox.net> wrote:

[snip. Problem with one rater constant.


>
> SPSS completes the crosstab (which shows that the raters agreed 60% of
> the time), but as for Kappa, it returns this note:
>
> "No measures of association are computed for the crosstabulation of
> VARIABLE1 and VARIABLE2. At least one variable in each 2-way table
> upon which measures of association are computed is a constant."
>
> Is there anywhere to get around this? I can calculate Kappa by hand
> with the above data; why doesn't SPSS?

I'm curious about your formula, a little bit. But I'm not
really worried about that.

When I wrote a program to do kappa, the program traps
a zero row or column as an error. I still think that
that's the proper treatment. I know that you don't have
a *test* on the result -- which makes the kappa rather
dubious, doesn't it? --, since the variance has a division
by zero.

I think that SPSS treats it right. IF you want a value, you
should have to interfere, say, in the way that Bruce shows.

--
Rich Ulrich, wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html

Kurt

unread,
May 18, 2007, 11:50:01 AM5/18/07
to
Kylie:

I tried your method and SPSS correctly weighted out the dummy case.
The crosstab table showed 60% agreement (the raters agreed on 3 out of
5 valid ratings) which is correct. But it calculated Kappa as .000,
which is definitely not correct.

My test data was set up as follows:

###

Rater1 Rater2 Weight
Item1 Y Y 1
Item2 N Y 1
Item3 Y Y 1
Item4 Y Y 1
Item5 N Y 1
Dummy N N .000000001

###

Any ideas?

Kurt

> Kylie.- Hide quoted text -
>
> - Show quoted text -


Bruce Weaver

unread,
May 18, 2007, 11:58:06 AM5/18/07
to
Kurt wrote:
> Kylie:
>
> I tried your method and SPSS correctly weighted out the dummy case.
> The crosstab table showed 60% agreement (the raters agreed on 3 out of
> 5 valid ratings) which is correct. But it calculated Kappa as .000,
> which is definitely not correct.
>
> My test data was set up as follows:
>
> ###
>
> Rater1 Rater2 Weight
> Item1 Y Y 1
> Item2 N Y 1
> Item3 Y Y 1
> Item4 Y Y 1
> Item5 N Y 1
> Dummy N N .000000001
>
> ###
>
> Any ideas?
>
> Kurt

I have a standalone Kappa program that gives these results for your table.

MEASUREMENT OF CLINICAL AGREEMENT FOR CATEGORICAL DATA:
THE KAPPA COEFFICIENTS

by

Louis Cyr and Kennon Francis

1992

COHEN`S (UNWEIGHTED) KAPPA
--------------------------

ESTIMATE STANDARD ERROR Z-STATISTIC
-------- -------------- -----------
KAPPA: 0.0000 0.0000 0.0000

STANDARD ERROR FOR CONSTRUCTING CONFIDENCE INTERVALS: 0.0000


JACKKNIFE ESTIMATE OF KAPPA
---------------------------

STANDARD ERROR
FOR
ESTIMATE CONFIDENCE INTERVALS
-------- --------------------
KAPPA: 0.0000 0.0000

Kurt

unread,
May 18, 2007, 1:39:32 PM5/18/07
to
I calculated Kappa by hand and the Kappa is indeed .000. After looking
at the formula, I understand why this occurs mathematically, but
conceptually it doesn't make sense. If the percent observed agreement
is 60%, than shouldn't Kappa be somewhere above 0 (which suggests no
agreement at all)?

> bwea...@lakeheadu.cawww.angelfire.com/wv/bwhomedir- Hide quoted text -

Brendan Halpin

unread,
May 18, 2007, 1:50:41 PM5/18/07
to
Kurt <khei...@cox.net> writes:

> I calculated Kappa by hand and the Kappa is indeed .000. After looking
> at the formula, I understand why this occurs mathematically, but
> conceptually it doesn't make sense. If the percent observed agreement
> is 60%, than shouldn't Kappa be somewhere above 0 (which suggests no
> agreement at all)?

It doesn't measure raw agreement, but rather agreement above that
expected under independence. If rater 2 says yes all the time, then
the expected agreement under independence is zero for no and
whatever rater 1 says for yes.

Another way of thinking about is that if rater 2 says yes all the
time, s/he's not a rater (his/her opinion is not a variable but a
constant).

Brendan
--
Brendan Halpin, Department of Sociology, University of Limerick, Ireland
Tel: w +353-61-213147 f +353-61-202569 h +353-61-338562; Room F2-025 x 3147
mailto:brendan...@ul.ie http://www.ul.ie/sociology/brendan.halpin.html

Bruce Weaver

unread,
May 18, 2007, 1:57:49 PM5/18/07
to
Kurt wrote:
> I calculated Kappa by hand and the Kappa is indeed .000. After looking
> at the formula, I understand why this occurs mathematically, but
> conceptually it doesn't make sense. If the percent observed agreement
> is 60%, than shouldn't Kappa be somewhere above 0 (which suggests no
> agreement at all)?

Kappa = 0 suggests no agreement beyond what is expected by chance. Your
2x2 table looks like this:

R2
Y N
R1 Y 3 0
N 2 0

The expected cell counts (which are computed as row total * column total
divided by grand total) are also 3, 0, 2, and 0. So there are no
agreements beyond the 3 that are expected by chance.

Richard Ulrich

unread,
May 18, 2007, 10:45:03 PM5/18/07
to
On 18 May 2007 10:39:32 -0700, Kurt <khei...@cox.net> wrote:

> I calculated Kappa by hand and the Kappa is indeed .000. After looking
> at the formula, I understand why this occurs mathematically, but
> conceptually it doesn't make sense. If the percent observed agreement
> is 60%, than shouldn't Kappa be somewhere above 0 (which suggests no
> agreement at all)?

[snip, rest]


I've seen a couple of other posts....

Here is another example of 'raw agreement'
being different from kappa.
Y N
Y 90 5
N 5 0
kappa= -.05 - NEGATIVE -
with 90% agreement.

Message has been deleted

Kurt

unread,
May 21, 2007, 10:50:21 AM5/21/07
to
And this example baffles me.

On May 18, 10:45 pm, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
> Here is another example of 'raw agreement'
> being different from kappa.
> Y N
> Y 90 5
> N 5 0
> kappa= -.05 - NEGATIVE -
> with 90% agreement.

I guess interpreting Kappa is not as straightforward as I had thought.
Or, I'm more dense than I imagined.

If there is 90% raw agreement, and Kappa is negative, this essentially
tells us that there is no agreement beyond what is expected by chance.
In other words, the observed 90% agreement - those 90 out of 100
ratings which were rated Yes by both raters - is within the realm of
chance. Isn't that a little counterintuitive? If I developed an
instrument and my raters consistently rated 90% of the items
similarly, I would see that as some evidence of good reliability. But
Kappa would tell me otherwise. ?

Brian

unread,
May 21, 2007, 5:58:39 PM5/21/07
to

I responded to Kurt offline with some syntax. Actually, what you all
are observing is called the Cichetti paradox. In essence, as the
marginal heterogeneity increases, the Pe becomes higher, which
diminishes the kappa. Try to calculate 90% agreement with marginal
homogeneity, e.g.,

Y N
Y 45 5
N 5 45

Kappa will be much larger. The paradox to which Cichetti refers is
that under ordinary rules of joint probability, the likelihood of
greater skewness (greater marginal heterogenity) is smaller and should
result in a lower Pe. However, the kappa family of chance corrected
statistics acts in opposition to that logic. Gwet's AC1 statistic
actually responds mildly in the reverse. As the marginal heterogenity
increases, there are concommitant increases in the AC1. Also, the
phenomenon which you are observing diminishes as the number of raters
and categories increases.

Brian

Ray Koopman

unread,
May 21, 2007, 7:24:04 PM5/21/07
to

This is exactly the situation for which Cohen invented kappa. When the
marginals are homogeneous (i.e., the row marginals are the same as the
column marginals), the expected rate of agreement given independence
(a.k.a. the chance agreement rate) grows as the marginals become more
skewed. In this particular example, 90% is actually less than the
chance agreement rate for two raters who have the same 95% bias toward
saying Yes but whose ratings are unrelated. The intent is to disabuse
people of the notion that a high agreement rate is ipso facto a
demonstration of reliability.

Richard Ulrich

unread,
May 21, 2007, 10:45:36 PM5/21/07
to


Thanks. Very good.

Another way to expand on the idea is to note that kappa
is thoroughly symmetrical. It does not matter which category
is YES and which is NO -- the kappa stays the same.

In my "90% agreement" example, look at the effect of switching
Yes and No. Each rater says that 5% have some trait. There
is 90% in the category where they agree for no-trait, but (now)
there is 0% in the category of Yes.

click...@gmail.com

unread,
Jul 2, 2020, 11:06:54 AM7/2/20
to
If there are two rater R1 & R2, then Can you tell me how to add third coloum for weight in SPSS as you did, can share your SPSS screenshot?
Your will be a very big hand for me. my email id : click...@gmail.com

Rich Ulrich

unread,
Jul 2, 2020, 12:11:22 PM7/2/20
to
On Thu, 2 Jul 2020 08:06:50 -0700 (PDT), click...@gmail.com wrote:

>On Friday, May 18, 2007 at 9:20:01 PM UTC+5:30, Kurt wrote:
>> Kylie:
>>
>> I tried your method and SPSS correctly weighted out the dummy case.
>> The crosstab table showed 60% agreement (the raters agreed on 3 out of
>> 5 valid ratings) which is correct. But it calculated Kappa as .000,
>> which is definitely not correct.
>>
>> My test data was set up as follows:


< snip, details >
>
>If there are two rater R1 & R2, then Can you tell me how to add third coloum for weight in SPSS as you did, can share your SPSS screenshot?
>Your will be a very big hand for me. my email id : click...@gmail.com

The original thread from 2007 is available from Google,
https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/ChdrpJTsvTk

and it give plenty of reason why you don't really want to
have a kappa reported when there is no variation.

Especially study my posts and the one of Ray Koopman.

--
Rich Ulrich

Rich Ulrich

unread,
Jul 2, 2020, 12:39:18 PM7/2/20
to
I will add to a point that I made in the original discussion.

The reader's problem arises because "agreement" is intuitively
sensible in usual circumstances, but it is nonsensical under
close examination when the marginal frequencies are extreme.

"Reliability" for a ratings of diagnosis logicallyy decomposes
into Sensitivity and Specificity -- picking out the Cases, and
picking out the Non-cases. Kappa is intended to combine
those measures, essentially. It looks at "performance above
chance." (For a 2x2 table, it is closely approximated by the
Pearson correlation.)

I gave a hypothetical table {90, 5; 5, 0} with a negative kappa.
90 5
5 0

One can arbitrarily label the rows and columns as starting with
Yes or with No. In one labeling there is 90% "agreement" as to
who is a case (cell A) , in the other case there is 0% "agreement"
(cell D).

When each of two raters are seeing 95% as Case, chance would
have them agree SOME time; so the 'agreement" of 0 is below
chance, and the kappa is negative.

--
Rich Ulrich

0 new messages