Re: comparing Kappa Statistics in case of dependence



On Tue, 15 Jan 2008 02:36:00 -0800 (PST), Bart Hamers
<bart.hamers@xxxxxxxxx> wrote:

Hello,

In order to compare different raters, we often use the nonparametric
Kappa statistic as a measure of agreement.
However, we would like to compare different kappa statistics. Moreover
do we need a test to compare the kappa's in case of dependence.
Is there a test to compare Kappa(X,Y) and Kappa(Y,Z) taking into
account the dependence introduced by the mutual rater Y?

First, I hope that you are using 2x2 kappa and not
anything else, since r x k kappa is a fairly lousy statistic
for the larger tables, with a bad dependence on marginal
counts. Or, weighted kappa is practically equivalent to
using a correlation, so you might test for correlated
correlations.

Here is the simple situation for 3 raters, 0/1 ratings.
There are 8 possible outcomes of the 2x2x2 table:

Take the 1st column as the shared rater.
YXZ
000 - most cases not scored 111 (you hope)
001 outcome A: X agrees, Z disagrees with Y
010 outcome B: X disagrees, Z agrees with Y
011 - both disagree with Y

100 - both disagree with Y
101 outcome B
110 outcome A
111 - most cases not scored 000

If the raters are equivalent, the sum of A (Sa) will be
the same as the sum of B (Sb). This is the information
that you have on DIFFERENCES between the kappas,
and for the differences between the raters,
ignoring the level-differences in how many 0s /1s
were given.

A simple test on Sz vs Sb is the sign test, like using McNemar's
test on the off-diagonal scores for testing "change."

One reason for looking at this is that it may show
you that you have very little power for comparing
your ratings. A conclusion of "not different" is not very
powerful until the count of disagreements is large.

I haven't looked at the citation that Ray provided.
"Weighted least squares" is obviously more complicated.

--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html
.



Relevant Pages

  • Re: sample size for kappa?
    ... Kappa is not a very good 'absolute' statistic beyond the 2x2 case. ... raters and a audit tool/questionnaire that has 250 ... know if there is inter rater agreement between the 2 raters on the 250 ... agreement in measuring a single 'dimension' that is shared ...
    (sci.stat.consult)
  • Re: Quadratic weighted Kappa and the Intraclass Correlation Coefficient
    ... kappa, using quadratic weights, asymptotically ... The Case 2 ICC assumes that the two raters compared are a random ... weighted kappa assumes that the two raters considered are the only ... random sample. ...
    (sci.stat.edu)
  • Re: kappa & ICC questions
    ... Can the data from more than one pairs of raters in an inter-rater ... reliability study ... There exists a creature called a multi-rater kappa. ... rated by each rater in the test-retest reliability study ranges from 9 ...
    (sci.stat.edu)
  • Re: comparing Kappa Statistics in case of dependence
    ... we would like to compare different kappa statistics. ... Weighted Least-Squares Approach for Comparing Correlated Kappa ...
    (sci.stat.math)
  • sample size for kappa?
    ... I am doing a study assessing an audit tool. ... raters and a audit tool/questionnaire that has 250 ... because we are interested if there is a difference in kappa between the ... know if there is inter rater agreement between the 2 raters on the 250 ...
    (sci.stat.consult)