Clinicians are right not to like Cohen's kappa

H.C.W. de Vet; L.B. Mokkink; C.B. Terwee; O.S. Hoekstra; D.L. Knol

doi:https://doi.org/10.1136/bmj.f2125

Clinicians are right not to like Cohen's kappa

H.C.W. de Vet, L.B. Mokkink, C.B. Terwee, O.S. Hoekstra, D.L. Knol

Research output: Contribution to journal › Article › Academic › peer-review

201 Citations (Scopus)

Abstract

Clinicians are interested in observer variation in terms of the probability of other raters (interobserver) or themselves (intraobserver) obtaining the same answer. Cohen's κ is commonly used in the medical literature to express such agreement in categorical outcomes. The value of Cohen's κ, however, is not sufficiently informative because it is a relative measure, while the clinician's question of observer variation calls for an absolute measure. Using an example in which the observed agreement and κ lead to different conclusions, we illustrate that percentage agreement is an absolute measure (a measure of agreement) and that κ is a relative measure (a measure of reliability). For the data to be useful for clinicians, measures of agreement should be used. The proportion of specific agreement, expressing the agreement separately for the positive and the negative ratings, is the most appropriate measure for conveying the relevant information in a 2 × 2 table and is most informative for clinicians.

Original language	English
Article number	f2125
Pages (from-to)	1-7
Journal	British Medical Journal
Volume	346
DOIs	https://doi.org/10.1136/bmj.f2125
Publication status	Published - 2013

Access to Document

https://doi.org/10.1136/bmj.f2125

Cite this

@article{246ac0d17b3f4109a8daa46c58a97572,

title = "Clinicians are right not to like Cohen's kappa",

abstract = "Clinicians are interested in observer variation in terms of the probability of other raters (interobserver) or themselves (intraobserver) obtaining the same answer. Cohen's κ is commonly used in the medical literature to express such agreement in categorical outcomes. The value of Cohen's κ, however, is not sufficiently informative because it is a relative measure, while the clinician's question of observer variation calls for an absolute measure. Using an example in which the observed agreement and κ lead to different conclusions, we illustrate that percentage agreement is an absolute measure (a measure of agreement) and that κ is a relative measure (a measure of reliability). For the data to be useful for clinicians, measures of agreement should be used. The proportion of specific agreement, expressing the agreement separately for the positive and the negative ratings, is the most appropriate measure for conveying the relevant information in a 2 × 2 table and is most informative for clinicians.",

author = "{de Vet}, H.C.W. and L.B. Mokkink and C.B. Terwee and O.S. Hoekstra and D.L. Knol",

year = "2013",

doi = "https://doi.org/10.1136/bmj.f2125",

language = "English",

volume = "346",

pages = "1--7",

journal = "British Medical Journal",

issn = "0959-535X",

publisher = "British Medical Association",

}

TY - JOUR

T1 - Clinicians are right not to like Cohen's kappa

AU - de Vet, H.C.W.

AU - Mokkink, L.B.

AU - Terwee, C.B.

AU - Hoekstra, O.S.

AU - Knol, D.L.

PY - 2013

Y1 - 2013

N2 - Clinicians are interested in observer variation in terms of the probability of other raters (interobserver) or themselves (intraobserver) obtaining the same answer. Cohen's κ is commonly used in the medical literature to express such agreement in categorical outcomes. The value of Cohen's κ, however, is not sufficiently informative because it is a relative measure, while the clinician's question of observer variation calls for an absolute measure. Using an example in which the observed agreement and κ lead to different conclusions, we illustrate that percentage agreement is an absolute measure (a measure of agreement) and that κ is a relative measure (a measure of reliability). For the data to be useful for clinicians, measures of agreement should be used. The proportion of specific agreement, expressing the agreement separately for the positive and the negative ratings, is the most appropriate measure for conveying the relevant information in a 2 × 2 table and is most informative for clinicians.

AB - Clinicians are interested in observer variation in terms of the probability of other raters (interobserver) or themselves (intraobserver) obtaining the same answer. Cohen's κ is commonly used in the medical literature to express such agreement in categorical outcomes. The value of Cohen's κ, however, is not sufficiently informative because it is a relative measure, while the clinician's question of observer variation calls for an absolute measure. Using an example in which the observed agreement and κ lead to different conclusions, we illustrate that percentage agreement is an absolute measure (a measure of agreement) and that κ is a relative measure (a measure of reliability). For the data to be useful for clinicians, measures of agreement should be used. The proportion of specific agreement, expressing the agreement separately for the positive and the negative ratings, is the most appropriate measure for conveying the relevant information in a 2 × 2 table and is most informative for clinicians.

U2 - https://doi.org/10.1136/bmj.f2125

DO - https://doi.org/10.1136/bmj.f2125

M3 - Article

SN - 0959-535X

VL - 346

SP - 1

EP - 7

JO - British Medical Journal

JF - British Medical Journal

M1 - f2125

ER -