Statistics From A (Agreement) to Z (z Score): A Guide to Interpreting Common Measures of Association, Agreement, Diagnostic Accuracy, Effect Size, Heterogeneity, and Reliability in Medical Research

Patrick Schober, Edward J. Mascha, Thomas R. Vetter

Research output: Contribution to journalArticleAcademicpeer-review

74 Citations (Scopus)

Abstract

Researchers reporting results of statistical analyses, as well as readers of manuscripts reporting original research, often seek guidance on how numeric results can be practically and meaningfully interpreted. With this article, we aim to provide benchmarks for cutoff or cut-point values and to suggest plain-language interpretations for a number of commonly used statistical measures of association, agreement, diagnostic accuracy, effect size, heterogeneity, and reliability in medical research. Specifically, we discuss correlation coefficients, Cronbach's alpha, I2, intraclass correlation (ICC), Cohen's and Fleiss' kappa statistics, the area under the receiver operating characteristic curve (AUROC, concordance statistic), standardized mean differences (Cohen's d, Hedge's g, Glass' delta), and z scores. We base these cutoff values on what has been previously proposed by experts in the field in peer-reviewed literature and textbooks, as well as online statistical resources. We integrate, adapt, and/or expand previous suggestions in attempts to (a) achieve a compromise between divergent recommendations, and (b) propose cutoffs that we perceive sensible for the field of anesthesia and related specialties. While our suggestions provide guidance on how the results of statistical tests are typically interpreted, this does not mean that the results can universally be interpreted as suggested here. We discuss the well-known inherent limitations of using cutoff values to categorize continuous measures. We further emphasize that cutoff values may depend on the specific clinical or scientific context. Rule-of-the thumb approaches to the interpretation of statistical measures should therefore be used judiciously.
Original languageEnglish
Pages (from-to)1633-1641
Number of pages9
JournalAnesthesia and analgesia
Volume133
Issue number6
DOIs
Publication statusPublished - 1 Dec 2021

Keywords

  • Algorithms
  • Area Under Curve
  • Benchmarking
  • Biomedical Research/statistics & numerical data
  • Correlation of Data
  • Data Interpretation, Statistical
  • Observer Variation
  • ROC Curve
  • Reference Values
  • Reproducibility of Results

Cite this