Visual rating of age-related white matter changes on magnetic resonance imaging: Scale comparison, interrater agreement, and correlations with quantitative measurements

P. Kapeller, R. Barber, R. J. Vermeulen, H. Adèr, P. Scheltens, W. Freidl, O. Almkvist, M. Moretti, T. Del Ser, P. Vaghfeldt, C. Enzinger, F. Barkhof, D. Inzitari, T. Erkinjunti, R. Schmidt, Franz Fazekas

Research output: Contribution to journalArticleAcademicpeer-review

267 Citations (Scopus)


Background and Purpose - To provide further insight into the MRI assessment of age-related white matter changes (ARWMCs) with visual rating scales, 3 raters with different levels of experience tested the interrater agreement and comparability of 3 widely used rating scales in a cross-sectional and follow-up setting. Furthermore, the correlation between visual ratings and quantitative volumetric measurement was assessed. Methods - Three raters from different sites using 3 established rating scales (Manolio, Fazekas and Schmidt, Scheltens) evaluated 74 baseline and follow-up scans from 5 European centers. One investigator also rated baseline scans in a set of 255 participants of the Austrian Stroke Prevention Study (ASPS) and measured the volume of ARWMCs. Results - The interrater agreement for the baseline investigation was fair to good for all scales (K values, 0.59 to 0.78). On the follow-up scans, all 3 raters depicted significant ARWMC progression; however, the direct interrater agreement for this task was poor (κ, 0.19 to 0.39). Comparison of the interrater reliability between the 3 scales revealed a statistical significant difference between the scale of Manolio and that of Fazekas and Schmidt for the baseline investigation (z value, -2.9676; P=0.003), demonstrating better interrater agreement for the Fazekas and Schmidt scale. The rating results obtained with all 3 scales were highly correlated with each other (Spearman rank correlation, 0.712 to 0.806; P≤0.01), and there was significant agreement between all 3 visual rating scales and the quantitative volumetric measurement of ARWMC (Kendall W, 0.37, 0.48, and 0.57; P<0.001). Conclusions - Our data demonstrate that the 3 rating scales studied reflect the actual volume of ARWMCs well. The 2 scales that provide more detailed information on ARWMCs seemed preferential compared with the 1 that yields more global information. The visual assessment of ARWMC progression remains problematic and may require modifications or extensions of existing rating scales.

Original languageEnglish
Pages (from-to)441-445
Number of pages5
Issue number2
Publication statusPublished - 1 Feb 2003


  • Magnetic resonance imaging
  • White matter

Cite this