Reproducibility of deep gray matter atrophy rate measurement in a large multicenter dataset

A. Meijerman, H. Amiri, M. D. Steenwijk, M. A. Jonker, R. A. Van Schijndel, K. S. Cover, H. Vrenken

Research output: Contribution to journalArticleAcademicpeer-review

14 Citations (Scopus)


Background and Purpose: Precise in vivo measurement of deep GM volume change is a highly demanded prerequisite for an adequate evaluation of disease progression and new treatments. However, quantitative data on the reproducibility of deep GM structure volumetry are not yet available. In this paper we aim to investigate this reproducibility using a large multicenter dataset. MATERIALS AND METHODS: We have assessed the reproducibility of 2 automated segmentation software packages (FreeSurfer and the FMRIB Integrated Registration and Segmentation Tool) by quantifying the volume changes of deep GM structures by using back-to-back MR imaging scans from the Alzheimer Disease Neuroimaging Initiative's multicenter dataset. Five hundred sixty-two subjects with scans at baseline and 1 year were included. Reproducibility was investigated in the bilateral caudate nucleus, putamen, amygdala, globus pallidus, and thalamus by carrying out descriptives as well as multilevel and variance component analysis. Results: Median absolute back-to-back differences varied between GM structures, ranging from 59.6-156.4 L for volume change, and 1.26%-8.63% for percentage volume change. FreeSurfer had a better performance for the outcome of longitudinal volume change for the bilateral amygdala, putamen, left caudate nucleus (P < .005), and right thalamus (P < .001). For longitudinal percentage volume change, Freesurfer performed better for the left amygdala, bilateral caudate nucleus, and left putamen (P.001). Smaller limits of agreement were found for FreeSurfer for both outcomes for all GM structures except the globus pallidus. Our results showed that back-to-back differences in 1-year percentage volume change were approximately 1.5-3.5 times larger than the mean measured 1-year volume change of those structures. CONCLUSIONS: Longitudinal deep GM atrophy measures should be interpreted with caution. Furthermore, deep GM atrophy measurement techniques require substantially improved reproducibility, specifically when aiming for personalized medicine.

Original languageEnglish
Pages (from-to)46-53
Number of pages8
JournalAmerican journal of neuroradiology
Issue number1
Publication statusPublished - 1 Jan 2018

Cite this