Performance of five research-domain automated WM lesion segmentation methods in a multi-center MS study

MAGNIMS Study Group; neuGRID

doi:https://doi.org/10.1016/j.neuroimage.2017.09.011

Performance of five research-domain automated WM lesion segmentation methods in a multi-center MS study

MAGNIMS Study Group, neuGRID

Research output: Contribution to journal › Article › Academic › peer-review

25 Citations (Scopus)

Abstract

Background and Purpose In vivoidentification of white matter lesions plays a key-role in evaluation of patients with multiple sclerosis (MS). Automated lesion segmentation methods have been developed to substitute manual outlining, but evidence of their performance in multi-center investigations is lacking. In this work, five research-domain automated segmentation methods were evaluated using a multi-center MS dataset. Methods 70 MS patients (median EDSS of 2.0 [range 0.0–6.5]) were included from a six-center dataset of the MAGNIMS Study Group (www.magnims.eu) which included 2D FLAIR and 3D T1 images with manual lesion segmentation as a reference. Automated lesion segmentations were produced using five algorithms: Cascade; Lesion Segmentation Toolbox (LST) with both the Lesion growth algorithm (LGA) and the Lesion prediction algorithm (LPA); Lesion-Topology preserving Anatomical Segmentation (Lesion-TOADS); and k-Nearest Neighbor with Tissue Type Priors (kNN-TTP). Main software parameters were optimized using a training set (N = 18), and formal testing was performed on the remaining patients (N = 52). To evaluate volumetric agreement with the reference segmentations, intraclass correlation coefficient (ICC) as well as mean difference in lesion volumes between the automated and reference segmentations were calculated. The Similarity Index (SI), False Positive (FP) volumes and False Negative (FN) volumes were used to examine spatial agreement. All analyses were repeated using a leave-one-center-out design to exclude the center of interest from the training phase to evaluate the performance of the method on ‘unseen’ center. Results Compared to the reference mean lesion volume (4.85 ± 7.29 mL), the methods displayed a mean difference of 1.60 ± 4.83 (Cascade), 2.31 ± 7.66 (LGA), 0.44 ± 4.68 (LPA), 1.76 ± 4.17 (Lesion-TOADS) and −1.39 ± 4.10 mL (kNN-TTP). The ICCs were 0.755, 0.713, 0.851, 0.806 and 0.723, respectively. Spatial agreement with reference segmentations was higher for LPA (SI = 0.37 ± 0.23), Lesion-TOADS (SI = 0.35 ± 0.18) and kNN-TTP (SI = 0.44 ± 0.14) than for Cascade (SI = 0.26 ± 0.17) or LGA (SI = 0.31 ± 0.23). All methods showed highly similar results when used on data from a center not used in software parameter optimization. Conclusion The performance of the methods in this multi-center MS dataset was moderate, but appeared to be robust even with new datasets from centers not included in training the automated methods.

Original language	English
Pages (from-to)	106-114
Number of pages	9
Journal	NEUROIMAGE
Volume	163
DOIs	https://doi.org/10.1016/j.neuroimage.2017.09.011
Publication status	Published - 1 Dec 2017

Keywords

Automated methods segmentation
MRI
Multiple sclerosis
White matter lesion

Access to Document

https://doi.org/10.1016/j.neuroimage.2017.09.011

Cite this

@article{47c6d5102aae474689bb818c4920fd1f,

title = "Performance of five research-domain automated WM lesion segmentation methods in a multi-center MS study",

abstract = "Background and Purpose In vivoidentification of white matter lesions plays a key-role in evaluation of patients with multiple sclerosis (MS). Automated lesion segmentation methods have been developed to substitute manual outlining, but evidence of their performance in multi-center investigations is lacking. In this work, five research-domain automated segmentation methods were evaluated using a multi-center MS dataset. Methods 70 MS patients (median EDSS of 2.0 [range 0.0–6.5]) were included from a six-center dataset of the MAGNIMS Study Group (www.magnims.eu) which included 2D FLAIR and 3D T1 images with manual lesion segmentation as a reference. Automated lesion segmentations were produced using five algorithms: Cascade; Lesion Segmentation Toolbox (LST) with both the Lesion growth algorithm (LGA) and the Lesion prediction algorithm (LPA); Lesion-Topology preserving Anatomical Segmentation (Lesion-TOADS); and k-Nearest Neighbor with Tissue Type Priors (kNN-TTP). Main software parameters were optimized using a training set (N = 18), and formal testing was performed on the remaining patients (N = 52). To evaluate volumetric agreement with the reference segmentations, intraclass correlation coefficient (ICC) as well as mean difference in lesion volumes between the automated and reference segmentations were calculated. The Similarity Index (SI), False Positive (FP) volumes and False Negative (FN) volumes were used to examine spatial agreement. All analyses were repeated using a leave-one-center-out design to exclude the center of interest from the training phase to evaluate the performance of the method on {\textquoteleft}unseen{\textquoteright} center. Results Compared to the reference mean lesion volume (4.85 ± 7.29 mL), the methods displayed a mean difference of 1.60 ± 4.83 (Cascade), 2.31 ± 7.66 (LGA), 0.44 ± 4.68 (LPA), 1.76 ± 4.17 (Lesion-TOADS) and −1.39 ± 4.10 mL (kNN-TTP). The ICCs were 0.755, 0.713, 0.851, 0.806 and 0.723, respectively. Spatial agreement with reference segmentations was higher for LPA (SI = 0.37 ± 0.23), Lesion-TOADS (SI = 0.35 ± 0.18) and kNN-TTP (SI = 0.44 ± 0.14) than for Cascade (SI = 0.26 ± 0.17) or LGA (SI = 0.31 ± 0.23). All methods showed highly similar results when used on data from a center not used in software parameter optimization. Conclusion The performance of the methods in this multi-center MS dataset was moderate, but appeared to be robust even with new datasets from centers not included in training the automated methods.",

keywords = "Automated methods segmentation, MRI, Multiple sclerosis, White matter lesion",

author = "{de Sitter}, Alexandra and Steenwijk, {Martijn D.} and Aur{\'e}lie Ruet and Adriaan Versteeg and Yaou Liu and {van Schijndel}, {Ronald A.} and Pouwels, {Petra J.W.} and Kilsdonk, {Iris D.} and Cover, {Keith S.} and Stefan Ropele and Rocca, {Maria A.} and Marios Yiannakas and Wattjes, {Mike P.} and Soheil Damangir and Frisoni, {Giovanni B.} and Jaume Sastre-Garriga and Alex Rovira and Christian Enzinger and Massimo Filippi and Frederiksen, {Jette L.} and Olga Ciccarelli and Ludwig Kappos and Frederik Barkhof and Hugo Vrenken and {MAGNIMS Study Group} and neuGRID and {van der Flier}, WM and ND Prins",

year = "2017",

month = dec,

day = "1",

doi = "https://doi.org/10.1016/j.neuroimage.2017.09.011",

language = "English",

volume = "163",

pages = "106--114",

journal = "NEUROIMAGE",

issn = "1053-8119",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Performance of five research-domain automated WM lesion segmentation methods in a multi-center MS study

AU - de Sitter, Alexandra

AU - Steenwijk, Martijn D.

AU - Ruet, Aurélie

AU - Versteeg, Adriaan

AU - Liu, Yaou

AU - van Schijndel, Ronald A.

AU - Pouwels, Petra J.W.

AU - Kilsdonk, Iris D.

AU - Cover, Keith S.

AU - Ropele, Stefan

AU - Rocca, Maria A.

AU - Yiannakas, Marios

AU - Wattjes, Mike P.

AU - Damangir, Soheil

AU - Frisoni, Giovanni B.

AU - Sastre-Garriga, Jaume

AU - Rovira, Alex

AU - Enzinger, Christian

AU - Filippi, Massimo

AU - Frederiksen, Jette L.

AU - Ciccarelli, Olga

AU - Kappos, Ludwig

AU - Barkhof, Frederik

AU - Vrenken, Hugo

AU - MAGNIMS Study Group

AU - neuGRID

AU - van der Flier, WM

AU - Prins, ND

PY - 2017/12/1

Y1 - 2017/12/1

N2 - Background and Purpose In vivoidentification of white matter lesions plays a key-role in evaluation of patients with multiple sclerosis (MS). Automated lesion segmentation methods have been developed to substitute manual outlining, but evidence of their performance in multi-center investigations is lacking. In this work, five research-domain automated segmentation methods were evaluated using a multi-center MS dataset. Methods 70 MS patients (median EDSS of 2.0 [range 0.0–6.5]) were included from a six-center dataset of the MAGNIMS Study Group (www.magnims.eu) which included 2D FLAIR and 3D T1 images with manual lesion segmentation as a reference. Automated lesion segmentations were produced using five algorithms: Cascade; Lesion Segmentation Toolbox (LST) with both the Lesion growth algorithm (LGA) and the Lesion prediction algorithm (LPA); Lesion-Topology preserving Anatomical Segmentation (Lesion-TOADS); and k-Nearest Neighbor with Tissue Type Priors (kNN-TTP). Main software parameters were optimized using a training set (N = 18), and formal testing was performed on the remaining patients (N = 52). To evaluate volumetric agreement with the reference segmentations, intraclass correlation coefficient (ICC) as well as mean difference in lesion volumes between the automated and reference segmentations were calculated. The Similarity Index (SI), False Positive (FP) volumes and False Negative (FN) volumes were used to examine spatial agreement. All analyses were repeated using a leave-one-center-out design to exclude the center of interest from the training phase to evaluate the performance of the method on ‘unseen’ center. Results Compared to the reference mean lesion volume (4.85 ± 7.29 mL), the methods displayed a mean difference of 1.60 ± 4.83 (Cascade), 2.31 ± 7.66 (LGA), 0.44 ± 4.68 (LPA), 1.76 ± 4.17 (Lesion-TOADS) and −1.39 ± 4.10 mL (kNN-TTP). The ICCs were 0.755, 0.713, 0.851, 0.806 and 0.723, respectively. Spatial agreement with reference segmentations was higher for LPA (SI = 0.37 ± 0.23), Lesion-TOADS (SI = 0.35 ± 0.18) and kNN-TTP (SI = 0.44 ± 0.14) than for Cascade (SI = 0.26 ± 0.17) or LGA (SI = 0.31 ± 0.23). All methods showed highly similar results when used on data from a center not used in software parameter optimization. Conclusion The performance of the methods in this multi-center MS dataset was moderate, but appeared to be robust even with new datasets from centers not included in training the automated methods.

AB - Background and Purpose In vivoidentification of white matter lesions plays a key-role in evaluation of patients with multiple sclerosis (MS). Automated lesion segmentation methods have been developed to substitute manual outlining, but evidence of their performance in multi-center investigations is lacking. In this work, five research-domain automated segmentation methods were evaluated using a multi-center MS dataset. Methods 70 MS patients (median EDSS of 2.0 [range 0.0–6.5]) were included from a six-center dataset of the MAGNIMS Study Group (www.magnims.eu) which included 2D FLAIR and 3D T1 images with manual lesion segmentation as a reference. Automated lesion segmentations were produced using five algorithms: Cascade; Lesion Segmentation Toolbox (LST) with both the Lesion growth algorithm (LGA) and the Lesion prediction algorithm (LPA); Lesion-Topology preserving Anatomical Segmentation (Lesion-TOADS); and k-Nearest Neighbor with Tissue Type Priors (kNN-TTP). Main software parameters were optimized using a training set (N = 18), and formal testing was performed on the remaining patients (N = 52). To evaluate volumetric agreement with the reference segmentations, intraclass correlation coefficient (ICC) as well as mean difference in lesion volumes between the automated and reference segmentations were calculated. The Similarity Index (SI), False Positive (FP) volumes and False Negative (FN) volumes were used to examine spatial agreement. All analyses were repeated using a leave-one-center-out design to exclude the center of interest from the training phase to evaluate the performance of the method on ‘unseen’ center. Results Compared to the reference mean lesion volume (4.85 ± 7.29 mL), the methods displayed a mean difference of 1.60 ± 4.83 (Cascade), 2.31 ± 7.66 (LGA), 0.44 ± 4.68 (LPA), 1.76 ± 4.17 (Lesion-TOADS) and −1.39 ± 4.10 mL (kNN-TTP). The ICCs were 0.755, 0.713, 0.851, 0.806 and 0.723, respectively. Spatial agreement with reference segmentations was higher for LPA (SI = 0.37 ± 0.23), Lesion-TOADS (SI = 0.35 ± 0.18) and kNN-TTP (SI = 0.44 ± 0.14) than for Cascade (SI = 0.26 ± 0.17) or LGA (SI = 0.31 ± 0.23). All methods showed highly similar results when used on data from a center not used in software parameter optimization. Conclusion The performance of the methods in this multi-center MS dataset was moderate, but appeared to be robust even with new datasets from centers not included in training the automated methods.

KW - Automated methods segmentation

KW - MRI

KW - Multiple sclerosis

KW - White matter lesion

UR - http://www.scopus.com/inward/record.url?scp=85029663712&partnerID=8YFLogxK

U2 - https://doi.org/10.1016/j.neuroimage.2017.09.011

DO - https://doi.org/10.1016/j.neuroimage.2017.09.011

M3 - Article

C2 - 28899746

SN - 1053-8119

VL - 163

SP - 106

EP - 114

JO - NEUROIMAGE

JF - NEUROIMAGE

ER -

Performance of five research-domain automated WM lesion segmentation methods in a multi-center MS study

Abstract

Keywords

Access to Document

Other files and links

Cite this