Multicenter Evaluation of AI-generated DIR and PSIR for Cortical and Juxtacortical Multiple Sclerosis Lesion Detection

Piet M. Bouman, Samantha Noteboom, Fernando A. Nobrega Santos, Erin S. Beck, Gregory Bliault, Marco Castellaro, Massimiliano Calabrese, Declan T. Chard, Paul Eichinger, Massimo Filippi, Matilde Inglese, Caterina Lapucci, Andrzej Marciniak, Bastiaan Moraal, Alfredo Morales Pinzon, Mark Mühlau, Paolo Preziosa, Daniel S. Reich, Maria A. Rocca, Menno M. SchoonheimJos W. R. Twisk, Benedict Wiestler, Laura E. Jonkman, Charles R. G. Guttmann, Jeroen J. G. Geurts, Martijn D. Steenwijk

Research output: Contribution to journalArticleAcademicpeer-review

11 Citations (Scopus)


Background: Cortical multiple sclerosis lesions are clinically relevant but inconspicuous at conventional clinical MRI. Double inversion recovery (DIR) and phase-sensitive inversion recovery (PSIR) are more sensitive but often unavailable. In the past 2 years, artificial intelligence (AI) was used to generate DIR and PSIR from standard clinical sequences (eg, T1-weighted, T2-weighted, and fluid-attenuated inversion-recovery sequences), but multicenter validation is crucial for further implementation. Purpose: To evaluate cortical and juxtacortical multiple sclerosis lesion detection for diagnostic and disease monitoring purposes on AI-generated DIR and PSIR images compared with MRI-acquired DIR and PSIR images in a multicenter setting. Materials and Methods: Generative adversarial networks were used to generate AI-based DIR (n = 50) and PSIR (n = 43) images. The number of detected lesions between AI-generated images and MRI-acquired (reference) images was compared by randomized blinded scoring by seven readers (all with >10 years of experience in lesion assessment). Reliability was expressed as the intraclass correlation coefficient (ICC). Differences in lesion subtype were determined using Wilcoxon signed-rank tests. Results: MRI scans of 202 patients with multiple sclerosis (mean age, 46 years ± 11 [SD]; 127 women) were retrospectively collected from seven centers (February 2020 to January 2021). In total, 1154 lesions were detected on AI-generated DIR images versus 855 on MRI-acquired DIR images (mean difference per reader, 35.0% ± 22.8; P < .001). On AI-generated PSIR images, 803 lesions were detected versus 814 on MRI-acquired PSIR images (98.9% ± 19.4; P = .87). Reliability was good for both DIR (ICC, 0.81) and PSIR (ICC, 0.75) across centers. Regionally, more juxtacortical lesions were detected on AI-generated DIR images than on MRI-acquired DIR images (495 [42.9%] vs 338 [39.5%]; P < .001). On AI-generated PSIR images, fewer juxtacortical lesions were detected than on MRI-acquired PSIR images (232 [28.9%] vs 282 [34.6%]; P = .02). Conclusion: Artificial intelligence–generated double inversion-recovery and phase-sensitive inversion-recovery images performed well compared with their MRI-acquired counterparts and can be considered reliable in a multicenter setting, with good between-reader and between-center interpretative agreement.

Original languageEnglish
Article numbere221425
Issue number2
Publication statusPublished - 1 Apr 2023

Cite this