Video-Based Pairwise Comparison: Enabling the Development of Automated Rating of Motor Dysfunction in Multiple Sclerosis

Jessica Burggraaff, Jonas Dorn, Marcus D'Souza, Cecily Morrison, Christian P. Kamm, Peter Kontschieder, Prejaas Tewarie, Saskia Steinheimer, Abigail Sellen, Frank Dahlke, Ludwig Kappos, Bernard Uitdehaag

Research output: Contribution to journalArticleAcademicpeer-review

5 Citations (Scopus)


Objectives: To examine the feasibility, reliability, granularity, and convergent validity of a video-based pairwise comparison technique that uses algorithmic support to enable automated rating of motor dysfunction in patients with multiple sclerosis (MS). Design: Feasibility and larger cross-sectional cohort study. Setting: The outpatient clinic of 2 specialist university medical centers. Participants: Selected sample from a cohort of patients with MS participating in the Assess MS study (N=42). Videos were randomly drawn from each strata of the ataxia severity-degrees as defined in the Expanded Disability Status Scale (EDSS). In Basel: 19 videos of 17 patients (mean age, 43.4±11.6y; 10 women). In Amsterdam: 50 videos of 25 patients (mean age, 50.0±10.0y; 15 women). Interventions: Not applicable. Main Outcome Measures: In each center, neurologists (n=13; n=10) viewed pairs of videos of patients performing standardized movements (eg, finger-to-nose test) to assess relative performance. A comparative assessment score was calculated for each video using the TrueSkill algorithm and analyzed for intrarater (test-retest; ratio of agreement) and interrater reliability (intraclass correlation coefficient [ICC] for absolute agreement) and convergent validity (Spearman ρ). Granularity was estimated from the average difference in comparative assessment scores at which 80% of neurologists considered performance to be different. Results: Intrarater reliability was excellent (median ratio of agreement≥0.87). The comparative assessment scores calculated from individual neurologists demonstrated good-excellent ICCs for interrater reliability (0.89; 0.71). The comparative assessment scores correlated (very) highly with their Neurostatus-EDSS equivalent (ρ=0.78, P<.001; ρ=0.91, P<.05), suggesting a more fine-grained rating. Conclusions: Video-based pairwise comparison of motor dysfunction allows for reliable and fine-grained capturing of clinical judgment about neurologic performance, which can contribute to the development of a consistent quantified metric of motor ability in MS.

Original languageEnglish
Pages (from-to)234-241
Number of pages8
JournalArchives of physical medicine and rehabilitation
Issue number2
Early online date30 Aug 2019
Publication statusPublished - 1 Feb 2020

Cite this