TY - CONF
T1 - Differential dementia diagnosis on incomplete data with latent trees
AU - Ledig, Christian
AU - Kaltwang, Sebastian
AU - Tolonen, Antti
AU - Koikkalainen, Juha
AU - Scheltens, Philip
AU - Barkhof, Frederik
AU - Rhodius-Meester, Hanneke
AU - Tijms, Betty
AU - Lemstra, Afina W.
AU - van der Flier, Wiesje
AU - Lötjönen, Jyrki
AU - Rueckert, Daniel
PY - 2016
Y1 - 2016
N2 - Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62% for the classification of all patients into five distinct dementia types even though 20% of the features are missing in both training and testing data (68% on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.
AB - Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62% for the classification of all patients into five distinct dementia types even though 20% of the features are missing in both training and testing data (68% on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.
KW - Dementia
KW - Differential diagnosis
KW - Incomplete data
KW - Latent trees
UR - http://www.scopus.com/inward/record.url?scp=84996598640&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/978-3-319-46723-8_6
DO - https://doi.org/10.1007/978-3-319-46723-8_6
M3 - Paper
SP - 44
EP - 52
ER -