TY - GEN
T1 - Transferring clinical prediction models across hospitals and electronic health record systems
AU - Curth, Alicia
AU - Thoral, Patrick
AU - van den Wildenberg, Wilco
AU - Bijlstra, Peter
AU - de Bruin, Daan
AU - Elbers, Paul
AU - Fornasa, Mattia
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Recent years have seen a surge in studies developing clinical prediction models based on electronic health records (EHRs) as a result of advances in machine learning techniques and data availability. Yet, validation and implementation of such models in practice are rare, in part because EHR-based clinical prediction models are more difficult to apply to new data sets than results of classical clinical studies due to less controlled clinical environments. In this paper we propose to use the theoretical framework of domain adaptation to analyze the problem of transferring machine-learning-based clinical prediction models across different hospitals and EHR systems. Using the model of Thoral et al. [12] predicting patient-level risk of readmission and mortality after intensive care unit discharge as a case study, we discuss, apply and compare multiple domain adaptation methods. We transfer the model from the original source data set to two new target data sets. We find that, while model performance deteriorates substantially when applying a model developed for one data set to another directly, updating models with training data from the target set and using methods that explicitly model differences in data sets always improves model performance. In a simulation experiment, we show that having access to data or model parameters from another hospital can substantially reduce the amount of data required to build an accurate prediction model for a new hospital. We also show that these performance gains diminish with increasing availability of data from the target hospital.
AB - Recent years have seen a surge in studies developing clinical prediction models based on electronic health records (EHRs) as a result of advances in machine learning techniques and data availability. Yet, validation and implementation of such models in practice are rare, in part because EHR-based clinical prediction models are more difficult to apply to new data sets than results of classical clinical studies due to less controlled clinical environments. In this paper we propose to use the theoretical framework of domain adaptation to analyze the problem of transferring machine-learning-based clinical prediction models across different hospitals and EHR systems. Using the model of Thoral et al. [12] predicting patient-level risk of readmission and mortality after intensive care unit discharge as a case study, we discuss, apply and compare multiple domain adaptation methods. We transfer the model from the original source data set to two new target data sets. We find that, while model performance deteriorates substantially when applying a model developed for one data set to another directly, updating models with training data from the target set and using methods that explicitly model differences in data sets always improves model performance. In a simulation experiment, we show that having access to data or model parameters from another hospital can substantially reduce the amount of data required to build an accurate prediction model for a new hospital. We also show that these performance gains diminish with increasing availability of data from the target hospital.
KW - Clinical prediction models
KW - Domain adaptation
KW - Electronic Health Records
KW - Intensive Care Medicine
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85083735710&partnerID=8YFLogxK
U2 - https://doi.org/10.1007/978-3-030-43823-4_48
DO - https://doi.org/10.1007/978-3-030-43823-4_48
M3 - Conference contribution
SN - 9783030438227
VL - 1167 CCIS
T3 - Communications in Computer and Information Science
SP - 605
EP - 621
BT - Machine Learning and Knowledge Discovery in Databases - International Workshops of ECML PKDD 2019, Proceedings
A2 - Cellier, Peggy
A2 - Driessens, Kurt
PB - Springer
T2 - 19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
Y2 - 16 September 2019 through 20 September 2019
ER -