Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis

Luca Roggeveen; Ali el Hassouni; Jonas Ahrendt; Tingjie Guo; Lucas Fleuren; Patrick Thoral; Armand RJ Girbes; Mark Hoogendoorn; Paul WG Elbers

doi:https://doi.org/10.1016/j.artmed.2020.102003

Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis

Luca Roggeveen, Ali el Hassouni, Jonas Ahrendt, Tingjie Guo, Lucas Fleuren, Patrick Thoral, Armand RJ Girbes, Mark Hoogendoorn, Paul WG Elbers

Research output: Contribution to journal › Article › Academic › peer-review

15 Citations (Scopus)

Abstract

Introduction: In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice. Method: We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability. Results: The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work. Conclusion: We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.

Original language	English
Article number	102003
Journal	Artificial Intelligence in Medicine
Volume	112
DOIs	https://doi.org/10.1016/j.artmed.2020.102003
Publication status	Published - 1 Feb 2021

Keywords

Deep Q learning
ICU
Reinforcement learning
Sepsis

Access to Document

https://doi.org/10.1016/j.artmed.2020.102003

Cite this

Roggeveen, L., el Hassouni, A., Ahrendt, J., Guo, T., Fleuren, L., Thoral, P., Girbes, A. RJ., Hoogendoorn, M., & Elbers, P. WG. (2021). Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis. Artificial Intelligence in Medicine, 112, Article 102003. https://doi.org/10.1016/j.artmed.2020.102003

@article{cf8f4deac48145499ff88541c57dba5b,

title = "Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis",

abstract = "Introduction: In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice. Method: We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability. Results: The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work. Conclusion: We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.",

keywords = "Deep Q learning, ICU, Reinforcement learning, Sepsis",

author = "Luca Roggeveen and {el Hassouni}, Ali and Jonas Ahrendt and Tingjie Guo and Lucas Fleuren and Patrick Thoral and Girbes, {Armand RJ} and Mark Hoogendoorn and Elbers, {Paul WG}",

year = "2021",

month = feb,

day = "1",

doi = "https://doi.org/10.1016/j.artmed.2020.102003",

language = "English",

volume = "112",

journal = "Artificial Intelligence in Medicine",

issn = "0933-3657",

publisher = "Elsevier",

}

TY - JOUR

T1 - Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis

AU - Roggeveen, Luca

AU - el Hassouni, Ali

AU - Ahrendt, Jonas

AU - Guo, Tingjie

AU - Fleuren, Lucas

AU - Thoral, Patrick

AU - Girbes, Armand RJ

AU - Hoogendoorn, Mark

AU - Elbers, Paul WG

PY - 2021/2/1

Y1 - 2021/2/1

N2 - Introduction: In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice. Method: We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability. Results: The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work. Conclusion: We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.

AB - Introduction: In recent years, reinforcement learning (RL) has gained traction in the healthcare domain. In particular, RL methods have been explored for haemodynamic optimization of septic patients in the Intensive Care Unit. Most hospitals however, lack the data and expertise for model development, necessitating transfer of models developed using external datasets. This approach assumes model generalizability across different patient populations, the validity of which has not previously been tested. In addition, there is limited knowledge on safety and reliability. These challenges need to be addressed to further facilitate implementation of RL models in clinical practice. Method: We developed and validated a new reinforcement learning model for hemodynamic optimization in sepsis on the MIMIC intensive care database from the USA using a dueling double deep Q network. We then transferred this model to the European AmsterdamUMCdb intensive care database. T-Distributed Stochastic Neighbor Embedding and Sequential Organ Failure Assessment scores were used to explore the differences between the patient populations. We apply off-policy policy evaluation methods to quantify model performance. In addition, we introduce and apply a novel deep policy inspection to analyse how the optimal policy relates to the different phases of sepsis and sepsis treatment to provide interpretable insight in order to assess model safety and reliability. Results: The off-policy evaluation revealed that the optimal policy outperformed the physician policy on both datasets despite marked differences between the two patient populations and physician's policies. Our novel deep policy inspection method showed insightful results and unveiled that the model could initiate therapy adequately and adjust therapy intensity to illness severity and disease progression which indicated safe and reliable model behaviour. Compared to current physician behavior, the developed policy prefers a more liberal use of vasopressors with a more restrained use of fluid therapy in line with previous work. Conclusion: We created a reinforcement learning model for optimal bedside hemodynamic management and demonstrated model transferability between populations from the USA and Europe for the first time. We proposed new methods for deep policy inspection integrating expert domain knowledge. This is expected to facilitate progression to bedside clinical decision support for the treatment of critically ill patients.

KW - Deep Q learning

KW - ICU

KW - Reinforcement learning

KW - Sepsis

UR - http://www.scopus.com/inward/record.url?scp=85100429092&partnerID=8YFLogxK

U2 - https://doi.org/10.1016/j.artmed.2020.102003

DO - https://doi.org/10.1016/j.artmed.2020.102003

M3 - Article

C2 - 33581824

SN - 0933-3657

VL - 112

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

M1 - 102003

ER -

Transatlantic transferability of a new reinforcement learning model for optimizing haemodynamic treatment for critically ill patients with sepsis

Abstract

Keywords

Access to Document

Other files and links

Cite this