Does Reinforcement Learning Improve Outcomes for Critically Ill Patients? A Systematic Review and Level-of-Readiness Assessment

Martijn Otten, Ameet R Jagesar, Tariq A Dam, Laurens A Biesheuvel, Floris den Hengst, Kirsten A Ziesemer, Patrick J Thoral, Harm-Jan de Grooth, Armand R J Girbes, Vincent François-Lavet, Mark Hoogendoorn, Paul W G Elbers

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)


OBJECTIVE: Reinforcement learning (RL) is a machine learning technique uniquely effective at sequential decision-making, which makes it potentially relevant to ICU treatment challenges. We set out to systematically review, assess level-of-readiness and meta-analyze the effect of RL on outcomes for critically ill patients.

DATA SOURCES: A systematic search was performed in PubMed,, Clarivate Analytics/Web of Science Core Collection, Elsevier/SCOPUS and the Institute of Electrical and Electronics Engineers Xplore Digital Library from inception to March 25, 2022, with subsequent citation tracking.

DATA EXTRACTION: Journal articles that used an RL technique in an ICU population and reported on patient health-related outcomes were included for full analysis. Conference papers were included for level-of-readiness assessment only. Descriptive statistics, characteristics of the models, outcome compared with clinician's policy and level-of-readiness were collected. RL-health risk of bias and applicability assessment was performed.

DATA SYNTHESIS: A total of 1,033 articles were screened, of which 18 journal articles and 18 conference papers, were included. Thirty of those were prototyping or modeling articles and six were validation articles. All articles reported RL algorithms to outperform clinical decision-making by ICU professionals, but only in retrospective data. The modeling techniques for the state-space, action-space, reward function, RL model training, and evaluation varied widely. The risk of bias was high in all articles, mainly due to the evaluation procedure.

CONCLUSION: In this first systematic review on the application of RL in intensive care medicine we found no studies that demonstrated improved patient outcomes from RL-based technologies. All studies reported that RL-agent policies outperformed clinician policies, but such assessments were all based on retrospective off-policy evaluation.

Original languageEnglish
Pages (from-to)E79-E88
JournalCritical Care Medicine
Issue number2
Early online date8 Nov 2023
Publication statusPublished - 1 Feb 2024


  • artificial intelligence
  • intensive care medicine
  • machine learning
  • reinforcement learning
  • sequential decision-making
  • systematic review

Cite this