TY - JOUR
T1 - Development of a Reinforcement Learning Algorithm to Optimize Corticosteroid Therapy in Critically Ill Patients with Sepsis
AU - Bologheanu, Razvan
AU - Kapral, Lorenz
AU - Laxar, Daniel
AU - Maleczek, Mathias
AU - Dibiasi, Christoph
AU - Zeiner, Sebastian
AU - Agibetov, Asan
AU - Ercole, Ari
AU - Thoral, Patrick
AU - Elbers, Paul
AU - Heitzinger, Clemens
AU - Kimberger, Oliver
N1 - Publisher Copyright: © 2023 by the authors.
PY - 2023/2/1
Y1 - 2023/2/1
N2 - Background: The optimal indication, dose, and timing of corticosteroids in sepsis is controversial. Here, we used reinforcement learning to derive the optimal steroid policy in septic patients based on data on 3051 ICU admissions from the AmsterdamUMCdb intensive care database. Methods: We identified septic patients according to the 2016 consensus definition. An actor-critic RL algorithm using ICU mortality as a reward signal was developed to determine the optimal treatment policy from time-series data on 277 clinical parameters. We performed off-policy evaluation and testing in independent subsets to assess the algorithm’s performance. Results: Agreement between the RL agent’s policy and the actual documented treatment reached 59%. Our RL agent’s treatment policy was more restrictive compared to the actual clinician behavior: our algorithm suggested withholding corticosteroids in 62% of the patient states, versus 52% according to the physicians’ policy. The 95% lower bound of the expected reward was higher for the RL agent than clinicians’ historical decisions. ICU mortality after concordant action in the testing dataset was lower both when corticosteroids had been withheld and when corticosteroids had been prescribed by the virtual agent. The most relevant variables were vital parameters and laboratory values, such as blood pressure, heart rate, leucocyte count, and glycemia. Conclusions: Individualized use of corticosteroids in sepsis may result in a mortality benefit, but optimal treatment policy may be more restrictive than the routine clinical practice. Whilst external validation is needed, our study motivates a ‘precision-medicine’ approach to future prospective controlled trials and practice.
AB - Background: The optimal indication, dose, and timing of corticosteroids in sepsis is controversial. Here, we used reinforcement learning to derive the optimal steroid policy in septic patients based on data on 3051 ICU admissions from the AmsterdamUMCdb intensive care database. Methods: We identified septic patients according to the 2016 consensus definition. An actor-critic RL algorithm using ICU mortality as a reward signal was developed to determine the optimal treatment policy from time-series data on 277 clinical parameters. We performed off-policy evaluation and testing in independent subsets to assess the algorithm’s performance. Results: Agreement between the RL agent’s policy and the actual documented treatment reached 59%. Our RL agent’s treatment policy was more restrictive compared to the actual clinician behavior: our algorithm suggested withholding corticosteroids in 62% of the patient states, versus 52% according to the physicians’ policy. The 95% lower bound of the expected reward was higher for the RL agent than clinicians’ historical decisions. ICU mortality after concordant action in the testing dataset was lower both when corticosteroids had been withheld and when corticosteroids had been prescribed by the virtual agent. The most relevant variables were vital parameters and laboratory values, such as blood pressure, heart rate, leucocyte count, and glycemia. Conclusions: Individualized use of corticosteroids in sepsis may result in a mortality benefit, but optimal treatment policy may be more restrictive than the routine clinical practice. Whilst external validation is needed, our study motivates a ‘precision-medicine’ approach to future prospective controlled trials and practice.
KW - artificial intelligence
KW - corticosteroids
KW - outcomes
KW - reinforcement learning
KW - sepsis
UR - http://www.scopus.com/inward/record.url?scp=85148945046&partnerID=8YFLogxK
U2 - https://doi.org/10.3390/jcm12041513
DO - https://doi.org/10.3390/jcm12041513
M3 - Article
C2 - 36836046
SN - 0009-9147
VL - 12
JO - the Journal of Applied Laboratory Medicine
JF - the Journal of Applied Laboratory Medicine
IS - 4
M1 - 1513
ER -