The number of studies applying machine learning (ML) to predict acute kidney injury (AKI) has grown steadily over the past decade. We assess and critically appraise the state of the art in ML models for AKI prediction, considering performance, methodological soundness, and applicability.

We searched PubMed and ArXiv, extracted data, and critically appraised studies based on the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD), Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS), and Prediction Model Risk of Bias Assessment Tool (PROBAST) guidelines.

Forty-six studies from 3166 titles were included. Thirty-eight studies developed a model, five developed and externally validated one, and three studies externally validated one. Flexible ML methods were used more often than deep learning, although the latter was common with temporal variables and text as predictors. Predictive performance showed an area under receiver operating curves ranging from 0.49 to 0.99. Our critical appraisal identified a high risk of bias in 39 studies. Some studies lacked internal validation, whereas external validation and interpretability of results were rarely considered. Fifteen studies focused on AKI prediction in the intensive care setting, and the US-derived Medical Information Mart for Intensive Care (MIMIC) data set was commonly used. Reproducibility was limited as data and code were usually unavailable.

Flexible ML methods are popular for the prediction of AKI, although more complex models based on deep learning are emerging. Our critical appraisal identified a high risk of bias in most models: Studies should use calibration measures and external validation more often, improve model interpretability, and share data and code to improve reproducibility.
Original languageEnglish
Article numbersfac181
JournalClinical Kidney Journal
Publication statusPublished - 2 Aug 2022


  • acute kidney injury
  • clinical prediction models
  • critical appraisal
  • machine learning
  • systematic review

Cite this