Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study

Jenny Lee; Max Westphal; Yasaman Vali; Jerome Boursier; Salvatorre Petta; Rachel Ostroff; Leigh Alexander; Yu Chen; Celine Fournier; Andreas Geier; Sven Francque; Kristy Wonders; Dina Tiniakos; Pierre Bedossa; Mike Allison; Georgios Papatheodoridis; Helena Cortez-Pinto; Raluca Pais; Jean-Francois Dufour; Diana Julie Leeming; Stephen Harrison; Jeremy Cobbold; Adriaan G. Holleboom; Hannele Yki-Järvinen; Javier Crespo; Mattias Ekstedt; Guruprasad P. Aithal; Elisabetta Bugianesi; Manuel Romero-Gomez; Richard Torstenson; Morten Karsdal; Carla Yunis; J. rn M. Schattenberg; Detlef Schuppan; Vlad Ratziu; Clifford Brass; Kevin Duffin; Koos Zwinderman; Michael Pavlides; Quentin M. Anstee; Patrick M. Bossuyt

doi:https://doi.org/10.1097/HEP.0000000000000364

Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study

Jenny Lee, Max Westphal, Yasaman Vali, Jerome Boursier, Salvatorre Petta, Rachel Ostroff, Leigh Alexander, Yu Chen, Celine Fournier, Andreas Geier, Sven Francque, Kristy Wonders, Dina Tiniakos, Pierre Bedossa, Mike Allison, Georgios Papatheodoridis, Helena Cortez-Pinto, Raluca Pais, Jean-Francois Dufour, Diana Julie LeemingStephen Harrison, Jeremy Cobbold, Adriaan G. Holleboom, Hannele Yki-Järvinen, Javier Crespo, Mattias Ekstedt, Guruprasad P. Aithal, Elisabetta Bugianesi, Manuel Romero-Gomez, Richard Torstenson, Morten Karsdal, Carla Yunis, J. rn M. Schattenberg, Detlef Schuppan, Vlad Ratziu, Clifford Brass, Kevin Duffin, Koos Zwinderman, Michael Pavlides, Quentin M. Anstee, Patrick M. Bossuyt

Research output: Contribution to journal › Article › Academic › peer-review

3 Citations (Scopus)

Abstract

Background and Aims: Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. Approach and Results: Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models. Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). Conclusions: Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.

Original language	English
Pages (from-to)	258-271
Number of pages	14
Journal	Hepatology (Baltimore, Md.)
Volume	78
Issue number	1
Early online date	31 Mar 2023
DOIs	https://doi.org/10.1097/HEP.0000000000000364
Publication status	Published - 1 Jul 2023

Access to Document

https://doi.org/10.1097/HEP.0000000000000364

Cite this

Lee, J., Westphal, M., Vali, Y., Boursier, J., Petta, S., Ostroff, R., Alexander, L., Chen, Y., Fournier, C., Geier, A., Francque, S., Wonders, K., Tiniakos, D., Bedossa, P., Allison, M., Papatheodoridis, G., Cortez-Pinto, H., Pais, R., Dufour, J.-F., ... Bossuyt, P. M. (2023). Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study. Hepatology (Baltimore, Md.), 78(1), 258-271. https://doi.org/10.1097/HEP.0000000000000364

@article{02a56c47f88748a3bc35922770179f67,

title = "Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study",

abstract = "Background and Aims: Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. Approach and Results: Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models. Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). Conclusions: Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.",

author = "Jenny Lee and Max Westphal and Yasaman Vali and Jerome Boursier and Salvatorre Petta and Rachel Ostroff and Leigh Alexander and Yu Chen and Celine Fournier and Andreas Geier and Sven Francque and Kristy Wonders and Dina Tiniakos and Pierre Bedossa and Mike Allison and Georgios Papatheodoridis and Helena Cortez-Pinto and Raluca Pais and Jean-Francois Dufour and Leeming, {Diana Julie} and Stephen Harrison and Jeremy Cobbold and Holleboom, {Adriaan G.} and Hannele Yki-J{\"a}rvinen and Javier Crespo and Mattias Ekstedt and Aithal, {Guruprasad P.} and Elisabetta Bugianesi and Manuel Romero-Gomez and Richard Torstenson and Morten Karsdal and Carla Yunis and Schattenberg, {J. rn M.} and Detlef Schuppan and Vlad Ratziu and Clifford Brass and Kevin Duffin and Koos Zwinderman and Michael Pavlides and Anstee, {Quentin M.} and Bossuyt, {Patrick M.}",

note = "Funding Information: The LITMUS project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No. 777377. This Joint Undertaking receives support from the European Union{\textquoteright}s Horizon 2020 research and innovation program and EFPIA. Sven Francque holds a senior clinical investigator fellowship from the Research Foundation Flanders (FWO) (1802154N). Publisher Copyright: {\textcopyright} 2023 John Wiley and Sons Inc.. All rights reserved.",

year = "2023",

month = jul,

day = "1",

doi = "https://doi.org/10.1097/HEP.0000000000000364",

language = "English",

volume = "78",

pages = "258--271",

journal = "Hepatology (Baltimore, Md.)",

issn = "0270-9139",

publisher = "John Wiley and Sons Ltd",

number = "1",

}

Lee, J, Westphal, M, Vali, Y, Boursier, J, Petta, S, Ostroff, R, Alexander, L, Chen, Y, Fournier, C, Geier, A, Francque, S, Wonders, K, Tiniakos, D, Bedossa, P, Allison, M, Papatheodoridis, G, Cortez-Pinto, H, Pais, R, Dufour, J-F, Leeming, DJ, Harrison, S, Cobbold, J, Holleboom, AG, Yki-Järvinen, H, Crespo, J, Ekstedt, M, Aithal, GP, Bugianesi, E, Romero-Gomez, M, Torstenson, R, Karsdal, M, Yunis, C, Schattenberg, JRM, Schuppan, D, Ratziu, V, Brass, C, Duffin, K, Zwinderman, K, Pavlides, M, Anstee, QM & Bossuyt, PM 2023, 'Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study', Hepatology (Baltimore, Md.), vol. 78, no. 1, pp. 258-271. https://doi.org/10.1097/HEP.0000000000000364

TY - JOUR

T1 - Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH

T2 - A development and validation study

AU - Lee, Jenny

AU - Westphal, Max

AU - Vali, Yasaman

AU - Boursier, Jerome

AU - Petta, Salvatorre

AU - Ostroff, Rachel

AU - Alexander, Leigh

AU - Chen, Yu

AU - Fournier, Celine

AU - Geier, Andreas

AU - Francque, Sven

AU - Wonders, Kristy

AU - Tiniakos, Dina

AU - Bedossa, Pierre

AU - Allison, Mike

AU - Papatheodoridis, Georgios

AU - Cortez-Pinto, Helena

AU - Pais, Raluca

AU - Dufour, Jean-Francois

AU - Leeming, Diana Julie

AU - Harrison, Stephen

AU - Cobbold, Jeremy

AU - Holleboom, Adriaan G.

AU - Yki-Järvinen, Hannele

AU - Crespo, Javier

AU - Ekstedt, Mattias

AU - Aithal, Guruprasad P.

AU - Bugianesi, Elisabetta

AU - Romero-Gomez, Manuel

AU - Torstenson, Richard

AU - Karsdal, Morten

AU - Yunis, Carla

AU - Schattenberg, J. rn M.

AU - Schuppan, Detlef

AU - Ratziu, Vlad

AU - Brass, Clifford

AU - Duffin, Kevin

AU - Zwinderman, Koos

AU - Pavlides, Michael

AU - Anstee, Quentin M.

AU - Bossuyt, Patrick M.

N1 - Funding Information: The LITMUS project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No. 777377. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation program and EFPIA. Sven Francque holds a senior clinical investigator fellowship from the Research Foundation Flanders (FWO) (1802154N). Publisher Copyright: © 2023 John Wiley and Sons Inc.. All rights reserved.

PY - 2023/7/1

Y1 - 2023/7/1

N2 - Background and Aims: Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. Approach and Results: Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models. Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). Conclusions: Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.

AB - Background and Aims: Detecting NASH remains challenging, while at-risk NASH (steatohepatitis and F≥ 2) tends to progress and is of interest for drug development and clinical application. We developed prediction models by supervised machine learning techniques, with clinical data and biomarkers to stage and grade patients with NAFLD. Approach and Results: Learning data were collected in the Liver Investigation: Testing Marker Utility in Steatohepatitis metacohort (966 biopsy-proven NAFLD adults), staged and graded according to NASH CRN. Conditions of interest were the clinical trial definition of NASH (NAS ≥ 4;53%), at-risk NASH (NASH with F ≥ 2;35%), significant (F ≥ 2;47%), and advanced fibrosis (F ≥ 3;28%). Thirty-five predictors were included. Missing data were handled by multiple imputations. Data were randomly split into training/validation (75/25) sets. A gradient boosting machine was applied to develop 2 models for each condition: clinical versus extended (clinical and biomarkers). Two variants of the NASH and at-risk NASH models were constructed: direct and composite models. Clinical gradient boosting machine models for steatosis/inflammation/ballooning had AUCs of 0.94/0.79/0.72. There were no improvements when biomarkers were included. The direct NASH model produced AUCs (clinical/extended) of 0.61/0.65. The composite NASH model performed significantly better (0.71) for both variants. The composite at-risk NASH model had an AUC of 0.83 (clinical and extended), an improvement over the direct model. Significant fibrosis models had AUCs (clinical/extended) of 0.76/0.78. The extended advanced fibrosis model (0.86) performed significantly better than the clinical version (0.82). Conclusions: Detection of NASH and at-risk NASH can be improved by constructing independent machine learning models for each component, using only clinical predictors. Adding biomarkers only improved the accuracy of fibrosis.

UR - http://www.scopus.com/inward/record.url?scp=85163490526&partnerID=8YFLogxK

U2 - https://doi.org/10.1097/HEP.0000000000000364

DO - https://doi.org/10.1097/HEP.0000000000000364

M3 - Article

C2 - 36994719

SN - 0270-9139

VL - 78

SP - 258

EP - 271

JO - Hepatology (Baltimore, Md.)

JF - Hepatology (Baltimore, Md.)

IS - 1

ER -

Machine learning algorithm improves the detection of NASH (NAS-based) and at-risk NASH: A development and validation study

Abstract

Access to Document

Other files and links

Cite this