Measuring the performance of prediction models to personalize treatment choice

Orestis Efthimiou; Jeroen Hoogland; Thomas P. A. Debray; Michael Seo; Toshiaki A. Furukawa; Matthias Egger; Ian R. White

doi:https://doi.org/10.1002/sim.9665

Measuring the performance of prediction models to personalize treatment choice

Orestis Efthimiou, Jeroen Hoogland, Thomas P. A. Debray, Michael Seo, Toshiaki A. Furukawa, Matthias Egger, Ian R. White

Research output: Contribution to journal › Article › Academic › peer-review

7 Citations (Scopus)

Abstract

When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.

Original language	English
Pages (from-to)	1188-1206
Number of pages	19
Journal	Statistics in medicine
Volume	42
Issue number	8
Early online date	2023
DOIs	https://doi.org/10.1002/sim.9665
Publication status	Published - 15 Apr 2023

Keywords

heterogeneous treatment effects
personalized medicine
prediction modelling

Access to Document

https://doi.org/10.1002/sim.9665

Cite this

@article{dd69e8fa611c4a18b74482074792a0db,

title = "Measuring the performance of prediction models to personalize treatment choice",

abstract = "When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.",

keywords = "heterogeneous treatment effects, personalized medicine, prediction modelling",

author = "Orestis Efthimiou and Jeroen Hoogland and Debray, {Thomas P. A.} and Michael Seo and Furukawa, {Toshiaki A.} and Matthias Egger and White, {Ian R.}",

note = "Funding Information: information European Commission,Horizon 2020 Research and Innovation Programme, Medical Research Council, Grant/Award Number: Programme MC_UU_00004/07; Schweizerischer Nationalfonds zur F{\"o}rderung der Wissenschaftlichen Forschung, Grant/Award Numbers: Ambizione grant number 180083; special project funding 189498; ZonMw, Grant/Award Number: grant 91215058OE, MS and ME were supported by the Swiss National Science Foundation (Ambizione grant number 180083, special project funding 189498). IW was supported by the Medical Research Council Programme MC_UU_00004/07. TD is supported by the European Union's Horizon 2020 research and innovation programme under ReCoDID grant agreement no. 825746. JH is supported by ZonMw (grant 91215058). Funding Information: OE, MS and ME were supported by the Swiss National Science Foundation (Ambizione grant number 180083, special project funding 189498). IW was supported by the Medical Research Council Programme MC_UU_00004/07. TD is supported by the European Union's Horizon 2020 research and innovation programme under ReCoDID grant agreement no. 825746. JH is supported by ZonMw (grant 91215058). Funding Information: European Commission,Horizon 2020 Research and Innovation Programme, Medical Research Council, Grant/Award Number: Programme MC_UU_00004/07; Schweizerischer Nationalfonds zur F{\"o}rderung der Wissenschaftlichen Forschung, Grant/Award Numbers: Ambizione grant number 180083; special project funding 189498; ZonMw, Grant/Award Number: grant 91215058 Funding information Publisher Copyright: {\textcopyright} 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.",

year = "2023",

month = apr,

day = "15",

doi = "https://doi.org/10.1002/sim.9665",

language = "English",

volume = "42",

pages = "1188--1206",

journal = "Statistics in medicine",

issn = "0277-6715",

publisher = "John Wiley and Sons Ltd",

number = "8",

}

TY - JOUR

T1 - Measuring the performance of prediction models to personalize treatment choice

AU - Efthimiou, Orestis

AU - Hoogland, Jeroen

AU - Debray, Thomas P. A.

AU - Seo, Michael

AU - Furukawa, Toshiaki A.

AU - Egger, Matthias

AU - White, Ian R.

N1 - Funding Information: information European Commission,Horizon 2020 Research and Innovation Programme, Medical Research Council, Grant/Award Number: Programme MC_UU_00004/07; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung, Grant/Award Numbers: Ambizione grant number 180083; special project funding 189498; ZonMw, Grant/Award Number: grant 91215058OE, MS and ME were supported by the Swiss National Science Foundation (Ambizione grant number 180083, special project funding 189498). IW was supported by the Medical Research Council Programme MC_UU_00004/07. TD is supported by the European Union's Horizon 2020 research and innovation programme under ReCoDID grant agreement no. 825746. JH is supported by ZonMw (grant 91215058). Funding Information: OE, MS and ME were supported by the Swiss National Science Foundation (Ambizione grant number 180083, special project funding 189498). IW was supported by the Medical Research Council Programme MC_UU_00004/07. TD is supported by the European Union's Horizon 2020 research and innovation programme under ReCoDID grant agreement no. 825746. JH is supported by ZonMw (grant 91215058). Funding Information: European Commission,Horizon 2020 Research and Innovation Programme, Medical Research Council, Grant/Award Number: Programme MC_UU_00004/07; Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung, Grant/Award Numbers: Ambizione grant number 180083; special project funding 189498; ZonMw, Grant/Award Number: grant 91215058 Funding information Publisher Copyright: © 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

PY - 2023/4/15

Y1 - 2023/4/15

N2 - When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.

AB - When data are available from individual patients receiving either a treatment or a control intervention in a randomized trial, various statistical and machine learning methods can be used to develop models for predicting future outcomes under the two conditions, and thus to predict treatment effect at the patient level. These predictions can subsequently guide personalized treatment choices. Although several methods for validating prediction models are available, little attention has been given to measuring the performance of predictions of personalized treatment effect. In this article, we propose a range of measures that can be used to this end. We start by defining two dimensions of model accuracy for treatment effects, for a single outcome: discrimination for benefit and calibration for benefit. We then amalgamate these two dimensions into an additional concept, decision accuracy, which quantifies the model's ability to identify patients for whom the benefit from treatment exceeds a given threshold. Subsequently, we propose a series of performance measures related to these dimensions and discuss estimating procedures, focusing on randomized data. Our methods are applicable for continuous or binary outcomes, for any type of prediction model, as long as it uses baseline covariates to predict outcomes under treatment and control. We illustrate all methods using two simulated datasets and a real dataset from a trial in depression. We implement all methods in the R package predieval. Results suggest that the proposed measures can be useful in evaluating and comparing the performance of competing models in predicting individualized treatment effect.

KW - heterogeneous treatment effects

KW - personalized medicine

KW - prediction modelling

UR - http://www.scopus.com/inward/record.url?scp=85147289187&partnerID=8YFLogxK

U2 - https://doi.org/10.1002/sim.9665

DO - https://doi.org/10.1002/sim.9665

M3 - Article

C2 - 36700492

SN - 0277-6715

VL - 42

SP - 1188

EP - 1206

JO - Statistics in medicine

JF - Statistics in medicine

IS - 8

ER -

Measuring the performance of prediction models to personalize treatment choice

Abstract

Keywords

Access to Document

Other files and links

Cite this