-
When accurate prediction models yield harmful self-fulfilling prophecies
Authors:
Wouter A. C. van Amsterdam,
Nan van Geloven,
Jesse H. Krijthe,
Rajesh Ranganath,
Giovanni CinĂ¡
Abstract:
Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We i…
▽ More
Objective: Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. Many prediction models are deployed for decision support based on their prediction accuracy in validation studies. We investigate whether this is a safe and valid approach.
Materials and Methods: We show that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model.
Results: Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution.
Discussion: Our results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions.
Conclusion: Outcome prediction models can yield harmful self-fulfilling prophecies when used for decision making, a new perspective on prediction model development, deployment and monitoring is needed.
△ Less
Submitted 8 February, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
From algorithms to action: improving patient care requires causality
Authors:
Wouter A. C. van Amsterdam,
Pim A. de Jong,
Joost J. C. Verhoeff,
Tim Leiner,
Rajesh Ranganath
Abstract:
In cancer research there is much interest in building and validating outcome predicting outcomes to support treatment decisions. However, because most outcome prediction models are developed and validated without regard to the causal aspects of treatment decision making, many published outcome prediction models may cause harm when used for decision making, despite being found accurate in validatio…
▽ More
In cancer research there is much interest in building and validating outcome predicting outcomes to support treatment decisions. However, because most outcome prediction models are developed and validated without regard to the causal aspects of treatment decision making, many published outcome prediction models may cause harm when used for decision making, despite being found accurate in validation studies. Guidelines on prediction model validation and the checklist for risk model endorsement by the American Joint Committee on Cancer do not protect against prediction models that are accurate during development and validation but harmful when used for decision making. We explain why this is the case and how to build and validate models that are useful for decision making.
△ Less
Submitted 1 April, 2024; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Conditional average treatment effect estimation with marginally constrained models
Authors:
Wouter A. C. van Amsterdam,
Rajesh Ranganath
Abstract:
Treatment effect estimates are often available from randomized controlled trials as a single average treatment effect for a certain patient population. Estimates of the conditional average treatment effect (CATE) are more useful for individualized treatment decision making, but randomized trials are often too small to estimate the CATE. Examples in medical literature make use of the relative treat…
▽ More
Treatment effect estimates are often available from randomized controlled trials as a single average treatment effect for a certain patient population. Estimates of the conditional average treatment effect (CATE) are more useful for individualized treatment decision making, but randomized trials are often too small to estimate the CATE. Examples in medical literature make use of the relative treatment effect (e.g. an odds-ratio) reported by randomized trials to estimate the CATE using large observational datasets. One approach to estimating these CATE models is by using the relative treatment effect as an offset, while estimating the covariate-specific untreated risk. We observe that the odds-ratios reported in randomized controlled trials are not the odds-ratios that are needed in offset models because trials often report the marginal odds-ratio. We introduce a constraint or regularizer to better use marginal odds-ratios from randomized controlled trials and find that under the standard observational causal inference assumptions this approach provides a consistent estimate of the CATE. Next, we show that the offset approach is not valid for CATE estimation in the presence of unobserved confounding. We study if the offset assumption and the marginal constraint lead to better approximations of the CATE relative to the alternative of using the average treatment effect estimate from the randomized trial. We empirically show that when the underlying CATE has sufficient variation, the constraint and offset approaches lead to closer approximations to the CATE.
△ Less
Submitted 23 July, 2023; v1 submitted 29 April, 2022;
originally announced April 2022.
-
Controlling for Biasing Signals in Images for Prognostic Models: Survival Predictions for Lung Cancer with Deep Learning
Authors:
Wouter A. C. van Amsterdam,
Marinus J. C. Eijkemans
Abstract:
Deep learning has shown remarkable results for image analysis and is expected to aid individual treatment decisions in health care. To achieve this, deep learning methods need to be promoted from the level of mere associations to being able to answer causal questions. We present a scenario with real-world medical images (CT-scans of lung cancers) and simulated outcome data. Through the sampling sc…
▽ More
Deep learning has shown remarkable results for image analysis and is expected to aid individual treatment decisions in health care. To achieve this, deep learning methods need to be promoted from the level of mere associations to being able to answer causal questions. We present a scenario with real-world medical images (CT-scans of lung cancers) and simulated outcome data. Through the sampling scheme, the images contain two distinct factors of variation that represent a collider and a prognostic factor. We show that when this collider can be quantified, unbiased individual prognosis predictions are attainable with deep learning. This is achieved by (1) setting a dual task for the network to predict both the outcome and the collider and (2) enforcing independence of the activation distributions of the last layer with ordinary least squares. Our method provides an example of combining deep learning and structural causal models for unbiased individual prognosis predictions.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.