Search | arXiv e-print repository

The risks of risk assessment: causal blind spots when using prediction models for treatment decisions

Authors: Nan van Geloven, Ruth H Keogh, Wouter van Amsterdam, Giovanni Cinà, Jesse H. Krijthe, Niels Peek, Kim Luijken, Sara Magliacane, Paweł Morzywołek, Thijs van Ommen, Hein Putter, Matthew Sperrin, Junfeng Wang, Daniala L. Weir, Vanessa Didelez

Abstract: Prediction models are increasingly proposed for guiding treatment decisions, but most fail to address the special role of treatments, leading to inappropriate use. This paper highlights the limitations of using standard prediction models for treatment decision support. We identify `causal blind spots' in three common approaches to handling treatments in prediction modelling: including treatment as… ▽ More Prediction models are increasingly proposed for guiding treatment decisions, but most fail to address the special role of treatments, leading to inappropriate use. This paper highlights the limitations of using standard prediction models for treatment decision support. We identify `causal blind spots' in three common approaches to handling treatments in prediction modelling: including treatment as a predictor, restricting data based on treatment status and ignoring treatments. When predictions are used to inform treatment decisions, confounders, colliders and mediators, as well as changes in treatment protocols over time may lead to misinformed decision-making. We illustrate potential harmful consequences in several medical applications. We advocate for an extension of guidelines for development, reporting and evaluation of prediction models to ensure that the intended use of the model is matched to an appropriate risk estimand. When prediction models are intended to inform treatment decisions, prediction models should specify upfront the treatment decisions they aim to support and target a prediction estimand in line with that goal. This requires a shift towards develo** predictions under the specific treatment options under consideration (`predictions under interventions'). Predictions under interventions need causal reasoning and inference techniques during development and validation. We argue that this will improve the efficacy of prediction models in guiding treatment decisions and prevent potential negative effects on patient outcomes. △ Less

Submitted 6 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2311.17547 [pdf, other]

Risk-based decision making: estimands for sequential prediction under interventions

Authors: Kim Luijken, Paweł Morzywołek, Wouter van Amsterdam, Giovanni Cinà, Jeroen Hoogland, Ruth Keogh, Jesse Krijthe, Sara Magliacane, Thijs van Ommen, Niels Peek, Hein Putter, Maarten van Smeden, Matthew Sperrin, Junfeng Wang, Daniala Weir, Vanessa Didelez, Nan van Geloven

Abstract: Prediction models are used amongst others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always provide risks that are relevant to inform such decisions: e.g., an individual may be estimated to be at low risk beca… ▽ More Prediction models are used amongst others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always provide risks that are relevant to inform such decisions: e.g., an individual may be estimated to be at low risk because similar individuals in the past received an intervention which lowered their risk. Therefore, prediction models supporting decisions should target risks belonging to defined intervention strategies. Previous works on prediction under interventions assumed that the prediction model was used only at one time point to make an intervention decision. In clinical practice, intervention decisions are rarely made only once: they might be repeated, deferred and re-evaluated. This requires estimated risks under interventions that can be reconsidered at several potential decision moments. In the current work, we highlight key considerations for formulating estimands in sequential prediction under interventions that can inform such intervention decisions. We illustrate these considerations by giving examples of estimands for a case study about choosing between vaginal delivery and cesarean section for women giving birth. Our formalization of prediction tasks in a sequential, causal, and estimand context provides guidance for future studies to ensure that the right question is answered and appropriate causal estimation approaches are chosen to develop sequential prediction models that can inform intervention decisions. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 32 pages, 2 figures

arXiv:2311.10856 [pdf]

Exploring the Consistency, Quality and Challenges in Manual and Automated Coding of Free-text Diagnoses from Hospital Outpatient Letters

Authors: Warren Del-Pinto, George Demetriou, Meghna Jani, Rikesh Patel, Leanne Gray, Alex Bulcock, Niels Peek, Andrew S. Kanter, William G Dixon, Goran Nenadic

Abstract: Coding of unstructured clinical free-text to produce interoperable structured data is essential to improve direct care, support clinical communication and to enable clinical research.However, manual clinical coding is difficult and time consuming, which motivates the development and use of natural language processing for automated coding. This work evaluates the quality and consistency of both man… ▽ More Coding of unstructured clinical free-text to produce interoperable structured data is essential to improve direct care, support clinical communication and to enable clinical research.However, manual clinical coding is difficult and time consuming, which motivates the development and use of natural language processing for automated coding. This work evaluates the quality and consistency of both manual and automated clinical coding of diagnoses from hospital outpatient letters. Using 100 randomly selected letters, two human clinicians performed coding of diagnosis lists to SNOMED CT. Automated coding was also performed using IMO's Concept Tagger. A gold standard was constructed by a panel of clinicians from a subset of the annotated diagnoses. This was used to evaluate the quality and consistency of both manual and automated coding via (1) a distance-based metric, treating SNOMED CT as a graph, and (2) a qualitative metric agreed upon by the panel of clinicians. Correlation between the two metrics was also evaluated. Comparing human and computer-generated codes to the gold standard, the results indicate that humans slightly out-performed automated coding, while both performed notably better when there was only a single diagnosis contained in the free-text description. Automated coding was considered acceptable by the panel of clinicians in approximately 90% of cases. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2308.13394 [pdf]

Calibration plots for multistate risk predictions models: an overview and simulation comparing novel approaches

Authors: Alexander Pate, Matthew Sperrin, Richard D. Riley, Niels Peek, Tjeerd Van Staa, Jamie C. Sergeant, Mamas A. Mamas, Gregory Y. H. Lip, Martin O Flaherty, Michael Barrowman, Iain Buchan, Glen P. Martin

Abstract: Introduction. There is currently no guidance on how to assess the calibration of multistate models used for risk prediction. We introduce several techniques that can be used to produce calibration plots for the transition probabilities of a multistate model, before assessing their performance in the presence of non-informative and informative censoring through a simulation. Methods. We studied p… ▽ More Introduction. There is currently no guidance on how to assess the calibration of multistate models used for risk prediction. We introduce several techniques that can be used to produce calibration plots for the transition probabilities of a multistate model, before assessing their performance in the presence of non-informative and informative censoring through a simulation. Methods. We studied pseudo-values based on the Aalen-Johansen estimator, binary logistic regression with inverse probability of censoring weights (BLR-IPCW), and multinomial logistic regression with inverse probability of censoring weights (MLR-IPCW). The MLR-IPCW approach results in a calibration scatter plot, providing extra insight about the calibration. We simulated data with varying levels of censoring and evaluated the ability of each method to estimate the calibration curve for a set of predicted transition probabilities. We also developed evaluated the calibration of a model predicting the incidence of cardiovascular disease, type 2 diabetes and chronic kidney disease among a cohort of patients derived from linked primary and secondary healthcare records. Results. The pseudo-value, BLR-IPCW and MLR-IPCW approaches give unbiased estimates of the calibration curves under non-informative censoring. These methods remained unbiased in the presence of informative censoring, unless the mechanism was strongly informative, with bias concentrated in the areas of predicted transition probabilities of low density. Conclusions. We recommend implementing either the pseudo-value or BLR-IPCW approaches to produce a calibration curve, combined with the MLR-IPCW approach to produce a calibration scatter plot, which provides additional information over either of the other methods. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: Pre-print for article currently under review

arXiv:2206.12295 [pdf]

Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

Authors: Rose Sisk, Matthew Sperrin, Niels Peek, Maarten van Smeden, Glen P. Martin

Abstract: Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the gold standard approach, can be challenging to apply in the clinic. Clearly, the outcome cannot be used to impute data at prediction time. Regression imputation… ▽ More Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the gold standard approach, can be challenging to apply in the clinic. Clearly, the outcome cannot be used to impute data at prediction time. Regression imputation (RI) may offer a pragmatic alternative in the prediction context, that is simpler to apply in the clinic. Moreover, the use of missing indicators can handle informative missingness, but it is currently unknown how well they perform within CPMs. Methods: We performed a simulation study where data were generated under various missing data mechanisms to compare the predictive performance of CPMs developed using both imputation methods. We consider deployment scenarios where missing data is permitted/prohibited, and develop models that use/omit the outcome during imputation and include/omit missing indicators. Results: When complete data must be available at deployment, our findings were in line with widely used recommendations; that the outcome should be used to impute development data under MI, yet omitted under RI. When imputation is applied at deployment, omitting the outcome from the imputation at development was preferred. Missing indicators improved model performance in some specific cases, but can be harmful when missingness is dependent on the outcome. Conclusion: We provide evidence that commonly taught principles of handling missing data via MI may not apply to CPMs, particularly when data can be missing at deployment. In such settings, RI and missing indicator methods can (marginally) outperform MI. As shown, the performance of the missing data handling method must be evaluated on a study-by-study basis, and should be based on whether missing data are allowed at deployment. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: 42 pages. Submitted to Statistical Methods in Medical Research in October 2021

arXiv:2205.13481 [pdf, other]

DeepJoint: Robust Survival Modelling Under Clinical Presence Shift

Authors: Vincent Jeanselme, Glen Martin, Niels Peek, Matthew Sperrin, Brian Tom, Jessica Barrett

Abstract: Observational data in medicine arise as a result of the complex interaction between patients and the healthcare system. The sampling process is often highly irregular and itself constitutes an informative process. When using such data to develop prediction models, this phenomenon is often ignored, leading to sub-optimal performance and generalisability of models when practices evolve. We propose a… ▽ More Observational data in medicine arise as a result of the complex interaction between patients and the healthcare system. The sampling process is often highly irregular and itself constitutes an informative process. When using such data to develop prediction models, this phenomenon is often ignored, leading to sub-optimal performance and generalisability of models when practices evolve. We propose a multi-task recurrent neural network which models three clinical presence dimensions -- namely the longitudinal, the inter-observation and the missingness processes -- in parallel to the survival outcome. On a prediction task using MIMIC III laboratory tests, explicit modelling of these three processes showed improved performance in comparison to state-of-the-art predictive models (C-index at 1 day horizon: 0.878). More importantly, the proposed approach was more robust to change in the clinical presence setting, demonstrated by performance comparison between patients admitted on weekdays and weekends. This analysis demonstrates the importance of studying and leveraging clinical presence to improve performance and create more transportable clinical models. △ Less

Submitted 26 May, 2022; originally announced May 2022.

arXiv:2205.00323 [pdf, other]

p5.fab: Direct Control of Digital Fabrication Machines from a Creative Coding Environment

Authors: Blair Subbaraman, Nadya Peek

Abstract: Machine settings and tuning are critical for digital fabrication outcomes. However, exploring these parameters is non-trivial. We seek to enable exploration of the full design space of digital fabrication. To identify where we might intervene, we studied how practitioners approach 3D printing. We found that beyond using CAD/CAM, they create bespoke routines and workflows to explore interdependent… ▽ More Machine settings and tuning are critical for digital fabrication outcomes. However, exploring these parameters is non-trivial. We seek to enable exploration of the full design space of digital fabrication. To identify where we might intervene, we studied how practitioners approach 3D printing. We found that beyond using CAD/CAM, they create bespoke routines and workflows to explore interdependent material and machine settings. We seek to provide a system that supports this workflow development. We identified design goals around material exploration, fine-tuned control, and iteration. Based on these, we present p5.fab, a system for controlling digital fabrication machines from the creative coding environment p5.js. We demonstrate p5.fab with examples of 3D prints that cannot be made with traditional 3D printing software. We evaluate p5.fab in workshops and find that it encourages novel printing workflows and artifacts. Finally, we discuss implications for future digital fabrication systems. △ Less

Submitted 30 April, 2022; originally announced May 2022.

Comments: Submitted to DIS 2022, 12 pages plus references

arXiv:2205.00079 [pdf, other]

"Short on time and big on ideas": Perspectives from Lab Members on DIYBio Work in Community Biolabs

Authors: Orlando de Lange, Kellie Dunn, Nadya Peek

Abstract: DIYbio challenges the status quo by positioning laboratory biology work outside of traditional institutions. HCI has increasingly explored the DIYbio movement, but we lack insight into sites of practice such as community biolabs. Therefore, we gathered data on eleven community biolabs by interviewing sixteen lab managers and members. These labs represent half of identified organizations in scope w… ▽ More DIYbio challenges the status quo by positioning laboratory biology work outside of traditional institutions. HCI has increasingly explored the DIYbio movement, but we lack insight into sites of practice such as community biolabs. Therefore, we gathered data on eleven community biolabs by interviewing sixteen lab managers and members. These labs represent half of identified organizations in scope worldwide. Participants detailed their practices and motivations, outlining the constraints and opportunities of their community biolabs. We found that lab members conducted technically challenging project work with access to high-end equipment and professional expertise. We found that the unique nature of biowork exacerbated challenges for cooperative work, partially due to the particular time sensitivities of work with living organisms. Building on our findings, we discuss how community biolab members are creating new approaches to laboratory biology and how this has design implications for systems that support non-traditional settings for scientific practice. △ Less

Submitted 29 April, 2022; originally announced May 2022.

Comments: Submitted to ACM DIS 2022, 17 pages plus references

arXiv:2106.07722 [pdf]

EPICURE Ensemble Pretrained Models for Extracting Cancer Mutations from Literature

Authors: Jiarun Cao, Elke M van Veen, Niels Peek, Andrew G Renehan, Sophia Ananiadou

Abstract: To interpret the genetic profile present in a patient sample, it is necessary to know which mutations have important roles in the development of the corresponding cancer type. Named entity recognition is a core step in the text mining pipeline which facilitates mining valuable cancer information from the scientific literature. However, due to the scarcity of related datasets, previous NER attempts… ▽ More To interpret the genetic profile present in a patient sample, it is necessary to know which mutations have important roles in the development of the corresponding cancer type. Named entity recognition is a core step in the text mining pipeline which facilitates mining valuable cancer information from the scientific literature. However, due to the scarcity of related datasets, previous NER attempts in this domain either suffer from low performance when deep learning based models are deployed, or they apply feature based machine learning models or rule based models to tackle this problem, which requires intensive efforts from domain experts, and limit the model generalization capability. In this paper, we propose EPICURE, an ensemble pre trained model equipped with a conditional random field pattern layer and a span prediction pattern layer to extract cancer mutations from text. We also adopt a data augmentation strategy to expand our training set from multiple datasets. Experimental results on three benchmark datasets show competitive results compared to the baseline models. △ Less

Submitted 11 June, 2021; originally announced June 2021.

arXiv:2101.11054 [pdf, other]

Remote Learners, Home Makers: How Digital Fabrication Was Taught Online During a Pandemic

Authors: Gabrielle Benabdallah, Samuelle Bourgault, Nadya Peek, Jennifer Jacobs

Abstract: Digital fabrication courses that relied on physical makerspaces were severely disrupted by COVID-19. As universities shut down in Spring 2020, instructors developed new models for digital fabrication at a distance. Through interviews with faculty and students and examination of course materials, we recount the experiences of eight remote digital fabrication courses. We found that learning with hob… ▽ More Digital fabrication courses that relied on physical makerspaces were severely disrupted by COVID-19. As universities shut down in Spring 2020, instructors developed new models for digital fabrication at a distance. Through interviews with faculty and students and examination of course materials, we recount the experiences of eight remote digital fabrication courses. We found that learning with hobbyist equipment and online social networks could emulate using industrial equipment in shared workshops. Furthermore, at-home digital fabrication offered unique learning opportunities including more iteration, machine tuning, and maintenance. These opportunities depended on new forms of labor and varied based on student living situations. Our findings have implications for remote and in-person digital fabrication instruction. They indicate how access to tools was important, but not as critical as providing opportunities for iteration; they show how remote fabrication exacerbated student inequities; and they suggest strategies for evaluating trade-offs in remote fabrication models with respect to learning objectives. △ Less

Submitted 28 January, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

Comments: to be published at CHI 2021

arXiv:2011.09815 [pdf]

A sco** review of causal methods enabling predictions under hypothetical interventions

Authors: Li**g Lin, Matthew Sperrin, David A. Jenkins, Glen P. Martin, Niels Peek

Abstract: Background and Aims: The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. We aimed to identify published methods for develo** and validating prediction mo… ▽ More Background and Aims: The methods with which prediction models are usually developed mean that neither the parameters nor the predictions should be interpreted causally. However, when prediction models are used to support decision making, there is often a need for predicting outcomes under hypothetical interventions. We aimed to identify published methods for develo** and validating prediction models that enable risk estimation of outcomes under hypothetical interventions, utilizing causal inference: their main methodological approaches, underlying assumptions, targeted estimands, and potential pitfalls and challenges with using the method, and unresolved methodological challenges. Methods: We systematically reviewed literature published by December 2019, considering papers in the health domain that used causal considerations to enable prediction models to be used for predictions under hypothetical interventions. Results: We identified 4919 papers through database searches and a further 115 papers through manual searches, of which 13 were selected for inclusion, from both the statistical and the machine learning literature. Most of the identified methods for causal inference from observational data were based on marginal structural models and g-estimation. Conclusions: There exist two broad methodological approaches for allowing prediction under hypothetical intervention into clinical prediction models: 1) enriching prediction models derived from observational studies with estimated causal effects from clinical trials and meta-analyses; and 2) estimating prediction models and causal effects directly from observational data. These methods require extending to dynamic treatment regimes, and consideration of multiple interventions to operationalise a clinical decision support system. Techniques for validating 'causal prediction models' are still in their infancy. △ Less

Submitted 12 January, 2021; v1 submitted 19 November, 2020; originally announced November 2020.

Journal ref: Diagnostic and Prognostic Research, 2021

arXiv:2001.08988 [pdf]

doi 10.1016/j.jclinepi.2020.07.014

Towards a Framework for the Design, Implementation and Reporting of Methodology Sco** Reviews

Authors: Glen P. Martin, David Jenkins, Lucy Bull, Rose Sisk, Li**g Lin, William Hulme, Anthony Wilson, Wenjuan Wang, Michael Barrowman, Camilla Sammut-Powell, Alexander Pate, Matthew Sperrin, Niels Peek

Abstract: Background: In view of the growth of published papers, there is an increasing need for studies that summarise scientific research. An increasingly common review is a 'Methodology sco** review', which provides a summary of existing analytical methods, techniques and software, proposed or applied in research articles, which address an analytical problem or further an analytical approach. However,… ▽ More Background: In view of the growth of published papers, there is an increasing need for studies that summarise scientific research. An increasingly common review is a 'Methodology sco** review', which provides a summary of existing analytical methods, techniques and software, proposed or applied in research articles, which address an analytical problem or further an analytical approach. However, guidelines for their design, implementation and reporting are limited. Methods: Drawing on the experiences of the authors, which were consolidated through a series of face-to-face workshops, we summarise the challenges inherent in conducting a methodology sco** review and offer suggestions of best practice to promote future guideline development. Results: We identified three challenges of conducting a methodology sco** review. First, identification of search terms; one cannot usually define the search terms a priori and the language used for a particular method can vary across the literature. Second, the scope of the review requires careful consideration since new methodology is often not described (in full) within abstracts. Third, many new methods are motivated by a specific clinical question, where the methodology may only be documented in supplementary materials. We formulated several recommendations that build upon existing review guidelines. These recommendations ranged from an iterative approach to defining search terms through to screening and data extraction processes. Conclusion: Although methodology sco** reviews are an important aspect of research, there is currently a lack of guidelines to standardise their design, implementation and reporting. We recommend a wider discussion on this topic. △ Less

Submitted 16 January, 2020; originally announced January 2020.

Comments: 22 pages, 2 tables

Journal ref: Journal of Clinical Epidemiology. (2020)

arXiv:1709.06859 [pdf]

Using marginal structural models to adjust for treatment drop-in when develo** clinical prediction models

Authors: Matthew Sperrin, Glen Martin, Tjeerd Van Staa, Niels Peek, Iain Buchan

Abstract: Objectives: Clinical prediction models (CPMs) can inform decision-making concerning treatment initiation. Here, one requires predicted risks assuming that no treatment is given. This is challenging since CPMs are often derived in datasets where patients receive treatment; moreover, treatment can commence post-baseline - treatment drop-ins. This study presents a novel approach of using marginal str… ▽ More Objectives: Clinical prediction models (CPMs) can inform decision-making concerning treatment initiation. Here, one requires predicted risks assuming that no treatment is given. This is challenging since CPMs are often derived in datasets where patients receive treatment; moreover, treatment can commence post-baseline - treatment drop-ins. This study presents a novel approach of using marginal structural models (MSMs) to adjust for treatment drop-in. Study Design and Setting: We illustrate the use of MSMs in the CPM framework through simulation studies, representing randomised controlled trials and observational data. The simulations include a binary treatment and a covariate, each recorded at two timepoints and having a prognostic effect on a binary outcome. The bias in predicted risk was examined in a model ignoring treatment, a model fitted on treatment naïve patients (at baseline), a model including baseline treatment, and the MSM. Results: In all simulation scenarios, all models except the MSM under-estimated the risk of outcome given absence of treatment. Consequently, CPMs that do not acknowledge treatment drop-in can lead to under-allocation of treatment. Conclusion: When develo** CPMs to predict treatment-naïve risk, authors should consider using MSMs to adjust for treatment drop-in. MSMs also allow estimation of individual treatment effects. △ Less

Submitted 20 September, 2017; originally announced September 2017.

Showing 1–13 of 13 results for author: Peek, N