Skip to main content

Showing 1–11 of 11 results for author: Van Calster, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.03379  [pdf, other

    stat.ME cs.LG stat.ML

    missForestPredict -- Missing data imputation for prediction settings

    Authors: Elena Albu, Shan Gao, Laure Wynants, Ben Van Calster

    Abstract: Prediction models are used to predict an outcome based on input variables. Missing data in input variables often occurs at model development and at prediction time. The missForestPredict R package proposes an adaptation of the missForest imputation algorithm that is fast, user-friendly and tailored for prediction settings. The algorithm iteratively imputes variables using random forests until a co… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2405.01986  [pdf

    stat.AP

    A comparison of regression models for static and dynamic prediction of a prognostic outcome during admission in electronic health care records

    Authors: Shan Gao, Elena Albu, Hein Putter, Pieter Stijnen, Frank Rademakers, Veerle Cossey, Yves Debaveye, Christel Janssens, Ben Van Calster, Laure Wynants

    Abstract: Objective Hospitals register information in the electronic health records (EHR) continuously until discharge or death. As such, there is no censoring for in-hospital outcomes. We aimed to compare different dynamic regression modeling approaches to predict central line-associated bloodstream infections (CLABSI) in EHR while accounting for competing events precluding CLABSI. Materials and Methods We… ▽ More

    Submitted 6 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 3388 words; 3 figures; 4 tables

  3. arXiv:2404.19494  [pdf, other

    stat.ME

    The harms of class imbalance corrections for machine learning based prediction models: a simulation study

    Authors: Alex Carriero, Kim Luijken, Anne de Hond, Karel GM Moons, Ben van Calster, Maarten van Smeden

    Abstract: Risk prediction models are increasingly used in healthcare to aid in clinical decision making. In most clinical contexts, model calibration (i.e., assessing the reliability of risk estimates) is critical. Data available for model development are often not perfectly balanced with respect to the modeled outcome (i.e., individuals with vs. without the event of interest are not equally represented in… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  4. arXiv:2404.16127  [pdf, other

    cs.LG stat.ML

    Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection

    Authors: Elena Albu, Shan Gao, Pieter Stijnen, Frank Rademakers, Christel Janssens, Veerle Cossey, Yves Debaveye, Laure Wynants, Ben Van Calster

    Abstract: Prognostic outcomes related to hospital admissions typically do not suffer from censoring, and can be modeled either categorically or as time-to-event. Competing events are common but often ignored. We compared the performance of random forest (RF) models to predict the risk of central line-associated bloodstream infections (CLABSI) using different outcome operationalizations. We included data fro… ▽ More

    Submitted 24 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  5. arXiv:2402.18612  [pdf

    stat.ME cs.CY cs.LG

    Understanding random forests and overfitting: a visualization and simulation study

    Authors: Lasai BarreƱada, Paula Dhiman, Dirk Timmerman, Anne-Laure Boulesteix, Ben Van Calster

    Abstract: Random forests have become popular for clinical risk prediction modelling. In a case study on predicting ovarian malignancy, we observed training c-statistics close to 1. Although this suggests overfitting, performance was competitive on test data. We aimed to understand the behaviour of random forests by (1) visualizing data space in three real world case studies and (2) a simulation study. For t… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 20 pages, 8 figures

  6. arXiv:2312.12008  [pdf

    stat.ME stat.AP

    How to develop, externally validate, and update multinomial prediction models

    Authors: Celina K Gehringer, Glen P Martin, Ben Van Calster, Kimme L Hyrich, Suzanne M M Verstappen, Jamie C Sergeant

    Abstract: Multinomial prediction models (MPMs) have a range of potential applications across healthcare where the primary outcome of interest has multiple nominal or ordinal categories. However, the application of MPMs is scarce, which may be due to the added methodological complexities that they bring. This article provides a guide of how to develop, externally validate, and update MPMs. Using a previously… ▽ More

    Submitted 20 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

  7. arXiv:2207.12892  [pdf

    stat.ME stat.AP

    Minimum Sample Size for Develo** a Multivariable Prediction Model using Multinomial Logistic Regression

    Authors: Alexander Pate, Richard D Riley, Gary S Collins, Maarten van Smeden, Ben Van Calster, Joie Ensor, Glen P Martin

    Abstract: Multinomial logistic regression models allow one to predict the risk of a categorical outcome with more than 2 categories. When develo** such a model, researchers should ensure the number of participants (n) is appropriate relative to the number of events (E.k) and the number of predictor parameters (p.k) for each category k. We propose three criteria to determine the minimum n required in light… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  8. arXiv:2202.09101  [pdf

    stat.ME

    The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression

    Authors: Ruben van den Goorbergh, Maarten van Smeden, Dirk Timmerman, Ben Van Calster

    Abstract: Methods to correct class imbalance, i.e. imbalance between the frequency of outcome events and non-events, are receiving increasing interest for develo** prediction models. We examined the effect of imbalance correction on the performance of standard and penalized (ridge) logistic regression models in terms of discrimination, calibration, and classification. We examined random undersampling, ran… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

    Comments: Main paper 21 pages, Supplement 53 pages

  9. arXiv:2104.09282  [pdf

    stat.ME

    Risk prediction models for discrete ordinal outcomes: calibration and the impact of the proportional odds assumption

    Authors: Michael Edlinger, Maarten van Smeden, Hannes F Alber, Maria Wanitschek, Ben Van Calster

    Abstract: Calibration is a vital aspect of the performance of risk prediction models, but research in the context of ordinal outcomes is scarce. This study compared calibration measures for risk models predicting a discrete ordinal outcome, and investigated the impact of the proportional odds assumption on calibration and overfitting. We studied the multinomial, cumulative, adjacent category, continuation r… ▽ More

    Submitted 18 November, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: Revised version submitted to Statistics in Medicine

  10. arXiv:1907.11493  [pdf

    stat.ME

    On the variability of regression shrinkage methods for clinical prediction models: simulation study on predictive performance

    Authors: Ben Van Calster, Maarten van Smeden, Ewout W. Steyerberg

    Abstract: When develo** risk prediction models, shrinkage methods are recommended, especially when the sample size is limited. Several earlier studies have shown that the shrinkage of model coefficients can reduce overfitting of the prediction model and subsequently result in better predictive performance on average. In this simulation study, we aimed to investigate the variability of regression shrinkage… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Comments: 138 pages (incl 114 supplementary pages). Main document: 5 figures and 2 tables

    MSC Class: 62J07

  11. arXiv:1806.10495  [pdf, other

    stat.ME

    Impact of predictor measurement heterogeneity across settings on performance of prediction models: a measurement error perspective

    Authors: Kim Luijken, Rolf H. H. Groenwold, Ben van Calster, Ewout W. Steyerberg, Maarten van Smeden

    Abstract: It is widely acknowledged that the predictive performance of clinical prediction models should be studied in patients that were not part of the data in which the model was derived. Out-of-sample performance can be hampered when predictors are measured differently at derivation and external validation. This may occur, for instance, when predictors are measured using different measurement protocols… ▽ More

    Submitted 5 February, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

    Comments: 32 pages, 4 figures

    MSC Class: 97K80