Interpretable (not just posthoc-explainable) medical claims modeling for discharge placement to prevent avoidable all-cause readmissions or death
Authors:
Joshua C. Chang,
Ted L. Chang,
Carson C. Chow,
Rohit Mahajan,
Sonya Mahajan,
Joe Maisog,
Shashaank Vattikuti,
Hong**g Xia
Abstract:
We developed an inherently interpretable multilevel Bayesian framework for representing variation in regression coefficients that mimics the piecewise linearity of ReLU-activated deep neural networks. We used the framework to formulate a survival model for using medical claims to predict hospital readmission and death that focuses on discharge placement, adjusting for confounding in estimating cau…
▽ More
We developed an inherently interpretable multilevel Bayesian framework for representing variation in regression coefficients that mimics the piecewise linearity of ReLU-activated deep neural networks. We used the framework to formulate a survival model for using medical claims to predict hospital readmission and death that focuses on discharge placement, adjusting for confounding in estimating causal local average treatment effects. We trained the model on a 5% sample of Medicare beneficiaries from 2008 and 2011, based on their 2009--2011 inpatient episodes, and then tested the model on 2012 episodes. The model scored an AUROC of approximately 0.76 on predicting all-cause readmissions -- defined using official Centers for Medicare and Medicaid Services (CMS) methodology -- or death within 30-days of discharge, being competitive against XGBoost and a Bayesian deep neural network, demonstrating that one need-not sacrifice interpretability for accuracy. Crucially, as a regression model, we provide what blackboxes cannot -- the exact gold-standard global interpretation of the model, identifying relative risk factors and quantifying the effect of discharge placement. We also show that the posthoc explainer SHAP fails to provide accurate explanations.
△ Less
Submitted 29 January, 2023; v1 submitted 28 August, 2022;
originally announced August 2022.
Using massive health insurance claims data to predict very high-cost claimants: a machine learning approach
Authors:
José M. Maisog,
Wenhong Li,
Yanchun Xu,
Brian Hurley,
Hetal Shah,
Ryan Lemberg,
Tina Borden,
Stephen Bandeian,
Melissa Schline,
Roxanna Cross,
Alan Spiro,
Russ Michael,
Alexander Gutfraind
Abstract:
Due to escalating healthcare costs, accurately predicting which patients will incur high costs is an important task for payers and providers of healthcare. High-cost claimants (HiCCs) are patients who have annual costs above $\$250,000…
▽ More
Due to escalating healthcare costs, accurately predicting which patients will incur high costs is an important task for payers and providers of healthcare. High-cost claimants (HiCCs) are patients who have annual costs above $\$250,000$ and who represent just 0.16% of the insured population but currently account for 9% of all healthcare costs. In this study, we aimed to develop a high-performance algorithm to predict HiCCs to inform a novel care management system. Using health insurance claims from 48 million people and augmented with census data, we applied machine learning to train binary classification models to calculate the personal risk of HiCC. To train the models, we developed a platform starting with 6,006 variables across all clinical and demographic dimensions and constructed over one hundred candidate models. The best model achieved an area under the receiver operating characteristic curve of 91.2%. The model exceeds the highest published performance (84%) and remains high for patients with no prior history of high-cost status (89%), who have less than a full year of enrollment (87%), or lack pharmacy claims data (88%). It attains an area under the precision-recall curve of 23.1%, and precision of 74% at a threshold of 0.99. A care management program enrolling 500 people with the highest HiCC risk is expected to treat 199 true HiCCs and generate a net savings of $\$7.3$ million per year. Our results demonstrate that high-performing predictive models can be constructed using claims data and publicly available data alone, even for rare high-cost claimants exceeding $\$250,000$. Our model demonstrates the transformational power of machine learning and artificial intelligence in care management, which would allow healthcare payers and providers to introduce the next generation of care management programs.
△ Less
Submitted 30 December, 2019;
originally announced December 2019.