Develo** and validating multi-modal models for mortality prediction in COVID-19 patients: a multi-center retrospective study
Authors:
Joy Tzung-yu Wu,
Miguel Ángel Armengol de la Hoz,
Po-Chih Kuo,
Joseph Alexander Paguio,
Jasper Seth Yao,
Edward Christopher Dee,
Wesley Yeung,
Jerry Jurado,
Achintya Moulick,
Carmelo Milazzo,
Paloma Peinado,
Paula Villares,
Antonio Cubillo,
José Felipe Varona,
Hyung-Chul Lee,
Alberto Estirado,
José Maria Castellano,
Leo Anthony Celi
Abstract:
The unprecedented global crisis brought about by the COVID-19 pandemic has sparked numerous efforts to create predictive models for the detection and prognostication of SARS-CoV-2 infections with the goal of hel** health systems allocate resources. Machine learning models, in particular, hold promise for their ability to leverage patient clinical information and medical images for prediction. Ho…
▽ More
The unprecedented global crisis brought about by the COVID-19 pandemic has sparked numerous efforts to create predictive models for the detection and prognostication of SARS-CoV-2 infections with the goal of hel** health systems allocate resources. Machine learning models, in particular, hold promise for their ability to leverage patient clinical information and medical images for prediction. However, most of the published COVID-19 prediction models thus far have little clinical utility due to methodological flaws and lack of appropriate validation. In this paper, we describe our methodology to develop and validate multi-modal models for COVID-19 mortality prediction using multi-center patient data. The models for COVID-19 mortality prediction were developed using retrospective data from Madrid, Spain (N=2547) and were externally validated in patient cohorts from a community hospital in New Jersey, USA (N=242) and an academic center in Seoul, Republic of Korea (N=336). The models we developed performed differently across various clinical settings, underscoring the need for a guided strategy when employing machine learning for clinical decision-making. We demonstrated that using features from both the structured electronic health records and chest X-ray imaging data resulted in better 30-day-mortality prediction performance across all three datasets (areas under the receiver operating characteristic curves: 0.85 (95% confidence interval: 0.83-0.87), 0.76 (0.70-0.82), and 0.95 (0.92-0.98)). We discuss the rationale for the decisions made at every step in develo** the models and have made our code available to the research community. We employed the best machine learning practices for clinical model development. Our goal is to create a toolkit that would assist investigators and organizations in building multi-modal models for prediction, classification and/or optimization.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.