Search | arXiv e-print repository

arXiv:2403.02558 [pdf]

The Minimum Information about CLinical Artificial Intelligence Checklist for Generative Modeling Research (MI-CLAIM-GEN)

Authors: Brenda Y. Miao, Irene Y. Chen, Christopher YK Williams, Jaysón Davidson, Augusto Garcia-Agundez, Shenghuan Sun, Travis Zack, Suchi Saria, Rima Arnaout, Giorgio Quer, Hossein J. Sadaei, Ali Torkamani, Brett Beaulieu-Jones, Bin Yu, Milena Gianfrancesco, Atul J. Butte, Beau Norgeot, Madhumita Sushil

Abstract: Recent advances in generative models, including large language models (LLMs), vision language models (VLMs), and diffusion models, have accelerated the field of natural language and image processing in medicine and marked a significant paradigm shift in how biomedical models can be developed and deployed. While these models are highly adaptable to new tasks, scaling and evaluating their usage pres… ▽ More Recent advances in generative models, including large language models (LLMs), vision language models (VLMs), and diffusion models, have accelerated the field of natural language and image processing in medicine and marked a significant paradigm shift in how biomedical models can be developed and deployed. While these models are highly adaptable to new tasks, scaling and evaluating their usage presents new challenges not addressed in previous frameworks. In particular, the ability of these models to produce useful outputs with little to no specialized training data ("zero-" or "few-shot" approaches), as well as the open-ended nature of their outputs, necessitate the development of new guidelines for robust reporting of clinical generative model research. In response to gaps in standards and best practices for the development of clinical AI tools identified by US Executive Order 141103 and several emerging national networks for clinical AI evaluation, we begin to formalize some of these guidelines by building on the original MI-CLAIM checklist. The new checklist, MI-CLAIM-GEN (Table 1), aims to address differences in training, evaluation, interpretability, and reproducibility of new generative models compared to non-generative ("predictive") AI models. This MI-CLAIM-GEN checklist also seeks to clarify cohort selection reporting with unstructured clinical data and adds additional items on alignment with ethical standards for clinical AI research. △ Less

Submitted 11 July, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2208.10245 [pdf]

When BERT Fails -- The Limits of EHR Classification

Authors: Augusto Garcia-Agundez, Carsten Eickhoff

Abstract: Transformers are powerful text representation learners, useful for all kinds of clinical decision support tasks. Although they outperform baselines on readmission prediction, they are not infallible. Here, we look into one such failure case, and report patterns that lead to inferior predictive performance. Transformers are powerful text representation learners, useful for all kinds of clinical decision support tasks. Although they outperform baselines on readmission prediction, they are not infallible. Here, we look into one such failure case, and report patterns that lead to inferior predictive performance. △ Less

Submitted 26 July, 2022; originally announced August 2022.

Journal ref: AMIA 2022 Annual Symposium

arXiv:2108.03284 [pdf, other]

Estimating Active Cases of COVID-19

Authors: Javier Álvarez, Carlos Baquero, Elisa Cabana, Jaya Prakash Champati, Antonio Fernández Anta, Davide Frey, Augusto García-Agúndez, Chryssis Georgiou, Mathieu Goessens, Harold Hernández, Rosa Lillo, Raquel Menezes, Raúl Moreno, Nicolas Nicolaou, Oluwasegun Ojo, Antonio Ortega, Jesús Rufino, Efstathios Stavrakis, Govind Jeevan, Christin Glorioso

Abstract: Having accurate and timely data on confirmed active COVID-19 cases is challenging, since it depends on testing capacity and the availability of an appropriate infrastructure to perform tests and aggregate their results. In this paper, we propose methods to estimate the number of active cases of COVID-19 from the official data (of confirmed cases and fatalities) and from survey data. We show that t… ▽ More Having accurate and timely data on confirmed active COVID-19 cases is challenging, since it depends on testing capacity and the availability of an appropriate infrastructure to perform tests and aggregate their results. In this paper, we propose methods to estimate the number of active cases of COVID-19 from the official data (of confirmed cases and fatalities) and from survey data. We show that the latter is a viable option in countries with reduced testing capacity or suboptimal infrastructures. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: Presented at the 2nd KDD Workshop on Data-driven Humanitarian Map**: Harnessing Human-Machine Intelligence for High-Stake Public Policy and Resiliency Planning, August 15, 2021

arXiv:2005.12783 [pdf, other]

CoronaSurveys: Using Surveys with Indirect Reporting to Estimate the Incidence and Evolution of Epidemics

Authors: Oluwasegun Ojo, Augusto García-Agundez, Benjamin Girault, Harold Hernández, Elisa Cabana, Amanda García-García, Payman Arabshahi, Carlos Baquero, Paolo Casari, Ednaldo José Ferreira, Davide Frey, Chryssis Georgiou, Mathieu Goessens, Anna Ishchenko, Ernesto Jiménez, Oleksiy Kebkal, Rosa Lillo, Raquel Menezes, Nicolas Nicolaou, Antonio Ortega, Paul Patras, Julian C Roberts, Efstathios Stavrakis, Yuichi Tanaka, Antonio Fernández Anta

Abstract: The world is suffering from a pandemic called COVID-19, caused by the SARS-CoV-2 virus. National governments have problems evaluating the reach of the epidemic, due to having limited resources and tests at their disposal. This problem is especially acute in low and middle-income countries (LMICs). Hence, any simple, cheap and flexible means of evaluating the incidence and evolution of the epidemic… ▽ More The world is suffering from a pandemic called COVID-19, caused by the SARS-CoV-2 virus. National governments have problems evaluating the reach of the epidemic, due to having limited resources and tests at their disposal. This problem is especially acute in low and middle-income countries (LMICs). Hence, any simple, cheap and flexible means of evaluating the incidence and evolution of the epidemic in a given country with a reasonable level of accuracy is useful. In this paper, we propose a technique based on (anonymous) surveys in which participants report on the health status of their contacts. This indirect reporting technique, known in the literature as network scale-up method, preserves the privacy of the participants and their contacts, and collects information from a larger fraction of the population (as compared to individual surveys). This technique has been deployed in the CoronaSurveys project, which has been collecting reports for the COVID-19 pandemic for more than two months. Results obtained by CoronaSurveys show the power and flexibility of the approach, suggesting that it could be an inexpensive and powerful tool for LMICs. △ Less

Submitted 26 June, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

Comments: Presented at The KDD Workshop on Humanitarian Map**, San Diego, California USA, August 24, 2020

Showing 1–4 of 4 results for author: García-Agúndez, A