Search | arXiv e-print repository

arXiv:2401.13588 [pdf]

Evaluation of General Large Language Models in Contextually Assessing Semantic Concepts Extracted from Adult Critical Care Electronic Health Record Notes

Authors: Darren Liu, Cheng Ding, Delgersuren Bold, Monique Bouvier, Jiaying Lu, Benjamin Shickel, Craig S. Jabaley, Wenhui Zhang, Soo** Park, Michael J. Young, Mark S. Wainwright, Gilles Clermont, Parisa Rashidi, Eric S. Rosenthal, Laurie Dimisko, Ran Xiao, Joo Heung Yoon, Carl Yang, Xiao Hu

Abstract: The field of healthcare has increasingly turned its focus towards Large Language Models (LLMs) due to their remarkable performance. However, their performance in actual clinical applications has been underexplored. Traditional evaluations based on question-answering tasks don't fully capture the nuanced contexts. This gap highlights the need for more in-depth and practical assessments of LLMs in r… ▽ More The field of healthcare has increasingly turned its focus towards Large Language Models (LLMs) due to their remarkable performance. However, their performance in actual clinical applications has been underexplored. Traditional evaluations based on question-answering tasks don't fully capture the nuanced contexts. This gap highlights the need for more in-depth and practical assessments of LLMs in real-world healthcare settings. Objective: We sought to evaluate the performance of LLMs in the complex clinical context of adult critical care medicine using systematic and comprehensible analytic methods, including clinician annotation and adjudication. Methods: We investigated the performance of three general LLMs in understanding and processing real-world clinical notes. Concepts from 150 clinical notes were identified by MetaMap and then labeled by 9 clinicians. Each LLM's proficiency was evaluated by identifying the temporality and negation of these concepts using different prompts for an in-depth analysis. Results: GPT-4 showed overall superior performance compared to other LLMs. In contrast, both GPT-3.5 and text-davinci-003 exhibit enhanced performance when the appropriate prompting strategies are employed. The GPT family models have demonstrated considerable efficiency, evidenced by their cost-effectiveness and time-saving capabilities. Conclusion: A comprehensive qualitative performance evaluation framework for LLMs is developed and operationalized. This framework goes beyond singular performance aspects. With expert annotations, this methodology not only validates LLMs' capabilities in processing complex medical data but also establishes a benchmark for future LLM evaluations across specialized domains. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2206.09074 [pdf, other]

Weakly Supervised Classification of Vital Sign Alerts as Real or Artifact

Authors: Arnab Dey, Mononito Goswami, Joo Heung Yoon, Gilles Clermont, Michael Pinsky, Marilyn Hravnak, Artur Dubrawski

Abstract: A significant proportion of clinical physiologic monitoring alarms are false. This often leads to alarm fatigue in clinical personnel, inevitably compromising patient safety. To combat this issue, researchers have attempted to build Machine Learning (ML) models capable of accurately adjudicating Vital Sign (VS) alerts raised at the bedside of hemodynamically monitored patients as real or artifact.… ▽ More A significant proportion of clinical physiologic monitoring alarms are false. This often leads to alarm fatigue in clinical personnel, inevitably compromising patient safety. To combat this issue, researchers have attempted to build Machine Learning (ML) models capable of accurately adjudicating Vital Sign (VS) alerts raised at the bedside of hemodynamically monitored patients as real or artifact. Previous studies have utilized supervised ML techniques that require substantial amounts of hand-labeled data. However, manually harvesting such data can be costly, time-consuming, and mundane, and is a key factor limiting the widespread adoption of ML in healthcare (HC). Instead, we explore the use of multiple, individually imperfect heuristics to automatically assign probabilistic labels to unlabeled training data using weak supervision. Our weakly supervised models perform competitively with traditional supervised techniques and require less involvement from domain experts, demonstrating their use as efficient and practical alternatives to supervised learning in HC applications of ML. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: Accepted at American Medical Informatics Association (AMIA) Annual Symposium 2022. 10 pages, 4 figures and 2 tables

arXiv:2204.05477 [pdf, other]

Deep Normed Embeddings for Patient Representation

Authors: Thesath Nanayakkara, Gilles Clermont, Christopher James Langmead, David Swigon

Abstract: We introduce a novel contrastive representation learning objective and a training scheme for clinical time series. Specifically, we project high dimensional EHR. data to a closed unit ball of low dimension, encoding geometric priors so that the origin represents an idealized perfect health state and the Euclidean norm is associated with the patient's mortality risk. Moreover, using septic patients… ▽ More We introduce a novel contrastive representation learning objective and a training scheme for clinical time series. Specifically, we project high dimensional EHR. data to a closed unit ball of low dimension, encoding geometric priors so that the origin represents an idealized perfect health state and the Euclidean norm is associated with the patient's mortality risk. Moreover, using septic patients as an example, we show how we could learn to associate the angle between two vectors with the different organ system failures, thereby, learning a compact representation which is indicative of both mortality risk and specific organ failure. We show how the learned embedding can be used for online patient monitoring, can supplement clinicians and improve performance of downstream machine learning tasks. This work was partially motivated from the desire and the need to introduce a systematic way of defining intermediate rewards for Reinforcement Learning in critical care medicine. Hence, we also show how such a design in terms of the learned embedding can result in qualitatively different policies and value distributions, as compared with using only terminal rewards. △ Less

Submitted 3 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: A minimal implementation of this work can be found at https://github.com/thxsxth/normed_constrastive_metric

arXiv:2101.08477 [pdf, other]

doi 10.1371/journal.pdig.0000012

Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis Treatment

Authors: Thesath Nanayakkara, Gilles Clermont, Christopher James Langmead, David Swigon

Abstract: Sepsis is a potentially life threatening inflammatory response to infection or severe tissue damage. It has a highly variable clinical course, requiring constant monitoring of the patient's state to guide the management of intravenous fluids and vasopressors, among other interventions. Despite decades of research, there's still debate among experts on optimal treatment. Here, we combine for the fi… ▽ More Sepsis is a potentially life threatening inflammatory response to infection or severe tissue damage. It has a highly variable clinical course, requiring constant monitoring of the patient's state to guide the management of intravenous fluids and vasopressors, among other interventions. Despite decades of research, there's still debate among experts on optimal treatment. Here, we combine for the first time, distributional deep reinforcement learning with mechanistic physiological models to find personalized sepsis treatment strategies. Our method handles partial observability by leveraging known cardiovascular physiology, introducing a novel physiology-driven recurrent autoencoder, and quantifies the uncertainty of its own results. Moreover, we introduce a framework for uncertainty aware decision support with humans in the loop. We show that our method learns physiologically explainable, robust policies that are consistent with clinical knowledge. Further our method consistently identifies high risk states that lead to death, which could potentially benefit from more frequent vasopressor administration, providing valuable guidance for future research △ Less

Submitted 12 June, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

arXiv:q-bio/0404034 [pdf, ps, other]

Dynamics of Acute Inflammation

Authors: Rukmini Kumar, Gilles Clermont, Yoram Vodovotz, Carson Chow

Abstract: When the body is infected, it mounts an acute inflammatory response to rid itself of the pathogens and restore health. Uncontrolled acute inflammation due to infection is defined clinically as Sepsis and can culminate in organ failure and death. We consider a three dimensional ordinary differential equation model of inflammation consisting of a pathogen, and two inflammatory mediators. The model… ▽ More When the body is infected, it mounts an acute inflammatory response to rid itself of the pathogens and restore health. Uncontrolled acute inflammation due to infection is defined clinically as Sepsis and can culminate in organ failure and death. We consider a three dimensional ordinary differential equation model of inflammation consisting of a pathogen, and two inflammatory mediators. The model reproduces the healthy outcome and diverse negative outcomes, depending on initial conditions and parameters.when key parameters are changed and suggest various therapeutic strategies. We suggest that the clinical condition of sepsis can arise from several distinct physiological states, each of which requires a different treatment approach. We analyze the various bifurcations between the different outcomes △ Less

Submitted 23 April, 2004; originally announced April 2004.

Comments: 27 pages, 9 figures, Accepted by the Journal of Theoretical Biology

Showing 1–5 of 5 results for author: Clermont, G