-
Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data
Authors:
Sudeshna Das,
Yao Ge,
Yuting Guo,
Swati Rajwal,
JaMor Hairston,
Jeanne Powell,
Drew Walker,
Snigdha Peddireddy,
Sahithi Lakamana,
Selen Bozkurt,
Matthew Reyna,
Reza Sameni,
Yunyu Xiao,
Sangmi Kim,
Rasheeta Chandler,
Natalie Hernandez,
Danielle Mowery,
Rachel Wightman,
Jennifer Love,
Anthony Spadaro,
Jeanmarie Perrone,
Abeed Sarker
Abstract:
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for qu…
▽ More
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers
Authors:
Ryan J. Urbanowicz,
Harsh Bandhey,
Brendan T. Keenan,
Greg Maislin,
Sy Hwang,
Danielle L. Mowery,
Shannon M. Lynch,
Diego R. Mazzotti,
Fang Han,
Qing Yun Li,
Thomas Penzel,
Sergio Tufik,
Lia Bittencourt,
Thorarinn Gislason,
Philip de Chazal,
Bhajan Singh,
Nigel McArdle,
Ning-Hung Chen,
Allan Pack,
Richard J. Schwab,
Peter A. Cistulli,
Ulysses J. Magalang
Abstract:
While machine learning (ML) includes a valuable array of tools for analyzing biomedical data, significant time and expertise is required to assemble effective, rigorous, and unbiased pipelines. Automated ML (AutoML) tools seek to facilitate ML application by automating a subset of analysis pipeline elements. In this study we develop and validate a Simple, Transparent, End-to-end Automated Machine…
▽ More
While machine learning (ML) includes a valuable array of tools for analyzing biomedical data, significant time and expertise is required to assemble effective, rigorous, and unbiased pipelines. Automated ML (AutoML) tools seek to facilitate ML application by automating a subset of analysis pipeline elements. In this study we develop and validate a Simple, Transparent, End-to-end Automated Machine Learning Pipeline (STREAMLINE) and apply it to investigate the added utility of photography-based phenotypes for predicting obstructive sleep apnea (OSA); a common and underdiagnosed condition associated with a variety of health, economic, and safety consequences. STREAMLINE is designed to tackle biomedical binary classification tasks while adhering to best practices and accommodating complexity, scalability, reproducibility, customization, and model interpretation. Benchmarking analyses validated the efficacy of STREAMLINE across data simulations with increasingly complex patterns of association. Then we applied STREAMLINE to evaluate the utility of demographics (DEM), self-reported comorbidities (DX), symptoms (SYM), and photography-based craniofacial (CF) and intraoral (IO) anatomy measures in predicting any OSA or moderate/severe OSA using 3,111 participants from Sleep Apnea Global Interdisciplinary Consortium (SAGIC). OSA analyses identified a significant increase in ROC-AUC when adding CF to DEM+DX+SYM to predict moderate/severe OSA. A consistent but non-significant increase in PRC-AUC was observed with the addition of each subsequent feature set to predict any OSA, with CF and IO yielding minimal improvements. Application of STREAMLINE to OSA data suggests that CF features provide additional value in predicting moderate/severe OSA, but neither CF nor IO features meaningfully improved the prediction of any OSA beyond established demographics, comorbidity and symptom characteristics.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
Feature Studies to Inform the Classification of Depressive Symptoms from Twitter Data for Population Health
Authors:
Danielle Mowery,
Craig Bryan,
Mike Conway
Abstract:
The utility of Twitter data as a medium to support population-level mental health monitoring is not well understood. In an effort to better understand the predictive power of supervised machine learning classifiers and the influence of feature sets for efficiently classifying depression-related tweets on a large-scale, we conducted two feature study experiments. In the first experiment, we assesse…
▽ More
The utility of Twitter data as a medium to support population-level mental health monitoring is not well understood. In an effort to better understand the predictive power of supervised machine learning classifiers and the influence of feature sets for efficiently classifying depression-related tweets on a large-scale, we conducted two feature study experiments. In the first experiment, we assessed the contribution of feature groups such as lexical information (e.g., unigrams) and emotions (e.g., strongly negative) using a feature ablation study. In the second experiment, we determined the percentile of top ranked features that produced the optimal classification performance by applying a three-step feature elimination approach. In the first experiment, we observed that lexical features are critical for identifying depressive symptoms, specifically for depressed mood (-35 points) and for disturbed sleep (-43 points). In the second experiment, we observed that the optimal F1-score performance of top ranked features in percentiles variably ranged across classes e.g., fatigue or loss of energy (5th percentile, 288 features) to depressed mood (55th percentile, 3,168 features) suggesting there is no consistent count of features for predicting depressive-related tweets. We conclude that simple lexical features and reduced feature sets can produce comparable results to larger feature sets.
△ Less
Submitted 27 January, 2017;
originally announced January 2017.