-
Entangled Relations: Leveraging NLI and Meta-analysis to Enhance Biomedical Relation Extraction
Authors:
William Hogan,
**gbo Shang
Abstract:
Recent research efforts have explored the potential of leveraging natural language inference (NLI) techniques to enhance relation extraction (RE). In this vein, we introduce MetaEntail-RE, a novel adaptation method that harnesses NLI principles to enhance RE performance. Our approach follows past works by verbalizing relation classes into class-indicative hypotheses, aligning a traditionally multi…
▽ More
Recent research efforts have explored the potential of leveraging natural language inference (NLI) techniques to enhance relation extraction (RE). In this vein, we introduce MetaEntail-RE, a novel adaptation method that harnesses NLI principles to enhance RE performance. Our approach follows past works by verbalizing relation classes into class-indicative hypotheses, aligning a traditionally multi-class classification task to one of textual entailment. We introduce three key enhancements: (1) Instead of labeling non-entailed premise-hypothesis pairs with the uninformative "neutral" entailment label, we introduce meta-class analysis, which provides additional context by analyzing overarching meta relationships between classes when assigning entailment labels; (2) Feasible hypothesis filtering, which removes unlikely hypotheses from consideration based on pairs of entity types; and (3) Group-based prediction selection, which further improves performance by selecting highly confident predictions. MetaEntail-RE is conceptually simple and empirically powerful, yielding significant improvements over conventional relation extraction techniques and other NLI formulations. Our experimental results underscore the versatility of MetaEntail-RE, demonstrating performance gains across both biomedical and general domains.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
Precision Rehabilitation for Patients Post-Stroke based on Electronic Health Records and Machine Learning
Authors:
Fengyi Gao,
Xingyu Zhang,
Sonish Sivarajkumar,
Parker Denny,
Bayan Aldhahwani,
Shyam Visweswaran,
Ryan Shi,
William Hogan,
Allyn Bove,
Yanshan Wang
Abstract:
In this study, we utilized statistical analysis and machine learning methods to examine whether rehabilitation exercises can improve patients post-stroke functional abilities, as well as forecast the improvement in functional abilities. Our dataset is patients' rehabilitation exercises and demographic information recorded in the unstructured electronic health records (EHRs) data and free-text reha…
▽ More
In this study, we utilized statistical analysis and machine learning methods to examine whether rehabilitation exercises can improve patients post-stroke functional abilities, as well as forecast the improvement in functional abilities. Our dataset is patients' rehabilitation exercises and demographic information recorded in the unstructured electronic health records (EHRs) data and free-text rehabilitation procedure notes. We collected data for 265 stroke patients from the University of Pittsburgh Medical Center. We employed a pre-existing natural language processing (NLP) algorithm to extract data on rehabilitation exercises and developed a rule-based NLP algorithm to extract Activity Measure for Post-Acute Care (AM-PAC) scores, covering basic mobility (BM) and applied cognitive (AC) domains, from procedure notes. Changes in AM-PAC scores were classified based on the minimal clinically important difference (MCID), and significance was assessed using Friedman and Wilcoxon tests. To identify impactful exercises, we used Chi-square tests, Fisher's exact tests, and logistic regression for odds ratios. Additionally, we developed five machine learning models-logistic regression (LR), Adaboost (ADB), support vector machine (SVM), gradient boosting (GB), and random forest (RF)-to predict outcomes in functional ability. Statistical analyses revealed significant associations between functional improvements and specific exercises. The RF model achieved the best performance in predicting functional outcomes. In this study, we identified three rehabilitation exercises that significantly contributed to patient post-stroke functional ability improvement in the first two months. Additionally, the successful application of a machine learning model to predict patient-specific functional outcomes underscores the potential for precision rehabilitation.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Credentials in the Occupation Ontology
Authors:
John Beverley,
Robin McGill,
Sam Smith,
Jie Zheng,
Giacomo De Colle,
Finn Wilson,
Matthew Diller,
William D. Duncan,
William R. Hogan,
Yongqun He
Abstract:
The term credential encompasses educational certificates, degrees, certifications, and government-issued licenses. An occupational credential is a verification of an individuals qualification or competence issued by a third party with relevant authority. Job seekers often leverage such credentials as evidence that desired qualifications are satisfied by their holders. Many U.S. education and workf…
▽ More
The term credential encompasses educational certificates, degrees, certifications, and government-issued licenses. An occupational credential is a verification of an individuals qualification or competence issued by a third party with relevant authority. Job seekers often leverage such credentials as evidence that desired qualifications are satisfied by their holders. Many U.S. education and workforce development organizations have recognized the importance of credentials for employment and the challenges of understanding the value of credentials. In this study, we identified and ontologically defined credential and credential-related terms at the textual and semantic levels based on the Occupation Ontology (OccO), a BFO-based ontology. Different credential types and their authorization logic are modeled. We additionally defined a high-level hierarchy of credential related terms and relations among many terms, which were initiated in concert with the Alabama Talent Triad (ATT) program, which aims to connect learners, earners, employers and education/training providers through credentials and skills. To our knowledge, our research provides for the first time systematic ontological modeling of the important domain of credentials and related contents, supporting enhanced credential data and knowledge integration in the future.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
READ: Improving Relation Extraction from an ADversarial Perspective
Authors:
Dawei Li,
William Hogan,
**gbo Shang
Abstract:
Recent works in relation extraction (RE) have achieved promising benchmark accuracy; however, our adversarial attack experiments show that these works excessively rely on entities, making their generalization capability questionable. To address this issue, we propose an adversarial training method specifically designed for RE. Our approach introduces both sequence- and token-level perturbations to…
▽ More
Recent works in relation extraction (RE) have achieved promising benchmark accuracy; however, our adversarial attack experiments show that these works excessively rely on entities, making their generalization capability questionable. To address this issue, we propose an adversarial training method specifically designed for RE. Our approach introduces both sequence- and token-level perturbations to the sample and uses a separate perturbation vocabulary to improve the search for entity and context perturbations. Furthermore, we introduce a probabilistic strategy for leaving clean tokens in the context during adversarial training. This strategy enables a larger attack budget for entities and coaxes the model to leverage relational patterns embedded in the context. Extensive experiments show that compared to various adversarial training methods, our method significantly improves both the accuracy and robustness of the model. Additionally, experiments on different data availability settings highlight the effectiveness of our method in low-resource scenarios. We also perform in-depth analyses of our proposed method and provide further hints. We will release our code at https://github.com/David-Li0406/READ.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Creating a Discipline-specific Commons for Infectious Disease Epidemiology
Authors:
Michael M. Wagner,
William Hogan,
John Levander,
Adam Darr,
Matt Diller,
Max Sibilla,
Alexander T. Loiacono. Terence Sperringer, Jr.,
Shawn T. Brown
Abstract:
Objective: To create a commons for infectious disease (ID) epidemiology in which epidemiologists, public health officers, data producers, and software developers can not only share data and software, but receive assistance in improving their interoperability. Materials and Methods: We represented 586 datasets, 54 software, and 24 data formats in OWL 2 and then used logical queries to infer potenti…
▽ More
Objective: To create a commons for infectious disease (ID) epidemiology in which epidemiologists, public health officers, data producers, and software developers can not only share data and software, but receive assistance in improving their interoperability. Materials and Methods: We represented 586 datasets, 54 software, and 24 data formats in OWL 2 and then used logical queries to infer potentially interoperable combinations of software and datasets, as well as statistics about the FAIRness of the collection. We represented the objects in DATS 2.2 and a software metadata schema of our own design. We used these representations as the basis for the Content, Search, FAIR-o-meter, and Workflow pages that constitute the MIDAS Digital Commons. Results: Interoperability was limited by lack of standardization of input and output formats of software. When formats existed, they were human-readable specifications (22/24; 92%); only 3 formats (13%) had machine-readable specifications. Nevertheless, logical search of a triple store based on named data formats was able to identify scores of potentially interoperable combinations of software and datasets. Discussion: We improved the findability and availability of a sample of software and datasets and developed metrics for assessing interoperability. The barriers to interoperability included poor documentation of software input/output formats and little attention to standardization of most types of data in this field. Conclusion: Centralizing and formalizing the representation of digital objects within a commons promotes FAIRness, enables its measurement over time and the identification of potentially interoperable combinations of data and software.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
DAIL: Data Augmentation for In-Context Learning via Self-Paraphrase
Authors:
Dawei Li,
Yaxuan Li,
Dheeraj Mekala,
Shuyao Li,
Yulin wang,
Xueqi Wang,
William Hogan,
**gbo Shang
Abstract:
In-Context Learning (ICL) combined with pre-trained large language models has achieved promising results on various NLP tasks. However, ICL requires high-quality annotated demonstrations which might not be available in real-world scenarios. To overcome this limitation, we propose \textbf{D}ata \textbf{A}ugmentation for \textbf{I}n-Context \textbf{L}earning (\textbf{DAIL}). DAIL leverages the intui…
▽ More
In-Context Learning (ICL) combined with pre-trained large language models has achieved promising results on various NLP tasks. However, ICL requires high-quality annotated demonstrations which might not be available in real-world scenarios. To overcome this limitation, we propose \textbf{D}ata \textbf{A}ugmentation for \textbf{I}n-Context \textbf{L}earning (\textbf{DAIL}). DAIL leverages the intuition that large language models are more familiar with the content generated by themselves. It first utilizes the language model to generate paraphrases of the test sample and employs majority voting to determine the final result based on individual predictions. Our extensive empirical evaluation shows that DAIL outperforms the standard ICL method and other ensemble-based methods in the low-resource scenario. Additionally, we explore the use of voting consistency as a confidence score of the model when the logits of predictions are inaccessible. We believe our work will stimulate further research on ICL in low-resource settings.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Open-world Semi-supervised Generalized Relation Discovery Aligned in a Real-world Setting
Authors:
William Hogan,
Jiacheng Li,
**gbo Shang
Abstract:
Open-world Relation Extraction (OpenRE) has recently garnered significant attention. However, existing approaches tend to oversimplify the problem by assuming that all unlabeled texts belong to novel classes, thereby limiting the practicality of these methods. We argue that the OpenRE setting should be more aligned with the characteristics of real-world data. Specifically, we propose two key impro…
▽ More
Open-world Relation Extraction (OpenRE) has recently garnered significant attention. However, existing approaches tend to oversimplify the problem by assuming that all unlabeled texts belong to novel classes, thereby limiting the practicality of these methods. We argue that the OpenRE setting should be more aligned with the characteristics of real-world data. Specifically, we propose two key improvements: (a) unlabeled data should encompass known and novel classes, including hard-negative instances; and (b) the set of novel classes should represent long-tail relation types. Furthermore, we observe that popular relations such as titles and locations can often be implicitly inferred through specific patterns, while long-tail relations tend to be explicitly expressed in sentences. Motivated by these insights, we present a novel method called KNoRD (Known and Novel Relation Discovery), which effectively classifies explicitly and implicitly expressed relations from known and novel classes within unlabeled data. Experimental evaluations on several Open-world RE benchmarks demonstrate that KNoRD consistently outperforms other existing methods, achieving significant performance gains.
△ Less
Submitted 1 November, 2023; v1 submitted 22 May, 2023;
originally announced May 2023.
-
A Study of Generative Large Language Model for Medical Research and Healthcare
Authors:
Cheng Peng,
Xi Yang,
Aokun Chen,
Kaleb E Smith,
Nima PourNejatian,
Anthony B Costa,
Cheryl Martin,
Mona G Flores,
Ying Zhang,
Tanja Magoc,
Gloria Lipori,
Duane A Mitchell,
Naykky S Ospina,
Mustafa M Ahmed,
William R Hogan,
Elizabeth A Shenkman,
Yi Guo,
Jiang Bian,
Yonghui Wu
Abstract:
There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language proc…
▽ More
There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Clinical Concept and Relation Extraction Using Prompt-based Machine Reading Comprehension
Authors:
Cheng Peng,
Xi Yang,
Zehao Yu,
Jiang Bian,
William R. Hogan,
Yonghui Wu
Abstract:
Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications.
Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and exp…
▽ More
Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications.
Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using two benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models.
Results and Conclusion: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the two benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the two datasets by 1%~3% and 0.7%~1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%~2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the two datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
SODA: A Natural Language Processing Package to Extract Social Determinants of Health for Cancer Studies
Authors:
Zehao Yu,
Xi Yang,
Chong Dang,
Prakash Adekkanattu,
Braja Gopal Patra,
Yifan Peng,
Jyotishman Pathak,
Debbie L. Wilson,
Ching-Yuan Chang,
Wei-Hsuan Lo-Ciganic,
Thomas J. George,
William R. Hogan,
Yi Guo,
Jiang Bian,
Yonghui Wu
Abstract:
Objective: We aim to develop an open-source natural language processing (NLP) package, SODA (i.e., SOcial DeterminAnts), with pre-trained transformer models to extract social determinants of health (SDoH) for cancer patients, examine the generalizability of SODA to a new disease domain (i.e., opioid use), and evaluate the extraction rate of SDoH using cancer populations.
Methods: We identified S…
▽ More
Objective: We aim to develop an open-source natural language processing (NLP) package, SODA (i.e., SOcial DeterminAnts), with pre-trained transformer models to extract social determinants of health (SDoH) for cancer patients, examine the generalizability of SODA to a new disease domain (i.e., opioid use), and evaluate the extraction rate of SDoH using cancer populations.
Methods: We identified SDoH categories and attributes and developed an SDoH corpus using clinical notes from a general cancer cohort. We compared four transformer-based NLP models to extract SDoH, examined the generalizability of NLP models to a cohort of patients prescribed with opioids, and explored customization strategies to improve performance. We applied the best NLP model to extract 19 categories of SDoH from the breast (n=7,971), lung (n=11,804), and colorectal cancer (n=6,240) cohorts.
Results and Conclusion: We developed a corpus of 629 cancer patients notes with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH. The Bidirectional Encoder Representations from Transformers (BERT) model achieved the best strict/lenient F1 scores of 0.9216 and 0.9441 for SDoH concept extraction, 0.9617 and 0.9626 for linking attributes to SDoH concepts. Fine-tuning the NLP models using new annotations from opioid use patients improved the strict/lenient F1 scores from 0.8172/0.8502 to 0.8312/0.8679. The extraction rates among 19 categories of SDoH varied greatly, where 10 SDoH could be extracted from >70% of cancer patients, but 9 SDoH had a low extraction rate (<70% of cancer patients). The SODA package with pre-trained transformer models is publicly available at https://github.com/uf-hobiinformatics-lab/SDoH_SODA.
△ Less
Submitted 18 May, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Tree-based Subgroup Discovery In Electronic Health Records: Heterogeneity of Treatment Effects for DTG-containing Therapies
Authors:
Jiabei Yang,
Ann W. Mwangi,
Rami Kantor,
Issa J. Dahabreh,
Monicah Nyambura,
Allison Delong,
Joseph W. Hogan,
Jon A. Steingrimsson
Abstract:
The rich longitudinal individual level data available from electronic health records (EHRs) can be used to examine treatment effect heterogeneity. However, estimating treatment effects using EHR data poses several challenges, including time-varying confounding, repeated and temporally non-aligned measurements of covariates, treatment assignments and outcomes, and loss-to-follow-up due to dropout.…
▽ More
The rich longitudinal individual level data available from electronic health records (EHRs) can be used to examine treatment effect heterogeneity. However, estimating treatment effects using EHR data poses several challenges, including time-varying confounding, repeated and temporally non-aligned measurements of covariates, treatment assignments and outcomes, and loss-to-follow-up due to dropout. Here, we develop the Subgroup Discovery for Longitudinal Data (SDLD) algorithm, a tree-based algorithm for discovering subgroups with heterogeneous treatment effects using longitudinal data by combining the generalized interaction tree algorithm, a general data-driven method for subgroup discovery, with longitudinal targeted maximum likelihood estimation. We apply the algorithm to EHR data to discover subgroups of people living with human immunodeficiency virus (HIV) who are at higher risk of weight gain when receiving dolutegravir-containing antiretroviral therapies (ARTs) versus when receiving non dolutegravir-containing ARTs.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods
Authors:
William Hogan
Abstract:
Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into…
▽ More
Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Ontology Development Kit: a toolkit for building, maintaining, and standardising biomedical ontologies
Authors:
Nicolas Matentzoglu,
Damien Goutte-Gattat,
Shawn Zheng Kai Tan,
James P. Balhoff,
Seth Carbon,
Anita R. Caron,
William D. Duncan,
Joe E. Flack,
Melissa Haendel,
Nomi L. Harris,
William R Hogan,
Charles Tapley Hoyt,
Rebecca C. Jackson,
HyeongSik Kim,
Huseyin Kir,
Martin Larralde,
Julie A. McMurry,
James A. Overton,
Bjoern Peters,
Clare Pilgrim,
Ray Stefancsik,
Sofia MC Robb,
Sabrina Toro,
Nicole A Vasilevsky,
Ramona Walls
, et al. (2 additional authors not shown)
Abstract:
Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking, and dependency management. To manage these processes, a diverse set of tools is required, from command line utilities to powerful ontology engineering environments such as ROBOT. Particularly in the biomedical domain, which has…
▽ More
Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking, and dependency management. To manage these processes, a diverse set of tools is required, from command line utilities to powerful ontology engineering environments such as ROBOT. Particularly in the biomedical domain, which has developed a set of highly diverse yet inter-dependent ontologies, standardising release practices and metadata, and establishing shared quality standards, are crucial to enable interoperability. The Ontology Development Kit (ODK) provides a set of standardised, customisable, and automatically executable workflows, and packages all required tooling in a single Docker image. In this paper, we provide an overview of how the ODK works, show how it is used in practice, and describe how we envision it driving standardisation efforts in our community.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Fine-grained Contrastive Learning for Relation Extraction
Authors:
William Hogan,
Jiacheng Li,
**gbo Shang
Abstract:
Recent relation extraction (RE) works have shown encouraging improvements by conducting contrastive learning on silver labels generated by distant supervision before fine-tuning on gold labels. Existing methods typically assume all these silver labels are accurate and treat them equally; however, distant supervision is inevitably noisy -- some silver labels are more reliable than others. In this p…
▽ More
Recent relation extraction (RE) works have shown encouraging improvements by conducting contrastive learning on silver labels generated by distant supervision before fine-tuning on gold labels. Existing methods typically assume all these silver labels are accurate and treat them equally; however, distant supervision is inevitably noisy -- some silver labels are more reliable than others. In this paper, we propose fine-grained contrastive learning (FineCL) for RE, which leverages fine-grained information about which silver labels are and are not noisy to improve the quality of learned relationship representations for RE. We first assess the quality of silver labels via a simple and automatic approach we call "learning order denoising," where we train a language model to learn these relations and record the order of learned training instances. We show that learning order largely corresponds to label accuracy -- early-learned silver labels have, on average, more accurate labels than later-learned silver labels. Then, during pre-training, we increase the weights of accurate labels within a novel contrastive learning objective. Experiments on several RE benchmarks show that FineCL makes consistent and significant performance gains over state-of-the-art methods.
△ Less
Submitted 20 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records
Authors:
Xi Yang,
Aokun Chen,
Nima PourNejatian,
Hoo Chang Shin,
Kaleb E Smith,
Christopher Parisien,
Colin Compas,
Cheryl Martin,
Mona G Flores,
Ying Zhang,
Tanja Magoc,
Christopher A Harle,
Gloria Lipori,
Duane A Mitchell,
William R Hogan,
Elizabeth A Shenkman,
Jiang Bian,
Yonghui Wu
Abstract:
There is an increasing interest in develo** artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is compar…
▽ More
There is an increasing interest in develo** artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model - GatorTron - using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on 5 clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve 5 clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.
△ Less
Submitted 16 December, 2022; v1 submitted 2 February, 2022;
originally announced March 2022.
-
A flexible approach for causal inference with multiple treatments and clustered survival outcomes
Authors:
Liangyuan Hu,
Jiayi Ji,
Ronald D. Ennis,
Joseph W. Hogan
Abstract:
When drawing causal inferences about the effects of multiple treatments on clustered survival outcomes using observational data, we need to address implications of the multilevel data structure, multiple treatments, censoring and unmeasured confounding for causal analyses. Few off-the-shelf causal inference tools are available to simultaneously tackle these issues. We develop a flexible random-int…
▽ More
When drawing causal inferences about the effects of multiple treatments on clustered survival outcomes using observational data, we need to address implications of the multilevel data structure, multiple treatments, censoring and unmeasured confounding for causal analyses. Few off-the-shelf causal inference tools are available to simultaneously tackle these issues. We develop a flexible random-intercept accelerated failure time model, in which we use Bayesian additive regression trees to capture arbitrarily complex relationships between censored survival times and pre-treatment covariates and use the random intercepts to capture cluster-specific main effects. We develop an efficient Markov chain Monte Carlo algorithm to draw posterior inferences about the population survival effects of multiple treatments and examine the variability in cluster-level effects. We further propose an interpretable sensitivity analysis approach to evaluate the sensitivity of drawn causal inferences about treatment effect to the potential magnitude of departure from the causal assumption of no unmeasured confounding. Expansive simulations empirically validate and demonstrate good practical operating characteristics of our proposed methods. Applying the proposed methods to a dataset on older high-risk localized prostate cancer patients drawn from the National Cancer Database, we evaluate the comparative effects of three treatment approaches on patient survival, and assess the ramifications of potential unmeasured confounding. The methods developed in this work are readily available in the $\textsf{R}$ package $\textsf{riAFTBART}$.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
Abstractified Multi-instance Learning (AMIL) for Biomedical Relation Extraction
Authors:
William Hogan,
Molly Huang,
Yannis Katsis,
Tyler Baldwin,
Ho-Cheol Kim,
Yoshiki Vazquez Baeza,
Andrew Bartko,
Chun-Nan Hsu
Abstract:
Relation extraction in the biomedical domain is a challenging task due to a lack of labeled data and a long-tail distribution of fact triples. Many works leverage distant supervision which automatically generates labeled data by pairing a knowledge graph with raw textual data. Distant supervision produces noisy labels and requires additional techniques, such as multi-instance learning (MIL), to de…
▽ More
Relation extraction in the biomedical domain is a challenging task due to a lack of labeled data and a long-tail distribution of fact triples. Many works leverage distant supervision which automatically generates labeled data by pairing a knowledge graph with raw textual data. Distant supervision produces noisy labels and requires additional techniques, such as multi-instance learning (MIL), to denoise the training signal. However, MIL requires multiple instances of data and struggles with very long-tail datasets such as those found in the biomedical domain. In this work, we propose a novel reformulation of MIL for biomedical relation extraction that abstractifies biomedical entities into their corresponding semantic types. By grou** entities by types, we are better able to take advantage of the benefits of MIL and further denoise the training signal. We show this reformulation, which we refer to as abstractified multi-instance learning (AMIL), improves performance in biomedical relationship extraction. We also propose a novel relationship embedding architecture that further improves model performance.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models
Authors:
Zehao Yu,
Xi Yang,
Chong Dang,
Songzi Wu,
Prakash Adekkanattu,
Jyotishman Pathak,
Thomas J. George,
William R. Hogan,
Yi Guo,
Jiang Bian,
Yonghui Wu
Abstract:
Social and behavioral determinants of health (SBDoH) have important roles in sha** people's health. In clinical research studies, especially comparative effectiveness studies, failure to adjust for SBDoH factors will potentially cause confounding issues and misclassification errors in either statistical analyses and machine learning-based models. However, there are limited studies to examine SBD…
▽ More
Social and behavioral determinants of health (SBDoH) have important roles in sha** people's health. In clinical research studies, especially comparative effectiveness studies, failure to adjust for SBDoH factors will potentially cause confounding issues and misclassification errors in either statistical analyses and machine learning-based models. However, there are limited studies to examine SBDoH factors in clinical outcomes due to the lack of structured SBDoH information in current electronic health record (EHR) systems, while much of the SBDoH information is documented in clinical narratives. Natural language processing (NLP) is thus the key technology to extract such information from unstructured clinical text. However, there is not a mature clinical NLP system focusing on SBDoH. In this study, we examined two state-of-the-art transformer-based NLP models, including BERT and RoBERTa, to extract SBDoH concepts from clinical narratives, applied the best performing model to extract SBDoH concepts on a lung cancer screening patient cohort, and examined the difference of SBDoH information between NLP extracted results and structured EHRs (SBDoH information captured in standard vocabularies such as the International Classification of Diseases codes). The experimental results show that the BERT-based NLP model achieved the best strict/lenient F1-score of 0.8791 and 0.8999, respectively. The comparison between NLP extracted SBDoH information and structured EHRs in the lung cancer patient cohort of 864 patients with 161,933 various types of clinical notes showed that much more detailed information about smoking, education, and employment were only captured in clinical narratives and that it is necessary to use both clinical narratives and structured EHRs to construct a more complete picture of patients' SBDoH factors.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
Applications of artificial intelligence in drug development using real-world data
Authors:
Zhaoyi Chen,
Xiong Liu,
William Hogan,
Elizabeth Shenkman,
Jiang Bian
Abstract:
The US Food and Drug Administration (FDA) has been actively promoting the use of real-world data (RWD) in drug development. RWD can generate important real-world evidence reflecting the real-world clinical environment where the treatments are used. Meanwhile, artificial intelligence (AI), especially machine- and deep-learning (ML/DL) methods, have been increasingly used across many stages of the d…
▽ More
The US Food and Drug Administration (FDA) has been actively promoting the use of real-world data (RWD) in drug development. RWD can generate important real-world evidence reflecting the real-world clinical environment where the treatments are used. Meanwhile, artificial intelligence (AI), especially machine- and deep-learning (ML/DL) methods, have been increasingly used across many stages of the drug development process. Advancements in AI have also provided new strategies to analyze large, multidimensional RWD. Thus, we conducted a rapid review of articles from the past 20 years, to provide an overview of the drug development studies that use both AI and RWD. We found that the most popular applications were adverse event detection, trial recruitment, and drug repurposing. Here, we also discuss current research gaps and future opportunities.
△ Less
Submitted 2 February, 2021; v1 submitted 21 January, 2021;
originally announced January 2021.
-
Inference for BART with Multinomial Outcomes
Authors:
Yizhen Xu,
Joseph W. Hogan,
Michael J. Daniels,
Rami Kantor,
Ann Mwangi
Abstract:
The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian…
▽ More
The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian distributed latent utilities. We introduce two new algorithms for fitting the MPBART and show that the theoretical mixing rates of our proposals are equal or superior to the existing algorithm in KD. Through simulations, we explore the robustness of the methods to the choice of reference level, imbalance in outcome frequencies, and the specifications of prior hyperparameters for the utility error term. The work is motivated by the application of generating posterior predictive distributions for mortality and engagement in care among HIV-positive patients based on electronic health records (EHRs) from the Academic Model Providing Access to Healthcare (AMPATH) in Kenya. In both the application and simulations, we observe better performance using our proposals as compared to KD in terms of MCMC convergence rate and posterior predictive accuracy.
△ Less
Submitted 12 August, 2022; v1 submitted 17 January, 2021;
originally announced January 2021.
-
Computation of Convex Hull Prices in Electricity Markets with Non-Convexities using Dantzig-Wolfe Decomposition
Authors:
Panagiotis Andrianesis,
Dimitris Bertsimas,
Michael C. Caramanis,
William W. Hogan
Abstract:
The presence of non-convexities in electricity markets has been an active research area for about two decades. The -- inevitable under current marginal cost pricing -- problem of guaranteeing that no market participant incurs losses in the day-ahead market is addressed in current practice through make-whole payments a.k.a. uplift. Alternative pricing rules have been studied to deal with this probl…
▽ More
The presence of non-convexities in electricity markets has been an active research area for about two decades. The -- inevitable under current marginal cost pricing -- problem of guaranteeing that no market participant incurs losses in the day-ahead market is addressed in current practice through make-whole payments a.k.a. uplift. Alternative pricing rules have been studied to deal with this problem. Among them, Convex Hull (CH) prices associated with minimum uplift have attracted significant attention. Several US Independent System Operators (ISOs) have considered CH prices but resorted to approximations, mainly because determining exact CH prices is computationally challenging, while providing little intuition about the price formation rationale. In this paper, we describe the CH price estimation problem by relying on Dantzig-Wolfe decomposition and Column Generation, as a tractable, highly paralellizable, and exact method -- i.e., yielding exact, not approximate, CH prices -- with guaranteed finite convergence. Moreover, the approach provides intuition on the underlying price formation rationale. A test bed of stylized examples provide an exposition of the intuition in the CH price formation. In addition, a realistic ISO dataset is used to support scalability and validate the proof-of-concept.
△ Less
Submitted 24 October, 2021; v1 submitted 24 December, 2020;
originally announced December 2020.
-
Witnessing History: Rates and Detectability of Naked-Eye Milky-Way Supernovae
Authors:
C. Tanner Murphey,
Jacob W. Hogan,
Brian D. Fields,
Gautham Narayan
Abstract:
The Milky Way hosts on average a few supernova explosions per century, yet in the past millennium only five supernovae have been identified confidently in the historical record. This deficit of naked-eye supernovae is at least partly due to dust extinction in the Galactic plane. We explore this effect quantitatively, develo** a formalism for the supernova probability distribution, accounting for…
▽ More
The Milky Way hosts on average a few supernova explosions per century, yet in the past millennium only five supernovae have been identified confidently in the historical record. This deficit of naked-eye supernovae is at least partly due to dust extinction in the Galactic plane. We explore this effect quantitatively, develo** a formalism for the supernova probability distribution, accounting for dust and for the observer's flux limit. We then construct a fiducial axisymmetric model for the supernova and dust densities, featuring an exponential dependence on galactocentric radius and height, with core-collapse events in a thin disk and Type Ia events including a thick disk component. When no flux limit is applied, our model predicts supernovae are intrinsically concentrated in the Galactic plane, with Type Ia events extending to higher latitudes reflecting their thick disk component. We then apply a flux limit and include dust effects, to predict the sky distribution of historical supernovae. We use well-observed supernovae as light-curve templates, and introduce naked-eye discovery criteria. The resulting sky distributions are strikingly inconsistent with the locations of confident historical supernovae, none of which lie near our model's central peaks. Indeed, SN 1054 lies off the plane almost exactly in the anticenter, and SN 1181 is in the 2nd Galactic quadrant. We discuss possible explanations for these discrepancies. We calculate the percentage of all supernovae bright enough for historical discovery: $\simeq 13\%$ of core-collapse and $\simeq 33\%$ of Type Ia events. Using these and the confident historical supernovae, we estimate the intrinsic Galactic supernova rates, finding general agreement with other methods. Finally, we urge searches for supernovae in historical records from civilizations in the southern hemisphere.
△ Less
Submitted 11 December, 2020;
originally announced December 2020.
-
Informed Pooled Testing with Quantitative Assays
Authors:
Tao Liu,
Joseph W Hogan,
Wanning Su,
Yizhen Xu,
Michael J Daniels,
Kantor Rami
Abstract:
Pooled testing is widely used for screening for viral or bacterial infections with low prevalence when individual testing is not cost-efficient. Pooled testing with qualitative assays that give binary results has been well-studied. However, characteristics of pooling with quantitative assays were mostly demonstrated using simulations or empirical studies. We investigate properties of three pooling…
▽ More
Pooled testing is widely used for screening for viral or bacterial infections with low prevalence when individual testing is not cost-efficient. Pooled testing with qualitative assays that give binary results has been well-studied. However, characteristics of pooling with quantitative assays were mostly demonstrated using simulations or empirical studies. We investigate properties of three pooling strategies with quantitative assays: traditional two-stage mini-pooling (MP) (Dorfman, 1943), mini-pooling with deconvolution algorithm (MPA) (May et al., 2010), and marker-assisted MPA (mMPA) (Liu et al., 2017). MPA and mMPA test individuals in a sequence after a positive pool and implement a deconvolution algorithm to determine when testing can cease to ascertain all individual statuses. mMPA uses information from other available markers to determine an optimal order for individual testings. We derive and compare the general statistical properties of the three pooling methods. We show that with a proper pool size, MP, MPA, and mMPA can be more cost-efficient than individual testing, and mMPA is superior to MPA and MP. For diagnostic accuracy, mMPA and MPA have higher specificity and positive predictive value but lower sensitivity and negative predictive value than MP and individual testing. Included in this paper are applications to various simulations and an application for HIV treatment monitoring.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
Optimal Allocation of Gold Standard Testing under Constrained Availability: Application to Assessment of HIV Treatment Failure
Authors:
Tao Liu,
Joseph W. Hogan,
Lisa Wang,
Shangxuan Zhang,
Rami Kantor
Abstract:
The World Health Organization (WHO) guidelines for monitoring the effectiveness of HIV treatment in resource-limited settings (RLS) are mostly based on clinical and immunological markers (e.g., CD4 cell counts). Recent research indicates that the guidelines are inadequate and can result in high error rates. Viral load (VL) is considered the "gold standard", yet its widespread use is limited by cos…
▽ More
The World Health Organization (WHO) guidelines for monitoring the effectiveness of HIV treatment in resource-limited settings (RLS) are mostly based on clinical and immunological markers (e.g., CD4 cell counts). Recent research indicates that the guidelines are inadequate and can result in high error rates. Viral load (VL) is considered the "gold standard", yet its widespread use is limited by cost and infrastructure. In this paper, we propose a diagnostic algorithm that uses information from routinely-collected clinical and immunological markers to guide a selective use of VL testing for diagnosing HIV treatment failure, under the assumption that VL testing is available only at a certain portion of patient visits. Our algorithm identifies the patient subpopulation, such that the use of limited VL testing on them minimizes a pre-defined risk (e.g., misdiagnosis error rate). Diagnostic properties of our proposal algorithm are assessed by simulations. For illustration, data from the Miriam Hospital Immunology Clinic (RI, USA) are analyzed.
△ Less
Submitted 1 October, 2020;
originally announced October 2020.
-
Framework for the Treatment And Reporting of Missing data in Observational Studies: The TARMOS framework
Authors:
Katherine J Lee,
Kate Tilling,
Rosie P Cornish,
Roderick JA Little,
Melanie L Bell,
Els Goetghebeur,
Joseph W Hogan,
James R Carpenter
Abstract:
Missing data are ubiquitous in medical research. Although there is increasing guidance on how to handle missing data, practice is changing slowly and misapprehensions abound, particularly in observational research. We present a practical framework for handling and reporting the analysis of incomplete data in observational studies, which we illustrate using a case study from the Avon Longitudinal S…
▽ More
Missing data are ubiquitous in medical research. Although there is increasing guidance on how to handle missing data, practice is changing slowly and misapprehensions abound, particularly in observational research. We present a practical framework for handling and reporting the analysis of incomplete data in observational studies, which we illustrate using a case study from the Avon Longitudinal Study of Parents and Children. The framework consists of three steps: 1) Develop an analysis plan specifying the analysis model and how missing data are going to be addressed. An important consideration is whether a complete records analysis is likely to be valid, whether multiple imputation or an alternative approach is likely to offer benefits, and whether a sensitivity analysis regarding the missingness mechanism is required. 2) Explore the data, checking the methods outlined in the analysis plan are appropriate, and conduct the pre-planned analysis. 3) Report the results, including a description of the missing data, details on how the missing data were addressed, and the results from all analyses, interpreted in light of the missing data and the clinical relevance. This framework seeks to support researchers in thinking systematically about missing data, and transparently reporting the potential effect on the study results.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
Identifying Cancer Patients at Risk for Heart Failure Using Machine Learning Methods
Authors:
Xi Yang,
Yan Gong,
Nida Waheed,
Keith March,
Jiang Bian,
William R. Hogan,
Yonghui Wu
Abstract:
Cardiotoxicity related to cancer therapies has become a serious issue, diminishing cancer treatment outcomes and quality of life. Early detection of cancer patients at risk for cardiotoxicity before cardiotoxic treatments and providing preventive measures are potential solutions to improve cancer patients's quality of life. This study focuses on predicting the development of heart failure in cance…
▽ More
Cardiotoxicity related to cancer therapies has become a serious issue, diminishing cancer treatment outcomes and quality of life. Early detection of cancer patients at risk for cardiotoxicity before cardiotoxic treatments and providing preventive measures are potential solutions to improve cancer patients's quality of life. This study focuses on predicting the development of heart failure in cancer patients after cancer diagnoses using historical electronic health record (EHR) data. We examined four machine learning algorithms using 143,199 cancer patients from the University of Florida Health (UF Health) Integrated Data Repository (IDR). We identified a total number of 1,958 qualified cases and matched them to 15,488 controls by gender, age, race, and major cancer type. Two feature encoding strategies were compared to encode variables as machine learning features. The gradient boosting (GB) based model achieved the best AUC score of 0.9077 (with a sensitivity of 0.8520 and a specificity of 0.8138), outperforming other machine learning methods. We also looked into the subgroup of cancer patients with exposure to chemotherapy drugs and observed a lower specificity score (0.7089). The experimental results show that machine learning methods are able to capture clinical factors that are known to be associated with heart failure and that it is feasible to use machine learning methods to identify cancer patients at risk for cancer therapy-related heart failure.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Assessing the Validity of a a priori Patient-Trial Generalizability Score using Real-world Data from a Large Clinical Data Research Network: A Colorectal Cancer Clinical Trial Case Study
Authors:
Qian Li,
Zhe He,
Yi Guo,
Hansi Zhang,
Thomas J George Jr,
William Hogan,
Neil Charness,
Jiang Bian
Abstract:
Existing trials had not taken enough consideration of their population representativeness, which can lower the effectiveness when the treatment is applied in real-world clinical practice. We analyzed the eligibility criteria of Bevacizumab colorectal cancer treatment trials, assessed their a priori generalizability, and examined how it affects patient outcomes when applied in real-world clinical s…
▽ More
Existing trials had not taken enough consideration of their population representativeness, which can lower the effectiveness when the treatment is applied in real-world clinical practice. We analyzed the eligibility criteria of Bevacizumab colorectal cancer treatment trials, assessed their a priori generalizability, and examined how it affects patient outcomes when applied in real-world clinical settings. To do so, we extracted patient-level data from a large collection of electronic health records (EHRs) from the OneFlorida consortium. We built a zero-inflated negative binomial model using a composite patient-trial generalizability (cPTG) score to predict patients clinical outcomes (i.e., number of serious adverse events, (SAEs)). Our study results provide a body of evidence that 1) the cPTG scores can predict patient outcomes; and 2) patients who are more similar to the study population in the trials that were used to develop the treatment will have a significantly lower possibility to experience serious adverse events.
△ Less
Submitted 24 June, 2019;
originally announced June 2019.
-
Modeling the Causal Effect of Treatment Initiation Time on Survival: Application to HIV/TB Co-infection
Authors:
Liangyuan Hu,
Joseph W. Hogan,
Ann W. Mwangi,
Abraham Siika
Abstract:
The timing of antiretroviral therapy (ART) initiation for HIV and tuberculosis (TB) co-infected patients needs to be considered carefully. CD4 cell count can be used to guide decision making about when to initiate ART. Evidence from recent randomized trials and observational studies generally supports early initiation but does not provide information about effects of initiation time on a continuou…
▽ More
The timing of antiretroviral therapy (ART) initiation for HIV and tuberculosis (TB) co-infected patients needs to be considered carefully. CD4 cell count can be used to guide decision making about when to initiate ART. Evidence from recent randomized trials and observational studies generally supports early initiation but does not provide information about effects of initiation time on a continuous scale. In this paper, we develop and apply a highly flexible structural proportional hazards model for characterizing the effect of treatment initiation time on a survival distribution. The model can be fitted using a weighted partial likelihood score function. Construction of both the score function and the weights must accommodate censoring of the treatment initiation time, the outcome, or both. The methods are applied to data on 4903 individuals with HIV/TB co-infection, derived from electronic health records in a large HIV care program in Kenya. We use a model formulation that flexibly captures the joint effects of ART initiation time and ART duration using natural cubic splines. The model is used to generate survival curves corresponding to specific treatment initiation times; and to identify optimal times for ART initiation for subgroups defined by CD4 count at time of TB diagnosis. Our findings potentially provide "higher resolution" information about the relationship between ART timing and mortality, and about the differential effect of ART timing within CD4 subgroups.
△ Less
Submitted 2 April, 2019;
originally announced April 2019.
-
Causal comparative effectiveness analysis of dynamic continuous-time treatment initiation rules with sparsely measured outcomes and death
Authors:
Liangyuan Hu,
Joseph W. Hogan
Abstract:
Evidence supporting the current World Health Organization recommendations of early antiretroviral therapy (ART) initiation for adolescents is inconclusive. We leverage a large observational data and compare, in terms of mortality and CD4 cell count, the dynamic treatment initiation rules for HIV-infected adolescents. Our approaches extend the marginal structural model for estimating outcome distri…
▽ More
Evidence supporting the current World Health Organization recommendations of early antiretroviral therapy (ART) initiation for adolescents is inconclusive. We leverage a large observational data and compare, in terms of mortality and CD4 cell count, the dynamic treatment initiation rules for HIV-infected adolescents. Our approaches extend the marginal structural model for estimating outcome distributions under dynamic treatment regimes (DTR), developed in Robins et al. (2008), to allow the causal comparisons of both specific regimes and regimes along a continuum. Furthermore, we propose strategies to address three challenges posed by the complex data set: continuous-time measurement of the treatment initiation process; sparse measurement of longitudinal outcomes of interest, leading to incomplete data; and censoring due to dropout and death. We derive a weighting strategy for continuous time treatment initiation; use imputation to deal with missingness caused by sparse measurements and dropout; and define a composite outcome that incorporates both death and CD4 count as a basis for comparing treatment regimes. Our analysis suggests that immediate ART initiation leads to lower mortality and higher median values of the composite outcome, relative to other initiation rules.
△ Less
Submitted 2 April, 2019;
originally announced April 2019.
-
Classification using Ensemble Learning under Weighted Misclassification Loss
Authors:
Yizhen Xu,
Tao Liu,
Michael J. Daniels,
Rami Kantor,
Ann Mwangi,
Joseph W. Hogan
Abstract:
Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy (ART) requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. I…
▽ More
Binary classification rules based on covariates typically depend on simple loss functions such as zero-one misclassification. Some cases may require more complex loss functions. For example, individual-level monitoring of HIV-infected individuals on antiretroviral therapy (ART) requires periodic assessment of treatment failure, defined as having a viral load (VL) value above a certain threshold. In some resource limited settings, VL tests may be limited by cost or technology, and diagnoses are based on other clinical markers. Depending on scenario, higher premium may be placed on avoiding false-positives which brings greater cost and reduced treatment options. Here, the optimal rule is determined by minimizing a weighted misclassification loss/risk.
We propose a method for finding and cross-validating optimal binary classification rules under weighted misclassification loss. We focus on rules comprising a prediction score and an associated threshold, where the score is derived using an ensemble learner. Simulations and examples show that our method, which derives the score and threshold jointly, more accurately estimates overall risk and has better operating characteristics compared with methods that derive the score first and the cutoff conditionally on the score especially for finite samples.
△ Less
Submitted 10 May, 2019; v1 submitted 16 December, 2018;
originally announced December 2018.
-
Reduced Bias for respondent driven sampling: accounting for non-uniform edge sampling probabilities in people who inject drugs in Mauritius
Authors:
Miles Q. Ott,
Krista J. Gile,
Matthew T. Harrison,
Lisa G. Johnston,
Joseph W. Hogan
Abstract:
People who inject drugs are an important population to study in order to reduce transmission of blood-borne illnesses including HIV and Hepatitis. In this paper we estimate the HIV and Hepatitis C prevalence among people who inject drugs, as well as the proportion of people who inject drugs who are female in Mauritius. Respondent driven sampling (RDS), a widely adopted link-tracing sampling design…
▽ More
People who inject drugs are an important population to study in order to reduce transmission of blood-borne illnesses including HIV and Hepatitis. In this paper we estimate the HIV and Hepatitis C prevalence among people who inject drugs, as well as the proportion of people who inject drugs who are female in Mauritius. Respondent driven sampling (RDS), a widely adopted link-tracing sampling design used to collect samples from hard-to-reach human populations, was used to collect this sample. The random walk approximation underlying many common RDS estimators assumes that each social relation (edge) in the underlying social network has an equal probability of being traced in the collection of the sample. This assumption does not hold in practice. We show that certain RDS estimators are sensitive to the violation of this assumption. In order to address this limitation in current methodology, and the impact it may have on prevalence estimates, we present a new method for improving RDS prevalence estimators using estimated edge inclusion probabilities, and apply this to data from Mauritius.
△ Less
Submitted 25 December, 2017;
originally announced December 2017.
-
Inference about ATE from Observational Studies with Continuous Outcome and Unmeasured Confounding
Authors:
Tao Liu,
Joseph W. Hogan
Abstract:
For settings with a binary treatment and a binary outcome, instrumental variables can be used to construct bounds on a causal treatment effect. With continuous outcomes, meaningful bounds are more difficult to obtain because the domain of the outcome is typically unrestricted. In this paper, we combine an instrumental variable and subjective assumptions in the context of an obser- vational cohort…
▽ More
For settings with a binary treatment and a binary outcome, instrumental variables can be used to construct bounds on a causal treatment effect. With continuous outcomes, meaningful bounds are more difficult to obtain because the domain of the outcome is typically unrestricted. In this paper, we combine an instrumental variable and subjective assumptions in the context of an obser- vational cohort study of HIV-infected women to construct meaningful bounds on the initial-stage causal effect of antiretroviral therapy on CD4 count. The subjective assumptions are encoded in terms of the potential outcomes that are identified by observed data as well as a sensitivity parameter that captures the impact of unmeasured confounding. Measured confounding is adjusted using the method of inverse probability weighting (IPW). With extra information from an IV, we quantify both the causal treatment effect and the degree of the unmea- sured confounding. We demonstrate our method by analyzing data from the HIV Epidemiology Research Study.
△ Less
Submitted 25 March, 2013;
originally announced March 2013.
-
Bayesian Biosurveillance of Disease Outbreaks
Authors:
Gregory F. Cooper,
Denver Dash,
John Levander,
Weng-Keen Wong,
William Hogan,
Michael Wagner
Abstract:
Early, reliable detection of disease outbreaks is a critical problem today. This paper reports an investigation of the use of causal Bayesian networks to model spatio-temporal patterns of a non-contagious disease (respiratory anthrax infection) in a population of people. The number of parameters in such a network can become enormous, if not carefully managed. Also, inference needs to be performed…
▽ More
Early, reliable detection of disease outbreaks is a critical problem today. This paper reports an investigation of the use of causal Bayesian networks to model spatio-temporal patterns of a non-contagious disease (respiratory anthrax infection) in a population of people. The number of parameters in such a network can become enormous, if not carefully managed. Also, inference needs to be performed in real time as population data stream in. We describe techniques we have applied to address both the modeling and inference challenges. A key contribution of this paper is the explication of assumptions and techniques that are sufficient to allow the scaling of Bayesian network modeling and inference to millions of nodes for real-time surveillance applications. The results reported here provide a proof-of-concept that Bayesian networks can serve as the foundation of a system that effectively performs Bayesian biosurveillance of disease outbreaks.
△ Less
Submitted 11 July, 2012;
originally announced July 2012.
-
HIV dynamics and natural history studies: Joint modeling with doubly interval-censored event time and infrequent longitudinal data
Authors:
Li Su,
Joseph W. Hogan
Abstract:
Hepatitis C virus (HCV) coinfection has become one of the most challenging clinical situations to manage in HIV-infected patients. Recently the effect of HCV coinfection on HIV dynamics following initiation of highly active antiretroviral therapy (HAART) has drawn considerable attention. Post-HAART HIV dynamics are commonly studied in short-term clinical trials with frequent data collection design…
▽ More
Hepatitis C virus (HCV) coinfection has become one of the most challenging clinical situations to manage in HIV-infected patients. Recently the effect of HCV coinfection on HIV dynamics following initiation of highly active antiretroviral therapy (HAART) has drawn considerable attention. Post-HAART HIV dynamics are commonly studied in short-term clinical trials with frequent data collection design. For example, the elimination process of plasma virus during treatment is closely monitored with daily assessments in viral dynamics studies of AIDS clinical trials. In this article instead we use infrequent cohort data from long-term natural history studies and develop a model for characterizing post-HAART HIV dynamics and their associations with HCV coinfection. Specifically, we propose a joint model for doubly interval-censored data for the time between HAART initiation and viral suppression, and the longitudinal CD4 count measurements relative to the viral suppression. Inference is accomplished using a fully Bayesian approach. Doubly interval-censored data are modeled semiparametrically by Dirichlet process priors and Bayesian penalized splines are used for modeling population-level and individual-level mean CD4 count profiles. We use the proposed methods and data from the HIV Epidemiology Research Study (HERS) to investigate the effect of HCV coinfection on the response to HAART.
△ Less
Submitted 3 May, 2011;
originally announced May 2011.
-
Decomposing Intraday Dependence in Currency Markets: Evidence from the AUD/USD Spot Market
Authors:
Jonathan A. Batten,
Craig A. Ellis,
Warren P. Hogan
Abstract:
The local Hurst exponent, a measure employed to detect the presence of dependence in a time series, may also be used to investigate the source of intraday variation observed in the returns in foreign exchange markets. Given that changes in the local Hurst exponent may be due to either a time-varying range, or standard deviation, or both of these simultaneously, values for the range, standard dev…
▽ More
The local Hurst exponent, a measure employed to detect the presence of dependence in a time series, may also be used to investigate the source of intraday variation observed in the returns in foreign exchange markets. Given that changes in the local Hurst exponent may be due to either a time-varying range, or standard deviation, or both of these simultaneously, values for the range, standard deviation and local Hurst exponent are recorded and analyzed separately. To illustrate this approach, a high-frequency data set of the spot Australian dollar/U.S. dollar provides evidence of the returns distribution across the 24-hour trading day with time-varying dependence and volatility clearly aligning with the opening and closing of markets. This variation is attributed to the effects of liquidity and the price-discovery actions of dealers.
△ Less
Submitted 17 December, 2004;
originally announced December 2004.
-
New Measurements of CP Violation Parameters as Tests of CPT in K Meson Decay
Authors:
G. D. Gollin,
W. P. Hogan
Abstract:
Using a technique which employs a pair of solid scintillator regenerators, the E773 collaboration has measured several CP violation parameters in K meson decay at Fermilab. We report new results for the phase of eta_+-, the K_L-K_S mass difference, the K_S lifetime, and the phase difference Arg(eta_00)-Arg(eta_+-) in K ->pi pi decay. In addition, we report a measurement of the magnitude and phas…
▽ More
Using a technique which employs a pair of solid scintillator regenerators, the E773 collaboration has measured several CP violation parameters in K meson decay at Fermilab. We report new results for the phase of eta_+-, the K_L-K_S mass difference, the K_S lifetime, and the phase difference Arg(eta_00)-Arg(eta_+-) in K ->pi pi decay. In addition, we report a measurement of the magnitude and phase of eta_+-gamma in K ->pi+ pi- gamma decay. Our preliminary results are compared with theoretical expectations based on CPT symmetry. (Glasgow ICHEP94 paper Ref. gls0167, set with REVTeX)
△ Less
Submitted 9 August, 1994;
originally announced August 1994.