Search | arXiv e-print repository

Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

Authors: Tiffany J. Callahan, Adrianne L. Stefanski, Jordan M. Wyrwa, Chenjie Zeng, Anna Ostropolets, Juan M. Banda, William A. Baumgartner Jr., Richard D. Boyce, Elena Casiraghi, Ben D. Coleman, Janine H. Collins, Sara J. Deakyne-Davies, James A. Feinstein, Melissa A. Haendel, Asiyah Y. Lin, Blake Martin, Nicolas A. Matentzoglu, Daniella Meeker, Justin Reese, Jessica Sinclair, Sanya B. Taneja, Katy E. Trinkley, Nicole A. Vasilevsky, Andrew Williams, Xingman A. Zhang , et al. (7 additional authors not shown)

Abstract: Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenoty**. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, map** EHR data to OB… ▽ More Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenoty**. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, map** EHR data to OBO ontologies requires significant manual curation and domain expertise. Objective: We introduce OMOP2OBO, an algorithm for map** Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Results: Using OMOP2OBO, we produced map**s for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the map**s helped systematically identify undiagnosed patients who might benefit from genetic testing. Conclusions: By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenoty**. △ Less

Submitted 30 January, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

Comments: Supplementary Material is included at the end of the manuscript

ACM Class: J.3

arXiv:2208.13071 [pdf, other]

Analysis of Validating and Verifying OpenACC Compilers 3.0 and Above

Authors: A. M. Jarmusch, A. Liu, C. Munley, D. Horta, V. Ravichandran, J. Denny, S. Chandrasekaran

Abstract: OpenACC is a high-level directive-based parallel programming model that can manage the sophistication of heterogeneity in architectures and abstract it from the users. The portability of the model across CPUs and accelerators has gained the model a wide variety of users. This means it is also crucial to analyze the reliability of the compilers' implementations. To address this challenge, the OpenA… ▽ More OpenACC is a high-level directive-based parallel programming model that can manage the sophistication of heterogeneity in architectures and abstract it from the users. The portability of the model across CPUs and accelerators has gained the model a wide variety of users. This means it is also crucial to analyze the reliability of the compilers' implementations. To address this challenge, the OpenACC Validation and Verification team has proposed a validation testsuite to verify the OpenACC implementations across various compilers with an infrastructure for a more streamlined execution. This paper will cover the following aspects: (a) the new developments since the last publication on the testsuite, (b) outline the use of the infrastructure, (c) discuss tests that highlight our workflow process, (d) analyze the results from executing the testsuite on various systems, and (e) outline future developments. △ Less

Submitted 27 August, 2022; originally announced August 2022.

arXiv:2006.16868 [pdf, other]

Predicting Sample Collision with Neural Networks

Authors: Tuan Tran, Jory Denny, Chinwe Ekenna

Abstract: Many state-of-art robotics applications require fast and efficient motion planning algorithms. Existing motion planning methods become less effective as the dimensionality of the robot and its workspace increases, especially the computational cost of collision detection routines. In this work, we present a framework to address the cost of expensive primitive operations in sampling-based motion pla… ▽ More Many state-of-art robotics applications require fast and efficient motion planning algorithms. Existing motion planning methods become less effective as the dimensionality of the robot and its workspace increases, especially the computational cost of collision detection routines. In this work, we present a framework to address the cost of expensive primitive operations in sampling-based motion planning. This framework determines the validity of a sample robot configuration through a novel combination of a Contractive AutoEncoder (CAE), which captures a occupancy grids representation of the robot's workspace, and a Multilayer Perceptron, which efficiently predicts the collision state of the robot from the CAE and the robot's configuration. We evaluate our framework on multiple planning problems with a variety of robots in 2D and 3D workspaces. The results show that (1) the framework is computationally efficient in all investigated problems, and (2) the framework generalizes well to new workspaces. △ Less

Submitted 30 June, 2020; originally announced June 2020.

Comments: 7 pages, 7 figures

arXiv:1811.06183 [pdf]

Characterizing Design Patterns of EHR-Driven Phenotype Extraction Algorithms

Authors: Yizhen Zhong, Luke Rasmussen, Yu Deng, Jennifer Pacheco, Maureen Smith, Justin Starren, Wei-Qi Wei, Peter Speltz, Joshua Denny, Nephi Walton, George Hripcsak, Christopher G Chute, Yuan Luo

Abstract: The automatic development of phenotype algorithms from Electronic Health Record data with machine learning (ML) techniques is of great interest given the current practice is very time-consuming and resource intensive. The extraction of design patterns from phenotype algorithms is essential to understand their rationale and standard, with great potential to automate the development process. In this… ▽ More The automatic development of phenotype algorithms from Electronic Health Record data with machine learning (ML) techniques is of great interest given the current practice is very time-consuming and resource intensive. The extraction of design patterns from phenotype algorithms is essential to understand their rationale and standard, with great potential to automate the development process. In this pilot study, we perform network visualization on the design patterns and their associations with phenotypes and sites. We classify design patterns using the fragments from previously annotated phenotype algorithms as the ground truth. The classification performance is used as a proxy for coherence at the attribution level. The bag-of-words representation with knowledge-based features generated a good performance in the classification task (0.79 macro-f1 scores). Good classification accuracy with simple features demonstrated the attribution coherence and the feasibility of automatic identification of design patterns. Our results point to both the feasibility and challenges of automatic identification of phenoty** design patterns, which would power the automatic development of phenotype algorithms. △ Less

Submitted 15 November, 2018; originally announced November 2018.

Comments: 4 pages, accepted by IEEE BIBM 2018 as short paper

arXiv:1407.2854 [pdf, other]

Graph Compartmentalization

Authors: Matthew J. Denny

Abstract: This article introduces a concept and measure of graph compartmentalization. This new measure allows for principled comparison between graphs of arbitrary structure, unlike existing measures such as graph modularity. The proposed measure is invariant to graph size and number of groups and can be calculated analytically, facilitating measurement on very large graphs. I also introduce a block model… ▽ More This article introduces a concept and measure of graph compartmentalization. This new measure allows for principled comparison between graphs of arbitrary structure, unlike existing measures such as graph modularity. The proposed measure is invariant to graph size and number of groups and can be calculated analytically, facilitating measurement on very large graphs. I also introduce a block model generative process for compartmentalized graphs as a benchmark on which to validate the proposed measure. Simulation results demonstrate improved performance of the new measure over modularity in recovering the degree of compartmentalization of graphs simulated from the generative model. I also explore an application to the measurement of political polarization. △ Less

Submitted 27 August, 2014; v1 submitted 10 July, 2014; originally announced July 2014.

Comments: 11 pages, 5 figures

Showing 1–5 of 5 results for author: Denny, J