Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality
Authors:
Tiffany J. Callahan,
Adrianne L. Stefanski,
Jordan M. Wyrwa,
Chenjie Zeng,
Anna Ostropolets,
Juan M. Banda,
William A. Baumgartner Jr.,
Richard D. Boyce,
Elena Casiraghi,
Ben D. Coleman,
Janine H. Collins,
Sara J. Deakyne-Davies,
James A. Feinstein,
Melissa A. Haendel,
Asiyah Y. Lin,
Blake Martin,
Nicolas A. Matentzoglu,
Daniella Meeker,
Justin Reese,
Jessica Sinclair,
Sanya B. Taneja,
Katy E. Trinkley,
Nicole A. Vasilevsky,
Andrew Williams,
Xingman A. Zhang
, et al. (7 additional authors not shown)
Abstract:
Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenoty**. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, map** EHR data to OB…
▽ More
Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenoty**. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, map** EHR data to OBO ontologies requires significant manual curation and domain expertise. Objective: We introduce OMOP2OBO, an algorithm for map** Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Results: Using OMOP2OBO, we produced map**s for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the map**s helped systematically identify undiagnosed patients who might benefit from genetic testing. Conclusions: By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenoty**.
△ Less
Submitted 30 January, 2023; v1 submitted 10 September, 2022;
originally announced September 2022.