-
Biolink Model: A Universal Schema for Knowledge Graphs in Clinical, Biomedical, and Translational Science
Authors:
Deepak R. Unni,
Sierra A. T. Moxon,
Michael Bada,
Matthew Brush,
Richard Bruskiewich,
Paul Clemons,
Vlado Dancik,
Michel Dumontier,
Karamarie Fecho,
Gustavo Glusman,
Jennifer J. Hadlock,
Nomi L. Harris,
Arpita Joshi,
Tim Putman,
Guangrong Qin,
Stephen A. Ramsey,
Kent A. Shefchek,
Harold Solbrig,
Karthik Soman,
Anne T. Thessen,
Melissa A. Haendel,
Chris Bizon,
Christopher J. Mungall,
the Biomedical Data Translator Consortium
Abstract:
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness between core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge…
▽ More
Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness between core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally-accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates), representing biomedical entities such as gene, disease, chemical, anatomical structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and hel** to realize the goals of translational science.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
A Simple Standard for Sharing Ontological Map**s (SSSOM)
Authors:
Nicolas Matentzoglu,
James P. Balhoff,
Susan M. Bello,
Chris Bizon,
Matthew Brush,
Tiffany J. Callahan,
Christopher G Chute,
William D. Duncan,
Chris T. Evelo,
Davera Gabriel,
John Graybeal,
Alasdair Gray,
Benjamin M. Gyori,
Melissa Haendel,
Henriette Harmse,
Nomi L. Harris,
Ian Harrow,
Harshad Hegde,
Amelia L. Hoyt,
Charles T. Hoyt,
Dazhi Jiao,
Ernesto Jiménez-Ruiz,
Simon Jupp,
Hyeongsik Kim,
Sebastian Koehler
, et al. (19 additional authors not shown)
Abstract:
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, ar…
▽ More
Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for map** between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Map**s often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how map**s were done makes it hard to combine and reconcile map**s, particularly curated and automated ones.
The Simple Standard for Sharing Ontological Map**s (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in map**s explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and map** practices. 4. Providing reference tools and software libraries for working with the standard.
In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of map**s, with the goal of making map**s Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
On structure-stabilizing electronic interferences in bcc-related phases. A research report
Authors:
Heinrich Solbrig
Abstract:
This study deals with cubic crystals where the contents of the simple cubic unit cells are close to n$\times$n$\times$n-bcc sublattices ($n$ = 2: diamond- and zinc-blende type, $n$ = 3: $γ$-brasses). First-principle results on the electronic structure are obtained from augmented LMTO-ASA calculations and interpreted within a VEC-based Hume-Rothery concept which employs joined planar-radial interfe…
▽ More
This study deals with cubic crystals where the contents of the simple cubic unit cells are close to n$\times$n$\times$n-bcc sublattices ($n$ = 2: diamond- and zinc-blende type, $n$ = 3: $γ$-brasses). First-principle results on the electronic structure are obtained from augmented LMTO-ASA calculations and interpreted within a VEC-based Hume-Rothery concept which employs joined planar-radial interferences to treat interference and hybridization on the same footing. We show that the charge redistribution supports enhanced electronic interference which causes the band energy to decrease. Several topics are included such as stabilizing networks, hardness and $s$-to-$p$ transfer, co-operation of interferences, interplay between local radial order and global planar order, electron-per-atom ratio, and the comparison with recent FLAPW-based results.
△ Less
Submitted 22 March, 2023; v1 submitted 24 March, 2019;
originally announced March 2019.
-
Validating and describing linked data portals using shapes
Authors:
Jose-Emilio Labra-Gayo,
Eric Prud'hommeaux,
Harold Solbrig,
Iovka Boneva
Abstract:
Linked data portals need to be able to advertise and describe the structure of their content. A sufficiently expressive and intuitive schema language will allow portals to communicate these structures. Validation tools will aid in the publication and maintenance of linked data and increase their quality.
Two schema language proposals have recently emerged for describing the structures of RDF gra…
▽ More
Linked data portals need to be able to advertise and describe the structure of their content. A sufficiently expressive and intuitive schema language will allow portals to communicate these structures. Validation tools will aid in the publication and maintenance of linked data and increase their quality.
Two schema language proposals have recently emerged for describing the structures of RDF graphs: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL). In this paper we describe how these formalisms can be used in the development of a linked data portal to describe and validate its contents. As a use case, we specify a data model inspired by the WebIndex data model, a medium size linked data portal, using both ShEx and SHACL, and we propose a benchmark that can generate compliant test data structures of any size. We then perform some preliminary experiments showing performance of one validation engine based on ShEx.
△ Less
Submitted 31 January, 2017;
originally announced January 2017.