Analyzing Partitioned FAIR Health Data Responsibly
Authors:
Chang Sun,
Lianne Ippel,
Birgit Wouters,
Johan van Soest,
Alexander Malic,
Onaopepo Adekunle,
Bob van den Berg,
Marco Puts,
Ole Mussmann,
Annemarie Koster,
Carla van der Kallen,
David Townend,
Andre Dekker,
Michel Dumontier
Abstract:
It is widely anticipated that the use of health-related big data will enable further understanding and improvements in human health and wellbeing. Our current project, funded through the Dutch National Research Agenda, aims to explore the relationship between the development of diabetes and socio-economic factors such as lifestyle and health care utilization. The analysis involves combining data f…
▽ More
It is widely anticipated that the use of health-related big data will enable further understanding and improvements in human health and wellbeing. Our current project, funded through the Dutch National Research Agenda, aims to explore the relationship between the development of diabetes and socio-economic factors such as lifestyle and health care utilization. The analysis involves combining data from the Maastricht Study (DMS), a prospective clinical study, and data collected by Statistics Netherlands (CBS) as part of its routine operations. However, a wide array of social, legal, technical, and scientific issues hinder the analysis. In this paper, we describe these challenges and our progress towards addressing them.
△ Less
Submitted 2 December, 2018;
originally announced December 2018.
Nanopublications: A Growing Resource of Provenance-Centric Scientific Linked Data
Authors:
Tobias Kuhn,
Albert Meroño-Peñuela,
Alexander Malic,
Jorrit H. Poelen,
Allen H. Hurlbert,
Emilio Centeno Ortiz,
Laura I. Furlong,
Núria Queralt-Rosinach,
Christine Chichester,
Juan M. Banda,
Egon Willighagen,
Friederike Ehrhart,
Chris Evelo,
Tareq B. Malas,
Michel Dumontier
Abstract:
Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format i…
▽ More
Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level. While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions. More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data. We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.