-
Enriching the scholarly metadata commons with citation metadata and spatio-temporal metadata to support responsible research assessment and research discovery
Authors:
Daniel Nüst,
Gazi Yücel,
Anette Cordts,
Christian Hauschke
Abstract:
In this article, we focus on the importance of open research information as the foundation for transparent and responsible research assessment and discovery of research outputs. We introduce work in which we support the open research information commons by enabling, in particular, independent and small Open Access journals to provide metadata to several open data hubs (Open Citations, Wikidata, Op…
▽ More
In this article, we focus on the importance of open research information as the foundation for transparent and responsible research assessment and discovery of research outputs. We introduce work in which we support the open research information commons by enabling, in particular, independent and small Open Access journals to provide metadata to several open data hubs (Open Citations, Wikidata, Open Research Knowledge Graph). In this context, we present The OPTIMETA Way, a means to integrate metadata collection, enrichment, and distribution in an effective and quality-ensured way that enables uptake even amongst small scholar-led publication venues. We have designed an implementation strategy for this approach in the form of two plugins for the most widely used journal publishing software, Open Journal Systems (OJS). These plugins collect, enrich, and automatically deliver citation metadata and spatio-temporal metadata for articles. Our contribution to research assessment and discovery with linked open bibliographic data is threefold. First, we enlarge the open research information data pool by advocating for the collection of enriched, user-validated metadata at the time of publication through open APIs. Second, we integrate data platforms and journals currently not included in the standard scientometric practices because of their language or lack of support from big publishing houses. Third, we allow new use cases based on location and temporal metadata that go beyond commonly used discovery features, specifically, the assessment of research activities using spatial coverage and new transdisciplinary connections between research outputs.
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
The role of metadata in reproducible computational research
Authors:
Jeremy Leipzig,
Daniel Nüst,
Charles Tapley Hoyt,
Stian Soiland-Reyes,
Karthik Ram,
Jane Greenberg
Abstract:
Reproducible computational research (RCR) is the keystone of the scientific method for in silico analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, RCR has the capacity to significantly accelerate evaluation and reuse. This potential and wide-support for the FAIR principles have motivated interest in metadata standards supporting…
▽ More
Reproducible computational research (RCR) is the keystone of the scientific method for in silico analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, RCR has the capacity to significantly accelerate evaluation and reuse. This potential and wide-support for the FAIR principles have motivated interest in metadata standards supporting RCR. Metadata provides context and provenance to raw data and methods and is essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described the relationship between metadata and RCR. This article employs a functional content analysis to identify metadata standards that support RCR functions across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our article provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.
△ Less
Submitted 19 April, 2021; v1 submitted 15 June, 2020;
originally announced June 2020.
-
The Rockerverse: Packages and Applications for Containerization with R
Authors:
Daniel Nüst,
Dirk Eddelbuettel,
Dom Bennett,
Robrecht Cannoodt,
Dav Clark,
Gergely Daroczi,
Mark Edmondson,
Colin Fay,
Ellis Hughes,
Lars Kjeldgaard,
Sean Lopp,
Ben Marwick,
Heather Nolis,
Jacqueline Nolis,
Hong Ooi,
Karthik Ram,
Noam Ross,
Lori Shepherd,
Péter Sólymos,
Tyson Lee Swetnam,
Nitesh Turaga,
Charlotte Van Petegem,
Jason Williams,
Craig Willis,
Nan Xiao
Abstract:
The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-ba…
▽ More
The Rocker Project provides widely used Docker images for R across different application scenarios. This article surveys downstream projects that build upon the Rocker Project images and presents the current state of R packages for managing Docker images and controlling containers. These use cases cover diverse topics such as package development, reproducible research, collaborative work, cloud-based data processing, and production deployment of services. The variety of applications demonstrates the power of the Rocker Project specifically and containerisation in general. Across the diverse ways to use containers, we identified common themes: reproducible environments, scalability and efficiency, and portability across clouds. We conclude that the current growth and diversification of use cases is likely to continue its positive impact, but see the need for consolidating the Rockerverse ecosystem of packages, develo** common practices for applications, and exploring alternative containerisation software.
△ Less
Submitted 17 August, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Publishing computational research -- A review of infrastructures for reproducible and transparent scholarly communication
Authors:
Markus Konkol,
Daniel Nüst,
Laura Goulier
Abstract:
The trend toward open science increases the pressure on authors to provide access to the source code and data they used to compute the results reported in their scientific papers. Since sharing materials reproducibly is challenging, several projects have developed solutions to support the release of executable analyses alongside articles. We reviewed 11 applications that can assist researchers in…
▽ More
The trend toward open science increases the pressure on authors to provide access to the source code and data they used to compute the results reported in their scientific papers. Since sharing materials reproducibly is challenging, several projects have developed solutions to support the release of executable analyses alongside articles. We reviewed 11 applications that can assist researchers in adhering to reproducibility principles. The applications were found through a literature search and interactions with the reproducible research community. An application was included in our analysis if it was actively maintained at the time the data for this paper was collected, supports the publication of executable code and data, is connected to the scholarly publication process. By investigating the software documentation and published articles, we compared the applications across 19 criteria, e.g. features that support authors in creating and readers in studying executable papers. From the 11 applications, eight allow publishers to self-host the system for free, whereas three provide paid services. Authors can submit an executable analysis using Jupyter Notebooks or R Markdown documents (10 applications support these formats). All approaches provide features to assist readers in studying the materials, e.g., one-click reproducible results or tools for manipulating the analysis parameters. Six applications allow for modifying materials after publication. The applications support authors to publish reproducible research predominantly with literate programming. Concerning readers, most applications provide user interfaces to inspect and manipulate the computational analysis.
△ Less
Submitted 14 July, 2020; v1 submitted 2 January, 2020;
originally announced January 2020.
-
How to Read a Research Compendium
Authors:
Daniel Nüst,
Carl Boettiger,
Ben Marwick
Abstract:
Researchers spend a great deal of time reading research papers. Keshav (2012) provides a three-pass method to researchers to improve their reading skills. This article extends Keshav's method for reading a research compendium. Research compendia are an increasingly used form of publication, which packages not only the research paper's text and figures, but also all data and software for better rep…
▽ More
Researchers spend a great deal of time reading research papers. Keshav (2012) provides a three-pass method to researchers to improve their reading skills. This article extends Keshav's method for reading a research compendium. Research compendia are an increasingly used form of publication, which packages not only the research paper's text and figures, but also all data and software for better reproducibility. We introduce the existing conventions for research compendia and suggest how to utilise their shared properties in a structured reading process. Unlike the original, this article is not build upon a long history but intends to provide guidance at the outset of an emerging practice.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.