-
Toward Enabling Reproducibility for Data-Intensive Research using the Whole Tale Platform
Authors:
Kyle Chard,
Niall Gaffney,
Mihael Hategan,
Kacper Kowalik,
Bertram Ludaescher,
Timothy McPhillips,
Jarek Nabrzyski,
Victoria Stodden,
Ian Taylor,
Thomas Thelen,
Matthew J. Turk,
Craig Willis
Abstract:
Whole Tale http://wholetale.org is a web-based, open-source platform for reproducible research supporting the creation, sharing, execution, and verification of "Tales" for the scientific research community. Tales are executable research objects that capture the code, data, and environment along with narrative and workflow information needed to re-create computational results from scientific studie…
▽ More
Whole Tale http://wholetale.org is a web-based, open-source platform for reproducible research supporting the creation, sharing, execution, and verification of "Tales" for the scientific research community. Tales are executable research objects that capture the code, data, and environment along with narrative and workflow information needed to re-create computational results from scientific studies. Creating reproducible research objects that enable reproducibility, transparency, and re-execution for computational experiments requiring significant compute resources or utilizing massive data is an especially challenging open problem. We describe opportunities, challenges, and solutions to facilitating reproducibility for data- and compute-intensive research, that we call "Tales at Scale," using the Whole Tale computing platform. We highlight challenges and solutions in frontend responsiveness needs, gaps in current middleware design and implementation, network restrictions, containerization, and data access. Finally, we discuss challenges in packaging computational experiment implementations for portable data-intensive Tales and outline future work.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Computing Environments for Reproducibility: Capturing the "Whole Tale"
Authors:
Adam Brinckman,
Kyle Chard,
Niall Gaffney,
Mihael Hategan,
Matthew B. Jones,
Kacper Kowalik,
Sivakumar Kulasekaran,
Bertram Ludäscher,
Bryce D. Mecum,
Jarek Nabrzyski,
Victoria Stodden,
Ian J. Taylor,
Matthew J. Turk,
Kandace Turner
Abstract:
The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale p…
▽ More
The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale project aims to address these barriers by connecting computational, data-intensive research efforts with the larger research process--transforming the knowledge discovery and dissemination process into one where data products are united with research articles to create "living publications" or "tales". The Whole Tale focuses on the full spectrum of science, empowering users in the long tail of science, and power users with demands for access to big data and compute resources. We report here on the design, architecture, and implementation of the Whole Tale environment.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.
-
How to Scale a Code in the Human Dimension
Authors:
Matthew J. Turk
Abstract:
As scientists' needs for computational techniques and tools grow, they cease to be supportable by software developed in isolation. In many cases, these needs are being met by communities of practice, where software is developed by domain scientists to reach pragmatic goals and satisfy distinct and enumerable scientific goals. We present techniques that have been successful in growing and engaging…
▽ More
As scientists' needs for computational techniques and tools grow, they cease to be supportable by software developed in isolation. In many cases, these needs are being met by communities of practice, where software is developed by domain scientists to reach pragmatic goals and satisfy distinct and enumerable scientific goals. We present techniques that have been successful in growing and engaging communities of practice, specifically in the yt and Enzo communities.
△ Less
Submitted 29 January, 2013;
originally announced January 2013.