-
Toward Enabling Reproducibility for Data-Intensive Research using the Whole Tale Platform
Authors:
Kyle Chard,
Niall Gaffney,
Mihael Hategan,
Kacper Kowalik,
Bertram Ludaescher,
Timothy McPhillips,
Jarek Nabrzyski,
Victoria Stodden,
Ian Taylor,
Thomas Thelen,
Matthew J. Turk,
Craig Willis
Abstract:
Whole Tale http://wholetale.org is a web-based, open-source platform for reproducible research supporting the creation, sharing, execution, and verification of "Tales" for the scientific research community. Tales are executable research objects that capture the code, data, and environment along with narrative and workflow information needed to re-create computational results from scientific studie…
▽ More
Whole Tale http://wholetale.org is a web-based, open-source platform for reproducible research supporting the creation, sharing, execution, and verification of "Tales" for the scientific research community. Tales are executable research objects that capture the code, data, and environment along with narrative and workflow information needed to re-create computational results from scientific studies. Creating reproducible research objects that enable reproducibility, transparency, and re-execution for computational experiments requiring significant compute resources or utilizing massive data is an especially challenging open problem. We describe opportunities, challenges, and solutions to facilitating reproducibility for data- and compute-intensive research, that we call "Tales at Scale," using the Whole Tale computing platform. We highlight challenges and solutions in frontend responsiveness needs, gaps in current middleware design and implementation, network restrictions, containerization, and data access. Finally, we discuss challenges in packaging computational experiment implementations for portable data-intensive Tales and outline future work.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
The demise of the filesystem and multi level service architecture
Authors:
William O'Mullane,
Niall Gaffney,
Frossie Economou,
Arfon M. Smith,
J. Ross Thomson,
Tim Jenness
Abstract:
Many astronomy data centres still work on filesystems. Industry has moved on; current practice in computing infrastructure is to achieve Big Data scalability using object stores rather than POSIX file systems. This presents us with opportunities for portability and reuse of software underlying processing and archive systems but it also causes problems for legacy implementations in current data cen…
▽ More
Many astronomy data centres still work on filesystems. Industry has moved on; current practice in computing infrastructure is to achieve Big Data scalability using object stores rather than POSIX file systems. This presents us with opportunities for portability and reuse of software underlying processing and archive systems but it also causes problems for legacy implementations in current data centers.
△ Less
Submitted 31 July, 2019; v1 submitted 26 July, 2019;
originally announced July 2019.
-
FanStore: Enabling Efficient and Scalable I/O for Distributed Deep Learning
Authors:
Zhao Zhang,
Lei Huang,
Uri Manor,
Lin**g Fang,
Gabriele Merlo,
Craig Michoski,
John Cazes,
Niall Gaffney
Abstract:
Emerging Deep Learning (DL) applications introduce heavy I/O workloads on computer clusters. The inherent long lasting, repeated, and random file access pattern can easily saturate the metadata and data service and negatively impact other users. In this paper, we present FanStore, a transient runtime file system that optimizes DL I/O on existing hardware/software stacks. FanStore distributes datas…
▽ More
Emerging Deep Learning (DL) applications introduce heavy I/O workloads on computer clusters. The inherent long lasting, repeated, and random file access pattern can easily saturate the metadata and data service and negatively impact other users. In this paper, we present FanStore, a transient runtime file system that optimizes DL I/O on existing hardware/software stacks. FanStore distributes datasets to the local storage of compute nodes, and maintains a global namespace. With the techniques of system call interception, distributed metadata management, and generic data compression, FanStore provides a POSIX-compliant interface with native hardware throughput in an efficient and scalable manner. Users do not have to make intrusive code changes to use FanStore and take advantage of the optimized I/O. Our experiments with benchmarks and real applications show that FanStore can scale DL training to 512 compute nodes with over 90\% scaling efficiency.
△ Less
Submitted 27 September, 2018;
originally announced September 2018.
-
Computing Environments for Reproducibility: Capturing the "Whole Tale"
Authors:
Adam Brinckman,
Kyle Chard,
Niall Gaffney,
Mihael Hategan,
Matthew B. Jones,
Kacper Kowalik,
Sivakumar Kulasekaran,
Bertram Ludäscher,
Bryce D. Mecum,
Jarek Nabrzyski,
Victoria Stodden,
Ian J. Taylor,
Matthew J. Turk,
Kandace Turner
Abstract:
The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale p…
▽ More
The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale project aims to address these barriers by connecting computational, data-intensive research efforts with the larger research process--transforming the knowledge discovery and dissemination process into one where data products are united with research articles to create "living publications" or "tales". The Whole Tale focuses on the full spectrum of science, empowering users in the long tail of science, and power users with demands for access to big data and compute resources. We report here on the design, architecture, and implementation of the Whole Tale environment.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.
-
Capturing the "Whole Tale" of Computational Research: Reproducibility in Computing Environments
Authors:
Bertram Ludaescher,
Kyle Chard,
Niall Gaffney,
Matthew B. Jones,
Jaroslaw Nabrzyski,
Victoria Stodden,
Matthew Turk
Abstract:
We present an overview of the recently funded "Merging Science and Cyberinfrastructure Pathways: The Whole Tale" project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research public…
▽ More
We present an overview of the recently funded "Merging Science and Cyberinfrastructure Pathways: The Whole Tale" project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways.
△ Less
Submitted 28 October, 2016;
originally announced October 2016.