DALiuGE: A Graph Execution Framework for Harnessing the Astronomical Data Deluge
Authors:
Chen Wu,
Rodrigo Tobar,
Kevin Vinsen,
Andreas Wicenec,
Dave Pallot,
Baoqiang Lao,
Ruonan Wang,
Tao An,
Mark Boulton,
Ian Cooper,
Richard Dodson,
Markus Dolensky,
Ying Mei,
Feng Wang
Abstract:
The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for processing large astronomical datasets at a scale required by the Square Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex data reduction pipelines consisting of both data sets and algorithmic components and an implementation run-time to execute such pipelines on distributed resources. By ma…
▽ More
The Data Activated Liu Graph Engine - DALiuGE - is an execution framework for processing large astronomical datasets at a scale required by the Square Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex data reduction pipelines consisting of both data sets and algorithmic components and an implementation run-time to execute such pipelines on distributed resources. By map** the logical view of a pipeline to its physical realisation, DALiuGE separates the concerns of multiple stakeholders, allowing them to collectively optimise large-scale data processing solutions in a coherent manner. The execution in DALiuGE is data-activated, where each individual data item autonomously triggers the processing on itself. Such decentralisation also makes the execution framework very scalable and flexible, supporting pipeline sizes ranging from less than ten tasks running on a laptop to tens of millions of concurrent tasks on the second fastest supercomputer in the world. DALiuGE has been used in production for reducing interferometry data sets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide Spectral Radioheliograph; and is being developed as the execution framework prototype for the Science Data Processor (SDP) consortium of the Square Kilometre Array (SKA) telescope. This paper presents a technical overview of DALiuGE and discusses case studies from the CHILES and MUSER projects that use DALiuGE to execute production pipelines. In a companion paper, we provide in-depth analysis of DALiuGE's scalability to very large numbers of tasks on two supercomputing facilities.
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
The Murchison Widefield Array Correlator
Authors:
S. M. Ord,
B. Crosse,
D. Emrich,
D. Pallot,
R. B. Wayth,
M. A. Clark,
S. E. Tremblay,
W. Arcus,
D. Barnes,
M. Bell,
G. Bernardi,
N. D. R. Bhat,
J. D. Bowman,
F. Briggs,
J. D. Bunton,
R. J. Cappallo,
B. E. Corey,
A. A. Deshpande,
L. deSouza,
A. Ewell-Wice,
L. Feng,
R. Goeke,
L. J. Greenhill,
B. J. Hazelton,
D. Herne
, et al. (42 additional authors not shown)
Abstract:
The Murchison Widefield Array (MWA) is a Square Kilometre Array (SKA) Precursor. The telescope is located at the Murchison Radio--astronomy Observatory (MRO) in Western Australia (WA). The MWA consists of 4096 dipoles arranged into 128 dual polarisation aperture arrays forming a connected element interferometer that cross-correlates signals from all 256 inputs. A hybrid approach to the correlation…
▽ More
The Murchison Widefield Array (MWA) is a Square Kilometre Array (SKA) Precursor. The telescope is located at the Murchison Radio--astronomy Observatory (MRO) in Western Australia (WA). The MWA consists of 4096 dipoles arranged into 128 dual polarisation aperture arrays forming a connected element interferometer that cross-correlates signals from all 256 inputs. A hybrid approach to the correlation task is employed, with some processing stages being performed by bespoke hardware, based on Field Programmable Gate Arrays (FPGAs), and others by Graphics Processing Units (GPUs) housed in general purpose rack mounted servers. The correlation capability required is approximately 8 TFLOPS (Tera FLoating point Operations Per Second). The MWA has commenced operations and the correlator is generating 8.3 TB/day of correlation products, that are subsequently transferred 700 km from the MRO to Perth (WA) in real-time for storage and offline processing. In this paper we outline the correlator design, signal path, and processing elements and present the data format for the internal and external interfaces.
△ Less
Submitted 23 January, 2015;
originally announced January 2015.