-
Processing All-Sky Images At Scale On The Amazon Cloud: A HiPS Example
Authors:
G. Bruce Berriman,
John C. Good
Abstract:
We report here on a project that has developed a practical approach to processing all-sky image collections on cloud platforms, using as an exemplar application the creation of three-color Hierarchical Progressive Survey (HiPS) maps of the 2MASS data set with the Montage Image Mosaic Engine on Amazon Web Services. We will emphasize issues that must be considered by scientists wishing to use cloud…
▽ More
We report here on a project that has developed a practical approach to processing all-sky image collections on cloud platforms, using as an exemplar application the creation of three-color Hierarchical Progressive Survey (HiPS) maps of the 2MASS data set with the Montage Image Mosaic Engine on Amazon Web Services. We will emphasize issues that must be considered by scientists wishing to use cloud platforms to perform such parallel processing, so providing a guide for scientists wishing to exploit cloud platforms for similar large-scale processing. A HiPS map is based on the HEALPix sky-tiling scheme. Progressive zooming of a HiPS map reveals an image sampled at ever smaller or larger spatial scales that are defined by the HEALPix standard. Briefly, the approach used by Montage involves creating a base mosaic at the lowest required HEALPix level, usually chosen to match as closely as possible the spatial sampling of the input images, then cutting out the HiPS cells in PNG format from this mosaic. The process is repeated at successive HEALPix levels to create a nested collection of FITS files, from which PNG files are created that are shown in HiPS viewers. Stretching FITS files to produce PNGs is based on an image histogram. For composite regions (up and including the whole sky), the histograms for each tile can be combined to create a composite histogram for the region. Using this single histogram for each of the individual FITS files means all the PNGs are on the same brightness scale and displaying them side by side in a HiPS viewer produces a continuous uniform map across the entire sky.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Astronomical Image Processing at Scale With Pegasus and Montage
Authors:
G. Bruce Berriman,
John C. Good,
Ewa Deelman,
Ryan Tanaka,
Karan Vahi
Abstract:
Image processing at scale is a powerful tool for creating new data sets and integrating them with existing data sets and performing analysis and quality assurance investigations. Workflow managers offer advantages in this type of processing, which involves multiple data access and processing steps. Generally, they enable automation of the workflow by locating data and resources, recovery from fail…
▽ More
Image processing at scale is a powerful tool for creating new data sets and integrating them with existing data sets and performing analysis and quality assurance investigations. Workflow managers offer advantages in this type of processing, which involves multiple data access and processing steps. Generally, they enable automation of the workflow by locating data and resources, recovery from failures, and monitoring of performance. In this focus demo we demonstrate how the Pegasus Workflow Manager Python API manages image processing to create mosaics with the Montage Image Mosaic engine. Since 2001, Pegasus has been developed and maintained at USC/ISI. Montage was in fact one of the first applications used to design Pegasus and optimize its performance. Pegasus has since found application in many areas of science. LIGO exploited it in making discoveries of black holes. The Vera C. Rubin Observatory used it to compare the cost and performance of processing images on cloud platforms. While these are examples of projects at large scale, small team investigations on local clusters of machines can benefit from Pegasus as well.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
Creating High Quality All-Sky Visualizations of Astronomy Image Data Sets: HiPS and Montage
Authors:
G. Bruce Berriman,
John C. Good,
Vandana Desai,
Steven L. Groom
Abstract:
We describe a case study to use the Montage image mosaic engine to create maps of the ALLWISE image data set in the Hierarchical Progressive Survey (HiPS) sky-tesselation scheme. Our approach demonstrates that Montage reveals the science content of infrared images in greater detail than has hitherto been possible in HiPS maps. The approach exploits two unique (to our knowledge) characteristics of…
▽ More
We describe a case study to use the Montage image mosaic engine to create maps of the ALLWISE image data set in the Hierarchical Progressive Survey (HiPS) sky-tesselation scheme. Our approach demonstrates that Montage reveals the science content of infrared images in greater detail than has hitherto been possible in HiPS maps. The approach exploits two unique (to our knowledge) characteristics of the Montage image mosaic engine: background modeling to rectify the time variable image backgrounds to common levels; and an adaptive image stretch to present images for visualization. The creation of the maps is supported by the development of four new tools that when fully tested will become part of the Montage distribution. The compute intensive part of the processing lies in the reprojection of the images, and we show how we optimized the processing for efficient creation of mosaics that are used in turn to create maps in the HiPS tiling scheme. We plan to apply our methodology to infrared image data sets such a those delivered by Spitzer, 2MASS, IRAS and Planck.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
A Study of the Efficiency of Spatial Indexing Methods Applied to Large Astronomical Databases
Authors:
G. B. Berriman,
J. C. Good,
B. Shiao,
T. Donaldson
Abstract:
We report the results of a study to compare the performance of two common database indexing methods, HTM and HEALPix, on Solaris and Windows database servers installed with PostgreSQL, and a Windows Server installed with MS SQL Server. The indexing was applied to the 2MASS All-Sky Catalog and to the Hubble Source Catalog, which approximate the diversity of catalogs common in astronomy. On each ser…
▽ More
We report the results of a study to compare the performance of two common database indexing methods, HTM and HEALPix, on Solaris and Windows database servers installed with PostgreSQL, and a Windows Server installed with MS SQL Server. The indexing was applied to the 2MASS All-Sky Catalog and to the Hubble Source Catalog, which approximate the diversity of catalogs common in astronomy. On each server, the study compared indexing performance by submitting 1 million queries at each index level with random sky positions and random cone search radius, which was computed on a logarithmic scale between 1 arcsec and 1 degree, and measuring the time to complete the query and write the output. These simulated queries, intended to model realistic use patterns, were run in a uniform way on many combinations of indexing method and indexing depth. The query times in all simulations are strongly I/O-bound and are linear with number of records returned for large numbers of sources. There are, however, considerable differences between simulations, which reveal that hardware I/O throughput is a more important factor in managing the performance of a DBMS than the choice of indexing scheme. The choice of index itself is relatively unimportant: for comparable index levels, the performance is consistent within the scatter of the timings. At small index levels (large cells; e.g. level 4; cell size 3.7 deg), there is large scatter in the timings because of wide variations in the number of sources found in the cells. At larger index levels, performance improves and scatter decreases, but the improvement at level 8 (14 arcmin) and higher is masked to some extent in the timing scatter caused by the range of query sizes. At very high levels (20; 0.0004 arsec), the granularity of the cells becomes so high that a large number of extraneous empty cells begin to degrade performance.
△ Less
Submitted 22 June, 2018;
originally announced June 2018.
-
Sustaining the Montage Image Mosaic Engine Since 2002
Authors:
G. Bruce Berriman,
John C. Good
Abstract:
This paper describes how we have sustained the Montage image mosaic engine (http://montage.ipac.caltech.edu) first released in 2002, to support the ever-growing scale and complexity of modern data sets. The key to its longevity has been its design as a toolkit written in ANSI-C, with each tool performing one distinct task, for easy integration into scripts, pipelines and workflows. The same code b…
▽ More
This paper describes how we have sustained the Montage image mosaic engine (http://montage.ipac.caltech.edu) first released in 2002, to support the ever-growing scale and complexity of modern data sets. The key to its longevity has been its design as a toolkit written in ANSI-C, with each tool performing one distinct task, for easy integration into scripts, pipelines and workflows. The same code base now supports Windows, JavaScript and Python by taking advantage of recent advances in compilers. The design has led to applicability of Montage far beyond what was anticipated when Montage was first built, such as supporting observation planning for the JWST. Moreover, Montage is highly scalable and is in wide use within the IT community to develop advanced, fault-tolerant cyber-infrastructure, such as job schedulers for grids, workflow orchestration, and restructuring techniques for processing complex workflows and pipelines.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.
-
The KELT Follow-Up Network and Transit False Positive Catalog: Pre-vetted False Positives for TESS
Authors:
Karen A. Collins,
Kevin I. Collins,
Joshua Pepper,
Jonathan Labadie-Bartz,
Keivan Stassun,
B. Scott Gaudi,
Daniel Bayliss,
Joao Bento,
Knicole D. Colón,
Dax Feliz,
David James,
Marshall C. Johnson,
Rudolf B. Kuhn,
Michael B. Lund,
Matthew T. Penny,
Joseph E. Rodriguez,
Robert J. Siverd,
Daniel J. Stevens,
Xinyu Yao,
George Zhou,
Mundra Akshay,
Giulio F. Aldi,
Cliff Ashcraft,
Supachai Awiphan,
Özgür Baştürk
, et al. (86 additional authors not shown)
Abstract:
The Kilodegree Extremely Little Telescope (KELT) project has been conducting a photometric survey for transiting planets orbiting bright stars for over ten years. The KELT images have a pixel scale of ~23"/pixel---very similar to that of NASA's Transiting Exoplanet Survey Satellite (TESS)---as well as a large point spread function, and the KELT reduction pipeline uses a weighted photometric apertu…
▽ More
The Kilodegree Extremely Little Telescope (KELT) project has been conducting a photometric survey for transiting planets orbiting bright stars for over ten years. The KELT images have a pixel scale of ~23"/pixel---very similar to that of NASA's Transiting Exoplanet Survey Satellite (TESS)---as well as a large point spread function, and the KELT reduction pipeline uses a weighted photometric aperture with radius 3'. At this angular scale, multiple stars are typically blended in the photometric apertures. In order to identify false positives and confirm transiting exoplanets, we have assembled a follow-up network (KELT-FUN) to conduct imaging with higher spatial resolution, cadence, and photometric precision than the KELT telescopes, as well as spectroscopic observations of the candidate host stars. The KELT-FUN team has followed-up over 1,600 planet candidates since 2011, resulting in more than 20 planet discoveries. Excluding ~450 false alarms of non-astrophysical origin (i.e., instrumental noise or systematics), we present an all-sky catalog of the 1,128 bright stars (6<V<10) that show transit-like features in the KELT light curves, but which were subsequently determined to be astrophysical false positives (FPs) after photometric and/or spectroscopic follow-up observations. The KELT-FUN team continues to pursue KELT and other planet candidates and will eventually follow up certain classes of TESS candidates. The KELT FP catalog will help minimize the duplication of follow-up observations by current and future transit surveys such as TESS.
△ Less
Submitted 19 September, 2018; v1 submitted 5 March, 2018;
originally announced March 2018.
-
The Application of the Montage Image Mosaic Engine To The Visualization Of Astronomical Images
Authors:
G. Bruce Berriman,
J. C. Good
Abstract:
The Montage Image Mosaic Engine was designed as a scalable toolkit, written in C for performance and portability across *nix platforms, that assembles FITS images into mosaics. The code is freely available and has been widely used in the astronomy and IT communities for research, product generation and for develo** next-generation cyber-infrastructure. Recently, it has begun to finding applicabi…
▽ More
The Montage Image Mosaic Engine was designed as a scalable toolkit, written in C for performance and portability across *nix platforms, that assembles FITS images into mosaics. The code is freely available and has been widely used in the astronomy and IT communities for research, product generation and for develo** next-generation cyber-infrastructure. Recently, it has begun to finding applicability in the field of visualization. This has come about because the toolkit design allows easy integration into scalable systems that process data for subsequent visualization in a browser or client. And it includes a visualization tool suitable for automation and for integration into Python: mViewer creates, with a single command, complex multi-color images overlaid with coordinate displays, labels, and observation footprints, and includes an adaptive image histogram equalization method that preserves the structure of a stretched image over its dynamic range. The Montage toolkit contains functionality originally developed to support the creation and management of mosaics but which also offers value to visualization: a background rectification algorithm that reveals the faint structure in an image; and tools for creating cutout and down-sampled versions of large images. Version 5 of Montage offers support for visualizing data written in HEALPix sky-tessellation scheme, and functionality for processing and organizing images to comply with the TOAST sky-tessellation scheme required for consumption by the World Wide Telescope (WWT). Four online tutorials enable readers to reproduce and extend all the visualizations presented in this paper.
△ Less
Submitted 8 February, 2017;
originally announced February 2017.
-
The Next Generation of the Montage Image Mosaic Toolkit
Authors:
G. Bruce Berriman,
J. C. Good,
B. Rusholme,
T. Robitaille
Abstract:
The scientific computing landscape has evolved dramatically in the past few years, with new schemes for organizing and storing data that reflect the growth in size and complexity of astronomical data sets. In response to this changing landscape, we are, over the next two years, deploying the next generation of the Montage toolkit ([ascl:1010.036]). The first release (October 2015) supports multi-d…
▽ More
The scientific computing landscape has evolved dramatically in the past few years, with new schemes for organizing and storing data that reflect the growth in size and complexity of astronomical data sets. In response to this changing landscape, we are, over the next two years, deploying the next generation of the Montage toolkit ([ascl:1010.036]). The first release (October 2015) supports multi-dimensional data sets ("data cubes"), and insertion of XMP/AVM tags that allows images to "drop-in" to the WWT. The same release offers a beta-version of web-based interactive visualization of images; this includes wrappers for visualization in Python. Subsequent releases will support HEALPix (now standard in cosmic background experiments); incorporation of Montage into package managers (which enable automated management of software builds), and support for a library that will enable Montage to be called directly from Python. This next generation toolkit will inherit the architectural benefits of the current engine - component based tools, ANSI-C portability across Unix platforms and scalability for distributed processing. With the expanded functionality under development, Montage can be viewed not simply as a mosaic engine, but as a scalable, portable toolkit for managing, organizing and processing images.
△ Less
Submitted 8 August, 2016;
originally announced August 2016.
-
A case study in adaptable and reusable infrastructure at the Keck Observatory Archive: VO interfaces, moving targets, and more
Authors:
G. Bruce Berriman,
Richard W. Cohen,
Andrew Colson,
Christopher R. Gelino,
John C. Good,
Mihseh Kong,
Anastasia C. Laity,
Jeffrey A. Mader,
Melanie A. Swain,
Hien D. Tran,
Shin-Ywan Wang
Abstract:
This paper describes how the Keck Observatory Archive (KOA) is extending open source software components to develop new services. In August 2015, KOA deployed a program interface to discover public data from all instruments equipped with an imaging mode. The interface complies with version 2 of the Simple Imaging Access Protocol (SIAP), under development by the International Virtual Observatory Al…
▽ More
This paper describes how the Keck Observatory Archive (KOA) is extending open source software components to develop new services. In August 2015, KOA deployed a program interface to discover public data from all instruments equipped with an imaging mode. The interface complies with version 2 of the Simple Imaging Access Protocol (SIAP), under development by the International Virtual Observatory Alliance (IVOA), which defines a standard mechanism for discovering images through spatial queries. The heart of the KOA service is an R-tree-based, database-indexing mechanism prototyped by the Virtual Astronomical Observatory (VAO) and further developed by the Montage Image Mosaic project, designed to provide fast access to large imaging data sets as a first step in creating wide-area image mosaics. The KOA service uses the results of the spatial R-tree search to create an SQLite data database for further relational filtering. The service uses a JSON configuration file to describe the association between instrument parameters and the service query parameters, and to make it applicable beyond the Keck instruments.
The R-tree program was itself extended to support temporal (in addition to spatial) indexing, in response to requests from the planetary science community for a search engine to discover observations of Solar System objects. With this 3D-indexing scheme, the service performs very fast time and spatial matches between the target ephemerides, obtained from the JPL SPICE service. Our experiments indicate these matches can be more than 100 times faster than when separating temporal and spatial searches. Images of the tracks of the moving targets, overlaid with the image footprints, are computed with a new command-line visualization tool, mViewer, released with the Montage distribution. The service is currently in test and will be released in Fall 2016.
△ Less
Submitted 8 August, 2016;
originally announced August 2016.
-
Connecting the time domain community with the Virtual Astronomical Observatory
Authors:
Matthew J. Graham,
S. G. Djorgovski,
Ciro Donalek,
Andrew J. Drake,
Ashish A. Mahabal,
Raymond L. Plante,
Jeffrey Kantor,
John C. Good
Abstract:
The time domain has been identified as one of the most important areas of astronomical research for the next decade. The Virtual Observatory is in the vanguard with dedicated tools and services that enable and facilitate the discovery, dissemination and analysis of time domain data. These range in scope from rapid notifications of time-critical astronomical transients to annotating long-term varia…
▽ More
The time domain has been identified as one of the most important areas of astronomical research for the next decade. The Virtual Observatory is in the vanguard with dedicated tools and services that enable and facilitate the discovery, dissemination and analysis of time domain data. These range in scope from rapid notifications of time-critical astronomical transients to annotating long-term variables with the latest modeling results. In this paper, we will review the prior art in these areas and focus on the capabilities that the VAO is bringing to bear in support of time domain science. In particular, we will focus on the issues involved with the heterogeneous collections of (ancillary) data associated with astronomical transients, and the time series characterization and classification tools required by the next generation of sky surveys, such as LSST and SKA.
△ Less
Submitted 18 June, 2012;
originally announced June 2012.