-
The Tiny Time-series Transformer: Low-latency High-throughput Classification of Astronomical Transients using Deep Model Compression
Authors:
Tarek Allam Jr.,
Julien Peloton,
Jason D. McEwen
Abstract:
A new golden age in astronomy is upon us, dominated by data. Large astronomical surveys are broadcasting unprecedented rates of information, demanding machine learning as a critical component in modern scientific pipelines to handle the deluge of data. The upcoming Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will raise the big-data bar for time-domain astronomy, with an…
▽ More
A new golden age in astronomy is upon us, dominated by data. Large astronomical surveys are broadcasting unprecedented rates of information, demanding machine learning as a critical component in modern scientific pipelines to handle the deluge of data. The upcoming Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will raise the big-data bar for time-domain astronomy, with an expected 10 million alerts per-night, and generating many petabytes of data over the lifetime of the survey. Fast and efficient classification algorithms that can operate in real-time, yet robustly and accurately, are needed for time-critical events where additional resources can be sought for follow-up analyses. In order to handle such data, state-of-the-art deep learning architectures coupled with tools that leverage modern hardware accelerators are essential. We showcase how the use of modern deep compression methods can achieve a $18\times$ reduction in model size, whilst preserving classification performance. We also show that in addition to the deep compression techniques, careful choice of file formats can improve inference latency, and thereby throughput of alerts, on the order of $8\times$ for local processing, and $5\times$ in a live production setting. To test this in a live setting, we deploy this optimised version of the original time-series transformer, t2, into the community alert broking system of FINK on real Zwicky Transient Facility (ZTF) alert data, and compare throughput performance with other science modules that exist in FINK. The results shown herein emphasise the time-series transformer's suitability for real-time classification at LSST scale, and beyond, and introduce deep model compression as a fundamental tool for improving deploy-ability and scalable inference of deep learning models for transient classification.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Considerations for optimizing photometric classification of supernovae from the Rubin Observatory
Authors:
Catarina S. Alves,
Hiranya V. Peiris,
Michelle Lochner,
Jason D. McEwen,
Tarek Allam Jr,
Rahul Biswas
Abstract:
The Vera C. Rubin Observatory will increase the number of observed supernovae (SNe) by an order of magnitude; however, it is impossible to spectroscopically confirm the class for all the SNe discovered. Thus, photometric classification is crucial but its accuracy depends on the not-yet-finalized observing strategy of Rubin Observatory's Legacy Survey of Space and Time (LSST). We quantitatively ana…
▽ More
The Vera C. Rubin Observatory will increase the number of observed supernovae (SNe) by an order of magnitude; however, it is impossible to spectroscopically confirm the class for all the SNe discovered. Thus, photometric classification is crucial but its accuracy depends on the not-yet-finalized observing strategy of Rubin Observatory's Legacy Survey of Space and Time (LSST). We quantitatively analyze the impact of the LSST observing strategy on SNe classification using simulated multi-band light curves from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). First, we augment the simulated training set to be representative of the photometric redshift distribution per supernovae class, the cadence of observations, and the flux uncertainty distribution of the test set. Then we build a classifier using the photometric transient classification library snmachine, based on wavelet features obtained from Gaussian process fits, yielding similar performance to the winning PLAsTiCC entry. We study the classification performance for SNe with different properties within a single simulated observing strategy. We find that season length is important, with light curves of 150 days yielding the highest performance. Cadence also has an important impact on SNe classification; events with median inter-night gap <3.5 days yield higher classification performance. Interestingly, we find that large gaps (>10 days) in light curve observations do not impact performance if sufficient observations are available on either side, due to the effectiveness of the Gaussian process interpolation. This analysis is the first exploration of the impact of observing strategy on photometric supernova classification with LSST.
△ Less
Submitted 29 October, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Paying Attention to Astronomical Transients: Introducing the Time-series Transformer for Photometric Classification
Authors:
Tarek Allam Jr.,
Jason D. McEwen
Abstract:
Future surveys such as the Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will observe an order of magnitude more astrophysical transient events than any previous survey before. With this deluge of photometric data, it will be impossible for all such events to be classified by humans alone. Recent efforts have sought to leverage machine learning methods to tackle the chall…
▽ More
Future surveys such as the Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will observe an order of magnitude more astrophysical transient events than any previous survey before. With this deluge of photometric data, it will be impossible for all such events to be classified by humans alone. Recent efforts have sought to leverage machine learning methods to tackle the challenge of astronomical transient classification, with ever improving success. Transformers are a recently developed deep learning architecture, first proposed for natural language processing, that have shown a great deal of recent success. In this work we develop a new transformer architecture, which uses multi-head self attention at its core, for general multi-variate time-series data. Furthermore, the proposed time-series transformer architecture supports the inclusion of an arbitrary number of additional features, while also offering interpretability. We apply the time-series transformer to the task of photometric classification, minimising the reliance of expert domain knowledge for feature selection, while achieving results comparable to state-of-the-art photometric classification methods. We achieve a logarithmic-loss of 0.507 on imbalanced data in a representative setting using data from the Photometric LSST Astronomical Time-Series Classification Challenge (PLAsTiCC). Moreover, we achieve a micro-averaged receiver operating characteristic area under curve of 0.98 and micro-averaged precision-recall area under curve of 0.87.
△ Less
Submitted 4 October, 2023; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Results of the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC)
Authors:
R. Hložek,
K. A. Ponder,
A. I. Malz,
M. Dai,
G. Narayan,
E. E. O. Ishida,
T. Allam Jr,
A. Bahmanyar,
R. Biswas,
L. Galbany,
S. W. Jha,
D. O. Jones,
R. Kessler,
M. Lochner,
A. A. Mahabal,
K. S. Mandel,
J. R. Martínez-Galarza,
J. D. McEwen,
D. Muthukrishna,
H. V. Peiris,
C. M. Peters,
C. N. Setzer
Abstract:
Next-generation surveys like the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will generate orders of magnitude more discoveries of transients and variable stars than previous surveys. To prepare for this data deluge, we developed the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC), a competition which aimed to catalyze the development of ro…
▽ More
Next-generation surveys like the Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will generate orders of magnitude more discoveries of transients and variable stars than previous surveys. To prepare for this data deluge, we developed the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC), a competition which aimed to catalyze the development of robust classifiers under LSST-like conditions of a non-representative training set for a large photometric test set of imbalanced classes. Over 1,000 teams participated in PLAsTiCC, which was hosted in the Kaggle data science competition platform between Sep 28, 2018 and Dec 17, 2018, ultimately identifying three winners in February 2019. Participants produced classifiers employing a diverse set of machine learning techniques including hybrid combinations and ensemble averages of a range of approaches, among them boosted decision trees, neural networks, and multi-layer perceptrons. The strong performance of the top three classifiers on Type Ia supernovae and kilonovae represent a major improvement over the current state-of-the-art within astronomy. This paper summarizes the most promising methods and evaluates their results in detail, highlighting future directions both for classifier development and simulation needs for a next generation PLAsTiCC data set.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
Fink, a new generation of broker for the LSST community
Authors:
Anais Möller,
Julien Peloton,
Emille E. O. Ishida,
Chris Arnault,
Etienne Bachelet,
Tristan Blaineau,
Dominique Boutigny,
Abhishek Chauhan,
Emmanuel Gangler,
Fabio Hernandez,
Julius Hrivnac,
Marco Leoni,
Nicolas Leroy,
Marc Moniez,
Sacha Pateyron,
Adrien Ramparison,
Damien Turpin,
Réza Ansari,
Tarek Allam Jr.,
Armelle Bajat,
Biswajit Biswas,
Alexandre Boucaud,
Johan Bregeon,
Jean-Eric Campagne,
Johann Cohen-Tanugi
, et al. (11 additional authors not shown)
Abstract:
Fink is a broker designed to enable science with large time-domain alert streams such as the one from the upcoming Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). It exhibits traditional astronomy broker features such as automatised ingestion, annotation, selection and redistribution of promising alerts for transient science. It is also designed to go beyond traditional broker fe…
▽ More
Fink is a broker designed to enable science with large time-domain alert streams such as the one from the upcoming Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). It exhibits traditional astronomy broker features such as automatised ingestion, annotation, selection and redistribution of promising alerts for transient science. It is also designed to go beyond traditional broker features by providing real-time transient classification which is continuously improved by using state-of-the-art Deep Learning and Adaptive Learning techniques. These evolving added values will enable more accurate scientific output from LSST photometric data for diverse science cases while also leading to a higher incidence of new discoveries which shall accompany the evolution of the survey. In this paper we introduce Fink, its science motivation, architecture and current status including first science verification cases using the Zwicky Transient Facility alert stream.
△ Less
Submitted 16 December, 2020; v1 submitted 21 September, 2020;
originally announced September 2020.
-
Optimizing the LSST Observing Strategy for Dark Energy Science: DESC Recommendations for the Deep Drilling Fields and other Special Programs
Authors:
Daniel M. Scolnic,
Michelle Lochner,
Phillipe Gris,
Nicolas Regnault,
Renée Hložek,
Greg Aldering,
Tarek Allam Jr,
Humna Awan,
Rahul Biswas,
Jonathan Blazek,
Chihway Chang,
Eric Gawiser,
Ariel Goobar,
Isobel M. Hook,
Saurabh W. Jha,
Jason D. McEwen,
Rachel Mandelbaum,
Phil Marshall,
Eric Neilsen,
Jason Rhodes,
Daniel Rothchild,
Ignacio Sevilla Noarbe,
Anže Slosar,
Peter Yoachim
Abstract:
We review the measurements of dark energy enabled by observations of the Deep Drilling Fields and the optimization of survey design for cosmological measurements. This white paper is the result of efforts by the LSST DESC Observing Strategy Task Force (OSTF), which represents the entire collaboration, and aims to make recommendations on observing strategy for the DDFs that will benefit all cosmolo…
▽ More
We review the measurements of dark energy enabled by observations of the Deep Drilling Fields and the optimization of survey design for cosmological measurements. This white paper is the result of efforts by the LSST DESC Observing Strategy Task Force (OSTF), which represents the entire collaboration, and aims to make recommendations on observing strategy for the DDFs that will benefit all cosmological analyses with LSST. It is accompanied by the DESC-WFD white paper (Lochner et al.). We argue for altering the nominal deep drilling plan to have $>6$ month seasons, interweaving $gri$ and $zy$ observations every 3 days with 2, 4, 8, 25, 4 visits in $grizy$, respectively. These recommendations are guided by metrics optimizing constraints on dark energy and mitigation of systematic uncertainties, including specific requirements on total number of visits after Y1 and Y10 for photometric redshifts (photo-$z$) and weak lensing systematics. We specify the precise locations for the previously-chosen LSST deep fields (ELAIS-S1, XMM-LSS, CDF-S, and COSMOS) and recommend Akari Deep Field South as the planned fifth deep field in order to synergize with Euclid and WFIRST. Our recommended DDF strategy uses $6.2\%$ of the LSST survey time. We briefly discuss synergy with white papers from other collaborations, as well as additional mini-surveys and Target-of-Opportunity programs that lead to better measurements of dark energy.
△ Less
Submitted 30 November, 2018;
originally announced December 2018.
-
Optimizing the LSST Observing Strategy for Dark Energy Science: DESC Recommendations for the Wide-Fast-Deep Survey
Authors:
Michelle Lochner,
Daniel M. Scolnic,
Humna Awan,
Nicolas Regnault,
Philippe Gris,
Rachel Mandelbaum,
Eric Gawiser,
Husni Almoubayyed,
Christian N. Setzer,
Simon Huber,
Melissa L. Graham,
Renée Hložek,
Rahul Biswas,
Tim Eifler,
Daniel Rothchild,
Tarek Allam Jr,
Jonathan Blazek,
Chihway Chang,
Thomas Collett,
Ariel Goobar,
Isobel M. Hook,
Mike Jarvis,
Saurabh W. Jha,
Alex G. Kim,
Phil Marshall
, et al. (11 additional authors not shown)
Abstract:
Cosmology is one of the four science pillars of LSST, which promises to be transformative for our understanding of dark energy and dark matter. The LSST Dark Energy Science Collaboration (DESC) has been tasked with deriving constraints on cosmological parameters from LSST data. Each of the cosmological probes for LSST is heavily impacted by the choice of observing strategy. This white paper is wri…
▽ More
Cosmology is one of the four science pillars of LSST, which promises to be transformative for our understanding of dark energy and dark matter. The LSST Dark Energy Science Collaboration (DESC) has been tasked with deriving constraints on cosmological parameters from LSST data. Each of the cosmological probes for LSST is heavily impacted by the choice of observing strategy. This white paper is written by the LSST DESC Observing Strategy Task Force (OSTF), which represents the entire collaboration, and aims to make recommendations on observing strategy that will benefit all cosmological analyses with LSST. It is accompanied by the DESC DDF (Deep Drilling Fields) white paper (Scolnic et al.). We use a variety of metrics to understand the effects of the observing strategy on measurements of weak lensing, large-scale structure, clusters, photometric redshifts, supernovae, strong lensing and kilonovae. In order to reduce systematic uncertainties, we conclude that the current baseline observing strategy needs to be significantly modified to result in the best possible cosmological constraints. We provide some key recommendations: moving the WFD (Wide-Fast-Deep) footprint to avoid regions of high extinction, taking visit pairs in different filters, changing the 2x15s snaps to a single exposure to improve efficiency, focusing on strategies that reduce long gaps (>15 days) between observations, and prioritizing spatial uniformity at several intervals during the 10-year survey.
△ Less
Submitted 14 December, 2018; v1 submitted 30 November, 2018;
originally announced December 2018.
-
The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC): Data set
Authors:
The PLAsTiCC team,
Tarek Allam Jr.,
Anita Bahmanyar,
Rahul Biswas,
Mi Dai,
Lluís Galbany,
Renée Hložek,
Emille E. O. Ishida,
Saurabh W. Jha,
David O. Jones,
Richard Kessler,
Michelle Lochner,
Ashish A. Mahabal,
Alex I. Malz,
Kaisey S. Mandel,
Juan Rafael Martínez-Galarza,
Jason D. McEwen,
Daniel Muthukrishna,
Gautham Narayan,
Hiranya Peiris,
Christina M. Peters,
Kara Ponder,
Christian N. Setzer,
The LSST Dark Energy Science Collaboration,
The LSST Transients
, et al. (1 additional authors not shown)
Abstract:
The Photometric LSST Astronomical Time Series Classification Challenge (PLAsTiCC) is an open data challenge to classify simulated astronomical time-series data in preparation for observations from the Large Synoptic Survey Telescope (LSST), which will achieve first light in 2019 and commence its 10-year main survey in 2022. LSST will revolutionize our understanding of the changing sky, discovering…
▽ More
The Photometric LSST Astronomical Time Series Classification Challenge (PLAsTiCC) is an open data challenge to classify simulated astronomical time-series data in preparation for observations from the Large Synoptic Survey Telescope (LSST), which will achieve first light in 2019 and commence its 10-year main survey in 2022. LSST will revolutionize our understanding of the changing sky, discovering and measuring millions of time-varying objects.
In this challenge, we pose the question: how well can we classify objects in the sky that vary in brightness from simulated LSST time-series data, with all its challenges of non-representativity? In this note we explain the need for a data challenge to help classify such astronomical sources and describe the PLAsTiCC data set and Kaggle data challenge, noting that while the references are provided for context, they are not needed to participate in the challenge.
△ Less
Submitted 28 September, 2018;
originally announced October 2018.
-
The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC): Selection of a performance metric for classification probabilities balancing diverse science goals
Authors:
A. I. Malz,
R. Hložek,
T. Allam Jr,
A. Bahmanyar,
R. Biswas,
M. Dai,
L. Galbany,
E. E. O. Ishida,
S. W. Jha,
D. O. Jones,
R. Kessler,
M. Lochner,
A. A. Mahabal,
K. S. Mandel,
J. R. Martínez-Galarza,
J. D. McEwen,
D. Muthukrishna,
G. Narayan,
H. Peiris,
C. M. Peters,
K. A. Ponder,
C. N. Setzer,
The LSST Dark Energy Science Collaboration,
The LSST Transients,
Variable Stars Science Collaboration
Abstract:
Classification of transient and variable light curves is an essential step in using astronomical observations to develop an understanding of their underlying physical processes. However, upcoming deep photometric surveys, including the Large Synoptic Survey Telescope (LSST), will produce a deluge of low signal-to-noise data for which traditional labeling procedures are inappropriate. Probabilistic…
▽ More
Classification of transient and variable light curves is an essential step in using astronomical observations to develop an understanding of their underlying physical processes. However, upcoming deep photometric surveys, including the Large Synoptic Survey Telescope (LSST), will produce a deluge of low signal-to-noise data for which traditional labeling procedures are inappropriate. Probabilistic classification is more appropriate for the data but are incompatible with the traditional metrics used on deterministic classifications. Furthermore, large survey collaborations intend to use these classification probabilities for diverse science objectives, indicating a need for a metric that balances a variety of goals. We describe the process used to develop an optimal performance metric for an open classification challenge that seeks probabilistic classifications and must serve many scientific interests. The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC) is an open competition aiming to identify promising techniques for obtaining classification probabilities of transient and variable objects by engaging a broader community both within and outside astronomy. Using mock classification probability submissions emulating archetypes of those anticipated of PLAsTiCC, we compare the sensitivity of metrics of classification probabilities under various weighting schemes, finding that they yield qualitatively consistent results. We choose as a metric for PLAsTiCC a weighted modification of the cross-entropy because it can be meaningfully interpreted. Finally, we propose extensions of our methodology to ever more complex challenge goals and suggest some guiding principles for approaching the choice of a metric of probabilistic classifications.
△ Less
Submitted 31 July, 2021; v1 submitted 28 September, 2018;
originally announced September 2018.
-
SDSS-IV eBOSS emission-line galaxy pilot survey
Authors:
J. Comparat,
T. Delubac,
S. Jouvel,
A. Raichoor,
J-P. Kneib,
C. Yeche,
F. B. Abdalla,
C. Le Cras,
C. Maraston,
D. M. Wilkinson,
G. Zhu,
E. Jullo,
F. Prada,
D. Schlegel,
Z. Xu,
H. Zou,
J. Bautista,
D. Bizyaev,
A. Bolton,
J. R. Brownstein,
K. S. Dawson,
S. Escoffier P. Gaulme,
K. Kinemuchi,
E. Malanushenko,
V. Malanushenko
, et al. (61 additional authors not shown)
Abstract:
The Sloan Digital Sky Survey IV extended Baryonic Oscillation Spectroscopic Survey (SDSS-IV/eBOSS) will observe 195,000 emission-line galaxies (ELGs) to measure the Baryonic Acoustic Oscillation standard ruler (BAO) at redshift 0.9. To test different ELG selection algorithms, 9,000 spectra were observed with the SDSS spectrograph as a pilot survey based on data from several imaging surveys. First,…
▽ More
The Sloan Digital Sky Survey IV extended Baryonic Oscillation Spectroscopic Survey (SDSS-IV/eBOSS) will observe 195,000 emission-line galaxies (ELGs) to measure the Baryonic Acoustic Oscillation standard ruler (BAO) at redshift 0.9. To test different ELG selection algorithms, 9,000 spectra were observed with the SDSS spectrograph as a pilot survey based on data from several imaging surveys. First, using visual inspection and redshift quality flags, we show that the automated spectroscopic redshifts assigned by the pipeline meet the quality requirements for a reliable BAO measurement. We also show the correlations between sky emission, signal-to-noise ratio in the emission lines, and redshift error. Then we provide a detailed description of each target selection algorithm we tested and compare them with the requirements of the eBOSS experiment. As a result, we provide reliable redshift distributions for the different target selection schemes we tested. Finally, we determine an target selection algorithms that is best suited to be applied on DECam photometry because they fulfill the eBOSS survey efficiency requirements.
△ Less
Submitted 21 June, 2016; v1 submitted 16 September, 2015;
originally announced September 2015.