Search | arXiv e-print repository

doi 10.1093/mnras/stac480

Investigating Deep Learning Methods for Obtaining Photometric Redshift Estimations from Images

Authors: Ben Henghes, Connor Pettitt, Jeyan Thiyagalingam, Tony Hey, Ofer Lahav

Abstract: Knowing the redshift of galaxies is one of the first requirements of many cosmological experiments, and as it's impossible to perform spectroscopy for every galaxy being observed, photometric redshift (photo-z) estimations are still of particular interest. Here, we investigate different deep learning methods for obtaining photo-z estimates directly from images, comparing these with traditional mac… ▽ More Knowing the redshift of galaxies is one of the first requirements of many cosmological experiments, and as it's impossible to perform spectroscopy for every galaxy being observed, photometric redshift (photo-z) estimations are still of particular interest. Here, we investigate different deep learning methods for obtaining photo-z estimates directly from images, comparing these with traditional machine learning algorithms which make use of magnitudes retrieved through photometry. As well as testing a convolutional neural network (CNN) and inception-module CNN, we introduce a novel mixed-input model which allows for both images and magnitude data to be used in the same model as a way of further improving the estimated redshifts. We also perform benchmarking as a way of demonstrating the performance and scalability of the different algorithms. The data used in the study comes entirely from the Sloan Digital Sky Survey (SDSS) from which 1 million galaxies were used, each having 5-filter (ugriz) images with complete photometry and a spectroscopic redshift which was taken as the ground truth. The mixed-input inception CNN achieved a mean squared error (MSE)=0.009, which was a significant improvement (30%) over the traditional Random Forest (RF), and the model performed even better at lower redshifts achieving a MSE=0.0007 (a 50% improvement over the RF) in the range of z<0.3. This method could be hugely beneficial to upcoming surveys such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) which will require vast numbers of photo-z estimates produced as quickly and accurately as possible. △ Less

Submitted 6 September, 2021; originally announced September 2021.

Comments: 13 pages, 14 figures, submitted to MNRAS

arXiv:2106.11315 [pdf, other]

doi 10.3847/1538-4357/ac3760

Expediting DECam Multimessenger Counterpart Searches with Convolutional Neural Networks

Authors: Adam Shandonay, Robert Morgan, Keith Bechtol, Clecio R. Bom, Brian Nord, Alyssa Garcia, Ben Henghes, Kenneth Herner, Megan Tabbutt, Antonella Palmese, Luidhy Santana-Silva, Marcelle Soares-Santos, Mandeep S. S. Gill, Juan Garcia-Bellido

Abstract: Searches for counterparts to multimessenger events with optical imagers use difference imaging to detect new transient sources. However, even with existing artifact detection algorithms, this process simultaneously returns several classes of false positives: false detections from poor quality image subtractions, false detections from low signal-to-noise images, and detections of pre-existing varia… ▽ More Searches for counterparts to multimessenger events with optical imagers use difference imaging to detect new transient sources. However, even with existing artifact detection algorithms, this process simultaneously returns several classes of false positives: false detections from poor quality image subtractions, false detections from low signal-to-noise images, and detections of pre-existing variable sources. Currently, human visual inspection to remove the false positives is a central part of multimessenger follow-up observations, but when next generation gravitational wave and neutrino detectors come online and increase the rate of multimessenger events, the visual inspection process will be prohibitively expensive. We approach this problem with two convolutional neural networks operating on the difference imaging outputs. The first network focuses on removing false detections and demonstrates an accuracy of 92 percent on our dataset. The second network focuses on sorting all real detections by the probability of being a transient source within a host galaxy and distinguishes between various classes of images that previously required additional human inspection. We find the number of images requiring human inspection will decrease by a factor of 1.5 using our approach alone and a factor of 3.6 using our approach in combination with existing algorithms, facilitating rapid multimessenger counterpart identification by the astronomical community. △ Less

Submitted 20 May, 2022; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: Published in ApJ

Report number: FERMILAB-PUB-21-268-AE

arXiv:2104.01875 [pdf, other]

doi 10.1093/mnras/stab1513

Benchmarking and Scalability of Machine Learning Methods for Photometric Redshift Estimation

Authors: Ben Henghes, Connor Pettitt, Jeyan Thiyagalingam, Tony Hey, Ofer Lahav

Abstract: Obtaining accurate photometric redshift estimations is an important aspect of cosmology, remaining a prerequisite of many analyses. In creating novel methods to produce redshift estimations, there has been a shift towards using machine learning techniques. However, there has not been as much of a focus on how well different machine learning methods scale or perform with the ever-increasing amounts… ▽ More Obtaining accurate photometric redshift estimations is an important aspect of cosmology, remaining a prerequisite of many analyses. In creating novel methods to produce redshift estimations, there has been a shift towards using machine learning techniques. However, there has not been as much of a focus on how well different machine learning methods scale or perform with the ever-increasing amounts of data being produced. Here, we introduce a benchmark designed to analyse the performance and scalability of different supervised machine learning methods for photometric redshift estimation. Making use of the Sloan Digital Sky Survey (SDSS - DR12) dataset, we analysed a variety of the most used machine learning algorithms. By scaling the number of galaxies used to train and test the algorithms up to one million, we obtained several metrics demonstrating the algorithms' performance and scalability for this task. Furthermore, by introducing a new optimisation method, time-considered optimisation, we were able to demonstrate how a small concession of error can allow for a great improvement in efficiency. From the algorithms tested we found that the Random Forest performed best in terms of error with a mean squared error, MSE = 0.0042; however, as other algorithms such as Boosted Decision Trees and k-Nearest Neighbours performed incredibly similarly, we used our benchmarks to demonstrate how different algorithms could be superior in different scenarios. We believe benchmarks such as this will become even more vital with upcoming surveys, such as LSST, which will capture billions of galaxies requiring photometric redshifts. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: 9 pages, 6 figures, submitted to MNRAS

arXiv:2009.12856 [pdf, other]

doi 10.1088/1538-3873/abcaea

Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Authors: B. Henghes, O. Lahav, D. W. Gerdes, E. Lin, R. Morgan, T. M. C. Abbott, M. Aguena, S. Allam, J. Annis, S. Avila, E. Bertin, D. Brooks, D. L. Burke, A. CarneroRosell, M. CarrascoKind, J. Carretero, C. Conselice, M. Costanzi, L. N. da Costa, J. DeVicente, S. Desai, H. T. Diehl, P. Doel, S. Everett, I. Ferrero , et al. (34 additional authors not shown)

Abstract: In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9… ▽ More In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9", may be present in the outer Solar System. DES is well placed to detect such a planet and has already been used to discover many other TNOs. Here, we perform tests on eight different supervised machine learning algorithms, using a dataset consisting of simulated TNOs buried within real DES noise data. We found that the best performing classifier was the Random Forest which, when optimised, performed well at detecting the rare objects. We achieve an area under the receiver operating characteristic (ROC) curve, (AUC) $= 0.996 \pm 0.001$. After optimizing the decision threshold of the Random Forest, we achieve a recall of 0.96 while maintaining a precision of 0.80. Finally, by using the optimized classifier to pre-select objects, we are able to run the orbit-fitting stage of our detection pipeline five times faster. △ Less

Submitted 10 December, 2020; v1 submitted 27 September, 2020; originally announced September 2020.

Comments: Published in PASP, 16 pages, 6 figures

Journal ref: PASP 133 014501 (2021)

Showing 1–4 of 4 results for author: Henghes, B