-
Statistical analysis of probability density functions for photometric redshifts through the KiDS-ESO-DR3 galaxies
Authors:
Valeria Amaro,
Stefano Cavuoti,
Massimo Brescia,
Civita Vellucci,
Giuseppe Longo,
Maciej Bilicki,
Jelte T. A. de Jong,
Crescenzo Tortora,
Mario Radovich,
Nicola R. Napolitano,
Hugo Buddelmeijer
Abstract:
Despite the high accuracy of photometric redshifts (zphot) derived using Machine Learning (ML) methods, the quantification of errors through reliable and accurate Probability Density Functions (PDFs) is still an open problem. First, because it is difficult to accurately assess the contribution from different sources of errors, namely internal to the method itself and from the photometric features…
▽ More
Despite the high accuracy of photometric redshifts (zphot) derived using Machine Learning (ML) methods, the quantification of errors through reliable and accurate Probability Density Functions (PDFs) is still an open problem. First, because it is difficult to accurately assess the contribution from different sources of errors, namely internal to the method itself and from the photometric features defining the available parameter space. Second, because the problem of defining a robust statistical method, always able to quantify and qualify the PDF estimation validity, is still an open issue. We present a comparison among PDFs obtained using three different methods on the same data set: two ML techniques, METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) and ANNz2, plus the spectral energy distribution template fitting method, BPZ. The photometric data were extracted from the KiDS (Kilo Degree Survey) ESO Data Release 3, while the spectroscopy was obtained from the GAMA (Galaxy and Mass Assembly) Data Release 2. The statistical evaluation of both individual and stacked PDFs was done through quantitative and qualitative estimators, including a dummy PDF, useful to verify whether different statistical estimators can correctly assess PDF quality. We conclude that, in order to quantify the reliability and accuracy of any zphot PDF method, a combined set of statistical estimators is required.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.
-
Evolution of galaxy size--stellar mass relation from the Kilo Degree Survey
Authors:
N. Roy,
N. R. Napolitano,
F. La Barbera,
C. Tortora,
F. Getman,
M. Radovich,
M. Capaccioli,
M. Brescia,
S. Cavuoti,
G. Longo,
M. A. Raj,
E. Puddu,
G. Covone,
V. Amaro,
C. Vellucci,
A. Grado,
K. Kuijken,
G. Verdoes Kleijn,
E. Valentijn
Abstract:
We have obtained structural parameters of about 340,000 galaxies from the Kilo Degree Survey (KiDS) in 153 square degrees of data release 1, 2 and 3. We have performed a seeing convolved 2D single Sérsic fit to the galaxy images in the 4 photometric bands (u, g, r, i) observed by KiDS, by selecting high signal-to-noise ratio (S/N > 50) systems in every bands.
We have classified galaxies as spher…
▽ More
We have obtained structural parameters of about 340,000 galaxies from the Kilo Degree Survey (KiDS) in 153 square degrees of data release 1, 2 and 3. We have performed a seeing convolved 2D single Sérsic fit to the galaxy images in the 4 photometric bands (u, g, r, i) observed by KiDS, by selecting high signal-to-noise ratio (S/N > 50) systems in every bands.
We have classified galaxies as spheroids and disc-dominated by combining their spectral energy distribution properties and their Sérsic index. Using photometric redshifts derived from a machine learning technique, we have determined the evolution of the effective radius, \Re\ and stellar mass, \mst, versus redshift, for both mass complete samples of spheroids and disc-dominated galaxies up to z ~ 0.6.
Our results show a significant evolution of the structural quantities at intermediate redshift for the massive spheroids ($\mbox{Log}\ M_*/M_\odot>11$, Chabrier IMF), while almost no evolution has found for less massive ones ($\mbox{Log}\ M_*/M_\odot < 11$). On the other hand, disc dominated systems show a milder evolution in the less massive systems ($\mbox{Log}\ M_*/M_\odot < 11$) and possibly no evolution of the more massive systems. These trends are generally consistent with predictions from hydrodynamical simulations and independent datasets out to redshift z ~ 0.6, although in some cases the scatter of the data is large to drive final conclusions.
These results, based on 1/10 of the expected KiDS area, reinforce precedent finding based on smaller statistical samples and show the route toward more accurate results, expected with the the next survey releases.
△ Less
Submitted 16 July, 2018;
originally announced July 2018.
-
Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case
Authors:
Massimo Brescia,
Stefano Cavuoti,
Valeria Amaro,
Giuseppe Riccio,
Giuseppe Angora,
Civita Vellucci,
Giuseppe Longo
Abstract:
Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of…
▽ More
Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.
△ Less
Submitted 16 July, 2018; v1 submitted 21 February, 2018;
originally announced February 2018.
-
Photometric redshifts for the Kilo-Degree Survey. Machine-learning analysis with artificial neural networks
Authors:
M. Bilicki,
H. Hoekstra,
M. J. I. Brown,
V. Amaro,
C. Blake,
S. Cavuoti,
J. T. A. de Jong,
C. Georgiou,
H. Hildebrandt,
C. Wolf,
A. Amon,
M. Brescia,
S. Brough,
M. V. Costa-Duarte,
T. Erben,
K. Glazebrook,
A. Grado,
C. Heymans,
T. Jarrett,
S. Joudaki,
K. Kuijken,
G. Longo,
N. Napolitano,
D. Parkinson,
C. Vellucci
, et al. (2 additional authors not shown)
Abstract:
We present a machine-learning photometric redshift analysis of the Kilo-Degree Survey Data Release 3, using two neural-network based techniques: ANNz2 and MLPQNA. Despite limited coverage of spectroscopic training sets, these ML codes provide photo-zs of quality comparable to, if not better than, those from the BPZ code, at least up to zphot<0.9 and r<23.5. At the bright end of r<20, where very co…
▽ More
We present a machine-learning photometric redshift analysis of the Kilo-Degree Survey Data Release 3, using two neural-network based techniques: ANNz2 and MLPQNA. Despite limited coverage of spectroscopic training sets, these ML codes provide photo-zs of quality comparable to, if not better than, those from the BPZ code, at least up to zphot<0.9 and r<23.5. At the bright end of r<20, where very complete spectroscopic data overlap** with KiDS are available, the performance of the ML photo-zs clearly surpasses that of BPZ, currently the primary photo-z method for KiDS.
Using the Galaxy And Mass Assembly (GAMA) spectroscopic survey as calibration, we furthermore study how photo-zs improve for bright sources when photometric parameters additional to magnitudes are included in the photo-z derivation, as well as when VIKING and WISE infrared bands are added. While the fiducial four-band ugri setup gives a photo-z bias $δz=-2e-4$ and scatter $σ_z<0.022$ at mean z = 0.23, combining magnitudes, colours, and galaxy sizes reduces the scatter by ~7% and the bias by an order of magnitude. Once the ugri and IR magnitudes are joined into 12-band photometry spanning up to 12 $μ$, the scatter decreases by more than 10% over the fiducial case. Finally, using the 12 bands together with optical colours and linear sizes gives $δz<4e-5$ and $σ_z<0.019$.
This paper also serves as a reference for two public photo-z catalogues accompanying KiDS DR3, both obtained using the ANNz2 code. The first one, of general purpose, includes all the 39 million KiDS sources with four-band ugri measurements in DR3. The second dataset, optimized for low-redshift studies such as galaxy-galaxy lensing, is limited to r<20, and provides photo-zs of much better quality than in the full-depth case thanks to incorporating optical magnitudes, colours, and sizes in the GAMA-calibrated photo-z derivation.
△ Less
Submitted 11 May, 2018; v1 submitted 13 September, 2017;
originally announced September 2017.
-
Probability density estimation of photometric redshifts based on machine learning
Authors:
Stefano Cavuoti,
Massimo Brescia,
Valeria Amaro,
Civita Vellucci,
Giuseppe Longo,
Crescenzo Tortora
Abstract:
Photometric redshifts (photo-z's) provide an alternative way to estimate the distances of large samples of galaxies and are therefore crucial to a large variety of cosmological problems. Among the various methods proposed over the years, supervised machine learning (ML) methods capable to interpolate the knowledge gained by means of spectroscopical data have proven to be very effective. METAPHOR (…
▽ More
Photometric redshifts (photo-z's) provide an alternative way to estimate the distances of large samples of galaxies and are therefore crucial to a large variety of cosmological problems. Among the various methods proposed over the years, supervised machine learning (ML) methods capable to interpolate the knowledge gained by means of spectroscopical data have proven to be very effective. METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) is a novel method designed to provide a reliable PDF (Probability density Function) of the error distribution of photometric redshifts predicted by ML methods. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-z's. After a short description of the software, we present a summary of results on public galaxy data (Sloan Digital Sky Survey - Data Release 9) and a comparison with a completely different method based on Spectral Energy Distribution (SED) template fitting.
△ Less
Submitted 12 June, 2017;
originally announced June 2017.
-
The third data release of the Kilo-Degree Survey and associated data products
Authors:
J. T. A. de Jong,
G. A. Verdoes Kleijn,
T. Erben,
H. Hildebrandt,
K. Kuijken,
G. Sikkema,
M. Brescia,
M. Bilicki,
N. R. Napolitano,
V. Amaro,
K. G. Begeman,
D. R. Boxhoorn,
H. Buddelmeijer,
S. Cavuoti,
F. Getman,
A. Grado,
E. Helmich,
Z. Huang,
N. Irisarri,
F. La Barbera,
G. Longo,
J. P. McFarland,
R. Nakajima,
M. Paolillo,
E. Puddu
, et al. (18 additional authors not shown)
Abstract:
The Kilo-Degree Survey (KiDS) is an ongoing optical wide-field imaging survey with the OmegaCAM camera at the VLT Survey Telescope. It aims to image 1500 square degrees in four filters (ugri). The core science driver is map** the large-scale matter distribution in the Universe, using weak lensing shear and photometric redshift measurements. Further science cases include galaxy evolution, Milky W…
▽ More
The Kilo-Degree Survey (KiDS) is an ongoing optical wide-field imaging survey with the OmegaCAM camera at the VLT Survey Telescope. It aims to image 1500 square degrees in four filters (ugri). The core science driver is map** the large-scale matter distribution in the Universe, using weak lensing shear and photometric redshift measurements. Further science cases include galaxy evolution, Milky Way structure, detection of high-redshift clusters, and finding rare sources such as strong lenses and quasars. Here we present the third public data release (DR3) and several associated data products, adding further area, homogenized photometric calibration, photometric redshifts and weak lensing shear measurements to the first two releases. A dedicated pipeline embedded in the Astro-WISE information system is used for the production of the main release. Modifications with respect to earlier releases are described in detail. Photometric redshifts have been derived using both Bayesian template fitting, and machine-learning techniques. For the weak lensing measurements, optimized procedures based on the THELI data reduction and lensfit shear measurement packages are used. In DR3 stacked ugri images, weight maps, masks, and source lists for 292 new survey tiles (~300 sq.deg) are made available. The multi-band catalogue, including homogenized photometry and photometric redshifts, covers the combined DR1, DR2 and DR3 footprint of 440 survey tiles (447 sq.deg). Limiting magnitudes are typically 24.3, 25.1, 24.9, 23.8 (5 sigma in a 2 arcsec aperture) in ugri, respectively, and the typical r-band PSF size is less than 0.7 arcsec. The photometric homogenization scheme ensures accurate colors and an absolute calibration stable to ~2% for gri and ~3% in u. Separately released are a weak lensing shear catalogue and photometric redshifts based on two different machine-learning techniques.
△ Less
Submitted 21 May, 2017; v1 submitted 8 March, 2017;
originally announced March 2017.
-
METAPHOR: Probability density estimation for machine learning based photometric redshifts
Authors:
Valeria Amaro,
Stefano Cavuoti,
Massimo Brescia,
Civita Vellucci,
Crescenzo Tortora,
Giuseppe Longo
Abstract:
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any ot…
▽ More
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-z's and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDF's derived from a traditional SED template fitting method (Le Phare).
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
Cooperative photometric redshift estimation
Authors:
Stefano Cavuoti,
Crescenzo Tortora,
Massimo Brescia,
Giuseppe Longo,
Mario Radovich,
Nicola R. Napolitano,
Valeria Amaro,
Civita Vellucci
Abstract:
In the modern galaxy surveys photometric redshifts play a central role in a broad range of studies, from gravitational lensing and dark matter distribution to galaxy evolution. Using a dataset of about 25,000 galaxies from the second data release of the Kilo Degree Survey (KiDS) we obtain photometric redshifts with five different methods: (i) Random forest, (ii) Multi Layer Perceptron with Quasi N…
▽ More
In the modern galaxy surveys photometric redshifts play a central role in a broad range of studies, from gravitational lensing and dark matter distribution to galaxy evolution. Using a dataset of about 25,000 galaxies from the second data release of the Kilo Degree Survey (KiDS) we obtain photometric redshifts with five different methods: (i) Random forest, (ii) Multi Layer Perceptron with Quasi Newton Algorithm, (iii) Multi Layer Perceptron with an optimization network based on the Levenberg-Marquardt learning rule, (iv) the Bayesian Photometric Redshift model (or BPZ) and (v) a classical SED template fitting procedure (Le Phare). We show how SED fitting techniques could provide useful information on the galaxy spectral type which can be used to improve the capability of machine learning methods constraining systematic errors and reduce the occurrence of catastrophic outliers. We use such classification to train specialized regression estimators, by demonstrating that such hybrid approach, involving SED fitting and machine learning in a single collaborative framework, is capable to improve the overall prediction accuracy of photometric redshifts.
△ Less
Submitted 27 January, 2017;
originally announced January 2017.
-
A cooperative approach among methods for photometric redshifts estimation: an application to KiDS data
Authors:
Stefano Cavuoti,
Crescenzo Tortora,
Massimo Brescia,
Giuseppe Longo,
Mario Radovich,
Nicola R. Napolitano,
Valeria Amaro,
Civita Vellucci,
Francesco La Barbera,
Fedor Getman,
Aniello Grado
Abstract:
Photometric redshifts (photo-z's) are fundamental in galaxy surveys to address different topics, from gravitational lensing and dark matter distribution to galaxy evolution. The Kilo Degree Survey (KiDS), i.e. the ESO public survey on the VLT Survey Telescope (VST), provides the unprecedented opportunity to exploit a large galaxy dataset with an exceptional image quality and depth in the optical w…
▽ More
Photometric redshifts (photo-z's) are fundamental in galaxy surveys to address different topics, from gravitational lensing and dark matter distribution to galaxy evolution. The Kilo Degree Survey (KiDS), i.e. the ESO public survey on the VLT Survey Telescope (VST), provides the unprecedented opportunity to exploit a large galaxy dataset with an exceptional image quality and depth in the optical wavebands. Using a KiDS subset of about 25,000 galaxies with measured spectroscopic redshifts, we have derived photo-z's using i) three different empirical methods based on supervised machine learning, ii) the Bayesian Photometric Redshift model (or BPZ), and iii) a classical SED template fitting procedure (Le Phare). We confirm that, in the regions of the photometric parameter space properly sampled by the spectroscopic templates, machine learning methods provide better redshift estimates, with a lower scatter and a smaller fraction of outliers. SED fitting techniques, however, provide useful information on the galaxy spectral type which can be effectively used to constrain systematic errors and to better characterize potential catastrophic outliers. Such classification is then used to specialize the training of regression machine learning models, by demonstrating that a hybrid approach, involving SED fitting and machine learning in a single collaborative framework, can be effectively used to improve the accuracy of photo-z estimates.
△ Less
Submitted 7 December, 2016;
originally announced December 2016.
-
METAPHOR: A machine learning based method for the probability density estimation of photometric redshifts
Authors:
Stefano Cavuoti,
Valeria Amaro,
Massimo Brescia,
Civita Vellucci,
Crescenzo Tortora,
Giuseppe Longo
Abstract:
A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-z's). A wide plethora of methods have been developed, based either on template models fitting or on empirical explorations of the photometric parameter space. Machine learning based techniques are not explicitly dependent on the physical priors and able to produce accurate…
▽ More
A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-z's). A wide plethora of methods have been developed, based either on template models fitting or on empirical explorations of the photometric parameter space. Machine learning based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z Probability Density Function (PDF), due to the fact that the analytical relation map** the photometric parameters onto the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-z's. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDF's obtained by the Le Phare SED template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.
△ Less
Submitted 7 November, 2016;
originally announced November 2016.
-
DAMEWARE - Data Mining & Exploration Web Application Resource
Authors:
Massimo Brescia,
Stefano Cavuoti,
Francesco Esposito,
Michelangelo Fiore,
Mauro Garofalo,
Marisa Guglielmo,
Giuseppe Longo,
Francesco Manna,
Alfonso Nocella,
Civita Vellucci
Abstract:
Astronomy is undergoing through a methodological revolution triggered by an unprecedented wealth of complex and accurate data. DAMEWARE (DAta Mining & Exploration Web Application and REsource) is a general purpose, Web-based, Virtual Observatory compliant, distributed data mining framework specialized in massive data sets exploration with machine learning methods. We present the DAMEWARE (DAta Min…
▽ More
Astronomy is undergoing through a methodological revolution triggered by an unprecedented wealth of complex and accurate data. DAMEWARE (DAta Mining & Exploration Web Application and REsource) is a general purpose, Web-based, Virtual Observatory compliant, distributed data mining framework specialized in massive data sets exploration with machine learning methods. We present the DAMEWARE (DAta Mining & Exploration Web Application REsource) which allows the scientific community to perform data mining and exploratory experiments on massive data sets, by using a simple web browser. DAMEWARE offers several tools which can be seen as working environments where to choose data analysis functionalities such as clustering, classification, regression, feature extraction etc., together with models and algorithms.
△ Less
Submitted 16 March, 2016; v1 submitted 2 March, 2016;
originally announced March 2016.