Galaxy merger challenge: A comparison study between machine learning-based detection methods
Authors:
B. Margalef-Bentabol,
L. Wang,
A. La Marca,
C. Blanco-Prieto,
D. Chudy,
H. Domínguez-Sánchez,
A. D. Goulding,
A. Guzmán-Ortega,
M. Huertas-Company,
G. Martin,
W. J. Pearson,
V. Rodriguez-Gomez,
M. Walmsley,
R. W. Bickley,
C. Bottrell,
C. Conselice,
D. O'Ryan
Abstract:
Various galaxy merger detection methods have been applied to diverse datasets. However, it is difficult to understand how they compare. We aim to benchmark the relative performance of machine learning (ML) merger detection methods. We explore six leading ML methods using three main datasets. The first one (the training data) consists of mock observations from the IllustrisTNG simulations and allow…
▽ More
Various galaxy merger detection methods have been applied to diverse datasets. However, it is difficult to understand how they compare. We aim to benchmark the relative performance of machine learning (ML) merger detection methods. We explore six leading ML methods using three main datasets. The first one (the training data) consists of mock observations from the IllustrisTNG simulations and allows us to quantify the performance metrics of the detection methods. The second one consists of mock observations from the Horizon-AGN simulations, introduced to evaluate the performance of classifiers trained on different, but comparable data. The third one consists of real observations from the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP) survey. For the binary classification task (mergers vs. non-mergers), all methods perform reasonably well in the domain of the training data. At $0.1<z<0.3$, precision and recall range between $\sim$70\% and 80\%, both of which decrease with increasing $z$ as expected (by $\sim$5\% for precision and $\sim$10\% for recall at $0.76<z<1.0$). When transferred to a different domain, the precision of all classifiers is only slightly reduced, but the recall is significantly worse (by $\sim$20-40\% depending on the method). Zoobot offers the best overall performance in terms of precision and F1 score. When applied to real HSC observations, all methods agree well with visual labels of clear mergers but can differ by more than an order of magnitude in predicting the overall fraction of major mergers. For the multi-class classification task to distinguish pre-, post- and non-mergers, none of the methods offer a good performance, which could be partly due to limitations in resolution and depth of the data. With the advent of better quality data (e.g. JWST and Euclid), it is important to improve our ability to detect mergers and distinguish between merger stages.
△ Less
Submitted 15 April, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
Morphological signatures of mergers in the TNG50 simulation and the Kilo-Degree Survey: the merger fraction from dwarfs to Milky Way-like galaxies
Authors:
Alejandro Guzmán-Ortega,
Vicente Rodriguez-Gomez,
Gregory F. Snyder,
Katie Chamberlain,
Lars Hernquist
Abstract:
Using the TNG50 cosmological simulation and observations from the Kilo-Degree Survey (KiDS), we investigate the connection between galaxy mergers and optical morphology in the local Universe over a wide range of galaxy stellar masses ($8.5\leqslant\log(M_\ast/\text{M}_\odot)\leqslant11$). To this end, we have generated over 16,000 synthetic images of TNG50 galaxies designed to match KiDS observati…
▽ More
Using the TNG50 cosmological simulation and observations from the Kilo-Degree Survey (KiDS), we investigate the connection between galaxy mergers and optical morphology in the local Universe over a wide range of galaxy stellar masses ($8.5\leqslant\log(M_\ast/\text{M}_\odot)\leqslant11$). To this end, we have generated over 16,000 synthetic images of TNG50 galaxies designed to match KiDS observations, including the effects of dust attenuation and scattering, and used the $\mathrm{\mathtt{statmorph}}$ code to measure various image-based morphological diagnostics in the $r$-band for both data sets. Such measurements include the Gini-$M_{20}$ and concentration-asymmetry-smoothness statistics. Overall, we find good agreement between the optical morphologies of TNG50 and KiDS galaxies, although the former are slightly more concentrated and asymmetric than their observational counterparts. Afterwards, we trained a random forest classifier to identify merging galaxies in the simulation (including major and minor mergers) using the morphological diagnostics as the model features, along with merger statistics from the merger trees as the ground truth. We find that the asymmetry statistic exhibits the highest feature importance of all the morphological parameters considered. Thus, the performance of our algorithm is comparable to that of the more traditional method of selecting highly asymmetric galaxies. Finally, using our trained model, we estimate the galaxy merger fraction in both our synthetic and observational galaxy samples, finding in both cases that the galaxy merger fraction increases steadily as a function of stellar mass.
△ Less
Submitted 10 November, 2022;
originally announced November 2022.