Search | arXiv e-print repository

DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification

Authors: Aleksandra Ćiprijanović, Diana Kafkes, Gregory Snyder, F. Javier Sánchez, Gabriel Nathan Perdue, Kevin Pedro, Brian Nord, Sandeep Madireddy, Stefan M. Wild

Abstract: With increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects (that can naturally occur in the data processing and analysis pipelines) and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the eff… ▽ More With increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects (that can naturally occur in the data processing and analysis pipelines) and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations. △ Less

Submitted 6 July, 2022; v1 submitted 28 December, 2021; originally announced December 2021.

Comments: 20 pages, 6 figures, 5 tables; accepted in MLST

Report number: FERMILAB-PUB-21-767-SCD

arXiv:2111.00961 [pdf, other]

Robustness of deep learning algorithms in astronomy -- galaxy morphology studies

Authors: A. Ćiprijanović, D. Kafkes, G. N. Perdue, K. Pedro, G. Snyder, F. J. Sánchez, S. Madireddy, S. M. Wild, B. Nord

Abstract: Deep learning models are being increasingly adopted in wide array of scientific domains, especially to handle high-dimensionality and volume of the scientific data. However, these models tend to be brittle due to their complexity and overparametrization, especially to the inadvertent adversarial perturbations that can appear due to common image processing such as compression or blurring that are o… ▽ More Deep learning models are being increasingly adopted in wide array of scientific domains, especially to handle high-dimensionality and volume of the scientific data. However, these models tend to be brittle due to their complexity and overparametrization, especially to the inadvertent adversarial perturbations that can appear due to common image processing such as compression or blurring that are often seen with real scientific data. It is crucial to understand this brittleness and develop models robust to these adversarial perturbations. To this end, we study the effect of observational noise from the exposure time, as well as the worst case scenario of a one-pixel attack as a proxy for compression or telescope errors on performance of ResNet18 trained to distinguish between galaxies of different morphologies in LSST mock data. We also explore how domain adaptation techniques can help improve model robustness in case of this type of naturally occurring attacks and help scientists build more trustworthy and stable models. △ Less

Submitted 2 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version

Report number: FERMILAB-CONF-21-561-SCD

arXiv:2109.08246 [pdf, other]

DeepGhostBusters: Using Mask R-CNN to Detect and Mask Ghosting and Scattered-Light Artifacts from Optical Survey Images

Authors: Dimitrios Tanoglidis, Aleksandra Ćiprijanović, Alex Drlica-Wagner, Brian Nord, Michael H. L. S. Wang, Ariel Jacob Amsellem, Kathryn Downey, Sydney Jenkins, Diana Kafkes, Zhuoqi Zhang

Abstract: Wide-field astronomical surveys are often affected by the presence of undesirable reflections (often known as "ghosting artifacts" or "ghosts") and scattered-light artifacts. The identification and mitigation of these artifacts is important for rigorous astronomical analyses of faint and low-surface-brightness systems. However, the identification of ghosts and scattered-light artifacts is challeng… ▽ More Wide-field astronomical surveys are often affected by the presence of undesirable reflections (often known as "ghosting artifacts" or "ghosts") and scattered-light artifacts. The identification and mitigation of these artifacts is important for rigorous astronomical analyses of faint and low-surface-brightness systems. However, the identification of ghosts and scattered-light artifacts is challenging due to a) the complex morphology of these features and b) the large data volume of current and near-future surveys. In this work, we use images from the Dark Energy Survey (DES) to train, validate, and test a deep neural network (Mask R-CNN) to detect and localize ghosts and scattered-light artifacts. We find that the ability of the Mask R-CNN model to identify affected regions is superior to that of conventional algorithms and traditional convolutional neural networks methods. We propose that a multi-step pipeline combining Mask R-CNN segmentation with a classical CNN classifier provides a powerful technique for the automated detection of ghosting and scattered-light artifacts in current and near-future surveys. △ Less

Submitted 16 September, 2021; originally announced September 2021.

Comments: 24 pages, 18 figures. Code and data related to this work can be found at: https://github.com/dtanoglidis/DeepGhostBusters

Report number: FERMILAB-PUB-21-374-AE

arXiv:2103.01373 [pdf, other]

doi 10.1093/mnras/stab1677

DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains

Authors: A. Ćiprijanović, D. Kafkes, K. Downey, S. Jenkins, G. N. Perdue, S. Madireddy, T. Johnston, G. F. Snyder, B. Nord

Abstract: In astronomy, neural networks are often trained on simulation data with the prospect of being used on telescope observations. Unfortunately, training a model on simulation data and then applying it to instrument data leads to a substantial and potentially even detrimental decrease in model accuracy on the new target dataset. Simulated and instrument data represent different data domains, and for a… ▽ More In astronomy, neural networks are often trained on simulation data with the prospect of being used on telescope observations. Unfortunately, training a model on simulation data and then applying it to instrument data leads to a substantial and potentially even detrimental decrease in model accuracy on the new target dataset. Simulated and instrument data represent different data domains, and for an algorithm to work in both, domain-invariant learning is necessary. Here we employ domain adaptation techniques$-$ Maximum Mean Discrepancy (MMD) as an additional transfer loss and Domain Adversarial Neural Networks (DANNs)$-$ and demonstrate their viability to extract domain-invariant features within the astronomical context of classifying merging and non-merging galaxies. Additionally, we explore the use of Fisher loss and entropy minimization to enforce better in-domain class discriminability. We show that the addition of each domain adaptation technique improves the performance of a classifier when compared to conventional deep learning algorithms. We demonstrate this on two examples: between two Illustris-1 simulated datasets of distant merging galaxies, and between Illustris-1 simulated data of nearby merging galaxies and observed data from the Sloan Digital Sky Survey. The use of domain adaptation techniques in our experiments leads to an increase of target domain classification accuracy of up to ${\sim}20\%$. With further development, these techniques will allow astronomers to successfully implement neural network models trained on simulation data to efficiently detect and study astrophysical objects in current and future large-scale astronomical surveys. △ Less

Submitted 1 March, 2021; originally announced March 2021.

Comments: Submitted to MNRAS; 21 pages, 9 figures, 9 tables

Report number: FERMILAB-PUB-21-072-SCD

Journal ref: MNRAS, Volume 506, Issue 1, September 2021, Page 677

arXiv:2011.03591 [pdf, other]

Domain adaptation techniques for improved cross-domain study of galaxy mergers

Authors: A. Ćiprijanović, D. Kafkes, S. Jenkins, K. Downey, G. N. Perdue, S. Madireddy, T. Johnston, B. Nord

Abstract: In astronomy, neural networks are often trained on simulated data with the prospect of being applied to real observations. Unfortunately, simply training a deep neural network on images from one domain does not guarantee satisfactory performance on new images from a different domain. The ability to share cross-domain knowledge is the main advantage of modern deep domain adaptation techniques. Here… ▽ More In astronomy, neural networks are often trained on simulated data with the prospect of being applied to real observations. Unfortunately, simply training a deep neural network on images from one domain does not guarantee satisfactory performance on new images from a different domain. The ability to share cross-domain knowledge is the main advantage of modern deep domain adaptation techniques. Here we demonstrate the use of two techniques - Maximum Mean Discrepancy (MMD) and adversarial training with Domain Adversarial Neural Networks (DANN) - for the classification of distant galaxy mergers from the Illustris-1 simulation, where the two domains presented differ only due to inclusion of observational noise. We show how the addition of either MMD or adversarial training greatly improves the performance of the classifier on the target domain when compared to conventional machine learning algorithms, thereby demonstrating great promise for their use in astronomy. △ Less

Submitted 13 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

Comments: Accepted in: Machine Learning and the Physical Sciences - Workshop at the 34th Conference on Neural Information Processing Systems (NeurIPS); final version

Report number: FERMILAB-CONF-20-582-SCD

Showing 1–5 of 5 results for author: Kafkes, D