Search | arXiv e-print repository

Hyperspectral shadow removal with Iterative Logistic Regression and latent Parametric Linear Combination of Gaussians

Authors: Core Francisco Park, Maya Nasr, Manuel Pérez-Carrasco, Eleanor Walker, Douglas Finkbeiner, Cecilia Garraffo

Abstract: Shadow detection and removal is a challenging problem in the analysis of hyperspectral images. Yet, this step is crucial for analyzing data for remote sensing applications like methane detection. In this work, we develop a shadow detection and removal method only based on the spectrum of each pixel and the overall distribution of spectral values. We first introduce Iterative Logistic Regression (I… ▽ More Shadow detection and removal is a challenging problem in the analysis of hyperspectral images. Yet, this step is crucial for analyzing data for remote sensing applications like methane detection. In this work, we develop a shadow detection and removal method only based on the spectrum of each pixel and the overall distribution of spectral values. We first introduce Iterative Logistic Regression (ILR) to learn a spectral basis in which shadows can be linearly classified. We then model the joint distribution of the mean radiance and the projection coefficients of the spectra onto the above basis as a parametric linear combination of Gaussians. We can then extract the maximum likelihood mixing parameter of the Gaussians to estimate the shadow coverage and to correct the shadowed spectra. Our correction scheme reduces correction artefacts at shadow borders. The shadow detection and removal method is applied to hyperspectral images from MethaneAIR, a precursor to the satellite MethaneSAT. △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2308.07538 [pdf, other]

Domain Adaptation via Minimax Entropy for Real/Bogus Classification of Astronomical Alerts

Authors: Guillermo Cabrera-Vives, César Bolivar, Francisco Förster, Alejandra M. Muñoz Arancibia, Manuel Pérez-Carrasco, Esteban Reyes

Abstract: Time domain astronomy is advancing towards the analysis of multiple massive datasets in real time, prompting the development of multi-stream machine learning models. In this work, we study Domain Adaptation (DA) for real/bogus classification of astronomical alerts using four different datasets: HiTS, DES, ATLAS, and ZTF. We study the domain shift between these datasets, and improve a naive deep le… ▽ More Time domain astronomy is advancing towards the analysis of multiple massive datasets in real time, prompting the development of multi-stream machine learning models. In this work, we study Domain Adaptation (DA) for real/bogus classification of astronomical alerts using four different datasets: HiTS, DES, ATLAS, and ZTF. We study the domain shift between these datasets, and improve a naive deep learning classification model by using a fine tuning approach and semi-supervised deep DA via Minimax Entropy (MME). We compare the balanced accuracy of these models for different source-target scenarios. We find that both the fine tuning and MME models improve significantly the base model with as few as one labeled item per class coming from the target dataset, but that the MME does not compromise its performance on the source dataset. △ Less

Submitted 14 August, 2023; originally announced August 2023.

MSC Class: 85-08 ACM Class: I.2.1

Journal ref: ICML 2023 Workshop on Machine Learning for Astrophysics

arXiv:2308.06404 [pdf, other]

Positional Encodings for Light Curve Transformers: Playing with Positions and Attention

Authors: Daniel Moreno-Cartagena, Guillermo Cabrera-Vives, Pavlos Protopapas, Cristobal Donoso-Oliva, Manuel Pérez-Carrasco, Martina Cádiz-Leyton

Abstract: We conducted empirical experiments to assess the transferability of a light curve transformer to datasets with different cadences and magnitude distributions using various positional encodings (PEs). We proposed a new approach to incorporate the temporal information directly to the output of the last attention layer. Our results indicated that using trainable PEs lead to significant improvements i… ▽ More We conducted empirical experiments to assess the transferability of a light curve transformer to datasets with different cadences and magnitude distributions using various positional encodings (PEs). We proposed a new approach to incorporate the temporal information directly to the output of the last attention layer. Our results indicated that using trainable PEs lead to significant improvements in the transformer performances and training times. Our proposed PE on attention can be trained faster than the traditional non-trainable PE transformer while achieving competitive results when transfered to other datasets. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: In Proceedings of the 40th International Conference on Machine Learning (ICML), Workshop on Machine Learning for Astrophysics, PMLR 202, 2023, Honolulu, Hawaii, USA

Journal ref: In Proceedings of the 40th International Conference on Machine Learning (ICML), Workshop on Machine Learning for Astrophysics, PMLR 202, 2023, Honolulu, Hawaii, USA

arXiv:2308.05011 [pdf, other]

Multi-Class Deep SVDD: Anomaly Detection Approach in Astronomy with Distinct Inlier Categories

Authors: Manuel Pérez-Carrasco, Guillermo Cabrera-Vives, Lorena Hernández-García, Francisco Forster, Paula Sánchez-Sáez, Alejandra Muñoz Arancibia, Nicolás Astorga, Franz Bauer, Amelia Bayo, Martina Cádiz-Leyton, Marcio Catelan

Abstract: With the increasing volume of astronomical data generated by modern survey telescopes, automated pipelines and machine learning techniques have become crucial for analyzing and extracting knowledge from these datasets. Anomaly detection, i.e. the task of identifying irregular or unexpected patterns in the data, is a complex challenge in astronomy. In this paper, we propose Multi-Class Deep Support… ▽ More With the increasing volume of astronomical data generated by modern survey telescopes, automated pipelines and machine learning techniques have become crucial for analyzing and extracting knowledge from these datasets. Anomaly detection, i.e. the task of identifying irregular or unexpected patterns in the data, is a complex challenge in astronomy. In this paper, we propose Multi-Class Deep Support Vector Data Description (MCDSVDD), an extension of the state-of-the-art anomaly detection algorithm One-Class Deep SVDD, specifically designed to handle different inlier categories with distinct data distributions. MCDSVDD uses a neural network to map the data into hyperspheres, where each hypersphere represents a specific inlier category. The distance of each sample from the centers of these hyperspheres determines the anomaly score. We evaluate the effectiveness of MCDSVDD by comparing its performance with several anomaly detection algorithms on a large dataset of astronomical light-curves obtained from the Zwicky Transient Facility. Our results demonstrate the efficacy of MCDSVDD in detecting anomalous sources while leveraging the presence of different inlier categories. The code and the data needed to reproduce our results are publicly available at https://github.com/mperezcarrasco/AnomalyALeRCE. △ Less

Submitted 10 August, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: Accepted to ICML 2023 Workshop on Machine Learning for Astrophysics

arXiv:2204.01558 [pdf, other]

Con$^{2}$DA: Simplifying Semi-supervised Domain Adaptation by Learning Consistent and Contrastive Feature Representations

Authors: Manuel Pérez-Carrasco, Pavlos Protopapas, Guillermo Cabrera-Vives

Abstract: In this work, we present Con$^{2}$DA, a simple framework that extends recent advances in semi-supervised learning to the semi-supervised domain adaptation (SSDA) problem. Our framework generates pairs of associated samples by performing stochastic data transformations to a given input. Associated data pairs are mapped to a feature representation space using a feature extractor. We use different lo… ▽ More In this work, we present Con$^{2}$DA, a simple framework that extends recent advances in semi-supervised learning to the semi-supervised domain adaptation (SSDA) problem. Our framework generates pairs of associated samples by performing stochastic data transformations to a given input. Associated data pairs are mapped to a feature representation space using a feature extractor. We use different loss functions to enforce consistency between the feature representations of associated data pairs of samples. We show that these learned representations are useful to deal with differences in data distributions in the domain adaptation problem. We performed experiments to study the main components of our model and we show that (i) learning of the consistent and contrastive feature representations is crucial to extract good discriminative features across different domains, and ii) our model benefits from the use of strong augmentation policies. With these findings, our method achieves state-of-the-art performances in three benchmark datasets for SSDA. △ Less

Submitted 11 August, 2023; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: Accepted to NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications

arXiv:2106.07660 [pdf, other]

doi 10.3847/1538-3881/ac1426

Searching for changing-state AGNs in massive datasets -- I: applying deep learning and anomaly detection techniques to find AGNs with anomalous variability behaviours

Authors: P. Sánchez-Sáez, H. Lira, L. Martí, N. Sánchez-Pi, J. Arredondo, F. E. Bauer, A. Bayo, G. Cabrera-Vives, C. Donoso-Oliva, P. A. Estévez, S. Eyheramendy, F. Förster, L. Hernández-García, A. M. Muñoz Arancibia, M. Pérez-Carrasco, M. Sepúlveda, J. R. Vergara

Abstract: The classic classification scheme for Active Galactic Nuclei (AGNs) was recently challenged by the discovery of the so-called changing-state (changing-look) AGNs (CSAGNs). The physical mechanism behind this phenomenon is still a matter of open debate and the samples are too small and of serendipitous nature to provide robust answers. In order to tackle this problem, we need to design methods that… ▽ More The classic classification scheme for Active Galactic Nuclei (AGNs) was recently challenged by the discovery of the so-called changing-state (changing-look) AGNs (CSAGNs). The physical mechanism behind this phenomenon is still a matter of open debate and the samples are too small and of serendipitous nature to provide robust answers. In order to tackle this problem, we need to design methods that are able to detect AGN right in the act of changing-state. Here we present an anomaly detection (AD) technique designed to identify AGN light curves with anomalous behaviors in massive datasets. The main aim of this technique is to identify CSAGN at different stages of the transition, but it can also be used for more general purposes, such as cleaning massive datasets for AGN variability analyses. We used light curves from the Zwicky Transient Facility data release 5 (ZTF DR5), containing a sample of 230,451 AGNs of different classes. The ZTF DR5 light curves were modeled with a Variational Recurrent Autoencoder (VRAE) architecture, that allowed us to obtain a set of attributes from the VRAE latent space that describes the general behaviour of our sample. These attributes were then used as features for an Isolation Forest (IF) algorithm, that is an anomaly detector for a "one class" kind of problem. We used the VRAE reconstruction errors and the IF anomaly score to select a sample of 8,809 anomalies. These anomalies are dominated by bogus candidates, but we were able to identify 75 promising CSAGN candidates. △ Less

Submitted 12 July, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: Accepted for publication in the Astronomical Journal (AJ)

Journal ref: AJ 162 206 (2021)

arXiv:2008.03303 [pdf, other]

doi 10.3847/1538-3881/abe9bc

The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker

Authors: F. Förster, G. Cabrera-Vives, E. Castillo-Navarrete, P. A. Estévez, P. Sánchez-Sáez, J. Arredondo, F. E. Bauer, R. Carrasco-Davis, M. Catelan, F. Elorrieta, S. Eyheramendy, P. Huijse, G. Pignata, E. Reyes, I. Reyes, D. Rodríguez-Mancini, D. Ruz-Mieres, C. Valenzuela, I. Alvarez-Maldonado, N. Astorga, J. Borissova, A. Clocchiatti, D. De Cicco, C. Donoso-Oliva, M. J. Graham , et al. (15 additional authors not shown)

Abstract: We introduce the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker, an astronomical alert broker designed to provide a rapid and self--consistent classification of large etendue telescope alert streams, such as that provided by the Zwicky Transient Facility (ZTF) and, in the future, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). ALeRCE is a Chilean--l… ▽ More We introduce the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker, an astronomical alert broker designed to provide a rapid and self--consistent classification of large etendue telescope alert streams, such as that provided by the Zwicky Transient Facility (ZTF) and, in the future, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). ALeRCE is a Chilean--led broker run by an interdisciplinary team of astronomers and engineers, working to become intermediaries between survey and follow--up facilities. ALeRCE uses a pipeline which includes the real--time ingestion, aggregation, cross--matching, machine learning (ML) classification, and visualization of the ZTF alert stream. We use two classifiers: a stamp--based classifier, designed for rapid classification, and a light--curve--based classifier, which uses the multi--band flux evolution to achieve a more refined classification. We describe in detail our pipeline, data products, tools and services, which are made public for the community (see \url{https://alerce.science}). Since we began operating our real--time ML classification of the ZTF alert stream in early 2019, we have grown a large community of active users around the globe. We describe our results to date, including the real--time processing of $9.7\times10^7$ alerts, the stamp classification of $1.9\times10^7$ objects, the light curve classification of $8.5\times10^5$ objects, the report of 3088 supernova candidates, and different experiments using LSST-like alert streams. Finally, we discuss the challenges ahead to go from a single-stream of alerts such as ZTF to a multi--stream ecosystem dominated by LSST. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: Submitted to AAS on Jun 29th. Preview for LSST PCW 2020. Comments welcome

arXiv:1909.11651 [pdf, other]

Matching Embeddings for Domain Adaptation

Authors: Manuel Pérez-Carrasco, Guillermo Cabrera-Vives, Pavlos Protopapas, Nicolás Astorga, Marouan Belhaj

Abstract: In this work we address the problem of transferring knowledge obtained from a vast annotated source domain to a low labeled target domain. We propose Adversarial Variational Domain Adaptation (AVDA), a semi-supervised domain adaptation method based on deep variational embedded representations. We use approximate inference and domain adversarial methods to map samples from source and target domains… ▽ More In this work we address the problem of transferring knowledge obtained from a vast annotated source domain to a low labeled target domain. We propose Adversarial Variational Domain Adaptation (AVDA), a semi-supervised domain adaptation method based on deep variational embedded representations. We use approximate inference and domain adversarial methods to map samples from source and target domains into an aligned class-dependent embedding defined as a Gaussian Mixture Model. AVDA works as a classifier and considers a generative model that helps this classification. We used digits dataset for experimentation. Our results show that on a semi-supervised few-shot scenario our model outperforms previous methods in most of the adaptation tasks, even using a fewer number of labeled samples per class on target domain. △ Less

Submitted 24 January, 2021; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: 12 pages, 3 figures

arXiv:1810.07857 [pdf, other]

doi 10.1088/1538-3873/aaeeb4

Multiband galaxy morphologies for CLASH: a convolutional neural network transferred from CANDELS

Authors: Manuel Pérez-Carrasco, Guillermo Cabrera-Vives, Monserrat Martinez-Marín, Pierluigi Cerulo, Ricardo Demarco, Pavlos Protopapas, Julio Godoy, Marc Huertas-Company

Abstract: We present visual-like morphologies over 16 photometric bands, from ultra-violet to near infrared, for 8,412 galaxies in the Cluster Lensing And Supernova survey with Hubble (CLASH) obtained by a convolutional neural network (CNN) model. Our model follows the CANDELS main morphological classification scheme, obtaining the probability for each galaxy at each CLASH band of being spheroid, disk, irre… ▽ More We present visual-like morphologies over 16 photometric bands, from ultra-violet to near infrared, for 8,412 galaxies in the Cluster Lensing And Supernova survey with Hubble (CLASH) obtained by a convolutional neural network (CNN) model. Our model follows the CANDELS main morphological classification scheme, obtaining the probability for each galaxy at each CLASH band of being spheroid, disk, irregular, point source, or unclassifiable. Our catalog contains morphologies for each galaxy with Hmag < 24.5 in every filter where the galaxy is observed. We trained an initial CNN model using approximately 7,500 expert eyeball labels from The Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS). We created eyeball labels for 100 randomly selected galaxies per each of the 16-filters set of CLASH (1,600 galaxy images in total), where each image was classified by at least five of us. We use these labels to fine-tune the network in order to accurately predict labels for the CLASH data and to evaluate the performance of our model. We achieve a root-mean-square error of 0.0991 on the test set. We show that our proposed fine-tuning technique reduces the number of labeled images needed for training, as compared to directly training over the CLASH data, and achieves a better performance. This approach is very useful to minimize eyeball labeling efforts when classifying unlabeled data from new surveys. This will become particularly useful for massive datasets such as the ones coming from near future surveys such as EUCLID or the LSST. Our catalog consists of prediction of probabilities for each galaxy by morphology in their different bands and is made publicly available at http://www.inf.udec.cl/~guille/data/Deep-CLASH.csv. △ Less

Submitted 17 October, 2018; originally announced October 2018.

Comments: 11 pages, 11 figures, submitted to Publications of the Astronomical Society of the Pacific

Showing 1–9 of 9 results for author: Pérez-Carrasco, M