-
Systematic analysis of jellyfish galaxy candidates in Fornax, Antlia, and Hydra from the S-PLUS survey: A self-supervised visual identification aid
Authors:
Yash Gondhalekar,
Ana L. Chies-Santos,
Rafael S. de Souza,
Carolina Queiroz,
Amanda R. Lopes,
Fabricio Ferrari,
Gabriel M. Azevedo,
Hellen Monteiro-Pereira,
Roderik Overzier,
Analía V. Smith Castelli,
Yara L. Jaffé,
Rodrigo F. Haack,
P. T. Rahna,
Shiyin Shen,
Zihao Mu,
Ciria Lima-Dias,
Carlos E. Barbosa,
Gustavo B. Oliveira Schwarz,
Rogério Riffel,
Yolanda Jimenez-Teja,
Marco Grossi,
Claudia L. Mendes de Oliveira,
William Schoenell,
Thiago Ribeiro,
Antonio Kanaan
Abstract:
We study 51 jellyfish galaxy candidates in the Fornax, Antlia, and Hydra clusters. These candidates are identified using the JClass scheme based on the visual classification of wide-field, twelve-band optical images obtained from the Southern Photometric Local Universe Survey. A comprehensive astrophysical analysis of the jellyfish (JClass > 0), non-jellyfish (JClass = 0), and independently organi…
▽ More
We study 51 jellyfish galaxy candidates in the Fornax, Antlia, and Hydra clusters. These candidates are identified using the JClass scheme based on the visual classification of wide-field, twelve-band optical images obtained from the Southern Photometric Local Universe Survey. A comprehensive astrophysical analysis of the jellyfish (JClass > 0), non-jellyfish (JClass = 0), and independently organized control samples is undertaken. We develop a semi-automated pipeline using self-supervised learning and similarity search to detect jellyfish galaxies. The proposed framework is designed to assist visual classifiers by providing more reliable JClasses for galaxies. We find that jellyfish candidates exhibit a lower Gini coefficient, higher entropy, and a lower 2D Sérsic index as the jellyfish features in these galaxies become more pronounced. Jellyfish candidates show elevated star formation rates (including contributions from the main body and tails) by $\sim$1.75 dex, suggesting a significant increase in the SFR caused by the ram-pressure strip** phenomenon. Galaxies in the Antlia and Fornax clusters preferentially fall towards the cluster's centre, whereas only a mild preference is observed for Hydra galaxies. Our self-supervised pipeline, applied in visually challenging cases, offers two main advantages: it reduces human visual biases and scales effectively for large datasets. This versatile framework promises substantial enhancements in morphology studies for future galaxy image surveys.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
ELEPHANT: ExtragaLactic alErt Pipeline for Hostless AstroNomical Transients
Authors:
P. J. Pessi,
R. Durgesh,
L. Nakazono,
E. E. Hayes,
R. A. P. Oliveira,
E. E. O. Ishida,
A. Moitinho,
A. Krone-Martins,
B. Moews,
R. S. de Souza,
R. Beck,
M. A. Kuhn,
K. Nowak,
S. Vaughan
Abstract:
Context. Transient astronomical events that exhibit no discernible association with a host galaxy are commonly referred to as hostless. These rare phenomena are associated with extremely energetic events, and they can offer unique insights into the properties and evolution of stars and galaxies. However, the sheer number of transients captured by contemporary high-cadence astronomical surveys rend…
▽ More
Context. Transient astronomical events that exhibit no discernible association with a host galaxy are commonly referred to as hostless. These rare phenomena are associated with extremely energetic events, and they can offer unique insights into the properties and evolution of stars and galaxies. However, the sheer number of transients captured by contemporary high-cadence astronomical surveys renders the manual identification of all potential hostless transients impractical. Therefore, creating a systematic identification tool is crucial for studying these elusive events. Aims. We present the ExtragaLactic alErt Pipeline for Hostless AstroNomical Transients (ELEPHANT), a framework for filtering hostless transients in astronomical data streams. Methods. We used Fink to access all the ZTF alerts produced between January/2022 and December/2023, selecting only those associated with extragalactic transients. We then processed the associated stamps using a sequence of image analysis techniques to retrieve hostless candidates. Results. We find that less than 2% of all analyzed transients are potentially hostless. Among them, approximately 10% have a spectroscopic class reported on TNS, with Type Ia supernova being the most common class, followed by SLSN. Among the hostless candidates retrieved by our pipeline, there was SN 2018ibb, which has been proposed to be a PISN candidate; and SN 2022ann, one of only five known SNe Icn. When no class is reported on TNS, the dominant classes are QSO and SN candidates, the former obtained from SIMBAD and the latter inferred using the Fink ML classifier. Conclusions. ELEPHANT represents an effective strategy to filter extragalactic events within large and complex astronomical alert streams. There are many applications for which this pipeline will be useful, ranging from transient selection for follow-up to studies of transient environments.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Galmoss: A package for GPU-accelerated Galaxy Profile Fitting
Authors:
Mi Chen,
Rafael S. de Souza,
Quanfeng Xu,
Shiyin Shen,
Ana L. Chies-Santos,
Renhao Ye,
Marco A. Canossa-Gosteinski,
Yan** Cong
Abstract:
We introduce galmoss, a python-based, torch-powered tool for two-dimensional fitting of galaxy profiles. By seamlessly enabling GPU parallelization, galmoss meets the high computational demands of large-scale galaxy surveys, placing galaxy profile fitting in the LSST-era. It incorporates widely used profiles such as the Sérsic, Exponential disk, Ferrer, King, Gaussian, and Moffat profiles, and all…
▽ More
We introduce galmoss, a python-based, torch-powered tool for two-dimensional fitting of galaxy profiles. By seamlessly enabling GPU parallelization, galmoss meets the high computational demands of large-scale galaxy surveys, placing galaxy profile fitting in the LSST-era. It incorporates widely used profiles such as the Sérsic, Exponential disk, Ferrer, King, Gaussian, and Moffat profiles, and allows for the easy integration of more complex models. Tested on 8,289 galaxies from the Sloan Digital Sky Survey (SDSS) g-band with a single NVIDIA A100 GPU, galmoss completed classical Sérsic profile fitting in about 10 minutes. Benchmark tests show that galmoss achieves computational speeds that are 6 $\times$ faster than those of default implementations.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
NSCs from groups to clusters: A catalogue of dwarf galaxies in the Shapley Supercluster and the role of environment in galaxy nucleation
Authors:
Emilio J. B. Zanatta,
Ruben Sanchéz-Janssen,
Rafael S. de Souza,
Ana L. Chies-Santos,
John P. Blakeslee
Abstract:
Nuclear star clusters (NSCs) are dense star clusters located at the centre of galaxies spanning a wide range of masses and morphologies. Analysing NSC occupation statistics in different environments provides an invaluable window into investigating early conditions of high-density star formation and mass assembly in clusters and group galaxies. We use HST/ACS deep imaging to obtain a catalogue of d…
▽ More
Nuclear star clusters (NSCs) are dense star clusters located at the centre of galaxies spanning a wide range of masses and morphologies. Analysing NSC occupation statistics in different environments provides an invaluable window into investigating early conditions of high-density star formation and mass assembly in clusters and group galaxies. We use HST/ACS deep imaging to obtain a catalogue of dwarf galaxies in two galaxy clusters in the Shapley Supercluster: the central cluster Abell 3558 and the northern Abell 1736a. The Shapley region is an ideal laboratory to study nucleation as it stands as the highest mass concentration in the nearby Universe. We investigate the NSC occurrence in quiescent dwarf galaxies as faint as $M_{I} = -10$ mag and compare it with all other environments where nucleation data is available. We use galaxy cluster/group halo mass as a proxy for the environment and employ a Bayesian logistic regression framework to model the nucleation fraction ($f_{n}$) as a function of galaxy luminosity and environment. We find a notably high $f_n$ in Abell 3558: at $M_{I} \approx -13.1$ mag, half the galaxies in the cluster host NSCs. This is higher than in the Virgo and Fornax clusters but comparable to the Coma Cluster. On the other hand, the $f_n$ in Abell 1736a is relatively lower, comparable to groups in the Local Volume. We find that the probability of nucleation varies with galaxy luminosity remarkably similarly in galaxy clusters. These results reinforce previous findings of the important role of the environment in NSC formation/growth.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
The 2022-2023 accretion outburst of the young star V1741 Sgr
Authors:
Michael A. Kuhn,
Lynne A. Hillenbrand,
Michael S. Connelley,
R. Michael Rich,
Bart Staels,
Adolfo S. Carvalho,
Philip W. Lucas,
Christoffer Fremling,
Viraj R. Karambelkar,
Ellen Lee,
Tomás Ahumada,
Emille E. O. Ishida,
Kishalay De,
Rafael S. de Souza,
Mansi Kasliwal
Abstract:
V1741 Sgr (= SPICY 71482/Gaia22dtk) is a Classical T Tauri star on the outskirts of the Lagoon Nebula. After at least a decade of stability, in mid-2022, the optical source brightened by ~3 mag over two months, remained bright until early 2023, then dimmed erratically over the next four months. This event was monitored with optical and infrared spectroscopy and photometry. Spectra from the peak (O…
▽ More
V1741 Sgr (= SPICY 71482/Gaia22dtk) is a Classical T Tauri star on the outskirts of the Lagoon Nebula. After at least a decade of stability, in mid-2022, the optical source brightened by ~3 mag over two months, remained bright until early 2023, then dimmed erratically over the next four months. This event was monitored with optical and infrared spectroscopy and photometry. Spectra from the peak (October 2022) indicate an EX Lup-type (EXor) accretion outburst, with strong emission from H I, He I, and Ca II lines and CO bands. At this stage, spectroscopic absorption features indicated a temperature of T ~ 4750 K with low-gravity lines (e.g., Ba II and Sr II). By April 2023, with the outburst beginning to dim, strong TiO absorption appeared, indicating a cooler T ~ 3600 K temperature. However, once the source had returned to its pre-outburst flux in August 2023, the TiO absorption and the CO emission disappeared. When the star went into outburst, the source's spectral energy distribution became flatter, leading to bluer colours at wavelengths shorter than ~1.6 microns and redder colours at longer wavelengths. The brightening requires a continuum emitting area larger than the stellar surface, likely from optically thick circumstellar gas with cooler surface layers producing the absorption features. Additional contributions to the outburst spectrum may include blue excess from hotspots on the stellar surface, emission lines from diffuse gas, and reprocessed emission from the dust disc. Cooling of the circumstellar gas would explain the appearance of TiO, which subsequently disappeared once this gas had faded and the stellar spectrum reemerged.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Spatially resolved self-consistent spectral modelling of jellyfish galaxies from MUSE with FADO: trends with mass and strip** intensity
Authors:
Gabriel M. Azevedo,
Ana L. Chies-Santos,
Rogério Riffel,
Jean M. Gomes,
Augusto E. Lassen,
João P. V. Benedetti,
Rafael S. de Souza,
Quanfeng Xu
Abstract:
We present a spatially resolved stellar population analysis of 61 jellyfish galaxies and 47 control galaxies observed with ESO/MUSE attempting to understand the general trends of the stellar populations as a function of the strip** intensity and mass. This is the public sample from the GASP programme, with $0.01 < z < 0.15$ and $8.9 <\log(M_{\star}/M_{\odot}) < 12.0$. We apply the spectral popul…
▽ More
We present a spatially resolved stellar population analysis of 61 jellyfish galaxies and 47 control galaxies observed with ESO/MUSE attempting to understand the general trends of the stellar populations as a function of the strip** intensity and mass. This is the public sample from the GASP programme, with $0.01 < z < 0.15$ and $8.9 <\log(M_{\star}/M_{\odot}) < 12.0$. We apply the spectral population synthesis code FADO to fit self-consistently both the stellar and nebular contributions to the spectra of the sources. We present 2D morphological maps for mean stellar ages, metallicities, gas-phase oxygen abundances, and star formation rates for the galaxies with Integrated Nested Laplace Approximation (INLA), which is efficient in reconstructing spatial data of extended sources. We find that "extreme strip**" and "strip**" galaxies are typically younger than the other types. Regarding stellar and nebular metallicities, the "strip**" and "control passive" galaxies are the most metal-poor. Based on the phase space for jellyfish cluster members we find trends in ages, metallicities, and abundances with different regions of the diagram. We also compute radial profiles for the same quantities. We find that both the strip** and the stellar masses seem to influence the profiles, and we see differences between various groups and distinct mass bins. The radial profiles for different mass bins present relations already shown in the literature for undisturbed galaxies, i.e., profiles of ages and metallicities tend to increase with mass. However, beyond $\sim0.75$ effective radius, the ages of the most massive galaxies become similar to or lower than the ages of the lower mass ones.
△ Less
Submitted 16 June, 2023; v1 submitted 31 May, 2023;
originally announced June 2023.
-
Are classification metrics good proxies for SN Ia cosmological constraining power?
Authors:
Alex I. Malz,
Mi Dai,
Kara A. Ponder,
Emille E. O. Ishida,
Santiago Gonzalez-Gaitain,
Rupesh Durgesh,
Alberto Krone-Martins,
Rafael S. de Souza,
Noble Kennamer,
Sreevarsha Sreejith,
Lluis Galbany,
The LSST Dark Energy Science Collaboration,
The Cosmostatistics Initiative
Abstract:
Context: When selecting a classifier to use for a supernova Ia (SN Ia) cosmological analysis, it is common to make decisions based on metrics of classification performance, i.e. contamination within the photometrically classified SN Ia sample, rather than a measure of cosmological constraining power. If the former is an appropriate proxy for the latter, this practice would save those designing an…
▽ More
Context: When selecting a classifier to use for a supernova Ia (SN Ia) cosmological analysis, it is common to make decisions based on metrics of classification performance, i.e. contamination within the photometrically classified SN Ia sample, rather than a measure of cosmological constraining power. If the former is an appropriate proxy for the latter, this practice would save those designing an analysis pipeline from the computational expense of a full cosmology forecast. Aims: This study tests the assumption that classification metrics are an appropriate proxy for cosmology metrics. Methods: We emulate photometric SN Ia cosmology samples with controlled contamination rates of individual contaminant classes and evaluate each of them under a set of classification metrics. We then derive cosmological parameter constraints from all samples under two common analysis approaches and quantify the impact of contamination by each contaminant class on the resulting cosmological parameter estimates. Results: We observe that cosmology metrics are sensitive to both the contamination rate and the class of the contaminating population, whereas the classification metrics are insensitive to the latter. Conclusions: We therefore discourage exclusive reliance on classification-based metrics for cosmological analysis design decisions, e.g. classifier choice, and instead recommend optimizing using a metric of cosmological parameter constraining power.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Repeating Outbursts from the Young Stellar Object Gaia23bab (= SPICY 97589)
Authors:
Michael A. Kuhn,
Robert A. Benjamin,
Emille E. O. Ishida,
Rafael S. de Souza,
Julien Peloton,
Michele Delli Veneri
Abstract:
The light curve of Gaia23bab (= SPICY 97589) shows two significant ($ΔG>2$ mag) brightening events, one in 2017 and an ongoing event starting in 2022. The source's quiescent spectral energy distribution indicates an embedded ($A_V>5$ mag) pre-main-sequence star, with optical accretion emission and mid-infrared disk emission. This characterization is supported by the source's membership in an embed…
▽ More
The light curve of Gaia23bab (= SPICY 97589) shows two significant ($ΔG>2$ mag) brightening events, one in 2017 and an ongoing event starting in 2022. The source's quiescent spectral energy distribution indicates an embedded ($A_V>5$ mag) pre-main-sequence star, with optical accretion emission and mid-infrared disk emission. This characterization is supported by the source's membership in an embedded cluster in the star-forming cloud DOBASHI 1604 at a distance of $900\pm45$~pc. Thus, the brightening events are probable accretion outbursts, likely of EX Lup-type.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
From Images to Features: Unbiased Morphology Classification via Variational Auto-Encoders and Domain Adaptation
Authors:
Quanfeng Xu,
Shiyin Shen,
Rafael S. de Souza,
Mi Chen,
Renhao Ye,
Yumei She,
Zhu Chen,
Emille E. O. Ishida,
Alberto Krone-Martins,
Rupesh Durgesh
Abstract:
We present a novel approach for the dimensionality reduction of galaxy images by leveraging a combination of variational auto-encoders (VAE) and domain adaptation (DA). We demonstrate the effectiveness of this approach using a sample of low redshift galaxies with detailed morphological type labels from the Galaxy-Zoo DECaLS project. We show that 40-dimensional latent variables can effectively repr…
▽ More
We present a novel approach for the dimensionality reduction of galaxy images by leveraging a combination of variational auto-encoders (VAE) and domain adaptation (DA). We demonstrate the effectiveness of this approach using a sample of low redshift galaxies with detailed morphological type labels from the Galaxy-Zoo DECaLS project. We show that 40-dimensional latent variables can effectively reproduce most morphological features in galaxy images. To further validate the effectiveness of our approach, we utilised a classical random forest (RF) classifier on the 40-dimensional latent variables to make detailed morphology feature classifications. This approach performs similarly to a direct neural network application on galaxy images. We further enhance our model by tuning the VAE network via DA using galaxies in the overlap** footprint of DECaLS and BASS+MzLS, enabling the unbiased application of our model to galaxy images in both surveys. We observed that DA led to even better morphological feature extraction and classification performance. Overall, this combination of VAE and DA can be applied to achieve image dimensionality reduction, defect image identification, and morphology classification in large optical surveys.
△ Less
Submitted 13 October, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
Constraining Supernova Physics through Gravitational-Wave Observations
Authors:
Gergely Dálya,
Sibe Bleuzé,
Bence Bécsy,
Rafael S. de Souza,
Tamás Szalai
Abstract:
We examine the potential for using the LIGO-Virgo-KAGRA network of gravitational-wave detectors to provide constraints on the physical properties of core-collapse supernovae through the observation of their gravitational radiation. We use waveforms generated by 14 of the latest 3D hydrodynamic core-collapse supernova simulations, which are added to noise samples based on the predicted sensitivitie…
▽ More
We examine the potential for using the LIGO-Virgo-KAGRA network of gravitational-wave detectors to provide constraints on the physical properties of core-collapse supernovae through the observation of their gravitational radiation. We use waveforms generated by 14 of the latest 3D hydrodynamic core-collapse supernova simulations, which are added to noise samples based on the predicted sensitivities of the GW detectors during the O5 observing run. Then we use the BayesWave algorithm to model-independently reconstruct the gravitational-wave waveforms, which are used as input for various machine learning algorithms. Our results demonstrate how these algorithms perform in terms of i) indicating the presence of specific features of the progenitor or the explosion, ii) predicting the explosion mechanism, and iii) estimating the mass and angular velocity of the progenitor, as a function of the signal-to-noise ratio of the observed supernova signal. The conclusions of our study highlight the potential for GW observations to complement electromagnetic detections of supernovae by providing unique information about the exact explosion mechanism and the dynamics of the collapse.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Enabling the discovery of fast transients: A kilonova science module for the Fink broker
Authors:
B. Biswas,
E. E. O. Ishida,
J. Peloton,
A. Moller,
M. V. Pruzhinskaya,
R. S. de Souza,
D. Muthukrishna
Abstract:
We describe the fast transient classification algorithm in the center of the kilonova (KN) science module currently implemented in the Fink broker and report classification results based on simulated catalogs and real data from the ZTF alert stream. We used noiseless, homogeneously sampled simulations to construct a basis of principal components (PCs). All light curves from a more realistic ZTF si…
▽ More
We describe the fast transient classification algorithm in the center of the kilonova (KN) science module currently implemented in the Fink broker and report classification results based on simulated catalogs and real data from the ZTF alert stream. We used noiseless, homogeneously sampled simulations to construct a basis of principal components (PCs). All light curves from a more realistic ZTF simulation were written as a linear combination of this basis. The corresponding coefficients were used as features in training a random forest classifier. The same method was applied to long (>30 days) and medium (<30 days) light curves. The latter aimed to simulate the data situation found within the ZTF alert stream. Classification based on long light curves achieved 73.87% precision and 82.19% recall. Medium baseline analysis resulted in 69.30% precision and 69.74% recall, thus confirming the robustness of precision results when limited to 30 days of observations. In both cases, dwarf flares and point Type Ia supernovae were the most frequent contaminants. The final trained model was integrated into the Fink broker and has been distributing fast transients, tagged as KN_candidates, to the astronomical community, especially through the GRANDMA collaboration. We showed that features specifically designed to grasp different light curve behaviors provide enough information to separate fast (KN-like) from slow (non-KN-like) evolving events. This module represents one crucial link in an intricate chain of infrastructure elements for multi-messenger astronomy which is currently being put in place by the Fink broker team in preparation for the arrival of data from the Vera Rubin Observatory Legacy Survey of Space and Time.
△ Less
Submitted 5 October, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Bayesian Estimation of the $S$ Factor and Thermonuclear Reaction Rate for $^{16}$O(p,$γ$)$^{17}$F
Authors:
Christian Iliadis,
Vimal Palanivelrajan,
Rafael S. de Souza
Abstract:
The $^{16}$O(p,$γ$)$^{17}$F reaction is the slowest hydrogen-burning process in the CNO mass region. Its thermonuclear rate sensitively impacts predictions of oxygen isotopic ratios in a number of astrophysical sites, including AGB stars. The reaction has been measured several times at low bombarding energies using a variety of techniques. The most recent evaluated experimental rates have a report…
▽ More
The $^{16}$O(p,$γ$)$^{17}$F reaction is the slowest hydrogen-burning process in the CNO mass region. Its thermonuclear rate sensitively impacts predictions of oxygen isotopic ratios in a number of astrophysical sites, including AGB stars. The reaction has been measured several times at low bombarding energies using a variety of techniques. The most recent evaluated experimental rates have a reported uncertainty of about 7.5\% below $1$~GK. However, the previous rate estimate represents a best guess only and was not based on rigorous statistical methods. We apply a Bayesian model to fit all reliable $^{16}$O(p,$γ$)$^{17}$F cross section data, and take into account independent contributions of statistical and systematic uncertainties. The nuclear reaction model employed is a single-particle potential model involving a Woods-Saxon potential for generating the radial bound state wave function. The model has three physical parameters, the radius and diffuseness of the Woods-Saxon potential, and the asymptotic normalization coefficients (ANCs) of the final bound state in $^{17}$F. We find that performing the Bayesian $S$ factor fit using ANCs as scaling parameters has a distinct advantage over adopting spectroscopic factors instead. Based on these results, we present the first statistically rigorous estimation of experimental $^{16}$O(p,$γ$)$^{17}$F reaction rates, with uncertainties ($\pm 4.2$\%) of about half the previously reported values.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
SCONCE: A cosmic web finder for spherical and conic geometries
Authors:
Yikun Zhang,
Rafael S. de Souza,
Yen-Chi Chen
Abstract:
The latticework structure known as the cosmic web provides a valuable insight into the assembly history of large-scale structures. Despite the variety of methods to identify the cosmic web structures, they mostly rely on the assumption that galaxies are embedded in a Euclidean geometric space. Here we present a novel cosmic web identifier called SCONCE (Spherical and CONic Cosmic wEb finder) that…
▽ More
The latticework structure known as the cosmic web provides a valuable insight into the assembly history of large-scale structures. Despite the variety of methods to identify the cosmic web structures, they mostly rely on the assumption that galaxies are embedded in a Euclidean geometric space. Here we present a novel cosmic web identifier called SCONCE (Spherical and CONic Cosmic wEb finder) that inherently considers the 2D (RA,DEC) spherical or the 3D (RA,DEC,$z$) conic geometry. The proposed algorithms in SCONCE generalize the well-known subspace constrained mean shift (SCMS) method and primarily address the predominant filament detection problem. They are intrinsic to the spherical/conic geometry and invariant to data rotations. We further test the efficacy of our method with an artificial cross-shaped filament example and apply it to the SDSS galaxy catalogue, revealing that the 2D spherical version of our algorithms is robust even in regions of high declination. Finally, using N-body simulations from Illustris, we show that the 3D conic version of our algorithms is more robust in detecting filaments than the standard SCMS method under the redshift distortions caused by the peculiar velocities of halos. Our cosmic web finder is packaged in python as SCONCE-SCMS and has been made publicly available.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
A graph-based spectral classification of Type II supernovae
Authors:
Rafael S. de Souza,
Stephen Thorp,
Lluís Galbany,
Emille E. O. Ishida,
Santiago González-Gaitán,
Morgan A. Schmitz,
Alberto Krone-Martins,
Christina Peters
Abstract:
Given the ever-increasing number of time-domain astronomical surveys, employing robust, interpretative, and automated data-driven classification schemes is pivotal. Based on graph theory, we present new data-driven classification heuristics for spectral data. A spectral classification scheme of Type II supernovae (SNe II) is proposed based on the phase relative to the maximum light in the $V$ band…
▽ More
Given the ever-increasing number of time-domain astronomical surveys, employing robust, interpretative, and automated data-driven classification schemes is pivotal. Based on graph theory, we present new data-driven classification heuristics for spectral data. A spectral classification scheme of Type II supernovae (SNe II) is proposed based on the phase relative to the maximum light in the $V$ band and the end of the plateau phase. We utilize a compiled optical data set that comprises 145 SNe and 1595 optical spectra in 4000-9000 $\overset{\circ}{\mathrm {A}}$. Our classification method naturally identifies outliers and arranges the different SNe in terms of their major spectral features. We compare our approach to the off-the-shelf umap manifold learning and show that both strategies are consistent with a continuous variation of spectral types rather than discrete families. The automated classification naturally reflects the fast evolution of Type II SNe around the maximum light while showcasing their homogeneity close to the end of the plateau phase. The scheme we develop could be more widely applicable to unsupervised time series classification or characterisation of other functional data.
△ Less
Submitted 1 June, 2023; v1 submitted 28 June, 2022;
originally announced June 2022.
-
qrpca: A Package for Fast Principal Component Analysis with GPU Acceleration
Authors:
Rafael S. de Souza,
Xu Quanfeng,
Shiyin Shen,
Chen Peng,
Zihao Mu
Abstract:
We present qrpca, a fast and scalable QR-decomposition principal component analysis package. The software, written in both R and python languages, makes use of torch for internal matrix computations, and enables GPU acceleration, when available. qrpca provides similar functionalities to prcomp (R) and sklearn (python) packages respectively. A benchmark test shows that qrpca can achieve computation…
▽ More
We present qrpca, a fast and scalable QR-decomposition principal component analysis package. The software, written in both R and python languages, makes use of torch for internal matrix computations, and enables GPU acceleration, when available. qrpca provides similar functionalities to prcomp (R) and sklearn (python) packages respectively. A benchmark test shows that qrpca can achieve computational speeds 10-20 $\times$ faster for large dimensional matrices than default implementations, and is at least twice as fast for a standard decomposition of spectral data cubes. The qrpca source code is made freely available to the community.
△ Less
Submitted 6 September, 2022; v1 submitted 14 June, 2022;
originally announced June 2022.
-
galmask: A Python package for unsupervised galaxy masking
Authors:
Yash Gondhalekar,
Rafael S. de Souza,
Ana L. Chies-Santos
Abstract:
Galaxy morphological classification is a fundamental aspect of galaxy formation and evolution studies. Various machine learning tools have been developed for automated pipeline analysis of large-scale surveys, enabling a fast search for objects of interest. However, crowded regions in the image may pose a challenge as they can lead to bias in the learning algorithm. In this Research Note, we prese…
▽ More
Galaxy morphological classification is a fundamental aspect of galaxy formation and evolution studies. Various machine learning tools have been developed for automated pipeline analysis of large-scale surveys, enabling a fast search for objects of interest. However, crowded regions in the image may pose a challenge as they can lead to bias in the learning algorithm. In this Research Note, we present galmask, an open-source package for unsupervised galaxy masking to isolate the central object of interest in the image. galmask is written in Python and can be installed from PyPI via the pip command.
△ Less
Submitted 11 June, 2022;
originally announced June 2022.
-
Spectroscopic Confirmation of a Population of Isolated, Intermediate-Mass YSOs
Authors:
Michael A. Kuhn,
Ramzi Saber,
Matthew S. Povich,
Rafael S. de Souza,
Alberto Krone-Martins,
Emille E. O. Ishida,
Catherine Zucker,
Robert A. Benjamin,
Lynne A. Hillenbrand,
Alfred Castro-Ginard,
Xingyu Zhou
Abstract:
Wide-field searches for young stellar objects (YSOs) can place useful constraints on the prevalence of clustered versus distributed star formation. The Spitzer/IRAC Candidate YSO (SPICY) catalog is one of the largest compilations of such objects (~120,000 candidates in the Galactic midplane). Many SPICY candidates are spatially clustered, but, perhaps surprisingly, approximately half the candidate…
▽ More
Wide-field searches for young stellar objects (YSOs) can place useful constraints on the prevalence of clustered versus distributed star formation. The Spitzer/IRAC Candidate YSO (SPICY) catalog is one of the largest compilations of such objects (~120,000 candidates in the Galactic midplane). Many SPICY candidates are spatially clustered, but, perhaps surprisingly, approximately half the candidates appear spatially distributed. To better characterize this unexpected population and confirm its nature, we obtained Palomar/DBSP spectroscopy for 26 of the optically-bright (G<15 mag) "isolated" YSO candidates. We confirm the YSO classifications of all 26 sources based on their positions on the Hertzsprung-Russell diagram, H and Ca II line-emission from over half the sample, and robust detection of infrared excesses. This implies a contamination rate of <10% for SPICY stars that meet our optical selection criteria. Spectral types range from B4 to K3, with A-type stars most common. Spectral energy distributions, diffuse interstellar bands, and Galactic extinction maps indicate moderate to high extinction. Stellar masses range from ~1 to 7 $M_\odot$, and the estimated accretion rates, ranging from $3\times10^{-8}$ to $3\times10^{-7}$ $M_\odot$ yr$^{-1}$, are typical for YSOs in this mass range. The 3D spatial distribution of these stars, based on Gaia astrometry, reveals that the "isolated" YSOs are not evenly distributed in the Solar neighborhood but are concentrated in kpc-scale dusty Galactic structures that also contain the majority of the SPICY YSO clusters. Thus, the processes that produce large Galactic star-forming structures may yield nearly as many distributed as clustered YSOs.
△ Less
Submitted 19 September, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.
-
How have astronomers cited other fields in the last decade?
Authors:
Michele Delli Veneri,
Rafael S. de Souza,
Alberto Krone-Martins,
Emille E. O. Ishida,
Maria Luiza L. Dantas,
Noble Kennamer
Abstract:
We present a citation pattern analysis between astronomical papers and 13 other disciplines, based on the arXiv database over the past decade ($2010 - 2020$). We analyze 12,600 astronomical papers citing over 14,531 unique publications outside astronomy. Two striking patterns are unraveled. First, general relativity recently became the most cited field by astronomers, a trend highly correlated wit…
▽ More
We present a citation pattern analysis between astronomical papers and 13 other disciplines, based on the arXiv database over the past decade ($2010 - 2020$). We analyze 12,600 astronomical papers citing over 14,531 unique publications outside astronomy. Two striking patterns are unraveled. First, general relativity recently became the most cited field by astronomers, a trend highly correlated with the discovery of gravitational waves. Secondly, the fast growth of referenced papers in computer science and statistics, the first with a notable 15-fold increase since 2015. Such findings confirm the critical role of interdisciplinary efforts involving astronomy, statistics, and computer science in recent astronomical research.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
yonder: A python package for data denoising and reconstruction
Authors:
Peng Chen,
Rafael S. de Souza
Abstract:
We present a standalone implementation of a data-deconvolution method based on singular value decomposition. The tool is written in python and packaged in the open-source yonder package. yonder receives as input two matrices, one for the data and another for the errors, and outputs a denoised version of the original dataset. In this Research Note, we briefly describe the methodology and show a dem…
▽ More
We present a standalone implementation of a data-deconvolution method based on singular value decomposition. The tool is written in python and packaged in the open-source yonder package. yonder receives as input two matrices, one for the data and another for the errors, and outputs a denoised version of the original dataset. In this Research Note, we briefly describe the methodology and show a demonstration of the yonder on a simulated dataset.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
J-PLUS: A catalogue of globular cluster candidates around the M81/M82/NGC3077 triplet of galaxies
Authors:
Ana L. Chies-Santos,
Rafael S. de Souza,
Juan P. Caso,
Ana I. Ennis,
Camila P. E. de Souza,
Renan S. Barbosa,
Peng Chen,
A. Javier Cenarro,
Alessandro Ederoclite,
David Cristóbal-Hornillos,
Carlos Hernández-Monteagudo,
Carlos López-Sanjuan,
Antonio Marín-Franch,
Mariano Moles,
Jesús Varela,
Héctor Vázquez Ramió,
Renato Dupke,
Laerte Sodré Jr.,
Raul E. Angulo
Abstract:
Globular clusters (GCs) are proxies of the formation assemblies of their host galaxies. However, few studies exist targeting GC systems of spiral galaxies up to several effective radii. Through 12-band Javalambre Photometric Local Universe Survey (J-PLUS) imaging, we study the point sources around the M81/M82/NGC3077 triplet in search of new GC candidates. We develop a tailored classification sche…
▽ More
Globular clusters (GCs) are proxies of the formation assemblies of their host galaxies. However, few studies exist targeting GC systems of spiral galaxies up to several effective radii. Through 12-band Javalambre Photometric Local Universe Survey (J-PLUS) imaging, we study the point sources around the M81/M82/NGC3077 triplet in search of new GC candidates. We develop a tailored classification scheme to search for GC candidates based on their similarity to known GCs via a principal components analysis (PCA) projection. Our method accounts for missing data and photometric errors. We report 642 new GC candidates in a region of 3.5 deg$^2$ around the triplet, ranked according to their Gaia astrometric proper motions when available. We find tantalising evidence for an overdensity of GC candidate sources forming a bridge connecting M81 and M82. Finally, the spatial distribution of the GC candidates $(g-i)$ colours is consistent with halo/intra-cluster GCs, i.e. it gets bluer as they get further from the closest galaxy in the field. We further employ a regression-tree based model to estimate the metallicity distribution of the GC candidates based on their J-PLUS bands. The metallicity distribution of the sample candidates is broad and displays a bump towards the metal-rich end. Our list increases the population of GC candidates around the triplet by 3-fold, stresses the usefulness of multi-band surveys in finding these objects, and provides a testbed for further studies analysing their spatial distribution around nearby (spirals) galaxies.
△ Less
Submitted 13 July, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
GLADE+: An Extended Galaxy Catalogue for Multimessenger Searches with Advanced Gravitational-wave Detectors
Authors:
G. Dálya,
R. Díaz,
F. R. Bouchet,
Z. Frei,
J. Jasche,
G. Lavaux,
R. Macas,
S. Mukherjee,
M. Pálfi,
R. S. de Souza,
B. D. Wandelt,
M. Bilicki,
P. Raffai
Abstract:
We present GLADE+, an extended version of the GLADE galaxy catalogue introduced in our previous paper for multimessenger searches with advanced gravitational-wave detectors. GLADE+ combines data from six separate but not independent astronomical catalogues: the GWGC, 2MPZ, 2MASS XSC, HyperLEDA, and WISExSCOSPZ galaxy catalogues, and the SDSS-DR16Q quasar catalogue. To allow corrections of CMB-fram…
▽ More
We present GLADE+, an extended version of the GLADE galaxy catalogue introduced in our previous paper for multimessenger searches with advanced gravitational-wave detectors. GLADE+ combines data from six separate but not independent astronomical catalogues: the GWGC, 2MPZ, 2MASS XSC, HyperLEDA, and WISExSCOSPZ galaxy catalogues, and the SDSS-DR16Q quasar catalogue. To allow corrections of CMB-frame redshifts for peculiar motions, we calculated peculiar velocities along with their standard deviations of all galaxies having $B$-band magnitude data within redshift $z=0.05$ using the "Bayesian Origin Reconstruction from Galaxies" formalism. GLADE+ is complete up to luminosity distance $d_L=47^{+4}_{-2}$ Mpc in terms of the total expected $B$-band luminosity of galaxies, and contains all of the brightest galaxies giving 90\% of the total $B$-band and $K$-band luminosity up to $d_L\simeq 130$ Mpc. We include estimations of stellar masses and individual binary neutron star merger rates for galaxies with $W1$ magnitudes. These parameters can help in ranking galaxies in a given gravitational wave localization volume in terms of their likelihood of being hosts, thereby possibly reducing the number of pointings and total integration time needed to find the electromagnetic counterpart.
△ Less
Submitted 2 June, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
Bayesian Estimation of the D(p,$γ$)$^3$He Thermonuclear Reaction Rate
Authors:
Joseph Moscoso,
Rafael S. de Souza,
Alain Coc,
Christian Iliadis
Abstract:
Big bang nucleosynthesis (BBN) is the standard model theory for the production of the light nuclides during the early stages of the universe, taking place for a period of about 20 minutes after the big bang. Deuterium production, in particular, is highly sensitive to the primordial baryon density and the number of neutrino species, and its abundance serves as a sensitive test for the conditions in…
▽ More
Big bang nucleosynthesis (BBN) is the standard model theory for the production of the light nuclides during the early stages of the universe, taking place for a period of about 20 minutes after the big bang. Deuterium production, in particular, is highly sensitive to the primordial baryon density and the number of neutrino species, and its abundance serves as a sensitive test for the conditions in the early universe. The comparison of observed deuterium abundances with predicted ones requires reliable knowledge of the relevant thermonuclear reaction rates, and their corresponding uncertainties. Recent observations reported the primordial deuterium abundance with percent accuracy, but some theoretical predictions based on BBN are at tension with the measured values because of uncertainties in the cross section of the deuterium-burning reactions. In this work, we analyze the S-factor of the D(p,$γ$)$^3$He reaction using a hierarchical Bayesian model. We take into account the results of eleven experiments, spanning the period of 1955--2021; more than any other study. We also present results for two different fitting functions, a two-parameter function based on microscopic nuclear theory and a four-parameter polynomial. Our recommended reaction rates have a 2.2\% uncertainty at $0.8$~GK, which is the temperature most important for deuterium BBN. Differences between our rates and previous results are discussed.
△ Less
Submitted 31 August, 2021;
originally announced September 2021.
-
Probabilistic modeling of asteroid diameters from Gaia DR2 errors
Authors:
Rafael S. de Souza,
Alberto Krone-Martins,
Valerio Carruba,
Rita de Cassia Domingos,
Emille E. O. Ishida,
Safwan Alijbaae,
Mariela Huaman Espinoza,
William Barletta
Abstract:
The Gaia Data Release 2 provides precise astrometry for nearly 1.5 billion sources across the entire sky, including several thousand asteroids. In this work, we provide evidence that reasonably large asteroids (diameter $>$ 20 km) have high correlations with Gaia relative flux uncertainties and systematic right ascension errors. We further capture these correlations using a logistic Bayesian addit…
▽ More
The Gaia Data Release 2 provides precise astrometry for nearly 1.5 billion sources across the entire sky, including several thousand asteroids. In this work, we provide evidence that reasonably large asteroids (diameter $>$ 20 km) have high correlations with Gaia relative flux uncertainties and systematic right ascension errors. We further capture these correlations using a logistic Bayesian additive regression tree model. We compile a small list of probable large asteroids that can be targeted for direct diameter measurements and shape reconstruction.
△ Less
Submitted 26 August, 2021;
originally announced August 2021.
-
A high pitch angle structure in the Sagittarius Arm
Authors:
M. A. Kuhn,
R. A. Benjamin,
C. Zucker,
A. Krone-Martins,
R. S. de Souza,
A. Castro-Ginard,
E. E. O. Ishida,
M. S. Povich,
L. A. Hillenbrand
Abstract:
Context: In spiral galaxies, star formation tends to trace features of the spiral pattern, including arms, spurs, feathers, and branches. However, in our own Milky Way, it has been challenging to connect individual star-forming regions to their larger Galactic environment owing to our perspective from within the disk. One feature in nearly all modern models of the Milky Way is the Sagittarius Arm,…
▽ More
Context: In spiral galaxies, star formation tends to trace features of the spiral pattern, including arms, spurs, feathers, and branches. However, in our own Milky Way, it has been challenging to connect individual star-forming regions to their larger Galactic environment owing to our perspective from within the disk. One feature in nearly all modern models of the Milky Way is the Sagittarius Arm, located inward of the Sun with a pitch angle of ~12 deg. Aims: We map the 3D locations and velocities of star-forming regions in a segment of the Sagittarius Arm using young stellar objects (YSOs) from the Spitzer/IRAC Candidate YSO (SPICY) catalog to compare their distribution to models of the arm. Methods: Distances and velocities for these objects are derived from Gaia EDR3 astrometry and molecular line surveys. We infer parallaxes and proper motions for spatially clustered groups of YSOs and estimate their radial velocities from the velocities of spatially associated molecular clouds. Results: We identify 25 star-forming regions in the Galactic longitude range l~4.0-18.5 deg arranged in a narrow, ~1 kpc long linear structure with a high pitch angle of $ψ= 56$ deg and a high aspect ratio of ~7:1. This structure includes massive star-forming regions such as M8, M16, M17, and M20. The motions in the structure are remarkably coherent, with velocities in the direction of Galactic rotation of $240\pm3$ km/s (slightly higher than average) and slight drifts toward the Galactic center (-4.3 km/s) and in the negative Z direction (-2.9 km/s). The rotational shear experienced by the structure is 4.6 km/s/kpc. Conclusions: The observed 56 deg pitch angle is remarkably high for a segment of the Sagittarius Arm. We discuss possible interpretations of this feature as a substructure within the lower pitch angle Sagittarius Arm, as a spur, or as an isolated structure.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
A high occurrence of nuclear star clusters in faint Coma galaxies, and the roles of mass and environment
Authors:
Emilio J. B. Zanatta,
Rubén Sánchez-Janssen,
Ana L. Chies-Santos,
Rafael S. de Souza,
John P. Blakeslee
Abstract:
We use deep high resolution \textit{HST/ACS} imaging of two fields in the core of the Coma cluster to investigate the occurrence of nuclear star clusters (NSCs) in quiescent dwarf galaxies as faint as $M_{I} = -10$ mag. We employ a hierarchical Bayesian logistic regression framework to model the faint end of the nucleation fraction ($f_{n}$) as a function of both galaxy luminosity and environment.…
▽ More
We use deep high resolution \textit{HST/ACS} imaging of two fields in the core of the Coma cluster to investigate the occurrence of nuclear star clusters (NSCs) in quiescent dwarf galaxies as faint as $M_{I} = -10$ mag. We employ a hierarchical Bayesian logistic regression framework to model the faint end of the nucleation fraction ($f_{n}$) as a function of both galaxy luminosity and environment. We find that $f_n$ is remarkably high in Coma: at $M_{I} \approx -13$ mag half of the cluster dwarfs still host prominent NSCs. Comparison with dwarf systems in nearby clusters and groups shows that, within the uncertainties, the rate at which the probability of nucleation varies with galaxy luminosity is nearly universal. On the other hand, the fraction of nucleated galaxies at fixed luminosity does exhibit an environmental dependence. More massive environments feature higher nucleation fractions and fainter values of the half-nucleation luminosity, which roughly scales with host halo virial mass as $L_{I,f_{n50}} \propto \mathcal{M}_{200}^{-0.2}$. Our results reinforce the role of galaxy luminosity/mass as a major driver of the efficiency of NSC formation and also indicate a clear secondary dependence on the environment, hence paving the way to more refined theoretical models.
△ Less
Submitted 11 August, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
Women in academia: a warning on selection bias in gender studies from the astronomical perspective
Authors:
M. L. L. Dantas,
E. Cameron,
Rafael S. de Souza,
A. R. da Silva,
A. L. Chies-Santos,
C. Heneka,
P. R. T. Coelho,
A. Ederoclite,
I. S. Beloto,
V. Branco,
Morgan S. Camargo,
V. M. Carvalho de Oliveira,
C. de Sá-Freitas,
G. Gonçalves,
T. A. Pacheco,
Isabel Rebollido
Abstract:
The recent paper by AlShebli et al. (2020) investigates the impact of mentorship in young scientists. Among their conclusions, they state that female protégés benefit more from male than female mentorship. We herein expose a critical flaw in their methodological design that is a common issue in Astronomy, namely "selection biases". An effect that if not treated properly may lead to unwarranted cau…
▽ More
The recent paper by AlShebli et al. (2020) investigates the impact of mentorship in young scientists. Among their conclusions, they state that female protégés benefit more from male than female mentorship. We herein expose a critical flaw in their methodological design that is a common issue in Astronomy, namely "selection biases". An effect that if not treated properly may lead to unwarranted causality claims. In their analysis, selection biases seem to be present in the response rate of their survey (8.35%), the choice of database, success criterion, and the overlook of the numerous drawbacks female researchers face in academia. We discuss these issues and their implications -- one of them being the potential increase in obstacles for women in academia. Finally, we reinforce the dangers of not considering selection bias effects in studies aimed at retrieving causal relations.
△ Less
Submitted 4 December, 2020;
originally announced December 2020.
-
SPICY: The Spitzer/IRAC Candidate YSO Catalog for the Inner Galactic Midplane
Authors:
Michael A. Kuhn,
Rafael S. de Souza,
Alberto Krone-Martins,
Alfred Castro-Ginard,
Emille E. O. Ishida,
Matthew S. Povich,
Lynne A. Hillenbrand
Abstract:
We present ~120,000 Spitzer/IRAC candidate young stellar objects (YSOs) based on surveys of the Galactic midplane between l~255 deg and 110 deg, including the GLIMPSE I, II, and 3D, Vela-Carina, Cygnus X, and SMOG surveys (613 square degrees), augmented by near-infrared catalogs. We employed a classification scheme that uses the flexibility of a tailored statistical learning method and curated YSO…
▽ More
We present ~120,000 Spitzer/IRAC candidate young stellar objects (YSOs) based on surveys of the Galactic midplane between l~255 deg and 110 deg, including the GLIMPSE I, II, and 3D, Vela-Carina, Cygnus X, and SMOG surveys (613 square degrees), augmented by near-infrared catalogs. We employed a classification scheme that uses the flexibility of a tailored statistical learning method and curated YSO datasets to take full advantage of IRAC's spatial resolution and sensitivity in the mid-infrared ~3-9 micron range. Multi-wavelength color/magnitude distributions provide intuition about how the classifier separates YSOs from other red IRAC sources and validate that the sample is consistent with expectations for disk/envelope-bearing pre-main-sequence stars. We also identify areas of IRAC color space associated with objects with strong silicate absorption or polycyclic aromatic hydrocarbon emission. Spatial distributions and variability properties help corroborate the youthful nature of our sample. Most of the candidates are in regions with mid-IR nebulosity, associated with star-forming clouds, but others appear distributed in the field. Using Gaia DR2 distance estimates, we find groups of YSO candidates associated with the Local Arm, the Sagittarius-Carina Arm, and the Scutum-Centaurus Arm. Candidate YSOs visible to the Zwicky Transient Facility tend to exhibit higher variability amplitudes than randomly selected field stars of the same magnitude, with many high-amplitude variables having light-curve morphologies characteristic of YSOs. Given that no current or planned instruments will significantly exceed IRAC's spatial resolution while possessing its wide-area map** capabilities, Spitzer-based catalogs such as ours will remain the main resources for mid-infrared YSOs in the Galactic midplane for the near future.
△ Less
Submitted 12 July, 2021; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients
Authors:
Noble Kennamer,
Emille E. O. Ishida,
Santiago Gonzalez-Gaitan,
Rafael S. de Souza,
Alexander Ihler,
Kara Ponder,
Ricardo Vilalta,
Anais Moller,
David O. Jones,
Mi Dai,
Alberto Krone-Martins,
Bruno Quint,
Sreevarsha Sreejith,
Alex I. Malz,
Lluis Galbany
Abstract:
The recent increase in volume and complexity of available astronomical data has led to a wide use of supervised machine learning techniques. Active learning strategies have been proposed as an alternative to optimize the distribution of scarce labeling resources. However, due to the specific conditions in which labels can be acquired, fundamental assumptions, such as sample representativeness and…
▽ More
The recent increase in volume and complexity of available astronomical data has led to a wide use of supervised machine learning techniques. Active learning strategies have been proposed as an alternative to optimize the distribution of scarce labeling resources. However, due to the specific conditions in which labels can be acquired, fundamental assumptions, such as sample representativeness and labeling cost stability cannot be fulfilled. The Recommendation System for Spectroscopic follow-up (RESSPECT) project aims to enable the construction of optimized training samples for the Rubin Observatory Legacy Survey of Space and Time (LSST), taking into account a realistic description of the astronomical data environment. In this work, we test the robustness of active learning techniques in a realistic simulated astronomical data scenario. Our experiment takes into account the evolution of training and pool samples, different costs per object, and two different sources of budget. Results show that traditional active learning strategies significantly outperform random sampling. Nevertheless, more complex batch strategies are not able to significantly overcome simple uncertainty sampling techniques. Our findings illustrate three important points: 1) active learning strategies are a powerful tool to optimize the label-acquisition task in astronomy, 2) for upcoming large surveys like LSST, such techniques allow us to tailor the construction of the training sample for the first day of the survey, and 3) the peculiar data environment related to the detection of astronomical transients is a fertile ground that calls for the development of tailored machine learning algorithms.
△ Less
Submitted 26 October, 2020; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Launching the VASCO citizen science project
Authors:
Beatriz Villarroel,
Kristiaan Pelckmans,
Enrique Solano,
Mikael Laaksoharju,
Abel Souza,
Onyeuwaoma Nnaemeka Dom,
Khaoula Laggoune,
Jamal Mimouni,
Hichem Guergouri,
Lars Mattsson,
Aurora Lago García,
Johan Soodla,
Diego Castillo,
Matthew E. Shultz,
Rubby Aworka,
Sébastien Comerón,
Stefan Geier,
Geoffrey Marcy,
Alok C. Gupta,
Josefine Bergstedt,
Rudolf E. Bär,
Bart Buelens,
Emilio Enriquez,
Christopher K. Mellon,
M. Almudena Prieto
, et al. (3 additional authors not shown)
Abstract:
The Vanishing & Appearing Sources during a Century of Observations (VASCO) project investigates astronomical surveys spanning a time interval of 70 years, searching for unusual and exotic transients. We present herein the VASCO Citizen Science Project, which can identify unusual candidates driven by three different approaches: hypothesis, exploratory, and machine learning, which is particularly us…
▽ More
The Vanishing & Appearing Sources during a Century of Observations (VASCO) project investigates astronomical surveys spanning a time interval of 70 years, searching for unusual and exotic transients. We present herein the VASCO Citizen Science Project, which can identify unusual candidates driven by three different approaches: hypothesis, exploratory, and machine learning, which is particularly useful for SETI searches. To address the big data challenge, VASCO combines three methods: the Virtual Observatory, user-aided machine learning, and visual inspection through citizen science. Here we demonstrate the citizen science project and its improved candidate selection process, and we give a progress report. We also present the VASCO citizen science network led by amateur astronomy associations mainly located in Algeria, Cameroon, and Nigeria. At the moment of writing, the citizen science project has carefully examined 15,593 candidate image pairs in the data (ca. 10% of the candidates), and has so far identified 798 objects classified as "vanished". The most interesting candidates will be followed up with optical and infrared imaging, together with the observations by the most potent radio telescopes.
△ Less
Submitted 26 December, 2022; v1 submitted 22 September, 2020;
originally announced September 2020.
-
Periodic Astrometric Signal Recovery through Convolutional Autoencoders
Authors:
Michele Delli Veneri,
Louis Desdoigts,
Morgan A. Schmitz,
Alberto Krone-Martins,
Emille E. O. Ishida,
Peter Tuthill,
Rafael S. de Souza,
Richard Scalzo,
Massimo Brescia,
Giuseppe Longo,
Antonio Picariello
Abstract:
Astrometric detection involves a precise measurement of stellar positions, and is widely regarded as the leading concept presently ready to find earth-mass planets in temperate orbits around nearby sun-like stars. The TOLIMAN space telescope[39] is a low-cost, agile mission concept dedicated to narrow-angle astrometric monitoring of bright binary stars. In particular the mission will be optimised…
▽ More
Astrometric detection involves a precise measurement of stellar positions, and is widely regarded as the leading concept presently ready to find earth-mass planets in temperate orbits around nearby sun-like stars. The TOLIMAN space telescope[39] is a low-cost, agile mission concept dedicated to narrow-angle astrometric monitoring of bright binary stars. In particular the mission will be optimised to search for habitable-zone planets around Alpha Centauri AB. If the separation between these two stars can be monitored with sufficient precision, tiny perturbations due to the gravitational tug from an unseen planet can be witnessed and, given the configuration of the optical system, the scale of the shifts in the image plane are about one millionth of a pixel. Image registration at this level of precision has never been demonstrated (to our knowledge) in any setting within science. In this paper we demonstrate that a Deep Convolutional Auto-Encoder is able to retrieve such a signal from simplified simulations of the TOLIMAN data and we present the full experimental pipeline to recreate out experiments from the simulations to the signal analysis. In future works, all the more realistic sources of noise and systematic effects present in the real-world system will be injected into the simulations.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
21st Century Statistical and Computational Challenges in Astrophysics
Authors:
Eric D. Feigelson,
Rafael S. de Souza,
Emille E. O. Ishida,
Gutti Jogesh Babu
Abstract:
Modern astronomy has been rapidly increasing our ability to see deeper into the universe, acquiring enormous samples of cosmic populations. Gaining astrophysical insights from these datasets requires a wide range of sophisticated statistical and machine learning methods. Long-standing problems in cosmology include characterization of galaxy clustering and estimation of galaxy distances from photom…
▽ More
Modern astronomy has been rapidly increasing our ability to see deeper into the universe, acquiring enormous samples of cosmic populations. Gaining astrophysical insights from these datasets requires a wide range of sophisticated statistical and machine learning methods. Long-standing problems in cosmology include characterization of galaxy clustering and estimation of galaxy distances from photometric colors. Bayesian inference, central to linking astronomical data to nonlinear astrophysical models, addresses problems in solar physics, properties of star clusters, and exoplanet systems. Likelihood-free methods are growing in importance. Detection of faint signals in complicated noise is needed to find periodic behaviors in stars and detect explosive gravitational wave events. Open issues concern treatment of heteroscedastic measurement errors and understanding probability distributions characterizing astrophysical systems. The field of astrostatistics needs increased collaboration with statisticians in the design and analysis stages of research projects, and to jointly develop new statistical methodologies. Together, they will draw more astrophysical insights into astronomical populations and the cosmos itself.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Ridges in the Dark Energy Survey for cosmic trough identification
Authors:
Ben Moews,
Morgan A. Schmitz,
Andrew J. Lawler,
Joe Zuntz,
Alex I. Malz,
Rafael S. de Souza,
Ricardo Vilalta,
Alberto Krone-Martins,
Emille E. O. Ishida
Abstract:
Cosmic voids and their corresponding redshift-projected mass densities, known as troughs, play an important role in our attempt to model the large-scale structure of the Universe. Understanding these structures enables us to compare the standard model with alternative cosmologies, constrain the dark energy equation of state, and distinguish between different gravitational theories. In this paper,…
▽ More
Cosmic voids and their corresponding redshift-projected mass densities, known as troughs, play an important role in our attempt to model the large-scale structure of the Universe. Understanding these structures enables us to compare the standard model with alternative cosmologies, constrain the dark energy equation of state, and distinguish between different gravitational theories. In this paper, we extend the subspace-constrained mean shift algorithm, a recently introduced method to estimate density ridges, and apply it to 2D weak lensing mass density maps from the Dark Energy Survey Y1 data release to identify curvilinear filamentary structures. We compare the obtained ridges with previous approaches to extract trough structure in the same data, and apply curvelets as an alternative wavelet-based method to constrain densities. We then invoke the Wasserstein distance between noisy and noiseless simulations to validate the denoising capabilities of our method. Our results demonstrate the viability of ridge estimation as a precursor for denoising weak lensing observables to recover the large-scale structure, paving the way for a more versatile and effective search for troughs.
△ Less
Submitted 14 November, 2022; v1 submitted 18 May, 2020;
originally announced May 2020.
-
Hierarchical Bayesian Thermonuclear Rate for the $^7$Be(n,p)$^7$Li Big Bang Nucleosynthesis Reaction
Authors:
Rafael S. de Souza,
Tan Hong Kiat,
Alain Coc,
Christian Iliadis
Abstract:
Big bang nucleosynthesis provides the earliest probe of standard model physics, at a time when the universe was less than a thousand seconds old. It determines the abundances of the lightest nuclides, which give rise to the subsequent history of the visible matter in the Universe. This work derives new $^7$Be(n,p)$^7$Li thermonuclear reaction rates based on all available experimental information.…
▽ More
Big bang nucleosynthesis provides the earliest probe of standard model physics, at a time when the universe was less than a thousand seconds old. It determines the abundances of the lightest nuclides, which give rise to the subsequent history of the visible matter in the Universe. This work derives new $^7$Be(n,p)$^7$Li thermonuclear reaction rates based on all available experimental information. This reaction sensitively impacts the primordial abundances of $^{7}$Be and $^7$Li during big bang nucleosynthesis. We critically evaluate all available data and disregard experimental results that are questionable. For the nuclear model, we adopt an incoherent sum of single-level, two-channel R-matrix approximation expressions, which are implemented into a hierarchical Bayesian model, to analyze the remaining six data sets we deem most reliable. In the fitting of the data, we consistently model all known sources of uncertainty, including discrepant absolute normalizations of different data sets, and also take the variation of the neutron and proton channel radii into account, hence providing less biased estimates of the $^7$Be(n,p)$^7$Li thermonuclear rates. From the resulting posteriors, we extract R-matrix parameters ($E_r$, $γ^2_n$, $γ^2_p$) and derive excitation energies, partial and total widths. Our fit is sensitive to the contributions of the first three levels above the neutron threshold. Reaction rates were computed by integrating 10,000 samples of the reduced cross section. Our $^7$Be(n,p)$^7$Li thermonuclear rates have uncertainties between 1.5% and 2.0% at temperatures of $\leq$1 GK. We compare our rates to previous results and find that the $^7$Be(n,p)$^7$Li rates most commonly used in big bang simulations have too optimistic uncertainties.
△ Less
Submitted 16 April, 2020; v1 submitted 12 December, 2019;
originally announced December 2019.
-
The Vanishing & Appearing Sources during a Century of Observations project: I. USNO objects missing in modern sky surveys and follow-up observations of a "missing star"
Authors:
Beatriz Villarroel,
Johan Soodla,
Sébastien Comerón,
Lars Mattsson,
Kristiaan Pelckmans,
Martín López-Corredoira,
Kevin Krisciunas,
Eduardo Guerras,
Oleg Kochukhov,
Josefine Bergstedt,
Bart Buelens,
Rudolf E. Bär,
Rubén Cubo,
J. Emilio Enriquez,
Alok C. Gupta,
Iñigo Imaz,
Torgny Karlsson,
M. Almudena Prieto,
Aleksey A. Shlyapnikov,
Rafael S. de Souza,
Irina B. Vavilova,
Martin J. Ward
Abstract:
In this paper we report the current status of a new research program. The primary goal of the "Vanishing & Appearing Sources during a Century of Observations" (VASCO) project is to search for vanishing and appearing sources using existing survey data to find examples of exceptional astrophysical transients. The implications of finding such objects extend from traditional astrophysics fields to the…
▽ More
In this paper we report the current status of a new research program. The primary goal of the "Vanishing & Appearing Sources during a Century of Observations" (VASCO) project is to search for vanishing and appearing sources using existing survey data to find examples of exceptional astrophysical transients. The implications of finding such objects extend from traditional astrophysics fields to the more exotic searches for evidence of technologically advanced civilizations. In this first paper we present new, deeper observations of the tentative candidate discovered by Villarroel et al. (2016). We then perform the first searches for vanishing objects throughout the sky by comparing 600 million objects from the US Naval Observatory Catalogue (USNO) B1.0 down to a limiting magnitude of $\sim 20 - 21$ with the recent Pan-STARRS Data Release-1 (DR1) with a limiting magnitude of $\sim$ 23.4. We find about 150,000 preliminary candidates that do not have any Pan-STARRS counterpart within a 30 arcsec radius. We show that these objects are redder and have larger proper motions than typical USNO objects. We visually examine the images for a subset of about 24,000 candidates, superseding the 2016 study with a sample ten times larger. We find about $\sim$ 100 point sources visible in only one epoch in the red band of the USNO which may be of interest in searches for strong M dwarf flares, high-redshift supernovae or other catagories of unidentified red transients.
△ Less
Submitted 21 November, 2019; v1 submitted 12 November, 2019;
originally announced November 2019.
-
UV bright red-sequence galaxies: how do UV upturn systems evolve in redshift and stellar mass?
Authors:
M. L. L. Dantas,
P. R. T. Coelho,
R. S. de Souza,
T. S. Gonçalves
Abstract:
The so-called ultraviolet (UV) upturn of elliptical galaxies is a phenomenon characterised by the up-rise of their fluxes in bluer wavelengths, typically in the 1,200-2,500A range. This work aims at estimating the rate of occurrence of the UV upturn over the entire red-sequence population of galaxies that show significant UV emission. This assessment is made considering it as function of three par…
▽ More
The so-called ultraviolet (UV) upturn of elliptical galaxies is a phenomenon characterised by the up-rise of their fluxes in bluer wavelengths, typically in the 1,200-2,500A range. This work aims at estimating the rate of occurrence of the UV upturn over the entire red-sequence population of galaxies that show significant UV emission. This assessment is made considering it as function of three parameters: redshift, stellar mass, and -- what may seem counter-intuitive at first -- emission-line classification. We built a multiwavelength spectro-photometric catalogue from the Galaxy Mass Assembly survey, together with aperture-matched data from Galaxy Evolution Explorer Medium-Depth Imaging Survey (MIS) and Sloan Digital Sky Survey, covering the redshift range between 0.06 and 0.40. From this sample, we analyse the UV emission among UV bright galaxies, by selecting those that occupy the red-sequence locus in the (NUV-r) x (FUV-NUV) chart; then, we stratify the sample by their emission-line classes. To that end, we make use of emission-line diagnostic diagrams, focusing the analysis in retired/passive lineless galaxies. Then, a Bayesian logistic model was built to simultaneously deal with the effects of all galaxy properties (including emission-line classification or lack thereof). The main results show that retired/passive systems host an up-rise in the fraction of UV upturn or redshifts between 0.06 and 0.25, followed by an in-fall up to 0.35. Additionally, we show that the fraction of UV upturn hosts rises with increasing stellar mass.
△ Less
Submitted 23 December, 2019; v1 submitted 19 August, 2019;
originally announced August 2019.
-
Assessing the photometric redshift precision of the S-PLUS survey: the Stripe-82 as a test-case
Authors:
A. Molino,
M. V. Costa-Duarte,
L. Sampedro,
F. R. Herpich,
L. Sodré Jr.,
C. Mendes de Oliveira,
W. Schoenell,
C. E. Barbosa,
C. Queiroz,
E. V. R. Lima,
L. Azanha,
N. Muñoz-Elgueta,
T. Ribeiro,
A. Kanaan,
J. A. Hernandez-Jimenez,
A. Cortesi,
S. Akras,
R. Lopes de Oliveira,
S. Torres-Flores,
C. Lima-Dias,
J. L. Nilo Castellon,
G. Damke,
A. Alvarez-Candal,
Y. Jiménez-Teja,
P. Coelho
, et al. (20 additional authors not shown)
Abstract:
In this paper we present a thorough discussion about the photometric redshift (photo-z) performance of the Southern Photometric Local Universe Survey (S-PLUS). This survey combines a 7 narrow + 5 broad passband filter system, with a typical photometric-depth of r$\sim$21 AB. For this exercise, we utilize the Data Release 1 (DR1), corresponding to 336 deg$^{2}$ from the Stripe-82 region. We rely on…
▽ More
In this paper we present a thorough discussion about the photometric redshift (photo-z) performance of the Southern Photometric Local Universe Survey (S-PLUS). This survey combines a 7 narrow + 5 broad passband filter system, with a typical photometric-depth of r$\sim$21 AB. For this exercise, we utilize the Data Release 1 (DR1), corresponding to 336 deg$^{2}$ from the Stripe-82 region. We rely on the \texttt{BPZ2} code to compute our estimates, using a new library of SED models, which includes additional templates for quiescent galaxies. When compared to a spectroscopic redshift control sample of $\sim$100k galaxies, we find a precision of $σ_{z}<$0.8\%, $<$2.0\% or $<$3.0\% for galaxies with magnitudes r$<$17, $<$19 and $<$21, respectively. A precision of 0.6\% is attained for galaxies with the highest \texttt{Odds} values. These estimates have a negligible bias and a fraction of catastrophic outliers inferior to 1\%. We identify a redshift window (i.e., 0.26$<z<$0.32) where our estimates double their precision, due to the simultaneous detection of two emission-lines in two distinct narrow-bands; representing a window opportunity to conduct statistical studies such as luminosity functions. We forecast a total of $\sim$2M, $\sim$16M and $\sim$32M galaxies in the S-PLUS survey with a photo-z precision of $σ_{z}<$1.0\%, $<$2.0\% and $<$2.5\% after observing 8000 $deg^{2}$. We also derive redshift Probability Density Functions, proving their reliability encoding redshift uncertainties and their potential recovering the $n(z)$ of galaxies at $z<0.4$, with an unprecedented precision for a photometric survey in the southern hemisphere.
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
The Southern Photometric Local Universe Survey (S-PLUS): improved SEDs, morphologies and redshifts with 12 optical filters
Authors:
C. Mendes de Oliveira,
T. Ribeiro,
W. Schoenell,
A. Kanaan,
R. A. Overzier,
A. Molino,
L. Sampedro,
P. Coelho,
C. E. Barbosa,
A. Cortesi,
M. V. Costa-Duarte,
F. R. Herpich,
J. A. Hernandez-Jimenez,
V. M. Placco,
H. S. Xavier,
L. R. Abramo,
R. K. Saito,
A. L. Chies-Santos,
A. Ederoclite,
R. Lopes de Oliveira,
D. R. Gonçalves,
S. Akras,
L. A. Almeida,
F. Almeida-Fernandes,
T. C. Beers
, et al. (120 additional authors not shown)
Abstract:
The Southern Photometric Local Universe Survey (S-PLUS) is imaging ~9300 deg^2 of the celestial sphere in twelve optical bands using a dedicated 0.8 m robotic telescope, the T80-South, at the Cerro Tololo Inter-American Observatory, Chile. The telescope is equipped with a 9.2k by 9.2k e2v detector with 10 um pixels, resulting in a field-of-view of 2 deg^2 with a plate scale of 0.55"/pixel. The sur…
▽ More
The Southern Photometric Local Universe Survey (S-PLUS) is imaging ~9300 deg^2 of the celestial sphere in twelve optical bands using a dedicated 0.8 m robotic telescope, the T80-South, at the Cerro Tololo Inter-American Observatory, Chile. The telescope is equipped with a 9.2k by 9.2k e2v detector with 10 um pixels, resulting in a field-of-view of 2 deg^2 with a plate scale of 0.55"/pixel. The survey consists of four main subfields, which include two non-contiguous fields at high Galactic latitudes (8000 deg^2 at |b| > 30 deg) and two areas of the Galactic plane and bulge (for an additional 1300 deg^2). S-PLUS uses the Javalambre 12-band magnitude system, which includes the 5 u, g, r, i, z broad-band filters and 7 narrow-band filters centered on prominent stellar spectral features: the Balmer jump/[OII], Ca H+K, H-delta, G-band, Mg b triplet, H-alpha, and the Ca triplet. S-PLUS delivers accurate photometric redshifts (delta_z/(1+z) = 0.02 or better) for galaxies with r < 20 AB mag and redshift < 0.5, thus producing a 3D map of the local Universe over a volume of more than 1 (Gpc/h)^3. The final S-PLUS catalogue will also enable the study of star formation and stellar populations in and around the Milky Way and nearby galaxies, as well as searches for quasars, variable sources, and low-metallicity stars. In this paper we introduce the main characteristics of the survey, illustrated with science verification data highlighting the unique capabilities of S-PLUS. We also present the first public data release of ~336 deg^2 of the Stripe-82 area, which is available at http://datalab.noao.edu/splus.
△ Less
Submitted 2 September, 2019; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Photometry of high-redshift blended galaxies using deep learning
Authors:
Alexandre Boucaud,
Marc Huertas-Company,
Caroline Heneka,
Emille E. O. Ishida,
Nima Sedaghat,
Rafael S. de Souza,
Ben Moews,
Hervé Dole,
Marco Castellano,
Emiliano Merlin,
Valerio Roscani,
Andrea Tramacere,
Madhura Killedar,
Arlindo M. M. Trindade
Abstract:
The new generation of deep photometric surveys requires unprecedentedly precise shape and photometry measurements of billions of galaxies to achieve their main science goals. At such depths, one major limiting factor is the blending of galaxies due to line-of-sight projection, with an expected fraction of blended galaxies of up to 50%. Current deblending approaches are in most cases either too slo…
▽ More
The new generation of deep photometric surveys requires unprecedentedly precise shape and photometry measurements of billions of galaxies to achieve their main science goals. At such depths, one major limiting factor is the blending of galaxies due to line-of-sight projection, with an expected fraction of blended galaxies of up to 50%. Current deblending approaches are in most cases either too slow or not accurate enough to reach the level of requirements. This work explores the use of deep neural networks to estimate the photometry of blended pairs of galaxies in monochrome space images, similar to the ones that will be delivered by the Euclid space telescope. Using a clean sample of isolated galaxies from the CANDELS survey, we artificially blend them and train two different network models to recover the photometry of the two galaxies. We show that our approach can recover the original photometry of the galaxies before being blended with $\sim$7% accuracy without any human intervention and without any assumption on the galaxy shape. This represents an improvement of at least a factor of 4 compared to the classical SExtractor approach. We also show that forcing the network to simultaneously estimate a binary segmentation map results in a slightly improved photometry. All data products and codes will be made public to ease the comparison with other approaches on a common data set.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
Fallopian tube anatomy predicts pregnancy and pregnancy outcomes after tubal reversal surgery
Authors:
Rafael S. de Souza,
Gary S. Berger
Abstract:
We conducted this study to determine whether fallopian tube anatomy can predict the likelihood of pregnancy and pregnancy outcomes after tubal sterilization reversal. We built a flexible, non-parametric, multivariate model via generalized additive models to assess the effects of the following tubal parameters observed during tubal reparative surgery: tubal lengths; differences in tubal segment loc…
▽ More
We conducted this study to determine whether fallopian tube anatomy can predict the likelihood of pregnancy and pregnancy outcomes after tubal sterilization reversal. We built a flexible, non-parametric, multivariate model via generalized additive models to assess the effects of the following tubal parameters observed during tubal reparative surgery: tubal lengths; differences in tubal segment location, and diameters at the anastomosis sites; and, fibrosis of the tubal muscularis. In this study population, age and tubal length - in that order - were the primary factors predicting the likelihood of pregnancy. For pregnancy outcomes, tubal length was the most influential predictor of birth and ectopic pregnancy, while age was the primary predictor of miscarriage. Segment location and diameters contributed slightly to the odds of miscarriage and ectopic pregnancy. Tubal muscularis fibrosis had a little apparent effect. This study is the first to show that a statistical learning predictive model based on fallopian tube anatomy can predict pregnancy and pregnancy outcome probabilities after tubal reversal surgery.
△ Less
Submitted 8 August, 2021; v1 submitted 20 April, 2019;
originally announced April 2019.
-
Astro2020 Science White Paper: The Next Decade of Astroinformatics and Astrostatistics
Authors:
A. Siemiginowska,
G. Eadie,
I. Czekala,
E. Feigelson,
E. B. Ford,
V. Kashyap,
M. Kuhn,
T. Loredo,
M. Ntampaka,
A. Stevens,
A. Avelino,
K. Borne,
T. Budavari,
B. Burkhart,
J. Cisewski-Kehe,
F. Civano,
I. Chilingarian,
D. A. van Dyk,
G. Fabbiano,
D. P. Finkbeiner,
D. Foreman-Mackey,
P. Freeman,
A. Fruscione,
A. A. Goodman,
M. Graham
, et al. (27 additional authors not shown)
Abstract:
Over the past century, major advances in astronomy and astrophysics have been largely driven by improvements in instrumentation and data collection. With the amassing of high quality data from new telescopes, and especially with the advent of deep and large astronomical surveys, it is becoming clear that future advances will also rely heavily on how those data are analyzed and interpreted. New met…
▽ More
Over the past century, major advances in astronomy and astrophysics have been largely driven by improvements in instrumentation and data collection. With the amassing of high quality data from new telescopes, and especially with the advent of deep and large astronomical surveys, it is becoming clear that future advances will also rely heavily on how those data are analyzed and interpreted. New methodologies derived from advances in statistics, computer science, and machine learning are beginning to be employed in sophisticated investigations that are not only bringing forth new discoveries, but are placing them on a solid footing. Progress in wide-field sky surveys, interferometric imaging, precision cosmology, exoplanet detection and characterization, and many subfields of stellar, Galactic and extragalactic astronomy, has resulted in complex data analysis challenges that must be solved to perform scientific inference. Research in astrostatistics and astroinformatics will be necessary to develop the state-of-the-art methodology needed in astronomy. Overcoming these challenges requires dedicated, interdisciplinary research. We recommend: (1) increasing funding for interdisciplinary projects in astrostatistics and astroinformatics; (2) dedicating space and time at conferences for interdisciplinary research and promotion; (3) develo** sustainable funding for long-term astrostatisics appointments; and (4) funding infrastructure development for data archives and archive support, state-of-the-art algorithms, and efficient computing.
△ Less
Submitted 15 March, 2019;
originally announced March 2019.
-
Thermonuclear fusion rates for tritium + deuterium using Bayesian methods
Authors:
Rafael S. de Souza,
S. Reece Boston,
Alain Coc,
Christian Iliadis
Abstract:
The $^3$H(d,n)$^4$He reaction has a large low-energy cross section and will likely be utilized in future commercial fusion reactors. This reaction also takes place during big bang nucleosynthesis. Studies of both scenarios require accurate and precise fusion rates. To this end, we implement a one-level, two-channel R-matrix approximation into a Bayesian model. Our main goals are to predict reliabl…
▽ More
The $^3$H(d,n)$^4$He reaction has a large low-energy cross section and will likely be utilized in future commercial fusion reactors. This reaction also takes place during big bang nucleosynthesis. Studies of both scenarios require accurate and precise fusion rates. To this end, we implement a one-level, two-channel R-matrix approximation into a Bayesian model. Our main goals are to predict reliable astrophysical S-factors and to estimate R-matrix parameters using the Bayesian approach. All relevant parameters are sampled in our study, including the channel radii, boundary condition parameters, and data set normalization factors. In addition, we take uncertainties in both measured bombarding energies and S-factors rigorously into account. Thermonuclear rates and reactivities of the $^3$H(d,n)$^4$He reaction are derived by numerically integrating the Bayesian S-factor samples. The present reaction rate uncertainties at temperatures between $1.0$ MK and $1.0$ GK are in the range of 0.2% to 0.6%. Our reaction rates differ from previous results by 2.9% near 1.0 GK. Our reactivities are smaller than previous results, with a maximum deviation of 2.9% near a thermal energy of $4$ keV. The present rate or reactivity uncertainties are more reliable compared to previous studies that did not include the channel radii, boundary condition parameters, and data set normalization factors in the fitting. Finally, we investigate previous claims of electron screening effects in the published $^3$H(d,n)$^4$He data. No such effects are evident and only an upper limit for the electron screening potential can be obtained.
△ Less
Submitted 14 January, 2019;
originally announced January 2019.
-
Stress testing the dark energy equation of state imprint on supernova data
Authors:
Ben Moews,
Rafael S. de Souza,
Emille E. O. Ishida,
Alex I. Malz,
Caroline Heneka,
Ricardo Vilalta,
Joe Zuntz
Abstract:
This work determines the degree to which a standard Lambda-CDM analysis based on type Ia supernovae can identify deviations from a cosmological constant in the form of a redshift-dependent dark energy equation of state w(z). We introduce and apply a novel random curve generator to simulate instances of w(z) from constraint families with increasing distinction from a cosmological constant. After pr…
▽ More
This work determines the degree to which a standard Lambda-CDM analysis based on type Ia supernovae can identify deviations from a cosmological constant in the form of a redshift-dependent dark energy equation of state w(z). We introduce and apply a novel random curve generator to simulate instances of w(z) from constraint families with increasing distinction from a cosmological constant. After producing a series of mock catalogs of binned type Ia supernovae corresponding to each w(z) curve, we perform a standard Lambda-CDM analysis to estimate the corresponding posterior densities of the absolute magnitude of type Ia supernovae, the present-day matter density, and the equation of state parameter. Using the Kullback-Leibler divergence between posterior densities as a difference measure, we demonstrate that a standard type Ia supernova cosmology analysis has limited sensitivity to extensive redshift dependencies of the dark energy equation of state. In addition, we report that larger redshift-dependent departures from a cosmological constant do not necessarily manifest easier-detectable incompatibilities with the Lambda-CDM model. Our results suggest that physics beyond the standard model may simply be hidden in plain sight.
△ Less
Submitted 5 July, 2019; v1 submitted 23 December, 2018;
originally announced December 2018.
-
Gaia DR2 unravels incompleteness of nearby cluster population: New open clusters in the direction of Perseus
Authors:
T. Cantat-Gaudin,
A. Krone-Martins,
N. Sedaghat,
A. Farahi,
R. S. de Souza,
R. Skalidis,
A. I. Malz,
S. Macêdo,
B. Moews,
C. Jordi,
A. Moitinho,
A. Castro-Ginard,
E. E. O. Ishida,
C. Heneka,
A. Boucaud,
A. M. M. Trindade
Abstract:
Open clusters (OCs) are popular tracers of the structure and evolutionary history of the Galactic disk. The OC population is often considered to be complete within 1.8 kpc of the Sun. The recent Gaia Data Release 2 (DR2) allows the latter claim to be challenged. We perform a systematic search for new OCs in the direction of Perseus using precise and accurate astrometry from Gaia DR2. We implement…
▽ More
Open clusters (OCs) are popular tracers of the structure and evolutionary history of the Galactic disk. The OC population is often considered to be complete within 1.8 kpc of the Sun. The recent Gaia Data Release 2 (DR2) allows the latter claim to be challenged. We perform a systematic search for new OCs in the direction of Perseus using precise and accurate astrometry from Gaia DR2. We implement a coarse-to-fine search method. First, we exploit spatial proximity using a fast density-aware partitioning of the sky via a k-d tree in the spatial domain of Galactic coordinates, (l, b). Secondly, we employ a Gaussian mixture model in the proper motion space to quickly tag fields around OC candidates. Thirdly, we apply an unsupervised membership assignment method, UPMASK, to scrutinise the candidates. We visually inspect colour-magnitude diagrams to validate the detected objects. Finally, we perform a diagnostic to quantify the significance of each identified overdensity in proper motion and in parallax space We report the discovery of 41 new stellar clusters. This represents an increment of at least 20% of the previously known OC population in this volume of the Milky Way. We also report on the clear identification of NGC 886, an object previously considered an asterism. This letter challenges the previous claim of a near-complete sample of open clusters up to 1.8 kpc. Our results reveal that this claim requires revision, and a complete census of nearby open clusters is yet to be found.
△ Less
Submitted 21 March, 2019; v1 submitted 12 October, 2018;
originally announced October 2018.
-
Astrophysical S-factors, thermonuclear rates, and electron screening potential for the $^3$He(d,p)$^{4}$He Big Bang reaction via a hierarchical Bayesian model
Authors:
Rafael S. de Souza,
Christian Iliadis,
Alain Coc
Abstract:
We developed a hierarchical Bayesian framework to estimate S-factors and thermonuclear rates for the $^3$He(d,p)$^{4}$He reaction, which impacts the primordial abundances of $^3$He and $^7$Li. The available data are evaluated and all direct measurements are taken into account in our analysis for which we can estimate separate uncertainties for systematic and statistical effects. For the nuclear re…
▽ More
We developed a hierarchical Bayesian framework to estimate S-factors and thermonuclear rates for the $^3$He(d,p)$^{4}$He reaction, which impacts the primordial abundances of $^3$He and $^7$Li. The available data are evaluated and all direct measurements are taken into account in our analysis for which we can estimate separate uncertainties for systematic and statistical effects. For the nuclear reaction model, we adopt a single-level, two-channel approximation of R-matrix theory, suitably modified to take the effects of electron screening at lower energies into account. Apart from the usual resonance parameters (resonance location and reduced widths for the incoming and outgoing reaction channel), we include for the first time the channel radii and boundary condition in the fitting process. Our new analysis of the $^3$He(d,p)$^{4}$He S-factor data results in improved estimates for the thermonuclear rates. This work represents the first nuclear rate evaluation using the R-matrix theory embedded into a hierarchical Bayesian framework, properly accounting for all known sources of uncertainty. Therefore, it provides a test bed for future studies of more complex reactions.
△ Less
Submitted 18 February, 2019; v1 submitted 18 September, 2018;
originally announced September 2018.
-
A case study of hurdle and generalized additive models in astronomy: the escape of ionizing radiation
Authors:
M. W. Hattab,
R. S. de Souza,
B. Ciardi,
J. -P. Paardekooper,
S. Khochfar,
C. Dalla Vecchia
Abstract:
The dark ages of the Universe end with the formation of the first generation of stars residing in primeval galaxies. These objects were the first to produce ultraviolet ionizing photons in a period when the cosmic gas changed from a neutral state to an ionized one, known as Epoch of Reionization (EoR). A pivotal aspect to comprehend the EoR is to probe the intertwined relationship between the frac…
▽ More
The dark ages of the Universe end with the formation of the first generation of stars residing in primeval galaxies. These objects were the first to produce ultraviolet ionizing photons in a period when the cosmic gas changed from a neutral state to an ionized one, known as Epoch of Reionization (EoR). A pivotal aspect to comprehend the EoR is to probe the intertwined relationship between the fraction of ionizing photons capable to escape dark haloes, also known as the escape fraction ($f_{esc}$), and the physical properties of the galaxy. This work develops a sound statistical model suitable to account for such non-linear relationships and the non-Gaussian nature of $f_{esc}$. This model simultaneously estimates the probability that a given primordial galaxy starts the ionizing photon production and estimates the mean level of the $f_{esc}$ once it is triggered. The model was employed in the First Billion Years simulation suite, from which we show that the baryonic fraction and the rate of ionizing photons appear to have a larger impact on $f_{esc}$ than previously thought. A naive univariate analysis of the same problem would suggest smaller effects for these properties and a much larger impact for the specific star formation rate, which is lessened after accounting for other galaxy properties and non-linearities in the statistical model.
△ Less
Submitted 13 January, 2019; v1 submitted 18 May, 2018;
originally announced May 2018.
-
GLADE: A Galaxy Catalogue for Multi-Messenger Searches in the Advanced Gravitational-Wave Detector Era
Authors:
Gergely Dálya,
Gábor Galgóczi,
László Dobos,
Zsolt Frei,
Ik Siong Heng,
Ronaldas Macas,
Christopher Messenger,
Péter Raffai,
Rafael S. de Souza
Abstract:
We introduce a value-added full-sky catalogue of galaxies, named as Galaxy List for the Advanced Detector Era, or GLADE. The purpose of this catalogue is to (i) help identifications of host candidates for gravitational-wave events, (ii) support target selections for electromagnetic follow-up observations of gravitational-wave candidates, (iii) provide input data on the matter distribution of the l…
▽ More
We introduce a value-added full-sky catalogue of galaxies, named as Galaxy List for the Advanced Detector Era, or GLADE. The purpose of this catalogue is to (i) help identifications of host candidates for gravitational-wave events, (ii) support target selections for electromagnetic follow-up observations of gravitational-wave candidates, (iii) provide input data on the matter distribution of the local universe for astrophysical or cosmological simulations, and (iv) help identifications of host candidates for poorly localised electromagnetic transients, such as gamma-ray bursts observed with the InterPlanetary Network. Both being potential hosts of astrophysical sources of gravitational waves, GLADE includes inactive and active galaxies as well. GLADE was constructed by cross-matching and combining data from five separate (but not independent) astronomical catalogues: GWGC, 2MPZ, 2MASS XSC, HyperLEDA and SDSS-DR12Q. GLADE is complete up to $d_L = 37^{+3}_{-4}$ Mpc in terms of the cumulative B-band luminosity of galaxies within luminosity distance $d_L$, and contains all of the brightest galaxies giving half of the total B-band luminosity up to $d_L = 91$ Mpc. As B-band luminosity is expected to be a tracer of binary neutron star mergers (currently the prime targets of joint GW+EM detections), our completeness measures can be used as estimations of completeness for containing all binary neutron star merger hosts in the local universe.
△ Less
Submitted 23 July, 2018; v1 submitted 13 April, 2018;
originally announced April 2018.
-
Optimizing spectroscopic follow-up strategies for supernova photometric classification with active learning
Authors:
E. E. O. Ishida,
R. Beck,
S. Gonzalez-Gaitan,
R. S. de Souza,
A. Krone-Martins,
J. W. Barrett,
N. Kennamer,
R. Vilalta,
J. M. Burgess,
B. Quint,
A. Z. Vitorelli,
A. Mahabal,
E. Gangler
Abstract:
We report a framework for spectroscopic follow-up design for optimizing supernova photometric classification. The strategy accounts for the unavoidable mismatch between spectroscopic and photometric samples, and can be used even in the beginning of a new survey -- without any initial training set. The framework falls under the umbrella of active learning (AL), a class of algorithms that aims to mi…
▽ More
We report a framework for spectroscopic follow-up design for optimizing supernova photometric classification. The strategy accounts for the unavoidable mismatch between spectroscopic and photometric samples, and can be used even in the beginning of a new survey -- without any initial training set. The framework falls under the umbrella of active learning (AL), a class of algorithms that aims to minimize labelling costs by identifying a few, carefully chosen, objects which have high potential in improving the classifier predictions. As a proof of concept, we use the simulated data released after the Supernova Photometric Classification Challenge (SNPCC) and a random forest classifier. Our results show that, using only 12\% the number of training objects in the SNPCC spectroscopic sample, this approach is able to double purity results. Moreover, in order to take into account multiple spectroscopic observations in the same night, we propose a semi-supervised batch-mode AL algorithm which selects a set of $N=5$ most informative objects at each night. In comparison with the initial state using the traditional approach, our method achieves 2.3 times higher purity and comparable figure of merit results after only 180 days of observation, or 800 queries (73% of the SNPCC spectroscopic sample size). Such results were obtained using the same amount of spectroscopic time necessary to observe the original SNPCC spectroscopic sample, showing that this type of strategy is feasible with current available spectroscopic resources. The code used in this work is available in the COINtoolbox: https://github.com/COINtoolbox/ActSNClass .
△ Less
Submitted 3 January, 2019; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Spatial field reconstruction with INLA: Application to IFU galaxy data
Authors:
S. González-Gaitán,
R. S. de Souza,
A. Krone-Martins,
E. Cameron,
P. Coelho,
L. Galbany,
E. E. O. Ishida
Abstract:
Astronomical observations of extended sources, such as cubes of integral field spectroscopy (IFS), encode auto-correlated spatial structures that cannot be optimally exploited by standard methodologies. This work introduces a novel technique to model IFS datasets, which treats the observed galaxy properties as realizations of an unobserved Gaussian Markov random field. The method is computationall…
▽ More
Astronomical observations of extended sources, such as cubes of integral field spectroscopy (IFS), encode auto-correlated spatial structures that cannot be optimally exploited by standard methodologies. This work introduces a novel technique to model IFS datasets, which treats the observed galaxy properties as realizations of an unobserved Gaussian Markov random field. The method is computationally efficient, resilient to the presence of low-signal-to-noise regions, and uses an alternative to Markov Chain Monte Carlo for fast Bayesian inference, the Integrated Nested Laplace Approximation (INLA). As a case study, we analyse 721 IFS data cubes of nearby galaxies from the CALIFA and PISCO surveys, for which we retrieve the maps of the following physical properties: age, metallicity, mass and extinction. The proposed Bayesian approach, built on a generative representation of the galaxy properties, enables the creation of synthetic images, recovery of areas with bad pixels, and an increased power to detect structures in datasets subject to substantial noise and/or sparsity of sampling. A snippet code to reproduce the analysis of this paper is available in the COIN toolbox, together with the field reconstructions of the CALIFA and PISCO samples.
△ Less
Submitted 30 December, 2018; v1 submitted 17 February, 2018;
originally announced February 2018.
-
The relation between velocity dispersions and chemical abundances in RAVE giants
Authors:
R. Smiljanic,
R. S. de Souza
Abstract:
We developed a Bayesian framework to determine in a robust way the relation between velocity dispersions and chemical abundances in a sample of stars. Our modelling takes into account the uncertainties in the chemical and kinematic properties. We make use of RAVE DR5 radial velocities and abundances together with Gaia DR1 proper motions and parallaxes (when possible, otherwise UCAC4 data is used).…
▽ More
We developed a Bayesian framework to determine in a robust way the relation between velocity dispersions and chemical abundances in a sample of stars. Our modelling takes into account the uncertainties in the chemical and kinematic properties. We make use of RAVE DR5 radial velocities and abundances together with Gaia DR1 proper motions and parallaxes (when possible, otherwise UCAC4 data is used). We found that, in general, the velocity dispersions increase with decreasing [Fe/H] and increasing [Mg/Fe]. A possible decrease in velocity dispersion for stars with high [Mg/Fe] is a property of a negligible fraction of stars and hardly a robust result. At low [Fe/H] and high [Mg/Fe] the sample is incomplete, affected by biases, and likely not representative of the underlying stellar population.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.
-
Statistical methods in astronomy
Authors:
James P. Long,
Rafael S. de Souza
Abstract:
We present a review of data types and statistical methods often encountered in astronomy. The aim is to provide an introduction to statistical applications in astronomy for statisticians and computer scientists. We highlight the complex, often hierarchical, nature of many astronomy inference problems and advocate for cross-disciplinary collaborations to address these challenges.
We present a review of data types and statistical methods often encountered in astronomy. The aim is to provide an introduction to statistical applications in astronomy for statisticians and computer scientists. We highlight the complex, often hierarchical, nature of many astronomy inference problems and advocate for cross-disciplinary collaborations to address these challenges.
△ Less
Submitted 19 October, 2017; v1 submitted 16 July, 2017;
originally announced July 2017.