-
Clusternets: A deep learning approach to probe clustering dark energy
Authors:
Amirmohammad Chegeni,
Farbod Hassani,
Alireza Vafaei Sadr,
Nima Khosravi,
Martin Kunz
Abstract:
Machine Learning (ML) algorithms are becoming popular in cosmology for extracting valuable information from cosmological data. In this paper, we evaluate the performance of a Convolutional Neural Network (CNN) trained on matter density snapshots to distinguish clustering Dark Energy (DE) from the cosmological constant scenario and to detect the speed of sound ($c_s$) associated with clustering DE.…
▽ More
Machine Learning (ML) algorithms are becoming popular in cosmology for extracting valuable information from cosmological data. In this paper, we evaluate the performance of a Convolutional Neural Network (CNN) trained on matter density snapshots to distinguish clustering Dark Energy (DE) from the cosmological constant scenario and to detect the speed of sound ($c_s$) associated with clustering DE. We compare the CNN results with those from a Random Forest (RF) algorithm trained on power spectra. Varying the dark energy equation of state parameter $w_{\rm{DE}}$ within the range of -0.7 to -0.99, while kee** $c_s^2 = 1$, we find that the CNN approach results in a significant improvement in accuracy over the RF algorithm. The improvement in classification accuracy can be as high as $40\%$ depending on the physical scales involved. We also investigate the ML algorithms' ability to detect the impact of the speed of sound by choosing $c_s^2$ from the set $\{1, 10^{-2}, 10^{-4}, 10^{-7}\}$ while maintaining a constant $w_{\rm DE}$ for three different cases: $w_{\rm DE} \in \{-0.7, -0.8, -0.9\}$. Our results suggest that distinguishing between various values of $c_s^2$ and the case where $c_s^2=1$ is challenging, particularly at small scales and when $w_{\rm{DE}}\approx -1$. However, as we consider larger scales, the accuracy of $c_s^2$ detection improves. Notably, the CNN algorithm consistently outperforms the RF algorithm, leading to an approximate $20\%$ enhancement in $c_s^2$ detection accuracy in some cases.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
SKA Science Data Challenge 2: analysis and results
Authors:
P. Hartley,
A. Bonaldi,
R. Braun,
J. N. H. S. Aditya,
S. Aicardi,
L. Alegre,
A. Chakraborty,
X. Chen,
S. Choudhuri,
A. O. Clarke,
J. Coles,
J. S. Collinson,
D. Cornu,
L. Darriba,
M. Delli Veneri,
J. Forbrich,
B. Fraga,
A. Galan,
J. Garrido,
F. Gubanov,
H. Håkansson,
M. J. Hardcastle,
C. Heneka,
D. Herranz,
K. M. Hess
, et al. (83 additional authors not shown)
Abstract:
The Square Kilometre Array Observatory (SKAO) will explore the radio sky to new depths in order to conduct transformational science. SKAO data products made available to astronomers will be correspondingly large and complex, requiring the application of advanced analysis techniques to extract key science findings. To this end, SKAO is conducting a series of Science Data Challenges, each designed t…
▽ More
The Square Kilometre Array Observatory (SKAO) will explore the radio sky to new depths in order to conduct transformational science. SKAO data products made available to astronomers will be correspondingly large and complex, requiring the application of advanced analysis techniques to extract key science findings. To this end, SKAO is conducting a series of Science Data Challenges, each designed to familiarise the scientific community with SKAO data and to drive the development of new analysis techniques. We present the results from Science Data Challenge 2 (SDC2), which invited participants to find and characterise 233245 neutral hydrogen (Hi) sources in a simulated data product representing a 2000~h SKA MID spectral line observation from redshifts 0.25 to 0.5. Through the generous support of eight international supercomputing facilities, participants were able to undertake the Challenge using dedicated computational resources. Alongside the main challenge, `reproducibility awards' were made in recognition of those pipelines which demonstrated Open Science best practice. The Challenge saw over 100 participants develop a range of new and existing techniques, with results that highlight the strengths of multidisciplinary and collaborative effort. The winning strategy -- which combined predictions from two independent machine learning techniques to yield a 20 percent improvement in overall performance -- underscores one of the main Challenge outcomes: that of method complementarity. It is likely that the combination of methods in a so-called ensemble approach will be key to exploiting very large astronomical datasets.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
U-Net-based Models for Skin Lesion Segmentation: More Attention and Augmentation
Authors:
Pooya Mohammadi Kazaj,
MohammadHossein Koosheshi,
Ali Shahedi,
Alireza Vafaei Sadr
Abstract:
According to WHO[1], since the 1970s, diagnosis of melanoma skin cancer has been more frequent. However, if detected early, the 5-year survival rate for melanoma can increase to 99 percent. In this regard, skin lesion segmentation can be pivotal in monitoring and treatment planning. In this work, ten models and four augmentation configurations are trained on the ISIC 2016 dataset. The performance…
▽ More
According to WHO[1], since the 1970s, diagnosis of melanoma skin cancer has been more frequent. However, if detected early, the 5-year survival rate for melanoma can increase to 99 percent. In this regard, skin lesion segmentation can be pivotal in monitoring and treatment planning. In this work, ten models and four augmentation configurations are trained on the ISIC 2016 dataset. The performance and overfitting are compared utilizing five metrics. Our results show that the U-Net-Resnet50 and the R2U-Net have the highest metrics value, along with two data augmentation scenarios. We also investigate CBAM and AG blocks in the U-Net architecture, which enhances segmentation performance at a meager computational cost. In addition, we propose using pyramid, AG, and CBAM blocks in a sequence, which significantly surpasses the results of using the two individually. Finally, our experiments show that models that have exploited attention modules successfully overcome common skin lesion segmentation problems. Lastly, in the spirit of reproducible research, we implement models and codes publicly available.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Learning to Detect Interesting Anomalies
Authors:
Alireza Vafaei Sadr,
Bruce A. Bassett,
Emmanuel Sekyi
Abstract:
Anomaly detection algorithms are typically applied to static, unchanging, data features hand-crafted by the user. But how does a user systematically craft good features for anomalies that have never been seen? Here we couple deep learning with active learning -- in which an Oracle iteratively labels small amounts of data selected algorithmically over a series of rounds -- to automatically and dyna…
▽ More
Anomaly detection algorithms are typically applied to static, unchanging, data features hand-crafted by the user. But how does a user systematically craft good features for anomalies that have never been seen? Here we couple deep learning with active learning -- in which an Oracle iteratively labels small amounts of data selected algorithmically over a series of rounds -- to automatically and dynamically improve the data features for efficient outlier detection. This approach, AHUNT, shows excellent performance on MNIST, CIFAR10, and Galaxy-DESI data, significantly outperforming both standard anomaly detection and active learning algorithms with static feature spaces. Beyond improved performance, AHUNT also allows the number of anomaly classes to grow organically in response to Oracle's evaluations. Extensive ablation studies explore the impact of Oracle question selection strategy and loss function on performance. We illustrate how the dynamic anomaly class taxonomy represents another step towards fully personalized rankings of different anomaly classes that reflect a user's interests, allowing the algorithm to learn to ignore statistically significant but uninteresting outliers (e.g., noise). This should prove useful in the era of massive astronomical datasets serving diverse sets of users who can only review a tiny subset of the incoming data.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Recommendations on test datasets for evaluating AI solutions in pathology
Authors:
André Homeyer,
Christian Geißler,
Lars Ole Schwen,
Falk Zakrzewski,
Theodore Evans,
Klaus Strohmenger,
Max Westphal,
Roman David Bülow,
Michaela Kargl,
Aray Karjauv,
Isidre Munné-Bertran,
Carl Orge Retzlaff,
Adrià Romero-López,
Tomasz Sołtysiński,
Markus Plass,
Rita Carvalho,
Peter Steinbach,
Yu-Chia Lan,
Nassim Bouteldja,
David Haber,
Mateo Rojas-Carulla,
Alireza Vafaei Sadr,
Matthias Kraft,
Daniel Krüger,
Rutger Fick
, et al. (5 additional authors not shown)
Abstract:
Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recom…
▽ More
Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing.
A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations for the collection of test datasets.
We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries?
The recommendations are intended to help AI developers demonstrate the utility of their products and to help regulatory agencies and end users verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
Deep Learning in Searching the Spectroscopic Redshift of Quasars
Authors:
F. Rastegar Nia,
M. T. Mirtorabi,
R. Moradi,
A. Vafaei. Sadr,
Y. Wang
Abstract:
Studying the cosmological sources at their cosmological rest-frames is crucial to track the cosmic history and properties of compact objects. In view of the increasing data volume of existing and upcoming telescopes/detectors, we here construct a 1--dimensional convolutional neural network (CNN) with a residual neural network (ResNet) structure to estimate the redshift of quasars in Sloan Digital…
▽ More
Studying the cosmological sources at their cosmological rest-frames is crucial to track the cosmic history and properties of compact objects. In view of the increasing data volume of existing and upcoming telescopes/detectors, we here construct a 1--dimensional convolutional neural network (CNN) with a residual neural network (ResNet) structure to estimate the redshift of quasars in Sloan Digital Sky Survey IV (SDSS-IV) catalog from DR16 quasar-only (DR16Q) of eBOSS on a broad range of signal-to-noise ratios, named \code{FNet}. Owing to its $24$ convolutional layers and the ResNet structure with different kernel sizes of $500$, $200$ and $15$, FNet is able to discover the "\textit{local}" and "\textit{global}" patterns in the whole sample of spectra by a self-learning procedure. It reaches the accuracy of 97.0$\%$ for the velocity difference for redshift, $|Δν|< 6000~ \rm km/s$ and 98.0$\%$ for $|Δν|< 12000~ \rm km/s$. While \code{QuasarNET}, which is a standard CNN adopted in the SDSS routine and is constructed by 4 convolutional layers (no ResNet structure), with kernel sizes of $10$, to measure the redshift via identifying seven emission lines (\textit{local} patterns), fails in estimating redshift of $\sim 1.3\%$ of visually inspected quasars in DR16Q catalog, and it gives 97.8$\%$ for $|Δν|< 6000~ \rm km/s$ and 97.9$\%$ for $|Δν|< 12000~ \rm km/s$. Hence, FNet provides similar accuracy to \code{QuasarNET}, but it is applicable for a wider range of SDSS spectra, especially for those missing the clear emission lines exploited by \code{QuasarNET}. These properties of \code{FNet}, together with the fast predictive power of machine learning, allow \code{FNet} to be a more accurate alternative for the pipeline redshift estimator and can make it practical in the upcoming catalogs to reduce the number of spectra to visually inspect.
△ Less
Submitted 10 January, 2022;
originally announced January 2022.
-
The Hydrogen Intensity and Real-time Analysis eXperiment: 256-Element Array Status and Overview
Authors:
Devin Crichton,
Moumita Aich,
Adam Amara,
Kevin Bandura,
Bruce A. Bassett,
Carlos Bengaly,
Pascale Berner,
Shruti Bhatporia,
Martin Bucher,
Tzu-Ching Chang,
H. Cynthia Chiang,
Jean-Francois Cliche,
Carolyn Crichton,
Romeel Dave,
Dirk I. L. de Villiers,
Matt A. Dobbs,
Aaron M. Ewall-Wice,
Scott Eyono,
Christopher Finlay,
Sindhu Gaddam,
Ken Ganga,
Kevin G. Gayley,
Kit Gerodias,
Tim Gibbon,
Austin Gumba
, et al. (75 additional authors not shown)
Abstract:
The Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) is a radio interferometer array currently in development, with an initial 256-element array to be deployed at the South African Radio Astronomy Observatory (SARAO) Square Kilometer Array (SKA) site in South Africa. Each of the 6m, $f/0.23$ dishes will be instrumented with dual-polarisation feeds operating over a frequency range of 40…
▽ More
The Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) is a radio interferometer array currently in development, with an initial 256-element array to be deployed at the South African Radio Astronomy Observatory (SARAO) Square Kilometer Array (SKA) site in South Africa. Each of the 6m, $f/0.23$ dishes will be instrumented with dual-polarisation feeds operating over a frequency range of 400-800 MHz. Through intensity map** of the 21 cm emission line of neutral hydrogen, HIRAX will provide a cosmological survey of the distribution of large-scale structure over the redshift range of $0.775 < z < 2.55$ over $\sim$15,000 square degrees of the southern sky. The statistical power of such a survey is sufficient to produce $\sim$7 percent constraints on the dark energy equation of state parameter when combined with measurements from the Planck satellite. Additionally, HIRAX will provide a highly competitive platform for radio transient and HI absorber science while enabling a multitude of cross-correlation studies. In this paper, we describe the science goals of the experiment, overview of the design and status of the sub-components of the telescope system, and describe the expected performance of the initial 256-element array as well as the planned future expansion to the final, 1024-element array.
△ Less
Submitted 17 January, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
Planck Limits on Cosmic String Tension Using Machine Learning
Authors:
M. Torki,
H. Hajizadeh,
M. Farhang,
A. Vafaei Sadr,
S. M. S. Movahed
Abstract:
We develop two parallel machine-learning pipelines to estimate the contribution of cosmic strings (CSs), conveniently encoded in their tension ($Gμ$), to the anisotropies of the cosmic microwave background radiation observed by {\it Planck}. The first approach is tree-based and feeds on certain map features derived by image processing and statistical tools. The second uses convolutional neural net…
▽ More
We develop two parallel machine-learning pipelines to estimate the contribution of cosmic strings (CSs), conveniently encoded in their tension ($Gμ$), to the anisotropies of the cosmic microwave background radiation observed by {\it Planck}. The first approach is tree-based and feeds on certain map features derived by image processing and statistical tools. The second uses convolutional neural network with the goal to explore possible non-trivial features of the CS imprints. The two pipelines are trained on {\it Planck} simulations and when applied to {\it Planck} \texttt{SMICA} map yield the $3σ$ upper bound of $Gμ\lesssim 8.6\times 10^{-7}$. We also train and apply the pipelines to make forecasts for futuristic CMB-S4-like surveys and conservatively find their minimum detectable tension to be $Gμ_{\rm min}\sim 1.9\times 10^{-7}$.
△ Less
Submitted 31 May, 2021;
originally announced June 2021.
-
Design and implementation of a noise temperature measurement system for the Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX)
Authors:
Emily R. Kuhn,
Benjamin R. B. Saliwanchik,
Maile Harris,
Moumita Aich,
Kevin Bandura,
Tzu-Ching Chang,
H. Cynthia Chiang,
Devin Crichton,
Aaron Ewall-Wice,
Austin A. Gumba,
N. Gupta,
Kabelo Calvin Kesebonye,
Jean-Paul Kneib,
Martin Kunz,
Kavilan Moodley,
Laura B. Newburgh,
Viraj Nistane,
Warren Naidoo,
Deniz Ölçek,
Jeffrey B. Peterson,
Alexandre Refregier,
Jonathan L. Sievers,
Corrie Ungerer,
Alireza Vafaei Sadr,
Jacques van Dyk
, et al. (2 additional authors not shown)
Abstract:
This paper describes the design, implementation, and verification of a test-bed for determining the noise temperature of radio antennas operating between 400-800MHz. The requirements for this test-bed were driven by the HIRAX experiment, which uses antennas with embedded amplification, making system noise characterization difficult in the laboratory. The test-bed consists of two large cylindrical…
▽ More
This paper describes the design, implementation, and verification of a test-bed for determining the noise temperature of radio antennas operating between 400-800MHz. The requirements for this test-bed were driven by the HIRAX experiment, which uses antennas with embedded amplification, making system noise characterization difficult in the laboratory. The test-bed consists of two large cylindrical cavities, each containing radio-frequency (RF) absorber held at different temperatures (300K and 77K), allowing a measurement of system noise temperature through the well-known 'Y-factor' method. The apparatus has been constructed at Yale, and over the course of the past year has undergone detailed verification measurements. To date, three preliminary noise temperature measurement sets have been conducted using the system, putting us on track to make the first noise temperature measurements of the HIRAX feed and perform the first analysis of feed repeatability.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Dynamical Dark Energy Properties Hidden in the Dark Matter Halos and Voids
Authors:
Aghileh S Ebrahimi,
A. Vafaei Sadr,
S. Tavasoli
Abstract:
In this paper, we analysed the halos and voids properties of a GR-based N-body simulation carried out at redshifts z= 0.0 and z= 0.8 as differences between dynamical dark energy models (namely PL and CPL) with respect to LCDM. Analysing the halos demonstrates that both models, PL and CPL, behave like LCDM, despite the velocity dispersion of halos was more sensitive to the dynamical dark energy mod…
▽ More
In this paper, we analysed the halos and voids properties of a GR-based N-body simulation carried out at redshifts z= 0.0 and z= 0.8 as differences between dynamical dark energy models (namely PL and CPL) with respect to LCDM. Analysing the halos demonstrates that both models, PL and CPL, behave like LCDM, despite the velocity dispersion of halos was more sensitive to the dynamical dark energy model. In addition, a void finder was developed to extract the properties of voids from simulated data. Further statistical model on voids confirms that the PL model produces larger voids. In summary, our novel simulation demonstrates void properties are better than halo properties in discriminating between dark energy models. Hence, the results suggest to make more use of the properties of voids in future studies of discriminating dynamical dark energy models.
△ Less
Submitted 8 November, 2020;
originally announced November 2020.
-
Square Kilometre Array Science Data Challenge 1: analysis and results
Authors:
A. Bonaldi,
T. An,
M. Bruggen,
S. Burkutean,
B. Coelho,
H. Goodarzi,
P. Hartley,
P. K. Sandhu,
C. Wu,
L. Yu,
M. H. Zhoolideh Haghighi,
S. Anton,
Z. Bagheri,
D. Barbosa,
J. P. Barraca,
D. Bartashevich,
M. Bergano,
M. Bonato,
J. Brand,
F. de Gasperin,
A. Giannetti,
R. Dodson,
P. Jain,
S. Jaiswal,
B. Lao
, et al. (20 additional authors not shown)
Abstract:
As the largest radio telescope in the world, the Square Kilometre Array (SKA) will lead the next generation of radio astronomy. The feats of engineering required to construct the telescope array will be matched only by the techniques developed to exploit the rich scientific value of the data. To drive forward the development of efficient and accurate analysis methods, we are designing a series of…
▽ More
As the largest radio telescope in the world, the Square Kilometre Array (SKA) will lead the next generation of radio astronomy. The feats of engineering required to construct the telescope array will be matched only by the techniques developed to exploit the rich scientific value of the data. To drive forward the development of efficient and accurate analysis methods, we are designing a series of data challenges that will provide the scientific community with high-quality datasets for testing and evaluating new techniques. In this paper we present a description and results from the first such Science Data Challenge (SDC1). Based on SKA MID continuum simulated observations and covering three frequencies (560 MHz, 1400MHz and 9200 MHz) at three depths (8 h, 100 h and 1000 h), SDC1 asked participants to apply source detection, characterization and classification methods to simulated data. The challenge opened in November 2018, with nine teams submitting results by the deadline of April 2019. In this work we analyse the results for 8 of those teams, showcasing the variety of approaches that can be successfully used to find, characterise and classify sources in a deep, crowded field. The results also demonstrate the importance of building domain knowledge and expertise on this kind of analysis to obtain the best performance. As high-resolution observations begin revealing the true complexity of the sky, one of the outstanding challenges emerging from this analysis is the ability to deal with highly resolved and complex sources as effectively as the unresolved source population.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
IMDb data from Two Generations, from 1979 to 2019; Part one, Dataset Introduction and Preliminary Analysis
Authors:
M. Bahraminasr,
A. Vafaei Sadr
Abstract:
"IMDb" as a user-regulating and one the most-visited portal has provided an opportunity to create an enormous database. Analysis of the information on Internet Movie Database - IMDb, either those related to the movie or provided by users would help to reveal the determinative factors in the route of success for each movie. As the lack of a comprehensive dataset was felt, we determined to do create…
▽ More
"IMDb" as a user-regulating and one the most-visited portal has provided an opportunity to create an enormous database. Analysis of the information on Internet Movie Database - IMDb, either those related to the movie or provided by users would help to reveal the determinative factors in the route of success for each movie. As the lack of a comprehensive dataset was felt, we determined to do create a compendious dataset for the later analysis using the statistical methods and machine learning models; It comprises of various information provided on IMDb such as rating data, genre, cast and crew, MPAA rating certificate, parental guide details, related movie information, posters, etc, for over 79k titles which is the largest dataset by this date. The present paper is the first paper in a series of papers aiming at the mentioned goals, by a description of the created dataset and a preliminary analysis including some trend in data, demographic analysis of IMDb scores and their relation of genre MPAA rating certificate has been investigated.
△ Less
Submitted 6 September, 2020; v1 submitted 28 May, 2020;
originally announced May 2020.
-
Deep Learning improves identification of Radio Frequency Interference
Authors:
Alireza Vafaei Sadr,
Bruce A. Bassett,
Nadeem Oozeer,
Yabebal Fantaye,
Chris Finlay
Abstract:
Flagging of Radio Frequency Interference (RFI) is an increasingly important challenge in radio astronomy. We present R-Net, a deep convolutional ResNet architecture that significantly outperforms existing algorithms -- including the default MeerKAT RFI flagger, and deep U-Net architectures -- across all metrics including AUC, F1-score and MCC. We demonstrate the robustness of this improvement on b…
▽ More
Flagging of Radio Frequency Interference (RFI) is an increasingly important challenge in radio astronomy. We present R-Net, a deep convolutional ResNet architecture that significantly outperforms existing algorithms -- including the default MeerKAT RFI flagger, and deep U-Net architectures -- across all metrics including AUC, F1-score and MCC. We demonstrate the robustness of this improvement on both single dish and interferometric simulations and, using transfer learning, on real data. Our R-Net model's precision is approximately $90\%$ better than the current MeerKAT flagger at $80\%$ recall and has a 35\% higher F1-score with no additional performance cost. We further highlight the effectiveness of transfer learning from a model initially trained on simulated MeerKAT data and fine-tuned on real, human-flagged, KAT-7 data. Despite the wide differences in the nature of the two telescope arrays, the model achieves an AUC of 0.91, while the best model without transfer learning only reaches an AUC of 0.67. We consider the use of phase information in our models but find that without calibration the phase adds almost no extra information relative to amplitude data only. Our results strongly suggest that deep learning on simulations, boosted by transfer learning on real data, will likely play a key role in the future of RFI flagging of radio astronomy data.
△ Less
Submitted 12 October, 2020; v1 submitted 18 May, 2020;
originally announced May 2020.
-
Inpainting via Generative Adversarial Networks for CMB data analysis
Authors:
Alireza Vafaei Sadr,
Farida Farsian
Abstract:
In this work, we propose a new method to inpaint the CMB signal in regions masked out following a point source extraction process. We adopt a modified Generative Adversarial Network (GAN) and compare different combinations of internal (hyper-)parameters and training strategies. We study the performance using a suitable $\mathcal{C}_r$ variable in order to estimate the performance regarding the CMB…
▽ More
In this work, we propose a new method to inpaint the CMB signal in regions masked out following a point source extraction process. We adopt a modified Generative Adversarial Network (GAN) and compare different combinations of internal (hyper-)parameters and training strategies. We study the performance using a suitable $\mathcal{C}_r$ variable in order to estimate the performance regarding the CMB power spectrum recovery. We consider a test set where one point source is masked out in each sky patch with a 1.83 $\times$ 1.83 squared degree extension, which, in our gridding, corresponds to 64 $\times$ 64 pixels. The GAN is optimized for estimating performance on Planck 2018 total intensity simulations. The training makes the GAN effective in reconstructing a masking corresponding to about 1500 pixels with $1\%$ error down to angular scales corresponding to about 5 arcminutes.
△ Less
Submitted 21 April, 2020; v1 submitted 8 April, 2020;
originally announced April 2020.
-
Clustering of Local Extrema in Planck CMB maps
Authors:
A. Vafaei Sadr,
S. M. S. Movahed
Abstract:
The clustering of local extrema will be exploited to examine Gaussianity, asymmetry, and the footprint of the cosmic-string network on the CMB observed by Planck. The number density of local extrema ($n_{\rm pk}$ for peak and $n_{\rm tr}$ for trough) and sharp clip** ($n_{\rm pix}$) statistics support the Gaussianity hypothesis for all component separations. However, the pixel at the threshold r…
▽ More
The clustering of local extrema will be exploited to examine Gaussianity, asymmetry, and the footprint of the cosmic-string network on the CMB observed by Planck. The number density of local extrema ($n_{\rm pk}$ for peak and $n_{\rm tr}$ for trough) and sharp clip** ($n_{\rm pix}$) statistics support the Gaussianity hypothesis for all component separations. However, the pixel at the threshold reveals a more consistent treatment with respect to end-to-end simulations. A very tiny deviation from associated simulations in the context of trough density, in the threshold range $θ\in [-2-0]$ for NILC and CR component separations, are detected. The unweighted two-point correlation function, of the local extrema, illustrates good consistency between different component separations and corresponding Gaussian simulations for almost all available thresholds. However, for high thresholds, a small deficit in the clustering of peaks is observed with respect to the Planck fiducial $Λ$CDM model. To put a significant constraint on the amplitude of the mass function based on the value of $Ψ$ around the Doppler peak ($θ\approx 70-75$ arcmin), we should consider $\vartheta\lesssim 0.0$. The scale-independent bias factors for the peak above a threshold for large separation angle and high threshold level are in agreement with the value expected for a pure Gaussian CMB. Applying the $n_{\rm pk}$, $n_{\rm tr}$, $Ψ_{\rm pk-pk}$ and $Ψ_{\rm tr-tr}$ measures on the tessellated CMB map with patches of $7.5^2$ deg$^2$ size prove statistical isotropy in the Planck maps. The peak clustering analysis puts the upper bound on the cosmic-string tension, $Gμ^{(\rm up)} \lesssim 5.59\times 10^{-7}$, in SMICA.
△ Less
Submitted 22 April, 2021; v1 submitted 16 March, 2020;
originally announced March 2020.
-
A Flexible Framework for Anomaly Detection via Dimensionality Reduction
Authors:
Alireza Vafaei Sadr,
Bruce A. Bassett,
Martin Kunz
Abstract:
Anomaly detection is challenging, especially for large datasets in high dimensions. Here we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. We release DRAMA, a general python package that implements the general framework with a wide range of built-in options. We test DRAMA on a wide variety of simulated and real datasets, in up to 3000 d…
▽ More
Anomaly detection is challenging, especially for large datasets in high dimensions. Here we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. We release DRAMA, a general python package that implements the general framework with a wide range of built-in options. We test DRAMA on a wide variety of simulated and real datasets, in up to 3000 dimensions, and find it robust and highly competitive with commonly-used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning and highly unbalanced datasets.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Eigen-reconstruction of Perturbations to the Primordial Tensor Power Spectrum
Authors:
M. Farhang,
A. Vafaei Sadr
Abstract:
We explore the potential of the B-mode anisotropies of the Cosmic Microwave Background radiation (CMB) to constrain the shape of the primordial tensor power spectrum in a model-independent way. We expand possible perturbations to the power-law primordial tensor spectrum (predicted by the simplest single-field slow-roll inflationary models) using various sets of localized and nonlocalized basis fun…
▽ More
We explore the potential of the B-mode anisotropies of the Cosmic Microwave Background radiation (CMB) to constrain the shape of the primordial tensor power spectrum in a model-independent way. We expand possible perturbations to the power-law primordial tensor spectrum (predicted by the simplest single-field slow-roll inflationary models) using various sets of localized and nonlocalized basis functions and construct the Fisher matrix for their amplitudes. The eigen-analysis of the Fisher matrix would then yield a hierarchy of uncorrelated perturbation patterns (called tensor eigenmodes or TeMs) which are rank-ordered according to their measurability by data. We find that the first three TeMs are expected to be constrainable within a few percent by the next generation of B-mode experiments. We discuss how the method can be iteratively used to reconstruct the observable part of any general deviation from the fiducial power spectrum.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
DeepSource: Point Source Detection using Deep Learning
Authors:
A. Vafaei Sadr,
Etienne. E. Vos,
Bruce A. Bassett,
Zafiirah Hosenie,
N. Oozeer,
Michelle Lochner
Abstract:
Point source detection at low signal-to-noise is challenging for astronomical surveys, particularly in radio interferometry images where the noise is correlated. Machine learning is a promising solution, allowing the development of algorithms tailored to specific telescope arrays and science cases. We present DeepSource - a deep learning solution - that uses convolutional neural networks to achiev…
▽ More
Point source detection at low signal-to-noise is challenging for astronomical surveys, particularly in radio interferometry images where the noise is correlated. Machine learning is a promising solution, allowing the development of algorithms tailored to specific telescope arrays and science cases. We present DeepSource - a deep learning solution - that uses convolutional neural networks to achieve these goals. DeepSource enhances the Signal-to-Noise Ratio (SNR) of the original map and then uses dynamic blob detection to detect sources. Trained and tested on two sets of 500 simulated 1 deg x 1 deg MeerKAT images with a total of 300,000 sources, DeepSource is essentially perfect in both purity and completeness down to SNR = 4 and outperforms PyBDSF in all metrics. For uniformly-weighted images it achieves a Purity x Completeness (PC) score at SNR = 3 of 0.73, compared to 0.31 for the best PyBDSF model. For natural-weighting we find a smaller improvement of ~40% in the PC score at SNR = 3. If instead we ask where either of the purity or completeness first drop to 90%, we find that DeepSource reaches this value at SNR = 3.6 compared to the 4.3 of PyBDSF (natural-weighting). A key advantage of DeepSource is that it can learn to optimally trade off purity and completeness for any science case under consideration. Our results show that deep learning is a promising approach to point source detection in astronomical images.
△ Less
Submitted 7 July, 2018;
originally announced July 2018.
-
Cosmic String Detection with Tree-Based Machine Learning
Authors:
A. Vafaei Sadr,
M. Farhang,
S. M. S. Movahed,
B. Bassett,
M. Kunz
Abstract:
We explore the use of random forest and gradient boosting, two powerful tree-based machine learning algorithms, for the detection of cosmic strings in maps of the cosmic microwave background (CMB), through their unique Gott-Kaiser-Stebbins effect on the temperature anisotropies.The information in the maps is compressed into feature vectors before being passed to the learning units. The feature vec…
▽ More
We explore the use of random forest and gradient boosting, two powerful tree-based machine learning algorithms, for the detection of cosmic strings in maps of the cosmic microwave background (CMB), through their unique Gott-Kaiser-Stebbins effect on the temperature anisotropies.The information in the maps is compressed into feature vectors before being passed to the learning units. The feature vectors contain various statistical measures of processed CMB maps that boost the cosmic string detectability. Our proposed classifiers, after training, give results improved over or similar to the claimed detectability levels of the existing methods for string tension, $Gμ$. They can make $3σ$ detection of strings with $Gμ\gtrsim 2.1\times 10^{-10}$ for noise-free, $0.9'$-resolution CMB observations. The minimum detectable tension increases to $Gμ\gtrsim 3.0\times 10^{-8}$ for a more realistic, CMB S4-like (II) strategy, still a significant improvement over the previous results.
△ Less
Submitted 12 January, 2018;
originally announced January 2018.
-
Multi-Scale Pipeline for the Search of String-Induced CMB Anisotropies
Authors:
A. Vafaei Sadr,
S. M. S. Movahed,
M. Farhang,
C. Ringeval,
F. R. Bouchet
Abstract:
We propose a multi-scale edge-detection algorithm to search for the Gott-Kaiser-Stebbins imprints of a cosmic string (CS) network on the Cosmic Microwave Background (CMB) anisotropies. Curvelet decomposition and extended Canny algorithm are used to enhance the string detectability. Various statistical tools are then applied to quantify the deviation of CMB maps having a cosmic string contribution…
▽ More
We propose a multi-scale edge-detection algorithm to search for the Gott-Kaiser-Stebbins imprints of a cosmic string (CS) network on the Cosmic Microwave Background (CMB) anisotropies. Curvelet decomposition and extended Canny algorithm are used to enhance the string detectability. Various statistical tools are then applied to quantify the deviation of CMB maps having a cosmic string contribution with respect to pure Gaussian anisotropies of inflationary origin. These statistical measures include the one-point probability density function, the weighted two-point correlation function (TPCF) of the anisotropies, the unweighted TPCF of the peaks and of the up-crossing map, as well as their cross-correlation. We use this algorithm on a hundred of simulated Nambu-Goto CMB flat sky maps, covering approximately $10\%$ of the sky, and for different string tensions $Gμ$. On noiseless sky maps with an angular resolution of $0.9'$, we show that our pipeline detects CSs with $Gμ$ as low as $Gμ\gtrsim 4.3\times 10^{-10}$. At the same resolution, but with a noise level typical to a CMB-S4 phase II experiment, the detection threshold would be to $Gμ\gtrsim 1.2 \times 10^{-7}$.
△ Less
Submitted 30 September, 2017;
originally announced October 2017.
-
Primordial anisotropies from cosmic strings during inflation
Authors:
Sadra Jazayeri,
Alireza Vafaei Sadr,
Hassan Firouzjahi
Abstract:
In this work we study the imprints of a primordial cosmic string on inflationary power spectrum. Cosmic string induces two distinct contributions on curvature perturbations power spectrum. The first type of correction respects the translation invariance while violating isotropy. This generates quadrupolar statistical anisotropy in CMB maps which is constrained by the Planck data. The second contri…
▽ More
In this work we study the imprints of a primordial cosmic string on inflationary power spectrum. Cosmic string induces two distinct contributions on curvature perturbations power spectrum. The first type of correction respects the translation invariance while violating isotropy. This generates quadrupolar statistical anisotropy in CMB maps which is constrained by the Planck data. The second contribution breaks both homogeneity and isotropy, generating a dipolar power asymmetry in variance of temperature fluctuations with its amplitude falling on small scales. We show that the strongest constraint on the tension of string is obtained from the quadrupolar anisotropy and argue that the mass scale of underlying theory responsible for the formation of string can not be much higher than the GUT scale. The predictions of string for the diagonal and off-diagonal components of CMB angular power spectrum are presented.
△ Less
Submitted 16 March, 2017;
originally announced March 2017.