-
UltraPINK -- New possibilities to explore Self-Organizing Kohonen Maps
Authors:
Fenja Kollasch,
Kai Polsterer
Abstract:
Unsupervised learning algorithms like self-organizing Kohonen maps are a promising approach to gain an overview among massive datasets. With UltraPINK, researchers can train, inspect, and explore self-organizing maps, whereby the toolbox of interaction possibilities grows continually. Key feature of UltraPINK is the consideration of versality in astronomical data. By kee** the operations as abst…
▽ More
Unsupervised learning algorithms like self-organizing Kohonen maps are a promising approach to gain an overview among massive datasets. With UltraPINK, researchers can train, inspect, and explore self-organizing maps, whereby the toolbox of interaction possibilities grows continually. Key feature of UltraPINK is the consideration of versality in astronomical data. By kee** the operations as abstract as possible and using design patterns meant for abstract usage, we ensure that data is compatible with UltraPINK, regardless of its type, formatting, or origin. Future work on the application will keep extending the catalogue of exploration tools and the interfaces towards other established applications to process astronomical data. Ultimatively, we aim towards a solid infrastructure for data analysis in astronomy.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Spherinator and HiPSter: Representation Learning for Unbiased Knowledge Discovery from Simulations
Authors:
Kai L. Polsterer,
Bernd Doser,
Andreas Fehlner,
Sebastian Trujillo-Gomez
Abstract:
Simulations are the best approximation to experimental laboratories in astrophysics and cosmology. However, the complexity, richness, and large size of their outputs severely limit the interpretability of their predictions. We describe a new, unbiased, and machine learning based approach to obtaining useful scientific insights from a broad range of simulations. The method can be used on today's la…
▽ More
Simulations are the best approximation to experimental laboratories in astrophysics and cosmology. However, the complexity, richness, and large size of their outputs severely limit the interpretability of their predictions. We describe a new, unbiased, and machine learning based approach to obtaining useful scientific insights from a broad range of simulations. The method can be used on today's largest simulations and will be essential to solve the extreme data exploration and analysis challenges posed by the Exascale era. Furthermore, this concept is so flexible, that it will also enable explorative access to observed data. Our concept is based on applying nonlinear dimensionality reduction to learn compact representations of the data in a low-dimensional space. The simulation data is projected onto this space for interactive inspection, visual interpretation, sample selection, and local analysis. We present a prototype using a rotational invariant hyperspherical variational convolutional autoencoder, utilizing a power distribution in the latent space, and trained on galaxies from IllustrisTNG simulation. Thereby, we obtain a natural Hubble tuning fork like similarity space that can be visualized interactively on the surface of a sphere by exploiting the power of HiPS tilings in Aladin Lite.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Rotation and flip** invariant self-organizing maps with astronomical images: A cookbook and application to the VLA Sky Survey QuickLook images
Authors:
A. N. Vantyghem,
T. J. Galvin,
B. Sebastian,
C. P. O'Dea,
Y. A. Gordon,
M. Boyce,
L. Rudnick,
K. Polsterer,
Heinz Andernach,
M. Dionyssiou,
P. Venkataraman,
R. Norris,
S. A. Baum,
X. R. Wang,
M. Huynh
Abstract:
Modern wide field radio surveys typically detect millions of objects. Techniques based on machine learning are proving to be useful for classifying large numbers of objects. The self-organizing map (SOM) is an unsupervised machine learning algorithm that projects a many-dimensional dataset onto a two- or three-dimensional lattice of neurons. This dimensionality reduction allows the user to visuali…
▽ More
Modern wide field radio surveys typically detect millions of objects. Techniques based on machine learning are proving to be useful for classifying large numbers of objects. The self-organizing map (SOM) is an unsupervised machine learning algorithm that projects a many-dimensional dataset onto a two- or three-dimensional lattice of neurons. This dimensionality reduction allows the user to visualize common features of the data better and develop algorithms for classifying objects that are not otherwise possible with large datasets. To this aim, we use the PINK implementation of a SOM. PINK incorporates rotation and flip** invariance so that the SOM algorithm may be applied to astronomical images. In this cookbook we provide instructions for working with PINK, including preprocessing the input images, training the model, and offering lessons learned through experimentation. The problem of imbalanced classes can be improved by careful selection of the training sample and increasing the number of neurons in the SOM (chosen by the user). Because PINK is not scale-invariant, structure can be smeared in the neurons. This can also be improved by increasing the number of neurons in the SOM. We also introduce pyink, a Python package used to read and write PINK binary files, assist in common preprocessing operations, perform standard analyses, visualize the SOM and preprocessed images, and create image-based annotations using a graphical interface. A tutorial is also provided to guide the user through the entire process. We present an application of PINK to VLA Sky Survey (VLASS) images. We demonstrate that the PINK is generally able to group VLASS sources with similar morphology together. We use the results of PINK to estimate the probability that a given source in the VLASS QuickLook Catalogue is actually due to sidelobe contamination.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
PSF quality metrics in the problem of revealing Intermediate-Mass Black Holes using MICADO@ELT
Authors:
Mariia Demianenko,
Joerg-Uwe Pott,
Kai Polsterer
Abstract:
Nowadays, astronomers perform point spread function (PSF) fitting for most types of observational data. Interpolation of the PSF is often an intermediate step in such algorithms. In the case of the Multi-AO Imaging Camera for Deep Observations (MICADO) at the Extremely Large Telescope (ELT), PSF interpolation will play a crucial role in high-precision astrometry for stellar clusters and confirmati…
▽ More
Nowadays, astronomers perform point spread function (PSF) fitting for most types of observational data. Interpolation of the PSF is often an intermediate step in such algorithms. In the case of the Multi-AO Imaging Camera for Deep Observations (MICADO) at the Extremely Large Telescope (ELT), PSF interpolation will play a crucial role in high-precision astrometry for stellar clusters and confirmation of the Intermediate-Mass Black Holes (IMBHs) presence. Significant PSF variations across the field of view invalidate the approach of deconvolution with a mean PSF or on-axis PSF. The ignoring of PSF variations can be especially unsatisfactory in the case of Single Conjugate Adaptive Optics (SCAO) observations, as these sophisticated and expensive systems are designed to achieve high resolution with ground-based telescopes by correcting for atmospheric turbulence in the direction of one reference star. In plenty of tasks, you face the question: How can I establish the quality of PSF fitting or interpolation? Our study aims to demonstrate the variety of PSF quality metrics, including the problem of revealing IMBHs in stellar clusters.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
A Gaussian process cross-correlation approach to time delay estimation in active galactic nuclei
Authors:
F. Pozo Nuñez,
N. Gianniotis,
K. L. Polsterer
Abstract:
We present a probabilistic cross-correlation approach to estimate time delays in the context of reverberation map** (RM) of Active Galactic Nuclei (AGN). We reformulate the traditional interpolated cross-correlation method as a statistically principled model that delivers a posterior distribution for the delay. The method employs Gaussian processes as a model for observed AGN light curves. We de…
▽ More
We present a probabilistic cross-correlation approach to estimate time delays in the context of reverberation map** (RM) of Active Galactic Nuclei (AGN). We reformulate the traditional interpolated cross-correlation method as a statistically principled model that delivers a posterior distribution for the delay. The method employs Gaussian processes as a model for observed AGN light curves. We describe the mathematical formalism and demonstrate the new approach using both simulated light curves and available RM observations. The proposed method delivers a posterior distribution for the delay that accounts for observational noise and the non-uniform sampling of the light curves. This feature allow us to fully quantify its uncertainty and propagate it to subsequent calculations of dependent physical quantities, e.g., black hole masses. It delivers out-of-sample predictions, which enables us to subject it to model selection and it can calculate the joint posterior delay for more than two light curves. Because of the numerous advantages of our reformulation and the simplicity of its application, we anticipate that our method will find favour not only in the specialised community of RM, but in all fields where cross-correlation analysis is performed. We provide the algorithms and examples of their application as part of our Julia GPCC package.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Modeling photometric reverberation map** data for the next generation of big data surveys. Quasar accretion disks sizes with the LSST
Authors:
F. Pozo Nuñez,
C. Bruckmann,
S. Desamutara,
B. Czerny,
S. Panda,
A. P. Lobban,
G. Pietrzyński,
K. L. Polsterer
Abstract:
Photometric reverberation map** can detect the radial extent of the accretion disc (AD) in Active Galactic Nuclei by measuring the time delays between light curves observed in different continuum bands. Quantifying the constraints on the efficiency and accuracy of the delay measurements is important for recovering the AD size-luminosity relation, and potentially using quasars as standard candles…
▽ More
Photometric reverberation map** can detect the radial extent of the accretion disc (AD) in Active Galactic Nuclei by measuring the time delays between light curves observed in different continuum bands. Quantifying the constraints on the efficiency and accuracy of the delay measurements is important for recovering the AD size-luminosity relation, and potentially using quasars as standard candles. We have explored the possibility of determining the AD size of quasars using next-generation Big Data surveys. We focus on the Legacy Survey of Space and Time (LSST) at the Vera C. Rubin Observatory, which will observe several thousand quasars with the Deep Drilling Fields and up to 10 million quasars for the main survey in six broadband filter during its 10-year operational lifetime. We have developed extensive simulations that take into account the characteristics of the LSST survey and the intrinsic properties of the quasars. The simulations are used to characterise the light curves from which AD sizes are determined using various algorithms. We find that the time delays can be recovered with an accuracy of 5 and 15% for light curves with a time sampling of 2 and 5 days, respectively. The results depend strongly on the redshift of the source and the relative contribution of the emission lines to the bandpasses. Assuming an optically thick and geometrically thin AD, the recovered time-delay spectrum is consistent with black hole masses derived with 30% uncertainty.
△ Less
Submitted 24 January, 2023; v1 submitted 18 December, 2022;
originally announced December 2022.
-
Applications of AI in Astronomy
Authors:
S. G. Djorgovski,
A. A. Mahabal,
M. J. Graham,
K. Polsterer,
A. Krone-Martins
Abstract:
We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with bi…
▽ More
We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with billions of feature vectors in hundreds of dimensions. The exponential data growth continued, with the rise of synoptic sky surveys and the Time Domain Astronomy, with the resulting Petascale data streams and the need for a real-time processing, classification, and decision making. A broad variety of classification and clustering methods have been applied for these tasks, and this remains a very active area of research. Over the past decade we have seen an exponential growth of the astronomical literature involving a variety of ML/AI applications of an ever increasing complexity and sophistication. ML and AI are now a standard part of the astronomical toolkit. As the data complexity continues to increase, we anticipate further advances leading towards a collaborative human-AI discovery.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Convolutional autoencoders for spatially-informed ensemble post-processing
Authors:
Sebastian Lerch,
Kai L. Polsterer
Abstract:
Ensemble weather predictions typically show systematic errors that have to be corrected via post-processing. Even state-of-the-art post-processing methods based on neural networks often solely rely on location-specific predictors that require an interpolation of the physical weather model's spatial forecast fields to the target locations. However, potentially useful predictability information cont…
▽ More
Ensemble weather predictions typically show systematic errors that have to be corrected via post-processing. Even state-of-the-art post-processing methods based on neural networks often solely rely on location-specific predictors that require an interpolation of the physical weather model's spatial forecast fields to the target locations. However, potentially useful predictability information contained in large-scale spatial structures within the input fields is potentially lost in this interpolation step. Therefore, we propose the use of convolutional autoencoders to learn compact representations of spatial input fields which can then be used to augment location-specific information as additional inputs to post-processing models. The benefits of including this spatial information is demonstrated in a case study of 2-m temperature forecasts at surface stations in Germany.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Disentangling the optical AGN and Host-galaxy luminosity with a probabilistic Flux Variation Gradient
Authors:
N. Gianniotis,
F. Pozo Nuñez,
K. L. Polsterer
Abstract:
We present a novel Probabilistic Flux Variation Gradient (PFVG) approach to to separate the contributions of active galactic nuclei (AGN) and host galaxies in the context of photometric reverberation map** (PRM) of AGN. We explored the ability of recovering the fractional contribution in a model-independent way using the entire set of light curves obtained through different filters and photometr…
▽ More
We present a novel Probabilistic Flux Variation Gradient (PFVG) approach to to separate the contributions of active galactic nuclei (AGN) and host galaxies in the context of photometric reverberation map** (PRM) of AGN. We explored the ability of recovering the fractional contribution in a model-independent way using the entire set of light curves obtained through different filters and photometric apertures simultaneously. The method is based on the observed bluer when brighter phenomenon that is attributed to the superimposition of a two-component structure; the red host galaxy, which is constant in time, and the varying blue AGN. We describe the PFVG mathematical formalism and demonstrate its performance using simulated light curves and available PRM observations. The new probabilistic approach is able to recover host-galaxy fluxes to within 1% precision as long as the light curves do not show a significant contribution from time delays. This represents a significant improvement with respect to previous applications of the traditional FVG method to PRM data. The proposed PFVG provides an efficient and accurate way to separate the AGN and host-galaxy luminosities in PRM monitoring data. The method will be especially helpful in the case of large upcoming photometric survey telescopes such as the public optical/near-infrared Legacy Survey of Space and Time (LSST) at the Vera C. Rubin Observatory. Finally, we have made the algorithms freely available as part of our Julia PFVG package.
△ Less
Submitted 28 October, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
From Photometric Redshifts to Improved Weather Forecasts: machine learning and proper scoring rules as a basis for interdisciplinary work
Authors:
Kai Lars Polsterer,
Antonio D'Isanto,
Sebastian Lerch
Abstract:
The amount, size, and complexity of astronomical data-sets and databases are growing rapidly in the last decades, due to new technologies and dedicated survey telescopes. Besides dealing with poly-structured and complex data, sparse data has become a field of growing scientific interest. A specific field of Astroinformatics research is the estimation of redshifts of extra-galactic sources by using…
▽ More
The amount, size, and complexity of astronomical data-sets and databases are growing rapidly in the last decades, due to new technologies and dedicated survey telescopes. Besides dealing with poly-structured and complex data, sparse data has become a field of growing scientific interest. A specific field of Astroinformatics research is the estimation of redshifts of extra-galactic sources by using sparse photometric observations. Many techniques have been developed to produce those estimates with increasing precision. In recent years, models have been favored which instead of providing a point estimate only, are able to generate probabilistic density functions (PDFs) in order to characterize and quantify the uncertainties of their estimates.
Crucial to the development of those models is a proper, mathematically principled way to evaluate and characterize their performances, based on scoring functions as well as on tools for assessing calibration. Still, in literature inappropriate methods are being used to express the quality of the estimates that are often not sufficient and can potentially generate misleading interpretations. In this work we summarize how to correctly evaluate errors and forecast quality when dealing with PDFs. We describe the use of the log-likelihood, the continuous ranked probability score (CRPS) and the probability integral transform (PIT) to characterize the calibration as well as the sharpness of predicted PDFs. We present what we achieved when using proper scoring rules to train deep neural networks as well as to evaluate the model estimates and how this work led from well calibrated redshift estimates to improvements in probabilistic weather forecasting. The presented work is an example of interdisciplinarity in data-science and illustrates how methods can help to bridge gaps between different fields of application.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Unveiling the rarest morphologies of the LOFAR Two-metre Sky Survey radio source population with self-organised maps
Authors:
Rafaël I. J. Mostert,
Kenneth J. Duncan,
Huub J. A. Röttgering,
Kai L. Polsterer,
Philip N. Best,
Marisa Brienza,
Marcus Brüggen,
Martin J. Hardcastle,
Nika Jurlin,
Beatriz Mingo,
Raffaella Morganti,
Tim Shimwell,
Dan Smith,
Wendy L. Williams
Abstract:
The Low Frequency Array (LOFAR) Two-metre Sky Survey (LoTSS) is a low-frequency radio continuum survey of the Northern sky at an unparalleled resolution and sensitivity. In order to fully exploit this huge dataset and those produced by the Square Kilometre Array in the next decade, automated methods in machine learning and data-mining will be increasingly essential both for morphological classific…
▽ More
The Low Frequency Array (LOFAR) Two-metre Sky Survey (LoTSS) is a low-frequency radio continuum survey of the Northern sky at an unparalleled resolution and sensitivity. In order to fully exploit this huge dataset and those produced by the Square Kilometre Array in the next decade, automated methods in machine learning and data-mining will be increasingly essential both for morphological classifications and for identifying optical counterparts to the radio sources. Using self-organising maps (SOMs), a form of unsupervised machine learning, we created a dimensionality reduction of the radio morphologies for the $\sim$25k extended radio continuum sources in the LoTSS first data release, which is only $\sim$2 percent of the final LoTSS survey. We made use of \textsc{PINK}, a code which extends the SOM algorithm with rotation and flip** invariance, increasing its suitability and effectiveness for training on astronomical sources. After training, the SOMs can be used for a wide range of science exploitation and we present an illustration of their potential by finding an arbitrary number of morphologically rare sources in our training data (424 square degrees) and subsequently in an area of the sky ($\sim$5300 square degrees) outside the training data. Objects found in this way span a wide range of morphological and physical categories: extended jets of radio active galactic nuclei, diffuse cluster haloes and relics, and nearby spiral galaxies. Finally, to enable accessible, interactive, and intuitive data exploration, we showcase the LOFAR-PyBDSF Visualisation Tool, which allows users to explore the LoTSS dataset through the trained SOMs.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Cataloging the radio-sky with unsupervised machine learning: a new approach for the SKA era
Authors:
T. J. Galvin,
M. Huynh,
R. P. Norris,
X. R. Wang,
E. Hopkins,
K. Polsterer,
N. O. Ralph,
A. N. O'Brien,
G. H. Heald
Abstract:
We develop a new analysis approach towards identifying related radio components and their corresponding infrared host galaxy based on unsupervised machine learning methods. By exploiting PINK, a self-organising map algorithm, we are able to associate radio and infrared sources without the a priori requirement of training labels. We present an example of this method using $894,415$ images from the…
▽ More
We develop a new analysis approach towards identifying related radio components and their corresponding infrared host galaxy based on unsupervised machine learning methods. By exploiting PINK, a self-organising map algorithm, we are able to associate radio and infrared sources without the a priori requirement of training labels. We present an example of this method using $894,415$ images from the FIRST and WISE surveys centred towards positions described by the FIRST catalogue. We produce a set of catalogues that complement FIRST and describe 802,646 objects, including their radio components and their corresponding AllWISE infrared host galaxy. Using these data products we (i) demonstrate the ability to identify objects with rare and unique radio morphologies (e.g. 'X'-shaped galaxies, hybrid FR-I/FR-II morphologies), (ii) can identify the potentially resolved radio components that are associated with a single infrared host and (iii) introduce a "curliness" statistic to search for bent and disturbed radio morphologies, and (iv) extract a set of 17 giant radio galaxies between 700-1100 kpc. As we require no training labels, our method can be applied to any radio-continuum survey, provided a sufficiently representative SOM can be trained.
△ Less
Submitted 26 June, 2020;
originally announced June 2020.
-
Optical continuum photometric reverberation map** of the Seyfert-1 galaxy Mrk509
Authors:
F. Pozo Nuñez,
N. Gianniotis,
J. Blex,
T. Lisow,
R. Chini,
K. L. Polsterer,
J. -U. Pott,
J. Esser,
G. Pietrzyński
Abstract:
We present the results of a two year optical continuum photometric reverberation map** campaign carried out on the nucleus of the Seyfert-1 galaxy Mrk509. Specially designed narrow-band filters were used in order to mitigate the line and pseudo-continuum contamination of the signal from the broad line region, while allowing for high-accuracy flux-calibration over a large field of view. We obtain…
▽ More
We present the results of a two year optical continuum photometric reverberation map** campaign carried out on the nucleus of the Seyfert-1 galaxy Mrk509. Specially designed narrow-band filters were used in order to mitigate the line and pseudo-continuum contamination of the signal from the broad line region, while allowing for high-accuracy flux-calibration over a large field of view. We obtained light curves with a sub-day time sampling and typical flux uncertainties of $1\%$. The high photometric precision allowed us to measure inter-band continuum time delays of up to $\sim 2$ days across the optical range. The time delays are consistent with the relation $τ\propto λ^{4/3}$ predicted for an optically thick and geometrically thin accretion disk model. The size of the disk is, however, a factor of 1.8 larger than predictions based on the standard thin-disk theory. We argue that, for the particular case of Mrk509, a larger black hole mass due to the unknown geometry scaling factor can reconcile the difference between the observations and theory.
△ Less
Submitted 21 December, 2019;
originally announced December 2019.
-
Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps
Authors:
T. J. Galvin,
M. Huynh,
R. P. Norris,
X. R. Wang,
E. Hopkins,
O. I. Wong,
S. Shabala,
L. Rudnick,
M. J. Alger,
K. L. Polsterer
Abstract:
With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and intro…
▽ More
With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and introduce a proper transfer mechanism via quantile random forest regression. By using parallelized rotation and flip** invariant Kohonen-maps, image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio continuum and WISE infrared all sky surveys are first projected down to a two-dimensional embedding in an unsupervised way. This embedding can be seen as a discretised space of shapes with the coordinates reflecting morphological features as expressed by the automatically derived prototypes. We find that these prototypes have reconstructed physically meaningful processes across two channel images at radio and infrared wavelengths in an unsupervised manner. In the second step, images are compared with those prototypes to create a heat-map, which is the morphological fingerprint of each object and the basis for transferring the user generated labels. These heat-maps have reduced the feature space by a factor of 248 and are able to be used as the basis for subsequent ML methods. Using an ensemble of decision trees we achieve upwards of 85.7% and 80.7% accuracy when predicting the number of components and peaks in an image, respectively, using these heat-maps. We also question the currently used discrete classification schema and introduce a continuous scale that better reflects the uncertainty in transition between two classes, caused by sensitivity and resolution limits.
△ Less
Submitted 5 April, 2019;
originally announced April 2019.
-
A Comparison of Photometric Redshift Techniques for Large Radio Surveys
Authors:
Ray P. Norris,
M. Salvato,
G. Longo,
M. Brescia,
T. Budavari,
S. Carliles,
S. Cavuoti,
D. Farrah,
J. Geach,
K. Luken,
A. Musaeva,
K. Polsterer,
G. Riccio,
N. Seymour,
V. Smolčić,
M. Vaccari,
P. Zinn
Abstract:
Future radio surveys will generate catalogues of tens of millions of radio sources, for which redshift estimates will be essential to achieve many of the science goals. However, spectroscopic data will be available for only a small fraction of these sources, and in most cases even the optical and infrared photometry will be of limited quality. Furthermore, radio sources tend to be at higher redshi…
▽ More
Future radio surveys will generate catalogues of tens of millions of radio sources, for which redshift estimates will be essential to achieve many of the science goals. However, spectroscopic data will be available for only a small fraction of these sources, and in most cases even the optical and infrared photometry will be of limited quality. Furthermore, radio sources tend to be at higher redshift than most optical sources and so a significant fraction of radio sources hosts differ from those for which most photometric redshift templates are designed. We therefore need to develop new techniques for estimating the redshifts of radio sources. As a starting point in this process, we evaluate a number of machine-learning techniques for estimating redshift, together with a conventional template-fitting technique. We pay special attention to how the performance is affected by the incompleteness of the training sample and by sparseness of the parameter space or by limited availability of ancillary multi-wavelength data. As expected, we find that the quality of the photometric-redshift degrades as the quality of the photometry decreases, but that even with the limited quality of photometry available for all sky-surveys, useful redshift information is available for the majority of sources, particularly at low redshift. We find that a template-fitting technique performs best with high-quality and almost complete multi-band photometry, especially if radio sources that are also X-ray emitting are treated separately. When we reduced the quality of photometry to match that available for the EMU all-sky radio survey, the quality of the template-fitting degraded and became comparable to some of the machine learning methods. Machine learning techniques currently perform better at low redshift than at high redshift, because of incompleteness of the currently available training data at high redshifts.
△ Less
Submitted 13 February, 2019;
originally announced February 2019.
-
Return of the features. Efficient feature selection and interpretation for photometric redshifts
Authors:
Antonio D'Isanto,
Stefano Cavuoti,
Fabian Gieseke,
Kai Lars Polsterer
Abstract:
The explosion of data in recent years has generated an increasing need for new analysis techniques in order to extract knowledge from massive datasets. Machine learning has proved particularly useful to perform this task. Fully automatized methods have recently gathered great popularity, even though those methods often lack physical interpretability. In contrast, feature based approaches can provi…
▽ More
The explosion of data in recent years has generated an increasing need for new analysis techniques in order to extract knowledge from massive datasets. Machine learning has proved particularly useful to perform this task. Fully automatized methods have recently gathered great popularity, even though those methods often lack physical interpretability. In contrast, feature based approaches can provide both well-performing models and understandable causalities with respect to the correlations found between features and physical processes. Efficient feature selection is an essential tool to boost the performance of machine learning models. In this work, we propose a forward selection method in order to compute, evaluate, and characterize better performing features for regression and classification problems. Given the importance of photometric redshift estimation, we adopt it as our case study. We synthetically created 4,520 features by combining magnitudes, errors, radii, and ellipticities of quasars, taken from the SDSS. We apply a forward selection process, a recursive method in which a huge number of feature sets is tested through a kNN algorithm, leading to a tree of feature sets. The branches of the tree are then used to perform experiments with the random forest, in order to validate the best set with an alternative model. We demonstrate that the sets of features determined with our approach improve the performances of the regression models significantly when compared to the performance of the classic features from the literature. The found features are unexpected and surprising, being very different from the classic features. Therefore, a method to interpret some of the found features in a physical context is presented. The methodology described here is very general and can be used to improve the performance of machine learning models for any regression or classification task.
△ Less
Submitted 9 May, 2018; v1 submitted 27 March, 2018;
originally announced March 2018.
-
Photometric redshift estimation via deep learning
Authors:
Antonio D'Isanto,
Kai Lars Polsterer
Abstract:
The need to analyze the available large synoptic multi-band surveys drives the development of new data-analysis methods. Photometric redshift estimation is one field of application where such new methods improved the results, substantially. Up to now, the vast majority of applied redshift estimation methods have utilized photometric features. We aim to develop a method to derive probabilistic phot…
▽ More
The need to analyze the available large synoptic multi-band surveys drives the development of new data-analysis methods. Photometric redshift estimation is one field of application where such new methods improved the results, substantially. Up to now, the vast majority of applied redshift estimation methods have utilized photometric features. We aim to develop a method to derive probabilistic photometric redshift directly from multi-band imaging data, rendering pre-classification of objects and feature extraction obsolete. A modified version of a deep convolutional network was combined with a mixture density network. The estimates are expressed as Gaussian mixture models representing the probability density functions (PDFs) in the redshift space. In addition to the traditional scores, the continuous ranked probability score (CRPS) and the probability integral transform (PIT) were applied as performance criteria. We have adopted a feature based random forest and a plain mixture density network to compare performances on experiments with data from SDSS (DR9). We show that the proposed method is able to predict redshift PDFs independently from the type of source, for example galaxies, quasars or stars. Thereby the prediction performance is better than both presented reference methods and is comparable to results from the literature. The presented method is extremely general and allows us to solve of any kind of probabilistic regression problems based on imaging data, for example estimating metallicity or star formation rate of galaxies. This kind of methodology is tremendously important for the next generation of surveys.
△ Less
Submitted 8 September, 2017; v1 submitted 8 June, 2017;
originally announced June 2017.
-
WTF? Discovering the Unexpected in next-generation radio continuum surveys
Authors:
Evan Crawford,
Ray P. Norris,
Kai Polsterer
Abstract:
Most major discoveries in astronomy have come from unplanned discoveries made by surveying the Universe in a new way, rather than by testing a hypothesis or conducting an investigation with planned outcomes. Next generation radio continuum surveys such as the Evolutionary Map of the Universe (EMU: the radio continuum survey on the new Australian SKA Pathfinder telescope), will significantly expand…
▽ More
Most major discoveries in astronomy have come from unplanned discoveries made by surveying the Universe in a new way, rather than by testing a hypothesis or conducting an investigation with planned outcomes. Next generation radio continuum surveys such as the Evolutionary Map of the Universe (EMU: the radio continuum survey on the new Australian SKA Pathfinder telescope), will significantly expand the volume of observational phase space, so we can be reasonably confident that we will stumble across unexpected new phenomena or new types of object. However, the complexity of the instrument and the large data volumes mean that it may be non-trivial to identify them. On the other hand, if we don't, then we may be missing out on the most exciting science results from EMU. We have therefore started a project called "WTF", which explicitly aims to mine EMU data to discover unexpected science that is not part of our primary science goals, using a variety of machine-learning techniques and algorithms. Although targeted specifically at EMU, we expect this approach will have broad applicability to astronomical survey data.
△ Less
Submitted 9 November, 2016;
originally announced November 2016.
-
Uncertain Photometric Redshifts
Authors:
Kai Lars Polsterer,
Antonio D'Isanto,
Fabian Gieseke
Abstract:
Photometric redshifts play an important role as a measure of distance for various cosmological topics. Spectroscopic redshifts are only available for a very limited number of objects but can be used for creating statistical models. A broad variety of photometric catalogues provide uncertain low resolution spectral information for galaxies and quasars that can be used to infer a redshift. Many diff…
▽ More
Photometric redshifts play an important role as a measure of distance for various cosmological topics. Spectroscopic redshifts are only available for a very limited number of objects but can be used for creating statistical models. A broad variety of photometric catalogues provide uncertain low resolution spectral information for galaxies and quasars that can be used to infer a redshift. Many different techniques have been developed to produce those redshift estimates with increasing precision. Instead of providing a point estimate only, astronomers start to generate probabilistic density functions (PDFs) which should provide a characterisation of the uncertainties of the estimation. In this work we present two simple approaches on how to generate those PDFs. We use the example of generating the photometric redshift PDFs of quasars from SDSS(DR7) to validate our approaches and to compare them with point estimates. We do not aim for presenting a new best performing method, but we choose an intuitive approach that is based on well known machine learning algorithms. Furthermore we introduce proper tools for evaluating the performance of PDFs in the context of astronomy. The continuous ranked probability score (CRPS) and the probability integral transform (PIT) are well accepted in the weather forecasting community. Both tools reflect how well the PDFs reproduce the real values of the analysed objects. As we show, nearly all currently used measures in astronomy show severe weaknesses when used to evaluate PDFs.
△ Less
Submitted 29 August, 2016;
originally announced August 2016.
-
A Spectral Model for Multimodal Redshift Estimation
Authors:
Sven D. Kugler,
Nikolaos Gianniotis,
Kai L. Polsterer
Abstract:
We present a physically inspired model for the problem of redshift estimation. Typically, redshift estimation has been treated as a regression problem that takes as input magnitudes and maps them to a single target redshift. In this work we acknowledge the fact that observed magnitudes may actually admit multiple plausible redshifts, i.e. the distribution of redshifts explaining the observed magni…
▽ More
We present a physically inspired model for the problem of redshift estimation. Typically, redshift estimation has been treated as a regression problem that takes as input magnitudes and maps them to a single target redshift. In this work we acknowledge the fact that observed magnitudes may actually admit multiple plausible redshifts, i.e. the distribution of redshifts explaining the observed magnitudes (or colours) is multimodal. Hence, employing one of the standard regression models, as is typically done, is insufficient for this kind of problem, as most models implement either one-to-one or many-to-one map**s. The observed multimodality of solutions is a direct consequence of (a) the variety of physical mechanisms that give rise to the observations, (b) the limited number of measurements available and (c) the presence of noise in photometric measurements. Our proposed solution consists in formulating a model from first principles capable of generating spectra. The generated spectra are integrated over filter curves to produce magnitudes which are then matched to the observed magnitudes. The resulting model naturally expresses a multimodal posterior over possible redshifts, includes measurement uncertainty (e.g. missing values) and is shown to perform favourably on a real dataset.
△ Less
Submitted 20 June, 2016;
originally announced June 2016.
-
Model-Coupled Autoencoder for Time Series Visualisation
Authors:
Nikolaos Gianniotis,
Sven D. Kügler,
Peter Tiňo,
Kai L. Polsterer
Abstract:
We present an approach for the visualisation of a set of time series that combines an echo state network with an autoencoder. For each time series in the dataset we train an echo state network, using a common and fixed reservoir of hidden neurons, and use the optimised readout weights as the new representation. Dimensionality reduction is then performed via an autoencoder on the readout weight rep…
▽ More
We present an approach for the visualisation of a set of time series that combines an echo state network with an autoencoder. For each time series in the dataset we train an echo state network, using a common and fixed reservoir of hidden neurons, and use the optimised readout weights as the new representation. Dimensionality reduction is then performed via an autoencoder on the readout weight representations. The crux of the work is to equip the autoencoder with a loss function that correctly interprets the reconstructed readout weights by associating them with a reconstruction error measured in the data space of sequences. This essentially amounts to measuring the predictive performance that the reconstructed readout weights exhibit on their corresponding sequences when plugged back into the echo state network with the same fixed reservoir. We demonstrate that the proposed visualisation framework can deal both with real valued sequences as well as binary sequences. We derive magnification factors in order to analyse distance preservations and distortions in the visualisation space. The versatility and advantages of the proposed method are demonstrated on datasets of time series that originate from diverse domains.
△ Less
Submitted 21 January, 2016;
originally announced January 2016.
-
An Explorative Approach for Inspecting Kepler Data
Authors:
S. D. Kügler,
N. Gianniotis,
K. L. Polsterer
Abstract:
The Kepler survey has provided a wealth of astrophysical knowledge by continuously monitoring over 150,000 stars. The resulting database contains thousands of examples of known variability types and at least as many that cannot be classified yet. In order to reveal the knowledge hidden in the database, we introduce a new visualisation method that allows us to inspect time series exploratively. To…
▽ More
The Kepler survey has provided a wealth of astrophysical knowledge by continuously monitoring over 150,000 stars. The resulting database contains thousands of examples of known variability types and at least as many that cannot be classified yet. In order to reveal the knowledge hidden in the database, we introduce a new visualisation method that allows us to inspect time series exploratively. To that end, we propose dimensionality reduction on the parameters of a model capable of representing time series as fixed-length vector representation. We show that a more refined objective function can be chosen by minimising the prediction error of the data reconstruction instead of the reconstruction of the model parameters. The proposed visualisation exhibits a strong correlation between the variability behaviour of the light curves and their physical properties. As a consequence, temperature and surface gravity can, for some stars, be directly inferred from non- (or quasi-) periodic light curves.
△ Less
Submitted 4 November, 2015; v1 submitted 14 August, 2015;
originally announced August 2015.
-
Radio Galaxy Zoo: host galaxies and radio morphologies derived from visual inspection
Authors:
J. K. Banfield,
O. I. Wong,
K. W. Willett,
R. P. Norris,
L. Rudnick,
S. S. Shabala,
B. D. Simmons,
C. Snyder,
A. Garon,
N. Seymour,
E. Middelberg,
H. Andernach,
C. J. Lintott,
K. Jacob,
A. D. Kapinska,
M. Y. Mao,
K. L. Masters,
M. J. Jarvis,
K. Schawinski,
E. Paget,
R. Simpson,
H. R. Klockner,
S. Bamford,
T. Burchell,
K. E. Chow
, et al. (11 additional authors not shown)
Abstract:
We present results from the first twelve months of operation of Radio Galaxy Zoo, which upon completion will enable visual inspection of over 170,000 radio sources to determine the host galaxy of the radio emission and the radio morphology. Radio Galaxy Zoo uses $1.4\,$GHz radio images from both the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) and the Australia Telescope Large Area…
▽ More
We present results from the first twelve months of operation of Radio Galaxy Zoo, which upon completion will enable visual inspection of over 170,000 radio sources to determine the host galaxy of the radio emission and the radio morphology. Radio Galaxy Zoo uses $1.4\,$GHz radio images from both the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) and the Australia Telescope Large Area Survey (ATLAS) in combination with mid-infrared images at $3.4\,μ$m from the {\it Wide-field Infrared Survey Explorer} (WISE) and at $3.6\,μ$m from the {\it Spitzer Space Telescope}. We present the early analysis of the WISE mid-infrared colours of the host galaxies. For images in which there is $>\,75\%$ consensus among the Radio Galaxy Zoo cross-identifications, the project participants are as effective as the science experts at identifying the host galaxies. The majority of the identified host galaxies reside in the mid-infrared colour space dominated by elliptical galaxies, quasi-stellar objects (QSOs), and luminous infrared radio galaxies (LIRGs). We also find a distinct population of Radio Galaxy Zoo host galaxies residing in a redder mid-infrared colour space consisting of star-forming galaxies and/or dust-enhanced non star-forming galaxies consistent with a scenario of merger-driven active galactic nuclei (AGN) formation. The completion of the full Radio Galaxy Zoo project will measure the relative populations of these hosts as a function of radio morphology and power while providing an avenue for the identification of rare and extreme radio structures. Currently, we are investigating candidates for radio galaxies with extreme morphologies, such as giant radio galaxies, late-type host galaxies with extended radio emission, and hybrid morphology radio sources.
△ Less
Submitted 26 July, 2015;
originally announced July 2015.
-
Autoencoding Time Series for Visualisation
Authors:
Nikolaos Gianniotis,
Dennis Kügler,
Peter Tino,
Kai Polsterer,
Ranjeev Misra
Abstract:
We present an algorithm for the visualisation of time series. To that end we employ echo state networks to convert time series into a suitable vector representation which is capable of capturing the latent dynamics of the time series. Subsequently, the obtained vector representations are put through an autoencoder and the visualisation is constructed using the activations of the bottleneck. The cr…
▽ More
We present an algorithm for the visualisation of time series. To that end we employ echo state networks to convert time series into a suitable vector representation which is capable of capturing the latent dynamics of the time series. Subsequently, the obtained vector representations are put through an autoencoder and the visualisation is constructed using the activations of the bottleneck. The crux of the work lies with defining an objective function that quantifies the reconstruction error of these representations in a principled manner. We demonstrate the method on synthetic and real data.
△ Less
Submitted 5 May, 2015;
originally announced May 2015.
-
Featureless Classification of Light Curves
Authors:
Sven Dennis Kügler,
Nikos Gianniotis,
Kai Lars Polsterer
Abstract:
In the era of rapidly increasing amounts of time series data, classification of variable objects has become the main objective of time-domain astronomy. Classification of irregularly sampled time series is particularly difficult because the data cannot be represented naturally as a vector which can be directly fed into a classifier. In the literature, various statistical features serve as vector r…
▽ More
In the era of rapidly increasing amounts of time series data, classification of variable objects has become the main objective of time-domain astronomy. Classification of irregularly sampled time series is particularly difficult because the data cannot be represented naturally as a vector which can be directly fed into a classifier. In the literature, various statistical features serve as vector representations. In this work, we represent time series by a density model. The density model captures all the information available, including measurement errors. Hence, we view this model as a generalisation to the static features which directly can be derived, e.g., as moments from the density. Similarity between each pair of time series is quantified by the distance between their respective models. Classification is performed on the obtained distance matrix. In the numerical experiments, we use data from the OGLE and ASAS surveys and demonstrate that the proposed representation performs up to par with the best cur- rently used feature-based approaches. The density representation preserves all static information present in the observational data, in contrast to a less complete description by features. The density representation is an upper boundary in terms of information made available to the classifier. Consequently, the predictive power of the proposed classification depends on the choice of similarity measure and classifier, only. Due to its principled nature, we advocate that this new approach of representing time series has potential in tasks beyond classification, e.g., unsupervised learning.
△ Less
Submitted 20 May, 2015; v1 submitted 17 April, 2015;
originally announced April 2015.
-
Estimating Spectroscopic Redshifts by Using k Nearest Neighbors Regression I. Description of Method and Analysis
Authors:
S. D. Kügler,
K. Polsterer,
M. Hoecker
Abstract:
Context: In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. While classical approaches (e.g. template fitting) are fine for objects of well-known classes, alternative techniques have to be developed to determine those that do not fit. Therefore a classification scheme should be based on individual properties instead of fitting to a glo…
▽ More
Context: In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. While classical approaches (e.g. template fitting) are fine for objects of well-known classes, alternative techniques have to be developed to determine those that do not fit. Therefore a classification scheme should be based on individual properties instead of fitting to a global model and therefore loose valuable information. An important issue when dealing with large data sets is the outlier detection which at the moment is often treated problem-orientated. Aims: In this paper we present a method to statistically estimate the redshift z based on a similarity approach. This allows us to determine redshifts in spectra in emission as well as in absorption without using any predefined model. Additionally we show how an estimate of the redshift based on single features is possible. As a consequence we are e.g. able to filter objects which show multiple redshift components. We propose to apply this general method to all similar problems in order to identify objects where traditional approaches fail. Methods: The redshift estimation is performed by comparing predefined regions in the spectra and applying a k nearest neighbor regression model for every predefined emission and absorption region, individually. Results: We estimated a redshift for more than 50% of the analyzed 16,000 spectra of our reference and test sample. The redshift estimate yields a precision for every individually tested feature that is comparable with the overall precision of the redshifts of SDSS. In 14 spectra we find a significant shift between emission and absorption or emission and emission lines. The results show already the immense power of this simple machine learning approach for investigating huge databases such as the SDSS.
△ Less
Submitted 6 March, 2015; v1 submitted 30 September, 2014;
originally announced September 2014.
-
Finding New High-Redshift Quasars by Asking the Neighbours
Authors:
Kai Lars Polsterer,
Peter-Christian Zinn,
Fabian Gieseke
Abstract:
Quasars with a high redshift (z) are important to understand the evolution processes of galaxies in the early universe. However only a few of these distant objects are known to this date. The costs of building and operating a 10-metre class telescope limit the number of facilities and, thus, the available observation time. Therefore an efficient selection of candidates is mandatory. This paper pre…
▽ More
Quasars with a high redshift (z) are important to understand the evolution processes of galaxies in the early universe. However only a few of these distant objects are known to this date. The costs of building and operating a 10-metre class telescope limit the number of facilities and, thus, the available observation time. Therefore an efficient selection of candidates is mandatory. This paper presents a new approach to select quasar candidates with high redshift (z>4.8) based on photometric catalogues. We have chosen to use the z>4.8 limit for our approach because the dominant Lyman alpha emission line of a quasar can only be found in the Sloan i and z-band filters. As part of the candidate selection approach, a photometric redshift estimator is presented, too. Three of the 120,000 generated candidates have been spectroscopically analysed in follow-up observations and a new z=5.0 quasar was found. This result is consistent with the estimated detection ratio of about 50 per cent and we expect 60,000 high-redshift quasars to be part of our candidate sample. The created candidates are available for download at MNRAS or at http://www.astro.rub.de/polsterer/quasar-candidates.csv.
△ Less
Submitted 26 October, 2012;
originally announced October 2012.
-
Detecting Quasars in Large-Scale Astronomical Surveys
Authors:
Fabian Gieseke,
Kai Lars Polsterer,
Andreas Thom,
Peter-Christian Zinn,
Dominik Bomanns,
Ralf-Jürgen Dettmar,
Oliver Kramer,
Jan Vahrenhold
Abstract:
We present a classification-based approach to identify quasi-stellar radio sources (quasars) in the Sloan Digital Sky Survey and evaluate its performance on a manually labeled training set. While reasonable results can already be obtained via approaches working only on photometric data, our experiments indicate that simple but problem-specific features extracted from spectroscopic data can signifi…
▽ More
We present a classification-based approach to identify quasi-stellar radio sources (quasars) in the Sloan Digital Sky Survey and evaluate its performance on a manually labeled training set. While reasonable results can already be obtained via approaches working only on photometric data, our experiments indicate that simple but problem-specific features extracted from spectroscopic data can significantly improve the classification performance. Since our approach works orthogonal to existing classification schemes used for building the spectroscopic catalogs, our classification results are well suited for a mutual assessment of the approaches' accuracies.
△ Less
Submitted 23 August, 2011;
originally announced August 2011.
-
Infrared Narrow-Band Tomography of the Local Starburst NGC 1569 with LBT/LUCIFER
Authors:
A. Pasquali,
A. Bik,
S. Zibetti,
N. Ageorges,
W. Seifert,
W. Brandner,
H. -W. Rix,
M Juette,
V. Knierim,
P. Buschkamp,
C. Feiz,
H. Gemperlein,
A. Germeroth,
R. Hoffmann,
W. Laun,
R. Lederer,
M. Lehmitz,
R. Lenzen,
U. Mall,
H. Mandel,
P. Mueller,
V. Naranjo,
K. Polsterer,
A. Quirrenbach,
L. Schaeffner
, et al. (2 additional authors not shown)
Abstract:
We used the near-IR imager/spectrograph LUCIFER mounted on the Large Binocular Telescope (LBT) to image, with sub-arcsec seeing, the local dwarf starburst NGC 1569 in the JHK bands and HeI 1.08 micron, [FeII] 1.64 micron and Brgamma narrow-band filters. We obtained high-quality spatial maps of HeI, [FeII] and Brgamma emission across the galaxy, and used them together with HST/ACS images of NGC 156…
▽ More
We used the near-IR imager/spectrograph LUCIFER mounted on the Large Binocular Telescope (LBT) to image, with sub-arcsec seeing, the local dwarf starburst NGC 1569 in the JHK bands and HeI 1.08 micron, [FeII] 1.64 micron and Brgamma narrow-band filters. We obtained high-quality spatial maps of HeI, [FeII] and Brgamma emission across the galaxy, and used them together with HST/ACS images of NGC 1569 in the Halpha filter to derive the two-dimensional spatial map of the dust extinction and surface star formation rate density. We show that dust extinction is rather patchy and, on average, higher in the North-West (NW) portion of the galaxy [E_g(B-V) = 0.71 mag] than in the South-East [E_g(B-V) = 0.57 mag]. Similarly, the surface density of star formation rate peaks in the NW region of NGC 1569, reaching a value of about 4 x 10^-6 M_sun yr^-1 pc^-2. The total star formation rate as estimated from the integrated, dereddened Halpha luminosity is about 0.4 M_sun yr^-1, and the total supernova rate from the integrated, dereddened [FeII] luminosity is about 0.005 yr^-1 (assuming a distance of 3.36 Mpc). The azimuthally averaged [FeII]/Brgamma flux ratio is larger at the edges of the central, gas-deficient cavities (encompassing the super star clusters A and B) and in the galaxy outskirts. If we interpret this line ratio as the ratio between the average past star formation (as traced by supernovae) and on-going activity (represented by OB stars able to ionize the interstellar medium), it would then indicate that star formation has been quenched within the central cavities and lately triggered in a ring around them. The number of ionizing hydrogen and helium photons as computed from the integrated, dereddened Halpha and HeI luminosities suggests that the latest burst of star formation occurred about 4 Myr ago and produced new stars with a total mass of ~1.8 x 10^6 M_sun. [Abridged]
△ Less
Submitted 17 January, 2011;
originally announced January 2011.
-
Black Hole Mass Estimates Based on CIV are Consistent with Those Based on the Balmer Lines
Authors:
R. J. Assef,
K. D. Denney,
C. S. Kochanek,
B. M. Peterson,
S. Kozlowski,
N. Ageorges,
R. S. Barrows,
P. Buschkamp,
M. Dietrich,
E. Falco,
C. Feiz,
H. Gemperlein,
A. Germeroth,
C. J. Grier,
R. Hofmann,
M. Juette,
R. Khan,
M. Kilic,
V. Knierim,
W. Laun,
R. Lederer,
M. Lehmitz,
R. Lenzen,
U. Mall,
K. K. Madsen
, et al. (17 additional authors not shown)
Abstract:
Using a sample of high-redshift lensed quasars from the CASTLES project with observed-frame ultraviolet or optical and near-infrared spectra, we have searched for possible biases between supermassive black hole (BH) mass estimates based on the CIV, Halpha and Hbeta broad emission lines. Our sample is based upon that of Greene, Peng & Ludwig, expanded with new near-IR spectroscopic observations, co…
▽ More
Using a sample of high-redshift lensed quasars from the CASTLES project with observed-frame ultraviolet or optical and near-infrared spectra, we have searched for possible biases between supermassive black hole (BH) mass estimates based on the CIV, Halpha and Hbeta broad emission lines. Our sample is based upon that of Greene, Peng & Ludwig, expanded with new near-IR spectroscopic observations, consistently analyzed high S/N optical spectra, and consistent continuum luminosity estimates at 5100A. We find that BH mass estimates based on the FWHM of CIV show a systematic offset with respect to those obtained from the line dispersion, sigma_l, of the same emission line, but not with those obtained from the FWHM of Halpha and Hbeta. The magnitude of the offset depends on the treatment of the HeII and FeII emission blended with CIV, but there is little scatter for any fixed measurement prescription. While we otherwise find no systematic offsets between CIV and Balmer line mass estimates, we do find that the residuals between them are strongly correlated with the ratio of the UV and optical continuum luminosities. Removing this dependency reduces the scatter between the UV- and optical-based BH mass estimates by a factor of approximately 2, from roughly 0.35 to 0.18 dex. The dispersion is smallest when comparing the CIV sigma_l mass estimate, after removing the offset from the FWHM estimates, and either Balmer line mass estimate. The correlation with the continuum slope is likely due to a combination of reddening, host contamination and object-dependent SED shapes. When we add additional heterogeneous measurements from the literature, the results are unchanged.
△ Less
Submitted 30 August, 2011; v1 submitted 6 September, 2010;
originally announced September 2010.