Astrophysics > Astrophysics of Galaxies
[Submitted on 11 Oct 2019]
Title:Quasar and galaxy classification in Gaia Data Release 2
View PDFAbstract:We construct a supervised classifier based on Gaussian Mixture Models to probabilistically classify objects in Gaia data release 2 (GDR2) using only photometric and astrometric data in that release. The model is trained empirically to classify objects into three classes -- star, quasar, galaxy -- for G<=14.5 mag down to the Gaia magnitude limit of G=21.0 mag. Galaxies and quasars are identified for the training set by a cross-match to objects with spectroscopic classifications from the Sloan Digital Sky Survey. Stars are defined directly from GDR2. When allowing for the expectation that quasars are 500 times rarer than stars, and galaxies 7500 times rarer than stars (the class imbalance problem), samples classified with a threshold probability of 0.5 are predicted to have purities of 0.43 for quasars and 0.28 for galaxies, and completenesses of 0.58 and 0.72 respectively. The purities can be increased up to 0.60 by adopting a higher threshold. Not accounting for this expected low frequency of extragalactic objects (the class prior) would give both erroneously optimistic performance predictions and severely impure samples. Applying our model to all 1.20 billion objects in GDR2 with the required features, we classify 2.3 million objects as quasars and 0.37 million objects as galaxies (with individual probabilities above 0.5). The small number of galaxies is due to the strong bias of the satellite detection algorithm and on-ground data selection against extended objects. We infer the true number of quasars and galaxies -- as these classes are defined by our training set -- to be 690,000 and 110,000 respectively (+/- 50%). The aim of this work is to see how well extragalactic objects can be classified using only GDR2 data. Better classifications should be possible with the low resolution spectroscopy (BP/RP) planned for GDR3.
Submission history
From: Coryn Bailer-Jones [view email][v1] Fri, 11 Oct 2019 15:43:20 UTC (6,608 KB)
Current browse context:
astro-ph.GA
Change to browse by:
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.