Skip to main content

Showing 1–47 of 47 results for author: Hennig, C

.
  1. arXiv:2404.13589  [pdf, other

    stat.ME

    The quantile-based classifier with variable-wise parameters

    Authors: Marco Berrettini, Christian Hennig, Cinzia Viroli

    Abstract: Quantile-based classifiers can classify high-dimensional observations by minimising a discrepancy of an observation to a class based on suitable quantiles of the within-class distributions, corresponding to a unique percentage for all variables. The present work extends these classifiers by introducing a way to determine potentially different optimal percentages for different variables. Furthermor… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  2. arXiv:2401.12126  [pdf, other

    q-bio.PE stat.AP stat.ME

    Approaches to biological species delimitation based on genetic and spatial dissimilarity

    Authors: Gabriele d'Angella, Christian Hennig

    Abstract: The delimitation of biological species, i.e., deciding which individuals belong to the same species and whether and how many different species are represented in a data set, is key to the conservation of biodiversity. Much existing work uses only genetic data for species delimitation, often employing some kind of cluster analysis. This can be misleading, because geographically distant groups of in… ▽ More

    Submitted 3 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: Paper of 26 pages with 6 figures; appendix of 19 pages with 17 figures. February 2024 update: tiny notation edit, results unchanged. April 2024 update: additional simulation results and plots; introduction and description of the methodologies edited; broader appendix with new charts. June 2024 update: Minor edits in methods description

  3. arXiv:2311.06108  [pdf, other

    math.ST stat.ML

    Nonparametric consistency for maximum likelihood estimation and clustering based on mixtures of elliptically-symmetric distributions

    Authors: Pietro Coretto, Christian Hennig

    Abstract: The consistency of the maximum likelihood estimator for mixtures of elliptically-symmetric distributions for estimating its population version is shown, where the underlying distribution $P$ is nonparametric and does not necessarily belong to the class of mixtures on which the estimator is based. In a situation where $P$ is a mixture of well enough separated but nonparametric distributions it is s… ▽ More

    Submitted 26 April, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

    MSC Class: 62H30; 62F35

  4. arXiv:2309.08468  [pdf, other

    stat.ME stat.CO

    Choice of trimming proportion and number of clusters in robust clustering based on trimming

    Authors: Luis Angel García-Escudero, Christian Hennig, Agustín Mayo-Iscar, Gianluca Morelli, Marco Riani

    Abstract: So-called "classification trimmed likelihood curves" have been proposed as a useful heuristic tool to determine the number of clusters and trimming proportion in trimming-based robust clustering methods. However, these curves needs a careful visual inspection, and this way of choosing parameters requires subjective decisions. This work is intended to provide theoretical background for the understa… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  5. arXiv:2308.14478  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Some issues in robust clustering

    Authors: Christian Hennig

    Abstract: Some key issues in robust clustering are discussed with focus on Gaussian mixture model based clustering, namely the formal definition of outliers, ambiguity between groups of outliers and clusters, the interaction between robust clustering and the estimation of the number of clusters, the essential dependence of (not only) robust clustering on tuning decisions, and shortcomings of existing measur… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: 11 pages, no figures

    MSC Class: 62H30

  6. Onset of a conceptual outline map to get a hold on the jungle of cluster analysis

    Authors: Iven Van Mechelen, Christian Hennig, Henk A. L. Kiers

    Abstract: The domain of cluster analysis is a meeting point for a very rich multidisciplinary encounter, with cluster-analytic methods being studied and developed in discrete mathematics, numerical analysis, statistics, data analysis, data science, and computer science (including machine learning, data mining, and knowledge discovery), to name but a few. The other side of the coin, however, is that the doma… ▽ More

    Submitted 11 July, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: 43 pages, 4 figures

    MSC Class: 62H30

    Journal ref: WIREs Data Mining and Knowledge Discovery, 2024, e1547

  7. arXiv:2204.09793  [pdf, other

    stat.AP

    Clustering of football players based on performance data and aggregated clustering validity indexes

    Authors: Serhat Akhanli, Christian Hennig

    Abstract: We analyse football (soccer) player performance data with mixed type variables from the 2014-15 season of eight European major leagues. We cluster these data based on a tailor-made dissimilarity measure. In order to decide between the many available clustering methods and to choose an appropriate number of clusters, we use the approach by Akhanli and Hennig (2020). This is based on several valid… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: 26 pages, 5 figures

    MSC Class: 62H30

  8. arXiv:2108.09243  [pdf, other

    stat.ME

    A comparison of different clustering approaches for high-dimensional presence-absence data

    Authors: Gabriele d'Angella, Christian Hennig

    Abstract: Presence-absence data is defined by vectors or matrices of zeroes and ones, where the ones usually indicate a "presence" in a certain place. Presence-absence data occur for example when investigating geographical species distributions, genetic information, or the occurrence of certain terms in texts. There are many applications for clustering such data; one example is to find so-called biotic elem… ▽ More

    Submitted 22 November, 2021; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: 22 pages, 6 Figures

    MSC Class: 62H30

  9. Parameters not empirically identifiable or distinguishable, including correlation between Gaussian observations

    Authors: Christian Hennig

    Abstract: Note: Accepted version, published in Statistical Papers, https://doi.org/10.1007/s00362-023-01414-3. It is shown that some theoretically identifiable parameters cannot be empirically identified, meaning that no consistent estimator of them can exist. An important example is a constant correlation between Gaussian observations (in presence of such correlation not even the mean can be empirically… ▽ More

    Submitted 17 April, 2023; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: 27 pages, no figures

    MSC Class: 62F99; 62G99; 62H30

  10. arXiv:2107.04946  [pdf, other

    econ.EM stat.ME

    Inference for the proportional odds cumulative logit model with monotonicity constraints for ordinal predictors and ordinal response

    Authors: Javier Espinosa-Brito, Christian Hennig

    Abstract: The proportional odds cumulative logit model (POCLM) is a standard regression model for an ordinal response. Ordinality of predictors can be incorporated by monotonicity constraints for the corresponding parameters. It is shown that estimators defined by optimization, such as maximum likelihood estimators, for an unconstrained model and for parameters in the interior set of the parameter space of… ▽ More

    Submitted 1 June, 2023; v1 submitted 10 July, 2021; originally announced July 2021.

  11. arXiv:2103.01281  [pdf, ps, other

    stat.ME

    Validation of cluster analysis results on validation data: A systematic framework

    Authors: Theresa Ullmann, Christian Hennig, Anne-Laure Boulesteix

    Abstract: Cluster analysis refers to a wide range of data analytic techniques for class discovery and is popular in many application fields. To judge the quality of a clustering result, different cluster validation procedures have been proposed in the literature. While there is extensive work on classical validation techniques, such as internal and external validation, less attention has been given to valid… ▽ More

    Submitted 10 January, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 32 pages, 1 figure

  12. arXiv:2102.03645  [pdf, other

    stat.ME

    An empirical comparison and characterisation of nine popular clustering methods

    Authors: Christian Hennig

    Abstract: Nine popular clustering methods are applied to 42 real data sets. The aim is to give a detailed characterisation of the methods by means of several cluster validation indexes that measure various individual aspects of the resulting clusters such as small within-cluster distances, separation of clusters, closeness to a Gaussian distribution etc. as introduced in Hennig (2019). 30 of the data sets c… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: 44 pages, 9 Figures

    MSC Class: 62H30

  13. arXiv:2010.07657  [pdf

    physics.chem-ph

    The missing pieces of the PuO 2 nanoparticle puzzle

    Authors: Evgeny Gerber, Anna Yu Romanchuk, Ivan Pidchenko, Lucia Amidani, Andre Rossberg, Christoph Hennig, Gavin B M Vaughan, Alexander Trigub, Tolganay Egorova, Stephen Bauters, Tatiana Plakhova, Myrtille O J Y Hunault, Stephan Weiss, Sergei M Butorin, Andreas C Scheinost, Stepan N Kalmykov, Kristina O Kvashnina

    Abstract: The nanoscience field often produces results more mystifying than any other discipline. It has been argued that changes in the plutonium dioxide (PuO2) particle size from bulk to nano can have a drastic effect on PuO2 properties. Here we report a full characterization of PuO2 nanoparticles (NPs) at the atomic level and probe their local and electronic structures by a variety of methods available a… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: Nanoscale (2020)

  14. arXiv:2009.00921  [pdf, other

    stat.ME

    An adequacy approach for deciding the number of clusters for OTRIMLE robust Gaussian mixture based clustering

    Authors: Christian Hennig, Pietro Coretto

    Abstract: We introduce a new approach to deciding the number of clusters. The approach is applied to Optimally Tuned Robust Improper Maximum Likelihood Estimation (OTRIMLE; Coretto and Hennig 2016) of a Gaussian mixture model allowing for observations to be classified as "noise", but it can be applied to other clustering methods as well. The quality of a clustering is assessed by a statistic $Q$ that measur… ▽ More

    Submitted 25 December, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: 35 pages, 13 figures

    MSC Class: 62H30

  15. Probability Models in Statistical Data Analysis: Uses, Interpretations, Frequentism-As-Model

    Authors: Christian Hennig

    Abstract: Note: Published now as a chapter in "Handbook of the History and Philosophy of Mathematical Practice" (Springer Nature, editor B. Sriraman, https://doi.org/10.1007/978-3-030-19071-2_105-1). The application of mathematical probability theory in statistics is quite controversial. Controversies regard both the interpretation of probability, and approaches to statistical inference. After having give… ▽ More

    Submitted 18 November, 2023; v1 submitted 11 July, 2020; originally announced July 2020.

    Comments: 55 pages no figures. Accepted for publication as a chapter in "Handbook of the History and Philosophy of Mathematical Practice - Practical, Historical and Philosophical Instances of Probability'' (Springer Nature, editor Egan Chernoff)

    MSC Class: 62A01

  16. arXiv:2002.01822  [pdf, other

    stat.ME

    Comparing clusterings and numbers of clusters by aggregation of calibrated clustering validity indexes

    Authors: Serhat Emre Akhanli, Christian Hennig

    Abstract: A key issue in cluster analysis is the choice of an appropriate clustering method and the determination of the best number of clusters. Different clusterings are optimal on the same data set according to different criteria, and the choice of such criteria depends on the context and aim of clustering. Therefore, researchers need to consider what data analytic characteristics the clusters they are a… ▽ More

    Submitted 23 June, 2020; v1 submitted 5 February, 2020; originally announced February 2020.

    Comments: 42 pages, 11 figures

    MSC Class: 62H30

  17. arXiv:1911.13272  [pdf, ps, other

    stat.ME

    Minkowski distances and standardisation for clustering and classification of high dimensional data

    Authors: Christian Hennig

    Abstract: There are many distance-based methods for classification and clustering, and for data with a high number of dimensions and a lower number of observations, processing distances is computationally advantageous compared to the raw data matrix. Euclidean distances are used as a default for continuous multivariate data, but there are alternatives. Here the so-called Minkowski distances, $L_1$ (city blo… ▽ More

    Submitted 23 June, 2020; v1 submitted 29 November, 2019; originally announced November 2019.

    Comments: Preliminary version; final version to be published by Springer, using Springer's svmult LATEX style

    MSC Class: 62H30

  18. arXiv:1910.11339  [pdf, other

    stat.ML cs.LG

    Clustering with the Average Silhouette Width

    Authors: Fatima Batool, Christian Hennig

    Abstract: The Average Silhouette Width (ASW; Rousseeuw (1987)) is a popular cluster validation index to estimate the number of clusters. Here we address the question whether it also is suitable as a general objective function to be optimized for finding a clustering. We will propose two algorithms (the standard version OSil and a fast version FOSil) and compare them with existing clustering methods in an ex… ▽ More

    Submitted 21 November, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

    Comments: 36 pages

    MSC Class: 62H30 ACM Class: I.5.3

  19. arXiv:1908.02218  [pdf, other

    stat.ME

    Should we test the model assumptions before running a model-based test?

    Authors: M. Iqbal Shamsudheen, Christian Hennig

    Abstract: Statistical methods are based on model assumptions, and it is statistical folklore that a method's model assumptions should be checked before applying it. This can be formally done by running one or more misspecification tests of model assumptions before running a method that requires these assumptions; here we focus on model-based tests. A combined test procedure can be defined by specifying a pr… ▽ More

    Submitted 17 April, 2023; v1 submitted 6 August, 2019; originally announced August 2019.

    Comments: 35 pages, 1 figure

    MSC Class: 62F03

  20. arXiv:1905.08876  [pdf, other

    stat.OT

    Many perspectives on Deborah Mayo's "Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars"

    Authors: Andrew Gelman, Brian Haig, Christian Hennig, Art Owen, Robert Cousins, Stan Young, Christian Robert, Corey Yanofsky, E. J. Wagenmakers, Ron Kenett, Daniel Lakeland

    Abstract: The new book by philosopher Deborah Mayo is relevant to data science for topical reasons, as she takes various controversial positions regarding hypothesis testing and statistical practice, and also as an entry point to thinking about the philosophy of statistics. The present article is a slightly expanded version of a series of informal reviews and comments on Mayo's book. We hope this discussion… ▽ More

    Submitted 29 May, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: 23 pages

  21. Benchmarking in cluster analysis: A white paper

    Authors: Iven Van Mechelen, Anne-Laure Boulesteix, Rainer Dangl, Nema Dean, Isabelle Guyon, Christian Hennig, Friedrich Leisch, Douglas Steinley

    Abstract: Note: A revised version of this is now published. Please cite and read (it's open access): Van Mechelen, I., Boulesteix, A.-L., Dangl, R., Dean, N., Hennig, C., Leisch, F., Steinley, D., Warrens, M. J. (2023). A white paper on good research practices in benchmarking: The case of cluster analysis. WIREs Data Mining and Knowledge Discovery, e1511. https://doi.org/10.1002/widm.1511 To achieve scien… ▽ More

    Submitted 30 July, 2023; v1 submitted 27 September, 2018; originally announced September 2018.

    MSC Class: 62H30

    Journal ref: WIREs Data Mining and Knowledge Discovery, 2023, e1511

  22. arXiv:1806.10403  [pdf, ps, other

    stat.ME

    Quantile-based clustering

    Authors: Christian Hennig, Cinzia Viroli, Laura Anderlucci

    Abstract: A new cluster analysis method, $K$-quantiles clustering, is introduced. $K$-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for $K$-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for diffe… ▽ More

    Submitted 8 November, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

  23. arXiv:1804.08715  [pdf, other

    stat.ME

    A constrained regression model for an ordinal response with ordinal predictors

    Authors: Javier Espinosa, Christian Hennig

    Abstract: A regression model is proposed for the analysis of an ordinal response variable depending on a set of multiple covariates containing ordinal and potentially other variables. The proportional odds model (McCullagh (1980)) is used for the ordinal response, and constrained maximum likelihood estimation is used to account for the ordinality of covariates. Ordinal predictors are coded by dummy variab… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: 33 pages, 7 figures, 1 appendix

    MSC Class: 62H12; 62J05; 62-07

  24. arXiv:1710.05908  [pdf, other

    astro-ph.CO astro-ph.GA

    Galaxies in X-ray Selected Clusters and Groups in Dark Energy Survey Data II: Hierarchical Bayesian Modeling of the Red-Sequence Galaxy Luminosity Function

    Authors: Y. Zhang, C. J. Miller, P. Rooney, A. Bermeo, A. K. Romer, C. Vergara cervantes, E. S. Rykoff, C. Hennig, R. Das, T. Mckay, J. Song, H. Wilcox, D. Bacon, S. L. Bridle, C. Collins, C. Conselice, M. Hilton, B. Hoyle, S. Kay, A. R. Liddle, R. G. Mann, N. Mehrtens, J. Mayers, R. C. Nichol, M. Sahlen , et al. (55 additional authors not shown)

    Abstract: Using $\sim 100$ X-ray selected clusters in the Dark Energy Survey Science Verification data, we constrain the luminosity function (LF) of cluster red sequence galaxies as a function of redshift. This is the first homogeneous optical/X-ray sample large enough to constrain the evolution of the luminosity function simultaneously in redshift ($0.1<z<1.05$) and cluster mass (… ▽ More

    Submitted 29 June, 2019; v1 submitted 16 October, 2017; originally announced October 2017.

    Comments: Updated to match the accepted version

  25. arXiv:1704.00959  [pdf, other

    stat.AP

    Using clustering of rankings to explain brand preferences with personality and socio-demographic variables

    Authors: Daniel Müllensiefen, Christian Hennig, Hedie Howells

    Abstract: The primary aim of market segmentation is to identify relevant groups of consumers that can be addressed efficiently by marketing or advertising campaigns. This paper addresses the issue whether consumer groups can be identified from background variables that are not brand-related and how much personality vs. socio-demographic variables contribute to the identification of consumer clusters. This i… ▽ More

    Submitted 4 April, 2017; originally announced April 2017.

    Comments: 26 pages, 12 figures

    MSC Class: 62H30; 91B08

  26. arXiv:1703.09282  [pdf, other

    stat.ME

    Cluster validation by measurement of clustering characteristics relevant to the user

    Authors: Christian Hennig

    Abstract: There are many cluster analysis methods that can produce quite different clusterings on the same dataset. Cluster validation is about the evaluation of the quality of a clustering; "relative cluster validation" is about using such criteria to compare clusterings. This can be used to select one of a set of clusterings from different methods, or from the same method ran with different parameters suc… ▽ More

    Submitted 8 September, 2020; v1 submitted 27 March, 2017; originally announced March 2017.

    Comments: 20 pages 2 figures

    MSC Class: 62H30

  27. arXiv:1604.02668  [pdf, other

    stat.ME stat.AP stat.ML

    Distance for Functional Data Clustering Based on Smoothing Parameter Commutation

    Authors: ShengLi Tzeng, Christian Hennig, Yu-Fen Li, Chien-Ju Lin

    Abstract: We propose a novel method to determine the dissimilarity between subjects for functional data clustering. Spline smoothing or interpolation is common to deal with data of such type. Instead of estimating the best-representing curve for each subject as fixed during clustering, we measure the dissimilarity between subjects based on varying curve estimates with commutation of smoothing parameters pai… ▽ More

    Submitted 10 April, 2016; originally announced April 2016.

    Journal ref: Statistical Methods in Medical Research, 27 (2018)

  28. arXiv:1604.00988  [pdf, other

    astro-ph.GA astro-ph.CO

    Galaxy Populations in Massive Galaxy Clusters to z=1.1: Color Distribution, Concentration, Halo Occupation Number and Red Sequence Fraction

    Authors: C. Hennig, J. J. Mohr, A. Zenteno, S. Desai, J. P. Dietrich, S. Bocquet, V. Strazzullo, A. Saro, T. M. C. Abbott, F. B. Abdalla, M. Bayliss, A. Benoit-Levy, R. A. Bernstein, E. Bertin, D. Brooks, R. Capasso, D. Capozzi, A. Carnero, M. Carrasco Kind, J. Carretero, I. Chiu, C. B. D'Andrea, L. N. daCosta, H. T. Diehl, P. Doel , et al. (48 additional authors not shown)

    Abstract: We study the galaxy populations in 74 Sunyaev Zeldovich Effect (SZE) selected clusters from the South Pole Telescope (SPT) survey that have been imaged in the science verification phase of the Dark Energy Survey (DES). The sample extends up to $z\sim 1.1$ with $4 \times 10^{14} M_{\odot}\le M_{200}\le 3\times 10^{15} M_{\odot}$. Using the band containing the 4000~Å break and its redward neighbor,… ▽ More

    Submitted 4 April, 2016; originally announced April 2016.

    Comments: 22 pages, 14 figures, submitted to MNRAS

  29. Recovering the number of clusters in data sets with noise features using feature rescaling factors

    Authors: Renato Cordeiro de Amorim, Christian Hennig

    Abstract: In this paper we introduce three methods for re-scaling data sets aiming at improving the likelihood of clustering validity indexes to return the true number of spherical Gaussian clusters with additional noise features. Our method obtains feature re-scaling factors taking into account the structure of a given data set and the intuitive idea that different features may have different degrees of re… ▽ More

    Submitted 22 February, 2016; originally announced February 2016.

    Journal ref: Information Sciences 324 (2015), 126-145

  30. Detection of Enhancement in Number Densities of Background Galaxies due to Magnification by Massive Galaxy Clusters

    Authors: I. Chiu, J. P. Dietrich, J. Mohr, D. E. Applegate, B. A. Benson, L. E. Bleem, M. B. Bayliss, S. Bocquet, J. E. Carlstrom, R. Capasso, S. Desai, C. Gangkofner, A. H. Gonzalez, N. Gupta, C. Hennig, H. Hoekstra, A. von der Linden, J. Liu, M. McDonald, C. L. Reichardt, A. Saro, T. Schrabback, V. Strazzullo, C. W. Stubbs, A. Zenteno

    Abstract: We present a detection of the enhancement in the number densities of background galaxies induced from lensing magnification and use it to test the Sunyaev-Zel'dovich effect (SZE) inferred masses in a sample of 19 galaxy clusters with median redshift $z\simeq0.42$ selected from the South Pole Telescope SPT-SZ survey. Two background galaxy populations are selected for this study through their photom… ▽ More

    Submitted 9 February, 2016; v1 submitted 6 October, 2015; originally announced October 2015.

    Comments: 16 pages, 10 figures, accepted for publication in MNRAS

  31. arXiv:1508.05453  [pdf, ps, other

    stat.OT stat.AP

    Beyond subjective and objective in statistics

    Authors: Andrew Gelman, Christian Hennig

    Abstract: We argue that the words "objectivity" and "subjectivity" in statistics discourse are used in a mostly unhelpful way, and we propose to replace each of them with broader collections of attributes, with objectivity replaced by transparency, consensus, impartiality, and correspondence to observable reality, and subjectivity replaced by awareness of multiple perspectives and context dependence. The ad… ▽ More

    Submitted 21 August, 2015; originally announced August 2015.

    Comments: 35 pages

  32. Constraints on the Richness-Mass Relation and the Optical-SZE Positional Offset Distribution for SZE-Selected Clusters

    Authors: A. Saro, S. Bocquet, E. Rozo, B. A. Benson, J. Mohr, E. S. Rykoff, M. Soares-Santos, L. Bleem, S. Dodelson, P. Melchior, F. Sobreira, V. Upadhyay, J. Weller, T. Abbott, F. B. Abdalla, S. Allam, R. Armstrong, M. Banerji, A. H. Bauer, M. Bayliss, A. Benoit-Levy, G. M. Bernstein, E. Bertin, M. Brodwin, D. Brooks , et al. (77 additional authors not shown)

    Abstract: We cross-match galaxy cluster candidates selected via their Sunyaev-Zel'dovich effect (SZE) signatures in 129.1 deg$^2$ of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey (DES) science verification data. We identify 25 clusters between $0.1\lesssim z\lesssim 0.8$ in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an opt… ▽ More

    Submitted 25 June, 2015; originally announced June 2015.

    Comments: 15 pages, 8 Figures, submitted to MNRAS

  33. arXiv:1504.02983  [pdf, ps, other

    astro-ph.GA astro-ph.CO

    Galaxies in X-ray Selected Clusters and Groups in Dark Energy Survey Data I: Stellar Mass Growth of Bright Central Galaxies Since z~1.2

    Authors: Y. Zhang, C. Miller, T. Mckay, P. Rooney, A. E. Evrard, A. K. Romer, R. Perfecto, J. Song, S. Desai, J. Mohr, H. Wilcox, A. Bermeo, T. Jeltema, D. Hollowood, D. Bacon, D. Capozzi, C. Collins, R. Das, D. Gerdes, C. Hennig, M. Hilton, B. Hoyle, S. Kay, A. Liddle, R. G. Mann , et al. (58 additional authors not shown)

    Abstract: Using the science verification data of the Dark Energy Survey (DES) for a new sample of 106 X-Ray selected clusters and groups, we study the stellar mass growth of Bright Central Galaxies (BCGs) since redshift 1.2. Compared with the expectation in a semi-analytical model applied to the Millennium Simulation, the observed BCGs become under-massive/under-luminous with decreasing redshift. We incorpo… ▽ More

    Submitted 2 December, 2015; v1 submitted 12 April, 2015; originally announced April 2015.

    Comments: Accepted to ApJ

  34. arXiv:1503.02059  [pdf, ps, other

    stat.ME

    Clustering strategy and method selection

    Authors: Christian Hennig

    Abstract: This paper is a chapter in the forthcoming Handbook of Cluster Analysis, Hennig et al. (2015). For definitions of basic clustering methods and some further methodology, other chapters of the Handbook are referred to. To read this version of the paper without the Handbook, some knowledge of cluster analysis methodology is required. The aim of this chapter is to provide a framework for all the dec… ▽ More

    Submitted 6 March, 2015; originally announced March 2015.

  35. arXiv:1502.02574  [pdf, ps, other

    stat.ME

    Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters

    Authors: Christian Hennig, Chien-Ju Lin

    Abstract: There are two notoriously hard problems in cluster analysis, estimating the number of clusters, and checking whether the population to be clustered is not actually homogeneous. Given a dataset, a clustering method and a cluster validation index, this paper proposes to set up null models that capture structural features of the data that cannot be interpreted as indicating clustering. Artificial dat… ▽ More

    Submitted 9 February, 2015; originally announced February 2015.

    MSC Class: 62H30; 62F03; 62F40

  36. arXiv:1502.02555  [pdf, ps, other

    stat.OT

    What are the true clusters?

    Authors: Christian Hennig

    Abstract: Constructivist philosophy and Hasok Chang's active scientific realism are used to argue that the idea of "truth" in cluster analysis depends on the context and the clustering aims. Different characteristics of clusterings are required in different situations. Researchers should be explicit about on what requirements and what idea of "true clusters" their research is based, because clustering becom… ▽ More

    Submitted 9 February, 2015; originally announced February 2015.

    MSC Class: 03A05; 62H30; 91C20

  37. Baryon Content of Massive Galaxy Clusters (0.57 < z < 1.33)

    Authors: I. Chiu, J. Mohr, M. Mcdonald, S. Bocquet, M. L. Ashby, M. Bayliss, B. A. Benson, L. E. Bleem, M. Brodwin, S. Desai, J. P. Dietrich, W. R. Forman, C. Gangkofner, A. H. Gonzalez, C. Hennig, J. Liu, C. L. Reichardt, A. Saro, B. Stalder, S. A. Stanford, J. Song, T. Schrabback, R. Suhada, V. Strazzullo, A. Zenteno

    Abstract: We study the stellar, Brightest Cluster Galaxy (BCG) and intracluster medium (ICM) masses of 14 South Pole Telescope (SPT) selected galaxy clusters with median redshift $z=0.9$ and median mass $M_{500}=6\times10^{14}M_{\odot}$. We estimate stellar masses for each cluster and BCG using six photometric bands spanning the range from the ultraviolet to the near-infrared observed with the VLT, HST and… ▽ More

    Submitted 3 October, 2015; v1 submitted 25 December, 2014; originally announced December 2014.

    Comments: Accepted for publication in MNRAS

  38. A Measurement of Gravitational Lensing of the Cosmic Microwave Background by Galaxy Clusters Using Data from the South Pole Telescope

    Authors: E. J. Baxter, R. Keisler, S. Dodelson, K. A. Aird, S. W. Allen, M. L. N. Ashby, M. Bautz, M. Bayliss, B. A. Benson, L. E. Bleem, S. Bocquet, M. Brodwin, J. E. Carlstrom, C. L. Chang, I. Chiu, H-M. Cho, A. Clocchiatti, T. M. Crawford, A. T. Crites, S. Desai, J. P. Dietrich, T. de Haan, M. A. Dobbs, R. J. Foley, W. R. Forman , et al. (50 additional authors not shown)

    Abstract: Clusters of galaxies are expected to gravitationally lens the cosmic microwave background (CMB) and thereby generate a distinct signal in the CMB on arcminute scales. Measurements of this effect can be used to constrain the masses of galaxy clusters with CMB data alone. Here we present a measurement of lensing of the CMB by galaxy clusters using data from the South Pole Telescope (SPT). We develop… ▽ More

    Submitted 23 June, 2015; v1 submitted 23 December, 2014; originally announced December 2014.

    Comments: 14 pages, 3 figures. Published in ApJ. Replaced to match published version

    Journal ref: ApJ, 806, 247 (2015)

  39. Galaxy Clusters Discovered via the Sunyaev-Zel'dovich Effect in the 2500-square-degree SPT-SZ survey

    Authors: L. E. Bleem, B. Stalder, T. de Haan, K. A. Aird, S. W. Allen, D. E. Applegate, M. L. N. Ashby, M. Bautz, M. Bayliss, B. A. Benson, S. Bocquet, M. Brodwin, J. E. Carlstrom, C. L. Chang, I. Chiu, H. M. Cho, A. Clocchiatti, T. M. Crawford, A. T. Crites, S. Desai, J. P. Dietrich, M. A. Dobbs, R. J. Foley, W. R. Forman, E. M. George , et al. (49 additional authors not shown)

    Abstract: We present a catalog of galaxy clusters selected via their Sunyaev-Zel'dovich (SZ) effect signature from 2500 deg$^2$ of South Pole Telescope (SPT) data. This work represents the complete sample of clusters detected at high significance in the 2500-square-degree SPT-SZ survey, which was completed in 2011. A total of 677 (409) cluster candidates are identified above a signal-to-noise threshold of… ▽ More

    Submitted 13 February, 2015; v1 submitted 2 September, 2014; originally announced September 2014.

    Comments: Minor changes to match accepted version; Associated data products available at http://pole.uchicago.edu/public/data/sptsz-clusters/index.html

    Journal ref: 2015, ApJS, 216, 27

  40. arXiv:1407.7520  [pdf, other

    astro-ph.CO astro-ph.GA

    Analysis of Sunyaev-Zel'dovich Effect Mass-Observable Relations using South Pole Telescope Observations of an X-ray Selected Sample of Low Mass Galaxy Clusters and Groups

    Authors: J. Liu, J. Mohr, A. Saro, K. A. Aird, M. L. N. Ashby, M. Bautz, M. Bayliss, B. A. Benson, L. E. Bleem, S. Bocquet, M. Brodwin, J. E. Carlstrom, C. L. Chang, I. Chiu, H. M. Cho, A. Clocchiatti, T. M. Crawford, A. T. Crites, T. de Haan, S. Desai, J. P. Dietrich, M. A. Dobbs, R. J. Foley, D. Gangkofner, E. M. George , et al. (41 additional authors not shown)

    Abstract: (Abridged) We use 95, 150, and 220GHz observations from the SPT to examine the SZE signatures of a sample of 46 X-ray selected groups and clusters drawn from ~6 deg^2 of the XMM-BCS. These systems extend to redshift z=1.02, have characteristic masses ~3x lower than clusters detected directly in the SPT data and probe the SZE signal to the lowest X-ray luminosities (>10^42 erg s^-1) yet. We devel… ▽ More

    Submitted 29 May, 2015; v1 submitted 28 July, 2014; originally announced July 2014.

    Comments: 15 pages, 7 figures

    Journal ref: MNRAS (April 11, 2015) 448 (3): 2085-2099

  41. Optical Confirmation and Redshift Estimation of the Planck Cluster Candidates overlap** the Pan-STARRS Survey

    Authors: J. Liu, C. Hennig, S. Desai, B. Hoyle, J. Koppenhoefer, J. J. Mohr, K. Paech, W. S. Burgett, K. C. Chambers, S. Cole, P. W. Draper, N. Kaiser, N. Metcalfe, J. S. Morgan, P. A. Price, C. W. Stubbs, J. L. Tonry, R. J. Wainscoat, C. Waters

    Abstract: We report results of a study of Planck Sunyaev-Zel'dovich effect (SZE) selected galaxy cluster candidates using the Panoramic Survey Telescope & Rapid Response System (Pan-STARRS) imaging data. We first examine 150 Planck confirmed galaxy clusters with spectroscopic redshifts to test our algorithm for identifying optical counterparts and measuring their redshifts; our redshifts have a typical accu… ▽ More

    Submitted 29 May, 2015; v1 submitted 22 July, 2014; originally announced July 2014.

    Comments: 11 pages, 9 figures

    Journal ref: MNRAS (June 01, 2015) Vol. 449 3370-3380

  42. Mass Calibration and Cosmological Analysis of the SPT-SZ Galaxy Cluster Sample Using Velocity Dispersion $σ_v$ and X-ray $Y_\textrm{X}$ Measurements

    Authors: S. Bocquet, A. Saro, J. J. Mohr, K. A. Aird, M. L. N. Ashby, M. Bautz, M. Bayliss, G. Bazin, B. A. Benson, L. E. Bleem, M. Brodwin, J. E. Carlstrom, C. L. Chang, I. Chiu, H. M. Cho, A. Clocchiatti, T. M. Crawford, A. T. Crites, S. Desai, T. de Haan, J. P. Dietrich, M. A. Dobbs, R. J. Foley, W. R. Forman, D. Gangkofner , et al. (46 additional authors not shown)

    Abstract: We present a velocity dispersion-based mass calibration of the South Pole Telescope Sunyaev-Zel'dovich effect survey (SPT-SZ) galaxy cluster sample. Using a homogeneously selected sample of 100 cluster candidates from 720 deg2 of the survey along with 63 velocity dispersion ($σ_v$) and 16 X-ray Yx measurements of sample clusters, we simultaneously calibrate the mass-observable relation and constra… ▽ More

    Submitted 2 December, 2014; v1 submitted 10 July, 2014; originally announced July 2014.

    Comments: Accepted by ApJ (v2 is accepted version); 17 pages, 6 figures

  43. Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering

    Authors: Pietro Coretto, Christian Hennig

    Abstract: The two main topics of this paper are the introduction of the "optimally tuned improper maximum likelihood estimator" (OTRIMLE) for robust clustering based on the multivariate Gaussian model for clusters, and a comprehensive simulation study comparing the OTRIMLE to Maximum Likelihood in Gaussian mixtures with and without noise component, mixtures of t-distributions, and the TCLUST approach for tr… ▽ More

    Submitted 28 January, 2017; v1 submitted 2 June, 2014; originally announced June 2014.

    MSC Class: 62H30; 62F35; 62P25

    Journal ref: Journal of the American Statistical Association 111(516), pp. 1648--1659 (2016)

  44. Constraints on the CMB Temperature Evolution using Multi-Band Measurements of the Sunyaev Zel'dovich Effect with the South Pole Telescope

    Authors: A. Saro, J. Liu, J. J. Mohr, K. A. Aird, M. L. N. Ashby, M. Bayliss, B. A. Benson, L. E. Bleem, S. Bocquet, M. Brodwin, J. E. Carlstrom, C. L. Chang, I. Chiu, H. M. Cho, A. Clocchiatti, T. M. Crawford, A. T. Crites, T. de Haan, S. Desai, J. P. Dietrich, M. A. Dobbs, K. Dolag, J. P. Dudley, R. J. Foley, D. Gangkofner , et al. (46 additional authors not shown)

    Abstract: The adiabatic evolution of the temperature of the cosmic microwave background (CMB) is a key prediction of standard cosmology. We study deviations from the expected adiabatic evolution of the CMB temperature of the form $T(z) =T_0(1+z)^{1-α}$ using measurements of the spectrum of the Sunyaev Zel'dovich Effect with the South Pole Telescope (SPT). We present a method for using the ratio of the Sunya… ▽ More

    Submitted 9 December, 2013; originally announced December 2013.

    Comments: Submitted to MNRAS Letters

  45. arXiv:1309.6895  [pdf, other

    stat.ME

    Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering

    Authors: Pietro Coretto, Christian Hennig

    Abstract: The robust improper maximum likelihood estimator (RIMLE) is a new method for robust multivariate clustering finding approximately Gaussian clusters. It maximizes a pseudo-likelihood defined by adding a component with improper constant density for accommodating outliers to a Gaussian mixture. A special case of the RIMLE is MLE for multivariate finite Gaussian mixture models. In this paper we treat… ▽ More

    Submitted 13 February, 2018; v1 submitted 26 September, 2013; originally announced September 2013.

    Comments: The title of this paper was originally: "A consistent and breakdown robust model-based clustering method"

    MSC Class: 62H30; 62F35

    Journal ref: 2017, Journal of Machine Learning Research, Vol. 18(142), pp. 1-39. Download link: http://jmlr.org/papers/v18/16-382.html

  46. arXiv:1303.1282  [pdf, ps, other

    stat.ME

    Quantile-based classifiers

    Authors: Christian Hennig, Cinzia Viroli

    Abstract: Quantile classifiers for potentially high-dimensional data are defined by classifying an observation according to a sum of appropriately weighted component-wise distances of the components of the observation to the within-class quantiles. An optimal percentage for the quantiles can be chosen by minimizing the misclassification error in the training sample. It is shown that this is consistent, fo… ▽ More

    Submitted 12 November, 2013; v1 submitted 6 March, 2013; originally announced March 2013.

  47. Breakdown points for maximum likelihood estimators of location-scale mixtures

    Authors: Christian Hennig

    Abstract: ML-estimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the estimation of mixtures of t-distributions by McLachlan and Peel [Finite Mixture Models (2000) Wiley, New York] and the addition of a further mixture component accou… ▽ More

    Submitted 5 October, 2004; originally announced October 2004.

    Comments: Published by the Institute of Mathematical Statistics (http://www.imstat.org) in the Annals of Statistics (http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/009053604000000571

    Report number: IMS-AOS-AOS209 MSC Class: 62F35 (Primary) 62H30 (Secondary)

    Journal ref: Annals of Statistics 2004, Vol. 32, No. 4, 1313-1340