-
Higher-order asymptotic corrections and their application to the Gamma Variance Model
Authors:
Enzo Canonero,
Alessandra Rosalba Brazzale,
Glen Cowan
Abstract:
We present improved methods for calculating confidence intervals and $p$-values in situations where standard asymptotic approaches fail due to small sample sizes. We apply these techniques to a specific class of statistical model that can incorporate uncertainties in parameters that themselves represent uncertainties (informally, "errors on errors") called the Gamma Variance Model. This model cont…
▽ More
We present improved methods for calculating confidence intervals and $p$-values in situations where standard asymptotic approaches fail due to small sample sizes. We apply these techniques to a specific class of statistical model that can incorporate uncertainties in parameters that themselves represent uncertainties (informally, "errors on errors") called the Gamma Variance Model. This model contains fixed parameters, generically called $\varepsilon$, that represent the relative uncertainties in estimates of standard deviations of Gaussian distributed measurements. If the $\varepsilon$ parameters are small, one can construct confidence intervals and $p$-values using standard asymptotic methods. This is formally similar to the familiar situation of a large data sample, in which estimators for all adjustable parameters have Gaussian distributions. Here we address the important case where the $\varepsilon$ parameters are not small and as a consequence the asymptotic distributions do not represent a good approximation. We investigate improved test statistics based on the technology of higher-order asymptotics ($p^*$ approximation and Bartlett correction).
△ Less
Submitted 9 January, 2024; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Locating γ-Ray Sources on the Celestial Sphere via Modal Clustering
Authors:
Anna Montin,
Alessandra R. Brazzale,
Giovanna Menardi
Abstract:
Searching for as yet undetected gamma-ray sources is a major target of the Fermi LAT Collaboration. We present an algorithm capable of identifying such type of sources by non-parametrically clustering the directions of arrival of the high-energy photons detected by the telescope onboard the Fermi spacecraft. n particular, the sources will be identified using a von Mises-Fisher kernel estimate of t…
▽ More
Searching for as yet undetected gamma-ray sources is a major target of the Fermi LAT Collaboration. We present an algorithm capable of identifying such type of sources by non-parametrically clustering the directions of arrival of the high-energy photons detected by the telescope onboard the Fermi spacecraft. n particular, the sources will be identified using a von Mises-Fisher kernel estimate of the photon count density on the unit sphere via an adjustment of the mean-shift algorithm to account for the directional nature of data. This choice entails a number of desirable benefits. It allows us to by-pass the difficulties inherent on the borders of any projection of the photon directions onto a 2-dimensional plane, while guaranteeing high flexibility. The smoothing parameter will be chosen adaptively, by combining scientific input with optimal selection guidelines, as known from the literature. Using statistical tools from hypothesis testing and classification, we furthermore present an automatic way to skim off sound candidate sources from the gamma-ray emitting diffuse background and to quantify their significance. The algorithm was calibrated on simulated data provided by the Fermi LAT Collaboration and will be illustrated on a real Fermi LAT case-study.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Likelihood Asymptotics in Nonregular Settings: A Review with Emphasis on the Likelihood Ratio
Authors:
Alessandra R. Brazzale,
Valentina Mameli
Abstract:
This paper reviews the most common situations where one or more regularity conditions which underlie classical likelihood-based parametric inference fail. We identify three main classes of problems: boundary problems, indeterminate parameter problems -- which include non-identifiable parameters and singular information matrices -- and change-point problems. The review focuses on the large-sample p…
▽ More
This paper reviews the most common situations where one or more regularity conditions which underlie classical likelihood-based parametric inference fail. We identify three main classes of problems: boundary problems, indeterminate parameter problems -- which include non-identifiable parameters and singular information matrices -- and change-point problems. The review focuses on the large-sample properties of the likelihood ratio statistic. We emphasize analytical solutions and acknowledge software implementations where available. We furthermore give summary insight about the possible tools to derivate the key results. Other approaches to hypothesis testing and connections to estimation are listed in the annotated bibliography of the Supplementary Material.
△ Less
Submitted 26 April, 2023; v1 submitted 30 June, 2022;
originally announced June 2022.
-
Identification of high-energy astrophysical point sources via hierarchical Bayesian nonparametric clustering
Authors:
Andrea Sottosanti,
Mauro Bernardi,
Alessandra R. Brazzale,
Alex Geringer-Sameth,
David C. Stenning,
Roberto Trotta,
David A. van Dyk
Abstract:
The light we receive from distant astrophysical objects carries information about their origins and the physical mechanisms that power them. The study of these signals, however, is complicated by the fact that observations are often a mixture of the light emitted by multiple localized sources situated in a spatially-varying background. A general algorithm to achieve robust and accurate source iden…
▽ More
The light we receive from distant astrophysical objects carries information about their origins and the physical mechanisms that power them. The study of these signals, however, is complicated by the fact that observations are often a mixture of the light emitted by multiple localized sources situated in a spatially-varying background. A general algorithm to achieve robust and accurate source identification in this case remains an open question in astrophysics.
This paper focuses on high-energy light (such as X-rays and gamma-rays), for which observatories can detect individual photons (quanta of light), measuring their incoming direction, arrival time, and energy. Our proposed Bayesian methodology uses both the spatial and energy information to identify point sources, that is, separate them from the spatially-varying background, to estimate their number, and to compute the posterior probabilities that each photon originated from each identified source. This is accomplished via a Dirichlet process mixture while the background is simultaneously reconstructed via a flexible Bayesian nonparametric model based on B-splines. Our proposed method is validated with a suite of simulation studies and illustrated with an application to a complex region of the sky observed by the \emph{Fermi} Gamma-ray Space Telescope.
△ Less
Submitted 26 April, 2021; v1 submitted 23 April, 2021;
originally announced April 2021.
-
Margin-free classification and new class detection using finite Dirichlet mixtures
Authors:
Prince John,
Alessandra R. Brazzale,
Maria Süveges
Abstract:
We present a margin-free finite mixture model which allows us to simultaneously classify objects into known classes and to identify possible new object types using a set of continuous attributes. This application is motivated by the needs of identifying and possibly detecting new types of a particular kind of stars known as variable stars. We first suitably transform the physical attributes of the…
▽ More
We present a margin-free finite mixture model which allows us to simultaneously classify objects into known classes and to identify possible new object types using a set of continuous attributes. This application is motivated by the needs of identifying and possibly detecting new types of a particular kind of stars known as variable stars. We first suitably transform the physical attributes of the stars onto the simplex to achieve scale invariance while maintaining their dependence structure. This allows us to compare data collected by different sky surveys which can have different scales. The model hence combines a mixture of Dirichlet mixtures to represent the known classes with the semi-supervised classification strategy of Vatanen et al. (2012) for outlier detection. In line with previous work on semiparametric model-based clustering, the single Dirichlet distributions can be seen as providing the baseline pattern of the data. These are then combined to effectively model the complex distributions of the attributes for the different classes. The model is estimated using a hierarchical two-step procedure which combines a suitably adapted version of the Expectation-Maximization (EM) algorithm with Bayes' rule. We validate our model on a reliable sample of periodic variable stars available in the literature (Dubath et al., 2011) achieving an overall classification accuracy of 71.95%, a sensitivity of 86.11% and a specificity of 99.79% for new class detection.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Accurate Parametric Inference for Small Samples
Authors:
Alessandra R. Brazzale,
Anthony C. Davison
Abstract:
We outline how modern likelihood theory, which provides essentially exact inferences in a variety of parametric statistical problems, may routinely be applied in practice. Although the likelihood procedures are based on analytical asymptotic approximations, the focus of this paper is not on theory but on implementation and applications. Numerical illustrations are given for logistic regression,…
▽ More
We outline how modern likelihood theory, which provides essentially exact inferences in a variety of parametric statistical problems, may routinely be applied in practice. Although the likelihood procedures are based on analytical asymptotic approximations, the focus of this paper is not on theory but on implementation and applications. Numerical illustrations are given for logistic regression, nonlinear models, and linear non-normal models, and we describe a sampling approach for the third of these classes. In the case of logistic regression, we argue that approximations are often more appropriate than `exact' procedures, even when these exist.
△ Less
Submitted 22 June, 2009;
originally announced June 2009.