-
Optimal E-Values for Exponential Families: the Simple Case
Authors:
Peter Grünwald,
Tyron Lardy,
Yunda Hao,
Shaul K. Bar-Lev,
Martijn de Jong
Abstract:
We provide a general condition under which e-variables in the form of a simple-vs.-simple likelihood ratio exist when the null hypothesis is a composite, multivariate exponential family. Such `simple' e-variables are easy to compute and expected-log-optimal with respect to any stop** time. Simple e-variables were previously only known to exist in quite specific settings, but we offer a unifying…
▽ More
We provide a general condition under which e-variables in the form of a simple-vs.-simple likelihood ratio exist when the null hypothesis is a composite, multivariate exponential family. Such `simple' e-variables are easy to compute and expected-log-optimal with respect to any stop** time. Simple e-variables were previously only known to exist in quite specific settings, but we offer a unifying theorem on their existence for testing exponential families. We start with a simple alternative $Q$ and a regular exponential family null. Together these induce a second exponential family ${\cal Q}$ containing $Q$, with the same sufficient statistic as the null. Our theorem shows that simple e-variables exist whenever the covariance matrices of ${\cal Q}$ and the null are in a certain relation. Examples in which this relation holds include some $k$-sample tests, Gaussian location- and scale tests, and tests for more general classes of natural exponential families.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Anytime-Valid Tests of Group Invariance through Conformal Prediction
Authors:
Tyron Lardy,
Muriel Felipe Pérez-Ortiz
Abstract:
We develop anytime-valid tests of invariance under the action of compact groups. The resulting test statistics are optimal in a logarithmic-growth sense. We apply our method to extend recent anytime-valid tests of independence and to construct tests of normality.
We develop anytime-valid tests of invariance under the action of compact groups. The resulting test statistics are optimal in a logarithmic-growth sense. We apply our method to extend recent anytime-valid tests of independence and to construct tests of normality.
△ Less
Submitted 23 May, 2024; v1 submitted 27 January, 2024;
originally announced January 2024.
-
Universal Reverse Information Projections and Optimal E-statistics
Authors:
Tyron Lardy,
Peter Grünwald,
Peter Harremoës
Abstract:
Information projections have found important applications in probability theory, statistics, and related areas. In the field of hypothesis testing in particular, the reverse information projection (RIPr) has recently been shown to lead to so-called growth-rate optimal (GRO) e-statistics for testing simple alternatives against composite null hypotheses. However, the RIPr as well as the GRO criterio…
▽ More
Information projections have found important applications in probability theory, statistics, and related areas. In the field of hypothesis testing in particular, the reverse information projection (RIPr) has recently been shown to lead to so-called growth-rate optimal (GRO) e-statistics for testing simple alternatives against composite null hypotheses. However, the RIPr as well as the GRO criterion are undefined whenever the infimum information divergence between the null and alternative is infinite. We show that in such scenarios there often still exists an element in the alternative that is 'closest' to the null: the universal reverse information projection. The universal reverse information projection and its non-universal counterpart coincide whenever information divergence is finite. Furthermore, the universal RIPr is shown to lead to optimal e-statistics in a sense that is a novel, but natural, extension of the GRO criterion. We also give conditions under which the universal RIPr is a strict sub-probability distribution, as well as conditions under which an approximation of the universal RIPr leads to approximate e-statistics. For this case we provide tight relations between the corresponding approximation rates.
△ Less
Submitted 4 December, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
E-values for k-Sample Tests With Exponential Families
Authors:
Yunda Hao,
Peter Grünwald,
Tyron Lardy,
Long Long,
Reuben Adams
Abstract:
We develop and compare e-variables for testing whether $k$ samples of data are drawn from the same distribution, the alternative being that they come from different elements of an exponential family. We consider the GRO (growth-rate optimal) e-variables for (1) a `small' null inside the same exponential family, and (2) a `large' nonparametric null, as well as (3) an e-variable arrived at by condit…
▽ More
We develop and compare e-variables for testing whether $k$ samples of data are drawn from the same distribution, the alternative being that they come from different elements of an exponential family. We consider the GRO (growth-rate optimal) e-variables for (1) a `small' null inside the same exponential family, and (2) a `large' nonparametric null, as well as (3) an e-variable arrived at by conditioning on the sum of the sufficient statistics. (2) and (3) are efficiently computable, and extend ideas from Turner et al. [2021] and Wald [1947] respectively from Bernoulli to general exponential families. We provide theoretical and simulation-based comparisons of these e-variables in terms of their logarithmic growth rate, and find that for small effects all four e-variables behave surprisingly similarly; for the Gaussian location and Poisson families, e-variables (1) and (3) coincide; for Bernoulli, (1) and (2) coincide; but in general, whether (2) or (3) grows faster under the alternative is family-dependent. We furthermore discuss algorithms for numerically approximating (1).
△ Less
Submitted 8 January, 2024; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Anytime Valid Tests of Conditional Independence Under Model-X
Authors:
Peter Grünwald,
Alexander Henzi,
Tyron Lardy
Abstract:
We propose a sequential, anytime-valid method to test the conditional independence of a response $Y$ and a predictor $X$ given a random vector $Z$. The proposed test is based on e-statistics and test martingales, which generalize likelihood ratios and allow valid inference at arbitrary stop** times. In accordance with the recently introduced model-X setting, our test depends on the availability…
▽ More
We propose a sequential, anytime-valid method to test the conditional independence of a response $Y$ and a predictor $X$ given a random vector $Z$. The proposed test is based on e-statistics and test martingales, which generalize likelihood ratios and allow valid inference at arbitrary stop** times. In accordance with the recently introduced model-X setting, our test depends on the availability of the conditional distribution of $X$ given $Z$, or at least a sufficiently sharp approximation thereof. Within this setting, we derive a general method for constructing e-statistics for testing conditional independence, show that it leads to growth-rate optimal e-statistics for simple alternatives, and prove that our method yields tests with asymptotic power one in the special case of a logistic regression model. A simulation study is done to demonstrate that the approach is competitive in terms of power when compared to established sequential and nonsequential testing methods, and robust with respect to violations of the model-X assumption.
△ Less
Submitted 21 February, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
E-Statistics, Group Invariance and Anytime Valid Testing
Authors:
Muriel Felipe Pérez-Ortiz,
Tyron Lardy,
Rianne de Heide,
Peter Grünwald
Abstract:
We study worst-case-growth-rate-optimal (GROW) e-statistics for hypothesis testing between two group models. It is known that under a mild condition on the action of the underlying group G on the data, there exists a maximally invariant statistic. We show that among all e-statistics, invariant or not, the likelihood ratio of the maximally invariant statistic is GROW, both in the absolute and in th…
▽ More
We study worst-case-growth-rate-optimal (GROW) e-statistics for hypothesis testing between two group models. It is known that under a mild condition on the action of the underlying group G on the data, there exists a maximally invariant statistic. We show that among all e-statistics, invariant or not, the likelihood ratio of the maximally invariant statistic is GROW, both in the absolute and in the relative sense, and that an anytime-valid test can be based on it. The GROW e-statistic is equal to a Bayes factor with a right Haar prior on G. Our treatment avoids nonuniqueness issues that sometimes arise for such priors in Bayesian contexts. A crucial assumption on the group G is its amenability, a well-known group-theoretical condition, which holds, for instance, in scale-location families. Our results also apply to finite-dimensional linear regression.
△ Less
Submitted 17 October, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.