-
An asymptotic expansion of the empirical angular measure for bivariate extremal dependence
Authors:
Stéphane Lhaut,
Johan Segers
Abstract:
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, t…
▽ More
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. We provide a functional asymptotic expansion for the empirical angular measure in the bivariate case based on the theory of weak convergence in the space of bounded functions. From the expansion, not only can the known asymptotic distribution of the empirical angular measure be recovered, it also enables to find expansions and weak limits for other statistics based on the associated empirical process or its quantile version.
△ Less
Submitted 10 November, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Uniform concentration bounds for frequencies of rare events
Authors:
Stéphane Lhaut,
Anne Sabourin,
Johan Segers
Abstract:
New Vapnik and Chervonenkis type concentration inequalities are derived for the empirical distribution of an independent random sample. Focus is on the maximal deviation over classes of Borel sets within a low probability region. The constants are explicit, enabling numerical comparisons.
New Vapnik and Chervonenkis type concentration inequalities are derived for the empirical distribution of an independent random sample. Focus is on the maximal deviation over classes of Borel sets within a low probability region. The constants are explicit, enabling numerical comparisons.
△ Less
Submitted 23 April, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
Concentration bounds for the empirical angular measure with statistical learning applications
Authors:
Stéphan Clémençon,
Hamid Jalalzai,
Stéphane Lhaut,
Anne Sabourin,
Johan Segers
Abstract:
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, t…
▽ More
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. However, the study of the sampling distribution of the resulting empirical angular measure is challenging. It is the purpose of the paper to establish finite-sample bounds for the maximal deviations between the empirical and true angular measures, uniformly over classes of Borel sets of controlled combinatorial complexity. The bounds are valid with high probability and, up to logarithmic factors, scale as the square root of the effective sample size. The bounds are applied to provide performance guarantees for two statistical learning procedures tailored to extreme regions of the input space and built upon the empirical angular measure: binary classification in extreme regions through empirical risk minimization and unsupervised anomaly detection through minimum-volume sets of the sphere.
△ Less
Submitted 17 October, 2022; v1 submitted 7 April, 2021;
originally announced April 2021.