-
How Flawed Is ECE? An Analysis via Logit Smoothing
Authors:
Muthu Chidambaram,
Holden Lee,
Colin McSwiggen,
Semon Rezchikov
Abstract:
Informally, a model is calibrated if its predictions are correct with a probability that matches the confidence of the prediction. By far the most common method in the literature for measuring calibration is the expected calibration error (ECE). Recent work, however, has pointed out drawbacks of ECE, such as the fact that it is discontinuous in the space of predictors. In this work, we ask: how fu…
▽ More
Informally, a model is calibrated if its predictions are correct with a probability that matches the confidence of the prediction. By far the most common method in the literature for measuring calibration is the expected calibration error (ECE). Recent work, however, has pointed out drawbacks of ECE, such as the fact that it is discontinuous in the space of predictors. In this work, we ask: how fundamental are these issues, and what are their impacts on existing results? Towards this end, we completely characterize the discontinuities of ECE with respect to general probability measures on Polish spaces. We then use the nature of these discontinuities to motivate a novel continuous, easily estimated miscalibration metric, which we term Logit-Smoothed ECE (LS-ECE). By comparing the ECE and LS-ECE of pre-trained image classification models, we show in initial experiments that binned ECE closely tracks LS-ECE, indicating that the theoretical pathologies of ECE may be avoidable in practice.
△ Less
Submitted 3 June, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Asymptotics of Generalized Bessel Functions and Weight Multiplicities via Large Deviations of Radial Dunkl Processes
Authors:
Jiaoyang Huang,
Colin McSwiggen
Abstract:
This paper studies the asymptotic behavior of several central objects in Dunkl theory as the dimension of the underlying space grows large. Our starting point is the observation that a recent result from the random matrix theory literature implies a large deviations principle for the hydrodynamic limit of radial Dunkl processes. Using this fact, we prove a variational formula for the large-$N$ asy…
▽ More
This paper studies the asymptotic behavior of several central objects in Dunkl theory as the dimension of the underlying space grows large. Our starting point is the observation that a recent result from the random matrix theory literature implies a large deviations principle for the hydrodynamic limit of radial Dunkl processes. Using this fact, we prove a variational formula for the large-$N$ asymptotics of generalized Bessel functions, as well as a large deviations principle for the more general family of radial Heckman-Opdam processes. As an application, we prove a theorem on the asymptotic behavior of weight multiplicities of irreducible representations of compact or complex simple Lie algebras in the limit of large rank. The theorems in this paper generalize several known results describing analogous asymptotics for Dyson Brownian motion, spherical matrix integrals, and Kostka numbers.
△ Less
Submitted 22 May, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Moments of random quantum marginals via Weingarten calculus
Authors:
Sho Matsumoto,
Colin McSwiggen
Abstract:
The randomized quantum marginal problem asks about the joint distribution of the partial traces ("marginals") of a uniform random Hermitian operator with fixed spectrum acting on a space of tensors. We introduce a new approach to this problem based on studying the mixed moments of the entries of the marginals. For randomized quantum marginal problems that describe systems of distinguishable partic…
▽ More
The randomized quantum marginal problem asks about the joint distribution of the partial traces ("marginals") of a uniform random Hermitian operator with fixed spectrum acting on a space of tensors. We introduce a new approach to this problem based on studying the mixed moments of the entries of the marginals. For randomized quantum marginal problems that describe systems of distinguishable particles, bosons, or fermions, we prove formulae for these mixed moments, which determine the joint distribution of the marginals completely. Our main tool is Weingarten calculus, which provides a method for computing integrals of polynomial functions with respect to Haar measure on the unitary group. As an application, in the case of two distinguishable particles, we prove some results on the asymptotic behavior of the marginals as the dimension of one or both Hilbert spaces goes to infinity.
△ Less
Submitted 17 April, 2023; v1 submitted 20 October, 2022;
originally announced October 2022.
-
Projections of Orbital Measures and Quantum Marginal Problems
Authors:
Benoît Collins,
Colin McSwiggen
Abstract:
This paper studies projections of uniform random elements of (co)adjoint orbits of compact Lie groups. Such projections generalize several widely studied ensembles in random matrix theory, including the randomized Horn's problem, the randomized Schur's problem, and the orbital corners process. In this general setting, we prove integral formulae for the probability densities, establish some propert…
▽ More
This paper studies projections of uniform random elements of (co)adjoint orbits of compact Lie groups. Such projections generalize several widely studied ensembles in random matrix theory, including the randomized Horn's problem, the randomized Schur's problem, and the orbital corners process. In this general setting, we prove integral formulae for the probability densities, establish some properties of the densities, and discuss connections to multiplicity problems in representation theory as well as to known results in the symplectic geometry literature. As applications, we show a number of results on marginal problems in quantum information theory and also prove an integral formula for restriction multiplicities.
△ Less
Submitted 16 May, 2023; v1 submitted 27 December, 2021;
originally announced December 2021.
-
Sampling Matrices from Harish-Chandra-Itzykson-Zuber Densities with Applications to Quantum Inference and Differential Privacy
Authors:
Jonathan Leake,
Colin S. McSwiggen,
Nisheeth K. Vishnoi
Abstract:
Given two $n \times n$ Hermitian matrices $Y$ and $Λ$, the Harish-Chandra-Itzykson-Zuber (HCIZ) distribution on the unitary group $\text{U}(n)$ is $e^{\text{tr}(UΛU^*Y)}dμ(U)$, where $μ$ is the Haar measure on $\text{U}(n)$. The density $e^{\text{tr}(UΛU^*Y)}$ is known as the HCIZ density. Random unitary matrices distributed according to the HCIZ density are important in various settings in physic…
▽ More
Given two $n \times n$ Hermitian matrices $Y$ and $Λ$, the Harish-Chandra-Itzykson-Zuber (HCIZ) distribution on the unitary group $\text{U}(n)$ is $e^{\text{tr}(UΛU^*Y)}dμ(U)$, where $μ$ is the Haar measure on $\text{U}(n)$. The density $e^{\text{tr}(UΛU^*Y)}$ is known as the HCIZ density. Random unitary matrices distributed according to the HCIZ density are important in various settings in physics and random matrix theory. However, the basic question of efficient sampling from the HCIZ distribution has remained open. We present two efficient algorithms to sample matrices from distributions that are close to the HCIZ distribution. The first algorithm outputs samples that are $ξ$-close in total variation distance and requires polynomially many arithmetic operations in $\log 1/ξ$ and the number of bits needed to encode $Y$ and $Λ$. The second algorithm comes with a stronger guarantee that the samples are $ξ$-close in infinity divergence, but the number of arithmetic operations depends polynomially on $1/ξ$, the number of bits needed to encode $Y$ and $Λ$, and the differences of the largest and the smallest eigenvalues of $Y$ and $Λ$.
HCIZ densities can also be viewed as exponential densities on $\text{U}(n)$-orbits, and these densities have been studied in statistics, machine learning, and theoretical computer science. Thus our results have the following applications: 1) an efficient algorithm to sample from complex versions of matrix Langevin distributions studied in statistics, 2) an efficient algorithm to sample from continuous max-entropy distributions on unitary orbits, which implies an efficient algorithm to sample a pure quantum state from the entropy-maximizing ensemble representing a given density matrix, and 3) an efficient algorithm for differentially private rank-$k$ approximation, with improved utility bounds for $k>1$.
△ Less
Submitted 6 April, 2021; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Majorization and Spherical Functions
Authors:
Colin McSwiggen,
Jonathan Novak
Abstract:
Majorization is a partial order on real vectors which plays an important role in a variety of subjects, ranging from algebra and combinatorics to probability and statistics. In this paper, we consider a generalized notion of majorization associated to an arbitrary root system $Φ,$ and show that it admits a natural characterization in terms of the values of spherical functions on any Riemannian sym…
▽ More
Majorization is a partial order on real vectors which plays an important role in a variety of subjects, ranging from algebra and combinatorics to probability and statistics. In this paper, we consider a generalized notion of majorization associated to an arbitrary root system $Φ,$ and show that it admits a natural characterization in terms of the values of spherical functions on any Riemannian symmetric space with restricted root system $Φ.$
△ Less
Submitted 16 December, 2020; v1 submitted 15 June, 2020;
originally announced June 2020.
-
Box splines, tensor product multiplicities and the volume function
Authors:
Colin McSwiggen
Abstract:
We study the relationship between the tensor product multiplicities of a compact semisimple Lie algebra $\mathfrak{g}$ and a special function $\mathcal{J}$ associated to $\mathfrak{g}$, called the volume function. The volume function arises in connection with the randomized Horn's problem in random matrix theory and has a related significance in symplectic geometry. Building on box spline deconvol…
▽ More
We study the relationship between the tensor product multiplicities of a compact semisimple Lie algebra $\mathfrak{g}$ and a special function $\mathcal{J}$ associated to $\mathfrak{g}$, called the volume function. The volume function arises in connection with the randomized Horn's problem in random matrix theory and has a related significance in symplectic geometry. Building on box spline deconvolution formulae of Dahmen-Micchelli and De Concini-Procesi-Vergne, we develop new techniques for computing the multiplicities from $\mathcal{J}$, answering a question posed by Coquereaux and Zuber. In particular, we derive an explicit algebraic formula for a large class of Littlewood-Richardson coefficients in terms of $\mathcal{J}$. We also give analogous results for weight multiplicities, and we show a number of further identities relating the tensor product multiplicities, the volume function and the box spline. To illustrate these ideas, we give new proofs of some known theorems.
△ Less
Submitted 26 April, 2020; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Revisiting Horn's Problem
Authors:
Robert Coquereaux,
Colin McSwiggen,
Jean-Bernard Zuber
Abstract:
We review recent progress on Horn's problem, which asks for a description of the possible eigenspectra of the sum of two matrices with known eigenvalues.
After revisiting the classical case, we consider several generalizations in which the space of matrices under study carries an action of a compact Lie group, and the goal is to describe an associated probability measure on the space of orbits.…
▽ More
We review recent progress on Horn's problem, which asks for a description of the possible eigenspectra of the sum of two matrices with known eigenvalues.
After revisiting the classical case, we consider several generalizations in which the space of matrices under study carries an action of a compact Lie group, and the goal is to describe an associated probability measure on the space of orbits. We review some recent results about the problem of computing the probability density via orbital integrals and about the locus of singularities of the density. We discuss some relations with representation theory, combinatorics, pictographs and symmetric polynomials, and we also include some novel remarks in connection with Schur's problem.
△ Less
Submitted 7 October, 2019; v1 submitted 23 May, 2019;
originally announced May 2019.
-
On Horn's Problem and its Volume Function
Authors:
Robert Coquereaux,
Colin McSwiggen,
Jean-Bernard Zuber
Abstract:
We consider an extended version of Horn's problem: given two orbits $\mathcal{O}_α$ and $\mathcal{O}_β$ of a linear representation of a compact Lie group, let $A\in \mathcal{O}_α$, $B\in \mathcal{O}_β$ be independent and invariantly distributed random elements of the two orbits. The problem is to describe the probability distribution of the orbit of the sum $A+B$. We study in particular the famili…
▽ More
We consider an extended version of Horn's problem: given two orbits $\mathcal{O}_α$ and $\mathcal{O}_β$ of a linear representation of a compact Lie group, let $A\in \mathcal{O}_α$, $B\in \mathcal{O}_β$ be independent and invariantly distributed random elements of the two orbits. The problem is to describe the probability distribution of the orbit of the sum $A+B$. We study in particular the familiar case of coadjoint orbits, and also the orbits of self-adjoint real, complex and quaternionic matrices under the conjugation actions of $\mathrm{SO}(n)$, $\mathrm{SU}(n)$ and $\mathrm{USp}(n)$ respectively. The probability density can be expressed in terms of a function that we call the volume function. In this paper, (i) we relate this function to the symplectic or Riemannian geometry of the orbits, depending on the case; (ii) we discuss its non-analyticities and possible vanishing; (iii) in the coadjoint case, we study its relation to tensor product multiplicities (generalized Littlewood--Richardson coefficients) and show that it computes the volume of a family of convex polytopes introduced by Berenstein and Zelevinsky. These considerations are illustrated by a detailed study of the volume function for the coadjoint orbits of $B_2=\mathfrak{so}(5)$.
△ Less
Submitted 25 April, 2020; v1 submitted 1 April, 2019;
originally announced April 2019.
-
The Harish-Chandra integral: An introduction with examples
Authors:
Colin McSwiggen
Abstract:
This expository paper introduces the theory of Harish-Chandra integrals, a family of special functions that express the integral of an exponential function over the adjoint orbits of a compact Lie group. Originally studied in the context of harmonic analysis on Lie algebras, Harish-Chandra integrals now have diverse applications in many areas of mathematics and physics. We review a number of these…
▽ More
This expository paper introduces the theory of Harish-Chandra integrals, a family of special functions that express the integral of an exponential function over the adjoint orbits of a compact Lie group. Originally studied in the context of harmonic analysis on Lie algebras, Harish-Chandra integrals now have diverse applications in many areas of mathematics and physics. We review a number of these applications, present several different proofs of Harish-Chandra's celebrated exact formula for the integrals, and give detailed derivations of the specific integral formulae for all compact classical groups. These notes are intended for mathematicians and physicists who are familiar with the basics of Lie groups and Lie algebras but who may not be specialists in representation theory or harmonic analysis.
△ Less
Submitted 2 May, 2021; v1 submitted 28 June, 2018;
originally announced June 2018.
-
A new proof of Harish-Chandra's integral formula
Authors:
Colin McSwiggen
Abstract:
We present a new proof of Harish-Chandra's formula $$Π(h_1) Π(h_2) \int_G e^{\langle \mathrm{Ad}_g h_1, h_2 \rangle} dg = \frac{ [ \! [ Π, Π] \!] }{|W|} \sum_{w \in W} ε(w) e^{\langle w(h_1),h_2 \rangle},$$ where $G$ is a compact, connected, semisimple Lie group, $dg$ is normalized Haar measure, $h_1$ and $h_2$ lie in a Cartan subalgebra of the complexified Lie algebra, $Π$ is the discriminant,…
▽ More
We present a new proof of Harish-Chandra's formula $$Π(h_1) Π(h_2) \int_G e^{\langle \mathrm{Ad}_g h_1, h_2 \rangle} dg = \frac{ [ \! [ Π, Π] \!] }{|W|} \sum_{w \in W} ε(w) e^{\langle w(h_1),h_2 \rangle},$$ where $G$ is a compact, connected, semisimple Lie group, $dg$ is normalized Haar measure, $h_1$ and $h_2$ lie in a Cartan subalgebra of the complexified Lie algebra, $Π$ is the discriminant, $\langle \cdot, \cdot \rangle$ is the Killing form, $[ \! [ \cdot, \cdot ] \!]$ is an inner product that extends the Killing form to polynomials, $W$ is a Weyl group, and $ε(w)$ is the sign of $w \in W$.
The proof in this paper follows from a relationship between heat flow on a semisimple Lie algebra and heat flow on a Cartan subalgebra, extending methods developed by Itzykson and Zuber for the case of an integral over the unitary group $U(N)$. The heat-flow proof allows a systematic approach to studying the asymptotics of orbital integrals over a wide class of groups.
△ Less
Submitted 26 April, 2020; v1 submitted 11 December, 2017;
originally announced December 2017.
-
Visualizing the "Heartbeat" of a City with Tweets
Authors:
Urbano França,
Hiroki Sayama,
Colin McSwiggen,
Roozbeh Daneshvar,
Yaneer Bar-Yam
Abstract:
Describing the dynamics of a city is a crucial step to both understanding the human activity in urban environments and to planning and designing cities accordingly. Here we describe the collective dynamics of New York City and surrounding areas as seen through the lens of Twitter usage. In particular, we observe and quantify the patterns that emerge naturally from the hourly activities in differen…
▽ More
Describing the dynamics of a city is a crucial step to both understanding the human activity in urban environments and to planning and designing cities accordingly. Here we describe the collective dynamics of New York City and surrounding areas as seen through the lens of Twitter usage. In particular, we observe and quantify the patterns that emerge naturally from the hourly activities in different areas of New York City, and discuss how they can be used to understand the urban areas. Using a dataset that includes more than 6 million geolocated Twitter messages we construct a movie of the geographic density of tweets. We observe the diurnal "heartbeat" of the NYC area. The largest scale dynamics are the waking and slee** cycle and commuting from residential communities to office areas in Manhattan. Hourly dynamics reflect the interplay of commuting, work and leisure, including whether people are preoccupied with other activities or actively using Twitter. Differences between weekday and weekend dynamics point to changes in when people wake and sleep, and engage in social activities. We show that by measuring the average distances to the heart of the city one can quantify the weekly differences and the shift in behavior during weekends. We also identify locations and times of high Twitter activity that occur because of specific activities. These include early morning high levels of traffic as people arrive and wait at air transportation hubs, and on Sunday at the Meadowlands Sports Complex and Statue of Liberty. We analyze the role of particular individuals where they have large impacts on overall Twitter activity. Our analysis points to the opportunity to develop insight into both geographic social dynamics and attention through social media analysis.
△ Less
Submitted 3 November, 2014;
originally announced November 2014.