Search | arXiv e-print repository

On Closed-Form Expressions for the Fisher-Rao Distance

Authors: Henrique K. Miyamoto, Fábio C. C. Meneghetti, Julianna Pinele, Sueli I. R. Costa

Abstract: The Fisher-Rao distance is the geodesic distance between probability distributions in a statistical manifold equipped with the Fisher metric, which is a natural choice of Riemannian metric on such manifolds. It has recently been applied to supervised and unsupervised problems in machine learning, in various contexts. Finding closed-form expressions for the Fisher-Rao distance is generally a non-tr… ▽ More The Fisher-Rao distance is the geodesic distance between probability distributions in a statistical manifold equipped with the Fisher metric, which is a natural choice of Riemannian metric on such manifolds. It has recently been applied to supervised and unsupervised problems in machine learning, in various contexts. Finding closed-form expressions for the Fisher-Rao distance is generally a non-trivial task, and those are only available for a few families of probability distributions. In this survey, we collect examples of closed-form expressions for the Fisher-Rao distance of both discrete and continuous distributions, aiming to present them in a unified and accessible language. In doing so, we also: illustrate the relation between negative multinomial distributions and the hyperbolic model, include a few new examples, and write a few more in the standard form of elliptical distributions. △ Less

Submitted 27 February, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

Comments: 42 pages, 3 figures, revised and extended version

arXiv:2211.03477 [pdf, other]

doi 10.3390/psf2022005019

Information Properties of a Random Variable Decomposition through Lattices

Authors: Fábio C. C. Meneghetti, Henrique K. Miyamoto, Sueli I. R. Costa

Abstract: A full-rank lattice in the Euclidean space is a discrete set formed by all integer linear combinations of a basis. Given a probability distribution on $\mathbb{R}^n$, two operations can be induced by considering the quotient of the space by such a lattice: wrap** and quantization. For a lattice $Λ$, and a fundamental domain $D$ which tiles $\mathbb{R}^n$ through $Λ$, the wrapped distribution ove… ▽ More A full-rank lattice in the Euclidean space is a discrete set formed by all integer linear combinations of a basis. Given a probability distribution on $\mathbb{R}^n$, two operations can be induced by considering the quotient of the space by such a lattice: wrap** and quantization. For a lattice $Λ$, and a fundamental domain $D$ which tiles $\mathbb{R}^n$ through $Λ$, the wrapped distribution over the quotient is obtained by summing the density over each coset, while the quantized distribution over the lattice is defined by integrating over each fundamental domain translation. These operations define wrapped and quantized random variables over $D$ and $Λ$, respectively, which sum up to the original random variable. We investigate information-theoretic properties of this decomposition, such as entropy, mutual information and the Fisher information matrix, and show that it naturally generalizes to the more abstract context of locally compact topological groups. △ Less

Submitted 16 November, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

Comments: 11 pages, 6 figures. v2: improved some explanations in sections 2 and 4, typos, and added citation in section 3

MSC Class: 94A17

Journal ref: Phys. Sci. Forum 2022, 5(1), 19

arXiv:2210.16401 [pdf, other]

doi 10.1007/s41884-022-00076-8

The Fisher-Rao Loss for Learning under Label Noise

Authors: Henrique K. Miyamoto, Fábio C. C. Meneghetti, Sueli I. R. Costa

Abstract: Choosing a suitable loss function is essential when learning by empirical risk minimisation. In many practical cases, the datasets used for training a classifier may contain incorrect labels, which prompts the interest for using loss functions that are inherently robust to label noise. In this paper, we study the Fisher-Rao loss function, which emerges from the Fisher-Rao distance in the statistic… ▽ More Choosing a suitable loss function is essential when learning by empirical risk minimisation. In many practical cases, the datasets used for training a classifier may contain incorrect labels, which prompts the interest for using loss functions that are inherently robust to label noise. In this paper, we study the Fisher-Rao loss function, which emerges from the Fisher-Rao distance in the statistical manifold of discrete distributions. We derive an upper bound for the performance degradation in the presence of label noise, and analyse the learning speed of this loss. Comparing with other commonly used losses, we argue that the Fisher-Rao loss provides a natural trade-off between robustness and training dynamics. Numerical experiments with synthetic and MNIST datasets illustrate this performance. △ Less

Submitted 26 November, 2022; v1 submitted 28 October, 2022; originally announced October 2022.

Comments: 20 pages, 5 figures, minor improvements. Accepted for publication in Information Geometry

Journal ref: Information Geometry, vol. 6, pp. 107-126, 2023

arXiv:2110.14748 [pdf, other]

doi 10.1109/TCOMM.2022.3173002

Context-Tree-Based Lossy Compression and Its Application to CSI Representation

Authors: Henrique K. Miyamoto, Sheng Yang

Abstract: We propose novel compression algorithms for time-varying channel state information (CSI) in wireless communications. The proposed scheme combines (lossy) vector quantisation and (lossless) compression. First, the new vector quantisation technique is based on a class of parametrised companders applied on each component of the normalised CSI vector. Our algorithm chooses a suitable compander in an i… ▽ More We propose novel compression algorithms for time-varying channel state information (CSI) in wireless communications. The proposed scheme combines (lossy) vector quantisation and (lossless) compression. First, the new vector quantisation technique is based on a class of parametrised companders applied on each component of the normalised CSI vector. Our algorithm chooses a suitable compander in an intuitively simple way whenever empirical data are available. Then, the sequences of quantisation indices are compressed using a context-tree-based approach. Essentially, we update the estimate of the conditional distribution of the source at each instant and encode the current symbol with the estimated distribution. The algorithms have low complexity, are linear-time in both the spatial dimension and time duration, and can be implemented in an online fashion. We run simulations to demonstrate the effectiveness of the proposed algorithms in such scenarios. △ Less

Submitted 4 May, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: 12 pages, 9 figures. Accepted for publication in the IEEE Transactions on Communications

Journal ref: IEEE Transactions on Communications, vol. 70, no. 7, pp. 4417-4428, July 2022

arXiv:2008.10728 [pdf, other]

doi 10.1109/TIT.2021.3114094

Constructive Spherical Codes by Hopf Foliations

Authors: Henrique K. Miyamoto, Sueli I. R. Costa, Henrique N. Sá Earp

Abstract: We present a new systematic approach to constructing spherical codes in dimensions $2^k$, based on Hopf foliations. Using the fact that a sphere $S^{2n-1}$ is foliated by manifolds $S_{\cosη}^{n-1} \times S_{\sinη}^{n-1}$, $η\in[0,π/2]$, we distribute points in dimension $2^k$ via a recursive algorithm from a basic construction in $\mathbb{R}^4$. Our procedure outperforms some current constructive… ▽ More We present a new systematic approach to constructing spherical codes in dimensions $2^k$, based on Hopf foliations. Using the fact that a sphere $S^{2n-1}$ is foliated by manifolds $S_{\cosη}^{n-1} \times S_{\sinη}^{n-1}$, $η\in[0,π/2]$, we distribute points in dimension $2^k$ via a recursive algorithm from a basic construction in $\mathbb{R}^4$. Our procedure outperforms some current constructive methods in several small-distance regimes and constitutes a compromise between achieving a large number of codewords for a minimum given distance and effective constructiveness with low encoding computational cost. Bounds for the asymptotic density are derived and compared with other constructions. The encoding process has storage complexity $O(n)$ and time complexity $O(n \log n)$. We also propose a sub-optimal decoding procedure, which does not require storing the codebook and has time complexity $O(n \log n)$. △ Less

Submitted 16 September, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

Comments: 15 pages, 9 figures, minor improvements. Accepted to the IEEE Transactions on Information Theory

Journal ref: IEEE Transactions on Information Theory, vol. 67, no. 12, pp. 7925-7939, Dec. 2021

Showing 1–5 of 5 results for author: Miyamoto, H K