Search | arXiv e-print repository

Relative entropy bounds for sampling with and without replacement

Authors: Oliver Johnson, Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: Sharp, nonasymptotic bounds are obtained for the relative entropy between the distributions of sampling with and without replacement from an urn with balls of $c\geq 2$ colors. Our bounds are asymptotically tight in certain regimes and, unlike previous results, they depend on the number of balls of each colour in the urn. The connection of these results with finite de Finetti-style theorems is exp… ▽ More Sharp, nonasymptotic bounds are obtained for the relative entropy between the distributions of sampling with and without replacement from an urn with balls of $c\geq 2$ colors. Our bounds are asymptotically tight in certain regimes and, unlike previous results, they depend on the number of balls of each colour in the urn. The connection of these results with finite de Finetti-style theorems is explored, and it is observed that a sampling bound due to Stam (1978) combined with the convexity of relative entropy yield a new finite de Finetti bound in relative entropy, which achieves the optimal asymptotic convergence rate. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 17 pages, 1 figure

MSC Class: 60E05 (Primary) 60G09 (Secondary)

arXiv:2403.07209 [pdf, ps, other]

The entropic doubling constant and robustness of Gaussian codebooks for additive-noise channels

Authors: Lampros Gavalakis, Ioannis Kontoyiannis, Mokshay Madiman

Abstract: Entropy comparison inequalities are obtained for the differential entropy $h(X+Y)$ of the sum of two independent random vectors $X,Y$, when one is replaced by a Gaussian. For identically distributed random vectors $X,Y$, these are closely related to bounds on the entropic doubling constant, which quantifies the entropy increase when adding an independent copy of a random vector to itself. Conseque… ▽ More Entropy comparison inequalities are obtained for the differential entropy $h(X+Y)$ of the sum of two independent random vectors $X,Y$, when one is replaced by a Gaussian. For identically distributed random vectors $X,Y$, these are closely related to bounds on the entropic doubling constant, which quantifies the entropy increase when adding an independent copy of a random vector to itself. Consequences of both large and small doubling are explored. For the former, lower bounds are deduced on the entropy increase when adding an independent Gaussian, while for the latter, a qualitative stability result for the entropy power inequality is obtained. In the more general case of non-identically distributed random vectors $X,Y$, a Gaussian comparison inequality with interesting implications for channel coding is established: For additive-noise channels with a power constraint, Gaussian codebooks come within a $\frac{\sf snr}{3{\sf snr}+2}$ factor of capacity. In the low-SNR regime this improves the half-a-bit additive bound of Zamir and Erez (2004). Analogous results are obtained for additive-noise multiple access channels, and for linear, additive-noise MIMO channels. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 23 pages, no figures

arXiv:2401.15462 [pdf, ps, other]

On the monotonicity of discrete entropy for log-concave random vectors on $\mathbb{Z}^d$

Authors: Matthieu Fradelizi, Lampros Gavalakis, Martin Rapaport

Abstract: We prove the following type of discrete entropy monotonicity for sums of isotropic, log-concave, independent and identically distributed random vectors $X_1,\dots,X_{n+1}$ on $\mathbb{Z}^d$: $$ H(X_1+\cdots+X_{n+1}) \geq H(X_1+\cdots+X_{n}) + \frac{d}{2}\log{\Bigl(\frac{n+1}{n}\Bigr)} +o(1), $$ where $o(1)$ vanishes as $H(X_1) \to \infty$. Moreover, for the $o(1)$-term, we obtain a rate of converg… ▽ More We prove the following type of discrete entropy monotonicity for sums of isotropic, log-concave, independent and identically distributed random vectors $X_1,\dots,X_{n+1}$ on $\mathbb{Z}^d$: $$ H(X_1+\cdots+X_{n+1}) \geq H(X_1+\cdots+X_{n}) + \frac{d}{2}\log{\Bigl(\frac{n+1}{n}\Bigr)} +o(1), $$ where $o(1)$ vanishes as $H(X_1) \to \infty$. Moreover, for the $o(1)$-term, we obtain a rate of convergence $ O\Bigl({H(X_1)}{e^{-\frac{1}{d}H(X_1)}}\Bigr)$, where the implied constants depend on $d$ and $n$. This generalizes to $\mathbb{Z}^d$ the one-dimensional result of the second named author (2023). As in dimension one, our strategy is to establish that the discrete entropy $H(X_1+\cdots+X_{n})$ is close to the differential (continuous) entropy $h(X_1+U_1+\cdots+X_{n}+U_{n})$, where $U_1,\dots, U_n$ are independent and identically distributed uniform random vectors on $[0,1]^d$ and to apply the theorem of Artstein, Ball, Barthe and Naor (2004) on the monotonicity of differential entropy. In fact, we show this result under more general assumptions than log-concavity, which are preserved up to constants under convolution. In order to show that log-concave distributions satisfy our assumptions in dimension $d\ge2$, more involved tools from convex geometry are needed because a suitable position is required. We show that, for a log-concave function on $\mathbb{R}^d$ in isotropic position, its integral, barycenter and covariance matrix are close to their discrete counterparts. Moreover, in the log-concave case, we weaken the isotropicity assumption to what we call almost isotropicity. One of our technical tools is a discrete analogue to the upper bound on the isotropic constant of a log-concave function, which extends to dimensions $d\ge1$ a result of Bobkov, Marsiglietti and Melbourne (2022). △ Less

Submitted 7 July, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

Comments: 26 pages, no figures. In the updated version we relax the hypotheses of the main theorems. Section 2 now has the proofs of the main results under the weaker assumptions and Section 3 is devoted to proving that log-concavity implies these new assumptions. Definition 2, Proposition 13 and Theorem 18 are new. The quantitative estimate in Lemma 14 has been improved from the previous version

MSC Class: Primary: 94A17 Secondary: 52C07; 39B62

arXiv:2308.15997 [pdf, other]

On the entropy and information of Gaussian mixtures

Authors: Alexandros Eskenazis, Lampros Gavalakis

Abstract: We establish several convexity properties for the entropy and Fisher information of mixtures of centered Gaussian distributions. First, we prove that if $X_1, X_2$ are independent scalar Gaussian mixtures, then the entropy of $\sqrt{t}X_1 + \sqrt{1-t}X_2$ is concave in $t \in [0,1]$, thus confirming a conjecture of Ball, Nayar and Tkocz (2016) for this class of random variables. In fact, we prove… ▽ More We establish several convexity properties for the entropy and Fisher information of mixtures of centered Gaussian distributions. First, we prove that if $X_1, X_2$ are independent scalar Gaussian mixtures, then the entropy of $\sqrt{t}X_1 + \sqrt{1-t}X_2$ is concave in $t \in [0,1]$, thus confirming a conjecture of Ball, Nayar and Tkocz (2016) for this class of random variables. In fact, we prove a generalisation of this assertion which also strengthens a result of Eskenazis, Nayar and Tkocz (2018). For the Fisher information, we extend a convexity result of Bobkov (2022) by showing that the Fisher information matrix is operator convex as a matrix-valued function acting on densities of mixtures in $\mathbb{R}^d$. As an application, we establish rates for the convergence of the Fisher information matrix of the sum of weighted i.i.d. Gaussian mixtures in the operator norm along the central limit theorem under mild moment assumptions. △ Less

Submitted 16 February, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: 14 pages, no figures. Reviewer's comments have been incorporated. To appear in Mathematika

MSC Class: 94A17 (Primary) 60E15; 26B25 (Secondary)

arXiv:2304.05360 [pdf, ps, other]

A Third Information-Theoretic Approach to Finite de Finetti Theorems

Authors: Mario Berta, Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: A new finite form of de Finetti's representation theorem is established using elementary information-theoretic tools. The distribution of the first $k$ random variables in an exchangeable vector of $n\geq k$ random variables is close to a mixture of product distributions. Closeness is measured in terms of the relative entropy and an explicit bound is provided. This bound is tighter than those obta… ▽ More A new finite form of de Finetti's representation theorem is established using elementary information-theoretic tools. The distribution of the first $k$ random variables in an exchangeable vector of $n\geq k$ random variables is close to a mixture of product distributions. Closeness is measured in terms of the relative entropy and an explicit bound is provided. This bound is tighter than those obtained via earlier information-theoretic proofs, and its utility extends to random variables taking values in general spaces. The core argument employed has its origins in the quantum information-theoretic literature. △ Less

Submitted 25 April, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: 11 pages, no figures. In the second version the introduction is slightly extended, two new references and Section 2.4 have been added

arXiv:2210.06624 [pdf, ps, other]

Approximate Discrete Entropy Monotonicity for Log-Concave Sums

Authors: Lampros Gavalakis

Abstract: It is proven that a conjecture of Tao (2010) holds true for log-concave random variables on the integers: For every $n \geq 1$, if $X_1,\ldots,X_n$ are i.i.d. integer-valued, log-concave random variables, then $$ H(X_1+\cdots+X_{n+1}) \geq H(X_1+\cdots+X_{n}) + \frac{1}{2}\log{\Bigl(\frac{n+1}{n}\Bigr)} - o(1) $$ as $H(X_1) \to \infty$, where $H$ denotes the (discrete) Shannon entropy. The problem… ▽ More It is proven that a conjecture of Tao (2010) holds true for log-concave random variables on the integers: For every $n \geq 1$, if $X_1,\ldots,X_n$ are i.i.d. integer-valued, log-concave random variables, then $$ H(X_1+\cdots+X_{n+1}) \geq H(X_1+\cdots+X_{n}) + \frac{1}{2}\log{\Bigl(\frac{n+1}{n}\Bigr)} - o(1) $$ as $H(X_1) \to \infty$, where $H$ denotes the (discrete) Shannon entropy. The problem is reduced to the continuous setting by showing that if $U_1,\ldots,U_n$ are independent continuous uniforms on $(0,1)$, then $$ h(X_1+\cdots+X_n + U_1+\cdots+U_n) = H(X_1+\cdots+X_n) + o(1) $$ as $H(X_1) \to \infty$, where $h$ stands for the differential entropy. Explicit bounds for the $o(1)$-terms are provided. △ Less

Submitted 18 October, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: 15 pages, no figures. A number of typos have been fixed and reviewers' comments have been incorporated. More details have been added in most of the proofs. To appear in Combinatorics, Probability and Computing

MSC Class: 94A17; 60E15

arXiv:2204.05033 [pdf, ps, other]

Information in probability: Another information-theoretic proof of a finite de Finetti theorem

Authors: Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: We recall some of the history of the information-theoretic approach to deriving core results in probability theory and indicate parts of the recent resurgence of interest in this area with current progress along several interesting directions. Then we give a new information-theoretic proof of a finite version of de Finetti's classical representation theorem for finite-valued random variables. We d… ▽ More We recall some of the history of the information-theoretic approach to deriving core results in probability theory and indicate parts of the recent resurgence of interest in this area with current progress along several interesting directions. Then we give a new information-theoretic proof of a finite version of de Finetti's classical representation theorem for finite-valued random variables. We derive an upper bound on the relative entropy between the distribution of the first $k$ in a sequence of $n$ exchangeable random variables, and an appropriate mixture over product distributions. The mixing measure is characterised as the law of the empirical measure of the original sequence, and de Finetti's result is recovered as a corollary. The proof is nicely motivated by the Gibbs conditioning principle in connection with statistical mechanics, and it follows along an appealing sequence of steps. The technical estimates required for these steps are obtained via the use of a collection of combinatorial tools known within information theory as `the method of types.' △ Less

Submitted 26 April, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: Final version, to be published as part of a Festschrift volume in the Springer "Lecture Notes in Mathematics" series

arXiv:2106.00514 [pdf, ps, other]

Entropy and the Discrete Central Limit Theorem

Authors: Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: A strengthened version of the central limit theorem for discrete random variables is established, relying only on information-theoretic tools and elementary arguments. It is shown that the relative entropy between the standardised sum of $n$ independent and identically distributed lattice random variables and an appropriately discretised Gaussian, vanishes as $n\to\infty$. A strengthened version of the central limit theorem for discrete random variables is established, relying only on information-theoretic tools and elementary arguments. It is shown that the relative entropy between the standardised sum of $n$ independent and identically distributed lattice random variables and an appropriately discretised Gaussian, vanishes as $n\to\infty$. △ Less

Submitted 1 June, 2021; originally announced June 2021.

Comments: 15 pages

MSC Class: 60F05; 94A17; 60E15

arXiv:2104.03882 [pdf, ps, other]

An Information-Theoretic Proof of a Finite de Finetti Theorem

Authors: Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: A finite form of de Finetti's representation theorem is established using elementary information-theoretic tools: The distribution of the first $k$ random variables in an exchangeable binary vector of length $n\geq k$ is close to a mixture of product distributions. Closeness is measured in terms of the relative entropy and an explicit bound is provided. A finite form of de Finetti's representation theorem is established using elementary information-theoretic tools: The distribution of the first $k$ random variables in an exchangeable binary vector of length $n\geq k$ is close to a mixture of product distributions. Closeness is measured in terms of the relative entropy and an explicit bound is provided. △ Less

Submitted 25 June, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

Comments: 5 pages. Revised version with some minor typos fixed and discussion slightly expanded

arXiv:2005.10823 [pdf, ps, other]

doi 10.3390/e22060705

Sharp Second-Order Pointwise Asymptotics for Lossless Compression with Side Information

Authors: Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: The problem of determining the best achievable performance of arbitrary lossless compression algorithms is examined, when correlated side information is available at both the encoder and decoder. For arbitrary source-side information pairs, the conditional information density is shown to provide a sharp asymptotic lower bound for the description lengths achieved by an arbitrary sequence of compres… ▽ More The problem of determining the best achievable performance of arbitrary lossless compression algorithms is examined, when correlated side information is available at both the encoder and decoder. For arbitrary source-side information pairs, the conditional information density is shown to provide a sharp asymptotic lower bound for the description lengths achieved by an arbitrary sequence of compressors. This implies that, for ergodic source-side information pairs, the conditional entropy rate is the best achievable asymptotic lower bound to the rate, not just in expectation but with probability one. Under appropriate mixing conditions, a central limit theorem and a law of the iterated logarithm are proved, describing the inevitable fluctuations of the second-order asymptotically best possible rate. An idealised version of Lempel-Ziv coding with side information is shown to be universally first- and second-order asymptotically optimal, under the same conditions. These results are in part based on a new almost-sure invariance principle for the conditional information density, which may be of independent interest. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: 20 pages, no figures. Based on part of arXiv:1912.05734v1

arXiv:1912.05734 [pdf, other]

Fundamental Limits of Lossless Data Compression with Side Information

Authors: Lampros Gavalakis, Ioannis Kontoyiannis

Abstract: The problem of lossless data compression with side information available to both the encoder and the decoder is considered. The finite-blocklength fundamental limits of the best achievable performance are defined, in two different versions of the problem: Reference-based compression, when a single side information string is used repeatedly in compressing different source messages, and pair-based c… ▽ More The problem of lossless data compression with side information available to both the encoder and the decoder is considered. The finite-blocklength fundamental limits of the best achievable performance are defined, in two different versions of the problem: Reference-based compression, when a single side information string is used repeatedly in compressing different source messages, and pair-based compression, where a different side information string is used for each source message. General achievability and converse theorems are established for arbitrary source-side information pairs. Nonasymptotic normal approximation expansions are proved for the optimal rate in both the reference-based and pair-based settings, for memoryless sources. These are stated in terms of explicit, finite-blocklength bounds, that are tight up to third-order terms. Extensions that go significantly beyond the class of memoryless sources are obtained. The relevant source dispersion is identified and its relationship with the conditional varentropy rate is established. Interestingly, the dispersion is different in reference-based and pair-based compression, and it is proved that the reference-based dispersion is in general smaller. △ Less

Submitted 21 February, 2021; v1 submitted 11 December, 2019; originally announced December 2019.

Comments: 26 pages, 1 figure. Revised, shorter, final version, focusing primarily on nonasymptotic results. This version has been accepted for publication in IEEE Transactions on Information Theory

Showing 1–11 of 11 results for author: Gavalakis, L