-
A Generalization of Relative Entropy to Count Vectors and its Concentration Property
Authors:
Kostas N. Oikonomou
Abstract:
We introduce a new generalization of relative entropy to non-negative vectors with sums $\gt 1$. We show in a purely combinatorial setting, with no probabilistic considerations, that in the presence of linear constraints defining a convex polytope, a concentration phenomenon arises for this generalized relative entropy, and we quantify the concentration precisely. We also present a probabilistic f…
▽ More
We introduce a new generalization of relative entropy to non-negative vectors with sums $\gt 1$. We show in a purely combinatorial setting, with no probabilistic considerations, that in the presence of linear constraints defining a convex polytope, a concentration phenomenon arises for this generalized relative entropy, and we quantify the concentration precisely. We also present a probabilistic formulation, and extend the concentration results to it. In addition, we provide a number of simplifications and improvements to our previous work, notably in dualizing the optimization problem, in the concentration with respect to $\ell_{\infty}$ distance, and in the relationship to generalized KL-divergence. A number of our results apply to general compact convex sets, not necessarily polyhedral.
△ Less
Submitted 7 May, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Generalized Entropy Concentration for Counts
Authors:
Kostas N. Oikonomou
Abstract:
The phenomenon of entropy concentration provides strong support for the maximum entropy method, MaxEnt, for inferring a probability vector from information in the form of constraints. Here we extend this phenomenon, in a discrete setting, to non-negative integral vectors not necessarily summing to 1. We show that linear constraints that simply bound the allowable sums suffice for concentration to…
▽ More
The phenomenon of entropy concentration provides strong support for the maximum entropy method, MaxEnt, for inferring a probability vector from information in the form of constraints. Here we extend this phenomenon, in a discrete setting, to non-negative integral vectors not necessarily summing to 1. We show that linear constraints that simply bound the allowable sums suffice for concentration to occur even in this setting. This requires a new, `generalized' entropy measure in which the sum of the vector plays a role. We measure the concentration in terms of deviation from the maximum generalized entropy value, or in terms of the distance from the maximum generalized entropy vector. We provide non-asymptotic bounds on the concentration in terms of various parameters, including a tolerance on the constraints which ensures that they are always satisfied by an integral vector. Generalized entropy maximization is not only compatible with ordinary MaxEnt, but can also be considered an extension of it, as it allows us to address problems that cannot be formulated as MaxEnt problems.
△ Less
Submitted 7 January, 2021; v1 submitted 1 November, 2016;
originally announced November 2016.
-
Analytical Forms for Most Likely Matrices Derived from Incomplete Information
Authors:
Kostas N. Oikonomou
Abstract:
Consider a rectangular matrix describing some type of communication or transportation between a set of origins and a set of destinations, or a classification of objects by two attributes. The problem is to infer the entries of the matrix from limited information in the form of constraints, generally the sums of the elements over various subsets of the matrix, such as rows, columns, etc, or from bo…
▽ More
Consider a rectangular matrix describing some type of communication or transportation between a set of origins and a set of destinations, or a classification of objects by two attributes. The problem is to infer the entries of the matrix from limited information in the form of constraints, generally the sums of the elements over various subsets of the matrix, such as rows, columns, etc, or from bounds on these sums, down to individual elements. Such problems are routinely addressed by applying the maximum entropy method to compute the matrix numerically, but in this paper we derive analytical, closed-form solutions. For the most complicated cases we consider the solution depends on the root of a non-linear equation, for which we provide an analytical approximation in the form of a power series. Some of our solutions extend to 3-dimensional matrices. Besides being valid for matrices of arbitrary size, the analytical solutions exhibit many of the appealing properties of maximum entropy, such as precise use of the available data, intuitive behavior with respect to changes in the constraints, and logical consistency.
△ Less
Submitted 4 October, 2011;
originally announced October 2011.
-
Explicit Bounds for Entropy Concentration under Linear Constraints
Authors:
Kostas N. Oikonomou,
Peter D. Grunwald
Abstract:
Consider the set of all sequences of $n$ outcomes, each taking one of $m$ values, that satisfy a number of linear constraints. If $m$ is fixed while $n$ increases, most sequences that satisfy the constraints result in frequency vectors whose entropy approaches that of the maximum entropy vector satisfying the constraints. This well-known "entropy concentration" phenomenon underlies the maximum ent…
▽ More
Consider the set of all sequences of $n$ outcomes, each taking one of $m$ values, that satisfy a number of linear constraints. If $m$ is fixed while $n$ increases, most sequences that satisfy the constraints result in frequency vectors whose entropy approaches that of the maximum entropy vector satisfying the constraints. This well-known "entropy concentration" phenomenon underlies the maximum entropy method.
Existing proofs of the concentration phenomenon are based on limits or asymptotics and unrealistically assume that constraints hold precisely, supporting maximum entropy inference more in principle than in practice. We present, for the first time, non-asymptotic, explicit lower bounds on $n$ for a number of variants of the concentration result to hold to any prescribed accuracies, with the constraints holding up to any specified tolerance, taking into account the fact that allocations of discrete units can satisfy constraints only approximately. Again unlike earlier results, we measure concentration not by deviation from the maximum entropy value, but by the $\ell_1$ and $\ell_2$ distances from the maximum entropy-achieving frequency vector. One of our results holds independently of the alphabet size $m$ and is based on a novel proof technique using the multi-dimensional Berry-Esseen theorem. We illustrate and compare our results using various detailed examples.
△ Less
Submitted 30 September, 2015; v1 submitted 29 July, 2011;
originally announced July 2011.