-
Tensor estimation with structured priors
Authors:
Clément Luneau,
Nicolas Macris
Abstract:
We consider rank-one symmetric tensor estimation when the tensor is corrupted by Gaussian noise and the spike forming the tensor is a structured signal coming from a generalized linear model. The latter is a mathematically tractable model of a non-trivial hidden lower-dimensional latent structure in a signal. We work in a large dimensional regime with fixed ratio of signal-to-latent space dimensio…
▽ More
We consider rank-one symmetric tensor estimation when the tensor is corrupted by Gaussian noise and the spike forming the tensor is a structured signal coming from a generalized linear model. The latter is a mathematically tractable model of a non-trivial hidden lower-dimensional latent structure in a signal. We work in a large dimensional regime with fixed ratio of signal-to-latent space dimensions. Remarkably, in this asymptotic regime, the mutual information between the spike and the observations can be expressed as a finite-dimensional variational problem, and it is possible to deduce the minimum-mean-square-error from its solution. We discuss, on examples, properties of the phase transitions as a function of the signal-to-noise ratio. Typically, the critical signal-to-noise ratio decreases with increasing signal-to-latent space dimensions. We discuss the limit of vanishing ratio of signal-to-latent space dimensions and determine the limiting tensor estimation problem. We also point out similarities and differences with the case of matrices.
△ Less
Submitted 26 June, 2020;
originally announced June 2020.
-
Information theoretic limits of learning a sparse rule
Authors:
Clément Luneau,
Jean Barbier,
Nicolas Macris
Abstract:
We consider generalized linear models in regimes where the number of nonzero components of the signal and accessible data points are sublinear with respect to the size of the signal. We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bay…
▽ More
We consider generalized linear models in regimes where the number of nonzero components of the signal and accessible data points are sublinear with respect to the size of the signal. We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bayesian estimator when the signal entries have a discrete distribution with finite support. We find that, for such signals and suitable vanishing scalings of the sparsity and sampling rate, the MMSE is nonincreasing piecewise constant. In specific instances the MMSE even displays an all-or-nothing phase transition, that is, the MMSE sharply jumps from its maximum value to zero at a critical sampling rate. The all-or-nothing phenomenon has previously been shown to occur in high-dimensional linear regression. Our analysis goes beyond the linear case and applies to learning the weights of a perceptron with general activation function in a teacher-student scenario. In particular, we discuss an all-or-nothing phenomenon for the generalization error with a sublinear set of training examples.
△ Less
Submitted 27 October, 2020; v1 submitted 19 June, 2020;
originally announced June 2020.
-
High-dimensional rank-one nonsymmetric matrix decomposition: the spherical case
Authors:
Clément Luneau,
Nicolas Macris,
Jean Barbier
Abstract:
We consider the problem of estimating a rank-one nonsymmetric matrix under additive white Gaussian noise. The matrix to estimate can be written as the outer product of two vectors and we look at the special case in which both vectors are uniformly distributed on spheres. We prove a replica-symmetric formula for the average mutual information between these vectors and the observations in the high-d…
▽ More
We consider the problem of estimating a rank-one nonsymmetric matrix under additive white Gaussian noise. The matrix to estimate can be written as the outer product of two vectors and we look at the special case in which both vectors are uniformly distributed on spheres. We prove a replica-symmetric formula for the average mutual information between these vectors and the observations in the high-dimensional regime. This goes beyond previous results which considered vectors with independent and identically distributed elements. The method used can be extended to rank-one tensor problems.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Mutual information for low-rank even-order symmetric tensor estimation
Authors:
Clément Luneau,
Jean Barbier,
Nicolas Macris
Abstract:
We consider a statistical model for finite-rank symmetric tensor factorization and prove a single-letter variational expression for its asymptotic mutual information when the tensor is of even order. The proof applies the adaptive interpolation method originally invented for rank-one factorization. Here we show how to extend the adaptive interpolation to finite-rank and even-order tensors. This re…
▽ More
We consider a statistical model for finite-rank symmetric tensor factorization and prove a single-letter variational expression for its asymptotic mutual information when the tensor is of even order. The proof applies the adaptive interpolation method originally invented for rank-one factorization. Here we show how to extend the adaptive interpolation to finite-rank and even-order tensors. This requires new nontrivial ideas with respect to the current analysis in the literature. We also underline where the proof falls short when dealing with odd-order tensors.
△ Less
Submitted 23 September, 2020; v1 submitted 9 April, 2019;
originally announced April 2019.
-
Entropy and mutual information in models of deep neural networks
Authors:
Marylou Gabrié,
Andre Manoel,
Clément Luneau,
Jean Barbier,
Nicolas Macris,
Florent Krzakala,
Lenka Zdeborová
Abstract:
We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is kno…
▽ More
We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive.
△ Less
Submitted 29 October, 2018; v1 submitted 24 May, 2018;
originally announced May 2018.