Search | arXiv e-print repository

arXiv:1907.03792 [pdf, other]

Asymptotic Bayes risk for Gaussian mixture in a semi-supervised setting

Abstract: Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the b… ▽ More Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the best semi-supervised approach using both labeled and unlabeled data. We quantify the best possible increase in performance obtained thanks to the unlabeled data, i.e. we compute the accuracy increase due to the information contained in the unlabeled data. Our work deals with a simple high-dimensional Gaussian mixture model for the data in a Bayesian setting. Our rigorous analysis builds on recent theoretical breakthroughs in high-dimensional inference and a large body of mathematical tools from statistical physics initially developed for spin glasses. △ Less

Submitted 28 September, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: 13 pages

arXiv:1806.04343 [pdf, other]

Phase transitions in spiked matrix estimation: information-theoretic analysis

Authors: Léo Miolane

Abstract: We study here the so-called spiked Wigner and Wishart models, where one observes a low-rank matrix perturbed by some Gaussian noise. These models encompass many classical statistical tasks such as sparse PCA, submatrix localization, community detection or Gaussian mixture clustering. The goal of these notes is to present in a unified manner recent results (as well as new developments) on the infor… ▽ More We study here the so-called spiked Wigner and Wishart models, where one observes a low-rank matrix perturbed by some Gaussian noise. These models encompass many classical statistical tasks such as sparse PCA, submatrix localization, community detection or Gaussian mixture clustering. The goal of these notes is to present in a unified manner recent results (as well as new developments) on the information-theoretic limits of these spiked matrix models. We compute the minimal mean squared error for the estimation of the low-rank signal and compare it to the performance of spectral estimators and message passing algorithms. Phase transition phenomena are observed: depending on the noise level it is either impossible, easy (i.e. using polynomial-time estimators) or hard (information-theoretically possible, but no efficient algorithm is known to succeed) to recover the signal. △ Less

Submitted 24 June, 2019; v1 submitted 12 June, 2018; originally announced June 2018.

Comments: These notes present in a unified manner recent results (as well as new developments) on the information-theoretic limits in spiked matrix estimation

arXiv:1709.10368 [pdf, ps, other]

The Layered Structure of Tensor Estimation and its Mutual Information

Authors: Jean Barbier, Nicolas Macris, Léo Miolane

Abstract: We consider rank-one non-symmetric tensor estimation and derive simple formulas for the mutual information. We start by the order 2 problem, namely matrix factorization. We treat it completely in a simpler fashion than previous proofs using a new type of interpolation method developed in [1]. We then show how to harness the structure in "layers" of tensor estimation in order to obtain a formula fo… ▽ More We consider rank-one non-symmetric tensor estimation and derive simple formulas for the mutual information. We start by the order 2 problem, namely matrix factorization. We treat it completely in a simpler fashion than previous proofs using a new type of interpolation method developed in [1]. We then show how to harness the structure in "layers" of tensor estimation in order to obtain a formula for the mutual information for the order 3 problem from the knowledge of the formula for the order 2 problem, still using the same kind of interpolation. Our proof technique straightforwardly generalizes and allows to rigorously obtain the mutual information at any order in a recursive way. △ Less

Submitted 27 November, 2018; v1 submitted 29 September, 2017; originally announced September 2017.

Comments: 55th Annual Allerton Conference on Communication, Control, and Computing, 2017

arXiv:1708.03395 [pdf, other]

doi 10.1073/pnas.1802705116

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Authors: Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová

Abstract: Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal es… ▽ More Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Non-rigorous predictions for the optimal errors existed for special cases of GLMs, e.g. for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multi-purpose algorithms. This paper is divided in two parts that can be read independently: The first part (main part) presents the model and main results, discusses some applications and sketches the main ideas of the proof. The second part (supplementary informations) is much more detailed and provides more examples as well as all the proofs. △ Less

Submitted 1 November, 2018; v1 submitted 10 August, 2017; originally announced August 2017.

Comments: 101 pages, 5 figures

Journal ref: Proceedings of the National Academy of Sciences 116. 12 (2019): 5451-5460

arXiv:1701.08010 [pdf, other]

doi 10.1109/ISIT.2017.8006580

Statistical and computational phase transitions in spiked tensor estimation

Authors: Thibault Lesieur, Léo Miolane, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

Abstract: We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmica… ▽ More We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmically "easy" in a much wider region than previously believed. It exists, however, a "hard" region where AMP fails to reach the MMSE and we conjecture that no polynomial algorithm will improve on AMP. △ Less

Submitted 16 December, 2017; v1 submitted 27 January, 2017; originally announced January 2017.

Comments: 17 pages, 3 figures, 1 table

Journal ref: IEEE International Symposium on Information Theory (ISIT), pp. 511-515 (2017)

Showing 1–5 of 5 results for author: Miolane, L