Search | arXiv e-print repository

Neural Incremental Data Assimilation

Authors: Matthieu Blanke, Ronan Fablet, Marc Lelarge

Abstract: Data assimilation is a central problem in many geophysical applications, such as weather forecasting. It aims to estimate the state of a potentially large system, such as the atmosphere, from sparse observations, supplemented by prior physical knowledge. The size of the systems involved and the complexity of the underlying physical equations make it a challenging task from a computational point of… ▽ More Data assimilation is a central problem in many geophysical applications, such as weather forecasting. It aims to estimate the state of a potentially large system, such as the atmosphere, from sparse observations, supplemented by prior physical knowledge. The size of the systems involved and the complexity of the underlying physical equations make it a challenging task from a computational point of view. Neural networks represent a promising method of emulating the physics at low cost, and therefore have the potential to considerably improve and accelerate data assimilation. In this work, we introduce a deep learning approach where the physical system is modeled as a sequence of coarse-to-fine Gaussian prior distributions parametrized by a neural network. This allows us to define an assimilation operator, which is trained in an end-to-end fashion to minimize the reconstruction error on a dataset with different observation processes. We illustrate our approach on chaotic dynamical physical systems with sparse observations, and compare it to traditional variational data assimilation methods. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2312.09860 [pdf, other]

Automatic Rao-Blackwellization for Sequential Monte Carlo with Belief Propagation

Authors: Waïss Azizian, Guillaume Baudart, Marc Lelarge

Abstract: Exact Bayesian inference on state-space models~(SSM) is in general untractable, and unfortunately, basic Sequential Monte Carlo~(SMC) methods do not yield correct approximations for complex models. In this paper, we propose a mixed inference algorithm that computes closed-form solutions using belief propagation as much as possible, and falls back to sampling-based SMC methods when exact computatio… ▽ More Exact Bayesian inference on state-space models~(SSM) is in general untractable, and unfortunately, basic Sequential Monte Carlo~(SMC) methods do not yield correct approximations for complex models. In this paper, we propose a mixed inference algorithm that computes closed-form solutions using belief propagation as much as possible, and falls back to sampling-based SMC methods when exact computations fail. This algorithm thus implements automatic Rao-Blackwellization and is even exact for Gaussian tree models. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.00477 [pdf, other]

Interpretable Meta-Learning of Physical Systems

Authors: Matthieu Blanke, Marc Lelarge

Abstract: Machine learning methods can be a valuable aid in the scientific process, but they need to face challenging settings where data come from inhomogeneous experimental conditions. Recent meta-learning methods have made significant progress in multi-task learning, but they rely on black-box neural networks, resulting in high computational costs and limited interpretability. Leveraging the structure of… ▽ More Machine learning methods can be a valuable aid in the scientific process, but they need to face challenging settings where data come from inhomogeneous experimental conditions. Recent meta-learning methods have made significant progress in multi-task learning, but they rely on black-box neural networks, resulting in high computational costs and limited interpretability. Leveraging the structure of the learning problem, we argue that multi-environment generalization can be achieved using a simpler learning model, with an affine structure with respect to the learning task. Crucially, we prove that this architecture can identify the physical parameters of the system, enabling interpreable learning. We demonstrate the competitive generalization performance and the low computational cost of our method by comparing it to state-of-the-art algorithms on physical systems, ranging from toy models to complex, non-analytical systems. The interpretability of our method is illustrated with original applications to physical-parameter-induced adaptation and to adaptive control. △ Less

Submitted 20 March, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

Journal ref: The Twelfth International Conference on Learning Representations, ICLR 2024

arXiv:2304.13426 [pdf, other]

FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems

Authors: Matthieu Blanke, Marc Lelarge

Abstract: Model-based reinforcement learning is a powerful tool, but collecting data to fit an accurate model of the system can be costly. Exploring an unknown environment in a sample-efficient manner is hence of great importance. However, the complexity of dynamics and the computational limitations of real systems make this task challenging. In this work, we introduce FLEX, an exploration algorithm for non… ▽ More Model-based reinforcement learning is a powerful tool, but collecting data to fit an accurate model of the system can be costly. Exploring an unknown environment in a sample-efficient manner is hence of great importance. However, the complexity of dynamics and the computational limitations of real systems make this task challenging. In this work, we introduce FLEX, an exploration algorithm for nonlinear dynamics based on optimal experimental design. Our policy maximizes the information of the next step and results in an adaptive exploration algorithm, compatible with generic parametric learning models and requiring minimal resources. We test our method on a number of nonlinear environments covering different settings, including time-varying dynamics. Kee** in mind that exploration is intended to serve an exploitation objective, we also test our algorithm on downstream model-based classical control tasks and compare it to other state-of-the-art model-based and model-free approaches. The performance achieved by FLEX is competitive and its computational cost is low. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: Accepted at ICML 2023

arXiv:2301.08117 [pdf, other]

Convergence beyond the over-parameterized regime using Rayleigh quotients

Authors: David A. R. Robin, Kevin Scaman, Marc Lelarge

Abstract: In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for se… ▽ More In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: Published at the 36th conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2204.06375 [pdf, other]

doi 10.1109/CDC51059.2022.9993030

Online greedy identification of linear dynamical systems

Authors: Matthieu Blanke, Marc Lelarge

Abstract: This work addresses the problem of exploration in an unknown environment. For linear dynamical systems, we use an experimental design framework and introduce an online greedy policy where the control maximizes the information of the next step. In a setting with a limited number of experimental trials, our algorithm has low complexity and shows experimentally competitive performances compared to mo… ▽ More This work addresses the problem of exploration in an unknown environment. For linear dynamical systems, we use an experimental design framework and introduce an online greedy policy where the control maximizes the information of the next step. In a setting with a limited number of experimental trials, our algorithm has low complexity and shows experimentally competitive performances compared to more elaborate gradient-based methods. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: 17 pages, 2 figures

arXiv:2203.10107 [pdf, other]

SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

Authors: Eric Daoud, Luca Ganassali, Antoine Baker, Marc Lelarge

Abstract: For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some… ▽ More For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some additional features should be considered in the design of the algorithm. In this work, we study the recommendation problem in the setting where affinities between users and items are based both on their embeddings in a latent space and on their geographical distance in their underlying euclidean space (e.g., $\mathbb{R}^2$), together with item capacity constraints. This framework is motivated by some real-world applications, for instance in healthcare: the task is to recommend hospitals to patients based on their location, pathology, and hospital capacities. In these applications, there is somewhat of an asymmetry between users and items: items are viewed as static points, their embeddings, capacities and locations constraining the allocation. Upon the observation of an optimal allocation, user embeddings, items capacities, and their positions in their underlying euclidean space, our aim is to recover item embeddings in the latent space; doing so, we are then able to use this estimate e.g. in order to predict future allocations. We propose an algorithm (SiMCa) based on matrix factorization enhanced with optimal transport steps to model user-item affinities and learn item embeddings from observed data. We then illustrate and discuss the results of such an approach for hospital recommendation on synthetic data. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: All comments are welcome

arXiv:2107.07623 [pdf, other]

Correlation detection in trees for planted graph alignment

Authors: Luca Ganassali, Laurent Massoulié, Marc Lelarge

Abstract: Motivated by alignment of correlated sparse random graphs, we introduce a hypothesis testing problem of deciding whether or not two random trees are correlated. We obtain sufficient conditions under which this testing is impossible or feasible. We propose MPAlign, a message-passing algorithm for graph alignment inspired by the tree correlation detection problem. We prove MPAlign to succeed in poly… ▽ More Motivated by alignment of correlated sparse random graphs, we introduce a hypothesis testing problem of deciding whether or not two random trees are correlated. We obtain sufficient conditions under which this testing is impossible or feasible. We propose MPAlign, a message-passing algorithm for graph alignment inspired by the tree correlation detection problem. We prove MPAlign to succeed in polynomial time at partial alignment whenever tree detection is feasible. As a result our analysis of tree detection reveals new ranges of parameters for which partial alignment of sparse random graphs is feasible in polynomial time. We then conjecture that graph alignment is not feasible in polynomial time when the associated tree detection problem is impossible. If true, this conjecture together with our sufficient conditions on tree detection impossibility would imply the existence of a hard phase for graph alignment, i.e. a parameter range where alignment cannot be done in polynomial time even though it is known to be feasible in non-polynomial time. △ Less

Submitted 5 December, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

Comments: 38 pages, 9 figures

arXiv:2102.02685 [pdf, other]

Impossibility of Partial Recovery in the Graph Alignment Problem

Authors: Luca Ganassali, Laurent Massoulié, Marc Lelarge

Abstract: Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known graph isomorphism problem. For the correlated Erdös-Rényi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well a… ▽ More Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known graph isomorphism problem. For the correlated Erdös-Rényi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well as a general bound on the maximal reachable overlap. Our bound is tight in the noiseless case (the graph isomorphism problem) and we conjecture that it is still tight with noise. Our proof technique relies on a careful application of the probabilistic method to build automorphisms between tree components of a subcritical Erdös-Rényi graph. △ Less

Submitted 29 June, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: 23 pages, 8 figures. Accepted for publication at COLT21

Journal ref: Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2080-2102, 2021

arXiv:2011.02143 [pdf, other]

Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems

Authors: Stéphane d'Ascoli, Alice Coucke, Francesco Caltagirone, Alexandre Caulier, Marc Lelarge

Abstract: Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. Our contribution is twofold. First we show how to optimally train and contr… ▽ More Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset, possibly containing irrelevant queries, to extract relevant information. Comparison with two different baselines shows that this method, in the appropriate regime, consistently improves the diversity of the generated queries without compromising their quality. We also demonstrate the effectiveness of our generation method as a data augmentation technique for language modelling tasks. △ Less

Submitted 3 November, 2020; originally announced November 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1911.03698

arXiv:2006.15646 [pdf, other]

Expressive Power of Invariant and Equivariant Graph Neural Networks

Authors: Waïss Azizian, Marc Lelarge

Abstract: Various classes of Graph Neural Networks (GNN) have been proposed and shown to be successful in a wide range of applications with graph structured data. In this paper, we propose a theoretical framework able to compare the expressive power of these GNN architectures. The current universality theorems only apply to intractable classes of GNNs. Here, we prove the first approximation guarantees for p… ▽ More Various classes of Graph Neural Networks (GNN) have been proposed and shown to be successful in a wide range of applications with graph structured data. In this paper, we propose a theoretical framework able to compare the expressive power of these GNN architectures. The current universality theorems only apply to intractable classes of GNNs. Here, we prove the first approximation guarantees for practical GNNs, paving the way for a better understanding of their generalization. Our theoretical results are proved for invariant GNNs computing a graph embedding (permutation of the nodes of the input graph does not affect the output) and equivariant GNNs computing an embedding of the nodes (permutation of the input permutes the output). We show that Folklore Graph Neural Networks (FGNN), which are tensor based GNNs augmented with matrix multiplication are the most expressive architectures proposed so far for a given tensor order. We illustrate our results on the Quadratic Assignment Problem (a NP-Hard combinatorial problem) by showing that FGNNs are able to learn how to solve the problem, leading to much better average performances than existing algorithms (based on spectral, SDP or other GNNs architectures). On a practical side, we also implement masked tensors to handle batches of graphs of varying sizes. △ Less

Submitted 6 June, 2021; v1 submitted 28 June, 2020; originally announced June 2020.

Comments: Appears in: Proceedings of the 9th International Conference on Learning Representations, ICLR 2021. 39 pages

ACM Class: G.1.6; I.2.6

arXiv:1912.00231 [pdf, other]

doi 10.1017/apr.2021.31

Spectral Alignment of Correlated Gaussian matrices

Authors: Luca Ganassali, Marc Lelarge, Laurent Massoulié

Abstract: In this paper we analyze a simple spectral method (EIG1) for the problem of matrix alignment, consisting in aligning their leading eigenvectors: given two matrices $A$ and $B$, we compute $v_1$ and $v'_1$ two corresponding leading eigenvectors. The algorithm returns the permutation $\hatπ$ such that the rank of coordinate $\hatπ(i)$ in $v_1$ and that of coordinate $i$ in $v'_1$ (up to the sign of… ▽ More In this paper we analyze a simple spectral method (EIG1) for the problem of matrix alignment, consisting in aligning their leading eigenvectors: given two matrices $A$ and $B$, we compute $v_1$ and $v'_1$ two corresponding leading eigenvectors. The algorithm returns the permutation $\hatπ$ such that the rank of coordinate $\hatπ(i)$ in $v_1$ and that of coordinate $i$ in $v'_1$ (up to the sign of $v'_1$) are the same. We consider a model of weighted graphs where the adjacency matrix $A$ belongs to the Gaussian Orthogonal Ensemble (GOE) of size $N \times N$, and $B$ is a noisy version of $A$ where all nodes have been relabeled according to some planted permutation $π$, namely $B= Π^T (A+σH) Π$, where $Π$ is the permutation matrix associated with $π$ and $H$ is an independent copy of $A$. We show the following zero-one law: with high probability, under the condition $σN^{7/6+ε} \to 0$ for some $ε>0$, EIG1 recovers all but a vanishing part of the underlying permutation $π$, whereas if $σN^{7/6-ε} \to \infty$, this method cannot recover more than $o(N)$ correct matches. This result gives an understanding of the simplest and fastest spectral method for matrix alignment (or complete weighted graph alignment), and involves proof methods and techniques which could be of independent interest. △ Less

Submitted 11 May, 2021; v1 submitted 30 November, 2019; originally announced December 2019.

Comments: 26 pages, 4 figures. Figures and paper organization updated, typos corrected. Remark 4.2. added

Journal ref: Advances in Applied Probability (2022) 1-32

arXiv:1911.03698 [pdf, other]

Conditioned Query Generation for Task-Oriented Dialogue Systems

Authors: Stéphane d'Ascoli, Alice Coucke, Francesco Caltagirone, Alexandre Caulier, Marc Lelarge

Abstract: Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. In this paper we propose a novel controlled data generation method that cou… ▽ More Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. In this paper we propose a novel controlled data generation method that could be used as a training augmentation framework for closed-domain dialogue. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a novel protocol called query transfer that allows to leverage a broad, unlabelled dataset to extract relevant information. Comparison with two different baselines shows that our method, in the appropriate regime, consistently improves the diversity of the generated queries without compromising their quality. △ Less

Submitted 9 November, 2019; originally announced November 2019.

arXiv:1907.03792 [pdf, other]

Asymptotic Bayes risk for Gaussian mixture in a semi-supervised setting

Authors: Marc Lelarge, Leo Miolane

Abstract: Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the b… ▽ More Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the best semi-supervised approach using both labeled and unlabeled data. We quantify the best possible increase in performance obtained thanks to the unlabeled data, i.e. we compute the accuracy increase due to the information contained in the unlabeled data. Our work deals with a simple high-dimensional Gaussian mixture model for the data in a Bayesian setting. Our rigorous analysis builds on recent theoretical breakthroughs in high-dimensional inference and a large body of mathematical tools from statistical physics initially developed for spin glasses. △ Less

Submitted 28 September, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: 13 pages

arXiv:1809.11115 [pdf, ps, other]

Weighted Spectral Embedding of Graphs

Authors: Thomas Bonald, Alexandre Hollocou, Marc Lelarge

Abstract: We present a novel spectral embedding of graphs that incorporates weights assigned to the nodes, quantifying their relative importance. This spectral embedding is based on the first eigenvectors of some properly normalized version of the Laplacian. We prove that these eigenvectors correspond to the configurations of lowest energy of an equivalent physical system, either mechanical or electrical, i… ▽ More We present a novel spectral embedding of graphs that incorporates weights assigned to the nodes, quantifying their relative importance. This spectral embedding is based on the first eigenvectors of some properly normalized version of the Laplacian. We prove that these eigenvectors correspond to the configurations of lowest energy of an equivalent physical system, either mechanical or electrical, in which the weight of each node can be interpreted as its mass or its capacitance, respectively. Experiments on a real dataset illustrate the impact of weighting on the embedding. △ Less

Submitted 3 October, 2018; v1 submitted 28 September, 2018; originally announced September 2018.

arXiv:1806.08240 [pdf, other]

InfoCatVAE: Representation Learning with Categorical Variational Autoencoders

Authors: Edouard Pineau, Marc Lelarge

Abstract: This paper describes InfoCatVAE, an extension of the variational autoencoder that enables unsupervised disentangled representation learning. InfoCatVAE uses multimodal distributions for the prior and the inference network and then maximizes the evidence lower bound objective (ELBO). We connect the new ELBO derived for our model with a natural soft clustering objective which explains the robustness… ▽ More This paper describes InfoCatVAE, an extension of the variational autoencoder that enables unsupervised disentangled representation learning. InfoCatVAE uses multimodal distributions for the prior and the inference network and then maximizes the evidence lower bound objective (ELBO). We connect the new ELBO derived for our model with a natural soft clustering objective which explains the robustness of our approach. We then adapt the InfoGANs method to our setting in order to maximize the mutual information between the categorical code and the generated inputs and obtain an improved model. △ Less

Submitted 25 June, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

Comments: 9 pages, 3 appendix, 5 figures. arXiv admin note: text overlap with arXiv:1606.03657 by other authors

arXiv:1803.09533 [pdf, other]

Deep Representation for Patient Visits from Electronic Health Records

Authors: Jean-Baptiste Escudié, Alaa Saade, Alice Coucke, Marc Lelarge

Abstract: We show how to learn low-dimensional representations (embeddings) of patient visits from the corresponding electronic health record (EHR) where International Classification of Diseases (ICD) diagnosis codes are removed. We expect that these embeddings will be useful for the construction of predictive statistical models anticipated to drive personalized medicine and improve healthcare quality. Thes… ▽ More We show how to learn low-dimensional representations (embeddings) of patient visits from the corresponding electronic health record (EHR) where International Classification of Diseases (ICD) diagnosis codes are removed. We expect that these embeddings will be useful for the construction of predictive statistical models anticipated to drive personalized medicine and improve healthcare quality. These embeddings are learned using a deep neural network trained to predict ICD diagnosis categories. We show that our embeddings capture relevant clinical informations and can be used directly as input to standard machine learning algorithms like multi-output classifiers for ICD code prediction. We also show that important medical informations correspond to particular directions in our embedding space. △ Less

Submitted 26 March, 2018; originally announced March 2018.

arXiv:1801.02889 [pdf, ps, other]

Optimal Content Replication and Request Matching in Large Caching Systems

Authors: Arpan Mukhopadhyay, Nidhi Hegde, Marc Lelarge

Abstract: We consider models of content delivery networks in which the servers are constrained by two main resources: memory and bandwidth. In such systems, the throughput crucially depends on how contents are replicated across servers and how the requests of specific contents are matched to servers storing those contents. In this paper, we first formulate the problem of computing the optimal replication po… ▽ More We consider models of content delivery networks in which the servers are constrained by two main resources: memory and bandwidth. In such systems, the throughput crucially depends on how contents are replicated across servers and how the requests of specific contents are matched to servers storing those contents. In this paper, we first formulate the problem of computing the optimal replication policy which if combined with the optimal matching policy maximizes the throughput of the caching system in the stationary regime. It is shown that computing the optimal replication policy for a given system is an NP-hard problem. A greedy replication scheme is proposed and it is shown that the scheme provides a constant factor approximation guarantee. We then propose a simple randomized matching scheme which avoids the problem of interruption in service of the ongoing requests due to re-assignment or repacking of the existing requests in the optimal matching policy. The dynamics of the caching system is analyzed under the combination of proposed replication and matching schemes. We study a limiting regime, where the number of servers and the arrival rates of the contents are scaled proportionally, and show that the proposed policies achieve asymptotic optimality. Extensive simulation results are presented to evaluate the performance of different policies and study the behavior of the caching system under different service time distributions of the requests. △ Less

Submitted 9 January, 2018; originally announced January 2018.

Comments: INFOCOM 2018

arXiv:1712.04337 [pdf, ps, other]

A Streaming Algorithm for Graph Clustering

Authors: Alexandre Hollocou, Julien Maudet, Thomas Bonald, Marc Lelarge

Abstract: We introduce a novel algorithm to perform graph clustering in the edge streaming setting. In this model, the graph is presented as a sequence of edges that can be processed strictly once. Our streaming algorithm has an extremely low memory footprint as it stores only three integers per node and does not keep any edge in memory. We provide a theoretical justification of the design of the algorithm… ▽ More We introduce a novel algorithm to perform graph clustering in the edge streaming setting. In this model, the graph is presented as a sequence of edges that can be processed strictly once. Our streaming algorithm has an extremely low memory footprint as it stores only three integers per node and does not keep any edge in memory. We provide a theoretical justification of the design of the algorithm based on the modularity function, which is a usual metric to evaluate the quality of a graph partition. We perform experiments on massive real-life graphs ranging from one million to more than one billion edges and we show that this new algorithm runs more than ten times faster than existing algorithms and leads to similar or better detection scores on the largest graphs. △ Less

Submitted 9 December, 2017; originally announced December 2017.

Comments: NIPS Wokshop on Advances in Modeling and Learning Interactions from Complex Data, 2017. arXiv admin note: substantial text overlap with arXiv:1703.02955

arXiv:1708.02457 [pdf, other]

doi 10.1007/s10955-018-1964-6

Replica Bounds by Combinatorial Interpolation for Diluted Spin Systems

Authors: Marc Lelarge, Mendes Oulamara

Abstract: In two papers Franz, Leone and Toninelli proved bounds for the free energy of diluted random constraints satisfaction problems, for a Poisson degree distribution [5] and a general distribution [6]. Panchenko and Talagrand [16] simplified the proof and generalized the result of [5] for the Poisson case. We provide a new proof for the general degree distribution case and as a corollary, we obtain ne… ▽ More In two papers Franz, Leone and Toninelli proved bounds for the free energy of diluted random constraints satisfaction problems, for a Poisson degree distribution [5] and a general distribution [6]. Panchenko and Talagrand [16] simplified the proof and generalized the result of [5] for the Poisson case. We provide a new proof for the general degree distribution case and as a corollary, we obtain new bounds for the size of the largest independent set (also known as hard core model) in a large random regular graph. Our proof uses a combinatorial interpolation based on biased random walks [21] and allows to bypass the arguments in [6] based on the study of the Sherrington-Kirkpatrick (SK) model. △ Less

Submitted 17 January, 2018; v1 submitted 8 August, 2017; originally announced August 2017.

Comments: Accepted in Journal of Statistical Physics

arXiv:1703.02955 [pdf, other]

A linear streaming algorithm for community detection in very large networks

Authors: Alexandre Hollocou, Julien Maudet, Thomas Bonald, Marc Lelarge

Abstract: In this paper, we introduce a novel community detection algorithm in graphs, called SCoDA (Streaming Community Detection Algorithm), based on an edge streaming setting. This algorithm has an extremely low memory footprint and a lightning-fast execution time as it only stores two integers per node and processes each edge strictly once. The approach is based on the following simple observation: if w… ▽ More In this paper, we introduce a novel community detection algorithm in graphs, called SCoDA (Streaming Community Detection Algorithm), based on an edge streaming setting. This algorithm has an extremely low memory footprint and a lightning-fast execution time as it only stores two integers per node and processes each edge strictly once. The approach is based on the following simple observation: if we pick an edge uniformly at random in the network, this edge is more likely to connect two nodes of the same community than two nodes of distinct communities. We exploit this idea to build communities by local changes at each edge arrival. Using theoretical arguments, we relate the ability of SCoDA to detect communities to usual quality metrics of these communities like the conductance. Experimental results performed on massive real-life networks ranging from one million to more than one billion edges shows that SCoDA runs more than ten times faster than existing algorithms and leads to similar or better detection scores on the largest graphs. △ Less

Submitted 8 March, 2017; originally announced March 2017.

Comments: Currently under review by an international conference

arXiv:1701.08010 [pdf, other]

doi 10.1109/ISIT.2017.8006580

Statistical and computational phase transitions in spiked tensor estimation

Authors: Thibault Lesieur, Léo Miolane, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

Abstract: We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmica… ▽ More We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmically "easy" in a much wider region than previously believed. It exists, however, a "hard" region where AMP fails to reach the MMSE and we conjecture that no polynomial algorithm will improve on AMP. △ Less

Submitted 16 December, 2017; v1 submitted 27 January, 2017; originally announced January 2017.

Comments: 17 pages, 3 figures, 1 table

Journal ref: IEEE International Symposium on Information Theory (ISIT), pp. 511-515 (2017)

arXiv:1611.03888 [pdf, other]

Fundamental limits of symmetric low-rank matrix estimation

Authors: Marc Lelarge, Léo Miolane

Abstract: We consider the high-dimensional inference problem where the signal is a low-rank symmetric matrix which is corrupted by an additive Gaussian noise. Given a probabilistic model for the low-rank matrix, we compute the limit in the large dimension setting for the mutual information between the signal and the observations, as well as the matrix minimum mean square error, while the rank of the signal… ▽ More We consider the high-dimensional inference problem where the signal is a low-rank symmetric matrix which is corrupted by an additive Gaussian noise. Given a probabilistic model for the low-rank matrix, we compute the limit in the large dimension setting for the mutual information between the signal and the observations, as well as the matrix minimum mean square error, while the rank of the signal remains constant. We also show that our model extends beyond the particular case of additive Gaussian noise and we prove an universality result connecting the community detection problem to our Gaussian framework. We unify and generalize a number of recent works on PCA, sparse PCA, submatrix localization or community detection by computing the information-theoretic limits for these problems in the high noise regime. In addition, we show that the posterior distribution of the signal given the observations is characterized by a parameter of the same dimension as the square of the rank of the signal (i.e. scalar in the case of rank one). Finally, we connect our work with the hard but detectable conjecture in statistical physics. △ Less

Submitted 30 March, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

arXiv:1610.08722 [pdf, other]

Improving PageRank for Local Community Detection

Authors: Alexandre Hollocou, Thomas Bonald, Marc Lelarge

Abstract: Community detection is a classical problem in the field of graph mining. While most algorithms work on the entire graph, it is often interesting in practice to recover only the community containing some given set of seed nodes. In this paper, we propose a novel approach to this problem, using some low-dimensional embedding of the graph based on random walks starting from the seed nodes. From this… ▽ More Community detection is a classical problem in the field of graph mining. While most algorithms work on the entire graph, it is often interesting in practice to recover only the community containing some given set of seed nodes. In this paper, we propose a novel approach to this problem, using some low-dimensional embedding of the graph based on random walks starting from the seed nodes. From this embedding, we propose some simple yet efficient versions of the PageRank algorithm as well as a novel algorithm, called WalkSCAN, that is able to detect multiple communities, possibly overlap**. We provide insights into the performance of these algorithms through the theoretical analysis of a toy network and show that WalkSCAN outperforms existing algorithms on real networks. △ Less

Submitted 7 November, 2016; v1 submitted 27 October, 2016; originally announced October 2016.

Comments: Currently under review by an international conference

arXiv:1610.03680 [pdf, other]

Recovering asymmetric communities in the stochastic block model

Authors: Francesco Caltagirone, Marc Lelarge, Léo Miolane

Abstract: We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community det… ▽ More We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community detection problem (i.e. do strictly better than a random guess) when the signal is sufficiently large namely above the so-called Kesten Stigum threshold. In this regime and when the average degree tends to infinity, we show that if the community of a vanishing fraction of the vertices is revealed, then a local algorithm (belief propagation) is optimal down to Kesten Stigum threshold and we quantify explicitly its performance. Below the Kesten Stigum threshold, we show that, in the large degree limit, there is a second threshold called the spinodal curve below which, the community detection problem is not solvable. The spinodal curve is equal to the Kesten Stigum threshold when the fraction of vertices in the smallest community is above $p^*=\frac{1}{2}-\frac{1}{2\sqrt{3}}$, so that the Kesten Stigum threshold is the threshold for solvability of the community detection in this case. However when the smallest community is smaller than $p^*$, the spinodal curve only provides a lower bound on the threshold for solvability. In the regime below the Kesten Stigum bound and above the spinodal curve, we also characterize the performance of best local algorithms as a function of the fraction of revealed vertices. Our proof relies on a careful analysis of the associated reconstruction problem on trees which might be of independent interest. In particular, we show that the spinodal curve corresponds to the reconstruction threshold on the tree. △ Less

Submitted 31 March, 2017; v1 submitted 12 October, 2016; originally announced October 2016.

arXiv:1609.02487 [pdf, ps, other]

Non-Backtracking Spectrum of Degree-Corrected Stochastic Block Models

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: Motivated by community detection, we characterise the spectrum of the non-backtracking matrix $B$ in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on $n$ vertices partitioned into two equal-sized clusters. The vertices have i.i.d. weights $\{ φ_u \}_{u=1}^n$ with second moment $Φ^{(2)}$. The intra-cluster connection probability for vertices $u$ and $v$ is… ▽ More Motivated by community detection, we characterise the spectrum of the non-backtracking matrix $B$ in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on $n$ vertices partitioned into two equal-sized clusters. The vertices have i.i.d. weights $\{ φ_u \}_{u=1}^n$ with second moment $Φ^{(2)}$. The intra-cluster connection probability for vertices $u$ and $v$ is $\frac{φ_u φ_v}{n}a$ and the inter-cluster connection probability is $\frac{φ_u φ_v}{n}b$. We show that with high probability, the following holds: The leading eigenvalue of the non-backtracking matrix $B$ is asymptotic to $ρ= \frac{a+b}{2} Φ^{(2)}$. The second eigenvalue is asymptotic to $μ_2 = \frac{a-b}{2} Φ^{(2)}$ when $μ_2^2 > ρ$, but asymptotically bounded by $\sqrtρ$ when $μ_2^2 \leq ρ$. All the remaining eigenvalues are asymptotically bounded by $\sqrtρ$. As a result, a clustering positively-correlated with the true communities can be obtained based on the second eigenvector of $B$ in the regime where $μ_2^2 > ρ.$ In a previous work we obtained that detection is impossible when $μ_2^2 < ρ,$ meaning that there occurs a phase-transition in the sparse regime of the Degree-Corrected Stochastic Block Model. As a corollary, we obtain that Degree-Corrected Erdős-Rényi graphs asymptotically satisfy the graph Riemann hypothesis, a quasi-Ramanujan property. A by-product of our proof is a weak law of large numbers for local-functionals on Degree-Corrected Stochastic Block Models, which could be of independent interest. △ Less

Submitted 18 May, 2017; v1 submitted 8 September, 2016; originally announced September 2016.

arXiv:1606.00858 [pdf, other]

Impact of Community Structure on Cascades

Authors: Mehrdad Moharrami, Vijay Subramanian, Mingyan Liu, Marc Lelarge

Abstract: We study cascades under the threshold model on sparse random graphs with community structure. In this model, individuals adopt the new behavior based on how many neighbors have already chosen it. Specifically, we consider the permanent adoption model wherein individuals that have adopted the new behavior (or opinion) cannot change their state. We present a differential-equation-based tight approxi… ▽ More We study cascades under the threshold model on sparse random graphs with community structure. In this model, individuals adopt the new behavior based on how many neighbors have already chosen it. Specifically, we consider the permanent adoption model wherein individuals that have adopted the new behavior (or opinion) cannot change their state. We present a differential-equation-based tight approximation to the stochastic process of adoption and prove the validity of the mean-field equations. In addition, we characterize both necessary and sufficient conditions for contagion to happen no matter how small the set of initial adopters is. Finally, we study the problem of optimum seeding given budget constraints and propose a gradient-based heuristic seeding strategy. Our algorithm, numerically, dispels commonly held beliefs in the literature that suggest the best seeding strategy is to seed over the vertices with the highest number of neighbors. △ Less

Submitted 4 May, 2022; v1 submitted 2 June, 2016; originally announced June 2016.

MSC Class: 05C80

arXiv:1605.06422 [pdf, other]

doi 10.1088/1742-6596/1036/1/012015

Fast Randomized Semi-Supervised Clustering

Authors: Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

Abstract: We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be ach… ▽ More We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be achieved from $O(n)$ randomly chosen measurements, where $n$ is the number of items in the dataset. Our algorithm is therefore efficient both in terms of time and space complexities. We also investigate numerically the performance of the algorithm on synthetic and real world data. △ Less

Submitted 9 October, 2016; v1 submitted 20 May, 2016; originally announced May 2016.

Journal ref: Journal of Physics: Conf. Series 1036 (2018) 012015

arXiv:1601.06683 [pdf, other]

doi 10.1109/ISIT.2016.7541405

Clustering from Sparse Pairwise Measurements

Authors: Alaa Saade, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

Abstract: We consider the problem of grou** items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal solution, and two spectral algorithms based on the non-backtracking and Bethe Hessian operators. For the case of two symmetric clusters, we conjecture that thes… ▽ More We consider the problem of grou** items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal solution, and two spectral algorithms based on the non-backtracking and Bethe Hessian operators. For the case of two symmetric clusters, we conjecture that these algorithms are asymptotically optimal in that they detect the clusters as soon as it is information theoretically possible to do so. We substantiate this claim for one of the spectral approaches we introduce. △ Less

Submitted 19 May, 2016; v1 submitted 25 January, 2016; originally announced January 2016.

Journal ref: Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT) Pages: 780 - 784

arXiv:1511.00546 [pdf, ps, other]

An Impossibility Result for Reconstruction in a Degree-Corrected Planted-Partition Model

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: We consider the Degree-Corrected Stochastic Block Model (DC-SBM): a random graph on $n$ nodes, having i.i.d. weights $(φ_u)_{u=1}^n$ (possibly heavy-tailed), partitioned into $q \geq 2$ asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and the finite second moment of the weights $Φ^{(2)}$. Vertices $u$ and $v$ are connected by an edge with probability… ▽ More We consider the Degree-Corrected Stochastic Block Model (DC-SBM): a random graph on $n$ nodes, having i.i.d. weights $(φ_u)_{u=1}^n$ (possibly heavy-tailed), partitioned into $q \geq 2$ asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and the finite second moment of the weights $Φ^{(2)}$. Vertices $u$ and $v$ are connected by an edge with probability $\frac{φ_u φ_v}{n}a$ when they are in the same class and with probability $\frac{φ_u φ_v}{n}b$ otherwise. We prove that it is information-theoretically impossible to estimate the clusters in a way positively correlated with the true community structure when $(a-b)^2 Φ^{(2)} \leq q(a+b)$. As by-products of our proof we obtain $(1)$ a precise coupling result for local neighbourhoods in DC-SBM's, that we use in a follow up paper [Gulikers et al., 2017] to establish a law of large numbers for local-functionals and $(2)$ that long-range interactions are weak in (power-law) DC-SBM's. △ Less

Submitted 24 November, 2018; v1 submitted 2 November, 2015; originally announced November 2015.

Comments: Appeared in Annals of Applied Probability

Journal ref: Annals of Applied Probability - Volume 28, Number 5 (2018), 3002-3027

arXiv:1507.04739 [pdf, ps, other]

Counting matchings in irregular bipartite graphs and random lifts

Authors: Marc Lelarge

Abstract: We give a sharp lower bound on the number of matchings of a given size in a bipartite graph. When specialized to regular bipartite graphs, our results imply Friedland's Lower Matching Conjecture and Schrijver's theorem proven by Gurvits and Csikvari. Indeed, our work extends the recent work of Csikvari done for regular and bi-regular bipartite graphs. Moreover, our lower bounds are order optimal a… ▽ More We give a sharp lower bound on the number of matchings of a given size in a bipartite graph. When specialized to regular bipartite graphs, our results imply Friedland's Lower Matching Conjecture and Schrijver's theorem proven by Gurvits and Csikvari. Indeed, our work extends the recent work of Csikvari done for regular and bi-regular bipartite graphs. Moreover, our lower bounds are order optimal as they are attained for a sequence of $2$-lifts of the original graph as well as for random $n$-lifts of the original graph when $n$ tends to infinity. We then extend our results to permanents and subpermanents sums. For permanents, we are able to recover the lower bound of Schrijver recently proved by Gurvits using stable polynomials. Our proof is algorithmic and borrows ideas from the theory of local weak convergence of graphs, statistical physics and covers of graphs. We provide new lower bounds for subpermanents sums and obtain new results on the number of matching in random $n$-lifts with some implications for the matching measure and the spectral measure of random $n$-lifts as well as for the spectral measure of infinite trees. △ Less

Submitted 5 November, 2015; v1 submitted 16 July, 2015; originally announced July 2015.

Comments: 26 pages, extended version (results for random lifts and more related work)

arXiv:1506.08621 [pdf, other]

A spectral method for community detection in moderately-sparse degree-corrected stochastic block models

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log$(n)$ or higher. Recovery succeeds even for very heterog… ▽ More We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log$(n)$ or higher. Recovery succeeds even for very heterogeneous degree-distributions. The used algorithm does not rely on parameters as input. In particular, it does not need to know the number of communities. △ Less

Submitted 7 February, 2017; v1 submitted 29 June, 2015; originally announced June 2015.

arXiv:1506.04158 [pdf, other]

A Spectral Algorithm with Additive Clustering for the Recovery of Overlap** Communities in Networks

Authors: Emilie Kaufmann, Thomas Bonald, Marc Lelarge

Abstract: This paper presents a novel spectral algorithm with additive clustering designed to identify overlap** communities in networks. The algorithm is based on geometric properties of the spectrum of the expected adjacency matrix in a random graph model that we call stochastic blockmodel with overlap (SBMO). An adaptive version of the algorithm, that does not require the knowledge of the number of hi… ▽ More This paper presents a novel spectral algorithm with additive clustering designed to identify overlap** communities in networks. The algorithm is based on geometric properties of the spectrum of the expected adjacency matrix in a random graph model that we call stochastic blockmodel with overlap (SBMO). An adaptive version of the algorithm, that does not require the knowledge of the number of hidden communities, is proved to be consistent under the SBMO when the degrees in the graph are (slightly more than) logarithmic. The algorithm is shown to perform well on simulated data and on real-world graphs with known overlap** communities. △ Less

Submitted 6 November, 2017; v1 submitted 12 June, 2015; originally announced June 2015.

Comments: Journal of Theoretical Computer Science (TCS), Elsevier, A Paraître

arXiv:1504.03156 [pdf, ps, other]

Streaming, Memory Limited Matrix Completion with Noise

Authors: Se-Young Yun, Marc Lelarge, Alexandre Proutiere

Abstract: In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missin… ▽ More In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missing entries after one pass on the data with limited memory space and limited computational complexity. We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i.e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix. △ Less

Submitted 13 April, 2015; originally announced April 2015.

Comments: 21 pages

arXiv:1502.04631 [pdf, other]

Clustering and Inference From Pairwise Comparisons

Authors: Rui Wu, Jiaming Xu, R. Srikant, Laurent Massoulié, Marc Lelarge, Bruce Hajek

Abstract: Given a set of pairwise comparisons, the classical ranking problem computes a single ranking that best represents the preferences of all users. In this paper, we study the problem of inferring individual preferences, arising in the context of making personalized recommendations. In particular, we assume that there are $n$ users of $r$ types; users of the same type provide similar pairwise comparis… ▽ More Given a set of pairwise comparisons, the classical ranking problem computes a single ranking that best represents the preferences of all users. In this paper, we study the problem of inferring individual preferences, arising in the context of making personalized recommendations. In particular, we assume that there are $n$ users of $r$ types; users of the same type provide similar pairwise comparisons for $m$ items according to the Bradley-Terry model. We propose an efficient algorithm that accurately estimates the individual preferences for almost all users, if there are $r \max \{m, n\}\log m \log^2 n$ pairwise comparisons per type, which is near optimal in sample complexity when $r$ only grows logarithmically with $m$ or $n$. Our algorithm has three steps: first, for each user, compute the \emph{net-win} vector which is a projection of its $\binom{m}{2}$-dimensional vector of pairwise comparisons onto an $m$-dimensional linear subspace; second, cluster the users based on the net-win vectors; third, estimate a single preference for each cluster separately. The net-win vectors are much less noisy than the high dimensional vectors of pairwise comparisons and clustering is more accurate after the projection as confirmed by numerical experiments. Moreover, we show that, when a cluster is only approximately correct, the maximum likelihood estimation for the Bradley-Terry model is still close to the true preference. △ Less

Submitted 17 December, 2015; v1 submitted 16 February, 2015; originally announced February 2015.

Comments: Corrected typos in the abstract

arXiv:1502.03475 [pdf, other]

Combinatorial Bandits Revisited

Authors: Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge

Abstract: This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ES… ▽ More This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice. In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems. △ Less

Submitted 5 November, 2015; v1 submitted 11 February, 2015; originally announced February 2015.

Comments: 30 pages, Advances in Neural Information Processing Systems 28 (NIPS 2015)

arXiv:1502.03365 [pdf, other]

Reconstruction in the Labeled Stochastic Block Model

Authors: Marc Lelarge, Laurent Massoulié, Jiaming Xu

Abstract: The labeled stochastic block model is a random graph model representing networks with community structure and interactions of multiple types. In its simplest form, it consists of two communities of approximately equal size, and the edges are drawn and labeled at random with probability depending on whether their two endpoints belong to the same community or not. It has been conjectured in \cite{… ▽ More The labeled stochastic block model is a random graph model representing networks with community structure and interactions of multiple types. In its simplest form, it consists of two communities of approximately equal size, and the edges are drawn and labeled at random with probability depending on whether their two endpoints belong to the same community or not. It has been conjectured in \cite{Heimlicher12} that correlated reconstruction (i.e.\ identification of a partition correlated with the true partition into the underlying communities) would be feasible if and only if a model parameter exceeds a threshold. We prove one half of this conjecture, i.e., reconstruction is impossible when below the threshold. In the positive direction, we introduce a weighted graph to exploit the label information. With a suitable choice of weight function, we show that when above the threshold by a specific constant, reconstruction is achieved by (1) minimum bisection, (2) a semidefinite relaxation of minimum bisection, and (3) a spectral method combined with removal of edges incident to vertices of high degree. Furthermore, we show that hypothesis testing between the labeled stochastic block model and the labeled Erdős-Rényi random graph model exhibits a phase transition at the conjectured reconstruction threshold. △ Less

Submitted 11 February, 2015; originally announced February 2015.

Comments: A preliminary version of this paper appeared in the Proceedings of the 2013 Information Theory Workshop

arXiv:1502.00163 [pdf, other]

doi 10.1109/ISIT.2015.7282642

Spectral Detection in the Censored Block Model

Authors: Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

Abstract: We consider the problem of partially recovering hidden binary variables from the observation of (few) censored edge weights, a problem with applications in community detection, correlation clustering and synchronization. We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators. These algorithms are shown to be asymptotically optimal for the pa… ▽ More We consider the problem of partially recovering hidden binary variables from the observation of (few) censored edge weights, a problem with applications in community detection, correlation clustering and synchronization. We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators. These algorithms are shown to be asymptotically optimal for the partial recovery problem, in that they detect the hidden assignment as soon as it is information theoretically possible to do so. △ Less

Submitted 10 June, 2015; v1 submitted 31 January, 2015; originally announced February 2015.

Comments: ISIT 2015

Journal ref: IEEE International Symposium on Information Theory (ISIT), pp.1184-1188 (2015)

arXiv:1501.06087 [pdf, other]

Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs

Authors: Charles Bordenave, Marc Lelarge, Laurent Massoulié

Abstract: A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some gene… ▽ More A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. In this work, we study the largest eigenvalues of the non-backtracking matrix of the Erdos-Renyi random graph and of the Stochastic Block Model in the regime where the number of edges is proportional to the number of vertices. Our results confirm the "spectral redemption" conjecture that community detection can be made on the basis of the leading eigenvectors above the feasibility threshold. △ Less

Submitted 22 April, 2015; v1 submitted 24 January, 2015; originally announced January 2015.

Comments: 59 pages

MSC Class: 05C80; 05C50; 91D30

arXiv:1412.1004 [pdf, other]

On rigidity, orientability and cores of random graphs with sliders

Authors: Julien Barré, Marc Lelarge, Dieter Mitsche

Abstract: Suppose that you add rigid bars between points in the plane, and suppose that a constant fraction $q$ of the points moves freely in the whole plane; the remaining fraction is constrained to move on fixed lines called sliders. When does a giant rigid cluster emerge? Under a genericity condition, the answer only depends on the graph formed by the points (vertices) and the bars (edges). We find for t… ▽ More Suppose that you add rigid bars between points in the plane, and suppose that a constant fraction $q$ of the points moves freely in the whole plane; the remaining fraction is constrained to move on fixed lines called sliders. When does a giant rigid cluster emerge? Under a genericity condition, the answer only depends on the graph formed by the points (vertices) and the bars (edges). We find for the random graph $G \in \mathcal{G}(n,c/n)$ the threshold value of $c$ for the appearance of a linear-sized rigid component as a function of $q$, generalizing results of Kasiviswanathan et al. We show that this appearance of a giant component undergoes a continuous transition for $q \leq 1/2$ and a discontinuous transition for $q > 1/2$. In our proofs, we introduce a generalized notion of orientability interpolating between 1- and 2-orientability, of cores interpolating between 2-core and 3-core, and of extended cores interpolating between 2+1-core and 3+2-core; we find the precise expressions for the respective thresholds and the sizes of the different cores above the threshold. In particular, this proves a conjecture of Kasiviswanathan et al. about the size of the 3+2-core. We also derive some structural properties of rigidity with sliders (matroid and decomposition into components) which can be of independent interest. △ Less

Submitted 20 February, 2015; v1 submitted 2 December, 2014; originally announced December 2014.

Comments: 32 pages, 1 figure

arXiv:1411.1279 [pdf, ps, other]

Streaming, Memory Limited Algorithms for Community Detection

Authors: Se-Young Yun, Marc Lelarge, Alexandre Proutiere

Abstract: In this paper, we consider sparse networks consisting of a finite number of non-overlap** communities, i.e. disjoint clusters, so that there is higher density within clusters than across clusters. Both the intra- and inter-cluster edge densities vanish when the size of the graph grows large, making the cluster reconstruction problem nosier and hence difficult to solve. We are interested in scena… ▽ More In this paper, we consider sparse networks consisting of a finite number of non-overlap** communities, i.e. disjoint clusters, so that there is higher density within clusters than across clusters. Both the intra- and inter-cluster edge densities vanish when the size of the graph grows large, making the cluster reconstruction problem nosier and hence difficult to solve. We are interested in scenarios where the network size is very large, so that the adjacency matrix of the graph is hard to manipulate and store. The data stream model in which columns of the adjacency matrix are revealed sequentially constitutes a natural framework in this setting. For this model, we develop two novel clustering algorithms that extract the clusters asymptotically accurately. The first algorithm is {\it offline}, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size. The second algorithm is {\it online}, as it may classify a node when the corresponding column is revealed and then discard this information. This algorithm requires a memory growing sub-linearly with the network size. To construct these efficient streaming memory-limited clustering algorithms, we first address the problem of clustering with partial information, where only a small proportion of the columns of the adjacency matrix is observed and develop, for this setting, a new spectral algorithm which is of independent interest. △ Less

Submitted 3 November, 2014; originally announced November 2014.

Comments: NIPS 2014

arXiv:1406.6897 [pdf, other]

Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

Authors: Jiaming Xu, Laurent Massoulié, Marc Lelarge

Abstract: The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and… ▽ More The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution. △ Less

Submitted 26 June, 2014; originally announced June 2014.

Comments: 17 pages

arXiv:1401.7923 [pdf, ps, other]

Loopy annealing belief propagation for vertex cover and matching: convergence, LP relaxation, correctness and Bethe approximation

Authors: Marc Lelarge

Abstract: For the minimum cardinality vertex cover and maximum cardinality matching problems, the max-product form of belief propagation (BP) is known to perform poorly on general graphs. In this paper, we present an iterative loopy annealing BP (LABP) algorithm which is shown to converge and to solve a Linear Programming relaxation of the vertex cover or matching problem on general graphs. LABP finds (asym… ▽ More For the minimum cardinality vertex cover and maximum cardinality matching problems, the max-product form of belief propagation (BP) is known to perform poorly on general graphs. In this paper, we present an iterative loopy annealing BP (LABP) algorithm which is shown to converge and to solve a Linear Programming relaxation of the vertex cover or matching problem on general graphs. LABP finds (asymptotically) a minimum half-integral vertex cover (hence provides a 2-approximation) and a maximum fractional matching on any graph. We also show that LABP finds (asymptotically) a minimum size vertex cover for any bipartite graph and as a consequence compute the matching number of the graph. Our proof relies on some subtle monotonicity arguments for the local iteration. We also show that the Bethe free entropy is concave and that LABP maximizes it. Using loop calculus, we also give an exact (also intractable for general graphs) expression of the partition function for matching in term of the LABP messages which can be used to improve mean-field approximations. △ Less

Submitted 7 July, 2014; v1 submitted 30 January, 2014; originally announced January 2014.

Comments: revised version, 23 pages

arXiv:1401.1770 [pdf, ps, other]

Adaptive Replication in Distributed Content Delivery Networks

Authors: Mathieu Leconte, Marc Lelarge, Laurent Massoulié

Abstract: We address the problem of content replication in large distributed content delivery networks, composed of a data center assisted by many small servers with limited capabilities and located at the edge of the network. The objective is to optimize the placement of contents on the servers to offload as much as possible the data center. We model the system constituted by the small servers as a loss ne… ▽ More We address the problem of content replication in large distributed content delivery networks, composed of a data center assisted by many small servers with limited capabilities and located at the edge of the network. The objective is to optimize the placement of contents on the servers to offload as much as possible the data center. We model the system constituted by the small servers as a loss network, each loss corresponding to a request to the data center. Based on large system / storage behavior, we obtain an asymptotic formula for the optimal replication of contents and propose adaptive schemes related to those encountered in cache networks but reacting here to loss events, and faster algorithms generating virtual events at higher rate while kee** the same target replication. We show through simulations that our adaptive schemes outperform significantly standard replication strategies both in terms of loss rates and adaptation speed. △ Less

Submitted 8 January, 2014; originally announced January 2014.

Comments: 10 pages, 5 figures

arXiv:1303.4325 [pdf, other]

Contagions in Random Networks with Overlap** Communities

Authors: Emilie Coupechoux, Marc Lelarge

Abstract: We consider a threshold epidemic model on a clustered random graph with overlap** communities. In other words, our epidemic model is such that an individual becomes infected as soon as the proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph model, each individual can belong to several communities. The distributions for the community sizes and the num… ▽ More We consider a threshold epidemic model on a clustered random graph with overlap** communities. In other words, our epidemic model is such that an individual becomes infected as soon as the proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph model, each individual can belong to several communities. The distributions for the community sizes and the number of communities an individual belongs to are arbitrary. We consider the case where the epidemic starts from a single individual, and we prove a phase transition (when the parameter q of the model varies) for the appearance of a cascade, i.e. when the epidemic can be propagated to an infinite part of the population. More precisely, we show that our epidemic is entirely described by a multi-type (and alternating) branching process, and then we apply Sevastyanov's theorem about the phase transition of multi-type Galton-Watson branching processes. In addition, we compute the entries of the matrix whose largest eigenvalue gives the phase transition. △ Less

Submitted 31 January, 2014; v1 submitted 18 March, 2013; originally announced March 2013.

Comments: Minor modifications for the second version: added comments (end of Section 3.2, beginning of Section 5.3); moved remark (end of Section 3.1, beginning of Section 4.1); corrected typos; changed title

MSC Class: 60C05; 05C80; 91D30

arXiv:1302.6974 [pdf, ps, other]

Spectrum Bandit Optimization

Authors: Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi

Abstract: We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each c… ▽ More We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each channel and on each link, an optimal allocation would be obtained by solving an Integer Linear Program (ILP). When radio conditions are unknown a priori, we look for a sequential channel allocation policy that converges to the optimal allocation while minimizing on the way the throughput loss or {\it regret} due to the need for exploring sub-optimal allocations. We formulate this problem as a generic linear bandit problem, and analyze it first in a stochastic setting where radio conditions are driven by a stationary stochastic process, and then in an adversarial setting where radio conditions can evolve arbitrarily. We provide new algorithms in both settings and derive upper bounds on their regrets. △ Less

Submitted 17 February, 2015; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: 21 pages

arXiv:1210.4839 [pdf]

Leveraging Side Observations in Stochastic Bandits

Authors: Stephane Caron, Branislav Kveton, Marc Lelarge, Smriti Bhagat

Abstract: This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms. In this setting, after pulling an arm i, the decision maker also observes the rewards for some other actions related to i. We will see that this model is suited to content recommendation in social networks, where users' reactions may be… ▽ More This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms. In this setting, after pulling an arm i, the decision maker also observes the rewards for some other actions related to i. We will see that this model is suited to content recommendation in social networks, where users' reactions may be endorsed or not by their friends. We provide efficient algorithms based on upper confidence bounds (UCBs) to leverage this additional information and derive new bounds improving on standard regret guarantees. We also evaluate these policies in the context of movie recommendation in social networks: experiments on real datasets show substantial learning rate speedups ranging from 2.2x to 14x on dense networks. △ Less

Submitted 16 October, 2012; originally announced October 2012.

Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Report number: UAI-P-2012-PG-142-151

arXiv:1209.2910 [pdf, other]

Community Detection in the Labelled Stochastic Block Model

Authors: Simon Heimlicher, Marc Lelarge, Laurent Massoulié

Abstract: We consider the problem of community detection from observed interactions between individuals, in the context where multiple types of interaction are possible. We use labelled stochastic block models to represent the observed data, where labels correspond to interaction types. Focusing on a two-community scenario, we conjecture a threshold for the problem of reconstructing the hidden communities i… ▽ More We consider the problem of community detection from observed interactions between individuals, in the context where multiple types of interaction are possible. We use labelled stochastic block models to represent the observed data, where labels correspond to interaction types. Focusing on a two-community scenario, we conjecture a threshold for the problem of reconstructing the hidden communities in a way that is correlated with the true partition. To substantiate the conjecture, we prove that the given threshold correctly identifies a transition on the behaviour of belief propagation from insensitive to sensitive. We further prove that the same threshold corresponds to the transition in a related inference problem on a tree model from infeasible to feasible. Finally, numerical results using belief propagation for community detection give further support to the conjecture. △ Less

Submitted 13 September, 2012; originally announced September 2012.

Comments: 9 pages

arXiv:1208.3994 [pdf, ps, other]

doi 10.1109/JSAC.2012.121213

Coordination in Network Security Games: a Monotone Comparative Statics Approach

Authors: Marc Lelarge

Abstract: Malicious softwares or malwares for short have become a major security threat. While originating in criminal behavior, their impact are also influenced by the decisions of legitimate end users. Getting agents in the Internet, and in networks in general, to invest in and deploy security features and protocols is a challenge, in particular because of economic reasons arising from the presence of net… ▽ More Malicious softwares or malwares for short have become a major security threat. While originating in criminal behavior, their impact are also influenced by the decisions of legitimate end users. Getting agents in the Internet, and in networks in general, to invest in and deploy security features and protocols is a challenge, in particular because of economic reasons arising from the presence of network externalities. In this paper, we focus on the question of incentive alignment for agents of a large network towards a better security. We start with an economic model for a single agent, that determines the optimal amount to invest in protection. The model takes into account the vulnerability of the agent to a security breach and the potential loss if a security breach occurs. We derive conditions on the quality of the protection to ensure that the optimal amount spent on security is an increasing function of the agent's vulnerability and potential loss. We also show that for a large class of risks, only a small fraction of the expected loss should be invested. Building on these results, we study a network of interconnected agents subject to epidemic risks. We derive conditions to ensure that the incentives of all agents are aligned towards a better security. When agents are strategic, we show that security investments are always socially inefficient due to the network externalities. Moreover alignment of incentives typically implies a coordination problem, leading to an equilibrium with a very high price of anarchy. △ Less

Submitted 20 August, 2012; originally announced August 2012.

Comments: 10 pages, to appear in IEEE JSAC

arXiv:1208.3629 [pdf, ps, other]

Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

Authors: Marc Lelarge, Hang Zhou

Abstract: For a graph $G$, let $Z(G,λ)$ be the partition function of the monomer-dimer system defined by $\sum_k m_k(G)λ^k$, where $m_k(G)$ is the number of matchings of size $k$ in $G$. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating $\log Z(G,λ)$ at an arbitrary value $λ>0$ within additive error $εn$ with high probability. The query complexity of our algorithm do… ▽ More For a graph $G$, let $Z(G,λ)$ be the partition function of the monomer-dimer system defined by $\sum_k m_k(G)λ^k$, where $m_k(G)$ is the number of matchings of size $k$ in $G$. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating $\log Z(G,λ)$ at an arbitrary value $λ>0$ within additive error $εn$ with high probability. The query complexity of our algorithm does not depend on the size of $G$ and is polynomial in $1/ε$, and we also provide a lower bound quadratic in $1/ε$ for this problem. This is the first analysis of a sublinear-time approximation algorithm for a $# P$-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with $Z(G,λ)$. We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in $1/ε$ and the lower bound is quadratic in $1/ε$. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity. △ Less

Submitted 4 September, 2013; v1 submitted 17 August, 2012; originally announced August 2012.

Showing 1–50 of 67 results for author: Lelarge, M