Search | arXiv e-print repository

Correlation detection in trees for planted graph alignment

Authors: Luca Ganassali, Laurent Massoulié, Marc Lelarge

Abstract: Motivated by alignment of correlated sparse random graphs, we introduce a hypothesis testing problem of deciding whether or not two random trees are correlated. We obtain sufficient conditions under which this testing is impossible or feasible. We propose MPAlign, a message-passing algorithm for graph alignment inspired by the tree correlation detection problem. We prove MPAlign to succeed in poly… ▽ More Motivated by alignment of correlated sparse random graphs, we introduce a hypothesis testing problem of deciding whether or not two random trees are correlated. We obtain sufficient conditions under which this testing is impossible or feasible. We propose MPAlign, a message-passing algorithm for graph alignment inspired by the tree correlation detection problem. We prove MPAlign to succeed in polynomial time at partial alignment whenever tree detection is feasible. As a result our analysis of tree detection reveals new ranges of parameters for which partial alignment of sparse random graphs is feasible in polynomial time. We then conjecture that graph alignment is not feasible in polynomial time when the associated tree detection problem is impossible. If true, this conjecture together with our sufficient conditions on tree detection impossibility would imply the existence of a hard phase for graph alignment, i.e. a parameter range where alignment cannot be done in polynomial time even though it is known to be feasible in non-polynomial time. △ Less

Submitted 5 December, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

Comments: 38 pages, 9 figures

arXiv:2102.02685 [pdf, other]

Impossibility of Partial Recovery in the Graph Alignment Problem

Authors: Luca Ganassali, Laurent Massoulié, Marc Lelarge

Abstract: Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known graph isomorphism problem. For the correlated Erdös-Rényi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well a… ▽ More Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known graph isomorphism problem. For the correlated Erdös-Rényi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well as a general bound on the maximal reachable overlap. Our bound is tight in the noiseless case (the graph isomorphism problem) and we conjecture that it is still tight with noise. Our proof technique relies on a careful application of the probabilistic method to build automorphisms between tree components of a subcritical Erdös-Rényi graph. △ Less

Submitted 29 June, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: 23 pages, 8 figures. Accepted for publication at COLT21

Journal ref: Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2080-2102, 2021

arXiv:1912.00231 [pdf, other]

doi 10.1017/apr.2021.31

Spectral Alignment of Correlated Gaussian matrices

Authors: Luca Ganassali, Marc Lelarge, Laurent Massoulié

Abstract: In this paper we analyze a simple spectral method (EIG1) for the problem of matrix alignment, consisting in aligning their leading eigenvectors: given two matrices $A$ and $B$, we compute $v_1$ and $v'_1$ two corresponding leading eigenvectors. The algorithm returns the permutation $\hatπ$ such that the rank of coordinate $\hatπ(i)$ in $v_1$ and that of coordinate $i$ in $v'_1$ (up to the sign of… ▽ More In this paper we analyze a simple spectral method (EIG1) for the problem of matrix alignment, consisting in aligning their leading eigenvectors: given two matrices $A$ and $B$, we compute $v_1$ and $v'_1$ two corresponding leading eigenvectors. The algorithm returns the permutation $\hatπ$ such that the rank of coordinate $\hatπ(i)$ in $v_1$ and that of coordinate $i$ in $v'_1$ (up to the sign of $v'_1$) are the same. We consider a model of weighted graphs where the adjacency matrix $A$ belongs to the Gaussian Orthogonal Ensemble (GOE) of size $N \times N$, and $B$ is a noisy version of $A$ where all nodes have been relabeled according to some planted permutation $π$, namely $B= Π^T (A+σH) Π$, where $Π$ is the permutation matrix associated with $π$ and $H$ is an independent copy of $A$. We show the following zero-one law: with high probability, under the condition $σN^{7/6+ε} \to 0$ for some $ε>0$, EIG1 recovers all but a vanishing part of the underlying permutation $π$, whereas if $σN^{7/6-ε} \to \infty$, this method cannot recover more than $o(N)$ correct matches. This result gives an understanding of the simplest and fastest spectral method for matrix alignment (or complete weighted graph alignment), and involves proof methods and techniques which could be of independent interest. △ Less

Submitted 11 May, 2021; v1 submitted 30 November, 2019; originally announced December 2019.

Comments: 26 pages, 4 figures. Figures and paper organization updated, typos corrected. Remark 4.2. added

Journal ref: Advances in Applied Probability (2022) 1-32

arXiv:1907.03792 [pdf, other]

Asymptotic Bayes risk for Gaussian mixture in a semi-supervised setting

Authors: Marc Lelarge, Leo Miolane

Abstract: Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the b… ▽ More Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performance when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach using only labeled data and the best semi-supervised approach using both labeled and unlabeled data. We quantify the best possible increase in performance obtained thanks to the unlabeled data, i.e. we compute the accuracy increase due to the information contained in the unlabeled data. Our work deals with a simple high-dimensional Gaussian mixture model for the data in a Bayesian setting. Our rigorous analysis builds on recent theoretical breakthroughs in high-dimensional inference and a large body of mathematical tools from statistical physics initially developed for spin glasses. △ Less

Submitted 28 September, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: 13 pages

arXiv:1708.02457 [pdf, other]

doi 10.1007/s10955-018-1964-6

Replica Bounds by Combinatorial Interpolation for Diluted Spin Systems

Authors: Marc Lelarge, Mendes Oulamara

Abstract: In two papers Franz, Leone and Toninelli proved bounds for the free energy of diluted random constraints satisfaction problems, for a Poisson degree distribution [5] and a general distribution [6]. Panchenko and Talagrand [16] simplified the proof and generalized the result of [5] for the Poisson case. We provide a new proof for the general degree distribution case and as a corollary, we obtain ne… ▽ More In two papers Franz, Leone and Toninelli proved bounds for the free energy of diluted random constraints satisfaction problems, for a Poisson degree distribution [5] and a general distribution [6]. Panchenko and Talagrand [16] simplified the proof and generalized the result of [5] for the Poisson case. We provide a new proof for the general degree distribution case and as a corollary, we obtain new bounds for the size of the largest independent set (also known as hard core model) in a large random regular graph. Our proof uses a combinatorial interpolation based on biased random walks [21] and allows to bypass the arguments in [6] based on the study of the Sherrington-Kirkpatrick (SK) model. △ Less

Submitted 17 January, 2018; v1 submitted 8 August, 2017; originally announced August 2017.

Comments: Accepted in Journal of Statistical Physics

arXiv:1701.08010 [pdf, other]

doi 10.1109/ISIT.2017.8006580

Statistical and computational phase transitions in spiked tensor estimation

Authors: Thibault Lesieur, Léo Miolane, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

Abstract: We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmica… ▽ More We consider tensor factorizations using a generative model and a Bayesian approach. We compute rigorously the mutual information, the Minimal Mean Squared Error (MMSE), and unveil information-theoretic phase transitions. In addition, we study the performance of Approximate Message Passing (AMP) and show that it achieves the MMSE for a large set of parameters, and that factorization is algorithmically "easy" in a much wider region than previously believed. It exists, however, a "hard" region where AMP fails to reach the MMSE and we conjecture that no polynomial algorithm will improve on AMP. △ Less

Submitted 16 December, 2017; v1 submitted 27 January, 2017; originally announced January 2017.

Comments: 17 pages, 3 figures, 1 table

Journal ref: IEEE International Symposium on Information Theory (ISIT), pp. 511-515 (2017)

arXiv:1611.03888 [pdf, other]

Fundamental limits of symmetric low-rank matrix estimation

Authors: Marc Lelarge, Léo Miolane

Abstract: We consider the high-dimensional inference problem where the signal is a low-rank symmetric matrix which is corrupted by an additive Gaussian noise. Given a probabilistic model for the low-rank matrix, we compute the limit in the large dimension setting for the mutual information between the signal and the observations, as well as the matrix minimum mean square error, while the rank of the signal… ▽ More We consider the high-dimensional inference problem where the signal is a low-rank symmetric matrix which is corrupted by an additive Gaussian noise. Given a probabilistic model for the low-rank matrix, we compute the limit in the large dimension setting for the mutual information between the signal and the observations, as well as the matrix minimum mean square error, while the rank of the signal remains constant. We also show that our model extends beyond the particular case of additive Gaussian noise and we prove an universality result connecting the community detection problem to our Gaussian framework. We unify and generalize a number of recent works on PCA, sparse PCA, submatrix localization or community detection by computing the information-theoretic limits for these problems in the high noise regime. In addition, we show that the posterior distribution of the signal given the observations is characterized by a parameter of the same dimension as the square of the rank of the signal (i.e. scalar in the case of rank one). Finally, we connect our work with the hard but detectable conjecture in statistical physics. △ Less

Submitted 30 March, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

arXiv:1610.03680 [pdf, other]

Recovering asymmetric communities in the stochastic block model

Authors: Francesco Caltagirone, Marc Lelarge, Léo Miolane

Abstract: We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community det… ▽ More We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community detection problem (i.e. do strictly better than a random guess) when the signal is sufficiently large namely above the so-called Kesten Stigum threshold. In this regime and when the average degree tends to infinity, we show that if the community of a vanishing fraction of the vertices is revealed, then a local algorithm (belief propagation) is optimal down to Kesten Stigum threshold and we quantify explicitly its performance. Below the Kesten Stigum threshold, we show that, in the large degree limit, there is a second threshold called the spinodal curve below which, the community detection problem is not solvable. The spinodal curve is equal to the Kesten Stigum threshold when the fraction of vertices in the smallest community is above $p^*=\frac{1}{2}-\frac{1}{2\sqrt{3}}$, so that the Kesten Stigum threshold is the threshold for solvability of the community detection in this case. However when the smallest community is smaller than $p^*$, the spinodal curve only provides a lower bound on the threshold for solvability. In the regime below the Kesten Stigum bound and above the spinodal curve, we also characterize the performance of best local algorithms as a function of the fraction of revealed vertices. Our proof relies on a careful analysis of the associated reconstruction problem on trees which might be of independent interest. In particular, we show that the spinodal curve corresponds to the reconstruction threshold on the tree. △ Less

Submitted 31 March, 2017; v1 submitted 12 October, 2016; originally announced October 2016.

arXiv:1609.02487 [pdf, ps, other]

Non-Backtracking Spectrum of Degree-Corrected Stochastic Block Models

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: Motivated by community detection, we characterise the spectrum of the non-backtracking matrix $B$ in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on $n$ vertices partitioned into two equal-sized clusters. The vertices have i.i.d. weights $\{ φ_u \}_{u=1}^n$ with second moment $Φ^{(2)}$. The intra-cluster connection probability for vertices $u$ and $v$ is… ▽ More Motivated by community detection, we characterise the spectrum of the non-backtracking matrix $B$ in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on $n$ vertices partitioned into two equal-sized clusters. The vertices have i.i.d. weights $\{ φ_u \}_{u=1}^n$ with second moment $Φ^{(2)}$. The intra-cluster connection probability for vertices $u$ and $v$ is $\frac{φ_u φ_v}{n}a$ and the inter-cluster connection probability is $\frac{φ_u φ_v}{n}b$. We show that with high probability, the following holds: The leading eigenvalue of the non-backtracking matrix $B$ is asymptotic to $ρ= \frac{a+b}{2} Φ^{(2)}$. The second eigenvalue is asymptotic to $μ_2 = \frac{a-b}{2} Φ^{(2)}$ when $μ_2^2 > ρ$, but asymptotically bounded by $\sqrtρ$ when $μ_2^2 \leq ρ$. All the remaining eigenvalues are asymptotically bounded by $\sqrtρ$. As a result, a clustering positively-correlated with the true communities can be obtained based on the second eigenvector of $B$ in the regime where $μ_2^2 > ρ.$ In a previous work we obtained that detection is impossible when $μ_2^2 < ρ,$ meaning that there occurs a phase-transition in the sparse regime of the Degree-Corrected Stochastic Block Model. As a corollary, we obtain that Degree-Corrected Erdős-Rényi graphs asymptotically satisfy the graph Riemann hypothesis, a quasi-Ramanujan property. A by-product of our proof is a weak law of large numbers for local-functionals on Degree-Corrected Stochastic Block Models, which could be of independent interest. △ Less

Submitted 18 May, 2017; v1 submitted 8 September, 2016; originally announced September 2016.

arXiv:1606.00858 [pdf, other]

Impact of Community Structure on Cascades

Authors: Mehrdad Moharrami, Vijay Subramanian, Mingyan Liu, Marc Lelarge

Abstract: We study cascades under the threshold model on sparse random graphs with community structure. In this model, individuals adopt the new behavior based on how many neighbors have already chosen it. Specifically, we consider the permanent adoption model wherein individuals that have adopted the new behavior (or opinion) cannot change their state. We present a differential-equation-based tight approxi… ▽ More We study cascades under the threshold model on sparse random graphs with community structure. In this model, individuals adopt the new behavior based on how many neighbors have already chosen it. Specifically, we consider the permanent adoption model wherein individuals that have adopted the new behavior (or opinion) cannot change their state. We present a differential-equation-based tight approximation to the stochastic process of adoption and prove the validity of the mean-field equations. In addition, we characterize both necessary and sufficient conditions for contagion to happen no matter how small the set of initial adopters is. Finally, we study the problem of optimum seeding given budget constraints and propose a gradient-based heuristic seeding strategy. Our algorithm, numerically, dispels commonly held beliefs in the literature that suggest the best seeding strategy is to seed over the vertices with the highest number of neighbors. △ Less

Submitted 4 May, 2022; v1 submitted 2 June, 2016; originally announced June 2016.

MSC Class: 05C80

arXiv:1605.06422 [pdf, other]

doi 10.1088/1742-6596/1036/1/012015

Fast Randomized Semi-Supervised Clustering

Authors: Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

Abstract: We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be ach… ▽ More We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be achieved from $O(n)$ randomly chosen measurements, where $n$ is the number of items in the dataset. Our algorithm is therefore efficient both in terms of time and space complexities. We also investigate numerically the performance of the algorithm on synthetic and real world data. △ Less

Submitted 9 October, 2016; v1 submitted 20 May, 2016; originally announced May 2016.

Journal ref: Journal of Physics: Conf. Series 1036 (2018) 012015

arXiv:1511.00546 [pdf, ps, other]

An Impossibility Result for Reconstruction in a Degree-Corrected Planted-Partition Model

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: We consider the Degree-Corrected Stochastic Block Model (DC-SBM): a random graph on $n$ nodes, having i.i.d. weights $(φ_u)_{u=1}^n$ (possibly heavy-tailed), partitioned into $q \geq 2$ asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and the finite second moment of the weights $Φ^{(2)}$. Vertices $u$ and $v$ are connected by an edge with probability… ▽ More We consider the Degree-Corrected Stochastic Block Model (DC-SBM): a random graph on $n$ nodes, having i.i.d. weights $(φ_u)_{u=1}^n$ (possibly heavy-tailed), partitioned into $q \geq 2$ asymptotically equal-sized clusters. The model parameters are two constants $a,b > 0$ and the finite second moment of the weights $Φ^{(2)}$. Vertices $u$ and $v$ are connected by an edge with probability $\frac{φ_u φ_v}{n}a$ when they are in the same class and with probability $\frac{φ_u φ_v}{n}b$ otherwise. We prove that it is information-theoretically impossible to estimate the clusters in a way positively correlated with the true community structure when $(a-b)^2 Φ^{(2)} \leq q(a+b)$. As by-products of our proof we obtain $(1)$ a precise coupling result for local neighbourhoods in DC-SBM's, that we use in a follow up paper [Gulikers et al., 2017] to establish a law of large numbers for local-functionals and $(2)$ that long-range interactions are weak in (power-law) DC-SBM's. △ Less

Submitted 24 November, 2018; v1 submitted 2 November, 2015; originally announced November 2015.

Comments: Appeared in Annals of Applied Probability

Journal ref: Annals of Applied Probability - Volume 28, Number 5 (2018), 3002-3027

arXiv:1507.04739 [pdf, ps, other]

Counting matchings in irregular bipartite graphs and random lifts

Authors: Marc Lelarge

Abstract: We give a sharp lower bound on the number of matchings of a given size in a bipartite graph. When specialized to regular bipartite graphs, our results imply Friedland's Lower Matching Conjecture and Schrijver's theorem proven by Gurvits and Csikvari. Indeed, our work extends the recent work of Csikvari done for regular and bi-regular bipartite graphs. Moreover, our lower bounds are order optimal a… ▽ More We give a sharp lower bound on the number of matchings of a given size in a bipartite graph. When specialized to regular bipartite graphs, our results imply Friedland's Lower Matching Conjecture and Schrijver's theorem proven by Gurvits and Csikvari. Indeed, our work extends the recent work of Csikvari done for regular and bi-regular bipartite graphs. Moreover, our lower bounds are order optimal as they are attained for a sequence of $2$-lifts of the original graph as well as for random $n$-lifts of the original graph when $n$ tends to infinity. We then extend our results to permanents and subpermanents sums. For permanents, we are able to recover the lower bound of Schrijver recently proved by Gurvits using stable polynomials. Our proof is algorithmic and borrows ideas from the theory of local weak convergence of graphs, statistical physics and covers of graphs. We provide new lower bounds for subpermanents sums and obtain new results on the number of matching in random $n$-lifts with some implications for the matching measure and the spectral measure of random $n$-lifts as well as for the spectral measure of infinite trees. △ Less

Submitted 5 November, 2015; v1 submitted 16 July, 2015; originally announced July 2015.

Comments: 26 pages, extended version (results for random lifts and more related work)

arXiv:1506.08621 [pdf, other]

A spectral method for community detection in moderately-sparse degree-corrected stochastic block models

Authors: Lennart Gulikers, Marc Lelarge, Laurent Massoulié

Abstract: We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log$(n)$ or higher. Recovery succeeds even for very heterog… ▽ More We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log$(n)$ or higher. Recovery succeeds even for very heterogeneous degree-distributions. The used algorithm does not rely on parameters as input. In particular, it does not need to know the number of communities. △ Less

Submitted 7 February, 2017; v1 submitted 29 June, 2015; originally announced June 2015.

arXiv:1504.03156 [pdf, ps, other]

Streaming, Memory Limited Matrix Completion with Noise

Authors: Se-Young Yun, Marc Lelarge, Alexandre Proutiere

Abstract: In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missin… ▽ More In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missing entries after one pass on the data with limited memory space and limited computational complexity. We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i.e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix. △ Less

Submitted 13 April, 2015; originally announced April 2015.

Comments: 21 pages

arXiv:1502.03475 [pdf, other]

Combinatorial Bandits Revisited

Authors: Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge

Abstract: This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ES… ▽ More This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice. In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems. △ Less

Submitted 5 November, 2015; v1 submitted 11 February, 2015; originally announced February 2015.

Comments: 30 pages, Advances in Neural Information Processing Systems 28 (NIPS 2015)

arXiv:1502.00163 [pdf, other]

doi 10.1109/ISIT.2015.7282642

Spectral Detection in the Censored Block Model

Authors: Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

Abstract: We consider the problem of partially recovering hidden binary variables from the observation of (few) censored edge weights, a problem with applications in community detection, correlation clustering and synchronization. We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators. These algorithms are shown to be asymptotically optimal for the pa… ▽ More We consider the problem of partially recovering hidden binary variables from the observation of (few) censored edge weights, a problem with applications in community detection, correlation clustering and synchronization. We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators. These algorithms are shown to be asymptotically optimal for the partial recovery problem, in that they detect the hidden assignment as soon as it is information theoretically possible to do so. △ Less

Submitted 10 June, 2015; v1 submitted 31 January, 2015; originally announced February 2015.

Comments: ISIT 2015

Journal ref: IEEE International Symposium on Information Theory (ISIT), pp.1184-1188 (2015)

arXiv:1501.06087 [pdf, other]

Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs

Authors: Charles Bordenave, Marc Lelarge, Laurent Massoulié

Abstract: A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some gene… ▽ More A non-backtracking walk on a graph is a directed path such that no edge is the inverse of its preceding edge. The non-backtracking matrix of a graph is indexed by its directed edges and can be used to count non-backtracking walks of a given length. It has been used recently in the context of community detection and has appeared previously in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. In this work, we study the largest eigenvalues of the non-backtracking matrix of the Erdos-Renyi random graph and of the Stochastic Block Model in the regime where the number of edges is proportional to the number of vertices. Our results confirm the "spectral redemption" conjecture that community detection can be made on the basis of the leading eigenvectors above the feasibility threshold. △ Less

Submitted 22 April, 2015; v1 submitted 24 January, 2015; originally announced January 2015.

Comments: 59 pages

MSC Class: 05C80; 05C50; 91D30

arXiv:1412.1004 [pdf, other]

On rigidity, orientability and cores of random graphs with sliders

Authors: Julien Barré, Marc Lelarge, Dieter Mitsche

Abstract: Suppose that you add rigid bars between points in the plane, and suppose that a constant fraction $q$ of the points moves freely in the whole plane; the remaining fraction is constrained to move on fixed lines called sliders. When does a giant rigid cluster emerge? Under a genericity condition, the answer only depends on the graph formed by the points (vertices) and the bars (edges). We find for t… ▽ More Suppose that you add rigid bars between points in the plane, and suppose that a constant fraction $q$ of the points moves freely in the whole plane; the remaining fraction is constrained to move on fixed lines called sliders. When does a giant rigid cluster emerge? Under a genericity condition, the answer only depends on the graph formed by the points (vertices) and the bars (edges). We find for the random graph $G \in \mathcal{G}(n,c/n)$ the threshold value of $c$ for the appearance of a linear-sized rigid component as a function of $q$, generalizing results of Kasiviswanathan et al. We show that this appearance of a giant component undergoes a continuous transition for $q \leq 1/2$ and a discontinuous transition for $q > 1/2$. In our proofs, we introduce a generalized notion of orientability interpolating between 1- and 2-orientability, of cores interpolating between 2-core and 3-core, and of extended cores interpolating between 2+1-core and 3+2-core; we find the precise expressions for the respective thresholds and the sizes of the different cores above the threshold. In particular, this proves a conjecture of Kasiviswanathan et al. about the size of the 3+2-core. We also derive some structural properties of rigidity with sliders (matroid and decomposition into components) which can be of independent interest. △ Less

Submitted 20 February, 2015; v1 submitted 2 December, 2014; originally announced December 2014.

Comments: 32 pages, 1 figure

arXiv:1406.6897 [pdf, other]

Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

Authors: Jiaming Xu, Laurent Massoulié, Marc Lelarge

Abstract: The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and… ▽ More The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution. △ Less

Submitted 26 June, 2014; originally announced June 2014.

Comments: 17 pages

arXiv:1401.7923 [pdf, ps, other]

Loopy annealing belief propagation for vertex cover and matching: convergence, LP relaxation, correctness and Bethe approximation

Authors: Marc Lelarge

Abstract: For the minimum cardinality vertex cover and maximum cardinality matching problems, the max-product form of belief propagation (BP) is known to perform poorly on general graphs. In this paper, we present an iterative loopy annealing BP (LABP) algorithm which is shown to converge and to solve a Linear Programming relaxation of the vertex cover or matching problem on general graphs. LABP finds (asym… ▽ More For the minimum cardinality vertex cover and maximum cardinality matching problems, the max-product form of belief propagation (BP) is known to perform poorly on general graphs. In this paper, we present an iterative loopy annealing BP (LABP) algorithm which is shown to converge and to solve a Linear Programming relaxation of the vertex cover or matching problem on general graphs. LABP finds (asymptotically) a minimum half-integral vertex cover (hence provides a 2-approximation) and a maximum fractional matching on any graph. We also show that LABP finds (asymptotically) a minimum size vertex cover for any bipartite graph and as a consequence compute the matching number of the graph. Our proof relies on some subtle monotonicity arguments for the local iteration. We also show that the Bethe free entropy is concave and that LABP maximizes it. Using loop calculus, we also give an exact (also intractable for general graphs) expression of the partition function for matching in term of the LABP messages which can be used to improve mean-field approximations. △ Less

Submitted 7 July, 2014; v1 submitted 30 January, 2014; originally announced January 2014.

Comments: revised version, 23 pages

arXiv:1303.4325 [pdf, other]

Contagions in Random Networks with Overlap** Communities

Authors: Emilie Coupechoux, Marc Lelarge

Abstract: We consider a threshold epidemic model on a clustered random graph with overlap** communities. In other words, our epidemic model is such that an individual becomes infected as soon as the proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph model, each individual can belong to several communities. The distributions for the community sizes and the num… ▽ More We consider a threshold epidemic model on a clustered random graph with overlap** communities. In other words, our epidemic model is such that an individual becomes infected as soon as the proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph model, each individual can belong to several communities. The distributions for the community sizes and the number of communities an individual belongs to are arbitrary. We consider the case where the epidemic starts from a single individual, and we prove a phase transition (when the parameter q of the model varies) for the appearance of a cascade, i.e. when the epidemic can be propagated to an infinite part of the population. More precisely, we show that our epidemic is entirely described by a multi-type (and alternating) branching process, and then we apply Sevastyanov's theorem about the phase transition of multi-type Galton-Watson branching processes. In addition, we compute the entries of the matrix whose largest eigenvalue gives the phase transition. △ Less

Submitted 31 January, 2014; v1 submitted 18 March, 2013; originally announced March 2013.

Comments: Minor modifications for the second version: added comments (end of Section 3.2, beginning of Section 5.3); moved remark (end of Section 3.1, beginning of Section 4.1); corrected typos; changed title

MSC Class: 60C05; 05C80; 91D30

arXiv:1302.6974 [pdf, ps, other]

Spectrum Bandit Optimization

Authors: Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi

Abstract: We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each c… ▽ More We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each channel and on each link, an optimal allocation would be obtained by solving an Integer Linear Program (ILP). When radio conditions are unknown a priori, we look for a sequential channel allocation policy that converges to the optimal allocation while minimizing on the way the throughput loss or {\it regret} due to the need for exploring sub-optimal allocations. We formulate this problem as a generic linear bandit problem, and analyze it first in a stochastic setting where radio conditions are driven by a stationary stochastic process, and then in an adversarial setting where radio conditions can evolve arbitrarily. We provide new algorithms in both settings and derive upper bounds on their regrets. △ Less

Submitted 17 February, 2015; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: 21 pages

arXiv:1209.2910 [pdf, other]

Community Detection in the Labelled Stochastic Block Model

Authors: Simon Heimlicher, Marc Lelarge, Laurent Massoulié

Abstract: We consider the problem of community detection from observed interactions between individuals, in the context where multiple types of interaction are possible. We use labelled stochastic block models to represent the observed data, where labels correspond to interaction types. Focusing on a two-community scenario, we conjecture a threshold for the problem of reconstructing the hidden communities i… ▽ More We consider the problem of community detection from observed interactions between individuals, in the context where multiple types of interaction are possible. We use labelled stochastic block models to represent the observed data, where labels correspond to interaction types. Focusing on a two-community scenario, we conjecture a threshold for the problem of reconstructing the hidden communities in a way that is correlated with the true partition. To substantiate the conjecture, we prove that the given threshold correctly identifies a transition on the behaviour of belief propagation from insensitive to sensitive. We further prove that the same threshold corresponds to the transition in a related inference problem on a tree model from infeasible to feasible. Finally, numerical results using belief propagation for community detection give further support to the conjecture. △ Less

Submitted 13 September, 2012; originally announced September 2012.

Comments: 9 pages

arXiv:1208.3629 [pdf, ps, other]

Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

Authors: Marc Lelarge, Hang Zhou

Abstract: For a graph $G$, let $Z(G,λ)$ be the partition function of the monomer-dimer system defined by $\sum_k m_k(G)λ^k$, where $m_k(G)$ is the number of matchings of size $k$ in $G$. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating $\log Z(G,λ)$ at an arbitrary value $λ>0$ within additive error $εn$ with high probability. The query complexity of our algorithm do… ▽ More For a graph $G$, let $Z(G,λ)$ be the partition function of the monomer-dimer system defined by $\sum_k m_k(G)λ^k$, where $m_k(G)$ is the number of matchings of size $k$ in $G$. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating $\log Z(G,λ)$ at an arbitrary value $λ>0$ within additive error $εn$ with high probability. The query complexity of our algorithm does not depend on the size of $G$ and is polynomial in $1/ε$, and we also provide a lower bound quadratic in $1/ε$ for this problem. This is the first analysis of a sublinear-time approximation algorithm for a $# P$-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with $Z(G,λ)$. We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in $1/ε$ and the lower bound is quadratic in $1/ε$. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity. △ Less

Submitted 4 September, 2013; v1 submitted 17 August, 2012; originally announced August 2012.

arXiv:1207.7321 [pdf, ps, other]

doi 10.1214/14-AAP1010

Universality in polytope phase transitions and message passing algorithms

Authors: Mohsen Bayati, Marc Lelarge, Andrea Montanari

Abstract: We consider a class of nonlinear map**s $\mathsf{F}_{A,N}$ in $\mathbb{R}^N$ indexed by symmetric random matrices $A\in\mathbb{R}^{N\times N}$ with independent entries. Within spin glass theory, special cases of these map**s correspond to iterating the TAP equations and were studied by Bolthausen [Comm. Math. Phys. 325 (2014) 333-366]. Within information theory, they are known as "approximate… ▽ More We consider a class of nonlinear map**s $\mathsf{F}_{A,N}$ in $\mathbb{R}^N$ indexed by symmetric random matrices $A\in\mathbb{R}^{N\times N}$ with independent entries. Within spin glass theory, special cases of these map**s correspond to iterating the TAP equations and were studied by Bolthausen [Comm. Math. Phys. 325 (2014) 333-366]. Within information theory, they are known as "approximate message passing" algorithms. We study the high-dimensional (large $N$) behavior of the iterates of $\mathsf{F}$ for polynomial functions $\mathsf{F}$, and prove that it is universal; that is, it depends only on the first two moments of the entries of $A$, under a sub-Gaussian tail condition. As an application, we prove the universality of a certain phase transition arising in polytope geometry and compressed sensing. This solves, for a broad class of random projections, a conjecture by David Donoho and Jared Tanner. △ Less

Submitted 17 March, 2015; v1 submitted 31 July, 2012; originally announced July 2012.

Comments: Published in at http://dx.doi.org/10.1214/14-AAP1010 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AAP-AAP1010

Journal ref: Annals of Applied Probability 2015, Vol. 25, 753-822

arXiv:1207.1659 [pdf, ps, other]

Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing

Authors: Mathieu Leconte, Marc Lelarge, Laurent Massoulié

Abstract: This paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. These two problems admit a common abstraction: in both scenarios, performance is characterized by th… ▽ More This paper is motivated by two applications, namely i) generalizations of cuckoo hashing, a computationally simple approach to assigning keys to objects, and ii) load balancing in content distribution networks, where one is interested in determining the impact of content replication on performance. These two problems admit a common abstraction: in both scenarios, performance is characterized by the maximum weight of a generalization of a matching in a bipartite graph, featuring node and edge capacities. Our main result is a law of large numbers characterizing the asymptotic maximum weight matching in the limit of large bipartite random graphs, when the graphs admit a local weak limit that is a tree. This result specializes to the two application scenarios, yielding new results in both contexts. In contrast with previous results, the key novelty is the ability to handle edge capacities with arbitrary integer values. An analysis of belief propagation algorithms (BP) with multivariate belief vectors underlies the proof. In particular, we show convergence of the corresponding BP by exploiting monotonicity of the belief vectors with respect to the so-called upshifted likelihood ratio stochastic order. This auxiliary result can be of independent interest, providing a new set of structural conditions which ensure convergence of BP. △ Less

Submitted 6 July, 2012; originally announced July 2012.

Comments: 10 pages format + proofs in the appendix: total 24 pages

arXiv:1202.4974 [pdf, ps, other]

How Clustering Affects Epidemics in Random Networks

Authors: Emilie Coupechoux, Marc Lelarge

Abstract: Motivated by the analysis of social networks, we study a model of random networks that has both a given degree distribution and a tunable clustering coefficient. We consider two types of growth processes on these graphs: diffusion and symmetric threshold model. The diffusion process is inspired from epidemic models. It is characterized by an infection probability, each neighbor transmitting the ep… ▽ More Motivated by the analysis of social networks, we study a model of random networks that has both a given degree distribution and a tunable clustering coefficient. We consider two types of growth processes on these graphs: diffusion and symmetric threshold model. The diffusion process is inspired from epidemic models. It is characterized by an infection probability, each neighbor transmitting the epidemic independently. In the symmetric threshold process, the interactions are still local but the propagation rule is governed by a threshold (that might vary among the different nodes). An interesting example of symmetric threshold process is the contagion process, which is inspired by a simple coordination game played on the network. Both types of processes have been used to model spread of new ideas, technologies, viruses or worms and results have been obtained for random graphs with no clustering. In this paper, we are able to analyze the impact of clustering on the growth processes. While clustering inhibits the diffusion process, its impact for the contagion process is more subtle and depends on the connectivity of the graph: in a low connectivity regime, clustering also inhibits the contagion, while in a high connectivity regime, clustering favors the appearance of global cascades but reduces their size. For both diffusion and symmetric threshold models, we characterize conditions under which global cascades are possible and compute their size explicitly, as a function of the degree distribution and the clustering coefficient. Our results are applied to regular or power-law graphs with exponential cutoff and shed new light on the impact of clustering. △ Less

Submitted 22 February, 2012; originally announced February 2012.

Comments: 30 pages

MSC Class: 60C05; 05C80; 91D30

arXiv:1201.5335 [pdf, ps, other]

A new approach to the orientation of random hypergraphs

Authors: Marc Lelarge

Abstract: A h-uniform hypergraph H=(V,E) is called (l,k)-orientable if there exists an assignment of each hyperedge e to exactly l of its vertices such that no vertex is assigned more than k hyperedges. Let H_{n,m,h} be a hypergraph, drawn uniformly at random from the set of all h-uniform hypergraphs with n vertices and m edges. In this paper, we determine the threshold of the existence of a (l,k)-orientati… ▽ More A h-uniform hypergraph H=(V,E) is called (l,k)-orientable if there exists an assignment of each hyperedge e to exactly l of its vertices such that no vertex is assigned more than k hyperedges. Let H_{n,m,h} be a hypergraph, drawn uniformly at random from the set of all h-uniform hypergraphs with n vertices and m edges. In this paper, we determine the threshold of the existence of a (l,k)-orientation of H_{n,m,h} for k>=1 and h>l>=1, extending recent results motivated by applications such as cuckoo hashing or load balancing with guaranteed maximum load. Our proof combines the local weak convergence of sparse graphs and a careful analysis of a Gibbs measure on spanning subgraphs with degree constraints. It allows us to deal with a much broader class than the uniform hypergraphs. △ Less

Submitted 25 January, 2012; originally announced January 2012.

Comments: 27 pages, preliminary version appeared at SODA 2012

arXiv:1112.6330 [pdf, ps, other]

doi 10.1214/14-AAP1034

The diameter of weighted random graphs

Authors: Hamed Amini, Marc Lelarge

Abstract: In this paper we study the impact of random exponential edge weights on the distances in a random graph and, in particular, on its diameter. Our main result consists of a precise asymptotic expression for the maximal weight of the shortest weight paths between all vertices (the weighted diameter) of sparse random graphs, when the edge weights are i.i.d. exponential random variables. In this paper we study the impact of random exponential edge weights on the distances in a random graph and, in particular, on its diameter. Our main result consists of a precise asymptotic expression for the maximal weight of the shortest weight paths between all vertices (the weighted diameter) of sparse random graphs, when the edge weights are i.i.d. exponential random variables. △ Less

Submitted 16 April, 2015; v1 submitted 29 December, 2011; originally announced December 2011.

Comments: Published at http://dx.doi.org/10.1214/14-AAP1034 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AAP-AAP1034

Journal ref: Annals of Applied Probability 2015, Vol. 25, No. 3, 1686-1727

arXiv:1102.0712 [pdf, ps, other]

Matchings on infinite graphs

Authors: Charles Bordenave, Marc Lelarge, Justin Salez

Abstract: Elek and Lippner (2010) showed that the convergence of a sequence of bounded-degree graphs implies the existence of a limit for the proportion of vertices covered by a maximum matching. We provide a characterization of the limiting parameter via a local recursion defined directly on the limit of the graph sequence. Interestingly, the recursion may admit multiple solutions, implying non-trivial lon… ▽ More Elek and Lippner (2010) showed that the convergence of a sequence of bounded-degree graphs implies the existence of a limit for the proportion of vertices covered by a maximum matching. We provide a characterization of the limiting parameter via a local recursion defined directly on the limit of the graph sequence. Interestingly, the recursion may admit multiple solutions, implying non-trivial long-range dependencies between the covered vertices. We overcome this lack of correlation decay by introducing a perturbative parameter (temperature), which we let progressively go to zero. This allows us to uniquely identify the correct solution. In the important case where the graph limit is a unimodular Galton-Watson tree, the recursion simplifies into a distributional equation that can be solved explicitly, leading to a new asymptotic formula that considerably extends the well-known one by Karp and Sipser for Erdös-Rényi random graphs. △ Less

Submitted 11 April, 2012; v1 submitted 3 February, 2011; originally announced February 2011.

Comments: 23 pages

MSC Class: 05C70; 60C05 (Primary) 05C80 (Secondary)

arXiv:1012.2062 [pdf, ps, other]

Diffusion and Cascading Behavior in Random Networks

Authors: Marc Lelarge

Abstract: The spread of new ideas, behaviors or technologies has been extensively studied using epidemic models. Here we consider a model of diffusion where the individuals' behavior is the result of a strategic choice. We study a simple coordination game with binary choice and give a condition for a new action to become widespread in a random network. We also analyze the possible equilibria of this game an… ▽ More The spread of new ideas, behaviors or technologies has been extensively studied using epidemic models. Here we consider a model of diffusion where the individuals' behavior is the result of a strategic choice. We study a simple coordination game with binary choice and give a condition for a new action to become widespread in a random network. We also analyze the possible equilibria of this game and identify conditions for the coexistence of both strategies in large connected sets. Finally we look at how can firms use social networks to promote their goals with limited information. Our results differ strongly from the one derived with epidemic models and show that connectivity plays an ambiguous role: while it allows the diffusion to spread, when the network is highly connected, the diffusion is also limited by high-degree nodes which are very stable. △ Less

Submitted 19 October, 2011; v1 submitted 9 December, 2010; originally announced December 2010.

arXiv:1011.5994 [pdf, ps, other]

Flooding in Weighted Random Graphs

Authors: Hamed Amini, Moez Draief, Marc Lelarge

Abstract: In this paper, we study the impact of edge weights on distances in diluted random graphs. We interpret these weights as delays, and take them as i.i.d exponential random variables. We analyze the weighted flooding time defined as the minimum time needed to reach all nodes from one uniformly chosen node, and the weighted diameter corresponding to the largest distance between any pair of vertices. U… ▽ More In this paper, we study the impact of edge weights on distances in diluted random graphs. We interpret these weights as delays, and take them as i.i.d exponential random variables. We analyze the weighted flooding time defined as the minimum time needed to reach all nodes from one uniformly chosen node, and the weighted diameter corresponding to the largest distance between any pair of vertices. Under some regularity conditions on the degree sequence of the random graph, we show that these quantities grow as the logarithm of $n$, when the size of the graph $n$ tends to infinity. We also derive the exact value for the prefactors. These allow us to analyze an asynchronous randomized broadcast algorithm for random regular graphs. Our results show that the asynchronous version of the algorithm performs better than its synchronized version: in the large size limit of the graph, it will reach the whole network faster even if the local dynamics are similar on average. △ Less

Submitted 27 November, 2010; originally announced November 2010.

Comments: 15 pages, 1 figure

arXiv:0907.4244 [pdf, ps, other]

doi 10.1214/10-AOP567

The rank of diluted random graphs

Authors: Charles Bordenave, Marc Lelarge, Justin Salez

Abstract: We investigate the rank of the adjacency matrix of large diluted random graphs: for a sequence of graphs $(G_n)_{n\geq0}$ converging locally to a Galton--Watson tree $T$ (GWT), we provide an explicit formula for the asymptotic multiplicity of the eigenvalue 0 in terms of the degree generating function $φ_*$ of $T$. In the first part, we show that the adjacency operator associated with $T$ is alway… ▽ More We investigate the rank of the adjacency matrix of large diluted random graphs: for a sequence of graphs $(G_n)_{n\geq0}$ converging locally to a Galton--Watson tree $T$ (GWT), we provide an explicit formula for the asymptotic multiplicity of the eigenvalue 0 in terms of the degree generating function $φ_*$ of $T$. In the first part, we show that the adjacency operator associated with $T$ is always self-adjoint; we analyze the associated spectral measure at the root and characterize the distribution of its atomic mass at 0. In the second part, we establish a sufficient condition on $φ_*$ for the expectation of this atomic mass to be precisely the normalized limit of the dimension of the kernel of the adjacency matrices of $(G_n)_{n\geq 0}$. Our proofs borrow ideas from analysis of algorithms, functional analysis, random matrix theory and statistical physics. △ Less

Submitted 8 April, 2011; v1 submitted 24 July, 2009; originally announced July 2009.

Comments: Published in at http://dx.doi.org/10.1214/10-AOP567 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP567

Journal ref: Annals of Probability 2011, Vol. 39, No. 3, 1097-1121

arXiv:0801.0155 [pdf, ps, other]

Resolvent of Large Random Graphs

Authors: Charles Bordenave, Marc Lelarge

Abstract: We analyze the convergence of the spectrum of large random graphs to the spectrum of a limit infinite graph. We apply these results to graphs converging locally to trees and derive a new formula for the Stieljes transform of the spectral measure of such graphs. We illustrate our results on the uniform regular graphs, Erdos-Renyi graphs and preferential attachment graphs. We sketch examples of ap… ▽ More We analyze the convergence of the spectrum of large random graphs to the spectrum of a limit infinite graph. We apply these results to graphs converging locally to trees and derive a new formula for the Stieljes transform of the spectral measure of such graphs. We illustrate our results on the uniform regular graphs, Erdos-Renyi graphs and preferential attachment graphs. We sketch examples of application for weighted graphs, bipartite graphs and the uniform spanning tree of n vertices. △ Less

Submitted 5 May, 2009; v1 submitted 31 December, 2007; originally announced January 2008.

Comments: 21 pages, 1 figure

MSC Class: 05C80; 15A52 (Primary); 47A10 (Secondary)

arXiv:0710.0857 [pdf, ps, other]

Dynamic Programming Optimization over Random Data: the Scaling Exponent for Near-optimal Solutions

Authors: David J. Aldous, Charles Bordenave, Marc Lelarge

Abstract: A very simple example of an algorithmic problem solvable by dynamic programming is to maximize, over sets A in {1,2,...,n}, the objective function |A| - \sum_i ξ_i 1(i \in A,i+1 \in A) for given ξ_i > 0. This problem, with random (ξ_i), provides a test example for studying the relationship between optimal and near-optimal solutions of combinatorial optimization problems. We show that, amongst so… ▽ More A very simple example of an algorithmic problem solvable by dynamic programming is to maximize, over sets A in {1,2,...,n}, the objective function |A| - \sum_i ξ_i 1(i \in A,i+1 \in A) for given ξ_i > 0. This problem, with random (ξ_i), provides a test example for studying the relationship between optimal and near-optimal solutions of combinatorial optimization problems. We show that, amongst solutions differing from the optimal solution in a small proportion δof places, we can find near-optimal solutions whose objective function value differs from the optimum by a factor of order δ^2 but not smaller order. We conjecture this relationship holds widely in the context of dynamic programming over random data, and Monte Carlo simulations for the Kauffman-Levin NK model are consistent with the conjecture. This work is a technical contribution to a broad program initiated in Aldous-Percus (2003) of relating such scaling exponents to the algorithmic difficulty of optimization problems. △ Less

Submitted 3 October, 2007; originally announced October 2007.

Comments: 35 pages

MSC Class: 68Q25; 90C39; 60J05.

arXiv:math/0701420 [pdf, ps, other]

Tail Asymptotics for Discrete Event Systems

Authors: Marc Lelarge

Abstract: In the context of communication networks, the framework of stochastic event graphs allows a modeling of control mechanisms induced by the communication protocol and an analysis of its performances. We concentrate on the logarithmic tail asymptotics of the stationary response time for a class of networks that admit a representation as (max,plus)-linear systems in a random medium. We are able to d… ▽ More In the context of communication networks, the framework of stochastic event graphs allows a modeling of control mechanisms induced by the communication protocol and an analysis of its performances. We concentrate on the logarithmic tail asymptotics of the stationary response time for a class of networks that admit a representation as (max,plus)-linear systems in a random medium. We are able to derive analytic results when the distribution of the holding times are light-tailed. We show that the lack of independence may lead in dimension bigger than one to non-trivial effects in the asymptotics of the sojourn time. We also study in detail a simple queueing network with multipath routing. △ Less

Submitted 16 March, 2007; v1 submitted 15 January, 2007; originally announced January 2007.

Comments: 19 pages, 2 figures, mistake in appendix corrected

MSC Class: 60F10; 60K25

arXiv:math/0609547 [pdf, ps, other]

Near-Minimal Spanning Trees: a Scaling Exponent in Probability Models

Authors: David Aldous, Charles Bordenave, Marc Lelarge

Abstract: We study the relation between the minimal spanning tree (MST) on many random points and the "near-minimal" tree which is optimal subject to the constraint that a proportion $δ$ of its edges must be different from those of the MST. Heuristics suggest that, regardless of details of the probability model, the ratio of lengths should scale as $1 + Θ(δ^2)$. We prove this scaling result in the model o… ▽ More We study the relation between the minimal spanning tree (MST) on many random points and the "near-minimal" tree which is optimal subject to the constraint that a proportion $δ$ of its edges must be different from those of the MST. Heuristics suggest that, regardless of details of the probability model, the ratio of lengths should scale as $1 + Θ(δ^2)$. We prove this scaling result in the model of the lattice with random edge-lengths and in the Euclidean model. △ Less

Submitted 23 July, 2007; v1 submitted 20 September, 2006; originally announced September 2006.

Comments: 24 pages, 3 figures

MSC Class: 05C80; 60K35; 68W40

arXiv:math/0602130 [pdf, ps, other]

Sample path large deviations for queueing networks with Bernoulli routing

Authors: Marc Lelarge

Abstract: This paper is devoted to the problem of sample path large deviations for multidimensional queueing models with feedback. We derive a new version of the contraction principle where the continuous map is not well-defined on the whole space: we give conditions under which it allows to identify the rate function. We illustrate our technique by deriving a large deviation principle for a class of netw… ▽ More This paper is devoted to the problem of sample path large deviations for multidimensional queueing models with feedback. We derive a new version of the contraction principle where the continuous map is not well-defined on the whole space: we give conditions under which it allows to identify the rate function. We illustrate our technique by deriving a large deviation principle for a class of networks that contains the classical Jackson networks. △ Less

Submitted 7 February, 2006; originally announced February 2006.

Comments: 25 pages, 2 figures

MSC Class: 60F10; 60K25

arXiv:math/0510117 [pdf, ps, other]

Tail asymptotics for monotone-separable networks

Authors: Marc Lelarge

Abstract: A network belongs to the monotone separable class if its state variables are homogeneous and monotone functions of the epochs of the arrival process. This framework contains several classical queueing network models, including generalized Jackson networks, max-plus networks, polling systems, multiserver queues, and various classes of stochastic Petri nets. We use comparison relationships between… ▽ More A network belongs to the monotone separable class if its state variables are homogeneous and monotone functions of the epochs of the arrival process. This framework contains several classical queueing network models, including generalized Jackson networks, max-plus networks, polling systems, multiserver queues, and various classes of stochastic Petri nets. We use comparison relationships between networks of this class with i.i.d. driving sequences and the $GI /GI /1/1$ queue to obtain the tail asymptotics of the stationary maximal dater under light-tailed assumptions for service times. The exponential rate of decay is given as a function of a logarithmic moment generating function. We exemplify an explicit computation of this rate for the case of queues in tandem under various stochastic assumptions. △ Less

Submitted 16 March, 2007; v1 submitted 6 October, 2005; originally announced October 2005.

Comments: 15 pages, shortened version, case of (max,plus)-networks handled in a separate paper

MSC Class: 60F10; 60K25

Showing 1–40 of 40 results for author: Lelarge, M