Search | arXiv e-print repository

Weak recovery, hypothesis testing, and mutual information in stochastic block models and planted factor graphs

Authors: Elchanan Mossel, Allan Sly, Youngtak Sohn

Abstract: The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the ``planted partition model.'' Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recove… ▽ More The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the ``planted partition model.'' Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recovery: can we estimate the communities with non trivial overlap with the true communities? (ii) Detection/Hypothesis testing: can we distinguish if the sample was drawn from the block model or from a random graph with no community structure with probability tending to $1$ as the graph size tends to infinity? In this work, we show that for sparse stochastic block models, the two inference tasks are equivalent except at a critical point. That is, weak recovery is information theoretically possible if and only if detection is possible. We thus find a strong connection between these two notions of inference for the model. We further prove that when detection is impossible, an explicit hypothesis test based on low degree polynomials in the adjacency matrix of the observed graph achieves the optimal statistical power. This low degree test is efficient as opposed to the likelihood ratio test, which is not known to be efficient. Moreover, we prove that the asymptotic mutual information between the observed network and the community structure exhibits a phase transition at the weak recovery threshold. Our results are proven in much broader settings including the hypergraph stochastic block models and general planted factor graphs. In these settings we prove that the impossibility of weak recovery implies contiguity and provide a condition which guarantees the equivalence of weak recovery and detection. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 78 pages

arXiv:2308.02075 [pdf, ps, other]

Upper bounds on the $2$-colorability threshold of random $d$-regular $k$-uniform hypergraphs for $k\geq 3$

Authors: Evan Chang, Neel Kolhe, Youngtak Sohn

Abstract: For a large class of random constraint satisfaction problems (CSP), deep but non-rigorous theory from statistical physics predict the location of the sharp satisfiability transition. The works of Ding, Sly, Sun (2014, 2016) and Coja-Oghlan, Panagiotou (2014) established the satisfiability threshold for random regular $k$-NAE-SAT, random $k$-SAT, and random regular $k$-SAT for large enough… ▽ More For a large class of random constraint satisfaction problems (CSP), deep but non-rigorous theory from statistical physics predict the location of the sharp satisfiability transition. The works of Ding, Sly, Sun (2014, 2016) and Coja-Oghlan, Panagiotou (2014) established the satisfiability threshold for random regular $k$-NAE-SAT, random $k$-SAT, and random regular $k$-SAT for large enough $k\geq k_0$ where $k_0$ is a large non-explicit constant. Establishing the same for small values of $k\geq 3$ remains an important open problem in the study of random CSPs. In this work, we study two closely related models of random CSPs, namely the $2$-coloring on random $d$-regular $k$-uniform hypergraphs and the random $d$-regular $k$-NAE-SAT model. For every $k\geq 3$, we prove that there is an explicit $d_{\ast}(k)$ which gives a satisfiability upper bound for both of the models. Our upper bound $d_{\ast}(k)$ for $k\geq 3$ matches the prediction from statistical physics for the hypergraph $2$-coloring by Dall'Asta, Ramezanpour, Zecchina (2008), thus conjectured to be sharp. Moreover, $d_{\ast}(k)$ coincides with the satisfiability threshold of random regular $k$-NAE-SAT for large enough $k\geq k_0$ by Ding, Sly, Sun (2014). △ Less

Submitted 3 August, 2023; originally announced August 2023.

Comments: 23 pages, 1 table

arXiv:2305.17334 [pdf, other]

Local geometry of NAE-SAT solutions in the condensation regime

Authors: Allan Sly, Youngtak Sohn

Abstract: The local behavior of typical solutions of random constraint satisfaction problems (CSP) describes many important phenomena including clustering thresholds, decay of correlations, and the behavior of message passing algorithms. When the constraint density is low, studying the planted model is a powerful technique for determining this local behavior which in many examples has a simple Markovian str… ▽ More The local behavior of typical solutions of random constraint satisfaction problems (CSP) describes many important phenomena including clustering thresholds, decay of correlations, and the behavior of message passing algorithms. When the constraint density is low, studying the planted model is a powerful technique for determining this local behavior which in many examples has a simple Markovian structure. Work of Coja-Oghlan, Kapetanopoulos, Müller (2020) showed that for a wide class of models, this description applies up to the so-called condensation threshold. Understanding the local behavior after the condensation threshold is more complex due to long-range correlations. In this work, we revisit the random regular NAE-SAT model in the condensation regime and determine the local weak limit which describes a random solution around a typical variable. This limit exhibits a complicated non-Markovian structure arising from the space of solutions being dominated by a small number of large clusters. This is the first description of the local weak limit in the condensation regime for any sparse random CSPs in the one-step replica symmetry breaking (1RSB) class. Our result is non-asymptotic, and characterizes the tight fluctuation $O(n^{-1/2})$ around the limit. Our proof is based on coupling the local neighborhoods of an infinite spin system, which encodes the structure of the clusters, to a broadcast model on trees whose channel is given by the 1RSB belief-propagation fixed point. We believe that our proof technique has broad applicability to random CSPs in the 1RSB class. △ Less

Submitted 29 July, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 43 pages, 2 figures

arXiv:2302.14830 [pdf, other]

Sharp thresholds in inference of planted subgraphs

Authors: Elchanan Mossel, Jonathan Niles-Weed, Youngtak Sohn, Nike Sun, Ilias Zadik

Abstract: A major question in the study of the Erdős--Rényi random graph is to understand the probability that it contains a given subgraph. This study originated in classical work of Erdős and Rényi (1960). More recent work studies this question both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (K… ▽ More A major question in the study of the Erdős--Rényi random graph is to understand the probability that it contains a given subgraph. This study originated in classical work of Erdős and Rényi (1960). More recent work studies this question both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (Kahn and Kalai, 2007; Talagrand, 2010; Frankston, Kahn, Narayanan, Park, 2019; Park and Pham, 2022). In inference problems, one often studies the optimal accuracy of inference as a function of the amount of noise. In a variety of sparse recovery problems, an ``all-or-nothing (AoN) phenomenon'' has been observed: Informally, as the amount of noise is gradually increased, at some critical threshold the inference problem undergoes a sharp jump from near-perfect recovery to near-zero accuracy (Gamarnik and Zadik, 2017; Reeves, Xu, Zadik, 2021). We can regard AoN as the natural inference analogue of the sharp threshold phenomenon in random graphs. In contrast with the general theory developed for sharp thresholds of random graph properties, the AoN phenomenon has only been studied so far in specific inference settings. In this paper we study the general problem of inferring a graph $H=H_n$ planted in an Erdős--Rényi random graph, thus naturally connecting the two lines of research mentioned above. We show that questions of AoN are closely connected to first moment thresholds, and to a generalization of the so-called Kahn--Kalai expectation threshold that scans over subgraphs of $H$ of edge density at least $q$. In a variety of settings we characterize AoN, by showing that AoN occurs if and only if this ``generalized expectation threshold'' is roughly constant in $q$. Our proofs combine techniques from random graph theory and Bayesian inference. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: 41 pages

arXiv:2212.03362 [pdf, ps, other]

Exact Phase Transitions for Stochastic Block Models and Reconstruction on Trees

Authors: Elchanan Mossel, Allan Sly, Youngtak Sohn

Abstract: In this paper we continue to rigorously establish the predictions in ground breaking work in statistical physics by Decelle, Krzakala, Moore, Zdeborová (2011) regarding the block model, in particular in the case of $q=3$ and $q=4$ communities. We prove that for $q=3$ and $q=4$ there is no computational-statistical gap if the average degree is above some constant by showing it is information theo… ▽ More In this paper we continue to rigorously establish the predictions in ground breaking work in statistical physics by Decelle, Krzakala, Moore, Zdeborová (2011) regarding the block model, in particular in the case of $q=3$ and $q=4$ communities. We prove that for $q=3$ and $q=4$ there is no computational-statistical gap if the average degree is above some constant by showing it is information theoretically impossible to detect below the Kesten-Stigum bound. The proof is based on showing that for the broadcast process on Galton-Watson trees, reconstruction is impossible for $q=3$ and $q=4$ if the average degree is sufficiently large. This improves on the result of Sly (2009), who proved similar results for regular trees for $q=3$. Our analysis of the critical case $q=4$ provides a detailed picture showing that the tightness of the Kesten-Stigum bound in the antiferromagnetic case depends on the average degree of the tree. We also prove that for $q\geq 5$, the Kestin-Stigum bound is not sharp. Our results prove conjectures of Decelle, Krzakala, Moore, Zdeborová (2011), Moore (2017), Abbe and Sandon (2018) and Ricci-Tersenghi, Semerjian, and Zdeborov{á} (2019). Our proofs are based on a new general coupling of the tree and graph processes and on a refined analysis of the broadcast process on the tree. △ Less

Submitted 6 December, 2022; originally announced December 2022.

Comments: 52 pages

MSC Class: 05C80; 60J85

arXiv:2112.02409 [pdf, other]

Understanding Dynamic Spatio-Temporal Contexts in Long Short-Term Memory for Road Traffic Speed Prediction

Authors: Won Kyung Lee, Deuk Sin Kwon, So Young Sohn

Abstract: Reliable traffic flow prediction is crucial to creating intelligent transportation systems. Many big-data-based prediction approaches have been developed but they do not reflect complicated dynamic interactions between roads considering time and location. In this study, we propose a dynamically localised long short-term memory (LSTM) model that involves both spatial and temporal dependence between… ▽ More Reliable traffic flow prediction is crucial to creating intelligent transportation systems. Many big-data-based prediction approaches have been developed but they do not reflect complicated dynamic interactions between roads considering time and location. In this study, we propose a dynamically localised long short-term memory (LSTM) model that involves both spatial and temporal dependence between roads. To do so, we use a localised dynamic spatial weight matrix along with its dynamic variation. Moreover, the LSTM model can deal with sequential data with long dependency as well as complex non-linear features. Empirical results indicated superior prediction performances of the proposed model compared to two different baseline methods. △ Less

Submitted 16 June, 2023; v1 submitted 4 December, 2021; originally announced December 2021.

Comments: 10pages, 2 tables, 4 figures, 2017 KDD Cup

arXiv:2112.00152 [pdf, ps, other]

One-step replica symmetry breaking of random regular NAE-SAT II

Authors: Danny Nam, Allan Sly, Youngtak Sohn

Abstract: Continuing our earlier work in \cite{nss20a}, we study the random regular k-NAE-SAT model in the condensation regime. In \cite{nss20a}, the 1RSB properties of the model were established with positive probability. In this paper, we improve the result to probability arbitrarily close to one. To do so, we introduce a new framework which is the synthesis of two approaches: the small subgraph condition… ▽ More Continuing our earlier work in \cite{nss20a}, we study the random regular k-NAE-SAT model in the condensation regime. In \cite{nss20a}, the 1RSB properties of the model were established with positive probability. In this paper, we improve the result to probability arbitrarily close to one. To do so, we introduce a new framework which is the synthesis of two approaches: the small subgraph conditioning and a variance decomposition technique using Doob martingales and discrete Fourier analysis. The main challenge is a delicate integration of the two methods to overcome the difficulty arising from applying the moment method to an unbounded state space. △ Less

Submitted 17 December, 2023; v1 submitted 30 November, 2021; originally announced December 2021.

Comments: 57 pages, 1 figure. Accepted to Communications in Mathematical Physics. arXiv admin note: text overlap with arXiv:2011.14270

MSC Class: 60G15; 60K35; 82B44; 82D30

arXiv:2102.13267 [pdf, other]

LazyTensor: combining eager execution with domain-specific compilers

Authors: Alex Suhan, Davide Libenzi, Ailing Zhang, Parker Schuh, Brennan Saeta, Jie Young Sohn, Denys Shabalin

Abstract: Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the "language subset problem" where some host language features are unsupported in the subset of the user's program that interacts with the domain-specific compiler. By contrast, d… ▽ More Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the "language subset problem" where some host language features are unsupported in the subset of the user's program that interacts with the domain-specific compiler. By contrast, define-by-run ML frameworks-colloquially called "eager" mode-are popular due to their ease of use and expressivity, where the full power of the host programming language can be used. LazyTensor is a technique to target domain specific compilers without sacrificing define-by-run ergonomics. Initially developed to support PyTorch on Cloud TPUs, the technique, along with a substantially shared implementation, has been used by Swift for TensorFlow across CPUs, GPUs, and TPUs, demonstrating the generality of the approach across (1) Tensor implementations, (2) hardware accelerators, and (3) programming languages. △ Less

Submitted 25 February, 2021; originally announced February 2021.

arXiv:1506.07645 [pdf, other]

When Pilots Should Not Be Reused Across Interfering Cells in Massive MIMO

Authors: Ji Yong Sohn, Sung Whan Yoon, Jaekyun Moon

Abstract: The pilot reuse issue in massive multi-input multi-output (MIMO) antenna systems with interfering cells is closely examined. This paper considers scenarios where the ratio of the channel coherence time to the number of users in a cell may be sufficiently large. One such practical scenario arises when the number of users per unit coverage area cannot grow freely while user mobility is low, as in in… ▽ More The pilot reuse issue in massive multi-input multi-output (MIMO) antenna systems with interfering cells is closely examined. This paper considers scenarios where the ratio of the channel coherence time to the number of users in a cell may be sufficiently large. One such practical scenario arises when the number of users per unit coverage area cannot grow freely while user mobility is low, as in indoor networks. Another important scenario is when the service provider is interested in maximizing the sum rate over a fixed, selected number of users rather than the sum rate over all users in the cell. A sum-rate comparison analysis shows that in such scenarios less aggressive reuse of pilots involving allocation of additional pilots for interfering users yields significant performance advantage relative to the case where all cells reuse the same pilot set. For a given ratio of the normalized coherence time interval to the number of users per cell, the optimal pilot assignment strategy is revealed via a closed-form solution and the resulting net sum-rate is compared with that of the full pilot reuse. △ Less

Submitted 25 June, 2015; originally announced June 2015.

Comments: 7 pages, accepted and presented at International Conference on Communications (ICC2015) Workshop on 5G & Beyond

arXiv:1105.0515 [pdf]

Core-Periphery Segregation in Evolving Prisoner's Dilemma Networks

Authors: Yunkyu Sohn, Jung-Kyoo Choi, T. K. Ahn

Abstract: Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of de… ▽ More Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of dense cooperative networks has not been explored. Our large group prisoner's dilemma experiments with exit and partner choice options show that core-periphery segregation of cooperators and defectors drives the emergence of cooperation. Cooperators' Quit-for-Tat and defectors' Roving strategy lead to a highly asymmetric core and periphery structure. Densely connected to each other, cooperators successfully isolate defectors and earn larger payoffs than defectors. Our analysis of the topological characteristics of evolving networks illuminates how social capital is generated. △ Less

Submitted 9 December, 2012; v1 submitted 3 May, 2011; originally announced May 2011.

Showing 1–10 of 10 results for author: Sohn, Y