-
Weak recovery, hypothesis testing, and mutual information in stochastic block models and planted factor graphs
Authors:
Elchanan Mossel,
Allan Sly,
Youngtak Sohn
Abstract:
The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the ``planted partition model.'' Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recove…
▽ More
The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the ``planted partition model.'' Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recovery: can we estimate the communities with non trivial overlap with the true communities? (ii) Detection/Hypothesis testing: can we distinguish if the sample was drawn from the block model or from a random graph with no community structure with probability tending to $1$ as the graph size tends to infinity?
In this work, we show that for sparse stochastic block models, the two inference tasks are equivalent except at a critical point. That is, weak recovery is information theoretically possible if and only if detection is possible. We thus find a strong connection between these two notions of inference for the model. We further prove that when detection is impossible, an explicit hypothesis test based on low degree polynomials in the adjacency matrix of the observed graph achieves the optimal statistical power. This low degree test is efficient as opposed to the likelihood ratio test, which is not known to be efficient. Moreover, we prove that the asymptotic mutual information between the observed network and the community structure exhibits a phase transition at the weak recovery threshold.
Our results are proven in much broader settings including the hypergraph stochastic block models and general planted factor graphs. In these settings we prove that the impossibility of weak recovery implies contiguity and provide a condition which guarantees the equivalence of weak recovery and detection.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Upper bounds on the $2$-colorability threshold of random $d$-regular $k$-uniform hypergraphs for $k\geq 3$
Authors:
Evan Chang,
Neel Kolhe,
Youngtak Sohn
Abstract:
For a large class of random constraint satisfaction problems (CSP), deep but non-rigorous theory from statistical physics predict the location of the sharp satisfiability transition. The works of Ding, Sly, Sun (2014, 2016) and Coja-Oghlan, Panagiotou (2014) established the satisfiability threshold for random regular $k$-NAE-SAT, random $k$-SAT, and random regular $k$-SAT for large enough…
▽ More
For a large class of random constraint satisfaction problems (CSP), deep but non-rigorous theory from statistical physics predict the location of the sharp satisfiability transition. The works of Ding, Sly, Sun (2014, 2016) and Coja-Oghlan, Panagiotou (2014) established the satisfiability threshold for random regular $k$-NAE-SAT, random $k$-SAT, and random regular $k$-SAT for large enough $k\geq k_0$ where $k_0$ is a large non-explicit constant. Establishing the same for small values of $k\geq 3$ remains an important open problem in the study of random CSPs.
In this work, we study two closely related models of random CSPs, namely the $2$-coloring on random $d$-regular $k$-uniform hypergraphs and the random $d$-regular $k$-NAE-SAT model. For every $k\geq 3$, we prove that there is an explicit $d_{\ast}(k)$ which gives a satisfiability upper bound for both of the models. Our upper bound $d_{\ast}(k)$ for $k\geq 3$ matches the prediction from statistical physics for the hypergraph $2$-coloring by Dall'Asta, Ramezanpour, Zecchina (2008), thus conjectured to be sharp. Moreover, $d_{\ast}(k)$ coincides with the satisfiability threshold of random regular $k$-NAE-SAT for large enough $k\geq k_0$ by Ding, Sly, Sun (2014).
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Local geometry of NAE-SAT solutions in the condensation regime
Authors:
Allan Sly,
Youngtak Sohn
Abstract:
The local behavior of typical solutions of random constraint satisfaction problems (CSP) describes many important phenomena including clustering thresholds, decay of correlations, and the behavior of message passing algorithms. When the constraint density is low, studying the planted model is a powerful technique for determining this local behavior which in many examples has a simple Markovian str…
▽ More
The local behavior of typical solutions of random constraint satisfaction problems (CSP) describes many important phenomena including clustering thresholds, decay of correlations, and the behavior of message passing algorithms. When the constraint density is low, studying the planted model is a powerful technique for determining this local behavior which in many examples has a simple Markovian structure. Work of Coja-Oghlan, Kapetanopoulos, Müller (2020) showed that for a wide class of models, this description applies up to the so-called condensation threshold.
Understanding the local behavior after the condensation threshold is more complex due to long-range correlations. In this work, we revisit the random regular NAE-SAT model in the condensation regime and determine the local weak limit which describes a random solution around a typical variable. This limit exhibits a complicated non-Markovian structure arising from the space of solutions being dominated by a small number of large clusters. This is the first description of the local weak limit in the condensation regime for any sparse random CSPs in the one-step replica symmetry breaking (1RSB) class. Our result is non-asymptotic, and characterizes the tight fluctuation $O(n^{-1/2})$ around the limit. Our proof is based on coupling the local neighborhoods of an infinite spin system, which encodes the structure of the clusters, to a broadcast model on trees whose channel is given by the 1RSB belief-propagation fixed point. We believe that our proof technique has broad applicability to random CSPs in the 1RSB class.
△ Less
Submitted 29 July, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Sharp thresholds in inference of planted subgraphs
Authors:
Elchanan Mossel,
Jonathan Niles-Weed,
Youngtak Sohn,
Nike Sun,
Ilias Zadik
Abstract:
A major question in the study of the Erdős--Rényi random graph is to understand the probability that it contains a given subgraph. This study originated in classical work of Erdős and Rényi (1960). More recent work studies this question both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (K…
▽ More
A major question in the study of the Erdős--Rényi random graph is to understand the probability that it contains a given subgraph. This study originated in classical work of Erdős and Rényi (1960). More recent work studies this question both in building a general theory of sharp versus coarse transitions (Friedgut and Bourgain 1999; Hatami, 2012) and in results on the location of the transition (Kahn and Kalai, 2007; Talagrand, 2010; Frankston, Kahn, Narayanan, Park, 2019; Park and Pham, 2022).
In inference problems, one often studies the optimal accuracy of inference as a function of the amount of noise. In a variety of sparse recovery problems, an ``all-or-nothing (AoN) phenomenon'' has been observed: Informally, as the amount of noise is gradually increased, at some critical threshold the inference problem undergoes a sharp jump from near-perfect recovery to near-zero accuracy (Gamarnik and Zadik, 2017; Reeves, Xu, Zadik, 2021). We can regard AoN as the natural inference analogue of the sharp threshold phenomenon in random graphs. In contrast with the general theory developed for sharp thresholds of random graph properties, the AoN phenomenon has only been studied so far in specific inference settings.
In this paper we study the general problem of inferring a graph $H=H_n$ planted in an Erdős--Rényi random graph, thus naturally connecting the two lines of research mentioned above. We show that questions of AoN are closely connected to first moment thresholds, and to a generalization of the so-called Kahn--Kalai expectation threshold that scans over subgraphs of $H$ of edge density at least $q$. In a variety of settings we characterize AoN, by showing that AoN occurs if and only if this ``generalized expectation threshold'' is roughly constant in $q$. Our proofs combine techniques from random graph theory and Bayesian inference.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Exact Phase Transitions for Stochastic Block Models and Reconstruction on Trees
Authors:
Elchanan Mossel,
Allan Sly,
Youngtak Sohn
Abstract:
In this paper we continue to rigorously establish the predictions in ground breaking work in statistical physics by Decelle, Krzakala, Moore, Zdeborová (2011) regarding the block model, in particular in the case of $q=3$ and $q=4$ communities.
We prove that for $q=3$ and $q=4$ there is no computational-statistical gap if the average degree is above some constant by showing it is information theo…
▽ More
In this paper we continue to rigorously establish the predictions in ground breaking work in statistical physics by Decelle, Krzakala, Moore, Zdeborová (2011) regarding the block model, in particular in the case of $q=3$ and $q=4$ communities.
We prove that for $q=3$ and $q=4$ there is no computational-statistical gap if the average degree is above some constant by showing it is information theoretically impossible to detect below the Kesten-Stigum bound. The proof is based on showing that for the broadcast process on Galton-Watson trees, reconstruction is impossible for $q=3$ and $q=4$ if the average degree is sufficiently large. This improves on the result of Sly (2009), who proved similar results for regular trees for $q=3$. Our analysis of the critical case $q=4$ provides a detailed picture showing that the tightness of the Kesten-Stigum bound in the antiferromagnetic case depends on the average degree of the tree. We also prove that for $q\geq 5$, the Kestin-Stigum bound is not sharp.
Our results prove conjectures of Decelle, Krzakala, Moore, Zdeborová (2011), Moore (2017), Abbe and Sandon (2018) and Ricci-Tersenghi, Semerjian, and Zdeborov{á} (2019). Our proofs are based on a new general coupling of the tree and graph processes and on a refined analysis of the broadcast process on the tree.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Understanding Dynamic Spatio-Temporal Contexts in Long Short-Term Memory for Road Traffic Speed Prediction
Authors:
Won Kyung Lee,
Deuk Sin Kwon,
So Young Sohn
Abstract:
Reliable traffic flow prediction is crucial to creating intelligent transportation systems. Many big-data-based prediction approaches have been developed but they do not reflect complicated dynamic interactions between roads considering time and location. In this study, we propose a dynamically localised long short-term memory (LSTM) model that involves both spatial and temporal dependence between…
▽ More
Reliable traffic flow prediction is crucial to creating intelligent transportation systems. Many big-data-based prediction approaches have been developed but they do not reflect complicated dynamic interactions between roads considering time and location. In this study, we propose a dynamically localised long short-term memory (LSTM) model that involves both spatial and temporal dependence between roads. To do so, we use a localised dynamic spatial weight matrix along with its dynamic variation. Moreover, the LSTM model can deal with sequential data with long dependency as well as complex non-linear features. Empirical results indicated superior prediction performances of the proposed model compared to two different baseline methods.
△ Less
Submitted 16 June, 2023; v1 submitted 4 December, 2021;
originally announced December 2021.
-
One-step replica symmetry breaking of random regular NAE-SAT II
Authors:
Danny Nam,
Allan Sly,
Youngtak Sohn
Abstract:
Continuing our earlier work in \cite{nss20a}, we study the random regular k-NAE-SAT model in the condensation regime. In \cite{nss20a}, the 1RSB properties of the model were established with positive probability. In this paper, we improve the result to probability arbitrarily close to one. To do so, we introduce a new framework which is the synthesis of two approaches: the small subgraph condition…
▽ More
Continuing our earlier work in \cite{nss20a}, we study the random regular k-NAE-SAT model in the condensation regime. In \cite{nss20a}, the 1RSB properties of the model were established with positive probability. In this paper, we improve the result to probability arbitrarily close to one. To do so, we introduce a new framework which is the synthesis of two approaches: the small subgraph conditioning and a variance decomposition technique using Doob martingales and discrete Fourier analysis. The main challenge is a delicate integration of the two methods to overcome the difficulty arising from applying the moment method to an unbounded state space.
△ Less
Submitted 17 December, 2023; v1 submitted 30 November, 2021;
originally announced December 2021.
-
LazyTensor: combining eager execution with domain-specific compilers
Authors:
Alex Suhan,
Davide Libenzi,
Ailing Zhang,
Parker Schuh,
Brennan Saeta,
Jie Young Sohn,
Denys Shabalin
Abstract:
Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the "language subset problem" where some host language features are unsupported in the subset of the user's program that interacts with the domain-specific compiler. By contrast, d…
▽ More
Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the "language subset problem" where some host language features are unsupported in the subset of the user's program that interacts with the domain-specific compiler. By contrast, define-by-run ML frameworks-colloquially called "eager" mode-are popular due to their ease of use and expressivity, where the full power of the host programming language can be used. LazyTensor is a technique to target domain specific compilers without sacrificing define-by-run ergonomics. Initially developed to support PyTorch on Cloud TPUs, the technique, along with a substantially shared implementation, has been used by Swift for TensorFlow across CPUs, GPUs, and TPUs, demonstrating the generality of the approach across (1) Tensor implementations, (2) hardware accelerators, and (3) programming languages.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
When Pilots Should Not Be Reused Across Interfering Cells in Massive MIMO
Authors:
Ji Yong Sohn,
Sung Whan Yoon,
Jaekyun Moon
Abstract:
The pilot reuse issue in massive multi-input multi-output (MIMO) antenna systems with interfering cells is closely examined. This paper considers scenarios where the ratio of the channel coherence time to the number of users in a cell may be sufficiently large. One such practical scenario arises when the number of users per unit coverage area cannot grow freely while user mobility is low, as in in…
▽ More
The pilot reuse issue in massive multi-input multi-output (MIMO) antenna systems with interfering cells is closely examined. This paper considers scenarios where the ratio of the channel coherence time to the number of users in a cell may be sufficiently large. One such practical scenario arises when the number of users per unit coverage area cannot grow freely while user mobility is low, as in indoor networks. Another important scenario is when the service provider is interested in maximizing the sum rate over a fixed, selected number of users rather than the sum rate over all users in the cell. A sum-rate comparison analysis shows that in such scenarios less aggressive reuse of pilots involving allocation of additional pilots for interfering users yields significant performance advantage relative to the case where all cells reuse the same pilot set. For a given ratio of the normalized coherence time interval to the number of users per cell, the optimal pilot assignment strategy is revealed via a closed-form solution and the resulting net sum-rate is compared with that of the full pilot reuse.
△ Less
Submitted 25 June, 2015;
originally announced June 2015.
-
Core-Periphery Segregation in Evolving Prisoner's Dilemma Networks
Authors:
Yunkyu Sohn,
Jung-Kyoo Choi,
T. K. Ahn
Abstract:
Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of de…
▽ More
Dense cooperative networks are an essential element of social capital for a prosperous society. These networks enable individuals to overcome collective action dilemmas by enhancing trust. In many biological and social settings, network structures evolve endogenously as agents exit relationships and build new ones. However, the process by which evolutionary dynamics lead to self-organization of dense cooperative networks has not been explored. Our large group prisoner's dilemma experiments with exit and partner choice options show that core-periphery segregation of cooperators and defectors drives the emergence of cooperation. Cooperators' Quit-for-Tat and defectors' Roving strategy lead to a highly asymmetric core and periphery structure. Densely connected to each other, cooperators successfully isolate defectors and earn larger payoffs than defectors. Our analysis of the topological characteristics of evolving networks illuminates how social capital is generated.
△ Less
Submitted 9 December, 2012; v1 submitted 3 May, 2011;
originally announced May 2011.