Search | arXiv e-print repository

Arbitrarily Large Labelled Random Satisfiability Formulas for Machine Learning Training

Authors: Dimitris Achlioptas, Amrit Daswaney, Periklis A. Papakonstantinou

Abstract: Applying deep learning to solve real-life instances of hard combinatorial problems has tremendous potential. Research in this direction has focused on the Boolean satisfiability (SAT) problem, both because of its theoretical centrality and practical importance. A major roadblock faced, though, is that training sets are restricted to random formulas of size several orders of magnitude smaller than… ▽ More Applying deep learning to solve real-life instances of hard combinatorial problems has tremendous potential. Research in this direction has focused on the Boolean satisfiability (SAT) problem, both because of its theoretical centrality and practical importance. A major roadblock faced, though, is that training sets are restricted to random formulas of size several orders of magnitude smaller than formulas of practical interest, raising serious concerns about generalization. This is because labeling random formulas of increasing size rapidly becomes intractable. By exploiting the probabilistic method in a fundamental way, we remove this roadblock entirely: we show how to generate correctly labeled random formulas of any desired size, without having to solve the underlying decision problem. Moreover, the difficulty of the classification task for the formulas produced by our generator is tunable by varying a simple scalar parameter. This opens up an entirely new level of sophistication for the machine learning methods that can be brought to bear on Satisfiability. Using our generator, we train existing state-of-the-art models for the task of predicting satisfiability on formulas with 10,000 variables. We find that they do no better than random guessing. As a first indication of what can be achieved with the new generator, we present a novel classifier that performs significantly better than random guessing 99% on the same datasets, for most difficulty levels. Crucially, unlike past approaches that learn based on syntactic features of a formula, our classifier performs its learning on a short prefix of a solver's computation, an approach that we expect to be of independent interest. △ Less

Submitted 4 June, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

arXiv:2111.08837 [pdf, ps, other]

The Lovász Local Lemma is Not About Probability

Authors: Dimitris Achlioptas, Kostas Zampetakis

Abstract: Given a collection of independent events each of which has strictly positive probability, the probability that all of them occur is also strictly positive. The Lovász local lemma (LLL) asserts that this remains true if the events are not too strongly negatively correlated. The formulation of the lemma involves a graph with one vertex per event, with edges indicating potential negative dependence.… ▽ More Given a collection of independent events each of which has strictly positive probability, the probability that all of them occur is also strictly positive. The Lovász local lemma (LLL) asserts that this remains true if the events are not too strongly negatively correlated. The formulation of the lemma involves a graph with one vertex per event, with edges indicating potential negative dependence. The word "Local" in LLL reflects that the condition for the negative correlation can be expressed solely in terms of the neighborhood of each vertex. In contrast to this local view, Shearer developed an exact criterion for the avoidance probability to be strictly positive, but it involves summing over all independent sets of the graph. In this work we make two contributions. The first is to develop a hierarchy of increasingly powerful, increasingly non-local lemmata for bounding the avoidance probability from below, each lemma associated with a different set of walks in the graph. Already, at its second level, our hierarchy is stronger than all known local lemmata. To demonstrate its power we prove new bounds for the negative-fugacity singularity of the hard-core model on several lattices, a central problem in statistical physics. Our second contribution is to prove that Shearer's connection between the probabilistic setting and the independent set polynomial holds for \emph{arbitrary supermodular} functions, not just probability measures. This means that all LLL machinery can be employed to bound from below an arbitrary supermodular function, based only on information regarding its value at singleton sets and partial information regarding their interactions. We show that this readily implies both the quantum LLL of Ambainis, Kempe, and Sattath~[JACM 2012], and the quantum Shearer criterion of Sattath, Morampudi, Laumann, and Moessner~[PNAS 2016]. △ Less

Submitted 16 November, 2021; originally announced November 2021.

MSC Class: 60C05; 82B20 ACM Class: G.2.1

arXiv:2011.04809 [pdf, ps, other]

On the 2-colorability of random hypergraphs

Authors: Dimitris Achlioptas, Cristopher Moore

Abstract: A 2-coloring of a hypergraph is a map** from its vertices to a set of two colors such that no edge is monochromatic. Let $H_k(n,m)$ be a random $k$-uniform hypergraph on $n$ vertices formed by picking $m$ edges uniformly, independently and with replacement. It is easy to show that if $r \geq r_c = 2^{k-1} \ln 2 - (\ln 2) /2$, then with high probability $H_k(n,m=rn)$ is not 2-colorable. We comple… ▽ More A 2-coloring of a hypergraph is a map** from its vertices to a set of two colors such that no edge is monochromatic. Let $H_k(n,m)$ be a random $k$-uniform hypergraph on $n$ vertices formed by picking $m$ edges uniformly, independently and with replacement. It is easy to show that if $r \geq r_c = 2^{k-1} \ln 2 - (\ln 2) /2$, then with high probability $H_k(n,m=rn)$ is not 2-colorable. We complement this observation by proving that if $r \leq r_c - 1$ then with high probability $H_k(n,m=rn)$ is 2-colorable. △ Less

Submitted 9 November, 2020; originally announced November 2020.

Comments: This is an 18-year-old paper: it appeared in RANDOM 2002, but we neglected to post it on the arxiv and it is a bit hard to find outside paywalls. An enormous amount of progress has been made on this and related problems since then, but it might still be of interest as an example of using the second moment method to prove lower bounds on phase transitions in random combinatorial problems

Journal ref: Proc. 6th Intl. Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM '02) 78-90 (2002)

arXiv:2002.03690 [pdf, other]

doi 10.1002/rsa.20993

The random 2-SAT partition function

Authors: Dimitris Achlioptas, Amin Coja-Oghlan, Max Hahn-Klimroth, Joon Lee, Noela Müller, Manuel Penschuck, Guangyan Zhou

Abstract: We show that throughout the satisfiable phase the normalised number of satisfying assignments of a random $2$-SAT formula converges in probability to an expression predicted by the cavity method from statistical physics. The proof is based on showing that the Belief Propagation algorithm renders the correct marginal probability that a variable is set to `true' under a uniformly random satisfying a… ▽ More We show that throughout the satisfiable phase the normalised number of satisfying assignments of a random $2$-SAT formula converges in probability to an expression predicted by the cavity method from statistical physics. The proof is based on showing that the Belief Propagation algorithm renders the correct marginal probability that a variable is set to `true' under a uniformly random satisfying assignment. △ Less

Submitted 10 February, 2020; originally announced February 2020.

MSC Class: 05C80; 60C05; 68Q87

arXiv:1906.02613 [pdf, other]

Bad Global Minima Exist and SGD Can Reach Them

Authors: Shengchao Liu, Dimitris Papailiopoulos, Dimitris Achlioptas

Abstract: Several works have aimed to explain why overparameterized neural networks generalize well when trained by Stochastic Gradient Descent (SGD). The consensus explanation that has emerged credits the randomized nature of SGD for the bias of the training process towards low-complexity models and, thus, for implicit regularization. We take a careful look at this explanation in the context of image class… ▽ More Several works have aimed to explain why overparameterized neural networks generalize well when trained by Stochastic Gradient Descent (SGD). The consensus explanation that has emerged credits the randomized nature of SGD for the bias of the training process towards low-complexity models and, thus, for implicit regularization. We take a careful look at this explanation in the context of image classification with common deep neural network architectures. We find that if we do not regularize \emph{explicitly}, then SGD can be easily made to converge to poorly-generalizing, high-complexity models: all it takes is to first train on a random labeling on the data, before switching to properly training with the correct labels. In contrast, we find that in the presence of explicit regularization, pretraining with random labels has no detrimental effect on SGD. We believe that our results give evidence that explicit regularization plays a far more important role in the success of overparameterized neural networks than what has been understood until now. Specifically, by penalizing complicated models independently of their fit to the data, regularization affects training dynamics also far away from optima, making simple models that fit the data well discoverable by local methods, such as SGD. △ Less

Submitted 22 February, 2021; v1 submitted 6 June, 2019; originally announced June 2019.

arXiv:1809.07910 [pdf, ps, other]

Simple Local Computation Algorithms for the General Lovasz Local Lemma

Authors: Dimitris Achlioptas, Themis Gouleakis, Fotis Iliopoulos

Abstract: We consider the task of designing Local Computation Algorithms (LCA) for applications of the Lovász Local Lemma (LLL). LCA is a class of sublinear algorithms proposed by Rubinfeld et al.~\cite{Ronitt} that have received a lot of attention in recent years. The LLL is an existential, sufficient condition for a collection of sets to have non-empty intersection (in applications, often, each set compri… ▽ More We consider the task of designing Local Computation Algorithms (LCA) for applications of the Lovász Local Lemma (LLL). LCA is a class of sublinear algorithms proposed by Rubinfeld et al.~\cite{Ronitt} that have received a lot of attention in recent years. The LLL is an existential, sufficient condition for a collection of sets to have non-empty intersection (in applications, often, each set comprises all objects having a certain property). The ground-breaking algorithm of Moser and Tardos~\cite{MT} made the LLL fully constructive, following earlier results by Beck~\cite{beck_lll} and Alon~\cite{alon_lll} giving algorithms under significantly stronger LLL-like conditions. LCAs under those stronger conditions were given in~\cite{Ronitt}, where it was asked if the Moser-Tardos algorithm can be used to design LCAs under the standard LLL condition. The main contribution of this paper is to answer this question affirmatively. In fact, our techniques yield LCAs for settings beyond the standard LLL condition. △ Less

Submitted 6 July, 2020; v1 submitted 20 September, 2018; originally announced September 2018.

arXiv:1809.01537 [pdf, ps, other]

A Local Lemma for Focused Stochastic Algorithms

Authors: Dimitris Achlioptas, Fotis Iliopoulos, Vladimir Kolmogorov

Abstract: We develop a framework for the rigorous analysis of focused stochastic local search algorithms. These are algorithms that search a state space by repeatedly selecting some constraint that is violated in the current state and moving to a random nearby state that addresses the violation, while hopefully not introducing many new ones. An important class of focused local search algorithms with provabl… ▽ More We develop a framework for the rigorous analysis of focused stochastic local search algorithms. These are algorithms that search a state space by repeatedly selecting some constraint that is violated in the current state and moving to a random nearby state that addresses the violation, while hopefully not introducing many new ones. An important class of focused local search algorithms with provable performance guarantees has recently arisen from algorithmizations of the Lovász Local Lemma (LLL), a non-constructive tool for proving the existence of satisfying states by introducing a background measure on the state space. While powerful, the state transitions of algorithms in this class must be, in a precise sense, perfectly compatible with the background measure. In many applications this is a very restrictive requirement and one needs to step outside the class. Here we introduce the notion of \emph{measure distortion} and develop a framework for analyzing arbitrary focused stochastic local search algorithms, recovering LLL algorithmizations as the special case of no distortion. Our framework takes as input an arbitrary such algorithm and an arbitrary probability measure and shows how to use the measure as a yardstick of algorithmic progress, even for algorithms designed independently of the measure. △ Less

Submitted 3 September, 2018; originally announced September 2018.

Comments: This paper is based on results that appeared in preliminary form in SODA 2016 and FOCS 2016. arXiv admin note: text overlap with arXiv:1507.07633

arXiv:1805.02026 [pdf, ps, other]

Beyond the Lovasz Local Lemma: Point to Set Correlations and Their Algorithmic Applications

Authors: Dimitris Achlioptas, Fotis Iliopoulos, Alistair Sinclair

Abstract: Following the groundbreaking algorithm of Moser and Tardos for the Lovasz Local Lemma (LLL), there has been a plethora of results analyzing local search algorithms for various constraint satisfaction problems. The algorithms considered fall into two broad categories: resampling algorithms, analyzed via different algorithmic LLL conditions; and backtracking algorithms, analyzed via entropy compress… ▽ More Following the groundbreaking algorithm of Moser and Tardos for the Lovasz Local Lemma (LLL), there has been a plethora of results analyzing local search algorithms for various constraint satisfaction problems. The algorithms considered fall into two broad categories: resampling algorithms, analyzed via different algorithmic LLL conditions; and backtracking algorithms, analyzed via entropy compression arguments. This paper introduces a new convergence condition that seamlessly handles resampling, backtracking, and hybrid algorithms, i.e., algorithms that perform both resampling and backtracking steps. Unlike all past LLL work, our condition replaces the notion of a dependency or causality graph by quantifying point-to-set correlations between bad events. As a result, our condition simultaneously: (i)~captures the most general algorithmic LLL condition known as a special case; (ii)~significantly simplifies the analysis of entropy compression applications; (iii)~relates backtracking algorithms, which are conceptually very different from resampling algorithms, to the LLL; and most importantly (iv)~allows for the analysis of hybrid algorithms, which were outside the scope of previous techniques. We give several applications of our condition, including a new hybrid vertex coloring algorithm that extends the recent breakthrough result of Molloy for coloring triangle-free graphs to arbitrary graphs. △ Less

Submitted 18 August, 2020; v1 submitted 5 May, 2018; originally announced May 2018.

arXiv:1707.09467 [pdf, ps, other]

Probabilistic Model Counting with Short XORs

Authors: Dimitris Achlioptas, Panos Theodoropoulos

Abstract: The idea of counting the number of satisfying truth assignments (models) of a formula by adding random parity constraints can be traced back to the seminal work of Valiant and Vazirani, showing that NP is as easy as detecting unique solutions. While theoretically sound, the random parity constraints in that construction have the following drawback: each constraint, on average, involves half of all… ▽ More The idea of counting the number of satisfying truth assignments (models) of a formula by adding random parity constraints can be traced back to the seminal work of Valiant and Vazirani, showing that NP is as easy as detecting unique solutions. While theoretically sound, the random parity constraints in that construction have the following drawback: each constraint, on average, involves half of all variables. As a result, the branching factor associated with searching for models that also satisfy the parity constraints quickly gets out of hand. In this work we prove that one can work with much shorter parity constraints and still get rigorous mathematical guarantees, especially when the number of models is large so that many constraints need to be added. Our work is based on the realization that the essential feature for random systems of parity constraints to be useful in probabilistic model counting is that the geometry of their set of solutions resembles an error-correcting code. △ Less

Submitted 29 July, 2017; originally announced July 2017.

Comments: To appear in SAT 17

ACM Class: D.2.4; F.3.1

arXiv:1702.04539 [pdf, other]

doi 10.1109/ISIT.2017.8006551

Time-Invariant LDPC Convolutional Codes

Authors: Dimitris Achlioptas, Hamed Hassani, Wei Liu, Rüdiger Urbanke

Abstract: Spatially coupled codes have been shown to universally achieve the capacity for a large class of channels. Many variants of such codes have been introduced to date. We discuss a further such variant that is particularly simple and is determined by a very small number of parameters. More precisely, we consider time-invariant low-density convolutional codes with very large constraint lengths. We s… ▽ More Spatially coupled codes have been shown to universally achieve the capacity for a large class of channels. Many variants of such codes have been introduced to date. We discuss a further such variant that is particularly simple and is determined by a very small number of parameters. More precisely, we consider time-invariant low-density convolutional codes with very large constraint lengths. We show via simulations that, despite their extreme simplicity, such codes still show the threshold saturation behavior known from the spatially coupled codes discussed in the literature. Further, we show how the size of the typical minimum stop** set is related to basic parameters of the code. Due to their simplicity and good performance, these codes might be attractive from an implementation perspective. △ Less

Submitted 15 February, 2017; originally announced February 2017.

Comments: Submitted to 2017 IEEE International Symposium on Information Theory

arXiv:1607.06494 [pdf, ps, other]

Stochastic Control via Entropy Compression

Authors: Dimitris Achlioptas, Fotis Iliopoulos, Nikos Vlassis

Abstract: We consider an agent trying to bring a system to an acceptable state by repeated probabilistic action. Several recent works on algorithmizations of the Lovasz Local Lemma (LLL) can be seen as establishing sufficient conditions for the agent to succeed. Here we study whether such stochastic control is also possible in a noisy environment, where both the process of state-observation and the process… ▽ More We consider an agent trying to bring a system to an acceptable state by repeated probabilistic action. Several recent works on algorithmizations of the Lovasz Local Lemma (LLL) can be seen as establishing sufficient conditions for the agent to succeed. Here we study whether such stochastic control is also possible in a noisy environment, where both the process of state-observation and the process of state-evolution are subject to adversarial perturbation (noise). The introduction of noise causes the tools developed for LLL algorithmization to break down since the key LLL ingredient, the sparsity of the causality (dependence) relationship, no longer holds. To overcome this challenge we develop a new analysis where entropy plays a central role, both to measure the rate at which progress towards an acceptable state is made and the rate at which noise undoes this progress. The end result is a sufficient condition that allows a smooth tradeoff between the intensity of the noise and the amenability of the system, recovering an asymmetric LLL condition in the noiseless case. △ Less

Submitted 26 November, 2016; v1 submitted 21 July, 2016; originally announced July 2016.

Comments: 18 pages

arXiv:1507.07633 [pdf, ps, other]

Focused Stochastic Local Search and the Lovász Local Lemma

Authors: Dimitris Achlioptas, Fotis Iliopoulos

Abstract: We develop tools for analyzing focused stochastic local search algorithms. These are algorithms which search a state space probabilistically by repeatedly selecting a constraint that is violated in the current state and moving to a random nearby state which, hopefully, addresses the violation without introducing many new ones. A large class of such algorithms arise from the algorithmization of the… ▽ More We develop tools for analyzing focused stochastic local search algorithms. These are algorithms which search a state space probabilistically by repeatedly selecting a constraint that is violated in the current state and moving to a random nearby state which, hopefully, addresses the violation without introducing many new ones. A large class of such algorithms arise from the algorithmization of the Lovász Local Lemma, a non-constructive tool for proving the existence of satisfying states. Here we give tools that provide a unified analysis of such algorithms and of many more, expressing them as instances of a general framework. △ Less

Submitted 15 August, 2015; v1 submitted 27 July, 2015; originally announced July 2015.

Comments: Generalized the analysis of the Recursive Walk algorithm; corrected the proof of Acyclic Edge Coloring result

MSC Class: 68W20 ACM Class: F.1.2; G.3

arXiv:1502.07787 [pdf, ps, other]

Product Measure Approximation of Symmetric Graph Properties

Authors: Dimitris Achlioptas, Paris Siminelakis

Abstract: In the study of random structures we often face a trade-off between realism and tractability, the latter typically enabled by assuming some form of independence. In this work we initiate an effort to bridge this gap by develo** tools that allow us to work with independence without assuming it. Let $\mathcal{G}_{n}$ be the set of all graphs on $n$ vertices and let $S$ be an arbitrary subset of… ▽ More In the study of random structures we often face a trade-off between realism and tractability, the latter typically enabled by assuming some form of independence. In this work we initiate an effort to bridge this gap by develo** tools that allow us to work with independence without assuming it. Let $\mathcal{G}_{n}$ be the set of all graphs on $n$ vertices and let $S$ be an arbitrary subset of $\mathcal{G}_{n}$, e.g., the set of graphs with $m$ edges. The study of random networks can be seen as the study of properties that are true for most elements of $S$, i.e., that are true with high probability for a uniformly random element of $S$. With this in mind, we pursue the following question: What are general sufficient conditions for the uniform measure on a set of graphs $S \subseteq \mathcal{G}_{n}$ to be approximable by a product measure? △ Less

Submitted 26 February, 2015; originally announced February 2015.

Comments: 16 pages

MSC Class: 05C80

arXiv:1501.04931 [pdf, ps, other]

Navigability is a Robust Property

Authors: Dimitris Achlioptas, Paris Siminelakis

Abstract: The Small World phenomenon has inspired researchers across a number of fields. A breakthrough in its understanding was made by Kleinberg who introduced Rank Based Augmentation (RBA): add to each vertex independently an arc to a random destination selected from a carefully crafted probability distribution. Kleinberg proved that RBA makes many networks navigable, i.e., it allows greedy routing to su… ▽ More The Small World phenomenon has inspired researchers across a number of fields. A breakthrough in its understanding was made by Kleinberg who introduced Rank Based Augmentation (RBA): add to each vertex independently an arc to a random destination selected from a carefully crafted probability distribution. Kleinberg proved that RBA makes many networks navigable, i.e., it allows greedy routing to successfully deliver messages between any two vertices in a polylogarithmic number of steps. We prove that navigability is an inherent property of many random networks, arising without coordination, or even independence assumptions. △ Less

Submitted 20 January, 2015; originally announced January 2015.

ACM Class: G.2.2; G.3

arXiv:1406.0242 [pdf, ps, other]

Random Walks that Find Perfect Objects and the Lovász Local Lemma

Authors: Dimitris Achlioptas, Fotis Iliopoulos

Abstract: We give an algorithmic local lemma by establishing a sufficient condition for the uniform random walk on a directed graph to reach a sink quickly. Our work is inspired by Moser's entropic method proof of the Lovász Local Lemma (LLL) for satisfiability and completely bypasses the Probabilistic Method formulation of the LLL. In particular, our method works when the underlying state space is entirely… ▽ More We give an algorithmic local lemma by establishing a sufficient condition for the uniform random walk on a directed graph to reach a sink quickly. Our work is inspired by Moser's entropic method proof of the Lovász Local Lemma (LLL) for satisfiability and completely bypasses the Probabilistic Method formulation of the LLL. In particular, our method works when the underlying state space is entirely unstructured. Similarly to Moser's argument, the key point is that the inevitability of reaching a sink is established by bounding the entropy of the walk as a function of time. △ Less

Submitted 8 April, 2015; v1 submitted 2 June, 2014; originally announced June 2014.

Comments: 28 pages, added weighted version, added Independent Sets version, added Latin Squares Application

MSC Class: 68W20 ACM Class: F.1.2; G.3

arXiv:1311.4643 [pdf, other]

Near-Optimal Entrywise Sampling for Data Matrices

Authors: Dimitris Achlioptas, Zohar Karnin, Edo Liberty

Abstract: We consider the problem of selecting non-zero entries of a matrix $A$ in order to produce a sparse sketch of it, $B$, that minimizes $\|A-B\|_2$. For large $m \times n$ matrices, such that $n \gg m$ (for example, representing $n$ observations over $m$ attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information… ▽ More We consider the problem of selecting non-zero entries of a matrix $A$ in order to produce a sparse sketch of it, $B$, that minimizes $\|A-B\|_2$. For large $m \times n$ matrices, such that $n \gg m$ (for example, representing $n$ observations over $m$ attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information regarding $A$. Second, they allow sketching of matrices whose non-zeros are presented to the algorithm in arbitrary order as a stream, with $O(1)$ computation per non-zero. Third, the resulting sketch matrices are not only sparse, but their non-zero entries are highly compressible. Lastly, and most importantly, under mild assumptions, our distributions are provably competitive with the optimal offline distribution. Note that the probabilities in the optimal offline distribution may be complex functions of all the entries in the matrix. Therefore, regardless of computational complexity, the optimal distribution might be impossible to compute in the streaming model. △ Less

Submitted 19 November, 2013; originally announced November 2013.

Comments: 14 pages, to appear in NIPS' 13

arXiv:1107.5550 [pdf, ps, other]

The solution space geometry of random linear equations

Authors: Dimitris Achlioptas, Michael Molloy

Abstract: We consider random systems of linear equations over GF(2) in which every equation binds k variables. We obtain a precise description of the clustering of solutions in such systems. In particular, we prove that with probability that tends to 1 as the number of variables, n, grows: for every pair of solutions σ, τ, either there exists a sequence of solutions σ,...,τ, in which successive elements dif… ▽ More We consider random systems of linear equations over GF(2) in which every equation binds k variables. We obtain a precise description of the clustering of solutions in such systems. In particular, we prove that with probability that tends to 1 as the number of variables, n, grows: for every pair of solutions σ, τ, either there exists a sequence of solutions σ,...,τ, in which successive elements differ by O(log n) variables, or every sequence of solutions σ,...,τ, contains a step requiring the simultaneous change of Ω(n) variables. Furthermore, we determine precisely which pairs of solutions are in each category. Our results are tight and highly quantitative in nature. Moreover, our proof highlights the role of unique extendability as the driving force behind the success of Low Density Parity Check codes and our techniques also apply to the problem of so-called pseudo-codewords in such codes. △ Less

Submitted 9 October, 2015; v1 submitted 27 July, 2011; originally announced July 2011.

Comments: Corrects an error from previous versions. Lemma 35(b) replaces Observation 5 in the journal publication

Journal ref: Random Structures and Algorithms 46, 197-231 (2015)

arXiv:0803.2122 [pdf, ps, other]

doi 10.1109/FOCS.2008.11

Algorithmic barriers from phase transitions

Authors: Dimitris Achlioptas, Amin Coja-Oghlan

Abstract: For many random Constraint Satisfaction Problems, by now, we have asymptotically tight estimates of the largest constraint density for which they have solutions. At the same time, all known polynomial-time algorithms for many of these problems already completely fail to find solutions at much smaller densities. For example, it is well-known that it is easy to color a random graph using twice as… ▽ More For many random Constraint Satisfaction Problems, by now, we have asymptotically tight estimates of the largest constraint density for which they have solutions. At the same time, all known polynomial-time algorithms for many of these problems already completely fail to find solutions at much smaller densities. For example, it is well-known that it is easy to color a random graph using twice as many colors as its chromatic number. Indeed, some of the simplest possible coloring algorithms already achieve this goal. Given the simplicity of those algorithms, one would expect there is a lot of room for improvement. Yet, to date, no algorithm is known that uses $(2-ε) χ$ colors, in spite of efforts by numerous researchers over the years. In view of the remarkable resilience of this factor of 2 against every algorithm hurled at it, we believe it is natural to inquire into its origin. We do so by analyzing the evolution of the set of $k$-colorings of a random graph, viewed as a subset of $\{1,...,k\}^{n}$, as edges are added. We prove that the factor of 2 corresponds in a precise mathematical sense to a phase transition in the geometry of this set. Roughly, the set of $k$-colorings looks like a giant ball for $k \ge 2 χ$, but like an error-correcting code for $k \le (2-ε) χ$. We prove that a completely analogous phase transition also occurs both in random $k$-SAT and in random hypergraph 2-coloring. And that for each problem, its location corresponds precisely with the point were all known polynomial-time algorithms fail. To prove our results we develop a general technique that allows us to prove rigorously much of the celebrated 1-step Replica-Symmetry-Breaking hypothesis of statistical physics for random CSPs. △ Less

Submitted 21 May, 2008; v1 submitted 14 March, 2008; originally announced March 2008.

Comments: extended abstract

MSC Class: 05C80

Journal ref: Proc. 49th FOCS (2008) 793 - 802

arXiv:0706.1725 [pdf, ps, other]

The two possible values of the chromatic number of a random graph

Authors: Dimitris Achlioptas, Assaf Naor

Abstract: Given d \in (0,infty) let k_d be the smallest integer k such that d < 2k\log k. We prove that the chromatic number of a random graph G(n,d/n) is either k_d or k_d+1 almost surely. Given d \in (0,infty) let k_d be the smallest integer k such that d < 2k\log k. We prove that the chromatic number of a random graph G(n,d/n) is either k_d or k_d+1 almost surely. △ Less

Submitted 12 June, 2007; originally announced June 2007.

Comments: 17 pages, published version

MSC Class: 60C05

Journal ref: Ann. of Math. (2) 162 (2005), no. 3, 1335--1351

arXiv:cs/0611052 [pdf, ps, other]

On the Solution-Space Geometry of Random Constraint Satisfaction Problems

Authors: Dimitris Achlioptas, Federico Ricci-Tersenghi

Abstract: For a large number of random constraint satisfaction problems, such as random k-SAT and random graph and hypergraph coloring, there are very good estimates of the largest constraint density for which solutions exist. Yet, all known polynomial-time algorithms for these problems fail to find solutions even at much lower densities. To understand the origin of this gap we study how the structure of… ▽ More For a large number of random constraint satisfaction problems, such as random k-SAT and random graph and hypergraph coloring, there are very good estimates of the largest constraint density for which solutions exist. Yet, all known polynomial-time algorithms for these problems fail to find solutions even at much lower densities. To understand the origin of this gap we study how the structure of the space of solutions evolves in such problems as constraints are added. In particular, we prove that much before solutions disappear, they organize into an exponential number of clusters, each of which is relatively small and far apart from all other clusters. Moreover, inside each cluster most variables are frozen, i.e., take only one value. The existence of such frozen variables gives a satisfying intuitive explanation for the failure of the polynomial-time algorithms analyzed so far. At the same time, our results establish rigorously one of the two main hypotheses underlying Survey Propagation, a heuristic introduced by physicists in recent years that appears to perform extraordinarily well on random constraint satisfaction problems. △ Less

Submitted 15 December, 2006; v1 submitted 13 November, 2006; originally announced November 2006.

Comments: 25 pages, work presented at STOC'06

arXiv:cond-mat/0508737 [pdf, ps, other]

doi 10.1088/1742-5468/2005/10/P10012

Rapid Mixing for Lattice Colorings with Fewer Colors

Authors: Dimitris Achlioptas, Michael Molloy, Cristopher Moore, Frank Van Bussell

Abstract: We provide an optimally mixing Markov chain for 6-colorings of the square lattice on rectangular regions with free, fixed, or toroidal boundary conditions. This implies that the uniform distribution on the set of such colorings has strong spatial mixing, so that the 6-state Potts antiferromagnet has a finite correlation length and a unique Gibbs measure at zero temperature. Four and five are now… ▽ More We provide an optimally mixing Markov chain for 6-colorings of the square lattice on rectangular regions with free, fixed, or toroidal boundary conditions. This implies that the uniform distribution on the set of such colorings has strong spatial mixing, so that the 6-state Potts antiferromagnet has a finite correlation length and a unique Gibbs measure at zero temperature. Four and five are now the only remaining values of q for which it is not known whether there exists a rapidly mixing Markov chain for q-colorings of the square lattice. △ Less

Submitted 30 August, 2005; originally announced August 2005.

Comments: Appeared in Proc. LATIN 2004, to appear in JSTAT

arXiv:cs/0503046 [pdf, ps, other]

Hiding Satisfying Assignments: Two are Better than One

Authors: Dimitris Achlioptas, Haixia Jia, Cristopher Moore

Abstract: The evaluation of incomplete satisfiability solvers depends critically on the availability of hard satisfiable instances. A plausible source of such instances consists of random k-SAT formulas whose clauses are chosen uniformly from among all clauses satisfying some randomly chosen truth assignment A. Unfortunately, instances generated in this manner tend to be relatively easy and can be solved… ▽ More The evaluation of incomplete satisfiability solvers depends critically on the availability of hard satisfiable instances. A plausible source of such instances consists of random k-SAT formulas whose clauses are chosen uniformly from among all clauses satisfying some randomly chosen truth assignment A. Unfortunately, instances generated in this manner tend to be relatively easy and can be solved efficiently by practical heuristics. Roughly speaking, as the formula's density increases, for a number of different algorithms, A acts as a stronger and stronger attractor. Motivated by recent results on the geometry of the space of satisfying truth assignments of random k-SAT and NAE-k-SAT formulas, we introduce a simple twist on this basic model, which appears to dramatically increase its hardness. Namely, in addition to forbidding the clauses violated by the hidden assignment A, we also forbid the clauses violated by its complement, so that both A and complement of A are satisfying. It appears that under this "symmetrization'' the effects of the two attractors largely cancel out, making it much harder for algorithms to find any truth assignment. We give theoretical and experimental evidence supporting this assertion. △ Less

Submitted 19 March, 2005; originally announced March 2005.

Comments: Preliminary version appeared in AAAI 2004

arXiv:cond-mat/0503087 [pdf, ps, other]

On the Bias of Traceroute Sampling; or, Power-law Degree Distributions in Regular Graphs

Authors: Dimitris Achlioptas, Aaron Clauset, David Kempe, Cristopher Moore

Abstract: Understanding the structure of the Internet graph is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining its graph structure is a surprisingly difficult task, as edges cannot be explicitly queried. Instead, empirical studies rely on traceroutes to build what are essentially single-source, all-destinations, shortest-path… ▽ More Understanding the structure of the Internet graph is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining its graph structure is a surprisingly difficult task, as edges cannot be explicitly queried. Instead, empirical studies rely on traceroutes to build what are essentially single-source, all-destinations, shortest-path trees. These trees only sample a fraction of the network's edges, and a recent paper by Lakhina et al. found empirically that the resuting sample is intrinsically biased. For instance, the observed degree distribution under traceroute sampling exhibits a power law even when the underlying degree distribution is Poisson. In this paper, we study the bias of traceroute sampling systematically, and, for a very general class of underlying degree distributions, calculate the likely observed distributions explicitly. To do this, we use a continuous-time realization of the process of exposing the BFS tree of a random graph with a given degree distribution, calculate the expected degree distribution of the tree, and show that it is sharply concentrated. As example applications of our machinery, we show how traceroute sampling finds power-law degree distributions in both delta-regular and Poisson-distributed random graphs. Thus, our work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions. △ Less

Submitted 29 March, 2006; v1 submitted 3 March, 2005; originally announced March 2005.

Comments: Long-format version (19 pages); includes small correction to section 6.1

Journal ref: Proc. 37th ACM Symposium on Theory of Computing (STOC) 2005

arXiv:cond-mat/0407278 [pdf, ps, other]

The Chromatic Number of Random Regular Graphs

Authors: Dimitris Achlioptas, Cristopher Moore

Abstract: Given any integer d >= 3, let k be the smallest integer such that d < 2k log k. We prove that with high probability the chromatic number of a random d-regular graph is k, k+1, or k+2, and that if (2k-1) \log k < d < 2k \log k then the chromatic number is either k+1 or k+2. Given any integer d >= 3, let k be the smallest integer such that d < 2k log k. We prove that with high probability the chromatic number of a random d-regular graph is k, k+1, or k+2, and that if (2k-1) \log k < d < 2k \log k then the chromatic number is either k+1 or k+2. △ Less

Submitted 11 July, 2004; originally announced July 2004.

Journal ref: Proc. RANDOM 2004

arXiv:cond-mat/0310227 [pdf, ps, other]

Random k-SAT: Two Moments Suffice to Cross a Sharp Threshold

Authors: Dimitris Achlioptas, Cristopher Moore

Abstract: Many NP-complete constraint satisfaction problems appear to undergo a "phase transition'' from solubility to insolubility when the constraint density passes through a critical threshold. In all such cases it is easy to derive upper bounds on the location of the threshold by showing that above a certain density the first moment (expectation) of the number of solutions tends to zero. We show that… ▽ More Many NP-complete constraint satisfaction problems appear to undergo a "phase transition'' from solubility to insolubility when the constraint density passes through a critical threshold. In all such cases it is easy to derive upper bounds on the location of the threshold by showing that above a certain density the first moment (expectation) of the number of solutions tends to zero. We show that in the case of certain symmetric constraints, considering the second moment of the number of solutions yields nearly matching lower bounds for the location of the threshold. Specifically, we prove that the threshold for both random hypergraph 2-colorability (Property B) and random Not-All-Equal k-SAT is 2^{k-1} ln 2 -O(1). As a corollary, we establish that the threshold for random k-SAT is of order Theta(2^k), resolving a long-standing open problem. △ Less

Submitted 9 October, 2003; originally announced October 2003.

arXiv:math/0305151 [pdf, ps, other]

On the Maximum Satisfiability of Random Formulas

Authors: Dimitris Achlioptas, Assaf Naor, Yuval Peres

Abstract: Maximum satisfiability is a canonical NP-hard optimization problem that appears empirically hard for random instances. Let us say that a Conjunctive normal form (CNF) formula consisting of $k$-clauses is $p$-satisfiable if there exists a truth assignment satisfying $1-2^{-k}+p 2^{-k}$ of all clauses (observe that every $k$-CNF is 0-satisfiable). Also, let $F_k(n,m)$ denote a random $k$-CNF on… ▽ More Maximum satisfiability is a canonical NP-hard optimization problem that appears empirically hard for random instances. Let us say that a Conjunctive normal form (CNF) formula consisting of $k$-clauses is $p$-satisfiable if there exists a truth assignment satisfying $1-2^{-k}+p 2^{-k}$ of all clauses (observe that every $k$-CNF is 0-satisfiable). Also, let $F_k(n,m)$ denote a random $k$-CNF on $n$ variables formed by selecting uniformly and independently $m$ out of all possible $k$-clauses. It is easy to prove that for every $k>1$ and every $p$ in $(0,1]$, there is $R_k(p)$ such that if $r >R_k(p)$, then the probability that $F_k(n,rn)$ is $p$-satisfiable tends to 0 as $n$ tends to infinity. We prove that there exists a sequence $δ_k \to 0$ such that if $r <(1-δ_k) R_k(p)$ then the probability that $F_k(n,rn)$is $p$-satisfiable tends to 1 as $n$ tends to infinity. The sequence $δ_k$ tends to 0 exponentially fast in $k$. △ Less

Submitted 9 May, 2003; originally announced May 2003.

arXiv:cs/0305009 [pdf, ps, other]

The Threshold for Random k-SAT is 2^k ln2 - O(k)

Authors: Dimitris Achlioptas, Yuval Peres

Abstract: Let F be a random k-SAT formula on n variables, formed by selecting uniformly and independently m = rn out of all possible k-clauses. It is well-known that if r>2^k ln 2, then the formula F is unsatisfiable with probability that tends to 1 as n tends to infinity. We prove that there exists a sequence t_k = O(k) such that if r < 2^k ln 2 - t_k, then the formula F is satisfiable with probability t… ▽ More Let F be a random k-SAT formula on n variables, formed by selecting uniformly and independently m = rn out of all possible k-clauses. It is well-known that if r>2^k ln 2, then the formula F is unsatisfiable with probability that tends to 1 as n tends to infinity. We prove that there exists a sequence t_k = O(k) such that if r < 2^k ln 2 - t_k, then the formula F is satisfiable with probability that tends to 1 as n tends to infinity. Our technique yields an explicit lower bound for the random k-SAT threshold for every k. For k>3 this improves upon all previously known lower bounds. For example, when k=10 our lower bound is 704.94 while the upper bound is 708.94. △ Less

Submitted 8 September, 2003; v1 submitted 13 May, 2003; originally announced May 2003.

Comments: Added figures and explained the intuition behind our approach. Made a correction following comments of Chris Calabro

ACM Class: F.2.2

arXiv:cond-mat/0209622 [pdf, ps, other]

The Asymptotic Order of the k-SAT Threshold

Authors: Dimitris Achlioptas, Cristopher Moore

Abstract: Form a random k-SAT formula on n variables by selecting uniformly and independently m=rn clauses out of all 2^k (n choose k) possible k-clauses. The Satisfiability Threshold Conjecture asserts that for each k there exists a constant r_k such that, as n tends to infinity, the probability that the formula is satisfiable tends to 1 if r < r_k and to 0 if r > r_k. It has long been known that 2^k / k… ▽ More Form a random k-SAT formula on n variables by selecting uniformly and independently m=rn clauses out of all 2^k (n choose k) possible k-clauses. The Satisfiability Threshold Conjecture asserts that for each k there exists a constant r_k such that, as n tends to infinity, the probability that the formula is satisfiable tends to 1 if r < r_k and to 0 if r > r_k. It has long been known that 2^k / k < r_k < 2^k. We prove that r_k > 2^{k-1} \ln 2 - d_k, where d_k \to (1+\ln 2)/2. Our proof also allows a blurry glimpse of the ``geometry'' of the set of satisfying truth assignments, and a nearly exact location of the threshold for Not-All-Equal (NAE) k-SAT. △ Less

Submitted 26 September, 2002; originally announced September 2002.

Comments: Conference version to appear in FOCS (Foundations of Computer Science) 2002

Showing 1–28 of 28 results for author: Achlioptas, D