-
Distributed Delta-Coloring under Bandwidth Limitations
Magnús M. Halldórsson [email protected] Reykjavik University, Iceland
Yannic Maus111Supported by the Austrian Science Fund (FWF), Grant P36280-N. [email protected] TU Graz, Austria
Abstract
We consider the problem of coloring graphs of maximum degree with colors in the distributed setting with limited bandwidth. Specifically, we give a -round randomized algorithm in the model. This is close to the lower bound of rounds from [Brandt et al., STOC ’16], which holds also in the more powerful model. The core of our algorithm is a reduction to several special instances of the constructive Lovász local lemma (LLL) and the -list coloring problem.
Contents
1 Introduction
The objective in the -coloring problem is to color the vertices of a graph with such that any two adjacent vertices receive different colors. In the distributed setting, the -coloring problem has long been the focus of interest as the natural local coloring problem: any partial solution can be extended to a valid full solution. It has fast -round algorithms, both in [13] and [29], and so does the more general deg+1-list coloring problem (d1LC), which is what remains when a subset of the nodes has been -colored [30, 34].
The -coloring problem, on the other hand, is non-local: fixing the colors of just two nodes can make it impossible to form a proper -coloring, see Figure 1 for an example. Due to its simplicity, it has become the prototypical problem for the frontier of the unknown [27, 3]. Even the existence of such colorings is non-trivial: a celebrated result by Brooks from the ’40s shows that -colorings exist for any connected graph that is neither an odd cycle nor a clique on nodes [10].
A -round -coloring algorithm was recently given in [22], but no non-trivial algorithm is known in . It is of natural interest to examine if the transition from local to non-local problems behaves differently in and in . Thus, we set out to answer the following question:
In this work, we answer the question in the affirmative. We prove the following theorem.
Theorem 1.1.
There is a randomized -round algorithm to -color any graph with maximum degree . The algorithm works with high probability.
Theorem 1.1 nearly matches the lower bound of that holds in [8]. In [3], the authors claim that in order to make progress in our understanding of distributed complexity theory, we require a -coloring algorithm that is genuinely different from the approaches in [42, 27]. This is due to the fact that the current state-of-the-art runtime for -coloring lies exactly in the regime that is poorly understood. The approaches of [42, 27] are based on brute-forcing solutions on carefully chosen subgraphs of super-constant diameter. In contrast, our results are based on a bandwidth-efficient deterministic reduction to a constant number of ‘simple’ Lovász Local Lemma (LLL) instances and instances of d1LC; the LLL is a general solution method applicable to a wide range of problems.
It is known that LLL is complete for sublogarithmic computation on constant-degree graphs, but its role on general graphs is widely open [15]. Our algorithm adds to the small list of problems (see the related work section in [32]) that can be solved in sublogarithmic time with an LLL-type approach, even under the presence of bandwidth restrictions. Before continuing further, let us first detail the computational model.
In the model, a communication network is abstracted as an -node graph of maximum degree , where nodes serve as computing entities and edges represent communication links. Initially, a node is unaware of the topology of the graph , nodes can communicate with their neighbors in order to coordinate their actions. This communication happens in synchronous rounds where, in each round, a node can perform arbitrary local computations and send one message of bits over each incident edge. At the end of the algorithm, each node outputs its own portion of the solution, e.g., its color in coloring problems. The model is identical, except without restrictions on message size.
1.1 Technical Overview on Previous Approaches
Previous fast distributed -coloring algorithms either use huge bandwidth [42, 27] or use limited bandwidth but only work in the extreme cases of either very high-degree [22] or super low-degree graphs [40]. Optimally, we would like to take any of these solutions and run them with minor modifications to obtain an algorithm that uses low bandwidth and works for all degrees. This approach is entirely infeasible for the highly specialized algorithms in [42, 27, 28]. These works crucially rely on learning the full topology of non-constant diameter subgraphs, which is impossible in .
For graphs of super-low degree, i.e., at most , an efficient -coloring algorithm with low bandwidth can be deduced from the results in [40]. In fact, the paper takes a complexity-theoretic approach and shows that any problem can be solved in sublogarithmic time with low bandwidth as long as 1) the problem is defined on low-degree graphs, 2) a given solution can be checked efficiently for correctness by a distributed algorithm, and 3) the problem admits a sublogarithmic time model algorithm. As such, the results are not very constructive for any specific problem like the -coloring problem. In fact, it is known that these generic techniques cannot be extended to problems defined on graphs with larger degrees [4], which is the main target of our work.
Our best hope is then the -round model algorithm of [22]. We discuss it in detail throughout the next few pages as it motivates the design choices of our solution. Unfortunately, for maximum degrees that are at most poly-logarithmic, it relies on the prior -round model algorithm from [27] in a black-box manner. For large maximum degrees, however, when is , they provide a sophisticated constant-round randomized reduction to the -list coloring problem (d1LC) that also works with low bandwidth. The central ingredient in this reduction is the notion of slack.
Slack.
To reduce the -coloring problem to d1LC, it suffices to obtain a unit amount of slack for each node. Namely, if two neighbors of a node are assigned the same color, there are then more colors available to the node than its number of uncolored neighbors. Slack can be easily generated w.h.p. (for most, but not all, kinds of nodes) with a simple single-round procedure termed , as long as the graph has high degree. This observation has been used in countless papers on various coloring problems, e.g., [19, 35, 13, 29, 22]. For intermediate-degree graphs, this slack generation problem can be formulated as an instance of the constructive Lovász Local Lemma (LLL), but one that seems inherently non-implementable in , as we explain later.
Recall that the LLL is a general solution method applicable to a wide range of problems. Defined over a set of independent random variables, it asks for an assignment of the variables that avoids a set of ”bad” events. The original theorem [18] shows that such an assignment exists as long as the probability of the events to occur is sufficiently small in relation to the dependence degree of the events, i.e., the number of other events that share a variable. There is now a general algorithm running in rounds of [39, 16], but superfast algorithms are only known for restricted cases [20, 26, 17]. Even less is known about solvability in [31, 32].
In the presented slack generation LLL, there is a bad event for each node that holds if the respective node does not obtain slack. The mentioned works as follows. Each node gets activated with a constant probability, picks a random candidate color that it keeps if no neighbor wants to get the same color and discards otherwise (see Algorithm 3 in Section 3 for details). Hence, there are random variables for each node depicting its activation status and candidate color choice. The main reason why this LLL cannot be directly implemented in is that events involve values of variables at distance 2 in the communication graph. This makes it impossible for an event node to obtain full information on the status of all its variables, an ingredient that essentially is crucial in all known sublogarithmic-time LLL algorithms. The formal meaning of the word ‘essential’ in that sentence is extremely technical and is captured by the notion of a simulatable LLL (see Definition 5.2). In essence, it says that the LLL is easy enough such that event nodes can learn enough information about their variables to execute some simple primitives such as evaluating their status (does the event hold or not), resampling their variables, and computing certain conditional probabilities for the event to hold under partial variable assignments. The latter condition is the most challenging one to ensure.
1.2 Our Technical Approach
What we have discussed so far is only half the truth. In fact, the slack generation process only works for sparse nodes, i.e., nodes with many non-edges in their neighborhood. If the graph is locally too dense, then slack cannot be obtained via this LLL. Thus, the algorithm of [22] carefully analyzes the topological structure of the hard instances for -coloring, combining several different (deterministic and randomized) methods to create slack. Such a treatment seems to be inherent to the -coloring problem as a very similar classification was independently and currently discovered in the streaming model [1]. Additionally, it has also been shown to be useful in different models of computation. In the aftermath of these works, it has been used to obtain efficient massively parallel algorithms for the problem [11].
Our algorithm is based on a fine-grained version of this classification equipped with a sequence of various LLLs for eventual slack generation. Each LLL is easier to solve in the model than the aforementioned slack generation LLL. In the following, we use the terminology of [22], and explain their algorithm and our solution in more detail.
Like in all recent randomized distributed graph coloring algorithms, they divide the graph into sparse and dense parts that are referred to as ”almost-cliques” (ACs). Then, they partition the ACs further into different types – ordinary, nice, difficult – each of which admits a different coloring approach. See Figure 1 for an example of an AC. One challenge is that all these different types of tricky subgraphs may appear in the same graph and close to each other. For this overview it is best to imagine each AC as a proper clique on almost nodes in which each node has a few external neighbors residing in other ACs and creating lots of dependencies between different ACs. Thus, their algorithm is fragile with regard to the order in which different types of ACs are colored. The starting point of our work is that the core step of their algorithm does not work in low-degree graphs. More detailed, the first step of their algorithm executes SlackGeneration (see Algorithm 3 in Section 3) on a carefully selected subset of nodes to achieve three objectives: a) giving slack to all sparse nodes, b) providing a slack-toehold222A slack-toehold for an AC is an uncolored node that can be stalled to be colored later. All of its neighbors then lose one competitor for the remaining colors, providing them with temporal slack. for a subclass of the difficult ACs that the authors term ”runaway”, and c) providing each ordinary clique with a node that has slack. Each of these probabilistic guarantees holds w.h.p. as long as . Their proof shows that, in essence, all three cases are LLLs but ones that are far from being simulatable. We discuss our solutions for a)–c), separately.
Solution for a): Providing slack to sparse graphs is the main application of the LLL algorithm in [32]. In essence, we adapt their techniques to provide slack to sparse nodes but provide additional guarantees that are needed for other parts of the graph.
Solution for b): For the difficult cliques we propose a solution that eliminates randomness and solely colors all the nodes via a sequence of d1LC instances. See Figure 2 for an illustration of our solution. First, we adjust the classification of difficult almost-cliques from [22]. All nodes in a given difficult clique have the same external degree. We associate with each such AC a special node on its outside that has many neighbors on the inside (namely, more than twice the external degree of ’s nodes).
From here, we assign each difficult clique a layer that determines the step in which it gets colored. Those with a special node that is not contained in another difficult clique are treated separately and assigned to layer , to be dealt with at the very end. The other difficult cliques are assigned to layers indexed by the base-2 logarithm of their external degree. The crucial property that follows is that the cliques in a given layer have their special node in a higher layer. This allows us to color the cliques layer by layer, starting with smaller layers. The special node is stalled to be colored later, providing a toehold for . This way, we color the cliques and special nodes in all layers besides .
This leaves the problem of coloring ACs the layer and their still uncolored special nodes. In this exposition, we assume that special nodes are not shared by multiple difficult cliques. In that case, we pair the special node up with some node that is not adjacent to with the objective to same-color the nodes: assigning both the same color. This is done via a virtual coloring problem capturing the dependencies between all selected pairs in the participating difficult cliques and the restrictions imposed by already colored vertices of the graph. We show that this virtual coloring instance is indeed a d1LC instance and can be solved efficiently in despite being a problem on a virtual graph. As a result, the clique obtains an uncolored node that is adjacent to both and , has slack due to two same-colored neighbors, and can serve as a toehold for .
Besides removing the need for randomization to solve the difficult cliques, our classification of difficult cliques also captures significantly more ACs than the definition of difficult cliques in [22]. The additional structure provided to the remaining ACs is exploited down the line in the most challenging part of the algorithm, dealing with the ordinary cliques in part c).
Solution for c): The most involved part by far is dealing with case c). We split the ordinary cliques into the small (of size less than ) and large. The small ones can be handled just like the sparse nodes, as one can show that their induced neighborhoods are relatively sparse. The main effort then is to manually create slack for the large ordinary cliques. For this exposition, it is best to imagine an ordinary clique to be a clique on nodes in which each node of the clique has exactly one external neighbor that is again a member of a large ordinary clique. See Figure 3 for an illustration.
In order to create slack-toehold in each large AC , we compute a ”vee-shaped” triple of nodes, with and , but and is also a non-neighbor of . Then, we set up a virtual list coloring instance with a node for each such pair with the objective to same-color the pairs . As we ensure that is uncolored, it serves as a slack-toehold for the AC. As many of the important ACs can be mutually adjacent, the main difficulty lies in finding non-overlap** triples for the ACs. We ensure this by first computing a suitable candidate set from which we then pick the third node of the triple. Finding the set can be modeled as an ‘easy’ LLL fitting the framework of [32]. Finding the node can also be modeled as a different type of ‘easy’ LLL. In essence, the first LLL is easy (in ) as its bad events only consist of simple bounds on the number of neighbors in . Next, we elaborate on our LLL for finding with slightly more detail; due to further technicalities of the existing LLL algorithms from which we spare you in this technical overview, our actual solution differs slightly from the one presented here.
With a given set , we model the problem of selecting as an LLL as follows. Each AC sends a proposal (to serve as its node) to each outside neighbor inside with probability . The proposal is successful if no other AC proposes to that node. We show that with a constant probability, no other AC proposes to the same node and that this is independent for different nodes in . Since we ensure has many neighbors in , we obtain that the probability that none of ’s proposals are successful is bounded above by . The main benefit is that this LLL and also the LLL for finding the set are simple enough to be simulatable (in contrast to LLLs based on randomized slack generation for those ACs that can be derived from the proofs in [32]).
Once we have found , the structure of large ordinary ACs implies that we can deterministically find the other two nodes and of the triple. Additional complications arise in ensuring that the list coloring instance of the pairs is a d1LC instance, i.e., that the size of the joint available color palette of and exceeds the maximum degree in the virtual graph induced by the pairs. The last difficulty that appears is solving the d1LC instance, as the bandwidth between the nodes within a pair is very limited and existing d1LC algorithms cannot be run in a black-box manner.
Further related work.
Graph coloring is fundamental to distributed computing as an elegant way of breaking symmetry and avoiding contention, and was, in fact, the topic of the original paper introducing the model [37]. There is an abundance of efficient deterministic and randomized -coloring algorithms in and for various settings, e.g., [2, 35, 21, 13, 9, 43, 39, 29, 33, 30, 24]. The excellent monograph on distributed graph coloring by Barenboim and Elkin is still a great resource for older results [5].
There are significantly fewer results for coloring with fewer than colors. A algorithm is known for -coloring in graphs not containing too large cliques [6]. An -round -coloring algorithm in the model is known for trees [12], matching the lower bound [8] within a constant factor. Additionally, there are works coloring special graph classes such as coloring planar graphs with or colors in rounds with a deterministic algorithm [14, 41].
Outline.
In Section 2, we define the notion of slack and state required results from prior work on solving d1LC and computing an almost clique decomposition (ACD). In Section 3, we present our -coloring algorithm with essentially all proofs. The algorithm consists of phases and all phases except for Phases 1 (ACD computation) and Phase 2 are deterministic reductions to various d1LC instances. In Phase 2, we provide slack to sparse nodes and the nodes in ordinary cliques; this refers to part a) and part c) described in Section 1.2. For ease of presentation, the (involved) Phase 2 is presented in two consecutive sections, where we first reduce Phase 2 to solving four different subproblems in Section 4 and then solve each of these subproblems via an instance of the constructive Lovász Local Lemma in Section 5.
2 Preliminaries: d1LC, Slack, Almost-Clique Decomposition, Graytone
In the -list coloring (d1LC) problem, each node of a graph receives as input a list of allowaed colors whose size exceeds its degree. The goal is to compute a proper vertex coloring in which each node outputs a color from its list. The problem can be solved with a simple centralized greedy algorithm, and it also admits efficient distributed algorithms.
Lemma 2.1 (List coloring [30, 34]).
There is a randomized algorithm to -list-color (d1LC) any graph in rounds, w.h.p. This reduces to rounds when the degrees and the size of the color space is .
The slack of a node (potentially in a subgraph) is defined as the difference between the size of its palette and the number of uncolored neighbors (in the subgraph).
Definition 2.2 (Slack).
Let be a node with color palette in a subgraph of . The slack of in is the difference , where is the number of uncolored neighbors of in .
We use the following helpful terminology.
Definition 2.3 (Graytone [22]).
Consider an arbitrary step of the algorithm. A node is gray if it has unit-slack or a neighbor that will be colored in a later step of the algorithm. A node is grayish if it is not gray but has a gray neighbor. A set of gray and grayish nodes is said to be graytone.
Any graytone set can be colored as two d1LC instances: first the grayish nodes and then the gray. We emphasize that the graytone property depends on the order in which nodes are processed. It always refers to a certain step of the algorithm in which we color the respective set. Throughout our algorithm we aim at making more and more nodes graytone.
The following construction is central to our approach.
Lemma 2.4 (ACD computation [1, 22]).
For any graph , there is a partition (almost-clique decomposition (ACD) of into sets and such that each node in is -sparse and for every ,
-
(i)
,
-
(ii)
Each has at least neighbors in : ,
-
(iii)
Each node has at most neighbors in : .
Further, there is an -round algorithm to compute a valid ACD, w.h.p.
We adapt a proof from [22] that, as stated, applies only to the case when is sufficiently large. Technically, the argument differs only in that we build on [34] instead of [29] in the first step of the argument, where we compute a decomposition with weaker properties. We have opted to rephrase it, given the different constants in the definitions of these works and in order to make it more self-contained.
Proof of Lemma 2.4.
We first use a -round algorithm of [HNT22] to compute a weaker form of ACD with parameter .333While such a statement is used in the paper, it is not explicitly stated. Alternatively, we may use an alternative (slower) implementation (Lemma 4.4) in [Flin et al (FGHKN22), arXiv:2301.06457]] that runs rounds for and still suffices for our main result. The slowdown in [FGHKN22] comes from working with sparsified graphs, while a version also runs in rounds. Namely, it computes w.h.p. a partition where nodes in are -sparse and we have, for each :
-
(a)
, and
-
(b)
, for each .
What this construction does not satisfy is condition (iii).
We form a modified decomposition as follows. For each , let consist of along with the nodes in with at least neighbors in . Let . Observe that the decomposition is well-defined, as a node cannot have neighbors in more than one .
We first bound from above the number of nodes added to each part . Each node in has at most outside neighbors, so the number of edges with exactly one endpoint in is at most , using (a) to bound . Each node in is incident on at least such edges (by definition). Thus,
(1) |
Now, (iii) holds since a node outside has at most neighbors in (by the definition of ) and at most other neighbors in (by Eq. 1). Also, (ii) holds for nodes in by (b) and for nodes in by the definition of . For the lower bound in (i), , by (b). For the upper bound of (i), we have (by (a) and Eq. 1).
Finally, the claim about follows from the definition of , as . ∎
We say that nodes in are sparse and other nodes are dense. It is immediate from Lemma 2.4 that each dense node has external degree (or neighbors outside its AC) at most and at most non-neighbors in its AC. Also, any pair of nodes in have at least common neighbors in .
Notation. For a graph and two nodes , let denote the length of a shortest (unweighted) path between and in . For a set we denote . denotes the set of neighbors of a node .
3 -Coloring in CONGEST
In this subsection, we prove the following theorem.
See 1.1
The extreme cases of very large and very small can be solved in the claimed runtime with prior work [22, 40], see the proof of Theorem 1.1 in Section 3.3. Here, we present an algorithm for the most challenging regime where .
In the extreme case that , the -coloring algorithm from [22] even runs in rounds. A lower bound of rounds in the model for the -coloring problem [8] rules out a algorithm for small . Hence, in this section, we aim for an algorithm using rounds. In fact, we reduce the -coloring problem to a few list coloring instances and a few LLL instances, each of which we solve in rounds.
3.1 Fine-Grained ACD Partition
The following definitions of types of almost-cliques are crucial for all results of the paper. The reader is hereby warned to read them slowly!
Definition 3.1 (Types of almost-cliques).
For an AC , let . An AC is easy if it contains a non-edge or a node of degree less than . A node is an intrusive neighbor of a non-easy if has at least neighbors in . A non-easy AC is difficult if it has an intrusive neighbor. Each difficult AC arbitrarily selects one of its intrusive neighbors as its special node . An AC is nice if it is easy or if it is both non-difficult and contains a special node (necessarily for another AC). An AC is ordinary if it is neither nice nor difficult.
Note that all ACs except the easy are proper cliques and all nodes in such a clique have external degree . We say that a node is ordinary (difficult, nice) if it belongs to an ordinary (difficult, nice) AC, respectively. The difficult ACs are divided into levels.
Definition 3.2 (Levels of difficult ACs).
The maximum level contains all difficult ACs whose special node is not contained in a difficult AC. A difficult AC that is not at the maximum level has level .
Observe that for all difficult ACs.
Definition 3.3 (Node classification).
-
The nodes are partitioned into the following sets:
-
1.
: the set of special nodes that are not in difficult ACs,
-
2.
: nodes in difficult ACs of level , (might include special nodes),
-
3.
: nodes in nice ACs, excluding those in ,
-
4.
: nodes in ordinary ACs, and
-
5.
: nodes in , excluding those in .
Our classification is built on [22] but is subtly different and more fine-grained. We are driven by a need to limit the reach of probabilistic arguments, being that we are in the challenging sub-logarithmic degree range. Thus, a strictly smaller set of dense nodes (the ordinary) needs probabilistic slack in our formulation. On the other hand, the easy, difficult, and nice definitions are more inclusive here. The difficult ones are here divided into super-constant number of levels, as opposed to only two types in [22].
The underlying idea is to ensure that every node gets at least one unit of slack, ensuring that it can be colored as part of a d1LC instance. Easy nodes have such slack from the start; difficult ones get it from their special nodes (special nodes are used in several different ways to provide slack); sparse and ordinary nodes get it from probabilistic slack generation; and non-easy nice ones get it from same-coloring a non-edge it contains. The most challenging part of the low-degree regime is the probabilistic part. That has guided our definition, resulting in the ordinary ACs being defined as restrictively as possible and, in fact, much more restrictive than the ordinary ACs in [22].
3.2 Algorithm for -coloring
Our -coloring algorithm consists of the following five phases.
The remainder of the paper describes these phases in detail. Only Phases 1 and 2 are randomized. Phase 2 is also the most involved part of our algorithm. For ease of presentation, we defer its details when is at most logarithmic to Sections 4 and 5. In this section, we present Phase 2 in the case of for a sufficiently large constant , where Phase 2 does not require any LLL and which is sufficient to understand how Phase 2 interacts with the remaining phases. The remaining phases are identical in both cases.
3.2.1 Phase 1: Partitioning the Nodes
We first apply Lemma 2.4 to compute an ACD for and break the graph into nice ACs, difficult ACs, ordinary ACs, and the remaining nodes in according to Definition 3.3.
3.2.2 Phase 2: Sparse and Ordinary Nodes ()
In this subsection, we prove the following lemma.
Lemma 3.4.
There exists a -round algorithm that w.h.p. colors the sparse nodes and nodes in ordinary cliques if for a sufficiently large constant .
Lemma 3.4 essentially follows from the proof of Lemma 3.5 in [22, arxiv version]. However, as we have changed the definition of ordinary cliques, we spell out the required details.
Slack generation is based on trying a random color for a subset of nodes. Sample a set of nodes and a random color for each of the sampled nodes. Nodes keep the random color if none of their neighbors choose the same color. See Algorithm 3 for a pseudocode. If there are enough non-edges in a node’s neighborhood, then it probabilistically gets significant slack.
We also require the following lemma from [22].
Lemma 3.5 ([22]).
Let be a non-easy AC, be a subset of nodes containing , and be an arbitrary matching between and . Then, after SlackGeneration is run on , contains uncolored nodes with unit-slack in , with probability .
There exists a large matching satisfying the hypothesis of Lemma 3.5,
Lemma 3.6.
For each ordinary AC , there exists a matching between and of size .
Proof.
We use the following combinatorial result.
Claim 3.7.
Let be a bipartite graph where nodes in have degree at least and nodes in have degree at most . There exists a matching of size in .
Proof.
Let be a maximum matching in and suppose that more than half the nodes in are unmatched. Let be the set of nodes reachable from the unmatched nodes . Since has no augmenting path, contains no unmatched node of . All of the edges incident on have their other endpoint in . By the degree bound on , there are fewer than such edges. Thus, . Every node in is matched to a node in , while all unmatched nodes in are in . Thus, the number of unmatched nodes in is at most . This is a contradiction, and hence, at least half the nodes in are matched.
As is not easy, all its nodes have external degree , while nodes in are by assumption not intrusive neighbors of , so they have at most neighbors in . 3.7 then implies that there exists a matching between and of size .
The properties of Phase 2 are summarized in the following lemma.
Lemma 3.8.
If for a sufficiently large constant , the following properties hold w.h.p. after Step 1 of Algorithm 2:
-
(†)
Each sparse node has unit-slack in ,
-
(††)
Each ordinary AC has an uncolored unit-slack node in .
Proof.
We run SlackGeneration on the node set . Nodes with neighbors outside have slack while the rest of the graph is stalled. We focus on the remaining nodes. Each sparse node gets the respective slack with probability at least [19, Lemma 3.1], implying (†). By Lemma 3.6, there is a matching between and of size . Thus, holds with probability at least , by Lemma 3.5.
Both probabilities become w.h.p. guarantees if for a sufficiently large constant . For for a sufficiently large constant we obtain an LLL. ∎
Proof of Lemma 3.4.
By Lemma 3.8 w.h.p. all sparse nodes become gray as they have unit slack. Also, the unit-slack node in each ordinary AC becomes gray and all other nodes of the AC become grayish as ordinary ACs induce cliques. This is sufficient to color all nodes with d1LC instances. ∎
Forward pointer: The main difficulty of Phase 2 for smaller values of is to mimic the properties of Lemma 3.8. Sections 4 and 5 are devoted to ensuring these properties via several LLLs and d1LC instances that can be solved in a bandwidth-efficient manner.
3.2.3 Phase 3: Nice ACs
We give a simpler treatment than [22]. We want a toehold in each nice AC: a node with permanent or temporary slack. With a toehold, the rest is easy. Namely, ACs have all nodes of internal degree at least , of which none are colored in previous phases. The neighbors of a toehold are gray, and there are at least of them by Lemma 2.4, all uncolored. The remaining nodes in the AC are then grayish, so the AC is graytone.
Nice ACs come in three types, depending on if they contain a special node, a non-edge, or a degree-below- node. The first and third types immediately give us a toehold. It remains then to consider nice ACs with a non-edge but with no special node, which we call hollow.
For a hollow AC , we identify an arbitrary non-edge and call it the pair for . We color the pairs for hollow ACs as a d1LC instance. The two nodes in a pair have at least common neighbors within and any of them can function as a toehold. It remains to argue that we can find a valid coloring of the pairs efficiently.
Lemma 3.9.
The pairs of hollow ACs can be colored in the model in rounds.
Proof.
As the nodes of a hollow were uncolored, the only nodes that can conflict with the coloring of the pair are the at most external neighbors. The colors we have to work with significantly exceed that. Thus, the pairs are -list colorable.
Both nodes of the pair have at least neighbors in , so they have at least common neighbors in . They provide the bandwidth to transmit to one node all the colors adjacent to the other node. Also, all messages to and from vis-a-vis its external neighbors can be forwarded in two rounds. Hence, we can simulate any coloring algorithm on the pairs with -factor slowdown; in particular, we can simulate the algorithm from Lemma 2.1. ∎
3.2.4 Phase 4: Difficult ACs in a Non-Maximum Level
By Definition 3.2, the special node of any difficult AC at a level other than is contained in another difficult AC . The next lemma shows that the level of must be strictly larger than the level of , which allows us to color fast while remains uncolored.
Claim 3.10.
For an AC with , let be the difficult AC that contains the special node . Then we have .
Proof.
The special node has external degree of at least as it is connected to at least nodes of that do not lie within . Hence, we obtain that the external degree in AC is at least , so . ∎
We color all ACs of a level in parallel, in increasing order of levels. Due to the previous claim, the special node of an AC is contained in a difficult clique in a larger level or not contained in a difficult clique at all. Hence, the special node is uncolored when the clique is processed. So, when processing some level , we color all nodes in ACs of that level, but we do not color their respective special nodes. Thus, the respective special node provides a toehold for the respective clique.
3.2.5 Phase 5: Difficult ACs in the Maximum Level
The maximum level is processed last and differently from the other levels. By definition, the special node of an AC in level is not contained in a difficult AC. Also, all nodes in and their special nodes are still uncolored at the beginning of this phase.
The algorithm has four steps: (1) Form pairs of selected non-adjacent nodes, (2) Color the nodes in each pair consistently, (3) Graytone color the remaining nodes of the AC, and (4) Color the special nodes . We explain each step in detail.
First, we form the following pairs. For each special node that is special for only one AC at level : Form a type-1 pair with a non-neighbor of in . For each special node that is special for more than one ACs at level , form a type-2 pair , where and are arbitrary non-adjacent nodes in two of the ACs for which is special. Let be the set of the latter special nodes.
Claim 3.11.
The pairs can be properly formed.
Proof.
Type-1: An (uncolored) non-neighbor of exists as can have at most neighbors in by Lemma 2.4 (4), but the AC has at least vertices.
Type-2: Let and be two ACs at level for which is special, where . By definition, has at least () neighbors in (), respectively. Pick to be any neighbor of in . Node has at most neighbors in . Thus, there are at least nodes in that are neighbors of and non-neighbors of , and we can pick any such node as . ∎
Lemma 3.12.
Coloring the pairs is a -list coloring instance that can be solved in rounds in , w.h.p.
Proof.
Type-1 pair , , : We say that a node conflicts with the pair if the node is already colored or is contained in an adjacent pair of the same phase. As does not contain a special node, is the only node of participating in the phase and all other nodes of are still uncolored. The node can only be adjacent to conflicting nodes as it has external degree at most . As has at least neighbors in , it can conflict with at most nodes. Thus, the pair conflicts with at most nodes, which is less than , the number of colors initially available. Thus, the problem of coloring such pairs is a -list coloring problem.
Type-2 pair : Each such pair is adjacent to at most nodes in other ACs. Further, all nodes in the ACs and are still uncolored, so both nodes have at least colors in their palette, and each pair is adjacent to at most other pairs or already colored neighbors, that is, the palette exceeds the degree.
Implementation. A type-1 pair has at least common neighbors (the special node has neighbors inside the clique by its definition that are all connected to ), which suffices to communicate the colors and all messages of external neighbors of to ( has at most external neighbors). Hence, the coloring can be achieved in .
Let be the common special node of a type-2 pair and let and be the respective cliques. For the node has at most outside neighbors and has neighbors in , denote these by . We simulate the pair by . The node can forward all initial colors of outside neighbors as well as all messages from them to by relaying them through . ∎
After coloring the pairs, each difficult AC has a node with unit-slack in , either because the clique contains an uncolored node with two neighbors appearing in a consistently colored type-1 pair , or because it contains an uncolored node with a neighbor in . In the former case, the uncolored node exists because has at least one neighbor in that is also a neighbor of . In the latter case, the special node with type-2 pair has by definition further neighbors besides and in each clique that are all uncolored.
Thus, we color all nodes in difficult cliques via the graytone property. At the end, we color the nodes in , which have unit-slack as they are adjacent to a type- pair.
3.3 Proof of Theorem 1.1
Proof of Theorem 1.1.
There are five cases, depending on the relation of and . Generally, we use Lemma 2.1 to solve d1LC instances in rounds. Whenever the d1LC instances require additional arguments to be solved in the respective time, e.g., because they are defined on a virtual graph, we reason their runtime when they are introduced.
-
•
If , we use the algorithm from [22] to -color the graph.
-
•
For for a sufficiently large constant , the result follows by executing Algorithm 1 with the arguments of this section. Phases – only require rounds and a constant number of d1LC instances. In Phase 4, we iterate through the levels and solve a constant number of d1LC instances for each level. Phase 5 can be executed in time by Lemma 3.12.
-
•
When , we use Algorithm 1 from this section and replace Phase 2 with Algorithm 4 (presented in Section 4) whose correctness and runtime we prove in Sections 4 and 5.
- •
-
•
If , that is, for constant , there is an existing algorithm from [40].
In all cases, the algorithm runs in rounds. ∎
4 Phase 2 (): Sparse Nodes and Ordinary Cliques
In this section, we deal with Phase 2 for the most challenging regime of . The following lemma follows from all proofs in this section, together with Lemmas 4.3, 4.5 and 4.6 all proven in Section 5.
Lemma 4.1 (Phase 2).
There exists a -round algorithm that w.h.p. color the sparse nodes and nodes in ordinary cliques if .
We first give high-level ideas of our method. We divide the ordinary cliques into the small, of size at most , and the large. Nodes in small ordinary cliques have significant sparsity (i.e., non-edges in their induced neighborhood), which means that the one-round procedure of trying a random color has a good probability of successfully generating slack. The natural LLL formulation of that step is therefore well-behaved enough that it can be solved fast in with a few additional tweaks, see Section 5.2. Large nodes need a different approach.
For each large AC, we produce unit slack for a single node. See Figure 3 for an illustration of the process we will describe. We identify for each such AC a triplet of nodes with the objective to color and with the same color, while remains uncolored. This way, receives unit slack, which gives us a toehold to color the whole AC.
Computing such triplets is non-trivial. We do so by breaking it into three steps, each solvable by a different LLL formulation. In brief, we first compute a set of candidate -nodes; next partition into two sets; and then select the actual -nodes to be used from these two sets. The split of into two sets is required to make the process of finally finding the -nodes fit the LLL solver from Theorem 5.3. The properties of the set imply that it is then much easier to identify compatible - and -nodes, and once we find such triplets, we set up a virtual coloring instance for same-coloring - and -nodes in each triple. We show that this instance is d1LC and can be solved with low bandwidth despite being defined on a virtual graph. This provides a slack-toehold to the -node of each triple and the coloring can be extended via d1LC instances to the whole instance.
Algorithm.
The first step of the algorithm is to compute a large matching between each ordinary clique and in parallel. We then classify the ordinary cliques as follows. Fix the parameter throughout this section.
Definition 4.2 (Small, Large, Unimportant and Important Ordinary cliques.).
An ordinary AC is large if it contains more than nodes, and small otherwise. A large AC is important if , and unimportant otherwise.
We say that a node is small/large/important/unimportant if it belongs to an AC of the corresponding type. Let , , , and be the set of important, unimportant, large, and small nodes, respectively.
Next, we present our full solution. The algorithm has the following steps, which are explained in detail below.
Step 0: Classifying ACs and computing matchings.
We compute a matching for each ordinary clique between the vertices in and the ones in . We use a 2.5-approximate algorithm of [23] running in rounds, obtaining that , using Lemma 3.6.
We view the edges of as being directed arcs with a head in and tail in . Each AC can determine its size and the size of in rounds and hence the classification of Definition 4.2 can be computed in rounds.
Step 1: Slack for sparse and small nodes.
In this step, we create slack for sparse nodes and all nodes in . The key property of small nodes is that they are relatively sparse (with many non-edges in their neighborhoods), so randomly trying colors is likely to produce slack. That leads to an LLL formulation that we can make simulatable and can therefore implement in .
The properties are summarized by the following lemma. Besides providing slack to all sparse nodes and the nodes in small ordinary ACs, it also guarantees that each neighborhood (and hence also each AC) does not have too many nodes colored and that the matching of each AC does not get too many nodes colored. The proof is in Section 5.2.
Lemma 4.3.
Assume that we are given a matching of size at least between and for each ordinary AC . There is a -round (LLL-based) algorithm that w.h.p. colors a subset and ensures that:
-
1.
Each uncolored node in has unit-slack in .
-
2.
In each of the following subsets, at most nodes are colored: for each and for each AC .
Step 2: Compute triple candidate set via LLL.
Let .
The goal of this step is to compute two disjoint sets of uncolored nodes such that each important AC has sufficiently many matching edges satisfying the following definition of usefulness.
Definition 4.4 (useful edge).
Given a subset and important AC , a matched arc is useful for if and . Refer to as the nodes and as the nodes. An edge is if both endpoints are .
An arc cannot be useful for the AC containing ; only the one containing .
Formally, Step 2 provides the following lemma that we prove in Section 5.3. For an AC and set , let denote the arcs of with one endpoint in (and the other in ).
Lemma 4.5.
Let . There is a -round (LLL-based) algorithm computing disjoint subsets satisfying the following properties, w.h.p.:
-
1.
, for and for each important AC , and
-
2.
, for all .
Step 3: Forming triples via LLL.
The goal of this step is to compute a triple of nodes that satisfy the conditions of the next lemma. These triple nodes are distinct for different ACs.
Lemma 4.6.
Given sets with the properties as in Lemma 4.5, there is a -round (LLL-based) algorithm that computes for each large important AC a triple of uncolored nodes such that w.h.p.:
-
1.
and ,
-
2.
, , ( and are non-adjacent; is adjacent to both and ) and
-
3.
the graph induced by has maximum degree .
We model the problem of selecting for each important AC as a disjoint variable set LLL. The proof of the lemma is given in Section 5.4.
Step 4: Same-coloring pairs.
Given a triple (, we will create a toehold for the AC at by coloring its non-adjacent neighbors and with the same color.
Let ( for pair) be the virtual graph consisting of one vertex for each pair and an edge between two pairs and if there is any edge in between and . The list of available colors consists of all colors that are not used by the already colored neighbors in of and .
Lemma 4.7.
The maximum degree of is upper bounded by .
Proof.
By Lemma 4.6, each node has at most neighbors in . Define the set . As contains at most one node per AC, the number of neighbors that a node in can have in is upper bounded by its external degree plus , which is upper bounded by . Thus, the maximum degree of the virtual graph is at most for sufficiently large . ∎
Lemma 4.8.
Coloring – i.e., same-coloring the pairs – is a -list coloring instance.
Proof.
By Lemma 4.7 we obtain . As we colored at most vertices in each neighborhood in Step 1, the list of available colors of each pair has at least colors available in their joint list. Hence, we obtain a -list coloring instance. ∎
implementation. Our algorithm is based on the -list coloring algorithm from [25, 7]. Before we show how to color the nodes in , we need to define a slow (it takes rounds) randomized algorithm. The algorithm is used in our analysis and it works as follows. In each iteration, each uncolored pair executes the following procedure that may result in the pair to try to get colored with a color or to not try a color (also see Algorithm 5 for pseudocode of the algorithm). Throughout the algorithm, nodes and maintain lists and consisting of all colors not used by their respective neighbors in . Then, in one iteration node selects a color u.a.r. from its list of available colors , and sends it to the other endpoint through node . The other endpoint checks whether ; if so, both nodes agree on trying color , and the color is sent to their neighbors. If no incident pair tries the same color, the pair gets permanently colored with the color. Lastly, both nodes individually update their lists by removing colors from adjacent vertices that got colored from their respective list. There is no explicit coordination between the two vertices in maintaining a joint list of available colors.
The next lemma shows that each pair gets colored with constant probability.
Lemma 4.9.
Consider an arbitrary iteration of Algorithm 5 and an arbitrary pair for a hiding AC that is uncolored at the start of the iteration. Then, we have
(2) |
The bound on the probability holds regardless of the outcome of previous iterations.
Proof.
Note 444The constants in this proof are not chosen optimally in order to improve readability. that throughout the execution of Algorithm 5 the respective lists of nodes and are always of size at least as and , by Lemma 4.7. Note, that both nodes keep their individual list of available colors in which they only remove the colors of immediate neighbors in from the list of available colors. Thus, at all times we have . Let be the set of colors tried by one of the pairs incident to in the current iteration. We obtain . As these colors are at least half of ’s palette, the probability that the pair gets colored is at least . ∎
Lemma 4.10.
There is a randomized -round algorithm that w.h.p. colors the pairs of .
Proof.
Consider the well-understood color trial algorithm in which nodes repeatedly try a color from their list of available colors, keep their color permanently if no neighbor tries the same color, and remove colors of permanently colored neighbors from their list of available colors. It is known that this algorithm colors each node with a constant probability in each iteration [7, 36]. Thus, it requires rounds to color all vertices of a graph. The shattering-based algorithm from [25] for d1LC runs in rounds. It requires three subroutines: a) A color trial algorithm like the one from [7, 36], b) a network decomposition algorithm that can run on small subgraphs (the ones in [43, 40, 38] do the job), and c) the possibility to run instances of the color trial algorithm in parallel. In our setting we want to solve the same problem, but on the virtual graph while the communication network is still the original graph . The subroutine for part b) can be taken from prior work as the same issue is dealt with formally in [40, 38, 32]. We refer to these works for the details and also the definition of a network decomposition. Let us sketch the main ingredient for the informed reader. Instead of computing a network decomposition of small subgraphs of , the subgraphs are first projected to , and a network decomposition of is computed afterwards. This only requires an increased distance between clusters such that the preimage of the decomposition induces a proper network decomposition of .
For ingredients a) and c), we observe that Ghaffari’s algorithm only requires the following properties for the color trial algorithm: i) one iteration can be executed in constant time and with bandwidth, allowing to execute instances in parallel in the model, and ii) each node gets colored with a constant probability in each iteration. Thus, we can replace the color trial algorithm with the color trial algorithm for given in Algorithm 5. We have already argued that it can be implemented with bandwidth showing and Lemma 4.9 provides its constant success probability for ii). ∎
Step 5: Completing the coloring.
To finish the coloring, we first color the unimportant nodes and then the important, small, and sparse nodes.
Lemma 4.11.
Unimportant nodes are graytone as long as the other ordinary nodes (small, sparse, important) are inactive.
Proof.
The only steps so far in which we colored vertices are Steps 1 and 4. In Step 1 we color at most vertices per AC and per matching of each ordinary AC . In Step 4 we only color (a subset of) the vertices in and one vertex per important AC (the vertex for AC ). As , we color at most vertices in each unimportant AC.
Fix some unimportant AC . Recall that the algorithm of [23] finds a 2.5-approximate matching, which by Lemma 3.6 implies that . As an unimportant AC has fewer than nodes in , we obtain that contains at least nodes that are not contained in . By Lemma 4.3, at most of these get colored in Step 1; denote the uncolored nodes of these by and let . By the earlier argument, at most nodes of are already colored, that is, there exists some that is still uncolored and has an uncolored neighbor . As is stalled to be colored later, is gray and other nodes of the AC are grayish. ∎
Lemma 4.12.
Small, sparse, and important nodes are graytone.
Proof.
By Lemma 4.3, each small or sparse node has slack in and is therefore gray (and stays gray until colored).
For an important AC with triple , the node is gray as and are same-colored. Hence, the remaining uncolored nodes of are either already colored or graytone as they are adjacent to . ∎
5 Solving Subproblems of Phase 2 via LLL
We show how the probabilistic subproblems of Section 4 can be solved via a fast LLL algorithm. We show for all four problems that they can be captured with the framework of [32]. We start by reviewing the framework of [32] and then solve each of the subproblems in respective subsections.
5.1 Framework for LLL in CONGEST
In this section, we present model LLL solvers from [32]. The definitions, theorems, and selected textual excerpts in this section have been sourced from [32].
Constructive Lovász Local Lemma (LLL).
An instance of the distributed Lovász local lemma (LLL) is given by a a set of independent random variables and a family of ”bad” events over these variables. Let denote the set of variables involving the event and note that is a binary function of . The dependency graph is a graph with a vertex for each event and an edge whenever . The dependency degree is the maximum degree of . We omit the subscript when the considered LLL is unambiguous. The Lovász Local Lemma [18] states that holds if , or in other words, there exists an assignment to the variables that avoids all bad events.
In the constructive Lovász local lemma one aims to compute such an feasible assignment, avoiding all bad events. This is often under much stronger conditions on the relation of and . The relation of and is referred to as the LLL criterion.
Constructive Distributed Lovász Local Lemma
In the distributed setting, the LLL instance is mapped to a communication network . We are given a function that assigns each variable and each bad event to a node of the communication network. We assume that for each variable , the node knows the distribution of , including the range of the variable. We also say that node simulates the variable/event . For a vertex , we call the load of vertex . The (maximum) vertex load of an LLL instance is .
In the constructive distributed LLL, we execute a or algorithm on to compute a feasible assignment . Afterwards, for each variable , node has to output .
In general, the graph and the dependency graph do not have to coincide. However, distances between events in and the corresponding nodes in are in close relation, as formalized by the next definition.
Definition 5.1 (Locality).
A triple has locality if for all events of and variables .
(Partial) Assignments. We use the value for variables that have not been set yet. A partial assignment of a set of variables is a function with domain satisfying for all . A partial assignment agrees with another (partial) assignment if for all , i.e., if all proper values assigned by match those of . A retraction of a partial assignment is a partial assignment that agrees with . For an event and a partial assignment , we use the notation to mean that the probability is over assignments with which agrees; in other words, the randomness is only over the variables in .
Simulatable Distributed Lovász Local Lemma (CONGEST)
Definition 5.2 (Simulatability).
We say an LLL is simulatable in if each of the following can be done in rounds:
-
1.
Test: Test in parallel which events of hold (without preprocessing).
-
2.
Min-aggregation: Given bit in each event (variable), each variable (event) can simultaneously find the minimum of the bits of its events (variables).
-
For the following items, it is sufficient if they hold in the setting that events and variables are given -bit identifiers555In general, for the whole LLL instance and for non-constant distances such identifiers do not exist, but our LLL algorithms only use the primitives in settings where they do exists and are available. (that are unique within distance in ):
-
3.
Evaluate: Given a partial assignment , and partial assignments , , in which each variable knows its values (or ), each event of can simultaneously decide if holds, where is a parameter known by all nodes of .
-
4.
Min-aggregation: We can compute the following for different instances in parallel: Given an -bit string in each event (variable), each variable (event) can simultaneously find the minimum of the strings for its events (for its variables).
Disjoint Variable Set LLLs
In a disjoint variable set LLLs there are two disjoint sets of variables available for each event. In fact, we consider events that can be written as the conjunction of two events where and holds for . Note, that to avoid it is sufficient to avoid either or .
Theorem 5.3.
There is a randomized algorithm that in rounds w.h.p. solve any disjoint variable set LLL of constant locality with dependency degree and bad event upper bound . The algorithm requires , , for constants , and that the LLL is simulatable.
Sampling LLLs
In a binary LLL the range of the variables is . We view the variables/nodes with black value as sampled. Thus, we also refer to them as sampling LLLs. The risk of a bad event upper bounds the probability of a bad event to hold under a certain type of retractions of an assignment that avoided an associated event .
Definition 5.4 (risk).
We say that an (associated) event testifies risk for some event if
(3) |
The risk of an event is the smallest risk testified by some event .
Here, is the set of retractions of assignments avoiding , where either (i) no variables of or (ii) all variables of are retracted. In our algorithms we will use several LLLs that have a low risk and hence can be solved with the following theorem.
Theorem 5.5.
There is a randomized algorithm that in rounds w.h.p. solve any LLL of constant locality with dependency degree and risk . The algorithm requires , , for constants and that the LLL is simulatable.
Events favor , or are monotone increasing, if changing any value to does not decrease the conditional probability of the event, respectively. A typical example of a monotone increasing event is sampling a subset of vertices containing many non-edges in the neighborhood of each node. We use this problem in our procedure to color sparse nodes. A key point is that it is easy to bound the risk of monotone increasing events as shown in the following lemma from [32].
Lemma 5.6.
The risk of a monotone increasing event is testified by .
Last but not least we will sample subsets of nodes satisfying certain degree bounds. The following lemma is helpful to bound the risk of such sampling LLLs.
Lemma 5.7.
Consider a random variable that is a sum of independent binary random variables. For some threshold parameter , let be the event that holds. Then, the risk of is at most testified by .
5.2 Generating Unit Slack for Sparse and Ordinary Nodes
The next lemma shows that the nodes in small ordinary cliques are somewhat sparse. As each large AC is a proper clique consisting of nodes with degree , we obtain the following.
Observation 5.8 (Small ordinary cliques are sparse).
Any node in an ordinary AC has at least non-edges in its neighborhood. In particular, any small node has at least non-edges in its neighborhood.
Proof.
Since is not easy, each of its nodes have external neighbors. Since is not difficult, it has no intrusive neighbor. Thus, each external neighbor of has at most neighbors in , so at least non-neighbors. Hence, the first claim. A small node has , implying the second claim. ∎
The task of this subsection is to prove the following lemma.
See 4.3
Proof.
Let and . Note that any node in with a neighbor automatically has unit-slack in as its neighbor is stalled to be colored later. Thus we can concentrate on the vertices in .
Each node is sparse and so has non-edges in its induced neighborhood, which is within . Each node in has at least non-edges in its neighborhood in by 5.8. Let .
We first use Theorem 5.5 (twice) to compute two sets , satisfying the following properties:
-
1.
, for all
-
2.
, for all ordinary ACs ,
-
3.
Number of non-edges in is , for each .
In order to construct consider the process that samples each node into with probability for a suitable constant . For a suitable constants introduce the following bad events.
-
1.
For all , event holds if
-
2.
For each ordinary AC , event holds if ,
-
3.
For each , event holds if the number of non edges in is less than .
Claim 5.9.
The sampling of with probability and the aforementioned bad events is a simulatable LLL with risk .
Proof.
We first bound the risk of the events and then reason about simulatability.
-
•
Fix some . The expected number of neighbors in is . Hence, by Chernoff. Additionally, define an associated event as the event that at most neighbors are sampled. We have . This bounds the risk of to be at most by Lemma 5.7.
-
•
The proof for bounding the risk of the event for each ordinary clique is identical to the proof for by considering the sampling status of the matching instead of the neighborhood of the respective node.
-
•
Fix a node and let be the fraction of non-edges of node in its neighborhood induced by . 5.8 shows , regardless of whether or .
Now, fix the constant such that the event is the event that the number of non-edges in is less than . Let be a random variable for the number of non-edges in the graph induced by . Apply the non-edge hitting lemma Lemma B.2, with and . The lemma shows that the expected number of non-edges is and that is also well concentrated. We obtain . is a monotone increasing event. Hence, its risk is at most by Lemma 5.6, where the associated event is itself.
In summary the risk is upper bounded by .
The simulatability of the first two types of events ( for and for ordinary cliques ) is immediate as it only counts the number of immediate neighbors of nodes and cliques, respectively. Here, the leader node can gather full information about the number of nodes in in any partial assignment sampling .
The lengthy proof of the simulatability of the event for is word by word identical to the proof of the simulatability of a similar type of event in [32, Lemma 8.4, arxiv version]. The crucial point is part 3 of the simulatability definition (Definition 5.2) where evaluations of conditional probabilities need to be done in parallel in the setting where locally unique IDs are represented with bits. These small IDs are sufficient for a preprocessing that is done simultaneously for all instances and in which learns the whole topology of . Once the topology is available, the sampling status of nodes will reveal the number of non-edges in ’s sampled neighborhood, showing simulatability. ∎
Due to 5.9, we can apply Theorem 5.5 to solve the LLL in and compute a set with the required properties in rounds, w.h.p. We proceed analogously for but compute it as a subset of . The remaining steps are identical except that constant is replaced with a smaller constant as removing the set from may reduce the sparsity of the nodes in . Still, the reduction is limited to a constant factor for the following reason: removing at most nodes from the neighborhood of each node reduces the number of non-edges in each neighborhood by at most . Thus a node in still has non-edges available, where we used that is large enough and holds. For nodes in , removing the nodes in from also removes less than half of the initially available non-edges.
With the two sets and , we apply Lemma B.1 with two disjoint color palettes of size . The number of non-edges in satisfies as required. As a result, a subset is colored, such that all nodes in get slack. The second property of this lemma, stating that the number of nodes colored in of the respective nodes and in , follows from the bound on number of neighbors in and . The runtime immediately follows from Theorem 5.5 and Lemma B.1. ∎
5.3 Computing the Set
See 4.5
We compute the sets and by two consecutive LLLs and . In the first LLL, we compute the set , which we split into the two sets and in the second LLL.
Definition 5.10 (First sampling LLL).
We define the following sampling LLL . Let .
-
•
Variables: Sample each node of with probability into . Denote .
-
•
Bad Events:
-
1.
For each , there is a bad event stating that .
-
2.
For each important AC , define an event that holds if fewer than edges of are useful.
-
1.
-
•
Associated Events
-
1.
: For each , the bad event holds if ,
-
2.
: The event holds if there are fewer than useful edges or if there are fewer than edges in .
-
1.
-
•
Event/variable assignment : Each variable and each event , are simulated by the corresponding node. The events and are simulated by the node of with the largest ID.
Note that is of different nature from .
Lemma 5.11.
We have the following upper bounds for the probabilities of the respective events.
-
1.
For all : .
-
2.
For all important ACs : .
Proof.
Throughout the proof we use that and are constant.
Bounding : As each node joins independently with probability , we have , and the first bound follows from a Chernoff bound.
Bounding : Let be the arcs of that have both endpoints in and uncolored after Step 1. All heads of arcs in are already in , and by the definition of an important AC, at least arcs in have their tails in . At most of are already colored. Thus, contains at least nodes.
Now, observe that the probability for an edge of to be useful is and the expected number of useful edges is . This property is independent for different edges in , so the claim regarding the number of useful edges follows from a Chernoff bound. Similarly, the probability for an edge to be is , and the expected number of edges in is . The claim regarding edges then follows with a Chernoff bound. ∎
Lemma 5.12.
is a sampling LLL with risk and dependency degree .
Proof.
The probabilities of the associated events and are at most by Lemma 5.11.
The dependency degree can be bounded as follows. Each variable of a node stating whether the node is or only appears in the events and of adjacent nodes and in the events and of adjacent ACs, bounding the variable degree by . We have and each event depends on two variables for each edge in . As , we obtain that each event depends on at most variables and the dependency degree can be upper bounded by .
Via Lemma 5.7 we obtain that testifies that has risk .
Next, we fix an important AC and reason that testifies that has risk . First note that , as required by Definition 5.4. Let , namely is a retraction of an assignment under which is avoided. By the definition of , the set of retracted variables is in one of the following two cases: 1) The set of retracted variables contains no variables of that were under , or 2) The set of retracted variables contains all variables of that were under .
Let us first consider the second case. As is avoided under , under the assignment at least edges of are white. In the second case, all of these obtain fresh randomness, and each of them is useful independently with probability . Thus, in expectation, at least of them are useful. With a Chernoff bound, we obtain that the probability of to happen in the second case is at most .
Now consider the first case. As is avoided under , under the assignment at least edges of are useful. Let be the set of nodes in those useful edges that are contained in . Note that all nodes in are under . Let be the nodes that are also under and let be the nodes that evaluate to under , i.e., got retracted. Nodes in are / with probability and , respectively. Let be the random variable describing the number of nodes of set to in this process. Let be the random variable describing the number of useful edges in after that process. We obtain , where we used that . The event holds if , which is smaller than . Hence, we obtain that happens with probability at most by a Chernoff bound. ∎
Lemma 5.13.
is simulatable.
Proof.
Each event depends only on variables that are immediately incident to the node is a function counting the number of nodes that is known to . Hence, the simulatability condition holds for . For all variables are simulated by nodes that are immediately incident to the AC and full knowledge about these variables can be relayed to the leader in the AC that simulates event . Again, whether the event holds can be evaluated with the values of the variables and the edges in , also for all conditional probabilities of partial assignments, as has full knowledge of the function . ∎
Let be the threshold of the number of useful edges that are guaranteed in each for each important AC by (see Definition 5.10). The second LLL is significantly simpler and given by the following definition.
Definition 5.14 (Second sampling LLL).
We define the following LLL . We split into two sets and where each node in flips an unbiased coin which set to join. There are bad events , and for each important AC , that hold if there are fewer than useful edges in , respectively.
This LLL can be solved by a result in [32].
Lemma 5.15.
There is a -round algorithm for .
Proof.
Form the bipartite graph with the nodes of on one side and a node for each important AC on the other side. There is an edge for each useful arc . Each node has degree at least (by Lemma 4.6), while each node has degree at most (as is large). Splitting the subset into two parts such that each node has between and neighbors into each part is a vertex subset-splitting problem formulated as bounded-risk LLL and solved in Lemma D.11(1) of [32].
We only need to verify that this problem remains simulatable in our embedded setting. The problem is simulatable because the node can obtain full knowledge of any partial assignment of and knows the function . ∎
Proof of Lemma 4.5.
First, apply Theorem 5.5 in order to solve in rounds yielding a set that avoids all bad events of . The conditions of the theorem are met by Lemmas 5.13 and 5.12 and as implies that the criterion is strong enough. We split the set into and by solving , by Lemma 5.15 The requirements of the theorem are satisfied as .
The degree bound immediately follows from the conditions on imposed in the neighborhood of each vertex by (note that . The second property follows from the avoided events of for each important AC. ∎
5.4 Forming Triples
See 4.6
We model the problem of finding for each important AC as a disjoint variable set LLL, where the disjoint sets and give rise to two disjoint sets of variables. Note that the respective nodes and will only be computed in the sequel via a deterministic method. Recall, that .
Definition 5.16.
Define the following disjoint variable set LLL .
-
•
Variables: For each important AC and each useful arc , there is a binary random variable that assumes with probability .
-
•
Events: We call a useful arc successful if AC activated and no other AC activated an edge (for some ). There is one bad event for each important AC that holds if there is no successful edge for . We introduce corresponding events and restricted to arcs in , respectively. We have .
-
•
The home node of is where is the node of with largest ID. The home node of is .
Lemma 5.17.
For each important AC and each , we have and the dependency degree of is upper bounded by .
Proof.
Fix an important AC and . For important arc , let be the event that is successful. Observe that depends only on arcs with as tail. It holds if is activated (i.e., has ) while the other arcs with as tail are not activated. The external degree of each is at most , as it is large, so at most that many useful arcs have as tail. Thus, .
The events and are independent, as each they involve disjoint sets of arcs. The bad event holds only when no useful edge in becomes successful, which occurs with probability
The dependency degree is upper bounded by because the events of an AC only share variables with ACs that are within distance from one of the nodes of the AC. ∎
Lemma 5.18.
is simulatable.
Proof.
The respective nodes can check in rounds whether their events hold by an assignment and the aggregation primitives can be implemented efficiently as all variables are in distance at most from the ACs.
We next reason why we can compute the conditional probabilities of Definition 5.2. Let be any partial assignment. To compute the conditional probabilities , and , the node holding the respective event needs to compute the probability that one of the useful edges becomes successful for . The conditional probability is if there already is a useful edge that is successful for . For all other useful edges in , the probability of becoming successful is independent as the activation by happens independently, and also, all activations from other ACs do influence at most one useful edge in . The probability to become successful for a single useful edge , conditioned on can be computed from knowing whether activated in , whether any other edge is activated in and from the number of useful edges of other ACs with endpoint that evaluate to under . The nodes can learn all this information in rounds using bits of communication per edge (here we use that to communicate the aforementioned number efficiently). This can be performed in parallel for the events of all important ACs. Knowing the probability for each edge to become successful, the respective node can compute the conditional probability for the event. This proof also subsumes that the events can be evaluated efficiently, as we did not require the locally unique IDs from a smaller ID space that are given by Definition 5.2. ∎
Proof of Lemma 4.6.
Fix an important AC . First, we use Theorem 5.3 to solve with the sets and given from Lemma 4.5. The algorithm runs in rounds and works w.h.p. It provides us with a successful edge for the AC (see Definition 5.16). Next, we show that we can deterministically compute a node to form the triple of nodes as required for Lemma 4.6 in rounds.
The nodes in that cannot be used for are those that are either: a) neighbors of , b) already colored, or c) function as for another important AC . As is large, it has at most neighbors in . By Lemma 4.3, at most nodes of are already colored.
By Lemma 4.5, we obtain at most nodes in are candidates for being the outside node in a triple. Hence, at least nodes in will do as a -node.
The graph induced by has maximum degree as Lemma 4.5 ensures that for all . ∎
References
- AKM [22] Sepehr Assadi, Pankaj Kumar, and Parth Mittal. Brooks’ theorem in graph streams: a single-pass semi-streaming algorithm for -coloring. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 234–247, 2022.
- Bar [15] L. Barenboim. Deterministic ( + 1)-coloring in sublinear (in ) time in static, dynamic and faulty networks. In Proc. 34th ACM Symposium on Principles of Distributed Computing (PODC), pages 345–354, 2015.
- BBKO [22] Alkida Balliu, Sebastian Brandt, Fabian Kuhn, and Dennis Olivetti. Distributed -coloring plays hide-and-seek. In Proc. 54th ACM Symp. on Theory of Computing (STOC), 2022.
- BCM+ [21] Alkida Balliu, Keren Censor-Hillel, Yannic Maus, Dennis Olivetti, and Jukka Suomela. Locally checkable labelings with small messages. In Seth Gilbert, editor, 35th International Symposium on Distributed Computing, DISC 2021, October 4-8, 2021, Freiburg, Germany (Virtual Conference), volume 209 of LIPIcs, pages 8:1–8:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- BE [13] Leonid Barenboim and Michael Elkin. Distributed Graph Coloring: Fundamentals and Recent Developments. Morgan & Claypool Publishers, 2013.
- BE [19] Étienne Bamas and Louis Esperet. Distributed coloring of graphs with an optimal number of colors. volume 126 of LIPIcs, pages 10:1–10:15. LZI, 2019.
- BEPS [16] Leonid Barenboim, Michael Elkin, Seth Pettie, and Johannes Schneider. The locality of distributed symmetry breaking. Journal of the ACM, 63(3):20:1–20:45, 2016.
- BFH+ [16] Sebastian Brandt, Orr Fischer, Juho Hirvonen, Barbara Keller, Tuomo Lempiäinen, Joel Rybicki, Jukka Suomela, and Jara Uitto. A lower bound for the distributed Lovász local lemma. In Proc. 48th ACM Symposium on Theory of Computing (STOC 2016), pages 479–488. ACM, 2016.
- BKM [20] Philipp Bamberger, Fabian Kuhn, and Yannic Maus. Efficient deterministic distributed coloring with small bandwidth. In PODC ’20: ACM Symposium on Principles of Distributed Computing, Virtual Event, Italy, August 3-7, 2020, pages 243–252, 2020.
- Bro [41] R. Leonard Brooks. On colouring the nodes of a network. Mathematical Proceedings of the Cambridge Philosophical Society, 37(2):194–197, 1941.
- CCDM [24] Sam Coy, Artur Czumaj, Peter Davies, and Gopinath Mishra. Parallel derandomization for coloring, 2024. Note: https://arxiv.longhoe.net/abs/2302.04378v1 contains the Delta-coloring algorithm.
- CHL+ [20] Yi-Jun Chang, Qizheng He, Wenzheng Li, Seth Pettie, and Jara Uitto. Distributed edge coloring and a special case of the constructive Lovász local lemma. ACM Trans. Algorithms, 2020.
- CLP [18] Yi-Jun Chang, Wenzheng Li, and Seth Pettie. An optimal distributed (+1)-coloring algorithm? In Proceedings of the ACM Symposium on Theory of Computing (STOC), pages 445–456, 2018.
- CM [19] Shiri Chechik and Doron Mukhtar. Optimal distributed coloring algorithms for planar graphs in the LOCAL model. In Timothy M. Chan, editor, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 787–804. SIAM, 2019.
- CP [19] Yi-Jun Chang and Seth Pettie. A time hierarchy theorem for the LOCAL model. SIAM J. Comput., 48(1):33–69, 2019.
- CPS [17] Kai-Min Chung, Seth Pettie, and Hsin-Hao Su. Distributed algorithms for the Lovász local lemma and graph coloring. Distributed Comput., 30(4):261–280, 2017.
- Dav [23] Peter Davies. Improved distributed algorithms for the Lovász local lemma and edge coloring. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4273–4295. SIAM, 2023.
- EL [74] Paul Erdös and László Lovász. Problems and Results on 3-chromatic Hypergraphs and some Related Questions. Colloquia Mathematica Societatis János Bolyai, pages 609–627, 1974.
- EPS [15] Michael Elkin, Seth Pettie, and Hsin-Hao Su. (2)-edge-coloring is much easier than maximal matching in the distributed setting. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 355–370, 2015.
- FG [17] Manuela Fischer and Mohsen Ghaffari. Sublogarithmic Distributed Algorithms for Lovász Local Lemma, and the Complexity Hierarchy. In the Proceedings of the 31st International Symposium on Distributed Computing (DISC), pages 18:1–18:16, 2017.
- FHK [16] Pierre Fraigniaud, Marc Heinrich, and Adrian Kosowski. Local conflict coloring. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 625–634, 2016.
- FHM [23] Manuela Fischer, Magnús M. Halldórsson, and Yannic Maus. Fast distributed Brooks’ theorem. In Proceedings of the SIAM-ACM Symposium on Discrete Algorithms (SODA), pages 2567–2588, 2023.
- Fis [17] Manuela Fischer. Improved deterministic distributed matching via rounding. In Andréa W. Richa, editor, 31st International Symposium on Distributed Computing, DISC 2017, October 16-20, 2017, Vienna, Austria, volume 91 of LIPIcs, pages 17:1–17:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
- FK [23] Marc Fuchs and Fabian Kuhn. List defective colorings: Distributed algorithms and applications. In Rotem Oshman, editor, 37th International Symposium on Distributed Computing, DISC 2023, October 10-12, 2023, L’Aquila, Italy, volume 281 of LIPIcs, pages 22:1–22:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023.
- Gha [19] Mohsen Ghaffari. Distributed maximal independent set using small messages. In Proc. 30th Symp. on Discrete Algorithms (SODA), pages 805–820, 2019.
- GHK [18] Mohsen Ghaffari, David G. Harris, and Fabian Kuhn. On derandomizing local distributed algorithms. In 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 662–673, 2018.
- GHKM [18] Mohsen Ghaffari, Juho Hirvonen, Fabian Kuhn, and Yannic Maus. Improved distributed delta-coloring. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, Egham, United Kingdom, July 23-27, 2018, pages 427–436, 2018.
- GK [21] Mohsen Ghaffari and Fabian Kuhn. Deterministic distributed vertex coloring: Simpler, faster, and without network decomposition. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 1009–1020, 2021.
- HKMT [21] Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan. Efficient randomized distributed coloring in CONGEST. In Proceedings of the ACM Symposium on Theory of Computing (STOC), pages 1180–1193, 2021. Full version at CoRR abs/2105.04700.
- HKNT [22] Magnús M. Halldórsson, Fabian Kuhn, Alexandre Nolin, and Tigran Tonoyan. Near-optimal distributed degree+1 coloring. In Stefano Leonardi and Anupam Gupta, editors, STOC ’22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20 - 24, 2022, pages 450–463. ACM, 2022.
- HMN [22] Magnús M. Halldórsson, Yannic Maus, and Alexandre Nolin. Fast distributed vertex splitting with applications. In Christian Scheideler, editor, 36th International Symposium on Distributed Computing, DISC 2022, October 25-27, 2022, Augusta, Georgia, USA, volume 246 of LIPIcs, pages 26:1–26:24. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- HMP [24] Magnús M. Halldórsson, Yannic Maus, and Saku Peltonen. Distributed Lovász local lemma under bandwidth limitations, 2024.
- HN [21] Magnús M. Halldórsson and Alexandre Nolin. Superfast coloring in CONGEST via efficient color sampling. In Tomasz Jurdzinski and Stefan Schmid, editors, Structural Information and Communication Complexity - 28th International Colloquium, SIROCCO 2021, Wrocław, Poland, June 28 - July 1, 2021, Proceedings, volume 12810 of Lecture Notes in Computer Science, pages 68–83. Springer, 2021.
- HNT [22] Magnús M. Halldórsson, Alexandre Nolin, and Tigran Tonoyan. Overcoming congestion in distributed coloring. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC), pages 26–36. ACM, 2022.
- HSS [18] David G. Harris, Johannes Schneider, and Hsin-Hao Su. Distributed ()-coloring in sublogarithmic rounds. Journal of the ACM, 65:19:1–19:21, 2018.
- Joh [99] Öjvind Johansson. Simple distributed -coloring of graphs. Inf. Process. Lett., 70(5):229–232, 1999.
- Lin [92] Nati Linial. Locality in distributed graph algorithms. SIAM Journal on Computing, 21(1):193–201, 1992.
- MPU [23] Yannic Maus, Saku Peltonen, and Jara Uitto. Distributed symmetry breaking on power graphs via sparsification. In Proceedings of the 2023 ACM Symposium on Principles of Distributed Computing, PODC ’23, page 157–167, New York, NY, USA, 2023. Association for Computing Machinery.
- MT [20] Yannic Maus and Tigran Tonoyan. Local conflict coloring revisited: Linial for lists. In Hagit Attiya, editor, 34th International Symposium on Distributed Computing, DISC 2020, October 12-16, 2020, Virtual Conference, volume 179 of LIPIcs, pages 16:1–16:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
- MU [21] Yannic Maus and Jara Uitto. Efficient CONGEST algorithms for the Lovász local lemma. In Seth Gilbert, editor, Proceedings of the International Symposium on Distributed Computing (DISC), volume 209 of LIPIcs, pages 31:1–31:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- Pos [19] Luke Postle. Linear-time and efficient distributed algorithms for list coloring graphs on surfaces. In David Zuckerman, editor, 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, Baltimore, Maryland, USA, November 9-12, 2019, pages 929–941. IEEE Computer Society, 2019.
- PS [95] Alessandro Panconesi and Aravind Srinivasan. The local nature of -coloring and its algorithmic applications. Combinatorica, 15(2):255–280, 1995.
- RG [20] Václav Rozhoň and Mohsen Ghaffari. Polylogarithmic-time deterministic network decomposition and distributed derandomization. In Proceedings of the ACM Symposium on Theory of Computing (STOC), pages 350–363, 2020.
Appendix A Concentration Bounds
Lemma A.1 (Chernoff bounds).
Let be a family of independent binary random variables with , and let . For any ,
Appendix B Further Supplementary Material from [32]
Slack generation with two given sets.
The following lemma shows that one can compute a partial coloring of the nodes in two given sets and such that any node that has sufficiently many non-edges in , obtain slack. We use it in Section 5.2 and it is proven in [32].
Lemma B.1 ([32]).
Let . Let and be positive integers such that and for some constant . Let and let be disjoint sets such that for ,
-
•
,
-
•
: the number of non-edges in is at least
There is a randomized algorithm that w.h.p. colors a subset of using a palette of size such that every node in has at least same-colored neighbors. Every node in has at most of its neighbors colored.
Non-edge hitting lemma.
An expected -fraction of the non-edges is preserved when sampling nodes into a set with probability . The following lemma shows that the probability of deviating from this expectation is small.
Lemma B.2 (Non-edge hitting lemma [32]).
Let be a graph on the vertex set with non-edges. Sample each node of with probability into a set and let be the random variable describing the number of non-edges in . Then we have .