-
Work-Efficient Parallel Derandomization II: Optimal Concentrations via Bootstrap**
Authors:
Mohsen Ghaffari,
Christoph Grunau
Abstract:
We present an efficient parallel derandomization method for randomized algorithms that rely on concentrations such as the Chernoff bound. This settles a classic problem in parallel derandomization, which dates back to the 1980s. Consider the \textit{set balancing} problem where $m$ sets of size at most $s$ are given in a ground set of size $n$, and we should partition the ground set into two parts…
▽ More
We present an efficient parallel derandomization method for randomized algorithms that rely on concentrations such as the Chernoff bound. This settles a classic problem in parallel derandomization, which dates back to the 1980s. Consider the \textit{set balancing} problem where $m$ sets of size at most $s$ are given in a ground set of size $n$, and we should partition the ground set into two parts such that each set is split evenly up to a small additive (discrepancy) bound. A random partition achieves a discrepancy of $O(\sqrt{s \log m})$ in each set, by Chernoff bound. We give a deterministic parallel algorithm that matches this bound, using near-linear work and polylogarithmic depth. The previous results were weaker in discrepancy and/or work bounds: Motwani, Naor, and Naor [FOCS'89] and Berger and Rompel [FOCS'89] achieve discrepancy $s^{\varepsilon} \cdot O(\sqrt{s \log m})$ with work $\tilde{O}(m+n+\sum_{i=1}^{m} |S_i|) \cdot m^{Θ(1/\varepsilon)}$ and polylogarithmic depth; the discrepancy was optimized to $O(\sqrt{s \log m})$ in later work, e.g. by Harris [Algorithmica'19], but the work bound remained high at $\tilde{O}(m^4n^3)$. Ghaffari, Grunau, and Rozhon [FOCS'23] achieve discrepancy $s/poly(\log(nm)) + O(\sqrt{s \log m})$ with near-linear work and polylogarithmic-depth. Notice that this discrepancy is barely sublinear with respect to the trivial bound of $s$. Our method relies on a novel bootstrap** idea that uses crude partitioning algorithms as a subroutine. In particular, we solve the problem recursively, by using the crude partition in each iteration to split the variables into many smaller parts, and then we find a constraint for the variables in each part such that we reduce the overall number of variables in the problem. The scheme relies on an interesting application of the multiplicative weights update method to control the variance losses in each iteration.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Work-Efficient Parallel Derandomization I: Chernoff-like Concentrations via Pairwise Independence
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Václav Rozhoň
Abstract:
We present a novel technique for work-efficient parallel derandomization, for algorithms that rely on the concentration of measure bounds such as Chernoff, Hoeffding, and Bernstein inequalities. Our method increases the algorithm's computational work and depth by only polylogarithmic factors. Before our work, the only known method to obtain parallel derandomization with such strong concentrations…
▽ More
We present a novel technique for work-efficient parallel derandomization, for algorithms that rely on the concentration of measure bounds such as Chernoff, Hoeffding, and Bernstein inequalities. Our method increases the algorithm's computational work and depth by only polylogarithmic factors. Before our work, the only known method to obtain parallel derandomization with such strong concentrations was by the results of [Motwani, Naor, and Naor FOCS'89; Berger and Rompel FOCS'89], which perform a binary search in a $k$-wise independent space for $k=poly(\log n)$. However, that method blows up the computational work by a high $poly(n)$ factor and does not yield work-efficient parallel algorithms. Their method was an extension of the approach of [Luby FOCS'88], which gave a work-efficient derandomization but was limited to algorithms analyzed with only pairwise independence. Pushing the method from pairwise to the higher $k$-wise analysis resulted in the $poly(n)$ factor computational work blow-up. Our work can be viewed as an alternative extension from the pairwise case, which yields the desired strong concentrations while retaining work efficiency up to logarithmic factors.
Our approach works by casting the problem of determining the random variables as an iterative process with $poly(\log n)$ iterations, where different iterations have independent randomness. This is done so that for the desired concentrations, we need only pairwise independence inside each iteration. In particular, we model each binary random variable as a result of a gradual random walk, and our method shows that the desired Chernoff-like concentrations about the endpoints of these walks can be boiled down to some pairwise analysis on the steps of these random walks in each iteration (while having independence across iterations).
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Conditionally Optimal Parallel Coloring of Forests
Authors:
Christoph Grunau,
Rustam Latypov,
Yannic Maus,
Shreyas Pai,
Jara Uitto
Abstract:
We show the first conditionally optimal deterministic algorithm for $3$-coloring forests in the low-space massively parallel computation (MPC) model. Our algorithm runs in $O(\log \log n)$ rounds and uses optimal global space. The best previous algorithm requires $4$ colors [Ghaffari, Grunau, **, DISC'20] and is randomized, while our algorithm are inherently deterministic.
Our main technical co…
▽ More
We show the first conditionally optimal deterministic algorithm for $3$-coloring forests in the low-space massively parallel computation (MPC) model. Our algorithm runs in $O(\log \log n)$ rounds and uses optimal global space. The best previous algorithm requires $4$ colors [Ghaffari, Grunau, **, DISC'20] and is randomized, while our algorithm are inherently deterministic.
Our main technical contribution is an $O(\log \log n)$-round algorithm to compute a partition of the forest into $O(\log n)$ ordered layers such that every node has at most two neighbors in the same or higher layers. Similar decompositions are often used in the area and we believe that this result is of independent interest. Our results also immediately yield conditionally optimal deterministic algorithms for maximal independent set and maximal matching for forests, matching the state of the art [Giliberti, Fischer, Grunau, SPAA'23]. In contrast to their solution, our algorithms are not based on derandomization, and are arguably simpler.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Fully Dynamic Consistent $k$-Center Clustering
Authors:
Jakub Łącki,
Bernhard Haeupler,
Christoph Grunau,
Václav Rozhoň,
Rajesh Jayaram
Abstract:
We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger,…
▽ More
We study the consistent k-center clustering problem. In this problem, the goal is to maintain a constant factor approximate $k$-center solution during a sequence of $n$ point insertions and deletions while minimizing the recourse, i.e., the number of changes made to the set of centers after each point insertion or deletion. Previous works by Lattanzi and Vassilvitskii [ICML '12] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] showed that in the incremental setting, where deletions are not allowed, one can obtain $k \cdot \textrm{polylog}(n) / n$ amortized recourse for both $k$-center and $k$-median, and demonstrated a matching lower bound. However, no algorithm for the fully dynamic setting achieves less than the trivial $O(k)$ changes per update, which can be obtained by simply reclustering the full dataset after every update.
In this work, we give the first algorithm for consistent $k$-center clustering for the fully dynamic setting, i.e., when both point insertions and deletions are allowed, and improves upon a trivial $O(k)$ recourse bound. Specifically, our algorithm maintains a constant factor approximate solution while ensuring worst-case constant recourse per update, which is optimal in the fully dynamic setting. Moreover, our algorithm is deterministic and is therefore correct even if an adaptive adversary chooses the insertions and deletions.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Noisy k-means++ Revisited
Authors:
Christoph Grunau,
Ahmet Alper Özüdoğru,
Václav Rozhoň
Abstract:
The $k$-means++ algorithm by Arthur and Vassilvitskii [SODA 2007] is a classical and time-tested algorithm for the $k$-means problem. While being very practical, the algorithm also has good theoretical guarantees: its solution is $O(\log k)$-approximate, in expectation.
In a recent work, Bhattacharya, Eube, Roglin, and Schmidt [ESA 2020] considered the following question: does the algorithm reta…
▽ More
The $k$-means++ algorithm by Arthur and Vassilvitskii [SODA 2007] is a classical and time-tested algorithm for the $k$-means problem. While being very practical, the algorithm also has good theoretical guarantees: its solution is $O(\log k)$-approximate, in expectation.
In a recent work, Bhattacharya, Eube, Roglin, and Schmidt [ESA 2020] considered the following question: does the algorithm retain its guarantees if we allow for a slight adversarial noise in the sampling probability distributions used by the algorithm? This is motivated e.g. by the fact that computations with real numbers in $k$-means++ implementations are inexact.
Surprisingly, the analysis under this scenario gets substantially more difficult and the authors were able to prove only a weaker approximation guarantee of $O(\log^2 k)$. In this paper, we close the gap by providing a tight, $O(\log k)$-approximate guarantee for the $k$-means++ algorithm with noise.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Nearly Work-Efficient Parallel DFS in Undirected Graphs
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Jiahao Qu
Abstract:
We present the first parallel depth-first search algorithm for undirected graphs that has near-linear work and sublinear depth. Concretely, in any $n$-node $m$-edge undirected graph, our algorithm computes a DFS in $\tilde{O}(\sqrt{n})$ depth and using $\tilde{O}(m+n)$ work. All prior work either required $Ω(n)$ depth, and thus were essentially sequential, or needed a high $poly(n)$ work and thus…
▽ More
We present the first parallel depth-first search algorithm for undirected graphs that has near-linear work and sublinear depth. Concretely, in any $n$-node $m$-edge undirected graph, our algorithm computes a DFS in $\tilde{O}(\sqrt{n})$ depth and using $\tilde{O}(m+n)$ work. All prior work either required $Ω(n)$ depth, and thus were essentially sequential, or needed a high $poly(n)$ work and thus were far from being work-efficient.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Faster Deterministic Distributed MIS and Approximate Matching
Authors:
Mohsen Ghaffari,
Christoph Grunau
Abstract:
$ \renewcommand{\tilde}{\widetilde} $We present an $\tilde{O}(\log^2 n)$ round deterministic distributed algorithm for the maximal independent set problem. By known reductions, this round complexity extends also to maximal matching, $Δ+1$ vertex coloring, and $2Δ-1…
▽ More
$ \renewcommand{\tilde}{\widetilde} $We present an $\tilde{O}(\log^2 n)$ round deterministic distributed algorithm for the maximal independent set problem. By known reductions, this round complexity extends also to maximal matching, $Δ+1$ vertex coloring, and $2Δ-1$ edge coloring. These four problems are among the most central problems in distributed graph algorithms and have been studied extensively for the past four decades. This improved round complexity comes closer to the $\tildeΩ(\log n)$ lower bound of maximal independent set and maximal matching [Balliu et al. FOCS '19]. The previous best known deterministic complexity for all of these problems was $Θ(\log^3 n)$. Via the shattering technique, the improvement permeates also to the corresponding randomized complexities, e.g., the new randomized complexity of $Δ+1$ vertex coloring is now $\tilde{O}(\log^2\log n)$ rounds.
Our approach is a novel combination of the previously known two methods for develo** deterministic algorithms for these problems, namely global derandomization via network decomposition (see e.g., [Rozhon, Ghaffari STOC'20; Ghaffari, Grunau, Rozhon SODA'21; Ghaffari et al. SODA'23]) and local rounding of fractional solutions (see e.g., [Fischer DISC'17; Harris FOCS'19; Fischer, Ghaffari, Kuhn FOCS'17; Ghaffari, Kuhn FOCS'21; Faour et al. SODA'23]). We consider a relaxation of the classic network decomposition concept, where instead of requiring the clusters in the same block to be non-adjacent, we allow each node to have a small number of neighboring clusters. We also show a deterministic algorithm that computes this relaxed decomposition faster than standard decompositions. We then use this relaxed decomposition to significantly improve the integrality of certain fractional solutions, before handing them to the local rounding procedure that now has to do fewer rounding steps.
△ Less
Submitted 28 March, 2023;
originally announced March 2023.
-
Parallel and Distributed Exact Single-Source Shortest Paths with Negative Edge Weights
Authors:
Vikrant Ashvinkumar,
Aaron Bernstein,
Nairen Cao,
Christoph Grunau,
Bernhard Haeupler,
Yonggang Jiang,
Danupon Nanongkai,
Hsin Hao Su
Abstract:
This paper presents parallel and distributed algorithms for single-source shortest paths when edges can have negative weights (negative-weight SSSP). We show a framework that reduces negative-weight SSSP in either setting to $n^{o(1)}$ calls to any SSSP algorithm that works with a virtual source. More specifically, for a graph with $m$ edges, $n$ vertices, undirected hop-diameter $D$, and polynomi…
▽ More
This paper presents parallel and distributed algorithms for single-source shortest paths when edges can have negative weights (negative-weight SSSP). We show a framework that reduces negative-weight SSSP in either setting to $n^{o(1)}$ calls to any SSSP algorithm that works with a virtual source. More specifically, for a graph with $m$ edges, $n$ vertices, undirected hop-diameter $D$, and polynomially bounded integer edge weights, we show randomized algorithms for negative-weight SSSP with (i) $W_{SSSP}(m,n)n^{o(1)}$ work and $S_{SSSP}(m,n)n^{o(1)}$ span, given access to an SSSP algorithm with $W_{SSSP}(m,n)$ work and $S_{SSSP}(m,n)$ span in the parallel model, (ii) $T_{SSSP}(n,D)n^{o(1)}$, given access to an SSSP algorithm that takes $T_{SSSP}(n,D)$ rounds in $\mathsf{CONGEST}$. This work builds off the recent result of [Bernstein, Nanongkai, Wulff-Nilsen, FOCS'22], which gives a near-linear time algorithm for negative-weight SSSP in the sequential setting.
Using current state-of-the-art SSSP algorithms yields randomized algorithms for negative-weight SSSP with (i) $m^{1+o(1)}$ work and $n^{1/2+o(1)}$ span in the parallel model, (ii) $(n^{2/5}D^{2/5} + \sqrt{n} + D)n^{o(1)}$ rounds in $\mathsf{CONGEST}$.
Our main technical contribution is an efficient reduction for computing a low-diameter decomposition (LDD) of directed graphs to computations of SSSP with a virtual source. Efficiently computing an LDD has heretofore only been known for undirected graphs in both the parallel and distributed models. The LDD is a crucial step of the algorithm in [Bernstein, Nanongkai, Wulff-Nilsen, FOCS'22], and we think that its applications to other problems in parallel and distributed models are far from being exhausted.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Deterministic Massively Parallel Symmetry Breaking for Sparse Graphs
Authors:
Manuela Fischer,
Jeff Giliberti,
Christoph Grunau
Abstract:
We consider the problem of designing deterministic graph algorithms for the model of Massively Parallel Computation (MPC) that improve with the sparsity of the input graph, as measured by the notion of arboricity. For the problems of maximal independent set (MIS), maximal matching (MM), and vertex coloring, we improve the state of the art as follows. Let $λ$ denote the arboricity of the $n$-node i…
▽ More
We consider the problem of designing deterministic graph algorithms for the model of Massively Parallel Computation (MPC) that improve with the sparsity of the input graph, as measured by the notion of arboricity. For the problems of maximal independent set (MIS), maximal matching (MM), and vertex coloring, we improve the state of the art as follows. Let $λ$ denote the arboricity of the $n$-node input graph with maximum degree $Δ$.
MIS and MM: We develop a deterministic low-space MPC algorithm that reduces the maximum degree to $poly(λ)$ in $O(\log \log n)$ rounds, improving and simplifying the randomized $O(\log \log n)$-round $poly(\max(λ, \log n))$-degree reduction of Ghaffari, Grunau, ** [DISC'20]. Our approach when combined with the state-of-the-art $O(\log Δ+ \log \log n)$-round algorithm by Czumaj, Davies, Parter [SPAA'20, TALG'21] leads to an improved deterministic round complexity of $O(\log λ+ \log \log n)$ for MIS and MM in low-space MPC.
We also extend above MIS and MM algorithms to work with linear global memory. Specifically, we show that both problems can be solved in deterministic time $O(\min(\log n, \log λ\cdot \log \log n))$, and even in $O(\log \log n)$ time for graphs with arboricity at most $\log^{O(1)} \log n$. In this setting, only a $O(\log^2 \log n)$-running time bound for trees was known due to Latypov and Uitto [ArXiv'21].
Vertex Coloring: We present a $O(1)$-round deterministic algorithm for the problem of $O(λ)$-coloring in linear-memory MPC with relaxed global memory of $n \cdot poly(λ)$ that solves the problem after just one single graph partitioning step. This matches the state-of-the-art randomized round complexity by Ghaffari and Sayyadi [ICALP'19] and improves upon the deterministic $O(λ^ε)$-round algorithm by Barenboim and Khazanov [CSR'18].
△ Less
Submitted 30 June, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Massively Parallel Algorithms for $b$-Matching
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Slobodan Mitrović
Abstract:
This paper presents an $O(\log\log \bar{d})$ round massively parallel algorithm for $1+ε$ approximation of maximum weighted $b$-matchings, using near-linear memory per machine. Here $\bar{d}$ denotes the average degree in the graph and $ε$ is an arbitrarily small positive constant. Recall that $b$-matching is the natural and well-studied generalization of the matching problem where different verti…
▽ More
This paper presents an $O(\log\log \bar{d})$ round massively parallel algorithm for $1+ε$ approximation of maximum weighted $b$-matchings, using near-linear memory per machine. Here $\bar{d}$ denotes the average degree in the graph and $ε$ is an arbitrarily small positive constant. Recall that $b$-matching is the natural and well-studied generalization of the matching problem where different vertices are allowed to have multiple (and differing number of) incident edges in the matching. Concretely, each vertex $v$ is given a positive integer budget $b_v$ and it can have up to $b_v$ incident edges in the matching. Previously, there were known algorithms with round complexity $O(\log\log n)$, or $O(\log\log Δ)$ where $Δ$ denotes maximum degree, for $1+ε$ approximation of weighted matching and for maximal matching [Czumaj et al., STOC'18, Ghaffari et al. PODC'18; Assadi et al. SODA'19; Behnezhad et al. FOCS'19; Gamlath et al. PODC'19], but these algorithms do not extend to the more general $b$-matching problem.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Parallel Breadth-First Search and Exact Shortest Paths and Stronger Notions for Approximate Distances
Authors:
Václav Rozhoň,
Bernhard Haeupler,
Anders Martinsson,
Christoph Grunau,
Goran Zuzic
Abstract:
We introduce stronger notions for approximate single-source shortest-path distances, show how to efficiently compute them from weaker standard notions, and demonstrate the algorithmic power of these new notions and transformations. One application is the first work-efficient parallel algorithm for computing exact single-source shortest paths graphs -- resolving a major open problem in parallel com…
▽ More
We introduce stronger notions for approximate single-source shortest-path distances, show how to efficiently compute them from weaker standard notions, and demonstrate the algorithmic power of these new notions and transformations. One application is the first work-efficient parallel algorithm for computing exact single-source shortest paths graphs -- resolving a major open problem in parallel computing.
Given a source vertex in a directed graph with polynomially-bounded nonnegative integer lengths, the algorithm computes an exact shortest path tree in $m \log^{O(1)} n$ work and $n^{1/2+o(1)}$ depth. Previously, no parallel algorithm improving the trivial linear depths of Dijkstra's algorithm without significantly increasing the work was known, even for the case of undirected and unweighted graphs (i.e., for computing a BFS-tree).
Our main result is a black-box transformation that uses $\log^{O(1)} n$ standard approximate distance computations to produce approximate distances which also satisfy the subtractive triangle inequality (up to a $(1+\varepsilon)$ factor) and even induce an exact shortest path tree in a graph with only slightly perturbed edge lengths. These strengthened approximations are algorithmically significantly more powerful and overcome well-known and often encountered barriers for using approximate distances. In directed graphs they can even be boosted to exact distances. This results in a black-box transformation of any (parallel or distributed) algorithm for approximate shortest paths in directed graphs into an algorithm computing exact distances at essentially no cost. Applying this to the recent breakthroughs of Fineman et al. for compute approximate SSSP-distances via approximate hopsets gives new parallel and distributed algorithm for exact shortest paths.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
A Simple Deterministic Distributed Low-Diameter Clustering
Authors:
Václav Rozhoň,
Bernhard Haeupler,
Christoph Grunau
Abstract:
We give a simple, local process for nodes in an undirected graph to form non-adjacent clusters that (1) have at most a polylogarithmic diameter and (2) contain at least half of all vertices. Efficient deterministic distributed clustering algorithms for computing strong-diameter network decompositions and other key tools follow immediately. Overall, our process is a direct and drastically simplifie…
▽ More
We give a simple, local process for nodes in an undirected graph to form non-adjacent clusters that (1) have at most a polylogarithmic diameter and (2) contain at least half of all vertices. Efficient deterministic distributed clustering algorithms for computing strong-diameter network decompositions and other key tools follow immediately. Overall, our process is a direct and drastically simplified way for computing these fundamental objects.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
Improved Distributed Network Decomposition, Hitting Sets, and Spanners, via Derandomization
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Bernhard Haeupler,
Saeed Ilchi,
Václav Rozhoň
Abstract:
This paper presents significantly improved deterministic algorithms for some of the key problems in the area of distributed graph algorithms, including network decomposition, hitting sets, and spanners. As the main ingredient in these results, we develop novel randomized distributed algorithms that we can analyze using only pairwise independence, and we can thus derandomize efficiently. As our mos…
▽ More
This paper presents significantly improved deterministic algorithms for some of the key problems in the area of distributed graph algorithms, including network decomposition, hitting sets, and spanners. As the main ingredient in these results, we develop novel randomized distributed algorithms that we can analyze using only pairwise independence, and we can thus derandomize efficiently. As our most prominent end-result, we obtain a deterministic construction for $O(\log n)$-color $O(\log n \cdot \log\log\log n)$-strong diameter network decomposition in $\tilde{O}(\log^3 n)$ rounds. This is the first construction that achieves almost $\log n$ in both parameters, and it improves on a recent line of exciting progress on deterministic distributed network decompositions [Rozhoň, Ghaffari STOC'20; Ghaffari, Grunau, Rozhoň SODA'21; Chang, Ghaffari PODC'21; Elkin, Haeupler, Rozhoň, Grunau FOCS'22].
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Local Distributed Rounding: Generalized to MIS, Matching, Set Cover, and Beyond
Authors:
Salwa Faour,
Mohsen Ghaffari,
Christoph Grunau,
Fabian Kuhn,
Václav Rozhoň
Abstract:
We develop a general deterministic distributed method for locally rounding fractional solutions of graph problems for which the analysis can be broken down into analyzing pairs of vertices. Roughly speaking, the method can transform fractional/probabilistic label assignments of the vertices into integral/deterministic label assignments for the vertices, while approximately preserving a potential f…
▽ More
We develop a general deterministic distributed method for locally rounding fractional solutions of graph problems for which the analysis can be broken down into analyzing pairs of vertices. Roughly speaking, the method can transform fractional/probabilistic label assignments of the vertices into integral/deterministic label assignments for the vertices, while approximately preserving a potential function that is a linear combination of functions, each of which depends on at most two vertices (subject to some conditions usually satisfied in pairwise analyses). The method unifies and significantly generalizes prior work on deterministic local rounding techniques [Ghaffari, Kuhn FOCS'21; Harris FOCS'19; Fischer, Ghaffari, Kuhn FOCS'17; Fischer DISC'17] to obtain polylogarithmic-time deterministic distributed solutions for combinatorial graph problems. Our general rounding result enables us to locally and efficiently derandomize a range of distributed algorithms for local graph problems, including maximal independent set (MIS), maximum-weight independent set approximation, and minimum-cost set cover approximation. As a highlight, we in particular obtain a deterministic $O(\log^2Δ\cdot\log n)$-round algorithm for computing an MIS in the LOCAL model and an almost as efficient $O(\log^2Δ\cdot\log\logΔ\cdot\log n)$-round deterministic MIS algorithm in the CONGEST model. As a result, the best known deterministic distributed time complexity of the four most widely studied distributed symmetry breaking problems (MIS, maximal matching, $(Δ+1)$-vertex coloring, and $(2Δ-1)$-edge coloring) is now $O(\log^2Δ\cdot\log n)$. Our new MIS algorithm is also the first direct polylogarithmic-time deterministic distributed MIS algorithm, which is not based on network decomposition.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
A Nearly Tight Analysis of Greedy k-means++
Authors:
Christoph Grunau,
Ahmet Alper Özüdoğru,
Václav Rozhoň,
Jakub Tětek
Abstract:
The famous $k$-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] is the most popular way of solving the $k$-means problem in practice. The algorithm is very simple: it samples the first center uniformly at random and each of the following $k-1$ centers is then always sampled proportional to its squared distance to the closest center so far. Afterward, Lloyd's iterative algorithm is run. Th…
▽ More
The famous $k$-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] is the most popular way of solving the $k$-means problem in practice. The algorithm is very simple: it samples the first center uniformly at random and each of the following $k-1$ centers is then always sampled proportional to its squared distance to the closest center so far. Afterward, Lloyd's iterative algorithm is run. The $k$-means++ algorithm is known to return a $Θ(\log k)$ approximate solution in expectation.
In their seminal work, Arthur and Vassilvitskii [SODA 2007] asked about the guarantees for its following \emph{greedy} variant: in every step, we sample $\ell$ candidate centers instead of one and then pick the one that minimizes the new cost. This is also how $k$-means++ is implemented in e.g. the popular Scikit-learn library [Pedregosa et al.; JMLR 2011].
We present nearly matching lower and upper bounds for the greedy $k$-means++: We prove that it is an $O(\ell^3 \log^3 k)$-approximation algorithm. On the other hand, we prove a lower bound of $Ω(\ell^3 \log^3 k / \log^2(\ell\log k))$. Previously, only an $Ω(\ell \log k)$ lower bound was known [Bhattacharya, Eube, Röglin, Schmidt; ESA 2020] and there was no known upper bound.
△ Less
Submitted 16 July, 2022;
originally announced July 2022.
-
Improved Deterministic Connectivity in Massively Parallel Computation
Authors:
Manuela Fischer,
Jeff Giliberti,
Christoph Grunau
Abstract:
A long line of research about connectivity in the Massively Parallel Computation model has culminated in the seminal works of Andoni et al. [FOCS'18] and Behnezhad et al. [FOCS'19]. They provide a randomized algorithm for low-space MPC with conjectured to be optimal round complexity $O(\log D + \log \log_{\frac m n} n)$ and $O(m)$ space, for graphs on $n$ vertices with $m$ edges and diameter $D$.…
▽ More
A long line of research about connectivity in the Massively Parallel Computation model has culminated in the seminal works of Andoni et al. [FOCS'18] and Behnezhad et al. [FOCS'19]. They provide a randomized algorithm for low-space MPC with conjectured to be optimal round complexity $O(\log D + \log \log_{\frac m n} n)$ and $O(m)$ space, for graphs on $n$ vertices with $m$ edges and diameter $D$. Surprisingly, a recent result of Coy and Czumaj [STOC'22] shows how to achieve the same deterministically. Unfortunately, however, their algorithm suffers from large local computation time. We present a deterministic connectivity algorithm that matches all the parameters of the randomized algorithm and, in addition, significantly reduces the local computation time to nearly linear. Our derandomization method is based on reducing the amount of randomness needed to allow for a simpler efficient search. While similar randomness reduction approaches have been used before, our result is not only strikingly simpler, but it is the first to have efficient local computation. This is why we believe it to serve as a starting point for the systematic development of computation-efficient derandomization approaches in low-memory MPC.
△ Less
Submitted 16 August, 2022; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Deterministic Distributed Sparse and Ultra-Sparse Spanners and Connectivity Certificates
Authors:
Marcel Bezdrighin,
Michael Elkin,
Mohsen Ghaffari,
Christoph Grunau,
Bernhard Haeupler,
Saeed Ilchi,
Václav Rozhoň
Abstract:
This paper presents efficient distributed algorithms for a number of fundamental problems in the area of graph sparsification:
We provide the first deterministic distributed algorithm that computes an ultra-sparse spanner in $\textrm{polylog}(n)$ rounds in weighted graphs. Concretely, our algorithm outputs a spanning subgraph with only $n+o(n)$ edges in which the pairwise distances are stretched…
▽ More
This paper presents efficient distributed algorithms for a number of fundamental problems in the area of graph sparsification:
We provide the first deterministic distributed algorithm that computes an ultra-sparse spanner in $\textrm{polylog}(n)$ rounds in weighted graphs. Concretely, our algorithm outputs a spanning subgraph with only $n+o(n)$ edges in which the pairwise distances are stretched by a factor of at most $O(\log n \;\cdot\; 2^{O(\log^* n)})$.
We provide a $\textrm{polylog}(n)$-round deterministic distributed algorithm that computes a spanner with stretch $(2k-1)$ and $O(nk + n^{1 + 1/k} \log k)$ edges in unweighted graphs and with $O(n^{1 + 1/k} k)$ edges in weighted graphs.
We present the first $\textrm{polylog}(n)$-round randomized distributed algorithm that computes a sparse connectivity certificate. For an $n$-node graph $G$, a certificate for connectivity $k$ is a spanning subgraph $H$ that is $k$-edge-connected if and only if $G$ is $k$-edge-connected, and this subgraph $H$ is called sparse if it has $O(nk)$ edges. Our algorithm achieves a sparsity of $(1 + o(1))nk$ edges, which is within a $2(1 + o(1))$ factor of the best possible.
△ Less
Submitted 23 September, 2022; v1 submitted 29 April, 2022;
originally announced April 2022.
-
Deterministic Distributed algorithms and Descriptive Combinatorics on Δ-regular trees
Authors:
Sebastian Brandt,
Yi-Jun Chang,
Jan Grebík,
Christoph Grunau,
Václav Rozhoň,
Zoltán Vidnyánszky
Abstract:
We study complexity classes of local problems on regular trees from the perspective of distributed local algorithms and descriptive combinatorics. We show that, surprisingly, some deterministic local complexity classes from the hierarchy of distributed computing exactly coincide with well studied classes of problems in descriptive combinatorics. Namely, we show that a local problem admits a contin…
▽ More
We study complexity classes of local problems on regular trees from the perspective of distributed local algorithms and descriptive combinatorics. We show that, surprisingly, some deterministic local complexity classes from the hierarchy of distributed computing exactly coincide with well studied classes of problems in descriptive combinatorics. Namely, we show that a local problem admits a continuous solution if and only if it admits a local algorithm with local complexity $O(\log^* n)$, and a Baire measurable solution if and only if it admits a local algorithm with local complexity $O(\log n)$.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Deterministic Low-Diameter Decompositions for Weighted Graphs and Distributed and Parallel Applications
Authors:
Václav Rozhoň,
Michael Elkin,
Christoph Grunau,
Bernhard Haeupler
Abstract:
This paper presents new deterministic and distributed low-diameter decomposition algorithms for weighted graphs. In particular, we show that if one can efficiently compute approximate distances in a parallel or a distributed setting, one can also efficiently compute low-diameter decompositions. This consequently implies solutions to many fundamental distance based problems using a polylogarithmic…
▽ More
This paper presents new deterministic and distributed low-diameter decomposition algorithms for weighted graphs. In particular, we show that if one can efficiently compute approximate distances in a parallel or a distributed setting, one can also efficiently compute low-diameter decompositions. This consequently implies solutions to many fundamental distance based problems using a polylogarithmic number of approximate distance computations.
Our low-diameter decomposition generalizes and extends the line of work starting from [Rozhoň, Ghaffari STOC 2020] to weighted graphs in a very model-independent manner. Moreover, our clustering results have additional useful properties, including strong-diameter guarantees, separation properties, restricting cluster centers to specified terminals, and more. Applications include:
-- The first near-linear work and polylogarithmic depth randomized and deterministic parallel algorithm for low-stretch spanning trees (LSST) with polylogarithmic stretch. Previously, the best parallel LSST algorithm required $m \cdot n^{o(1)}$ work and $n^{o(1)}$ depth and was inherently randomized. No deterministic LSST algorithm with truly sub-quadratic work and sub-linear depth was known.
-- The first near-linear work and polylogarithmic depth deterministic algorithm for computing an $\ell_1$-embedding into polylogarithmic dimensional space with polylogarithmic distortion. The best prior deterministic algorithms for $\ell_1$-embeddings either require large polynomial work or are inherently sequential.
Even when we apply our techniques to the classical problem of computing a ball-carving with strong-diameter $O(\log^2 n)$ in an unweighted graph, our new clustering algorithm still leads to an improvement in round complexity from $O(\log^{10} n)$ rounds [Chang, Ghaffari PODC 21] to $O(\log^{4} n)$.
△ Less
Submitted 3 September, 2022; v1 submitted 18 April, 2022;
originally announced April 2022.
-
Undirected $(1+\varepsilon)$-Shortest Paths via Minor-Aggregates: Near-Optimal Deterministic Parallel & Distributed Algorithms
Authors:
Václav Rozhoň,
Christoph Grunau,
Bernhard Haeupler,
Goran Zuzic,
Jason Li
Abstract:
This paper presents near-optimal deterministic parallel and distributed algorithms for computing $(1+\varepsilon)$-approximate single-source shortest paths in any undirected weighted graph.
On a high level, we deterministically reduce this and other shortest-path problems to $\tilde{O}(1)$ Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every c…
▽ More
This paper presents near-optimal deterministic parallel and distributed algorithms for computing $(1+\varepsilon)$-approximate single-source shortest paths in any undirected weighted graph.
On a high level, we deterministically reduce this and other shortest-path problems to $\tilde{O}(1)$ Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every connected component of some subgraph.
Our reduction immediately implies:
Optimal deterministic parallel (PRAM) algorithms with $\tilde{O}(1)$ depth and near-linear work.
Universally-optimal deterministic distributed (CONGEST) algorithms, whenever deterministic Minor-Aggregate algorithms exist. For example, an optimal $\tilde{O}(HopDiameter(G))$-round deterministic CONGEST algorithm for excluded-minor networks.
Several novel tools developed for the above results are interesting in their own right:
A local iterative approach for reducing shortest path computations "up to distance $D$" to computing low-diameter decompositions "up to distance $\frac{D}{2}$". Compared to the recursive vertex-reduction approach of [Li20], our approach is simpler, suitable for distributed algorithms, and eliminates many derandomization barriers.
A simple graph-based $\tilde{O}(1)$-competitive $\ell_1$-oblivious routing based on low-diameter decompositions that can be evaluated in near-linear work. The previous such routing [ZGY+20] was $n^{o(1)}$-competitive and required $n^{o(1)}$ more work.
A deterministic algorithm to round any fractional single-source transshipment flow into an integral tree solution.
The first distributed algorithms for computing Eulerian orientations.
△ Less
Submitted 23 September, 2022; v1 submitted 12 April, 2022;
originally announced April 2022.
-
The Landscape of Distributed Complexities on Trees and Beyond
Authors:
Christoph Grunau,
Vaclav Rozhon,
Sebastian Brandt
Abstract:
We study the local complexity landscape of locally checkable labeling (LCL) problems on constant-degree graphs with a focus on complexities below $\log^* n$.
Our contribution is threefold:
Our main contribution is that we complete the classification of the complexity landscape of LCL problems on trees in the LOCAL model, by proving that every LCL problem with local complexity $o(\log^* n)$ has…
▽ More
We study the local complexity landscape of locally checkable labeling (LCL) problems on constant-degree graphs with a focus on complexities below $\log^* n$.
Our contribution is threefold:
Our main contribution is that we complete the classification of the complexity landscape of LCL problems on trees in the LOCAL model, by proving that every LCL problem with local complexity $o(\log^* n)$ has actually complexity $O(1)$. This result improves upon the previous speedup result from $o(\log \log^* n)$ to $O(1)$ by [Chang, Pettie, FOCS 2017].
In the related LCA and Volume models [Alon, Rubinfeld, Vardi, Xie, SODA 2012, Rubinfeld, Tamir, Vardi, Xie, 2011, Rosenbaum, Suomela, PODC 2020], we prove the same speedup from $o(\log^* n)$ to $O(1)$ for all bounded degree graphs.
Similarly, we complete the classification of the LOCAL complexity landscape of oriented $d$-dimensional grids by proving that any LCL problem with local complexity $o(\log^* n)$ has actually complexity $O(1)$. This improves upon the previous speed-up from $o(\sqrt[d]{\log^* n})$ by Suomela in [Chang, Pettie, FOCS 2017].
△ Less
Submitted 23 September, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
On Homomorphism Graphs
Authors:
Sebastian Brandt,
Yi-Jun Chang,
Jan Grebík,
Christoph Grunau,
Václav Rozhoň,
Zoltán Vidnyánszky
Abstract:
We introduce a new type of examples of bounded degree acyclic Borel graphs and study their combinatorial properties in the context of descriptive combinatorics, using a generalization of the determinacy method of Marks. The motivation for the construction comes from the adaptation of this method to the LOCAL model of distributed computing. Our approach unifies the previous results in the area, as…
▽ More
We introduce a new type of examples of bounded degree acyclic Borel graphs and study their combinatorial properties in the context of descriptive combinatorics, using a generalization of the determinacy method of Marks. The motivation for the construction comes from the adaptation of this method to the LOCAL model of distributed computing. Our approach unifies the previous results in the area, as well as produces new ones. In particular, we show that for $Δ>2$ it is impossible to give a simple characterization of acyclic $Δ$-regular Borel graphs with Borel chromatic number at most $Δ$: such graphs form a $\mathbfΣ^1_2$-complete set. This implies a strong failure of Brooks'-like theorems in the Borel context.
△ Less
Submitted 29 April, 2024; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Local Problems on Trees from the Perspectives of Distributed Algorithms, Finitary Factors, and Descriptive Combinatorics
Authors:
Sebastian Brandt,
Yi-Jun Chang,
Jan Grebík,
Christoph Grunau,
Václav Rozhoň,
Zoltán Vidnyánszky
Abstract:
We study connections between distributed local algorithms, finitary factors of iid processes, and descriptive combinatorics in the context of regular trees.
We extend the Borel determinacy technique of Marks coming from descriptive combinatorics and adapt it to the area of distributed computing. Using this technique, we prove deterministic distributed $Ω(\log n)$-round lower bounds for problems…
▽ More
We study connections between distributed local algorithms, finitary factors of iid processes, and descriptive combinatorics in the context of regular trees.
We extend the Borel determinacy technique of Marks coming from descriptive combinatorics and adapt it to the area of distributed computing. Using this technique, we prove deterministic distributed $Ω(\log n)$-round lower bounds for problems from a natural class of homomorphism problems. Interestingly, these lower bounds seem beyond the current reach of the powerful round elimination technique responsible for all substantial locality lower bounds of the last years. Our key technical ingredient is a novel ID graph technique that we expect to be of independent interest.
We prove that a local problem admits a Baire measurable coloring if and only if it admits a local algorithm with local complexity $O(\log n)$, extending the classification of Baire measurable colorings of Bernshteyn. A key ingredient of the proof is a new and simple characterization of local problems that can be solved in $O(\log n)$ rounds. We complement this result by showing separations between complexity classes from distributed computing, finitary factors, and descriptive combinatorics. Most notably, the class of problems that allow a distributed algorithm with sublogarithmic randomized local complexity is incomparable with the class of problems with a Borel solution.
We hope that our treatment will help to view all three perspectives as part of a common theory of locality, in which we follow the insightful paper of [Bernshteyn -- arXiv 2004.04905].
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
The randomized local computation complexity of the Lovász local lemma
Authors:
Sebastian Brandt,
Christoph Grunau,
Václav Rozhoň
Abstract:
The Local Computation Algorithm (LCA) model is a popular model in the field of sublinear-time algorithms that measures the complexity of an algorithm by the number of probes the algorithm makes in the neighborhood of one node to determine that node's output.
In this paper we show that the randomized LCA complexity of the Lovász Local Lemma (LLL) on constant degree graphs is $Θ(\log n)$. The lowe…
▽ More
The Local Computation Algorithm (LCA) model is a popular model in the field of sublinear-time algorithms that measures the complexity of an algorithm by the number of probes the algorithm makes in the neighborhood of one node to determine that node's output.
In this paper we show that the randomized LCA complexity of the Lovász Local Lemma (LLL) on constant degree graphs is $Θ(\log n)$. The lower bound follows by proving an $Ω(\log n)$ lower bound for the Sinkless Orientation problem introduced in [Brandt et al. STOC 2016]. This answers a question of [Rosenbaum, Suomela PODC 2020].
Additionally, we show that every randomized LCA algorithm for a locally checkable problem with a probe complexity of $o(\sqrt{\log{n}})$ can be turned into a deterministic LCA algorithm with a probe complexity of $O(\log^* n)$. This improves exponentially upon the currently best known speed-up result from $o(\log \log n)$ to $O(\log^* n)$ implied by the result of [Chang, Pettie FOCS 2017] in the LOCAL model.
Finally, we show that for every fixed constant $c \geq 2$, the deterministic VOLUME complexity of $c$-coloring a bounded degree tree is $Θ(n)$, where the VOLUME model is a close relative of the LCA model that was recently introduced by [Rosenbaum, Suomela PODC 2020].
△ Less
Submitted 3 December, 2021; v1 submitted 30 March, 2021;
originally announced March 2021.
-
Improved Deterministic Network Decomposition
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Václav Rozhoň
Abstract:
Network decomposition is a central tool in distributed graph algorithms. We present two improvements on the state of the art for network decomposition, which thus lead to improvements in the (deterministic and randomized) complexity of several well-studied graph problems.
- We provide a deterministic distributed network decomposition algorithm with $O(\log^5 n)$ round complexity, using…
▽ More
Network decomposition is a central tool in distributed graph algorithms. We present two improvements on the state of the art for network decomposition, which thus lead to improvements in the (deterministic and randomized) complexity of several well-studied graph problems.
- We provide a deterministic distributed network decomposition algorithm with $O(\log^5 n)$ round complexity, using $O(\log n)$-bit messages. This improves on the $O(\log^7 n)$-round algorithm of Rozhoň and Ghaffari [STOC'20], which used large messages, and their $O(\log^8 n)$-round algorithm with $O(\log n)$-bit messages. This directly leads to similar improvements for a wide range of deterministic and randomized distributed algorithms, whose solution relies on network decomposition, including the general distributed derandomization of Ghaffari, Kuhn, and Harris [FOCS'18].
- One drawback of the algorithm of Rozhoň and Ghaffari, in the $\mathsf{CONGEST}$ model, was its dependence on the length of the identifiers. Because of this, for instance, the algorithm could not be used in the shattering framework in the $\mathsf{CONGEST}$ model. Thus, the state of the art randomized complexity of several problems in this model remained with an additive $2^{O(\sqrt{\log\log n})}$ term, which was a clear leftover of the older network decomposition complexity [Panconesi and Srinivasan STOC'92]. We present a modified version that remedies this, constructing a decomposition whose quality does not depend on the identifiers, and thus improves the randomized round complexity for various problems.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Adapting $k$-means algorithms for outliers
Authors:
Christoph Grunau,
Václav Rozhoň
Abstract:
This paper shows how to adapt several simple and classical sampling-based algorithms for the $k$-means problem to the setting with outliers.
Recently, Bhaskara et al. (NeurIPS 2019) showed how to adapt the classical $k$-means++ algorithm to the setting with outliers. However, their algorithm needs to output $O(\log (k) \cdot z)$ outliers, where $z$ is the number of true outliers, to match the…
▽ More
This paper shows how to adapt several simple and classical sampling-based algorithms for the $k$-means problem to the setting with outliers.
Recently, Bhaskara et al. (NeurIPS 2019) showed how to adapt the classical $k$-means++ algorithm to the setting with outliers. However, their algorithm needs to output $O(\log (k) \cdot z)$ outliers, where $z$ is the number of true outliers, to match the $O(\log k)$-approximation guarantee of $k$-means++. In this paper, we build on their ideas and show how to adapt several sequential and distributed $k$-means algorithms to the setting with outliers, but with substantially stronger theoretical guarantees: our algorithms output $(1+\varepsilon)z$ outliers while achieving an $O(1 / \varepsilon)$-approximation to the objective function. In the sequential world, we achieve this by adapting a recent algorithm of Lattanzi and Sohler (ICML 2019). In the distributed setting, we adapt a simple algorithm of Guha et al. (IEEE Trans. Know. and Data Engineering 2003) and the popular $k$-means$\|$ of Bahmani et al. (PVLDB 2012).
A theoretical application of our techniques is an algorithm with running time $\tilde{O}(nk^2/z)$ that achieves an $O(1)$-approximation to the objective function while outputting $O(z)$ outliers, assuming $k \ll z \ll n$. This is complemented with a matching lower bound of $Ω(nk^2/z)$ for this problem in the oracle model.
△ Less
Submitted 23 September, 2022; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Generalizing the Sharp Threshold Phenomenon for the Distributed Complexity of the Lovász Local Lemma
Authors:
Sebastian Brandt,
Christoph Grunau,
Václav Rozhoň
Abstract:
Recently, Brandt, Maus and Uitto [PODC'19] showed that, in a restricted setting, the dependency of the complexity of the distributed Lovász Local Lemma (LLL) on the chosen LLL criterion exhibits a sharp threshold phenomenon: They proved that, under the LLL criterion $p2^d < 1$, if each random variable affects at most $3$ events, the deterministic complexity of the LLL in the LOCAL model is…
▽ More
Recently, Brandt, Maus and Uitto [PODC'19] showed that, in a restricted setting, the dependency of the complexity of the distributed Lovász Local Lemma (LLL) on the chosen LLL criterion exhibits a sharp threshold phenomenon: They proved that, under the LLL criterion $p2^d < 1$, if each random variable affects at most $3$ events, the deterministic complexity of the LLL in the LOCAL model is $O(d^2 + \log^* n)$. In stark contrast, under the criterion $p2^d \leq 1$, there is a randomized lower bound of $Ω(\log \log n)$ by Brandt et al. [STOC'16] and a deterministic lower bound of $Ω(\log n)$ by Chang, Kopelowitz and Pettie [FOCS'16]. Brandt, Maus and Uitto conjectured that the same behavior holds for the unrestricted setting where each random variable affects arbitrarily many events.
We prove their conjecture, by providing an algorithm that solves the LLL in time $O(d^2 + \log^* n)$ under the LLL criterion $p2^d < 1$, which is tight in bounded-degree graphs due to an $Ω(\log^* n)$ lower bound by Chung, Pettie and Su [PODC'14]. By the work of Brandt, Maus and Uitto, obtaining such an algorithm can be reduced to proving that all members in a certain family of functions in arbitrarily high dimensions are convex on some specific domain. Unfortunately, an analytical description of these functions is known only for dimension at most $3$, which led to the aforementioned restriction of their result. While obtaining those descriptions for functions of (substantially) higher dimension seems out of the reach of current techniques, we show that their convexity can be inferred by combinatorial means.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Improved MPC Algorithms for MIS, Matching, and Coloring on Trees and Beyond
Authors:
Mohsen Ghaffari,
Christoph Grunau,
Ce **
Abstract:
We present $O(\log\log n)$ round scalable Massively Parallel Computation algorithms for maximal independent set and maximal matching, in trees and more generally graphs of bounded arboricity, as well as for constant coloring trees. Following the standards, by a scalable MPC algorithm, we mean that these algorithms can work on machines that have capacity/memory as small as $n^δ$ for any positive co…
▽ More
We present $O(\log\log n)$ round scalable Massively Parallel Computation algorithms for maximal independent set and maximal matching, in trees and more generally graphs of bounded arboricity, as well as for constant coloring trees. Following the standards, by a scalable MPC algorithm, we mean that these algorithms can work on machines that have capacity/memory as small as $n^δ$ for any positive constant $δ<1$. Our results improve over the $O(\log^2\log n)$ round algorithms of Behnezhad et al. [PODC'19]. Moreover, our matching algorithm is presumably optimal as its bound matches an $Ω(\log\log n)$ conditional lower bound of Ghaffari, Kuhn, and Uitto [FOCS'19].
△ Less
Submitted 10 August, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
k-means++: few more steps yield constant approximation
Authors:
Davin Choo,
Christoph Grunau,
Julian Portmann,
Václav Rozhoň
Abstract:
The k-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is a state-of-the-art algorithm for solving the k-means clustering problem and is known to give an O(log k)-approximation in expectation. Recently, Lattanzi and Sohler (ICML 2019) proposed augmenting k-means++ with O(k log log k) local search steps to yield a constant approximation (in expectation) to the k-means clustering problem. I…
▽ More
The k-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is a state-of-the-art algorithm for solving the k-means clustering problem and is known to give an O(log k)-approximation in expectation. Recently, Lattanzi and Sohler (ICML 2019) proposed augmenting k-means++ with O(k log log k) local search steps to yield a constant approximation (in expectation) to the k-means clustering problem. In this paper, we improve their analysis to show that, for any arbitrarily small constant $\eps > 0$, with only $\eps k$ additional local search steps, one can achieve a constant approximation guarantee (with high probability in k), resolving an open problem in their paper.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Improved Local Computation Algorithm for Set Cover via Sparsification
Authors:
Christoph Grunau,
Slobodan Mitrović,
Ronitt Rubinfeld,
Ali Vakilian
Abstract:
We design a Local Computation Algorithm (LCA) for the set cover problem. Given a set system where each set has size at most $s$ and each element is contained in at most $t$ sets, the algorithm reports whether a given set is in some fixed set cover whose expected size is $O(\log{s})$ times the minimum fractional set cover value. Our algorithm requires…
▽ More
We design a Local Computation Algorithm (LCA) for the set cover problem. Given a set system where each set has size at most $s$ and each element is contained in at most $t$ sets, the algorithm reports whether a given set is in some fixed set cover whose expected size is $O(\log{s})$ times the minimum fractional set cover value. Our algorithm requires $s^{O(\log{s})} t^{O(\log{s} \cdot (\log \log{s} + \log \log{t}))}$ queries. This result improves upon the application of the reduction of [Parnas and Ron, TCS'07] on the result of [Kuhn et al., SODA'06], which leads to a query complexity of $(st)^{O(\log{s} \cdot \log{t})}$.
To obtain this result, we design a parallel set cover algorithm that admits an efficient simulation in the LCA model by using a sparsification technique introduced in [Ghaffari and Uitto, SODA'19] for the maximal independent set problem. The parallel algorithm adds a random subset of the sets to the solution in a style similar to the PRAM algorithm of [Berger et al., FOCS'89]. However, our algorithm differs in the way that it never revokes its decisions, which results in a fewer number of adaptive rounds. This requires a novel approximation analysis which might be of independent interest.
△ Less
Submitted 5 November, 2019; v1 submitted 30 October, 2019;
originally announced October 2019.