-
Fully Scalable MPC Algorithms for Clustering in High Dimension
Authors:
Artur Czumaj,
Guichen Gao,
Shaofeng H. -C. Jiang,
Robert Krauthgamer,
Pavel Veselý
Abstract:
We design new parallel algorithms for clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine may be $n^σ$ for arbitrarily small fixed $σ>0$. Importantly, the local memory may be substantially smaller than the number of clusters $k$, yet all our algorithms are fast…
▽ More
We design new parallel algorithms for clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine may be $n^σ$ for arbitrarily small fixed $σ>0$. Importantly, the local memory may be substantially smaller than the number of clusters $k$, yet all our algorithms are fast, i.e., run in $O(1)$ rounds.
We first devise a fast MPC algorithm for $O(1)$-approximation of uniform facility location. This is the first fully-scalable MPC algorithm that achieves $O(1)$-approximation for any clustering problem in general geometric setting; previous algorithms only provide $\mathrm{poly}(\log n)$-approximation or apply to restricted inputs, like low dimension or small number of clusters $k$; e.g. [Bhaskara and Wijewardena, ICML'18; Cohen-Addad et al., NeurIPS'21; Cohen-Addad et al., ICML'22]. We then build on this facility location result and devise a fast MPC algorithm that achieves $O(1)$-bicriteria approximation for $k$-Median and for $k$-Means, namely, it computes $(1+\varepsilon)k$ clusters of cost within $O(1/\varepsilon^2)$-factor of the optimum for $k$ clusters.
A primary technical tool that we introduce, and may be of independent interest, is a new MPC primitive for geometric aggregation, namely, computing for every data point a statistic of its approximate neighborhood, for statistics like range counting and nearest-neighbor search. Our implementation of this primitive works in high dimension, and is based on consistent hashing (aka sparse partition), a technique that was recently used for streaming algorithms [Czumaj et al., FOCS'22].
△ Less
Submitted 6 July, 2024; v1 submitted 15 July, 2023;
originally announced July 2023.
-
Optimal (degree+1)-Coloring in Congested Clique
Authors:
Sam Coy,
Artur Czumaj,
Peter Davies,
Gopinath Mishra
Abstract:
We consider the distributed complexity of the (degree+1)-list coloring problem, in which each node $u$ of degree $d(u)$ is assigned a palette of $d(u)+1$ colors, and the goal is to find a proper coloring using these color palettes. The (degree+1)-list coloring problem is a natural generalization of the classical $(Δ+1)$-coloring and $(Δ+1)$-list coloring problems, both being benchmark problems ext…
▽ More
We consider the distributed complexity of the (degree+1)-list coloring problem, in which each node $u$ of degree $d(u)$ is assigned a palette of $d(u)+1$ colors, and the goal is to find a proper coloring using these color palettes. The (degree+1)-list coloring problem is a natural generalization of the classical $(Δ+1)$-coloring and $(Δ+1)$-list coloring problems, both being benchmark problems extensively studied in distributed and parallel computing.
In this paper we settle the complexity of the (degree+1)-list coloring problem in the Congested Clique model by showing that it can be solved deterministically in a constant number of rounds.
△ Less
Submitted 24 April, 2024; v1 submitted 21 June, 2023;
originally announced June 2023.
-
On Parallel k-Center Clustering
Authors:
Sam Coy,
Artur Czumaj,
Gopinath Mishra
Abstract:
We consider the classic $k$-center problem in a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model, with local space per machine of $\mathcal{O}(n^δ)$, where $δ\in (0,1)$ is an arbitrary constant. As a central clustering problem, the $k$-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring $Ω(k)$ o…
▽ More
We consider the classic $k$-center problem in a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model, with local space per machine of $\mathcal{O}(n^δ)$, where $δ\in (0,1)$ is an arbitrary constant. As a central clustering problem, the $k$-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring $Ω(k)$ or even $Ω(k n^δ)$ local space per machine. While this setting covers the case of small values of $k$, for a large number of clusters these algorithms require large local memory, making them poorly scalable. The case of large $k$, $k \ge Ω(n^δ)$, has been considered recently for the low-local-space MPC model by Bateni et al. (2021), who gave an $\mathcal{O}(\log \log n)$-round MPC algorithm that produces $k(1+o(1))$ centers whose cost has multiplicative approximation of $\mathcal{O}(\log\log\log n)$. In this paper we extend the algorithm of Bateni et al. and design a low-local-space MPC algorithm that in $\mathcal{O}(\log\log n)$ rounds returns a clustering with $k(1+o(1))$ clusters that is an $\mathcal{O}(\log^*n)$-approximation for $k$-center.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Parallel Derandomization for Coloring
Authors:
Sam Coy,
Artur Czumaj,
Peter Davies,
Gopinath Mishra
Abstract:
Graph coloring problems are among the most fundamental problems in parallel and distributed computing, and have been studied extensively in both settings. In this context, designing efficient deterministic algorithms for these problems has been found particularly challenging.
In this work we consider this challenge, and design a novel framework for derandomizing algorithms for coloring-type prob…
▽ More
Graph coloring problems are among the most fundamental problems in parallel and distributed computing, and have been studied extensively in both settings. In this context, designing efficient deterministic algorithms for these problems has been found particularly challenging.
In this work we consider this challenge, and design a novel framework for derandomizing algorithms for coloring-type problems in the Massively Parallel Computation (MPC) model with sublinear space. We give an application of this framework by showing that a recent $(degree+1)$-list coloring algorithm by Halldorsson et al. (STOC'22) in the LOCAL model of distributed computation can be translated to the MPC model and efficiently derandomized. Our algorithm runs in $O(\log \log \log n)$ rounds, which matches the complexity of the state of the art algorithm for the $(Δ+ 1)$-coloring problem.
△ Less
Submitted 25 April, 2024; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Routing Schemes for Hybrid Communication Networks
Authors:
Sam Coy,
Artur Czumaj,
Christian Scheideler,
Philipp Schneider,
Julian Werthmann
Abstract:
We consider the problem of computing routing schemes in the $\mathsf{HYBRID}$ model of distributed computing where nodes have access to two fundamentally different communication modes. In this problem nodes have to compute small labels and routing tables that allow for efficient routing of messages in the local network, which typically offers the majority of the throughput. Recent work has shown t…
▽ More
We consider the problem of computing routing schemes in the $\mathsf{HYBRID}$ model of distributed computing where nodes have access to two fundamentally different communication modes. In this problem nodes have to compute small labels and routing tables that allow for efficient routing of messages in the local network, which typically offers the majority of the throughput. Recent work has shown that using the $\mathsf{HYBRID}$ model admits a significant speed-up compared to what would be possible if either communication mode were used in isolation. Nonetheless, if general graphs are used as the input graph the computation of routing schemes still takes polynomial rounds in the $\mathsf{HYBRID}$ model. We bypass this lower bound by restricting the local graph to unit-disc-graphs and solve the problem deterministically with running time $O(|\mathcal H|^2 \!+\! \log n)$, label size $O(\log n)$, and size of routing tables $O(|\mathcal H|^2 \!\cdot\! \log n)$ where $|\mathcal H|$ is the number of ``radio holes'' in the network. Our work builds on recent work by Coy et al., who obtain this result in the much simpler setting where the input graph has no radio holes. We develop new techniques to achieve this, including a decomposition of the local graph into path-convex regions, where each region contains a shortest path for any pair of nodes in it.
△ Less
Submitted 13 March, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Streaming Facility Location in High Dimension via Geometric Hashing
Authors:
Artur Czumaj,
Arnold Filtser,
Shaofeng H. -C. Jiang,
Robert Krauthgamer,
Pavel Veselý,
Mingwei Yang
Abstract:
In Euclidean Uniform Facility Location (UFL), the input is a set of clients in $\mathbb{R}^d$ and the goal is to place facilities to serve them, so as to minimize the total cost of opening facilities plus connecting the clients. We study the setting of dynamic geometric streams, where the clients are presented as a sequence of insertions and deletions of points in the grid $\{1,\ldots,Δ\}^d$, and…
▽ More
In Euclidean Uniform Facility Location (UFL), the input is a set of clients in $\mathbb{R}^d$ and the goal is to place facilities to serve them, so as to minimize the total cost of opening facilities plus connecting the clients. We study the setting of dynamic geometric streams, where the clients are presented as a sequence of insertions and deletions of points in the grid $\{1,\ldots,Δ\}^d$, and we focus on the \emph{high-dimensional regime}, where the algorithm must use space polynomial in $d\cdot\logΔ$.
We present a new algorithmic framework, based on importance sampling, for $O(1)$-approximation of UFL using only $\mathrm{poly}(d\cdot\logΔ)$ space. This framework is easy to implement in two passes, one for sampling points and the other for estimating their contribution. Over random-order streams, we can extend this to one pass by using the two halves of the stream separately. Our main result, for arbitrary-order streams, computes $O(d / \log d)$-approximation in one pass by combining the two passes differently. This improves upon previous algorithms that either need space $\exp(d)$ or only guarantee $O(d\cdot\log^2Δ)$-approximation, and therefore our algorithms for high dimension are the first to avoid the $O(\logΔ)$-factor in approximation that is inherent to the widely-used quadtree decomposition. Our improvement is achieved by employing a geometric hashing scheme that maps points in $\mathbb{R}^d$ into buckets of bounded diameter, with the key property that every point set of small-enough diameter is hashed into few buckets. By applying an alternative bound for this hashing, we also obtain an $O(1 / ε)$-approximation in one pass, using larger but still sublinear space $O(n^ε)$ where $n$ is the number of clients.
We complement our results by showing $1.085$-approximation requires space exponential in $\mathrm{poly}(d\cdot\logΔ)$.
△ Less
Submitted 28 January, 2023; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Near-Shortest Path Routing in Hybrid Communication Networks
Authors:
Sam Coy,
Artur Czumaj,
Michael Feldmann,
Kristian Hinnenthal,
Fabian Kuhn,
Christian Scheideler,
Philipp Schneider,
Martijn Struijs
Abstract:
Hybrid networks, i.e., networks that leverage different means of communication, become ever more widespread. To allow theoretical study of such networks, [Augustine et al., SODA'20] introduced the $\mathsf{HYBRID}$ model, which is based on the concept of synchronous message passing and uses two fundamentally different principles of communication: a local mode, which allows every node to exchange o…
▽ More
Hybrid networks, i.e., networks that leverage different means of communication, become ever more widespread. To allow theoretical study of such networks, [Augustine et al., SODA'20] introduced the $\mathsf{HYBRID}$ model, which is based on the concept of synchronous message passing and uses two fundamentally different principles of communication: a local mode, which allows every node to exchange one message per round with each neighbor in a local communication graph; and a global mode where any pair of nodes can exchange messages, but only few such exchanges can take place per round.
A sizable portion of the previous research for the $\mathsf{HYBRID}$ model revolves around basic communication primitives and computing distances or shortest paths in networks. In this paper, we extend this study to a related fundamental problem of computing compact routing schemes for near-shortest paths in the local communication graph. We demonstrate that, for the case where the local communication graph is a unit-disc graph with $n$ nodes that is realized in the plane and has no radio holes, we can deterministically compute a routing scheme that has constant stretch and uses labels and local routing tables of size $O(\log n)$ bits in only $O(\log n)$ rounds.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
Improved Deterministic $(Δ+1)$-Coloring in Low-Space MPC
Authors:
Artur Czumaj,
Peter Davies,
Merav Parter
Abstract:
We present a deterministic $O(\log \log \log n)$-round low-space Massively Parallel Computation (MPC) algorithm for the classical problem of $(Δ+1)$-coloring on $n$-vertex graphs. In this model, every machine has a sublinear local memory of size $n^φ$ for any arbitrary constant $φ\in (0,1)$. Our algorithm works under the relaxed setting where each machine is allowed to perform exponential (in…
▽ More
We present a deterministic $O(\log \log \log n)$-round low-space Massively Parallel Computation (MPC) algorithm for the classical problem of $(Δ+1)$-coloring on $n$-vertex graphs. In this model, every machine has a sublinear local memory of size $n^φ$ for any arbitrary constant $φ\in (0,1)$. Our algorithm works under the relaxed setting where each machine is allowed to perform exponential (in $n^φ$) local computation, while respecting the $n^φ$ space and bandwidth limitations.
Our key technical contribution is a novel derandomization of the ingenious $(Δ+1)$-coloring LOCAL algorithm by Chang-Li-Pettie (STOC 2018, SIAM J. Comput. 2020). The Chang-Li-Pettie algorithm runs in $T_{local}=poly(\log\log n)$ rounds, which sets the state-of-the-art randomized round complexity for the problem in the local model. Our derandomization employs a combination of tools, most notably pseudorandom generators (PRG) and bounded-independence hash functions.
The achieved round complexity of $O(\log\log\log n)$ rounds matches the bound of $\log(T_{local})$, which currently serves an upper bound barrier for all known randomized algorithms for locally-checkable problems in this model. Furthermore, no deterministic sublogarithmic low-space MPC algorithms for the $(Δ+1)$-coloring problem were previously known.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
On Truly Parallel Time in Population Protocols
Authors:
Artur Czumaj,
Andrzej Lingas
Abstract:
The {\em parallel time} of a population protocol is defined as the average number of required interactions that an agent in the protocol participates, i.e., the quotient between the total number of interactions required by the protocol and the total number $n$ of agents, or just roughly the number of required rounds with $n$ interactions. This naming triggers an intuition that at least on the aver…
▽ More
The {\em parallel time} of a population protocol is defined as the average number of required interactions that an agent in the protocol participates, i.e., the quotient between the total number of interactions required by the protocol and the total number $n$ of agents, or just roughly the number of required rounds with $n$ interactions. This naming triggers an intuition that at least on the average a round of $n$ interactions can be implemented in $O(1)$ parallel steps. We show that when the transition function of a population protocol is treated as a black box then the expected maximum number of parallel steps necessary to implement a round of $n$ interactions is $Ω(\frac {\log n}{\log \log n})$. We also provide a combinatorial argument for a matching upper bound on the number of parallel steps in the average case under additional assumptions.
△ Less
Submitted 26 August, 2021;
originally announced August 2021.
-
Deterministic Massively Parallel Connectivity
Authors:
Sam Coy,
Artur Czumaj
Abstract:
We consider the problem of designing fundamental graph algorithms on the model of Massive Parallel Computation (MPC). The input to the problem is an undirected graph $G$ with $n$ vertices and $m$ edges, and with $D$ being the maximum diameter of any connected component in $G$. We consider the MPC with low local space, allowing each machine to store only $Θ(n^δ)$ words for an arbitrarily constant…
▽ More
We consider the problem of designing fundamental graph algorithms on the model of Massive Parallel Computation (MPC). The input to the problem is an undirected graph $G$ with $n$ vertices and $m$ edges, and with $D$ being the maximum diameter of any connected component in $G$. We consider the MPC with low local space, allowing each machine to store only $Θ(n^δ)$ words for an arbitrarily constant $δ> 0$, and with linear global space (which is equal to the number of machines times the local space available), that is, with optimal utilization.
In a recent breakthrough, Andoni et al. (FOCS 18) and Behnezhad et al. (FOCS 19) designed parallel randomized algorithms that in $O(\log D + \log \log n)$ rounds on an MPC with low local space determine all connected components of an input graph, improving upon the classic bound of $O(\log n)$ derived from earlier works on PRAM algorithms.
In this paper, we show that asymptotically identical bounds can be also achieved for deterministic algorithms: we present a deterministic MPC low local space algorithm that in $O(\log D + \log \log n)$ rounds determines all connected components of the input graph.
△ Less
Submitted 9 August, 2021;
originally announced August 2021.
-
Component Stability in Low-Space Massively Parallel Computation
Authors:
Artur Czumaj,
Peter Davies,
Merav Parter
Abstract:
We study the power and limitations of component-stable algorithms in the low-space model of Massively Parallel Computation (MPC). Recently Ghaffari, Kuhn and Uitto (FOCS 2019) introduced the class of component-stable low-space MPC algorithms, which are, informally, defined as algorithms for which the outputs reported by the nodes in different connected components are required to be independent. Th…
▽ More
We study the power and limitations of component-stable algorithms in the low-space model of Massively Parallel Computation (MPC). Recently Ghaffari, Kuhn and Uitto (FOCS 2019) introduced the class of component-stable low-space MPC algorithms, which are, informally, defined as algorithms for which the outputs reported by the nodes in different connected components are required to be independent. This very natural notion was introduced to capture most (if not all) of the known efficient MPC algorithms to date, and it was the first general class of MPC algorithms for which one can show non-trivial conditional lower bounds. In this paper we enhance the framework of component-stable algorithms and investigate its effect on the complexity of randomized and deterministic low-space MPC. Our key contributions include:
1) We revise and formalize the lifting approach of Ghaffari, Kuhn and Uitto. This requires a very delicate amendment of the notion of component stability, which allows us to fill in gaps in the earlier arguments.
2) We also extend the framework to obtain conditional lower bounds for deterministic algorithms and fine-grained lower bounds that depend on the maximum degree $Δ$.
3) We demonstrate a collection of natural graph problems for which non-component-stable algorithms break the conditional lower bound obtained for component-stable algorithms. This implies that, for both deterministic and randomized algorithms, component-stable algorithms are conditionally weaker than the non-component-stable ones.
Altogether our results imply that component-stability might limit the computational power of the low-space MPC model, at least in certain contexts, paving the way for improved upper bounds that escape the conditional lower bound setting of Ghaffari, Kuhn, and Uitto.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Streaming Algorithms for Geometric Steiner Forest
Authors:
Artur Czumaj,
Shaofeng H. -C. Jiang,
Robert Krauthgamer,
Pavel Veselý
Abstract:
We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}^2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem i…
▽ More
We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}^2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to $X$. Each input point $x\in X$ arrives with its color $\textsf{color}(x) \in [k]$, and as usual for dynamic geometric streams, the input points are restricted to the discrete grid $\{0, \ldots, Δ\}^2$.
We design a single-pass streaming algorithm that uses $\mathrm{poly}(k \cdot \logΔ)$ space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio $α_2$ (currently $1.1547 \le α_2 \le 1.214$). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when $k=1$, and it is a major open question to improve the ratio to $1 + ε$ even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting.
We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires $Ω(k)$ bits of space.
△ Less
Submitted 10 May, 2024; v1 submitted 9 November, 2020;
originally announced November 2020.
-
Simple, Deterministic, Constant-Round Coloring in the Congested Clique
Authors:
Artur Czumaj,
Peter Davies,
Merav Parter
Abstract:
We settle the complexity of the $(Δ+1)$-coloring and $(Δ+1)$-list coloring problems in the CONGESTED CLIQUE model by presenting a simple deterministic algorithm for both problems running in a constant number of rounds. This matches the complexity of the recent breakthrough randomized constant-round $(Δ+1)$-list coloring algorithm due to Chang et al. (PODC'19), and significantly improves upon the s…
▽ More
We settle the complexity of the $(Δ+1)$-coloring and $(Δ+1)$-list coloring problems in the CONGESTED CLIQUE model by presenting a simple deterministic algorithm for both problems running in a constant number of rounds. This matches the complexity of the recent breakthrough randomized constant-round $(Δ+1)$-list coloring algorithm due to Chang et al. (PODC'19), and significantly improves upon the state-of-the-art $O(\log Δ)$-round deterministic $(Δ+1)$-coloring bound of Parter (ICALP'18).
A remarkable property of our algorithm is its simplicity. Whereas the state-of-the-art randomized algorithms for this problem are based on the quite involved local coloring algorithm of Chang et al. (STOC'18), our algorithm can be described in just a few lines. At a high level, it applies a careful derandomization of a recursive procedure which partitions the nodes and their respective palettes into separate bins. We show that after $O(1)$ recursion steps, the remaining uncolored subgraph within each bin has linear size, and thus can be solved locally by collecting it to a single node. This algorithm can also be implemented in the Massively Parallel Computation (MPC) model provided that each machine has linear (in $n$, the number of nodes in the input graph) space.
We also show an extension of our algorithm to the MPC regime in which machines have sublinear space: we present the first deterministic $(Δ+1)$-list coloring algorithm designed for sublinear-space MPC, which runs in $O(\log Δ+ \log\log n)$ rounds.
△ Less
Submitted 13 September, 2020;
originally announced September 2020.
-
Haystack Hunting Hints and Locker Room Communication
Authors:
Artur Czumaj,
George Kontogeorgiou,
Mike Paterson
Abstract:
We want to efficiently find a specific object in a large unstructured set, which we model by a random $n$-permutation, and we have to do it by revealing just a single element. Clearly, without any help this task is hopeless and the best one can do is select the element at random, and achieve the success probability $\frac{1}{n}$. Can we do better with some small amount of advice about the permutat…
▽ More
We want to efficiently find a specific object in a large unstructured set, which we model by a random $n$-permutation, and we have to do it by revealing just a single element. Clearly, without any help this task is hopeless and the best one can do is select the element at random, and achieve the success probability $\frac{1}{n}$. Can we do better with some small amount of advice about the permutation, even without knowing the object sought? We show that by providing advice of just one integer in $\{0,1,...,n-1\}$, one can improve the success probability considerably, by a $Θ(\frac{logn}{loglogn})$ factor. We study this and related problems, and show asymptotically matching upper and lower bounds for their optimal probability of success.Our analysis relies on a close relationship of such problems to some intrinsic properties of rendom permutations related to the rencontres number.
△ Less
Submitted 3 June, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Graph Sparsification for Derandomizing Massively Parallel Computation with Low Space
Authors:
Artur Czumaj,
Peter Davies,
Merav Parter
Abstract:
The Massively Parallel Computation (MPC) model is an emerging model which distills core aspects of distributed and parallel computation. It has been developed as a tool to solve (typically graph) problems in systems where the input is distributed over many machines with limited space. Recent work has focused on the regime in which machines have sublinear (in $n$, the number of nodes in the input g…
▽ More
The Massively Parallel Computation (MPC) model is an emerging model which distills core aspects of distributed and parallel computation. It has been developed as a tool to solve (typically graph) problems in systems where the input is distributed over many machines with limited space. Recent work has focused on the regime in which machines have sublinear (in $n$, the number of nodes in the input graph) memory, with randomized algorithms presented for fundamental graph problems of Maximal Matching and Maximal Independent Set. However, there have been no prior corresponding \emph{deterministic} algorithms.
A major challenge underlying the sublinear space setting is that the local space of each machine might be too small to store all the edges incident to a single node. This poses a considerable obstacle compared to the classical models in which each node is assumed to know and have easy access to its incident edges. To overcome this barrier we introduce a new \emph{graph sparsification technique} that \emph{deterministically} computes a low-degree subgraph with additional desired properties. Using this framework to derandomize the well-known randomized algorithm of Luby [SICOMP'86], we obtain $O(\log Δ+\log\log n)$-round \emph{deterministic} MPC algorithms for solving the fundamental problems of \emph{Maximal Matching} and \emph{Maximal Independent Set} with $O(n^ε)$ space on each machine for any constant $ε> 0$. Based on the recent work of Ghaffari et al. [FOCS'18], this additive $O(\log\log n)$ factor is \emph{conditionally} essential. These algorithms can also be shown to run in $O(\log Δ)$ rounds in the closely related model of \congc, improving upon the state-of-the-art bound of $O(\log^2 Δ)$ rounds by Censor-Hillel et al. [DISC'17].
△ Less
Submitted 19 February, 2020; v1 submitted 11 December, 2019;
originally announced December 2019.
-
A characterization of graph properties testable for general planar graphs with one-sided error (It is all about forbidden subgraphs)
Authors:
Artur Czumaj,
Christian Sohler
Abstract:
The problem of characterizing testable graph properties (properties that can be tested with a number of queries independent of the input size) is a fundamental problem in the area of property testing. While there has been some extensive prior research characterizing testable graph properties in the dense graphs model and we have good understanding of the bounded degree graphs model, no similar cha…
▽ More
The problem of characterizing testable graph properties (properties that can be tested with a number of queries independent of the input size) is a fundamental problem in the area of property testing. While there has been some extensive prior research characterizing testable graph properties in the dense graphs model and we have good understanding of the bounded degree graphs model, no similar characterization has been known for general graphs, with no degree bounds. In this paper we take on this major challenge and consider the problem of characterizing all testable graph properties in general planar graphs.
We consider the model in which a general planar graph can be accessed by the random neighbor oracle that allows access to any given vertex and access to a random neighbor of a given vertex. We show that, informally, a graph property $P$ is testable with one-sided error for general planar graphs if and only if testing $P$ can be reduced to testing for a finite family of finite forbidden subgraphs. While our presentation focuses on planar graphs, our approach extends easily to general minor-free graphs.
Our analysis of the necessary condition relies on a recent construction of canonical testers in the random neighbor oracle model that is applied here to the one-sided error model for testing in planar graphs. The sufficient condition in the characterization reduces the problem to the task of testing $H$-freeness in planar graphs, and is the main and most challenging technical contribution of the paper: we show that for planar graphs (with arbitrary degrees), the property of being $H$-free is testable with one-sided error for every finite graph $H$, in the random neighbor oracle model.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
Testable Properties in General Graphs and Random Order Streaming
Authors:
Artur Czumaj,
Hendrik Fichtenberger,
Pan Peng,
Christian Sohler
Abstract:
We present a novel framework closely linking the areas of property testing and data streaming algorithms in the setting of general graphs. It has been recently shown (Monemizadeh et al. 2017) that for bounded-degree graphs, any constant-query tester can be emulated in the random order streaming model by a streaming algorithm that uses only space required to store a constant number of words. Howeve…
▽ More
We present a novel framework closely linking the areas of property testing and data streaming algorithms in the setting of general graphs. It has been recently shown (Monemizadeh et al. 2017) that for bounded-degree graphs, any constant-query tester can be emulated in the random order streaming model by a streaming algorithm that uses only space required to store a constant number of words. However, in a more natural setting of general graphs, with no restriction on the maximum degree, no such results were known because of our lack of understanding of constant-query testers in general graphs and lack of techniques to appropriately emulate in the streaming setting off-line algorithms allowing many high-degree vertices.
In this work we advance our understanding on both of these challenges. First, we provide canonical testers for all constant-query testers for general graphs, both, for one-sided and two-sided errors. Such canonizations were only known before (in the adjacency matrix model) for dense graphs (Goldreich and Trevisan 2003) and (in the adjacency list model) for bounded degree (di-)graphs (Goldreich and Ron 2011, Czumaj et al. 2016). Using the concept of canonical testers, we then prove that every property of general graphs that is constant-query testable with one-sided error can also be tested in constant-space with one-sided error in the random order streaming model.
Our results imply, among others, that properties like $(s,t)$ disconnectivity, $k$-path-freeness, etc. are constant-space testable in random order streams.
△ Less
Submitted 5 May, 2019;
originally announced May 2019.
-
Online Facility Location with Deletions
Authors:
Marek Cygan,
Artur Czumaj,
Marcin Mucha,
Piotr Sankowski
Abstract:
In this paper we study three previously unstudied variants of the online Facility Location problem, considering an intrinsic scenario when the clients and facilities are not only allowed to arrive to the system, but they can also depart at any moment.
We begin with the study of a natural fully-dynamic online uncapacitated model where clients can be both added and removed. When a client arrives,…
▽ More
In this paper we study three previously unstudied variants of the online Facility Location problem, considering an intrinsic scenario when the clients and facilities are not only allowed to arrive to the system, but they can also depart at any moment.
We begin with the study of a natural fully-dynamic online uncapacitated model where clients can be both added and removed. When a client arrives, then it has to be assigned either to an existing facility or to a new facility opened at the client's location. However, when a client who has been also one of the open facilities is to be removed, then our model has to allow to reconnect all clients that have been connected to that removed facility. In this model, we present an optimal O(log n_act / log log n_act)-competitive algorithm, where n_act is the number of active clients at the end of the input sequence.
Next, we turn our attention to the capacitated Facility Location problem. We first note that if no deletions are allowed, then one can achieve an optimal competitive ratio of O(log n/ log log n), where n is the length of the sequence. However, when deletions are allowed, the capacitated version of the problem is significantly more challenging than the uncapacitated one. We show that still, using a more sophisticated algorithmic approach, one can obtain an online O(log m + log c log n)-competitive algorithm for the capacitated Facility Location problem in the fully dynamic model, where m is number of points in the input metric and c is the capacity of any open facility.
△ Less
Submitted 10 July, 2018;
originally announced July 2018.
-
Detecting cliques in CONGEST networks
Authors:
Artur Czumaj,
Christian Konrad
Abstract:
The problem of detecting network structures plays a central role in distributed computing. One of the fundamental problems studied in this area is to determine whether for a given graph $H$, the input network contains a subgraph isomorphic to $H$ or not. We investigate this problem for $H$ being a clique $K_{l}$ in the classical distributed CONGEST model, where the communication topology is the sa…
▽ More
The problem of detecting network structures plays a central role in distributed computing. One of the fundamental problems studied in this area is to determine whether for a given graph $H$, the input network contains a subgraph isomorphic to $H$ or not. We investigate this problem for $H$ being a clique $K_{l}$ in the classical distributed CONGEST model, where the communication topology is the same as the topology of the underlying network, and with limited communication bandwidth on the links.
Our first and main result is a lower bound, showing that detecting $K_{l}$ requires $Ω(\sqrt{n} / b)$ communication rounds, for every $4 \le l \le \sqrt{n}$, and $Ω(n / (l b))$ rounds for every $l \ge \sqrt{n}$, where $b$ is the bandwidth of the communication links. This result is obtained by using a reduction to the set disjointness problem in the framework of two-party communication complexity.
We complement our lower bound with a two-party communication protocol for listing all cliques in the input graph, which up to constant factors communicates the same number of bits as our lower bound for $K_4$ detection. This demonstrates that our lower bound cannot be improved using the two-party communication framework.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
Randomized Communication Without Network Knowledge
Authors:
Artur Czumaj,
Peter Davies
Abstract:
Radio networks are a long-studied model for distributed system of devices which communicate wirelessly. When these devices are mobile or have limited capabilities, the system is often best modeled by the ad-hoc variant, in which the devices do not know the structure of the network. A large body of work has been devoted to designing algorithms for the ad-hoc model, particularly for fundamental comm…
▽ More
Radio networks are a long-studied model for distributed system of devices which communicate wirelessly. When these devices are mobile or have limited capabilities, the system is often best modeled by the ad-hoc variant, in which the devices do not know the structure of the network. A large body of work has been devoted to designing algorithms for the ad-hoc model, particularly for fundamental communications tasks such as broadcasting. Most of these algorithms, however, assume that devices have some network knowledge (usually bounds on the number of nodes in the network $n$, and the diameter $D$), which may not always be realistic in systems with weak devices or gradual deployment. Very little is known about what can be done when this information is not available.
This is the issue we address in this work, by presenting the first \emph{randomized} broadcasting algorithms for \emph{blind} networks in which nodes have no prior knowledge whatsoever. We demonstrate that lack of parameter knowledge can be overcome at only a small increase in running time. Specifically, we show that in networks without collision detection, broadcast can be achieved in $O(D\log\frac nD\log^2\log\frac nD + \log^2 n)$ time, almost reaching the $Ω(D\log\frac nD + \log^2 n)$ lower bound. We also give an algorithm for directed networks with collision detection, which requires only $O(D\log\frac nD\log\log\log\frac nD + \log^2 n)$ time.
△ Less
Submitted 13 May, 2018;
originally announced May 2018.
-
Deterministic Blind Radio Networks
Authors:
Artur Czumaj,
Peter Davies
Abstract:
Ad-hoc radio networks and multiple access channels are classical and well-studied models of distributed systems, with a large body of literature on deterministic algorithms for fundamental communications primitives such as broadcasting and wake-up. However, almost all of these algorithms assume knowledge of the number of participating nodes and the range of possible IDs, and often make the further…
▽ More
Ad-hoc radio networks and multiple access channels are classical and well-studied models of distributed systems, with a large body of literature on deterministic algorithms for fundamental communications primitives such as broadcasting and wake-up. However, almost all of these algorithms assume knowledge of the number of participating nodes and the range of possible IDs, and often make the further assumption that the latter is linear in the former. These are very strong assumptions for models which were designed to capture networks of weak devices organized in an ad-hoc manner. It was believed that without this knowledge, deterministic algorithms must necessarily be much less efficient.
In this paper we address this fundamental question and show that this is not the case. We present \emph{deterministic} algorithms for \emph{blind} networks (in which nodes know only their own IDs), which match or nearly match the running times of the fastest algorithms which assume network knowledge (and even surpass the previous fastest algorithms which assume parameter knowledge but not small labels).
Specifically, in multiple access channels with $k$ participating nodes and IDs up to $L$, we give a wake-up algorithm requiring $O(\frac{k\log L \log k }{\log\log k})$ time, improving dramatically over the $O(L^3 \log^3 L)$ time algorithm of De Marco et al. (2007), and a broadcasting algorithm requiring \sloppy{$O(k\log L \log\log k)$ }time, improving over the $O(L)$ time algorithm of Gasieniec et al. (2001) in most circumstances. Furthermore, we show how these same algorithms apply directly to multi-hop radio networks, achieving even larger running time improvements.
△ Less
Submitted 13 May, 2018;
originally announced May 2018.
-
Round Compression for Parallel Matching Algorithms
Authors:
Artur Czumaj,
Jakub Łącki,
Aleksander Mądry,
Slobodan Mitrović,
Krzysztof Onak,
Piotr Sankowski
Abstract:
For over a decade now we have been witnessing the success of {\em massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow…
▽ More
For over a decade now we have been witnessing the success of {\em massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms?
A prominent example here is the {\em maximum matching} problem---one of the most classic graph problems. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in $O(\log{n})$ rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has $n^{1+Ω(1)}$ memory, this problem can also be solved $2$-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly $O(\log{n})$ rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power.
In this paper, we finally refute that perplexing possibility. That is, we break the above $O(\log n)$ round complexity bound even in the case of {\em slightly sublinear} memory per machine. In fact, our improvement here is {\em almost exponential}: we are able to deliver a $(2+ε)$-approximation to maximum matching, for any fixed constant $ε>0$, in $O((\log \log n)^2)$ rounds.
△ Less
Submitted 1 February, 2018; v1 submitted 11 July, 2017;
originally announced July 2017.
-
Exploiting Spontaneous Transmissions for Broadcasting and Leader Election in Radio Networks
Authors:
Artur Czumaj,
Peter Davies
Abstract:
We study two fundamental communication primitives: broadcasting and leader election in the classical model of multi-hop radio networks with unknown topology and without collision detection mechanisms.
It has been known for almost 20 years that in undirected networks with n nodes and diameter D, randomized broadcasting requires Omega(D log n/D + log^2 n) rounds in expectation, assuming that uninf…
▽ More
We study two fundamental communication primitives: broadcasting and leader election in the classical model of multi-hop radio networks with unknown topology and without collision detection mechanisms.
It has been known for almost 20 years that in undirected networks with n nodes and diameter D, randomized broadcasting requires Omega(D log n/D + log^2 n) rounds in expectation, assuming that uninformed nodes are not allowed to communicate (until they are informed). Only very recently, Haeupler and Wajc (PODC'2016) showed that this bound can be slightly improved for the model with spontaneous transmissions, providing an O(D log n loglog n / log D + log^O(1) n)-time broadcasting algorithm. In this paper, we give a new and faster algorithm that completes broadcasting in O(D log n/log D + log^O(1) n) time, with high probability. This yields the first optimal O(D)-time broadcasting algorithm whenever D is polynomial in n.
Furthermore, our approach can be applied to design a new leader election algorithm that matches the performance of our broadcasting algorithm. Previously, all fast randomized leader election algorithms have been using broadcasting as their subroutine and their complexity have been asymptotically strictly bigger than the complexity of broadcasting. In particular, the fastest previously known randomized leader election algorithm of Ghaffari and Haeupler (SODA'2013) requires O(D log n/D min{loglog n, log n/D} + log^O(1) n)-time with high probability. Our new algorithm requires O(D log n / log D + log^O(1) n) time with high probability, and it achieves the optimal O(D) time whenever D is polynomial in n.
△ Less
Submitted 6 March, 2017;
originally announced March 2017.
-
Distributed Methods for Computing Approximate Equilibria
Authors:
Artur Czumaj,
Argyrios Deligkas,
Michail Fasoulakis,
John Fearnley,
Marcin Jurdziński,
Rahul Savani
Abstract:
We present a new, distributed method to compute approximate Nash equilibria in bimatrix games. In contrast to previous approaches that analyze the two payoff matrices at the same time (for example, by solving a single LP that combines the two players payoffs), our algorithm first solves two independent LPs, each of which is derived from one of the two payoff matrices, and then compute approximate…
▽ More
We present a new, distributed method to compute approximate Nash equilibria in bimatrix games. In contrast to previous approaches that analyze the two payoff matrices at the same time (for example, by solving a single LP that combines the two players payoffs), our algorithm first solves two independent LPs, each of which is derived from one of the two payoff matrices, and then compute approximate Nash equilibria using only limited communication between the players.
Our method has several applications for improved bounds for efficient computations of approximate Nash equilibria in bimatrix games. First, it yields a best polynomial-time algorithm for computing \emph{approximate well-supported Nash equilibria (WSNE)}, which guarantees to find a 0.6528-WSNE in polynomial time. Furthermore, since our algorithm solves the two LPs separately, it can be used to improve upon the best known algorithms in the limited communication setting: the algorithm can be implemented to obtain a randomized expected-polynomial-time algorithm that uses poly-logarithmic communication and finds a 0.6528-WSNE. The algorithm can also be carried out to beat the best known bound in the query complexity setting, requiring $O(n \log n)$ payoff queries to compute a 0.6528-WSNE. Finally, our approach can also be adapted to provide the best known communication efficient algorithm for computing \emph{approximate Nash equilibria}: it uses poly-logarithmic communication to find a 0.382-approximate Nash equilibrium.
△ Less
Submitted 10 December, 2015;
originally announced December 2015.
-
Deterministic Communication in Radio Networks
Authors:
Artur Czumaj,
Peter Davies
Abstract:
In this paper we improve the deterministic complexity of two fundamental communication primitives in the classical model of ad-hoc radio networks with unknown topology: broadcasting and wake-up. We consider an unknown radio network, in which all nodes have no prior knowledge about network topology, and know only the size of the network $n$, the maximum in-degree of any node $Δ$, and the eccentrici…
▽ More
In this paper we improve the deterministic complexity of two fundamental communication primitives in the classical model of ad-hoc radio networks with unknown topology: broadcasting and wake-up. We consider an unknown radio network, in which all nodes have no prior knowledge about network topology, and know only the size of the network $n$, the maximum in-degree of any node $Δ$, and the eccentricity of the network $D$.
For such networks, we first give an algorithm for wake-up, based on the existence of small universal synchronizers. This algorithm runs in $O(\frac{\min\{n, D Δ\} \log n \log Δ}{\log\log Δ})$ time, the fastest known in both directed and undirected networks, improving over the previous best $O(n \log^2n)$-time result across all ranges of parameters, but particularly when maximum in-degree is small.
Next, we introduce a new combinatorial framework of block synchronizers and prove the existence of such objects of low size. Using this framework, we design a new deterministic algorithm for the fundamental problem of broadcasting, running in $O(n \log D \log\log\frac{D Δ}{n})$ time. This is the fastest known algorithm for the problem in directed networks, improving upon the $O(n \log n \log \log n)$-time algorithm of De Marco (2010) and the $O(n \log^2 D)$-time algorithm due to Czumaj and Rytter (2003). It is also the first to come within a log-logarithmic factor of the $Ω(n \log D)$ lower bound due to Clementi et al.\ (2003).
Our results also have direct implications on the fastest \emph{deterministic leader election} and \emph{clock synchronization} algorithms in both directed and undirected radio networks, tasks which are commonly used as building blocks for more complex procedures.
△ Less
Submitted 16 March, 2019; v1 submitted 2 June, 2015;
originally announced June 2015.
-
Leader Election in Multi-Hop Radio Networks
Authors:
Artur Czumaj,
Peter Davies
Abstract:
In this paper we present a framework for leader election in multi-hop radio networks which yield randomized leader election algorithms taking $O(\text{broadcasting time})$ in expectation, and another which yields algorithms taking fixed $O(\sqrt{\log n})$-times broadcasting time. Both succeed with high probability.
We show how to implement these frameworks in radio networks without collision det…
▽ More
In this paper we present a framework for leader election in multi-hop radio networks which yield randomized leader election algorithms taking $O(\text{broadcasting time})$ in expectation, and another which yields algorithms taking fixed $O(\sqrt{\log n})$-times broadcasting time. Both succeed with high probability.
We show how to implement these frameworks in radio networks without collision detection, and in networks with collision detection (in fact in the strictly weaker beep model). In doing so, we obtain the first optimal expected-time leader election algorithms in both settings, and also improve the worst-case running time in directed networks without collision detection by an $O(\sqrt {\log n})$ factor.
△ Less
Submitted 16 March, 2019; v1 submitted 22 May, 2015;
originally announced May 2015.
-
Communicating with Beeps
Authors:
Artur Czumaj,
Peter Davies
Abstract:
The \emph{beep model} is a very weak communications model in which devices in a network can communicate only via beeps and silence. As a result of its weak assumptions, it has broad applicability to many different implementations of communications networks. This comes at the cost of a restrictive environment for algorithm design.
Despite being only recently introduced, the beep model has receive…
▽ More
The \emph{beep model} is a very weak communications model in which devices in a network can communicate only via beeps and silence. As a result of its weak assumptions, it has broad applicability to many different implementations of communications networks. This comes at the cost of a restrictive environment for algorithm design.
Despite being only recently introduced, the beep model has received considerable attention, in part due to its relationship with other communication models such as that of ad-hoc radio networks. However, there has been no definitive published result for several fundamental tasks in the model. We aim to rectify this with our paper.
We present algorithms and lower bounds for a variety of fundamental global communications tasks in the model.
△ Less
Submitted 16 March, 2019; v1 submitted 22 May, 2015;
originally announced May 2015.
-
Testing Cluster Structure of Graphs
Authors:
Artur Czumaj,
Pan Peng,
Christian Sohler
Abstract:
We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter $\varepsilon$, a $d$-bounded degree graph is defined to be $(k, φ)$-clusterable, if it can be partitioned into no more than $k$ parts, such that the (inner) conductance of the induced subgraph on each part is at least $φ$ and the (outer) conductan…
▽ More
We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter $\varepsilon$, a $d$-bounded degree graph is defined to be $(k, φ)$-clusterable, if it can be partitioned into no more than $k$ parts, such that the (inner) conductance of the induced subgraph on each part is at least $φ$ and the (outer) conductance of each part is at most $c_{d,k}\varepsilon^4φ^2$, where $c_{d,k}$ depends only on $d,k$. Our main result is a sublinear algorithm with the running time $\widetilde{O}(\sqrt{n}\cdot\mathrm{poly}(φ,k,1/\varepsilon))$ that takes as input a graph with maximum degree bounded by $d$, parameters $k$, $φ$, $\varepsilon$, and with probability at least $\frac23$, accepts the graph if it is $(k,φ)$-clusterable and rejects the graph if it is $\varepsilon$-far from $(k, φ^*)$-clusterable for $φ^* = c'_{d,k}\frac{φ^2 \varepsilon^4}{\log n}$, where $c'_{d,k}$ depends only on $d,k$. By the lower bound of $Ω(\sqrt{n})$ on the number of queries needed for testing graph expansion, which corresponds to $k=1$ in our problem, our algorithm is asymptotically optimal up to polylogarithmic factors.
△ Less
Submitted 13 April, 2015;
originally announced April 2015.
-
Approximate well-supported Nash equilibria in symmetric bimatrix games
Authors:
Artur Czumaj,
Michail Fasoulakis,
Marcin Jurdziński
Abstract:
The $\varepsilon$-well-supported Nash equilibrium is a strong notion of approximation of a Nash equilibrium, where no player has an incentive greater than $\varepsilon$ to deviate from any of the pure strategies that she uses in her mixed strategy. The smallest constant $\varepsilon$ currently known for which there is a polynomial-time algorithm that computes an $\varepsilon$-well-supported Nash e…
▽ More
The $\varepsilon$-well-supported Nash equilibrium is a strong notion of approximation of a Nash equilibrium, where no player has an incentive greater than $\varepsilon$ to deviate from any of the pure strategies that she uses in her mixed strategy. The smallest constant $\varepsilon$ currently known for which there is a polynomial-time algorithm that computes an $\varepsilon$-well-supported Nash equilibrium in bimatrix games is slightly below $2/3$. In this paper we study this problem for symmetric bimatrix games and we provide a polynomial-time algorithm that gives a $(1/2+δ)$-well-supported Nash equilibrium, for an arbitrarily small positive constant $δ$.
△ Less
Submitted 10 July, 2014;
originally announced July 2014.
-
Planar Graphs: Random Walks and Bipartiteness Testing
Authors:
Artur Czumaj,
Morteza Monemizadeh,
Krzysztof Onak,
Christian Sohler
Abstract:
We initiate the study of property testing in arbitrary planar graphs. We prove that bipartiteness can be tested in constant time, improving on the previous bound of $\tilde{O}(\sqrt{n})$ for graphs on $n$ vertices. The constant-time testability was only known for planar graphs with bounded degree.
Our algorithm is based on random walks. Since planar graphs have good separators, i.e., bad expansi…
▽ More
We initiate the study of property testing in arbitrary planar graphs. We prove that bipartiteness can be tested in constant time, improving on the previous bound of $\tilde{O}(\sqrt{n})$ for graphs on $n$ vertices. The constant-time testability was only known for planar graphs with bounded degree.
Our algorithm is based on random walks. Since planar graphs have good separators, i.e., bad expansion, our analysis diverges from standard techniques that involve the fast convergence of random walks on expanders. We reduce the problem to the task of detecting an odd-parity cycle in a multigraph induced by constant-length cycles. We iteratively reduce the length of cycles while preserving the detection probability, until the multigraph collapses to a collection of easily discoverable self-loops.
Our approach extends to arbitrary minor-free graphs. We also believe that our techniques will find applications to testing other properties in arbitrary minor-free graphs.
△ Less
Submitted 21 December, 2018; v1 submitted 8 July, 2014;
originally announced July 2014.
-
Finding Cycles and Trees in Sublinear Time
Authors:
Artur Czumaj,
Oded Goldreich,
Dana Ron,
C. Seshadhri,
Asaf Shapira,
Christian Sohler
Abstract:
We present sublinear-time (randomized) algorithms for finding simple cycles of length at least $k\geq 3$ and tree-minors in bounded-degree graphs. The complexity of these algorithms is related to the distance of the graph from being $C_k$-minor-free (resp., free from having the corresponding tree-minor). In particular, if the graph is far (i.e., $Ω(1)$-far) {from} being cycle-free, i.e. if one has…
▽ More
We present sublinear-time (randomized) algorithms for finding simple cycles of length at least $k\geq 3$ and tree-minors in bounded-degree graphs. The complexity of these algorithms is related to the distance of the graph from being $C_k$-minor-free (resp., free from having the corresponding tree-minor). In particular, if the graph is far (i.e., $Ω(1)$-far) {from} being cycle-free, i.e. if one has to delete a constant fraction of edges to make it cycle-free, then the algorithm finds a cycle of polylogarithmic length in time $\tildeO(\sqrt{N})$, where $N$ denotes the number of vertices. This time complexity is optimal up to polylogarithmic factors.
The foregoing results are the outcome of our study of the complexity of {\em one-sided error} property testing algorithms in the bounded-degree graphs model. For example, we show that cycle-freeness of $N$-vertex graphs can be tested with one-sided error within time complexity $\tildeO(\poly(1/\e)\cdot\sqrt{N})$. This matches the known $Ω(\sqrt{N})$ query lower bound, and contrasts with the fact that any minor-free property admits a {\em two-sided error} tester of query complexity that only depends on the proximity parameter $\e$. For any constant $k\geq3$, we extend this result to testing whether the input graph has a simple cycle of length at least $k$. On the other hand, for any fixed tree $T$, we show that $T$-minor-freeness has a one-sided error tester of query complexity that only depends on the proximity parameter $\e$.
Our algorithm for finding cycles in bounded-degree graphs extends to general graphs, where distances are measured with respect to the actual number of edges. Such an extension is not possible with respect to finding tree-minors in $o(\sqrt{N})$ complexity.
△ Less
Submitted 3 April, 2012; v1 submitted 23 July, 2010;
originally announced July 2010.
-
PTAS for k-tour cover problem on the plane for moderately large values of k
Authors:
Anna Adamaszek,
Artur Czumaj,
Andrzej Lingas
Abstract:
Let P be a set of n points in the Euclidean plane and let O be the origin point in the plane. In the k-tour cover problem (called frequently the capacitated vehicle routing problem), the goal is to minimize the total length of tours that cover all points in P, such that each tour starts and ends in O and covers at most k points from P.
The k-tour cover problem is known to be NP-hard. It is als…
▽ More
Let P be a set of n points in the Euclidean plane and let O be the origin point in the plane. In the k-tour cover problem (called frequently the capacitated vehicle routing problem), the goal is to minimize the total length of tours that cover all points in P, such that each tour starts and ends in O and covers at most k points from P.
The k-tour cover problem is known to be NP-hard. It is also known to admit constant factor approximation algorithms for all values of k and even a polynomial-time approximation scheme (PTAS) for small values of k, i.e., k=O(log n / log log n).
We significantly enlarge the set of values of k for which a PTAS is provable. We present a new PTAS for all values of k <= 2^{log^δn}, where δ= δ(ε). The main technical result proved in the paper is a novel reduction of the k-tour cover problem with a set of n points to a small set of instances of the problem, each with O((k/ε)^O(1)) points.
△ Less
Submitted 16 April, 2009;
originally announced April 2009.