Near-Optimal Massively Parallel Graph Connectivity

Behnezhad, Soheil; Dhulipala, Laxman; Esfandiari, Hossein; Łącki, Jakub; Mirrokni, Vahab

Computer Science > Data Structures and Algorithms

arXiv:1910.05385 (cs)

[Submitted on 11 Oct 2019 (v1), last revised 11 Mar 2020 (this version, v2)]

Title:Near-Optimal Massively Parallel Graph Connectivity

Authors:Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, Vahab Mirrokni

View PDF

Abstract:Identifying the connected components of a graph, apart from being a fundamental problem with countless applications, is a key primitive for many other algorithms. In this paper, we consider this problem in parallel settings. Particularly, we focus on the Massively Parallel Computations (MPC) model, which is the standard theoretical model for modern parallel frameworks such as MapReduce, Hadoop, or Spark. We consider the truly sublinear regime of MPC for graph problems where the space per machine is $n^\delta$ for some desirably small constant $\delta \in (0, 1)$.
We present an algorithm that for graphs with diameter $D$ in the wide range $[\log^{\epsilon} n, n]$, takes $O(\log D)$ rounds to identify the connected components and takes $O(\log \log n)$ rounds for all other graphs. The algorithm is randomized, succeeds with high probability, does not require prior knowledge of $D$, and uses an optimal total space of $O(m)$. We complement this by showing a conditional lower-bound based on the widely believed TwoCycle conjecture that $\Omega(\log D)$ rounds are indeed necessary in this setting.
Studying parallel connectivity algorithms received a resurgence of interest after the pioneering work of Andoni et al. [FOCS 2018] who presented an algorithm with $O(\log D \cdot \log \log n)$ round-complexity. Our algorithm improves this result for the whole range of values of $D$ and almost settles the problem due to the conditional lower-bound.
Additionally, we show that with minimal adjustments, our algorithm can also be implemented in a variant of the (CRCW) PRAM in asymptotically the same number of rounds.

Comments:	A preliminary version of this paper is to appear in the proceedings of The 60th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2019)
Subjects:	Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1910.05385 [cs.DS]
	(or arXiv:1910.05385v2 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.1910.05385

Submission history

From: Soheil Behnezhad [view email]
[v1] Fri, 11 Oct 2019 19:51:13 UTC (134 KB)
[v2] Wed, 11 Mar 2020 23:51:43 UTC (135 KB)

Computer Science > Data Structures and Algorithms

Title:Near-Optimal Massively Parallel Graph Connectivity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Near-Optimal Massively Parallel Graph Connectivity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators