-
Fast and Simple Sorting Using Partial Information
Authors:
Bernhard Haeupler,
Richard Hladík,
John Iacono,
Vaclav Rozhon,
Robert Tarjan,
Jakub Tětek
Abstract:
We consider the problem of sorting a set of items having an unknown total order by doing binary comparisons of the items, given the outcomes of some pre-existing comparisons. We present a simple algorithm with a running time of $O(m+n+\log T)$, where $n$, $m$, and $T$ are the number of items, the number of pre-existing comparisons, and the number of total orders consistent with the outcomes of the…
▽ More
We consider the problem of sorting a set of items having an unknown total order by doing binary comparisons of the items, given the outcomes of some pre-existing comparisons. We present a simple algorithm with a running time of $O(m+n+\log T)$, where $n$, $m$, and $T$ are the number of items, the number of pre-existing comparisons, and the number of total orders consistent with the outcomes of the pre-existing comparisons, respectively. The algorithm does $O(\log T)$ comparisons.
Our running time and comparison bounds are best possible up to constant factors, thus resolving a problem that has been studied intensely since 1976 (Fredman, Theoretical Computer Science). The best previous algorithm with a bound of $O(\log T)$ on the number of comparisons has a time bound of $O(n^{2.5})$ and is significantly more complicated. Our algorithm combines three classic algorithms: topological sort, heapsort with the right kind of heap, and efficient insertion into a sorted list.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Minimum-cost paths for electric cars
Authors:
Dani Dorfman,
Haim Kaplan,
Robert E. Tarjan,
Mikkel Thorup,
Uri Zwick
Abstract:
An electric car equipped with a battery of a finite capacity travels on a road network with an infrastructure of charging stations. Each charging station has a possibly different cost per unit of energy. Traversing a given road segment requires a specified amount of energy that may be positive, zero or negative. The car can only traverse a road segment if it has enough charge to do so (the charge…
▽ More
An electric car equipped with a battery of a finite capacity travels on a road network with an infrastructure of charging stations. Each charging station has a possibly different cost per unit of energy. Traversing a given road segment requires a specified amount of energy that may be positive, zero or negative. The car can only traverse a road segment if it has enough charge to do so (the charge cannot drop below zero), and it cannot charge its battery beyond its capacity.
To travel from one point to another the car needs to choose a \emph{travel plan} consisting of a path in the network and a recharging schedule that specifies how much energy to charge at each charging station on the path, making sure of having enough energy to reach the next charging station or the destination. The cost of the plan is the total charging cost along the chosen path. We reduce the problem of computing plans between every two junctions of the network to two problems: Finding optimal energetic paths when no charging is allowed and finding standard shortest paths. When there are no negative cycles in the network, we obtain an $O(n^3)$-time algorithm for computing all-pairs travel plans, where~$n$ is the number of junctions in the network. We obtain slightly faster algorithms under some further assumptions. We also consider the case in which a bound is placed on the number of rechargings allowed.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Universal Optimality of Dijkstra via Beyond-Worst-Case Heaps
Authors:
Bernhard Haeupler,
Richard Hladík,
Václav Rozhoň,
Robert Tarjan,
Jakub Tětek
Abstract:
This paper proves that Dijkstra's shortest-path algorithm is universally optimal in both its running time and number of comparisons when combined with a sufficiently efficient heap data structure.
Universal optimality is a powerful beyond-worst-case performance guarantee for graph algorithms that informally states that a single algorithm performs as well as possible for every single graph topolo…
▽ More
This paper proves that Dijkstra's shortest-path algorithm is universally optimal in both its running time and number of comparisons when combined with a sufficiently efficient heap data structure.
Universal optimality is a powerful beyond-worst-case performance guarantee for graph algorithms that informally states that a single algorithm performs as well as possible for every single graph topology. We give the first application of this notion to any sequential algorithm.
We design a new heap data structure with a working-set property guaranteeing that the heap takes advantage of locality in heap operations. Our heap matches the optimal (worst-case) bounds of Fibonacci heaps but also provides the beyond-worst-case guarantee that the cost of extracting the minimum element is merely logarithmic in the number of elements inserted after it instead of logarithmic in the number of all elements in the heap. This makes the extraction of recently added elements cheaper.
We prove that our working-set property is sufficient to guarantee universal optimality, specifically, for the problem of ordering vertices by their distance from the source vertex: The locality in the sequence of heap operations generated by any run of Dijkstra's algorithm on a fixed topology is strong enough that one can couple the number of comparisons performed by any heap with our working-set property to the minimum number of comparisons required to solve the distance ordering problem on this topology.
△ Less
Submitted 9 April, 2024; v1 submitted 20 November, 2023;
originally announced November 2023.
-
Zip-zip Trees: Making Zip Trees More Balanced, Biased, Compact, or Persistent
Authors:
Ofek Gila,
Michael T. Goodrich,
Robert E. Tarjan
Abstract:
We define simple variants of zip trees, called zip-zip trees, which provide several advantages over zip trees, including overcoming a bias that favors smaller keys over larger ones. We analyze zip-zip trees theoretically and empirically, showing, e.g., that the expected depth of a node in an $n$-node zip-zip tree is at most $1.3863\log n-1+o(1)$, which matches the expected depth of treaps and bina…
▽ More
We define simple variants of zip trees, called zip-zip trees, which provide several advantages over zip trees, including overcoming a bias that favors smaller keys over larger ones. We analyze zip-zip trees theoretically and empirically, showing, e.g., that the expected depth of a node in an $n$-node zip-zip tree is at most $1.3863\log n-1+o(1)$, which matches the expected depth of treaps and binary search trees built by uniformly random insertions. Unlike these other data structures, however, zip-zip trees achieve their bounds using only $O(\log\log n)$ bits of metadata per node, w.h.p., as compared to the $Θ(\log n)$ bits per node required by treaps. In fact, we even describe a ``just-in-time'' zip-zip tree variant, which needs just an expected $O(1)$ number of bits of metadata per node. Moreover, we can define zip-zip trees to be strongly history independent, whereas treaps are generally only weakly history independent. We also introduce \emph{biased zip-zip trees}, which have an explicit bias based on key weights, so the expected depth of a key, $k$, with weight, $w_k$, is $O(\log (W/w_k))$, where $W$ is the weight of all keys in the weighted zip-zip tree. Finally, we show that one can easily make zip-zip trees partially persistent with only $O(n)$ space overhead w.h.p.
△ Less
Submitted 2 May, 2024; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Efficiency of Self-Adjusting Heaps
Authors:
Corwin Sinnamon,
Robert E. Tarjan
Abstract:
Since the invention of the pairing heap by Fredman et al., it has been an open question whether this or any other simple "self-adjusting" heap supports decrease-key operations on $n$-item heaps in $O(\log\log n)$ time. Using powerful new techniques, we answer this question in the affirmative. We prove that both slim and smooth heaps, recently introduced self-adjusting heaps, support heap operation…
▽ More
Since the invention of the pairing heap by Fredman et al., it has been an open question whether this or any other simple "self-adjusting" heap supports decrease-key operations on $n$-item heaps in $O(\log\log n)$ time. Using powerful new techniques, we answer this question in the affirmative. We prove that both slim and smooth heaps, recently introduced self-adjusting heaps, support heap operations on an $n$-item heap in the following amortized time bounds: $O(\log n)$ for delete-min and delete, $O(\log\log n)$ for decrease-key, and $O(1)$ for all other heap operations, including insert and meld. We also analyze the multipass pairing heap, a variant of pairing heaps. For this heap implementation, we obtain the same bounds except for decrease-key, for which our bound is $O(\log\log n \log\log\log n)$. Our bounds significantly improve the best previously known bounds for all three data structures. For slim and smooth heaps our bounds are tight, since they match lower bounds of Iacono and Ozkan; for multipass pairing heaps our bounds are tight except for decrease-key, which by the lower bounds of Fredman and Iacono and Özkan must take $O(\log\log n)$ amortized time if delete-min takes $O(\log n)$ time.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Optimal energetic paths for electric cars
Authors:
Dani Dorfman,
Haim Kaplan,
Robert E. Tarjan,
Uri Zwick
Abstract:
A weighted directed graph $G=(V,A,c)$, where $A\subseteq V\times V$ and $c:A\to R$, describes a road network in which an electric car can roam. An arc $uv$ models a road segment connecting the two vertices $u$ and $v$. The cost $c(uv)$ of an arc $uv$ is the amount of energy the car needs to traverse the arc. This amount may be positive, zero or negative. To make the problem realistic, we assume th…
▽ More
A weighted directed graph $G=(V,A,c)$, where $A\subseteq V\times V$ and $c:A\to R$, describes a road network in which an electric car can roam. An arc $uv$ models a road segment connecting the two vertices $u$ and $v$. The cost $c(uv)$ of an arc $uv$ is the amount of energy the car needs to traverse the arc. This amount may be positive, zero or negative. To make the problem realistic, we assume there are no negative cycles.
The car has a battery that can store up to $B$ units of energy. It can traverse an arc $uv\in A$ only if it is at $u$ and the charge $b$ in its battery satisfies $b\ge c(uv)$. If it traverses the arc, it reaches $v$ with a charge of $\min(b-c(uv),B)$. Arcs with positive costs deplete the battery, arcs with negative costs charge the battery, but not above its capacity of $B$.
Given $s,t\in V$, can the car travel from $s$ to $t$, starting at $s$ with an initial charge $b$, where $0\le b\le B$? If so, what is the maximum charge with which the car can reach $t$? Equivalently, what is the smallest $δ_{B,b}(s,t)$ such that the car can reach $t$ with a charge of $b-δ_{B,b}(s,t)$, and which path should the car follow to achieve this? We refer to $δ_{B,b}(s,t)$ as the energetic cost of traveling from $s$ to $t$. We let $δ_{B,b}(s,t)=\infty$ if the car cannot travel from $s$ to $t$ starting with an initial charge of $b$. The problem of computing energetic costs is a strict generalization of the standard shortest paths problem.
We show that the single-source minimum energetic paths problem can be solved using simple, but subtle, adaptations of the Bellman-Ford and Dijkstra algorithms. To make Dijkstra's algorithm work in the presence of negative arcs, but no negative cycles, we use a variant of the $A^*$ search heuristic. These results are explicit or implicit in some previous papers. We provide a simpler and unified description of these algorithms.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Optimal resizable arrays
Authors:
Robert E. Tarjan,
Uri Zwick
Abstract:
A \emph{resizable array} is an array that can \emph{grow} and \emph{shrink} by the addition or removal of items from its end, or both its ends, while still supporting constant-time \emph{access} to each item stored in the array given its \emph{index}. Since the size of an array, i.e., the number of items in it, varies over time, space-efficient maintenance of a resizable array requires dynamic mem…
▽ More
A \emph{resizable array} is an array that can \emph{grow} and \emph{shrink} by the addition or removal of items from its end, or both its ends, while still supporting constant-time \emph{access} to each item stored in the array given its \emph{index}. Since the size of an array, i.e., the number of items in it, varies over time, space-efficient maintenance of a resizable array requires dynamic memory management. A standard doubling technique allows the maintenance of an array of size~$N$ using only $O(N)$ space, with $O(1)$ amortized time, or even $O(1)$ worst-case time, per operation. Sitarski and Brodnik et al.\ describe much better solutions that maintain a resizable array of size~$N$ using only $N+O(\sqrt{N})$ space, still with $O(1)$ time per operation. Brodnik et al.\ give a simple proof that this is best possible.
We distinguish between the space needed for \emph{storing} a resizable array, and accessing its items, and the \emph{temporary} space that may be needed while growing or shrinking the array. For every integer $r\ge 2$, we show that $N+O(N^{1/r})$ space is sufficient for storing and accessing an array of size~$N$, if $N+O(N^{1-1/r})$ space can be used briefly during grow and shrink operations. Accessing an item by index takes $O(1)$ worst-case time while grow and shrink operations take $O(r)$ amortized time. Using an exact analysis of a \emph{growth game}, we show that for any data structure from a wide class of data structures that uses only $N+O(N^{1/r})$ space to store the array, the amortized cost of grow is $Ω(r)$, even if only grow and access operations are allowed. The time for grow and shrink operations cannot be made worst-case, unless $r=2$.
△ Less
Submitted 29 May, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
A Simpler Proof that Pairing Heaps Take O(1) Amortized Time per Insertion
Authors:
Corwin Sinnamon,
Robert Tarjan
Abstract:
The pairing heap is a simple "self-adjusting" implementation of a heap (priority queue). Inserting an item into a pairing heap or decreasing the key of an item takes O(1) time worst-case, as does melding two heaps. But deleting an item of minimum key can take time linear in the heap size in the worst case. The paper that introduced the pairing heap proved an O(log n) amortized time bound for each…
▽ More
The pairing heap is a simple "self-adjusting" implementation of a heap (priority queue). Inserting an item into a pairing heap or decreasing the key of an item takes O(1) time worst-case, as does melding two heaps. But deleting an item of minimum key can take time linear in the heap size in the worst case. The paper that introduced the pairing heap proved an O(log n) amortized time bound for each heap operation, where n is the number of items in the heap or heaps involved in the operation, by charging all but O(log n) of the time for each deletion to non-deletion operations, O(log n) to each. Later Iacono found a way to reduce the amortized time per insertion to O(1) and that of meld to zero while preserving the O(log n) amortized time bound for the other update operations. We give a simpler proof of Iacono's result with significantly smaller constant factors. Our analysis uses the natural representation of pairing heaps instead of the conversion to a binary tree used in the original analysis and in Iacono's.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
Finding Strong Components Using Depth-First Search
Authors:
Robert E. Tarjan,
Uri Zwick
Abstract:
We survey three algorithms that use depth-first search to find the strong components of a directed graph in linear time: (1) Tarjan's algorithm; (2) a cycle-finding algorithm; and (3) a bidirectional search algorithm.
We survey three algorithms that use depth-first search to find the strong components of a directed graph in linear time: (1) Tarjan's algorithm; (2) a cycle-finding algorithm; and (3) a bidirectional search algorithm.
△ Less
Submitted 11 April, 2022; v1 submitted 18 January, 2022;
originally announced January 2022.
-
A Tight Analysis of Slim Heaps and Smooth Heaps
Authors:
Corwin Sinnamon,
Robert E. Tarjan
Abstract:
The smooth heap and the closely related slim heap are recently invented self-adjusting implementations of the heap (priority queue) data structure. We analyze the efficiency of these data structures. We obtain the following amortized bounds on the time per operation: $O(1)$ for make-heap, insert, find-min, and meld; $O(\log\log n)$ for decrease-key; and $O(\log n)$ for delete-min and delete, where…
▽ More
The smooth heap and the closely related slim heap are recently invented self-adjusting implementations of the heap (priority queue) data structure. We analyze the efficiency of these data structures. We obtain the following amortized bounds on the time per operation: $O(1)$ for make-heap, insert, find-min, and meld; $O(\log\log n)$ for decrease-key; and $O(\log n)$ for delete-min and delete, where $n$ is the current number of items in the heap. These bounds are tight not only for smooth and slim heaps but for any heap implementation in Iacono and Özkan's pure heap model, intended to capture all possible "self-adjusting" heap implementations. Slim and smooth heaps are the first known data structures to match Iacono and Özkan's lower bounds and to satisfy the constraints of their model. Our analysis builds on Pettie's insights into the efficiency of pairing heaps, a classical self-adjusting heap implementation.
△ Less
Submitted 5 November, 2021; v1 submitted 10 August, 2021;
originally announced August 2021.
-
Analysis of Smooth Heaps and Slim Heaps
Authors:
Maria Hartmann,
László Kozma,
Corwin Sinnamon,
Robert E. Tarjan
Abstract:
The smooth heap is a recently introduced self-adjusting heap [Kozma, Saranurak, 2018] similar to the pairing heap [Fredman, Sedgewick, Sleator, Tarjan, 1986]. The smooth heap was obtained as a heap-counterpart of Greedy BST, a binary search tree updating strategy conjectured to be \emph{instance-optimal} [Lucas, 1988], [Munro, 2000]. Several adaptive properties of smooth heaps follow from this con…
▽ More
The smooth heap is a recently introduced self-adjusting heap [Kozma, Saranurak, 2018] similar to the pairing heap [Fredman, Sedgewick, Sleator, Tarjan, 1986]. The smooth heap was obtained as a heap-counterpart of Greedy BST, a binary search tree updating strategy conjectured to be \emph{instance-optimal} [Lucas, 1988], [Munro, 2000]. Several adaptive properties of smooth heaps follow from this connection; moreover, the smooth heap itself has been conjectured to be instance-optimal within a certain class of heaps. Nevertheless, no general analysis of smooth heaps has existed until now, the only previous analysis showing that, when used in \emph{sorting mode} ($n$ insertions followed by $n$ delete-min operations), smooth heaps sort $n$ numbers in $O(n\lg n)$ time.
In this paper we describe a simpler variant of the smooth heap we call the \emph{slim heap}. We give a new, self-contained analysis of smooth heaps and slim heaps in unrestricted operation, obtaining amortized bounds that match the best bounds known for self-adjusting heaps. Previous experimental work has found the pairing heap to dominate other data structures in this class in various settings. Our tests show that smooth heaps and slim heaps are competitive with pairing heaps, outperforming them in some cases, while being comparably easy to implement.
△ Less
Submitted 10 July, 2021;
originally announced July 2021.
-
Concurrent Disjoint Set Union
Authors:
Siddhartha V. Jayanti,
Robert E. Tarjan
Abstract:
We develop and analyze concurrent algorithms for the disjoint set union (union-find) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of…
▽ More
We develop and analyze concurrent algorithms for the disjoint set union (union-find) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of $O(m \cdot (\log(np/m + 1) + α(n, m/(np)))$ for a problem with $n$ elements and $m$ operations solved by $p$ processes, where $α$ is a functional inverse of Ackermann's function. We give two randomized algorithms that use only CAS and have the same work bound in expectation. The analysis of the second randomized algorithm is valid even if the scheduler is adversarial. Our DCAS and randomized algorithms take $O(\log n)$ steps per operation, worst-case for the DCAS algorithm, high-probability for the randomized algorithms. Our work and step bounds grow only logarithmically with $p$, making our algorithms truly scalable. We prove that for a class of symmetric algorithms that includes ours, no better step or work bound is possible.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Connected Components on a PRAM in Log Diameter Time
Authors:
S. Cliff Liu,
Robert E. Tarjan,
Peilin Zhong
Abstract:
We present an $O(\log d + \log\log_{m/n} n)$-time randomized PRAM algorithm for computing the connected components of an $n$-vertex, $m$-edge undirected graph with maximum component diameter $d$. The algorithm runs on an ARBITRARY CRCW (concurrent-read, concurrent-write with arbitrary write resolution) PRAM using $O(m)$ processors. The time bound holds with good probability.
Our algorithm is bas…
▽ More
We present an $O(\log d + \log\log_{m/n} n)$-time randomized PRAM algorithm for computing the connected components of an $n$-vertex, $m$-edge undirected graph with maximum component diameter $d$. The algorithm runs on an ARBITRARY CRCW (concurrent-read, concurrent-write with arbitrary write resolution) PRAM using $O(m)$ processors. The time bound holds with good probability.
Our algorithm is based on the breakthrough results of Andoni et al. [FOCS'18] and Behnezhad et al. [FOCS'19]. Their algorithms run on the more powerful MPC model and rely on sorting and computing prefix sums in $O(1)$ time, tasks that take $Ω(\log n / \log\log n)$ time on a CRCW PRAM with $\text{poly}(n)$ processors. Our simpler algorithm uses limited-collision hashing and does not sort or do prefix sums. It matches the time and space bounds of the algorithm of Behnezhad et al., who improved the time bound of Andoni et al.
It is widely believed that the larger private memory per processor and unbounded local computation of the MPC model admit algorithms faster than that on a PRAM. Our result suggests that such additional power might not be necessary, at least for fundamental graph problems like connected components and spanning forest.
△ Less
Submitted 21 April, 2021; v1 submitted 1 March, 2020;
originally announced March 2020.
-
A Foundation for Proving Splay is Dynamically Optimal
Authors:
Caleb C. Levy,
Robert E. Tarjan
Abstract:
Consider the task of performing a sequence of searches in a binary search tree. After each search, we allow an algorithm to arbitrarily restructure the tree. The cost of executing the task is the sum of the time spent searching and the time spent optimizing the searches with restructuring operations. Sleator and Tarjan introduced this notion in 1985, along with an algorithm and a conjecture. The a…
▽ More
Consider the task of performing a sequence of searches in a binary search tree. After each search, we allow an algorithm to arbitrarily restructure the tree. The cost of executing the task is the sum of the time spent searching and the time spent optimizing the searches with restructuring operations. Sleator and Tarjan introduced this notion in 1985, along with an algorithm and a conjecture. The algorithm, Splay, is an elegant procedure for performing adjustments that move searched items to the top of the tree. The conjecture, called dynamic optimality, is that the cost of splaying is always within a constant factor of the optimal algorithm for performing searches. We lay a foundation for proving the dynamic optimality conjecture. Central to our method is approximate monotonicity. Approximately monotone algorithms are those whose cost does not increase by more than a fixed multiple after removing searches from the sequence. As we shall see, Splay is dynamically optimal if and only if it is approximately monotone. This result extends to a weaker form of approximate monotonicity as well as insertion, deletion, and related algorithms. We prove that a lower bound on optimal execution cost is approximately monotone and outline how to adapt this proof from the lower bound to Splay, and how to overcome the remaining barriers to establishing dynamic optimality.
△ Less
Submitted 8 May, 2022; v1 submitted 14 July, 2019;
originally announced July 2019.
-
Splaying Preorders and Postorders
Authors:
Caleb C. Levy,
Robert E. Tarjan
Abstract:
Let $T$ be a binary search tree. We prove two results about the behavior of the Splay algorithm (Sleator and Tarjan 1985). Our first result is that inserting keys into an empty binary search tree via splaying in the order of either $T$'s preorder or $T$'s postorder takes linear time. Our proof uses the fact that preorders and postorders are pattern-avoiding: i.e. they contain no subsequences that…
▽ More
Let $T$ be a binary search tree. We prove two results about the behavior of the Splay algorithm (Sleator and Tarjan 1985). Our first result is that inserting keys into an empty binary search tree via splaying in the order of either $T$'s preorder or $T$'s postorder takes linear time. Our proof uses the fact that preorders and postorders are pattern-avoiding: i.e. they contain no subsequences that are order-isomorphic to $(2,3,1)$ and $(3,1,2)$, respectively. Pattern-avoidance implies certain constraints on the manner in which items are inserted. We exploit this structure with a simple potential function that counts inserted nodes lying on access paths to uninserted nodes. Our methods can likely be extended to permutations that avoid more general patterns. Second, if $T'$ is any other binary search tree with the same keys as $T$ and $T$ is weight-balanced (Nievergelt and Reingold 1973), then splaying $T$'s preorder sequence or $T$'s postorder sequence starting from $T'$ takes linear time. To prove this, we demonstrate that preorders and postorders of balanced search trees do not contain many large "jumps" in symmetric order, and exploit this fact by using the dynamic finger theorem (Cole et al. 2000). Both of our results provide further evidence in favor of the elusive "dynamic optimality conjecture."
△ Less
Submitted 14 July, 2019;
originally announced July 2019.
-
Simple Concurrent Labeling Algorithms for Connected Components
Authors:
S. Cliff Liu,
Robert E. Tarjan
Abstract:
We study a class of simple algorithms for concurrently computing the connected components of an $n$-vertex, $m$-edge graph. Our algorithms are easy to implement in either the COMBINING CRCW PRAM or the MPC computing model. For two related algorithms in this class, we obtain $Θ(\lg n)$ step and $Θ(m \lg n)$ work bounds. For two others, we obtain $O(\lg^2 n)$ step and $O(m \lg^2 n)$ work bounds, whi…
▽ More
We study a class of simple algorithms for concurrently computing the connected components of an $n$-vertex, $m$-edge graph. Our algorithms are easy to implement in either the COMBINING CRCW PRAM or the MPC computing model. For two related algorithms in this class, we obtain $Θ(\lg n)$ step and $Θ(m \lg n)$ work bounds. For two others, we obtain $O(\lg^2 n)$ step and $O(m \lg^2 n)$ work bounds, which are tight for one of them. All our algorithms are simpler than related algorithms in the literature. We also point out some gaps and errors in the analysis of previous algorithms. Our results show that even a basic problem like connected components still has secrets to reveal.
△ Less
Submitted 2 March, 2020; v1 submitted 14 December, 2018;
originally announced December 2018.
-
Zip Trees
Authors:
Robert E. Tarjan,
Caleb C. Levy,
Stephen Timmel
Abstract:
We introduce the zip tree, a form of randomized binary search tree that integrates previous ideas into one practical, performant, and pleasant-to-implement package. A zip tree is a binary search tree in which each node has a numeric rank and the tree is (max)-heap-ordered with respect to ranks, with rank ties broken in favor of smaller keys. Zip trees are essentially treaps (Seidel and Aragon 1996…
▽ More
We introduce the zip tree, a form of randomized binary search tree that integrates previous ideas into one practical, performant, and pleasant-to-implement package. A zip tree is a binary search tree in which each node has a numeric rank and the tree is (max)-heap-ordered with respect to ranks, with rank ties broken in favor of smaller keys. Zip trees are essentially treaps (Seidel and Aragon 1996), except that ranks are drawn from a geometric distribution instead of a uniform distribution, and we allow rank ties. These changes enable us to use fewer random bits per node. We perform insertions and deletions by unmerging and merging paths ("unzip**" and "zip**") rather than by doing rotations, which avoids some pointer changes and improves efficiency. The methods of zip** and unzip** take inspiration from previous top-down approaches to insertion and deletion (Stephenson 1980; Martínez and Roura 1998; Sprugnoli 1980). From a theoretical standpoint, this work provides two main results. First, zip trees require only $O(\log \log n)$ bits (with high probability) to represent the largest rank in an $n$-node binary search tree; previous data structures require $O(\log n)$ bits for the largest rank. Second, zip trees are naturally isomorphic to skip lists (Pugh 1990), and simplify the map** of (Dean and Jones 2007) between skip lists and binary search trees.
△ Less
Submitted 21 February, 2022; v1 submitted 18 June, 2018;
originally announced June 2018.
-
A Randomized Concurrent Algorithm for Disjoint Set Union
Authors:
Siddhartha V. Jayanti,
Robert E. Tarjan
Abstract:
The disjoint set union problem is a basic problem in data structures with a wide variety of applications. We extend a known efficient sequential algorithm for this problem to obtain a simple and efficient concurrent wait-free algorithm running on an asynchronous parallel random access machine (APRAM). Crucial to our result is the use of randomization. Under a certain independence assumption, for a…
▽ More
The disjoint set union problem is a basic problem in data structures with a wide variety of applications. We extend a known efficient sequential algorithm for this problem to obtain a simple and efficient concurrent wait-free algorithm running on an asynchronous parallel random access machine (APRAM). Crucial to our result is the use of randomization. Under a certain independence assumption, for a problem instance in which there are n elements, m operations, and p processes, our algorithm does Theta(m (alpha(n, m/(np)) + log(np/m + 1))) expected work, where the expectation is over the random choices made by the algorithm and alpha is a functional inverse of Ackermann's function. In addition, each operation takes O(log n) steps with high probability. Our algorithm is significantly simpler and more efficient than previous algorithms proposed by Anderson and Woll. Under our independence assumption, our algorithm achieves almost-linear speed-up for applications in which all or most of the processes can be kept busy.
△ Less
Submitted 5 December, 2016;
originally announced December 2016.
-
A Note on Fault Tolerant Reachability for Directed Graphs
Authors:
Loukas Georgiadis,
Robert E. Tarjan
Abstract:
In this note we describe an application of low-high orders in fault-tolerant network design. Baswana et al. [DISC 2015] study the following reachability problem. We are given a flow graph $G = (V, A)$ with start vertex $s$, and a spanning tree $T =(V, A_T)$ rooted at $s$. We call a set of arcs $A'$ valid if the subgraph $G' = (V, A_T \cup A')$ of $G$ has the same dominators as $G$. The goal is to…
▽ More
In this note we describe an application of low-high orders in fault-tolerant network design. Baswana et al. [DISC 2015] study the following reachability problem. We are given a flow graph $G = (V, A)$ with start vertex $s$, and a spanning tree $T =(V, A_T)$ rooted at $s$. We call a set of arcs $A'$ valid if the subgraph $G' = (V, A_T \cup A')$ of $G$ has the same dominators as $G$. The goal is to find a valid set of minimum size. Baswana et al. gave an $O(m \log{n})$-time algorithm to compute a minimum-size valid set in $O(m \log{n})$ time, where $n = |V|$ and $m = |A|$. Here we provide a simple $O(m)$-time algorithm that uses the dominator tree $D$ of $G$ and a low-high order of it.
△ Less
Submitted 24 November, 2015;
originally announced November 2015.
-
Hollow Heaps
Authors:
Thomas Dueholm Hansen,
Haim Kaplan,
Robert E. Tarjan,
Uri Zwick
Abstract:
We introduce the hollow heap, a very simple data structure with the same amortized efficiency as the classical Fibonacci heap. All heap operations except delete and delete-min take $O(1)$ time, worst case as well as amortized; delete and delete-min take $O(\log n)$ amortized time on a heap of $n$ items. Hollow heaps are by far the simplest structure to achieve this. Hollow heaps combine two novel…
▽ More
We introduce the hollow heap, a very simple data structure with the same amortized efficiency as the classical Fibonacci heap. All heap operations except delete and delete-min take $O(1)$ time, worst case as well as amortized; delete and delete-min take $O(\log n)$ amortized time on a heap of $n$ items. Hollow heaps are by far the simplest structure to achieve this. Hollow heaps combine two novel ideas: the use of lazy deletion and re-insertion to do decrease-key operations, and the use of a dag (directed acyclic graph) instead of a tree or set of trees to represent a heap. Lazy deletion produces hollow nodes (nodes without items), giving the data structure its name.
△ Less
Submitted 22 October, 2015;
originally announced October 2015.
-
Amortized Rotation Cost in AVL Trees
Authors:
Mahdi Amani,
Kevin A. Lai,
Robert E. Tarjan
Abstract:
An AVL tree is the original type of balanced binary search tree. An insertion in an $n$-node AVL tree takes at most two rotations, but a deletion in an $n$-node AVL tree can take $Θ(\log n)$. A natural question is whether deletions can take many rotations not only in the worst case but in the amortized case as well. A sequence of $n$ successive deletions in an $n$-node tree takes $O(n)$ rotations,…
▽ More
An AVL tree is the original type of balanced binary search tree. An insertion in an $n$-node AVL tree takes at most two rotations, but a deletion in an $n$-node AVL tree can take $Θ(\log n)$. A natural question is whether deletions can take many rotations not only in the worst case but in the amortized case as well. A sequence of $n$ successive deletions in an $n$-node tree takes $O(n)$ rotations, but what happens when insertions are intermixed with deletions? Heaupler, Sen, and Tarjan conjectured that alternating insertions and deletions in an $n$-node AVL tree can cause each deletion to do $Ω(\log n)$ rotations, but they provided no construction to justify their claim. We provide such a construction: we show that, for infinitely many $n$, there is a set $E$ of {\it expensive} $n$-node AVL trees with the property that, given any tree in $E$, deleting a certain leaf and then reinserting it produces a tree in $E$, with the deletion having done $Θ(\log n)$ rotations. One can do an arbitrary number of such expensive deletion-insertion pairs. The difficulty in obtaining such a construction is that in general the tree produced by an expensive deletion-insertion pair is not the original tree. Indeed, if the trees in $E$ have even height $k$, $2^{k/2}$ deletion-insertion pairs are required to reproduce the original tree.
△ Less
Submitted 10 June, 2015;
originally announced June 2015.
-
Fibonacci Heaps Revisited
Authors:
Haim Kaplan,
Robert E. Tarjan,
Uri Zwick
Abstract:
The Fibonacci heap is a classic data structure that supports deletions in logarithmic amortized time and all other heap operations in O(1) amortized time. We explore the design space of this data structure. We propose a version with the following improvements over the original: (i) Each heap is represented by a single heap-ordered tree, instead of a set of trees. (ii) Each decrease-key operation d…
▽ More
The Fibonacci heap is a classic data structure that supports deletions in logarithmic amortized time and all other heap operations in O(1) amortized time. We explore the design space of this data structure. We propose a version with the following improvements over the original: (i) Each heap is represented by a single heap-ordered tree, instead of a set of trees. (ii) Each decrease-key operation does only one cut and a cascade of rank changes, instead of doing a cascade of cuts. (iii) The outcomes of all comparisons done by the algorithm are explicitly represented in the data structure, so none are wasted. We also give an example to show that without cascading cuts or rank changes, both the original data structure and the new version fail to have the desired efficiency, solving an open problem of Fredman. Finally, we illustrate the richness of the design space by proposing several alternative ways to do cascading rank changes, including a randomized strategy related to one previously proposed by Karger. We leave the analysis of these alternatives as intriguing open problems.
△ Less
Submitted 22 July, 2014;
originally announced July 2014.
-
A Back-to-Basics Empirical Study of Priority Queues
Authors:
Daniel H. Larkin,
Siddhartha Sen,
Robert E. Tarjan
Abstract:
The theory community has proposed several new heap variants in the recent past which have remained largely untested experimentally. We take the field back to the drawing board, with straightforward implementations of both classic and novel structures using only standard, well-known optimizations. We study the behavior of each structure on a variety of inputs, including artificial workloads, worklo…
▽ More
The theory community has proposed several new heap variants in the recent past which have remained largely untested experimentally. We take the field back to the drawing board, with straightforward implementations of both classic and novel structures using only standard, well-known optimizations. We study the behavior of each structure on a variety of inputs, including artificial workloads, workloads generated by running algorithms on real map data, and workloads from a discrete event simulator used in recent systems networking research. We provide observations about which characteristics are most correlated to performance. For example, we find that the L1 cache miss rate appears to be strongly correlated with wallclock time. We also provide observations about how the input sequence affects the relative performance of the different heap variants. For example, we show (both theoretically and in practice) that certain random insertion-deletion sequences are degenerate and can lead to misleading results. Overall, our findings suggest that while the conventional wisdom holds in some cases, it is sorely mistaken in others.
△ Less
Submitted 2 March, 2014;
originally announced March 2014.
-
Finding Dominators via Disjoint Set Union
Authors:
Wojciech Fraczak,
Loukas Georgiadis,
Andrew Miller,
Robert E. Tarjan
Abstract:
The problem of finding dominators in a directed graph has many important applications, notably in global optimization of computer code. Although linear and near-linear-time algorithms exist, they use sophisticated data structures. We develop an algorithm for finding dominators that uses only a "static tree" disjoint set data structure in addition to simple lists and maps. The algorithm runs in nea…
▽ More
The problem of finding dominators in a directed graph has many important applications, notably in global optimization of computer code. Although linear and near-linear-time algorithms exist, they use sophisticated data structures. We develop an algorithm for finding dominators that uses only a "static tree" disjoint set data structure in addition to simple lists and maps. The algorithm runs in near-linear or linear time, depending on the implementation of the disjoint set data structure. We give several versions of the algorithm, including one that computes loop nesting information (needed in many kinds of global code optimization) and that can be made self-certifying, so that the correctness of the computed dominators is very easy to verify.
△ Less
Submitted 8 October, 2013;
originally announced October 2013.
-
Dominator Tree Certification and Independent Spanning Trees
Authors:
Loukas Georgiadis,
Robert E. Tarjan
Abstract:
How does one verify that the output of a complicated program is correct? One can formally prove that the program is correct, but this may be beyond the power of existing methods. Alternatively one can check that the output produced for a particular input satisfies the desired input-output relation, by running a checker on the input-output pair. Then one only needs to prove the correctness of the c…
▽ More
How does one verify that the output of a complicated program is correct? One can formally prove that the program is correct, but this may be beyond the power of existing methods. Alternatively one can check that the output produced for a particular input satisfies the desired input-output relation, by running a checker on the input-output pair. Then one only needs to prove the correctness of the checker. But for some problems even such a checker may be too complicated to formally verify. There is a third alternative: augment the original program to produce not only an output but also a correctness certificate, with the property that a very simple program (whose correctness is easy to prove) can use the certificate to verify that the input-output pair satisfies the desired input-output relation.
We consider the following important instance of this general question: How does one verify that the dominator tree of a flow graph is correct? Existing fast algorithms for finding dominators are complicated, and even verifying the correctness of a dominator tree in the absence of additional information seems complicated. We define a correctness certificate for a dominator tree, show how to use it to easily verify the correctness of the tree, and show how to augment fast dominator-finding algorithms so that they produce a correctness certificate. We also relate the dominator certificate problem to the problem of finding independent spanning trees in a flow graph, and we develop algorithms to find such trees. All our algorithms run in linear time. Previous algorithms apply just to the special case of only trivial dominators, and they take at least quadratic time.
△ Less
Submitted 7 March, 2013; v1 submitted 31 October, 2012;
originally announced October 2012.
-
A New Approach to Incremental Cycle Detection and Related Problems
Authors:
Michael A. Bender,
Jeremy T. Fineman,
Seth Gilbert,
Robert E. Tarjan
Abstract:
We consider the problem of detecting a cycle in a directed graph that grows by arc insertions, and the related problems of maintaining a topological order and the strong components of such a graph. For these problems, we give two algorithms, one suited to sparse graphs, and the other to dense graphs. The former takes the minimum of O(m^{3/2}) and O(mn^{2/3}) time to insert m arcs into an n-vertex…
▽ More
We consider the problem of detecting a cycle in a directed graph that grows by arc insertions, and the related problems of maintaining a topological order and the strong components of such a graph. For these problems, we give two algorithms, one suited to sparse graphs, and the other to dense graphs. The former takes the minimum of O(m^{3/2}) and O(mn^{2/3}) time to insert m arcs into an n-vertex graph; the latter takes O(n^2 log(n)) time. Our sparse algorithm is considerably simpler than a previous O(m^{3/2})-time algorithm; it is also faster on graphs of sufficient density. The time bound of our dense algorithm beats the previously best time bound of O(n^{5/2}) for dense graphs. Our algorithms rely for their efficiency on topologically ordered vertex numberings; bounds on the size of the numbers give bound on running times.
△ Less
Submitted 4 December, 2011;
originally announced December 2011.
-
Incremental Cycle Detection, Topological Ordering, and Strong Component Maintenance
Authors:
Bernhard Haeupler,
Telikepalli Kavitha,
Rogers Mathew,
Siddhartha Sen,
Robert Endre Tarjan
Abstract:
We present two on-line algorithms for maintaining a topological order of a directed $n$-vertex acyclic graph as arcs are added, and detecting a cycle when one is created. Our first algorithm handles $m$ arc additions in $O(m^{3/2})$ time. For sparse graphs ($m/n = O(1)$), this bound improves the best previous bound by a logarithmic factor, and is tight to within a constant factor among algorithms…
▽ More
We present two on-line algorithms for maintaining a topological order of a directed $n$-vertex acyclic graph as arcs are added, and detecting a cycle when one is created. Our first algorithm handles $m$ arc additions in $O(m^{3/2})$ time. For sparse graphs ($m/n = O(1)$), this bound improves the best previous bound by a logarithmic factor, and is tight to within a constant factor among algorithms satisfying a natural {\em locality} property. Our second algorithm handles an arbitrary sequence of arc additions in $O(n^{5/2})$ time. For sufficiently dense graphs, this bound improves the best previous bound by a polynomial factor. Our bound may be far from tight: we show that the algorithm can take $Ω(n^2 2^{\sqrt{2\lg n}})$ time by relating its performance to a generalization of the $k$-levels problem of combinatorial geometry. A completely different algorithm running in $Θ(n^2 \log n)$ time was given recently by Bender, Fineman, and Gilbert. We extend both of our algorithms to the maintenance of strong components, without affecting the asymptotic time bounds.
△ Less
Submitted 12 May, 2011;
originally announced May 2011.
-
Heaps Simplified
Authors:
Bernhard Haeupler,
Siddhartha Sen,
Robert E. Tarjan
Abstract:
The heap is a basic data structure used in a wide variety of applications, including shortest path and minimum spanning tree algorithms. In this paper we explore the design space of comparison-based, amortized-efficient heap implementations. From a consideration of dynamic single-elimination tournaments, we obtain the binomial queue, a classical heap implementation, in a simple and natural way.…
▽ More
The heap is a basic data structure used in a wide variety of applications, including shortest path and minimum spanning tree algorithms. In this paper we explore the design space of comparison-based, amortized-efficient heap implementations. From a consideration of dynamic single-elimination tournaments, we obtain the binomial queue, a classical heap implementation, in a simple and natural way. We give four equivalent ways of representing heaps arising from tournaments, and we obtain two new variants of binomial queues, a one-tree version and a one-pass version. We extend the one-pass version to support key decrease operations, obtaining the {\em rank-pairing heap}, or {\em rp-heap}. Rank-pairing heaps combine the performance guarantees of Fibonacci heaps with simplicity approaching that of pairing heaps. Like pairing heaps, rank-pairing heaps consist of trees of arbitrary structure, but these trees are combined by rank, not by list position, and rank changes, but not structural changes, cascade during key decrease operations.
△ Less
Submitted 28 February, 2009;
originally announced March 2009.
-
Incremental Topological Ordering and Strong Component Maintenance
Authors:
Bernhard Haeupler,
Siddhartha Sen,
Robert E. Tarjan
Abstract:
We present an on-line algorithm for maintaining a topological order of a directed acyclic graph as arcs are added, and detecting a cycle when one is created. Our algorithm takes O(m^{1/2}) amortized time per arc, where m is the total number of arcs. For sparse graphs, this bound improves the best previous bound by a logarithmic factor and is tight to within a constant factor for a natural class…
▽ More
We present an on-line algorithm for maintaining a topological order of a directed acyclic graph as arcs are added, and detecting a cycle when one is created. Our algorithm takes O(m^{1/2}) amortized time per arc, where m is the total number of arcs. For sparse graphs, this bound improves the best previous bound by a logarithmic factor and is tight to within a constant factor for a natural class of algorithms that includes all the existing ones. Our main insight is that the bidirectional search method of previous algorithms does not require an ordered search, but can be more general. This allows us to avoid the use of heaps (priority queues) entirely. Instead, the deterministic version of our algorithm uses (approximate) median-finding. The randomized version of our algorithm avoids this complication, making it very simple. We extend our topological ordering algorithm to give the first detailed algorithm for maintaining the strong components of a directed graph, and a topological order of these components, as arcs are added. This extension also has an amortized time bound of O(m^{1/2}) per arc.
△ Less
Submitted 6 March, 2008;
originally announced March 2008.
-
Finding a Feasible Flow in a Strongly Connected Network
Authors:
Bernhard Haeupler,
Robert E. Tarjan
Abstract:
We consider the problem of finding a feasible single-commodity flow in a strongly connected network with fixed supplies and demands, provided that the sum of supplies equals the sum of demands and the minimum arc capacity is at least this sum. A fast algorithm for this problem improves the worst-case time bound of the Goldberg-Rao maximum flow method by a constant factor. Erlebach and Hagerup ga…
▽ More
We consider the problem of finding a feasible single-commodity flow in a strongly connected network with fixed supplies and demands, provided that the sum of supplies equals the sum of demands and the minimum arc capacity is at least this sum. A fast algorithm for this problem improves the worst-case time bound of the Goldberg-Rao maximum flow method by a constant factor. Erlebach and Hagerup gave an linear-time feasible flow algorithm. We give an arguably simpler one.
△ Less
Submitted 3 December, 2007; v1 submitted 16 November, 2007;
originally announced November 2007.
-
Data Structures for Mergeable Trees
Authors:
Loukas Georgiadis,
Haim Kaplan,
Nira Shafrir,
Robert E. Tarjan,
Renato F. Werneck
Abstract:
Motivated by an application in computational topology, we consider a novel variant of the problem of efficiently maintaining dynamic rooted trees. This variant requires merging two paths in a single operation. In contrast to the standard problem, in which only one tree arc changes at a time, a single merge operation can change many arcs. In spite of this, we develop a data structure that support…
▽ More
Motivated by an application in computational topology, we consider a novel variant of the problem of efficiently maintaining dynamic rooted trees. This variant requires merging two paths in a single operation. In contrast to the standard problem, in which only one tree arc changes at a time, a single merge operation can change many arcs. In spite of this, we develop a data structure that supports merges on an n-node forest in O(log^2 n) amortized time and all other standard tree operations in O(log n) time (amortized, worst-case, or randomized depending on the underlying data structure). For the special case that occurs in the motivating application, in which arbitrary arc deletions (cuts) are not allowed, we give a data structure with an O(log n) time bound per operation. This is asymptotically optimal under certain assumptions. For the even-more special case in which both cuts and parent queries are disallowed, we give an alternative O(log n)-time solution that uses standard dynamic trees as a black box. This solution also applies to the motivating application. Our methods use previous work on dynamic trees in various ways, but the analysis of each algorithm requires novel ideas. We also investigate lower bounds for the problem under various assumptions.
△ Less
Submitted 11 November, 2007;
originally announced November 2007.
-
Linear-Time Pointer-Machine Algorithms for Path-Evaluation Problems on Trees and Graphs
Authors:
Adam L. Buchsbaum,
Loukas Georgiadis,
Haim Kaplan,
Anne Rogers,
Robert E. Tarjan,
Jeffery R. Westbrook
Abstract:
We present algorithms that run in linear time on pointer machines for a collection of problems, each of which either directly or indirectly requires the evaluation of a function defined on paths in a tree. These problems previously had linear-time algorithms but only for random-access machines (RAMs); the best pointer-machine algorithms were super-linear by an inverse-Ackermann-function factor.…
▽ More
We present algorithms that run in linear time on pointer machines for a collection of problems, each of which either directly or indirectly requires the evaluation of a function defined on paths in a tree. These problems previously had linear-time algorithms but only for random-access machines (RAMs); the best pointer-machine algorithms were super-linear by an inverse-Ackermann-function factor. Our algorithms are also simpler, in some cases substantially, than the previous linear-time RAM algorithms. Our improvements come primarily from three new ideas: a refined analysis of path compression that gives a linear bound if the compressions favor certain nodes, a pointer-based radix sort as a replacement for table-based methods, and a more careful partitioning of a tree into easily managed parts. Our algorithms compute nearest common ancestors off-line, verify and construct minimum spanning trees, do interval analysis on a flowgraph, find the dominators of a flowgraph, and build the component tree of a weighted tree.
△ Less
Submitted 14 November, 2006; v1 submitted 15 July, 2002;
originally announced July 2002.
-
Faster Parametric Shortest Path and Minimum Balance Algorithms
Authors:
Neal Young,
Robert Tarjan,
James Orlin
Abstract:
The parametric shortest path problem is to find the shortest paths in graph where the edge costs are of the form w_ij+lambda where each w_ij is constant and lambda is a parameter that varies. The problem is to find shortest path trees for every possible value of lambda.
The minimum-balance problem is to find a ``weighting'' of the vertices so that adjusting the edge costs by the vertex weights…
▽ More
The parametric shortest path problem is to find the shortest paths in graph where the edge costs are of the form w_ij+lambda where each w_ij is constant and lambda is a parameter that varies. The problem is to find shortest path trees for every possible value of lambda.
The minimum-balance problem is to find a ``weighting'' of the vertices so that adjusting the edge costs by the vertex weights yields a graph in which, for every cut, the minimum weight of any edge crossing the cut in one direction equals the minimum weight of any edge crossing the cut in the other direction.
The paper presents fast algorithms for both problems. The algorithms run in O(nm+n^2 log n) time. The paper also describes empirical studies of the algorithms on random graphs, suggesting that the expected time for finding a minimum-mean cycle (an important special case of both problems) is O(n log(n) + m).
△ Less
Submitted 18 May, 2002;
originally announced May 2002.