-
Bottom-up Rebalancing Binary Search Trees by Flip** a Coin
Authors:
Gerth Stølting Brodal
Abstract:
Rebalancing schemes for dynamic binary search trees are numerous in the literature, where the goal is to maintain trees of low height, either in the worst-case or expected sense. In this paper we study randomized rebalancing schemes for sequences of $n$ insertions into an initially empty binary search tree, under the assumption that a tree only stores the elements and the tree structure without an…
▽ More
Rebalancing schemes for dynamic binary search trees are numerous in the literature, where the goal is to maintain trees of low height, either in the worst-case or expected sense. In this paper we study randomized rebalancing schemes for sequences of $n$ insertions into an initially empty binary search tree, under the assumption that a tree only stores the elements and the tree structure without any additional balance information. Seidel~(2009) presented a top-down randomized insertion algorithm, where insertions take expected $O\big(\lg^2 n\big)$ time, and the resulting trees have the same distribution as inserting a uniform random permutation into a binary search tree without rebalancing. Seidel states as an open problem if a similar result can be achieved with bottom-up insertions. In this paper we fail to answer this question.
We consider two simple canonical randomized bottom-up insertion algorithms on binary search trees, assuming that an insertion is given the position where to insert the next element. The subsequent rebalancing is performed bottom-up in expected $O(1)$ time, uses expected $O(1)$ random bits, performs at most two rotations, and the rotations appear with geometrically decreasing probability in the distance from the leaf. For some insertion sequences the expected depth of each node is proved to be $O(\lg n)$. On the negative side, we prove for both algorithms that there exist simple insertion sequences where the expected depth is $Ω(n)$, i.e., the studied rebalancing schemes are \emph{not} competitive with (most) other rebalancing schemes in the literature.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Dynamic Convex Hulls for Simple Paths
Authors:
Bruce Brewer,
Gerth Stølting Brodal,
Haitao Wang
Abstract:
We consider the planar dynamic convex hull problem. In the literature, solutions exist supporting the insertion and deletion of points in poly-logarithmic time and various queries on the convex hull of the current set of points in logarithmic time. If arbitrary insertion and deletion of points are allowed, constant time updates and fast queries are known to be impossible. This paper considers two…
▽ More
We consider the planar dynamic convex hull problem. In the literature, solutions exist supporting the insertion and deletion of points in poly-logarithmic time and various queries on the convex hull of the current set of points in logarithmic time. If arbitrary insertion and deletion of points are allowed, constant time updates and fast queries are known to be impossible. This paper considers two restricted cases where worst-case constant time updates and logarithmic time queries are possible. We assume all updates are performed on a deque (double-ended queue) of points. The first case considers the monotonic path case, where all points are sorted in a given direction, say horizontally left-to-right, and only the leftmost and rightmost points can be inserted and deleted. The second case assumes that the points in the deque constitute a simple path. Note that the monotone case is a special case of the simple path case. For both cases, we present solutions supporting deque insertions and deletions in worst-case constant time and standard queries on the convex hull of the points in $O(\log n)$ time, where $n$ is the number of points in the current point set. The convex hull of the current point set can be reported in $O(h+\log n)$ time, where $h$ is the number of edges of the convex hull. For the 1-sided monotone path case, where updates are only allowed on one side, the reporting time can be reduced to $O(h)$, and queries on the convex hull are supported in $O(\log h)$ time. All our time bounds are worst case. In addition, we prove lower bounds that match these time bounds, and thus our results are optimal. For a quick comparison, the previous best update bounds for the simple path problem were amortized $O(\log n)$ time by Friedman, Hershberger, and Snoeyink [SoCG 1989].
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Deterministic Cache-Oblivious Funnelselect
Authors:
Gerth Stølting Brodal,
Sebastian Wild
Abstract:
In the multiple-selection problem one is given an unsorted array $S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots<r_q$, and the task is to return, in sorted order, the $q$ elements in $S$ of rank $r_1, \ldots, r_q$, respectively. The asymptotic deterministic comparison complexity of the problem was settled by Dobkin and Munro [JACM 1981]. In the I/O model an optimal I/O complexity w…
▽ More
In the multiple-selection problem one is given an unsorted array $S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots<r_q$, and the task is to return, in sorted order, the $q$ elements in $S$ of rank $r_1, \ldots, r_q$, respectively. The asymptotic deterministic comparison complexity of the problem was settled by Dobkin and Munro [JACM 1981]. In the I/O model an optimal I/O complexity was achieved by Hu et al. [SPAA 2014]. Recently [ESA 2023], we presented a cache-oblivious algorithm with matching I/O complexity, named funnelselect, since it heavily borrows ideas from the cache-oblivious sorting algorithm funnelsort from the seminal paper by Frigo, Leiserson, Prokop and Ramachandran [FOCS 1999]. Funnelselect is inherently randomized as it relies on sampling for cheaply finding many good pivots. In this paper we present deterministic funnelselect, achieving the same optional I/O complexity cache-obliviously without randomization. Our new algorithm essentially replaces a single (in expectation) reversed-funnel computation using random pivots by a recursive algorithm using multiple reversed-funnel computations. To meet the I/O bound, this requires a carefully chosen subproblem size based on the entropy of the sequence of query ranks; deterministic funnelselect thus raises distinct technical challenges not met by randomized funnelselect. The resulting worst-case I/O bound is $O\bigl(\sum_{i=1}^{q+1} \frac{Δ_i}{B} \cdot \log_{M/B} \frac{N}{Δ_i} + \frac{N}{B}\bigr)$, where $B$ is the external memory block size, $M\geq B^{1+ε}$ is the internal memory size, for some constant $ε>0$, and $Δ_i = r_{i} - r_{i-1}$ (assuming $r_0=0$ and $r_{q+1}=N + 1$).
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Soft Sequence Heaps
Authors:
Gerth Stølting Brodal
Abstract:
Chazelle [JACM00] introduced the soft heap as a building block for efficient minimum spanning tree algorithms, and recently Kaplan et al. [SOSA2019] showed how soft heaps can be applied to achieve simpler algorithms for various selection problems. A soft heap trades-off accuracy for efficiency, by allowing $εN$ of the items in a heap to be corrupted after a total of $N$ insertions, where a corrupt…
▽ More
Chazelle [JACM00] introduced the soft heap as a building block for efficient minimum spanning tree algorithms, and recently Kaplan et al. [SOSA2019] showed how soft heaps can be applied to achieve simpler algorithms for various selection problems. A soft heap trades-off accuracy for efficiency, by allowing $εN$ of the items in a heap to be corrupted after a total of $N$ insertions, where a corrupted item is an item with artificially increased key and $0 < ε\leq 1/2$ is a fixed error parameter. Chazelle's soft heaps are based on binomial trees and support insertions in amortized $O(\lg(1/ε))$ time and extract-min operations in amortized $O(1)$ time.
In this paper we explore the design space of soft heaps. The main contribution of this paper is an alternative soft heap implementation based on merging sorted sequences, with time bounds matching those of Chazelle's soft heaps. We also discuss a variation of the soft heap by Kaplan et al. [SICOMP2013], where we avoid performing insertions lazily. It is based on ternary trees instead of binary trees and matches the time bounds of Kaplan et al., i.e. amortized $O(1)$ insertions and amortized $O(\lg(1/ε))$ extract-min. Both our data structures only introduce corruptions after extract-min operations which return the set of items corrupted by the operation.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Dynamic Planar Convex Hull
Authors:
Riko Jacob,
Gerth Stølting Brodal
Abstract:
In this article, we determine the amortized computational complexity of the planar dynamic convex hull problem by querying.
We present a data structure that maintains a set of n points in the plane under the insertion and deletion of points in amortized O(log n) time per operation. The space usage of the data structure is O(n). The data structure supports extreme point queries in a given directi…
▽ More
In this article, we determine the amortized computational complexity of the planar dynamic convex hull problem by querying.
We present a data structure that maintains a set of n points in the plane under the insertion and deletion of points in amortized O(log n) time per operation. The space usage of the data structure is O(n). The data structure supports extreme point queries in a given direction, tangent queries through a given point, and queries for the neighboring points on the convex hull in O(log n) time. The extreme point queries can be used to decide whether or not a given line intersects the convex hull, and the tangent queries to determine whether a given point is inside the convex hull.
We give a lower bound on the amortized asymptotic time complexity that matches the performance of this data structure.
△ Less
Submitted 28 February, 2019;
originally announced February 2019.
-
Cache Oblivious Algorithms for Computing the Triplet Distance Between Trees
Authors:
Gerth Stølting Brodal,
Konstantinos Mampentzidis
Abstract:
We study the problem of computing the triplet distance between two rooted unordered trees with $n$ labeled leafs. Introduced by Dobson 1975, the triplet distance is the number of leaf triples that induce different topologies in the two trees. The current theoretically best algorithm is an $\mathrm{O}(n \log n)$ time algorithm by Brodal et al. (SODA 2013). Recently Jansson and Rajaby proposed a new…
▽ More
We study the problem of computing the triplet distance between two rooted unordered trees with $n$ labeled leafs. Introduced by Dobson 1975, the triplet distance is the number of leaf triples that induce different topologies in the two trees. The current theoretically best algorithm is an $\mathrm{O}(n \log n)$ time algorithm by Brodal et al. (SODA 2013). Recently Jansson and Rajaby proposed a new algorithm that, while slower in theory, requiring $\mathrm{O}(n \log^3 n)$ time, in practice it outperforms the theoretically faster $\mathrm{O}(n \log n)$ algorithm. Both algorithms do not scale to external memory. We present two cache oblivious algorithms that combine the best of both worlds. The first algorithm is for the case when the two input trees are binary trees and the second a generalized algorithm for two input trees of arbitrary degree. Analyzed in the RAM model, both algorithms require $\mathrm{O}(n \log n)$ time, and in the cache oblivious model $\mathrm{O}(\frac{n}{B} \log_{2} \frac{n}{M})$ I/Os. Their relative simplicity and the fact that they scale to external memory makes them achieve the best practical performance. We note that these are the first algorithms that scale to external memory, both in theory and practice, for this problem.
△ Less
Submitted 7 November, 2019; v1 submitted 30 June, 2017;
originally announced June 2017.
-
External Memory Three-Sided Range Reporting and Top-$k$ Queries with Sublogarithmic Updates
Authors:
Gerth Stølting Brodal
Abstract:
An external memory data structure is presented for maintaining a dynamic set of $N$ two-dimensional points under the insertion and deletion of points, and supporting 3-sided range reporting queries and top-$k$ queries, where top-$k$ queries report the $k$~points with highest $y$-value within a given $x$-range. For any constant $0<\varepsilon\leq \frac{1}{2}$, a data structure is constructed that s…
▽ More
An external memory data structure is presented for maintaining a dynamic set of $N$ two-dimensional points under the insertion and deletion of points, and supporting 3-sided range reporting queries and top-$k$ queries, where top-$k$ queries report the $k$~points with highest $y$-value within a given $x$-range. For any constant $0<\varepsilon\leq \frac{1}{2}$, a data structure is constructed that supports updates in amortized $O(\frac{1}{\varepsilon B^{1-\varepsilon}}\log_B N)$ IOs and queries in amortized $O(\frac{1}{\varepsilon}\log_B N+K/B)$ IOs, where $B$ is the external memory block size, and $K$ is the size of the output to the query (for top-$k$ queries $K$ is the minimum of $k$ and the number of points in the query interval). The data structure uses linear space. The update bound is a significant factor $B^{1-\varepsilon}$ improvement over the previous best update bounds for the two query problems, while staying within the same query and space bounds.
△ Less
Submitted 28 September, 2015;
originally announced September 2015.
-
Strictly Implicit Priority Queues: On the Number of Moves and Worst-Case Time
Authors:
Gerth Stølting Brodal,
Jesper Sindahl Nielsen,
Jakob Truelsen
Abstract:
The binary heap of Williams (1964) is a simple priority queue characterized by only storing an array containing the elements and the number of elements $n$ - here denoted a strictly implicit priority queue. We introduce two new strictly implicit priority queues. The first structure supports amortized $O(1)$ time Insert and $O(\log n)$ time ExtractMin operations, where both operations require amort…
▽ More
The binary heap of Williams (1964) is a simple priority queue characterized by only storing an array containing the elements and the number of elements $n$ - here denoted a strictly implicit priority queue. We introduce two new strictly implicit priority queues. The first structure supports amortized $O(1)$ time Insert and $O(\log n)$ time ExtractMin operations, where both operations require amortized $O(1)$ element moves. No previous implicit heap with $O(1)$ time Insert supports both operations with $O(1)$ moves. The second structure supports worst-case $O(1)$ time Insert and $O(\log n)$ time (and moves) ExtractMin operations. Previous results were either amortized or needed $O(\log n)$ bits of additional state information between operations.
△ Less
Submitted 1 May, 2015;
originally announced May 2015.
-
Optimal Planar Orthogonal Skyline Counting Queries
Authors:
Gerth Stølting Brodal,
Kasper Green Larsen
Abstract:
The skyline of a set of points in the plane is the subset of maximal points, where a point $(x,y)$ is maximal if no other point $(x',y')$ satisfies $x'\ge x$ and $y'\ge Y$. We consider the problem of preprocessing a set $P$ of $n$ points into a space efficient static data structure supporting orthogonal skyline counting queries, i.e. given a query rectangle $R$ to report the size of the skyline of…
▽ More
The skyline of a set of points in the plane is the subset of maximal points, where a point $(x,y)$ is maximal if no other point $(x',y')$ satisfies $x'\ge x$ and $y'\ge Y$. We consider the problem of preprocessing a set $P$ of $n$ points into a space efficient static data structure supporting orthogonal skyline counting queries, i.e. given a query rectangle $R$ to report the size of the skyline of $P$ intersected with $R$. We present a data structure for storing n points with integer coordinates having query time $O(\lg n/\lg\lg n)$ and space usage $O(n)$. The model of computation is a unit cost RAM with logarithmic word size. We prove that these bounds are the best possible by presenting a lower bound in the cell probe model with logarithmic word size: Space usage $n\lg^{O(1)} n$ implies worst case query time $Ω(\lg n/\lg\lg n)$.
△ Less
Submitted 24 April, 2014; v1 submitted 30 April, 2013;
originally announced April 2013.
-
Dynamic 3-sided Planar Range Queries with Expected Doubly Logarithmic Time
Authors:
Gerth Stølting Brodal,
Alexis C. Kaporis,
Apostolos N. Papadopoulos,
Spyros Sioutas,
Konstantinos Tsakalidis,
Kostas Tsichlas
Abstract:
This work studies the problem of 2-dimensional searching for the 3-sided range query of the form $[a, b]\times (-\infty, c]$ in both main and external memory, by considering a variety of input distributions. We present three sets of solutions each of which examines the 3-sided problem in both RAM and I/O model respectively. The presented data structures are deterministic and the expectation is wit…
▽ More
This work studies the problem of 2-dimensional searching for the 3-sided range query of the form $[a, b]\times (-\infty, c]$ in both main and external memory, by considering a variety of input distributions. We present three sets of solutions each of which examines the 3-sided problem in both RAM and I/O model respectively. The presented data structures are deterministic and the expectation is with respect to the input distribution.
△ Less
Submitted 12 January, 2012;
originally announced January 2012.
-
Cache-Oblivious Implicit Predecessor Dictionaries with the Working Set Property
Authors:
Gerth Stølting Brodal,
Casper Kejlberg-Rasmussen
Abstract:
In this paper we present an implicit dynamic dictionary with the working-set property, supporting insert(e) and delete(e) in O(log n) time, predecessor(e) in O(log l_{p(e)}) time, successor(e) in O(log l_{s(e)}) time and search(e) in O(log min(l_{p(e)},l_{e}, l_{s(e)})) time, where n is the number of elements stored in the dictionary, l_{e} is the number of distinct elements searched for since ele…
▽ More
In this paper we present an implicit dynamic dictionary with the working-set property, supporting insert(e) and delete(e) in O(log n) time, predecessor(e) in O(log l_{p(e)}) time, successor(e) in O(log l_{s(e)}) time and search(e) in O(log min(l_{p(e)},l_{e}, l_{s(e)})) time, where n is the number of elements stored in the dictionary, l_{e} is the number of distinct elements searched for since element e was last searched for and p(e) and s(e) are the predecessor and successor of e, respectively. The time-bounds are all worst-case. The dictionary stores the elements in an array of size n using no additional space. In the cache-oblivious model the log is base B and the cache-obliviousness is due to our black box use of an existing cache-oblivious implicit dictionary. This is the first implicit dictionary supporting predecessor and successor searches in the working-set bound. Previous implicit structures required O(log n) time.
△ Less
Submitted 21 February, 2012; v1 submitted 22 December, 2011;
originally announced December 2011.
-
D$^2$-Tree: A New Overlay with Deterministic Bounds
Authors:
G. S. Brodal,
S. Sioutas,
K. Tsichlas,
C. Zaroliagis
Abstract:
We present a new overlay, called the {\em Deterministic Decentralized tree} ($D^2$-tree). The $D^2$-tree compares favourably to other overlays for the following reasons: (a) it provides matching and better complexities, which are deterministic for the supported operations; (b) the management of nodes (peers) and elements are completely decoupled from each other; and (c) an efficient deterministic…
▽ More
We present a new overlay, called the {\em Deterministic Decentralized tree} ($D^2$-tree). The $D^2$-tree compares favourably to other overlays for the following reasons: (a) it provides matching and better complexities, which are deterministic for the supported operations; (b) the management of nodes (peers) and elements are completely decoupled from each other; and (c) an efficient deterministic load-balancing mechanism is presented for the uniform distribution of elements into nodes, while at the same time probabilistic optimal bounds are provided for the congestion of operations at the nodes. The load-balancing scheme of elements into nodes is deterministic and general enough to be applied to other hierarchical tree-based overlays. This load-balancing mechanism is based on an innovative lazy weight-balancing mechanism, which is interesting in its own right.
△ Less
Submitted 8 March, 2012; v1 submitted 16 September, 2010;
originally announced September 2010.