-
Online List Labeling with Near-Logarithmic Writes
Authors:
Martin P. Seybold
Abstract:
In the Online List Labeling problem, a set of $n \leq N$ elements from a totally ordered universe must be stored in sorted order in an array with $m=N+\lceil\varepsilon N \rceil$ slots, where $\varepsilon \in (0,1]$ is constant, while an adversary chooses elements that must be inserted and deleted from the set.
We devise a skip-list based algorithm for maintaining order against an oblivious adve…
▽ More
In the Online List Labeling problem, a set of $n \leq N$ elements from a totally ordered universe must be stored in sorted order in an array with $m=N+\lceil\varepsilon N \rceil$ slots, where $\varepsilon \in (0,1]$ is constant, while an adversary chooses elements that must be inserted and deleted from the set.
We devise a skip-list based algorithm for maintaining order against an oblivious adversary and show that the expected amortized number of writes is $O(\varepsilon^{-1}\log (n) \operatorname{poly}(\log \log n))$ per update.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Covering Rectilinear Polygons with Area-Weighted Rectangles
Authors:
Kathrin Hanauer,
Martin P. Seybold,
Julian Unterweger
Abstract:
Representing a polygon using a set of simple shapes has numerous applications in different use-case scenarios. We consider the problem of covering the interior of a rectilinear polygon with holes by a set of area-weighted, axis-aligned rectangles such that the total weight of the rectangles in the cover is minimized. Already the unit-weight case is known to be NP-hard and the general problem has,…
▽ More
Representing a polygon using a set of simple shapes has numerous applications in different use-case scenarios. We consider the problem of covering the interior of a rectilinear polygon with holes by a set of area-weighted, axis-aligned rectangles such that the total weight of the rectangles in the cover is minimized. Already the unit-weight case is known to be NP-hard and the general problem has, to the best of our knowledge, not been studied experimentally before.
We show a new basic property of optimal solutions of the weighted problem. This allows us to speed up existing algorithms for the unit-weight case, obtain an improved ILP formulation for both the weighted and unweighted problem, and develop several approximation algorithms and heuristics for the weighted case.
All our algorithms are evaluated in a large experimental study on 186 837 polygons combined with six cost functions, which provides evidence that our algorithms are both fast and yield close-to-optimal solutions in practice.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
On the Complexity of Algorithms with Predictions for Dynamic Graph Problems
Authors:
Monika Henzinger,
Barna Saha,
Martin P. Seybold,
Christopher Ye
Abstract:
{\em Algorithms with predictions} incorporate machine learning predictions into algorithm design. A plethora of recent works incorporated predictions to improve on worst-case optimal bounds for online problems. In this paper, we initiate the study of complexity of dynamic data structures with predictions, including dynamic graph algorithms. Unlike in online algorithms, the main goal in dynamic dat…
▽ More
{\em Algorithms with predictions} incorporate machine learning predictions into algorithm design. A plethora of recent works incorporated predictions to improve on worst-case optimal bounds for online problems. In this paper, we initiate the study of complexity of dynamic data structures with predictions, including dynamic graph algorithms. Unlike in online algorithms, the main goal in dynamic data structures is to maintain the solution {\em efficiently} with every update.
Motivated by work in online algorithms, we investigate three natural models of predictions: (1) $\varepsilon$-accurate predictions where each predicted request matches the true request with probability at least $\varepsilon$, (2) list-accurate predictions where a true request comes from a list of possible requests, and (3) bounded delay predictions where the true requests are some (unknown) permutations of the predicted requests. For $\varepsilon$-accurate predictions, we show that lower bounds from the non-prediction setting of a problem carry over, up to a $1-\varepsilon$ factor. Then we give general reductions among the prediction models for a problem, showing that lower bounds for bounded delay imply lower bounds for list-accurate predictions, which imply lower bounds for $\varepsilon$-accurate predictions.
Further, we identify two broad problem classes based on lower bounds due to the Online Matrix Vector (OMv) conjecture. Specifically, we show that dynamic problems that are {\em locally correctable} have strong conditional lower bounds for list-accurate predictions that are equivalent to the non-prediction setting, unless list-accurate predictions are perfect. Moreover, dynamic problems that are {\em locally reducible} have a smooth transition in the running time. We categorize problems accordingly and give upper bounds that show that our lower bounds are almost tight, including problems in dynamic graphs.
△ Less
Submitted 10 September, 2023; v1 submitted 31 July, 2023;
originally announced July 2023.
-
B-Treaps Revised: Write Efficient Randomized Block Search Trees with High Load
Authors:
Roodabeh Safavi,
Martin P. Seybold
Abstract:
Uniquely represented data structures represent each logical state with a unique storage state. We study the problem of maintaining a dynamic set of $n$ keys from a totally ordered universe in this context.
We introduce a two-layer data structure called $(α,\varepsilon)$-Randomized Block Search Tree (RBST) that is uniquely represented and suitable for external memory. Though RBSTs naturally gener…
▽ More
Uniquely represented data structures represent each logical state with a unique storage state. We study the problem of maintaining a dynamic set of $n$ keys from a totally ordered universe in this context.
We introduce a two-layer data structure called $(α,\varepsilon)$-Randomized Block Search Tree (RBST) that is uniquely represented and suitable for external memory. Though RBSTs naturally generalize the well-known binary Treaps, several new ideas are needed to analyze the {\em expected} search, update, and storage, efficiency in terms of block-reads, block-writes, and blocks stored. We prove that searches have $O(\varepsilon^{-1} + \log_αn)$ block-reads, that $(α, \varepsilon)$-RBSTs have an asymptotic load-factor of at least $(1-\varepsilon)$ for every $\varepsilon \in (0,1/2]$, and that dynamic updates perform $O(\varepsilon^{-1} + \log_α(n)/α)$ block-writes, i.e. $O(1/\varepsilon)$ writes if $α=Ω(\frac{\log n}{\log \log n} )$. Thus $(α, \varepsilon)$-RBSTs provide improved search, storage-, and write-efficiency bounds in regard to the known, uniquely represented B-Treap [Golovin; ICALP'09].
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Map matching queries on realistic input graphs under the Fréchet distance
Authors:
Joachim Gudmundsson,
Martin P. Seybold,
Sampson Wong
Abstract:
Map matching is a common preprocessing step for analysing vehicle trajectories. In the theory community, the most popular approach for map matching is to compute a path on the road network that is the most spatially similar to the trajectory, where spatial similarity is measured using the Fréchet distance. A shortcoming of existing map matching algorithms under the Fréchet distance is that every t…
▽ More
Map matching is a common preprocessing step for analysing vehicle trajectories. In the theory community, the most popular approach for map matching is to compute a path on the road network that is the most spatially similar to the trajectory, where spatial similarity is measured using the Fréchet distance. A shortcoming of existing map matching algorithms under the Fréchet distance is that every time a trajectory is matched, the entire road network needs to be reprocessed from scratch. An open problem is whether one can preprocess the road network into a data structure, so that map matching queries can be answered in sublinear time.
In this paper, we investigate map matching queries under the Fréchet distance. We provide a negative result for geometric planar graphs. We show that, unless SETH fails, there is no data structure that can be constructed in polynomial time that answers map matching queries in $O((pq)^{1-δ})$ query time for any $δ> 0$, where $p$ and $q$ are the complexities of the geometric planar graph and the query trajectory, respectively. We provide a positive result for realistic input graphs, which we regard as the main result of this paper. We show that for $c$-packed graphs, one can construct a data structure of $\tilde O(cp)$ size that can answer $(1+\varepsilon)$-approximate map matching queries in $\tilde O(c^4 q \log^4 p)$ time, where $\tilde O(\cdot)$ hides lower-order factors and dependence on $\varepsilon$.
△ Less
Submitted 25 January, 2024; v1 submitted 5 November, 2022;
originally announced November 2022.
-
On Practical Nearest Sub-Trajectory Queries under the Fréchet Distance
Authors:
Joachim Gudmundsson,
John Pfeifer,
Martin P. Seybold
Abstract:
We study the problem of sub-trajectory nearest-neighbor queries on polygonal curves under the continuous Fréchet distance. Given an $n$ vertex trajectory $P$ and an $m$ vertex query trajectory $Q$, we seek to report a vertex-aligned sub-trajectory $P'$ of $P$ that is closest to $Q$, i.e. $P'$ must start and end on contiguous vertices of $P$. Since in real data $P$ typically contains a very large n…
▽ More
We study the problem of sub-trajectory nearest-neighbor queries on polygonal curves under the continuous Fréchet distance. Given an $n$ vertex trajectory $P$ and an $m$ vertex query trajectory $Q$, we seek to report a vertex-aligned sub-trajectory $P'$ of $P$ that is closest to $Q$, i.e. $P'$ must start and end on contiguous vertices of $P$. Since in real data $P$ typically contains a very large number of vertices, we focus on answering queries, without restrictions on $P$ or $Q$, using only precomputed structures of ${\mathcal{O}}(n)$ size.
We use three baseline algorithms from straightforward extensions of known work, however they have impractical performance on realistic inputs. Therefore, we propose a new Hierarchical Simplification Tree data structure and an adaptive clustering based query algorithm that efficiently explores relevant parts of $P$. The core of our query methods is a novel greedy-backtracking algorithm that solves the Fréchet decision problem using ${\cal O}(n+m)$ space and ${\cal O}(nm)$ time in the worst case.
Experiments on real and synthetic data show that our heuristic effectively prunes the search space and greatly reduces computations compared to baseline approaches.
△ Less
Submitted 13 January, 2024; v1 submitted 19 March, 2022;
originally announced March 2022.
-
Exploring Sub-skeleton Trajectories for Interpretable Recognition of Sign Language
Authors:
Joachim Gudmundsson,
Martin P. Seybold,
John Pfeifer
Abstract:
Recent advances in tracking sensors and pose estimation software enable smart systems to use trajectories of skeleton joint locations for supervised learning. We study the problem of accurately recognizing sign language words, which is key to narrowing the communication gap between hard and non-hard of hearing people.
Our method explores a geometric feature space that we call `sub-skeleton' aspe…
▽ More
Recent advances in tracking sensors and pose estimation software enable smart systems to use trajectories of skeleton joint locations for supervised learning. We study the problem of accurately recognizing sign language words, which is key to narrowing the communication gap between hard and non-hard of hearing people.
Our method explores a geometric feature space that we call `sub-skeleton' aspects of movement. We assess similarity of feature space trajectories using natural, speed invariant distance measures, which enables clear and insightful nearest neighbor classification. The simplicity and scalability of our basic method allows for immediate application in different data domains with little to no parameter tuning.
We demonstrate the effectiveness of our basic method, and a boosted variation, with experiments on data from different application domains and tracking technologies. Surprisingly, our simple methods improve sign recognition over recent, state-of-the-art approaches.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Approximating Multiplicatively Weighted Voronoi Diagrams: Efficient Construction with Linear Size
Authors:
Joachim Gudmundsson,
Martin P. Seybold,
Sampson Wong
Abstract:
Given a set of $n$ sites from $\mathbb{R}^d$, each having some positive weight factor, the Multiplicatively Weighted Voronoi Diagram is a subdivision of space that associates each cell to the site whose weighted Euclidean distance is minimal for all points in the cell.
We give novel approximation algorithms that output a cube-based subdivision such that the weighted distance of a point with resp…
▽ More
Given a set of $n$ sites from $\mathbb{R}^d$, each having some positive weight factor, the Multiplicatively Weighted Voronoi Diagram is a subdivision of space that associates each cell to the site whose weighted Euclidean distance is minimal for all points in the cell.
We give novel approximation algorithms that output a cube-based subdivision such that the weighted distance of a point with respect to the associated site is at most $(1+\varepsilon)$ times the minimum weighted distance, for any fixed parameter $\varepsilon \in (0,1)$. The diagram size is $O_d(n \log(1/\varepsilon)/\varepsilon^{d-1})$ and the construction time is within an $O_D(\log(n)/\varepsilon^{(d+5)/2})$-factor of the size bound. We also prove a matching lower bound for the size, showing that the proposed method is the first to achieve \emph{optimal size}, up to $Θ(1)^d$-factors. In particular, the obscure $\log(1/\varepsilon)$ factor is unavoidable. As a by-product, we obtain a factor $d^{O(d)}$ improvement in size for the unweighted case and $O(d \log(n) + d^2 \log(1/\varepsilon))$ point-location time in the subdivision, improving the known query bound by one $d$-factor.
The key ingredients of our approximation algorithms are the study of convex regions that we call cores, an adaptive refinement algorithm to obtain optimal size, and a novel notion of \emph{bisector coresets}, which may be of independent interest. In particular, we show that coresets with $O_d(1/\varepsilon^{(d+3)/2})$ worst-case size can be computed in near-linear time.
△ Less
Submitted 18 March, 2024; v1 submitted 22 December, 2021;
originally announced December 2021.
-
Optimal Window Queries on Line Segments using the Trapezoidal Search DAG
Authors:
Milutin Brankovic,
Martin P. Seybold
Abstract:
We propose a new query application for the well-known Trapezoidal Search DAG (TSD) of a set of $n$~line segments in the plane, where queries are allowed to be {\em vertical line segments}.
We show that a simple Depth-First Search reports the $k$ trapezoids that are intersected by the query segment in $O(k+\log n)$ expected time, regardless of the spatial location of the query. This bound is opti…
▽ More
We propose a new query application for the well-known Trapezoidal Search DAG (TSD) of a set of $n$~line segments in the plane, where queries are allowed to be {\em vertical line segments}.
We show that a simple Depth-First Search reports the $k$ trapezoids that are intersected by the query segment in $O(k+\log n)$ expected time, regardless of the spatial location of the query. This bound is optimal and matches known data structures with $O(n)$ size. In the important case of edges from a connected, planar graph, our simplistic approach yields an expected $O(n \log^*\!n)$ construction time, which improves on the construction time of known structures for vertical segment-queries. Also for connected input, a simple extension allows the TSD approach to directly answer axis-aligned window-queries in $O(k + \log n)$ expected time, where $k$ is the result size.
△ Less
Submitted 13 January, 2024; v1 submitted 12 November, 2021;
originally announced November 2021.
-
A Tail Estimate with Exponential Decay for the Randomized Incremental Construction of Search Structures
Authors:
Joachim Gudmundsson,
Martin P. Seybold
Abstract:
The Randomized Incremental Construction (RIC) of search DAGs for point location in planar subdivisions, nearest-neighbor search in 2D points, and extreme point search in 3D convex hulls, are well known to take ${\cal O}(n \log n)$ expected time for structures of ${\cal O}(n)$ expected size. Moreover, searching takes w.h.p. ${\cal O}(\log n)$ comparisons in the first and w.h.p.…
▽ More
The Randomized Incremental Construction (RIC) of search DAGs for point location in planar subdivisions, nearest-neighbor search in 2D points, and extreme point search in 3D convex hulls, are well known to take ${\cal O}(n \log n)$ expected time for structures of ${\cal O}(n)$ expected size. Moreover, searching takes w.h.p. ${\cal O}(\log n)$ comparisons in the first and w.h.p. ${\cal O}(\log^2 n)$ comparisons in the latter two DAGs. However, the expected depth of the DAGs and high probability bounds for their size are unknown.
Using a novel analysis technique, we show that the three DAGs have w.h.p. i) a size of ${\cal O}(n)$, ii) a depth of ${\cal O}(\log n)$, and iii) a construction time of ${\cal O}(n \log n)$. One application of these new and improved results are \emph{remarkably simple} Las Vegas verifiers to obtain search DAGs with optimal worst-case bounds. This positively answers the conjectured logarithmic search cost in the DAG of Delaunay triangulations [Guibas et al.; ICALP 1990] and a conjecture on the depth of the DAG of Trapezoidal subdivisions [Hemmer et al.; ESA 2012]. It also shows that history-based RIC circumvents a lower bound on runtime tail estimates of conflict-graph RICs [Sen; STACS 2019].
△ Less
Submitted 18 July, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
A Practical Index Structure Supporting Fréchet Proximity Queries Among Trajectories
Authors:
Joachim Gudmundsson,
Michael Horton,
John Pfeifer,
Martin P. Seybold
Abstract:
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics, like the continuous Fréchet distance on trajectory data. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories, regardless of the trajectory's individual sizes or the spatial dimension, which allows one to exploit…
▽ More
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics, like the continuous Fréchet distance on trajectory data. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories, regardless of the trajectory's individual sizes or the spatial dimension, which allows one to exploit low `intrinsic dimensionality' of data sets for effective search space pruning.
Since the distance computation is expensive, generic metric indexing methods are rendered impractical. We present strategies that (i) improve on known upper and lower bound computations, (ii) build cluster trees without any or very few distance calls, and (iii) search using bounds for metric pruning, interval orderings for reduction, and randomized pivoting for reporting the final results.
We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets. The results show improvement over state-of-the-art methods for exact queries, and even further speed-ups are achieved for queries that may return approximate results. Surprisingly, the majority of exact nearest-neighbor queries on real data sets are answered without any distance computations.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
A Simple Dynamization of Trapezoidal Point Location in Planar Subdivisions
Authors:
Milutin Brankovic,
Nikola Grujic,
André van Renssen,
Martin P. Seybold
Abstract:
We study how to dynamize the Trapezoidal Search Tree - a well known randomized point location structure for planar subdivisions of kinetic line segments.
Our approach naturally extends incremental leaf-level insertions to recursive methods and allows adaptation for the online setting. Moreover, the dynamization carries over to the Trapezoidal Search DAG, offering a linear sized data structure wi…
▽ More
We study how to dynamize the Trapezoidal Search Tree - a well known randomized point location structure for planar subdivisions of kinetic line segments.
Our approach naturally extends incremental leaf-level insertions to recursive methods and allows adaptation for the online setting. Moreover, the dynamization carries over to the Trapezoidal Search DAG, offering a linear sized data structure with logarithmic point location costs as a by-product. On a set $S$ of non-crossing segments, each update performs expected ${\mathcal O}(\log^2|S|)$ operations.
We demonstrate the practicality of our method with an open-source implementation, based on the Computational Geometry Algorithms Library, and experiments on the update performance.
△ Less
Submitted 6 December, 2019;
originally announced December 2019.
-
Estimating Flow Rates through Fracture Networks using Combinatorial Optimization
Authors:
A. Hobé,
D. Vogler,
M. P. Seybold,
A. Ebigbo,
R. R. Settgast,
M. O. Saar
Abstract:
To enable fast uncertainty quantification of fluid flow in a discrete fracture network (DFN), we present two approaches to quickly compute fluid flow in DFNs using combinatorial optimization algorithms. Specifically, the presented Hanan Shortest Path Maxflow (HSPM) and Intersection Shortest Path Maxflow (ISPM) methods translate DFN geometries and properties to a graph on which a max flow algorithm…
▽ More
To enable fast uncertainty quantification of fluid flow in a discrete fracture network (DFN), we present two approaches to quickly compute fluid flow in DFNs using combinatorial optimization algorithms. Specifically, the presented Hanan Shortest Path Maxflow (HSPM) and Intersection Shortest Path Maxflow (ISPM) methods translate DFN geometries and properties to a graph on which a max flow algorithm computes a combinatorial flow, from which an overall fluid flow rate is estimated using a shortest path decomposition of this flow. The two approaches are assessed by comparing their predictions with results from explicit numerical simulations of simple test cases as well as stochastic DFN realizations covering a range of fracture densities. Both methods have a high accuracy and very low computational cost, which can facilitate much-needed in-depth analyses of the propagation of uncertainty in fracture and fracture-network properties to fluid flow rates.
△ Less
Submitted 25 January, 2018;
originally announced January 2018.
-
Rational Points on the Unit Sphere: Approximation Complexity and Practical Constructions
Authors:
Daniel Bahrdt,
Martin P. Seybold
Abstract:
Each non-zero point in $\mathbb{R}^d$ identifies a closest point $x$ on the unit sphere $\mathbb{S}^{d-1}$. We are interested in computing an $ε$-approximation $y \in \mathbb{Q}^d$ for $x$, that is exactly on $\mathbb{S}^{d-1}$ and has low bit size. We revise lower bounds on rational approximations and provide explicit, spherical instances.
We prove that floating-point numbers can only provide t…
▽ More
Each non-zero point in $\mathbb{R}^d$ identifies a closest point $x$ on the unit sphere $\mathbb{S}^{d-1}$. We are interested in computing an $ε$-approximation $y \in \mathbb{Q}^d$ for $x$, that is exactly on $\mathbb{S}^{d-1}$ and has low bit size. We revise lower bounds on rational approximations and provide explicit, spherical instances.
We prove that floating-point numbers can only provide trivial solutions to the sphere equation in $\mathbb{R}^2$ and $\mathbb{R}^3$. Moreover, we show how to construct a rational point with denominators of at most $10(d-1)/\varepsilon^2$ for any given $ε\in \left(0,\tfrac 1 8\right]$, improving on a previous result. The method further benefits from algorithms for simultaneous Diophantine approximation.
Our open-source implementation and experiments demonstrate the practicality of our approach in the context of massive data sets Geo-referenced by latitude and longitude values.
△ Less
Submitted 26 July, 2017;
originally announced July 2017.