-
Dynamic Dynamic Time War**
Authors:
Karl Bringmann,
Nick Fischer,
Ivor van der Hoog,
Evangelos Kipouridis,
Tomasz Kociumaka,
Eva Rotenberg
Abstract:
The Dynamic Time War** (DTW) distance is a popular similarity measure for polygonal curves (i.e., sequences of points). It finds many theoretical and practical applications, especially for temporal data, and is known to be a robust, outlier-insensitive alternative to the \frechet distance. For static curves of at most $n$ points, the DTW distance can be computed in $O(n^2)$ time in constant dime…
▽ More
The Dynamic Time War** (DTW) distance is a popular similarity measure for polygonal curves (i.e., sequences of points). It finds many theoretical and practical applications, especially for temporal data, and is known to be a robust, outlier-insensitive alternative to the \frechet distance. For static curves of at most $n$ points, the DTW distance can be computed in $O(n^2)$ time in constant dimension. This tightly matches a SETH-based lower bound, even for curves in $\mathbb{R}^1$.
In this work, we study \emph{dynamic} algorithms for the DTW distance. Here, the goal is to design a data structure that can be efficiently updated to accommodate local changes to one or both curves, such as inserting or deleting vertices and, after each operation, reports the updated DTW distance. We give such a data structure with update and query time $O(n^{1.5} \log n)$, where $n$ is the maximum length of the curves.
As our main result, we prove that our data structure is conditionally \emph{optimal}, up to subpolynomial factors. More precisely, we prove that, already for curves in $\mathbb{R}^1$, there is no dynamic algorithm to maintain the DTW distance with update and query time~\makebox{$O(n^{1.5 - δ})$} for any constant $δ> 0$, unless the Negative-$k$-Clique Hypothesis fails. In fact, we give matching upper and lower bounds for various trade-offs between update and query time, even in cases where the lengths of the curves differ.
△ Less
Submitted 13 November, 2023; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Fitting Tree Metrics with Minimum Disagreements
Authors:
Evangelos Kipouridis
Abstract:
In the $L_0$ Fitting Tree Metrics problem, we are given all pairwise distances among the elements of a set $V$ and our output is a tree metric on $V$. The goal is to minimize the number of pairwise distance disagreements between the input and the output. We provide an $O(1)$ approximation for $L_0$ Fitting Tree Metrics, which is asymptotically optimal as the problem is APX-Hard.
For $p\ge 1$, so…
▽ More
In the $L_0$ Fitting Tree Metrics problem, we are given all pairwise distances among the elements of a set $V$ and our output is a tree metric on $V$. The goal is to minimize the number of pairwise distance disagreements between the input and the output. We provide an $O(1)$ approximation for $L_0$ Fitting Tree Metrics, which is asymptotically optimal as the problem is APX-Hard.
For $p\ge 1$, solutions to the related $L_p$ Fitting Tree Metrics have typically used a reduction to $L_p$ Fitting Constrained Ultrametrics. Even though in FOCS '22 Cohen-Addad et al. solved $L_0$ Fitting (unconstrained) Ultrametrics within a constant approximation factor, their results did not extend to tree metrics.
We identify two possible reasons, and provide simple techniques to circumvent them. Our framework does not modify the algorithm from Cohen-Addad et al. It rather extends any $ρ$ approximation for $L_0$ Fitting Ultrametrics to a $6ρ$ approximation for $L_0$ Fitting Tree Metrics in a blackbox fashion.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
A Simple Algorithm for Multiple-Source Shortest Paths in Planar Digraphs
Authors:
Debarati Das,
Evangelos Kipouridis,
Maximilian Probst Gutenberg,
Christian Wulff-Nilsen
Abstract:
Given an $n$-vertex planar embedded digraph $G$ with non-negative edge weights and a face $f$ of $G$, Klein presented a data structure with $O(n\log n)$ space and preprocessing time which can answer any query $(u,v)$ for the shortest path distance in $G$ from $u$ to $v$ or from $v$ to $u$ in $O(\log n)$ time, provided $u$ is on $f$. This data structure is a key tool in a number of state-of-the-art…
▽ More
Given an $n$-vertex planar embedded digraph $G$ with non-negative edge weights and a face $f$ of $G$, Klein presented a data structure with $O(n\log n)$ space and preprocessing time which can answer any query $(u,v)$ for the shortest path distance in $G$ from $u$ to $v$ or from $v$ to $u$ in $O(\log n)$ time, provided $u$ is on $f$. This data structure is a key tool in a number of state-of-the-art algorithms and data structures for planar graphs.
Klein's data structure relies on dynamic trees and the persistence technique as well as a highly non-trivial interaction between primal shortest path trees and their duals. The construction of our data structure follows a completely different and in our opinion very simple divide-and-conquer approach that solely relies on Single-Source Shortest Path computations and contractions in the primal graph. Our space and preprocessing time bound is $O(n\log |f|)$ and query time is $O(\log |f|)$ which is an improvement over Klein's data structure when $f$ has small size.
△ Less
Submitted 28 July, 2023; v1 submitted 14 November, 2021;
originally announced November 2021.
-
Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor
Authors:
Vincent Cohen-Addad,
Debarati Das,
Evangelos Kipouridis,
Nikos Parotsidis,
Mikkel Thorup
Abstract:
We consider the numerical taxonomy problem of fitting a positive distance function ${D:{S\choose 2}\rightarrow \mathbb R_{>0}}$ by a tree metric. We want a tree $T$ with positive edge weights and including $S$ among the vertices so that their distances in $T$ match those in $D$. A nice application is in evolutionary biology where the tree $T$ aims to approximate the branching process leading to th…
▽ More
We consider the numerical taxonomy problem of fitting a positive distance function ${D:{S\choose 2}\rightarrow \mathbb R_{>0}}$ by a tree metric. We want a tree $T$ with positive edge weights and including $S$ among the vertices so that their distances in $T$ match those in $D$. A nice application is in evolutionary biology where the tree $T$ aims to approximate the branching process leading to the observed distances in $D$ [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in $S$.
The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was $O((\log n)(\log \log n))$ by Ailon and Charikar [2005] who wrote "Determining whether an $O(1)$ approximation can be obtained is a fascinating question".
△ Less
Submitted 11 March, 2022; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Threshold-based Network Structural Dynamics
Authors:
Evangelos Kipouridis,
Paul G. Spirakis,
Kostas Tsichlas
Abstract:
The interest in dynamic processes on networks is steadily rising in recent years. In this paper, we consider the $(α,β)$-Thresholded Network Dynamics ($(α,β)$-Dynamics), where $α\leq β$, in which only structural dynamics (dynamics of the network) are allowed, guided by local thresholding rules executed in each node. In particular, in each discrete round $t$, each pair of nodes $u$ and $v$ that are…
▽ More
The interest in dynamic processes on networks is steadily rising in recent years. In this paper, we consider the $(α,β)$-Thresholded Network Dynamics ($(α,β)$-Dynamics), where $α\leq β$, in which only structural dynamics (dynamics of the network) are allowed, guided by local thresholding rules executed in each node. In particular, in each discrete round $t$, each pair of nodes $u$ and $v$ that are allowed to communicate by the scheduler, computes a value $\mathcal{E}(u,v)$ (the potential of the pair) as a function of the local structure of the network at round $t$ around the two nodes. If $\mathcal{E}(u,v) < α$ then the link (if it exists) between $u$ and $v$ is removed; if $α\leq \mathcal{E}(u,v) < β$ then an existing link among $u$ and $v$ is maintained; if $β\leq \mathcal{E}(u,v)$ then a link between $u$ and $v$ is established if not already present.
The microscopic structure of $(α,β)$-Dynamics appears to be simple, so that we are able to rigorously argue about it, but still flexible, so that we are able to design meaningful microscopic local rules that give rise to interesting macroscopic behaviors. Our goals are the following: a) to investigate the properties of the $(α,β)$-Thresholded Network Dynamics and b) to show that $(α,β)$-Dynamics is expressive enough to solve complex problems on networks.
Our contribution in these directions is twofold. We rigorously exhibit the claim about the expressiveness of $(α,β)$-Dynamics, both by designing a simple protocol that provably computes the $k$-core of the network as well as by showing that $(α,β)$-Dynamics is in fact Turing-Complete. Second and most important, we construct general tools for proving stabilization that work for a subclass of $(α,β)$-Dynamics and prove speed of convergence in a restricted setting.
△ Less
Submitted 22 June, 2021; v1 submitted 8 March, 2021;
originally announced March 2021.
-
No Repetition: Fast Streaming with Highly Concentrated Hashing
Authors:
Anders Aamand,
Debarati Das,
Evangelos Kipouridis,
Jakob B. T. Knudsen,
Peter M. R. Rasmussen,
Mikkel Thorup
Abstract:
To get estimators that work within a certain error bound with high probability, a common strategy is to design one that works with constant probability, and then boost the probability using independent repetitions. Important examples of this approach are small space algorithms for estimating the number of distinct elements in a stream, or estimating the set similarity between large sets. Using sta…
▽ More
To get estimators that work within a certain error bound with high probability, a common strategy is to design one that works with constant probability, and then boost the probability using independent repetitions. Important examples of this approach are small space algorithms for estimating the number of distinct elements in a stream, or estimating the set similarity between large sets. Using standard strongly universal hashing to process each element, we get a sketch based estimator where the probability of a too large error is, say, 1/4. By performing $r$ independent repetitions and taking the median of the estimators, the error probability falls exponentially in $r$. However, running $r$ independent experiments increases the processing time by a factor $r$.
Here we make the point that if we have a hash function with strong concentration bounds, then we get the same high probability bounds without any need for repetitions. Instead of $r$ independent sketches, we have a single sketch that is $r$ times bigger, so the total space is the same. However, we only apply a single hash function, so we save a factor $r$ in time, and the overall algorithms just get simpler.
Fast practical hash functions with strong concentration bounds were recently proposed by Aamand em et al. (to appear in STOC 2020). Using their hashing schemes, the algorithms thus become very fast and practical, suitable for online processing of high volume data streams.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
On the Convergence of Network Systems
Authors:
Evangelos Kipouridis,
Kostas Tsichlas
Abstract:
The apparent disconnection between the microscopic and the macroscopic is a major issue in the understanding of complex systems. To this extend, we study the convergence of repeatedly applying local rules on a network, and touch on the expressive power of this model. We look at network systems and study their behavior when different types of local rules are applied on them. For a very general clas…
▽ More
The apparent disconnection between the microscopic and the macroscopic is a major issue in the understanding of complex systems. To this extend, we study the convergence of repeatedly applying local rules on a network, and touch on the expressive power of this model. We look at network systems and study their behavior when different types of local rules are applied on them. For a very general class of local rules, we prove convergence and provide a certain member of this class that, when applied on a graph, efficiently computes its k-core and its (k-1)-crust giving hints on the expressive power of such a model. Furthermore, we provide guarantees on the speed of convergence for an important subclass of the aforementioned class. We also study more general rules, and show that they do not converge. Our counterexamples resolve an open question of (Zhang, Wang, Wang, Zhou, KDD- 2009) as well, concerning whether a certain process converges. Finally, we show the universality of our network system, by providing a local rule under which it is Turing-Complete.
△ Less
Submitted 8 February, 2020; v1 submitted 11 February, 2019;
originally announced February 2019.
-
Longest Common Subsequence on Weighted Sequences
Authors:
Evangelos Kipouridis,
Kostas Tsichlas
Abstract:
We consider the general problem of the Longest Common Subsequence (LCS) on weighted sequences. Weighted sequences are an extension of classical strings, where in each position every letter of the alphabet may occur with some probability. Previous results presented a PTAS and noticed that no FPTAS is possible unless P=NP. In this paper we essentially close the gap between upper and lower bounds by…
▽ More
We consider the general problem of the Longest Common Subsequence (LCS) on weighted sequences. Weighted sequences are an extension of classical strings, where in each position every letter of the alphabet may occur with some probability. Previous results presented a PTAS and noticed that no FPTAS is possible unless P=NP. In this paper we essentially close the gap between upper and lower bounds by improving both. First of all, we provide an EPTAS for bounded alphabets (which is the most natural case), and prove that there does not exist any EPTAS for unbounded alphabets unless FPT=W[1]. Furthermore, under the Exponential Time Hypothesis, we provide a lower bound which shows that no significantly better PTAS can exist for unbounded alphabets. As a side note, we prove that it is sufficient to work with only one threshold in the general variant of the problem.
△ Less
Submitted 19 July, 2020; v1 submitted 13 January, 2019;
originally announced January 2019.