-
A QPTAS for Facility Location on Unit Disk graphs
Authors:
Zachary Friggstad,
Mohsen Rezapour,
Mohammad R. Salavatipour,
Hao Sun
Abstract:
We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) c…
▽ More
We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) consists of a set $C\subseteq P$ of clients and a set $F\subseteq P$ of facilities, each having an opening cost $f_i$. The goal is to pick a subset $F'\subseteq F$ to open while minimizing $\sum_{i\in F'} f_i + \sum_{v\in C} d(v,F')$, where $d(v,F')$ is the distance of $v$ to nearest facility in $F'$ through UDG(P).
In this paper, we present the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem. While approximation schemes are well-established for facility location problems on sparse geometric graphs (such as planar graphs), there is a lack of such results for dense graphs. Specifically, prior to this study, to the best of our knowledge, there was no approximation scheme for any facility location problem on UDGs in the general setting.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Approximation Schemes for Orienteering and Deadline TSP in Doubling Metrics
Authors:
Kinter Ren,
Mohammad R. Salavatipour
Abstract:
In this paper we look at $k$-stroll, point-to-point orienteering, as well as the deadline TSP problem on graphs with bounded doubling dimension and bounded treewidth and present approximation schemes for them. Given a weighted graph $G=(V,E)$, start node $s\in V$, distances $d:E\rightarrow \mathbb{Q}^+$ and integer $k$. In the $k$-stroll problem the goal is to find a path starting at $s$ of minimu…
▽ More
In this paper we look at $k$-stroll, point-to-point orienteering, as well as the deadline TSP problem on graphs with bounded doubling dimension and bounded treewidth and present approximation schemes for them. Given a weighted graph $G=(V,E)$, start node $s\in V$, distances $d:E\rightarrow \mathbb{Q}^+$ and integer $k$. In the $k$-stroll problem the goal is to find a path starting at $s$ of minimum length that visits at least $k$ vertices. The dual problem to $k$-stroll is the rooted orienteering in which instead of $k$ we are given a budget $B$ and the goal is to find a walk of length at most $B$ starting at $s$ that visits as many vertices as possible. In the P2P orienteering we are given start and end nodes $s,t$ for the path. In the deadline TSP we are given a deadline $D(v)$ for each $v\in V$ and the goal is to find a walk starting at $s$ that visits as many vertices as possible before their deadline. The best approximation for rooted or P2P orienteering is $(2+ε)$-approximation [12] and $O(\log n)$-approximation for deadline TSP [3]. There is no known approximation scheme for deadline TSP for any metric (not even trees). Our main result is the first approximation scheme for deadline TSP on metrics with bounded doubling dimension. To do so we first show if $G$ is a metric with doubling dimension $κ$ and aspect ratio $Δ$, there is a $(1+ε)$-approximation that runs in time $n^{O\left(\left(\logΔ/ε\right)^{2κ+1}\right)}$. We then extend these to obtain an approximation scheme for deadline TSP when the distances and deadlines are integer which runs in time $n^{O\left(\left(\log Δ/ε\right)^{2κ+2}\right)}$. For graphs with treewidth $ω$ we show how to solve $k$-stroll and P2P orienteering exactly in polynomial time and a $(1+ε)$-approximation for deadline TSP in time $n^{O((ω\logΔ/ε)^2)}$.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Improved Approximations for CVRP with Unsplittable Demands
Authors:
Zachary Friggstad,
Ramin Mousavi,
Mirmahdi Rahgoshay,
Mohammad R. Salavatipour
Abstract:
In this paper, we present improved approximation algorithms for the (unsplittable) Capacitated Vehicle Routing Problem (CVRP) in general metrics. In CVRP, introduced by Dantzig and Ramser (1959), we are given a set of points (clients) $V$ together with a depot $r$ in a metric space, with each $v\in V$ having a demand $d_v>0$, and a vehicle of bounded capacity $Q$. The goal is to find a minimum cos…
▽ More
In this paper, we present improved approximation algorithms for the (unsplittable) Capacitated Vehicle Routing Problem (CVRP) in general metrics. In CVRP, introduced by Dantzig and Ramser (1959), we are given a set of points (clients) $V$ together with a depot $r$ in a metric space, with each $v\in V$ having a demand $d_v>0$, and a vehicle of bounded capacity $Q$. The goal is to find a minimum cost collection of tours for the vehicle, each starting and ending at the depot, such that each client is visited at least once and the total demands of the clients in each tour is at most $Q$. In the unsplittable variant we study, the demand of a node must be served entirely by one tour. We present two approximation algorithms for unsplittable CVRP: a combinatorial $(α+1.75)$-approximation, where $α$ is the approximation factor for the Traveling Salesman Problem, and an approximation algorithm based on LP rounding with approximation guarantee $α+\ln(2) + δ\approx 3.194 + δ$ in $n^{O(1/δ)}$ time. Both approximations can further be improved by a small amount when combined with recent work by Blauth, Traub, and Vygen (2021), who obtained an $(α+ 2\cdot (1 -ε))$-approximation for unsplittable CVRP for some constant $ε$ depending on $α$ ($ε> 1/3000$ for $α= 1.5$).
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Hierarchical Clustering: New Bounds and Objective
Authors:
Mirmahdi Rahgoshay,
Mohammad R. Salavatipour
Abstract:
Hierarchical Clustering has been studied and used extensively as a method for analysis of data. More recently, Dasgupta [2016] defined a precise objective function. Given a set of $n$ data points with a weight function $w_{i,j}$ for each two items $i$ and $j$ denoting their similarity/dis-similarity, the goal is to build a recursive (tree like) partitioning of the data points (items) into successi…
▽ More
Hierarchical Clustering has been studied and used extensively as a method for analysis of data. More recently, Dasgupta [2016] defined a precise objective function. Given a set of $n$ data points with a weight function $w_{i,j}$ for each two items $i$ and $j$ denoting their similarity/dis-similarity, the goal is to build a recursive (tree like) partitioning of the data points (items) into successively smaller clusters. He defined a cost function for a tree $T$ to be $Cost(T) = \sum_{i,j \in [n]} \big(w_{i,j} \times |T_{i,j}| \big)$ where $T_{i,j}$ is the subtree rooted at the least common ancestor of $i$ and $j$ and presented the first approximation algorithm for such clustering. Then Moseley and Wang [2017] considered the dual of Dasgupta's objective function for similarity-based weights and showed that both random partitioning and average linkage have approximation ratio $1/3$ which has been improved in a series of works to $0.585$ [Alon et al. 2020]. Later Cohen-Addad et al. [2019] considered the same objective function as Dasgupta's but for dissimilarity-based metrics, called $Rev(T)$. It is shown that both random partitioning and average linkage have ratio $2/3$ which has been only slightly improved to $0.667078$ [Charikar et al. SODA2020]. Our first main result is to consider $Rev(T)$ and present a more delicate algorithm and careful analysis that achieves approximation $0.71604$. We also introduce a new objective function for dissimilarity-based clustering. For any tree $T$, let $H_{i,j}$ be the number of $i$ and $j$'s common ancestors. Intuitively, items that are similar are expected to remain within the same cluster as deep as possible. So, for dissimilarity-based metrics, we suggest the cost of each tree $T$, which we want to minimize, to be $Cost_H(T) = \sum_{i,j \in [n]} \big(w_{i,j} \times H_{i,j} \big)$. We present a $1.3977$-approximation for this objective.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Approximation Schemes for Capacitated Vehicle Routing on Graphs of Bounded Treewidth, Bounded Doubling, or Highway Dimension
Authors:
Aditya Jayaprakash,
Mohammad R. Salavatipour
Abstract:
In this paper, we present Approximation Schemes for Capacitated Vehicle Routing Problem (CVRP) on several classes of graphs. In CVRP, introduced by Dantzig and Ramser (1959), we are given a graph $G=(V,E)$ with metric edges costs, a depot $r\in V$, and a vehicle of bounded capacity $Q$. The goal is to find minimum cost collection of tours for the vehicle that returns to the depot, each visiting at…
▽ More
In this paper, we present Approximation Schemes for Capacitated Vehicle Routing Problem (CVRP) on several classes of graphs. In CVRP, introduced by Dantzig and Ramser (1959), we are given a graph $G=(V,E)$ with metric edges costs, a depot $r\in V$, and a vehicle of bounded capacity $Q$. The goal is to find minimum cost collection of tours for the vehicle that returns to the depot, each visiting at most $Q$ nodes, such that they cover all the nodes. This generalizes classic TSP and has been studied extensively. In the more general setting, each node $v$ has a demand $d_v$ and the total demand of each tour must be no more than $Q$. Either the demand of each node must be served by one tour (unsplittable) or can be served by multiple tour (splittable). The best known approximation algorithm for general graphs has ratio $α+2(1-ε)$ (for the unsplittable) and $α+1-ε$ (for the splittable) for some fixed $ε>\frac{1}{3000}$, where $α$ is the best approximation for TSP. Even for the case of trees, the best approximation ratio is $4/3$ by Becker (2018) and it has been an open question if there is an approximation scheme for this simple class of graphs. Das and Mathieu (2015) presented an approximation scheme with time $n^{\log^{O(1/ε)}n}$ for Euclidean plane $\mathbb{R}^2$. No other approximation scheme is known for any other class of metrics (without further restrictions on $Q$). In this paper, we make significant progress on this classic problem by presenting Quasi-Polynomial Time Approximation Schemes (QPTAS) for graphs of bounded treewidth, graphs of bounded highway dimensions, and graphs of bounded doubling dimensions. For comparison, our result implies an approximation scheme for Euclidean plane with run time $n^{O(\log^{10}n/ε^{9})}$.
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
Approximations for Throughput Maximization
Authors:
Dylan Hyatt-Denesik,
Mirmahdi Rahgoshay,
Mohammad R. Salavatipour
Abstract:
In this paper we study the classical problem of throughput maximization. In this problem we have a collection $J$ of $n$ jobs, each having a release time $r_j$, deadline $d_j$, and processing time $p_j$. They have to be scheduled non-preemptively on $m$ identical parallel machines. The goal is to find a schedule which maximizes the number of jobs scheduled entirely in their $[r_j,d_j]$ window. Thi…
▽ More
In this paper we study the classical problem of throughput maximization. In this problem we have a collection $J$ of $n$ jobs, each having a release time $r_j$, deadline $d_j$, and processing time $p_j$. They have to be scheduled non-preemptively on $m$ identical parallel machines. The goal is to find a schedule which maximizes the number of jobs scheduled entirely in their $[r_j,d_j]$ window. This problem has been studied extensively (even for the case of $m=1$). Several special cases of the problem remain open. Bar-Noy et al. [STOC1999] presented an algorithm with ratio $1-1/(1+1/m)^m$ for $m$ machines, which approaches $1-1/e$ as $m$ increases. For $m=1$, Chuzhoy-Ostrovsky-Rabani [FOCS2001] presented an algorithm with approximation with ratio $1-\frac{1}{e}-\varepsilon$ (for any $\varepsilon>0$). Recently Im-Li-Moseley [IPCO2017] presented an algorithm with ratio $1-1/e-\varepsilon_0$ for some absolute constant $\varepsilon_0>0$ for any fixed $m$. They also presented an algorithm with ratio $1-O(\sqrt{\log m/m})-\varepsilon$ for general $m$ which approaches 1 as $m$ grows. The approximability of the problem for $m=O(1)$ remains a major open question. Even for the case of $m=1$ and $c=O(1)$ distinct processing times the problem is open (Sgall [ESA2012]). In this paper we study the case of $m=O(1)$ and show that if there are $c$ distinct processing times, i.e. $p_j$'s come from a set of size $c$, then there is a $(1-\varepsilon)$-approximation that runs in time $O(n^{mc^7\varepsilon^{-6}}\log T)$, where $T$ is the largest deadline. Therefore, for constant $m$ and constant $c$ this yields a PTAS. Our algorithm is based on proving structural properties for a near optimum solution that allows one to use a dynamic programming with pruning.
△ Less
Submitted 12 February, 2020; v1 submitted 27 January, 2020;
originally announced January 2020.
-
Exact Algorithms and Lower Bounds for Stable Instances of Euclidean k-Means
Authors:
Zachary Friggstad,
Kamyar Khodamoradi,
Mohammad R. Salavatipour
Abstract:
We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT…
▽ More
We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $α$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $ε>0$, $(1+ε)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+ε)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $ε_0>0$ s.t. there is not even a PTAS for $(1+ε_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard.
△ Less
Submitted 30 January, 2024; v1 submitted 14 July, 2018;
originally announced July 2018.
-
Approximation Schemes for Clustering with Outliers
Authors:
Zachary Friggstad,
Kamyar Khodamoradi,
Mohsen Rezapour,
Mohammad R. Salavatipour
Abstract:
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons.
We study clustering problems with outliers. More specifically, we look a…
▽ More
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons.
We study clustering problems with outliers. More specifically, we look at Uncapacitated Facility Location (UFL), $k$-Median, and $k$-Means. In UFL with outliers, we have to open some centres, discard up to $z$ points of $\cal X$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In $k$-Median and $k$-Means, we have to open up to $k$ centres but there are no opening costs. In $k$-Means, the cost of assigning $j$ to $i$ is $δ^2(j,i)$. We present several results. Our main focus is on cases where $δ$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For uniform-cost UFL with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the $k$-Median and $k$-Means problems in the same metrics where, for any constant $ε> 0$, we can find a solution using $(1+ε)k$ centres whose cost is at most a $(1+ε)$-factor of the optimum and uses at most $z$ outliers. We also show that natural local search heuristics that do not violate the number of clusters and outliers for $k$-Median (or $k$-Means) will have unbounded gap even in Euclidean metrics. Furthermore, we show how our analysis can be extended to general metrics for $k$-Means with outliers to obtain a $(25+ε,1+ε)$ bicriteria.
△ Less
Submitted 13 July, 2017;
originally announced July 2017.
-
Local Search Yields a PTAS for k-Means in Doubling Metrics
Authors:
Zachary Friggstad,
Mohsen Rezapour,
Mohammad R. Salavatipour
Abstract:
The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie…
▽ More
The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie $\mathbb{R}^d$ for some $d\geq 2$.
$k$-means and the first algorithms for it were introduced in the 1950's. Since then, hundreds of papers have studied this problem and many algorithms have been proposed for it. The most commonly used algorithm is known as Lloyd-Forgy, which is also referred to as "the" $k$-means algorithm, and various extensions of it often work very well in practice. However, they may produce solutions whose cost is arbitrarily large compared to the optimum solution. Kanungo et al. [2004] analyzed a simple local search heuristic to get a polynomial-time algorithm with approximation ratio $9+ε$ for any fixed $ε>0$ for $k$-means in Euclidean space.
Finding an algorithm with a better approximation guarantee has remained one of the biggest open questions in this area, in particular whether one can get a true PTAS for fixed dimension Euclidean space. We settle this problem by showing that a simple local search algorithm provides a PTAS for $k$-means in $\mathbb{R}^d$ for any fixed $d$. More precisely, for any error parameter $ε>0$, the local search algorithm that considers swaps of up to $ρ=d^{O(d)}\cdotε^{-O(d/ε)}$ centres at a time finds a solution using exactly $k$ centres whose cost is at most a $(1+ε)$-factor greater than the optimum.
Finally, we provide the first demonstration that local search yields a PTAS for the uncapacitated facility location problem and $k$-median with non-uniform opening costs in doubling metrics.
△ Less
Submitted 9 January, 2017; v1 submitted 29 March, 2016;
originally announced March 2016.
-
Asymmetric Traveling Salesman Path and Directed Latency Problems
Authors:
Zachary Friggstad,
Mohammad R. Salavatipour,
Zoya Svitkina
Abstract:
We study integrality gaps and approximability of two closely related problems on directed graphs. Given a set V of n nodes in an underlying asymmetric metric and two specified nodes s and t, both problems ask to find an s-t path visiting all other nodes. In the asymmetric traveling salesman path problem (ATSPP), the objective is to minimize the total cost of this path. In the directed latency prob…
▽ More
We study integrality gaps and approximability of two closely related problems on directed graphs. Given a set V of n nodes in an underlying asymmetric metric and two specified nodes s and t, both problems ask to find an s-t path visiting all other nodes. In the asymmetric traveling salesman path problem (ATSPP), the objective is to minimize the total cost of this path. In the directed latency problem, the objective is to minimize the sum of distances on this path from s to each node. Both of these problems are NP-hard. The best known approximation algorithms for ATSPP had ratio O(log n) until the very recent result that improves it to O(log n/ log log n). However, only a bound of O(sqrt(n)) for the integrality gap of its linear programming relaxation has been known. For directed latency, the best previously known approximation algorithm has a guarantee of O(n^(1/2+eps)), for any constant eps > 0. We present a new algorithm for the ATSPP problem that has an approximation ratio of O(log n), but whose analysis also bounds the integrality gap of the standard LP relaxation of ATSPP by the same factor. This solves an open problem posed by Chekuri and Pal [2007]. We then pursue a deeper study of this linear program and its variations, which leads to an algorithm for the k-person ATSPP (where k s-t paths of minimum total length are sought) and an O(log n)-approximation for the directed latency problem.
△ Less
Submitted 1 June, 2010; v1 submitted 3 July, 2009;
originally announced July 2009.
-
A Weakly-Robust PTAS for Minimum Clique Partition in Unit Disk Graphs
Authors:
Imran A. Pirwani,
Mohammad R. Salavatipour
Abstract:
We consider the problem of partitioning the set of vertices of a given unit disk graph (UDG) into a minimum number of cliques. The problem is NP-hard and various constant factor approximations are known, with the current best ratio of 3. Our main result is a {\em weakly robust} polynomial time approximation scheme (PTAS) for UDGs expressed with edge-lengths, it either (i) computes a clique parti…
▽ More
We consider the problem of partitioning the set of vertices of a given unit disk graph (UDG) into a minimum number of cliques. The problem is NP-hard and various constant factor approximations are known, with the current best ratio of 3. Our main result is a {\em weakly robust} polynomial time approximation scheme (PTAS) for UDGs expressed with edge-lengths, it either (i) computes a clique partition or (ii) gives a certificate that the graph is not a UDG; for the case (i) that it computes a clique partition, we show that it is guaranteed to be within $(1+\eps)$ ratio of the optimum if the input is UDG; however if the input is not a UDG it either computes a clique partition as in case (i) with no guarantee on the quality of the clique partition or detects that it is not a UDG. Noting that recognition of UDG's is NP-hard even if we are given edge lengths, our PTAS is a weakly-robust algorithm. Our algorithm can be transformed into an $O(\frac{\log^* n}{\eps^{O(1)}})$ time distributed PTAS.
We consider a weighted version of the clique partition problem on vertex weighted UDGs that generalizes the problem. We note some key distinctions with the unweighted version, where ideas useful in obtaining a PTAS breakdown. Yet, surprisingly, it admits a $(2+\eps)$-approximation algorithm for the weighted case where the graph is expressed, say, as an adjacency matrix. This improves on the best known 8-approximation for the {\em unweighted} case for UDGs expressed in standard form.
△ Less
Submitted 3 December, 2009; v1 submitted 14 April, 2009;
originally announced April 2009.