Search | arXiv e-print repository

Approximate Minimum Sum Colorings and Maximum $k$-Colorable Subgraphs of Chordal Graphs

Abstract: We give a $(1.796+ε)$-approximation for the minimum sum coloring problem on chordal graphs, improving over the previous 3.591-approximation by Gandhi et al. [2005]. To do so, we also design the first polynomial-time approximation scheme for the maximum $k$-colorable subgraph problem in chordal graphs. We give a $(1.796+ε)$-approximation for the minimum sum coloring problem on chordal graphs, improving over the previous 3.591-approximation by Gandhi et al. [2005]. To do so, we also design the first polynomial-time approximation scheme for the maximum $k$-colorable subgraph problem in chordal graphs. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 15 pages, preliminary version appeared in the proceedings of WADS 2023

ACM Class: F.2.2

arXiv:2405.12876 [pdf, other]

Approximating TSP Variants Using a Bridge Lemma

Authors: Martin Böhm, Zachary Friggstad, Tobias Mömke, Joachim Spoerhase

Abstract: We give improved approximations for two metric \textsc{Traveling Salesman Problem} (TSP) variants. In \textsc{Ordered TSP} (OTSP) we are given a linear ordering on a subset of nodes $o_1, \ldots, o_k$. The TSP solution must have that $o_{i+1}$ is visited at some point after $o_i$ for each $1 \leq i < k$. This is the special case of \textsc{Precedence-Constrained TSP} ($PTSP$) in which the preceden… ▽ More We give improved approximations for two metric \textsc{Traveling Salesman Problem} (TSP) variants. In \textsc{Ordered TSP} (OTSP) we are given a linear ordering on a subset of nodes $o_1, \ldots, o_k$. The TSP solution must have that $o_{i+1}$ is visited at some point after $o_i$ for each $1 \leq i < k$. This is the special case of \textsc{Precedence-Constrained TSP} ($PTSP$) in which the precedence constraints are given by a single chain on a subset of nodes. In \textsc{$k$-Person TSP Path} (k-TSPP), we are given pairs of nodes $(s_1,t_1), \ldots, (s_k,t_k)$. The goal is to find an $s_i$-$t_i$ path with minimum total cost such that every node is visited by at least one path. We obtain a $3/2 + e^{-1} < 1.878$ approximation for OTSP, the first improvement over a trivial $α+1$ approximation where $α$ is the current best TSP approximation. We also obtain a $1 + 2 \cdot e^{-1/2} < 2.214$ approximation for k-TSPP, the first improvement over a trivial $3$-approximation. These algorithms both use an adaptation of the Bridge Lemma that was initially used to obtain improved \textsc{Steiner Tree} approximations [Byrka et al., 2013]. Roughly speaking, our variant states that the cost of a cheapest forest rooted at a given set of terminal nodes will decrease by a substantial amount if we randomly sample a set of non-terminal nodes to also become terminals such provided each non-terminal has a constant probability of being sampled. We believe this view of the Bridge Lemma will find further use for improved vehicle routing approximations beyond this paper. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.08931 [pdf, ps, other]

A QPTAS for Facility Location on Unit Disk graphs

Authors: Zachary Friggstad, Mohsen Rezapour, Mohammad R. Salavatipour, Hao Sun

Abstract: We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) c… ▽ More We study the classic \textsc{(Uncapacitated) Facility Location} problem on Unit Disk Graphs (UDGs). For a given point set $P$ in the plane, the unit disk graph UDG(P) on $P$ has vertex set $P$ and an edge between two distinct points $p, q \in P$ if and only if their Euclidean distance $|pq|$ is at most 1. The weight of the edge $pq$ is equal to their distance $|pq|$. An instance of \fl on UDG(P) consists of a set $C\subseteq P$ of clients and a set $F\subseteq P$ of facilities, each having an opening cost $f_i$. The goal is to pick a subset $F'\subseteq F$ to open while minimizing $\sum_{i\in F'} f_i + \sum_{v\in C} d(v,F')$, where $d(v,F')$ is the distance of $v$ to nearest facility in $F'$ through UDG(P). In this paper, we present the first Quasi-Polynomial Time Approximation Schemes (QPTAS) for the problem. While approximation schemes are well-established for facility location problems on sparse geometric graphs (such as planar graphs), there is a lack of such results for dense graphs. Specifically, prior to this study, to the best of our knowledge, there was no approximation scheme for any facility location problem on UDGs in the general setting. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2302.04747 [pdf, other]

An $O(\log k)$-Approximation for Directed Steiner Tree in Planar Graphs

Authors: Zachary Friggstad, Ramin Mousavi

Abstract: We present an $O(\log k)$-approximation for both the edge-weighted and node-weighted versions of \DST in planar graphs where $k$ is the number of terminals. We extend our approach to \MDST (in general graphs \MDST and \DST are easily seen to be equivalent but in planar graphs this is not the case necessarily) in which we get an $O(R+\log k)$-approximation for planar graphs for where $R$ is the num… ▽ More We present an $O(\log k)$-approximation for both the edge-weighted and node-weighted versions of \DST in planar graphs where $k$ is the number of terminals. We extend our approach to \MDST (in general graphs \MDST and \DST are easily seen to be equivalent but in planar graphs this is not the case necessarily) in which we get an $O(R+\log k)$-approximation for planar graphs for where $R$ is the number of roots. △ Less

Submitted 21 April, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

arXiv:2211.12423 [pdf, other]

On Narrative Information and the Distillation of Stories

Authors: Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Jürgen Schmidhuber

Abstract: The act of telling stories is a fundamental part of what it means to be human. This work introduces the concept of narrative information, which we define to be the overlap in information space between a story and the items that compose the story. Using contrastive learning methods, we show how modern artificial neural networks can be leveraged to distill stories and extract a representation of the… ▽ More The act of telling stories is a fundamental part of what it means to be human. This work introduces the concept of narrative information, which we define to be the overlap in information space between a story and the items that compose the story. Using contrastive learning methods, we show how modern artificial neural networks can be leveraged to distill stories and extract a representation of the narrative information. We then demonstrate how evolutionary algorithms can leverage this to extract a set of narrative templates and how these templates -- in tandem with a novel curve-fitting algorithm we introduce -- can reorder music albums to automatically induce stories in them. In the process of doing so, we give strong statistical evidence that these narrative information templates are present in existing albums. While we experiment only with music albums here, the premises of our work extend to any form of (largely) independent media. △ Less

Submitted 13 February, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: presented in the Information-Theoretic Principles in Cognitive Systems Workshop at the 36th Conference on Neural Information Processing Systems; 4 pages in main text + 2 pages of references + 8 pages of appendices, 2 figures in main text + 3 in appendices, 1 table in main text, 2 algorithms in appendices; source code available at https://github.com/dylanashley/story-distiller

MSC Class: 68T07 (Primary) 68P30; 68W50; 94A15 (Secondary) ACM Class: H.1.1; H.5.5; I.2.6; I.5.1; J.5

arXiv:2112.10195 [pdf, ps, other]

Parameterized Approximation Algorithms for $k$-Center Clustering and Variants

Authors: Sayan Bandyapadhyay, Zachary Friggstad, Ramin Mousavi

Abstract: $k$-center is one of the most popular clustering models. While it admits a simple 2-approximation in polynomial time in general metrics, the Euclidean version is NP-hard to approximate within a factor of 1.93, even in the plane, if one insists the dependence on $k$ in the running time be polynomial. Without this restriction, a classic algorithm yields a $2^{O((k\log k)/ε)}dn$-time $(1+ε)… ▽ More $k$-center is one of the most popular clustering models. While it admits a simple 2-approximation in polynomial time in general metrics, the Euclidean version is NP-hard to approximate within a factor of 1.93, even in the plane, if one insists the dependence on $k$ in the running time be polynomial. Without this restriction, a classic algorithm yields a $2^{O((k\log k)/ε)}dn$-time $(1+ε)$-approximation for Euclidean $k$-center, where $d$ is the dimension. We give a faster algorithm for small dimensions: roughly speaking an $O^*(2^{O((1/ε)^{O(d)} \cdot k^{1-1/d} \cdot \log k)})$-time $(1+ε)$-approximation. In particular, the running time is roughly $O^*(2^{O((1/ε)^{O(1)}\sqrt{k}\log k)})$ in the plane. We complement our algorithmic result with a matching hardness lower bound. We also consider a well-studied generalization of $k$-center, called Non-uniform $k$-center (NUkC), where we allow different radii clusters. NUkC is NP-hard to approximate within any factor, even in the Euclidean case. We design a $2^{O(k\log k)}n^2$ time $3$-approximation for NUkC in general metrics, and a $2^{O((k\log k)/ε)}dn$ time $(1+ε)$-approximation for Euclidean NUkC. The latter time bound matches the bound for $k$-center. △ Less

Submitted 19 December, 2021; originally announced December 2021.

Comments: A preliminary version appears in AAAI 2022

arXiv:2111.08138 [pdf, ps, other]

Improved Approximations for CVRP with Unsplittable Demands

Authors: Zachary Friggstad, Ramin Mousavi, Mirmahdi Rahgoshay, Mohammad R. Salavatipour

Abstract: In this paper, we present improved approximation algorithms for the (unsplittable) Capacitated Vehicle Routing Problem (CVRP) in general metrics. In CVRP, introduced by Dantzig and Ramser (1959), we are given a set of points (clients) $V$ together with a depot $r$ in a metric space, with each $v\in V$ having a demand $d_v>0$, and a vehicle of bounded capacity $Q$. The goal is to find a minimum cos… ▽ More In this paper, we present improved approximation algorithms for the (unsplittable) Capacitated Vehicle Routing Problem (CVRP) in general metrics. In CVRP, introduced by Dantzig and Ramser (1959), we are given a set of points (clients) $V$ together with a depot $r$ in a metric space, with each $v\in V$ having a demand $d_v>0$, and a vehicle of bounded capacity $Q$. The goal is to find a minimum cost collection of tours for the vehicle, each starting and ending at the depot, such that each client is visited at least once and the total demands of the clients in each tour is at most $Q$. In the unsplittable variant we study, the demand of a node must be served entirely by one tour. We present two approximation algorithms for unsplittable CVRP: a combinatorial $(α+1.75)$-approximation, where $α$ is the approximation factor for the Traveling Salesman Problem, and an approximation algorithm based on LP rounding with approximation guarantee $α+\ln(2) + δ\approx 3.194 + δ$ in $n^{O(1/δ)}$ time. Both approximations can further be improved by a small amount when combined with recent work by Blauth, Traub, and Vygen (2021), who obtained an $(α+ 2\cdot (1 -ε))$-approximation for unsplittable CVRP for some constant $ε$ depending on $α$ ($ε> 1/3000$ for $α= 1.5$). △ Less

Submitted 15 November, 2021; originally announced November 2021.

arXiv:2111.07414 [pdf, ps, other]

Combinatorial Algorithms for Rooted Prize-Collecting Walks and Applications to Orienteering and Minimum-Latency Problems

Authors: Sina Dezfuli, Zachary Friggstad, Ian Post, Chaitanya Swamy

Abstract: We consider the rooted prize-collecting walks (PCW) problem, wherein we seek a collection $C$ of rooted walks having minimum prize-collecting cost, which is the (total cost of walks in $C$) + (total node-reward of nodes not visited by any walk in $C$). This problem arises naturally as the Lagrangian relaxation of both orienteering, where we seek a length-bounded walk of maximum reward, and the… ▽ More We consider the rooted prize-collecting walks (PCW) problem, wherein we seek a collection $C$ of rooted walks having minimum prize-collecting cost, which is the (total cost of walks in $C$) + (total node-reward of nodes not visited by any walk in $C$). This problem arises naturally as the Lagrangian relaxation of both orienteering, where we seek a length-bounded walk of maximum reward, and the $\ell$-stroll problem, where we seek a minimum-length walk covering at least $\ell$ nodes. Our main contribution is to devise a simple, combinatorial algorithm for the PCW problem in directed graphs that returns a rooted tree whose prize-collecting cost is at most the optimum value of the prize-collecting walks problem. We utilize our algorithm to develop combinatorial approximation algorithms for two fundamental vehicle-routing problems (VRPs): (1) orienteering; and (2) $k$-minimum-latency problem ($k$-MLP), wherein we seek to cover all nodes using $k$ paths starting at a prescribed root node, so as to minimize the sum of the node visiting times. Our combinatorial algorithm allows us to sidestep the part where we solve a preflow-based LP in the LP-rounding algorithms of Friggstand and Swamy (2017) for orienteering, and in the state-of-the-art $7.183$-approximation algorithm for $k$-MP in Post and Swamy (2015). Consequently, we obtain combinatorial implementations of these algorithms with substantially improved running times compared with the current-best approximation factors. We report computational results for our resulting (combinatorial implementations of) orienteering algorithms, which show that the algorithms perform quite well in practice, both in terms of the quality of the solution they return, as also the upper bound they yield on the orienteering optimum (which is obtained by leveraging the workings of our PCW algorithm). △ Less

Submitted 14 November, 2021; originally announced November 2021.

ACM Class: F.2.2; G.1.6; G.2

arXiv:2111.02572 [pdf, ps, other]

A Constant-Factor Approximation for Quasi-bipartite Directed Steiner Tree on Minor-Free Graphs

Authors: Zachary Friggstad, Ramin Mousavi

Abstract: We give the first constant-factor approximation algorithm for quasi-bipartite instances of Directed Steiner Tree on graphs that exclude fixed minors. In particular, for $K_r$-minor-free graphs our approximation guarantee is $O(r\cdot\sqrt{\log r})$ and, further, for planar graphs our approximation guarantee is 20. Our algorithm uses the primal-dual scheme. We employ a more involved method of det… ▽ More We give the first constant-factor approximation algorithm for quasi-bipartite instances of Directed Steiner Tree on graphs that exclude fixed minors. In particular, for $K_r$-minor-free graphs our approximation guarantee is $O(r\cdot\sqrt{\log r})$ and, further, for planar graphs our approximation guarantee is 20. Our algorithm uses the primal-dual scheme. We employ a more involved method of determining when to buy an edge while raising dual variables since, as we show, the natural primal-dual scheme fails to raise enough dual value to pay for the purchased solution. As a consequence, we also demonstrate integrality gap upper bounds on the standard cut-based linear programming relaxation for the Directed Steiner Tree instances we consider. △ Less

Submitted 5 November, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

arXiv:2111.02216 [pdf, other]

Automatic Embedding of Stories Into Collections of Independent Media

Authors: Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Kory W. Mathewson, Jürgen Schmidhuber

Abstract: We look at how machine learning techniques that derive properties of items in a collection of independent media can be used to automatically embed stories into such collections. To do so, we use models that extract the tempo of songs to make a music playlist follow a narrative arc. Our work specifies an open-source tool that uses pre-trained neural network models to extract the global tempo of a s… ▽ More We look at how machine learning techniques that derive properties of items in a collection of independent media can be used to automatically embed stories into such collections. To do so, we use models that extract the tempo of songs to make a music playlist follow a narrative arc. Our work specifies an open-source tool that uses pre-trained neural network models to extract the global tempo of a set of raw audio files and applies these measures to create a narrative-following playlist. This tool is available at https://github.com/dylanashley/playlist-story-builder/releases/tag/v1.0.0 △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: 2 pages in main text + 1 page of references + 6 pages of appendices, 2 figures in main text + 3 figures in appendices, 1 algorithm in appendices; source code available at https://gist.github.com/dylanashley/1387a99deb85bfc0bce11286810cd98b

ACM Class: H.5.5; I.2.6; J.5

arXiv:1912.06198 [pdf, other]

A Constant-Factor Approximation for Directed Latency in Quasi-Polynomial Time

Authors: Zachary Friggstad, Chaitanya Swamy

Abstract: We give the first constant-factor approximation for the Directed Latency problem in quasi-polynomial time. Here, the goal is to visit all nodes in an asymmetric metric with a single vehicle starting at a depot $r$ to minimize the average time a node waits to be visited by the vehicle. The approximation guarantee is an improvement over the polynomial-time $O(\log n)$-approximation [Friggstad, Salav… ▽ More We give the first constant-factor approximation for the Directed Latency problem in quasi-polynomial time. Here, the goal is to visit all nodes in an asymmetric metric with a single vehicle starting at a depot $r$ to minimize the average time a node waits to be visited by the vehicle. The approximation guarantee is an improvement over the polynomial-time $O(\log n)$-approximation [Friggstad, Salavatipour, Svitkina, 2013] and no better quasi-polynomial time approximation algorithm was known. To obtain this, we must extend a recent result showing the integrality gap of the Asymmetric TSP-Path LP relaxation is bounded by a constant [Köhne, Traub, and Vygen, 2019], which itself builds on the breakthrough result that the integrality gap for standard Asymmetric TSP is also a constant [Svensson, Tarnawsi, and Vegh, 2018]. We show the standard Asymmetric TSP-Path integrality gap is bounded by a constant even if the cut requirements of the LP relaxation are relaxed from $x(δ^{in}(S)) \geq 1$ to $x(δ^{in}(S)) \geq ρ$ for some constant $1/2 < ρ\leq 1$. We also give a better approximation guarantee in the special case of Directed Latency in regret metrics where the goal is to find a path $P$ minimize the average time a node $v$ waits in excess of $c_{rv}$, i.e. $\frac{1}{|V|} \cdot \sum_{v \in V} (c_v(P)-c_{rv})$. △ Less

Submitted 15 April, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

arXiv:1912.05010 [pdf, other]

Graph Pricing with Limited Supply

Authors: Zachary Friggstad, Maryam Mahboub

Abstract: We study approximation algorithms for graph pricing with vertex capacities yet without the traditional envy-free constraint. Specifically, we have a set of items $V$ and a set of customers $X$ where each customer $i \in X$ has a budget $b_i$ and is interested in a bundle of items $S_i \subseteq V$ with $|S_i| \leq 2$. However, there is a limited supply of each item: we only have $μ_v$ copies of it… ▽ More We study approximation algorithms for graph pricing with vertex capacities yet without the traditional envy-free constraint. Specifically, we have a set of items $V$ and a set of customers $X$ where each customer $i \in X$ has a budget $b_i$ and is interested in a bundle of items $S_i \subseteq V$ with $|S_i| \leq 2$. However, there is a limited supply of each item: we only have $μ_v$ copies of item $v$ to sell for each $v \in V$. We should assign prices $p(v)$ to each $v \in V$ and chose a subset $Y \subseteq X$ of customers so that each $i \in Y$ can afford their bundle ($p(S_i) \leq b_i$) and at most $μ_v$ chosen customers have item $v$ in their bundle for each item $v \in V$. Each customer $i \in Y$ pays $p(S_i)$ for the bundle they purchased: our goal is to do this in a way that maximizes revenue. Such pricing problems have been studied from the perspective of envy-freeness where we also must ensure that $p(S_i) \geq b_i$ for each $i \notin Y$. However, the version where we simply allocate items to customers after setting prices and do not worry about the envy-free condition has received less attention. Our main result is an 8-approximation for the capacitated case via local search and a 7.8096-approximation in simple graphs with uniform vertex capacities. The latter is obtained by combing a more involved analysis of a multi-swap local search algorithm for constant capacities and an LP-rounding algorithm for larger capacities. If all capacities are bounded by a constant $C$, we further show a multi-swap local search algorithm yields an $\left(4 \cdot \frac{2C-1}{C} + ε\right)$-approximation. We also give a $(4+ε)$-approximation in simple graphs through LP rounding when all capacities are very large as a function of $ε$. △ Less

Submitted 10 December, 2019; originally announced December 2019.

arXiv:1807.05443 [pdf, other]

Exact Algorithms and Lower Bounds for Stable Instances of Euclidean k-Means

Authors: Zachary Friggstad, Kamyar Khodamoradi, Mohammad R. Salavatipour

Abstract: We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT… ▽ More We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $α$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $ε>0$, $(1+ε)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+ε)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $ε_0>0$ s.t. there is not even a PTAS for $(1+ε_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard. △ Less

Submitted 30 January, 2024; v1 submitted 14 July, 2018; originally announced July 2018.

Comments: 28 pages, 2 figures

MSC Class: 68W40; 68W25

arXiv:1708.01335 [pdf, other]

Compact, Provably-Good LPs for Orienteering and Regret-Bounded Vehicle Routing

Authors: Zachary Friggstad, Chaitanya Swamy

Abstract: We develop polynomial-size LP-relaxations for {\em orienteering} and the {\em regret-bounded vehicle routing problem} (\rvrp) and devise suitable LP-rounding algorithms that lead to various new insights and approximation results for these problems. In orienteering, the goal is to find a maximum-reward $r$-rooted path, possibly ending at a specified node, of length at most some given budget $B$. In… ▽ More We develop polynomial-size LP-relaxations for {\em orienteering} and the {\em regret-bounded vehicle routing problem} (\rvrp) and devise suitable LP-rounding algorithms that lead to various new insights and approximation results for these problems. In orienteering, the goal is to find a maximum-reward $r$-rooted path, possibly ending at a specified node, of length at most some given budget $B$. In \rvrp, the goal is to find the minimum number of $r$-rooted paths of {\em regret} at most a given bound $R$ that cover all nodes, where the regret of an $r$-$v$ path is its length $-$ $c_{rv}$. For {\em rooted orienteering}, we introduce a natural bidirected LP-relaxation and obtain a simple $3$-approximation algorithm via LP-rounding. This is the {\em first LP-based} guarantee for this problem. We also show that {\em point-to-point} (\ptp) {\em orienteering} can be reduced to a regret-version of rooted orienteering at the expense of a factor-2 loss in approximation. For \rvrp, we propose two compact LPs that lead to significant improvements, in both approximation ratio and running time, over the approach in~\cite{FriggstadS14}. One of these is a natural modification of the LP for rooted orienteering; the other is an unconventional formulation that is motivated by certain structural properties of an \rvrp-solution, which leads to a $15$-approximation algorithm for \rvrp. △ Less

Submitted 3 August, 2017; originally announced August 2017.

ACM Class: F.2.2; G.1.6; G.2

arXiv:1707.04295 [pdf, other]

Approximation Schemes for Clustering with Outliers

Authors: Zachary Friggstad, Kamyar Khodamoradi, Mohsen Rezapour, Mohammad R. Salavatipour

Abstract: Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. We study clustering problems with outliers. More specifically, we look a… ▽ More Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons. We study clustering problems with outliers. More specifically, we look at Uncapacitated Facility Location (UFL), $k$-Median, and $k$-Means. In UFL with outliers, we have to open some centres, discard up to $z$ points of $\cal X$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In $k$-Median and $k$-Means, we have to open up to $k$ centres but there are no opening costs. In $k$-Means, the cost of assigning $j$ to $i$ is $δ^2(j,i)$. We present several results. Our main focus is on cases where $δ$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For uniform-cost UFL with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the $k$-Median and $k$-Means problems in the same metrics where, for any constant $ε> 0$, we can find a solution using $(1+ε)k$ centres whose cost is at most a $(1+ε)$-factor of the optimum and uses at most $z$ outliers. We also show that natural local search heuristics that do not violate the number of clusters and outliers for $k$-Median (or $k$-Means) will have unbounded gap even in Euclidean metrics. Furthermore, we show how our analysis can be extended to general metrics for $k$-Means with outliers to obtain a $(25+ε,1+ε)$ bicriteria. △ Less

Submitted 13 July, 2017; originally announced July 2017.

arXiv:1705.10396 [pdf, other]

Further Approximations for Demand Matching: Matroid Constraints and Minor-Closed Graphs

Authors: Sara Ahmadian, Zachary Friggstad

Abstract: We pursue a study of the Generalized Demand Matching problem, a common generalization of the $b$-Matching and Knapsack problems. Here, we are given a graph with vertex capacities, edge profits, and asymmetric demands on the edges. The goal is to find a maximum-profit subset of edges so the demands of chosen edges do not violate vertex capacities. This problem is APX-hard and constant-factor approx… ▽ More We pursue a study of the Generalized Demand Matching problem, a common generalization of the $b$-Matching and Knapsack problems. Here, we are given a graph with vertex capacities, edge profits, and asymmetric demands on the edges. The goal is to find a maximum-profit subset of edges so the demands of chosen edges do not violate vertex capacities. This problem is APX-hard and constant-factor approximations are known. Our results fall into two categories. First, using iterated relaxation and various filtering strategies, we show with an efficient rounding algorithm if an additional matroid structure $\mathcal M$ is given and we further only allow sets $F \subseteq E$ that are independent in $\mathcal M$, the natural LP relaxation has an integrality gap of at most $\frac{25}{3} \approx 8.333$. This can be improved in various special cases, for example we improve over the 15-approximation for the previously-studied Coupled Placement problem [Korupolu et al. 2014] by giving a $7$-approximation. Using similar techniques, we show the problem of computing a minimum-cost base in $\mathcal M$ satisfying vertex capacities admits a $(1,3)$-bicriteria approximation. This improves over the previous $(1,4)$-approximation in the special case that $\mathcal M$ is the graphic matroid over the given graph [Fukanaga and Nagamochi, 2009]. Second, we show Demand Matching admits a polynomial-time approximation scheme in graphs that exclude a fixed minor. If all demands are polynomially-bounded integers, this is somewhat easy using dynamic programming in bounded-treewidth graphs. Our main technical contribution is a sparsification lemma allowing us to scale the demands to be used in a more intricate dynamic programming algorithm, followed by randomized rounding to filter our scaled-demand solution to a feasible solution. △ Less

Submitted 29 May, 2017; originally announced May 2017.

arXiv:1604.08132 [pdf, other]

A Logarithmic Integrality Gap Bound for Directed Steiner Tree in Quasi-bipartite Graphs

Authors: Zachary Friggstad, Jochen Koenemann, Mohammad Shadravan

Abstract: We demonstrate that the integrality gap of the natural cut-based LP relaxation for the directed Steiner tree problem is $O(\log k)$ in quasi-bipartite graphs with $k$ terminals. Such instances can be seen to generalize set cover, so the integrality gap analysis is tight up to a constant factor. A novel aspect of our approach is that we use the primal-dual method; a technique that is rarely used in… ▽ More We demonstrate that the integrality gap of the natural cut-based LP relaxation for the directed Steiner tree problem is $O(\log k)$ in quasi-bipartite graphs with $k$ terminals. Such instances can be seen to generalize set cover, so the integrality gap analysis is tight up to a constant factor. A novel aspect of our approach is that we use the primal-dual method; a technique that is rarely used in designing approximation algorithms for network design problems in directed graphs. △ Less

Submitted 27 April, 2016; originally announced April 2016.

arXiv:1603.08976 [pdf, other]

Local Search Yields a PTAS for k-Means in Doubling Metrics

Authors: Zachary Friggstad, Mohsen Rezapour, Mohammad R. Salavatipour

Abstract: The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie… ▽ More The most well known and ubiquitous clustering problem encountered in nearly every branch of science is undoubtedly $k$-means: given a set of data points and a parameter $k$, select $k$ centres and partition the data points into $k$ clusters around these centres so that the sum of squares of distances of the points to their cluster centre is minimized. Typically these data points lie $\mathbb{R}^d$ for some $d\geq 2$. $k$-means and the first algorithms for it were introduced in the 1950's. Since then, hundreds of papers have studied this problem and many algorithms have been proposed for it. The most commonly used algorithm is known as Lloyd-Forgy, which is also referred to as "the" $k$-means algorithm, and various extensions of it often work very well in practice. However, they may produce solutions whose cost is arbitrarily large compared to the optimum solution. Kanungo et al. [2004] analyzed a simple local search heuristic to get a polynomial-time algorithm with approximation ratio $9+ε$ for any fixed $ε>0$ for $k$-means in Euclidean space. Finding an algorithm with a better approximation guarantee has remained one of the biggest open questions in this area, in particular whether one can get a true PTAS for fixed dimension Euclidean space. We settle this problem by showing that a simple local search algorithm provides a PTAS for $k$-means in $\mathbb{R}^d$ for any fixed $d$. More precisely, for any error parameter $ε>0$, the local search algorithm that considers swaps of up to $ρ=d^{O(d)}\cdotε^{-O(d/ε)}$ centres at a time finds a solution using exactly $k$ centres whose cost is at most a $(1+ε)$-factor greater than the optimum. Finally, we provide the first demonstration that local search yields a PTAS for the uncapacitated facility location problem and $k$-median with non-uniform opening costs in doubling metrics. △ Less

Submitted 9 January, 2017; v1 submitted 29 March, 2016; originally announced March 2016.

arXiv:1603.00973 [pdf, other]

Tight Analysis of a Multiple-Swap Heuristic for Budgeted Red-Blue Median

Authors: Zachary Friggstad, Yifeng Zhang

Abstract: Budgeted Red-Blue Median is a generalization of classic $k$-Median in that there are two sets of facilities, say $\mathcal{R}$ and $\mathcal{B}$, that can be used to serve clients located in some metric space. The goal is to open $k_r$ facilities in $\mathcal{R}$ and $k_b$ facilities in $\mathcal{B}$ for some given bounds $k_r, k_b$ and connect each client to their nearest open facility in a way t… ▽ More Budgeted Red-Blue Median is a generalization of classic $k$-Median in that there are two sets of facilities, say $\mathcal{R}$ and $\mathcal{B}$, that can be used to serve clients located in some metric space. The goal is to open $k_r$ facilities in $\mathcal{R}$ and $k_b$ facilities in $\mathcal{B}$ for some given bounds $k_r, k_b$ and connect each client to their nearest open facility in a way that minimizes the total connection cost. We extend work by Hajiaghayi, Khandekar, and Kortsarz [2012] and show that a multiple-swap local search heuristic can be used to obtain a $(5+ε)$-approximation for Budgeted Red-Blue Median for any constant $ε> 0$. This is an improvement over their single swap analysis and beats the previous best approximation guarantee of 8 by Swamy [2014]. We also present a matching lower bound showing that for every $p \geq 1$, there are instances of Budgeted Red-Blue Median with local optimum solutions for the $p$-swap heuristic whose cost is $5 + Ω\left(\frac{1}{p}\right)$ times the optimum solution cost. Thus, our analysis is tight up to the lower order terms. In particular, for any $ε> 0$ we show the single-swap heuristic admits local optima whose cost can be as bad as $7-ε$ times the optimum solution cost. △ Less

Submitted 3 March, 2016; originally announced March 2016.

arXiv:1311.6024 [pdf, other]

Approximation Algorithms for Regret-Bounded Vehicle Routing and Applications to Distance-Constrained Vehicle Routing

Authors: Zachary Friggstad, Chaitanya Swamy

Abstract: We consider vehicle-routing problems (VRPs) that incorporate the notion of {\em regret} of a client, which is a measure of the waiting time of a client relative to its shortest-path distance from the depot. Formally, we consider both the additive and multiplicative versions of, what we call, the {\em regret-bounded vehicle routing problem} (RVRP). In these problems, we are given an undirected comp… ▽ More We consider vehicle-routing problems (VRPs) that incorporate the notion of {\em regret} of a client, which is a measure of the waiting time of a client relative to its shortest-path distance from the depot. Formally, we consider both the additive and multiplicative versions of, what we call, the {\em regret-bounded vehicle routing problem} (RVRP). In these problems, we are given an undirected complete graph $G=(\{r\}\cup V,E)$ on $n$ nodes with a distinguished root (depot) node $r$, edge costs $\{c_{uv}\}$ that form a metric, and a regret bound $R$. Given a path $P$ rooted at $r$ and a node $v\in P$, let $c_P(v)$ be the distance from $r$ to $v$ along $P$. The goal is to find the fewest number of paths rooted at $r$ that cover all the nodes so that for every node $v$ covered by (say) path $P$: (i) its additive regret $c_P(v)-c_{rv}$, with respect to $P$ is at most $R$ in {\em additive-RVRP}; or (ii) its multiplicative regret, $c_P(c)/c_{rv}$, with respect to $P$ is at most $R$ in {\em multiplicative-RVRP}. Our main result is the {\em first} constant-factor approximation algorithm for additive-RVRP by devising rounding techniques for a natural {\em configuration-style LP}. This is a substantial improvement over the previous-best $O(\log n)$-approximation. Additive-RVRP turns out be a rather central vehicle-routing problem, whose study reveals insights into a variety of other regret-related problems as well as the classical {\em distance-constrained VRP} ({DVRP}). We obtain approximation ratios of $O\bigl(\log(\frac{R}{R-1})\bigr)$ for multiplicative-RVRP, and $O\bigl(\min\bigl\{\mathit{OPT},\frac{\log D}{\log\log D}\bigr\}\bigr)$ for DVRP with distance bound $D$ via reductions to additive-RVRP; the latter improves upon the previous-best approximation for DVRP. △ Less

Submitted 23 November, 2013; originally announced November 2013.

ACM Class: F.2.2; G.1.6; G.2

arXiv:1302.3145 [pdf, other]

An Improved Integrality Gap for Asymmetric TSP Paths

Authors: Zachary Friggstad, Anupam Gupta, Mohit Singh

Abstract: The Asymmetric Traveling Salesperson Path Problem (ATSPP) is one where, given an asymmetric metric space $(V,d)$ with specified vertices s and t, the goal is to find an s-t path of minimum length that passes through all the vertices in V. This problem is closely related to the Asymmetric TSP (ATSP), which seeks to find a tour (instead of an $s-t$ path) visiting all the nodes: for ATSP, a $ρ$-app… ▽ More The Asymmetric Traveling Salesperson Path Problem (ATSPP) is one where, given an asymmetric metric space $(V,d)$ with specified vertices s and t, the goal is to find an s-t path of minimum length that passes through all the vertices in V. This problem is closely related to the Asymmetric TSP (ATSP), which seeks to find a tour (instead of an $s-t$ path) visiting all the nodes: for ATSP, a $ρ$-approximation guarantee implies an $O(ρ)$-approximation for ATSPP. However, no such connection is known for the integrality gaps of the linear programming relaxations for these problems: the current-best approximation algorithm for ATSPP is $O(\log n/\log\log n)$, whereas the best bound on the integrality gap of the natural LP relaxation (the subtour elimination LP) for ATSPP is $O(\log n)$. In this paper, we close this gap, and improve the current best bound on the integrality gap from $O(\log n)$ to $O(\log n/\log\log n)$. The resulting algorithm uses the structure of narrow $s$-$t$ cuts in the LP solution to construct a (random) tree spanning tree that can be cheaply augmented to contain an Eulerian $s$-$t$ walk. We also build on a result of Oveis Gharan and Saberi and show a strong form of Goddyn's conjecture about thin spanning trees implies the integrality gap of the subtour elimination LP relaxation for ATSPP is bounded by a constant. Finally, we give a simpler family of instances showing the integrality gap of this LP is at least 2. △ Less

Submitted 6 January, 2015; v1 submitted 13 February, 2013; originally announced February 2013.

arXiv:1301.4478 [pdf, other]

Local-Search based Approximation Algorithms for Mobile Facility Location Problems

Authors: Sara Ahmadian, Zachary Friggstad, Chaitanya Swamy

Abstract: We consider the {\em mobile facility location} (\mfl) problem. We are given a set of facilities and clients located in a common metric space. The goal is to move each facility from its initial location to a destination and assign each client to the destination of some facility so as to minimize the sum of the movement-costs of the facilities and the client-assignment costs. This abstracts facility… ▽ More We consider the {\em mobile facility location} (\mfl) problem. We are given a set of facilities and clients located in a common metric space. The goal is to move each facility from its initial location to a destination and assign each client to the destination of some facility so as to minimize the sum of the movement-costs of the facilities and the client-assignment costs. This abstracts facility-location settings where one has the flexibility of moving facilities from their current locations to other destinations so as to serve clients more efficiently by reducing their assignment costs. We give the first {\em local-search based} approximation algorithm for this problem and achieve the best-known approximation guarantee. Our main result is $(3+ε)$-approximation for this problem for any constant $ε>0$ using local search. The previous best guarantee was an 8-approximation algorithm based on LP-rounding. Our guarantee {\em matches} the best-known approximation guarantee for the $k$-median problem. Since there is an approximation-preserving reduction from the $k$-median problem to \mfl, any improvement of our result would imply an analogous improvement for the $k$-median problem. Furthermore, {\em our analysis is tight} (up to $o(1)$ factors) since the tight example for the local-search based 3-approximation algorithm for $k$-median can be easily adapted to show that our local-search algorithm has a tight approximation ratio of 3. One of the chief novelties of the analysis is that in order to generate a suitable collection of local-search moves whose resulting inequalities yield the desired bound on the cost of a local-optimum, we define a tree-like structure that (loosely speaking) functions as a "recursion tree", using which we spawn off local-search moves by exploring this tree to a constant depth. △ Less

Submitted 18 January, 2013; originally announced January 2013.

ACM Class: F.2.2; G.1.6; G.2.1

arXiv:1207.5722 [pdf, ps, other]

Approximating Minimum-Cost Connected T-Joins

Authors: Joseph Cheriyan, Zachary Friggstad, Zhihan Gao

Abstract: We design and analyse approximation algorithms for the minimum-cost connected T-join problem: given an undirected graph G = (V;E) with nonnegative costs on the edges, and a subset of nodes T, find (if it exists) a spanning connected subgraph H of minimum cost such that every node in T has odd degree and every node not in T has even degree; H may have multiple copies of any edge of G. Two well-know… ▽ More We design and analyse approximation algorithms for the minimum-cost connected T-join problem: given an undirected graph G = (V;E) with nonnegative costs on the edges, and a subset of nodes T, find (if it exists) a spanning connected subgraph H of minimum cost such that every node in T has odd degree and every node not in T has even degree; H may have multiple copies of any edge of G. Two well-known special cases are the TSP (|T| = 0) and the s-t path TSP (|T| = 2). Recently, An, Kleinberg, and Shmoys [STOC 2012] improved on the long-standing 5/3-approximation guarantee for the latter problem and presented an algorithm based on LP rounding that achieves an approximation guarantee of (1+sqrt(5))/2 < 1.6181. We show that the methods of An et al. extend to the minimum-cost connected T-join problem. They presented a new proof for a 5/3-approximation guarantee for the s-t path TSP; their proof extends easily to the minimum-cost connected T-join problem. Next, we improve on the approximation guarantee of 5/3 by extending their LP-rounding algorithm to get an approximation guarantee of 13/8 = 1.625 for all |T| >= 4. Finally, we focus on the prize-collecting version of the problem, and present a primal-dual algorithm that is "Lagrangian multiplier preserving" and that achieves an approximation guarantee 3 - 2/(|T|-1) when |T| >= 4. Our primal-dual algorithm is a generalization of the known primal-dual 2-approximation for the prize-collecting s-t path TSP. Furthermore, we show that our analysis is tight by presenting instances with |T| >= 4 such that the cost of the solution found by the algorithm is exactly 3 - 2/(|T|-1) times the cost of the constructed dual solution. △ Less

Submitted 24 July, 2012; originally announced July 2012.

arXiv:1204.5489 [pdf, ps, other]

Understanding Set Cover: Sub-exponential Time Approximations and Lift-and-Project Methods

Authors: Eden Chlamtac, Zac Friggstad, Konstantinos Georgiou

Abstract: Recently, Cygan, Kowalik, and Wykurz [IPL 2009] gave sub-exponential-time approximation algorithms for the Set-Cover problem with approximation ratios better than ln(n). In light of this result, it is natural to ask whether such improvements can be achieved using lift-and-project methods. We present a simpler combinatorial algorithm which has nearly the same time-approximation tradeoff as the algo… ▽ More Recently, Cygan, Kowalik, and Wykurz [IPL 2009] gave sub-exponential-time approximation algorithms for the Set-Cover problem with approximation ratios better than ln(n). In light of this result, it is natural to ask whether such improvements can be achieved using lift-and-project methods. We present a simpler combinatorial algorithm which has nearly the same time-approximation tradeoff as the algorithm of Cygan et al., and which lends itself naturally to a lift-and-project based approach. At a high level, our approach is similar to the recent work of Karlin, Mathieu, and Nguyen [IPCO 2011], who examined a known PTAS for Knapsack (similar to our combinatorial Set-Cover algorithm) and its connection to hierarchies of LP and SDP relaxations for Knapsack. For Set-Cover, we show that, indeed, using the trick of "lifting the objective function", we can match the performance of our combinatorial algorithm using the LP hierarchy of Lovasz and Schrijver. We also show that this trick is essential: even in the stronger LP hierarchy of Sherali and Adams, the integrality gap remains at least (1-eps) ln(n) at level Omega(n) (when the objective function is not lifted). As shown by Aleknovich, Arora, and Tourlakis [STOC 2005], Set-Cover relaxations stemming from SDP hierarchies (specifically, LS+) have similarly large integrality gaps. This stands in contrast to Knapsack, where Karlin et al. showed that the (much stronger) Lasserre SDP hierarchy reduces the integrality gap to (1+eps) at level O(1). For completeness, we show that LS+ also reduces the integrality gap for Knapsack to (1+eps). This result may be of independent interest, as our LS+ based rounding and analysis are rather different from those of Karlin et al., and to the best of our knowledge this is the first explicit demonstration of such a reduction in the integrality gap of LS+ relaxations after few rounds. △ Less

Submitted 25 October, 2012; v1 submitted 24 April, 2012; originally announced April 2012.

arXiv:1112.2930 [pdf, ps, other]

Multiple Traveling Salesmen in Asymmetric Metrics

Authors: Zachary Friggstad

Abstract: We consider some generalizations of the Asymmetric Traveling Salesman Path problem. Suppose we have an asymmetric metric G = (V,A) with two distinguished nodes s,t. We are also given a positive integer k. The goal is to find k paths of minimum total cost from s to t whose union spans all nodes. We call this the k-Person Asymmetric Traveling Salesmen Path problem (k-ATSPP). Our main result for k-AT… ▽ More We consider some generalizations of the Asymmetric Traveling Salesman Path problem. Suppose we have an asymmetric metric G = (V,A) with two distinguished nodes s,t. We are also given a positive integer k. The goal is to find k paths of minimum total cost from s to t whose union spans all nodes. We call this the k-Person Asymmetric Traveling Salesmen Path problem (k-ATSPP). Our main result for k-ATSPP is a bicriteria approximation that, for some parameter b >= 1 we may choose, finds between k and k + k/b paths of total length O(b log |V|) times the optimum value of an LP relaxation based on the Held-Karp relaxation for the Traveling Salesman problem. On one extreme this is an O(log |V|)-approximation that uses up to 2k paths and on the other it is an O(k log |V|)-approximation that uses exactly k paths. Next, we consider the case where we have k pairs of nodes (s_1,t_1), ..., (s_k,t_k). The goal is to find an s_i-t_i path for every pair such that each node of G lies on at least one of these paths. Simple approximation algorithms are presented for the special cases where the metric is symmetric or where s_i = t_i for each i. We also show that the problem can be approximated within a factor O(log n) when k=2. On the other hand, we demonstrate that the general problem cannot be approximated within any bounded ratio unless P = NP. △ Less

Submitted 14 December, 2011; v1 submitted 13 December, 2011; originally announced December 2011.

Comments: 19 Pages, 3 Figures. First revision fixes a broken reference and adds to the discussion for General 2-ATSPP

ACM Class: F.2.2

arXiv:0907.0726 [pdf, ps, other]

Asymmetric Traveling Salesman Path and Directed Latency Problems

Authors: Zachary Friggstad, Mohammad R. Salavatipour, Zoya Svitkina

Abstract: We study integrality gaps and approximability of two closely related problems on directed graphs. Given a set V of n nodes in an underlying asymmetric metric and two specified nodes s and t, both problems ask to find an s-t path visiting all other nodes. In the asymmetric traveling salesman path problem (ATSPP), the objective is to minimize the total cost of this path. In the directed latency prob… ▽ More We study integrality gaps and approximability of two closely related problems on directed graphs. Given a set V of n nodes in an underlying asymmetric metric and two specified nodes s and t, both problems ask to find an s-t path visiting all other nodes. In the asymmetric traveling salesman path problem (ATSPP), the objective is to minimize the total cost of this path. In the directed latency problem, the objective is to minimize the sum of distances on this path from s to each node. Both of these problems are NP-hard. The best known approximation algorithms for ATSPP had ratio O(log n) until the very recent result that improves it to O(log n/ log log n). However, only a bound of O(sqrt(n)) for the integrality gap of its linear programming relaxation has been known. For directed latency, the best previously known approximation algorithm has a guarantee of O(n^(1/2+eps)), for any constant eps > 0. We present a new algorithm for the ATSPP problem that has an approximation ratio of O(log n), but whose analysis also bounds the integrality gap of the standard LP relaxation of ATSPP by the same factor. This solves an open problem posed by Chekuri and Pal [2007]. We then pursue a deeper study of this linear program and its variations, which leads to an algorithm for the k-person ATSPP (where k s-t paths of minimum total length are sought) and an O(log n)-approximation for the directed latency problem. △ Less

Submitted 1 June, 2010; v1 submitted 3 July, 2009; originally announced July 2009.

Showing 1–26 of 26 results for author: Friggstad, Z