-
Independent set in $k$-Claw-Free Graphs: Conditional $χ$-boundedness and the Power of LP/SDP Relaxations
Authors:
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Joachim Spoerhase
Abstract:
This paper studies $k$-claw-free graphs, exploring the connection between an extremal combinatorics question and the power of a convex program in approximating the maximum-weight independent set in this graph class. For the extremal question, we consider the notion, that we call \textit{conditional $χ$-boundedness} of a graph: Given a graph $G$ that is assumed to contain an independent set of a ce…
▽ More
This paper studies $k$-claw-free graphs, exploring the connection between an extremal combinatorics question and the power of a convex program in approximating the maximum-weight independent set in this graph class. For the extremal question, we consider the notion, that we call \textit{conditional $χ$-boundedness} of a graph: Given a graph $G$ that is assumed to contain an independent set of a certain (constant) size, we are interested in upper bounding the chromatic number in terms of the clique number of $G$. This question, besides being interesting on its own, has algorithmic implications (which have been relatively neglected in the literature) on the performance of SDP relaxations in estimating the value of maximum-weight independent set.
For $k=3$, Chudnovsky and Seymour (JCTB 2010) prove that any $3$-claw-free graph $G$ with an independent set of size three must satisfy $χ(G) \leq 2 ω(G)$. Their result implies a factor $2$-estimation algorithm for the maximum weight independent set via an SDP relaxation (providing the first non-trivial result for maximum-weight independent set in such graphs via a convex relaxation). An obvious open question is whether a similar conditional $χ$-boundedness phenomenon holds for any $k$-claw-free graph. Our main result answers this question negatively. We further present some evidence that our construction could be useful in studying more broadly the power of convex relaxations in the context of approximating maximum weight independent set in $k$-claw free graphs. In particular, we prove a lower bound on families of convex programs that are stronger than known convex relaxations used algorithmically in this context.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces
Authors:
Fateme Abbasi,
Sandip Banerjee,
Jarosław Byrka,
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Dániel Marx,
Roohani Sharma,
Joachim Spoerhase
Abstract:
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$.…
▽ More
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) δ(p,X)^z$ is minimized.
This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+ε)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023].
Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $η_0 >0.0006$, we devise a $3^z(1-η_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $Θ(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+ε)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Parameterized Approximation Schemes for Clustering with General Norm Objectives
Authors:
Fateme Abbasi,
Sandip Banerjee,
Jarosław Byrka,
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Dániel Marx,
Roohani Sharma,
Joachim Spoerhase
Abstract:
This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as…
▽ More
This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as $k$-median, and $k$-means [Kumar, Sabharwal, Sen; J. ACM 2010]. However, existing EPASes handle only basic objectives (such as $k$-center, $k$-median, and $k$-means) and are tailored to the specific objective and metric space.
Our main contribution is a clean and simple EPAS that settles more than ten clustering problems (across multiple well-studied objectives as well as metric spaces) and unifies well-known EPASes. Our algorithm gives EPASes for a large variety of clustering objectives (for example, $k$-means, $k$-center, $k$-median, priority $k$-center, $\ell$-centrum, ordered $k$-median, socially fair $k$-median aka robust $k$-median, or more generally monotone norm $k$-clustering) and metric spaces (for example, continuous high-dimensional Euclidean spaces, metrics of bounded doubling dimension, bounded treewidth metrics, and planar metrics).
Key to our approach is a new concept that we call bounded $ε$-scatter dimension--an intrinsic complexity measure of a metric space that is a relaxation of the standard notion of bounded doubling dimension. Our main technical result shows that two conditions are essentially sufficient for our algorithm to yield an EPAS on the input metric $M$ for any clustering objective: (i) The objective is described by a monotone (not necessarily symmetric!) norm, and (ii) the $ε$-scatter dimension of $M$ is upper bounded by a function of $ε$.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Approximation Algorithms for Demand Strip Packing
Authors:
Waldo Gálvez,
Fabrizio Grandoni,
Afrouz Jabal Ameli,
Kamyar Khodamoradi
Abstract:
In the Demand Strip Packing problem (DSP), we are given a time interval and a collection of tasks, each characterized by a processing time and a demand for a given resource (such as electricity, computational power, etc.). A feasible solution consists of a schedule of the tasks within the mentioned time interval. Our goal is to minimize the peak resource consumption, i.e. the maximum total demand…
▽ More
In the Demand Strip Packing problem (DSP), we are given a time interval and a collection of tasks, each characterized by a processing time and a demand for a given resource (such as electricity, computational power, etc.). A feasible solution consists of a schedule of the tasks within the mentioned time interval. Our goal is to minimize the peak resource consumption, i.e. the maximum total demand of tasks executed at any point in time.
It is known that DSP is NP-hard to approximate below a factor 3/2, and standard techniques for related problems imply a (polynomial-time) 2-approximation. Our main result is a (5/3+eps)-approximation algorithm for any constant eps>0. We also achieve best-possible approximation factors for some relevant special cases.
△ Less
Submitted 19 May, 2021; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Exact Algorithms and Lower Bounds for Stable Instances of Euclidean k-Means
Authors:
Zachary Friggstad,
Kamyar Khodamoradi,
Mohammad R. Salavatipour
Abstract:
We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT…
▽ More
We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $α$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $α$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $ε>0$, $(1+ε)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+ε)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $ε_0>0$ s.t. there is not even a PTAS for $(1+ε_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard.
△ Less
Submitted 30 January, 2024; v1 submitted 14 July, 2018;
originally announced July 2018.
-
Approximation Schemes for Clustering with Outliers
Authors:
Zachary Friggstad,
Kamyar Khodamoradi,
Mohsen Rezapour,
Mohammad R. Salavatipour
Abstract:
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons.
We study clustering problems with outliers. More specifically, we look a…
▽ More
Clustering problems are well-studied in a variety of fields such as data science, operations research, and computer science. Such problems include variants of centre location problems, $k$-median, and $k$-means to name a few. In some cases, not all data points need to be clustered; some may be discarded for various reasons.
We study clustering problems with outliers. More specifically, we look at Uncapacitated Facility Location (UFL), $k$-Median, and $k$-Means. In UFL with outliers, we have to open some centres, discard up to $z$ points of $\cal X$ and assign every other point to the nearest open centre, minimizing the total assignment cost plus centre opening costs. In $k$-Median and $k$-Means, we have to open up to $k$ centres but there are no opening costs. In $k$-Means, the cost of assigning $j$ to $i$ is $δ^2(j,i)$. We present several results. Our main focus is on cases where $δ$ is a doubling metric or is the shortest path metrics of graphs from a minor-closed family of graphs. For uniform-cost UFL with outliers on such metrics we show that a multiswap simple local search heuristic yields a PTAS. With a bit more work, we extend this to bicriteria approximations for the $k$-Median and $k$-Means problems in the same metrics where, for any constant $ε> 0$, we can find a solution using $(1+ε)k$ centres whose cost is at most a $(1+ε)$-factor of the optimum and uses at most $z$ outliers. We also show that natural local search heuristics that do not violate the number of clusters and outliers for $k$-Median (or $k$-Means) will have unbounded gap even in Euclidean metrics. Furthermore, we show how our analysis can be extended to general metrics for $k$-Means with outliers to obtain a $(25+ε,1+ε)$ bicriteria.
△ Less
Submitted 13 July, 2017;
originally announced July 2017.
-
PTAS for Ordered Instances of Resource Allocation Problems with Restrictions on Inclusions
Authors:
Kamyar Khodamoradi,
Ramesh Krishnamurti,
Arash Rafiey,
Georgios Stamoulis
Abstract:
We consider the problem of allocating a set $I$ of $m$ indivisible resources (items) to a set $P$ of $n$ customers (players) competing for the resources. Each resource $j \in I$ has a same value $v_j > 0$ for a subset of customers interested in $j$, and zero value for the remaining customers. The utility received by each customer is the sum of the values of the resources allocated to her. The goal…
▽ More
We consider the problem of allocating a set $I$ of $m$ indivisible resources (items) to a set $P$ of $n$ customers (players) competing for the resources. Each resource $j \in I$ has a same value $v_j > 0$ for a subset of customers interested in $j$, and zero value for the remaining customers. The utility received by each customer is the sum of the values of the resources allocated to her. The goal is to find a feasible allocation of the resources to the interested customers such that for the Max-Min allocation problem (Min-Max allocation problem) the minimum of the utilities (maximum of the utilities) received by the customers is maximized (minimized). The Max-Min allocation problem is also known as the \textit{Fair Allocation problem}, or the \textit{Santa Claus problem}. The Min-Max allocation problem is the problem of Scheduling on Unrelated Parallel Machines, and is also known as the $R \, | \, | C_{\max}$ problem.
In this paper, we are interested in instances of the problem that admit a Polynomial Time Approximation Scheme (PTAS). We show that an ordering property on the resources and the customers is important and paves the way for a PTAS. For the Max-Min allocation problem, we start with instances of the problem that can be viewed as a \textit{convex bipartite graph}; a bipartite graph for which there exists an ordering of the resources such that each customer is interested in (has a positive evaluation for) a set of \textit{consecutive} resources. We demonstrate a PTAS for the inclusion-free cases. This class of instances is equivalent to the class of bipartite permutation graphs. For the Min-Max allocation problem, we also obtain a PTAS for inclusion-free instances. These instances are not only of theoretical interest but also have practical applications.
△ Less
Submitted 11 October, 2016; v1 submitted 30 September, 2016;
originally announced October 2016.