Search | arXiv e-print repository

A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering

Authors: Sayan Bandyapadhyay, Eden Chlamtáč, Yury Makarychev, Ali Vakilian

Abstract: In this work, we study pairwise fair clustering with $\ell \ge 2$ groups, where for every cluster $C$ and every group $i \in [\ell]$, the number of points in $C$ from group $i$ must be at most $t$ times the number of points in $C$ from any other group $j \in [\ell]$, for a given integer $t$. To the best of our knowledge, only bi-criteria approximation and exponential-time algorithms follow for thi… ▽ More In this work, we study pairwise fair clustering with $\ell \ge 2$ groups, where for every cluster $C$ and every group $i \in [\ell]$, the number of points in $C$ from group $i$ must be at most $t$ times the number of points in $C$ from any other group $j \in [\ell]$, for a given integer $t$. To the best of our knowledge, only bi-criteria approximation and exponential-time algorithms follow for this problem from the prior work on fair clustering problems when $\ell > 2$. In our work, focusing on the $\ell > 2$ case, we design the first polynomial-time $(t^{\ell}\cdot \ell\cdot k)^{O(\ell)}$-approximation for this problem with $k$-median cost that does not violate the fairness constraints. We complement our algorithmic result by providing hardness of approximation results, which show that our problem even when $\ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no polynomial-time algorithm with an approximation factor of $o(\log k)$ is known. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2402.15837 [pdf, other]

An $O(n \log n)$-Time Approximation Scheme for Geometric Many-to-Many Matching

Authors: Sayan Bandyapadhyay, Jie Xue

Abstract: Geometric matching is an important topic in computational geometry and has been extensively studied over decades. In this paper, we study a geometric-matching problem, known as geometric many-to-many matching. In this problem, the input is a set $S$ of $n$ colored points in $\mathbb{R}^d$, which implicitly defines a graph $G = (S,E(S))$ where… ▽ More Geometric matching is an important topic in computational geometry and has been extensively studied over decades. In this paper, we study a geometric-matching problem, known as geometric many-to-many matching. In this problem, the input is a set $S$ of $n$ colored points in $\mathbb{R}^d$, which implicitly defines a graph $G = (S,E(S))$ where $E(S) = \{(p,q): p,q \in S \text{ have different colors}\}$, and the goal is to compute a minimum-cost subset $E^* \subseteq E(S)$ of edges that cover all points in $S$. Here the cost of $E^*$ is the sum of the costs of all edges in $E^*$, where the cost of a single edge $e$ is the Euclidean distance (or more generally, the $L_p$-distance) between the two endpoints of $e$. Our main result is a $(1+\varepsilon)$-approximation algorithm with an optimal running time $O_\varepsilon(n \log n)$ for geometric many-to-many matching in any fixed dimension, which works under any $L_p$-norm. This is the first near-linear approximation scheme for the problem in any $d \geq 2$. Prior to this work, only the bipartite case of geometric many-to-many matching was considered in $\mathbb{R}^1$ and $\mathbb{R}^2$, and the best known approximation scheme in $\mathbb{R}^2$ takes $O_\varepsilon(n^{1.5} \cdot \mathsf{poly}(\log n))$ time. △ Less

Submitted 3 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

Comments: In SoCG'24

arXiv:2312.01589 [pdf, other]

Euclidean Bottleneck Steiner Tree is Fixed-Parameter Tractable

Authors: Sayan Bandyapadhyay, William Lochet, Daniel Lokshtanov, Saket Saurabh, Jie Xue

Abstract: In the Euclidean Bottleneck Steiner Tree problem, the input consists of a set of $n$ points in $\mathbb{R}^2$ called terminals and a parameter $k$, and the goal is to compute a Steiner tree that spans all the terminals and contains at most $k$ points of $\mathbb{R}^2$ as Steiner points such that the maximum edge-length of the Steiner tree is minimized, where the length of a tree edge is the Euclid… ▽ More In the Euclidean Bottleneck Steiner Tree problem, the input consists of a set of $n$ points in $\mathbb{R}^2$ called terminals and a parameter $k$, and the goal is to compute a Steiner tree that spans all the terminals and contains at most $k$ points of $\mathbb{R}^2$ as Steiner points such that the maximum edge-length of the Steiner tree is minimized, where the length of a tree edge is the Euclidean distance between its two endpoints. The problem is well-studied and is known to be NP-hard. In this paper, we give a $k^{O(k)} n^{O(1)}$-time algorithm for Euclidean Bottleneck Steiner Tree, which implies that the problem is fixed-parameter tractable (FPT). This settles an open question explicitly asked by Bae et al. [Algorithmica, 2011], who showed that the $\ell_1$ and $\ell_{\infty}$ variants of the problem are FPT. Our approach can be generalized to the problem with $\ell_p$ metric for any rational $1 \le p \le \infty$, or even other metrics on $\mathbb{R}^2$. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: In SODA'24

arXiv:2308.15842 [pdf, other]

On Colorful Vertex and Edge Cover Problems

Authors: Sayan Bandyapadhyay, Aritra Banik, Sujoy Bhore

Abstract: In this paper, we study two generalizations of Vertex Cover and Edge Cover, namely Colorful Vertex Cover and Colorful Edge Cover. In the Colorful Vertex Cover problem, given an $n$-vertex edge-colored graph $G$ with colors from $\{1, \ldots, ω\}$ and coverage requirements $r_1, r_2, \ldots, r_ω$, the goal is to find a minimum-sized set of vertices that are incident on at least $r_i$ edges of color… ▽ More In this paper, we study two generalizations of Vertex Cover and Edge Cover, namely Colorful Vertex Cover and Colorful Edge Cover. In the Colorful Vertex Cover problem, given an $n$-vertex edge-colored graph $G$ with colors from $\{1, \ldots, ω\}$ and coverage requirements $r_1, r_2, \ldots, r_ω$, the goal is to find a minimum-sized set of vertices that are incident on at least $r_i$ edges of color $i$, for each $1 \le i \le ω$, i.e., we need to cover at least $r_i$ edges of color $i$. Colorful Edge Cover is similar to Colorful Vertex Cover, except here we are given a vertex-colored graph and the goal is to cover at least $r_i$ vertices of color $i$, for each $1 \le i \le ω$, by a minimum-sized set of edges. These problems have several applications in fair covering and hitting of geometric set systems involving points and lines that are divided into multiple groups. Here, fairness ensures that the coverage (resp. hitting) requirement of every group is fully satisfied. We obtain a $(2+ε)$-approximation for the Colorful Vertex Cover problem in time $n^{O(ω/ε)}$. Thus, for a constant number of colors, the problem admits a $(2+ε)$-approximation in polynomial time. Next, for the Colorful Edge Cover problem, we design an $O(ωn^3)$ time exact algorithm, via a chain of reductions to a matching problem. For all intermediate problems in this chain of reductions, we design polynomial-time algorithms, which might be of independent interest. △ Less

Submitted 30 August, 2023; originally announced August 2023.

arXiv:2305.03985 [pdf, other]

Minimum-Membership Geometric Set Cover, Revisited

Authors: Sayan Bandyapadhyay, William Lochet, Saket Saurabh, Jie Xue

Abstract: We revisit a natural variant of geometric set cover, called minimum-membership geometric set cover (MMGSC). In this problem, the input consists of a set $S$ of points and a set $\mathcal{R}$ of geometric objects, and the goal is to find a subset $\mathcal{R}^*\subseteq\mathcal{R}$ to cover all points in $S$ such that the \textit{membership} of $S$ with respect to $\mathcal{R}^*$, denoted by… ▽ More We revisit a natural variant of geometric set cover, called minimum-membership geometric set cover (MMGSC). In this problem, the input consists of a set $S$ of points and a set $\mathcal{R}$ of geometric objects, and the goal is to find a subset $\mathcal{R}^*\subseteq\mathcal{R}$ to cover all points in $S$ such that the \textit{membership} of $S$ with respect to $\mathcal{R}^*$, denoted by $\mathsf{memb}(S,\mathcal{R}^*)$, is minimized, where $\mathsf{memb}(S,\mathcal{R}^*)=\max_{p\in S}|\{R\in\mathcal{R}^*: p\in R\}|$. We achieve the following two main results. * We give the first polynomial-time constant-approximation algorithm for MMGSC with unit squares. This answers a question left open since the work of Erlebach and Leeuwen [SODA'08], who gave a constant-approximation algorithm with running time $n^{O(\mathsf{opt})}$ where $\mathsf{opt}$ is the optimum of the problem (i.e., the minimum membership). * We give the first polynomial-time approximation scheme (PTAS) for MMGSC with halfplanes. Prior to this work, it was even unknown whether the problem can be approximated with a factor of $o(\log n)$ in polynomial time, while it is well-known that the minimum-size set cover problem with halfplanes can be solved in polynomial time. We also consider a problem closely related to MMGSC, called minimum-ply geometric set cover (MPGSC), in which the goal is to find $\mathcal{R}^*\subseteq\mathcal{R}$ to cover $S$ such that the ply of $\mathcal{R}^*$ is minimized, where the ply is defined as the maximum number of objects in $\mathcal{R}^*$ which have a nonempty common intersection. Very recently, Durocher et al. gave the first constant-approximation algorithm for MPGSC with unit squares which runs in $O(n^{12})$ time. We give a significantly simpler constant-approximation algorithm with near-linear running time. △ Less

Submitted 6 May, 2023; originally announced May 2023.

Comments: In SoCG'23

arXiv:2303.07923 [pdf, other]

FPT Constant-Approximations for Capacitated Clustering to Minimize the Sum of Cluster Radii

Authors: Sayan Bandyapadhyay, William Lochet, Saket Saurabh

Abstract: Clustering with capacity constraints is a fundamental problem that attracted significant attention throughout the years. In this paper, we give the first FPT constant-factor approximation algorithm for the problem of clustering points in a general metric into $k$ clusters to minimize the sum of cluster radii, subject to non-uniform hard capacity constraints. In particular, we give a $(15+ε)$-appro… ▽ More Clustering with capacity constraints is a fundamental problem that attracted significant attention throughout the years. In this paper, we give the first FPT constant-factor approximation algorithm for the problem of clustering points in a general metric into $k$ clusters to minimize the sum of cluster radii, subject to non-uniform hard capacity constraints. In particular, we give a $(15+ε)$-approximation algorithm that runs in $2^{0(k^2\log k)}\cdot n^3$ time. When capacities are uniform, we obtain the following improved approximation bounds: A (4 + $ε$)-approximation with running time $2^{O(k\log(k/ε))}n^3$, which significantly improves over the FPT 28-approximation of Inamdar and Varadarajan [ESA 2020]; a (2 + $ε$)-approximation with running time $2^{O(k/ε^2 \cdot\log(k/ε))}dn^3$ and a $(1+ε)$-approximation with running time $2^{O(kd\log ((k/ε)))}n^{3}$ in the Euclidean space; and a (1 + $ε$)-approximation in the Euclidean space with running time $2^{O(k/ε^2 \cdot\log(k/ε))}dn^3$ if we are allowed to violate the capacities by (1 + $ε$)-factor. We complement this result by showing that there is no (1 + $ε$)-approximation algorithm running in time $f(k)\cdot n^{O(1)}$, if any capacity violation is not allowed. △ Less

Submitted 20 February, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: Updated version: fix an error in the proof of Lemma 2.5

arXiv:2303.01400 [pdf, other]

Coresets for Clustering in Geometric Intersection Graphs

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Tanmay Inamdar

Abstract: Designing coresets--small-space sketches of the data preserving cost of the solutions within $(1\pm ε)$-approximate factor--is an important research direction in the study of center-based $k$-clustering problems, such as $k$-means or $k$-median. Feldman and Langberg [STOC'11] have shown that for $k$-clustering of $n$ points in general metrics, it is possible to obtain coresets whose size depends l… ▽ More Designing coresets--small-space sketches of the data preserving cost of the solutions within $(1\pm ε)$-approximate factor--is an important research direction in the study of center-based $k$-clustering problems, such as $k$-means or $k$-median. Feldman and Langberg [STOC'11] have shown that for $k$-clustering of $n$ points in general metrics, it is possible to obtain coresets whose size depends logarithmically in $n$. Moreover, such a dependency in $n$ is inevitable in general metrics. A significant amount of recent work in the area is devoted to obtaining coresests whose sizes are independent of $n$ (i.e., ``small'' coresets) for special metrics, like $d$-dimensional Euclidean spaces, doubling metrics, metrics of graphs of bounded treewidth, or those excluding a fixed minor. In this paper, we provide the first constructions of small coresets for $k$-clustering in the metrics induced by geometric intersection graphs, such as Euclidean-weighted Unit Disk/Square Graphs. These constructions follow from a general theorem that identifies two canonical properties of a graph metric sufficient for obtaining small coresets. The proof of our theorem builds on the recent work of Cohen-Addad, Saulpic, and Schwiegelshohn [STOC '21], which ensures small-sized coresets conditioned on the existence of an interesting set of centers, called ``centroid set''. The main technical contribution of our work is the proof of the existence of such a small-sized centroid set for graphs that satisfy the two canonical geometric properties. The new coreset construction helps to design the first $(1+ε)$-approximation for center-based clustering problems in UDGs and USGs, that is fixed-parameter tractable in $k$ and $ε$ (FPT-AS). △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: Full version of a paper accepted to SoCG 2023. Abstract shortened to meet the arXiv character limit

arXiv:2301.03862 [pdf, other]

Proportionally Fair Matching with Multiple Groups

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Tanmay Inamdar, Kirill Simonov

Abstract: The study of fair algorithms has become mainstream in machine learning and artificial intelligence due to its increasing demand in dealing with biases and discrimination. Along this line, researchers have considered fair versions of traditional optimization problems including clustering, regression, ranking and voting. However, most of the efforts have been channeled into designing heuristic algor… ▽ More The study of fair algorithms has become mainstream in machine learning and artificial intelligence due to its increasing demand in dealing with biases and discrimination. Along this line, researchers have considered fair versions of traditional optimization problems including clustering, regression, ranking and voting. However, most of the efforts have been channeled into designing heuristic algorithms, which often do not provide any guarantees on the quality of the solution. In this work, we study matching problems with the notion of proportional fairness. Proportional fairness is one of the most popular notions of group fairness where every group is represented up to an extent proportional to the final selection size. Matching with proportional fairness or more commonly, proportionally fair matching, was introduced in [Chierichetti et al., AISTATS, 2019], where the problem was studied with only two groups. However, in many practical applications, the number of groups -- although often a small constant -- is larger than two. In this work, we make the first step towards understanding the computational complexity of proportionally fair matching with more than two groups. We design exact and approximation algorithms achieving reasonable guarantees on the quality of the matching as well as on the time complexity. Our algorithms are also supported by suitable hardness bounds. △ Less

Submitted 10 January, 2023; originally announced January 2023.

arXiv:2112.10195 [pdf, ps, other]

Parameterized Approximation Algorithms for $k$-Center Clustering and Variants

Authors: Sayan Bandyapadhyay, Zachary Friggstad, Ramin Mousavi

Abstract: $k$-center is one of the most popular clustering models. While it admits a simple 2-approximation in polynomial time in general metrics, the Euclidean version is NP-hard to approximate within a factor of 1.93, even in the plane, if one insists the dependence on $k$ in the running time be polynomial. Without this restriction, a classic algorithm yields a $2^{O((k\log k)/ε)}dn$-time $(1+ε)… ▽ More $k$-center is one of the most popular clustering models. While it admits a simple 2-approximation in polynomial time in general metrics, the Euclidean version is NP-hard to approximate within a factor of 1.93, even in the plane, if one insists the dependence on $k$ in the running time be polynomial. Without this restriction, a classic algorithm yields a $2^{O((k\log k)/ε)}dn$-time $(1+ε)$-approximation for Euclidean $k$-center, where $d$ is the dimension. We give a faster algorithm for small dimensions: roughly speaking an $O^*(2^{O((1/ε)^{O(d)} \cdot k^{1-1/d} \cdot \log k)})$-time $(1+ε)$-approximation. In particular, the running time is roughly $O^*(2^{O((1/ε)^{O(1)}\sqrt{k}\log k)})$ in the plane. We complement our algorithmic result with a matching hardness lower bound. We also consider a well-studied generalization of $k$-center, called Non-uniform $k$-center (NUkC), where we allow different radii clusters. NUkC is NP-hard to approximate within any factor, even in the Euclidean case. We design a $2^{O(k\log k)}n^2$ time $3$-approximation for NUkC in general metrics, and a $2^{O((k\log k)/ε)}dn$ time $(1+ε)$-approximation for Euclidean NUkC. The latter time bound matches the bound for $k$-center. △ Less

Submitted 19 December, 2021; originally announced December 2021.

Comments: A preliminary version appears in AAAI 2022

arXiv:2112.06580 [pdf, other]

How to Find a Good Explanation for Clustering?

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, William Lochet, Nidhi Purohit, Kirill Simonov

Abstract: $k$-means and $k$-median clustering are powerful unsupervised machine learning techniques. However, due to complicated dependences on all the features, it is challenging to interpret the resulting cluster assignments. Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable $k$-means and $k$-median clustering. In this model, a decision tree with $k… ▽ More $k$-means and $k$-median clustering are powerful unsupervised machine learning techniques. However, due to complicated dependences on all the features, it is challenging to interpret the resulting cluster assignments. Moshkovitz, Dasgupta, Rashtchian, and Frost [ICML 2020] proposed an elegant model of explainable $k$-means and $k$-median clustering. In this model, a decision tree with $k$ leaves provides a straightforward characterization of the data set into clusters. We study two natural algorithmic questions about explainable clustering. (1) For a given clustering, how to find the "best explanation" by using a decision tree with $k$ leaves? (2) For a given set of points, how to find a decision tree with $k$ leaves minimizing the $k$-means/median objective of the resulting explainable clustering? To address the first question, we introduce a new model of explainable clustering. Our model, inspired by the notion of outliers in robust statistics, is the following. We are seeking a small number of points (outliers) whose removal makes the existing clustering well-explainable. For addressing the second question, we initiate the study of the model of Moshkovitz et al. from the perspective of multivariate complexity. Our rigorous algorithmic analysis sheds some light on the influence of parameters like the input size, dimension of the data, the number of outliers, the number of clusters, and the approximation ratio, on the computational complexity of explainable clustering. △ Less

Submitted 16 December, 2021; v1 submitted 13 December, 2021; originally announced December 2021.

arXiv:2111.14196 [pdf, other]

Subexponential Parameterized Algorithms for Cut and Cycle Hitting Problems on H-Minor-Free Graphs

Authors: Sayan Bandyapadhyay, William Lochet, Daniel Lokshtanov, Saket Saurabh, Jie Xue

Abstract: We design the first subexponential-time (parameterized) algorithms for several cut and cycle-hitting problems on $H$-minor free graphs. In particular, we obtain the following results (where $k$ is the solution-size parameter). 1. $2^{O(\sqrt{k}\log k)} \cdot n^{O(1)}$ time algorithms for Edge Bipartization and Odd Cycle Transversal; 2. a $2^{O(\sqrt{k}\log^4 k)} \cdot n^{O(1)}$ time algorithm… ▽ More We design the first subexponential-time (parameterized) algorithms for several cut and cycle-hitting problems on $H$-minor free graphs. In particular, we obtain the following results (where $k$ is the solution-size parameter). 1. $2^{O(\sqrt{k}\log k)} \cdot n^{O(1)}$ time algorithms for Edge Bipartization and Odd Cycle Transversal; 2. a $2^{O(\sqrt{k}\log^4 k)} \cdot n^{O(1)}$ time algorithm for Edge Multiway Cut and a $2^{O(r \sqrt{k} \log k)} \cdot n^{O(1)}$ time algorithm for Vertex Multiway Cut, where $r$ is the number of terminals to be separated; 3. a $2^{O((r+\sqrt{k})\log^4 (rk))} \cdot n^{O(1)}$ time algorithm for Edge Multicut and a $2^{O((\sqrt{rk}+r) \log (rk))} \cdot n^{O(1)}$ time algorithm for Vertex Multicut, where $r$ is the number of terminal pairs to be separated; 4. a $2^{O(\sqrt{k} \log g \log^4 k)} \cdot n^{O(1)}$ time algorithm for Group Feedback Edge Set and a $2^{O(g \sqrt{k}\log(gk))} \cdot n^{O(1)}$ time algorithm for Group Feedback Vertex Set, where $g$ is the size of the group. 5. In addition, our approach also gives $n^{O(\sqrt{k})}$ time algorithms for all above problems with the exception of $n^{O(r+\sqrt{k})}$ time for Edge/Vertex Multicut and $(ng)^{O(\sqrt{k})}$ time for Group Feedback Edge/Vertex Set. We obtain our results by giving a new decomposition theorem on graphs of bounded genus, or more generally, an $h$-almost-embeddable graph for any fixed constant $h$. In particular we show the following. Let $G$ be an $h$-almost-embeddable graph for a constant $h$. Then for every $p\in\mathbb{N}$, there exist disjoint sets $Z_1,\dots,Z_p \subseteq V(G)$ such that for every $i \in \{1,\dots,p\}$ and every $Z'\subseteq Z_i$, the treewidth of $G/(Z_i\backslash Z')$ is $O(p+|Z'|)$. Here $G/(Z_i\backslash Z')$ is the graph obtained from $G$ by contracting edges with both endpoints in $Z_i \backslash Z'$. △ Less

Submitted 4 July, 2022; v1 submitted 28 November, 2021; originally announced November 2021.

Comments: A preliminary version appears in SODA'22

arXiv:2109.07524 [pdf, other]

Exact and Approximation Algorithms for Many-To-Many Point Matching in the Plane

Authors: Sayan Bandyapadhyay, Anil Maheshwari, Michiel Smid

Abstract: Given two sets $S$ and $T$ of points in the plane, of total size $n$, a {many-to-many} matching between $S$ and $T$ is a set of pairs $(p,q)$ such that $p\in S$, $q\in T$ and for each $r\in S\cup T$, $r$ appears in at least one such pair. The {cost of a pair} $(p,q)$ is the (Euclidean) distance between $p$ and $q$. In the {minimum-cost many-to-many matching} problem, the goal is to compute a many-… ▽ More Given two sets $S$ and $T$ of points in the plane, of total size $n$, a {many-to-many} matching between $S$ and $T$ is a set of pairs $(p,q)$ such that $p\in S$, $q\in T$ and for each $r\in S\cup T$, $r$ appears in at least one such pair. The {cost of a pair} $(p,q)$ is the (Euclidean) distance between $p$ and $q$. In the {minimum-cost many-to-many matching} problem, the goal is to compute a many-to-many matching such that the sum of the costs of the pairs is minimized. This problem is a restricted version of minimum-weight edge cover in a bipartite graph, and hence can be solved in $O(n^3)$ time. In a more restricted setting where all the points are on a line, the problem can be solved in $O(n\log n)$ time [Colannino, Damian, Hurtado, Langerman, Meijer, Ramaswami, Souvaine, Toussaint; Graphs Comb., 2007]. However, no progress has been made in the general planar case in improving the cubic time bound. In this paper, we obtain an $O(n^2\cdot poly(\log n))$ time exact algorithm and an $O( n^{3/2}\cdot poly(\log n))$ time $(1+ε)$-approximation in the planar case. Our results affirmatively address an open problem posed in [Colannino et al., Graphs Comb., 2007]. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: Accepted at ISAAC 2021

arXiv:2107.09481 [pdf, other]

FPT Approximation for Fair Minimum-Load Clustering

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, Nidhi Purohit, Kirill Simonov

Abstract: In this paper, we consider the Minimum-Load $k$-Clustering/Facility Location (MLkC) problem where we are given a set $P$ of $n$ points in a metric space that we have to cluster and an integer $k$ that denotes the number of clusters. Additionally, we are given a set $F$ of cluster centers in the same metric space. The goal is to select a set $C\subseteq F$ of $k$ centers and assign each point in… ▽ More In this paper, we consider the Minimum-Load $k$-Clustering/Facility Location (MLkC) problem where we are given a set $P$ of $n$ points in a metric space that we have to cluster and an integer $k$ that denotes the number of clusters. Additionally, we are given a set $F$ of cluster centers in the same metric space. The goal is to select a set $C\subseteq F$ of $k$ centers and assign each point in $P$ to a center in $C$, such that the maximum load over all centers is minimized. Here the load of a center is the sum of the distances between it and the points assigned to it. Although clustering/facility location problems have a rich literature, the minimum-load objective is not studied substantially, and hence MLkC has remained a poorly understood problem. More interestingly, the problem is notoriously hard even in some special cases including the one in line metrics as shown by Ahmadian et al. [ACM Trans. Algo. 2018]. They also show APX-hardness of the problem in the plane. On the other hand, the best-known approximation factor for MLkC is $O(k)$, even in the plane. In this work, we study a fair version of MLkC inspired by the work of Chierichetti et al. [NeurIPS, 2017], which generalizes MLkC. Here the input points are colored by one of the $\ell$ colors denoting the group they belong to. MLkC is the special case with $\ell=1$. Considering this problem, we are able to obtain a $3$-approximation in $f(k,\ell)\cdot n^{O(1)}$ time. Also, our scheme leads to an improved $(1 + ε)$-approximation in case of Euclidean norm, and in this case, the running time depends only polynomially on the dimension $d$. Our results imply the same approximations for MLkC with running time $f(k)\cdot n^{O(1)}$, achieving the first constant approximations for this problem in general and Euclidean metric spaces. △ Less

Submitted 20 July, 2021; originally announced July 2021.

arXiv:2107.07383 [pdf, other]

Lossy Kernelization of Same-Size Clustering

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, Nidhi Purohit, Kirill Simonov

Abstract: In this work, we study the $k$-median clustering problem with an additional equal-size constraint on the clusters, from the perspective of parameterized preprocessing. Our main result is the first lossy ($2$-approximate) polynomial kernel for this problem, parameterized by the cost of clustering. We complement this result by establishing lower bounds for the problem that eliminate the existences o… ▽ More In this work, we study the $k$-median clustering problem with an additional equal-size constraint on the clusters, from the perspective of parameterized preprocessing. Our main result is the first lossy ($2$-approximate) polynomial kernel for this problem, parameterized by the cost of clustering. We complement this result by establishing lower bounds for the problem that eliminate the existences of an (exact) kernel of polynomial size and a PTAS. △ Less

Submitted 15 July, 2021; originally announced July 2021.

arXiv:2105.03753 [pdf, other]

Parameterized Complexity of Feature Selection for Categorical Data Clustering

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Petr A. Golovach, Kirill Simonov

Abstract: We develop new algorithmic methods with provable guarantees for feature selection in regard to categorical data clustering. While feature selection is one of the most common approaches to reduce dimensionality in practice, most of the known feature selection methods are heuristics. We study the following mathematical model. We assume that there are some inadvertent (or undesirable) features of the… ▽ More We develop new algorithmic methods with provable guarantees for feature selection in regard to categorical data clustering. While feature selection is one of the most common approaches to reduce dimensionality in practice, most of the known feature selection methods are heuristics. We study the following mathematical model. We assume that there are some inadvertent (or undesirable) features of the input data that unnecessarily increase the cost of clustering. Consequently, we want to select a subset of the original features from the data such that there is a small-cost clustering on the selected features. More precisely, for given integers $\ell$ (the number of irrelevant features) and $k$ (the number of clusters), budget $B$, and a set of $n$ categorical data points (represented by $m$-dimensional vectors whose elements belong to a finite set of values $Σ$), we want to select $m-\ell$ relevant features such that the cost of any optimal $k$-clustering on these features does not exceed $B$. Here the cost of a cluster is the sum of Hamming distances ($\ell_0$-distances) between the selected features of the elements of the cluster and its center. The clustering cost is the total sum of the costs of the clusters. We use the framework of parameterized complexity to identify how the complexity of the problem depends on parameters $k$, $B$, and $|Σ|$. Our main result is an algorithm that solves the Feature Selection problem in time $f(k,B,|Σ|)\cdot m^{g(k,|Σ|)}\cdot n^2$ for some functions $f$ and $g$. In other words, the problem is fixed-parameter tractable parameterized by $B$ when $|Σ|$ and $k$ are constants. Our algorithm is based on a solution to a more general problem, Constrained Clustering with Outliers. We also complement our algorithmic findings with complexity lower bounds. △ Less

Submitted 19 August, 2021; v1 submitted 8 May, 2021; originally announced May 2021.

Comments: 25 pages, full version

arXiv:2007.11476

Approximate Covering with Lower and Upper Bounds via LP Rounding

Authors: Sayan Bandyapadhyay, Aniket Basu Roy

Abstract: In this paper, we study the lower- and upper-bounded covering (LUC) problem, where we are given a set $P$ of $n$ points, a collection $\mathcal{B}$ of balls, and parameters $L$ and $U$. The goal is to find a minimum-sized subset $\mathcal{B}'\subseteq \mathcal{B}$ and an assignment of the points in $P$ to $\mathcal{B}'$, such that each point $p\in P$ is assigned to a ball that contains $p$ and for… ▽ More In this paper, we study the lower- and upper-bounded covering (LUC) problem, where we are given a set $P$ of $n$ points, a collection $\mathcal{B}$ of balls, and parameters $L$ and $U$. The goal is to find a minimum-sized subset $\mathcal{B}'\subseteq \mathcal{B}$ and an assignment of the points in $P$ to $\mathcal{B}'$, such that each point $p\in P$ is assigned to a ball that contains $p$ and for each ball $B_i\in \mathcal{B}'$, at least $L$ and at most $U$ points are assigned to $B_i$. We obtain an LP rounding based constant approximation for LUC by violating the lower and upper bound constraints by small constant factors and expanding the balls by again a small constant factor. Similar results were known before for covering problems with only the upper bound constraint. We also show that with only the lower bound constraint, the above result can be obtained without any lower bound violation. Covering problems have close connections with facility location problems. We note that the known constant-approximation for the corresponding lower- and upper-bounded facility location problem, violates the lower and upper bound constraints by a constant factor. △ Less

Submitted 18 September, 2020; v1 submitted 22 July, 2020; originally announced July 2020.

Comments: There is an error in the algorithm for LUC in Section 3. The proof of Lemma 5 does not hold

arXiv:2007.10137 [pdf, other]

On Coresets for Fair Clustering in Metric and Euclidean Spaces and Their Applications

Authors: Sayan Bandyapadhyay, Fedor V. Fomin, Kirill Simonov

Abstract: Fair clustering is a constrained variant of clustering where the goal is to partition a set of colored points, such that the fraction of points of any color in every cluster is more or less equal to the fraction of points of this color in the dataset. This variant was recently introduced by Chierichetti et al. [NeurIPS, 2017] in a seminal work and became widely popular in the clustering literature… ▽ More Fair clustering is a constrained variant of clustering where the goal is to partition a set of colored points, such that the fraction of points of any color in every cluster is more or less equal to the fraction of points of this color in the dataset. This variant was recently introduced by Chierichetti et al. [NeurIPS, 2017] in a seminal work and became widely popular in the clustering literature. In this paper, we propose a new construction of coresets for fair clustering based on random sampling. The new construction allows us to obtain the first coreset for fair clustering in general metric spaces. For Euclidean spaces, we obtain the first coreset whose size does not depend exponentially on the dimension. Our coreset results solve open questions proposed by Schmidt et al. [WAOA, 2019] and Huang et al. [NeurIPS, 2019]. The new coreset construction helps to design several new approximation and streaming algorithms. In particular, we obtain the first true constant-approximation algorithm for metric fair clustering, whose running time is fixed-parameter tractable (FPT). In the Euclidean case, we derive the first $(1+ε)$-approximation algorithm for fair clustering whose time complexity is near-linear and does not depend exponentially on the dimension of the space. Besides, our coreset construction scheme is fairly general and gives rise to coresets for a wide range of constrained clustering problems. This leads to improved constant-approximations for these problems in general metrics and near-linear time $(1+ε)$-approximations in the Euclidean metric. △ Less

Submitted 20 July, 2020; originally announced July 2020.

arXiv:2006.12454 [pdf, other]

Improved Bounds for Metric Capacitated Covering Problems

Authors: Sayan Bandyapadhyay

Abstract: In the Metric Capacitated Covering (MCC) problem, given a set of balls $\mathcal{B}$ in a metric space $P$ with metric $d$ and a capacity parameter $U$, the goal is to find a minimum sized subset $\mathcal{B}'\subseteq \mathcal{B}$ and an assignment of the points in $P$ to the balls in $\mathcal{B}'$ such that each point is assigned to a ball that contains it and each ball is assigned with at most… ▽ More In the Metric Capacitated Covering (MCC) problem, given a set of balls $\mathcal{B}$ in a metric space $P$ with metric $d$ and a capacity parameter $U$, the goal is to find a minimum sized subset $\mathcal{B}'\subseteq \mathcal{B}$ and an assignment of the points in $P$ to the balls in $\mathcal{B}'$ such that each point is assigned to a ball that contains it and each ball is assigned with at most $U$ points. MCC achieves an $O(\log |P|)$-approximation using a greedy algorithm. On the other hand, it is hard to approximate within a factor of $o(\log |P|)$ even with $β< 3$ factor expansion of the balls. Bandyapadhyay~{et al.} [SoCG 2018, DCG 2019] showed that one can obtain an $O(1)$-approximation for the problem with $6.47$ factor expansion of the balls. An open question left by their work is to reduce the gap between the lower bound $3$ and the upper bound $6.47$. In this current work, we show that it is possible to obtain an $O(1)$-approximation with only $4.24$ factor expansion of the balls. We also show a similar upper bound of $5$ for a more generalized version of MCC for which the best previously known bound was $9$. △ Less

Submitted 22 June, 2020; originally announced June 2020.

Comments: To appear at European Symposia on Algorithms 2020

arXiv:2004.12633 [pdf, other]

On Perturbation Resilience of Non-Uniform $k$-Center

Authors: Sayan Bandyapadhyay

Abstract: The Non-Uniform $k$-center (NUkC) problem has recently been formulated by Chakrabarty, Goyal and Krishnaswamy [ICALP, 2016] as a generalization of the classical $k$-center clustering problem. In NUkC, given a set of $n$ points $P$ in a metric space and non-negative numbers $r_1, r_2, \ldots , r_k$, the goal is to find the minimum dilation $α$ and to choose $k$ balls centered at the points of $P$ w… ▽ More The Non-Uniform $k$-center (NUkC) problem has recently been formulated by Chakrabarty, Goyal and Krishnaswamy [ICALP, 2016] as a generalization of the classical $k$-center clustering problem. In NUkC, given a set of $n$ points $P$ in a metric space and non-negative numbers $r_1, r_2, \ldots , r_k$, the goal is to find the minimum dilation $α$ and to choose $k$ balls centered at the points of $P$ with radius $α\cdot r_i$ for $1\le i\le k$, such that all points of $P$ are contained in the union of the chosen balls. They showed that the problem is NP-hard to approximate within any factor even in tree metrics. On the other hand, they designed a "bi-criteria" constant approximation algorithm that uses a constant times $k$ balls. Surprisingly, no true approximation is known even in the special case when the $r_i$'s belong to a fixed set of size 3. In this paper, we study the NUkC problem under perturbation resilience, which was introduced by Bilu and Linial [Combinatorics, Probability and Computing, 2012]. We show that the problem under 2-perturbation resilience is polynomial time solvable when the $r_i$'s belong to a constant sized set. However, we show that perturbation resilience does not help in the general case. In particular, our findings imply that even with perturbation resilience one cannot hope to find any "good" approximation for the problem. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: 20 pages, 5 figures

arXiv:1911.08924 [pdf, other]

doi 10.1016/j.tcs.2021.09.035

Geometric Planar Networks on Bichromatic Points

Authors: Sayan Bandyapadhyay, Aritra Banik, Sujoy Bhore, Martin Nöllenburg

Abstract: We study three classical graph problems - Hamiltonian path, minimum spanning tree, and minimum perfect matching on geometric graphs induced by bichromatic (red and blue) points. These problems have been widely studied for points in the Euclidean plane, and many of them are NP-hard. In this work, we consider these problems for collinear points. We show that almost all of these problems can be solve… ▽ More We study three classical graph problems - Hamiltonian path, minimum spanning tree, and minimum perfect matching on geometric graphs induced by bichromatic (red and blue) points. These problems have been widely studied for points in the Euclidean plane, and many of them are NP-hard. In this work, we consider these problems for collinear points. We show that almost all of these problems can be solved in linear time in this setting. △ Less

Submitted 16 July, 2022; v1 submitted 20 November, 2019; originally announced November 2019.

Comments: Appeared in Theoretical Computer Science (TCS) 2021

arXiv:1907.08906 [pdf, other]

A Constant Approximation for Colorful k-Center

Authors: Sayan Bandyapadhyay, Tanmay Inamdar, Shreyas Pai, Kasturi Varadarajan

Abstract: In this paper, we consider the colorful $k$-center problem, which is a generalization of the well-known $k$-center problem. Here, we are given red and blue points in a metric space, and a coverage requirement for each color. The goal is to find the smallest radius $ρ$, such that with $k$ balls of radius $ρ$, the desired number of points of each color can be covered. We obtain a constant approximat… ▽ More In this paper, we consider the colorful $k$-center problem, which is a generalization of the well-known $k$-center problem. Here, we are given red and blue points in a metric space, and a coverage requirement for each color. The goal is to find the smallest radius $ρ$, such that with $k$ balls of radius $ρ$, the desired number of points of each color can be covered. We obtain a constant approximation for this problem in the Euclidean plane. We obtain this result by combining a "pseudo-approximation" algorithm that works in any metric space, and an approximation algorithm that works for a special class of instances in the plane. The latter algorithm uses a novel connection to a certain matching problem in graphs. △ Less

Submitted 20 July, 2019; originally announced July 2019.

Comments: 14 pages, Published in ESA 2019

arXiv:1904.13369 [pdf, other]

Constrained Orthogonal Segment Stabbing

Authors: Sayan Bandyapadhyay, Saeed Mehrabi

Abstract: Let $S$ and $D$ each be a set of orthogonal line segments in the plane. A line segment $s\in S$ \emph{stabs} a line segment $s'\in D$ if $s\cap s'\neq\emptyset$. It is known that the problem of stabbing the line segments in $D$ with the minimum number of line segments of $S$ is NP-hard. However, no better than $O(\log |S\cup D|)$-approximation is known for the problem. In this paper, we introduce… ▽ More Let $S$ and $D$ each be a set of orthogonal line segments in the plane. A line segment $s\in S$ \emph{stabs} a line segment $s'\in D$ if $s\cap s'\neq\emptyset$. It is known that the problem of stabbing the line segments in $D$ with the minimum number of line segments of $S$ is NP-hard. However, no better than $O(\log |S\cup D|)$-approximation is known for the problem. In this paper, we introduce a constrained version of this problem in which every horizontal line segment of $S\cup D$ intersects a common vertical line. We study several versions of the problem, depending on which line segments are used for stabbing and which line segments must be stabbed. We obtain several NP-hardness and constant approximation results for these versions. Our finding implies, the problem remains NP-hard even under the extra assumption on input, but small constant approximation algorithms can be designed. △ Less

Submitted 22 June, 2019; v1 submitted 30 April, 2019; originally announced April 2019.

Comments: to appear at CCCG 2019

arXiv:1803.06216 [pdf, other]

Approximating Dominating Set on Intersection Graphs of Rectangles and L-frames

Authors: Sayan Bandyapadhyay, Anil Maheshwari, Saeed Mehrabi, Subhash Suri

Abstract: We consider the Minimum Dominating Set (MDS) problem on the intersection graphs of geometric objects. Even for simple and widely-used geometric objects such as rectangles, no sub-logarithmic approximation is known for the problem and (perhaps surprisingly) the problem is NP-hard even when all the rectangles are "anchored" at a diagonal line with slope -1 (Pandit, CCCG 2017). In this paper, we firs… ▽ More We consider the Minimum Dominating Set (MDS) problem on the intersection graphs of geometric objects. Even for simple and widely-used geometric objects such as rectangles, no sub-logarithmic approximation is known for the problem and (perhaps surprisingly) the problem is NP-hard even when all the rectangles are "anchored" at a diagonal line with slope -1 (Pandit, CCCG 2017). In this paper, we first show that for any $ε>0$, there exists a $(2+ε)$-approximation algorithm for the MDS problem on "diagonal-anchored" rectangles, providing the first $O(1)$-approximation for the problem on a non-trivial subclass of rectangles. It is not hard to see that the MDS problem on "diagonal-anchored" rectangles is the same as the MDS problem on "diagonal-anchored" L-frames: the union of a vertical and a horizontal line segment that share an endpoint. As such, we also obtain a $(2+ε)$-approximation for the problem with "diagonal-anchored" L-frames. On the other hand, we show that the problem is APX-hard in case the input L-frames intersect the diagonal, or the horizontal segments of the L-frames intersect a vertical line. However, as we show, the problem is linear-time solvable in case the L-frames intersect a vertical as well as a horizontal line. Finally, we consider the MDS problem in the so-called "edge intersection model" and obtain a number of results, answering two questions posed by Mehrabi (WAOA 2017). △ Less

Submitted 25 June, 2018; v1 submitted 16 March, 2018; originally announced March 2018.

Comments: 19 pages, 13 figures, a preliminary version to appear in MFCS 2018

arXiv:1710.08381 [pdf, ps, other]

Near-Optimal Clustering in the $k$-machine model

Authors: Sayan Bandyapadhyay, Tanmay Inamdar, Shreyas Pai, Sriram V. Pemmaraju

Abstract: The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown rapidly, researchers have focused on designing algorithms for clustering problems in models of computation suited for large-scale computation such as MapReduce… ▽ More The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown rapidly, researchers have focused on designing algorithms for clustering problems in models of computation suited for large-scale computation such as MapReduce, Pregel, and streaming models. The $k$-machine model (Klauck et al., SODA 2015) is a simple, message-passing model for large-scale distributed graph processing. This paper considers three of the most prominent examples of clustering problems: the uncapacitated facility location problem, the $p$-median problem, and the $p$-center problem and presents $O(1)$-factor approximation algorithms for these problems running in $\tilde{O}(n/k)$ rounds in the $k$-machine model. These algorithms are optimal up to polylogarithmic factors because this paper also shows $\tildeΩ(n/k)$ lower bounds for obtaining polynomial-factor approximation algorithms for these problems. These are the first results for clustering problems in the $k$-machine model. We assume that the metric provided as input for these clustering problems in only implicitly provided, as an edge-weighted graph and in a nutshell, our main technical contribution is to show that constant-factor approximation algorithms for all three clustering problems can be obtained by learning only a small portion of the input metric. △ Less

Submitted 23 October, 2017; originally announced October 2017.

arXiv:1707.05170 [pdf, other]

Capacitated Covering Problems in Geometric Spaces

Authors: Sayan Bandyapadhyay, Santanu Bhowmick, Tanmay Inamdar, Kasturi Varadarajan

Abstract: In this article, we consider the following capacitated covering problem. We are given a set $P$ of $n$ points and a set $\mathcal{B}$ of balls from some metric space, and a positive integer $U$ that represents the capacity of each of the balls in $\mathcal{B}$. We would like to compute a subset $\mathcal{B}' \subseteq \mathcal{B}$ of balls and assign each point in $P$ to some ball in… ▽ More In this article, we consider the following capacitated covering problem. We are given a set $P$ of $n$ points and a set $\mathcal{B}$ of balls from some metric space, and a positive integer $U$ that represents the capacity of each of the balls in $\mathcal{B}$. We would like to compute a subset $\mathcal{B}' \subseteq \mathcal{B}$ of balls and assign each point in $P$ to some ball in $\mathcal{B}$ that contains it, such that the number of points assigned to any ball is at most $U$. The objective function that we would like to minimize is the cardinality of $\mathcal{B}$. We consider this problem in arbitrary metric spaces as well as Euclidean spaces of constant dimension. In the metric setting, even the uncapacitated version of the problem is hard to approximate to within a logarithmic factor. In the Euclidean setting, the best known approximation guarantee in dimensions $3$ and higher is logarithmic in the number of points. Thus we focus on obtaining "bi-criteria" approximations. In particular, we are allowed to expand the balls in our solution by some factor, but optimal solutions do not have that flexibility. Our main result is that allowing constant factor expansion of the input balls suffices to obtain constant approximations for these problems. In fact, in the Euclidean setting, only $(1+ε)$ factor expansion is sufficient for any $ε> 0$, with the approximation factor being a polynomial in $1/ε$. We obtain these results using a unified scheme for rounding the natural LP relaxation; this scheme may be useful for other capacitated covering problems. We also complement these bi-criteria approximations by obtaining hardness of approximation results that shed light on our understanding of these problems. △ Less

Submitted 12 December, 2017; v1 submitted 17 July, 2017; originally announced July 2017.

arXiv:1610.00300 [pdf, other]

Polynomial Time Algorithms for Bichromatic Problems

Authors: Sayan Bandyapadhyay, Aritra Banik

Abstract: In this article, we consider a collection of geometric problems involving points colored by two colors (red and blue), referred to as bichromatic problems. The motivation behind studying these problems is two fold; (i) these problems appear naturally and frequently in the fields like Machine learning, Data mining, and so on, and (ii) we are interested in extending the algorithms and techniques for… ▽ More In this article, we consider a collection of geometric problems involving points colored by two colors (red and blue), referred to as bichromatic problems. The motivation behind studying these problems is two fold; (i) these problems appear naturally and frequently in the fields like Machine learning, Data mining, and so on, and (ii) we are interested in extending the algorithms and techniques for single point set (monochromatic) problems to bichromatic case. For all the problems considered in this paper, we design low polynomial time exact algorithms. These algorithms are based on novel techniques which might be of independent interest. △ Less

Submitted 2 October, 2016; originally announced October 2016.

arXiv:1512.02985 [pdf, other]

On Variants of k-means Clustering

Authors: Sayan Bandyapadhyay, Kasturi Varadarajan

Abstract: \textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically \textit{$k$-means} clustering has got much attention from the researchers. Despite the fact that $k$-means is a very well studied problem its status in… ▽ More \textit{Clustering problems} often arise in the fields like data mining, machine learning etc. to group a collection of objects into similar groups with respect to a similarity (or dissimilarity) measure. Among the clustering problems, specifically \textit{$k$-means} clustering has got much attention from the researchers. Despite the fact that $k$-means is a very well studied problem its status in the plane is still an open problem. In particular, it is unknown whether it admits a PTAS in the plane. The best known approximation bound in polynomial time is $9+\eps$. In this paper, we consider the following variant of $k$-means. Given a set $C$ of points in $\mathcal{R}^d$ and a real $f > 0$, find a finite set $F$ of points in $\mathcal{R}^d$ that minimizes the quantity $f*|F|+\sum_{p\in C} \min_{q \in F} {||p-q||}^2$. For any fixed dimension $d$, we design a local search PTAS for this problem. We also give a "bi-criterion" local search algorithm for $k$-means which uses $(1+\eps)k$ centers and yields a solution whose cost is at most $(1+\eps)$ times the cost of an optimal $k$-means solution. The algorithm runs in polynomial time for any fixed dimension. The contribution of this paper is two fold. On the one hand, we are being able to handle the square of distances in an elegant manner, which yields near optimal approximation bound. This leads us towards a better understanding of the $k$-means problem. On the other hand, our analysis of local search might also be useful for other geometric problems. This is important considering that very little is known about the local search method for geometric approximation. △ Less

Submitted 9 December, 2015; originally announced December 2015.

Comments: 15 pages

arXiv:1507.02222 [pdf, other]

Approximate Clustering via Metric Partitioning

Authors: Sayan Bandyapadhyay, Kasturi Varadarajan

Abstract: In this paper we consider two metric covering/clustering problems - \textit{Minimum Cost Covering Problem} (MCC) and $k$-clustering. In the MCC problem, we are given two point sets $X$ (clients) and $Y$ (servers), and a metric on $X \cup Y$. We would like to cover the clients by balls centered at the servers. The objective function to minimize is the sum of the $α$-th power of the radii of the bal… ▽ More In this paper we consider two metric covering/clustering problems - \textit{Minimum Cost Covering Problem} (MCC) and $k$-clustering. In the MCC problem, we are given two point sets $X$ (clients) and $Y$ (servers), and a metric on $X \cup Y$. We would like to cover the clients by balls centered at the servers. The objective function to minimize is the sum of the $α$-th power of the radii of the balls. Here $α\geq 1$ is a parameter of the problem (but not of a problem instance). MCC is closely related to the $k$-clustering problem. The main difference between $k$-clustering and MCC is that in $k$-clustering one needs to select $k$ balls to cover the clients. For any $\eps > 0$, we describe quasi-polynomial time $(1 + \eps)$ approximation algorithms for both of the problems. However, in case of $k$-clustering the algorithm uses $(1 + \eps)k$ balls. Prior to our work, a $3^α$ and a ${c}^α$ approximation were achieved by polynomial-time algorithms for MCC and $k$-clustering, respectively, where $c > 1$ is an absolute constant. These two problems are thus interesting examples of metric covering/clustering problems that admit $(1 + \eps)$-approximation (using $(1+\eps)k$ balls in case of $k$-clustering), if one is willing to settle for quasi-polynomial time. In contrast, for the variant of MCC where $α$ is part of the input, we show under standard assumptions that no polynomial time algorithm can achieve an approximation factor better than $O(\log |X|)$ for $α\geq \log |X|$. △ Less

Submitted 3 October, 2016; v1 submitted 8 July, 2015; originally announced July 2015.

Comments: 19 pages

arXiv:1502.03847 [pdf, other]

A Constant Factor Approximation for Orthogonal Order Preserving Layout Adjustment

Authors: Sayan Bandyapadhyay, Santanu Bhowmick, Kasturi Varadarajan

Abstract: Given an initial placement of a set of rectangles in the plane, we consider the problem of finding a disjoint placement of the rectangles that minimizes the area of the bounding box and preserves the orthogonal order i.e.\ maintains the sorted ordering of the rectangle centers along both $x$-axis and $y$-axis with respect to the initial placement. This problem is known as Layout Adjustment for Dis… ▽ More Given an initial placement of a set of rectangles in the plane, we consider the problem of finding a disjoint placement of the rectangles that minimizes the area of the bounding box and preserves the orthogonal order i.e.\ maintains the sorted ordering of the rectangle centers along both $x$-axis and $y$-axis with respect to the initial placement. This problem is known as Layout Adjustment for Disjoint Rectangles(LADR). It was known that LADR is $\mathbb{NP}$-hard, but only heuristics were known for it. We show that a certain decision version of LADR is $\mathbb{APX}$-hard, and give a constant factor approximation for LADR. △ Less

Submitted 23 February, 2015; v1 submitted 12 February, 2015; originally announced February 2015.

Comments: Edited Section 5, re-arranged content

ACM Class: I.3.5

arXiv:1409.0173 [pdf, other]

A Variant of the Maximum Weight Independent Set Problem

Authors: Sayan Bandyapadhyay

Abstract: We study a natural extension of the Maximum Weight Independent Set Problem (MWIS), one of the most studied optimization problems in Graph algorithms. We are given a graph $G=(V,E)$, a weight function $w: V \rightarrow \mathbb{R^+}$, a budget function $b: V \rightarrow \mathbb{Z^+}$, and a positive integer $B$. The weight (resp. budget) of a subset of vertices is the sum of weights (resp. budgets)… ▽ More We study a natural extension of the Maximum Weight Independent Set Problem (MWIS), one of the most studied optimization problems in Graph algorithms. We are given a graph $G=(V,E)$, a weight function $w: V \rightarrow \mathbb{R^+}$, a budget function $b: V \rightarrow \mathbb{Z^+}$, and a positive integer $B$. The weight (resp. budget) of a subset of vertices is the sum of weights (resp. budgets) of the vertices in the subset. A $k$-budgeted independent set in $G$ is a subset of vertices, such that no pair of vertices in that subset are adjacent, and the budget of the subset is at most $k$. The goal is to find a $B$-budgeted independent set in $G$ such that its weight is maximum among all the $B$-budgeted independent sets in $G$. We refer to this problem as MWBIS. Being a generalization of MWIS, MWBIS also has several applications in Scheduling, Wireless networks and so on. Due to the hardness results implied from MWIS, we study the MWBIS problem in several special classes of graphs. We design exact algorithms for trees, forests, cycle graphs, and interval graphs. In unweighted case we design an approximation algorithm for $d+1$-claw free graphs whose approximation ratio ($d$) is competitive with the approximation ratio ($\frac{d}{2}$) of MWIS (unweighted). Furthermore, we extend Baker's technique \cite{Baker83} to get a PTAS for MWBIS in planar graphs. △ Less

Submitted 28 September, 2014; v1 submitted 30 August, 2014; originally announced September 2014.

Comments: 18 pages

arXiv:1407.8474 [pdf, other]

Voronoi Game on Graphs

Authors: Sayan Bandyapadhyay, Aritra Banik, Sandip Das, Hirak Sarkar

Abstract: \textit{Voronoi game} is a geometric model of competitive facility location problem played between two players. Users are generally modeled as points uniformly distributed on a given underlying space. Each player chooses a set of points in the underlying space to place their facilities. Each user avails service from its nearest facility. Service zone of a facility consists of the set of users whic… ▽ More \textit{Voronoi game} is a geometric model of competitive facility location problem played between two players. Users are generally modeled as points uniformly distributed on a given underlying space. Each player chooses a set of points in the underlying space to place their facilities. Each user avails service from its nearest facility. Service zone of a facility consists of the set of users which are closer to it than any other facility. Payoff of each player is defined by the quantity of users served by all of its facilities. The objective of each player is to maximize their respective payoff. In this paper we consider the two players {\it Voronoi game} where the underlying space is a road network modeled by a graph. In this framework we consider the problem of finding $k$ optimal facility locations of Player 2 given any placement of $m$ facilities by Player 1. Our main result is a dynamic programming based polynomial time algorithm for this problem on tree network. On the other hand, we show that the problem is strongly $\mathcal{NP}$-complete for graphs. This proves that finding a winning strategy of P2 is $\mathcal{NP}$-complete. Consequently, we design an $1-\frac{1}{e}$ factor approximation algorithm, where $e \approx 2.718$. △ Less

Submitted 31 July, 2014; originally announced July 2014.

Comments: Journal preprint version, 18 pages

arXiv:1404.3776 [pdf, other]

Approximation Schemes for Partitioning: Convex Decomposition and Surface Approximation

Authors: Sayan Bandyapadhyay, Santanu Bhowmick, Kasturi Varadarajan

Abstract: We revisit two NP-hard geometric partitioning problems - convex decomposition and surface approximation. Building on recent developments in geometric separators, we present quasi-polynomial time algorithms for these problems with improved approximation guarantees. We revisit two NP-hard geometric partitioning problems - convex decomposition and surface approximation. Building on recent developments in geometric separators, we present quasi-polynomial time algorithms for these problems with improved approximation guarantees. △ Less

Submitted 14 April, 2014; originally announced April 2014.

Comments: 21 pages, 6 figures

Showing 1–32 of 32 results for author: Bandyapadhyay, S