Search | arXiv e-print repository

On Equivalence of Parameterized Inapproximability of k-Median, k-Max-Coverage, and 2-CSP

Authors: Karthik C. S., Euiwoong Lee, Pasin Manurangsi

Abstract: Parameterized Inapproximability Hypothesis (PIH) is a central question in the field of parameterized complexity. PIH asserts that given as input a 2-CSP on $k$ variables and alphabet size $n$, it is W[1]-hard parameterized by $k$ to distinguish if the input is perfectly satisfiable or if every assignment to the input violates 1% of the constraints. An important implication of PIH is that it yiel… ▽ More Parameterized Inapproximability Hypothesis (PIH) is a central question in the field of parameterized complexity. PIH asserts that given as input a 2-CSP on $k$ variables and alphabet size $n$, it is W[1]-hard parameterized by $k$ to distinguish if the input is perfectly satisfiable or if every assignment to the input violates 1% of the constraints. An important implication of PIH is that it yields the tight parameterized inapproximability of the $k$-maxcoverage problem. In the $k$-maxcoverage problem, we are given as input a set system, a threshold $τ>0$, and a parameter $k$ and the goal is to determine if there exist $k$ sets in the input whose union is at least $τ$ fraction of the entire universe. PIH is known to imply that it is W[1]-hard parameterized by $k$ to distinguish if there are $k$ input sets whose union is at least $τ$ fraction of the universe or if the union of every $k$ input sets is not much larger than $τ\cdot (1-\frac{1}{e})$ fraction of the universe. In this work we present a gap preserving FPT reduction (in the reverse direction) from the $k$-maxcoverage problem to the aforementioned 2-CSP problem, thus showing that the assertion that approximating the $k$-maxcoverage problem to some constant factor is W[1]-hard implies PIH. In addition, we present a gap preserving FPT reduction from the $k$-median problem (in general metrics) to the $k$-maxcoverage problem, further highlighting the power of gap preserving FPT reductions over classical gap preserving polynomial time reductions. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2405.13877 [pdf, ps, other]

On connections between k-coloring and Euclidean k-means

Authors: Enver Aman, Karthik C. S., Sharath Punna

Abstract: In the Euclidean $k$-means problems we are given as input a set of $n$ points in $\mathbb{R}^d$ and the goal is to find a set of $k$ points $C\subseteq \mathbb{R}^d$, so as to minimize the sum of the squared Euclidean distances from each point in $P$ to its closest center in $C$. In this paper, we formally explore connections between the $k$-coloring problem on graphs and the Euclidean $k$-means p… ▽ More In the Euclidean $k$-means problems we are given as input a set of $n$ points in $\mathbb{R}^d$ and the goal is to find a set of $k$ points $C\subseteq \mathbb{R}^d$, so as to minimize the sum of the squared Euclidean distances from each point in $P$ to its closest center in $C$. In this paper, we formally explore connections between the $k$-coloring problem on graphs and the Euclidean $k$-means problem. Our results are as follows: $\bullet$ For all $k\ge 3$, we provide a simple reduction from the $k$-coloring problem on regular graphs to the Euclidean $k$-means problem. Moreover, our technique extends to enable a reduction from a structured max-cut problem (which may be considered as a partial 2-coloring problem) to the Euclidean $2$-means problem. Thus, we have a simple and alternate proof of the NP-hardness of Euclidean 2-means problem. $\bullet$ In the other direction, we mimic the $O(1.7297^n)$ time algorithm of Williams [TCS'05] for the max-cut of problem on $n$ vertices to obtain an algorithm for the Euclidean 2-means problem with the same runtime, improving on the naive exhaustive search running in $2^n\cdot \text{poly}(n,d)$ time. $\bullet$ We prove similar results and connections as above for the Euclidean $k$-min-sum problem. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2404.04795 [pdf, other]

Range Longest Increasing Subsequence and its Relatives: Beating Quadratic Barrier and Approaching Optimality

Authors: Karthik C. S., Saladi Rahul

Abstract: In this work, we present a plethora of results for the range longest increasing subsequence problem (Range-LIS) and its variants. The input to Range-LIS is a sequence $\mathcal{S}$ of $n$ real numbers and a collection $\mathcal{Q}$ of $m$ query ranges and for each query in $\mathcal{Q}$, the goal is to report the LIS of the sequence $\mathcal{S}$ restricted to that query. Our two main results are… ▽ More In this work, we present a plethora of results for the range longest increasing subsequence problem (Range-LIS) and its variants. The input to Range-LIS is a sequence $\mathcal{S}$ of $n$ real numbers and a collection $\mathcal{Q}$ of $m$ query ranges and for each query in $\mathcal{Q}$, the goal is to report the LIS of the sequence $\mathcal{S}$ restricted to that query. Our two main results are for the following generalizations of the Range-LIS problem: $\bullet$ 2D Range Queries: In this variant of the Range-LIS problem, each query is a pair of ranges, one of indices and the other of values, and we provide an algorithm with running time $\tilde{O}(mn^{1/2}+ n^{3/2} +k)$, where $k$ is the cumulative length of the $m$ output subsequences. This breaks the quadratic barrier of $\tilde{O}(mn)$ when $m=Ω(\sqrt{n})$. Previously, the only known result breaking the quadratic barrier was of Tiskin [SODA'10] which could only handle 1D range queries (i.e., each query was a range of indices) and also just outputted the length of the LIS (instead of reporting the subsequence achieving that length). $\bullet$ Colored Sequences: In this variant of the Range-LIS problem, each element in $\mathcal{S}$ is colored and for each query in $\mathcal{Q}$, the goal is to report a monochromatic LIS contained in the sequence $\mathcal{S}$ restricted to that query. For 2D queries, we provide an algorithm for this colored version with running time $\tilde{O}(mn^{2/3}+ n^{5/3} +k)$. Moreover, for 1D queries, we provide an improved algorithm with running time $\tilde{O}(mn^{1/2}+ n^{3/2} +k)$. Thus, we again break the quadratic barrier of $\tilde{O}(mn)$. Additionally, we prove that assuming the well-known Combinatorial Boolean Matrix Multiplication Hypothesis, that the runtime for 1D queries is essentially tight for combinatorial algorithms. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: Abstract shortened to meet Arxiv requirements

arXiv:2401.17235 [pdf, other]

Explicit Good Codes Approaching Distance 1 in Ulam Metric

Authors: Elazar Goldenberg, Mursalin Habib, Karthik C. S

Abstract: The Ulam distance of two permutations on $[n]$ is $n$ minus the length of their longest common subsequence. In this paper, we show that for every $\varepsilon>0$, there exists some $α>0$, and an infinite set $Γ\subseteq \mathbb{N}$, such that for all $n\inΓ$, there is an explicit set $C_n$ of $(n!)^α$ many permutations on $[n]$, such that every pair of permutations in $C_n$ has pairwise Ulam dista… ▽ More The Ulam distance of two permutations on $[n]$ is $n$ minus the length of their longest common subsequence. In this paper, we show that for every $\varepsilon>0$, there exists some $α>0$, and an infinite set $Γ\subseteq \mathbb{N}$, such that for all $n\inΓ$, there is an explicit set $C_n$ of $(n!)^α$ many permutations on $[n]$, such that every pair of permutations in $C_n$ has pairwise Ulam distance at least $(1-\varepsilon)\cdot n$. Moreover, we can compute the $i^{\text{th}}$ permutation in $C_n$ in poly$(n)$ time and can also decode in poly$(n)$ time, a permutation $π$ on $[n]$ to its closest permutation $π^*$ in $C_n$, if the Ulam distance of $π$ and $π^*$ is less than $ \frac{(1-\varepsilon)\cdot n}{4} $. Previously, it was implicitly known by combining works of Goldreich and Wigderson [Israel Journal of Mathematics'23] and Farnoud, Skachek, and Milenkovic [IEEE Transactions on Information Theory'13] in a black-box manner, that it is possible to explicitly construct $(n!)^{Ω(1)}$ many permutations on $[n]$, such that every pair of them have pairwise Ulam distance at least $\frac{n}{6}\cdot (1-\varepsilon)$, for any $\varepsilon>0$, and the bound on the distance can be improved to $\frac{n}{4}\cdot (1-\varepsilon)$ if the construction of Goldreich and Wigderson is directly analyzed in the Ulam metric. △ Less

Submitted 11 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

arXiv:2312.17140 [pdf, other]

On Inapproximability of Reconfiguration Problems: PSPACE-Hardness and some Tight NP-Hardness Results

Authors: Karthik C. S., Pasin Manurangsi

Abstract: The field of combinatorial reconfiguration studies search problems with a focus on transforming one feasible solution into another. Recently, Ohsaka [STACS'23] put forth the Reconfiguration Inapproximability Hypothesis (RIH), which roughly asserts that for some $ε>0$, given as input a $k$-CSP instance (for some constant $k$) over some constant sized alphabet, and two satisfying assignments $ψ_s$ a… ▽ More The field of combinatorial reconfiguration studies search problems with a focus on transforming one feasible solution into another. Recently, Ohsaka [STACS'23] put forth the Reconfiguration Inapproximability Hypothesis (RIH), which roughly asserts that for some $ε>0$, given as input a $k$-CSP instance (for some constant $k$) over some constant sized alphabet, and two satisfying assignments $ψ_s$ and $ψ_t$, it is PSPACE-hard to find a sequence of assignments starting from $ψ_s$ and ending at $ψ_t$ such that every assignment in the sequence satisfies at least $(1-ε)$ fraction of the constraints and also that every assignment in the sequence is obtained by changing its immediately preceding assignment (in the sequence) on exactly one variable. Assuming RIH, many important reconfiguration problems have been shown to be PSPACE-hard to approximate by Ohsaka [STACS'23; SODA'24]. In this paper, we prove RIH and establish the first (constant factor) PSPACE-hardness of approximation results for many reconfiguration problems, resolving an open question posed by Ito et al. [TCS'11]. Our proof uses known constructions of Probabilistically Checkable Proofs of Proximity (in a black-box manner) to create the gap. Independently, Hirahara and Ohsaka [STOC'24] have also proved RIH. We also prove that the aforementioned $k$-CSP Reconfiguration problem is NP-hard to approximate to within a factor of $1/2 + ε$ (for any $ε>0$) when $k=2$. We complement this with a $(1/2 - ε)$-approximation polynomial time algorithm, which improves upon a $(1/4 - ε)$-approximation algorithm of Ohsaka [2023] (again for any $ε>0$). Finally, we show that Set Cover Reconfiguration is NP-hard to approximate to within a factor of $2 - ε$ for any constant $ε> 0$, which matches the simple linear-time 2-approximation algorithm by Ito et al. [TCS'11]. △ Less

Submitted 15 February, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

arXiv:2312.02097 [pdf, other]

Inapproximability of Maximum Diameter Clustering for Few Clusters

Authors: Henry Fleischmann, Kyrylo Karlov, Karthik C. S., Ashwin Padaki, Stepan Zharkov

Abstract: In the Max-k-diameter problem, we are given a set of points in a metric space, and the goal is to partition the input points into k parts such that the maximum pairwise distance between points in the same part of the partition is minimized. The approximability of the Max-k-diameter problem was studied in the eighties, culminating in the work of Feder and Greene [STOC'88], wherein they showed it… ▽ More In the Max-k-diameter problem, we are given a set of points in a metric space, and the goal is to partition the input points into k parts such that the maximum pairwise distance between points in the same part of the partition is minimized. The approximability of the Max-k-diameter problem was studied in the eighties, culminating in the work of Feder and Greene [STOC'88], wherein they showed it is NP-hard to approximate within a factor better than 2 in the $\ell_1$ and $\ell_\infty$ metrics, and NP-hard to approximate within a factor better than 1.969 in the Euclidean metric. This complements the celebrated 2 factor polynomial time approximation algorithm for the problem in general metrics (Gonzalez [TCS'85]; Hochbaum and Shmoys [JACM'86]). Over the last couple of decades, there has been increased interest from the algorithmic community to study the approximability of various clustering objectives when the number of clusters is fixed. In this setting, the framework of coresets has yielded PTAS for most popular clustering objectives, including k-means, k-median, k-center, k-minsum, and so on. In this paper, rather surprisingly, we prove that even when k=3, the Max-k-diameter problem is NP-hard to approximate within a factor of 1.5 in the $\ell_1$-metric (and Hamming metric) and NP-hard to approximate within a factor of 1.304 in the Euclidean metric. Our main conceptual contribution is the introduction of a novel framework called cloud systems which embed hypergraphs into $\ell_p$-metric spaces such that the chromatic number of the hypergraph is related to the quality of the Max-k-diameter clustering of the embedded pointset. Our main technical contributions are the constructions of nontrivial cloud systems in the Euclidean and $\ell_1$-metrics using extremal geometric structures. △ Less

Submitted 5 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

arXiv:2312.01252 [pdf, ps, other]

On Steiner Trees of the Regular Simplex

Authors: Henry Fleischmann, Guillermo A. Gamboa Q., Karthik C. S., Josef Matějka, Jakub Petr

Abstract: In the Euclidean Steiner Tree problem, we are given as input a set of points (called terminals) in the $\ell_2$-metric space and the goal is to find the minimum-cost tree connecting them. Additional points (called Steiner points) from the space can be introduced as nodes in the solution. The seminal works of Arora [JACM'98] and Mitchell [SICOMP'99] provide a Polynomial Time Approximation Scheme… ▽ More In the Euclidean Steiner Tree problem, we are given as input a set of points (called terminals) in the $\ell_2$-metric space and the goal is to find the minimum-cost tree connecting them. Additional points (called Steiner points) from the space can be introduced as nodes in the solution. The seminal works of Arora [JACM'98] and Mitchell [SICOMP'99] provide a Polynomial Time Approximation Scheme (PTAS) for solving the Euclidean Steiner Tree problem in fixed dimensions. However, the problem remains poorly understood in higher dimensions (such as when the dimension is logarithmic in the number of terminals) and ruling out a PTAS for the problem in high dimensions is a notoriously long standing open problem (for example, see Trevisan [SICOMP'00]). Moreover, the explicit construction of optimal Steiner trees remains unknown for almost all well-studied high-dimensional point configurations. Furthermore, a vast majority the state-of-the-art structural results on (high-dimensional) Euclidean Steiner trees were established in the 1960s, with no noteworthy update in over half a century. In this paper, we revisit high-dimensional Euclidean Steiner trees, proving new structural results. We also establish a link between the computational hardness of the Euclidean Steiner Tree problem and understanding the optimal Steiner trees of regular simplices (and simplicial complexes), proposing several conjectures and showing that some of them suffice to resolve the status of the inapproximability of the Euclidean Steiner Tree problem. Motivated by this connection, we investigate optimal Steiner trees of regular simplices, proving new structural properties of their optimal Steiner trees, revisiting an old conjecture of Smith [Algorithmica'92] about their optimal topology, and providing the first explicit, general construction of candidate optimal Steiner trees for that topology. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2311.05913 [pdf, ps, other]

Conditional lower bounds for sparse parameterized 2-CSP: A streamlined proof

Authors: Karthik C. S., Dániel Marx, Marcin Pilipczuk, Uéverton Souza

Abstract: Assuming the Exponential Time Hypothesis (ETH), a result of Marx (ToC'10) implies that there is no $f(k)\cdot n^{o(k/\log k)}$ time algorithm that can solve 2-CSPs with $k$ constraints (over a domain of arbitrary large size $n$) for any computable function $f$. This lower bound is widely used to show that certain parameterized problems cannot be solved in time $f(k)\cdot n^{o(k/\log k)}$ time (ass… ▽ More Assuming the Exponential Time Hypothesis (ETH), a result of Marx (ToC'10) implies that there is no $f(k)\cdot n^{o(k/\log k)}$ time algorithm that can solve 2-CSPs with $k$ constraints (over a domain of arbitrary large size $n$) for any computable function $f$. This lower bound is widely used to show that certain parameterized problems cannot be solved in time $f(k)\cdot n^{o(k/\log k)}$ time (assuming the ETH). The purpose of this note is to give a streamlined proof of this result. △ Less

Submitted 17 April, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

arXiv:2306.02189 [pdf, ps, other]

On Approximability of Steiner Tree in $\ell_p$-metrics

Authors: Henry Fleischmann, Surya Teja Gavva, Karthik C. S.

Abstract: In the Continuous Steiner Tree problem (CST), we are given as input a set of points (called terminals) in a metric space and ask for the minimum-cost tree connecting them. Additional points (called Steiner points) from the metric space can be introduced as nodes in the solution. In the Discrete Steiner Tree problem (DST), we are given in addition to the terminals, a set of facilities, and any solu… ▽ More In the Continuous Steiner Tree problem (CST), we are given as input a set of points (called terminals) in a metric space and ask for the minimum-cost tree connecting them. Additional points (called Steiner points) from the metric space can be introduced as nodes in the solution. In the Discrete Steiner Tree problem (DST), we are given in addition to the terminals, a set of facilities, and any solution tree connecting the terminals can only contain the Steiner points from this set of facilities. Trevisan [SICOMP'00] showed that CST and DST are APX-hard when the input lies in the $\ell_1$-metric (and Hamming metric). Chlebík and Chlebíková [TCS'08] showed that DST is NP-hard to approximate to factor of $96/95\approx 1.01$ in the graph metric (and consequently $\ell_\infty$-metric). Prior to this work, it was unclear if CST and DST are APX-hard in essentially every other popular metric! In this work, we prove that DST is APX-hard in every $\ell_p$-metric. We also prove that CST is APX-hard in the $\ell_{\infty}$-metric. Finally, we relate CST and DST, showing a general reduction from CST to DST in $\ell_p$-metrics. As an immediate consequence, this yields a $1.39$-approximation polynomial time algorithm for CST in $\ell_p$-metrics. △ Less

Submitted 21 September, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

Comments: Abstract shortened due to arxiv's requirements

arXiv:2305.16878 [pdf, ps, other]

Can You Solve Closest String Faster than Exhaustive Search?

Authors: Amir Abboud, Nick Fischer, Elazar Goldenberg, Karthik C. S., Ron Safier

Abstract: We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq Σ^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest Hamming ball around $x^*$ that encloses all the strings in $X$. In this paper, we investigate whether the Closest String problem admits algorithms that are faster th… ▽ More We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq Σ^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest Hamming ball around $x^*$ that encloses all the strings in $X$. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: $\bullet$ In the continuous Closest String problem, the goal is to find the solution string $x^*$ anywhere in $Σ^d$. For binary strings, the exhaustive search algorithm runs in time $O(2^d poly(nd))$ and we prove that it cannot be improved to time $O(2^{(1-ε) d} poly(nd))$, for any $ε> 0$, unless the Strong Exponential Time Hypothesis fails. $\bullet$ In the discrete Closest String problem, $x^*$ is required to be in the input set $X$. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time $n^{2 \pm o(1)}$ whenever the dimension is $ω(\log n) < d < n^{o(1)}$. We complement this known hardness result with new algorithms, proving essentially that whenever $d$ falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-$d$ regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in $X$. △ Less

Submitted 29 May, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2305.02850 [pdf, other]

Impossibility of Depth Reduction in Explainable Clustering

Authors: Chengyuan Deng, Surya Teja Gavva, Karthik C. S., Parth Patel, Adarsh Srinivasan

Abstract: Over the last few years Explainable Clustering has gathered a lot of attention. Dasgupta et al. [ICML'20] initiated the study of explainable k-means and k-median clustering problems where the explanation is captured by a threshold decision tree which partitions the space at each node using axis parallel hyperplanes. Recently, Laber et al. [Pattern Recognition'23] made a case to consider the depth… ▽ More Over the last few years Explainable Clustering has gathered a lot of attention. Dasgupta et al. [ICML'20] initiated the study of explainable k-means and k-median clustering problems where the explanation is captured by a threshold decision tree which partitions the space at each node using axis parallel hyperplanes. Recently, Laber et al. [Pattern Recognition'23] made a case to consider the depth of the decision tree as an additional complexity measure of interest. In this work, we prove that even when the input points are in the Euclidean plane, then any depth reduction in the explanation incurs unbounded loss in the k-means and k-median cost. Formally, we show that there exists a data set X in the Euclidean plane, for which there is a decision tree of depth k-1 whose k-means/k-median cost matches the optimal clustering cost of X, but every decision tree of depth less than k-1 has unbounded cost w.r.t. the optimal cost of clustering. We extend our results to the k-center objective as well, albeit with weaker guarantees. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2210.09640 [pdf, other]

Clustering Categorical Data: Soft Rounding k-modes

Authors: Surya Teja Gavva, Karthik C. S., Sharath Punna

Abstract: Over the last three decades, researchers have intensively explored various clustering tools for categorical data analysis. Despite the proposal of various clustering algorithms, the classical k-modes algorithm remains a popular choice for unsupervised learning of categorical data. Surprisingly, our first insight is that in a natural generative block model, the k-modes algorithm performs poorly for… ▽ More Over the last three decades, researchers have intensively explored various clustering tools for categorical data analysis. Despite the proposal of various clustering algorithms, the classical k-modes algorithm remains a popular choice for unsupervised learning of categorical data. Surprisingly, our first insight is that in a natural generative block model, the k-modes algorithm performs poorly for a large range of parameters. We remedy this issue by proposing a soft rounding variant of the k-modes algorithm (SoftModes) and theoretically prove that our variant addresses the drawbacks of the k-modes algorithm in the generative model. Finally, we empirically verify that SoftModes performs well on both synthetic and real-world datasets. △ Less

Submitted 7 October, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

arXiv:2202.10028 [pdf, other]

Obtaining Approximately Optimal and Diverse Solutions via Dispersion

Authors: Jie Gao, Mayank Goswami, Karthik C. S., Meng-Tsung Tsai, Shih-Yu Tsai, Hao-Tsung Yang

Abstract: There has been a long-standing interest in computing diverse solutions to optimization problems. Motivated by reallocation of governmental institutions in Sweden, in 1995 J. Krarup posed the problem of finding $k$ edge-disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solu… ▽ More There has been a long-standing interest in computing diverse solutions to optimization problems. Motivated by reallocation of governmental institutions in Sweden, in 1995 J. Krarup posed the problem of finding $k$ edge-disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solutions to spanning trees, paths, vertex covers, matchings, and more. Unlike the PSP that has a constraint on the total weight of the solutions, recent work has involved finding diverse solutions that are all optimal. However, sometimes the space of exact solutions may be too small to achieve sufficient diversity. Motivated by this, we initiate the study of obtaining sufficiently-diverse, yet approximately-optimal solutions to optimization problems. Formally, given an integer $k$, an approximation factor $c$, and an instance $I$ of an optimization problem, we aim to obtain a set of $k$ solutions to $I$ that a) are all $c$ approximately-optimal for $I$ and b) maximize the diversity of the $k$ solutions. Finding such solutions, therefore, requires a better understanding of the global landscape of the optimization function. We show that, given any metric on the space of solutions, and the diversity measure as the sum of pairwise distances between solutions, this problem can be solved by combining ideas from dispersion and multicriteria optimization. We first provide a general reduction to an associated budget-constrained optimization (BCO) problem, where one objective function is to be maximized (minimized) subject to a bound on the second objective function. We then prove that bi-approximations to the BCO can be used to give bi-approximations to the diverse approximately optimal solutions problem with a little overhead. △ Less

Submitted 21 February, 2022; originally announced February 2022.

arXiv:2112.03983 [pdf, ps, other]

Almost Polynomial Factor Inapproximability for Parameterized k-Clique

Authors: Karthik C. S., Subhash Khot

Abstract: The k-Clique problem is a canonical hard problem in parameterized complexity. In this paper, we study the parameterized complexity of approximating the k-Clique problem where an integer k and a graph G on n vertices are given as input, and the goal is to find a clique of size at least k/F(k) whenever the graph G has a clique of size k. When such an algorithm runs in time T(k)poly(n) (i.e., FPT-tim… ▽ More The k-Clique problem is a canonical hard problem in parameterized complexity. In this paper, we study the parameterized complexity of approximating the k-Clique problem where an integer k and a graph G on n vertices are given as input, and the goal is to find a clique of size at least k/F(k) whenever the graph G has a clique of size k. When such an algorithm runs in time T(k)poly(n) (i.e., FPT-time) for some computable function T, it is said to be an F(k)-FPT-approximation algorithm for the k-Clique problem. Although, the non-existence of an F(k)-FPT-approximation algorithm for any computable sublinear function F is known under gap-ETH [Chalermsook et al., FOCS 2017], it has remained a long standing open problem to prove the same inapproximability result under the more standard and weaker assumption, W[1]$\neq$FPT. In a recent breakthrough, Lin [STOC 2021] ruled out constant factor (i.e., F(k)=O(1)) FPT-approximation algorithms under W[1]$\neq$FPT. In this paper, we improve this inapproximability result (under the same assumption) to rule out every $F(k)=k^{1/H(k)}$ factor FPT-approximation algorithm for any increasing computable function H (for example $H(k)=\log^\ast k$). Our main technical contribution is introducing list decoding of Hadamard codes over large prime fields into the proof framework of Lin. △ Less

Submitted 16 July, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

arXiv:2112.03222 [pdf, ps, other]

On Complexity of 1-Center in Various Metrics

Authors: Amir Abboud, Mohammad Hossein Bateni, Vincent Cohen-Addad, Karthik C. S., Saeed Seddighin

Abstract: We consider the classic 1-center problem: Given a set $P$ of $n$ points in a metric space find the point in $P$ that minimizes the maximum distance to the other points of $P$. We study the complexity of this problem in $d$-dimensional $\ell_p$-metrics and in edit and Ulam metrics over strings of length $d$. Our results for the 1-center problem may be classified based on $d$ as follows.… ▽ More We consider the classic 1-center problem: Given a set $P$ of $n$ points in a metric space find the point in $P$ that minimizes the maximum distance to the other points of $P$. We study the complexity of this problem in $d$-dimensional $\ell_p$-metrics and in edit and Ulam metrics over strings of length $d$. Our results for the 1-center problem may be classified based on $d$ as follows. $\bullet$ Small $d$: Assuming the hitting set conjecture (HSC), we show that when $d=ω(\log n)$, no subquadratic algorithm can solve 1-center problem in any of the $\ell_p$-metrics, or in edit or Ulam metrics. $\bullet$ Large $d$: When $d=Ω(n)$, we extend our conditional lower bound to rule out subquartic algorithms for 1-center problem in edit metric (assuming Quantified SETH). On the other hand, we give a $(1+ε)$-approximation for 1-center in Ulam metric with running time $\tilde{O_{\varepsilon}}(nd+n^2\sqrt{d})$. We also strengthen some of the above lower bounds by allowing approximations or by reducing the dimension $d$, but only against a weaker class of algorithms which list all requisite solutions. Moreover, we extend one of our hardness results to rule out subquartic algorithms for the well-studied 1-median problem in the edit metric, where given a set of $n$ strings each of length $n$, the goal is to find a string in the set that minimizes the sum of the edit distances to the rest of the strings in the set. △ Less

Submitted 9 July, 2023; v1 submitted 6 December, 2021; originally announced December 2021.

arXiv:2111.10912 [pdf, ps, other]

Johnson Coverage Hypothesis: Inapproximability of k-means and k-median in L_p metrics

Authors: Vincent Cohen-Addad, Karthik C. S., Euiwoong Lee

Abstract: K-median and k-means are the two most popular objectives for clustering algorithms. Despite intensive effort, a good understanding of the approximability of these objectives, particularly in $\ell_p$-metrics, remains a major open problem. In this paper, we significantly improve upon the hardness of approximation factors known in literature for these objectives in $\ell_p$-metrics. We introduce a… ▽ More K-median and k-means are the two most popular objectives for clustering algorithms. Despite intensive effort, a good understanding of the approximability of these objectives, particularly in $\ell_p$-metrics, remains a major open problem. In this paper, we significantly improve upon the hardness of approximation factors known in literature for these objectives in $\ell_p$-metrics. We introduce a new hypothesis called the Johnson Coverage Hypothesis (JCH), which roughly asserts that the well-studied max k-coverage problem on set systems is hard to approximate to a factor greater than 1-1/e, even when the membership graph of the set system is a subgraph of the Johnson graph. We then show that together with generalizations of the embedding techniques introduced by Cohen-Addad and Karthik (FOCS '19), JCH implies hardness of approximation results for k-median and k-means in $\ell_p$-metrics for factors which are close to the ones obtained for general metrics. In particular, assuming JCH we show that it is hard to approximate the k-means objective: $\bullet$ Discrete case: To a factor of 3.94 in the $\ell_1$-metric and to a factor of 1.73 in the $\ell_2$-metric; this improves upon the previous factor of 1.56 and 1.17 respectively, obtained under UGC. $\bullet$ Continuous case: To a factor of 2.10 in the $\ell_1$-metric and to a factor of 1.36 in the $\ell_2$-metric; this improves upon the previous factor of 1.07 in the $\ell_2$-metric obtained under UGC. We also obtain similar improvements under JCH for the k-median objective. Additionally, we prove a weak version of JCH using the work of Dinur et al. (SICOMP '05) on Hypergraph Vertex Cover, and recover all the results stated above of Cohen-Addad and Karthik (FOCS '19) to (nearly) the same inapproximability factors but now under the standard NP$\neq$P assumption (instead of UGC). △ Less

Submitted 21 November, 2021; originally announced November 2021.

Comments: Abstract in metadata shortened to meet arxiv requirements

arXiv:2111.05518 [pdf, other]

Applications of Random Algebraic Constructions to Hardness of Approximation

Authors: Boris Bukh, Karthik C. S., Bhargav Narayanan

Abstract: In this paper, we show how one may (efficiently) construct two types of extremal combinatorial objects whose existence was previously conjectural. (*) Panchromatic Graphs: For fixed integer k, a k-panchromatic graph is, roughly speaking, a balanced bipartite graph with one partition class equipartitioned into k colour classes in which the common neighbourhoods of panchromatic k-sets of vertices… ▽ More In this paper, we show how one may (efficiently) construct two types of extremal combinatorial objects whose existence was previously conjectural. (*) Panchromatic Graphs: For fixed integer k, a k-panchromatic graph is, roughly speaking, a balanced bipartite graph with one partition class equipartitioned into k colour classes in which the common neighbourhoods of panchromatic k-sets of vertices are much larger than those of k-sets that repeat a colour. The question of their existence was raised by Karthik and Manurangsi [Combinatorica 2020]. (*) Threshold Graphs: For fixed integer k, a k-threshold graph is, roughly speaking, a balanced bipartite graph in which the common neighbourhoods of k-sets of vertices on one side are much larger than those of (k+1)-sets. The question of their existence was raised by Lin [JACM 2018]. As applications of our constructions, we show the following conditional time lower bounds on the parameterized set intersection problem where, given a collection of n sets over universe [n] and a parameter k, the goal is to find k sets with the largest intersection. (*) Assuming ETH, for any computable function F, no $n^{o(k)}$-time algorithm can approximate the parameterized set intersection problem up to factor F(k). This improves considerably on the previously best-known result under ETH due to Lin [JACM 2018], who ruled out any $n^{o(\sqrt{k})}$ time approximation algorithm for this problem. (*) Assuming SETH, for every $\varepsilon>0$ and any computable function F, no $n^{k-\varepsilon}$-time algorithm can approximate the parameterized set intersection problem up to factor F(k). No result of comparable strength was previously known under SETH, even for solving this problem exactly. △ Less

Submitted 9 November, 2021; originally announced November 2021.

Comments: Abstract in metadata shortened to meet arxiv requirements

arXiv:2010.00087 [pdf, ps, other]

On Approximability of Clustering Problems Without Candidate Centers

Authors: Vincent Cohen-Addad, Karthik C. S., Euiwoong Lee

Abstract: The k-means objective is arguably the most widely-used cost function for modeling clustering tasks in a metric space. In practice and historically, k-means is thought of in a continuous setting, namely where the centers can be located anywhere in the metric space. For example, the popular Lloyd's heuristic locates a center at the mean of each cluster. Despite persistent efforts on understanding… ▽ More The k-means objective is arguably the most widely-used cost function for modeling clustering tasks in a metric space. In practice and historically, k-means is thought of in a continuous setting, namely where the centers can be located anywhere in the metric space. For example, the popular Lloyd's heuristic locates a center at the mean of each cluster. Despite persistent efforts on understanding the approximability of k-means, and other classic clustering problems such as k-median and k-minsum, our knowledge of the hardness of approximation factors of these problems remains quite poor. In this paper, we significantly improve upon the hardness of approximation factors known in the literature for these objectives. We show that if the input lies in a general metric space, it is NP-hard to approximate: $\bullet$ Continuous k-median to a factor of $2-o(1)$; this improves upon the previous inapproximability factor of 1.36 shown by Guha and Khuller (J. Algorithms '99). $\bullet$ Continuous k-means to a factor of $4- o(1)$; this improves upon the previous inapproximability factor of 2.10 shown by Guha and Khuller (J. Algorithms '99). $\bullet$ k-minsum to a factor of $1.415$; this improves upon the APX-hardness shown by Guruswami and Indyk (SODA '03). Our results shed new and perhaps counter-intuitive light on the differences between clustering problems in the continuous setting versus the discrete setting (where the candidate centers are given as part of the input). △ Less

Submitted 2 October, 2020; v1 submitted 30 September, 2020; originally announced October 2020.

arXiv:2009.02778 [pdf, ps, other]

On Hardness of Approximation of Parameterized Set Cover and Label Cover: Threshold Graphs from Error Correcting Codes

Authors: Karthik C. S., Inbal Livni-Navon

Abstract: In the $(k,h)$-SetCover problem, we are given a collection $\mathcal{S}$ of sets over a universe $U$, and the goal is to distinguish between the case that $\mathcal{S}$ contains $k$ sets which cover $U$, from the case that at least $h$ sets in $\mathcal{S}$ are needed to cover $U$. Lin (ICALP'19) recently showed a gap creating reduction from the $(k,k+1)$-SetCover problem on universe of size… ▽ More In the $(k,h)$-SetCover problem, we are given a collection $\mathcal{S}$ of sets over a universe $U$, and the goal is to distinguish between the case that $\mathcal{S}$ contains $k$ sets which cover $U$, from the case that at least $h$ sets in $\mathcal{S}$ are needed to cover $U$. Lin (ICALP'19) recently showed a gap creating reduction from the $(k,k+1)$-SetCover problem on universe of size $O_k(\log |\mathcal{S}|)$ to the $\left(k,\sqrt[k]{\frac{\log|\mathcal{S}|}{\log\log |\mathcal{S}|}}\cdot k\right)$-SetCover problem on universe of size $|\mathcal{S}|$. In this paper, we prove a more scalable version of his result: given any error correcting code $C$ over alphabet $[q]$, rate $ρ$, and relative distance $δ$, we use $C$ to create a reduction from the $(k,k+1)$-SetCover problem on universe $U$ to the $\left(k,\sqrt[2k]{\frac{2}{1-δ}}\right)$-SetCover problem on universe of size $\frac{\log|\mathcal{S}|}ρ\cdot|U|^{q^k}$. Lin established his result by composing the input SetCover instance (that has no gap) with a special threshold graph constructed from extremal combinatorial object called universal sets, resulting in a final SetCover instance with gap. Our reduction follows along the exact same lines, except that we generate the threshold graphs specified by Lin simply using the basic properties of the error correcting code $C$. We use the same threshold graphs mentioned above to prove inapproximability results, under W[1]$\neq$FPT and ETH, for the $k$-MaxCover problem introduced by Chalermsook et al. (SICOMP'20). Our inapproximaiblity results match the bounds obtained by Karthik et al. (JACM'19), although their proof framework is very different, and involves generalization of the distributed PCP framework. Prior to this work, it was not clear how to adopt the proof strategy of Lin to prove inapproximability results for $k$-MaxCover. △ Less

Submitted 6 September, 2020; originally announced September 2020.

arXiv:2008.06700 [pdf, other]

On Efficient Low Distortion Ultrametric Embedding

Authors: Vincent Cohen-Addad, Karthik C. S., Guillaume Lagarde

Abstract: A classic problem in unsupervised learning and data analysis is to find simpler and easy-to-visualize representations of the data that preserve its essential properties. A widely-used method to preserve the underlying hierarchical structure of the data while reducing its complexity is to find an embedding of the data into a tree or an ultrametric. The most popular algorithms for this task are the… ▽ More A classic problem in unsupervised learning and data analysis is to find simpler and easy-to-visualize representations of the data that preserve its essential properties. A widely-used method to preserve the underlying hierarchical structure of the data while reducing its complexity is to find an embedding of the data into a tree or an ultrametric. The most popular algorithms for this task are the classic linkage algorithms (single, average, or complete). However, these methods on a data set of $n$ points in $Ω(\log n)$ dimensions exhibit a quite prohibitive running time of $Θ(n^2)$. In this paper, we provide a new algorithm which takes as input a set of points $P$ in $\mathbb{R}^d$, and for every $c\ge 1$, runs in time $n^{1+\fracρ{c^2}}$ (for some universal constant $ρ>1$) to output an ultrametric $Δ$ such that for any two points $u,v$ in $P$, we have $Δ(u,v)$ is within a multiplicative factor of $5c$ to the distance between $u$ and $v$ in the "best" ultrametric representation of $P$. Here, the best ultrametric is the ultrametric $\tildeΔ$ that minimizes the maximum distance distortion with respect to the $\ell_2$ distance, namely that minimizes $\underset{u,v \in P}{\max}\ \frac{\tildeΔ(u,v)}{\|u-v\|_2}$. We complement the above result by showing that under popular complexity theoretic assumptions, for every constant $\varepsilon>0$, no algorithm with running time $n^{2-\varepsilon}$ can distinguish between inputs in $\ell_\infty$-metric that admit isometric embedding and those that incur a distortion of $\frac{3}{2}$. Finally, we present empirical evaluation on classic machine learning datasets and show that the output of our algorithm is comparable to the output of the linkage algorithms while achieving a much faster running time. △ Less

Submitted 15 August, 2020; originally announced August 2020.

arXiv:2008.05421 [pdf, other]

Deterministic Replacement Path Covering

Authors: Karthik C. S., Merav Parter

Abstract: In this article, we provide a unified and simplified approach to derandomize central results in the area of fault-tolerant graph algorithms. Given a graph $G$, a vertex pair $(s,t) \in V(G)\times V(G)$, and a set of edge faults $F \subseteq E(G)$, a replacement path $P(s,t,F)$ is an $s$-$t$ shortest path in $G \setminus F$. For integer parameters $L,f$, a replacement path covering (RPC) is a colle… ▽ More In this article, we provide a unified and simplified approach to derandomize central results in the area of fault-tolerant graph algorithms. Given a graph $G$, a vertex pair $(s,t) \in V(G)\times V(G)$, and a set of edge faults $F \subseteq E(G)$, a replacement path $P(s,t,F)$ is an $s$-$t$ shortest path in $G \setminus F$. For integer parameters $L,f$, a replacement path covering (RPC) is a collection of subgraphs of $G$, denoted by $\textit{G}_{L,f}=\{G_1,\ldots, G_r \}$, such that for every set $F$ of at most $f$ faults (i.e., $|F|\le f$) and every replacement path $P(s,t,F)$ of at most $L$ edges, there exists a subgraph $G_i\in \textit{G}_{L,f}$ that contains all the edges of $P$ and does not contain any of the edges of $F$. The covering value of the RPC $\textit{G}_{L,f}$ is then defined to be the number of subgraphs in $\textit{G}_{L,f}$. We present efficient deterministic constructions of $(L,f)$-RPCs whose covering values almost match the randomized ones, for a wide range of parameters. Our time and value bounds improve considerably over the previous construction of Parter (DISC 2019). We also provide an almost matching lower bound for the value of these coverings. A key application of our above deterministic constructions is the derandomization of the algebraic construction of the distance sensitivity oracle by Weimann and Yuster (FOCS 2010). The preprocessing and query time of the our deterministic algorithm nearly match the randomized bounds. This resolves the open problem of Alon, Chechik and Cohen (ICALP 2019). △ Less

Submitted 9 April, 2023; v1 submitted 12 August, 2020; originally announced August 2020.

arXiv:2006.04411 [pdf, other]

A Survey on Approximation in Parameterized Complexity: Hardness and Algorithms

Authors: Andreas Emil Feldmann, Karthik C. S., Euiwoong Lee, Pasin Manurangsi

Abstract: Parameterization and approximation are two popular ways of co** with NP-hard problems. More recently, the two have also been combined to derive many interesting results. We survey developments in the area both from the algorithmic and hardness perspectives, with emphasis on new techniques and potential future research directions. Parameterization and approximation are two popular ways of co** with NP-hard problems. More recently, the two have also been combined to derive many interesting results. We survey developments in the area both from the algorithmic and hardness perspectives, with emphasis on new techniques and potential future research directions. △ Less

Submitted 8 June, 2020; originally announced June 2020.

arXiv:1909.10958 [pdf, other]

On Communication Complexity of Fixed Point Computation

Authors: Anat Ganor, Karthik C. S., Dömötör Pálvölgyi

Abstract: Brouwer's fixed point theorem states that any continuous function from a compact convex space to itself has a fixed point. Roughgarden and Weinstein (FOCS 2016) initiated the study of fixed point computation in the two-player communication model, where each player gets a function from $[0,1]^n$ to $[0,1]^n$, and their goal is to find an approximate fixed point of the composition of the two functio… ▽ More Brouwer's fixed point theorem states that any continuous function from a compact convex space to itself has a fixed point. Roughgarden and Weinstein (FOCS 2016) initiated the study of fixed point computation in the two-player communication model, where each player gets a function from $[0,1]^n$ to $[0,1]^n$, and their goal is to find an approximate fixed point of the composition of the two functions. They left it as an open question to show a lower bound of $2^{Ω(n)}$ for the (randomized) communication complexity of this problem, in the range of parameters which make it a total search problem. We answer this question affirmatively. Additionally, we introduce two natural fixed point problems in the two-player communication model. $\bullet$ Each player is given a function from $[0,1]^n$ to $[0,1]^{n/2}$, and their goal is to find an approximate fixed point of the concatenation of the functions. $\bullet$ Each player is given a function from $[0,1]^n$ to $[0,1]^{n}$, and their goal is to find an approximate fixed point of the interpolation of the functions. We show a randomized communication complexity lower bound of $2^{Ω(n)}$ for these problems (for some constant approximation factor). Finally, we initiate the study of finding a panchromatic simplex in a Sperner-coloring of a triangulation (guaranteed by Sperner's lemma) in the two-player communication model: A triangulation $T$ of the $d$-simplex is publicly known and one player is given a set $S_A\subset T$ and a coloring function from $S_A$ to $\{0,\ldots ,d/2\}$, and the other player is given a set $S_B\subset T$ and a coloring function from $S_B$ to $\{d/2+1,\ldots ,d\}$, such that $S_A\dot\cup S_B=T$, and their goal is to find a panchromatic simplex. We show a randomized communication complexity lower bound of $|T|^{Ω(1)}$ for the aforementioned problem as well (when $d$ is large). △ Less

Submitted 25 May, 2022; v1 submitted 24 September, 2019; originally announced September 2019.

arXiv:1909.01986 [pdf, ps, other]

Parameterized Intractability of Even Set and Shortest Vector Problem

Authors: Arnab Bhattacharyya, Édouard Bonnet, László Egri, Suprovat Ghoshal, Karthik C. S., Bingkai Lin, Pasin Manurangsi, Dániel Marx

Abstract: The $k$-Even Set problem is a parameterized variant of the Minimum Distance Problem of linear codes over $\mathbb F_2$, which can be stated as follows: given a generator matrix $\mathbf A$ and an integer $k$, determine whether the code generated by $\mathbf A$ has distance at most $k$, or in other words, whether there is a nonzero vector $\mathbf{x}$ such that $\mathbf A \mathbf{x}$ has at most… ▽ More The $k$-Even Set problem is a parameterized variant of the Minimum Distance Problem of linear codes over $\mathbb F_2$, which can be stated as follows: given a generator matrix $\mathbf A$ and an integer $k$, determine whether the code generated by $\mathbf A$ has distance at most $k$, or in other words, whether there is a nonzero vector $\mathbf{x}$ such that $\mathbf A \mathbf{x}$ has at most $k$ nonzero coordinates. The question of whether $k$-Even Set is fixed parameter tractable (FPT) parameterized by the distance $k$ has been repeatedly raised in literature; in fact, it is one of the few remaining open questions from the seminal book of Downey and Fellows (1999). In this work, we show that $k$-Even Set is W[1]-hard under randomized reductions. We also consider the parameterized $k$-Shortest Vector Problem (SVP), in which we are given a lattice whose basis vectors are integral and an integer $k$, and the goal is to determine whether the norm of the shortest vector (in the $\ell_p$ norm for some fixed $p$) is at most $k$. Similar to $k$-Even Set, understanding the complexity of this problem is also a long-standing open question in the field of Parameterized Complexity. We show that, for any $p > 1$, $k$-SVP is W[1]-hard to approximate (under randomized reductions) to some constant factor. △ Less

Submitted 4 September, 2019; originally announced September 2019.

Comments: Preliminary version of this article appeared in ESA'16 (arXiv:1601.04935) and ICALP'18 (arXiv:1803.09717)

arXiv:1908.10248 [pdf, ps, other]

Hardness Amplification of Optimization Problems

Authors: Elazar Goldenberg, Karthik C. S.

Abstract: In this paper, we prove a general hardness amplification scheme for optimization problems based on the technique of direct products. We say that an optimization problem $Π$ is direct product feasible if it is possible to efficiently aggregate any $k$ instances of $Π$ and form one large instance of $Π$ such that given an optimal feasible solution to the larger instance, we can efficiently find opti… ▽ More In this paper, we prove a general hardness amplification scheme for optimization problems based on the technique of direct products. We say that an optimization problem $Π$ is direct product feasible if it is possible to efficiently aggregate any $k$ instances of $Π$ and form one large instance of $Π$ such that given an optimal feasible solution to the larger instance, we can efficiently find optimal feasible solutions to all the $k$ smaller instances. Given a direct product feasible optimization problem $Π$, our hardness amplification theorem may be informally stated as follows: If there is a distribution $\mathcal{D}$ over instances of $Π$ of size $n$ such that every randomized algorithm running in time $t(n)$ fails to solve $Π$ on $\frac{1}{α(n)}$ fraction of inputs sampled from $\mathcal{D}$, then, assuming some relationships on $α(n)$ and $t(n)$, there is a distribution $\mathcal{D}'$ over instances of $Π$ of size $O(n\cdot α(n))$ such that every randomized algorithm running in time $\frac{t(n)}{poly(α(n))}$ fails to solve $Π$ on $\frac{99}{100}$ fraction of inputs sampled from $\mathcal{D}'$. As a consequence of the above theorem, we show hardness amplification of problems in various classes such as NP-hard problems like Max-Clique, Knapsack, and Max-SAT, problems in P such as Longest Common Subsequence, Edit Distance, Matrix Multiplication, and even problems in TFNP such as Factoring and computing Nash equilibrium. △ Less

Submitted 27 August, 2019; originally announced August 2019.

arXiv:1901.06220 [pdf, ps, other]

Towards a General Direct Product Testing Theorem

Authors: Elazar Goldenberg, Karthik C. S.

Abstract: The Direct Product encoding of a string $a\in \{0,1\}^n$ on an underlying domain $V\subseteq \binom{n}{k}$, is a function DP$_V(a)$ which gets as input a set $S\in V$ and outputs $a$ restricted to $S$. In the Direct Product Testing Problem, we are given a function $F:V\to \{0,1\}^k$, and our goal is to test whether $F$ is close to a direct product encoding, i.e., whether there exists some… ▽ More The Direct Product encoding of a string $a\in \{0,1\}^n$ on an underlying domain $V\subseteq \binom{n}{k}$, is a function DP$_V(a)$ which gets as input a set $S\in V$ and outputs $a$ restricted to $S$. In the Direct Product Testing Problem, we are given a function $F:V\to \{0,1\}^k$, and our goal is to test whether $F$ is close to a direct product encoding, i.e., whether there exists some $a\in \{0,1\}^n$ such that on most sets $S$, we have $F(S)=$DP$_V(a)(S)$. A natural test is as follows: select a pair $(S,S')\in V$ according to some underlying distribution over $V\times V$, query $F$ on this pair, and check for consistency on their intersection. Note that the above distribution may be viewed as a weighted graph over the vertex set $V$ and is referred to as a test graph. The testability of direct products was studied over various specific domains and test graphs (for example see Dinur-Steurer [CCC'14]; Dinur-Kaufman [FOCS'17]). In this paper, we study the testability of direct products in a general setting, addressing the question: what properties of the domain and the test graph allow one to prove a direct product testing theorem? Towards this goal we introduce the notion of coordinate expansion of a test graph. Roughly speaking a test graph is a coordinate expander if it has global and local expansion, and has certain nice intersection properties on sampling. We show that whenever the test graph has coordinate expansion then it admits a direct product testing theorem. Additionally, for every $k$ and $n$ we provide a direct product domain $V\subseteq \binom{n}{k}$ of size $n$, called the Sliding Window domain for which we prove direct product testability. △ Less

Submitted 18 January, 2019; originally announced January 2019.

arXiv:1812.00901 [pdf, ps, other]

On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic

Authors: Karthik C. S., Pasin Manurangsi

Abstract: Given a set of $n$ points in $\mathbb R^d$, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the $\ell_p$-metric. Closest Pair is a fundamental problem in Computational Geometry and understanding its fine-grained complexity in the Euclidean metric when $d=ω(\log n)$ was raised as an open question in recent works (Abboud-Rubinstein-Willi… ▽ More Given a set of $n$ points in $\mathbb R^d$, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the $\ell_p$-metric. Closest Pair is a fundamental problem in Computational Geometry and understanding its fine-grained complexity in the Euclidean metric when $d=ω(\log n)$ was raised as an open question in recent works (Abboud-Rubinstein-Williams [FOCS'17], Williams [SODA'18], David-Karthik-Laekhanukit [SoCG'18]). In this paper, we show that for every $p\in\mathbb R_{\ge 1}\cup\{0\}$, under the Strong Exponential Time Hypothesis (SETH), for every $\varepsilon>0$, the following holds: $\bullet$ No algorithm running in time $O(n^{2-\varepsilon})$ can solve the Closest Pair problem in $d=(\log n)^{Ω_{\varepsilon}(1)}$ dimensions in the $\ell_p$-metric. $\bullet$ There exists $δ= δ(\varepsilon)>0$ and $c = c(\varepsilon)\ge 1$ such that no algorithm running in time $O(n^{1.5-\varepsilon})$ can approximate Closest Pair problem to a factor of $(1+δ)$ in $d\ge c\log n$ dimensions in the $\ell_p$-metric. At the heart of all our proofs is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on $n$ vertices with $n^{2-\varepsilon}$ edges whose vertices can be realized as points in a $(\log n)^{Ω_\varepsilon(1)}$-dimensional Euclidean space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory'03]. △ Less

Submitted 3 December, 2018; originally announced December 2018.

arXiv:1803.09717 [pdf, ps, other]

Parameterized Intractability of Even Set and Shortest Vector Problem from Gap-ETH

Authors: Arnab Bhattacharyya, Suprovat Ghoshal, Karthik C. S., Pasin Manurangsi

Abstract: The $k$-Even Set problem is a parameterized variant of the Minimum Distance Problem of linear codes over $\mathbb F_2$, which can be stated as follows: given a generator matrix $\mathbf A$ and an integer $k$, determine whether the code generated by $\mathbf A$ has distance at most $k$. Here, $k$ is the parameter of the problem. The question of whether $k$-Even Set is fixed parameter tractable (FPT… ▽ More The $k$-Even Set problem is a parameterized variant of the Minimum Distance Problem of linear codes over $\mathbb F_2$, which can be stated as follows: given a generator matrix $\mathbf A$ and an integer $k$, determine whether the code generated by $\mathbf A$ has distance at most $k$. Here, $k$ is the parameter of the problem. The question of whether $k$-Even Set is fixed parameter tractable (FPT) has been repeatedly raised in literature and has earned its place in Downey and Fellows' book (2013) as one of the "most infamous" open problems in the field of Parameterized Complexity. In this work, we show that $k$-Even Set does not admit FPT algorithms under the (randomized) Gap Exponential Time Hypothesis (Gap-ETH) [Dinur'16, Manurangsi-Raghavendra'16]. In fact, our result rules out not only exact FPT algorithms, but also any constant factor FPT approximation algorithms for the problem. Furthermore, our result holds even under the following weaker assumption, which is also known as the Parameterized Inapproximability Hypothesis (PIH) [Lokshtanov et al.'17]: no (randomized) FPT algorithm can distinguish a satisfiable 2CSP instance from one which is only $0.99$-satisfiable (where the parameter is the number of variables). We also consider the parameterized $k$-Shortest Vector Problem (SVP), in which we are given a lattice whose basis vectors are integral and an integer $k$, and the goal is to determine whether the norm of the shortest vector (in the $\ell_p$ norm for some fixed $p$) is at most $k$. Similar to $k$-Even Set, this problem is also a long-standing open problem in the field of Parameterized Complexity. We show that, for any $p > 1$, $k$-SVP is hard to approximate (in FPT time) to some constant factor, assuming PIH. Furthermore, for the case of $p = 2$, the inapproximability factor can be amplified to any constant. △ Less

Submitted 26 March, 2018; originally announced March 2018.

arXiv:1711.11029 [pdf, ps, other]

On the Parameterized Complexity of Approximating Dominating Set

Authors: Karthik C. S., Bundit Laekhanukit, Pasin Manurangsi

Abstract: We study the parameterized complexity of approximating the $k$-Dominating Set (DomSet) problem where an integer $k$ and a graph $G$ on $n$ vertices are given as input, and the goal is to find a dominating set of size at most $F(k) \cdot k$ whenever the graph $G$ has a dominating set of size $k$. When such an algorithm runs in time $T(k) \cdot poly(n)$ (i.e., FPT-time) for some computable function… ▽ More We study the parameterized complexity of approximating the $k$-Dominating Set (DomSet) problem where an integer $k$ and a graph $G$ on $n$ vertices are given as input, and the goal is to find a dominating set of size at most $F(k) \cdot k$ whenever the graph $G$ has a dominating set of size $k$. When such an algorithm runs in time $T(k) \cdot poly(n)$ (i.e., FPT-time) for some computable function $T$, it is said to be an $F(k)$-FPT-approximation algorithm for $k$-DomSet. We prove the following for every computable functions $T, F$ and every constant $\varepsilon > 0$: $\bullet$ Assuming $W[1]\neq FPT$, there is no $F(k)$-FPT-approximation algorithm for $k$-DomSet. $\bullet$ Assuming the Exponential Time Hypothesis (ETH), there is no $F(k)$-approximation algorithm for $k$-DomSet that runs in $T(k) \cdot n^{o(k)}$ time. $\bullet$ Assuming the Strong Exponential Time Hypothesis (SETH), for every integer $k \geq 2$, there is no $F(k)$-approximation algorithm for $k$-DomSet that runs in $T(k) \cdot n^{k - \varepsilon}$ time. $\bullet$ Assuming the $k$-Sum Hypothesis, for every integer $k \geq 3$, there is no $F(k)$-approximation algorithm for $k$-DomSet that runs in $T(k) \cdot n^{\lceil k/2 \rceil - \varepsilon}$ time. Our results are obtained by establishing a connection between communication complexity and hardness of approximation, generalizing the ideas from a recent breakthrough work of Abboud et al. [FOCS 2017]. Specifically, we show that to prove hardness of approximation of a certain parameterized variant of the label cover problem, it suffices to devise a specific protocol for a communication problem that depends on which hypothesis we rely on. Each of these communication problems turns out to be either a well studied problem or a variant of one; this allows us to easily apply known techniques to solve them. △ Less

Submitted 12 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

arXiv:1704.01104 [pdf, other]

Communication Complexity of Correlated Equilibrium in Two-Player Games

Authors: Anat Ganor, Karthik C. S.

Abstract: We show a communication complexity lower bound for finding a correlated equilibrium of a two-player game. More precisely, we define a two-player $N \times N$ game called the 2-cycle game and show that the randomized communication complexity of finding a 1/poly($N$)-approximate correlated equilibrium of the 2-cycle game is $Ω(N)$. For small approximation values, this answers an open question of Bab… ▽ More We show a communication complexity lower bound for finding a correlated equilibrium of a two-player game. More precisely, we define a two-player $N \times N$ game called the 2-cycle game and show that the randomized communication complexity of finding a 1/poly($N$)-approximate correlated equilibrium of the 2-cycle game is $Ω(N)$. For small approximation values, this answers an open question of Babichenko and Rubinstein (STOC 2017). Our lower bound is obtained via a direct reduction from the unique set disjointness problem. △ Less

Submitted 4 April, 2017; originally announced April 2017.

arXiv:1609.03840 [pdf, ps, other]

Did the Train Reach its Destination: The Complexity of Finding a Witness

Authors: Karthik C. S.

Abstract: Recently, Dohrau et al. studied a zero-player game on switch graphs and proved that deciding the termination of the game is in NP $\cap$ coNP. In this short paper, we show that the search version of this game on switch graphs, i.e., the task of finding a witness of termination (or of non-termination) is in PLS. Recently, Dohrau et al. studied a zero-player game on switch graphs and proved that deciding the termination of the game is in NP $\cap$ coNP. In this short paper, we show that the search version of this game on switch graphs, i.e., the task of finding a witness of termination (or of non-termination) is in PLS. △ Less

Submitted 6 January, 2017; v1 submitted 13 September, 2016; originally announced September 2016.

ACM Class: F.1.3; F.2.2

arXiv:1608.03245 [pdf, ps, other]

On the Complexity of Closest Pair via Polar-Pair of Point-Sets

Authors: Roee David, Karthik C. S., Bundit Laekhanukit

Abstract: Every graph $G$ can be represented by a collection of equi-radii spheres in a $d$-dimensional metric $Δ$ such that there is an edge $uv$ in $G$ if and only if the spheres corresponding to $u$ and $v$ intersect. The smallest integer $d$ such that $G$ can be represented by a collection of spheres (all of the same radius) in $Δ$ is called the sphericity of $G$, and if the collection of spheres are no… ▽ More Every graph $G$ can be represented by a collection of equi-radii spheres in a $d$-dimensional metric $Δ$ such that there is an edge $uv$ in $G$ if and only if the spheres corresponding to $u$ and $v$ intersect. The smallest integer $d$ such that $G$ can be represented by a collection of spheres (all of the same radius) in $Δ$ is called the sphericity of $G$, and if the collection of spheres are non-overlap**, then the value $d$ is called the contact-dimension of $G$. In this paper, we study the sphericity and contact dimension of the complete bipartite graph $K_{n,n}$ in various $L^p$-metrics and consequently connect the complexity of the monochromatic closest pair and bichromatic closest pair problems. △ Less

Submitted 15 November, 2018; v1 submitted 10 August, 2016; originally announced August 2016.

Comments: The paper was previously titled, "The Curse of Medium Dimension for Geometric Problems in Almost Every Norm"

ACM Class: F.2.2

arXiv:1607.08449 [pdf, ps, other]

An Efficient Representation for Filtrations of Simplicial Complexes

Authors: Jean-Daniel Boissonnat, Karthik C. S.

Abstract: A filtration over a simplicial complex $K$ is an ordering of the simplices of $K$ such that all prefixes in the ordering are subcomplexes of $K$. Filtrations are at the core of Persistent Homology, a major tool in Topological Data Analysis. In order to represent the filtration of a simplicial complex, the entire filtration can be appended to any data structure that explicitly stores all the simpli… ▽ More A filtration over a simplicial complex $K$ is an ordering of the simplices of $K$ such that all prefixes in the ordering are subcomplexes of $K$. Filtrations are at the core of Persistent Homology, a major tool in Topological Data Analysis. In order to represent the filtration of a simplicial complex, the entire filtration can be appended to any data structure that explicitly stores all the simplices of the complex such as the Hasse diagram or the recently introduced Simplex Tree [Algorithmica '14]. However, with the popularity of various computational methods that need to handle simplicial complexes, and with the rapidly increasing size of the complexes, the task of finding a compact data structure that can still support efficient queries is of great interest. In this paper, we propose a new data structure called the Critical Simplex Diagram (CSD) which is a variant of the Simplex Array List (SAL) [Algorithmica '17]. Our data structure allows one to store in a compact way the filtration of a simplicial complex, and allows for the efficient implementation of a large range of basic operations. Moreover, we prove that our data structure is essentially optimal with respect to the requisite storage space. Finally, we show that the CSD representation admits fast construction algorithms for Flag complexes and relaxed Delaunay complexes. △ Less

Submitted 4 February, 2018; v1 submitted 28 July, 2016; originally announced July 2016.

Comments: A preliminary version appeared in SODA 2017

ACM Class: E.1, F.2.2

arXiv:1607.05189 [pdf, ps, other]

On the Sensitivity Conjecture for Disjunctive Normal Forms

Authors: Karthik C. S., Sébastien Tavenas

Abstract: The sensitivity conjecture of Nisan and Szegedy [CC '94] asks whether for any Boolean function $f$, the maximum sensitivity $s(f)$, is polynomially related to its block sensitivity $bs(f)$, and hence to other major complexity measures. Despite major advances in the analysis of Boolean functions over the last decade, the problem remains widely open. In this paper, we consider a restriction on the… ▽ More The sensitivity conjecture of Nisan and Szegedy [CC '94] asks whether for any Boolean function $f$, the maximum sensitivity $s(f)$, is polynomially related to its block sensitivity $bs(f)$, and hence to other major complexity measures. Despite major advances in the analysis of Boolean functions over the last decade, the problem remains widely open. In this paper, we consider a restriction on the class of Boolean functions through a model of computation (DNF), and refer to the functions adhering to this restriction as admitting the Normalized Block property. We prove that for any function $f$ admitting the Normalized Block property, $bs(f) \leq 4s(f)^2$. We note that (almost) all the functions mentioned in literature that achieve a quadratic separation between sensitivity and block sensitivity admit the Normalized Block property. Recently, Gopalan et al. [ITCS '16] showed that every Boolean function $f$ is uniquely specified by its values on a Hamming ball of radius at most $2s(f)$. We extend this result and also construct examples of Boolean functions which provide the matching lower bounds. △ Less

Submitted 7 December, 2016; v1 submitted 18 July, 2016; originally announced July 2016.

arXiv:1503.07444 [pdf, ps, other]

doi 10.1007/s00453-016-0207-y

Building Efficient and Compact Data Structures for Simplicial Complexes

Authors: Jean-Daniel Boissonnat, Karthik C. S., Sébastien Tavenas

Abstract: The Simplex Tree (ST) is a recently introduced data structure that can represent abstract simplicial complexes of any dimension and allows efficient implementation of a large range of basic operations on simplicial complexes. In this paper, we show how to optimally compress the Simplex Tree while retaining its functionalities. In addition, we propose two new data structures called the Maximal Simp… ▽ More The Simplex Tree (ST) is a recently introduced data structure that can represent abstract simplicial complexes of any dimension and allows efficient implementation of a large range of basic operations on simplicial complexes. In this paper, we show how to optimally compress the Simplex Tree while retaining its functionalities. In addition, we propose two new data structures called the Maximal Simplex Tree (MxST) and the Simplex Array List (SAL). We analyze the compressed Simplex Tree, the Maximal Simplex Tree, and the Simplex Array List under various settings. △ Less

Submitted 5 November, 2016; v1 submitted 25 March, 2015; originally announced March 2015.

Comments: An extended abstract appeared in the proceedings of SoCG 2015

Showing 1–35 of 35 results for author: S., K C