-
How to Construct Quantum FHE, Generically
Authors:
Aparna Gupte,
Vinod Vaikuntanathan
Abstract:
We construct a (compact) quantum fully homomorphic encryption (QFHE) scheme starting from (compact) classical fully homomorphic encryption scheme with decryption in $\mathsf{NC}^{1}$, together with a dual-mode trapdoor function family. Compared to previous constructions (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) which made non-black-box use of similar underlying primitives, our construction prov…
▽ More
We construct a (compact) quantum fully homomorphic encryption (QFHE) scheme starting from (compact) classical fully homomorphic encryption scheme with decryption in $\mathsf{NC}^{1}$, together with a dual-mode trapdoor function family. Compared to previous constructions (Mahadev, FOCS 2018; Brakerski, CRYPTO 2018) which made non-black-box use of similar underlying primitives, our construction provides a pathway to instantiations from different assumptions. Our construction uses the techniques of Dulek, Schaffner and Speelman (CRYPTO 2016) and shows how to make the client in their QFHE scheme classical using dual-mode trapdoor functions. As an additional contribution, we show a new instantiation of dual-mode trapdoor functions from group actions.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Sparse Linear Regression and Lattice Problems
Authors:
Aparna Gupte,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector $y=Xθ^*+w$ for a $k$-sparse vector $θ^*$ (that is, $\|θ^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find a $k$-sparse $\widehatθ \in \mathbb{R}^n$ that minimizes the mean squared prediction error…
▽ More
Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector $y=Xθ^*+w$ for a $k$-sparse vector $θ^*$ (that is, $\|θ^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find a $k$-sparse $\widehatθ \in \mathbb{R}^n$ that minimizes the mean squared prediction error $\frac{1}{m}\|X\widehatθ-Xθ^*\|^2_2$. While $\ell_1$-relaxation methods such as basis pursuit, Lasso, and the Dantzig selector solve SLR when the design matrix is well-conditioned, no general algorithm is known, nor is there any formal evidence of hardness in an average-case setting with respect to all efficient algorithms.
We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems. Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR, where the condition number of the lattice basis that defines the BDD instance is directly related to the restricted eigenvalue condition of the design matrix, which characterizes some of the classical statistical-computational gaps for sparse linear regression. Also, by appealing to worst-case to average-case reductions from the world of lattices, this shows hardness for a distribution of SLR instances; while the design matrices are ill-conditioned, the resulting SLR instances are in the identifiable regime.
Furthermore, for well-conditioned (essentially) isotropic Gaussian design matrices, where Lasso is known to behave well in the identifiable regime, we show hardness of outputting any good solution in the unidentifiable regime where there are many solutions, assuming the worst-case hardness of standard and well-studied lattice problems.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Initial Task Assignment in Multi-Human Multi-Robot Teams: An Attention-enhanced Hierarchical Reinforcement Learning Approach
Authors:
Ruiqi Wang,
Dezhong Zhao,
Arjun Gupte,
Byung-Cheol Min
Abstract:
Multi-human multi-robot teams (MH-MR) obtain tremendous potential in tackling intricate and massive missions by merging distinct strengths and expertise of individual members. The inherent heterogeneity of these teams necessitates advanced initial task assignment (ITA) methods that align tasks with the intrinsic capabilities of team members from the outset. While existing reinforcement learning ap…
▽ More
Multi-human multi-robot teams (MH-MR) obtain tremendous potential in tackling intricate and massive missions by merging distinct strengths and expertise of individual members. The inherent heterogeneity of these teams necessitates advanced initial task assignment (ITA) methods that align tasks with the intrinsic capabilities of team members from the outset. While existing reinforcement learning approaches show encouraging results, they might fall short in addressing the nuances of long-horizon ITA problems, particularly in settings with large-scale MH-MR teams or multifaceted tasks. To bridge this gap, we propose an attention-enhanced hierarchical reinforcement learning approach that decomposes the complex ITA problem into structured sub-problems, facilitating more efficient allocations. To bolster sub-policy learning, we introduce a hierarchical cross-attribute attention (HCA) mechanism, encouraging each sub-policy within the hierarchy to discern and leverage the specific nuances in the state space that are crucial for its respective decision-making phase. Through an extensive environmental surveillance case study, we demonstrate the benefits of our model and the HCA inside.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Multilinear formulations for computing Nash equilibrium of multi-player matrix games
Authors:
Miriam Fischer,
Akshay Gupte
Abstract:
We present multilinear and mixed-integer multilinear programs to find a Nash equilibrium in multi-player noncooperative games. We compare the formulations to common algorithms in Gambit, and conclude that a multilinear feasibility program finds a Nash equilibrium faster than any of the methods we compare it to, including the quantal response equilibrium method, which is recommended for large games…
▽ More
We present multilinear and mixed-integer multilinear programs to find a Nash equilibrium in multi-player noncooperative games. We compare the formulations to common algorithms in Gambit, and conclude that a multilinear feasibility program finds a Nash equilibrium faster than any of the methods we compare it to, including the quantal response equilibrium method, which is recommended for large games. Hence, the multilinear feasibility program is an alternative method to find a Nash equilibrium in multi-player games, and outperforms many common algorithms. The mixed-integer formulations are generalisations of known mixed-integer programs for two-player games, however unlike two-player games, these mixed-integer programs do not give better performance than existing algorithms.
△ Less
Submitted 25 March, 2024; v1 submitted 5 August, 2022;
originally announced August 2022.
-
Large independent sets in recursive Markov random graphs
Authors:
Akshay Gupte,
Yiran Zhu
Abstract:
Computing the maximum size of an independent set in a graph is a famously hard combinatorial problem that has been well-studied for various classes of graphs. When it comes to random graphs, only the classical Erdős-Rényi-Gilbert random graph $G_{n,p}$ has been analysed and shown to have largest independent sets of size $Θ(\log{n})$ w.h.p. This classical model does not capture any dependency struc…
▽ More
Computing the maximum size of an independent set in a graph is a famously hard combinatorial problem that has been well-studied for various classes of graphs. When it comes to random graphs, only the classical Erdős-Rényi-Gilbert random graph $G_{n,p}$ has been analysed and shown to have largest independent sets of size $Θ(\log{n})$ w.h.p. This classical model does not capture any dependency structure between edges that can appear in real-world networks. We initiate study in this direction by defining random graphs $G^{r}_{n,p}$ whose existence of edges is determined by a Markov process that is also governed by a decay parameter $r\in(0,1]$. We prove that w.h.p. $G^{r}_{n,p}$ has independent sets of size $(\frac{1-r}{2+ε}) \frac{n}{\log{n}}$ for arbitrary $ε> 0$, which implies an asymptotic lower bound of $Ω(π(n))$ where $π(n)$ is the prime-counting function. This is derived using bounds on the terms of a harmonic series, Turán bound on stability number, and a concentration analysis for a certain sequence of dependent Bernoulli variables that may also be of independent interest. Since $G^{r}_{n,p}$ collapses to $G_{n,p}$ when there is no decay, it follows that having even the slightest bit of dependency (any $r < 1$) in the random graph construction leads to the presence of large independent sets and thus our random model has a phase transition at its boundary value of $r=1$. For the maximal independent set output by a greedy algorithm, we deduce that it has a performance ratio of at most $1 + \frac{\log{n}}{(1-r)}$ w.h.p. when the lowest degree vertex is picked at each iteration, and also show that under any other permutation of vertices the algorithm outputs a set of size $Ω(n^{1/1+τ})$, where $τ=1/(1-r)$, and hence has a performance ratio of $O(n^{\frac{1}{2-r}})$.
△ Less
Submitted 24 May, 2024; v1 submitted 10 July, 2022;
originally announced July 2022.
-
Characterizing the Implicit Bias of Regularized SGD in Rank Minimization
Authors:
Tomer Galanti,
Zachary S. Siegel,
Aparna Gupte,
Tomaso Poggio
Abstract:
We study the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training deep neural networks. Our results show that training neural networks with mini-batch SGD and weight decay causes a bias towards rank minimization over the weight matrices. Specifically, we show, both theoretically and empirically, that this bias is more pronounced when using smaller batch sizes,…
▽ More
We study the bias of Stochastic Gradient Descent (SGD) to learn low-rank weight matrices when training deep neural networks. Our results show that training neural networks with mini-batch SGD and weight decay causes a bias towards rank minimization over the weight matrices. Specifically, we show, both theoretically and empirically, that this bias is more pronounced when using smaller batch sizes, higher learning rates, or increased weight decay. Additionally, we predict and observe empirically that weight decay is necessary to achieve this bias. Unlike previous literature, our analysis does not rely on assumptions about the data, convergence, or optimality of the weight matrices and applies to a wide range of neural network architectures of any width or depth. Finally, we empirically investigate the connection between this bias and generalization, finding that it has a marginal effect on generalization.
△ Less
Submitted 25 October, 2023; v1 submitted 12 June, 2022;
originally announced June 2022.
-
Continuous LWE is as Hard as LWE & Applications to Learning Gaussian Mixtures
Authors:
Aparna Gupte,
Neekon Vafa,
Vinod Vaikuntanathan
Abstract:
We show direct and conceptually simple reductions between the classical learning with errors (LWE) problem and its continuous analog, CLWE (Bruna, Regev, Song and Tang, STOC 2021). This allows us to bring to bear the powerful machinery of LWE-based cryptography to the applications of CLWE. For example, we obtain the hardness of CLWE under the classical worst-case hardness of the gap shortest vecto…
▽ More
We show direct and conceptually simple reductions between the classical learning with errors (LWE) problem and its continuous analog, CLWE (Bruna, Regev, Song and Tang, STOC 2021). This allows us to bring to bear the powerful machinery of LWE-based cryptography to the applications of CLWE. For example, we obtain the hardness of CLWE under the classical worst-case hardness of the gap shortest vector problem. Previously, this was known only under quantum worst-case hardness of lattice problems. More broadly, with our reductions between the two problems, any future developments to LWE will also apply to CLWE and its downstream applications.
As a concrete application, we show an improved hardness result for density estimation for mixtures of Gaussians. In this computational problem, given sample access to a mixture of Gaussians, the goal is to output a function that estimates the density function of the mixture. Under the (plausible and widely believed) exponential hardness of the classical LWE problem, we show that Gaussian mixture density estimation in $\mathbb{R}^n$ with roughly $\log n$ Gaussian components given $\mathsf{poly}(n)$ samples requires time quasi-polynomial in $n$. Under the (conservative) polynomial hardness of LWE, we show hardness of density estimation for $n^ε$ Gaussians for any constant $ε> 0$, which improves on Bruna, Regev, Song and Tang (STOC 2021), who show hardness for at least $\sqrt{n}$ Gaussians under polynomial (quantum) hardness assumptions.
Our key technical tool is a reduction from classical LWE to LWE with $k$-sparse secrets where the multiplicative increase in the noise is only $O(\sqrt{k})$, independent of the ambient dimension $n$.
△ Less
Submitted 2 November, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR documents
Authors:
Amit Gupte,
Alexey Romanov,
Sahitya Mantravadi,
Dalitso Banda,
Jianjie Liu,
Raza Khan,
Lakshmanan Ramu Meenal,
Benjamin Han,
Soundar Srinivasan
Abstract:
Document digitization is essential for the digital transformation of our societies, yet a crucial step in the process, Optical Character Recognition (OCR), is still not perfect. Even commercial OCR systems can produce questionable output depending on the fidelity of the scanned documents. In this paper, we demonstrate an effective framework for mitigating OCR errors for any downstream NLP task, us…
▽ More
Document digitization is essential for the digital transformation of our societies, yet a crucial step in the process, Optical Character Recognition (OCR), is still not perfect. Even commercial OCR systems can produce questionable output depending on the fidelity of the scanned documents. In this paper, we demonstrate an effective framework for mitigating OCR errors for any downstream NLP task, using Named Entity Recognition (NER) as an example. We first address the data scarcity problem for model training by constructing a document synthesis pipeline, generating realistic but degraded data with NER labels. We measure the NER accuracy drop at various degradation levels and show that a text restoration model, trained on the degraded data, significantly closes the NER accuracy gaps caused by OCR errors, including on an out-of-domain dataset. For the benefit of the community, we have made the document synthesis pipeline available as an open-source project.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
The Fine-Grained Hardness of Sparse Linear Regression
Authors:
Aparna Gupte,
Vinod Vaikuntanathan
Abstract:
Sparse linear regression is the well-studied inference problem where one is given a design matrix $\mathbf{A} \in \mathbb{R}^{M\times N}$ and a response vector $\mathbf{b} \in \mathbb{R}^M$, and the goal is to find a solution $\mathbf{x} \in \mathbb{R}^{N}$ which is $k$-sparse (that is, it has at most $k$ non-zero coordinates) and minimizes the prediction error…
▽ More
Sparse linear regression is the well-studied inference problem where one is given a design matrix $\mathbf{A} \in \mathbb{R}^{M\times N}$ and a response vector $\mathbf{b} \in \mathbb{R}^M$, and the goal is to find a solution $\mathbf{x} \in \mathbb{R}^{N}$ which is $k$-sparse (that is, it has at most $k$ non-zero coordinates) and minimizes the prediction error $\|\mathbf{A} \mathbf{x} - \mathbf{b}\|_2$. On the one hand, the problem is known to be $\mathcal{NP}$-hard which tells us that no polynomial-time algorithm exists unless $\mathcal{P} = \mathcal{NP}$. On the other hand, the best known algorithms for the problem do a brute-force search among $N^k$ possibilities. In this work, we show that there are no better-than-brute-force algorithms, assuming any one of a variety of popular conjectures including the weighted $k$-clique conjecture from the area of fine-grained complexity, or the hardness of the closest vector problem from the geometry of numbers. We also show the impossibility of better-than-brute-force algorithms when the prediction error is measured in other $\ell_p$ norms, assuming the strong exponential-time hypothesis.
△ Less
Submitted 15 February, 2022; v1 submitted 6 June, 2021;
originally announced June 2021.
-
On lexicographic approximations of integer programs
Authors:
Michael Eldredge,
Akshay Gupte
Abstract:
We use the lexicographic order to define a hierarchy of primal and dual bounds on the optimum of a bounded integer program. These bounds are constructed using lex maximal and minimal feasible points taken under different permutations. Their strength is analyzed and it is shown that a family of primal bounds is tight for any $0\backslash 1$ program with nonnegative linear objective, and a different…
▽ More
We use the lexicographic order to define a hierarchy of primal and dual bounds on the optimum of a bounded integer program. These bounds are constructed using lex maximal and minimal feasible points taken under different permutations. Their strength is analyzed and it is shown that a family of primal bounds is tight for any $0\backslash 1$ program with nonnegative linear objective, and a different family of dual bounds is tight for any packing- or covering-type $0\backslash 1$ program with an arbitrary linear objective. The former result yields a structural characterization for the optimum of $0\backslash 1$ programs, with connections to matroid optimization, and a heuristic for general integer programs. The latter result implies a stronger polyhedral representation for the integer feasible points and a new approach for deriving strong valid inequalities to the integer hull. Since the construction of our bounds depends on the computation of lex optima, we derive explicit formulae for lex optima of some special polytopes, such as polytopes that are monotone with respect to each variable, and integral polymatroids and their base polytopes. We also classify $\mathrm{P}$ and $\mathrm{NP}$-$\mathrm{hard}$ cases of computing lex bounds and lex optima.
△ Less
Submitted 26 April, 2023; v1 submitted 20 October, 2016;
originally announced October 2016.
-
Efficient storage of Pareto points in biobjective mixed integer programming
Authors:
Nathan Adelgren,
Pietro Belotti,
Akshay Gupte
Abstract:
In biobjective mixed integer linear programs (BOMILPs), two linear objectives are minimized over a polyhedron while restricting some of the variables to be integer. Since many of the techniques for finding or approximating the Pareto set of a BOMILP use and update a subset of nondominated solutions, it is highly desirable to efficiently store this subset. We present a new data structure, a variant…
▽ More
In biobjective mixed integer linear programs (BOMILPs), two linear objectives are minimized over a polyhedron while restricting some of the variables to be integer. Since many of the techniques for finding or approximating the Pareto set of a BOMILP use and update a subset of nondominated solutions, it is highly desirable to efficiently store this subset. We present a new data structure, a variant of a binary tree that takes as input points and line segments in $\R^2$ and stores the nondominated subset of this input. When used within an exact solution procedure, such as branch-and-bound (BB), at termination this structure contains the set of Pareto optimal solutions.
We compare the efficiency of our structure in storing solutions to that of a dynamic list which updates via pairwise comparison. Then we use our data structure in two biobjective BB techniques available in the literature and solve three classes of instances of BOMILP, one of which is generated by us. The first experiment shows that our data structure handles up to $10^7$ points or segments much more efficiently than a dynamic list. The second experiment shows that our data structure handles points and segments much more efficiently than a list when used in a BB.
△ Less
Submitted 2 May, 2017; v1 submitted 24 November, 2014;
originally announced November 2014.