-
Dueling Optimization with a Monotone Adversary
Authors:
Avrim Blum,
Meghal Gupta,
Gene Li,
Naren Sarayu Manoj,
Aadirupa Saha,
Yuanyuan Yang
Abstract:
We introduce and study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer $\mathbf{x}^{*}$ for a function $f\colon X \to \mathbb{R}$, where $X \subseteq \mathbb{R}^d$. In each round, the algorithm submits a pair of guesses, i.e., $\mathbf{x}^{(1)}$ and…
▽ More
We introduce and study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer $\mathbf{x}^{*}$ for a function $f\colon X \to \mathbb{R}$, where $X \subseteq \mathbb{R}^d$. In each round, the algorithm submits a pair of guesses, i.e., $\mathbf{x}^{(1)}$ and $\mathbf{x}^{(2)}$, and the adversary responds with any point in the space that is at least as good as both guesses. The cost of each query is the suboptimality of the worse of the two guesses; i.e., ${\max} \left( f(\mathbf{x}^{(1)}), f(\mathbf{x}^{(2)}) \right) - f(\mathbf{x}^{*})$. The goal is to minimize the number of iterations required to find an $\varepsilon$-optimal point and to minimize the total cost (regret) of the guesses over many rounds. Our main result is an efficient randomized algorithm for several natural choices of the function $f$ and set $X$ that incurs cost $O(d)$ and iteration complexity $O(d\log(1/\varepsilon)^2)$. Moreover, our dependence on $d$ is asymptotically optimal, as we show examples in which any randomized algorithm for this problem must incur $Ω(d)$ cost and iteration complexity.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
The Change-of-Measure Method, Block Lewis Weights, and Approximating Matrix Block Norms
Authors:
Naren Sarayu Manoj,
Max Ovsiankin
Abstract:
Given a matrix $\mathbf{A} \in \mathbb{R}^{k \times n}$, a partitioning of $[k]$ into groups $S_1,\dots,S_m$, an outer norm $p$, and a collection of inner norms such that either $p \ge 1$ and $p_1,\dots,p_m \ge 2$ or $p_1=\dots=p_m=p \ge 1/\log n$, we prove that there is a sparse weight vector $\mathbfβ \in \mathbb{R}^{m}$ such that…
▽ More
Given a matrix $\mathbf{A} \in \mathbb{R}^{k \times n}$, a partitioning of $[k]$ into groups $S_1,\dots,S_m$, an outer norm $p$, and a collection of inner norms such that either $p \ge 1$ and $p_1,\dots,p_m \ge 2$ or $p_1=\dots=p_m=p \ge 1/\log n$, we prove that there is a sparse weight vector $\mathbfβ \in \mathbb{R}^{m}$ such that $\sum_{i=1}^m β_i \cdot \|\mathbf{A}_{S_i}\mathbf{x}\|_{p_i}^p \approx_{1\pm\varepsilon} \sum_{i=1}^m \|\mathbf{A}_{S_i}\mathbf{x}\|_{p_i}^p$, where the number of nonzero entries of $\mathbfβ$ is at most $O_{p,p_i}(\varepsilon^{-2}n^{\max(1,p/2)}(\log n)^2(\log(n/\varepsilon)))$. When $p_1\dots,p_m \ge 2$, this weight vector arises from an importance sampling procedure based on the block Lewis weights, a recently proposed generalization of Lewis weights. Additionally, we prove that there exist efficient algorithms to find the sparse weight vector $\mathbfβ$ in several important regimes of $p$ and $p_1,\dots,p_m$.
Our main technical contribution is a substantial generalization of the change-of-measure method that Bourgain, Lindenstrauss, and Milman used to obtain the analogous result when every group has size $1$. Our generalization allows one to analyze change of measures beyond those implied by D. Lewis's original construction, including the measure implied by the block Lewis weights and natural approximations of this measure.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Near-Optimal Streaming Ellipsoidal Rounding for General Convex Polytopes
Authors:
Yury Makarychev,
Naren Sarayu Manoj,
Max Ovsiankin
Abstract:
We give near-optimal algorithms for computing an ellipsoidal rounding of a convex polytope whose vertices are given in a stream. The approximation factor is linear in the dimension (as in John's theorem) and only loses an excess logarithmic factor in the aspect ratio of the polytope. Our algorithms are nearly optimal in two senses: first, their runtimes nearly match those of the most efficient kno…
▽ More
We give near-optimal algorithms for computing an ellipsoidal rounding of a convex polytope whose vertices are given in a stream. The approximation factor is linear in the dimension (as in John's theorem) and only loses an excess logarithmic factor in the aspect ratio of the polytope. Our algorithms are nearly optimal in two senses: first, their runtimes nearly match those of the most efficient known algorithms for the offline version of the problem. Second, their approximation factors nearly match a lower bound we show against a natural class of geometric streaming algorithms. In contrast to existing works in the streaming setting that compute ellipsoidal roundings only for centrally symmetric convex polytopes, our algorithms apply to general convex polytopes. We also show how to use our algorithms to construct coresets from a stream of points that approximately preserve both the ellipsoidal rounding and the convex hull of the original set of points.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Interpolation Learning With Minimum Description Length
Authors:
Naren Sarayu Manoj,
Nathan Srebro
Abstract:
We prove that the Minimum Description Length learning rule exhibits tempered overfitting. We obtain tempered agnostic finite sample learning guarantees and characterize the asymptotic behavior in the presence of random label noise.
We prove that the Minimum Description Length learning rule exhibits tempered overfitting. We obtain tempered agnostic finite sample learning guarantees and characterize the asymptotic behavior in the presence of random label noise.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Streaming Algorithms for Ellipsoidal Approximation of Convex Polytopes
Authors:
Yury Makarychev,
Naren Sarayu Manoj,
Max Ovsiankin
Abstract:
We give efficient deterministic one-pass streaming algorithms for finding an ellipsoidal approximation of a symmetric convex polytope. The algorithms are near-optimal in that their approximation factors differ from that of the optimal offline solution only by a factor sub-logarithmic in the aspect ratio of the polytope.
We give efficient deterministic one-pass streaming algorithms for finding an ellipsoidal approximation of a symmetric convex polytope. The algorithms are near-optimal in that their approximation factors differ from that of the optimal offline solution only by a factor sub-logarithmic in the aspect ratio of the polytope.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
An Optimal Algorithm for Certifying Monotone Functions
Authors:
Meghal Gupta,
Naren Sarayu Manoj
Abstract:
Given query access to a monotone function $f\colon\{0,1\}^n\to\{0,1\}$ with certificate complexity $C(f)$ and an input $x^{\star}$, we design an algorithm that outputs a size-$C(f)$ subset of $x^{\star}$ certifying the value of $f(x^{\star})$. Our algorithm makes $O(C(f) \cdot \log n)$ queries to $f$, which matches the information-theoretic lower bound for this problem and resolves the concrete op…
▽ More
Given query access to a monotone function $f\colon\{0,1\}^n\to\{0,1\}$ with certificate complexity $C(f)$ and an input $x^{\star}$, we design an algorithm that outputs a size-$C(f)$ subset of $x^{\star}$ certifying the value of $f(x^{\star})$. Our algorithm makes $O(C(f) \cdot \log n)$ queries to $f$, which matches the information-theoretic lower bound for this problem and resolves the concrete open question posed in the STOC '22 paper of Blanc, Koch, Lange, and Tan [BKLT22].
We extend this result to an algorithm that finds a size-$2C(f)$ certificate for a real-valued monotone function with $O(C(f) \cdot \log n)$ queries. We also complement our algorithms with a hardness result, in which we show that finding the shortest possible certificate in $x^{\star}$ may require $Ω\left(\binom{n}{C(f)}\right)$ queries in the worst case.
△ Less
Submitted 3 April, 2022;
originally announced April 2022.
-
Excess Capacity and Backdoor Poisoning
Authors:
Naren Sarayu Manoj,
Avrim Blum
Abstract:
A backdoor data poisoning attack is an adversarial attack wherein the attacker injects several watermarked, mislabeled training examples into a training set. The watermark does not impact the test-time performance of the model on typical data; however, the model reliably errs on watermarked examples.
To gain a better foundational understanding of backdoor data poisoning attacks, we present a for…
▽ More
A backdoor data poisoning attack is an adversarial attack wherein the attacker injects several watermarked, mislabeled training examples into a training set. The watermark does not impact the test-time performance of the model on typical data; however, the model reliably errs on watermarked examples.
To gain a better foundational understanding of backdoor data poisoning attacks, we present a formal theoretical framework within which one can discuss backdoor data poisoning attacks for classification problems. We then use this to analyze important statistical and computational issues surrounding these attacks.
On the statistical front, we identify a parameter we call the memorization capacity that captures the intrinsic vulnerability of a learning problem to a backdoor attack. This allows us to argue about the robustness of several natural learning problems to backdoor attacks. Our results favoring the attacker involve presenting explicit constructions of backdoor attacks, and our robustness results show that some natural problem settings cannot yield successful backdoor attacks.
From a computational standpoint, we show that under certain assumptions, adversarial training can detect the presence of backdoors in a training set. We then show that under similar assumptions, two closely related problems we call backdoor filtering and robust generalization are nearly equivalent. This implies that it is both asymptotically necessary and sufficient to design algorithms that can identify watermarked examples in the training set in order to obtain a learning algorithm that both generalizes well to unseen data and is robust to backdoors.
△ Less
Submitted 3 November, 2021; v1 submitted 1 September, 2021;
originally announced September 2021.
-
Conditional Classification: A Solution for Computational Energy Reduction
Authors:
Ali Mirzaeian,
Sai Manoj,
Ashkan Vakil,
Houman Homayoun,
Avesta Sasan
Abstract:
Deep convolutional neural networks have shown high efficiency in computer visions and other applications. However, with the increase in the depth of the networks, the computational complexity is growing exponentially. In this paper, we propose a novel solution to reduce the computational complexity of convolutional neural network models used for many class image classification. Our proposed techni…
▽ More
Deep convolutional neural networks have shown high efficiency in computer visions and other applications. However, with the increase in the depth of the networks, the computational complexity is growing exponentially. In this paper, we propose a novel solution to reduce the computational complexity of convolutional neural network models used for many class image classification. Our proposed technique breaks the classification task into two steps: 1) coarse-grain classification, in which the input samples are classified among a set of hyper-classes, 2) fine-grain classification, in which the final labels are predicted among those hyper-classes detected at the first step. We illustrate that our proposed classifier can reach the level of accuracy reported by the best in class classification models with less computational complexity (Flop Count) by only activating parts of the model that are needed for the image classification.
△ Less
Submitted 7 January, 2021; v1 submitted 28 June, 2020;
originally announced June 2020.
-
Code-Bridged Classifier (CBC): A Low or Negative Overhead Defense for Making a CNN Classifier Robust Against Adversarial Attacks
Authors:
Farnaz Behnia,
Ali Mirzaeian,
Mohammad Sabokrou,
Sai Manoj,
Tinoosh Mohsenin,
Khaled N. Khasawneh,
Liang Zhao,
Houman Homayoun,
Avesta Sasan
Abstract:
In this paper, we propose Code-Bridged Classifier (CBC), a framework for making a Convolutional Neural Network (CNNs) robust against adversarial attacks without increasing or even by decreasing the overall models' computational complexity. More specifically, we propose a stacked encoder-convolutional model, in which the input image is first encoded by the encoder module of a denoising auto-encoder…
▽ More
In this paper, we propose Code-Bridged Classifier (CBC), a framework for making a Convolutional Neural Network (CNNs) robust against adversarial attacks without increasing or even by decreasing the overall models' computational complexity. More specifically, we propose a stacked encoder-convolutional model, in which the input image is first encoded by the encoder module of a denoising auto-encoder, and then the resulting latent representation (without being decoded) is fed to a reduced complexity CNN for image classification. We illustrate that this network not only is more robust to adversarial examples but also has a significantly lower computational complexity when compared to the prior art defenses.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Urban Delay Tolerant Network Simulator (UDTNSim v0.1)
Authors:
Sarath Babu,
Gaurav Jain,
B. S. Manoj
Abstract:
Delay Tolerant Networking (DTN) is an approach to networking which handles network disruptions and high delays that may occur in many kinds of communication networks. The major reasons for high delay include partial connectivity of networks as can be seen in many types of ad hoc wireless networks with frequent network partitions, long propagation time as experienced in inter-planetary and deep spa…
▽ More
Delay Tolerant Networking (DTN) is an approach to networking which handles network disruptions and high delays that may occur in many kinds of communication networks. The major reasons for high delay include partial connectivity of networks as can be seen in many types of ad hoc wireless networks with frequent network partitions, long propagation time as experienced in inter-planetary and deep space networks, and frequent link disruptions due to the mobility of nodes as observed in terrestrial wireless network environments. Experimenting network architectures, protocols, and mobility models in such real-world scenarios is difficult due to the complexities involved in the network environment. Therefore, in this document, we present the documentation of an Urban Delay Tolerant Network Simulator (UDTNSim) version 0.1, capable of simulating urban road network environments with DTN characteristics including mobility models and routing protocols. The mobility models included in this version of UDTNSim are (i) Stationary Movement, (ii) Simple Random Movement, (iii) Path Type Based Movememt, (iv) Path Memory Based Movement, (v) Path Type with Restricted Movement, and (vi) Path Type with Wait Movement. In addition to mobility models, we also provide three routing and data hand-off protocols: (i) Epidemic Routing, (ii) Superior Only Handoff, and (iii) Superior Peer Handoff. UDTNSim v0.1 is designed using object-oriented programming approach in order to provide flexibility in addition of new features to the DTN environment. UDTNSim v0.1 is distributed as an open source simulator for the use of the research community.
△ Less
Submitted 17 September, 2017;
originally announced September 2017.
-
Graph Fourier Transform based on Directed Laplacian
Authors:
Rahul Singh,
Abhishek Chakraborty,
B. S. Manoj
Abstract:
In this paper, we redefine the Graph Fourier Transform (GFT) under the DSP$_\mathrm{G}$ framework. We consider the Jordan eigenvectors of the directed Laplacian as graph harmonics and the corresponding eigenvalues as the graph frequencies. For this purpose, we propose a shift operator based on the directed Laplacian of a graph. Based on our shift operator, we then define total variation of graph s…
▽ More
In this paper, we redefine the Graph Fourier Transform (GFT) under the DSP$_\mathrm{G}$ framework. We consider the Jordan eigenvectors of the directed Laplacian as graph harmonics and the corresponding eigenvalues as the graph frequencies. For this purpose, we propose a shift operator based on the directed Laplacian of a graph. Based on our shift operator, we then define total variation of graph signals, which is used in frequency ordering. We achieve natural frequency ordering and interpretation via the proposed definition of GFT. Moreover, we show that our proposed shift operator makes the LSI filters under DSP$_\mathrm{G}$ to become polynomial in the directed Laplacian.
△ Less
Submitted 13 January, 2016;
originally announced January 2016.