Skip to main content

Showing 1–21 of 21 results for author: Suggala, A

.
  1. arXiv:2406.17542  [pdf, ps, other

    cs.LG cs.AI cs.CL

    CDQuant: Accurate Post-training Weight Quantization of Large Pre-trained Models using Greedy Coordinate Descent

    Authors: Pranav Ajit Nair, Arun Sai Suggala

    Abstract: Large language models (LLMs) have recently demonstrated remarkable performance across diverse language tasks. But their deployment is often constrained by their substantial computational and storage requirements. Quantization has emerged as a key technique for addressing this challenge, enabling the compression of large models with minimal impact on performance. The recent GPTQ algorithm, a post-t… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2403.12236  [pdf, other

    cs.LG cs.CV

    Improving Generalization via Meta-Learning on Hard Samples

    Authors: Nishant Jain, Arun S. Suggala, Pradeep Shenoy

    Abstract: Learned reweighting (LRW) approaches to supervised learning use an optimization criterion to assign weights for training instances, in order to maximize performance on a representative validation dataset. We pose and formalize the problem of optimized selection of the validation set used in LRW training, to improve classifier generalization. In particular, we show that using hard-to-classify insta… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR 2024

  3. arXiv:2403.05683  [pdf, other

    cs.AI cs.LG

    Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning

    Authors: Sanket Shah, Arun Suggala, Milind Tambe, Aparna Taneja

    Abstract: The declining participation of beneficiaries over time is a key concern in public health programs. A popular strategy for improving retention is to have health workers `intervene' on beneficiaries at risk of drop** out. However, the availability and time of these health workers are limited resources. As a result, there has been a line of research on optimizing these limited intervention resource… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures, 2 tables

  4. arXiv:2402.08929  [pdf, other

    cs.LG stat.ML

    Second Order Methods for Bandit Optimization and Control

    Authors: Arun Suggala, Y. Jennifer Sun, Praneeth Netrapalli, Elad Hazan

    Abstract: Bandit convex optimization (BCO) is a general framework for online decision making under uncertainty. While tight regret bounds for general convex losses have been established, existing algorithms achieving these bounds have prohibitive computational costs for high dimensional data. In this paper, we propose a simple and practical BCO algorithm inspired by the online Newton step algorithm. We sh… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  5. arXiv:2311.03376  [pdf, other

    cs.IR cs.LG stat.ML

    Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints

    Authors: Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

    Abstract: We consider the problem of \emph{blocked} collaborative bandits where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. Our goal is to design algorithms that maximize the cumulative reward accrued by all the users over time, under the \em… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: 44 pages, To Appear in NeurIPS 2023

  6. arXiv:2310.18832  [pdf, other

    cs.AI

    Responsible AI (RAI) Games and Ensembles

    Authors: Yash Gupta, Runtian Zhai, Arun Suggala, Pradeep Ravikumar

    Abstract: Several recent works have studied the societal effects of AI; these include issues such as fairness, robustness, and safety. In many of these objectives, a learner seeks to minimize its worst-case loss over a set of predefined distributions (known as uncertainty sets), with usual examples being perturbed versions of the empirical distribution. In other words, aforementioned problems can be written… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  7. arXiv:2306.09222  [pdf, other

    cs.LG cs.AI

    Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization

    Authors: Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala

    Abstract: We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting. Our method is grounded in the principles of distributionally robust optimization (DRO) with Kullback-Leibler divergence. RGD is simple to implement, computationally efficient, and compatible with widely used optimizers such… ▽ More

    Submitted 26 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  8. arXiv:2306.09048  [pdf, other

    cs.LG stat.ML

    Optimal Best-Arm Identification in Bandits with Access to Offline Data

    Authors: Shubhada Agrawal, Sandeep Juneja, Karthikeyan Shanmugam, Arun Sai Suggala

    Abstract: Learning paradigms based purely on offline data as well as those based solely on sequential online learning have been well-studied in the literature. In this paper, we consider combining offline data with online learning, an area less studied but of obvious practical importance. We consider the stochastic $K$-armed bandit problem, where our goal is to identify the arm with the highest mean in the… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 45 pages, 5 figures

  9. arXiv:2306.05785  [pdf, other

    cs.LG

    End-to-End Neural Network Compression via $\frac{\ell_1}{\ell_2}$ Regularized Latency Surrogates

    Authors: Anshul Nasery, Hardik Shah, Arun Sai Suggala, Prateek Jain

    Abstract: Neural network (NN) compression via techniques such as pruning, quantization requires setting compression hyperparameters (e.g., number of channels to be pruned, bitwidths for quantization) for each layer either manually or via neural architecture search (NAS) which can be computationally expensive. We address this problem by providing an end-to-end technique that optimizes for model's Floating Po… ▽ More

    Submitted 13 June, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  10. arXiv:2301.13273  [pdf, other

    cs.LG cs.CR math.ST stat.ML

    Near Optimal Private and Robust Linear Regression

    Authors: Xiyang Liu, Prateek Jain, Weihao Kong, Sewoong Oh, Arun Sai Suggala

    Abstract: We study the canonical statistical estimation problem of linear regression from $n$ i.i.d.~examples under $(\varepsilon,δ)$-differential privacy when some response variables are adversarially corrupted. We propose a variant of the popular differentially private stochastic gradient descent (DP-SGD) algorithm with two innovations: a full-batch gradient descent to improve sample complexity and a nove… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  11. arXiv:2301.07040  [pdf, other

    cs.LG stat.ML

    Optimal Algorithms for Latent Bandits with Cluster Structure

    Authors: Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

    Abstract: We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal… ▽ More

    Submitted 11 July, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: 48 pages. Accepted to AISTATS 2023. Added Experiments

  12. arXiv:2206.03362  [pdf, other

    cs.LG cs.AI cs.CR stat.ME stat.ML

    Building Robust Ensembles via Margin Boosting

    Authors: Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala

    Abstract: In the context of adversarial robustness, a single model does not usually have enough power to defend against all possible adversarial attacks, and as a result, has sub-optimal robustness. Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks. In this work, we take a principled approach towards building robust ensembles.… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted by ICML 2022

  13. arXiv:2110.13948  [pdf, other

    cs.LG stat.ML

    Boosted CVaR Classification

    Authors: Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar

    Abstract: Many modern machine learning tasks require models with high tail performance, i.e. high performance over the worst-off samples in the dataset. This problem has been widely studied in fields such as algorithmic fairness, class imbalance, and risk-sensitive decision making. A popular approach to maximize the model's tail performance is to minimize the CVaR (Conditional Value at Risk) loss, which com… ▽ More

    Submitted 10 November, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021. 16 pages, 4 figures

  14. arXiv:2006.11430  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Learning Minimax Estimators via Online Learning

    Authors: Kartik Gupta, Arun Sai Suggala, Adarsh Prasad, Praneeth Netrapalli, Pradeep Ravikumar

    Abstract: We consider the problem of designing minimax estimators for estimating the parameters of a probability distribution. Unlike classical approaches such as the MLE and minimum distance estimators, we consider an algorithmic approach for constructing such estimators. We view the problem of designing minimax estimators as finding a mixed strategy Nash equilibrium of a zero-sum game. By leveraging recen… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: 60 pages. Under review

  15. arXiv:2006.07541  [pdf, ps, other

    cs.LG stat.ML

    Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

    Authors: Arun Sai Suggala, Praneeth Netrapalli

    Abstract: We consider the problem of online learning and its application to solving minimax games. For the online learning problem, Follow the Perturbed Leader (FTPL) is a widely studied algorithm which enjoys the optimal $O(T^{1/2})$ worst-case regret guarantee for both convex and nonconvex losses. In this work, we show that when the sequence of loss functions is predictable, a simple modification of FTPL… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 38 pages. Under review

  16. arXiv:1903.08192  [pdf, ps, other

    cs.LG stat.ML

    Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

    Authors: Arun Sai Suggala, Kush Bhatia, Pradeep Ravikumar, Prateek Jain

    Abstract: We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting eith… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

  17. arXiv:1903.08110  [pdf, ps, other

    cs.LG math.OC stat.ML

    Online Non-Convex Learning: Following the Perturbed Leader is Optimal

    Authors: Arun Sai Suggala, Praneeth Netrapalli

    Abstract: We study the problem of online learning with non-convex losses, where the learner has access to an offline optimization oracle. We show that the classical Follow the Perturbed Leader (FTPL) algorithm achieves optimal regret rate of $O(T^{-1/2})$ in this setting. This improves upon the previous best-known regret rate of $O(T^{-1/3})$ for FTPL. We further show that an optimistic variant of FTPL achi… ▽ More

    Submitted 20 September, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

  18. arXiv:1901.09392  [pdf, other

    cs.LG stat.ML

    On the (In)fidelity and Sensitivity for Explanations

    Authors: Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Sai Suggala, David I. Inouye, Pradeep Ravikumar

    Abstract: We consider objective evaluation measures of saliency explanations for complex black-box machine learning models. We propose simple robust variants of two notions that have been considered in recent literature: (in)fidelity, and sensitivity. We analyze optimal explanations with respect to both these measures, and while the optimal explanation for sensitivity is a vacuous constant explanation, the… ▽ More

    Submitted 3 November, 2019; v1 submitted 27 January, 2019; originally announced January 2019.

    Comments: NeurIPS 2019 camera ready, previous version on Arxiv: "How Sensitive are Sensitivity-Based Explanations"

  19. arXiv:1806.02924  [pdf, ps, other

    stat.ML cs.LG

    Revisiting Adversarial Risk

    Authors: Arun Sai Suggala, Adarsh Prasad, Vaishnavh Nagarajan, Pradeep Ravikumar

    Abstract: Recent works on adversarial perturbations show that there is an inherent trade-off between standard test accuracy and adversarial accuracy. Specifically, they show that no classifier can simultaneously be robust to adversarial perturbations and achieve high standard test accuracy. However, this is contrary to the standard notion that on tasks such as image classification, humans are robust classif… ▽ More

    Submitted 22 March, 2019; v1 submitted 7 June, 2018; originally announced June 2018.

  20. arXiv:1802.06485  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Robust Estimation via Robust Gradient Estimation

    Authors: Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar

    Abstract: We provide a new computationally-efficient class of estimators for risk minimization. We show that these estimators are robust for general statistical models: in the classical Huber epsilon-contamination model and in heavy-tailed settings. Our workhorse is a novel robust variant of gradient descent, and we provide conditions under which our gradient descent variant provides accurate estimators in… ▽ More

    Submitted 20 April, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

    Comments: 48 pages, 5 figures

  21. arXiv:1505.05117  [pdf, other

    stat.ML

    Vector-Space Markov Random Fields via Exponential Families

    Authors: Wesley Tansey, Oscar Hernan Madrid Padilla, Arun Sai Suggala, Pradeep Ravikumar

    Abstract: We present Vector-Space Markov Random Fields (VS-MRFs), a novel class of undirected graphical models where each variable can belong to an arbitrary vector space. VS-MRFs generalize a recent line of work on scalar-valued, uni-parameter exponential family and mixed graphical models, thereby greatly broadening the class of exponential families available (e.g., allowing multinomial and Dirichlet distr… ▽ More

    Submitted 19 May, 2015; originally announced May 2015.

    Comments: See https://github.com/tansey/vsmrfs for code