Skip to main content

Showing 1–50 of 61 results for author: Garg, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.14758  [pdf, other

    math.OC cs.LG stat.ML

    Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients

    Authors: Sachin Garg, Albert S. Berahas, Michał Dereziński

    Abstract: We show that, for finite-sum minimization problems, incorporating partial second-order information of the objective function can dramatically improve the robustness to mini-batch size of variance-reduced stochastic gradient methods, making them more scalable while retaining their benefits over traditional Newton-type approaches. We demonstrate this phenomenon on a prototypical stochastic second-or… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    MSC Class: 65K05; 90C06; 90C30

  2. arXiv:2404.07815  [pdf, other

    cs.LG cs.AI stat.ML

    Post-Hoc Reversal: Are We Selecting Models Prematurely?

    Authors: Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Chase Lipton

    Abstract: Trained models are often composed with post-hoc transforms such as temperature scaling (TS), ensembling and stochastic weight averaging (SWA) to improve performance, robustness, uncertainty estimation, etc. However, such transforms are typically applied only after the base models have already been finalized by standard means. In this paper, we challenge this practice with an extensive empirical st… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 9 pages + references + appendix, 7 figures

  3. arXiv:2402.16926  [pdf, other

    cs.CR cs.AI cs.LG stat.ML

    On the (In)feasibility of ML Backdoor Detection as an Hypothesis Testing Problem

    Authors: Georg Pichler, Marco Romanelli, Divya Prakash Manivannan, Prashanth Krishnamurthy, Farshad Khorrami, Siddharth Garg

    Abstract: We introduce a formal statistical definition for the problem of backdoor detection in machine learning systems and use it to analyze the feasibility of such problems, providing evidence for the utility and applicability of our definition. The main contributions of this work are an impossibility result and an achievability result for backdoor detection. We show a no-free-lunch theorem, proving that… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  4. arXiv:2312.03318  [pdf, other

    cs.LG cs.CV stat.ML

    Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

    Authors: Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan

    Abstract: Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains unexplored. In this paper, we undertake a systematic empirical investi… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  5. arXiv:2311.11194  [pdf, other

    cs.DS cs.IT cs.LG stat.ML

    Testing with Non-identically Distributed Samples

    Authors: Shivam Garg, Chirag Pabbaraju, Kirankumar Shiragur, Gregory Valiant

    Abstract: We examine the extent to which sublinear-sample property testing and estimation applies to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing framework: Suppose there is a set of distributions over a discrete support of size $k$, $\textbf{p}_1, \textbf{p}_2,\ldots,\textbf{p}_T$, and we obtain $c$ indepen… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  6. arXiv:2307.08999  [pdf, ps, other

    cs.LG stat.ML

    Oracle Efficient Online Multicalibration and Omniprediction

    Authors: Sumegha Garg, Christopher Jung, Omer Reingold, Aaron Roth

    Abstract: A recent line of work has shown a surprising connection between multicalibration, a multi-group fairness notion, and omniprediction, a learning paradigm that provides simultaneous loss minimization guarantees for a large family of loss functions. Prior work studies omniprediction in the batch setting. We initiate the study of omniprediction in the online adversarial setting. Although there exist a… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  7. arXiv:2306.00312  [pdf, other

    stat.ML cs.LG

    (Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy

    Authors: Elan Rosenfeld, Saurabh Garg

    Abstract: We derive an (almost) guaranteed upper bound on the error of deep neural networks under distribution shift using unlabeled test data. Prior methods either give bounds that are vacuous in practice or give estimates that are accurate on average but heavily underestimate error for a sizeable fraction of shifts. In particular, the latter only give guarantees based on complex continuous measures such a… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  8. arXiv:2305.19570  [pdf, other

    stat.ML cs.LG

    Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

    Authors: Dheeraj Baby, Saurabh Garg, Tzu-Ching Yen, Sivaraman Balakrishnan, Zachary Chase Lipton, Yu-Xiang Wang

    Abstract: This paper focuses on supervised and unsupervised online label shift, where the class marginals $Q(y)$ varies but the class-conditionals $Q(x|y)$ remain invariant. In the unsupervised setting, our goal is to adapt a learner, trained on some offline labeled data, to changing label distributions given unlabeled online data. In the supervised setting, we must both learn a classifier and adapt to the… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: First three authors contributed equally

  9. arXiv:2302.03020  [pdf, other

    cs.LG cs.CV stat.ML

    RLSbench: Domain Adaptation Under Relaxed Label Shift

    Authors: Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored. Meanwhile, popular deep domain adaptation heuristics tend to falter when faced with label proportions shifts. While several papers modify these heuristics in attempts to handle label proportions shifts, inconsistencies i… ▽ More

    Submitted 5 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. Paper website: https://sites.google.com/view/rlsbench/

  10. arXiv:2302.01051  [pdf, other

    stat.ML cs.LG

    Randomized prior wavelet neural operator for uncertainty quantification

    Authors: Shailesh Garg, Souvik Chakraborty

    Abstract: In this paper, we propose a novel data-driven operator learning framework referred to as the \textit{Randomized Prior Wavelet Neural Operator} (RP-WNO). The proposed RP-WNO is an extension of the recently proposed wavelet neural operator, which boasts excellent generalizing capabilities but cannot estimate the uncertainty associated with its predictions. RP-WNO, unlike the vanilla WNO, comes with… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  11. arXiv:2208.12926  [pdf, ps, other

    cs.LG stat.ML

    Overparameterization from Computational Constraints

    Authors: Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Mingyuan Wang

    Abstract: Overparameterized models with millions of parameters have been hugely successful. In this work, we ask: can the need for large models be, at least in part, due to the \emph{computational} limitations of the learner? Additionally, we ask, is this situation exacerbated for \emph{robust} learning? We show that this indeed could be the case. We show learning tasks for which computationally bounded lea… ▽ More

    Submitted 15 October, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

  12. arXiv:2207.13179  [pdf, other

    cs.LG stat.ML

    Unsupervised Learning under Latent Label Shift

    Authors: Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton

    Abstract: What sorts of structure might enable a learner to discover classes from unlabeled data? Traditional approaches rely on feature-space similarity and heroic assumptions on the data. In this paper, we introduce unsupervised learning under Latent Label Shift (LLS), where we have access to unlabeled data from multiple domains such that the label marginals $p_d(y)$ can shift across domains but the class… ▽ More

    Submitted 1 December, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022. Manley Roberts and Pranav Mani contributed equally to this work

  13. arXiv:2206.05655  [pdf, other

    stat.ML cs.LG

    Variational Bayes Deep Operator Network: A data-driven Bayesian solver for parametric differential equations

    Authors: Shailesh Garg, Souvik Chakraborty

    Abstract: Neural network based data-driven operator learning schemes have shown tremendous potential in computational mechanics. DeepONet is one such neural network architecture which has gained widespread appreciation owing to its excellent prediction capabilities. Having said that, being set in a deterministic framework exposes DeepONet architecture to the risk of overfitting, poor generalization and in i… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

  14. arXiv:2202.09931  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deconstructing Distributions: A Pointwise Framework of Learning

    Authors: Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran

    Abstract: In machine learning, we traditionally evaluate the performance of a single model, averaged over a collection of test inputs. In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$. Specifically, we study a point's $\textit{profile}$: the relationship between models' average performance on the test distribution… ▽ More

    Submitted 7 June, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

    Comments: GK and NG contributed equally. v2: Added Figures 4, 5

  15. arXiv:2201.13145  [pdf, other

    stat.ML cs.LG

    Assessment of DeepONet for reliability analysis of stochastic nonlinear dynamical systems

    Authors: Shailesh Garg, Harshit Gupta, Souvik Chakraborty

    Abstract: Time dependent reliability analysis and uncertainty quantification of structural system subjected to stochastic forcing function is a challenging endeavour as it necessitates considerable computational time. We investigate the efficacy of recently proposed DeepONet in solving time dependent reliability analysis and uncertainty quantification of systems subjected to stochastic loading. Unlike conve… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 21 pages

  16. arXiv:2201.04234  [pdf, other

    cs.LG stat.ML

    Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

    Authors: Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi

    Abstract: Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on… ▽ More

    Submitted 14 October, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: Accepted at ICLR 2022

  17. arXiv:2111.08706  [pdf, other

    cs.NE cs.LG q-bio.NC stat.ML

    How and When Random Feedback Works: A Case Study of Low-Rank Matrix Factorization

    Authors: Shivam Garg, Santosh S. Vempala

    Abstract: The success of gradient descent in ML and especially for learning neural networks is remarkable and robust. In the context of how the brain learns, one aspect of gradient descent that appears biologically difficult to realize (if not implausible) is that its updates rely on feedback from later layers to earlier layers through the same connections. Such bidirected links are relatively few in brain… ▽ More

    Submitted 10 April, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Fixed minor typos. AISTATS 2022

  18. arXiv:2111.00980  [pdf, other

    cs.LG stat.ML

    Mixture Proportion Estimation and PU Learning: A Modern Approach

    Authors: Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Given only positive examples and unlabeled examples (from both positive and negative classes), we might hope nevertheless to estimate an accurate positive-versus-negative classifier. Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, lea… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: Spotlight at NeurIPS 2021

  19. arXiv:2110.02667  [pdf, other

    cs.LG cs.SI stat.ML

    Attentive Walk-Aggregating Graph Neural Networks

    Authors: Mehmet F. Demirel, Shengchao Liu, Siddhant Garg, Zhenmei Shi, Yingyu Liang

    Abstract: Graph neural networks (GNNs) have been shown to possess strong representation power, which can be exploited for downstream prediction tasks on graph-structured data, such as molecules and social networks. They typically learn representations by aggregating information from the $K$-hop neighborhood of individual vertices or from the enumerated walks in the graph. Prior studies have demonstrated the… ▽ More

    Submitted 21 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR (Transactions on Machine Learning Research) (08/2022) 32 pages

  20. arXiv:2109.00538  [pdf, other

    stat.ML cs.LG physics.data-an

    Physics-integrated hybrid framework for model form error identification in nonlinear dynamical systems

    Authors: Shailesh Garg, Souvik Chakraborty, Budhaditya Hazra

    Abstract: For real-life nonlinear systems, the exact form of nonlinearity is often not known and the known governing equations are often based on certain assumptions and approximations. Such representation introduced model-form error into the system. In this paper, we propose a novel gray-box modeling approach that not only identifies the model-form error but also utilizes it to improve the predictive capab… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: 23 pages

  21. arXiv:2108.05828  [pdf, other

    cs.LG cs.AI stat.ML

    A general class of surrogate functions for stable and efficient reinforcement learning

    Authors: Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux

    Abstract: Common policy gradient methods rely on the maximization of a sequence of surrogate functions. In recent years, many such surrogate functions have been proposed, most without strong theoretical guarantees, leading to algorithms such as TRPO, PPO or MPO. Rather than design yet another surrogate function, we instead propose a general framework (FMA-PG) based on functional mirror ascent that gives ris… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: Fixed minor typos

  22. arXiv:2105.00303  [pdf, other

    cs.LG stat.ML

    RATT: Leveraging Unlabeled Data to Guarantee Generalization

    Authors: Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

    Abstract: To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data. However, (i) typically yields vacuous guarantees for overparameterized models. Furthermore, (ii) shrinks the training set and its guarantee erodes with each re-u… ▽ More

    Submitted 6 November, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

    Comments: ICML 2021 (Long Talk)

  23. arXiv:2103.15636  [pdf, other

    stat.ML cs.LG

    Machine learning based digital twin for stochastic nonlinear multi-degree of freedom dynamical system

    Authors: Shailesh Garg, Ankush Gogoi, Souvik Chakraborty, Budhaditya Hazra

    Abstract: The potential of digital twin technology is immense, specifically in the infrastructure, aerospace, and automotive sector. However, practical implementation of this technology is not at an expected speed, specifically because of lack of application-specific details. In this paper, we propose a novel digital twin framework for stochastic nonlinear multi-degree of freedom (MDOF) dynamical systems. T… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: 21 pages

  24. arXiv:2102.10264  [pdf, other

    cs.LG cs.RO stat.ML

    On Proximal Policy Optimization's Heavy-tailed Gradients

    Authors: Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

    Abstract: Modern policy gradient algorithms such as Proximal Policy Optimization (PPO) rely on an arsenal of heuristics, including loss clip** and gradient clip**, to ensure successful learning. These heuristics are reminiscent of techniques from robust statistics, commonly used for estimation in outlier-rich (``heavy-tailed'') regimes. In this paper, we present a detailed empirical study to characteriz… ▽ More

    Submitted 12 July, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  25. arXiv:2009.09283  [pdf, other

    cs.CV cs.AI cs.CR cs.LG stat.ML

    Subverting Privacy-Preserving GANs: Hiding Secrets in Sanitized Images

    Authors: Kang Liu, Benjamin Tan, Siddharth Garg

    Abstract: Unprecedented data collection and sharing have exacerbated privacy concerns and led to increasing interest in privacy-preserving tools that remove sensitive attributes from images while maintaining useful information for other tasks. Currently, state-of-the-art approaches use privacy-preserving generative adversarial networks (PP-GANs) for this purpose, for instance, to enable reliable facial expr… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  26. arXiv:2008.12338  [pdf, other

    cs.LG cs.CV stat.ML

    Adversarially Robust Learning via Entropic Regularization

    Authors: Gauri Jagatap, Ameya Joshi, Animesh Basak Chowdhury, Siddharth Garg, Chinmay Hegde

    Abstract: In this paper we propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks. We formulate a new loss function that is equipped with an additional entropic regularization. Our loss function considers the contribution of adversarial samples that are drawn from a specially designed distribution in the data space that assigns high probability to points with high… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

  27. arXiv:2008.02447  [pdf, other

    cs.LG stat.ML

    Functional Regularization for Representation Learning: A Unified Theoretical Perspective

    Authors: Siddhant Garg, Yingyu Liang

    Abstract: Unsupervised and self-supervised learning approaches have become a crucial tool to learn representations for downstream prediction tasks. While these approaches are widely used in practice and achieve impressive empirical gains, their theoretical understanding largely lags behind. Towards bridging this gap, we present a unifying perspective where several such approaches can be viewed as imposing a… ▽ More

    Submitted 21 October, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted at NeurIPS 2020

  28. arXiv:2008.01761  [pdf, other

    cs.LG cs.CR stat.ML

    Can Adversarial Weight Perturbations Inject Neural Backdoors?

    Authors: Siddhant Garg, Adarsh Kumar, Vibhor Goel, Yingyu Liang

    Abstract: Adversarial machine learning has exposed several security hazards of neural models and has become an important research topic in recent times. Thus far, the concept of an "adversarial perturbation" has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of "adversarial perturbations" t… ▽ More

    Submitted 21 September, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: Accepted as a conference paper at CIKM 2020

  29. arXiv:2007.09712  [pdf, other

    cs.LG cs.DC stat.ML

    Deep Anomaly Detection for Time-series Data in Industrial IoT: A Communication-Efficient On-device Federated Learning Approach

    Authors: Yi Liu, Sahil Garg, Jiangtian Nie, Yang Zhang, Zehui Xiong, Jiawen Kang, M. Shamim Hossain

    Abstract: Since edge device failures (i.e., anomalies) seriously affect the production of industrial products in Industrial IoT (IIoT), accurately and timely detecting anomalies is becoming increasingly important. Furthermore, data collected by the edge device may contain the user's private data, which is challenging the current detection approaches as user privacy is calling for the public concern in recen… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: IEEE Internet of Things Journal

  30. arXiv:2007.00611  [pdf, other

    cs.LG cs.AI stat.ML

    Gradient Temporal-Difference Learning with Regularized Corrections

    Authors: Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White

    Abstract: It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well. However, recent work with large neural network learning systems reveals that instability is more common than previously thought. Practitioners face a difficult dilemma: choose an ea… ▽ More

    Submitted 17 September, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Appeared in Proceedings of the 37th International Conference on Machine Learning (ICML2020)

  31. arXiv:2006.08733  [pdf, other

    cs.LG cs.CR stat.ML

    CryptoNAS: Private Inference on a ReLU Budget

    Authors: Zahra Ghodsi, Akshaj Veldanda, Brandon Reagen, Siddharth Garg

    Abstract: Machine learning as a service has given raise to privacy concerns surrounding clients' data and providers' models and has catalyzed research in private inference (PI): methods to process inferences without disclosing inputs. Recently, researchers have adapted cryptographic techniques to show PI is possible, however all solutions increase inference latency beyond practical limits. This paper makes… ▽ More

    Submitted 13 May, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

  32. arXiv:2005.08377  [pdf, ps, other

    cs.LG stat.ML

    The Role of Randomness and Noise in Strategic Classification

    Authors: Mark Braverman, Sumegha Garg

    Abstract: We investigate the problem of designing optimal classifiers in the strategic classification setting, where the classification is part of a game in which players can modify their features to attain a favorable classification outcome (while incurring some cost). Previously, the problem has been considered from a learning-theoretic perspective and from the algorithmic fairness perspective. Our main c… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.

    Comments: 22 pages. Appeared in FORC, 2020

  33. arXiv:2004.12492  [pdf, other

    cs.LG cs.CR stat.ML

    Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks

    Authors: Kang Liu, Benjamin Tan, Gaurav Rajavendra Reddy, Siddharth Garg, Yiorgos Makris, Ramesh Karri

    Abstract: Deep learning (DL) offers potential improvements throughout the CAD tool-flow, one promising application being lithographic hotspot detection. However, DL techniques have been shown to be especially vulnerable to inference and training time adversarial attacks. Recent work has demonstrated that a small fraction of malicious physical designers can stealthily "backdoor" a DL-based hotspot detector d… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

  34. arXiv:2004.09900  [pdf, other

    cs.LG stat.ML

    An RNN-Survival Model to Decide Email Send Times

    Authors: Harvineet Singh, Moumita Sinha, Atanu R. Sinha, Sahil Garg, Neha Banerjee

    Abstract: Email communications are ubiquitous. Firms control send times of emails and thereby the instants at which emails reach recipients (it is assumed email is received instantaneously from the send time). However, they do not control the duration it takes for recipients to open emails, labeled as time-to-open. Importantly, among emails that are opened, most occur within a short window from their send t… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: 11 pages, 3 figures, 2 tables

  35. arXiv:2003.12020  [pdf, ps, other

    cs.LG cs.CR stat.ML

    A Separation Result Between Data-oblivious and Data-aware Poisoning Attacks

    Authors: Samuel Deng, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Abhradeep Thakurta

    Abstract: Poisoning attacks have emerged as a significant security threat to machine learning algorithms. It has been demonstrated that adversaries who make small changes to the training set, such as adding specially crafted data points, can hurt the performance of the output model. Some of the stronger poisoning attacks require the full knowledge of the training data. This leaves open the possibility of ac… ▽ More

    Submitted 13 December, 2021; v1 submitted 26 March, 2020; originally announced March 2020.

  36. arXiv:2003.07554  [pdf, other

    cs.LG stat.ML

    A Unified View of Label Shift Estimation

    Authors: Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Under label shift, the label distribution p(y) might change but the class-conditional distributions p(x|y) do not. There are two dominant approaches for estimating the label marginal. BBSE, a moment-matching approach based on confusion matrices, is provably consistent and provides interpretable error bounds. However, a maximum likelihood estimation approach, which we call MLLS, dominates empirical… ▽ More

    Submitted 16 October, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

    Comments: Accepted at Neurips 2020

  37. Temporal Attribute Prediction via Joint Modeling of Multi-Relational Structure Evolution

    Authors: Sankalp Garg, Navodita Sharma, Woojeong **, Xiang Ren

    Abstract: Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict… ▽ More

    Submitted 13 July, 2020; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: In Proceedings of IJCAI 2020. Code can be found at https://github.com/INK-USC/DArtNet . The sole copyright holder is IJCAI (International Joint Conferences on Artificial Intelligence), all rights reserved. Original Publication available at https://www.ijcai.org/Proceedings/2020/386

  38. arXiv:2002.07375  [pdf, other

    cs.LG cs.AI stat.ML

    Symbolic Network: Generalized Neural Policies for Relational MDPs

    Authors: Sankalp Garg, Aniket Bajpai, Mausam

    Abstract: A Relational Markov Decision Process (RMDP) is a first-order representation to express all instances of a single probabilistic planning domain with possibly unbounded number of objects. Early work in RMDPs outputs generalized (instance-independent) first-order policies or value functions as a means to solve all instances of a domain at once. Unfortunately, this line of work met with limited succes… ▽ More

    Submitted 29 June, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: In Proceeding of ICML 2020. Code can be found at https://github.com/dair-iitd/symnet

  39. arXiv:1910.01161  [pdf, ps, other

    cs.LG stat.ML

    Stochastic Bandits with Delayed Composite Anonymous Feedback

    Authors: Siddhant Garg, Aditya Kumar Akash

    Abstract: We explore a novel setting of the Multi-Armed Bandit (MAB) problem inspired from real world applications which we call bandits with "stochastic delayed composite anonymous feedback (SDCAF)". In SDCAF, the rewards on pulling arms are stochastic with respect to time but spread over a fixed number of time steps in the future after pulling the arm. The complexity of this problem stems from the anonymo… ▽ More

    Submitted 11 October, 2019; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS) Workshop on Machine Learning with Guarantees

  40. arXiv:1909.03881  [pdf, other

    cs.LG cs.AI cs.CL cs.IT stat.ML

    Nearly-Unsupervised Hashcode Representations for Relation Extraction

    Authors: Sahil Garg, Aram Galstyan, Greg Ver Steeg, Guillermo Cecchi

    Abstract: Recently, kernelized locality sensitive hashcodes have been successfully employed as representations of natural language text, especially showing high relevance to biomedical relation extraction tasks. In this paper, we propose to optimize the hashcode representations in a nearly unsupervised manner, in which we only use data points, but not their class labels, for learning. The optimized hashcode… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: Proceedings of EMNLP-19

  41. arXiv:1907.01643  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

    Authors: Hemant Pugaliya, Karan Saxena, Shefali Garg, Sheetal Shalini, Prashant Gupta, Eric Nyberg, Teruko Mitamura

    Abstract: Parallel deep learning architectures like fine-tuned BERT and MT-DNN, have quickly become the state of the art, bypassing previous deep and shallow learning methods by a large margin. More recently, pre-trained models from large related datasets have been able to perform well on many downstream tasks by just fine-tuning on domain-specific datasets . However, using powerful models on non-trivial ta… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  42. arXiv:1906.10773  [pdf, other

    cs.LG cs.CR stat.ML

    Are Adversarial Perturbations a Showstopper for ML-Based CAD? A Case Study on CNN-Based Lithographic Hotspot Detection

    Authors: Kang Liu, Haoyu Yang, Yuzhe Ma, Benjamin Tan, Bei Yu, Evangeline F. Y. Young, Ramesh Karri, Siddharth Garg

    Abstract: There is substantial interest in the use of machine learning (ML) based techniques throughout the electronic computer-aided design (CAD) flow, particularly those based on deep learning. However, while deep learning methods have surpassed state-of-the-art performance in several applications, they have exhibited intrinsic susceptibility to adversarial perturbations --- small but deliberate alteratio… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Journal ref: ACM Trans. Des. Autom. Electron. Syst. 25, 5, Article 48 (August 2020)

  43. arXiv:1905.11564  [pdf, ps, other

    cs.LG cs.CC cs.CR stat.ML

    Adversarially Robust Learning Could Leverage Computational Hardness

    Authors: Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody

    Abstract: Over recent years, devising classification algorithms that are robust to adversarial perturbations has emerged as a challenging problem. In particular, deep neural nets (DNNs) seem to be susceptible to small imperceptible changes over test instances. However, the line of work in provable robustness, so far, has been focused on information-theoretic robustness, ruling out even the existence of any… ▽ More

    Submitted 19 December, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

  44. arXiv:1905.07088  [pdf, other

    cs.LG stat.ML

    Sliced Score Matching: A Scalable Approach to Density and Score Estimation

    Authors: Yang Song, Sahaj Garg, Jiaxin Shi, Stefano Ermon

    Abstract: Score matching is a popular method for estimating unnormalized statistical models. However, it has been so far limited to simple, shallow models or low-dimensional data, due to the difficulty of computing the Hessian of log-density functions. We show this difficulty can be mitigated by projecting the scores onto random vectors before comparing them. This objective, called sliced score matching, on… ▽ More

    Submitted 27 June, 2019; v1 submitted 16 May, 2019; originally announced May 2019.

    Comments: UAI 2019

  45. arXiv:1904.12053  [pdf, other

    cs.LG math.ST stat.ML

    Sample Amplification: Increasing Dataset Size even when Learning is Impossible

    Authors: Brian Axelrod, Shivam Garg, Vatsal Sharan, Gregory Valiant

    Abstract: Given data drawn from an unknown distribution, $D$, to what extent is it possible to ``amplify'' this dataset and output an even larger set of samples that appear to have been drawn from $D$? We formalize this question as follows: an $(n,m)$ $\text{amplification procedure}$ takes as input $n$ independent draws from an unknown distribution $D$, and outputs a set of $m > n$ ``samples''. An amplifica… ▽ More

    Submitted 2 December, 2019; v1 submitted 26 April, 2019; originally announced April 2019.

    Comments: Added discussion about potential applications

  46. arXiv:1904.09942  [pdf, other

    cs.LG stat.ML

    Tracking and Improving Information in the Service of Fairness

    Authors: Sumegha Garg, Michael P. Kim, Omer Reingold

    Abstract: As algorithmic prediction systems have become widespread, fears that these systems may inadvertently discriminate against members of underrepresented populations have grown. With the goal of understanding fundamental principles that underpin the growing number of approaches to mitigating algorithmic discrimination, we investigate the role of information in fair prediction. A common strategy for de… ▽ More

    Submitted 1 August, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

    Comments: Appeared at EC 2019

  47. arXiv:1904.09489  [pdf, other

    cs.LG cs.AI stat.ML

    Compression and Localization in Reinforcement Learning for ATARI Games

    Authors: Joel Ruben Antony Moniz, Barun Patra, Sarthak Garg

    Abstract: Deep neural networks have become commonplace in the domain of reinforcement learning, but are often expensive in terms of the number of parameters needed. While compressing deep neural networks has of late assumed great importance to overcome this drawback, little work has been done to address this problem in the context of reinforcement learning agents. This work aims at making first steps toward… ▽ More

    Submitted 20 April, 2019; originally announced April 2019.

    Comments: NeurIPS 2018 Deep Reinforcement Learning Workshop

  48. arXiv:1902.03081  [pdf, other

    cs.LG stat.ML

    Size Independent Neural Transfer for RDDL Planning

    Authors: Sankalp Garg, Aniket Bajpai, Mausam

    Abstract: Neural planners for RDDL MDPs produce deep reactive policies in an offline fashion. These scale well with large domains, but are sample inefficient and time-consuming to train from scratch for each new problem. To mitigate this, recent work has studied neural transfer learning, so that a generic planner trained on other problems of the same domain can rapidly transfer to a new problem. However, th… ▽ More

    Submitted 4 April, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

    Comments: Published in ICAPS 2019

  49. arXiv:1811.06609  [pdf, other

    cs.LG stat.ML

    A Spectral View of Adversarially Robust Features

    Authors: Shivam Garg, Vatsal Sharan, Brian Hu Zhang, Gregory Valiant

    Abstract: Given the apparent difficulty of learning models that are robust to adversarial perturbations, we propose tackling the simpler problem of develo** adversarially robust features. Specifically, given a dataset and metric of interest, the goal is to return a function (or multiple functions) that 1) is robust to adversarial perturbations, and 2) has significant variation across the datapoints. We es… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: To appear at NIPS 2018

  50. arXiv:1809.10610  [pdf, other

    cs.LG stat.ML

    Counterfactual Fairness in Text Classification through Robustness

    Authors: Sahaj Garg, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, Alex Beutel

    Abstract: In this paper, we study counterfactual fairness in text classification, which asks the question: How would the prediction change if the sensitive attribute referenced in the example were different? Toxicity classifiers demonstrate a counterfactual fairness issue by predicting that "Some people are gay" is toxic while "Some people are straight" is nontoxic. We offer a metric, counterfactual token f… ▽ More

    Submitted 13 February, 2019; v1 submitted 27 September, 2018; originally announced September 2018.