Skip to main content

Showing 1–50 of 77 results for author: Shanmugam, K

.
  1. arXiv:2406.05937  [pdf, other

    cs.LG stat.ML

    Linear Causal Representation Learning from Unknown Multi-node Interventions

    Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer

    Abstract: Despite the multifaceted recent advances in interventional causal representation learning (CRL), they primarily focus on the stylized assumption of single-node interventions. This assumption is not valid in a wide range of applications, and generally, the subset of nodes intervened in an interventional environment is fully unknown. This paper focuses on interventional CRL under unknown multi-node… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  2. arXiv:2405.17035  [pdf, other

    cs.LG

    Glauber Generative Model: Discrete Diffusion Models via Binary Classification

    Authors: Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam

    Abstract: We introduce the Glauber Generative Model (GGM), a new class of discrete diffusion models, to obtain new samples from a distribution given samples from a discrete space. GGM deploys a discrete Markov chain called the heat bath dynamics (or the Glauber dynamics) to denoise a sequence of noisy tokens to a sample from a joint distribution of discrete tokens. Our novel conceptual framework provides an… ▽ More

    Submitted 27 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2403.10638  [pdf, other

    cs.LG cs.CY stat.ML

    A resource-constrained stochastic scheduling algorithm for homeless street outreach and gleaning edible food

    Authors: Conor M. Artman, Aditya Mate, Ezinne Nwankwo, Aliza Heching, Tsuyoshi Idé, Jiří Navrátil, Karthikeyan Shanmugam, Wei Sun, Kush R. Varshney, Lauri Goldkind, Gidi Kroch, Jaclyn Sawyer, Ian Watson

    Abstract: We developed a common algorithmic solution addressing the problem of resource-constrained outreach encountered by social change organizations with different missions and operations: Breaking Ground -- an organization that helps individuals experiencing homelessness in New York transition to permanent housing and Leket -- the national food bank of Israel that rescues food from farms and elsewhere t… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  4. arXiv:2402.00849  [pdf, other

    cs.LG stat.ML

    Score-based Causal Representation Learning: Linear and General Transformations

    Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Abhishek Kumar, Ali Tajer

    Abstract: This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transformation that maps the latent variables to the observed variables. Linear and general transformations are investigated. The paper addresses both the identifiability and achievability aspects. Identifiability refers to determining algorithm-agnostic con… ▽ More

    Submitted 26 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: (updated literature review) Linear transformations: stronger results than our previous paper Score-based Causal Representation Learning with Interventions (arXiv:2301.08230). General transformations: results also appear in our paper General Identifiability and Achievability for Causal Representation Learning (arXiv:2310.15450) accepted to AISTATS 2024 (oral). arXiv admin note: text overlap with arXiv:2310.15450

  5. arXiv:2311.03376  [pdf, other

    cs.IR cs.LG stat.ML

    Blocked Collaborative Bandits: Online Collaborative Filtering with Per-Item Budget Constraints

    Authors: Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

    Abstract: We consider the problem of \emph{blocked} collaborative bandits where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. Our goal is to design algorithms that maximize the cumulative reward accrued by all the users over time, under the \em… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: 44 pages, To Appear in NeurIPS 2023

  6. arXiv:2310.15450  [pdf, other

    cs.LG stat.ML

    General Identifiability and Achievability for Causal Representation Learning

    Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer

    Abstract: This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments hav… ▽ More

    Submitted 14 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to AISTATS 2024 (oral presentation). Also appeared at CRL Workshop @ NeurIPS 2023 (oral presentation) titled as "Score-based Causal Representation Learning: Nonparametric Identifiability"

  7. arXiv:2310.08056  [pdf, other

    cs.LG cs.AI

    Learning from Label Proportions: Bootstrap** Supervised Learners via Belief Propagation

    Authors: Shreyas Havaldar, Navodita Sharma, Shubhi Sareen, Karthikeyan Shanmugam, Aravindan Raghuveer

    Abstract: Learning from Label Proportions (LLP) is a learning problem where only aggregate level labels are available for groups of instances, called bags, during training, and the aim is to get the best performance at the instance-level on the test data. This setting arises in domains like advertising and medicine due to privacy considerations. We propose a novel algorithmic framework for this problem that… ▽ More

    Submitted 20 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at The Twelfth International Conference on Learning Representations (ICLR 2024) & Oral Presentation at Regulatable ML @ NeurIPS 2023

  8. arXiv:2310.07535  [pdf, other

    cs.LG cs.AI

    Fairness under Covariate Shift: Improving Fairness-Accuracy tradeoff with few Unlabeled Test Samples

    Authors: Shreyas Havaldar, Jatin Chauhan, Karthikeyan Shanmugam, Jay Nandy, Aravindan Raghuveer

    Abstract: Covariate shift in the test data is a common practical phenomena that can significantly downgrade both the accuracy and the fairness performance of the model. Ensuring fairness across different sensitive groups under covariate shift is of paramount importance due to societal implications like criminal justice. We operate in the unsupervised regime where only a small set of unlabeled test samples a… ▽ More

    Submitted 8 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted at The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  9. arXiv:2307.06250  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Identifiability Guarantees for Causal Disentanglement from Soft Interventions

    Authors: Jiaqi Zhang, Chandler Squires, Kristjan Greenewald, Akash Srivastava, Karthikeyan Shanmugam, Caroline Uhler

    Abstract: Causal disentanglement aims to uncover a representation of data using latent variables that are interrelated through a causal model. Such a representation is identifiable if the latent model that explains the data is unique. In this paper, we focus on the scenario where unpaired observational and interventional data are available, with each intervention changing the mechanism of a latent variable.… ▽ More

    Submitted 8 November, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

  10. arXiv:2306.11008  [pdf, other

    cs.LG stat.ME stat.ML

    Front-door Adjustment Beyond Markov Equivalence with Limited Graph Knowledge

    Authors: Abhin Shah, Karthikeyan Shanmugam, Murat Kocaoglu

    Abstract: Causal effect estimation from data typically requires assumptions about the cause-effect relations either explicitly in the form of a causal graph structure within the Pearlian framework, or implicitly in terms of (conditional) independence statements between counterfactual variables within the potential outcomes framework. When the treatment variable and the outcome variable are confounded, front… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  11. arXiv:2306.09048  [pdf, other

    cs.LG stat.ML

    Optimal Best-Arm Identification in Bandits with Access to Offline Data

    Authors: Shubhada Agrawal, Sandeep Juneja, Karthikeyan Shanmugam, Arun Sai Suggala

    Abstract: Learning paradigms based purely on offline data as well as those based solely on sequential online learning have been well-studied in the literature. In this paper, we consider combining offline data with online learning, an area less studied but of obvious practical importance. We consider the stochastic $K$-armed bandit problem, where our goal is to identify the arm with the highest mean in the… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 45 pages, 5 figures

  12. arXiv:2305.19490  [pdf

    cs.CR eess.SY

    Adoption of Blockchain Platform for Security Enhancement in Energy Transaction

    Authors: Madhuresh Gupta, Soumyakanti Giri, Prabhakar Karthikeyan Shanmugam, Mahajan Sagar Bhaskar, Jens Bo Holm-Nielsen, Sanjeevikumar Padmanaban

    Abstract: Renewable energy has become a reality in the present and is being preferred by countries to become a considerable part of the central grid. With the increasing adoption of renewables it will soon become crucial to have a platform which would facilitate secure transaction of energy for consumers as well as producers. This paper discusses and implements a Blockchain based platform which enhances and… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: 11 Pages, 6 Figures

  13. arXiv:2302.07920  [pdf, other

    cs.LG

    InfoNCE Loss Provably Learns Cluster-Preserving Representations

    Authors: Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, Sanjay Shakkottai

    Abstract: The goal of contrasting learning is to learn a representation that preserves underlying clusters by kee** samples with similar content, e.g. the ``dogness'' of a dog, close to each other in the space generated by the representation. A common and successful approach for tackling this unsupervised learning problem is minimizing the InfoNCE loss associated with the training samples, where each samp… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  14. arXiv:2301.08230  [pdf, other

    stat.ML cs.LG

    Score-based Causal Representation Learning with Interventions

    Authors: Burak Varici, Emre Acarturk, Karthikeyan Shanmugam, Abhishek Kumar, Ali Tajer

    Abstract: This paper studies the causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Sufficient conditions for DAG recovery are established, and it is s… ▽ More

    Submitted 1 May, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

    Comments: This version outlines large classes of non-linear causal models in the latent space for which our assumptions hold. It also discusses the latest updates of related literature

  15. arXiv:2301.07040  [pdf, other

    cs.LG stat.ML

    Optimal Algorithms for Latent Bandits with Cluster Structure

    Authors: Soumyabrata Pal, Arun Sai Suggala, Karthikeyan Shanmugam, Prateek Jain

    Abstract: We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal… ▽ More

    Submitted 11 July, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: 48 pages. Accepted to AISTATS 2023. Added Experiments

  16. arXiv:2212.05987  [pdf, other

    cs.LG

    Selective classification using a robust meta-learning approach

    Authors: Nishant Jain, Karthikeyan Shanmugam, Pradeep Shenoy

    Abstract: Predictive uncertainty-a model's self awareness regarding its accuracy on an input-is key for both building robust models via training interventions and for test-time applications such as selective classification. We propose a novel instance-conditioned reweighting approach that captures predictive uncertainty using an auxiliary network and unifies these train- and test-time applications. The auxi… ▽ More

    Submitted 2 January, 2024; v1 submitted 12 December, 2022; originally announced December 2022.

  17. arXiv:2208.12764  [pdf, other

    stat.ML cs.LG

    Causal Bandits for Linear Structural Equation Models

    Authors: Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer

    Abstract: This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model to minimize cumulative regret with respect to the best intervention in hindsight. This is, naturally, posed as a causal bandit problem. The focus is on causal bandits for linear structural equation models (SEMs) and soft interventions. It is assumed that the graph's structure is known and h… ▽ More

    Submitted 31 March, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: 61 pages; new to this version: added lower bounds and relaxed assumptions

  18. arXiv:2207.07174  [pdf, other

    cs.LG stat.ML

    Causal Graphs Underlying Generative Models: Path to Learning with Limited Data

    Authors: Samuel C. Hoffman, Kahini Wadhawan, Payel Das, Prasanna Sattigeri, Karthikeyan Shanmugam

    Abstract: Training generative models that capture rich semantics of the data and interpreting the latent representations encoded by such models are very important problems in unsupervised learning. In this work, we provide a simple algorithm that relies on perturbation experiments on latent codes of a pre-trained generative autoencoder to uncover a causal graph that is implied by the generative model. We le… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  19. arXiv:2205.15196  [pdf, other

    cs.LG stat.ML

    PAC Generalization via Invariant Representations

    Authors: Advait Parulekar, Karthikeyan Shanmugam, Sanjay Shakkottai

    Abstract: One method for obtaining generalizable solutions to machine learning tasks when presented with diverse training environments is to find \textit{invariant representations} of the data. These are representations of the covariates such that the best model on top of the representation is invariant across training environments. In the context of linear Structural Equation Models (SEMs), invariant repre… ▽ More

    Submitted 14 August, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

  20. arXiv:2202.03712  [pdf, other

    cs.LG cs.AI

    Fourier Representations for Black-Box Optimization over Categorical Variables

    Authors: Hamid Dadkhahi, Jesus Rios, Karthikeyan Shanmugam, Payel Das

    Abstract: Optimization of real-world black-box functions defined over purely categorical variables is an active area of research. In particular, optimization and design of biological sequences with specific functional or structural properties have a profound impact in medicine, materials science, and biotechnology. Standalone search algorithms, such as simulated annealing (SA) and Monte Carlo tree search (M… ▽ More

    Submitted 24 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  21. arXiv:2202.01011  [pdf, other

    cs.LG cs.AI cs.CV

    Auto-Transfer: Learning to Route Transferrable Representations

    Authors: Keerthiram Murugesan, Vijay Sadashivaiah, Ronny Luss, Karthikeyan Shanmugam, Pin-Yu Chen, Amit Dhurandhar

    Abstract: Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labeled data can be difficult to obtain in many applications. Existing approaches typically constrain the target deep neural network (DNN) feature representations to be close to the source DNNs feature representations, which can be limiting. We,… ▽ More

    Submitted 16 March, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: Camera ready ICLR 2022

  22. arXiv:2111.07512  [pdf, other

    stat.ME cs.LG stat.ML

    Scalable Intervention Target Estimation in Linear Models

    Authors: Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer

    Abstract: This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data. The focus is on soft interventions in linear structural equation models (SEMs). Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention target… ▽ More

    Submitted 14 November, 2021; originally announced November 2021.

    Comments: 23 pages, 4 figures, NeurIPS 2021

  23. arXiv:2109.12151  [pdf, other

    cs.LG cs.AI

    AI Explainability 360: Impact and Design

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilovic, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms become increasingly prevalent in society, multiple stakeholders are calling for these algorithms to provide explanations. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, have different explanation needs. To address these needs, in 2019, we created AI Expl… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:1909.03012

    Journal ref: IAAI 2022

  24. arXiv:2107.03263  [pdf, other

    cs.LG

    Episodic Bandits with Stochastic Experts

    Authors: Nihal Sharma, Soumya Basu, Karthikeyan Shanmugam, Sanjay Shakkottai

    Abstract: We study a version of the contextual bandit problem where an agent can intervene through a set of stochastic expert policies. The agent interacts with the environment over episodes, with each episode having different context distributions; this results in the `best expert' changing across episodes. Our goal is to develop an agent that tracks the best expert over episodes. We introduce the Empirica… ▽ More

    Submitted 26 October, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

  25. arXiv:2106.12729  [pdf, ps, other

    cs.LG math.OC stat.ML

    Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

    Authors: Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

    Abstract: In temporal difference (TD) learning, off-policy sampling is known to be more practical than on-policy sampling, and by decoupling learning from data collection, it enables data reuse. It is known that policy evaluation (including multi-step off-policy importance sampling) has the interpretation of solving a generalized Bellman equation. In this paper, we derive finite-sample bounds for any genera… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  26. arXiv:2106.11560  [pdf, other

    cs.LG

    Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge

    Authors: Abhin Shah, Karthikeyan Shanmugam, Kartik Ahuja

    Abstract: Treatment effect estimation from observational data is a fundamental problem in causal inference. There are two very different schools of thought that have tackled this problem. On one hand, Pearlian framework commonly assumes structural knowledge (provided by an expert) in form of directed acyclic graphs and provides graphical criteria such as back-door criterion to identify valid adjustment sets… ▽ More

    Submitted 25 February, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

  27. arXiv:2103.07788  [pdf, ps, other

    cs.LG stat.ML

    Treatment Effect Estimation using Invariant Risk Minimization

    Authors: Abhin Shah, Kartik Ahuja, Karthikeyan Shanmugam, Dennis Wei, Kush Varshney, Amit Dhurandhar

    Abstract: Inferring causal individual treatment effect (ITE) from observational data is a challenging problem whose difficulty is exacerbated by the presence of treatment assignment bias. In this work, we propose a new way to estimate the ITE using the domain generalization framework of invariant risk minimization (IRM). IRM uses data from multiple domains, learns predictors that do not exploit spurious dom… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

  28. arXiv:2103.03411  [pdf, other

    cs.CR cs.AI cs.LG

    Efficient Encrypted Inference on Ensembles of Decision Trees

    Authors: Kanthi Sarpatwar, Karthik Nandakumar, Nalini Ratha, James Rayfield, Karthikeyan Shanmugam, Sharath Pankanti, Roman Vaculin

    Abstract: Data privacy concerns often prevent the use of cloud-based machine learning services for sensitive personal data. While homomorphic encryption (HE) offers a potential solution by enabling computations on encrypted data, the challenge is to obtain accurate machine learning models that work within the multiplicative depth constraints of a leveled HE scheme. Existing approaches for encrypted inferenc… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 9 pages, 6 figures

  29. arXiv:2102.01567  [pdf, ps, other

    cs.LG math.OC stat.ML

    A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants

    Authors: Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

    Abstract: This paper develops an unified framework to study finite-sample convergence guarantees of a large class of value-based asynchronous reinforcement learning (RL) algorithms. We do this by first reformulating the RL algorithms as \textit{Markovian Stochastic Approximation} (SA) algorithms to solve fixed-point equations. We then develop a Lyapunov analysis and derive mean-square error bounds on the co… ▽ More

    Submitted 4 September, 2023; v1 submitted 2 February, 2021; originally announced February 2021.

  30. arXiv:2011.01979  [pdf, other

    stat.ML cs.LG stat.ME

    High-Dimensional Feature Selection for Sample Efficient Treatment Effect Estimation

    Authors: Kristjan Greenewald, Dmitriy Katz-Rogozhnikov, Karthik Shanmugam

    Abstract: The estimation of causal treatment effects from observational data is a fundamental problem in causal inference. To avoid bias, the effect estimator must control for all confounders. Hence practitioners often collect data for as many covariates as possible to raise the chances of including the relevant confounders. While this addresses the bias, this has the side effect of significantly increasing… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

  31. arXiv:2011.01016  [pdf, other

    cs.LG

    Stochastic Linear Bandits with Protected Subspace

    Authors: Advait Parulekar, Soumya Basu, Aditya Gopalan, Karthikeyan Shanmugam, Sanjay Shakkottai

    Abstract: We study a variant of the stochastic linear bandit problem wherein we optimize a linear objective function but rewards are accrued only orthogonal to an unknown subspace (which we interpret as a \textit{protected space}) given only zero-order stochastic oracle access to both the objective itself and protected subspace. In particular, at each round, the learner must choose whether to query the obje… ▽ More

    Submitted 1 March, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

  32. arXiv:2011.00641  [pdf, other

    stat.ME cs.LG stat.ML

    Active Structure Learning of Causal DAGs via Directed Clique Tree

    Authors: Chandler Squires, Sara Magliacane, Kristjan Greenewald, Dmitriy Katz, Murat Kocaoglu, Karthikeyan Shanmugam

    Abstract: A growing body of work has begun to study intervention design for efficient structure learning of causal directed acyclic graphs (DAGs). A typical setting is a causally sufficient setting, i.e. a system with no latent confounders, selection bias, or feedback, when the essential graph of the observational equivalence class (EC) is given as an input and interventions are assumed to be noiseless. Mos… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: NeurIPS 2020

  33. arXiv:2010.16412  [pdf, other

    cs.LG stat.ML

    Empirical or Invariant Risk Minimization? A Sample Complexity Perspective

    Authors: Kartik Ahuja, Jun Wang, Amit Dhurandhar, Karthikeyan Shanmugam, Kush R. Varshney

    Abstract: Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this importa… ▽ More

    Submitted 19 August, 2022; v1 submitted 30 October, 2020; originally announced October 2020.

  34. arXiv:2010.15234  [pdf, other

    cs.LG

    Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions

    Authors: Kartik Ahuja, Karthikeyan Shanmugam, Amit Dhurandhar

    Abstract: Recently, invariant risk minimization (IRM) (Arjovsky et al.) was proposed as a promising solution to address out-of-distribution (OOD) generalization. In Ahuja et al., it was shown that solving for the Nash equilibria of a new class of "ensemble-games" is equivalent to solving IRM. In this work, we extend the framework in Ahuja et al. for linear regressions by projecting the ensemble-game on an… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

  35. arXiv:2006.06053  [pdf, other

    cs.LG cs.CY cs.DB stat.ML

    Causal Feature Selection for Algorithmic Fairness

    Authors: Sainyam Galhotra, Karthikeyan Shanmugam, Prasanna Sattigeri, Kush R. Varshney

    Abstract: The use of machine learning (ML) in high-stakes societal decisions has encouraged the consideration of fairness throughout the ML lifecycle. Although data integration is one of the primary steps to generate high quality training data, most of the fairness literature ignores this stage. In this work, we consider fairness in the integration component of data management, aiming to identify features t… ▽ More

    Submitted 31 March, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: Full version of the paper at SIGMOD 2022

  36. arXiv:2006.03963  [pdf, other

    cs.LG stat.ML

    Combinatorial Black-Box Optimization with Expert Advice

    Authors: Hamid Dadkhahi, Karthikeyan Shanmugam, Jesus Rios, Payel Das, Samuel Hoffman, Troy David Loeffler, Subramanian Sankaranarayanan

    Abstract: We consider the problem of black-box function optimization over the boolean hypercube. Despite the vast literature on black-box function optimization over continuous domains, not much attention has been paid to learning models for optimization over combinatorial domains until recently. However, the computational complexity of the recently devised algorithms are prohibitive even for moderate number… ▽ More

    Submitted 13 October, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

    Journal ref: KDD 2020

  37. arXiv:2002.09575  [pdf, other

    cs.LG stat.ML

    A Multi-Channel Neural Graphical Event Model with Negative Evidence

    Authors: Tian Gao, Dharmashankar Subramanian, Karthikeyan Shanmugam, Debarun Bhattacharjya, Nicholas Mattei

    Abstract: Event datasets are sequences of events of various types occurring irregularly over the time-line, and they are increasingly prevalent in numerous domains. Existing work for modeling events using conditional intensities rely on either using some underlying parametric form to capture historical dependencies, or on non-parametric models that focus primarily on tasks such as prediction. We propose a n… ▽ More

    Submitted 21 February, 2020; originally announced February 2020.

    Comments: AAAI 2020

  38. arXiv:2002.08405  [pdf, other

    cs.LG stat.ML

    On Under-exploration in Bandits with Mean Bounds from Confounded Data

    Authors: Nihal Sharma, Soumya Basu, Karthikeyan Shanmugam, Sanjay Shakkottai

    Abstract: We study a variant of the multi-armed bandit problem where side information in the form of bounds on the mean of each arm is provided. We develop the novel non-optimistic Global Under-Explore (GLUE) algorithm which uses the provided mean bounds (across all the arms) to infer pseudo-variances for each arm, which in turn decide the rate of exploration for the arms. We analyze the regret of GLUE and… ▽ More

    Submitted 10 June, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

  39. arXiv:2002.08247  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Global Transparent Models Consistent with Local Contrastive Explanations

    Authors: Tejaswini Pedapati, Avinash Balakrishnan, Karthikeyan Shanmugam, Amit Dhurandhar

    Abstract: There is a rich and growing literature on producing local contrastive/counterfactual explanations for black-box models (e.g. neural networks). In these methods, for an input, an explanation is in the form of a contrast point differing in very few features from the original input and lying in a different class. Other works try to build globally interpretable models like decision trees and rule li… ▽ More

    Submitted 28 October, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Journal ref: NeurIPS 2020

  40. arXiv:2002.04692  [pdf, other

    cs.LG stat.ML

    Invariant Risk Minimization Games

    Authors: Kartik Ahuja, Karthikeyan Shanmugam, Kush R. Varshney, Amit Dhurandhar

    Abstract: The standard risk minimization paradigm of machine learning is brittle when operating in environments whose test distributions are different from the training distribution due to spurious correlations. Training on data from many environments and finding invariant predictors reduces the effect of spurious features by concentrating models on features that have a causal relationship with the outcome.… ▽ More

    Submitted 18 March, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

  41. arXiv:2002.00874  [pdf, other

    cs.LG math.OC stat.ML

    Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes

    Authors: Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

    Abstract: Stochastic Approximation (SA) is a popular approach for solving fixed-point equations where the information is corrupted by noise. In this paper, we consider an SA involving a contraction map** with respect to an arbitrary norm, and show its finite-sample error bounds while using different stepsizes. The idea is to construct a smooth Lyapunov function using the generalized Moreau envelope, and s… ▽ More

    Submitted 30 June, 2021; v1 submitted 3 February, 2020; originally announced February 2020.

  42. arXiv:1910.12832  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Differentially Private Distributed Data Summarization under Covariate Shift

    Authors: Kanthi Sarpatwar, Karthikeyan Shanmugam, Venkata Sitaramagiridharganesh Ganapavarapu, Ashish Jagmohan, Roman Vaculin

    Abstract: We envision AI marketplaces to be platforms where consumers, with very less data for a target task, can obtain a relevant model by accessing many private data sources with vast number of data samples. One of the key challenges is to construct a training dataset that matches a target task without compromising on privacy of the data sources. To this end, we consider the following distributed data su… ▽ More

    Submitted 9 January, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: To appear in the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  43. arXiv:1909.03012  [pdf, other

    cs.AI cs.CV cs.HC stat.ML

    One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

    Authors: Vijay Arya, Rachel K. E. Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Q. Vera Liao, Ronny Luss, Aleksandra Mojsilović, Sami Mourad, Pablo Pedemonte, Ramya Raghavendra, John Richards, Prasanna Sattigeri, Karthikeyan Shanmugam, Moninder Singh, Kush R. Varshney, Dennis Wei, Yunfeng Zhang

    Abstract: As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements for explanations. Toward addressing these need… ▽ More

    Submitted 14 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  44. arXiv:1907.10154  [pdf, other

    stat.ML cs.IT cs.LG

    Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

    Authors: Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai

    Abstract: We consider a covariate shift problem where one has access to several different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions. This covariate shift is caused, in part, due to unobserved features in the datasets. The objective, then, is to find the best mixture distribution over the training datasets… ▽ More

    Submitted 14 July, 2020; v1 submitted 23 July, 2019; originally announced July 2019.

    Comments: New from previous version: Adds Acknowledgements section

  45. arXiv:1906.00117  [pdf, other

    cs.LG stat.ML

    Model Agnostic Contrastive Explanations for Structured Data

    Authors: Amit Dhurandhar, Tejaswini Pedapati, Avinash Balakrishnan, Pin-Yu Chen, Karthikeyan Shanmugam, Ruchir Puri

    Abstract: Recently, a method [7] was proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model. In this work, we propose a method, Model Agnostic Contrastive Explanations Method (MACEM), to generate contrastive explanations for \emph{any} classification model where one is able to \emph{only} query the class probabilities… ▽ More

    Submitted 31 May, 2019; originally announced June 2019.

  46. arXiv:1905.13565  [pdf, other

    cs.LG stat.ML

    Enhancing Simple Models by Exploiting What They Already Know

    Authors: Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

    Abstract: There has been recent interest in improving performance of simple models for multiple reasons such as interpretability, robust learning from small data, deployment in memory constrained settings as well as environmental considerations. In this paper, we propose a novel method SRatio that can utilize information from high performing complex models (viz. deep neural networks, boosted trees, random f… ▽ More

    Submitted 19 June, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Accepted to ICML 2020

  47. arXiv:1905.12698  [pdf, other

    cs.LG stat.ML

    Leveraging Latent Features for Local Explanations

    Authors: Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Yunfeng Zhang, Karthikeyan Shanmugam, Chun-Chen Tu

    Abstract: As the application of deep neural networks proliferates in numerous areas such as medical imaging, video surveillance, and self driving cars, the need for explaining the decisions of these models has become a hot research topic, both at the global and local level. Locally, most explanation methods have focused on identifying relevance of features, limiting the types of explanations possible. In th… ▽ More

    Submitted 29 May, 2021; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Accepted to KDD 2021

  48. arXiv:1903.02054  [pdf, other

    stat.ML cs.AI cs.LG

    Size of Interventional Markov Equivalence Classes in Random DAG Models

    Authors: Dmitriy Katz, Karthikeyan Shanmugam, Chandler Squires, Caroline Uhler

    Abstract: Directed acyclic graph (DAG) models are popular for capturing causal relationships. From observational and interventional data, a DAG model can only be determined up to its \emph{interventional Markov equivalence class} (I-MEC). We investigate the size of MECs for random DAG models generated by uniformly sampling and ordering an Erdős-Rényi graph. For constant density, we show that the expected… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

    Comments: 19 pages, 5 figures. Accepted to AISTATS 2019

  49. arXiv:1902.10347  [pdf, other

    stat.ME

    ABCD-Strategy: Budgeted Experimental Design for Targeted Causal Structure Discovery

    Authors: Raj Agrawal, Chandler Squires, Karren Yang, Karthik Shanmugam, Caroline Uhler

    Abstract: Determining the causal structure of a set of variables is critical for both scientific inquiry and decision-making. However, this is often challenging in practice due to limited interventional data. Given that randomized experiments are usually expensive to perform, we propose a general framework and theory based on optimal Bayesian experimental design to select experiments for targeted causal dis… ▽ More

    Submitted 27 February, 2019; originally announced February 2019.

    Comments: To appear in AISTATS 2019

  50. arXiv:1807.07506  [pdf, other

    cs.LG stat.ML

    Improving Simple Models with Confidence Profiles

    Authors: Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss, Peder Olsen

    Abstract: In this paper, we propose a new method called ProfWeight for transferring information from a pre-trained deep neural network that has a high test accuracy to a simpler interpretable model or a very shallow network of low complexity and a priori low test accuracy. We are motivated by applications in interpretability and model deployment in severely memory constrained environments (like sensors). Ou… ▽ More

    Submitted 19 November, 2018; v1 submitted 19 July, 2018; originally announced July 2018.

    Comments: 16 pages; Accepted to NIPS 2018