Search | arXiv e-print repository

Causal Contextual Bandits with Adaptive Context

Authors: Rahul Madhavan, Aurghya Maiti, Gaurav Sinha, Siddharth Barman

Abstract: We study a variant of causal contextual bandits where the context is chosen based on an initial intervention chosen by the learner. At the beginning of each round, the learner selects an initial action, depending on which a stochastic context is revealed by the environment. Following this, the learner then selects a final action and receives a reward. Given $T$ rounds of interactions with the envi… ▽ More We study a variant of causal contextual bandits where the context is chosen based on an initial intervention chosen by the learner. At the beginning of each round, the learner selects an initial action, depending on which a stochastic context is revealed by the environment. Following this, the learner then selects a final action and receives a reward. Given $T$ rounds of interactions with the environment, the objective of the learner is to learn a policy (of selecting the initial and the final action) with maximum expected reward. In this paper we study the specific situation where every action corresponds to intervening on a node in some known causal graph. We extend prior work from the deterministic context setting to obtain simple regret minimization guarantees. This is achieved through an instance-dependent causal parameter, $λ$, which characterizes our upper bound. Furthermore, we prove that our simple regret is essentially tight for a large class of instances. A key feature of our work is that we use convex optimization to address the bandit exploration problem. We also conduct experiments to validate our theoretical results, and release our code at our project GitHub repository: https://github.com/adaptiveContextualCausalBandits/aCCB. △ Less

Submitted 2 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: Reinforcement Learning Conference (RLC) 2024, 10 pages (31 pages including appendix), 8 plots. arXiv admin note: text overlap with arXiv:2111.00886

arXiv:2401.15229 [pdf, other]

Evolving AI Risk Management: A Maturity Model based on the NIST AI Risk Management Framework

Authors: Ravit Dotan, Borhane Blili-Hamelin, Ravi Madhavan, Jeanna Matthews, Joshua Scarpino

Abstract: Researchers, government bodies, and organizations have been repeatedly calling for a shift in the responsible AI community from general principles to tangible and operationalizable practices in mitigating the potential sociotechnical harms of AI. Frameworks like the NIST AI RMF embody an emerging consensus on recommended practices in operationalizing sociotechnical harm mitigation. However, privat… ▽ More Researchers, government bodies, and organizations have been repeatedly calling for a shift in the responsible AI community from general principles to tangible and operationalizable practices in mitigating the potential sociotechnical harms of AI. Frameworks like the NIST AI RMF embody an emerging consensus on recommended practices in operationalizing sociotechnical harm mitigation. However, private sector organizations currently lag far behind this emerging consensus. Implementation is sporadic and selective at best. At worst, it is ineffective and can risk serving as a misleading veneer of trustworthy processes, providing an appearance of legitimacy to substantively harmful practices. In this paper, we provide a foundation for a framework for evaluating where organizations sit relative to the emerging consensus on sociotechnical harm mitigation best practices: a flexible maturity model based on the NIST AI RMF. △ Less

Submitted 13 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2311.11229 [pdf, other]

Causal ATE Mitigates Unintended Bias in Controlled Text Generation

Authors: Rahul Madhavan, Kahini Wadhawan

Abstract: We study attribute control in language models through the method of Causal Average Treatment Effect (Causal ATE). Existing methods for the attribute control task in Language Models (LMs) check for the co-occurrence of words in a sentence with the attribute of interest, and control for them. However, spurious correlation of the words with the attribute in the training dataset, can cause models to h… ▽ More We study attribute control in language models through the method of Causal Average Treatment Effect (Causal ATE). Existing methods for the attribute control task in Language Models (LMs) check for the co-occurrence of words in a sentence with the attribute of interest, and control for them. However, spurious correlation of the words with the attribute in the training dataset, can cause models to hallucinate the presence of the attribute when presented with the spurious correlate during inference. We show that the simple perturbation-based method of Causal ATE removes this unintended effect. Specifically, we ground it in the problem of toxicity mitigation, where a significant challenge lies in the inadvertent bias that often emerges towards protected groups post detoxification. We show that this unintended bias can be solved by the use of the Causal ATE metric and rigorously prove our claim. We provide experimental validations for our claims and release our code (anonymously) here: https://github.com/causalate-mitigates-bias/causal-ate-mitigates-bias. △ Less

Submitted 16 February, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

Comments: 12 pages, 5 figures

arXiv:2306.00374 [pdf, other]

doi 10.18653/v1/2023.findings-acl.720

CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation

Authors: Rahul Madhavan, Rishabh Garg, Kahini Wadhawan, Sameep Mehta

Abstract: We propose a method to control the attributes of Language Models (LMs) for the text generation task using Causal Average Treatment Effect (ATE) scores and counterfactual augmentation. We explore this method, in the context of LM detoxification, and propose the Causally Fair Language (CFL) architecture for detoxifying pre-trained LMs in a plug-and-play manner. Our architecture is based on a Structu… ▽ More We propose a method to control the attributes of Language Models (LMs) for the text generation task using Causal Average Treatment Effect (ATE) scores and counterfactual augmentation. We explore this method, in the context of LM detoxification, and propose the Causally Fair Language (CFL) architecture for detoxifying pre-trained LMs in a plug-and-play manner. Our architecture is based on a Structural Causal Model (SCM) that is mathematically transparent and computationally efficient as compared with many existing detoxification techniques. We also propose several new metrics that aim to better understand the behaviour of LMs in the context of toxic text generation. Further, we achieve state of the art performance for toxic degeneration, which are computed using \RTP (RTP) benchmark. Our experiments show that CFL achieves such a detoxification without much impact on the model perplexity. We also show that CFL mitigates the unintended bias problem through experiments on the BOLD dataset. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 19 pages, 10 figures. Findings of ACL 2023

Journal ref: Findings of the Association for Computational Linguistics: ACL 2023

arXiv:2305.04638 [pdf, other]

Learning Good Interventions in Causal Graphs via Covering

Authors: Ayush Sawarni, Rahul Madhavan, Gaurav Sinha, Siddharth Barman

Abstract: We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $A$ of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in ${A}$ is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Berno… ▽ More We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $A$ of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in ${A}$ is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on $N$ vertices with constant in-degree, prior work has achieved a worst case guarantee of $\widetilde{O} (N/\sqrt{T})$ for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within ${A}$) and establishes a simple regret guarantee of $\widetilde{O}(\sqrt{N/T})$. Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: 26 pages

arXiv:2203.03541 [pdf, other]

Fairness for Text Classification Tasks with Identity Information Data Augmentation Methods

Authors: Mohit Wadhwa, Mohan Bhambhani, Ashvini **dal, Uma Sawant, Ramanujam Madhavan

Abstract: Counterfactual fairness methods address the question: How would the prediction change if the sensitive identity attributes referenced in the text instance were different? These methods are entirely based on generating counterfactuals for the given training and test set instances. Counterfactual instances are commonly prepared by replacing sensitive identity terms, i.e., the identity terms present… ▽ More Counterfactual fairness methods address the question: How would the prediction change if the sensitive identity attributes referenced in the text instance were different? These methods are entirely based on generating counterfactuals for the given training and test set instances. Counterfactual instances are commonly prepared by replacing sensitive identity terms, i.e., the identity terms present in the instance are replaced with other identity terms that fall under the same sensitive category. Therefore, the efficacy of these methods depends heavily on the quality and comprehensiveness of identity pairs. In this paper, we offer a two-step data augmentation process where (1) the former stage consists of a novel method for preparing a comprehensive list of identity pairs with word embeddings, and (2) the latter consists of leveraging prepared identity pairs list to enhance the training instances by applying three simple operations (namely identity pair replacement, identity term blindness, and identity pair swap). We empirically show that the two-stage augmentation process leads to diverse identity pairs and an enhanced training set, with an improved counterfactual token-based fairness metric score on two well-known text classification tasks. △ Less

Submitted 4 February, 2022; originally announced March 2022.

arXiv:2111.00886 [pdf, other]

Intervention Efficient Algorithm for Two-Stage Causal MDPs

Authors: Rahul Madhavan, Aurghya Maiti, Gaurav Sinha, Siddharth Barman

Abstract: We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that stochastically generate rewards. In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state. Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDP… ▽ More We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that stochastically generate rewards. In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state. Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDPs, with parallel causal graph at each state. We propose an algorithm that achieves an instance dependent regret bound. A key feature of our algorithm is that it utilizes convex optimization to address the exploration problem. We identify classes of instances wherein our regret guarantee is essentially tight, and experimentally validate our theoretical results. △ Less

Submitted 1 November, 2021; originally announced November 2021.

Comments: 29 pages

arXiv:2106.13849 [pdf, other]

A CNN Segmentation-Based Approach to Object Detection and Tracking in Ultrasound Scans with Application to the Vagus Nerve Detection

Authors: Abdullah F. Al-Battal, Yan Gong, Lu Xu, Timothy Morton, Chen Du, Yifeng Bu 1, Imanuel R Lerman, Radhika Madhavan, Truong Q. Nguyen

Abstract: Ultrasound scanning is essential in several medical diagnostic and therapeutic applications. It is used to visualize and analyze anatomical features and structures that influence treatment plans. However, it is both labor intensive, and its effectiveness is operator dependent. Real-time accurate and robust automatic detection and tracking of anatomical structures while scanning would significantly… ▽ More Ultrasound scanning is essential in several medical diagnostic and therapeutic applications. It is used to visualize and analyze anatomical features and structures that influence treatment plans. However, it is both labor intensive, and its effectiveness is operator dependent. Real-time accurate and robust automatic detection and tracking of anatomical structures while scanning would significantly impact diagnostic and therapeutic procedures to be consistent and efficient. In this paper, we propose a deep learning framework to automatically detect and track a specific anatomical target structure in ultrasound scans. Our framework is designed to be accurate and robust across subjects and imaging devices, to operate in real-time, and to not require a large training set. It maintains a localization precision and recall higher than 90% when trained on training sets that are as small as 20% in size of the original training set. The framework backbone is a weakly trained segmentation neural network based on U-Net. We tested the framework on two different ultrasound datasets with the aim to detect and track the Vagus nerve, where it outperformed current state-of-the-art real-time object detection networks. △ Less

Submitted 25 June, 2021; originally announced June 2021.

Comments: 7 pages , 4 figures, submitted to the IEEE EMBC 2021 conference

arXiv:2104.07361 [pdf, other]

Scale Invariant Monte Carlo under Linear Function Approximation with Curvature based step-size

Authors: Rahul Madhavan, Hemanta Makwana

Abstract: We study the feature-scaled version of the Monte Carlo algorithm with linear function approximation. This algorithm converges to a scale-invariant solution, which is not unduly affected by states having feature vectors with large norms. The usual versions of the MCMC algorithm, obtained by minimizing the least-squares criterion, do not produce solutions that give equal importance to all states irr… ▽ More We study the feature-scaled version of the Monte Carlo algorithm with linear function approximation. This algorithm converges to a scale-invariant solution, which is not unduly affected by states having feature vectors with large norms. The usual versions of the MCMC algorithm, obtained by minimizing the least-squares criterion, do not produce solutions that give equal importance to all states irrespective of feature-vector norm -- a requirement that may be critical in many reinforcement learning contexts. To speed up convergence in our algorithm, we introduce an adaptive step-size based on the curvature of the iterate convergence path -- a novelty that may be useful in more general optimization contexts as well. A key contribution of this paper is to prove convergence, in the presence of adaptive curvature based step-size and heavy-ball momentum. We provide rigorous theoretical guarantees and use simulations to demonstrate the efficacy of our ideas. △ Less

Submitted 29 May, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

Comments: 42 pages, 9 figures (9 pages main body with 5 figures)

arXiv:2010.10737 [pdf, other]

Directed Graph Representation through Vector Cross Product

Authors: Ramanujam Madhavan, Mohit Wadhwa

Abstract: Graph embedding methods embed the nodes in a graph in low dimensional vector space while preserving graph topology to carry out the downstream tasks such as link prediction, node recommendation and clustering. These tasks depend on a similarity measure such as cosine similarity and Euclidean distance between a pair of embeddings that are symmetric in nature and hence do not hold good for directed… ▽ More Graph embedding methods embed the nodes in a graph in low dimensional vector space while preserving graph topology to carry out the downstream tasks such as link prediction, node recommendation and clustering. These tasks depend on a similarity measure such as cosine similarity and Euclidean distance between a pair of embeddings that are symmetric in nature and hence do not hold good for directed graphs. Recent work on directed graphs, HOPE, APP, and NERD, proposed to preserve the direction of edges among nodes by learning two embeddings, source and target, for every node. However, these methods do not take into account the properties of directed edges explicitly. To understand the directional relation among nodes, we propose a novel approach that takes advantage of the non commutative property of vector cross product to learn embeddings that inherently preserve the direction of edges among nodes. We learn the node embeddings through a Siamese neural network where the cross-product operation is incorporated into the network architecture. Although cross product between a pair of vectors is defined in three dimensional, the approach is extended to learn N dimensional embeddings while maintaining the non-commutative property. In our empirical experiments on three real-world datasets, we observed that even very low dimensional embeddings could effectively preserve the directional property while outperforming some of the state-of-the-art methods on link prediction and node recommendation tasks △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:2002.12143 [pdf, other]

doi 10.1145/3340531

Fairness-Aware Learning with Prejudice Free Representations

Authors: Ramanujam Madhavan, Mohit Wadhwa

Abstract: Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from t… ▽ More Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from the data; however, a model could pick up prejudice from latent sensitive attributes that may exist in the training data. This has led to the growing apprehension about the fairness of the employed models. In this paper, we propose a novel algorithm that can effectively identify and treat latent discriminating features. The approach is agnostic of the learning algorithm and generalizes well for classification as well as regression tasks. It can also be used as a key aid in proving that the model is free of discrimination towards regulatory compliance if the need arises. The approach helps to collect discrimination-free features that would improve the model performance while ensuring the fairness of the model. The experimental results from our evaluations on publicly available real-world datasets show a near-ideal fairness measurement in comparison to other methods. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:1704.05729 [pdf, other]

A generalized Bayesian framework for the analysis of subscription based businesses

Authors: Rahul Madhavan, Ankit Baraskar

Abstract: We have created a framework for analyzing subscription based businesses in terms of a unified metric which we call SCV (single customer value). The major advance in this paper is to model customer churn as an exponential decay variable, which directly follows from experimental data relating to subscription based businesses. This Bayesian probabilistic model was used to compute an expected value fo… ▽ More We have created a framework for analyzing subscription based businesses in terms of a unified metric which we call SCV (single customer value). The major advance in this paper is to model customer churn as an exponential decay variable, which directly follows from experimental data relating to subscription based businesses. This Bayesian probabilistic model was used to compute an expected value for the revenue contribution of a single user. We obtain an exact closed-form solution for the constant churn model, and an approximate closed-form solution for the exponential decay model. In addition, we define a general methodology for decision making processes using sensitivity analysis of the model equation, which we illustrate with a real-life case study for a food based subscription business. △ Less

Submitted 12 April, 2017; originally announced April 2017.

Comments: 12 pages, 4 figures, Atidiv Research

Showing 1–12 of 12 results for author: Madhavan, R