-
DIET: Conditional independence testing with marginal dependence measures of residual information
Authors:
Mukund Sudarshan,
Aahlad Manas Puli,
Wesley Tansey,
Rajesh Ranganath
Abstract:
Conditional randomization tests (CRTs) assess whether a variable $x$ is predictive of another variable $y$, having observed covariates $z$. CRTs require fitting a large number of predictive models, which is often computationally intractable. Existing solutions to reduce the cost of CRTs typically split the dataset into a train and test portion, or rely on heuristics for interactions, both of which…
▽ More
Conditional randomization tests (CRTs) assess whether a variable $x$ is predictive of another variable $y$, having observed covariates $z$. CRTs require fitting a large number of predictive models, which is often computationally intractable. Existing solutions to reduce the cost of CRTs typically split the dataset into a train and test portion, or rely on heuristics for interactions, both of which lead to a loss in power. We propose the decoupled independence test (DIET), an algorithm that avoids both of these issues by leveraging marginal independence statistics to test conditional independence relationships. DIET tests the marginal independence of two random variables: $F(x \mid z)$ and $F(y \mid z)$ where $F(\cdot \mid z)$ is a conditional cumulative distribution function (CDF). These variables are termed "information residuals." We give sufficient conditions for DIET to achieve finite sample type-1 error control and power greater than the type-1 error rate. We then prove that when using the mutual information between the information residuals as a test statistic, DIET yields the most powerful conditionally valid test. Finally, we show DIET achieves higher power than other tractable CRTs on several synthetic and real benchmarks.
△ Less
Submitted 11 April, 2023; v1 submitted 17 August, 2022;
originally announced August 2022.
-
Bayesian Modeling of Marketing Attribution
Authors:
Ritwik Sinha,
David Arbour,
Aahlad Manas Puli
Abstract:
In a multi-channel marketing world, the purchase decision journey encounters many interactions (e.g., email, mobile notifications, display advertising, social media, and so on). These impressions have direct (main effects), as well as interactive influence on the final decision of the customer. To maximize conversions, a marketer needs to understand how each of these marketing efforts individually…
▽ More
In a multi-channel marketing world, the purchase decision journey encounters many interactions (e.g., email, mobile notifications, display advertising, social media, and so on). These impressions have direct (main effects), as well as interactive influence on the final decision of the customer. To maximize conversions, a marketer needs to understand how each of these marketing efforts individually and collectively affect the customer's final decision. This insight will help her optimize the advertising budget over interacting marketing channels. This problem of interpreting the influence of various marketing channels to the customer's decision process is called marketing attribution. We propose a Bayesian model of marketing attribution that captures established modes of action of advertisements, including the direct effect of the ad, decay of the ad effect, interaction between ads, and customer heterogeneity. Our model allows us to incorporate information from customer's features and provides usable error bounds for parameters of interest, like the ad effect or the half-life of an ad. We apply our model on a real-world dataset and evaluate its performance against alternatives in simulations.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
General Control Functions for Causal Effect Estimation from Instrumental Variables
Authors:
Aahlad Manas Puli,
Rajesh Ranganath
Abstract:
Causal effect estimation relies on separating the variation in the outcome into parts due to the treatment and due to the confounders. To achieve this separation, practitioners often use external sources of randomness that only influence the treatment called instrumental variables (IVs). We study variables constructed from treatment and IV that help estimate effects, called control functions. We c…
▽ More
Causal effect estimation relies on separating the variation in the outcome into parts due to the treatment and due to the confounders. To achieve this separation, practitioners often use external sources of randomness that only influence the treatment called instrumental variables (IVs). We study variables constructed from treatment and IV that help estimate effects, called control functions. We characterize general control functions for effect estimation in a meta-identification result. Then, we show that structural assumptions on the treatment process allow the construction of general control functions, thereby guaranteeing identification. To construct general control functions and estimate effects, we develop the general control function method (GCFN). GCFN's first stage called variational decoupling (VDE) constructs general control functions by recovering the residual variation in the treatment given the IV. Using VDE's control function, GCFN's second stage estimates effects via regression. Further, we develop semi-supervised GCFN to construct general control functions using subsets of data that have both IV and confounders observed as supervision; this needs no structural treatment process assumptions. We evaluate GCFN on low and high dimensional simulated data and on recovering the causal effect of slave export on modern community trust.
△ Less
Submitted 2 February, 2021; v1 submitted 8 July, 2019;
originally announced July 2019.
-
Removing Hidden Confounding by Experimental Grounding
Authors:
Nathan Kallus,
Aahlad Manas Puli,
Uri Shalit
Abstract:
Observational data is increasingly used as a means for making individual-level causal predictions and intervention recommendations. The foremost challenge of causal inference from observational data is hidden confounding, whose presence cannot be tested in data and can invalidate any causal conclusion. Experimental data does not suffer from confounding but is usually limited in both scope and scal…
▽ More
Observational data is increasingly used as a means for making individual-level causal predictions and intervention recommendations. The foremost challenge of causal inference from observational data is hidden confounding, whose presence cannot be tested in data and can invalidate any causal conclusion. Experimental data does not suffer from confounding but is usually limited in both scope and scale. We introduce a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data. Our method makes strictly weaker assumptions than existing approaches, and we prove conditions under which it yields a consistent estimator. We demonstrate our method's efficacy using real-world data from a large educational experiment.
△ Less
Submitted 27 October, 2018;
originally announced October 2018.