Search | arXiv e-print repository

Interventional Causal Discovery in a Mixture of DAGs

Authors: Burak Varıcı, Dmitriy Katz-Rogozhnikov, Dennis Wei, Prasanna Sattigeri, Ali Tajer

Abstract: Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled… ▽ More Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled by one directed acyclic graph (DAG). Causal discovery from mixtures is fundamentally more challenging than single-DAG causal discovery. Two major difficulties stem from (i) inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. This paper addresses these challenges and aims to identify edges that exist in at least one component DAG of the mixture, referred to as true edges. First, it establishes matching necessary and sufficient conditions on the size of interventions required to identify the true edges. Next, guided by the necessity results, an adaptive algorithm is designed that learns all true edges using ${\cal O}(n^2)$ interventions, where $n$ is the number of nodes. Remarkably, the size of the interventions is optimal if the underlying mixture model does not contain cycles across its components. More generally, the gap between the intervention size used by the algorithm and the optimal size is quantified. It is shown to be bounded by the cyclic complexity number of the mixture model, defined as the size of the minimal intervention that can break the cycles in the mixture, which is upper bounded by the number of cycles among the ancestors of a node. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2403.00233 [pdf, other]

Causal Bandits with General Causal Models and Interventions

Authors: Zirui Yan, Dennis Wei, Dmitriy Katz-Rogozhnikov, Prasanna Sattigeri, Ali Tajer

Abstract: This paper considers causal bandits (CBs) for the sequential design of interventions in a causal system. The objective is to optimize a reward function via minimizing a measure of cumulative regret with respect to the best sequence of interventions in hindsight. The paper advances the results on CBs in three directions. First, the structural causal models (SCMs) are assumed to be unknown and drawn… ▽ More This paper considers causal bandits (CBs) for the sequential design of interventions in a causal system. The objective is to optimize a reward function via minimizing a measure of cumulative regret with respect to the best sequence of interventions in hindsight. The paper advances the results on CBs in three directions. First, the structural causal models (SCMs) are assumed to be unknown and drawn arbitrarily from a general class $\mathcal{F}$ of Lipschitz-continuous functions. Existing results are often focused on (generalized) linear SCMs. Second, the interventions are assumed to be generalized soft with any desired level of granularity, resulting in an infinite number of possible interventions. The existing literature, in contrast, generally adopts atomic and hard interventions. Third, we provide general upper and lower bounds on regret. The upper bounds subsume (and improve) known bounds for special cases. The lower bounds are generally hitherto unknown. These bounds are characterized as functions of the (i) graph parameters, (ii) eluder dimension of the space of SCMs, denoted by $\operatorname{dim}(\mathcal{F})$, and (iii) the covering number of the function space, denoted by ${\rm cn}(\mathcal{F})$. Specifically, the cumulative achievable regret over horizon $T$ is $\mathcal{O}(K d^{L-1}\sqrt{T\operatorname{dim}(\mathcal{F}) \log({\rm cn}(\mathcal{F}))})$, where $K$ is related to the Lipschitz constants, $d$ is the graph's maximum in-degree, and $L$ is the length of the longest causal path. The upper bound is further refined for special classes of SCMs (neural network, polynomial, and linear), and their corresponding lower bounds are provided. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 37 pages, 13 figures, conference

arXiv:2104.04633 [pdf, other]

Automated Meta-Analysis: A Causal Learning Perspective

Authors: Lu Cheng, Dmitriy A. Katz-Rogozhnikov, Kush R. Varshney, Ioana Baldini

Abstract: Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies. It is central to deriving conclusions about the summary effect of treatments and interventions in medicine, poverty alleviation, and other applications with social impact. Unfortunately, meta-analysis involves great human effort, rendering a process that… ▽ More Meta-analysis is a systematic approach for understanding a phenomenon by analyzing the results of many previously published experimental studies. It is central to deriving conclusions about the summary effect of treatments and interventions in medicine, poverty alleviation, and other applications with social impact. Unfortunately, meta-analysis involves great human effort, rendering a process that is extremely inefficient and vulnerable to human bias. To overcome these issues, we work toward automating meta-analysis with a focus on controlling for risks of bias. In particular, we first extract information from scientific publications written in natural language. From a novel causal learning perspective, we then propose to frame automated meta-analysis -- based on the input of the first step -- as a multiple-causal-inference problem where the summary effect is obtained through intervention. Built upon existing efforts for automating the initial steps of meta-analysis, the proposed approach achieves the goal of automated meta-analysis and largely reduces the human effort involved. Evaluations on synthetic and semi-synthetic datasets show that this approach can yield promising results. △ Less

Submitted 9 April, 2021; originally announced April 2021.

Comments: 11 pages, 6 figures

arXiv:2011.01979 [pdf, other]

High-Dimensional Feature Selection for Sample Efficient Treatment Effect Estimation

Authors: Kristjan Greenewald, Dmitriy Katz-Rogozhnikov, Karthik Shanmugam

Abstract: The estimation of causal treatment effects from observational data is a fundamental problem in causal inference. To avoid bias, the effect estimator must control for all confounders. Hence practitioners often collect data for as many covariates as possible to raise the chances of including the relevant confounders. While this addresses the bias, this has the side effect of significantly increasing… ▽ More The estimation of causal treatment effects from observational data is a fundamental problem in causal inference. To avoid bias, the effect estimator must control for all confounders. Hence practitioners often collect data for as many covariates as possible to raise the chances of including the relevant confounders. While this addresses the bias, this has the side effect of significantly increasing the number of data samples required to accurately estimate the effect due to the increased dimensionality. In this work, we consider the setting where out of a large number of covariates $X$ that satisfy strong ignorability, an unknown sparse subset $S$ is sufficient to include to achieve zero bias, i.e. $c$-equivalent to $X$. We propose a common objective function involving outcomes across treatment cohorts with nonconvex joint sparsity regularization that is guaranteed to recover $S$ with high probability under a linear outcome model for $Y$ and subgaussian covariates for each of the treatment cohort. This improves the effect estimation sample complexity so that it scales with the cardinality of the sparse subset $S$ and $\log |X|$, as opposed to the cardinality of the full set $X$. We validate our approach with experiments on treatment effect estimation. △ Less

Submitted 3 November, 2020; originally announced November 2020.

arXiv:1911.07819 [pdf, other]

Drug Repurposing for Cancer: An NLP Approach to Identify Low-Cost Therapies

Authors: Shivashankar Subramanian, Ioana Baldini, Sushma Ravichandran, Dmitriy A. Katz-Rogozhnikov, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Kush R. Varshney, Annmarie Wang, Pradeep Mangalath, Laura B. Kleiman

Abstract: More than 200 generic drugs approved by the U.S. Food and Drug Administration for non-cancer indications have shown promise for treating cancer. Due to their long history of safe patient use, low cost, and widespread availability, repurposing of generic drugs represents a major opportunity to rapidly improve outcomes for cancer patients and reduce healthcare costs worldwide. Evidence on the effica… ▽ More More than 200 generic drugs approved by the U.S. Food and Drug Administration for non-cancer indications have shown promise for treating cancer. Due to their long history of safe patient use, low cost, and widespread availability, repurposing of generic drugs represents a major opportunity to rapidly improve outcomes for cancer patients and reduce healthcare costs worldwide. Evidence on the efficacy of non-cancer generic drugs being tested for cancer exists in scientific publications, but trying to manually identify and extract such evidence is intractable. In this paper, we introduce a system to automate this evidence extraction from PubMed abstracts. Our primary contribution is to define the natural language processing pipeline required to obtain such evidence, comprising the following modules: querying, filtering, cancer type entity extraction, therapeutic association classification, and study type classification. Using the subject matter expertise on our team, we create our own datasets for these specialized domain-specific tasks. We obtain promising performance in each of the modules by utilizing modern language modeling techniques and plan to treat them as baseline approaches for future improvement of individual components. △ Less

Submitted 5 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1211.4063 [pdf, ps, other]

Asymptotic Optimality of Constant-Order Policies for Lost Sales Inventory Models with Large Lead Times

Authors: David A. Goldberg, Dmitriy A. Katz-Rogozhnikov, Yingdong Lu, Mayank Sharma, Mark S. Squillante

Abstract: Lost sales inventory models with large lead times, which arise in many practical settings, are notoriously difficult to optimize due to the curse of dimensionality. In this paper we show that when lead times are large, a very simple constant-order policy, first studied by Reiman (\cite{Reiman04}), performs nearly optimally. The main insight of our work is that when the lead time is very large, suc… ▽ More Lost sales inventory models with large lead times, which arise in many practical settings, are notoriously difficult to optimize due to the curse of dimensionality. In this paper we show that when lead times are large, a very simple constant-order policy, first studied by Reiman (\cite{Reiman04}), performs nearly optimally. The main insight of our work is that when the lead time is very large, such a significant amount of randomness is injected into the system between when an order for more inventory is placed and when that order is received, that "being smart" algorithmically provides almost no benefit. Our main proof technique combines a novel coupling for suprema of random walks with arguments from queueing theory. △ Less

Submitted 3 September, 2014; v1 submitted 16 November, 2012; originally announced November 2012.

MSC Class: Primary: 90B05; secondary: 90B22

Showing 1–6 of 6 results for author: Katz-Rogozhnikov, D