-
Causal Layering via Conditional Entropy
Authors:
Itai Feigenbaum,
Devansh Arpit,
Huan Wang,
Shelby Heinecke,
Juan Carlos Niebles,
Weiran Yao,
Caiming Xiong,
Silvio Savarese
Abstract:
Causal discovery aims to recover information about an unobserved causal graph from the observable data it generates. Layerings are orderings of the variables which place causes before effects. In this paper, we provide ways to recover layerings of a graph by accessing the data via a conditional entropy oracle, when distributions are discrete. Our algorithms work by repeatedly removing sources or s…
▽ More
Causal discovery aims to recover information about an unobserved causal graph from the observable data it generates. Layerings are orderings of the variables which place causes before effects. In this paper, we provide ways to recover layerings of a graph by accessing the data via a conditional entropy oracle, when distributions are discrete. Our algorithms work by repeatedly removing sources or sinks from the graph. Under appropriate assumptions and conditioning, we can separate the sources or sinks from the remainder of the nodes by comparing their conditional entropy to the unconditional entropy of their noise. Our algorithms are provably correct and run in worst-case quadratic time. The main assumptions are faithfulness and injective noise, and either known noise entropies or weakly monotonically increasing noise entropies along directed paths. In addition, we require one of either a very mild extension of faithfulness, or strictly monotonically increasing noise entropies, or expanding noise injectivity to include an additional single argument in the structural functions.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Editing Arbitrary Propositions in LLMs without Subject Labels
Authors:
Itai Feigenbaum,
Devansh Arpit,
Huan Wang,
Shelby Heinecke,
Juan Carlos Niebles,
Weiran Yao,
Caiming Xiong,
Silvio Savarese
Abstract:
Large Language Model (LLM) editing modifies factual information in LLMs. Locate-and-Edit (L\&E) methods accomplish this by finding where relevant information is stored within the neural network, and editing the weights at that location. The goal of editing is to modify the response of an LLM to a proposition independently of its phrasing, while not modifying its response to other related propositi…
▽ More
Large Language Model (LLM) editing modifies factual information in LLMs. Locate-and-Edit (L\&E) methods accomplish this by finding where relevant information is stored within the neural network, and editing the weights at that location. The goal of editing is to modify the response of an LLM to a proposition independently of its phrasing, while not modifying its response to other related propositions. Existing methods are limited to binary propositions, which represent straightforward binary relations between a subject and an object. Furthermore, existing methods rely on semantic subject labels, which may not be available or even be well-defined in practice. In this paper, we show that both of these issues can be effectively skirted with a simple and fast localization method called Gradient Tracing (GT). This localization method allows editing arbitrary propositions instead of just binary ones, and does so without the need for subject labels. As propositions always have a truth value, our experiments prompt an LLM as a boolean classifier, and edit its T/F response to propositions. Our method applies GT for location tracing, and then edit the model at that location using a mild variant of Rank-One Model Editing (ROME). On datasets of binary propositions derived from the CounterFact dataset, we show that our method -- without access to subject labels -- performs close to state-of-the-art L\&E methods which has access subject labels. We then introduce a new dataset, Factual Accuracy Classification Test (FACT), which includes non-binary propositions and for which subject labels are not generally applicable, and therefore is beyond the scope of existing L\&E methods. Nevertheless, we show that with our method editing is possible on FACT.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
On the Unlikelihood of D-Separation
Authors:
Itai Feigenbaum,
Huan Wang,
Shelby Heinecke,
Juan Carlos Niebles,
Weiran Yao,
Caiming Xiong,
Devansh Arpit
Abstract:
Causal discovery aims to recover a causal graph from data generated by it; constraint based methods do so by searching for a d-separating conditioning set of nodes in the graph via an oracle. In this paper, we provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist, unless the graph is extremely sparse. We then provide an analytic average ca…
▽ More
Causal discovery aims to recover a causal graph from data generated by it; constraint based methods do so by searching for a d-separating conditioning set of nodes in the graph via an oracle. In this paper, we provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist, unless the graph is extremely sparse. We then provide an analytic average case analysis of the PC Algorithm for causal discovery, as well as a variant of the SGS Algorithm we call UniformSGS. We consider a set $V=\{v_1,\ldots,v_n\}$ of nodes, and generate a random DAG $G=(V,E)$ where $(v_a, v_b) \in E$ with i.i.d. probability $p_1$ if $a<b$ and $0$ if $a > b$. We provide upper bounds on the probability that a subset of $V-\{x,y\}$ d-separates $x$ and $y$, conditional on $x$ and $y$ being d-separable; our upper bounds decay exponentially fast to $0$ as $|V| \rightarrow \infty$. For the PC Algorithm, while it is known that its worst-case guarantees fail on non-sparse graphs, we show that the same is true for the average case, and that the sparsity requirement is quite demanding: for good performance, the density must go to $0$ as $|V| \rightarrow \infty$ even in the average case. For UniformSGS, while it is known that the running time is exponential for existing edges, we show that in the average case, that is the expected running time for most non-existing edges as well.
△ Less
Submitted 3 October, 2023; v1 submitted 9 March, 2023;
originally announced March 2023.
-
Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data
Authors:
Devansh Arpit,
Matthew Fernandez,
Itai Feigenbaum,
Weiran Yao,
Chenghao Liu,
Wenzhuo Yang,
Paul Josel,
Shelby Heinecke,
Eric Hu,
Huan Wang,
Stephen Hoi,
Caiming Xiong,
Kun Zhang,
Juan Carlos Niebles
Abstract:
We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data. It supports causal discovery and causal inference for tabular and time series data, of discrete, continuous and heterogeneous types. This library includes algorithms that handle linear and non-linear causal relationships between variables, and uses multi-processing for speed-up. We al…
▽ More
We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data. It supports causal discovery and causal inference for tabular and time series data, of discrete, continuous and heterogeneous types. This library includes algorithms that handle linear and non-linear causal relationships between variables, and uses multi-processing for speed-up. We also include a data generator capable of generating synthetic data with specified structural equation model for the aforementioned data formats and types, that helps users control the ground-truth causal process while investigating various algorithms. Finally, we provide a user interface (UI) that allows users to perform causal analysis on data without coding. The goal of this library is to provide a fast and flexible solution for a variety of problems in the domain of causality. This technical report describes the Salesforce CausalAI API along with its capabilities, the implementations of the supported algorithms, and experiments demonstrating their performance and speed. Our library is available at \url{https://github.com/salesforce/causalai}.
△ Less
Submitted 22 September, 2023; v1 submitted 25 January, 2023;
originally announced January 2023.
-
Selfish Knapsack
Authors:
Itai Feigenbaum,
Matthew P. Johnson
Abstract:
We consider a selfish variant of the knapsack problem. In our version, the items are owned by agents, and each agent can misrepresent the set of items she owns---either by avoiding reporting some of them (understating), or by reporting additional ones that do not exist (overstating). Each agent's objective is to maximize, within the items chosen for inclusion in the knapsack, the total valuation o…
▽ More
We consider a selfish variant of the knapsack problem. In our version, the items are owned by agents, and each agent can misrepresent the set of items she owns---either by avoiding reporting some of them (understating), or by reporting additional ones that do not exist (overstating). Each agent's objective is to maximize, within the items chosen for inclusion in the knapsack, the total valuation of her own chosen items. The knapsack problem, in this context, seeks to minimize the worst-case approximation ratio for social welfare at equilibrium. We show that a randomized greedy mechanism has attractive strategic properties: in general, it has a correlated price of anarchy of $2$ (subject to a mild assumption). For overstating-only agents, it becomes strategyproof; we also provide a matching lower bound of $2$ on the (worst-case) approximation ratio attainable by randomized strategyproof mechanisms, and show that no deterministic strategyproof mechanism can provide any constant approximation ratio. We also deal with more specialized environments. For the case of $2$ understating-only agents, we provide a randomized strategyproof $\frac{5+4\sqrt{2}}{7} \approx 1.522$-approximate mechanism, and a lower bound of $\frac{5\sqrt{5}-9}{2} \approx 1.09$. When all agents but one are honest, we provide a deterministic strategyproof $\frac{1+\sqrt{5}}{2} \approx 1.618$-approximate mechanism with a matching lower bound. Finally, we consider a model where agents can misreport their items' properties rather than existence. Specifically, each agent owns a single item, whose value-to-size ratio is publicly known, but whose actual value and size are not. We show that an adaptation of the greedy mechanism is strategyproof and $2$-approximate, and provide a matching lower bound; we also show that no deterministic strategyproof mechanism can provide a constant approximation ratio.
△ Less
Submitted 27 February, 2016; v1 submitted 25 October, 2015;
originally announced October 2015.
-
Strategyproof Mechanisms for One-Dimensional Hybrid and Obnoxious Facility Location
Authors:
Itai Feigenbaum,
Jay Sethuraman
Abstract:
We consider a strategic variant of the facility location problem. We would like to locate a facility on a closed interval. There are n agents located on that interval, divided into two types: type 1 agents, who wish for the facility to be as far from them as possible, and type 2 agents, who wish for the facility to be as close to them as possible. Our goal is to maximize a form of aggregated socia…
▽ More
We consider a strategic variant of the facility location problem. We would like to locate a facility on a closed interval. There are n agents located on that interval, divided into two types: type 1 agents, who wish for the facility to be as far from them as possible, and type 2 agents, who wish for the facility to be as close to them as possible. Our goal is to maximize a form of aggregated social benefit: maxisum- the sum of the agents' utilities, or the egalitarian objective- the minimal agent utility. The strategic aspect of the problem is that the agents' locations are not known to us, but rather reported to us by the agents- an agent might misreport his location in an attempt to move the facility away from or towards to his true location. We therefore require the facility-locating mechanism to be strategyproof, namely that reporting truthfully is a dominant strategy for each agent. As simply maximizing the social benefit is generally not strategyproof, our goal is to design strategyproof mechanisms with good approximation ratios.
For the maxisum objective, in the deterministic setting, we provide a best-possible 3- approximate strategyproof mechanism; in the randomized setting, we provide a 23/13- approximate strategyproof mechanism and a lower bound of \frac{2}{\sqrt{3}}. For the egalitarian objective, we provide a lower bound of 3/2 in the randomized setting, and show that no bounded approximation ratio is attainable in the deterministic setting. To obtain our deterministic lower bounds, we characterize all deterministic strategyproof mechanisms when all agents are of type 1. Finally, we consider a generalized model that allows an agent to control more than one location, and provide best-possible 3- and 3/2- approximate strategyproof mechanisms for maxisum, in the deterministic and randomized settings respectively, when only type 1 agents are present.
△ Less
Submitted 17 July, 2015; v1 submitted 10 December, 2014;
originally announced December 2014.
-
Approximately Optimal Mechanisms for Strategyproof Facility Location: Minimizing $L_p$ Norm of Costs
Authors:
Itai Feigenbaum,
Jay Sethuraman,
Chun Ye
Abstract:
We consider the problem of locating a single facility on the real line. This facility serves a set of agents, each of whom is located on the line, and incurs a cost equal to his distance from the facility. An agent's location is private information that is known only to him. Agents report their location to a central planner who decides where to locate the facility. The planner's objective is to mi…
▽ More
We consider the problem of locating a single facility on the real line. This facility serves a set of agents, each of whom is located on the line, and incurs a cost equal to his distance from the facility. An agent's location is private information that is known only to him. Agents report their location to a central planner who decides where to locate the facility. The planner's objective is to minimize a "social" cost function that depends on the agent-costs. However, agents might not report truthfully; to address this issue, the planner must restrict himself to {\em strategyproof} mechanisms, in which truthful reporting is a dominant strategy for each agent. A mechanism that simply chooses the optimal solution is generally not strategyproof, and so the planner aspires to use a mechanism that effectively {\em approximates} his objective function. In our paper, we study the problem described above with the social cost function being the $L_p$ norm of the vector of agent-costs. We show that the median mechanism (which is known to be strategyproof) provides a $2^{1-\frac{1}{p}}$ approximation ratio, and that is the optimal approximation ratio among all deterministic strategyproof mechanisms. For randomized mechanisms, we present two results. First, we present a negative result: we show that for integer $\infty>p>2$, no mechanism---from a rather large class of randomized mechanisms--- has an approximation ratio better than that of the median mechanism. This is in contrast to the case of $p=2$ and $p=\infty$ where a randomized mechanism provably helps improve the worst case approximation ratio. Second, for the case of 2 agents, we show that a mechanism called LRM, first designed by Procaccia and Tennenholtz for the special case of $L_{\infty}$, provides the optimal approximation ratio among all randomized mechanisms.
△ Less
Submitted 15 September, 2014; v1 submitted 10 May, 2013;
originally announced May 2013.