-
An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed Bandits
Authors:
Biyonka Liang,
Iavor Bo**ov
Abstract:
Experimentation is crucial for managers to rigorously quantify the value of a change and determine if it leads to a statistically significant improvement over the status quo, thus augmenting their decision-making. Many companies now mandate that all changes undergo experimentation, presenting two challenges: (1) reducing the risk/cost of experimentation by minimizing the proportion of customers as…
▽ More
Experimentation is crucial for managers to rigorously quantify the value of a change and determine if it leads to a statistically significant improvement over the status quo, thus augmenting their decision-making. Many companies now mandate that all changes undergo experimentation, presenting two challenges: (1) reducing the risk/cost of experimentation by minimizing the proportion of customers assigned to the inferior treatment and (2) increasing the experimentation velocity by enabling managers to stop experiments as soon as results are statistically significant. This paper simultaneously addresses both challenges by proposing the Mixture Adaptive Design (MAD), a new experimental design for multi-armed bandit (MAB) algorithms that enables anytime valid inference on the Average Treatment Effect (ATE) for any MAB algorithm. Intuitively, the MAB "mixes" any bandit algorithm with a Bernoulli design such that at each time step, the probability that a customer is assigned via the Bernoulli design is controlled by a user-specified deterministic sequence that can converge to zero. The sequence enables managers to directly and interpretably control the trade-off between regret minimization and inferential precision. Under mild conditions on the rate the sequence converges to zero, we provide a confidence sequence that is asymptotically anytime valid and demonstrate that the MAD is guaranteed to have a finite stop** time in the presence of a true non-zero ATE. Hence, the MAD allows managers to stop experiments early when a significant ATE is detected while ensuring valid inference, enhancing both the efficiency and reliability of adaptive experiments. Empirically, we demonstrate that the MAD achieves finite-sample anytime-validity while accurately and precisely estimating the ATE, all without incurring significant losses in reward compared to standard bandit designs.
△ Less
Submitted 14 June, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Balancing Risk and Reward: An Automated Phased Release Strategy
Authors:
Yufan Li,
Jialiang Mao,
Iavor Bo**ov
Abstract:
Phased releases are a common strategy in the technology industry for gradually releasing new products or updates through a sequence of A/B tests in which the number of treated units gradually grows until full deployment or deprecation. Performing phased releases in a principled way requires selecting the proportion of units assigned to the new release in a way that balances the risk of an adverse…
▽ More
Phased releases are a common strategy in the technology industry for gradually releasing new products or updates through a sequence of A/B tests in which the number of treated units gradually grows until full deployment or deprecation. Performing phased releases in a principled way requires selecting the proportion of units assigned to the new release in a way that balances the risk of an adverse effect with the need to iterate and learn from the experiment rapidly. In this paper, we formalize this problem and propose an algorithm that automatically determines the release percentage at each stage in the schedule, balancing the need to control risk while maximizing ramp-up speed. Our framework models the challenge as a constrained batched bandit problem that ensures that our pre-specified experimental budget is not depleted with high probability. Our proposed algorithm leverages an adaptive Bayesian approach in which the maximal number of units assigned to the treatment is determined by the posterior distribution, ensuring that the probability of depleting the remaining budget is low. Notably, our approach analytically solves the ramp sizes by inverting probability bounds, eliminating the need for challenging rare-event Monte Carlo simulation. It only requires computing means and variances of outcome subsets, making it highly efficient and parallelizable.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Design-Based Inference for Multi-arm Bandits
Authors:
Dae Woong Ham,
Iavor Bo**ov,
Michael Lindon,
Martin Tingley
Abstract:
Multi-arm bandits are gaining popularity as they enable real-world sequential decision-making across application areas, including clinical trials, recommender systems, and online decision-making. Consequently, there is an increased desire to use the available adaptively collected datasets to distinguish whether one arm was more effective than the other, e.g., which product or treatment was more ef…
▽ More
Multi-arm bandits are gaining popularity as they enable real-world sequential decision-making across application areas, including clinical trials, recommender systems, and online decision-making. Consequently, there is an increased desire to use the available adaptively collected datasets to distinguish whether one arm was more effective than the other, e.g., which product or treatment was more effective. Unfortunately, existing tools fail to provide valid inference when data is collected adaptively or require many untestable and technical assumptions, e.g., stationarity, iid rewards, bounded random variables, etc. Our paper introduces the design-based approach to inference for multi-arm bandits, where we condition the full set of potential outcomes and perform inference on the obtained sample. Our paper constructs valid confidence intervals for both the reward mean of any arm and the mean reward difference between any arms in an assumption-light manner, allowing the rewards to be arbitrarily distributed, non-iid, and from non-stationary distributions. In addition to confidence intervals, we also provide valid design-based confidence sequences, sequences of confidence intervals that have uniform type-1 error guarantees over time. Confidence sequences allow the agent to perform a hypothesis test as the data arrives sequentially and stop the experiment as soon as the agent is satisfied with the inference, e.g., the mean reward of an arm is statistically significantly higher than a desired threshold.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Design-Based Confidence Sequences: A General Approach to Risk Mitigation in Online Experimentation
Authors:
Dae Woong Ham,
Iavor Bo**ov,
Michael Lindon,
Martin Tingley
Abstract:
Randomized experiments have become the standard method for companies to evaluate the performance of new products or services. In addition to augmenting managers' decision-making, experimentation mitigates risk by limiting the proportion of customers exposed to innovation. Since many experiments are on customers arriving sequentially, a potential solution is to allow managers to "peek" at the resul…
▽ More
Randomized experiments have become the standard method for companies to evaluate the performance of new products or services. In addition to augmenting managers' decision-making, experimentation mitigates risk by limiting the proportion of customers exposed to innovation. Since many experiments are on customers arriving sequentially, a potential solution is to allow managers to "peek" at the results when new data becomes available and stop the test if the results are statistically significant. Unfortunately, peeking invalidates the statistical guarantees for standard statistical analysis and leads to uncontrolled type-1 error. Our paper provides valid design-based confidence sequences, sequences of confidence intervals with uniform type-1 error guarantees over time for various sequential experiments in an assumption-light manner. In particular, we focus on finite-sample estimands defined on the study participants as a direct measure of the incurred risks by companies. Our proposed confidence sequences are valid for a large class of experiments, including multi-arm bandits, time series, and panel experiments. We further provide a variance reduction technique incorporating modeling assumptions and covariates. Finally, we demonstrate the effectiveness of our proposed approach through a simulation study and three real-world applications from Netflix. Our results show that by using our confidence sequence, harmful experiments could be stopped after only observing a handful of units; for instance, an experiment that Netflix ran on its sign-up page on 30,000 potential customers would have been stopped by our method on the first day before 100 observations.
△ Less
Submitted 24 May, 2023; v1 submitted 16 October, 2022;
originally announced October 2022.
-
Anytime-Valid Linear Models and Regression Adjusted Causal Inference in Randomized Experiments
Authors:
Michael Lindon,
Dae Woong Ham,
Martin Tingley,
Iavor Bo**ov
Abstract:
Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-v…
▽ More
Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-valid analogues of such procedures, enabling linear regression adjustment in the sequential analysis of randomised experiments. We first provide sequential $F$-tests and confidence sequences for the parametric linear model, which provide time-uniform Type-I error and coverage guarantees that hold for all sample sizes. We then relax all linear model parametric assumptions in randomised designs and provide nonparametric model-free sequential tests and confidence sequences for treatment effects. This formally allows experiments to be continuously monitored for significance, stopped early, and safeguards against statistical malpractices in data collection. A particular feature of our results is their simplicity. Our test statistics and confidence sequences all emit closed-form expressions, which are functions of statistics directly available from a standard linear regression table. We illustrate our methodology with the sequential analysis of software A/B experiments at Netflix, performing regression adjustment with pre-treatment outcomes.
△ Less
Submitted 7 February, 2024; v1 submitted 16 October, 2022;
originally announced October 2022.
-
Quantifying the Value of Iterative Experimentation
Authors:
Jialiang Mao,
Iavor Bo**ov
Abstract:
Over the past decade, most technology companies and a growing number of conventional firms have adopted online experimentation (or A/B testing) into their product development process. Initially, A/B testing was deployed as a static procedure in which an experiment was conducted by randomly splitting half of the users to see the control-the standard offering-and the other half the treatment-the new…
▽ More
Over the past decade, most technology companies and a growing number of conventional firms have adopted online experimentation (or A/B testing) into their product development process. Initially, A/B testing was deployed as a static procedure in which an experiment was conducted by randomly splitting half of the users to see the control-the standard offering-and the other half the treatment-the new version. The results were then used to augment decision-making around which version to release widely. More recently, as experimentation has matured, firms have developed a more dynamic approach to experimentation in which a new version (the treatment) is gradually released to a growing number of units through a sequence of randomized experiments, known as iterations. In this paper, we develop a theoretical framework to quantify the value brought on by such dynamic or iterative experimentation. We apply our framework to seven months of LinkedIn experiments and show that iterative experimentation led to an additional 20% improvement in one of the firm's primary metrics.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
Population Interference in Panel Experiments
Authors:
Kevin Han,
Iavor Bo**ov,
Guillaume Basse
Abstract:
The phenomenon of population interference, where a treatment assigned to one experimental unit affects another experimental unit's outcome, has received considerable attention in standard randomized experiments. The complications produced by population interference in this setting are now readily recognized, and partial remedies are well known. Much less understood is the impact of population inte…
▽ More
The phenomenon of population interference, where a treatment assigned to one experimental unit affects another experimental unit's outcome, has received considerable attention in standard randomized experiments. The complications produced by population interference in this setting are now readily recognized, and partial remedies are well known. Much less understood is the impact of population interference in panel experiments where treatment is sequentially randomized in the population, and the outcomes are observed at each time step. This paper proposes a general framework for studying population interference in panel experiments and presents new finite population estimation and inference results. Our findings suggest that, under mild assumptions, the addition of a temporal dimension to an experiment alleviates some of the challenges of population interference for certain estimands. In contrast, we show that the presence of carryover effects -- that is, when past treatments may affect future outcomes -- exacerbates the problem. Revisiting the special case of standard experiments with population interference, we prove a central limit theorem under weaker conditions than previous results in the literature and highlight the trade-off between flexibility in the design and the interference structure.
△ Less
Submitted 6 June, 2023; v1 submitted 28 February, 2021;
originally announced March 2021.
-
Design and Analysis of Switchback Experiments
Authors:
Iavor Bo**ov,
David Simchi-Levi,
**glong Zhao
Abstract:
Switchback experiments, where a firm sequentially exposes an experimental unit to random treatments, are among the most prevalent designs used in the technology sector, with applications ranging from ride-hailing platforms to online marketplaces. Although practitioners have widely adopted this technique, the derivation of the optimal design has been elusive, hindering practitioners from drawing va…
▽ More
Switchback experiments, where a firm sequentially exposes an experimental unit to random treatments, are among the most prevalent designs used in the technology sector, with applications ranging from ride-hailing platforms to online marketplaces. Although practitioners have widely adopted this technique, the derivation of the optimal design has been elusive, hindering practitioners from drawing valid causal conclusions with enough statistical power. We address this limitation by deriving the optimal design of switchback experiments under a range of different assumptions on the order of the carryover effect -- the length of time a treatment persists in impacting the outcome. We cast the optimal experimental design problem as a minimax discrete optimization problem, identify the worst-case adversarial strategy, establish structural results, and solve the reduced problem via a continuous relaxation. For switchback experiments conducted under the optimal design, we provide two approaches for performing inference. The first provides exact randomization based p-values, and the second uses a new finite population central limit theorem to conduct conservative hypothesis tests and build confidence intervals. We further provide theoretical results when the order of the carryover effect is misspecified and provide a data-driven procedure to identify the order of the carryover effect. We conduct extensive simulations to study the numerical performance and empirical properties of our results, and conclude with practical suggestions.
△ Less
Submitted 1 April, 2022; v1 submitted 31 August, 2020;
originally announced September 2020.
-
Estimating the effectiveness of permanent price reductions for competing products using multivariate Bayesian structural time series models
Authors:
Fiammetta Menchetti,
Iavor Bo**ov
Abstract:
The Florence branch of an Italian supermarket chain recently implemented a strategy that permanently lowered the price of numerous store brands in several product categories. To quantify the impact of such a policy change, researchers often use synthetic control methods for estimating causal effects when a subset of units receive a single persistent treatment, and the rest are unaffected by the ch…
▽ More
The Florence branch of an Italian supermarket chain recently implemented a strategy that permanently lowered the price of numerous store brands in several product categories. To quantify the impact of such a policy change, researchers often use synthetic control methods for estimating causal effects when a subset of units receive a single persistent treatment, and the rest are unaffected by the change. In our applications, however, competitor brands not assigned to treatment are likely impacted by the intervention because of substitution effects; more broadly, this type of interference occurs whenever the treatment assignment of one unit affects the outcome of another. This paper extends the synthetic control methods to accommodate partial interference, allowing interference within predefined groups but not between them. Focusing on a class of causal estimands that capture the effect both on the treated and control units, we develop a multivariate Bayesian structural time series model for generating synthetic controls that would have occurred in the absence of an intervention enabling us to estimate our novel effects. In a simulation study, we explore our Bayesian procedure's empirical properties and show that it achieves good frequentists coverage even when the model is misspecified. We use our new methodology to make causal statements about the impact on sales of the affected store brands and their direct competitors. Our proposed approach is implemented in the CausalMBSTS R package.
△ Less
Submitted 22 February, 2021; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Panel Experiments and Dynamic Causal Effects: A Finite Population Perspective
Authors:
Iavor Bo**ov,
Ashesh Rambachan,
Neil Shephard
Abstract:
In panel experiments, we randomly assign units to different interventions, measuring their outcomes, and repeating the procedure in several periods. Using the potential outcomes framework, we define finite population dynamic causal effects that capture the relative effectiveness of alternative treatment paths. For a rich class of dynamic causal effects, we provide a nonparametric estimator that is…
▽ More
In panel experiments, we randomly assign units to different interventions, measuring their outcomes, and repeating the procedure in several periods. Using the potential outcomes framework, we define finite population dynamic causal effects that capture the relative effectiveness of alternative treatment paths. For a rich class of dynamic causal effects, we provide a nonparametric estimator that is unbiased over the randomization distribution and derive its finite population limiting distribution as either the sample size or the duration of the experiment increases. We develop two methods for inference: a conservative test for weak null hypotheses and an exact randomization test for sharp null hypotheses. We further analyze the finite population probability limit of linear fixed effects estimators. These commonly-used estimators do not recover a causally interpretable estimand if there are dynamic causal effects and serial correlation in the assignments, highlighting the value of our proposed estimator.
△ Less
Submitted 27 May, 2021; v1 submitted 22 March, 2020;
originally announced March 2020.
-
A general theory of identification
Authors:
Guillaume Basse,
Iavor Bo**ov
Abstract:
What does it mean to say that a quantity is identifiable from the data? Statisticians seem to agree on a definition in the context of parametric statistical models --- roughly, a parameter $θ$ in a model $\mathcal{P} = \{P_θ: θ\in Θ\}$ is identifiable if the map** $θ\mapsto P_θ$ is injective. This definition raises important questions: Are parameters the only quantities that can be identified? I…
▽ More
What does it mean to say that a quantity is identifiable from the data? Statisticians seem to agree on a definition in the context of parametric statistical models --- roughly, a parameter $θ$ in a model $\mathcal{P} = \{P_θ: θ\in Θ\}$ is identifiable if the map** $θ\mapsto P_θ$ is injective. This definition raises important questions: Are parameters the only quantities that can be identified? Is the concept of identification meaningful outside of parametric statistics? Does it even require the notion of a statistical model? Partial and idiosyncratic answers to these questions have been discussed in econometrics, biological modeling, and in some subfields of statistics like causal inference. This paper proposes a unifying theory of identification that incorporates existing definitions for parametric and nonparametric models and formalizes the process of identification analysis. The applicability of this framework is illustrated through a series of examples and two extended case studies.
△ Less
Submitted 14 February, 2020;
originally announced February 2020.
-
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery
Authors:
Jiachen Yang,
Igor Borovikov,
Hongyuan Zha
Abstract:
Human players in professional team sports achieve high level coordination by dynamically choosing complementary skills and executing primitive actions to perform these skills. As a step toward creating intelligent agents with this capability for fully cooperative multi-agent settings, we propose a two-level hierarchical multi-agent reinforcement learning (MARL) algorithm with unsupervised skill di…
▽ More
Human players in professional team sports achieve high level coordination by dynamically choosing complementary skills and executing primitive actions to perform these skills. As a step toward creating intelligent agents with this capability for fully cooperative multi-agent settings, we propose a two-level hierarchical multi-agent reinforcement learning (MARL) algorithm with unsupervised skill discovery. Agents learn useful and distinct skills at the low level via independent Q-learning, while they learn to select complementary latent skill variables at the high level via centralized multi-agent training with an extrinsic team reward. The set of low-level skills emerges from an intrinsic reward that solely promotes the decodability of latent skill variables from the trajectory of a low-level skill, without the need for hand-crafted rewards for each skill. For scalable decentralized execution, each agent independently chooses latent skill variables and primitive actions based on local observations. Our overall method enables the use of general cooperative MARL algorithms for training high level policies and single-agent RL for training low level skills. Experiments on a stochastic high dimensional team game show the emergence of useful skills and cooperative team play. The interpretability of the learned skills show the promise of the proposed method for achieving human-AI cooperation in team sports games.
△ Less
Submitted 7 May, 2020; v1 submitted 7 December, 2019;
originally announced December 2019.
-
Neural Network-based Object Classification by Known and Unknown Features (Based on Text Queries)
Authors:
A. Artemov,
I. Bolokhov,
D. Kem,
I. Khasenevich
Abstract:
The article presents a method that improves the quality of classification of objects described by a combination of known and unknown features. The method is based on modernized Informational Neurobayesian Approach with consideration of unknown features. The proposed method was developed and trained on 1500 text queries of Promobot users in Russian to classify them into 20 categories (classes). As…
▽ More
The article presents a method that improves the quality of classification of objects described by a combination of known and unknown features. The method is based on modernized Informational Neurobayesian Approach with consideration of unknown features. The proposed method was developed and trained on 1500 text queries of Promobot users in Russian to classify them into 20 categories (classes). As a result, the use of the method allowed to completely solve the problem of misclassification for queries with combining known and unknown features of the model. The theoretical substantiation of the method is presented by the formulated and proved theorem On the Model with Limited Knowledge. It states, that in conditions of limited data, an equal number of equally unknown features of an object cannot have different significance for the classification problem.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Causal inference from observational data: Estimating the effect of contributions on visitation frequency atLinkedIn
Authors:
Iavor Bo**ov,
Ye Tu,
Min Liu,
Ya Xu
Abstract:
Randomized experiments (A/B testings) have become the standard way for web-facing companies to guide innovation, evaluate new products, and prioritize ideas. There are times, however, when running an experiment is too complicated (e.g., we have not built the infrastructure), costly (e.g., the intervention will have a substantial negative impact on revenue), and time-consuming (e.g., the effect may…
▽ More
Randomized experiments (A/B testings) have become the standard way for web-facing companies to guide innovation, evaluate new products, and prioritize ideas. There are times, however, when running an experiment is too complicated (e.g., we have not built the infrastructure), costly (e.g., the intervention will have a substantial negative impact on revenue), and time-consuming (e.g., the effect may take months to materialize). Even if we can run an experiment, knowing the magnitude of the impact will significantly accelerate the product development life cycle by hel** us prioritize tests and determine the appropriate traffic allocation for different treatment groups. In this setting, we should leverage observational data to quickly and cost-efficiently obtain a reliable estimate of the causal effect. Although causal inference from observational data has a long history, its adoption by data scientist in technology companies has been slow. In this paper, we rectify this by providing a brief introduction to the vast field of causal inference with a specific focus on the tools and techniques that data scientist can directly leverage. We illustrate how to apply some of these methodologies to measure the effect of contributions (e.g., post, comment, like or send private messages) on engagement metrics. Evaluating the impact of contributions on engagement through an A/B test requires encouragement design and the development of non-standard experimentation infrastructure, which can consume a tremendous amount of time and financial resources. We present multiple efficient strategies that exploit historical data to accurately estimate the contemporaneous (or instantaneous) causal effect of a user's contribution on her own and her neighbors' (i.e., the users she is connected to) subsequent visitation frequency. We apply these tools to LinkedIn data for several million members.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.
-
Diagnosing missing always at random in multivariate data
Authors:
Iavor Bo**ov,
Natesh Pillai,
Donald Rubin
Abstract:
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct-likelihood paradigms, requires that the missing data are missing at…
▽ More
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable - a twofold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct-likelihood paradigms, requires that the missing data are missing at random; in contrast, the frequentist-likelihood paradigm demands that the missing data mechanism always produces missing at random data, a condition known as missing always at random. Under certain regularity conditions, assuming missing always at random leads to an assumption that can be tested using the observed data alone namely, the missing data indicators only depend on fully observed variables. Here, we propose three different diagnostic tests that not only indicate when this assumption is incorrect but also suggest which variables are the most likely culprits. Although missing always at random is not a necessary condition to ensure validity under the Bayesian and direct-likelihood paradigms, it is sufficient, and evidence for its violation should encourage the careful statistician to conduct targeted sensitivity analyses.
△ Less
Submitted 2 April, 2018; v1 submitted 18 October, 2017;
originally announced October 2017.
-
Time series experiments and causal estimands: exact randomization tests and trading
Authors:
Iavor Bo**ov,
Neil Shephard
Abstract:
We define causal estimands for experiments on single time series, extending the potential outcome framework to dealing with temporal data. Our approach allows the estimation of some of these estimands and exact randomization based p-values for testing causal effects, without imposing stringent assumptions. We test our methodology on simulated "potential autoregressions,"which have a causal interpr…
▽ More
We define causal estimands for experiments on single time series, extending the potential outcome framework to dealing with temporal data. Our approach allows the estimation of some of these estimands and exact randomization based p-values for testing causal effects, without imposing stringent assumptions. We test our methodology on simulated "potential autoregressions,"which have a causal interpretation. Our methodology is partially inspired by data from a large number of experiments carried out by a financial company who compared the impact of two different ways of trading equity futures contracts. We use our methodology to make causal statements about their trading methods.
△ Less
Submitted 18 July, 2017; v1 submitted 23 June, 2017;
originally announced June 2017.
-
Multiple Imputation Using Gaussian Copulas
Authors:
Florian M. Hollenbach,
Iavor Bo**ov,
Shahryar Minhas,
Nils W. Metternich,
Shahryar Minhas,
Michael D. Ward,
Alexander Volfovsky
Abstract:
Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this paper, we present a simple-to-use method for generating multiple imputations using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff, 2007) allows sc…
▽ More
Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this paper, we present a simple-to-use method for generating multiple imputations using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff, 2007) allows scholars to attain estimation results that have good coverage and small bias. The use of copulas to model the dependence among variables will enable researchers to construct valid joint distributions of the data, even without knowledge of the actual underlying marginal distributions. Multiple imputations are then generated by drawing observations from the resulting posterior joint distribution and replacing the missing values. Using simulated and observational data from published social science research, we compare imputation via Gaussian copulas with two other widely used imputation methods: MICE and Amelia II. Our results suggest that the Gaussian copula approach has a slightly smaller bias, higher coverage rates, and narrower confidence intervals compared to the other methods. This is especially true when the variables with missing data are not normally distributed. These results, combined with theoretical guarantees and ease-of-use suggest that the approach examined provides an attractive alternative for applied researchers undertaking multiple imputations.
△ Less
Submitted 4 October, 2018; v1 submitted 3 November, 2014;
originally announced November 2014.