Search | arXiv e-print repository

arXiv:2402.12400 [pdf, other]

Estimating the age-conditioned average treatment effects curves: An application for assessing load-management strategies in the NBA

Authors: Shinpei Nakamura-Sakai, Laura Forastiere, Brian Macdonald

Abstract: In the realm of competitive sports, understanding the performance dynamics of athletes, represented by the age curve (showing progression, peak, and decline), is vital. Our research introduces a novel framework for quantifying age-specific treatment effects, enhancing the granularity of performance trajectory analysis. Firstly, we propose a methodology for estimating the age curve using game-level… ▽ More In the realm of competitive sports, understanding the performance dynamics of athletes, represented by the age curve (showing progression, peak, and decline), is vital. Our research introduces a novel framework for quantifying age-specific treatment effects, enhancing the granularity of performance trajectory analysis. Firstly, we propose a methodology for estimating the age curve using game-level data, diverging from traditional season-level data approaches, and tackling its inherent complexities with a meta-learner framework that leverages advanced machine learning models. This approach uncovers intricate non-linear patterns missed by existing methods. Secondly, our framework enables the identification of causal effects, allowing for a detailed examination of age curves under various conditions. By defining the Age-Conditioned Treatment Effect (ACTE), we facilitate the exploration of causal relationships regarding treatment impacts at specific ages. Finally, applying this methodology to study the effects of rest days on performance metrics, particularly across different ages, offers valuable insights into load management strategies' effectiveness. Our findings underscore the importance of tailored rest periods, highlighting their positive impact on athlete performance and suggesting a reevaluation of current management practices for optimizing athlete performance. △ Less

Submitted 17 February, 2024; originally announced February 2024.

arXiv:2310.02151 [pdf, ps, other]

Estimation and inference for causal spillover effects in egocentric-network randomized trials in the presence of network membership misclassification

Authors: Ariel Chao, Donna Spiegelman, Ashley Buchanan, Laura Forastiere

Abstract: To leverage peer influence and increase population behavioral changes, behavioral interventions often rely on peer-based strategies. A common study design that assesses such strategies is the egocentric-network randomized trial (ENRT), in which those receiving the intervention are encouraged to disseminate information to their peers. The Average Spillover Effect (ASpE) measures the impact of the i… ▽ More To leverage peer influence and increase population behavioral changes, behavioral interventions often rely on peer-based strategies. A common study design that assesses such strategies is the egocentric-network randomized trial (ENRT), in which those receiving the intervention are encouraged to disseminate information to their peers. The Average Spillover Effect (ASpE) measures the impact of the intervention on participants who do not receive it, but whose outcomes may be affected by others who do. The assessment of the ASpE relies on assumptions about, and correct measurement of, interference sets within which individuals may influence one another's outcomes. It can be challenging to properly specify interference sets, such as networks in ENRTs, and when mismeasured, intervention effects estimated by existing methods will be biased. In HIV prevention studies where social networks play an important role in disease transmission, correcting ASpE estimates for bias due to network misclassification is critical for accurately evaluating the full impact of interventions. We combined measurement error and causal inference methods to bias-correct the ASpE estimate for network misclassification in ENRTs, when surrogate networks are recorded in place of true ones, and validation data that relate the misclassified to the true networks are available. We investigated finite sample properties of our methods in an extensive simulation study, and illustrated our methods in the HIV Prevention Trials Network (HPTN) 037 study. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2308.03443 [pdf, other]

Doubly Robust Estimator for Off-Policy Evaluation with Large Action Spaces

Authors: Tatsuhiro Shimizu, Laura Forastiere

Abstract: We study Off-Policy Evaluation (OPE) in contextual bandit settings with large action spaces. The benchmark estimators suffer from severe bias and variance tradeoffs. Parametric approaches suffer from bias due to difficulty specifying the correct model, whereas ones with importance weight suffer from variance. To overcome these limitations, Marginalized Inverse Propensity Scoring (MIPS) was propose… ▽ More We study Off-Policy Evaluation (OPE) in contextual bandit settings with large action spaces. The benchmark estimators suffer from severe bias and variance tradeoffs. Parametric approaches suffer from bias due to difficulty specifying the correct model, whereas ones with importance weight suffer from variance. To overcome these limitations, Marginalized Inverse Propensity Scoring (MIPS) was proposed to mitigate the estimator's variance via embeddings of an action. Nevertheless, MIPS is unbiased under the no direct effect, which assumes that the action embedding completely mediates the effect of an action on a reward. To overcome the dependency on these unrealistic assumptions, we propose a Marginalized Doubly Robust (MDR) estimator. Theoretical analysis shows that the proposed estimator is unbiased under weaker assumptions than MIPS while reducing the variance against MIPS. The empirical experiment verifies the supremacy of MDR against existing estimators with large action spaces. △ Less

Submitted 14 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: 14 pages, 8 figures

arXiv:2308.00791 [pdf, other]

Design of egocentric network-based studies to estimate causal effects under interference

Authors: Junhan Fang, Donna Spiegelman, Ashley Buchanan, Laura Forastiere

Abstract: Many public health interventions are conducted in settings where individuals are connected to one another and the intervention assigned to randomly selected individuals may spill over to other individuals they are connected to. In these spillover settings, the effects of such interventions can be quantified in several ways. The average individual effect measures the intervention effect among those… ▽ More Many public health interventions are conducted in settings where individuals are connected to one another and the intervention assigned to randomly selected individuals may spill over to other individuals they are connected to. In these spillover settings, the effects of such interventions can be quantified in several ways. The average individual effect measures the intervention effect among those directly treated, while the spillover effect measures the effect among those connected to those directly treated. In addition, the overall effect measures the average intervention effect across the study population, over those directly treated along with those to whom the intervention spills over but who are not directly treated. Here, we develop methods for study design with the aim of estimating individual, spillover, and overall effects. In particular, we consider an egocentric network-based randomized design in which a set of index participants is recruited from the population and randomly assigned to treatment, while data are also collected from their untreated network members. We use the potential outcomes framework to define two clustered regression modeling approaches and clarify the underlying assumptions required to identify and estimate causal effects. We then develop sample size formulas for detecting individual, spillover, and overall effects. We investigate the roles of the intra-class correlation coefficient and the probability of treatment allocation on the required number of egocentric networks with a fixed number of network members for each egocentric network and vice-versa. △ Less

Submitted 1 August, 2023; originally announced August 2023.

Comments: 30 pages for main text including figures and tables, 5 figures and 3 tables

arXiv:2304.09361 [pdf, other]

Evaluating Spillover Effects in Network-Based Studies In the Presence of Missing Outcomes

Authors: TingFang Lee, Ashley L. Buchanan, Natallia Katenka, Laura Forastiere, M. Elizabeth Halloran, Georgios Nikolopoulos

Abstract: Estimating causal effects in the presence of spillover among individuals embedded within a social network is often challenging with missing information. The spillover effect is the effect of an intervention if a participant is not exposed to the intervention themselves but is connected to intervention recipients in the network. In network-based studies, outcomes may be missing due to the administr… ▽ More Estimating causal effects in the presence of spillover among individuals embedded within a social network is often challenging with missing information. The spillover effect is the effect of an intervention if a participant is not exposed to the intervention themselves but is connected to intervention recipients in the network. In network-based studies, outcomes may be missing due to the administrative end of a study or participants being lost to follow-up due to study dropout, also known as censoring. We propose an inverse probability censoring weighted (IPCW) estimator, which is an extension of an IPW estimator for network-based observational studies to settings where the outcome is subject to possible censoring. We demonstrated that the proposed estimator was consistent and asymptotically normal. We also derived a closed-form estimator of the asymptotic variance estimator. We used the IPCW estimator to quantify the spillover effects in a network-based study of a nonrandomized intervention with censoring of the outcome. A simulation study was conducted to evaluate the finite-sample performance of the IPCW estimators. The simulation study demonstrated that the estimator performed well in finite samples when the sample size and number of connected subnetworks (components) were fairly large. We then employed the method to evaluate the spillover effects of community alerts on self-reported HIV risk behavior among people who inject drugs and their contacts in the Transmission Reduction Intervention Project (TRIP), 2013 to 2015, Athens, Greece. Community alerts were protective not only for the person who received the alert from the study but also among others in the network likely through information shared between participants. In this study, we found that the risk of HIV behavior was reduced by increasing the proportion of a participant's immediate contacts exposed to community alerts. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2304.00231 [pdf]

Using Overlap Weights to Address Extreme Propensity Scores in Estimating Restricted Mean Counterfactual Survival Times

Authors: Zhiqiang Cao, Lama Ghazi, Claudia Mastrogiacomo, Laura Forastiere, F. Perry Wilson, Fan Li

Abstract: While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity scores, overlap weighting (OW) can help mitigate the bi… ▽ More While the inverse probability of treatment weighting (IPTW) is a commonly used approach for treatment comparisons in observational data, the resulting estimates may be subject to bias and excessively large variance when there is lack of overlap in the propensity score distributions. By smoothly down-weighting the units with extreme propensity scores, overlap weighting (OW) can help mitigate the bias and variance issues associated with IPTW. Although theoretical and simulation results have supported the use of OW with continuous and binary outcomes, its performance with right-censored survival outcomes remains to be further investigated, especially when the target estimand is defined based on the restricted mean survival time (RMST)-a clinically meaningful summary measure free of the proportional hazards assumption. In this article, we combine propensity score weighting and inverse probability of censoring weighting to estimate the restricted mean counterfactual survival times, and propose computationally-efficient variance estimators. We conduct simulations to compare the performance of IPTW, trimming, and OW in terms of bias, variance, and 95% confidence interval coverage, under various degrees of covariate overlap. Regardless of overlap, we demonstrate the advantage of OW over IPTW and trimming methods in bias, variance, and coverage when the estimand is defined based on RMST. △ Less

Submitted 10 February, 2024; v1 submitted 1 April, 2023; originally announced April 2023.

arXiv:2211.09099 [pdf, other]

Selecting Subpopulations for Causal Inference in Regression Discontinuity Designs

Authors: Laura Forastiere, Alessandra Mattei, Julia M. Pescarini, Mauricio L. Barreto, Fabrizia Mealli

Abstract: The Brazil Bolsa Familia (BF) program is a conditional cash transfer program aimed to reduce short-term poverty by direct cash transfers and to fight long-term poverty by increasing human capital among poor Brazilian people. Eligibility for Bolsa Familia benefits depends on a cutoff rule, which classifies the BF study as a regression discontinuity (RD) design. Extracting causal information from RD… ▽ More The Brazil Bolsa Familia (BF) program is a conditional cash transfer program aimed to reduce short-term poverty by direct cash transfers and to fight long-term poverty by increasing human capital among poor Brazilian people. Eligibility for Bolsa Familia benefits depends on a cutoff rule, which classifies the BF study as a regression discontinuity (RD) design. Extracting causal information from RD studies is challenging. Following Li et al (2015) and Branson and Mealli (2019), we formally describe the BF RD design as a local randomized experiment within the potential outcome approach. Under this framework, causal effects can be identified and estimated on a subpopulation where a local overlap assumption, a local SUTVA and a local ignorability assumption hold. We first discuss the potential advantages of this framework over local regression methods based on continuity assumptions, which concern the definition of the causal estimands, the design and the analysis of the study, and the interpretation and generalizability of the results. A critical issue of this local randomization approach is how to choose subpopulations for which we can draw valid causal inference. We propose a Bayesian model-based finite mixture approach to clustering to classify observations into subpopulations where the RD assumptions hold and do not hold. This approach has important advantages: a) it allows to account for the uncertainty in the subpopulation membership, which is typically neglected; b) it does not impose any constraint on the shape of the subpopulation; c) it is scalable to high-dimensional settings; e) it allows to target alternative causal estimands than the average treatment effect (ATE); and f) it is robust to a certain degree of manipulation/selection of the running variable. We apply our proposed approach to assess causal effects of the Bolsa Familia program on leprosy incidence in 2009. △ Less

Submitted 11 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

arXiv:2109.07502 [pdf, other]

Causal Effects with Hidden Treatment Diffusion on Observed or Partially Observed Networks

Authors: Costanza Tortú, Irene Crimaldi, Fabrizia Mealli, Laura Forastiere

Abstract: In randomized experiments, interactions between units might generate a treatment diffusion process. This is common when the treatment of interest is an actual object or product that can be shared among peers (e.g., flyers, booklets, videos). For instance, if the intervention of interest is an information campaign realized through the distribution of a video to targeted individuals, some of these t… ▽ More In randomized experiments, interactions between units might generate a treatment diffusion process. This is common when the treatment of interest is an actual object or product that can be shared among peers (e.g., flyers, booklets, videos). For instance, if the intervention of interest is an information campaign realized through the distribution of a video to targeted individuals, some of these treated individuals might share the video they received with their friends. Such a phenomenon is usually unobserved, causing a misallocation of individuals in the two treatment arms: some of the initially untreated units might have actually received the treatment by diffusion. Treatment misclassification can, in turn, introduce a bias in the estimation of the causal effect. Inspired by a recent field experiment on the effect of different types of school incentives aimed at encouraging students to attend cultural events, we present a novel approach to deal with a hidden diffusion process on observed or partially observed networks.Specifically, we develop a simulation-based sensitivity analysis that assesses the robustness of the estimates against the possible presence of a treatment diffusion. We simulate several diffusion scenarios within a plausible range of sensitivity parameters and we compare the treatment effect which is estimated in each scenario with the one that is obtained while ignoring the diffusion process. Results suggest that even a treatment diffusion parameter of small size may lead to a significant bias in the estimation of the treatment effect. △ Less

Submitted 15 September, 2021; originally announced September 2021.

arXiv:2108.04865 [pdf, ps, other]

Estimating Causal Effects of HIV Prevention Interventions with Interference in Network-based Studies among People Who Inject Drugs

Authors: TingFang Lee, Ashley L. Buchanan, Natallia V. Katenka, Laura Forastiere, M. Elizabeth Halloran, Samuel R. Friedman, Georgios Nikolopoulos

Abstract: Evaluating causal effects in the presence of interference is challenging in network-based studies of hard-to-reach populations. Like many such populations, people who inject drugs (PWID) are embedded in social networks and often exert influence on others in their network. In our setting, the study design is observational with a non-randomized network-based HIV prevention intervention. Information… ▽ More Evaluating causal effects in the presence of interference is challenging in network-based studies of hard-to-reach populations. Like many such populations, people who inject drugs (PWID) are embedded in social networks and often exert influence on others in their network. In our setting, the study design is observational with a non-randomized network-based HIV prevention intervention. Information is available on each participant and their connections that confer possible HIV risk through injection and sexual behaviors. We considered two inverse probability weighted (IPW) estimators to quantify the population-level effects of non-randomized interventions on subsequent health outcomes. We demonstrated that these two IPW estimators are consistent, asymptotically normal, and derived a closed-form estimator for the asymptotic variance, while allowing for overlap** interference sets (groups of individuals in which the interference is assumed possible). A simulation study was conducted to evaluate the finite-sample performance of the estimators. We analyzed data from the Transmission Reduction Intervention Project, which ascertained a network of PWID and their contacts in Athens, Greece, from 2013 to 2015. We evaluated the effects of community alerts on HIV risk behavior in this observed network, where the links between participants were defined by using substances or having unprotected sex together. In the study, community alerts were distributed to inform people of recent HIV infections among individuals in close proximity in the observed network. The estimates of the risk differences for spillover using either IPW estimator demonstrated a protective effect. The results suggest that HIV risk behavior can be mitigated by exposure to a community alert when an increased risk of HIV is detected in the network. △ Less

Submitted 14 November, 2022; v1 submitted 10 August, 2021; originally announced August 2021.

Comments: 22 pages, 7 figures

arXiv:2012.04831 [pdf, other]

Bipartite Interference and Air Pollution Transport: Estimating Health Effects of Power Plant Interventions

Authors: Corwin Zigler, Vera Liu, Fabrizia Mealli, Laura Forastiere

Abstract: Evaluating air quality interventions is confronted with the challenge of interference since interventions at a particular pollution source likely impact air quality and health at distant locations and air quality and health at any given location are likely impacted by interventions at many sources. The structure of interference in this context is dictated by complex atmospheric processes governing… ▽ More Evaluating air quality interventions is confronted with the challenge of interference since interventions at a particular pollution source likely impact air quality and health at distant locations and air quality and health at any given location are likely impacted by interventions at many sources. The structure of interference in this context is dictated by complex atmospheric processes governing how pollution emitted from a particular source is transformed and transported across space, and can be cast with a bipartite structure reflecting the two distinct types of units: 1) interventional units on which treatments are applied or withheld to change pollution emissions; and 2) outcome units on which outcomes of primary interest are measured. We propose new estimands for bipartite causal inference with interference that construe two components of treatment: a "key-associated" (or "individual") treatment and an "upwind" (or "neighborhood") treatment. Estimation is carried out using a semi-parametric adjustment approach based on joint propensity scores. A reduced-complexity atmospheric model is deployed to characterize the structure of the interference network by modeling the movement of air parcels through time and space. The new methods are deployed to evaluate the effectiveness of installing flue-gas desulfurization scrubbers on 472 coal-burning power plants (the interventional units) in reducing Medicare hospitalizations among 21,577,552 Medicare beneficiaries residing across 25,553 ZIP codes in the United States (the outcome units). △ Less

Submitted 2 January, 2023; v1 submitted 8 December, 2020; originally announced December 2020.

arXiv:2010.02896 [pdf]

Mass Gatherings for Political Expression Had No Discernable Association with the Local Course of the COVID-19 Pandemic in the USA in 2020 and 2021

Authors: Eric M. Feltham, Laura Forastiere, Marcus Alexander, Nicholas A. Christakis

Abstract: Epidemic disease can spread during mass gatherings. We assessed the impact on the local-area trajectory of the COVID-19 epidemic of a type of mass gathering about which comprehensive data were available. Here, we examined five types of political events in 2020 and 2021: the US primary elections; the US Senate special election in Georgia; the gubernatorial elections in New Jersey and Virginia; Dona… ▽ More Epidemic disease can spread during mass gatherings. We assessed the impact on the local-area trajectory of the COVID-19 epidemic of a type of mass gathering about which comprehensive data were available. Here, we examined five types of political events in 2020 and 2021: the US primary elections; the US Senate special election in Georgia; the gubernatorial elections in New Jersey and Virginia; Donald Trump's political rallies; and the Black Lives Matter protests. Our study period encompassed over 700 such mass gatherings during multiple phases of the pandemic. We used data from the 48 contiguous states, representing 3,119 counties, and we implemented a novel extension of a recently developed non-parametric, generalized difference-in-difference estimator with a (high-quality) matching procedure for panel data to estimate the average effect of the gatherings on local mortality and other outcomes. There were no statistically significant increases in cases, deaths, or a measure of epidemic transmissibility (Rt) in a 40-day period following large-scale political activities. We estimated small and statistically insignificant effects, corresponding to an average difference of -0.0567 deaths (95% CI = -0.319, 0.162), and 8.275 cases (95% CI = -1.383, 20.7), on each day, for counties that held mass gatherings for political expression compared to matched control counties. In sum, there is no statistical evidence of a material increase in local COVID-19 deaths, cases, or transmissibility after mass gatherings for political expression during the first two years of the pandemic in the USA. This may relate to the specific manner in which such activities are typically conducted. △ Less

Submitted 12 June, 2023; v1 submitted 6 October, 2020; originally announced October 2020.

arXiv:2008.00707 [pdf, other]

Heterogeneous Treatment and Spillover Effects under Clustered Network Interference

Authors: Falco J. Bargagli-Stoffi, Costanza Tortù, Laura Forastiere

Abstract: The bulk of causal inference studies rule out the presence of interference between units. However, in many real-world scenarios, units are interconnected by social, physical, or virtual ties, and the effect of the treatment can spill from one unit to other connected individuals in the network. In this paper, we develop a machine learning method that uses tree-based algorithms and a Horvitz-Thompso… ▽ More The bulk of causal inference studies rule out the presence of interference between units. However, in many real-world scenarios, units are interconnected by social, physical, or virtual ties, and the effect of the treatment can spill from one unit to other connected individuals in the network. In this paper, we develop a machine learning method that uses tree-based algorithms and a Horvitz-Thompson estimator to assess the heterogeneity of treatment and spillover effects with respect to individual, neighborhood, and network characteristics in the context of clustered networks and neighborhood interference within clusters. The proposed Network Causal Tree (NCT) algorithm has several advantages. First, it allows the investigation of the treatment effect heterogeneity, avoiding potential bias due to the presence of interference. Second, understanding the heterogeneity of both treatment and spillover effects can guide policy-makers in scaling up interventions, designing targeting strategies, and increasing cost-effectiveness. We investigate the performance of our NCT method using a Monte Carlo simulation study, and we illustrate its application to assess the heterogeneous effects of information sessions on the uptake of a new weather insurance policy in rural China. △ Less

Submitted 2 November, 2023; v1 submitted 3 August, 2020; originally announced August 2020.

arXiv:2004.13459 [pdf, other]

Causal Inference on Networks under Continuous Treatment Interference

Authors: Laura Forastiere, Davide Del Prete, Valerio Leone Sciabolazza

Abstract: This paper investigates the case of interference, when a unit's treatment also affects other units' outcome. When interference is at work, policy evaluation mostly relies on the use of randomized experiments under cluster interference and binary treatment. Instead, we consider a non-experimental setting under continuous treatment and network interference. In particular, we define spillover effects… ▽ More This paper investigates the case of interference, when a unit's treatment also affects other units' outcome. When interference is at work, policy evaluation mostly relies on the use of randomized experiments under cluster interference and binary treatment. Instead, we consider a non-experimental setting under continuous treatment and network interference. In particular, we define spillover effects by specifying the exposure to network treatment as a weighted average of the treatment received by units connected through physical, social or economic interactions. We provide a generalized propensity score-based estimator to estimate both direct and spillover effects of a continuous treatment. Our estimator also allows to consider asymmetric network connections characterized by heterogeneous intensities. To showcase this methodology, we investigate whether and how spillover effects shape the optimal level of policy interventions in agricultural markets. Our results show that, in this context, neglecting interference may underestimate the degree of policy effectiveness. △ Less

Submitted 12 June, 2023; v1 submitted 28 April, 2020; originally announced April 2020.

arXiv:2003.10525 [pdf, other]

Modelling Network Interference with Multi-valued Treatments: the Causal Effect of Immigration Policy on Crime Rates

Authors: C. Tortù, I. Crimaldi, F. Mealli, L. Forastiere

Abstract: Policy evaluation studies, which intend to assess the effect of an intervention, face some statistical challenges: in real-world settings treatments are not randomly assigned and the analysis might be further complicated by the presence of interference between units. Researchers have started to develop novel methods that allow to manage spillover mechanisms in observational studies; recent works f… ▽ More Policy evaluation studies, which intend to assess the effect of an intervention, face some statistical challenges: in real-world settings treatments are not randomly assigned and the analysis might be further complicated by the presence of interference between units. Researchers have started to develop novel methods that allow to manage spillover mechanisms in observational studies; recent works focus primarily on binary treatments. However, many policy evaluation studies deal with more complex interventions. For instance, in political science, evaluating the impact of policies implemented by administrative entities often implies a multivariate approach, as a policy towards a specific issue operates at many different levels and can be defined along a number of dimensions. In this work, we extend the statistical framework about causal inference under network interference in observational studies, allowing for a multi-valued individual treatment and an interference structure shaped by a weighted network. The estimation strategy is based on a joint multiple generalized propensity score and allows one to estimate direct effects, controlling for both individual and network covariates. We follow the proposed methodology to analyze the impact of the national immigration policy on the crime rate. We define a multi-valued characterization of political attitudes towards migrants and we assume that the extent to which each country can be influenced by another country is modeled by an appropriate indicator, summarizing their cultural and geographical proximity. Results suggest that implementing a highly restrictive immigration policy leads to an increase of the crime rate and the estimated effects is larger if we take into account interference from other countries. △ Less

Submitted 23 June, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

arXiv:1807.11038 [pdf, other]

Estimating Causal Effects Under Interference Using Bayesian Generalized Propensity Scores

Authors: Laura Forastiere, Fabrizia Mealli, Albert Wu, Edoardo Airoldi

Abstract: In most real-world systems units are interconnected and can be represented as networks consisting of nodes and edges. For instance, in social systems individuals can have social ties, family or financial relationships. In settings where some units are exposed to a treatment and its effect spills over connected units, estimating both the direct effect of the treatment and spillover effects presents… ▽ More In most real-world systems units are interconnected and can be represented as networks consisting of nodes and edges. For instance, in social systems individuals can have social ties, family or financial relationships. In settings where some units are exposed to a treatment and its effect spills over connected units, estimating both the direct effect of the treatment and spillover effects presents several challenges. First, assumptions on the way and the extent to which spillover effects occur along the observed network are required. Second, in observational studies, where the treatment assignment is not under the control of the investigator, confounding and homophily are potential threats to the identification and estimation of causal effects on networks. Here, we make two structural assumptions: i) neighborhood interference, which assumes interference operates only through a function of the immediate neighbors' treatments ii) unconfoundedness of the individual and neighborhood treatment, which rules out the presence of unmeasured confounding variables, including those driving homophily. Under these assumptions we develop a new covariate-adjustment estimator for treatment and spillover effects in observational studies on networks. Estimation is based on a generalized propensity score that balances individual and neighborhood covariates across units under different levels of individual treatment and of exposure to neighbors' treatment. Adjustment for propensity score is performed using a penalized spline regression. Inference capitalizes on a three-step Bayesian procedure which allows to take into account the uncertainty in the propensity score estimation and avoiding model feedback. Finally, correlation of interacting units is taken into account using a community detection algorithm and incorporating random effects in the outcome model. △ Less

Submitted 29 July, 2018; originally announced July 2018.

Comments: arXiv admin note: text overlap with arXiv:1610.06511 by other authors

arXiv:1609.06245 [pdf, other]

Identification and estimation of treatment and interference effects in observational studies on networks

Authors: Laura Forastiere, Edoardo M. Airoldi, Fabrizia Mealli

Abstract: Causal inference on a population of units connected through a network often presents technical challenges, including how to account for interference. In the presence of local interference, for instance, potential outcomes of a unit depend on its treatment as well as on the treatments of other local units, such as its neighbors according to the network. In observational studies, a further complicat… ▽ More Causal inference on a population of units connected through a network often presents technical challenges, including how to account for interference. In the presence of local interference, for instance, potential outcomes of a unit depend on its treatment as well as on the treatments of other local units, such as its neighbors according to the network. In observational studies, a further complication is that the typical unconfoundedness assumption must be extended - say, to include the treatment of neighbors, and indi- vidual and neighborhood covariates - to guarantee identification and valid inference. Here, we propose new estimands that define treatment and interference effects. We then derive analytical expressions for the bias of a naive estimator that wrongly assumes away interference. The bias depends on the level of interference but also on the degree of association between individual and neighborhood treatments. We propose an extended unconfoundedness assumption that accounts for interference, and we develop new covariate-adjustment methods that lead to valid estimates of treatment and interference effects in observational studies on networks. Estimation is based on a generalized propensity score that balances individual and neighborhood covariates across units under different levels of individual treatment and of exposure to neighbors' treatment. We carry out simulations, calibrated using friendship networks and covariates in a nationally representative longitudinal study of adolescents in grades 7-12, in the United States, to explore finite-sample performance in different realistic settings. △ Less

Submitted 29 March, 2018; v1 submitted 20 September, 2016; originally announced September 2016.

arXiv:1605.07242 [pdf, other]

More Powerful Multiple Testing in Randomized Experiments with Non-Compliance

Authors: Joseph J. Lee, Laura Forastiere, Luke Miratrix, Natesh S. Pillai

Abstract: Two common concerns raised in analyses of randomized experiments are (i) appropriately handling issues of non-compliance, and (ii) appropriately adjusting for multiple tests (e.g., on multiple outcomes or subgroups). Although simple intention-to-treat (ITT) and Bonferroni methods are valid in terms of type I error, they can each lead to a substantial loss of power; when employing both simultaneous… ▽ More Two common concerns raised in analyses of randomized experiments are (i) appropriately handling issues of non-compliance, and (ii) appropriately adjusting for multiple tests (e.g., on multiple outcomes or subgroups). Although simple intention-to-treat (ITT) and Bonferroni methods are valid in terms of type I error, they can each lead to a substantial loss of power; when employing both simultaneously, the total loss may be severe. Alternatives exist to address each concern. Here we propose an analysis method for experiments involving both features that merges posterior predictive $p$-values for complier causal effects with randomization-based multiple comparisons adjustments; the results are valid familywise tests that are doubly advantageous: more powerful than both those based on standard ITT statistics and those using traditional multiple comparison adjustments. The operating characteristics and advantages of our method are demonstrated through a series of simulated experiments and an analysis of the United States Job Training Partnership Act (JTPA) Study, where our methods lead to different conclusions regarding the significance of estimated JTPA effects. △ Less

Submitted 23 May, 2016; originally announced May 2016.

Comments: To appear in Statistica Sinica

arXiv:1511.00521 [pdf, other]

Posterior Predictive P-values with Fisher Randomization Tests in Noncompliance Settings: Test Statistics vs Discrepancy Variables

Authors: Laura Forastiere, Fabrizia Mealli, Luke Miratrix

Abstract: In randomized experiments with noncompliance, tests may focus on compliers rather than on the overall sample. Rubin (1998) put forth such a method, and argued that testing for the complier average causal effect and averaging permutation based p-values over the posterior distribution of the compliance status could increase power, as compared to general intent-to-treat tests. The general scheme is t… ▽ More In randomized experiments with noncompliance, tests may focus on compliers rather than on the overall sample. Rubin (1998) put forth such a method, and argued that testing for the complier average causal effect and averaging permutation based p-values over the posterior distribution of the compliance status could increase power, as compared to general intent-to-treat tests. The general scheme is to repeatedly do a two-step process of imputing missing compliance statuses and conducting a permutation test with the completed data. In this paper, we explore this idea further, comparing the use of discrepancy measures, which depend on unknown but imputed parameters, to classical test statistics and exploring different approaches for imputing the unknown compliance statuses. We also examine consequences of model misspecification in the imputation step, and discuss to what extent this additional modeling undercuts the permutation test's model independence. We find that, especially for discrepancy measures, modeling choices can impact both power and validity. In particular, imputing missing compliance statuses assuming the null can radically reduce power, but not doing so can jeopardize validity. Fortunately, covariates predictive of compliance status can mitigate these results. Finally, we compare this overall approach to Bayesian model-based tests, that is tests that are directly derived from posterior credible intervals, under both correct and incorrect model specification. We find that adding the permutation step in an otherwise Bayesian approach improves robustness to model specification without substantial loss of power. △ Less

Submitted 20 February, 2016; v1 submitted 2 November, 2015; originally announced November 2015.

Showing 1–18 of 18 results for author: Forastiere, L