Search | arXiv e-print repository

Bayesian principal stratification with longitudinal data and truncation by death

Authors: Giulio Grossi, Marco Mariani, Alessandra Mattei, Fabrizia Mealli

Abstract: In many causal studies, outcomes are censored by death, in the sense that they are neither observed nor defined for units who die. In such studies, the focus is usually on the stratum of always survivors up to a single fixed time s. Building on a recent strand of the literature, we propose an extended framework for the analysis of longitudinal studies, where units can die at different time points,… ▽ More In many causal studies, outcomes are censored by death, in the sense that they are neither observed nor defined for units who die. In such studies, the focus is usually on the stratum of always survivors up to a single fixed time s. Building on a recent strand of the literature, we propose an extended framework for the analysis of longitudinal studies, where units can die at different time points, and the main endpoints are observed and well defined only up to the death time. We develop a Bayesian longitudinal principal stratification framework, where units are cross classified according to the longitudinal death status. Under this framework, the focus is on causal effects for the principal strata of units that would be alive up to a time point s irrespective of their treatment assignment, where these strata may vary as a function of s. We can get precious insights into the effects of treatment by inspecting the distribution of baseline characteristics within each longitudinal principal stratum, and by investigating the time trend of both principal stratum membership and survivor-average causal effects. We illustrate our approach for the analysis of a longitudinal observational study aimed to assess, under the assumption of strong ignorability of treatment assignment, the causal effects of a policy promoting start ups on firms survival and hiring policy, where firms hiring status is censored by death. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2310.06653 [pdf, other]

Evaluating causal effects on time-to-event outcomes in an RCT in Oncology with treatment discontinuation due to adverse events

Authors: Veronica Ballerini, Björn Bornkamp, Alessandra Mattei, Fabrizia Mealli, Craig Wang, Yufen Zhang

Abstract: In clinical trials, patients sometimes discontinue study treatments prematurely due to reasons such as adverse events. Treatment discontinuation occurs after the randomisation as an intercurrent event, making causal inference more challenging. The Intention-To-Treat (ITT) analysis provides valid causal estimates of the effect of treatment assignment; still, it does not take into account whether or… ▽ More In clinical trials, patients sometimes discontinue study treatments prematurely due to reasons such as adverse events. Treatment discontinuation occurs after the randomisation as an intercurrent event, making causal inference more challenging. The Intention-To-Treat (ITT) analysis provides valid causal estimates of the effect of treatment assignment; still, it does not take into account whether or not patients had to discontinue the treatment prematurely. We propose to deal with the problem of treatment discontinuation using principal stratification, recognised in the ICH E9(R1) addendum as a strategy for handling intercurrent events. Under this approach, we can decompose the overall ITT effect into principal causal effects for groups of patients defined by their potential discontinuation behaviour in continuous time. In this framework, we must consider that discontinuation happening in continuous time generates an infinite number of principal strata and that discontinuation time is not defined for patients who would never discontinue. An additional complication is that discontinuation time and time-to-event outcomes are subject to administrative censoring. We employ a flexible model-based Bayesian approach to deal with such complications. We apply the Bayesian principal stratification framework to analyse synthetic data based on a recent RCT in Oncology, aiming to assess the causal effects of a new investigational drug combined with standard of care vs. standard of care alone on progression-free survival. We simulate data under different assumptions that reflect real situations where patients' behaviour depends on critical baseline covariates. Finally, we highlight how such an approach makes it straightforward to characterise patients' discontinuation behaviour with respect to the available covariates with the help of a simulation study. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.14486 [pdf, other]

Principal stratification with continuous treatments and continuous post-treatment variables

Authors: Joseph Antonelli, Fabrizia Mealli, Brenden Beck, Alessandra Mattei

Abstract: In causal inference studies, interest often lies in understanding the mechanisms through which a treatment affects an outcome. One approach is principal stratification (PS), which introduces well-defined causal effects in the presence of confounded post-treatment variables, or mediators, and clearly defines the assumptions for identification and estimation of those effects. The goal of this paper… ▽ More In causal inference studies, interest often lies in understanding the mechanisms through which a treatment affects an outcome. One approach is principal stratification (PS), which introduces well-defined causal effects in the presence of confounded post-treatment variables, or mediators, and clearly defines the assumptions for identification and estimation of those effects. The goal of this paper is to extend the PS framework to studies with continuous treatments and continuous post-treatment variables, which introduces a number of unique challenges both in terms of defining causal effects and performing inference. This manuscript provides three key methodological contributions: 1) we introduce novel principal estimands for continuous treatments that provide valuable insights into different causal mechanisms, 2) we utilize Bayesian nonparametric approaches to model the joint distribution of the potential mediating variables based on both Gaussian processes and Dirichlet process mixtures to ensure our approach is robust to model misspecification, and 3) we provide theoretical and numerical justification for utilizing a model for the potential outcomes to identify the joint distribution of the potential mediating variables. Lastly, we apply our methodology to a novel study of the relationship between the economy and arrest rates, and how this is potentially mediated by police capacity. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2211.09099 [pdf, other]

Selecting Subpopulations for Causal Inference in Regression Discontinuity Designs

Authors: Laura Forastiere, Alessandra Mattei, Julia M. Pescarini, Mauricio L. Barreto, Fabrizia Mealli

Abstract: The Brazil Bolsa Familia (BF) program is a conditional cash transfer program aimed to reduce short-term poverty by direct cash transfers and to fight long-term poverty by increasing human capital among poor Brazilian people. Eligibility for Bolsa Familia benefits depends on a cutoff rule, which classifies the BF study as a regression discontinuity (RD) design. Extracting causal information from RD… ▽ More The Brazil Bolsa Familia (BF) program is a conditional cash transfer program aimed to reduce short-term poverty by direct cash transfers and to fight long-term poverty by increasing human capital among poor Brazilian people. Eligibility for Bolsa Familia benefits depends on a cutoff rule, which classifies the BF study as a regression discontinuity (RD) design. Extracting causal information from RD studies is challenging. Following Li et al (2015) and Branson and Mealli (2019), we formally describe the BF RD design as a local randomized experiment within the potential outcome approach. Under this framework, causal effects can be identified and estimated on a subpopulation where a local overlap assumption, a local SUTVA and a local ignorability assumption hold. We first discuss the potential advantages of this framework over local regression methods based on continuity assumptions, which concern the definition of the causal estimands, the design and the analysis of the study, and the interpretation and generalizability of the results. A critical issue of this local randomization approach is how to choose subpopulations for which we can draw valid causal inference. We propose a Bayesian model-based finite mixture approach to clustering to classify observations into subpopulations where the RD assumptions hold and do not hold. This approach has important advantages: a) it allows to account for the uncertainty in the subpopulation membership, which is typically neglected; b) it does not impose any constraint on the shape of the subpopulation; c) it is scalable to high-dimensional settings; e) it allows to target alternative causal estimands than the average treatment effect (ATE); and f) it is robust to a certain degree of manipulation/selection of the running variable. We apply our proposed approach to assess causal effects of the Bolsa Familia program on leprosy incidence in 2009. △ Less

Submitted 11 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

arXiv:2011.11023 [pdf, other]

Exploiting network information to disentangle spillover effects in a field experiment on teens' museum attendance

Authors: Silvia Noirjean, Marco Mariani, Alessandra Mattei, Fabrizia Mealli

Abstract: A key element in the education of youths is their sensitization to historical and artistic heritage. We analyze a field experiment conducted in Florence (Italy) to assess how appropriate incentives assigned to high-school classes may induce teens to visit museums in their free time. Non-compliance and spillover effects make the impact evaluation of this clustered encouragement design challenging.… ▽ More A key element in the education of youths is their sensitization to historical and artistic heritage. We analyze a field experiment conducted in Florence (Italy) to assess how appropriate incentives assigned to high-school classes may induce teens to visit museums in their free time. Non-compliance and spillover effects make the impact evaluation of this clustered encouragement design challenging. We propose to blend principal stratification and causal mediation, by defining sub-populations of units according to their compliance behavior and using the information on their friendship networks as mediator. We formally define principal natural direct and indirect effects and principal controlled direct and spillover effects, and use them to disentangle spillovers from other causal channels. We adopt a Bayesian approach for inference. △ Less

Submitted 6 May, 2022; v1 submitted 22 November, 2020; originally announced November 2020.

Comments: Major changes (data/outcome/models)

arXiv:2004.05027 [pdf, other]

Direct and spillover effects of a new tramway line on the commercial vitality of peripheral streets. A synthetic-control approach

Authors: Giulio Grossi, Marco Mariani, Alessandra Mattei, Patrizia Lattarulo, Özge Öner

Abstract: In cities, the creation of public transport infrastructure such as light rails can cause changes on a very detailed spatial scale, with different stories unfolding next to each other within a same urban neighborhood. We study the direct effect of a light rail line built in Florence (Italy) on the retail density of the street where it was built and and its spillover effect on other streets in the t… ▽ More In cities, the creation of public transport infrastructure such as light rails can cause changes on a very detailed spatial scale, with different stories unfolding next to each other within a same urban neighborhood. We study the direct effect of a light rail line built in Florence (Italy) on the retail density of the street where it was built and and its spillover effect on other streets in the treated street's neighborhood. To this aim, we investigate the use of the Synthetic Control Group (SCG) methods in panel comparative case studies where interference between the treated and the untreated units is plausible, an issue still little researched in the SCG methodological literature. We frame our discussion in the potential outcomes approach. Under a partial interference assumption, we formally define relevant direct and spillover causal effects. We also consider the ``unrealized'' spillover effect on the treated street in the hypothetical scenario that another street in the treated unit's neighborhood had been assigned to the intervention. △ Less

Submitted 27 November, 2023; v1 submitted 10 April, 2020; originally announced April 2020.

arXiv:2002.11989 [pdf, other]

Assessing causal effects in the presence of treatment switching through principal stratification

Authors: Alessandra Mattei, Peng Ding, Veronica Ballerini, Fabrizia Mealli

Abstract: Clinical trials often allow patients in the control arm to switch to the treatment arm if their physical conditions are worse than certain tolerance levels. For instance, treatment switching arises in the Concorde clinical trial, which aims to assess causal effects on the time-to-disease progression or death of immediate versus deferred treatment with zidovudine among patients with asymptomatic HI… ▽ More Clinical trials often allow patients in the control arm to switch to the treatment arm if their physical conditions are worse than certain tolerance levels. For instance, treatment switching arises in the Concorde clinical trial, which aims to assess causal effects on the time-to-disease progression or death of immediate versus deferred treatment with zidovudine among patients with asymptomatic HIV infection. The Intention-To-Treat analysis does not measure the effect of the actual receipt of the treatment and ignores the information on treatment switching. Other existing methods reconstruct the outcome a patient would have had if they had not switched under strong assumptions. Departing from the literature, we re-define the problem of treatment switching using principal stratification and focus on causal effects for patients belonging to subpopulations defined by the switching behavior under control. We use a Bayesian approach to inference, taking into account that (i) switching happens in continuous time; (ii) switching time is not defined for patients who never switch in a particular experiment; and (iii) survival time and switching time are subject to censoring. We apply this framework to analyze synthetic data based on the Concorde study. Our data analysis reveals that immediate treatment with zidovudine increases survival time for never switcher and that treatment effects are highly heterogeneous across different types of patients defined by the switching behavior. △ Less

Submitted 5 September, 2023; v1 submitted 27 February, 2020; originally announced February 2020.

arXiv:1710.07039 [pdf, ps, other]

Causal inference for binary non-independent outcomes

Authors: Monia Lupparelli, Alessandra Mattei

Abstract: Causal inference on multiple non-independent outcomes raises serious challenges, because multivariate techniques that properly account for the outcome's dependence structure need to be considered. We focus on the case of binary outcomes framing our discussion in the potential outcome approach to causal inference. We define causal effects of treatment on joint outcomes introducing the notion of pro… ▽ More Causal inference on multiple non-independent outcomes raises serious challenges, because multivariate techniques that properly account for the outcome's dependence structure need to be considered. We focus on the case of binary outcomes framing our discussion in the potential outcome approach to causal inference. We define causal effects of treatment on joint outcomes introducing the notion of product outcomes. We also discuss a decomposition of the causal effect on product outcomes into intrinsic and extrinsic causal effects, which respectively provide information on treatment effect on the intrinsic (product) structure of the product outcomes and on the outcomes' dependence structure. We propose a log-mean linear regression approach for modeling the distribution of the potential outcomes, which is particularly appealing because all the causal estimands of interest and the decomposition into intrinsic and extrinsic causal effects can be easily derived by model parameters. The method is illustrated in two randomized experiments concerning (i) the effect of the administration of oral pre-surgery morphine on pain intensity after surgery; and (ii) the effect of honey on nocturnal cough and sleep difficulty associated with childhood upper respiratory tract infections. △ Less

Submitted 10 May, 2018; v1 submitted 19 October, 2017; originally announced October 2017.

arXiv:1710.00720 [pdf, other]

A novel quantile-based decomposition of the indirect effect in mediation analysis with an application to infant mortality in the US population

Authors: Marco Geraci, Alessandra Mattei

Abstract: In mediation analysis, the effect of an exposure (or treatment) on an outcome variable is decomposed into two components: a direct effect, which pertains to an immediate influence of the exposure on the outcome, and an indirect effect, which the exposure exerts on the outcome through a third variable called mediator. Our motivating example concerns the relationship between maternal smoking (the ex… ▽ More In mediation analysis, the effect of an exposure (or treatment) on an outcome variable is decomposed into two components: a direct effect, which pertains to an immediate influence of the exposure on the outcome, and an indirect effect, which the exposure exerts on the outcome through a third variable called mediator. Our motivating example concerns the relationship between maternal smoking (the exposure, $X$), birthweight (the mediator, $M$), and infant mortality (the outcome, $Y$), which has attracted the interest of epidemiologists and statisticians for many years. We introduce new causal estimands, named $u$-specific direct and indirect effects, which describe the direct and indirect effects of the exposure on the outcome at a specific quantile $u$ of the mediator, $0 < u < 1$. Under sequential ignorability we derive an interesting and novel decomposition of $u$-specific indirect effects. The components of this decomposition have a straightforward interpretation and can provide new insights into the complexity of the mechanisms underlying the indirect effect. We illustrate the proposed methods using data on infant mortality in the US population. We provide analytical evidence that supports the hypothesis that the risk of sudden infant death syndrome is not predicted by changes in the birthweight distribution. △ Less

Submitted 3 October, 2017; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: 31 pages, 7 figures

MSC Class: 62J99

arXiv:1608.08485 [pdf]

Potential outcome approach to causal inference in assessing the short term impact of air pollution on mortality

Authors: Michela Baccini, Alessandra Mattei, Fabrizia Mealli, Pier Alberto Bertazzi, Michele Carugno

Abstract: The opportunity to assess short term impact of air pollution relies on the causal interpretation of the exposure-outcome association, but up to now few studies explicitly faced this issue within a causal inference framework. In this paper, we reformulated the problem of assessing the short term impact of air pollution on health using the potential outcome approach to causal inference. We focused o… ▽ More The opportunity to assess short term impact of air pollution relies on the causal interpretation of the exposure-outcome association, but up to now few studies explicitly faced this issue within a causal inference framework. In this paper, we reformulated the problem of assessing the short term impact of air pollution on health using the potential outcome approach to causal inference. We focused on the impact of high daily levels of PM10 on mortality within two days from the exposure in the metropolitan area of Milan (Italy), during the period 2003-2006. After defining the number of attributable deaths in terms of difference between potential outcomes, we used the estimated propensity score to match each high exposure-day with a day with similar background characteristics but lower PM10 level. Then, we estimated the impact by comparing mortality between matched days. We found that during the study period daily exposures larger than 40 microgram per cubic meter were responsible of 1079 deaths (116; 2042). The impact was more evident among the elderly than in the younger classes of age. The propensity score matching turned out to be an appealing method to assess historical impacts in this field. △ Less

Submitted 30 August, 2016; originally announced August 2016.

arXiv:1608.07180 [pdf, other]

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability

Authors: Federico Ricciardi, Alessandra Mattei, Fabrizia Mealli

Abstract: We focus on causal inference for longitudinal treatments, where units are assigned to treatments at multiple time points, aiming to assess the effect of different treatment sequences on an outcome observed at a final point. A common assumption in similar studies is Sequential Ignorability (SI): treatment assignment at each time point is assumed independent of future potential outcomes given past o… ▽ More We focus on causal inference for longitudinal treatments, where units are assigned to treatments at multiple time points, aiming to assess the effect of different treatment sequences on an outcome observed at a final point. A common assumption in similar studies is Sequential Ignorability (SI): treatment assignment at each time point is assumed independent of future potential outcomes given past observed outcomes and covariates. SI is questionable when treatment participation depends on individual choices, and treatment assignment may depend on unobservable quantities associated with future outcomes. We rely on Principal Stratification to formulate a relaxed version of SI: Latent Sequential Ignorability (LSI) assumes that treatment assignment is conditionally independent on future potential outcomes given past treatments, covariates and principal stratum membership, a latent variable defined by the joint value of observed and missing intermediate outcomes. We evaluate SI and LSI, using theoretical arguments and simulation studies to investigate the performance of the two assumptions when one holds and inference is conducted under both. Simulations show that when SI does not hold, inference performed under SI leads to misleading conclusions. Conversely, LSI generally leads to correct posterior distributions, irrespective of which assumption holds. △ Less

Submitted 12 May, 2019; v1 submitted 25 August, 2016; originally announced August 2016.

arXiv:1507.04199 [pdf, ps, other]

Evaluating the Causal Effect of University Grants on Student Dropout: Evidence from a Regression Discontinuity Design Using Principal Stratification

Authors: Fan Li, Alessandra Mattei, Fabrizia Mealli

Abstract: Regression discontinuity (RD) designs are often interpreted as local randomized experiments: a RD design can be considered as a randomized experiment for units with a realized value of a so-called forcing variable falling around a pre-fixed threshold. Motivated by the evaluation of Italian university grants, we consider a fuzzy RD design where the receipt of the treatment is based on both eligibil… ▽ More Regression discontinuity (RD) designs are often interpreted as local randomized experiments: a RD design can be considered as a randomized experiment for units with a realized value of a so-called forcing variable falling around a pre-fixed threshold. Motivated by the evaluation of Italian university grants, we consider a fuzzy RD design where the receipt of the treatment is based on both eligibility criteria and a voluntary application status. Resting on the fact that grant application and grant receipt statuses are post-assignment (post-eligibility) intermediate variables, we use the principal stratification framework to define causal estimands within the Rubin Causal Model. We propose a probabilistic formulation of the assignment mechanism underlying RD designs, by re-formulating the Stable Unit Treatment Value Assumption (SUTVA) and making an explicit local overlap assumption for a subpopulation around the threshold. A local randomization assumption is invoked instead of more standard continuity assumptions. We also develop a model-based Bayesian approach to select the target subpopulation(s) with adjustment for multiple comparisons, and to draw inference for the target causal estimands in this framework. Applying the method to the data from two Italian universities, we find evidence that university grants are effective in preventing students from low-income families from drop** out of higher education. △ Less

Submitted 15 July, 2015; originally announced July 2015.

arXiv:1401.2344 [pdf, ps, other]

doi 10.1214/13-AOAS674

Exploiting multiple outcomes in Bayesian principal stratification analysis with application to the evaluation of a job training program

Authors: Alessandra Mattei, Fan Li, Fabrizia Mealli

Abstract: The causal effect of a randomized job training program, the JOBS II study, on trainees' depression is evaluated. Principal stratification is used to deal with noncompliance to the assigned treatment. Due to the latent nature of the principal strata, strong structural assumptions are often invoked to identify principal causal effects. Alternatively, distributional assumptions may be invoked using a… ▽ More The causal effect of a randomized job training program, the JOBS II study, on trainees' depression is evaluated. Principal stratification is used to deal with noncompliance to the assigned treatment. Due to the latent nature of the principal strata, strong structural assumptions are often invoked to identify principal causal effects. Alternatively, distributional assumptions may be invoked using a model-based approach. These often lead to weakly identified models with substantial regions of flatness in the posterior distribution of the causal effects. Information on multiple outcomes is routinely collected in practice, but is rarely used to improve inference. This article develops a Bayesian approach to exploit multivariate outcomes to sharpen inferences in weakly identified principal stratification models. We show that inference for the causal effect on depression is significantly improved by using the re-employment status as a secondary outcome in the JOBS II study. Simulation studies are also performed to illustrate the potential gains in the estimation of principal causal effects from jointly modeling more than one outcome. This approach can also be used to assess plausibility of structural assumptions and sensitivity to deviations from these structural assumptions. Two model checking procedures via posterior predictive checks are also discussed. △ Less

Submitted 10 January, 2014; originally announced January 2014.

Comments: Published in at http://dx.doi.org/10.1214/13-AOAS674 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS674

Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 4, 2336-2360

Showing 1–13 of 13 results for author: Mattei, A