Differential recall bias in estimating treatment effects in observational studies

Abstract.

Observational studies are frequently used to estimate the effect of an exposure or treatment on an outcome. To obtain an unbiased estimate of the treatment effect, it is crucial to measure the exposure accurately. A common type of exposure misclassification is recall bias, which occurs in retrospective cohort studies when study subjects may inaccurately recall their past exposure. Particularly challenging is differential recall bias in the context of self-reported binary exposures, where the bias may be directional rather than random , and its extent varies according to the outcomes experienced. This paper makes several contributions: (1) it establishes bounds for the average treatment effect (ATE) even when a validation study is not available; (2) it proposes multiple estimation methods across various strategies predicated on different assumptions; and (3) it suggests a sensitivity analysis technique to assess the robustness of the causal conclusion, incorporating insights from prior research. The effectiveness of these methods is demonstrated through simulation studies that explore various model misspecification scenarios. These approaches are then applied to investigate the effect of childhood physical abuse on mental health in adulthood.

Key words and phrases:
Blocking; Causal inference; Differential recall bias; Prognostic score; Propensity score; Stratification.

Differential Recall Bias in Estimating Treatment Effects
in Observational Studies00footnotetext: This article has been accepted for publication in Biometrics Published by Oxford University Press.


Suhwan Bong1, Kwonsang Lee***Corresponding author: [email protected]1 and Francesca Dominici2


1Department of Statistics, Seoul National University

2Department of Biostatistics, Harvard T.H. Chan School of Public Health


1. Introduction

Observational studies are conducted to quantify the evidence of a potential causal relationship between an exposure or treatment and a given outcome. While numerous methods have been proposed to address confounding bias in observational studies, only a few have considered the challenges associated with accurately measuring exposure. One such challenge is the presence of recall bias, which refers to a systematic error that occurs when participants inaccurately recall or omit details of past experiences, potentially influenced by subsequent events. Recall bias is particularly problematic in studies relying on self-reporting, such as retrospective cohort studies. It can lead to exposure misclassification, which can manifest as random or differential misclassification (Rothman,, 2012). Unlike random recall bias, differential recall bias occurs when the misclassification of exposure information varies according to the value of other study variables. Specifically, under differential recall bias, exposure is differentially under-reported (over-reported) depending on the outcome. In addition to differential recall bias, random recall bias occurs when inaccuracies in the reporting of past events are due to chance and are not influenced by any specific factors. If the inaccuracies are equally likely to occur across the groups, then the bias may cancel out, and the estimated treatment effect may be unbiased (Raphael,, 1987). However, differential recall bias will likely lead to a biased estimate (Rothman,, 2012). In retrospective cohort studies or case-control studies, differential recall bias cannot be eliminated even after adjusting for confounders. This paper will focus on methods for addressing differential recall bias in retrospective cohort studies.

In our motivating example, which examines the data on childhood physical abuse and adult anger, it is noted from prior research that significant differential recall bias can occur. This introduces a systematic bias, further compounding the bias already present due to confounders. Adults tend to under-report their exposures to childhood abuse because they are hesitant to disclose their experiences, even in anonymous or confidential surveys, due to feelings of shame, guilt, or fear of retaliation. Also, it is possible that individuals who have experienced childhood abuse and suffer from anger issues may be more likely to report their abuse, as their anger may be related to unresolved trauma or emotional distress stemming from the abuse. However, it is important to note that the relationship between childhood abuse and adult anger is complex, and under-reporting of childhood abuse is a common problem that can vary depending on a range of individual and contextual factors (Fergusson et al.,, 2000).

There is an extensive literature on measurement error correction and exposure misclassification in epidemiology (Carroll et al.,, 1995; Rothman et al.,, 2008). For instance, numerous studies have concentrated on the error-in-variables model, addressing measurement error in linear regression problems (Lindley,, 1953; Lord,, 1960; Cochran,, 1968; Fuller,, 1980; Carroll et al.,, 1985). The regression calibration algorithm is proposed as a general approach by Carroll and Stefanski, (1990) and Gleser, (1990). Several measurement error correction methods have also been developed for binary misclassification problems and logistic regression models (Bross,, 1954; Armstrong,, 1985; Stefanski and Carroll,, 1985; Rosner et al.,, 1989, 1990). However, previous methodologies addressing measurement error have predominantly focused on outcome prediction and targeted the association of variables. In our research, we aim to address this issue while considering confounders and focusing on the causal relationship between variables. To the best of our knowledge, contributions regarding the impact of differential recall bias on measures with causal interpretations are scarce.

Accounting for measurement error in causal inference is important. However, studies on measurement error have typically focused on mismeasured covariates and misclassified outcomes. For example, previous studies have investigated measurement error in covariates (McCaffrey et al.,, 2013; Lockwood and McCaffrey,, 2016) and misclassification of binary outcomes (Gravel and Platt,, 2018). However, only a limited number of studies have addressed misclassified exposure, or recall bias. Imai and Yamamoto, (2010) proposed a nonparametric identification method for estimating the average treatment effect (ATE) with differential treatment measurement error. This method could address both over-reporting and under-reporting measurement errors, but it was based on strict assumptions, such as no misclassification for compliant groups. Furthermore, their bounds for the ATE were derived from the true treatment assigning probability, which may be unknown when recall bias is present. Babanezhad et al., (2010) and Braun et al., (2017) have shown that the exposure misclassification could significantly impact causal analysis. Babanezhad et al., (2010) compared several causal estimators for time-varying exposure reclassification cases, and Braun et al., (2017) proposed a likelihood-based method that adjusts for exposure misclassification bias, presupposing on non-differential measurement error assumption. In summary, these studies highlight a gap regarding causal inference methodologies that can analytically assess the impact of differential recall bias.

The primary goal of this paper is to introduce a collection of robust estimators for estimating the ATE in the presence of differential recall bias. We emphasize the significance of delineating the impact of differential recall bias on ATE within the causal inference framework. Additionally, this paper has 3 further contributions. First, we derive bounds for the ATE that do not rely on a validation study. The bounds can be refined with additional information about the nature of differential recall bias. This allows researchers to tailor the bounds based on the available evidence. Second, we propose multiple estimation methods using 2 different strategies for estimating the ATE—maximum likelihood estimation (MLE) and stratification. Each of the methods is based on its own assumptions, offering valuable insights on potential model misspecification. Finally, we propose a novel sensitivity analysis approach to assess the impact of differential recall bias on our conclusion. This is a crucial and useful way to quantify the evidence, given that the degree of differential recall bias is typically unknown in practice. We illustrate the application and efficacy of this sensitivity analysis method by applying it to the real data in Section 6.

2. Notation and Recall Bias Model

2.1. Causal Inference Framework and Target Parameters

We start by introducing the potential outcome framework (Rubin,, 1974). Assume N𝑁Nitalic_N individuals in total. We denote 𝐗i𝒳dsubscript𝐗𝑖𝒳superscript𝑑{\mathbf{X}}_{i}\in\mathcal{X}\subseteq\mathbb{R}^{d}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X ⊆ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT as an observed covariate vector for the i𝑖iitalic_ith individual. We let Zi=1subscript𝑍𝑖1Z_{i}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, to indicate that individual i𝑖iitalic_i was exposed to a certain binary exposure, and Zi=0subscript𝑍𝑖0Z_{i}=0italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 otherwise. We can define potential outcomes as follows: If Zi=0subscript𝑍𝑖0Z_{i}=0italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0, then individual i𝑖iitalic_i exhibits response Yi(0)subscript𝑌𝑖0Y_{i}(0)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ); if Zi=1subscript𝑍𝑖1Z_{i}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, then individual i𝑖iitalic_i exhibits Yi(1)subscript𝑌𝑖1Y_{i}(1)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ). Only one of the two potential outcomes can be observed depending on the exposure of individual i𝑖iitalic_i. The response exhibited by individual i𝑖iitalic_i is Yi=ZiYi(1)+(1Zi)Yi(0)subscript𝑌𝑖subscript𝑍𝑖subscript𝑌𝑖11subscript𝑍𝑖subscript𝑌𝑖0Y_{i}=Z_{i}Y_{i}(1)+(1-Z_{i})Y_{i}(0)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) + ( 1 - italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ). In this paper, Yi(0)subscript𝑌𝑖0Y_{i}(0)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) and Yi(1)subscript𝑌𝑖1Y_{i}(1)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) are assumed to be binary. Depending on the occurrence of the outcome, the potential outcome is equal to either 0 or 1. We consider two assumptions: (1) unconfoundedness and (2) positivity. The unconfoundedness assumption means that the potential outcomes (Yi(0),Yi(1))subscript𝑌𝑖0subscript𝑌𝑖1(Y_{i}(0),Y_{i}(1))( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) ) are conditionally independent of the treatment Z𝑍Zitalic_Z given 𝐗isubscript𝐗𝑖{\mathbf{X}}_{i}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, that is, (Yi(0),Yi(1))Zi|𝐗i(Y_{i}(0),Y_{i}(1))\perp\!\!\!\perp Z_{i}|{\mathbf{X}}_{i}( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) ) ⟂ ⟂ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The positivity assumption means that the probability Pr(Zi=1|𝐗i)Prsubscript𝑍𝑖conditional1subscript𝐗𝑖\Pr(Z_{i}=1|{\mathbf{X}}_{i})roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) lies in (0,1)01(0,1)( 0 , 1 ). These assumptions together are often called strong ignorability (Rosenbaum and Rubin,, 1983). We also adopt the Stable Unit Treatment Value Assumption (Rubin,, 1980) to identify causal effects; that is, the potential outcomes for each individual are not affected by the treatment status of other individuals.

Binary exposure is frequently retrospectively investigated to find the cause of the outcome in observational studies; thus, exposure to a risk factor is never randomized. A naive comparison of the prevalence of the outcome between the exposed and unexposed groups can be misleading due to the confounding bias. The effect caused by treatment to an individual i𝑖iitalic_i is defined as the difference, Yi(1)Yi(0)subscript𝑌𝑖1subscript𝑌𝑖0Y_{i}(1)-Y_{i}(0)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) - italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ). However, it is impossible to observe both Yi(0)subscript𝑌𝑖0Y_{i}(0)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) and Yi(1)subscript𝑌𝑖1Y_{i}(1)italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) for any individual. Under strong ignorability assumptions, it is possible to identify the ATE. Thus, our parameter of interest is the ATE, τ=𝔼[Yi(1)]𝔼[Yi(0)]𝜏𝔼delimited-[]subscript𝑌𝑖1𝔼delimited-[]subscript𝑌𝑖0\tau=\mathbb{E}[Y_{i}(1)]-\mathbb{E}[Y_{i}(0)]italic_τ = blackboard_E [ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) ] - blackboard_E [ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) ]. In some instances, we are interested in estimating the conditional average treatment effect (CATE) at a given level of 𝐗i=𝐱subscript𝐗𝑖𝐱{\mathbf{X}}_{i}=\mathbf{x}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x for 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X as τ(𝐱)=𝔼[Yi(1)|𝐗i=𝐱]𝔼[Yi(0)|𝐗i=𝐱]𝜏𝐱𝔼delimited-[]conditionalsubscript𝑌𝑖1subscript𝐗𝑖𝐱𝔼delimited-[]conditionalsubscript𝑌𝑖0subscript𝐗𝑖𝐱\tau(\mathbf{x})=\mathbb{E}[Y_{i}(1)|{\mathbf{X}}_{i}=\mathbf{x}]-\mathbb{E}[Y% _{i}(0)|{\mathbf{X}}_{i}=\mathbf{x}]italic_τ ( bold_x ) = blackboard_E [ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ] - blackboard_E [ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ].

2.2. Differential Recall Bias Model

Some observational studies, including retrospective cohort studies, are retrospective in nature. Thus, recall bias may occur when the exposures are self-reported. In this paper, we consider situations with differential recall bias where the exposure is under-reported (over-reported) differently depending on the outcome. In the presence of differential recall bias, the underlying true exposure Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is not observed. Instead, we observe the biased exposure Zisuperscriptsubscript𝑍𝑖Z_{i}^{*}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT with recall bias. If no recall bias exists, then Zi=Zisubscript𝑍𝑖superscriptsubscript𝑍𝑖Z_{i}=Z_{i}^{*}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. In the childhood abuse example, previous literature has indicated that exposure was mainly under-reported. To address this issue, we assume that either over-reporting or under-reporting recall biases occur. Since over-reporting can be treated similarly, we focus on the under-reporting recall bias in this paper.

Assumption 1 (Differential Recall Bias).

Recall bias occurs independently with probability ηy(𝐱)subscript𝜂𝑦𝐱\eta_{y}(\mathbf{x})italic_η start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( bold_x ) for individuals with Yi=ysubscript𝑌𝑖𝑦Y_{i}=yitalic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_y, Zi=1subscript𝑍𝑖1Z_{i}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, and 𝐗i=𝐱subscript𝐗𝑖𝐱{\mathbf{X}}_{i}=\mathbf{x}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x where y=0,1𝑦01y=0,1italic_y = 0 , 1 and 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X.

η0(𝐱)=Pr(Zi=0|Yi=0,Zi=1,𝐗i=𝐱),subscript𝜂0𝐱Prsuperscriptsubscript𝑍𝑖conditional0subscript𝑌𝑖0subscript𝑍𝑖1subscript𝐗𝑖𝐱\displaystyle\eta_{0}(\mathbf{x})=\Pr(Z_{i}^{*}=0|Y_{i}=0,Z_{i}=1,{\mathbf{X}}% _{i}=\mathbf{x}),italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0 | italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 , bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) ,
η1(𝐱)=Pr(Zi=0|Yi=1,Zi=1,𝐗i=𝐱).subscript𝜂1𝐱Prsuperscriptsubscript𝑍𝑖conditional0subscript𝑌𝑖1subscript𝑍𝑖1subscript𝐗𝑖𝐱\displaystyle\eta_{1}(\mathbf{x})=\Pr(Z_{i}^{*}=0|Y_{i}=1,Z_{i}=1,{\mathbf{X}}% _{i}=\mathbf{x}).italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0 | italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 , bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) .

Assumption 1 proposes the differential recall bias model that assumes that the occurrence and magnitude of bias depend on the observed outcome and 𝐗isubscript𝐗𝑖{\mathbf{X}}_{i}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. In essence, after stratifying the data based on 𝐗isubscript𝐗𝑖{\mathbf{X}}_{i}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the 2×2222\times 22 × 2 contingency table of Yisubscript𝑌𝑖Y_{i}italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be subject to misclassification. The parameters η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) represent the probability of under-reporting depending on outcome variables. In under-reported cases, Zi=0subscript𝑍𝑖0Z_{i}=0italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 0 always implies Zi=0superscriptsubscript𝑍𝑖0Z_{i}^{*}=0italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0, but Zi=1subscript𝑍𝑖1Z_{i}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 implies either Zi=1superscriptsubscript𝑍𝑖1Z_{i}^{*}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 or Zi=0superscriptsubscript𝑍𝑖0Z_{i}^{*}=0italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0. Therefore, recall bias occurs only when Zi=1subscript𝑍𝑖1Z_{i}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1. Note that if there is no recall bias, then η0(𝐱)=η1(𝐱)=0subscript𝜂0𝐱subscript𝜂1𝐱0\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})=0italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = 0.

3. Identification of Causal Parameters

If recall bias is absent and the exposure is observed correctly, then τ𝜏\tauitalic_τ and τ(𝐱)𝜏𝐱\tau(\mathbf{x})italic_τ ( bold_x ) can be identified under strong ignorability assumptions. However, in the presence of recall bias, if the inference is made on the basis of observed Zisuperscriptsubscript𝑍𝑖Z_{i}^{*}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT rather than Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, then we obtain a biased estimate because (Yi(0),Yi(1))⟂̸Zi|𝐗i(Y_{i}(0),Y_{i}(1))\not\perp\!\!\!\perp Z_{i}^{*}|{\mathbf{X}}_{i}( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) ) ⟂̸ ⟂ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. To describe this, consider the probabilities based on observed exposure Zsuperscript𝑍Z^{\ast}italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, py|z(𝐱)=Pr(Yi=y|Zi=z,𝐗i=𝐱)superscriptsubscript𝑝conditional𝑦𝑧𝐱Prsubscript𝑌𝑖conditional𝑦superscriptsubscript𝑍𝑖𝑧subscript𝐗𝑖𝐱p_{y|z}^{\ast}(\mathbf{x})=\Pr(Y_{i}=y|Z_{i}^{\ast}=z,{\mathbf{X}}_{i}=\mathbf% {x})italic_p start_POSTSUBSCRIPT italic_y | italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) = roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_y | italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_z , bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) for z=0,1𝑧01z=0,1italic_z = 0 , 1, y=0,1𝑦01y=0,1italic_y = 0 , 1, and 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X. Then,

py|z(𝐱)=pyz(𝐱)p1z(𝐱)+p0z(𝐱)pyz(𝐱)p1z(𝐱)+p0z(𝐱)=py|z(𝐱)subscript𝑝conditional𝑦𝑧𝐱subscript𝑝𝑦𝑧𝐱subscript𝑝1𝑧𝐱subscript𝑝0𝑧𝐱superscriptsubscript𝑝𝑦𝑧𝐱superscriptsubscript𝑝1𝑧𝐱superscriptsubscript𝑝0𝑧𝐱superscriptsubscript𝑝conditional𝑦𝑧𝐱\displaystyle p_{y|z}(\mathbf{x})=\frac{p_{yz}(\mathbf{x})}{p_{1z}(\mathbf{x})% +p_{0z}(\mathbf{x})}\not=\frac{p_{yz}^{\ast}(\mathbf{x})}{p_{1z}^{\ast}(% \mathbf{x})+p_{0z}^{\ast}(\mathbf{x})}=p_{y|z}^{\ast}(\mathbf{x})italic_p start_POSTSUBSCRIPT italic_y | italic_z end_POSTSUBSCRIPT ( bold_x ) = divide start_ARG italic_p start_POSTSUBSCRIPT italic_y italic_z end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 1 italic_z end_POSTSUBSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 0 italic_z end_POSTSUBSCRIPT ( bold_x ) end_ARG ≠ divide start_ARG italic_p start_POSTSUBSCRIPT italic_y italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 1 italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 0 italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG = italic_p start_POSTSUBSCRIPT italic_y | italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x )

where pyz(𝐱)=Pr(Yi=y,Zi=z|𝐗i=𝐱)subscript𝑝𝑦𝑧𝐱Prsubscript𝑌𝑖𝑦subscript𝑍𝑖conditional𝑧subscript𝐗𝑖𝐱p_{yz}(\mathbf{x})=\Pr(Y_{i}=y,Z_{i}=z|{\mathbf{X}}_{i}=\mathbf{x})italic_p start_POSTSUBSCRIPT italic_y italic_z end_POSTSUBSCRIPT ( bold_x ) = roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_y , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_z | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) and pyz(𝐱)=Pr(Yi=y,Zi=z|𝐗i=𝐱)superscriptsubscript𝑝𝑦𝑧𝐱Prsubscript𝑌𝑖𝑦superscriptsubscript𝑍𝑖conditional𝑧subscript𝐗𝑖𝐱p_{yz}^{\ast}(\mathbf{x})=\Pr(Y_{i}=y,Z_{i}^{\ast}=z|{\mathbf{X}}_{i}=\mathbf{% x})italic_p start_POSTSUBSCRIPT italic_y italic_z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) = roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_y , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_z | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) for z=0,1𝑧01z=0,1italic_z = 0 , 1, y=0,1𝑦01y=0,1italic_y = 0 , 1, and 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X holds.

The precise recall bias mechanism in real-life scenarios is often unknown. Assuming we lack precise knowledge of the recall bias parameter functions stated in Assumption 1, it becomes infeasible to identify the ATE with certainty. However, if we can establish bounds on the recall bias parameter functions, we can potentially bound the target parameters. Under the recall bias model in Assumption 1, the following relationships hold:

p11(𝐱)=p11(𝐱)1η1(𝐱),p10(𝐱)=p10(𝐱)η1(𝐱)1η1(𝐱)p11(𝐱),formulae-sequencesubscript𝑝11𝐱superscriptsubscript𝑝11𝐱1subscript𝜂1𝐱subscript𝑝10𝐱superscriptsubscript𝑝10𝐱subscript𝜂1𝐱1subscript𝜂1𝐱superscriptsubscript𝑝11𝐱\displaystyle p_{11}(\mathbf{x})=\frac{p_{11}^{\ast}(\mathbf{x})}{1-\eta_{1}(% \mathbf{x})},\quad p_{10}(\mathbf{x})=p_{10}^{\ast}(\mathbf{x})-\frac{\eta_{1}% (\mathbf{x})}{1-\eta_{1}(\mathbf{x})}p_{11}^{\ast}(\mathbf{x}),italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( bold_x ) = divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG , italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( bold_x ) = italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) ,
p01(𝐱)=p01(𝐱)1η0(𝐱),p00(𝐱)=p00(𝐱)η0(𝐱)1η0(𝐱)p01(𝐱).formulae-sequencesubscript𝑝01𝐱superscriptsubscript𝑝01𝐱1subscript𝜂0𝐱subscript𝑝00𝐱superscriptsubscript𝑝00𝐱subscript𝜂0𝐱1subscript𝜂0𝐱superscriptsubscript𝑝01𝐱\displaystyle p_{01}(\mathbf{x})=\frac{p_{01}^{\ast}(\mathbf{x})}{1-\eta_{0}(% \mathbf{x})},\quad p_{00}(\mathbf{x})=p_{00}^{\ast}(\mathbf{x})-\frac{\eta_{0}% (\mathbf{x})}{1-\eta_{0}(\mathbf{x})}p_{01}^{\ast}(\mathbf{x}).italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT ( bold_x ) = divide start_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG , italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ( bold_x ) = italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) . (1)

The following proposition allows partial identification of the causal parameters if the recall bias occurs with the probability of at most δ𝛿\deltaitalic_δ.

Proposition 1.

Under Assumption 1, suppose there exists a constant 0δ<10𝛿10\leq\delta<10 ≤ italic_δ < 1 which 0η0(𝐱),η1(𝐱)δformulae-sequence0subscript𝜂0𝐱subscript𝜂1𝐱𝛿0\leq\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_δ holds for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X. Then the following inequalities hold for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X:

p11(𝐱)p11(𝐱)+11δp01(𝐱)superscriptsubscript𝑝11𝐱superscriptsubscript𝑝11𝐱11𝛿superscriptsubscript𝑝01𝐱\displaystyle\frac{p_{11}^{*}(\mathbf{x})}{p_{11}^{*}(\mathbf{x})+\frac{1}{1-% \delta}p_{01}^{*}(\mathbf{x})}divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + divide start_ARG 1 end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG p1|1(𝐱)p11(𝐱)p11(𝐱)+(1δ)p01(𝐱),absentsubscript𝑝conditional11𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝11𝐱1𝛿superscriptsubscript𝑝01𝐱\displaystyle\leq p_{1|1}(\mathbf{x})\leq\frac{p_{11}^{*}(\mathbf{x})}{p_{11}^% {*}(\mathbf{x})+(1-\delta)p_{01}^{*}(\mathbf{x})},≤ italic_p start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) ≤ divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + ( 1 - italic_δ ) italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG ,
p10(𝐱)δ1δp11(𝐱)p10(𝐱)+p00(𝐱)δ1δp11(𝐱)superscriptsubscript𝑝10𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱\displaystyle\frac{p_{10}^{*}(\mathbf{x})-\frac{\delta}{1-\delta}p_{11}^{*}(% \mathbf{x})}{p_{10}^{*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\delta}{1-% \delta}p_{11}^{*}(\mathbf{x})}divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG p1|0(𝐱)p10(𝐱)p10(𝐱)+p00(𝐱)δ1δp01(𝐱).absentsubscript𝑝conditional10𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝01𝐱\displaystyle\leq p_{1|0}(\mathbf{x})\leq\frac{p_{10}^{*}(\mathbf{x})}{p_{10}^% {*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\delta}{1-\delta}p_{01}^{*}(% \mathbf{x})}.≤ italic_p start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x ) ≤ divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG .

This proposition can be used when we can constrain the occurrence probability of recall bias using domain knowledge.

Some additional assumptions may be useful to narrow the bounds of estimands. For example, when studying the potential impact of childhood abuse on mental health issues in adulthood, it is important to consider the possibility of individuals hiding or feeling shame about their previous experiences. Additionally, those who have mental health problems in adulthood may be more likely to under-report their history of abuse, as they may feel particularly affected by their experiences and may be hesitant to disclose them. Providing this additional information enables us to make the assumption that η0(𝐱)η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x})\leq\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ).

Proposition 2.

Under Assumption 1,

  1. (a)

    Suppose there exists a constant 0δ<10𝛿10\leq\delta<10 ≤ italic_δ < 1 which 0η0(𝐱)η1(𝐱)δ0subscript𝜂0𝐱subscript𝜂1𝐱𝛿0\leq\eta_{0}(\mathbf{x})\leq\eta_{1}(\mathbf{x})\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_δ holds for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X. Then, the following inequalities hold for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X:

    p1|1(𝐱)p1|1(𝐱)p11(𝐱)p11(𝐱)+p01(𝐱)(1δ),subscriptsuperscript𝑝conditional11𝐱subscript𝑝conditional11𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝01𝐱1𝛿\displaystyle p^{*}_{1|1}(\mathbf{x})\leq p_{1|1}(\mathbf{x})\leq\frac{p_{11}^% {*}(\mathbf{x})}{p_{11}^{*}(\mathbf{x})+p_{01}^{*}(\mathbf{x})(1-\delta)},italic_p start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_p start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) ≤ divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) ( 1 - italic_δ ) end_ARG ,
    p10(𝐱)δ1δp11(𝐱)p10(𝐱)+p00(𝐱)δ1δp11(𝐱)p1|0(𝐱)max{p1|0(𝐱),p10(𝐱)δ1δp11(𝐱)p10(𝐱)+p00(𝐱)δ1δ{p01(𝐱)+p11(𝐱)}}.superscriptsubscript𝑝10𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱subscript𝑝conditional10𝐱subscriptsuperscript𝑝conditional10𝐱superscriptsubscript𝑝10𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝01𝐱superscriptsubscript𝑝11𝐱\displaystyle\frac{p_{10}^{*}(\mathbf{x})-\frac{\delta}{1-\delta}p_{11}^{*}(% \mathbf{x})}{p_{10}^{*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\delta}{1-% \delta}p_{11}^{*}(\mathbf{x})}\leq p_{1|0}(\mathbf{x})\leq\max\left\{p^{*}_{1|% 0}(\mathbf{x}),\frac{p_{10}^{*}(\mathbf{x})-\frac{\delta}{1-\delta}p_{11}^{*}(% \mathbf{x})}{p_{10}^{*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\delta}{1-% \delta}\{p_{01}^{*}(\mathbf{x})+p_{11}^{*}(\mathbf{x})\}}\right\}.divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG ≤ italic_p start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x ) ≤ roman_max { italic_p start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x ) , divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG { italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) } end_ARG } .
  2. (b)

    Suppose there exists a constant 0δ<10𝛿10\leq\delta<10 ≤ italic_δ < 1 which 0η1(𝐱)η0(𝐱)δ0subscript𝜂1𝐱subscript𝜂0𝐱𝛿0\leq\eta_{1}(\mathbf{x})\leq\eta_{0}(\mathbf{x})\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_δ holds for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X. Then, the following inequalities hold for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X:

    p11(𝐱)p11(𝐱)+p01(𝐱)/(1δ)p1|1(𝐱)p1|1(𝐱),superscriptsubscript𝑝11𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝01𝐱1𝛿subscript𝑝conditional11𝐱subscriptsuperscript𝑝conditional11𝐱\displaystyle\frac{p_{11}^{*}(\mathbf{x})}{p_{11}^{*}(\mathbf{x})+p_{01}^{*}(% \mathbf{x})/(1-\delta)}\leq p_{1|1}(\mathbf{x})\leq p^{*}_{1|1}(\mathbf{x}),divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) / ( 1 - italic_δ ) end_ARG ≤ italic_p start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_p start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) ,
    min{p1|0(𝐱),p10(𝐱)δ1δp11(𝐱)p10(𝐱)+p00(𝐱)δ1δ{p01(𝐱)+p11(𝐱)}}p1|0(𝐱)p10(𝐱)p10(𝐱)+p00(𝐱)δ1δp01(𝐱).subscriptsuperscript𝑝conditional10𝐱superscriptsubscript𝑝10𝐱𝛿1𝛿superscriptsubscript𝑝11𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝01𝐱superscriptsubscript𝑝11𝐱subscript𝑝conditional10𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝10𝐱superscriptsubscript𝑝00𝐱𝛿1𝛿superscriptsubscript𝑝01𝐱\displaystyle\min\left\{p^{*}_{1|0}(\mathbf{x}),\frac{p_{10}^{*}(\mathbf{x})-% \frac{\delta}{1-\delta}p_{11}^{*}(\mathbf{x})}{p_{10}^{*}(\mathbf{x})+p_{00}^{% *}(\mathbf{x})-\frac{\delta}{1-\delta}\{p_{01}^{*}(\mathbf{x})+p_{11}^{*}(% \mathbf{x})\}}\right\}\leq p_{1|0}(\mathbf{x})\leq\frac{p_{10}^{*}(\mathbf{x})% }{p_{10}^{*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\delta}{1-\delta}p_{01}^% {*}(\mathbf{x})}.roman_min { italic_p start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x ) , divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG { italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) } end_ARG } ≤ italic_p start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x ) ≤ divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_δ end_ARG start_ARG 1 - italic_δ end_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG .

This proposition implies that by assuming a relationship between two parameters, η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ), we can narrow down either the upper or lower bound of the ATE. We can partially identify the causal parameter when the exact recall bias parameter functions are unknown. We can also point identify the causal treatment effect parameters if the recall bias parameter functions are known.

Proposition 3.

Under Assumption 1, the following equality holds for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X.

τ(𝐱)=p11(𝐱)1η1(𝐱)p11(𝐱)1η1(𝐱)+p01(𝐱)1η0(𝐱)p10(𝐱)η1(𝐱)1η1(𝐱)p11(𝐱)p10(𝐱)η1(𝐱)1η1(𝐱)p11(𝐱)+p00(𝐱)η0(𝐱)1η0(𝐱)p01(𝐱).𝜏𝐱superscriptsubscript𝑝11𝐱1subscript𝜂1𝐱superscriptsubscript𝑝11𝐱1subscript𝜂1𝐱superscriptsubscript𝑝01𝐱1subscript𝜂0𝐱superscriptsubscript𝑝10𝐱subscript𝜂1𝐱1subscript𝜂1𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝10𝐱subscript𝜂1𝐱1subscript𝜂1𝐱superscriptsubscript𝑝11𝐱superscriptsubscript𝑝00𝐱subscript𝜂0𝐱1subscript𝜂0𝐱superscriptsubscript𝑝01𝐱\displaystyle\tau(\mathbf{x})=\frac{\frac{p_{11}^{*}(\mathbf{x})}{1-\eta_{1}(% \mathbf{x})}}{\frac{p_{11}^{*}(\mathbf{x})}{1-\eta_{1}(\mathbf{x})}+\frac{p_{0% 1}^{*}(\mathbf{x})}{1-\eta_{0}(\mathbf{x})}}-\frac{p_{10}^{*}(\mathbf{x})-% \frac{\eta_{1}(\mathbf{x})}{1-\eta_{1}(\mathbf{x})}p_{11}^{*}(\mathbf{x})}{p_{% 10}^{*}(\mathbf{x})-\frac{\eta_{1}(\mathbf{x})}{1-\eta_{1}(\mathbf{x})}p_{11}^% {*}(\mathbf{x})+p_{00}^{*}(\mathbf{x})-\frac{\eta_{0}(\mathbf{x})}{1-\eta_{0}(% \mathbf{x})}p_{01}^{*}(\mathbf{x})}.italic_τ ( bold_x ) = divide start_ARG divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG end_ARG start_ARG divide start_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG + divide start_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG end_ARG - divide start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG start_ARG italic_p start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) end_ARG italic_p start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) + italic_p start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) - divide start_ARG italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) end_ARG italic_p start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) end_ARG .

If a validation study is available, η0(𝐱),η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) can be estimated and then be plugged into the above equation. However, it is not common in many situations, especially at an early stage of research.

4. Methods for Recovering the Treatment Effects in the Presence of Recall Bias

In this section, we propose two estimation methods that provide consistent estimates of the ATE in the presence of recall bias and confounding: (1) maximum likelihood estimation and (2) stratification. We suggest three stratification techniques for the stratification method: (1) propensity score stratification, (2) prognostic score stratification, and (3) blocking. Furthermore, we discuss the nearest-neighbor combination method used to address the problems in the stratification method with recall bias.

For given η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ), the first maximum likelihood (ML)-based method requires the correct identification of models for exposure and two potential outcomes to obtain a consistent estimate of τ𝜏\tauitalic_τ. The stratification-based method requires a few model assumptions. Stratification can be implemented on the basis of either propensity scores or prognostic scores (Hansen,, 2008). The propensity score stratification method requires a correctly specified exposure model, while the prognostic score stratification method needs a correctly specified outcome model. The last blocking method suggested by Karmakar et al., (2021) does not need any model assumption. In the following subsections, we discuss these estimation methods in more detail.

4.1. Maximum Likelihood Estimation

Consider the outcome models mz(𝐱)=Pr(Yi=1|Zi=z,𝐗i=𝐱)subscript𝑚𝑧𝐱Prsubscript𝑌𝑖conditional1subscript𝑍𝑖𝑧subscript𝐗𝑖𝐱m_{z}(\mathbf{x})=\Pr(Y_{i}=1|Z_{i}=z,{\mathbf{X}}_{i}=\mathbf{x})italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) = roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 | italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_z , bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) for z=0,1𝑧01z=0,1italic_z = 0 , 1 that are the models for the two probabilities, p1|1(𝐱)subscript𝑝conditional11𝐱p_{1|1}(\mathbf{x})italic_p start_POSTSUBSCRIPT 1 | 1 end_POSTSUBSCRIPT ( bold_x ) and p1|0(𝐱)subscript𝑝conditional10𝐱p_{1|0}(\mathbf{x})italic_p start_POSTSUBSCRIPT 1 | 0 end_POSTSUBSCRIPT ( bold_x )and, the propensity score model e(𝐱)𝑒𝐱e(\mathbf{x})italic_e ( bold_x ) for Pr(Zi=1|𝐗i=𝐱)Prsubscript𝑍𝑖conditional1subscript𝐗𝑖𝐱\Pr(Z_{i}=1|{\mathbf{X}}_{i}=\mathbf{x})roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ). The probability pz(𝐱)=𝔼[Yi(z)|𝐗i=𝐗]subscript𝑝𝑧𝐱𝔼delimited-[]conditionalsubscript𝑌𝑖𝑧subscript𝐗𝑖𝐗p_{z}(\mathbf{x})=\mathbb{E}[Y_{i}(z)|{\mathbf{X}}_{i}=\mathbf{X}]italic_p start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) = blackboard_E [ italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_z ) | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_X ] can be identified as p1|z(𝐱)subscript𝑝conditional1𝑧𝐱p_{1|z}(\mathbf{x})italic_p start_POSTSUBSCRIPT 1 | italic_z end_POSTSUBSCRIPT ( bold_x ), which can thus be estimated by m1(𝐱)subscript𝑚1𝐱m_{1}(\mathbf{x})italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ). It is well-known that, in the absence of recall bias, either mz(𝐱)subscript𝑚𝑧𝐱m_{z}(\mathbf{x})italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) or e(𝐱)𝑒𝐱e(\mathbf{x})italic_e ( bold_x ) is required to be correctly specified to obtain a consistent estimate. However, in the presence of recall bias, mz(𝐱)subscript𝑚𝑧𝐱m_{z}(\mathbf{x})italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) nor e(𝐱)𝑒𝐱e(\mathbf{x})italic_e ( bold_x ) cannot be estimated from the observable data set due to the absence of true Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. We can rather estimate the ATE as a function of the tuning parameters of the recall bias model.

The first method presented in this subsection uses maximum likelihood estimation. mz(𝐱)subscript𝑚𝑧𝐱m_{z}(\mathbf{x})italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) and e(𝐱)𝑒𝐱e(\mathbf{x})italic_e ( bold_x ) must be specified to construct the likelihood function to obtain an estimate for given η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ). Under Assumptions 1, the joint probability Pr(Yi,Zi|𝐗i)Prsubscript𝑌𝑖conditionalsuperscriptsubscript𝑍𝑖subscript𝐗𝑖\Pr(Y_{i},Z_{i}^{\ast}|{\mathbf{X}}_{i})roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) of observable variables can be represented by a function of m0(𝐱)subscript𝑚0𝐱m_{0}(\mathbf{x})italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ), m1(𝐱)subscript𝑚1𝐱m_{1}(\mathbf{x})italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ), and e(𝐱)𝑒𝐱e(\mathbf{x})italic_e ( bold_x ). We assume models mz(𝐱;𝜸z),z=0,1formulae-sequencesubscript𝑚𝑧𝐱subscript𝜸𝑧𝑧01m_{z}(\mathbf{x};\bm{\gamma}_{z}),z=0,1italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ; bold_italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) , italic_z = 0 , 1 and e(𝐱;𝜷)𝑒𝐱𝜷e(\mathbf{x};\bm{\beta})italic_e ( bold_x ; bold_italic_β ) with parameters 𝜸zsubscript𝜸𝑧\bm{\gamma}_{z}bold_italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT and 𝜷𝜷\bm{\beta}bold_italic_β, respectively. For instance, logistic regressions can be used such as m(Z,𝐗;𝜸)=exp(γzZ+𝜸𝐗T𝐗)/{1+exp(γzZ+𝜸𝐗T𝐗)}𝑚𝑍𝐗𝜸subscript𝛾𝑧𝑍superscriptsubscript𝜸𝐗𝑇𝐗1subscript𝛾𝑧𝑍superscriptsubscript𝜸𝐗𝑇𝐗m(Z,\mathbf{X};\bm{\gamma})=\exp(\gamma_{z}Z+\bm{\gamma}_{\mathbf{X}}^{T}% \mathbf{X})/\{1+\exp(\gamma_{z}Z+\bm{\gamma}_{\mathbf{X}}^{T}\mathbf{X})\}italic_m ( italic_Z , bold_X ; bold_italic_γ ) = roman_exp ( italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_Z + bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X ) / { 1 + roman_exp ( italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_Z + bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X ) } with m1(𝐗)=m(1,𝐗;𝜸)subscript𝑚1𝐗𝑚1𝐗𝜸m_{1}(\mathbf{X})=m(1,\mathbf{X};\bm{\gamma})italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_X ) = italic_m ( 1 , bold_X ; bold_italic_γ ) and m0(𝐗)=m(0,𝐗;𝜸)subscript𝑚0𝐗𝑚0𝐗𝜸m_{0}(\mathbf{X})=m(0,\mathbf{X};\bm{\gamma})italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_X ) = italic_m ( 0 , bold_X ; bold_italic_γ ) and e(𝐗)=exp(𝜷T𝐗)/{1+exp(𝜷T𝐗)}𝑒𝐗superscript𝜷𝑇𝐗1superscript𝜷𝑇𝐗e(\mathbf{X})=\exp({\bm{\beta}}^{T}\mathbf{X})/\{1+\exp(\bm{\beta}^{T}\mathbf{% X})\}italic_e ( bold_X ) = roman_exp ( bold_italic_β start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X ) / { 1 + roman_exp ( bold_italic_β start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X ) }. These model parameters can be estimated by solving the following maximization problem:

𝜽^=(𝜷^,𝜸^0,𝜸^1)=argmax𝜷,𝜸0,𝜸1i=1NlogPr(Yi=yi,Zi=zi|𝐗i=𝐱i).^𝜽^𝜷subscript^𝜸0subscript^𝜸1subscriptargmax𝜷subscript𝜸0subscript𝜸1superscriptsubscript𝑖1𝑁Prsubscript𝑌𝑖subscript𝑦𝑖superscriptsubscript𝑍𝑖conditionalsubscript𝑧𝑖subscript𝐗𝑖subscript𝐱𝑖\widehat{\bm{\theta}}=(\widehat{\bm{\beta}},\widehat{\bm{\gamma}}_{0},\widehat% {\bm{\gamma}}_{1})=\operatorname*{argmax}_{\bm{\beta},\bm{\gamma}_{0},\bm{% \gamma}_{1}}\sum_{i=1}^{N}\log\Pr(Y_{i}=y_{i},Z_{i}^{*}=z_{i}|\mathbf{X}_{i}=% \mathbf{x}_{i}).over^ start_ARG bold_italic_θ end_ARG = ( over^ start_ARG bold_italic_β end_ARG , over^ start_ARG bold_italic_γ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , over^ start_ARG bold_italic_γ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = roman_argmax start_POSTSUBSCRIPT bold_italic_β , bold_italic_γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT roman_log roman_Pr ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) .

Once we obtain the estimate 𝜽^^𝜽\widehat{\bm{\theta}}over^ start_ARG bold_italic_θ end_ARG, we can compute m^z(𝐱)=mt(𝐱;𝜸^z)subscript^𝑚𝑧𝐱subscript𝑚𝑡𝐱subscript^𝜸𝑧\widehat{m}_{z}(\mathbf{x})=m_{t}(\mathbf{x};\widehat{\bm{\gamma}}_{z})over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) = italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_x ; over^ start_ARG bold_italic_γ end_ARG start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) and e^(𝐱)=e(𝐱;𝜷^)^𝑒𝐱𝑒𝐱^𝜷\widehat{e}(\mathbf{x})=e(\mathbf{x};\widehat{\bm{\beta}})over^ start_ARG italic_e end_ARG ( bold_x ) = italic_e ( bold_x ; over^ start_ARG bold_italic_β end_ARG ). The marginal probabilities pzsubscript𝑝𝑧{p}_{z}italic_p start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT are then estimated by taking sample averages of m^z(𝐱)subscript^𝑚𝑧𝐱\widehat{m}_{z}(\mathbf{x})over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ) as p^1ML=1Ni=1Nm^1(𝐗i)superscriptsubscript^𝑝1𝑀𝐿1𝑁superscriptsubscript𝑖1𝑁subscript^𝑚1subscript𝐗𝑖\widehat{p}_{1}^{ML}=\frac{1}{N}\sum_{i=1}^{N}\widehat{m}_{1}({\mathbf{X}}_{i})over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and p^0ML=1Ni=1Nm^0(𝐗i)superscriptsubscript^𝑝0𝑀𝐿1𝑁superscriptsubscript𝑖1𝑁subscript^𝑚0subscript𝐗𝑖\widehat{p}_{0}^{ML}=\frac{1}{N}\sum_{i=1}^{N}\widehat{m}_{0}({\mathbf{X}}_{i})over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). Then, the ATE estimate is τ^ML=p^1MLp^0MLsuperscript^𝜏𝑀𝐿superscriptsubscript^𝑝1𝑀𝐿superscriptsubscript^𝑝0𝑀𝐿\widehat{\tau}^{ML}=\widehat{p}_{1}^{ML}-\widehat{p}_{0}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT = over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT - over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT. This estimate is consistent if mz(𝐱;𝜸z)subscript𝑚𝑧𝐱subscript𝜸𝑧m_{z}(\mathbf{x};\bm{\gamma}_{z})italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( bold_x ; bold_italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) and e(𝐱;𝜷)𝑒𝐱𝜷e(\mathbf{x};\bm{\beta})italic_e ( bold_x ; bold_italic_β ) must be correctly specified for fixed η0(𝐱),η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ).

A key challenge with this method is the requirement for researchers to specify η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ), which are typically unknown in practice. If these can be estimated from external sources, such estimates can be incorporated into the likelihood function. In the absence of information about η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ), an additional assumption might be made that η0(𝐱)=η0subscript𝜂0𝐱subscript𝜂0\eta_{0}(\mathbf{x})=\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1(𝐱)=η1subscript𝜂1𝐱subscript𝜂1\eta_{1}(\mathbf{x})=\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, implying constant values across all 𝐱𝐱\mathbf{x}bold_x. To address this uncertainty, a sensitivity analysis could be employed, utilizing a plausible range for η0subscript𝜂0\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1subscript𝜂1\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT informed by prior research. This involves testing every possible combination of (η0,η1)subscript𝜂0subscript𝜂1(\eta_{0},\eta_{1})( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) within the specified range and assessing the impact on the ATE estimate’s variation.

4.2. Stratification

Stratification can be alternatively used to estimate τ𝜏\tauitalic_τ by aiming to balance the covariate distributions between exposed and unexposed groups. Compared with the ML method, stratification requires fewer assumptions in general. If stratification can be successfully created while adjusting for confounders, the estimation of τ𝜏\tauitalic_τ is straightforward. Assume that there are I𝐼Iitalic_I strata. Each stratum i𝑖iitalic_i, contains nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT individuals. There are N=i=1Ini𝑁superscriptsubscript𝑖1𝐼subscript𝑛𝑖N=\sum_{i=1}^{I}n_{i}italic_N = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT individuals in total. Denote ij𝑖𝑗ijitalic_i italic_j as the j𝑗jitalic_jth individual in stratum i𝑖iitalic_i for j=1,,ni𝑗1subscript𝑛𝑖j=1,\dots,n_{i}italic_j = 1 , … , italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. If we assume (Yij(1),Yij(0))Zijperpendicular-tosubscript𝑌𝑖𝑗1subscript𝑌𝑖𝑗0subscript𝑍𝑖𝑗(Y_{ij}(1),Y_{ij}(0))\perp Z_{ij}( italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 1 ) , italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 0 ) ) ⟂ italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT holds within each stratum i𝑖iitalic_i, then the stratum-specific probabilities p1i=𝔼𝐗|stratum i[p1(𝐗)]subscript𝑝1𝑖subscript𝔼conditional𝐗stratum idelimited-[]subscript𝑝1𝐗p_{1i}=\mathbb{E}_{\mathbf{X}|\textbf{stratum i}}[p_{1}(\mathbf{X})]italic_p start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_X | stratum i end_POSTSUBSCRIPT [ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_X ) ] and p0i=𝔼𝐗|stratum i[p0(𝐗)]subscript𝑝0𝑖subscript𝔼conditional𝐗stratum idelimited-[]subscript𝑝0𝐗p_{0i}=\mathbb{E}_{\mathbf{X}|\textbf{stratum i}}[p_{0}(\mathbf{X})]italic_p start_POSTSUBSCRIPT 0 italic_i end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_X | stratum i end_POSTSUBSCRIPT [ italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_X ) ] can be identified from the 2×2222\times 22 × 2 table generated by stratum i𝑖iitalic_i. However, Zijsuperscriptsubscript𝑍𝑖𝑗Z_{ij}^{\ast}italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is observed instead of Zijsubscript𝑍𝑖𝑗Z_{ij}italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT due to recall bias. Therefore, the recall bias adjustment using (3) is required. For stratum i𝑖iitalic_i, assume that Table 1 with ai,bi,ci,disuperscriptsubscript𝑎𝑖superscriptsubscript𝑏𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖a_{i}^{*},b_{i}^{*},c_{i}^{*},d_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is observed.

Table 1. The 2×2222\times 22 × 2 contingency observed table for the i𝑖iitalic_ith stratum.
Y=1𝑌1Y=1italic_Y = 1 Y=0𝑌0Y=0italic_Y = 0
Exposed (Z=1)superscript𝑍1(Z^{*}=1)( italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 ) aisuperscriptsubscript𝑎𝑖a_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT bisuperscriptsubscript𝑏𝑖b_{i}^{*}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ai+bisuperscriptsubscript𝑎𝑖superscriptsubscript𝑏𝑖a_{i}^{*}+b_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
Not exposed (Z=0)superscript𝑍0(Z^{*}=0)( italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0 ) cisuperscriptsubscript𝑐𝑖c_{i}^{*}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT disuperscriptsubscript𝑑𝑖d_{i}^{*}italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ci+disuperscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖c_{i}^{*}+d_{i}^{*}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
ai+cisuperscriptsubscript𝑎𝑖superscriptsubscript𝑐𝑖a_{i}^{*}+c_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT bi+disuperscriptsubscript𝑏𝑖superscriptsubscript𝑑𝑖b_{i}^{*}+d_{i}^{*}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT nisuperscriptsubscript𝑛𝑖n_{i}^{*}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
Proposition 4.

Suppose there are 2×2222\times 22 × 2 contingency tables for I𝐼Iitalic_I strata on Y𝑌Yitalic_Y and Zsuperscript𝑍Z^{*}italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT with ai,bi,ci,disuperscriptsubscript𝑎𝑖superscriptsubscript𝑏𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖a_{i}^{*},b_{i}^{*},c_{i}^{*},d_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT as in Table 1. The stratum-specific ATE τi=𝔼𝐗|stratum i[τ(𝐱)]subscript𝜏𝑖subscript𝔼conditional𝐗stratum idelimited-[]𝜏𝐱\tau_{i}=\mathbb{E}_{\mathbf{X}|\textbf{stratum i}}[\tau(\mathbf{x})]italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_X | stratum i end_POSTSUBSCRIPT [ italic_τ ( bold_x ) ] can be estimated for known η0(𝐱),η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) as

τ^i=j=1niZijYij1η1(𝐱ij)j=1niZijYij1η1(𝐱ij)+j=1niZij(1Yij)1η0(𝐱ij)subscript^𝜏𝑖superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗1subscript𝜂1subscript𝐱𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗1subscript𝜂1subscript𝐱𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗1subscript𝑌𝑖𝑗1subscript𝜂0subscript𝐱𝑖𝑗\displaystyle\widehat{\tau}_{i}=\frac{\sum_{j=1}^{n_{i}}\frac{Z_{ij}^{*}Y_{ij}% }{1-\eta_{1}(\mathbf{x}_{ij})}}{\sum_{j=1}^{n_{i}}\frac{Z_{ij}^{*}Y_{ij}}{1-% \eta_{1}(\mathbf{x}_{ij})}+\sum_{j=1}^{n_{i}}\frac{Z_{ij}^{*}(1-Y_{ij})}{1-% \eta_{0}(\mathbf{x}_{ij})}}over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 - italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG end_ARG
j=1ni(1Zij)Yijj=1niZijYijη1(𝐱ij)1η1(𝐱ij)j=1ni(1Zij)Yijj=1niZijYijη1(𝐱ij)1η1(𝐱ij)+j=1ni(1Zij)(1Yij)j=1niZij(1Yij)η1(𝐱ij)1η1(𝐱ij).superscriptsubscript𝑗1subscript𝑛𝑖1superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗subscript𝜂1subscript𝐱𝑖𝑗1subscript𝜂1subscript𝐱𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖1superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗subscript𝑌𝑖𝑗subscript𝜂1subscript𝐱𝑖𝑗1subscript𝜂1subscript𝐱𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖1superscriptsubscript𝑍𝑖𝑗1subscript𝑌𝑖𝑗superscriptsubscript𝑗1subscript𝑛𝑖superscriptsubscript𝑍𝑖𝑗1subscript𝑌𝑖𝑗subscript𝜂1subscript𝐱𝑖𝑗1subscript𝜂1subscript𝐱𝑖𝑗\displaystyle-\frac{\sum_{j=1}^{n_{i}}(1-Z_{ij}^{*})Y_{ij}-\sum_{j=1}^{n_{i}}Z% _{ij}^{*}Y_{ij}\frac{\eta_{1}(\mathbf{x}_{ij})}{1-\eta_{1}(\mathbf{x}_{ij})}}{% \sum_{j=1}^{n_{i}}(1-Z_{ij}^{*})Y_{ij}-\sum_{j=1}^{n_{i}}Z_{ij}^{*}Y_{ij}\frac% {\eta_{1}(\mathbf{x}_{ij})}{1-\eta_{1}(\mathbf{x}_{ij})}+\sum_{j=1}^{n_{i}}(1-% Z_{ij}^{*})(1-Y_{ij})-\sum_{j=1}^{n_{i}}Z_{ij}^{*}(1-Y_{ij})\frac{\eta_{1}(% \mathbf{x}_{ij})}{1-\eta_{1}(\mathbf{x}_{ij})}}.- divide start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( 1 - italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 - italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) divide start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG end_ARG .

Also, for 0η0(𝐱),η1(𝐱)δiformulae-sequence0subscript𝜂0𝐱subscript𝜂1𝐱subscript𝛿𝑖0\leq\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})\leq\delta_{i}0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the bound can be estimated as

aiai+bi11δicici+dibiδi1δiτ^iaiai+bi(1δi)ciaiδi1δici+diaiδi1δi.superscriptsubscript𝑎𝑖superscriptsubscript𝑎𝑖superscriptsubscript𝑏𝑖11subscript𝛿𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖superscriptsubscript𝑏𝑖subscript𝛿𝑖1subscript𝛿𝑖subscript^𝜏𝑖superscriptsubscript𝑎𝑖superscriptsubscript𝑎𝑖superscriptsubscript𝑏𝑖1subscript𝛿𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑎𝑖subscript𝛿𝑖1subscript𝛿𝑖superscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖superscriptsubscript𝑎𝑖subscript𝛿𝑖1subscript𝛿𝑖\frac{a_{i}^{*}}{a_{i}^{*}+b_{i}^{*}\frac{1}{1-\delta_{i}}}-\frac{c_{i}^{*}}{c% _{i}^{*}+d_{i}^{*}-b_{i}^{*}\frac{\delta_{i}}{1-\delta_{i}}}\leq\widehat{\tau}% _{i}\leq\frac{a_{i}^{*}}{a_{i}^{*}+b_{i}^{*}(1-\delta_{i})}-\frac{c_{i}^{*}-a_% {i}^{*}\frac{\delta_{i}}{1-\delta_{i}}}{c_{i}^{*}+d_{i}^{*}-a_{i}^{*}\frac{% \delta_{i}}{1-\delta_{i}}}.divide start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT divide start_ARG italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG ≤ over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ divide start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG - divide start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT divide start_ARG italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT divide start_ARG italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG .

This proposition is directly obtained from Propositions 1 and 3. If η0(𝐱)=η0subscript𝜂0𝐱subscript𝜂0\eta_{0}(\mathbf{x})=\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1(𝐱)=η1subscript𝜂1𝐱subscript𝜂1\eta_{1}(\mathbf{x})=\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, the estimate τ^isubscript^𝜏𝑖\widehat{\tau}_{i}over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is simplified as ai/(1η1)ai/(1η1)+bi/(1η0)ciai(η1/(1η1))ciai(η1/(1η1))+dibi(η0/(1η0))superscriptsubscript𝑎𝑖1subscript𝜂1superscriptsubscript𝑎𝑖1subscript𝜂1superscriptsubscript𝑏𝑖1subscript𝜂0superscriptsubscript𝑐𝑖superscriptsubscript𝑎𝑖subscript𝜂11subscript𝜂1superscriptsubscript𝑐𝑖superscriptsubscript𝑎𝑖subscript𝜂11subscript𝜂1superscriptsubscript𝑑𝑖superscriptsubscript𝑏𝑖subscript𝜂01subscript𝜂0\frac{a_{i}^{*}/(1-\eta_{1})}{a_{i}^{*}/(1-\eta_{1})+b_{i}^{*}/(1-\eta_{0})}-% \frac{c_{i}^{*}-a_{i}^{*}(\eta_{1}/(1-\eta_{1}))}{c_{i}^{*}-a_{i}^{*}(\eta_{1}% /(1-\eta_{1}))+d_{i}^{*}-b_{i}^{*}(\eta_{0}/(1-\eta_{0}))}divide start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG - divide start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) end_ARG start_ARG italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) end_ARG. The marginal probabilities can be estimated by the weighted average of these stratum-specific τ^isubscript^𝜏𝑖\widehat{\tau}_{i}over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with weights si=ni/Nsubscript𝑠𝑖subscript𝑛𝑖𝑁s_{i}=n_{i}/Nitalic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_N. Therefore, the ATE is estimated by τ^S=i=1Iτ^i(si/N)superscript^𝜏𝑆superscriptsubscript𝑖1𝐼subscript^𝜏𝑖subscript𝑠𝑖𝑁\widehat{\tau}^{S}=\sum_{i=1}^{I}\widehat{\tau}_{i}(s_{i}/N)over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_N ). The bound can be similarly obtained.

4.2.1. Propensity Score Stratification

Among stratification-based methods, stratification based on propensity score is the most common approach (Rosenbaum and Rubin,, 1983). The propensity score is a conditional probability of the treatment assignment given the observed covariates, e(𝐱)=Pr(Zi=1|𝐗i=𝐱)𝑒𝐱Prsubscript𝑍𝑖conditional1subscript𝐗𝑖𝐱e(\mathbf{x})=\Pr(Z_{i}=1|\mathbf{X}_{i}=\mathbf{x})italic_e ( bold_x ) = roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ). We only have to assume a treatment model to create strata using propensity scores. However, similar to many stratification-based methods, this method relies on the assumption that stratification achieves covariate balance by at least approximately. Furthermore, strata are formed on the basis of biasedly estimated propensity score e^(𝐱)=Pr(Zi=1|𝐗i=𝐱)superscript^𝑒𝐱Prsuperscriptsubscript𝑍𝑖conditional1subscript𝐗𝑖𝐱\widehat{e}^{*}(\mathbf{x})=\Pr(Z_{i}^{*}=1|\mathbf{X}_{i}=\mathbf{x})over^ start_ARG italic_e end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) = roman_Pr ( italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 | bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_x ) using Zsuperscript𝑍Z^{*}italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT instead of unobservable Z𝑍Zitalic_Z. It is not feasible to compare the covariate distributions between the exposed and unexposed groups. Thus, constructing strata based on the propensity score can be problematic if η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) are significantly different from 00. However, if η0(𝐱)=η1(𝐱)=ηsubscript𝜂0𝐱subscript𝜂1𝐱𝜂\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})=\etaitalic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η for all 𝐱𝒳𝐱𝒳\mathbf{x}\in\mathcal{X}bold_x ∈ caligraphic_X, then the covariate balance between the Z=1superscript𝑍1Z^{*}=1italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 and Z=0superscript𝑍0Z^{*}=0italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0 groups is asymptotically the same as that between the Z=1𝑍1Z=1italic_Z = 1 and Z=0𝑍0Z=0italic_Z = 0 groups since e(𝐱)=(1η)e(𝐱)superscript𝑒𝐱1𝜂𝑒𝐱e^{\ast}(\mathbf{x})=(1-\eta)e(\mathbf{x})italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) = ( 1 - italic_η ) italic_e ( bold_x ). If the recall bias occurs with the same probability across the Y=0𝑌0Y=0italic_Y = 0 and Y=1𝑌1Y=1italic_Y = 1 groups (i.e., recall bias is not differential), then e(𝐱)superscript𝑒𝐱e^{\ast}(\mathbf{x})italic_e start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( bold_x ) is also a balancing score. Thus, we can create valid strata using the biased propensity score obtained by observable variables.

4.2.2. Prognostic Score Stratification

Instead of using the propensity score, the prognostic score can be utilized to construct strata (Hansen,, 2008). If there is Ψ(𝐗ij)Ψsubscript𝐗𝑖𝑗\Psi(\mathbf{X}_{ij})roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) such that Yij(0)𝐗ij|Ψ(𝐗ij)Y_{ij}(0)\perp\!\!\!\perp\mathbf{X}_{ij}|\Psi(\mathbf{X}_{ij})italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 0 ) ⟂ ⟂ bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ), then we call Ψ()Ψ\Psi(\cdot)roman_Ψ ( ⋅ ) the prognostic score. Similar to propensity score stratification, prognostic score stratification permits the estimation of exposure effects within the exposed group. If (Yij(0),Yij(1))𝐗ij|Ψ(𝐗ij)(Y_{ij}(0),Y_{ij}(1))\perp\!\!\!\perp\mathbf{X}_{ij}|\Psi(\mathbf{X}_{ij})( italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 1 ) ) ⟂ ⟂ bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT | roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) is further assumed, then prognostic score stratification is valid for estimating overall exposure effects. For instance, if m(Zij,𝐗ij;𝜸)=exp(γzZij+𝜸𝐗T𝐗ij)/{1+exp(γzZ+𝜸𝐗T𝐗ij)}𝑚subscript𝑍𝑖𝑗subscript𝐗𝑖𝑗𝜸subscript𝛾𝑧subscript𝑍𝑖𝑗superscriptsubscript𝜸𝐗𝑇subscript𝐗𝑖𝑗1subscript𝛾𝑧𝑍superscriptsubscript𝜸𝐗𝑇subscript𝐗𝑖𝑗m(Z_{ij},\mathbf{X}_{ij};\bm{\gamma})=\exp(\gamma_{z}Z_{ij}+\bm{\gamma}_{% \mathbf{X}}^{T}\mathbf{X}_{ij})/\{1+\exp(\gamma_{z}Z+\bm{\gamma}_{\mathbf{X}}^% {T}\mathbf{X}_{ij})\}italic_m ( italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ; bold_italic_γ ) = roman_exp ( italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT + bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) / { 1 + roman_exp ( italic_γ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_Z + bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) } is assumed, Ψ(𝐗ij)=𝜸𝐗T𝐗ijΨsubscript𝐗𝑖𝑗superscriptsubscript𝜸𝐗𝑇subscript𝐗𝑖𝑗\Psi(\mathbf{X}_{ij})=\bm{\gamma}_{\mathbf{X}}^{T}\mathbf{X}_{ij}roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the prognostic score.

Like propensity score stratification, stratification on the prognostic score leads to a desirable and balanced structure. Since we do not know Ψ(𝐗ij)Ψsubscript𝐗𝑖𝑗\Psi(\mathbf{X}_{ij})roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) a priori, it has to be estimated from the data. As mentioned before, if η0(𝐱)=η1(𝐱)=ηsubscript𝜂0𝐱subscript𝜂1𝐱𝜂\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})=\etaitalic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η, then the probabilities of recall bias occurrence between the Y=1𝑌1Y=1italic_Y = 1 and Y=0𝑌0Y=0italic_Y = 0 groups are the same. In this case, the prognostic score can be used in stratification while estimating the treatment effect. Since the exposure was under-reported, we know Zij=1superscriptsubscript𝑍𝑖𝑗1Z_{ij}^{*}=1italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 always implies Zij=1subscript𝑍𝑖𝑗1Z_{ij}=1italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1. We first estimate γ𝐗subscript𝛾𝐗\gamma_{\mathbf{X}}italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT by using the data of the Zi=1superscriptsubscript𝑍𝑖1Z_{i}^{*}=1italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1 group. Assuming that the recall bias occurs randomly, we then calculate the prognostic scores Ψ(𝐗ij)=𝜸𝐗T𝐗ijΨsubscript𝐗𝑖𝑗superscriptsubscript𝜸𝐗𝑇subscript𝐗𝑖𝑗\Psi(\mathbf{X}_{ij})=\bm{\gamma}_{\mathbf{X}}^{T}\mathbf{X}_{ij}roman_Ψ ( bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = bold_italic_γ start_POSTSUBSCRIPT bold_X end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT for all individuals. The outcome models should be correctly specified for prognostic score stratification. Even though τ^Progsuperscript^𝜏𝑃𝑟𝑜𝑔\widehat{\tau}^{Prog}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_g end_POSTSUPERSCRIPT needs fewer modeling assumptions than τ^MLsuperscript^𝜏𝑀𝐿\widehat{\tau}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT, modeling assumption is still required. Moreover, score-based stratifications need a further assumption that η0(𝐱)=η1(𝐱)=ηsubscript𝜂0𝐱subscript𝜂1𝐱𝜂\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})=\etaitalic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η to be justified.

4.2.3. Blocking

In Sections 4.2.1 and 4.2.2, proper scores based on modeling assumptions are required to create valid strata. Also, score-based stratifications could be problematic if η0(𝐱)subscript𝜂0𝐱\eta_{0}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) and η1(𝐱)subscript𝜂1𝐱\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) significantly differ. Stratification based on propensity score also requires accurate treatment model identification, and the outcome model must be correctly specified to create strata with a prognostic score. However, the blocking method does not require any model assumption. Our goal is to make covariates 𝐗ijsubscript𝐗𝑖𝑗\mathbf{X}_{ij}bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT in block i𝑖iitalic_i to be similar. If the covariates in each block are almost the same, then we assume that (Yij(0),Yij(1))Zijperpendicular-tosubscript𝑌𝑖𝑗0subscript𝑌𝑖𝑗1subscript𝑍𝑖𝑗(Y_{ij}(0),Y_{ij}(1))\perp Z_{ij}( italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 1 ) ) ⟂ italic_Z start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT in each block i𝑖iitalic_i holds. Karmakar et al., (2021) used the blockingChallenge package in R to build blocks.

Suppose that there are N=Ik𝑁𝐼𝑘N=Ikitalic_N = italic_I italic_k individuals. To make I𝐼Iitalic_I blocks with size k𝑘kitalic_k, I𝐼Iitalic_I individuals are first randomly chosen as template individuals for each block. The remaining I(k1)𝐼𝑘1I(k-1)italic_I ( italic_k - 1 ) individuals are then matched to template individuals using optimal matching at a ratio of (k1):1:𝑘11(k-1):1( italic_k - 1 ) : 1. After the first blocking, separate an individual who is the most distant from the remaining k1𝑘1k-1italic_k - 1 individuals in each block. Setting these I𝐼Iitalic_I individuals as template individuals for each block, optimal matching is used again to build I𝐼Iitalic_I blocks. Repeating this process facilitates the implementation of an effective minimum within-block distance stratification. Repeat this process until no changes occur to obtain I𝐼Iitalic_I blocks, which are strata with size k𝑘kitalic_k.

The blocking method does not require any model assumption. However, the covariates 𝐗ijsubscript𝐗𝑖𝑗\mathbf{X}_{ij}bold_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT in each block i𝑖iitalic_i need to be similar. When achieving covariate balance is difficult or a weak overlap situation emerges, such blocks are not obtained. If the covariate balance within the block can be easily achieved, the blocking method is likely to provide a reliable estimator. Different from τ^MLsuperscript^𝜏𝑀𝐿\widehat{\tau}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT, τ^Propsuperscript^𝜏𝑃𝑟𝑜𝑝\widehat{\tau}^{Prop}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_p end_POSTSUPERSCRIPT, and τ^Progsuperscript^𝜏𝑃𝑟𝑜𝑔\widehat{\tau}^{Prog}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_g end_POSTSUPERSCRIPT, an advantage of τ^Bsuperscript^𝜏𝐵\widehat{\tau}^{B}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT is that any modeling assumptions is unnecessary. This stratification technique is still robust under model misspecification. We will examine the performances of these estimators in Section 5.

5. Simulation Studies

We conduct simulation studies to compare the performance of the proposed methods: (1) ML, (2) propensity score stratification, (3) prognostic score stratification, and (4) blocking. We consider various model specification scenarios to examine how they can successfully recover the true treatment effect under different model misspecification cases. In addition, we include Naïve estimators based on inverse probability weighting (IPW) and outcome regression (OR), assuming no misclassification error.

We consider four independent covariates, 𝐗i=(Xi1,Xi2,Xi3,Xi4)subscript𝐗𝑖subscript𝑋𝑖1subscript𝑋𝑖2subscript𝑋𝑖3subscript𝑋𝑖4\mathbf{X}_{i}=(X_{i1},X_{i2},X_{i3},X_{i4})bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_X start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i 3 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_i 4 end_POSTSUBSCRIPT ). Xi1subscript𝑋𝑖1X_{i1}italic_X start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT and Xi2subscript𝑋𝑖2X_{i2}italic_X start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT are binary covariates, whereas Xi3subscript𝑋𝑖3X_{i3}italic_X start_POSTSUBSCRIPT italic_i 3 end_POSTSUBSCRIPT and Xi4subscript𝑋𝑖4X_{i4}italic_X start_POSTSUBSCRIPT italic_i 4 end_POSTSUBSCRIPT are continuous covariates. We also consider four simulation scenarios where the exposure and outcome models are correctly specified or misspecified: (i) (cor, cor), (ii) (cor, mis), (iii) (mis, cor), and (iv) (mis, mis). For example, (mis, cor) means the exposure model is misspecified, but the outcome model is correctly specified. We randomly generate exposure Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and potential outcomes (Yi(0),Yi(1))subscript𝑌𝑖0subscript𝑌𝑖1(Y_{i}(0),Y_{i}(1))( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) ) of each individual depending on the model specification scenario. However, due to recall bias, we cannot observe the true exposure Zisubscript𝑍𝑖Z_{i}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT; we observe the biased exposure Zisuperscriptsubscript𝑍𝑖Z_{i}^{\ast}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT instead. We assume that the exposure is under-reported for this simulation study. We generate Zisuperscriptsubscript𝑍𝑖Z_{i}^{\ast}italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT based on the observed outcome Yi=Yi(1)Zi+Yi(0)(1Zi)subscript𝑌𝑖subscript𝑌𝑖1subscript𝑍𝑖subscript𝑌𝑖01subscript𝑍𝑖Y_{i}=Y_{i}(1)Z_{i}+Y_{i}(0)(1-Z_{i})italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) ( 1 - italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (See Web Appendix for the detailed simulation settings).

We compare the considered methods considering their successful recovery of the true ATE under different model misspecification scenarios. In addition to this factor of model misspecification, we also consider two sample sizes (N=1000𝑁1000N=1000italic_N = 1000 or 2000200020002000) and two constant recall bias parameter functions ((η0(𝐱),η1(𝐱))=(0.1,0.1)subscript𝜂0𝐱subscript𝜂1𝐱0.10.1(\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x}))=(0.1,0.1)( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ) = ( 0.1 , 0.1 ) or (0.1,0.2)0.10.2(0.1,0.2)( 0.1 , 0.2 )) throughout this simulation. We fix the strata size to 50 in stratification methods with the nearest-neighbor combination method. Table 2 shows the simulation results that are obtained from 1000 simulated datasets.

Naïve estimators assuming no misclassification error exhibit poor performance across various model misspecification scenarios, particularly in cases with a differential recall bias. This highlights the necessity of adjusting bias when we overlook the potential for exposure misclassification. Among the estimators we proposed, if both the exposure and potential outcome models are correctly specified, then τ^MLsuperscript^𝜏𝑀𝐿\widehat{\tau}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT is the best estimator. τ^Propsuperscript^𝜏𝑃𝑟𝑜𝑝\widehat{\tau}^{Prop}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_p end_POSTSUPERSCRIPT and τ^Progsuperscript^𝜏𝑃𝑟𝑜𝑔\widehat{\tau}^{Prog}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_g end_POSTSUPERSCRIPT show similar performance in each scenario. Particularly, even in the treatment model misspecification scenario, stratification based on propensity score shows slightly better results than stratification based on prognostic score. Score-based stratifications perform agreeably, although η0subscript𝜂0\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1subscript𝜂1\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are different. τ^Bsuperscript^𝜏𝐵\widehat{\tau}^{B}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT provides the least biased estimate in the case of misspecification for both models. On the contrary, τ^MLsuperscript^𝜏𝑀𝐿\widehat{\tau}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT shows the worst performance in (mis, mis) scenario. As expected, the model dependency for the blocking method is the smallest, and that for the ML method is the largest. This finding leads to a good result of the blocking estimator and a poor result of the ML estimator in the worst model misspecification scenario. Even though we require weak assumptions, the blocking estimator performs well throughout every model misspecification scenario. If the models are misspecified, τ^MLsuperscript^𝜏𝑀𝐿\widehat{\tau}^{ML}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_M italic_L end_POSTSUPERSCRIPT, τ^Propsuperscript^𝜏𝑃𝑟𝑜𝑝\widehat{\tau}^{Prop}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_p end_POSTSUPERSCRIPT, and τ^Progsuperscript^𝜏𝑃𝑟𝑜𝑔\widehat{\tau}^{Prog}over^ start_ARG italic_τ end_ARG start_POSTSUPERSCRIPT italic_P italic_r italic_o italic_g end_POSTSUPERSCRIPT are no longer consistent estimates of τ𝜏\tauitalic_τ.

Table 2. Performance of the estimation methods for recovering the average treatment effect. Six methods are compared, (1) Naïve inverse probability weighting (IPW), (2) Naïve outcome regression (OR), (3) maximum likelihood, (4) stratification based on propensity scores, (5) stratification based on prognostic scores, and (6) blocking. Absolute bias and root mean square error (RMSE) are reported, with all values multiplied by 100.
Method
(η0,η1)subscript𝜂0subscript𝜂1(\eta_{0},\eta_{1})( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) Scenario N𝑁Nitalic_N Naïve IPW Naïve OR ML Prop Prog Block
(0.1, 0.1) (cor, cor) 1000 0.553 (4.802) 0.584 (4.629) 0.040 (3.106) 0.065 (3.338) 0.399 (3.381) 0.993 (3.391)
2000 0.579 (3.449) 0.568 (3.313) 0.004 (2.117) 0.016 (2.233) 0.268 (2.291) 0.570 (2.349)
(cor, mis) 1000 0.310 (4.842) 0.228 (4.532) 0.238 (3.097) 0.056 (3.289) 0.143 (3.246) 0.044 (3.226)
2000 0.529 (3.560) 0.266 (3.336) 0.201 (2.174) 0.008 (2.284) 0.157 (2.280) 0.049 (2.289)
(mis, cor) 1000 0.352 (4.044) 0.383 (4.025) 0.031 (2.733) 0.063 (2.879) 0.075 (2.834) 0.150 (2.913)
2000 0.610 (2.900) 0.636 (2.892) 0.015 (1.926) 0.025 (1.954) 0.046 (1.977) 0.106 (2.027)
(mis, mis) 1000 2.050 (4.557) 2.014 (4.531) 2.952 (4.023) 2.540 (3.837) 2.658 (3.896) 1.860 (3.432)
2000 2.273 (3.649) 2.235 (3.613) 3.106 (3.659) 2.594 (3.318) 2.782 (3.444) 1.384 (2.457)
(0.1, 0.2) (cor, cor) 1000 4.843 (7.066) 4.910 (6.922) 0.061 (3.274) 0.050 (3.567) 0.321 (3.517) 0.877 (3.565)
2000 4.523 (5.717) 4.654 (5.754) 0.051 (2.245) 0.012 (2.334) 0.207 (2.404) 0.371 (2.431)
(cor, mis) 1000 4.608 (6.993) 4.659 (6.733) 0.271 (3.262) 0.337 (3.541) 0.465 (3.484) 0.281 (3.397)
2000 4.881 (6.076) 4.765 (5.884) 0.026 (2.302) 0.004 (2.390) 0.233 (2.381) 0.108 (2.445)
(mis, cor) 1000 5.155 (6.475) 5.148 (6.447) 0.062 (2.881) 0.137 (2.998) 0.180 (2.978) 0.062 (3.073)
2000 4.922 (5.817) 4.960 (5.846) 0.086 (2.081) 0.027 (2.139) 0.021 (2.099) 0.034 (2.208)
(mis, mis) 1000 2.334 (4.869) 2.397 (4.891) 3.370 (4.443) 2.916 (4.198) 3.006 (4.296) 2.164 (3.743)
2000 2.255 (3.724) 2.288 (3.733) 3.129 (3.725) 2.663 (3.407) 2.778 (3.490) 1.403 (2.557)

6. Data Example: Child Abuse and Adult Anger

In this section, we apply the causal inference framework to the motivating example of our research, which examines the causal relationship between childhood abuse and adult anger. We consider a retrospective cohort study to examine the question, “Does child abuse by either parent increase a likelihood toward to adult anger?”. This study focuses on the publicly available 1993-1994 sibling survey of the Wisconsin Longitudinal Study (WLS). The treatment is defined as the presence or absence of childhood abuse by either the father or mother, and the outcome is determined by a binary indicator of whether either parent exhibits a high anger score. See Springer et al., (2007); Small et al., (2013) for additional details regarding the WLS data.

Springer et al., (2007) indicated that the results might be affected by a tendency to under-reporting of abuse. Adults are likely not to report their childhood abuse even though there is any. With this information, we applied (1) ML, (2) propensity score stratification, (3) prognostic score stratification, and (4) blocking for the estimation of the ATE. The logistic outcome regression with the seven covariates without interaction terms is considered for the ML method. The same exposure model is used for propensity score stratification, whereas prognostic score stratification is based on the same outcome model. Ten strata are constructed by using the quantile values of the estimated score. A block size of 20 is used for the blocking method to build blocks. Fergusson et al., (2000) asserted that a severe amount of false negative responses (approximately 50%) exist when reporting childhood abuse, whereas false positive responses are absent. Based on this study, we can consider 0η0(𝐱),η1(𝐱)δformulae-sequence0subscript𝜂0𝐱subscript𝜂1𝐱𝛿0\leq\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_δ by letting δ0.5𝛿0.5\delta\leq 0.5italic_δ ≤ 0.5 and compute the bounds according to δ𝛿\deltaitalic_δ as δ𝛿\deltaitalic_δ increases to 0.5.

Refer to caption

Figure 1. (a) Bounds of the average treatment effect (ATE) with 0η0,η1δformulae-sequence0subscript𝜂0subscript𝜂1𝛿0\leq\eta_{0},\eta_{1}\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_δ, (b) bounds of the ATE with 0η0η1δ0subscript𝜂0subscript𝜂1𝛿0\leq\eta_{0}\leq\eta_{1}\leq\delta0 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_δ, (c) point estimates of the ATE across the line of η0=η1subscript𝜂0subscript𝜂1\eta_{0}=\eta_{1}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and (d) point estimates of the ATE from maximum likelihood (ML) and blocking methods with 95% bootstrapped confidence intervals.

As shown in Figure 1(a), the bounds become wider as δ𝛿\deltaitalic_δ increases. All four bounds are above zero until δ=0.22𝛿0.22\delta=0.22italic_δ = 0.22. It is shown that they have a similar pattern, but the blocking method is the least sensitive to δ𝛿\deltaitalic_δ. Since the prognostic score stratification is similar to the blocking, this figure may indicate that the propensity score model is misspecified. Moreover, Deblinger and Runyon, (2005) stated that individuals who have high anger scores are more likely to experience recall bias when reporting childhood abuse experiences, that is, η0(𝐱)η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x})\leq\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ). This allows us to narrow down the lower bounds of the ATE from Proposition 2, as presented in Figure 1(b). Hence, we can conclude that childhood abuse has a causal effect on high anger scores in individuals, even in the presence of differential recall bias, based on this assumption without knowing η0(𝐱),η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x}),\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ).

Table 3. The effects of recall bias for six values of η0=η1subscript𝜂0subscript𝜂1\eta_{0}=\eta_{1}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. The estimates and 95% bootstrap confidence intervals are displayed for the maximum likelihood (ML) and stratification methods
Method
(η0,η1)subscript𝜂0subscript𝜂1(\eta_{0},\eta_{1})( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ML Prop Prog Block
(0.0,0.0) 0.067 (±plus-or-minus\pm± 0.052) 0.066 (±plus-or-minus\pm± 0.035) 0.088 (±plus-or-minus\pm± 0.055) 0.090 (±plus-or-minus\pm± 0.081)
(0.1,0.1) 0.068 (±plus-or-minus\pm± 0.053) 0.068 (±plus-or-minus\pm± 0.035) 0.089 (±plus-or-minus\pm± 0.055) 0.091 (±plus-or-minus\pm± 0.082)
(0.2,0.2) 0.070 (±plus-or-minus\pm± 0.055) 0.069 (±plus-or-minus\pm± 0.035) 0.091 (±plus-or-minus\pm± 0.056) 0.093 (±plus-or-minus\pm± 0.084)
(0.3,0.3) 0.073 (±plus-or-minus\pm± 0.057) 0.072 (±plus-or-minus\pm± 0.034) 0.093 (±plus-or-minus\pm± 0.058) 0.096 (±plus-or-minus\pm± 0.086)
(0.4,0.4) 0.076 (±plus-or-minus\pm± 0.061) 0.075 (±plus-or-minus\pm± 0.033) 0.097 (±plus-or-minus\pm± 0.059) 0.100 (±plus-or-minus\pm± 0.089)
(0.5,0.5) 0.081 (±plus-or-minus\pm± 0.066) 0.082 (±plus-or-minus\pm± 0.033) 0.103 (±plus-or-minus\pm± 0.062) 0.106 (±plus-or-minus\pm± 0.093)

We may further narrow down the bounds of the ATE if we make a stronger assumption. Fergusson et al., (2000) suggested that the probabilities of recall bias may not differ significantly based on an adult’s anger score. This suggests that recall bias may not be strongly related to an individual’s level of anger. This allows us to assume η0(𝐱)=η1(𝐱)subscript𝜂0𝐱subscript𝜂1𝐱\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ). Robins et al., (1985) pointed out that there is a minimal impact of reporters’ demographic characteristics, such as sex, age, and social class, on recall bias, which further allows us to assume η0(𝐱)=η1(𝐱)=ηsubscript𝜂0𝐱subscript𝜂1𝐱𝜂\eta_{0}(\mathbf{x})=\eta_{1}(\mathbf{x})=\etaitalic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_x ) = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x ) = italic_η. The setup of the range 0η0=η10.50subscript𝜂0subscript𝜂10.50\leq\eta_{0}=\eta_{1}\leq 0.50 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 0.5 requires the strongest assumption, but it helps us to look at the results succinctly. The estimates for various values of η0subscript𝜂0\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1subscript𝜂1\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are shown in Table 3. For this case, the variance estimation can be accompanied so that we can provide confidence intervals for various δ𝛿\deltaitalic_δ values. Figure 1(c) shows the estimates of the ATE across the line of η0=η1subscript𝜂0subscript𝜂1\eta_{0}=\eta_{1}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. All the estimates increase as η0=η1subscript𝜂0subscript𝜂1\eta_{0}=\eta_{1}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT increases. Furthermore, the 95% CIs of all methods do not contain 0. In Figure 1(d), we particularly focus on the results of the ML and blocking estimators when η0=η1subscript𝜂0subscript𝜂1\eta_{0}=\eta_{1}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Even though the confidence interval of the blocking estimator is broader than that of the ML estimator, possibly due to the fact that it requires weak assumptions, the confidence interval still stays above 0. These results imply that the under-reporting issue does not alter the initial conclusion; on the contrary, it strengthens the conclusion that there is significant evidence that child abuse increases the adult anger score.

Refer to caption

Figure 2. Contour plots for the values of (η0,η1)subscript𝜂0subscript𝜂1(\eta_{0},\eta_{1})( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) using (a) maximum likelihood, (b) propensity score stratification, (c) prognostic score stratification, and (d) blocking methods in the region 0η0,η10.5formulae-sequence0subscript𝜂0subscript𝜂10.50\leq\eta_{0},\eta_{1}\leq 0.50 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 0.5.

We also conduct a sensitivity analysis of recall bias with parameters 0η0,η10.5formulae-sequence0subscript𝜂0subscript𝜂10.50\leq\eta_{0},\eta_{1}\leq 0.50 ≤ italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 0.5. Figure 2 shows contour plots of the estimated ATEs for the values of η0subscript𝜂0\eta_{0}italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and η1subscript𝜂1\eta_{1}italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in this region. This figure reveals that most of the estimates are above 0. Especially for the blocking method, estimates are below 0 only for a small region of η012η1+0.4subscript𝜂012subscript𝜂10.4\eta_{0}\geq\frac{1}{2}\eta_{1}+0.4italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 0.4 for 0η10.20subscript𝜂10.20\leq\eta_{1}\leq 0.20 ≤ italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ 0.2.

7. Discussion

In this paper, we derived the ATE bounds and incorporated knowledge from prior research in order to narrow the bounds. Also, we proposed several approaches to estimate the bounds. Most of discussion were focused on the ATE, but the same argument can be applied to different measures such as average treatment effect on the treated (ATT), risk ratio, or odds ratio. In Web Appendix, we include more detailed discussions about these measures.

Also, another difficulty in applying the stratification methods proposed arises in computation. For instance, in some cases, both aisuperscriptsubscript𝑎𝑖a_{i}^{*}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and bisuperscriptsubscript𝑏𝑖b_{i}^{*}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT are zero. To avoid any computational issues, we also propose a nearest-neighbor combination method to improve the stability of the estimates. This method is discussed in Web Appendix.

Finally, one limitation is that we cannot assess the covariate balance before adjustment since the exposure variable is misclassified. We can consider an indirect approach to check the covariate balance under the constant recall bias assumption. If stratification is successful, we may assume that the covariate distributions between the treated and control groups are equal, at least asymptotically. Also, if the magnitude of recall bias is independent of the covariates, we are able to assess the balance with a biased treatment Zsuperscript𝑍Z^{*}italic_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. First, we calculate the number of treated and control units with bias correction such as ai/(1η1)+bi/(1η0)superscriptsubscript𝑎𝑖1subscript𝜂1superscriptsubscript𝑏𝑖1subscript𝜂0a_{i}^{*}/(1-\eta_{1})+b_{i}^{*}/(1-\eta_{0})italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and ci+diai(η1/(1η1))bi(η0/(1η0))superscriptsubscript𝑐𝑖superscriptsubscript𝑑𝑖superscriptsubscript𝑎𝑖subscript𝜂11subscript𝜂1superscriptsubscript𝑏𝑖subscript𝜂01subscript𝜂0c_{i}^{*}+d_{i}^{*}-a_{i}^{*}(\eta_{1}/(1-\eta_{1}))-b_{i}^{*}(\eta_{0}/(1-% \eta_{0}))italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) - italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / ( 1 - italic_η start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) respectively. Second, under the assumption that the covariate vectors within each stratum are similar, we can compute the average covariate vectors for each stratum. Every individual in the same stratum shares the same average vector, which is not restrictive since we are going to compare the weighted means after all. Finally, we compare the absolute standardized mean difference between the treated and control groups using corrected weights based on this assumption. This new technique is also discussed in Web Appendix.

Acknowledgments

This work was supported by NIH grants (R01ES026217, R01MD012769, R01ES028033, 1R01ES030616, 1R01AG066793-01R01, 1R01ES029950, R01ES028033-S1), Alfred P. Sloan Foundation (G-2020-13946), Vice Provost for Research at Harvard University (Climate Change Solutions Fund), the New Faculty Startup Fund from Seoul National University, the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2021R1C1C1012750), and the Global-LAMP Program of the National Research Foundation of Korea (NRF) grant funded by the Ministry of Education (No. RS-2023-00301976).

Supplementary Materials

Supplementary materials containing the Web Appendices are available online. The R codes are available at https://github.com/suhwanbong121/recall_bias_observational_study.

References

  • Armstrong, (1985) Armstrong, B. (1985). Measurement error in the generalised linear model. Communications in Statistics-Simulation and Computation, 14(3):529–544.
  • Babanezhad et al., (2010) Babanezhad, M., Vansteelandt, S., and Goetghebeur, E. (2010). Comparison of causal effect estimators under exposure misclassification. Journal of Statistical Planning and Inference, 140(5):1306–1319.
  • Braun et al., (2017) Braun, D., Gorfine, M., Parmigiani, G., Arvold, N. D., Dominici, F., and Zigler, C. (2017). Propensity scores with misclassified treatment assignment: a likelihood-based adjustment. Biostatistics, 18(4):695–710.
  • Bross, (1954) Bross, I. (1954). Misclassification in 2 x 2 tables. Biometrics, 10(4):478–486.
  • Carroll et al., (1985) Carroll, R. J., Gallo, P., and Gleser, L. J. (1985). Comparison of least squares and errors-in-variables regression, with special reference to randomized analysis of covariance. Journal of the American Statistical Association, 80(392):929–932.
  • Carroll et al., (1995) Carroll, R. J., Ruppert, D., and Stefanski, L. A. (1995). Measurement error in nonlinear models. New York: Chapman and Hall.
  • Carroll and Stefanski, (1990) Carroll, R. J. and Stefanski, L. A. (1990). Approximate quasi-likelihood estimation in models with surrogate predictors. Journal of the American Statistical Association, 85(411):652–663.
  • Cochran, (1968) Cochran, W. G. (1968). Errors of measurement in statistics. Technometrics, 10(4):637–666.
  • Deblinger and Runyon, (2005) Deblinger, E. and Runyon, M. K. (2005). Understanding and treating feelings of shame in children who have experienced maltreatment. Child Maltreatment, 10(4):364–376.
  • Fergusson et al., (2000) Fergusson, D. M., Horwood, L. J., and Woodward, L. J. (2000). The stability of child abuse reports: a longitudinal study of the reporting behaviour of young adults. Psychological Medicine, 30(3):529–544.
  • Fuller, (1980) Fuller, W. A. (1980). Properties of some estimators for the errors-in-variables model. The Annals of Statistics, 8(2):407–422.
  • Gleser, (1990) Gleser, L. J. (1990). Improvements of the naive approach to estimation in nonlinear errors-in-variables regression models. Contemporary Mathematics, 112:99–114.
  • Gravel and Platt, (2018) Gravel, C. A. and Platt, R. W. (2018). Weighted estimation for confounded binary outcomes subject to misclassification. Statistics in Medicine, 37(3):425–436.
  • Hansen, (2008) Hansen, B. B. (2008). The prognostic analogue of the propensity score. Biometrika, 95(2):481–488.
  • Imai and Yamamoto, (2010) Imai, K. and Yamamoto, T. (2010). Causal inference with differential measurement error: Nonparametric identification and sensitivity analysis. American Journal of Political Science, 54(2):543–560.
  • Karmakar et al., (2021) Karmakar, B., Small, D. S., and Rosenbaum, P. R. (2021). Reinforced designs: Multiple instruments plus control groups as evidence factors in an observational study of the effectiveness of catholic schools. Journal of the American Statistical Association, 116(533):82–92.
  • Lindley, (1953) Lindley, D. (1953). Estimation of a functional relationship. Biometrika, 40(1/2):47–49.
  • Lockwood and McCaffrey, (2016) Lockwood, J. R. and McCaffrey, D. F. (2016). Matching and weighting with functions of error-prone covariates for causal inference. Journal of the American Statistical Association, 111(516):1831–1839.
  • Lord, (1960) Lord, F. M. (1960). Large-sample covariance analysis when the control variable is fallible. Journal of the American Statistical Association, 55(290):307–321.
  • McCaffrey et al., (2013) McCaffrey, D. F., Lockwood, J. R., and Setodji, C. M. (2013). Inverse probability weighting with error-prone covariates. Biometrika, 100(3):671–680.
  • Raphael, (1987) Raphael, K. (1987). Recall bias: a proposal for assessment and control. International Journal of Epidemiology, 16(2):167–170.
  • Robins et al., (1985) Robins, L. N., Schoenberg, S. P., Holmes, S. J., Ratcliff, K. S., Benham, A., and Works, J. (1985). Early home environment and retrospective recall: A test for concordance between siblings with and without psychiatric disorders. American Journal of Orthopsychiatry, 55(1):27–41.
  • Rosenbaum and Rubin, (1983) Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55.
  • Rosner et al., (1990) Rosner, B., Spiegelman, D., and Willett, W. C. (1990). Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. American Journal of Epidemiology, 132(4):734–745.
  • Rosner et al., (1989) Rosner, B., Willett, W., and Spiegelman, D. (1989). Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Statistics in Medicine, 8(9):1051–1069.
  • Rothman, (2012) Rothman, K. J. (2012). Epidemiology: an introduction. New York: Oxford University Press.
  • Rothman et al., (2008) Rothman, K. J., Greenland, S., Lash, T. L., et al. (2008). Modern epidemiology, volume 3. Wolters Kluwer Health/Lippincott Williams & Wilkins Philadelphia.
  • Rubin, (1974) Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688–701.
  • Rubin, (1980) Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association, 75(371):591–593.
  • Small et al., (2013) Small, D. S., Cheng, J., Halloran, M. E., and Rosenbaum, P. R. (2013). Case definition and design sensitivity. Journal of the American Statistical Association, 108(504):1457–1468.
  • Springer et al., (2007) Springer, K. W., Sheridan, J., Kuo, D., and Carnes, M. (2007). Long-term physical and mental health consequences of childhood physical abuse: Results from a large population-based sample of men and women. Child Abuse & Neglect, 31(5):517–530.
  • Stefanski and Carroll, (1985) Stefanski, L. A. and Carroll, R. J. (1985). Covariate measurement error in logistic regression. The Annals of Statistics, 14(3):1335–1351.