HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: moreverb

Authors: achieve the best HTML results from your LaTeX submissions by selecting from this list of supported packages.

License: CC BY-NC-ND 4.0
arXiv:2312.07697v1 [stat.ME] 12 Dec 2023

A Class of Computational Methods to Reduce Selection Bias when Designing Phase 3 Clinical Trials

Tianyu Zhan
Data and Statistical Sciences, AbbVie Inc., North Chicago, IL, USA
Tianyu Zhan is an employee of AbbVie Inc. Corresponding author email address: [email protected].
Abstract

When designing confirmatory Phase 3 studies, one usually evaluates one or more efficacious and safe treatment option(s) based on data from previous studies. However, several retrospective research articles reported the phenomenon of “diminished treatment effect in Phase 3” based on many case studies. Even under basic assumptions, it was shown that the commonly used estimator could substantially overestimate the efficacy of selected group(s). As alternatives, we propose a class of computational methods to reduce estimation bias and mean squared error (MSE) with a broader scope of multiple treatment groups and flexibility to accommodate summary results by group as input. Based on simulation studies and a real data example, we provide practical implementation guidance for this class of methods under different scenarios. For more complicated problems, our framework can serve as a starting point with additional layers built in. Proposed methods can also be widely applied to other selection problems.


Keywords: Bias correction; Estimation; Higher-order Bootstrap; Jackknife.

1 Introduction

In clinical drug development, confirmatory Phase 3 studies are usually conducted to comprehensively evaluate the safety and efficacy of the study drug after exploratory Phase 2 studies (ICH Guideline E8, 2022). To properly design Phase 3 studies, one needs to accurately characterize the efficacy profile of one or more selected efficacious and safe treatment option(s) to inform many key decisions, for example, Go/No-Go and sample size calculation. However, quite a few retrospective research articles reported the phenomenon of “diminished treatment effect in Phase 3”: FDA studied 22 recent cases in which promising Phase 2 results were not confirmed in Phase 3 and found 21 of them were due to lack of efficacy (Food and Drug Administration, 2017); treatment effect sizes of progression-free survival (PFS) were on average 26%percent2626\%26 % larger in Phase 2 as compared with Phase 3 in 57 pairs of oncology studies (Liang et al., 2019); 35 out of 43 Phase 3 studies of chemotherapy in advanced solid malignancies had lower response rates than preceding Phase 2 studies (Zia et al., 2005).

This question concerning the efficacy gap between previous studies and Phase 3 studies is important, but is also challenging to resolve. There are several caveats that may contribute to this gap, for example, temporal drift due to the standard of care improvements or other factors (Saville et al., 2022), patient heterogeneity across studies (Liang et al., 2019), variability in results based on limited sample size. As a starting point, we consider a typical approach of directly choosing the treatment group(s) with the best outcome(s) based on previous studies, and use the corresponding results as assumptions for Phase 3 design. Under a basic scenario where true response means between studies are the same for a selected group, this estimator may substantially overestimate its response mean in Phase 3 with insufficient power, as further discussed in later sections, including a toy simulation in Table 1. As discussed in Section 6, this framework can be extended with additional layers to handle more complicated problems, e.g., temporal drift, and other base estimators, e.g., the minimum efficacious dose (MED) modeled from MCP-Mod (Bretz et al., 2005).

There were some previous theoretical works conducted to study this problem of estimating the larger of two means for some specific distributions. Blumenthal and Cohen (1968) and Dahiya (1974) investigated this under two Normal distributions and a common known variance, and showed that no unbiased estimator could exist (Blumenthal and Cohen, 1968). This phenomenon of the non-existence of unbiased estimators was further studied in more generalized distributions from two groups, for example, Normal distributions with common but unknown variance Hsieh (1981), a general class of distributions (Ishwaei D et al., 1985), double exponential distributions with unknown locations (Kumar and Sharma, 1993). On the other hand, unbiased estimators may exist under some special settings, e.g., two gamma distributions with a common and known shape parameter (Vellaisamy and Sharma, 1988). As a general approach, Rosenkranz (2014) proposed to correct bias using non-parametric Bootstrap (Efron and Tibshirani, 1994; Davison and Hinkley, 1997; Kosmidis, 2014) to accommodate general distribution assumptions from two groups. However, patient-level data are needed to implement this method. In this article, we consider a more general scope of ”previous studies”, in the sense that it can be in-house Phase 2 studies with patient-level data under multiple doses and/or multiple compounds, or external studies with only summary data available to characterize assumptions of the active comparator(s) in the new Phase 3 study. Additionally, it is also common to have more than two treatment options to be selected in the design of Phase 3 trials.

Additionally, there are several methods proposed to correct selection bias in clinical trials with two or more stages. Based on Whitehead (1986), Stallard and Todd (2005) developed an iterative approach to reduce the estimation bias conditional on the selection of a treatment group. This method requires analytic derivation of the conditional bias given a specific setting, e.g., equal-variance considered in Stallard and Todd (2005). The single and double Bootstrap methods introduced in Section 3.1 have a similar idea of correcting bias iteratively, but utilizes empirical Bootstrap distributions to estimate the bias. Our proposed approaches are also more general to cover settings with unequal-variance. Bauer et al. (2010) investigated the bias and MSE when estimating the efficacy of the best treatment in multi-stage trials with sample size adaptation and homogeneous variance. They emphasized that the quantification of the bias is possible only in designs with planned adaptivity (Bauer et al., 2010). Hwang (1993) and Lindley (1962) proposed a shrinkage estimator with superior performance of Bayes risk as compared with the typical maximum-likelihood estimator (MLE). This shrinkage estimator is briefly reviewed in Section 3.3 with comparison results in Section 4. Two recent papers nicely reviewed point estimation for adaptive trial designs, including bias reduction in multi-arm multi-stage designs with treatment selection (Robertson et al., 2023a, b).

The motivation of our proposed methods is to empirically estimate the bias for correction with computational approaches, such as Bootstrap (Efron and Tibshirani, 1994) or Jackknife (Quenouille, 1949). This framework can naturally accommodate general settings, such as multiple (more than two) groups based on either subject-level data or group-level summary data, homogeneous or heterogeneous variance between treatment groups. Our scope is broader to cover typical Phase 2 studies with patient-level data, and external studies with only summary data available based on literature. As compared with single Bootstrap methods, double Bootstrap methods can further reduce bias with slightly larger mean squared error (MSE) and an additional cost of computation, with results in Section 4. We further propose hybrid estimators based on double Bootstrap estimators and shrinkage estimators (Hwang, 1993; Lindley, 1962) to balance the reductions in both bias and MSE.

The remainder of this article is organized as follows. In Section 2, we introduce the setup of this problem and notations. In Section 3, a class of computational methods is proposed, and an existing shrinkage estimator (Hwang, 1993; Lindley, 1962) is reviewed. Simulation studies are performed in Section 4 to demonstrate the potential gains of those proposed methods in terms of bias and MSE under different settings. We apply our methods to a Phase 2/3 seamless trial in Section 5. Discussions are provided in Section 6.

2 Setup

Consider a previous study with I𝐼Iitalic_I active treatment groups and nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT patients randomized to the i𝑖iitalic_ith treatment group, for i=1,,I𝑖1𝐼i=1,\cdots,Iitalic_i = 1 , ⋯ , italic_I. We consider the response Xi,jsubscript𝑋𝑖𝑗X_{i,j}italic_X start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT of the treatment group i𝑖iitalic_i, for i=1,,I𝑖1𝐼i=1,\cdots,Iitalic_i = 1 , ⋯ , italic_I, and the subject j𝑗jitalic_j, for j=1,,ni𝑗1subscript𝑛𝑖j=1,\cdots,n_{i}italic_j = 1 , ⋯ , italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, follows a Normal distribution,

Xi,j𝒩(θi,σi2),similar-tosubscript𝑋𝑖𝑗𝒩subscript𝜃𝑖superscriptsubscript𝜎𝑖2X_{i,j}\sim\mathcal{N}\left(\theta_{i},\sigma_{i}^{2}\right),italic_X start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∼ caligraphic_N ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (1)

where θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the mean and σisubscript𝜎𝑖\sigma_{i}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the standard deviation of the treatment group i𝑖iitalic_i. We assume that a larger value of Xijsubscript𝑋𝑖𝑗X_{ij}italic_X start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT corresponds to a better outcome, and σisubscript𝜎𝑖\sigma_{i}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is unknown. Denote 𝑿i=(Xi,1,,Xi,ni)subscript𝑿𝑖subscript𝑋𝑖1subscript𝑋𝑖subscript𝑛𝑖\bm{X}_{i}=\left(X_{i,1},\cdots,X_{i,n_{i}}\right)bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( italic_X start_POSTSUBSCRIPT italic_i , 1 end_POSTSUBSCRIPT , ⋯ , italic_X start_POSTSUBSCRIPT italic_i , italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) as the vector of responses from group i𝑖iitalic_i, and 𝑿=(𝑿1,,𝑿I)𝑿subscript𝑿1subscript𝑿𝐼\bm{X}=(\bm{X}_{1},\cdots,\bm{X}_{I})bold_italic_X = ( bold_italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , bold_italic_X start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ).

After obtaining data from multiple treatment groups, the study team will usually select one or two treatment group(s) to confirm findings in Phase 3 studies. We consider a motivating scenario where all treatment groups have similar safety profiles, and the most efficacious group will be moved to Phase 3. A key question is how to accurately characterize the efficacy of this selected group for sample size calculation.

The corresponding statistical question is to use observed data (𝑿1,,𝑿I)subscript𝑿1subscript𝑿𝐼\left(\bm{X}_{1},\cdots,\bm{X}_{I}\right)( bold_italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , bold_italic_X start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ) to estimate the parameter of interest θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT, defined as,

θmax=max(θ1,,θI).subscript𝜃𝑚𝑎𝑥subscript𝜃1subscript𝜃𝐼\theta_{max}=\max(\theta_{1},\cdots,\theta_{I}).italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT = roman_max ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_θ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ) . (2)

A traditional estimator θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG is commonly used in practice to estimate θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT:

θ^(𝑿)=max[θ~(𝑿1),,θ~(𝑿I)],^𝜃𝑿~𝜃subscript𝑿1~𝜃subscript𝑿𝐼\widehat{\theta}(\bm{X})=\max\left[\widetilde{\theta}(\bm{X}_{1}),\cdots,% \widetilde{\theta}(\bm{X}_{I})\right],over^ start_ARG italic_θ end_ARG ( bold_italic_X ) = roman_max [ over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ⋯ , over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ) ] , (3)

where θ~(x)~𝜃𝑥\widetilde{\theta}(x)over~ start_ARG italic_θ end_ARG ( italic_x ) is the sample mean of x𝑥xitalic_x, and θ~(𝑿i)~𝜃subscript𝑿𝑖\widetilde{\theta}(\bm{X}_{i})over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) as an unbiased estimator of θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. However, θ^(𝑿)^𝜃𝑿\widehat{\theta}(\bm{X})over^ start_ARG italic_θ end_ARG ( bold_italic_X ) may overestimate θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT in finite-samples. Even though θ~(𝑿i)~𝜃subscript𝑿𝑖\widetilde{\theta}(\bm{X}_{i})over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) can accurately estimate θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with no bias for each treatment group i𝑖iitalic_i, one does not know which treatment group has the highest true response mean θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in (2) based on observed data.

To provide a numerical illustration of bias, we conduct a toy simulation with I=2𝐼2I=2italic_I = 2 treatment groups, θ1=0.9subscript𝜃10.9\theta_{1}=0.9italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.9, θ2=1subscript𝜃21\theta_{2}=1italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1, and σ1=σ2=5subscript𝜎1subscript𝜎25\sigma_{1}=\sigma_{2}=5italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 5 under three magnitudes of sample size n𝑛nitalic_n. Table 1 shows that the traditional estimator θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG can overestimate θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT by 40%percent4040\%40 % under a moderate sample size n=40𝑛40n=40italic_n = 40, but the bias shrinks as n𝑛nitalic_n increases. When n=40,000𝑛40000n=40,000italic_n = 40 , 000, the probability of selecting the correct treatment group i=2𝑖2i=2italic_i = 2 is nearly 100%percent100100\%100 %, and therefore, θ^(𝑿)^𝜃𝑿\widehat{\theta}(\bm{X})over^ start_ARG italic_θ end_ARG ( bold_italic_X ) is close to θ~(𝑿2)~𝜃subscript𝑿2\widetilde{\theta}(\bm{X}_{2})over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) as an unbiased estimator of θmax=θ2subscript𝜃𝑚𝑎𝑥subscript𝜃2\theta_{max}=\theta_{2}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT = italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT n𝑛nitalic_n E(θ^)𝐸^𝜃E\big{(}\widehat{\theta}\big{)}italic_E ( over^ start_ARG italic_θ end_ARG ) E(θ^)θmax𝐸^𝜃subscript𝜃𝑚𝑎𝑥E(\widehat{\theta})-\theta_{max}italic_E ( over^ start_ARG italic_θ end_ARG ) - italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT Prob of correctly selecting i=2𝑖2i=2italic_i = 2
1 40 1.40 0.40 0.53
4000 1.01 0.01 0.82
40000 1.00 0.00 1.00
Table 1: A toy simulation to evaluate the bias of θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG when estimating θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT.

3 Proposed Methods

In this section, we introduce a class of computational methods based on Bootstrap or Jackknife principles to reduce estimation bias.

3.1 Single and Double Bootstrap

Suppose that we have θ^(𝑿)^𝜃𝑿\widehat{\theta}(\bm{X})over^ start_ARG italic_θ end_ARG ( bold_italic_X ) in (3) as an initial estimator of θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT. Its bias at θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is denoted as A(θ0)𝐴subscript𝜃0A(\theta_{0})italic_A ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ),

A(θ0)=E[θ^(𝑿)]θ0.𝐴subscript𝜃0𝐸delimited-[]^𝜃𝑿subscript𝜃0A(\theta_{0})=E\left[\widehat{\theta}(\bm{X})\right]-\theta_{0}.italic_A ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_E [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ] - italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT . (4)

Since the true value θ0subscript𝜃0\theta_{0}italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT of θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT is to be estimated and the functional form of A()𝐴A(\cdot)italic_A ( ⋅ ) is usually unknown, one can use A^[θ^(𝑿)]^𝐴delimited-[]^𝜃𝑿\widehat{A}\left[\widehat{\theta}(\bm{X})\right]over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ] to approximate A(θ0)𝐴subscript𝜃0A(\theta_{0})italic_A ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ),

A^[θ^(𝑿)]=E^[θ^(𝑿B)]θ^(𝑿),^𝐴delimited-[]^𝜃𝑿^𝐸delimited-[]^𝜃subscript𝑿𝐵^𝜃𝑿\widehat{A}\left[\widehat{\theta}(\bm{X})\right]=\widehat{E}\left[\widehat{% \theta}(\bm{X}_{B})\right]-\widehat{\theta}(\bm{X}),over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ] = over^ start_ARG italic_E end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ) ] - over^ start_ARG italic_θ end_ARG ( bold_italic_X ) , (5)

where E^^𝐸\widehat{E}over^ start_ARG italic_E end_ARG is the empirical expectation based on Monte Carlo Bootstrap data 𝑿Bsubscript𝑿𝐵\bm{X}_{B}bold_italic_X start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT with size B𝐵Bitalic_B. The single Bootstrap estimator θ^(1)(𝑿)superscript^𝜃1𝑿\widehat{\theta}^{(1)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) (Efron and Tibshirani, 1994; Davison and Hinkley, 1997; Kosmidis, 2014) can then be constructed as,

θ^(1)(𝑿)=θ^(𝑿)A^[θ^(𝑿)].superscript^𝜃1𝑿^𝜃𝑿^𝐴delimited-[]^𝜃𝑿\widehat{\theta}^{(1)}(\bm{X})=\widehat{\theta}(\bm{X})-\widehat{A}\left[% \widehat{\theta}(\bm{X})\right].over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) = over^ start_ARG italic_θ end_ARG ( bold_italic_X ) - over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ] . (6)

Figure 1 left-hand side provides a graphical illustration of the construction above. Algorithm 1 streamlines the workflow to compute θ^(1)(𝑿)superscript^𝜃1𝑿\widehat{\theta}^{(1)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) based on B𝐵Bitalic_B Bootstrap samples.

To further reduce bias, we can iteratively apply the above approach with θ^(1)(𝑿)superscript^𝜃1𝑿\widehat{\theta}^{(1)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) as the initial estimator to obtain the double Bootstrap estimator θ^(2)(𝑿)superscript^𝜃2𝑿\widehat{\theta}^{(2)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ( bold_italic_X ) as in Figure 1 right-hand side. Algorithm 2 demonstrates that the computation of θ^(2)(𝑿)superscript^𝜃2𝑿\widehat{\theta}^{(2)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ( bold_italic_X ) requires B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT Bootstrap samples. This strategy is analog to the calibration of Bootstrap to obtain second-order accurate confidence intervals (Efron and Tibshirani, 1994). Based on our simulation studies in Section 4, θ^(2)(𝑿)superscript^𝜃2𝑿\widehat{\theta}^{(2)}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ( bold_italic_X ) has a satisfactory finite-sample performance in terms of bias and MSE. The triple (or even a higher-order) Bootstrap estimator can also be implemented to seek potential improvements, but with a cost of a much heavier computational burden. Section 4.2 provides more discussion on higher-order Bootstrap estimators.

Next we provide more details on simulating Bootstrap samples from data. Taking the single Bootstrap as an example, our strategy is to resample 𝑿bsuperscriptsubscript𝑿𝑏\bm{X}_{b}^{\ast}bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT from observed data 𝑿𝑿\bm{X}bold_italic_X blocked by groups. To be more specific, for each treatment group i𝑖iitalic_i, we generate Bootstrap samples 𝒀b,isubscript𝒀𝑏𝑖\bm{Y}_{b,i}bold_italic_Y start_POSTSUBSCRIPT italic_b , italic_i end_POSTSUBSCRIPT of size nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from 𝑿isubscript𝑿𝑖\bm{X}_{i}bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and then obtain 𝑿b=(𝒀b,1,,𝒀b,I)superscriptsubscript𝑿𝑏subscript𝒀𝑏1subscript𝒀𝑏𝐼\bm{X}_{b}^{\ast}=\left(\bm{Y}_{b,1},\cdots,\bm{Y}_{b,I}\right)bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = ( bold_italic_Y start_POSTSUBSCRIPT italic_b , 1 end_POSTSUBSCRIPT , ⋯ , bold_italic_Y start_POSTSUBSCRIPT italic_b , italic_I end_POSTSUBSCRIPT ). For sampling methods, one can adopt the Nonparametric Bootstrap (NB) to sample nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT observations from 𝑿isubscript𝑿𝑖\bm{X}_{i}bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with replacement to get 𝒀b,isubscript𝒀𝑏𝑖\bm{Y}_{b,i}bold_italic_Y start_POSTSUBSCRIPT italic_b , italic_i end_POSTSUBSCRIPT, as considered in Rosenkranz (2014). An alternative approach is the Parametric Bootstrap (PB) with distribution assumptions, for example, a Normal distribution with sample mean θ~(𝑿i)~𝜃subscript𝑿𝑖\widetilde{\theta}(\bm{X}_{i})over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) as the mean parameter and empirical standard deviation σ~(𝑿i)~𝜎subscript𝑿𝑖\widetilde{\sigma}(\bm{X}_{i})over~ start_ARG italic_σ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) as the standard deviation parameter. PB is flexible to cover scenarios where only summary statistics (e.g., n𝑛nitalic_n, θ~~𝜃\widetilde{\theta}over~ start_ARG italic_θ end_ARG, σ~)\widetilde{\sigma})over~ start_ARG italic_σ end_ARG ) for each group are reported in the literature or other external sources. In Section 4, we have scenarios with mixture distributions to evaluate the robustness of PB.

Now we have four Bootstrap estimators: θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT, θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT, where the superscript “(1)1(1)( 1 )” corresponds to the single Bootstrap, “(2)2(2)( 2 )” to the double Bootstrap, the subscript “PB𝑃𝐵PBitalic_P italic_B” to Parametric Bootstrap, and “NB𝑁𝐵NBitalic_N italic_B” to Nonparametric Bootstrap.

Refer to caption
Figure 1: Graphical illustration of the single Bootstrap estimator θ^(1)superscript^𝜃1\widehat{\theta}^{(1)}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT (left) and the double Bootstrap estimator θ^(2)superscript^𝜃2\widehat{\theta}^{(2)}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT (right).
Algorithm 1 Single Bootstrap
Input: 𝑿𝑿\bm{X}bold_italic_X
Procedure:
For Bootstrap index b𝑏bitalic_b from 1111 to B𝐵Bitalic_B, Do:
  Simulate Bootstrap sample 𝑿bsuperscriptsubscript𝑿𝑏\bm{X}_{b}^{\ast}bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT based on 𝑿𝑿\bm{X}bold_italic_X
End
Compute A^[θ^(𝑿)]=b=1Bθ^(𝑿b)/Bθ^(𝑿)^𝐴delimited-[]^𝜃𝑿superscriptsubscript𝑏1𝐵^𝜃superscriptsubscript𝑿𝑏𝐵^𝜃𝑿\widehat{A}\left[\widehat{\theta}(\bm{X})\right]=\sum_{b=1}^{B}\widehat{\theta% }(\bm{X}_{b}^{\ast})/B-\widehat{\theta}(\bm{X})over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ] = ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) / italic_B - over^ start_ARG italic_θ end_ARG ( bold_italic_X )
Output: θ^(1)(𝑿)=θ^(𝑿)A^[θ^(𝑿)]superscript^𝜃1𝑿^𝜃𝑿^𝐴delimited-[]^𝜃𝑿\widehat{\theta}^{(1)}(\bm{X})=\widehat{\theta}(\bm{X})-\widehat{A}\left[% \widehat{\theta}(\bm{X})\right]over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) = over^ start_ARG italic_θ end_ARG ( bold_italic_X ) - over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ]
Algorithm 2 Double Bootstrap
Input: 𝑿𝑿\bm{X}bold_italic_X
Procedure:
For Bootstrap index b𝑏bitalic_b from 1111 to B𝐵Bitalic_B, Do:
  Simulate Bootstrap sample 𝑿bsuperscriptsubscript𝑿𝑏\bm{X}_{b}^{\ast}bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT based on 𝑿𝑿\bm{X}bold_italic_X
   For Bootstrap index c𝑐citalic_c from 1111 to B𝐵Bitalic_B, Do:
    Simulate Bootstrap sample 𝑿b,csuperscriptsubscript𝑿𝑏𝑐absent\bm{X}_{b,c}^{\ast\ast}bold_italic_X start_POSTSUBSCRIPT italic_b , italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ ∗ end_POSTSUPERSCRIPT based on 𝑿bsuperscriptsubscript𝑿𝑏\bm{X}_{b}^{\ast}bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
   End
  Compute A^[𝑿b]=c=1Bθ^(𝑿b,c)/Bθ^(𝑿b)^𝐴delimited-[]superscriptsubscript𝑿𝑏superscriptsubscript𝑐1𝐵^𝜃superscriptsubscript𝑿𝑏𝑐absent𝐵^𝜃superscriptsubscript𝑿𝑏\widehat{A}\left[\bm{X}_{b}^{\ast}\right]=\sum_{c=1}^{B}\widehat{\theta}(\bm{X% }_{b,c}^{\ast\ast})/B-\widehat{\theta}(\bm{X}_{b}^{\ast})over^ start_ARG italic_A end_ARG [ bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_c = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_b , italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ ∗ end_POSTSUPERSCRIPT ) / italic_B - over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )
  Obtain θ^(1)(𝑿b)=θ^(𝑿b)A^[𝑿b]superscript^𝜃1superscriptsubscript𝑿𝑏^𝜃superscriptsubscript𝑿𝑏^𝐴delimited-[]superscriptsubscript𝑿𝑏\widehat{\theta}^{(1)}(\bm{X}_{b}^{\ast})=\widehat{\theta}(\bm{X}_{b}^{\ast})-% \widehat{A}\left[\bm{X}_{b}^{\ast}\right]over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - over^ start_ARG italic_A end_ARG [ bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ]
End
Compute A^[θ^(1)(𝑿)]=b=1Bθ^(1)(𝑿b)/Bθ^(1)(𝑿)^𝐴delimited-[]superscript^𝜃1𝑿superscriptsubscript𝑏1𝐵superscript^𝜃1superscriptsubscript𝑿𝑏𝐵superscript^𝜃1𝑿\widehat{A}\left[\widehat{\theta}^{(1)}(\bm{X})\right]=\sum_{b=1}^{B}\widehat{% \theta}^{(1)}(\bm{X}_{b}^{\ast})/B-\widehat{\theta}^{(1)}(\bm{X})over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) ] = ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) / italic_B - over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X )
Output: θ^(2)(𝑿)=θ^(1)(𝑿)A^[θ^(1)(𝑿)]superscript^𝜃2𝑿superscript^𝜃1𝑿^𝐴delimited-[]superscript^𝜃1𝑿\widehat{\theta}^{(2)}(\bm{X})=\widehat{\theta}^{(1)}(\bm{X})-\widehat{A}\left% [\widehat{\theta}^{(1)}(\bm{X})\right]over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ( bold_italic_X ) = over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) - over^ start_ARG italic_A end_ARG [ over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( bold_italic_X ) ]

3.2 Jackknife

The Jackknife is a well-established technique to correct bias (Quenouille, 1949; Miller, 1974). Suppose that the bias of θ^(𝑿)^𝜃𝑿\widehat{\theta}(\bm{X})over^ start_ARG italic_θ end_ARG ( bold_italic_X ) under sample size n𝑛nitalic_n can be expressed as,

E[θ^(𝑿);n]θ0=a1n+a2n2+𝒪(n3),𝐸^𝜃𝑿𝑛subscript𝜃0subscript𝑎1𝑛subscript𝑎2superscript𝑛2𝒪superscript𝑛3E\left[\widehat{\theta}(\bm{X});n\right]-\theta_{0}=\frac{a_{1}}{n}+\frac{a_{2% }}{n^{2}}+\mathcal{O}(n^{-3}),italic_E [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ; italic_n ] - italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG + divide start_ARG italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + caligraphic_O ( italic_n start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT ) , (7)

where a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are unknown and do not depend on n𝑛nitalic_n. The bias with a sample size of n1𝑛1n-1italic_n - 1 is,

E[θ^(𝑿);n1]θ0=a1n1+a2(n1)2+𝒪(n3).𝐸^𝜃𝑿𝑛1subscript𝜃0subscript𝑎1𝑛1subscript𝑎2superscript𝑛12𝒪superscript𝑛3E\left[\widehat{\theta}(\bm{X});n-1\right]-\theta_{0}=\frac{a_{1}}{n-1}+\frac{% a_{2}}{(n-1)^{2}}+\mathcal{O}(n^{-3}).italic_E [ over^ start_ARG italic_θ end_ARG ( bold_italic_X ) ; italic_n - 1 ] - italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n - 1 end_ARG + divide start_ARG italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG ( italic_n - 1 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + caligraphic_O ( italic_n start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT ) . (8)

In order to correct the order 1/n1𝑛1/n1 / italic_n term, one can construct the following Jackknife estimator θ^JK(𝑿)subscript^𝜃𝐽𝐾𝑿\widehat{\theta}_{JK}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT ( bold_italic_X ),

θ^JK(𝑿)=nθ^(𝑿)(n1)θ^()(𝑿),subscript^𝜃𝐽𝐾𝑿𝑛^𝜃𝑿𝑛1subscript^𝜃𝑿\widehat{\theta}_{JK}(\bm{X})=n\widehat{\theta}(\bm{X})-(n-1)\widehat{\theta}_% {(\bullet)}(\bm{X}),over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT ( bold_italic_X ) = italic_n over^ start_ARG italic_θ end_ARG ( bold_italic_X ) - ( italic_n - 1 ) over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT ( ∙ ) end_POSTSUBSCRIPT ( bold_italic_X ) , (9)

where θ^()(𝑿)=j=1nθ^(𝑿j)/nsubscript^𝜃𝑿superscriptsubscript𝑗1𝑛^𝜃subscript𝑿𝑗𝑛\widehat{\theta}_{(\bullet)}(\bm{X})={\sum_{j=1}^{n}\widehat{\theta}\left(\bm{% X}_{-j}\right)}/{n}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT ( ∙ ) end_POSTSUBSCRIPT ( bold_italic_X ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over^ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT - italic_j end_POSTSUBSCRIPT ) / italic_n, and 𝑿jsubscript𝑿𝑗\bm{X}_{-j}bold_italic_X start_POSTSUBSCRIPT - italic_j end_POSTSUBSCRIPT is 𝑿𝑿\bm{X}bold_italic_X with j𝑗jitalic_jth observation deleted.

It can be shown that the bias of θ^JK(𝑿)subscript^𝜃𝐽𝐾𝑿\widehat{\theta}_{JK}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT ( bold_italic_X ) is now with an order of 1/n21superscript𝑛21/n^{2}1 / italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (Miller, 1974). A graphical illustration of this bias reduction is provided in Figure 2. As compared with Bootstrap methods, θ^JK(𝑿)subscript^𝜃𝐽𝐾𝑿\widehat{\theta}_{JK}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT ( bold_italic_X ) is computationally friendly, and can also give exact results without a need to specify random seeds.

Refer to caption
Figure 2: Graphical illustration of the Jackknife estimator θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT.

3.3 Shrinkage Estimators

Hwang (1993) and Lindley (1962) considered the following shrinkage estimator θ^S(𝑿)subscript^𝜃𝑆𝑿\widehat{\theta}_{S}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( bold_italic_X ) for θmaxsubscript𝜃𝑚𝑎𝑥\theta_{max}italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT to reduce MSE.

θ^S(𝑿)subscript^𝜃𝑆𝑿\displaystyle\widehat{\theta}_{S}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( bold_italic_X ) =C+θ^(𝑿)+(1C+)θ~(𝑿)absentsubscript𝐶^𝜃𝑿1subscript𝐶~𝜃𝑿\displaystyle=C_{+}\widehat{\theta}(\bm{X})+\left(1-C_{+}\right)\widetilde{% \theta}(\bm{X})= italic_C start_POSTSUBSCRIPT + end_POSTSUBSCRIPT over^ start_ARG italic_θ end_ARG ( bold_italic_X ) + ( 1 - italic_C start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ) over~ start_ARG italic_θ end_ARG ( bold_italic_X ) (10)
C+subscript𝐶\displaystyle C_{+}italic_C start_POSTSUBSCRIPT + end_POSTSUBSCRIPT =max(0,C)absent0𝐶\displaystyle=\max(0,C)= roman_max ( 0 , italic_C )
C𝐶\displaystyle Citalic_C =1(I1)σ2i=1Ini[θ~(𝑿i)θ~(𝑿)]2absent1𝐼1superscript𝜎2superscriptsubscript𝑖1𝐼subscript𝑛𝑖superscriptdelimited-[]~𝜃subscript𝑿𝑖~𝜃𝑿2\displaystyle=1-\frac{(I-1)\sigma^{2}}{\sum_{i=1}^{I}n_{i}\left[\widetilde{% \theta}(\bm{X}_{i})-\widetilde{\theta}(\bm{X})\right]^{2}}= 1 - divide start_ARG ( italic_I - 1 ) italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - over~ start_ARG italic_θ end_ARG ( bold_italic_X ) ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (11)

Their original estimator is based on a setting of common and known variance σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT for each treatment group i𝑖iitalic_i in (1). For evaluation in this article, we replace σ2superscript𝜎2\sigma^{2}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT in (11) by the average of empirical variance estimators from all I𝐼Iitalic_I groups. Moreover, the constant (I1)𝐼1(I-1)( italic_I - 1 ) is modified from (I3)𝐼3(I-3)( italic_I - 3 ) for I4𝐼4I\geq 4italic_I ≥ 4 in Hwang (1993) and Lindley (1962) to accommodate a general setting with I2𝐼2I\geq 2italic_I ≥ 2 as suggested by Carreras and Brannath (2013). Intuitively, when θ1,,θIsubscript𝜃1subscript𝜃𝐼\theta_{1},\cdots,\theta_{I}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_θ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT are far from each other, C𝐶Citalic_C in (11) will be close to 1111, because i=1Ini[θ~(𝑿i)θ~(𝑿)]2superscriptsubscript𝑖1𝐼subscript𝑛𝑖superscriptdelimited-[]~𝜃subscript𝑿𝑖~𝜃𝑿2\sum_{i=1}^{I}n_{i}\left[\widetilde{\theta}(\bm{X}_{i})-\widetilde{\theta}(\bm% {X})\right]^{2}∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT [ over~ start_ARG italic_θ end_ARG ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - over~ start_ARG italic_θ end_ARG ( bold_italic_X ) ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is relatively large. The estimator θ^S(𝑿)subscript^𝜃𝑆𝑿\widehat{\theta}_{S}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( bold_italic_X ) will be close to θ^(𝑿)^𝜃𝑿\widehat{\theta}(\bm{X})over^ start_ARG italic_θ end_ARG ( bold_italic_X ) with small bias under this setting (Carreras and Brannath, 2013). Otherwise, when θ1,,θIsubscript𝜃1subscript𝜃𝐼\theta_{1},\cdots,\theta_{I}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_θ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT are close to each other, θ^S(𝑿)subscript^𝜃𝑆𝑿\widehat{\theta}_{S}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( bold_italic_X ) will be close to θ~(𝑿)~𝜃𝑿\widetilde{\theta}(\bm{X})over~ start_ARG italic_θ end_ARG ( bold_italic_X ) as the overall mean. Superior performance of θ^S(𝑿)subscript^𝜃𝑆𝑿\widehat{\theta}_{S}(\bm{X})over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( bold_italic_X ) in terms of Bayes risk was studied in Hwang (1993) and Carreras and Brannath (2013).

3.4 Hybrid Estimators based on Double Bootstrap and Shrinkage

The double bootstrap estimators θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT may further reduce estimation bias as compared with their single Bootstrap versions, but with a potential cost of increased MSE. This phenomenon was observed in some previous works (Hsu et al., 1986; MacKinnon and Smith Jr, 1998; Ouysse, 2011), and our simulation results in Section 4.

In this article, we also consider a natural generalization to ensemble double Bootstrap estimators and shrinkage estimators in Section 3.3, with a goal to balance reductions in bias and MSE. The proposed hybrid estimators θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT and θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT substitute θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG in (10) by θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT, respectively.

4 Simulation

4.1 Main Study

In this section, we conduct simulations with I=3𝐼3I=3italic_I = 3 treatment groups and n=40𝑛40n=40italic_n = 40 per group to evaluate the performance of several existing estimators θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG, θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT (Hwang, 1993; Lindley, 1962; Carreras and Brannath, 2013) and θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT (Rosenkranz, 2014) and our proposed estimators θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT, θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT, θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT and θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT. An additional setting of I=4𝐼4I=4italic_I = 4 is considered in Table 4. Denote 𝜽=(θ1,θ2,θ3)𝜽subscript𝜃1subscript𝜃2subscript𝜃3\bm{\theta}=(\theta_{1},\theta_{2},\theta_{3})bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) as the response mean vector, and 𝝈=(σ1,σ2,σ3)𝝈subscript𝜎1subscript𝜎2subscript𝜎3\bm{\sigma}=(\sigma_{1},\sigma_{2},\sigma_{3})bold_italic_σ = ( italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) as the standard deviation vector. The number of Bootstrap samples is set at B=80𝐵80B=80italic_B = 80, and the number of simulation iterations is 10,0001000010,00010 , 000. Discussion on how to choose the value of B𝐵Bitalic_B is provided in Section 4.2.

We consider the following four simulation scenarios.

  • S1: Varying mean vector 𝜽𝜽\bm{\theta}bold_italic_θ at (1,1,1)111(1,1,1)( 1 , 1 , 1 ), (1,1,1.2)111.2(1,1,1.2)( 1 , 1 , 1.2 ), (1,1.1,1.2)11.11.2(1,1.1,1.2)( 1 , 1.1 , 1.2 ) and (1,1.2,1.2)11.21.2(1,1.2,1.2)( 1 , 1.2 , 1.2 ) with 𝝈=(5,5,5)𝝈555\bm{\sigma}=(5,5,5)bold_italic_σ = ( 5 , 5 , 5 ) and Normal distribution of 𝑿𝑿\bm{X}bold_italic_X

  • S2: Varying mean vector 𝜽𝜽\bm{\theta}bold_italic_θ as in S1 but with 𝝈=(3,4,5)𝝈345\bm{\sigma}=(3,4,5)bold_italic_σ = ( 3 , 4 , 5 ) and Normal distribution of 𝑿𝑿\bm{X}bold_italic_X

  • S3: Varying w𝑤witalic_w at 0.10.10.10.1, 0.20.20.20.2, 0.30.30.30.3 and 0.50.50.50.5 with 𝝈=(5,5,5)𝝈555\bm{\sigma}=(5,5,5)bold_italic_σ = ( 5 , 5 , 5 ), 𝜽=(1,1.1,1.2)𝜽11.11.2\bm{\theta}=(1,1.1,1.2)bold_italic_θ = ( 1 , 1.1 , 1.2 ) and a mixture of Gamma and Normal distributions of 𝑿𝑿\bm{X}bold_italic_X

  • S4: Varying w𝑤witalic_w at 0.10.10.10.1, 0.20.20.20.2, 0.30.30.30.3 and 0.50.50.50.5 with 𝝈=(5,5,5)𝝈555\bm{\sigma}=(5,5,5)bold_italic_σ = ( 5 , 5 , 5 ), 𝜽=(1,1.1,1.2)𝜽11.11.2\bm{\theta}=(1,1.1,1.2)bold_italic_θ = ( 1 , 1.1 , 1.2 ) and a mixture of Uniform and Normal distributions of 𝑿𝑿\bm{X}bold_italic_X

S1 considers different values of 𝜽𝜽\bm{\theta}bold_italic_θ under homogeneous 𝝈=(5,5,5)𝝈555\bm{\sigma}=(5,5,5)bold_italic_σ = ( 5 , 5 , 5 ), while S2 is for heterogeneous 𝝈=(3,4,5)𝝈345\bm{\sigma}=(3,4,5)bold_italic_σ = ( 3 , 4 , 5 ). S3 evaluates estimators based on a mixture distribution with w𝑤witalic_w (w[0,1]𝑤01w\in[0,1]italic_w ∈ [ 0 , 1 ]) proportion of Gamma distribution and 1w1𝑤1-w1 - italic_w proportion of Normal distribution for each treatment group. S4 studies Uniform distribution as the outlier distribution. The shape and scale parameters of Gamma distributions or the minimum and maximum parameters of Uniform distributions are specified to match the mean and standard deviation for each treatment group. Those Parametric Bootstrap estimators (i.e., θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT) still use Normal distribution as the re-sampling assumption. S3 and S4 essentially evaluate the robustness of different estimators based on data sampling distributions deviating from the Normal assumption.

Table 2 evaluates the unconditional or marginal bias and MSE of those estimators when estimating 𝜽maxsubscript𝜽𝑚𝑎𝑥\bm{\theta}_{max}bold_italic_θ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT. Among existing estimators, θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT and θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT can generally reduce bias and MSE as compared with the traditional estimator θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG. For our proposed estimators, both θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT can substantially reduce estimation bias but with increased MSE. As a better balance between bias reduction and MSE reduction, our hybrid parameters θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT and θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT have the smallest MSE, and also smaller bias than three existing estimators. The Normal assumption for PB is usually reasonable to assume for problems with response mean as the parameter of interest and a moderate sample size (Efron and Tibshirani, 1994), with supporting results in S3 and S4. The Jackknife estimator θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT has similar bias with θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT but with increased MSE.

In Table 3, we also evaluate the conditional bias and MSE given the third treatment group i=3𝑖3i=3italic_i = 3 is being selected. The true response mean of this group is larger than or equal to the other two groups under four simulation scenarios specified above. The marginal bias and MSE are evaluated under some additional settings with I=4𝐼4I=4italic_I = 4 treatment groups in Table 4. Results and conclusions of these two additional analyses are consistent with Table 2.

The overall recommendation is that the double Bootstrap estimators θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT can be applied to achieve the smallest bias, but with slightly larger MSE as compared with θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG. The hybrid parameters θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT and θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT achieve a better balance between bias reduction and MSE reduction.

Existing Estimators Proposed Estimators
Scenario 𝜽𝜽\bm{\theta}bold_italic_θ θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT
S1 (1, 1, 1) 0.67 (0.80) 0.18 (0.35) 0.41 (0.65) 0.07 (0.83) 0.14 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.40 (0.65) 0.060.06\bm{0.06}bold_0.06 (0.83) 0.14 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.35 (1.15)
(1, 1, 1.2) 0.54 (0.64) 0.05 (0.32) 0.27 (0.55) -0.06 (0.83) 0.010.01\bm{0.01}bold_0.01 (0.31¯¯0.31\underline{0.31}under¯ start_ARG 0.31 end_ARG) 0.27 (0.55) -0.07 (0.84) 0.010.01\bm{0.01}bold_0.01 (0.31¯¯0.31\underline{0.31}under¯ start_ARG 0.31 end_ARG) 0.22 (1.07)
(1, 1.1, 1.2) 0.58 (0.69) 0.09 (0.32) 0.32 (0.58) -0.020.02\bm{0.02}bold_0.02 (0.83) 0.05 (0.31¯¯0.31\underline{0.31}under¯ start_ARG 0.31 end_ARG) 0.31 (0.58) -0.03 (0.84) 0.05 (0.31¯¯0.31\underline{0.31}under¯ start_ARG 0.31 end_ARG) 0.27 (1.09)
(1, 1.2, 1.2) 0.60 (0.72) 0.11 (0.33) 0.33 (0.60) 0.000.00\bm{0.00}bold_0.00 (0.84) 0.07 (0.32) 0.33 (0.60) -0.01 (0.85) 0.07 (0.31¯¯0.31\underline{0.31}under¯ start_ARG 0.31 end_ARG) 0.27 (1.14)
S2 (1, 1, 1) 0.55 (0.55) 0.26 (0.32) 0.33 (0.44) 0.050.05\bm{0.05}bold_0.05 (0.57) 0.18 (0.29¯¯0.29\underline{0.29}under¯ start_ARG 0.29 end_ARG) 0.32 (0.44) 0.050.05\bm{0.05}bold_0.05 (0.58) 0.18 (0.29¯¯0.29\underline{0.29}under¯ start_ARG 0.29 end_ARG) 0.28 (0.80)
(1, 1, 1.2) 0.42 (0.45) 0.13 (0.30) 0.21 (0.41) -0.06 (0.62) 0.050.05\bm{0.05}bold_0.05 (0.28¯¯0.28\underline{0.28}under¯ start_ARG 0.28 end_ARG) 0.20 (0.41) -0.07 (0.62) 0.050.05\bm{0.05}bold_0.05 (0.29) 0.17 (0.76)
(1, 1.1, 1.2) 0.46 (0.49) 0.17 (0.31) 0.25 (0.43) -0.020.02\bm{0.02}bold_0.02 (0.62) 0.09 (0.29¯¯0.29\underline{0.29}under¯ start_ARG 0.29 end_ARG) 0.24 (0.43) -0.03 (0.62) 0.09 (0.29¯¯0.29\underline{0.29}under¯ start_ARG 0.29 end_ARG) 0.20 (0.78)
(1, 1.2, 1.2) 0.49 (0.51) 0.20 (0.32) 0.27 (0.44) 0.000.00\bm{0.00}bold_0.00 (0.62) 0.12 (0.30¯¯0.30\underline{0.30}under¯ start_ARG 0.30 end_ARG) 0.27 (0.44) -0.01 (0.63) 0.11 (0.30¯¯0.30\underline{0.30}under¯ start_ARG 0.30 end_ARG) 0.21 (0.83)
Scenario w𝑤witalic_w θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT
S3 0.1 0.52 (0.56) 0.07 (0.27) 0.27 (0.48) -0.030.03\bm{0.03}bold_0.03 (0.68) 0.030.03\bm{0.03}bold_0.03 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.27 (0.48) -0.04 (0.69) 0.030.03\bm{0.03}bold_0.03 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.22 (0.91)
0.2 0.46 (0.45) 0.05 (0.22) 0.24 (0.39) -0.04 (0.57) 0.020.02\bm{0.02}bold_0.02 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.23 (0.39) -0.05 (0.57) 0.020.02\bm{0.02}bold_0.02 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.19 (0.73)
0.3 0.43 (0.40) 0.06 (0.20) 0.22 (0.35) -0.03 (0.49) 0.020.02\bm{0.02}bold_0.02 (0.19¯¯0.19\underline{0.19}under¯ start_ARG 0.19 end_ARG) 0.22 (0.34) -0.04 (0.49) 0.020.02\bm{0.02}bold_0.02 (0.19¯¯0.19\underline{0.19}under¯ start_ARG 0.19 end_ARG) 0.19 (0.62)
0.5 0.39 (0.39) 0.07 (0.21) 0.21 (0.33) -0.020.02\bm{0.02}bold_0.02 (0.45) 0.020.02\bm{0.02}bold_0.02 (0.20¯¯0.20\underline{0.20}under¯ start_ARG 0.20 end_ARG) 0.20 (0.32) -0.03 (0.44) 0.020.02\bm{0.02}bold_0.02 (0.20¯¯0.20\underline{0.20}under¯ start_ARG 0.20 end_ARG) 0.16 (0.51)
S4 0.1 0.51 (0.54) 0.06 (0.26) 0.26 (0.46) -0.04 (0.68) 0.030.03\bm{0.03}bold_0.03 (0.25¯¯0.25\underline{0.25}under¯ start_ARG 0.25 end_ARG) 0.26 (0.46) -0.05 (0.68) 0.030.03\bm{0.03}bold_0.03 (0.25¯¯0.25\underline{0.25}under¯ start_ARG 0.25 end_ARG) 0.21 (0.90)
0.2 0.45 (0.45) 0.05 (0.22) 0.23 (0.38) -0.04 (0.56) 0.010.01\bm{0.01}bold_0.01 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.23 (0.38) -0.05 (0.57) 0.010.01\bm{0.01}bold_0.01 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.19 (0.74)
0.3 0.41 (0.37) 0.04 (0.18) 0.21 (0.32) -0.05 (0.48) 0.010.01\bm{0.01}bold_0.01 (0.17¯¯0.17\underline{0.17}under¯ start_ARG 0.17 end_ARG) 0.20 (0.32) -0.06 (0.49) 0.010.01\bm{0.01}bold_0.01 (0.17¯¯0.17\underline{0.17}under¯ start_ARG 0.17 end_ARG) 0.16 (0.64)
0.5 0.38 (0.32) 0.03 (0.16) 0.19 (0.27) -0.05 (0.41) 0.000.00\bm{0.00}bold_0.00 (0.15¯¯0.15\underline{0.15}under¯ start_ARG 0.15 end_ARG) 0.18 (0.27) -0.05 (0.41) 0.000.00\bm{0.00}bold_0.00 (0.15¯¯0.15\underline{0.15}under¯ start_ARG 0.15 end_ARG) 0.15 (0.54)
Table 2: Marginal bias and MSE in parenthesis of three existing estimators and six proposed estimators. Within each row, the bias with the smallest absolute value is in bold, and the smallest MSE is underlined.
Existing Estimators Proposed Estimators
Scenario 𝜽𝜽\bm{\theta}bold_italic_θ θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT
S1 (1, 1, 1) 0.68 (0.81) 0.19 (0.35) 0.41 (0.65) 0.070.07\bm{0.07}bold_0.07 (0.82) 0.15 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.41 (0.65) 0.070.07\bm{0.07}bold_0.07 (0.84) 0.15 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.36 (1.13)
(1, 1, 1.2) 0.58 (0.69) 0.08 (0.34) 0.33 (0.60) 0.02 (0.84) 0.03 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.32 (0.60) 0.010.01\bm{0.01}bold_0.01 (0.84) 0.03 (0.34) 0.29 (1.07)
(1, 1.1, 1.2) 0.62 (0.74) 0.12 (0.34) 0.36 (0.62) 0.05 (0.84) 0.07 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.36 (0.62) 0.040.04\bm{0.04}bold_0.04 (0.85) 0.07 (0.33¯¯0.33\underline{0.33}under¯ start_ARG 0.33 end_ARG) 0.32 (1.10)
(1, 1.2, 1.2) 0.63 (0.76) 0.14 (0.34) 0.37 (0.63) 0.05 (0.85) 0.09 (0.32¯¯0.32\underline{0.32}under¯ start_ARG 0.32 end_ARG) 0.37 (0.63) 0.040.04\bm{0.04}bold_0.04 (0.87) 0.09 (0.32¯¯0.32\underline{0.32}under¯ start_ARG 0.32 end_ARG) 0.31 (1.15)
S2 (1, 1, 1) 0.71 (0.81) 0.42 (0.49) 0.50 (0.67) 0.23 (0.77) 0.34 (0.44¯¯0.44\underline{0.44}under¯ start_ARG 0.44 end_ARG) 0.49 (0.67) 0.220.22\bm{0.22}bold_0.22 (0.78) 0.34 (0.44¯¯0.44\underline{0.44}under¯ start_ARG 0.44 end_ARG) 0.44 (1.08)
(1, 1, 1.2) 0.59 (0.66) 0.30 (0.42) 0.40 (0.59) 0.16 (0.74) 0.22 (0.39¯¯0.39\underline{0.39}under¯ start_ARG 0.39 end_ARG) 0.39 (0.59) 0.150.15\bm{0.15}bold_0.15 (0.74) 0.22 (0.40) 0.37 (0.93)
(1, 1.1, 1.2) 0.64 (0.71) 0.34 (0.44) 0.43 (0.61) 0.190.19\bm{0.19}bold_0.19 (0.75) 0.26 (0.40¯¯0.40\underline{0.40}under¯ start_ARG 0.40 end_ARG) 0.43 (0.61) 0.190.19\bm{0.19}bold_0.19 (0.74) 0.26 (0.41) 0.42 (0.92)
(1, 1.2, 1.2) 0.65 (0.72) 0.35 (0.45) 0.44 (0.61) 0.18 (0.75) 0.27 (0.40¯¯0.40\underline{0.40}under¯ start_ARG 0.40 end_ARG) 0.43 (0.62) 0.170.17\bm{0.17}bold_0.17 (0.75) 0.27 (0.40¯¯0.40\underline{0.40}under¯ start_ARG 0.40 end_ARG) 0.39 (1.00)
Scenario w𝑤witalic_w θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT
S3 0.1 0.55 (0.61) 0.10 (0.29) 0.32 (0.52) 0.03 (0.70) 0.06 (0.28¯¯0.28\underline{0.28}under¯ start_ARG 0.28 end_ARG) 0.31 (0.52) 0.020.02\bm{0.02}bold_0.02 (0.72) 0.06 (0.28¯¯0.28\underline{0.28}under¯ start_ARG 0.28 end_ARG) 0.28 (0.92)
0.2 0.48 (0.48) 0.07 (0.23) 0.27 (0.41) 0.02 (0.57) 0.03 (0.22¯¯0.22\underline{0.22}under¯ start_ARG 0.22 end_ARG) 0.27 (0.41) 0.010.01\bm{0.01}bold_0.01 (0.56) 0.04 (0.23) 0.25 (0.72)
0.3 0.45 (0.43) 0.08 (0.22) 0.26 (0.37) 0.02 (0.51) 0.04 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.25 (0.37) 0.010.01\bm{0.01}bold_0.01 (0.50) 0.04 (0.21¯¯0.21\underline{0.21}under¯ start_ARG 0.21 end_ARG) 0.23 (0.62)
0.5 0.41 (0.40) 0.11 (0.24) 0.24 (0.34) 0.04 (0.45) 0.06 (0.22¯¯0.22\underline{0.22}under¯ start_ARG 0.22 end_ARG) 0.24 (0.34) 0.030.03\bm{0.03}bold_0.03 (0.44) 0.06 (0.22¯¯0.22\underline{0.22}under¯ start_ARG 0.22 end_ARG) 0.21 (0.51)
S4 0.1 0.55 (0.59) 0.10 (0.27) 0.31 (0.50) 0.03 (0.69) 0.06 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.31 (0.50) 0.020.02\bm{0.02}bold_0.02 (0.69) 0.06 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.28 (0.92)
0.2 0.49 (0.49) 0.08 (0.23¯¯0.23\underline{0.23}under¯ start_ARG 0.23 end_ARG) 0.28 (0.41) 0.03 (0.57) 0.04 (0.23¯¯0.23\underline{0.23}under¯ start_ARG 0.23 end_ARG) 0.28 (0.42) 0.020.02\bm{0.02}bold_0.02 (0.57) 0.04 (0.23¯¯0.23\underline{0.23}under¯ start_ARG 0.23 end_ARG) 0.25 (0.74)
0.3 0.44 (0.40) 0.06 (0.19) 0.25 (0.35) 0.01 (0.48) 0.02 (0.18¯¯0.18\underline{0.18}under¯ start_ARG 0.18 end_ARG) 0.25 (0.34) 0.000.00\bm{0.00}bold_0.00 (0.49) 0.02 (0.18¯¯0.18\underline{0.18}under¯ start_ARG 0.18 end_ARG) 0.22 (0.63)
0.5 0.40 (0.34) 0.04 (0.17) 0.22 (0.30) 0.000.00\bm{0.00}bold_0.00 (0.42) 0.01 (0.16¯¯0.16\underline{0.16}under¯ start_ARG 0.16 end_ARG) 0.21 (0.30) -0.01 (0.43) 0.01 (0.16¯¯0.16\underline{0.16}under¯ start_ARG 0.16 end_ARG) 0.18 (0.56)
Table 3: Conditional bias and MSE in parenthesis of three existing estimators and six proposed estimators. Within each row, the bias with the smallest absolute value is in bold, and the smallest MSE is underlined.
Existing Estimators Proposed Estimators
𝜽𝜽\bm{\theta}bold_italic_θ θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG θ^Ssubscript^𝜃𝑆\widehat{\theta}_{S}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT
(1, 1, 1, 1) 0.67 (0.80) 0.19 (0.30) 0.48 (0.69) 0.07 (0.86) 0.13 (0.27) 0.48 (0.69) 0.060.06\bm{0.06}bold_0.06 (0.87) 0.13 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.41 (1.30)
(1, 1, 1, 1.2) 0.47 (0.57) 0.05 (0.27) 0.34 (0.58) -0.07 (0.87) -0.020.02\bm{0.02}bold_0.02 (0.25¯¯0.25\underline{0.25}under¯ start_ARG 0.25 end_ARG) 0.34 (0.58) -0.08 (0.88) -0.020.02\bm{0.02}bold_0.02 (0.26) 0.26 (1.22)
(1, 1.05, 1.1, 1.2) 0.53 (0.63) 0.09 (0.28) 0.39 (0.61) -0.020.02\bm{0.02}bold_0.02 (0.85) 0.020.02\bm{0.02}bold_0.02 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.38 (0.61) -0.03 (0.86) 0.020.02\bm{0.02}bold_0.02 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.32 (1.23)
(1, 1.1, 1.2, 1.2) 0.57 (0.68) 0.12 (0.29) 0.42 (0.64) 0.01 (0.85) 0.05 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.41 (0.64) 0.000.00\bm{0.00}bold_0.00 (0.86) 0.05 (0.26¯¯0.26\underline{0.26}under¯ start_ARG 0.26 end_ARG) 0.36 (1.24)
Table 4: Marginal bias and MSE in parenthesis of three existing estimators and six proposed estimators when I=4𝐼4I=4italic_I = 4. Within each row, the bias with the smallest absolute value is in bold, and the smallest MSE is underlined.

4.2 The Choice of B𝐵Bitalic_B and Higher-Order Bootstrap Methods

In this section, we provide some insights and guidance on how to choose the value of B𝐵Bitalic_B in Bootstrap methods, and the feasibility of higher-order Bootstrap methods. Under 4 different values of 𝜽𝜽\bm{\theta}bold_italic_θ in S1, Table 5 evaluates single, double and triple Bootstrap methods for both parametric and nonparametric versions. Due to computation burdens, triple Bootstrap methods θ^PB(3)subscriptsuperscript^𝜃3𝑃𝐵\widehat{\theta}^{(3)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^NB(3)subscriptsuperscript^𝜃3𝑁𝐵\widehat{\theta}^{(3)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT are only assessed under B=80𝐵80B=80italic_B = 80 and 100100100100.

For single and double Bootstrap methods, there is limited improvement in bias and MSE when increasing B=80𝐵80B=80italic_B = 80 to 1000100010001000 under scenarios we considered. Therefore, we utilize B=80𝐵80B=80italic_B = 80 in this study, and B=1000𝐵1000B=1000italic_B = 1000 for the real data example in the next section. For other problems, one can implement Bootstrap methods with several values of B𝐵Bitalic_B to find a proper one with a reasonable computation time. When it comes to higher-order Bootstrap, for example triple Bootstrap methods, their bias and MSE can be even worse than the single Bootstrap version. The high MSE of double Bootstrap estimators is carried over to triple Bootstrap estimators by an additional layer of iteration. With even more intensive computation, triple Bootstrap or even higher-order Bootstrap methods are not recommended for the settings considered.

𝜽𝜽\bm{\theta}bold_italic_θ B θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^PB(3)subscriptsuperscript^𝜃3𝑃𝐵\widehat{\theta}^{(3)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT θ^NB(3)subscriptsuperscript^𝜃3𝑁𝐵\widehat{\theta}^{(3)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT
(1, 1, 1) 80 0.40 (0.65) 0.06 (0.83) -0.37 (1.97) 0.40 (0.65) 0.07 (0.83) -0.36 (1.94)
100 0.39 (0.63) 0.05 (0.82) -0.39 (1.95) 0.40 (0.63) 0.06 (0.82) -0.38 (1.92)
500 0.41 (0.64) 0.07 (0.81) 0.41 (0.65) 0.08 (0.81)
1000 0.38 (0.62) 0.04 (0.80) 0.39 (0.63) 0.05 (0.80)
(1, 1, 1.2) 80 0.28 (0.57) -0.06 (0.84) -0.49 (2.08) 0.28 (0.57) -0.05 (0.84) -0.47 (2.04)
100 0.27 (0.55) -0.07 (0.83) -0.51 (2.07) 0.27 (0.56) -0.06 (0.82) -0.49 (2.03)
500 0.28 (0.56) -0.06 (0.82) 0.28 (0.57) -0.05 (0.81)
1000 0.26 (0.55) -0.08 (0.82) 0.27 (0.55) -0.07 (0.81)
(1, 1.1, 1.2) 80 0.31 (0.59) -0.03 (0.84) -0.46 (2.06) 0.31 (0.59) -0.02 (0.84) -0.45 (2.03)
100 0.30 (0.57) -0.04 (0.82) -0.48 (2.03) 0.30 (0.57) -0.04 (0.82) -0.47 (2.00)
500 0.31 (0.58) -0.03 (0.81) 0.32 (0.58) -0.02 (0.81)
1000 0.29 (0.57) -0.05 (0.82) 0.30 (0.57) -0.04 (0.81)
(1, 1.2, 1.2) 80 0.34 (0.61) 0.00 (0.85) -0.43 (2.04) 0.35 (0.61) 0.02 (0.84) -0.42 (2.01)
100 0.33 (0.59) 0.00 (0.82) -0.44 (2.01) 0.34 (0.60) 0.00 (0.82) -0.43 (1.97)
500 0.35 (0.61) 0.01 (0.82) 0.35 (0.61) 0.02 (0.81)
1000 0.33 (0.59) -0.01 (0.82) 0.33 (0.59) 0.00 (0.81)
Table 5: Marginal bias and MSE in parenthesis of single, double and triple Bootstrap methods with varying B𝐵Bitalic_B.

5 Real Data Example

AWARD-5 was an adaptive, dose-finding, seamless Phase 2/3 study of dulaglutide for the treatment of type 2 diabetes mellitus (Geiger et al., 2012). The study had a dose-finding portion (Stage 1) with Bayesian response adaptive randomization to evaluate 7 dulaglutide doses and a fixed scheme (Stage 2) to confirm findings of 2 selected doses (0.75 mg and 1.5 mg) (Skrivanek et al., 2014). The adaptive randomization at Stage 1 and dose selection at the end of Stage 2 was informed by a clinical utility index (CUI), a single metric that reflects four prespecified safety and efficacy response measures (Geiger et al., 2012). Sample size re-estimation was also performed for Stage 2 based on the data from Stage 1 (Geiger et al., 2012; Skrivanek et al., 2014; ClinicalTrials.gov, 2015).

For illustration purposes, we consider a simplified problem of treating Stage 1 as a previous Phase 2 study, while Stage 2 as the new Phase 3 study. Our goal is to accurately estimate the response mean of the selected group dulaglutide 1.5 mg to plan its sample size for Stage 2 based on results in Stage 1. The dosing regimen dulaglutide 1.5 mg was selected as the most efficacious group for further testing in Stage 2 during the actual trial conduct of AWARD-5 (Skrivanek et al., 2014). Assessments are based on the primary efficacy endpoint of change from Baseline (CHG) of glycosylated hemoglobin (HbA1c) at Week 52. For notation consistency, we use the negative of CHG (decrease in HbA1c) with a larger value denoting a better response. Table 6 summarizes the response mean (based on Bayesian posterior mean), the sample size, and the standard deviation (based on Normal approximation of Bayesian 95%percent9595\%95 % credible intervals) for each of the 7 active treatment groups in Stage 1 of dose selection (Skrivanek et al., 2014). Since publicly available results are only summary statistics by group as in Table 6, we apply the single PB θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, the double PB θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, and the hybrid estimator θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT to estimate the response mean of 1.5 mg for sample size re-assessment in Stage 2. The Bootstrap sample size is B=1,000𝐵1000B=1,000italic_B = 1 , 000, and therefore, θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT, θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT require 103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, 106superscript10610^{6}10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT and 106superscript10610^{6}10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT resamples, respectively.

The traditional estimator θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG in (3) is 1.331.331.331.33 as the maximum of -CHGs from 7 active treatment groups. Our three PB estimators are θ^PB(1)=1.28subscriptsuperscript^𝜃1𝑃𝐵1.28\widehat{\theta}^{(1)}_{PB}=1.28over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT = 1.28, θ^PB(2)=1.20subscriptsuperscript^𝜃2𝑃𝐵1.20\widehat{\theta}^{(2)}_{PB}=1.20over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT = 1.20 and θ^PB,S(2)=1.16subscriptsuperscript^𝜃2𝑃𝐵𝑆1.16\widehat{\theta}^{(2)}_{PB,S}=1.16over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT = 1.16, with computational time on a standard laptop of 0.040.040.040.04 second, 23.123.123.123.1 seconds and 23.123.123.123.1 seconds, respectively. These results are consistent with simulation results / conclusions in Section 4, where θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG has a large positive estimation bias, and θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT has a moderate positive bias, while θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT and θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT have biases close to zero.

Group -CHG of HbA1c at Week 52 n SD
Dulaglutide 0.25 mg 0.82 13 0.55
Dulaglutide 0.5 mg 0.95 16 0.42
Dulaglutide 0.75 mg 0.93 20 0.59
Dulaglutide 1 mg 1.00 8 0.40
Dulaglutide 1.5 mg 1.33 18 0.67
Dulaglutide 2 mg 1.28 24 0.49
Dulaglutide 3 mg 1.00 10 0.42
Estimator Value
θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG 1.331.331.331.33
θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT 1.281.281.281.28
θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT 1.201.201.201.20
θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT 1.161.161.161.16
Table 6: Summary statistics of Stage 1 are based on Bayesian posterior means and 95%percent9595\%95 % credible intervals of CHG of HbA1c at Week 52 (Skrivanek et al., 2014). The standard deviations (SD) are computed by Normal approximation of the 95%percent9595\%95 % credible intervals. Values of 4 estimators are presented.

6 Discussion

We summarize several attributes of those computational methods in Table 7. PB methods can either use patient-level data or summary results by treatment group, while NB methods and the Jackknife need patient-level data. With the least computational resource, the Jackknife method can give exact results. As compared with PB, NB does not necessarily require specific distribution assumptions. Based on simulation studies with outliers in Section 4, PB with the Normal assumption has a satisfactory performance when inferring response means of continuous endpoints. Bootstrap methods are capable of conducting a second-order resampling to decrease bias. However, double Bootstrap and Jackknife are observed to have larger MSEs than the traditional estimator θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG based on results in Section 4. Generally speaking, correcting the bias may cause a larger increase in variance, and results in a larger MSE (Efron and Tibshirani, 1994). The Jackknife method makes a linear approximation to the Bootstrap method, and can be inefficient for nonlinear functions (Efron and Tibshirani, 1994). These arguments may explain why both Bootstrap and Jackknife methods can correct bias but with larger MSE than θ^^𝜃\widehat{\theta}over^ start_ARG italic_θ end_ARG, and why Jackknife method usually has the largest MSE. Hybrid estimators based on double Bootstrap and shrinkage can reduce both bias and MSE. Therefore, our overall recommendation is to implement PB methods if only summary results are available, and to choose NB methods with patient-level data. Hybrid estimators based on double Bootstrap and shrinkage are preferred to balance reductions in both bias and MSE.

The last row of Table 7 summarized limitations of our computational methods for bias correction. Single Bootstrap methods have moderate or limited bias reduction, while double Bootstrap methods have increased MSE and require intensive computation. Hybrid estimators also require double Bootstrap with heavy computation. The Jackknife method has limited bias reduction but with increased MSE.

This article is not intended to completely fill the evidence gap between previous studies and Phase 3 studies. Under a basic scenario where efficacy profiles of the selected group(s) are the same between studies, we show that our computational methods can characterize the efficacy more accurately than the common practice. We have a broader scope with multiple groups for selection and flexibility to accommodate summary results as compared with some previous works. On top of this framework, additional layers can be added to accommodate more complicated problems, e.g., temporal drift and patient heterogeneity. Our framework can be integrated into MCP-Mod (Bretz et al., 2005) to accommodate the dose-ranging part of previous studies. The proposed method can also be broadly applied to other settings, e.g., response-adaptive randomization design with multiple active treatment groups, patient enrichment designs, and other general selection problems.

In this article, we consider continuous endpoints for illustration. Generalization can be made for binary endpoints, and time-to-event endpoints. Some other future works include targeting treatment differences by adjusting the placebo effect, regression models to accommodate covariates, and improved computational methods to reduce the burden of higher-order Bootstrap methods.

Parametric Bootstrap Nonparametric Bootstrap

Jackknife

Single

Double

Hybrid

Single

Double

Hybrid

Notation

θ^PB(1)subscriptsuperscript^𝜃1𝑃𝐵\widehat{\theta}^{(1)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT

θ^PB(2)subscriptsuperscript^𝜃2𝑃𝐵\widehat{\theta}^{(2)}_{PB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B end_POSTSUBSCRIPT

θ^PB,S(2)subscriptsuperscript^𝜃2𝑃𝐵𝑆\widehat{\theta}^{(2)}_{PB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_P italic_B , italic_S end_POSTSUBSCRIPT

θ^NB(1)subscriptsuperscript^𝜃1𝑁𝐵\widehat{\theta}^{(1)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT

θ^NB(2)subscriptsuperscript^𝜃2𝑁𝐵\widehat{\theta}^{(2)}_{NB}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B end_POSTSUBSCRIPT

θ^NB,S(2)subscriptsuperscript^𝜃2𝑁𝐵𝑆\widehat{\theta}^{(2)}_{NB,S}over^ start_ARG italic_θ end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_N italic_B , italic_S end_POSTSUBSCRIPT

θ^JKsubscript^𝜃𝐽𝐾\widehat{\theta}_{JK}over^ start_ARG italic_θ end_ARG start_POSTSUBSCRIPT italic_J italic_K end_POSTSUBSCRIPT

Data source

Subject level data or summary
statistics by groups

Subject level data

Number of
sampling
iterations

B𝐵Bitalic_B

B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

B𝐵Bitalic_B

B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

n𝑛nitalic_n

Features

Can utilize summary statistics;
Double Bootstrap or hybrid with
shrinkage to increase precision

Free of distribution assumptions;
Double Bootstrap or hybrid with
shrinkage to increase precision

Exact results;
Less computationally
intensive

Limitations

Moderate bias
reduction

Increased MSE; Intensive computation

Intensive computation

Moderate bias
reduction

Increased MSE; Intensive computation

Intensive computation

Moderate bias
reduction;
Increased MSE

Table 7: Summary table of computational methods for bias correction

Acknowledgements

The author thanks the Editor, the Associate Editor and two reviewers for their insightful comments. This manuscript was supported by AbbVie Inc. AbbVie participated in the review and approval of the content. Tianyu Zhan is employed by AbbVie Inc., and may own AbbVie stock.

Supplementary materials

The R code to replicate results in Section 4 and 5 is available on GitHub https://github.com/tian-yu-zhan/Bias_Reduction. Data sharing is not applicable, because simulation is based on results from literature.


References

  • Bauer et al. (2010) Bauer, P., F. Koenig, W. Brannath, and M. Posch (2010). Selection and bias—two hostile brothers. Statistics in Medicine 29(1), 1–13.
  • Blumenthal and Cohen (1968) Blumenthal, S. and A. Cohen (1968). Estimation of the larger of two normal means. Journal of the American Statistical Association 63(323), 861–876.
  • Bretz et al. (2005) Bretz, F., J. C. Pinheiro, and M. Branson (2005). Combining multiple comparisons and modeling techniques in dose-response studies. Biometrics 61(3), 738–748.
  • Carreras and Brannath (2013) Carreras, M. and W. Brannath (2013). Shrinkage estimation in two-stage adaptive designs with midtrial treatment selection. Statistics in Medicine 32(10), 1677–1690.
  • ClinicalTrials.gov (2015) ClinicalTrials.gov (2015). A Study of LY2189265 Compared to Sitagliptin in Participants With Type 2 Diabetes Mellitus on Metformin. . https://clinicaltrials.gov/ct2/show/NCT00734474.
  • Dahiya (1974) Dahiya, R. C. (1974). Estimation of the mean of the selected population. Journal of the American Statistical Association 69(345), 226–230.
  • Davison and Hinkley (1997) Davison, A. C. and D. V. Hinkley (1997). Bootstrap methods and their application. Cambridge University Press.
  • Efron and Tibshirani (1994) Efron, B. and R. J. Tibshirani (1994). An introduction to the bootstrap. CRC Press.
  • Food and Drug Administration (2017) Food and Drug Administration (2017). 22 Case Studies Where Phase 2 and Phase 3 Trials Had Divergent Results. https://www.fda.gov/media/102332/download.
  • Geiger et al. (2012) Geiger, M. J., Z. Skrivanek, B. Gaydos, J. Chien, S. Berry, D. Berry, and J. H. Anderson Jr (2012). An adaptive, dose-finding, seamless phase 2/3 study of a long-acting glucagon-like peptide-1 analog (dulaglutide): trial design and baseline characteristics. Journal of Diabetes Science and Technology 6(6), 1319–1327.
  • Hsieh (1981) Hsieh, H.-K. (1981). On estimating the mean of the selected population with unknown variance. Communications in Statistics-Theory and Methods 10(18), 1869–1878.
  • Hsu et al. (1986) Hsu, Y.-S., K.-N. Lau, H.-G. Fung, and E. F. Ulveling (1986). Monte carlo studies on the effectiveness of the bootstrap bias reduction method on 2sls estimates. Economics Letters 20(3), 233–239.
  • Hwang (1993) Hwang, J. T. (1993). Empirical bayes estimation for the means of the selected populations. Sankhyā: The Indian Journal of Statistics, Series A, 285–304.
  • ICH Guideline E8 (2022) ICH Guideline E8 (2022). ICH guideline E8 (R1) on general considerations for clinical studies. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e-8-general-considerations-clinical-trials-step-5_en.pdf.
  • Ishwaei D et al. (1985) Ishwaei D, B., D. Shabma, and K. Krishnamoorthy (1985). Non-existence of unbiased estimators of ordered parameters. Statistics: A Journal of Theoretical and Applied Statistics 16(1), 89–95.
  • Kosmidis (2014) Kosmidis, I. (2014). Bias in parametric estimation: reduction and useful side-effects. Wiley Interdisciplinary Reviews: Computational Statistics 6(3), 185–196.
  • Kumar and Sharma (1993) Kumar, S. and D. Sharma (1993). Unbiased inestimability of the larger of two parameters. Statistics: A Journal of Theoretical and Applied Statistics 24(2), 137–142.
  • Liang et al. (2019) Liang, F., Z. Wu, M. Mo, C. Zhou, J. Shen, Z. Wang, and Y. Zheng (2019). Comparison of treatment effect from randomised controlled phase II trials and subsequent phase III trials using identical regimens in the same treatment setting. European Journal of Cancer 121, 19–28.
  • Lindley (1962) Lindley, D.-V. (1962). Discussion of Professor Stein’s paper ”Confidence sets for the mean of a multivariate normal distribution”. Journal of the Royal Statistical Society, Series B 24, 285–287.
  • MacKinnon and Smith Jr (1998) MacKinnon, J. G. and A. A. Smith Jr (1998). Approximate bias correction in econometrics. Journal of Econometrics 85(2), 205–230.
  • Miller (1974) Miller, R. G. (1974). The jackknife-a review. Biometrika 61(1), 1–15.
  • Ouysse (2011) Ouysse, R. (2011). Computationally efficient approximation for the double bootstrap mean bias correction. Economics Bulletin 31(3), 2388–2403.
  • Quenouille (1949) Quenouille, M. H. (1949). Approximate tests of correlation in time-series 3. Mathematical Proceedings of the Cambridge Philosophical Society 45, 483–484.
  • Robertson et al. (2023a) Robertson, D. S., B. Choodari-Oskooei, M. Dimairo, L. Flight, P. Pallmann, and T. Jaki (2023a). Point estimation for adaptive trial designs i: A methodological review. Statistics in Medicine 42(2), 122–145.
  • Robertson et al. (2023b) Robertson, D. S., B. Choodari-Oskooei, M. Dimairo, L. Flight, P. Pallmann, and T. Jaki (2023b). Point estimation for adaptive trial designs ii: practical considerations and guidance. Statistics in Medicine.
  • Rosenkranz (2014) Rosenkranz, G. K. (2014). Bootstrap corrections of treatment effect estimates following selection. Computational Statistics & Data Analysis 69, 220–227.
  • Saville et al. (2022) Saville, B. R., D. A. Berry, N. S. Berry, K. Viele, and S. M. Berry (2022). The Bayesian time machine: accounting for temporal drift in multi-arm platform trials. Clinical Trials 19(5), 490–501.
  • Skrivanek et al. (2014) Skrivanek, Z., B. Gaydos, J. Chien, M. Geiger, M. Heathman, S. Berry, J. Anderson, T. Forst, Z. Milicevic, and D. Berry (2014). Dose-finding results in an adaptive, seamless, randomized trial of once-weekly dulaglutide combined with metformin in type 2 diabetes patients (award-5). Diabetes, Obesity and Metabolism 16(8), 748–756.
  • Stallard and Todd (2005) Stallard, N. and S. Todd (2005). Point estimates and confidence regions for sequential trials involving selection. Journal of Statistical Planning and Inference 135(2), 402–419.
  • Vellaisamy and Sharma (1988) Vellaisamy, P. and D. Sharma (1988). Estimation of the mean of the selected gamma population. Communications in Statistics-Theory and Methods 17(8), 2797–2817.
  • Whitehead (1986) Whitehead, J. (1986). On the bias of maximum likelihood estimation following a sequential test. Biometrika 73(3), 573–581.
  • Zia et al. (2005) Zia, M. I., L. L. Siu, G. R. Pond, and E. X. Chen (2005). Comparison of outcomes of phase II studies and subsequent randomized control studies using identical chemotherapeutic regimens. Journal of Clinical Oncology 23(28), 6982–6991.