Parametric Modal Regression with Error in Covariates

[Uncaptioned image] Qingyang Liu
Department of Statistics
University of South Carolina
Columbia, SC 29201
[email protected]
&[Uncaptioned image] Xianzheng Huang
Department of Statistics
University of South Carolina
Columbia, SC 29201
[email protected]
(June 29, 2024)
Abstract

An inference procedure is proposed to provide consistent estimators of parameters in a modal regression model with a covariate prone to measurement error. A score-based diagnostic tool exploiting parametric bootstrap is developed to assess adequacy of parametric assumptions imposed on the regression model. The proposed estimation method and diagnostic tool are applied to synthetic data generated from simulation experiments and data from real-world applications to demonstrate their implementation and performance. These empirical examples illustrate the importance of adequately accounting for measurement error in the error-prone covariate when inferring the association between a response and covariates based on a modal regression model that is especially suitable for skewed and heavy-tailed response data.

Keywords Beta distribution \cdot bootstrap \cdot corrected score \cdot M𝑀Mitalic_M-estimation \cdot model misspecification

1 Introduction

The mean, median, and mode are three widely used measures of central tendency of data. The mode can be a more informative and sensible central tendency measure than the other two for data arising from distributions that are heavy-tailed and skewed. This very virtue of mode and the ubiquity of heavy-tailed and skewed data in biology, sociology, economics, and many other fields of study have recently revived data scientists’ interest in regression methodology focusing on the conditional mode of a response (Chacón,, 2020).

While there exists an extensive literature on regression models that relate the mean or the median of a response variable Y𝑌Yitalic_Y to covariates 𝐗𝐗\mathbf{X}bold_X, there are much less work on regression models tailored for the conditional mode of Y𝑌Yitalic_Y given 𝐗𝐗\mathbf{X}bold_X (Sager and Thisted,, 1982; Lee,, 1989, 1993). Among the limited existing modal regression methods, the majority of them are in the semi-/non-parametric framework (Yao and Li,, 2013; Chen et al.,, 2016; Ota et al.,, 2019; Wang et al.,, 2019; Kemp et al.,, 2020; Zhang et al.,, 2021; Ullah et al.,, 2022; Xiang and Yao,, 2022), which typically suffer from low statistical efficiency when compared with their parametric counterparts. One reality that discourages use of parametric models for inferring the mode is that very few named distributions that allow asymmetry can be conveniently formulated as distribution families indexed by the mode along with other parameters. Among the few groups of authors who considered parametric modal regression models, Aristodemou, (2014, Chapter 3) assumed a gamma distribution for a non-negative response with a covariate-dependent mode; Bourguignon et al., (2020) followed a similar model construction while also allowing a covariate-dependent precision parameter for the gamma distribution. Focusing on bounded response data, Zhou and Huang, (2020) proposed two modal regression models, one based on a beta distribution and the other based on a generalized biparabolic distribution for the response given covariates. In all three aforementioned works, frequentist likelihood-based methods are developed to infer model parameters. Most recently, Zhou and Huang, (2022) unified the mean regression and modal regression in a Bayesian framework by reparameterizing a four-parameter beta distribution with an unknown support so that the mean or the mode of Y𝑌Yitalic_Y depends on 𝐗𝐗\mathbf{X}bold_X. Earlier works on Bayesian modal regression, including parametric and nonparametric methods, can also be found in Aristodemou, (2014, Chapter 2).

All the above works on modal regression assume that covariates are measured precisely. Data analysts in many disciplines are well aware that, among all variables of interest, some of them often cannot be measured precisely due to inaccurate measuring devices or human error in data collection. Some variables are in principle inaccessible and only some surrogates of them can be measured. For example, one’s long-term blood pressure is an important biomarker associated with one’s heart health, yet it cannot be directly measured. Instead, measurable surrogates of it are blood pressure readings collected during a doctor’s visit, which can be viewed as error-contaminated versions of one’s long-term blood pressure. It has also been well-understood that ignoring covariates measurement error in mean regression or quantile regression usually lead to misleading inference results. There exists a large collection of works on mean regression methodology accounting for measurement error (Carroll et al.,, 2006; Fuller,, 2009; Buonaccorsi,, 2010; Yi,, 2017), and also some works in quantile regression to address this complication (He and Liang,, 2000; Wei and Carroll,, 2009; Wang et al.,, 2012). Modal regression methodology that address this issue only emerged recently, including those developed by Zhou and Huang, (2016), Li and Huang, (2019), and Shi et al., (2021), all of which opted for a nonparametric model for the error term in the primary regression model. There is a lack of methodology to account for error-prone covariates in parametric modal regression, and our study presented in this article fills the void.

In preparation for proposing a method to account for measurement error in covariates that is applicable to any parametric modal regression models, we first formulate the measurement error model and discuss complications unique to modal regression models in Section 2. For concreteness, we then focus on the beta modal regression model for a response supported on [0, 1] with an error-prone covariate, and propose consistent estimation methods to infer model parameters that account for measurement error in Section 3. A model diagnostic method is developed to detect model misspecifications when adopting the beta modal regression model in a given application in Section 4. Simulation studies are reported in Section 5 to demonstrate the performance of the estimation and diagnostics methods. We apply the proposed modal regression method accounting for covariate measurement error to data sets arising from two real-life studies in Section 6, where we also discuss revisions of the method to adapt to more general settings. Section 7 gives concluding remarks and future research directions.

2 Data and Model

2.1 Observed data

Suppose that, given p𝑝pitalic_p covariates in 𝐗=(X1,,Xp)T𝐗superscriptsubscript𝑋1subscript𝑋𝑝T\mathbf{X}=(X_{1},\ldots,X_{p})^{\mathrm{\scriptscriptstyle T}}bold_X = ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, Y𝑌Yitalic_Y follows a unimodal distribution specified by the probability density function (pdf), fY|𝐗(y|𝐱)subscript𝑓conditional𝑌𝐗conditional𝑦𝐱f_{\hbox{$Y|\mathbf{X}$}}(y|\mathbf{x})italic_f start_POSTSUBSCRIPT italic_Y | bold_X end_POSTSUBSCRIPT ( italic_y | bold_x ). Denote by θ(𝐱)𝜃𝐱\theta(\mathbf{x})italic_θ ( bold_x ) the mode of Y𝑌Yitalic_Y given 𝐗=𝐱𝐗𝐱\mathbf{X}=\mathbf{x}bold_X = bold_x. In modal regression without measurement error, one infers θ(𝐱)𝜃𝐱\theta(\mathbf{x})italic_θ ( bold_x ) based on a random sample of size n𝑛nitalic_n from the joint distribution of (Y,𝐗)𝑌𝐗(Y,\mathbf{X})( italic_Y , bold_X ), {(Yj,𝐗j)}j=1nsuperscriptsubscriptsubscript𝑌𝑗subscript𝐗𝑗𝑗1𝑛\{(Y_{j},\mathbf{X}_{j})\}_{j=1}^{n}{ ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, where 𝐗j=(X1,j,,Xp,j)Tsubscript𝐗𝑗superscriptsubscript𝑋1𝑗subscript𝑋𝑝𝑗T\mathbf{X}_{j}=(X_{1,j},\ldots,X_{p,j})^{\mathrm{\scriptscriptstyle T}}bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT. Now suppose that a covariate in 𝐗𝐗\mathbf{X}bold_X, say, X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, is prone to measurement error, and a surrogate W𝑊Witalic_W is observed instead of X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, with njsubscript𝑛𝑗n_{j}italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT replicate measures of X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT in W~j={Wj,k}k=1njsubscript~𝑊𝑗superscriptsubscriptsubscript𝑊𝑗𝑘𝑘1subscript𝑛𝑗\widetilde{W}_{j}=\{W_{j,k}\}_{k=1}^{n_{j}}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n. In this study, we assume that Wj,ksubscript𝑊𝑗𝑘W_{j,k}italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT relates to X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT via an additive measurement error model,

Wj,k=X1,j+Uj,k, for j=1,,n and k=1,,nj,subscript𝑊𝑗𝑘subscript𝑋1𝑗subscript𝑈𝑗𝑘 for j=1,,n and k=1,,nj,W_{j,k}=X_{1,j}+U_{j,k},\mbox{ for $j=1,\ldots,n$ and $k=1,\ldots,n_{j}$,}italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT + italic_U start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT , for italic_j = 1 , … , italic_n and italic_k = 1 , … , italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , (1)

where {Uj,k,k=1,,nj}j=1n\{U_{j,k},\,k=1,\ldots,n_{j}\}_{j=1}^{n}{ italic_U start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT , italic_k = 1 , … , italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT are independent and identically distributed (i.i.d.) mean-zero measurement error, which are independent of {(Yj,𝐗j)}j=1nsuperscriptsubscriptsubscript𝑌𝑗subscript𝐗𝑗𝑗1𝑛\{(Y_{j},\mathbf{X}_{j})\}_{j=1}^{n}{ ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT to guarantee nondifferential measurement error as considered in the classical measurement error models (Carroll et al.,, 2006, Section 2.5).

In a naive univariate modal regression analysis using the surrogate data, one treats W𝑊Witalic_W as if it were X=X1𝑋subscript𝑋1X=X_{1}italic_X = italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and equivalently, views the conditional pdf of Y𝑌Yitalic_Y given W=w𝑊𝑤W=witalic_W = italic_w, fY|W(y|w)subscript𝑓conditional𝑌𝑊conditional𝑦𝑤f_{\hbox{$Y|W$}}(y|w)italic_f start_POSTSUBSCRIPT italic_Y | italic_W end_POSTSUBSCRIPT ( italic_y | italic_w ), the same as fY|X(y|w)subscript𝑓conditional𝑌𝑋conditional𝑦𝑤f_{\hbox{$Y|X$}}(y|w)italic_f start_POSTSUBSCRIPT italic_Y | italic_X end_POSTSUBSCRIPT ( italic_y | italic_w ). As a result, naive modal regression analysis essentially infers the mode of fY|W(y|w)subscript𝑓conditional𝑌𝑊conditional𝑦𝑤f_{\hbox{$Y|W$}}(y|w)italic_f start_POSTSUBSCRIPT italic_Y | italic_W end_POSTSUBSCRIPT ( italic_y | italic_w ) instead of θ()𝜃\theta(\cdot)italic_θ ( ⋅ ). In the context of univariate mean regression models not limited to linear regression, the attenuation effect of measurement error on covariate effect estimation is often noted in the literature (Carroll et al.,, 2006; Buonaccorsi,, 2010), which causes the estimated covariate effect of a truly influential covariate to be pulled towards zero. Naive modal regression can suffer the same attenuation effect. For instance, if the mean and the mode of fY|X(y|x)subscript𝑓conditional𝑌𝑋conditional𝑦𝑥f_{\hbox{$Y|X$}}(y|x)italic_f start_POSTSUBSCRIPT italic_Y | italic_X end_POSTSUBSCRIPT ( italic_y | italic_x ) differ by a quantity that does not depend on covariates, such as for a Gumbel distribution that depends on a covariate X𝑋Xitalic_X only via the mode but not via the scale parameter, then the impact of measurement error on naive inference for the conditional mean mostly carries over to naive inference for θ(x)𝜃𝑥\theta(x)italic_θ ( italic_x ). In other model settings where the conditional mean and mode of Y𝑌Yitalic_Y differ by a quantity that does depend on the error-prone covariate, the effect of measurement error on naive modal regression demands investigation on a case-by-case basis. Even before conducting such investigation, a more fundamental question needs to be addressed, that is whether or not naive modal regression is meaningful, since unimodality of fY|X(y|x)subscript𝑓conditional𝑌𝑋conditional𝑦𝑥f_{\hbox{$Y|X$}}(y|x)italic_f start_POSTSUBSCRIPT italic_Y | italic_X end_POSTSUBSCRIPT ( italic_y | italic_x ) does not guarantee unimodality of fY|W(y|w)subscript𝑓conditional𝑌𝑊conditional𝑦𝑤f_{\hbox{$Y|W$}}(y|w)italic_f start_POSTSUBSCRIPT italic_Y | italic_W end_POSTSUBSCRIPT ( italic_y | italic_w ). Indeed, there is an extra layer of complication in modal regression with an error-prone covariate that does not exist in mean regression since, if the mean of Y𝑌Yitalic_Y given X𝑋Xitalic_X, μ(X)𝜇𝑋\mu(X)italic_μ ( italic_X ), is well defined, then the mean of Y𝑌Yitalic_Y given W𝑊Witalic_W is E{μ(X)|W}𝐸conditional-set𝜇𝑋𝑊E\{\mu(X)|W\}italic_E { italic_μ ( italic_X ) | italic_W }, which is also well defined in most settings of practical interest. Because of this additional complication, correcting naive inference to account for measurement error in modal regression is more challenging than the counterpart task in mean regression. For example, a strategy that can be easy to implement in mean regression is to correct the bias in a naive estimator of a parameter to produce an improved estimator accounting for measurement error (Carroll et al.,, 2006, Section 3.4). This idea of de-biasing naive estimation may not be a sensible approach now with the existence of a naive mode function in question.

2.2 Regression model

We propose to account for measurement error when inferring parameters in a modal regression model by exploiting the idea of corrected scores. In particular, we focus on modeling a bounded response Y𝑌Yitalic_Y, which is commonly encountered in practice, such as test scores, disease prevalence, and the fraction of household income spent on food. Any bounded response with a known support can be scaled to be supported on the unit interval [0, 1]. Beta distribution is a parametric family that encompasses various shapes of distributions supported on [0, 1], and thus serves as a relatively flexible basis for building a regression model for such responses. For a random variable V𝑉Vitalic_V that follows a beta distribution with shape parameters α1,α2>0subscript𝛼1subscript𝛼20\alpha_{1},\alpha_{2}>0italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 0, i.e., Vbeta(α1,α2)similar-to𝑉betasubscript𝛼1subscript𝛼2V\sim\mbox{beta}(\alpha_{1},\alpha_{2})italic_V ∼ beta ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), its density function is,

f(v;α1,α2)=Γ(α1+α2)Γ(α1)Γ(α2)vα11(1v)α21, for 0<v<1,𝑓𝑣subscript𝛼1subscript𝛼2Γsubscript𝛼1subscript𝛼2Γsubscript𝛼1Γsubscript𝛼2superscript𝑣subscript𝛼11superscript1𝑣subscript𝛼21 for 0<v<1,f(v;\alpha_{1},\alpha_{2})=\frac{\Gamma(\alpha_{1}+\alpha_{2})}{\Gamma(\alpha_% {1})\Gamma(\alpha_{2})}v^{\alpha_{1}-1}(1-v)^{\alpha_{2}-1},\mbox{ for $0<v<1$,}italic_f ( italic_v ; italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG roman_Γ ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG roman_Γ ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) roman_Γ ( italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG italic_v start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - 1 end_POSTSUPERSCRIPT ( 1 - italic_v ) start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 1 end_POSTSUPERSCRIPT , for 0 < italic_v < 1 ,

where Γ()Γ\Gamma(\cdot)roman_Γ ( ⋅ ) is the Gamma function. When α1,α2>1subscript𝛼1subscript𝛼21\alpha_{1},\,\alpha_{2}>1italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 1, this distribution has a unique mode given by θ=(α11)/(α1+α22)𝜃subscript𝛼11subscript𝛼1subscript𝛼22\theta=(\alpha_{1}-1)/(\alpha_{1}+\alpha_{2}-2)italic_θ = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - 1 ) / ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 2 ). To prepare for modal regression, we reparameterize the beta distribution by setting α1=1+mθsubscript𝛼11𝑚𝜃\alpha_{1}=1+m\thetaitalic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 + italic_m italic_θ and α2=1+m(1θ)subscript𝛼21𝑚1𝜃\alpha_{2}=1+m(1-\theta)italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 + italic_m ( 1 - italic_θ ), where m>0𝑚0m>0italic_m > 0 plays the role of a precision parameter, with a larger value of m𝑚mitalic_m leading to a smaller variance of the distribution (Zhou and Huang,, 2020). A similar parameterization of the beta distribution was used in Chen, (1999) to formulate the beta kernel in kernel density estimators, and also in Bagnato and Punzo, (2013) to construct beta mixture distributions. In both earlier works, the beta family is indexed by θ𝜃\thetaitalic_θ and a dispersion parameter equal to the reciprocal of m𝑚mitalic_m. The parameterization of beta distributions used in our study is also in line with the one in Kruschke, (2015, see Equation (6.6)), except for that a concentration parameter equal to our m𝑚mitalic_m plus 2 is used in place of our precision parameter there. Despite these small differences, all aforementioned parameterizations hightlight the mode as the location parameter, with the original shape parameters α1subscript𝛼1\alpha_{1}italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and α2subscript𝛼2\alpha_{2}italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT specified by the mode and a precision/concentration/dispersion parameter that is of secondary interest in drawing inference. By construction, as long as the mode θ(0,1)𝜃01\theta\in(0,1)italic_θ ∈ ( 0 , 1 ) exists, which we assume throughout the study, we have α1,α2>1subscript𝛼1subscript𝛼21\alpha_{1},\alpha_{2}>1italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 1 following our parameterization.

With a beta distribution family indexed by (θ,m)𝜃𝑚(\theta,m)( italic_θ , italic_m ) formulated, a beta modal regression model follows by introducing covariates-dependent mode of Y𝑌Yitalic_Y, θ(𝐗)=g(𝜷T𝐗~)𝜃𝐗𝑔superscript𝜷T~𝐗\theta{(\mathbf{X})}=g(\mbox{\boldmath$\beta$}^{\mathrm{\scriptscriptstyle T}}% \tilde{\mathbf{X}})italic_θ ( bold_X ) = italic_g ( bold_italic_β start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT over~ start_ARG bold_X end_ARG ), where 𝐗~=(1,𝐗T)T~𝐗superscript1superscript𝐗TT\tilde{\mathbf{X}}=(1,\mathbf{X}^{\mathrm{\scriptscriptstyle T}})^{\mathrm{% \scriptscriptstyle T}}over~ start_ARG bold_X end_ARG = ( 1 , bold_X start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, 𝜷=(β0,β1,,βp)T𝜷superscriptsubscript𝛽0subscript𝛽1subscript𝛽𝑝T\mbox{\boldmath$\beta$}=(\beta_{0},\beta_{1},\ldots,\beta_{p})^{\mathrm{% \scriptscriptstyle T}}bold_italic_β = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_β start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT with β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT being the intercept and β1,,βpsubscript𝛽1subscript𝛽𝑝\beta_{1},\ldots,\beta_{p}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_β start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT representing covariate effects associated with the p𝑝pitalic_p covariates in 𝐗𝐗\mathbf{X}bold_X, and g()𝑔g(\cdot)italic_g ( ⋅ ) is a user-specified link function, such as logit, probit, log-log, and complementary log-log. Now a modal regression model for Y𝑌Yitalic_Y is fully specified by the following conditional distribution of Y𝑌Yitalic_Y given 𝐗𝐗\mathbf{X}bold_X,

Y|𝐗beta(1+mθ(𝐗), 1+m{1θ(𝐗)}).similar-toconditional𝑌𝐗beta1𝑚𝜃𝐗1𝑚1𝜃𝐗Y|\mathbf{X}\sim\text{beta}(1+m\theta(\mathbf{X}),\,1+m\{1-\theta(\mathbf{X})% \}).italic_Y | bold_X ∼ beta ( 1 + italic_m italic_θ ( bold_X ) , 1 + italic_m { 1 - italic_θ ( bold_X ) } ) . (2)

Combining (2) with (1) completes the specification of a modal regression model for a response Y𝑌Yitalic_Y supported on [0, 1] and covariates 𝐗=(X1,,Xp)T𝐗superscriptsubscript𝑋1subscript𝑋𝑝T\mathbf{X}=(X_{1},\ldots,X_{p})^{\mathrm{\scriptscriptstyle T}}bold_X = ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, with X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT subject to additive nondifferential measurement error. The focal point of inference lies in parameters involved in the primary regression model in (2), 𝛀=(𝜷T,m)T𝛀superscriptsuperscript𝜷T𝑚T\mbox{\boldmath$\Omega$}=(\mbox{\boldmath$\beta$}^{\mathrm{\scriptscriptstyle T% }},m)^{\mathrm{\scriptscriptstyle T}}bold_Ω = ( bold_italic_β start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT , italic_m ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT. Parameters appearing in (1) are of secondary interest but required to specify the measurement error distribution.

3 Parameter estimation

3.1 Maximum likelihood estimation

In the absence of measurement error, one may carry out maximum likelihood estimation of 𝛀𝛀\Omegabold_Ω straightforwardly by solving the normal score equations for 𝛀𝛀\Omegabold_Ω. More specifically, the log-likelihood of error-free data, 𝒟={(Yj,𝐗j)}j=1n𝒟superscriptsubscriptsubscript𝑌𝑗subscript𝐗𝑗𝑗1𝑛\mathcal{D}=\{(Y_{j},\mathbf{X}_{j})\}_{j=1}^{n}caligraphic_D = { ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, is

(𝛀;𝒟)=𝛀𝒟absent\displaystyle\ell(\mathbf{\Omega};\mathcal{D})=roman_ℓ ( bold_Ω ; caligraphic_D ) = j=1n(𝛀;Yj,𝐗j)superscriptsubscript𝑗1𝑛𝛀subscript𝑌𝑗subscript𝐗𝑗\displaystyle\ \sum_{j=1}^{n}\ell(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) (3)
=\displaystyle== nlogΓ(2+m)j=1nlog(Γ(1+mθ(𝐗j))Γ(1+m{1θ(𝐗j)}))𝑛Γ2𝑚superscriptsubscript𝑗1𝑛Γ1𝑚𝜃subscript𝐗𝑗Γ1𝑚1𝜃subscript𝐗𝑗\displaystyle\ n\log\Gamma(2+m)-\sum_{j=1}^{n}\log\left(\Gamma(1+m\theta\left(% \mathbf{X}_{j}\right))\Gamma(1+m\left\{1-\theta\left(\mathbf{X}_{j}\right)% \right\})\right)italic_n roman_log roman_Γ ( 2 + italic_m ) - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log ( roman_Γ ( 1 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) roman_Γ ( 1 + italic_m { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } ) )
+mj=1n[θ(𝐗j)logYj+{1θ(𝐗j)}log(1Yj)].𝑚superscriptsubscript𝑗1𝑛delimited-[]𝜃subscript𝐗𝑗subscript𝑌𝑗1𝜃subscript𝐗𝑗1subscript𝑌𝑗\displaystyle+m\sum_{j=1}^{n}\left[\theta\left(\mathbf{X}_{j}\right)\log Y_{j}% +\left\{1-\theta\left(\mathbf{X}_{j}\right)\right\}\log\left(1-Y_{j}\right)% \right].+ italic_m ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) roman_log italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } roman_log ( 1 - italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ] .

Differentiating (3) with respect to 𝛀𝛀\Omegabold_Ω leads to the score equations, j=1n𝚿0(𝛀;Yj,𝐗j)=𝟎superscriptsubscript𝑗1𝑛subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗0\sum_{j=1}^{n}\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},% \mathbf{X}_{j})=\mbox{\boldmath$0$}∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = bold_0, where the score vector evaluated at the j𝑗jitalic_j-th data point, 𝚿0(𝛀;Yj,𝐗j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), consists of the following scores, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n,

(𝛀;Yj,𝐗j)𝜷=𝛀subscript𝑌𝑗subscript𝐗𝑗𝜷absent\displaystyle\frac{\partial\ell(\boldsymbol{\Omega};Y_{j},\mathbf{X}_{j})}{% \partial\mbox{\boldmath$\beta$}}=divide start_ARG ∂ roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ bold_italic_β end_ARG = {mψ(1+mθ(𝐗j))+mψ(1+m{1θ(𝐗j)})+mlog(Yj1Yj)}𝑚𝜓1𝑚𝜃subscript𝐗𝑗𝑚𝜓1𝑚1𝜃subscript𝐗𝑗𝑚subscript𝑌𝑗1subscript𝑌𝑗\displaystyle\ \left\{-m\psi(1+m\theta(\mathbf{X}_{j}))+m\psi(1+m\{1-\theta(% \mathbf{X}_{j})\})+m\log\left(\frac{Y_{j}}{1-Y_{j}}\right)\right\}{ - italic_m italic_ψ ( 1 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) + italic_m italic_ψ ( 1 + italic_m { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } ) + italic_m roman_log ( divide start_ARG italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) }
×g(𝜷T𝐗~j)𝐗~j,absentsuperscript𝑔superscript𝜷Tsubscript~𝐗𝑗subscript~𝐗𝑗\displaystyle\ \times g^{\prime}(\mbox{\boldmath$\beta$}^{\mathrm{% \scriptscriptstyle T}}\tilde{\mathbf{X}}_{j})\tilde{\mathbf{X}}_{j},× italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_β start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT over~ start_ARG bold_X end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) over~ start_ARG bold_X end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , (4)
(𝛀;Yj,𝐗j)m=𝛀subscript𝑌𝑗subscript𝐗𝑗𝑚absent\displaystyle\frac{\partial\ell(\boldsymbol{\Omega};Y_{j},\mathbf{X}_{j})}{% \partial m}=divide start_ARG ∂ roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_m end_ARG = ψ(2+m)θ(𝐗j)ψ(1+mθ(𝐗j)){1θ(𝐗j)}ψ(1+m{1θ(𝐗j)})𝜓2𝑚𝜃subscript𝐗𝑗𝜓1𝑚𝜃subscript𝐗𝑗1𝜃subscript𝐗𝑗𝜓1𝑚1𝜃subscript𝐗𝑗\displaystyle\ \psi(2+m)-\theta(\mathbf{X}_{j})\psi(1+m\theta(\mathbf{X}_{j}))% -\{1-\theta(\mathbf{X}_{j})\}\psi(1+m\{1-\theta(\mathbf{X}_{j})\})italic_ψ ( 2 + italic_m ) - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_ψ ( 1 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) - { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } italic_ψ ( 1 + italic_m { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } )
+θ(𝐗j)logYj+{1θ(𝐗j)}log(1Yj),𝜃subscript𝐗𝑗subscript𝑌𝑗1𝜃subscript𝐗𝑗1subscript𝑌𝑗\displaystyle\ +\theta(\mathbf{X}_{j})\log Y_{j}+\{1-\theta(\mathbf{X}_{j})\}% \log(1-Y_{j}),+ italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) roman_log italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + { 1 - italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } roman_log ( 1 - italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (5)

where ψ(t)=(d/dt)logΓ(t)𝜓𝑡𝑑𝑑𝑡Γ𝑡\psi(t)=(d/dt)\log\Gamma(t)italic_ψ ( italic_t ) = ( italic_d / italic_d italic_t ) roman_log roman_Γ ( italic_t ) is the digamma function and g(t)=(d/dt)g(t)superscript𝑔𝑡𝑑𝑑𝑡𝑔𝑡g^{\prime}(t)=(d/dt)g(t)italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_t ) = ( italic_d / italic_d italic_t ) italic_g ( italic_t ).

3.2 Monte-Carlo corrected scores

In the presence of measurement error, a naive estimator of 𝛀𝛀\Omegabold_Ω solves the naive score equations resulting from replacing X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT with W¯j=nj1k=1njWj,ksubscript¯𝑊𝑗superscriptsubscript𝑛𝑗1superscriptsubscript𝑘1subscript𝑛𝑗subscript𝑊𝑗𝑘\overline{W}_{j}=n_{j}^{-1}\sum_{k=1}^{n_{j}}W_{j,k}over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT in (4) and (5), for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n. As pointed out earlier and also evidenced in simulation study to be presented later, this naive treatment typically results in misleading inference for 𝛀𝛀\Omegabold_Ω. We propose to follow the idea of the corrected score method (Nakamura,, 1990) and revise the naive scores to obtain estimating equations that adequately account for measurement error. The thrust of the corrected score method is to use the observed error-prone data, 𝒟={(Yj,W~j,𝐗1,j)}j=1nsuperscript𝒟superscriptsubscriptsubscript𝑌𝑗subscript~𝑊𝑗subscript𝐗1𝑗𝑗1𝑛\mathcal{D}^{*}=\{(Y_{j},\widetilde{W}_{j},\,\mathbf{X}_{-1,j})\}_{j=1}^{n}caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = { ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with W~j={Wj,k}k=1njsubscript~𝑊𝑗superscriptsubscriptsubscript𝑊𝑗𝑘𝑘1subscript𝑛𝑗\widetilde{W}_{j}=\{W_{j,k}\}_{k=1}^{n_{j}}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and 𝐗1,j=(X2,j,,Xp,j)Tsubscript𝐗1𝑗superscriptsubscript𝑋2𝑗subscript𝑋𝑝𝑗T\mathbf{X}_{-1,j}=(X_{2,j},\ldots,X_{p,j})^{\mathrm{\scriptscriptstyle T}}bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT = ( italic_X start_POSTSUBSCRIPT 2 , italic_j end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_p , italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, to construct unbiased estimators of the above normal scores. In this vein of thinking, one treats {X1,j}j=1nsuperscriptsubscriptsubscript𝑋1𝑗𝑗1𝑛\{X_{1,j}\}_{j=1}^{n}{ italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT as unknown parameters instead of realizations of a random variable, and thus one takes on the functional point of view as opposed to the structural viewpoint of measurement error models where a distribution for X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is assumed (Carroll et al.,, 2006, Section 2.1).

We begin with applying the Monte-Carlo-amenable method proposed by Stefanski et al., (2005), a method originating from the idea described in Stefanski, (1989). More specifically, we construct a score, 𝚿(𝛀;Yj,W~j,𝐗1,j)𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript𝐗1𝑗\mbox{\boldmath$\Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},% \mathbf{X}_{-1,j})bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ), that satisfies E{𝚿(𝛀;Yj,W~j,𝐗1,j)|Yj,𝐗j}=𝚿0(𝛀;Yj,𝐗j)𝐸conditional-set𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript𝐗1𝑗subscript𝑌𝑗subscript𝐗𝑗subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗E\{\mbox{\boldmath$\Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},% \mathbf{X}_{-1,j})|Y_{j},\mathbf{X}_{j}\}=\mbox{\boldmath$\Psi$}_{0}(\mbox{% \boldmath$\Omega$};Y_{j},\mathbf{X}_{j})italic_E { bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) | italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } = bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n. This particular method is especially suitable for settings with a univariate error-prone covariate subject to normal measurement error U𝑈Uitalic_U. We will address violation of the normality assumption on U𝑈Uitalic_U in Section 3, and describe revisions of the method to adapt to settings with multiple error-prone covariates in Section 6. As shown in Stefanski et al., (2005, Theorem 1), the minimum variance unbiased estimator of 𝚿0(𝛀;Yj,𝐗j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) is given by

𝚿(𝛀;Yj,W~j,𝐗1,j)=E{𝚿0(𝛀;Yj,W¯j+i(nj1)Sj2njT,𝐗1,j)|Yj,W¯j,Sj2,𝐗1,j},𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript𝐗1𝑗𝐸conditional-setsubscript𝚿0𝛀subscript𝑌𝑗subscript¯𝑊𝑗𝑖subscript𝑛𝑗1superscriptsubscript𝑆𝑗2subscript𝑛𝑗𝑇subscript𝐗1𝑗subscript𝑌𝑗subscript¯𝑊𝑗superscriptsubscript𝑆𝑗2subscript𝐗1𝑗\mbox{\boldmath$\Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},% \mathbf{X}_{-1,j})=E\left\{\left.\mbox{\boldmath$\Psi$}_{0}\left(\mbox{% \boldmath$\Omega$};Y_{j},\overline{W}_{j}+i\sqrt{\frac{(n_{j}-1)S_{j}^{2}}{n_{% j}}}T,\mathbf{X}_{-1,j}\right)\right|Y_{j},\overline{W}_{j},S_{j}^{2},\mathbf{% X}_{-1,j}\right\},bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) = italic_E { bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_i square-root start_ARG divide start_ARG ( italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 ) italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG italic_T , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) | italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT } , (6)

where i𝑖iitalic_i is the imaginary unit, Sj2superscriptsubscript𝑆𝑗2S_{j}^{2}italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the sample variance of W~j={Wj,k}k=1njsubscript~𝑊𝑗superscriptsubscriptsubscript𝑊𝑗𝑘𝑘1subscript𝑛𝑗\widetilde{W}_{j}=\{W_{j,k}\}_{k=1}^{n_{j}}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, and T=Z1/(k=1nj1Zk2)1/2𝑇subscript𝑍1superscriptsuperscriptsubscript𝑘1subscript𝑛𝑗1subscriptsuperscript𝑍2𝑘12T=Z_{1}/(\sum_{k=1}^{n_{j}-1}Z^{2}_{k})^{1/2}italic_T = italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ( ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 end_POSTSUPERSCRIPT italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT is independent of all observed data, in which Z1,,Znj1subscript𝑍1subscript𝑍subscript𝑛𝑗1Z_{1},\ldots,Z_{n_{j}-1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_Z start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT are independent standard normal random variables. The estimator of 𝚿0(𝛀;Yj,𝐗j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) in (6) originates from a jackknife exact-extrapolant estimator constructed for the purpose of estimating a function of the mean of a normal distribution based on a random sample from the distribution. In the context of (6), this random sample is W~jsubscript~𝑊𝑗\widetilde{W}_{j}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT from N(X1,j,σu2)𝑁subscript𝑋1𝑗superscriptsubscript𝜎𝑢2N(X_{1,j},\sigma_{u}^{2})italic_N ( italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), where σu2superscriptsubscript𝜎𝑢2\sigma_{u}^{2}italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the measurement error variance, i.e., assuming UN(0,σu2)similar-to𝑈𝑁0superscriptsubscript𝜎𝑢2U\sim N(0,\sigma_{u}^{2})italic_U ∼ italic_N ( 0 , italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) in (1), and the function of the normal mean X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT is 𝚿0(𝛀;Yj,X1,j,𝐗1,j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝑋1𝑗subscript𝐗1𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},X_{1,j},\mathbf{X}_{% -1,j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ). The expectation in (6) cannot be derived in closed form. But since the only quantity viewed as random when deriving this conditional expectation is T𝑇Titalic_T that is independent of observed data, one can estimate this expectation unbiasedly via an empirical mean based on simulated random samples of T𝑇Titalic_T. Moreover, as shown in Stefanski et al., (2005), even though (6) is complex-valued by construction, the expectation of its imaginary part is zero as long as 𝚿0(𝛀;Yj,X1,j,𝐗1,j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝑋1𝑗subscript𝐗1𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},X_{1,j},\mathbf{X}_{% -1,j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) is infinitely differentiable with respect to X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT, which is guaranteed in our case by choosing a link function g(t)𝑔𝑡g(t)italic_g ( italic_t ) that is infinitely differentiable. Hence, using the real part of the empirical version of (6) suffices for constructing an unbiased estimator of 𝚿0(𝛀;Yj,𝐗j)subscript𝚿0𝛀subscript𝑌𝑗subscript𝐗𝑗\mbox{\boldmath$\Psi$}_{0}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). This leads to the following corrected score based on a simulated random sample of T𝑇Titalic_T of size B𝐵Bitalic_B, T~j={Tj,b}b=1Bsubscript~𝑇𝑗superscriptsubscriptsubscript𝑇𝑗𝑏𝑏1𝐵\widetilde{T}_{j}=\{T_{j,b}\}_{b=1}^{B}over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_T start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n,

𝚿(𝛀;Yj,W~j,T~j,𝐗1,j)=1Bb=1BRe{𝚿0(𝛀;Yj,W¯j+i(nj1)Sj2njTj,b,𝐗1,j)},𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗1𝐵superscriptsubscript𝑏1𝐵Resubscript𝚿0𝛀subscript𝑌𝑗subscript¯𝑊𝑗𝑖subscript𝑛𝑗1superscriptsubscript𝑆𝑗2subscript𝑛𝑗subscript𝑇𝑗𝑏subscript𝐗1𝑗\mbox{\boldmath$\Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},% \widetilde{T}_{j},\mathbf{X}_{-1,j})=\frac{1}{B}\sum_{b=1}^{B}\mbox{Re}\left\{% \mbox{\boldmath$\Psi$}_{0}\left(\mbox{\boldmath$\Omega$};Y_{j},\overline{W}_{j% }+i\sqrt{\frac{(n_{j}-1)S_{j}^{2}}{n_{j}}}T_{j,b},\mathbf{X}_{-1,j}\right)% \right\},bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_B end_ARG ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT Re { bold_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_i square-root start_ARG divide start_ARG ( italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 ) italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG italic_T start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } , (7)

where Re(t)Re𝑡\mbox{Re}(t)Re ( italic_t ) denotes the real part of a complex-valued t𝑡titalic_t.

One now can solve the following system of p+2𝑝2p+2italic_p + 2 equations based on the corrected score in (7),

j=1n𝚿(𝛀;Yj,W~j,T~j,𝐗1,j)=𝟎,superscriptsubscript𝑗1𝑛𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗0\sum_{j=1}^{n}\mbox{\boldmath$\Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde% {W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j})=\mbox{\boldmath$0$},∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) = bold_0 , (8)

for 𝛀𝛀\Omegabold_Ω to obtain a consistent estimator ^𝛀^absent𝛀\hat{}\mbox{\boldmath$\Omega$}over^ start_ARG end_ARG bold_Ω, where T~1,,T~nsubscript~𝑇1subscript~𝑇𝑛\widetilde{T}_{1},\ldots,\widetilde{T}_{n}over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are independent. Solving (8) for 𝛀𝛀\Omegabold_Ω is equivalent to solving an optimization problem, that is,

𝛀^=argmin𝛀p+1×+{j=1n𝚿(𝛀;Yj,W~j,T~j,𝐗1,j)}T{j=1n𝚿(𝛀;Yj,W~j,T~j,𝐗1,j)}.^𝛀𝛀superscript𝑝1superscriptsuperscriptsuperscriptsubscript𝑗1𝑛𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗Tsuperscriptsubscript𝑗1𝑛𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗\hat{\mathbf{\Omega}}=\underset{\mathbf{\Omega}\in\mathbb{R}^{p+1}\times% \mathbb{R}^{+}}{\arg\min}\left\{\sum_{j=1}^{n}\mbox{\boldmath$\Psi$}(\mbox{% \boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j}% )\right\}^{\mathrm{\scriptscriptstyle T}}\left\{\sum_{j=1}^{n}\mbox{\boldmath$% \Psi$}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde{T}_{j},% \mathbf{X}_{-1,j})\right\}.over^ start_ARG bold_Ω end_ARG = start_UNDERACCENT bold_Ω ∈ blackboard_R start_POSTSUPERSCRIPT italic_p + 1 end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_UNDERACCENT start_ARG roman_arg roman_min end_ARG { ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT { ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } . (9)

The equivalence between (9) and the solution to (8) is obvious when there exists a unique solution to (8). An added benefit of dealing with an optimization problem is more appreciated in the presence of model misspecification that can potentially lead to non-existence of a solution to (8), yet (9) may still be well-defined with meaningful statistical interpretations according to White, (1982).

3.3 Monte-Carlo corrected log-likelihood

To this end, estimating 𝛀𝛀\Omegabold_Ω appears to be a straightforward optimization problem. But the numerical procedure to obtain (9) requires evaluating p+2𝑝2p+2italic_p + 2 scores at each iteration, which can be cumbersome and very demanding on the computer memory and central processing unit, especially due to the Monte Carlo nature of the score in (7) that involves computing a vector-valued score B𝐵Bitalic_B times. Viewing the quadratic form in (9) as an objective function that accounts for measurement error, we propose to use a different objective function that also takes measurement error into account and is computationally less cumbersome to optimize. This new objective function is obtained by correcting the naive log-likelihood function (𝛀;Yj,W¯j,𝐗1,j)𝛀subscript𝑌𝑗subscript¯𝑊𝑗subscript𝐗1𝑗\ell(\mbox{\boldmath$\Omega$};Y_{j},\overline{W}_{j},\mathbf{X}_{-1,j})roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) that is the summand of (3) with X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT evaluated at W¯jsubscript¯𝑊𝑗\overline{W}_{j}over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n. Similar to the construction of the corrected score in (7) based on the naive score, the new objective function based on the naive log-likelihood evaluated at the j𝑗jitalic_j-th observed data point is

~(𝛀;Yj,W~j,T~j,𝐗1,j)~𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗\displaystyle\tilde{\ell}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},% \widetilde{T}_{j},\mathbf{X}_{-1,j})over~ start_ARG roman_ℓ end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) =1Bb=1BRe{(𝛀;Yj,W¯j+i(nj1)Sj2njTj,b,𝐗1,j)},absent1𝐵superscriptsubscript𝑏1𝐵Re𝛀subscript𝑌𝑗subscript¯𝑊𝑗𝑖subscript𝑛𝑗1superscriptsubscript𝑆𝑗2subscript𝑛𝑗subscript𝑇𝑗𝑏subscript𝐗1𝑗\displaystyle=\frac{1}{B}\sum_{b=1}^{B}\mbox{Re}\left\{\ell\left(\mbox{% \boldmath$\Omega$};Y_{j},\overline{W}_{j}+i\sqrt{\frac{(n_{j}-1)S_{j}^{2}}{n_{% j}}}T_{j,b},\mathbf{X}_{-1,j}\right)\right\},= divide start_ARG 1 end_ARG start_ARG italic_B end_ARG ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT Re { roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_i square-root start_ARG divide start_ARG ( italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 ) italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG italic_T start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } , (10)

which satisfies E{~(𝛀;Yj,W~j,T~j,𝐗1,j)|Yj,𝐗j}=(𝛀;Yj,𝐗j)𝐸conditional-set~𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗subscript𝑌𝑗subscript𝐗𝑗𝛀subscript𝑌𝑗subscript𝐗𝑗E\{\tilde{\ell}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde{T}% _{j},\mathbf{X}_{-1,j})|Y_{j},\mathbf{X}_{j}\}=\ell(\mbox{\boldmath$\Omega$};Y% _{j},\mathbf{X}_{j})italic_E { over~ start_ARG roman_ℓ end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) | italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } = roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n. We then define an estimator of 𝛀𝛀\Omegabold_Ω as

𝛀^=argmax𝛀p+1×+j=1n~(𝛀;Yj,W~j,T~j,𝐗1,j),^𝛀𝛀superscript𝑝1superscriptsuperscriptsubscript𝑗1𝑛~𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗\hat{\mathbf{\Omega}}=\underset{\mathbf{\Omega}\in\mathbb{R}^{p+1}\times% \mathbb{R}^{+}}{\arg\max}\sum_{j=1}^{n}\tilde{\ell}(\mbox{\boldmath$\Omega$};Y% _{j},\widetilde{W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j}),over^ start_ARG bold_Ω end_ARG = start_UNDERACCENT bold_Ω ∈ blackboard_R start_POSTSUPERSCRIPT italic_p + 1 end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_UNDERACCENT start_ARG roman_arg roman_max end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over~ start_ARG roman_ℓ end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) , (11)

which only requires repeated evaluation of a scalar function in (10) at each iteration of an optimization algorithm. In simulation studies (not presented in this article) where we estimate 𝛀𝛀\Omegabold_Ω using these two routes of optimization according to (9) and (11), we obtain very similar estimates of 𝛀𝛀\Omegabold_Ω, with the former route more computationally demanding than the latter. The numerical similarity of (9) and (11) may be expected given the connection between the naive score and the naive log-likelihood, in addition to the equivalence between the solution to the normal score equation and the maximum likelihood estimator in the absence of measurement error. We refer to the estimator defined in (11) the Monte Carlo corrected log-likelihood estimator, or MCCL for short.

Whether one follows the idea of correcting the naive scores or the route of correcting the naive log-likelihood to account for measurement error, our proposed estimation method falls in the general framework of M𝑀Mitalic_M-estimation (Boos and Stefanski,, 2013, Chapter 7). As an M𝑀Mitalic_M-estimator, the MCCL estimator ^𝛀^absent𝛀\hat{}\mbox{\boldmath$\Omega$}over^ start_ARG end_ARG bold_Ω is a consistent estimator of 𝛀𝛀\Omegabold_Ω that is asymptotically normal under regularity conditions stated in, for example, Theorem 7.2 in Boos and Stefanski, (2013). Moreover, motivated by its asymptotic variance of the sandwich form (Boos and Stefanski,, 2013, Section 7.2.1), the variance of ^𝛀^absent𝛀\hat{}\mbox{\boldmath$\Omega$}over^ start_ARG end_ARG bold_Ω can be estimated by

𝐕(𝒟;^𝛀)={𝐀(𝒟;^𝛀)}1𝐁(𝒟;^𝛀)[{𝐀(𝒟;^𝛀)}1]T,𝐕superscript𝒟^absent𝛀superscript𝐀superscript𝒟^absent𝛀1𝐁superscript𝒟^absent𝛀superscriptdelimited-[]superscript𝐀superscript𝒟^absent𝛀1T\mathbf{V}(\mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})=\left\{\mathbf{A}(% \mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})\right\}^{-1}\mathbf{B}(% \mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})\left[\left\{\mathbf{A}(% \mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})\right\}^{-1}\right]^{\mathrm{% \scriptscriptstyle T}},bold_V ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) = { bold_A ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_B ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) [ { bold_A ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT , (12)

where

𝐀(𝒟;^𝛀)𝐀superscript𝒟^absent𝛀\displaystyle\mathbf{A}(\mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})bold_A ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) =1nj=1n𝛀T𝚿(𝛀;Yj,W~j,T~j,𝐗1,j)|𝛀=^𝛀,absentevaluated-at1𝑛superscriptsubscript𝑗1𝑛superscript𝛀T𝚿𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗𝛀^absent𝛀\displaystyle=\left.\frac{1}{n}\sum_{j=1}^{n}\frac{\partial}{\partial\mbox{% \boldmath$\Omega$}^{\mathrm{\scriptscriptstyle T}}}\mbox{\boldmath$\Psi$}(% \mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde{T}_{j},\mathbf{X}_% {-1,j})\right|_{\mbox{\boldmath$\Omega$}=\hat{}\mbox{\boldmath$\Omega$}},= divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG ∂ end_ARG start_ARG ∂ bold_Ω start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT end_ARG bold_Ψ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) | start_POSTSUBSCRIPT bold_Ω = over^ start_ARG end_ARG bold_Ω end_POSTSUBSCRIPT ,
𝐁(𝒟;^𝛀)𝐁superscript𝒟^absent𝛀\displaystyle\mathbf{B}(\mathcal{D}^{*};\hat{}\mbox{\boldmath$\Omega$})bold_B ( caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ; over^ start_ARG end_ARG bold_Ω ) =1nj=1n𝚿(^𝛀;Yj,W~j,T~j,𝐗1,j){𝚿(^𝛀;Yj,W~j,T~j,𝐗1,j)}T.absent1𝑛superscriptsubscript𝑗1𝑛𝚿^absent𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗superscript𝚿^absent𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗T\displaystyle=\frac{1}{n}\sum_{j=1}^{n}\mbox{\boldmath$\Psi$}(\hat{}\mbox{% \boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j}% )\left\{\mbox{\boldmath$\Psi$}(\hat{}\mbox{\boldmath$\Omega$};Y_{j},\widetilde% {W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j})\right\}^{\mathrm{% \scriptscriptstyle T}}.= divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_Ψ ( over^ start_ARG end_ARG bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) { bold_Ψ ( over^ start_ARG end_ARG bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT .

4 Model diagnostics

Even though we avoid specifying the true covariate distribution by adopting the functional viewpoint of measurement error models, the primary regression model in (2) is fully parametric. This raises the concern of model misspecification and calls for model diagnostics tools. Model diagnostics based on error-prone data is more challenging than settings without measurement error. In particular, conventional residual-based diagnostics methods that require evaluating an estimated regression function, whether it is the conditional mean μ(𝐗)𝜇𝐗\mu(\mathbf{X})italic_μ ( bold_X ) in mean regression or the conditional mode θ(𝐗)𝜃𝐗\theta(\mathbf{X})italic_θ ( bold_X ) in modal regression, are no longer applicable now that a true covariate is unobserved. Another contribution of our study is an effective score-based diagnostic tool that circumvents this obstacle a traditional residual-based diagnostic method faces in the presence of measurement error.

For the beta modal regression model without error in covariates, Zhou and Huang, (2020) propose a score-based test statistic defined below for the purpose of model diagnostics,

Q(𝛀^0;𝒟)=n22(n1)𝐒¯T𝚺^1𝐒¯,𝑄subscript^𝛀0𝒟𝑛22𝑛1superscript¯𝐒Tsuperscript^𝚺1¯𝐒Q(\hat{\mathbf{\Omega}}_{0};\mathcal{D})=\frac{n-2}{2(n-1)}\overline{\mathbf{S% }}^{\mathrm{\scriptscriptstyle T}}\hat{\mathbf{\Sigma}}^{-1}\overline{\mathbf{% S}},italic_Q ( over^ start_ARG bold_Ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; caligraphic_D ) = divide start_ARG italic_n - 2 end_ARG start_ARG 2 ( italic_n - 1 ) end_ARG over¯ start_ARG bold_S end_ARG start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT over^ start_ARG bold_Σ end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_S end_ARG , (13)

where ^𝛀0^absentsubscript𝛀0\hat{}\mbox{\boldmath$\Omega$}_{0}over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the maximum likelihood estimator of 𝛀𝛀\Omegabold_Ω, 𝐒¯=n1j=1n𝐒(𝛀^0;Yj,𝐗j)¯𝐒superscript𝑛1superscriptsubscript𝑗1𝑛𝐒subscript^𝛀0subscript𝑌𝑗subscript𝐗𝑗\overline{\mathbf{S}}=n^{-1}\sum_{j=1}^{n}\mathbf{S}(\hat{\mathbf{\Omega}}_{0}% ;Y_{j},\mathbf{X}_{j})over¯ start_ARG bold_S end_ARG = italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_S ( over^ start_ARG bold_Ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ), and 𝚺^={n(n1)}1j=1n{𝐒(𝛀^0;Yj,𝐗j)𝐒¯}{𝐒(𝛀^0;Yj,𝐗j)𝐒¯}T^𝚺superscript𝑛𝑛11superscriptsubscript𝑗1𝑛𝐒subscript^𝛀0subscript𝑌𝑗subscript𝐗𝑗¯𝐒superscript𝐒subscript^𝛀0subscript𝑌𝑗subscript𝐗𝑗¯𝐒T\hat{\boldsymbol{\Sigma}}=\{n(n-1)\}^{-1}\sum_{j=1}^{n}\{\mathbf{S}(\hat{% \boldsymbol{\Omega}}_{0};Y_{j},\mathbf{X}_{j})-\overline{\mathbf{S}}\}\{% \mathbf{S}(\hat{\boldsymbol{\Omega}}_{0};Y_{j},\mathbf{X}_{j})-\overline{% \mathbf{S}}\}^{\mathrm{\scriptscriptstyle T}}over^ start_ARG bold_Σ end_ARG = { italic_n ( italic_n - 1 ) } start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { bold_S ( over^ start_ARG bold_Ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - over¯ start_ARG bold_S end_ARG } { bold_S ( over^ start_ARG bold_Ω end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - over¯ start_ARG bold_S end_ARG } start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, in which, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n,

𝐒(𝛀;Yj,𝐗j)𝐒𝛀subscript𝑌𝑗subscript𝐗𝑗\displaystyle\mathbf{S}(\boldsymbol{\Omega};Y_{j},\mathbf{X}_{j})bold_S ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) =[logYjψ(1+mθ(𝐗j))+ψ(2+m)YjlogYj{1+mθ(𝐗j)}{ψ(2+mθ(𝐗j))ψ(3+m)}2+m]absentmatrixsubscript𝑌𝑗𝜓1𝑚𝜃subscript𝐗𝑗𝜓2𝑚subscript𝑌𝑗subscript𝑌𝑗1𝑚𝜃subscript𝐗𝑗𝜓2𝑚𝜃subscript𝐗𝑗𝜓3𝑚2𝑚\displaystyle=\begin{bmatrix}\log Y_{j}-\psi(1+m\theta(\mathbf{X}_{j}))+\psi(2% +m)\\ \displaystyle{Y_{j}\log Y_{j}-\frac{\{1+m\theta(\mathbf{X}_{j})\}\{\psi(2+m% \theta(\mathbf{X}_{j}))-\psi(3+m)\}}{2+m}}\end{bmatrix}= [ start_ARG start_ROW start_CELL roman_log italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_ψ ( 1 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) + italic_ψ ( 2 + italic_m ) end_CELL end_ROW start_ROW start_CELL italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT roman_log italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - divide start_ARG { 1 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } { italic_ψ ( 2 + italic_m italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) - italic_ψ ( 3 + italic_m ) } end_ARG start_ARG 2 + italic_m end_ARG end_CELL end_ROW end_ARG ] (14)

is the score vector constructed by matching logV𝑉\log Vroman_log italic_V and VlogV𝑉𝑉V\log Vitalic_V roman_log italic_V with their respective expectations for Vbeta(α1,α2)similar-to𝑉betasubscript𝛼1subscript𝛼2V\sim\mbox{beta}(\alpha_{1},\alpha_{2})italic_V ∼ beta ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and thus E{𝐒(𝛀;Yj,𝐗j)}=𝟎𝐸𝐒𝛀subscript𝑌𝑗subscript𝐗𝑗0E\{\mathbf{S}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})\}=\mbox{\boldmath% $0$}italic_E { bold_S ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } = bold_0 in the absence of model misspecification. By construction, a larger value of the nonnegative Q(^𝛀0;𝒟)𝑄^absentsubscript𝛀0𝒟Q(\hat{}\mbox{\boldmath$\Omega$}_{0};\mathcal{D})italic_Q ( over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; caligraphic_D ) provides stronger evidence indicating model misspecification. A parametric bootstrap procedure is developed in Zhou and Huang, (2020) to estimate the null distribution of Q(^𝛀0;𝒟)𝑄^absentsubscript𝛀0𝒟Q(\hat{}\mbox{\boldmath$\Omega$}_{0};\mathcal{D})italic_Q ( over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; caligraphic_D ), from which one onbtains an estimated p𝑝pitalic_p-value for the test.

Returning to our beta modal regression model with error-in-covariate, we apply the idea of corrected score here to construct a counterpart of (14) to obtain a score accounting for measurement error whose mean is zero in the absence of model misspecification. This yields the corrected score evaluated at the j𝑗jitalic_j-th observed data point for model diagnostics, for j=1,,n,𝑗1𝑛j=1,\ldots,n,italic_j = 1 , … , italic_n ,

𝐒~(𝛀;Yj,W~j,T~j,𝐗1,j)=1Bb=1BRe{𝐒(𝛀;Yj,W¯j+i(nj1)Sj2njTj,b,𝐗1,j)}.~𝐒𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗1𝐵superscriptsubscript𝑏1𝐵Re𝐒𝛀subscript𝑌𝑗subscript¯𝑊𝑗𝑖subscript𝑛𝑗1superscriptsubscript𝑆𝑗2subscript𝑛𝑗subscript𝑇𝑗𝑏subscript𝐗1𝑗\tilde{\mathbf{S}}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde% {T}_{j},\mathbf{X}_{-1,j})=\frac{1}{B}\sum_{b=1}^{B}\mbox{Re}\left\{\mathbf{S}% \left(\mbox{\boldmath$\Omega$};Y_{j},\overline{W}_{j}+i\sqrt{\frac{(n_{j}-1)S_% {j}^{2}}{n_{j}}}T_{j,b},\mathbf{X}_{-1,j}\right)\right\}.over~ start_ARG bold_S end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_B end_ARG ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT Re { bold_S ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_i square-root start_ARG divide start_ARG ( italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - 1 ) italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG italic_T start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } . (15)

The test statistic of the quadratic form denoted by Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) that is parallel to (13) follows by using the MCCL estimator ^𝛀^absent𝛀\hat{}\mbox{\boldmath$\Omega$}over^ start_ARG end_ARG bold_Ω instead of ^𝛀0^absentsubscript𝛀0\hat{}\mbox{\boldmath$\Omega$}_{0}over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, replacing 𝐒¯¯𝐒\overline{\mathbf{S}}over¯ start_ARG bold_S end_ARG appearing in (13) with n1j=1n𝐒~(𝛀;Yj,W~j,T~j,𝐗1,j)superscript𝑛1superscriptsubscript𝑗1𝑛~𝐒𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗n^{-1}\sum_{j=1}^{n}\tilde{\mathbf{S}}(\mbox{\boldmath$\Omega$};Y_{j},% \widetilde{W}_{j},\widetilde{T}_{j},\mathbf{X}_{-1,j})italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over~ start_ARG bold_S end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ), and revising ^𝚺^absent𝚺\hat{}\mbox{\boldmath$\Sigma$}over^ start_ARG end_ARG bold_Σ accordingly. But the next hurdle emerges, that is the design of a parametric bootstrap procedure for estimating the null distribution of Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ). Traditional parametric bootstrap in the regression setting, such as the procedure in Zhou and Huang, (2020), involves generating response data from the primary regression model that again requires evaluating an estimated regression function at the true covariates that are partly unobserved in the current context. We overcome this hurdle by “estimating” unobserved true covariate data, as implemented in the method of regression calibration (Chapter 4, Carroll et al.,, 2006) that takes on the structural viewpoint of measurement error models. Under the classical measurement error in (1), the best linear predictor of X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT is E(X1,j|W¯j)=μ1+λj(W¯jμ1)𝐸conditionalsubscript𝑋1𝑗subscript¯𝑊𝑗subscript𝜇1subscript𝜆𝑗subscript¯𝑊𝑗subscript𝜇1E(X_{1,j}|\overline{W}_{j})=\mu_{1}+\lambda_{j}(\overline{W}_{j}-\mu_{1})italic_E ( italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT | over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ), where μ1=E(X1)subscript𝜇1𝐸subscript𝑋1\mu_{1}=E(X_{1})italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_E ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) and λj=njσ12/σW2subscript𝜆𝑗subscript𝑛𝑗subscriptsuperscript𝜎21subscriptsuperscript𝜎2𝑊\lambda_{j}=n_{j}\sigma^{2}_{1}/\sigma^{2}_{\hbox{$W$}}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT is the reliability ratio associated with W¯jsubscript¯𝑊𝑗\overline{W}_{j}over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (Carroll et al.,, 2006, Section 3.2.1), in which σ12subscriptsuperscript𝜎21\sigma^{2}_{1}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σW2subscriptsuperscript𝜎2𝑊\sigma^{2}_{\hbox{$W$}}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT denote the variance of X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and that of W𝑊Witalic_W, respectively. Replacing each unknown quantity in E(X1,j|W¯j)𝐸conditionalsubscript𝑋1𝑗subscript¯𝑊𝑗E(X_{1,j}|\overline{W}_{j})italic_E ( italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT | over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) with its method-of-moments estimator yields an “estimator" or prediction of X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT given by

X^1,j=W¯+λ^(W¯jW¯), for j=1,,n,superscriptsubscript^𝑋1𝑗¯𝑊^𝜆subscript¯𝑊𝑗¯𝑊 for j=1,,n,\hat{X}_{1,j}^{*}=\overline{W}+\hat{\lambda}(\overline{W}_{j}-\overline{W}),% \text{ for $j=1,\ldots,n$,}over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = over¯ start_ARG italic_W end_ARG + over^ start_ARG italic_λ end_ARG ( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - over¯ start_ARG italic_W end_ARG ) , for italic_j = 1 , … , italic_n , (16)

where W¯=n1j=1nW¯j¯𝑊superscript𝑛1superscriptsubscript𝑗1𝑛subscript¯𝑊𝑗\overline{W}=n^{-1}\sum_{j=1}^{n}\overline{W}_{j}over¯ start_ARG italic_W end_ARG = italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and λ^=σ^12/σ^W2^𝜆subscriptsuperscript^𝜎21subscriptsuperscript^𝜎2𝑊\hat{\lambda}=\hat{\sigma}^{2}_{1}/\hat{\sigma}^{2}_{\hbox{$W$}}over^ start_ARG italic_λ end_ARG = over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT, in which σ^W2subscriptsuperscript^𝜎2𝑊\hat{\sigma}^{2}_{\hbox{$W$}}over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT is the sample variance of (W¯1,,W¯n)subscript¯𝑊1subscript¯𝑊𝑛(\overline{W}_{1},\ldots,\overline{W}_{n})( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), σ^12=(σ^W2σ^u2)+subscriptsuperscript^𝜎21subscriptsubscriptsuperscript^𝜎2𝑊subscriptsuperscript^𝜎2𝑢\hat{\sigma}^{2}_{1}=(\hat{\sigma}^{2}_{\hbox{$W$}}-\hat{\sigma}^{2}_{u})_{+}over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT - over^ start_ARG italic_σ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, and σ^u2=n1j=1nSj2/njsuperscriptsubscript^𝜎𝑢2superscript𝑛1superscriptsubscript𝑗1𝑛superscriptsubscript𝑆𝑗2subscript𝑛𝑗\hat{\sigma}_{u}^{2}=n^{-1}\sum_{j=1}^{n}S_{j}^{2}/n_{j}over^ start_ARG italic_σ end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_n start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, recalling that, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n, Sj2superscriptsubscript𝑆𝑗2S_{j}^{2}italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is the sample variance of (Wj,1,,Wj,nj)subscript𝑊𝑗1subscript𝑊𝑗subscript𝑛𝑗(W_{j,1},\ldots,W_{j,n_{j}})( italic_W start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT , … , italic_W start_POSTSUBSCRIPT italic_j , italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) computed earlier to evaluate the corrected score and the corrected log-likelihood. The idea of regression calibration is to regress Y𝑌Yitalic_Y on the estimated covariate X^1subscriptsuperscript^𝑋1\hat{X}^{*}_{1}over^ start_ARG italic_X end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT defined by (16) and 𝐗1=(X2,,Xp)Tsubscript𝐗1superscriptsubscript𝑋2subscript𝑋𝑝T\mathbf{X}_{-1}=(X_{2},\ldots,X_{p})^{\mathrm{\scriptscriptstyle T}}bold_X start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT = ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT instead of regressing on (W,𝐗1T)Tsuperscript𝑊superscriptsubscript𝐗1TT(W,\mathbf{X}_{-1}^{\mathrm{\scriptscriptstyle T}})^{\mathrm{% \scriptscriptstyle T}}( italic_W , bold_X start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT . Even though this idea often yields estimators of parameters in the primary regression model improved over naive estimators, Buonaccorsi et al., (2018) noted that (16) tends to underestimate the variability of the true covariate and thus can be problematic if used in a bootstrap procedure as we intend to. They then proposed to use

X^1,j=W¯+λ^1/2(W¯jW¯), for j=1,,n,subscript^𝑋1𝑗¯𝑊superscript^𝜆12subscript¯𝑊𝑗¯𝑊 for j=1,,n,\hat{X}_{1,j}=\overline{W}+\hat{\lambda}^{1/2}(\overline{W}_{j}-\overline{W}),% \text{ for $j=1,\ldots,n$,}over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT = over¯ start_ARG italic_W end_ARG + over^ start_ARG italic_λ end_ARG start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - over¯ start_ARG italic_W end_ARG ) , for italic_j = 1 , … , italic_n , (17)

as estimated covariate data instead so that these estimated covariate values have the mean and variance coinciding with method-of-moments estimates for the mean and variance of X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

With this last hurdle resolved, we are in the position to present the detailed algorithm of the parametric bootstrap for estimating the p𝑝pitalic_p-value associated with Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) based on M𝑀Mitalic_M bootstrap samples next.

  1. Step 1: Fit the beta modal regression model with classical measurement error to 𝒟superscript𝒟\mathcal{D}^{*}caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by applying the MCCL method in Section 3.3. This gives the MCCL estimate ^𝛀=(^𝜷T,m^)T^absent𝛀superscript^absentsuperscript𝜷T^𝑚T\hat{}\mbox{\boldmath$\Omega$}=(\hat{}\mbox{\boldmath$\beta$}^{\mathrm{% \scriptscriptstyle T}},\hat{m})^{\mathrm{\scriptscriptstyle T}}over^ start_ARG end_ARG bold_Ω = ( over^ start_ARG end_ARG bold_italic_β start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT , over^ start_ARG italic_m end_ARG ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT.

  2. Step 2: Compute the test statistic Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

  3. For d=1,,M𝑑1𝑀d=1,\ldots,Mitalic_d = 1 , … , italic_M, repeat Steps 3–5,

  4. Step 3: For j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n, generate Yj(d)subscriptsuperscript𝑌𝑑𝑗Y^{(d)}_{j}italic_Y start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT from beta(1+m^θ^(X^1,j,𝐗1,j), 1+m^{1θ^(X^1,j,𝐗1,j)})1^𝑚^𝜃subscript^𝑋1𝑗subscript𝐗1𝑗1^𝑚1^𝜃subscript^𝑋1𝑗subscript𝐗1𝑗(1+\hat{m}\hat{\theta}(\hat{X}_{1,j},\mathbf{X}_{-1,j}),\,1+\hat{m}\{1-\hat{% \theta}(\hat{X}_{1,j},\mathbf{X}_{-1,j})\})( 1 + over^ start_ARG italic_m end_ARG over^ start_ARG italic_θ end_ARG ( over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) , 1 + over^ start_ARG italic_m end_ARG { 1 - over^ start_ARG italic_θ end_ARG ( over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } ), and generate Wj,k(d)=X^1,j+Uj,k(d)subscriptsuperscript𝑊𝑑𝑗𝑘subscript^𝑋1𝑗subscriptsuperscript𝑈𝑑𝑗𝑘W^{(d)}_{j,k}=\hat{X}_{1,j}+U^{(d)}_{j,k}italic_W start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT = over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT + italic_U start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT, for k=1𝑘1k=1italic_k = 1, …, njsubscript𝑛𝑗n_{j}italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, where X^1,jsubscript^𝑋1𝑗\hat{X}_{1,j}over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT is given by (17), and {Uj,k(d)}k=1njsuperscriptsubscriptsubscriptsuperscript𝑈𝑑𝑗𝑘𝑘1subscript𝑛𝑗\{U^{(d)}_{j,k}\}_{k=1}^{n_{j}}{ italic_U start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT are i.i.d. from N(0,Sj2)𝑁0superscriptsubscript𝑆𝑗2N(0,S_{j}^{2})italic_N ( 0 , italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). Let W~j(d)={Wj,k(d)}k=1njsuperscriptsubscript~𝑊𝑗𝑑superscriptsubscriptsubscriptsuperscript𝑊𝑑𝑗𝑘𝑘1subscript𝑛𝑗\widetilde{W}_{j}^{(d)}=\{W^{(d)}_{j,k}\}_{k=1}^{n_{j}}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT = { italic_W start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. This yields the d𝑑ditalic_d-th set of bootstrap data, 𝒟(d)={(Yj(d),W~j(d),𝐗1,j)}j=1nsuperscript𝒟𝑑superscriptsubscriptsubscriptsuperscript𝑌𝑑𝑗subscriptsuperscript~𝑊𝑑𝑗subscript𝐗1𝑗𝑗1𝑛\mathcal{D}^{(d)}=\{(Y^{(d)}_{j},\,\widetilde{W}^{(d)}_{j},\mathbf{X}_{-1,j})% \}_{j=1}^{n}caligraphic_D start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT = { ( italic_Y start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

  5. Step 4: Fit the beta modal regression model with classic measurement error to 𝒟(d)superscript𝒟𝑑\mathcal{D}^{(d)}caligraphic_D start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT, and obtain the MCCL estimate of 𝛀𝛀\Omegabold_Ω, denoted by 𝛀^(d)superscript^𝛀𝑑\hat{\boldsymbol{\Omega}}^{(d)}over^ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT.

  6. Step 5: Compute the test statistic, Q~(𝛀^(d);𝒟(d))~𝑄superscript^𝛀𝑑superscript𝒟𝑑\tilde{Q}(\hat{\boldsymbol{\Omega}}^{(d)};\mathcal{D}^{(d)})over~ start_ARG italic_Q end_ARG ( over^ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT ; caligraphic_D start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT ).

  7. Step 6: Estimate the p𝑝pitalic_p-value by M1d=1MI{Q~(𝛀^(d);𝒟(d))>Q~(𝛀^;𝒟)}superscript𝑀1superscriptsubscript𝑑1𝑀𝐼~𝑄superscript^𝛀𝑑superscript𝒟𝑑~𝑄^𝛀superscript𝒟M^{-1}\sum_{d=1}^{M}I\left\{\tilde{Q}\left(\hat{\boldsymbol{\Omega}}^{(d)};% \mathcal{D}^{(d)}\right)>\tilde{Q}\left(\hat{\boldsymbol{\Omega}};\mathcal{D}^% {*}\right)\right\}italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_d = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_I { over~ start_ARG italic_Q end_ARG ( over^ start_ARG bold_Ω end_ARG start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT ; caligraphic_D start_POSTSUPERSCRIPT ( italic_d ) end_POSTSUPERSCRIPT ) > over~ start_ARG italic_Q end_ARG ( over^ start_ARG bold_Ω end_ARG ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) }.

In the absence of covariate measurement error where {X1,j,j=1,,n}formulae-sequencesubscript𝑋1𝑗𝑗1𝑛\{X_{1,j},\,j=1,\ldots,n\}{ italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , italic_j = 1 , … , italic_n } are observed, the above algorithm (with X^1,jsubscript^𝑋1𝑗\hat{X}_{1,j}over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT replaced by X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT in Step 3) essentially follows the general guidelines of bootstrap hypothesis testing as discussed in Hall and Wilson, (1991), Davison and Hinkley, (1997), and Martin, (2007). In particular, our targeted null hypothesis states that the response given true covariates follows a beta modal regression model; Step 1 in our bootstrap algorithm aims to “recover" the model consistent with the null, and response data obtained in Step 3 are generated from the fitted null model and thus these response data reflect the null. This is precisely the first principle of model-based bootstrap for hypothesis testing: to generate bootstrap data that reflect the null. The unique challenge of bootstrap hypothesis testing in the presence of covariate measurement error is that true covariate values need to be estimated before generating response data. Unlike response data generation, which should reflect the null (that does not specify a distribution for the true covariate data), when “recovering" true covariate values, one aims to recover certain structures of the design matrix in the absence of measurement error. We accomplish this goal by using {X^1,j,j=1,,n}formulae-sequencesubscript^𝑋1𝑗𝑗1𝑛\{\hat{X}_{1,j},\,j=1,\ldots,n\}{ over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , italic_j = 1 , … , italic_n } in (17), which preserve certain structures of true covariate values in the sense that the first two moments of these estimated covariate values coincide with the method-of-moment estimates for the first two moments of {X1,j,j=1,,n}formulae-sequencesubscript𝑋1𝑗𝑗1𝑛\{X_{1,j},\,j=1,\ldots,n\}{ italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT , italic_j = 1 , … , italic_n }. The so-constructed estimated true covariate values are also used in Thomas et al., (2011) to recover true covariate data. Even though it is unclear if there exists a better way to recover error-free covariates data for the purpose of bootstrap hypothesis testing, Buonaccorsi et al., (2016) showed that this approach substantially outperforms two obvious alternative methods: one is to use W¯jsubscript¯𝑊𝑗\overline{W}_{j}over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to estimate X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT, the other is to use X^1,jsuperscriptsubscript^𝑋1𝑗\hat{X}_{1,j}^{*}over^ start_ARG italic_X end_ARG start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT in (16). In our context, empirical evidence from the simulation study presented in the next section suggest that the proposed bootstrap procedure can estimate the null distribution of Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) accurately enough to preserve the right size of the test for model misspecification over a wide range of significance levels.

5 Simulation study

We carry out simulation study to inspect finite sample performance of the proposed estimation method and the diagnostic method. The source code to reproduce results in this section is publicly available on the journal’s web page.

5.1 Design of simulation experiments

We generate data from each of the following four data generation processes.

  1. (M1)

    Generate response data according to (2), with m=3𝑚3m=3italic_m = 3, θ(𝐗)=1/{1+exp(β0β1X1β2X2)}𝜃𝐗11subscript𝛽0subscript𝛽1subscript𝑋1subscript𝛽2subscript𝑋2\theta(\mathbf{X})=1/\{1+\exp(-\beta_{0}-\beta_{1}X_{1}-\beta_{2}X_{2})\}italic_θ ( bold_X ) = 1 / { 1 + roman_exp ( - italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) }, 𝜷=(β0,β1,β2)T=(0.25,0.25,0.25)T𝜷superscriptsubscript𝛽0subscript𝛽1subscript𝛽2Tsuperscript0.250.250.25T\mbox{\boldmath$\beta$}=(\beta_{0},\beta_{1},\beta_{2})^{\mathrm{% \scriptscriptstyle T}}=(0.25,0.25,0.25)^{\mathrm{\scriptscriptstyle T}}bold_italic_β = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT = ( 0.25 , 0.25 , 0.25 ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, X2Bernoulli(0.5)similar-tosubscript𝑋2Bernoulli0.5X_{2}\sim\text{Bernoulli}(0.5)italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ Bernoulli ( 0.5 ), and X1|X2N(I(X2=1)I(X2=0), 1)similar-toconditionalsubscript𝑋1subscript𝑋2𝑁𝐼subscript𝑋21𝐼subscript𝑋201X_{1}|X_{2}\sim N(I(X_{2}=1)-I(X_{2}=0),\,1)italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_N ( italic_I ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 ) - italic_I ( italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 ) , 1 ), where I()𝐼I(\cdot)italic_I ( ⋅ ) is the indicator function. Contaminate data of X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT according to (1) to generate Wj,ksubscript𝑊𝑗𝑘W_{j,k}italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n and k=1,2,3𝑘123k=1,2,3italic_k = 1 , 2 , 3, with Uj,kN(0,σu2)similar-tosubscript𝑈𝑗𝑘𝑁0subscriptsuperscript𝜎2𝑢U_{j,k}\sim N(0,\sigma^{2}_{u})italic_U start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∼ italic_N ( 0 , italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ).

  2. (M2)

    Same as (M1) except for that m=40𝑚40m=40italic_m = 40 and θ(𝐗)=1/{1+exp(β0β1X1β2X2β3X12)}𝜃𝐗11subscript𝛽0subscript𝛽1subscript𝑋1subscript𝛽2subscript𝑋2subscript𝛽3superscriptsubscript𝑋12\theta(\mathbf{X})=1/\{1+\exp(-\beta_{0}-\beta_{1}X_{1}-\beta_{2}X_{2}-\beta_{% 3}X_{1}^{2})\}italic_θ ( bold_X ) = 1 / { 1 + roman_exp ( - italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) }, with 𝜷=(β0,β1,β2,β3)T=(1,1,1,1)T𝜷superscriptsubscript𝛽0subscript𝛽1subscript𝛽2subscript𝛽3Tsuperscript1111T\mbox{\boldmath$\beta$}=(\beta_{0},\beta_{1},\beta_{2},\beta_{3})^{\mathrm{% \scriptscriptstyle T}}=(1,1,1,1)^{\mathrm{\scriptscriptstyle T}}bold_italic_β = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT = ( 1 , 1 , 1 , 1 ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT.

  3. (M3)

    Same as (M1) except for that θ(𝐗)=Φ(β0+β1X1+β2X2)𝜃𝐗Φsubscript𝛽0subscript𝛽1subscript𝑋1subscript𝛽2subscript𝑋2\theta\left(\mathbf{X}\right)=\Phi\left(\beta_{0}+\beta_{1}X_{1}+\beta_{2}X_{2% }\right)italic_θ ( bold_X ) = roman_Φ ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) with 𝜷=(β0,β1,β2)T=(1,1,1)T𝜷superscriptsubscript𝛽0subscript𝛽1subscript𝛽2Tsuperscript111T\mbox{\boldmath$\beta$}=(\beta_{0},\beta_{1},\beta_{2})^{\mathrm{% \scriptscriptstyle T}}=(1,1,1)^{\mathrm{\scriptscriptstyle T}}bold_italic_β = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT = ( 1 , 1 , 1 ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, where Φ()Φ\Phi(\cdot)roman_Φ ( ⋅ ) is the cumulative distribution function of N(0,1)𝑁01N(0,1)italic_N ( 0 , 1 ).

  4. (M4)

    Generate response data {Yj}j=1nsuperscriptsubscriptsubscript𝑌𝑗𝑗1𝑛\{Y_{j}\}_{j=1}^{n}{ italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT according to Yj=(YjY(1))/(Y(n)Y(1))subscript𝑌𝑗superscriptsubscript𝑌𝑗subscriptsuperscript𝑌1subscriptsuperscript𝑌𝑛subscriptsuperscript𝑌1Y_{j}=(Y_{j}^{*}-Y^{*}_{(1)})/(Y^{*}_{(n)}-Y^{*}_{(1)})italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT ) / ( italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT - italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT ), for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n, where Y(1)subscriptsuperscript𝑌1Y^{*}_{(1)}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( 1 ) end_POSTSUBSCRIPT and Y(n)subscriptsuperscript𝑌𝑛Y^{*}_{(n)}italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( italic_n ) end_POSTSUBSCRIPT are the minimum and maximum order statistics of data {Yj}j=1nsuperscriptsubscriptsubscriptsuperscript𝑌𝑗𝑗1𝑛\{Y^{*}_{j}\}_{j=1}^{n}{ italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, respectively, Yj𝐗jGumbel(θ(𝐗j),γ1{12θ(𝐗j)}/(2+m))similar-toconditionalsubscriptsuperscript𝑌𝑗subscript𝐗𝑗Gumbel𝜃subscript𝐗𝑗superscript𝛾112𝜃subscript𝐗𝑗2𝑚Y^{*}_{j}\mid\mathbf{X}_{j}\sim\operatorname{Gumbel}(\theta(\mathbf{X}_{j}),\,% \gamma^{-1}\{1-2\theta(\mathbf{X}_{j})\}/(2+m))italic_Y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∣ bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∼ roman_Gumbel ( italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT { 1 - 2 italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } / ( 2 + italic_m ) ), in which θ(𝐗j)<0.5𝜃subscript𝐗𝑗0.5\theta(\mathbf{X}_{j})<0.5italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) < 0.5 is the mode formulated as that in (M1) with 𝜷=(β0,β1,β2)=(1,1,1)T𝜷subscript𝛽0subscript𝛽1subscript𝛽2superscript111T\mbox{\boldmath$\beta$}=(\beta_{0},\beta_{1},\beta_{2})=(1,1,1)^{\mathrm{% \scriptscriptstyle T}}bold_italic_β = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = ( 1 , 1 , 1 ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, γ1{12θ(𝐗j)}/(2+m)superscript𝛾112𝜃subscript𝐗𝑗2𝑚\gamma^{-1}\{1-2\theta(\mathbf{X}_{j})\}/(2+m)italic_γ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT { 1 - 2 italic_θ ( bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } / ( 2 + italic_m ) is the scale of the Gumbel distribution, and γ𝛾\gammaitalic_γ stands for the Euler–Mascheroni constant.

Despite the data generation process used to generate a particular data set, we always assume a beta modal regression model with θ(𝐗)𝜃𝐗\theta(\mathbf{X})italic_θ ( bold_X ) specified as that in (M1) when carrying out modal regression analysis of Y𝑌Yitalic_Y on 𝐗=(X1,X2)T𝐗superscriptsubscript𝑋1subscript𝑋2T\mathbf{X}=(X_{1},X_{2})^{\mathrm{\scriptscriptstyle T}}bold_X = ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT. By so doing, the design in (M1) allows us to monitor point estimation in the absence of model misspecification, and the latter three designs can be used to study operating characteristics of the proposed model diagnostic method in the presence of different sources of model misspecification. In particular, fitting the assumed model to data generated according to (M2) creates a scenario where one misspecifies the linear predictor in the regression function. When data are generated from (M3), the assumed model has a wrong link function. Finally, fitting the assumed model to data from (M4) gives rise to the most severe model misspecification in the sense that the true distribution of Y𝑌Yitalic_Y given 𝐗𝐗\mathbf{X}bold_X is outside of the beta family.

5.2 Performance of point estimation

Besides assessing the quality of the MCCL estimator of 𝛀𝛀\Omegabold_Ω in comparison with the naive maximum likelihood estimator, we aim at addressing the following three issues of point estimation in the simulation study: (i) the impact of having an error-free covariate along with an error-prone covariate on covariate effects estimation; (ii) the quality of the variance estimation based on (12); (iii) the robustness of the MCCL estimator to the normality assumption on U𝑈Uitalic_U. We bring up the third issue because the corrected score method is developed under the assumption of normal measurement error. Due to our focus on covariate effects estimation in the presence of an error-prone covariate in a modal regression model for a bounded response, none of the existing modal regression methods accounting for measurement error referenced in Section 1 serves as a sensible competing method in the current simulation study (e.g., there is no covariate effect parameters 𝜷𝜷\betabold_italic_β in a nonparametric modal regression model) .

Based on data generated according to (M1) with σu2=0.6,1.2subscriptsuperscript𝜎2𝑢0.61.2\sigma^{2}_{u}=0.6,1.2italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 0.6 , 1.2, we obtain the MCCL estimate of 𝛀𝛀\Omegabold_Ω using B=100,200𝐵100200B=100,200italic_B = 100 , 200 and the naive maximum likelihood estimate that ignores measurement error in X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Table 1 provides the median of MCCL estimates ^𝛀^absent𝛀\hat{}\mbox{\boldmath$\Omega$}over^ start_ARG end_ARG bold_Ω and the median of naive estimates across 1000 Monte Carlo replicates at each of the two sample sizes n=100,200𝑛100200n=100,200italic_n = 100 , 200. In contrast to the naive estimates that exhibit bias that do not diminish as the sample size increases, the MCCL estimates are much improved despite the severity of error contamination in X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. And raising B𝐵Bitalic_B from 100 to 200 provides negligible improvement in the qualitiy of MCCL estimates. We thus set B=100𝐵100B=100italic_B = 100 in the remaining empirical study and only show results corresponding to this default choice of B𝐵Bitalic_B in the sequel. Not surprisingly, the MCCL estimator corrects the bias of the naive estimator at the price of an inflation in variation.

Table 1: Medians of MCCL estimates and medians of naive estimates across 1000100010001000 Monte Carlo replicates generated according to (M1). The number in parentheses following each median is the interquartile range of the 1000 realizations of an estimator.
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT logm𝑚\log mroman_log italic_m
σu2=0.6subscriptsuperscript𝜎2𝑢0.6\sigma^{2}_{u}=0.6italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 0.6
MCCLB=100subscriptMCCL𝐵100\text{MCCL}_{B=100}MCCL start_POSTSUBSCRIPT italic_B = 100 end_POSTSUBSCRIPT 0.23 (0.34) 0.24 (0.22) 0.26 (0.59) 1.18 (0.30)
MCCLB=200subscriptMCCL𝐵200\text{MCCL}_{B=200}MCCL start_POSTSUBSCRIPT italic_B = 200 end_POSTSUBSCRIPT 0.23 (0.35) 0.24 (0.22) 0.26 (0.59) 1.18 (0.30)
n=100𝑛100n=100italic_n = 100 Naive 0.19 (0.31) 0.20 (0.18) 0.35 (0.55) 1.16 (0.29)
MCCLB=100subscriptMCCL𝐵100\text{MCCL}_{B=100}MCCL start_POSTSUBSCRIPT italic_B = 100 end_POSTSUBSCRIPT 0.24 (0.23) 0.25 (0.15) 0.26 (0.40) 1.14 (0.22)
MCCLB=200subscriptMCCL𝐵200\text{MCCL}_{B=200}MCCL start_POSTSUBSCRIPT italic_B = 200 end_POSTSUBSCRIPT 0.24 (0.23) 0.25 (0.15) 0.26 (0.40) 1.14 (0.22)
n=200𝑛200n=200italic_n = 200 Naive 0.20 (0.22) 0.20 (0.13) 0.34 (0.37) 1.13 (0.21)
σu2=1.2subscriptsuperscript𝜎2𝑢1.2\sigma^{2}_{u}=1.2italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 1.2
MCCLB=100subscriptMCCL𝐵100\text{MCCL}_{B=100}MCCL start_POSTSUBSCRIPT italic_B = 100 end_POSTSUBSCRIPT 0.24 (0.34) 0.24 (0.24) 0.27 (0.65) 1.18 (0.30)
MCCLB=200subscriptMCCL𝐵200\text{MCCL}_{B=200}MCCL start_POSTSUBSCRIPT italic_B = 200 end_POSTSUBSCRIPT 0.24 (0.35) 0.24 (0.24) 0.26 (0.66) 1.18 (0.30)
n=100𝑛100n=100italic_n = 100 Naive 0.17 (0.31) 0.17 (0.17) 0.41 (0.54) 1.16 (0.29)
MCCLB=100subscriptMCCL𝐵100\text{MCCL}_{B=100}MCCL start_POSTSUBSCRIPT italic_B = 100 end_POSTSUBSCRIPT 0.25 (0.25) 0.25 (0.18) 0.25 (0.43) 1.14 (0.21)
MCCLB=200subscriptMCCL𝐵200\text{MCCL}_{B=200}MCCL start_POSTSUBSCRIPT italic_B = 200 end_POSTSUBSCRIPT 0.25 (0.25) 0.25 (0.18) 0.26 (0.43) 1.14 (0.21)
n=200𝑛200n=200italic_n = 200 Naive 0.17 (0.21) 0.17 (0.12) 0.41 (0.36) 1.12 (0.21)

The attenuation effect of measurement error on the naive covariate effect estimation for X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is evident in Table 1. In contrast, the covariate effect estimation for the error-free covariate X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is noticeably overestimated by the naive method. One may wonder if the observed opposite directions in the bias of naive estimation of two covariates effects persists when the two covariates are independent. This relates to the first issue brought up above. To address this issue, we revise the data generating process in (M1) in that X1N(0,1)similar-tosubscript𝑋1𝑁01X_{1}\sim N(0,1)italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_N ( 0 , 1 ). Figure 1 includes boxplots of two sets of regression coefficients estimates, including the MCCL estimates and the naive estimates, under (M1) where X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are dependent (see the left panel in Figure 1) and under the revised (M1) with X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT independent (see the right panel in Figure 1). Here, we set n=2000𝑛2000n=2000italic_n = 2000 for each of 1000 Monte Carlo replicates. Interestingly, when X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is independent of the error-prone covariate X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, naive estimation for the covariate effect of X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT does not appear to be affected by measurement error. Regardless, the attenuation in the estimated covariate effect for X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT remains.

Refer to caption
Figure 1: Boxplots of regression coefficients estimates under (M1) with X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT dependent (left panel) and those under a revised version of (M1) with X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and X2subscript𝑋2X_{2}italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT independent (right panel). The two boxes associated with each parameter correspond to two estimators (from left to right): the MCCL estimator (red box) and the naive estimator (cyan box).

Table 2 presents the average of standard deviation estimation of each parameter in 𝛀𝛀\Omegabold_Ω based on (12) across 1000 Monte Carlo replicates from (M1) with n=200𝑛200n=200italic_n = 200. The Monte Carlo standard deviation of each parameter estimate in 𝛀𝛀\Omegabold_Ω is used as a reference/gold standard in this table. The proximity of the standard deviation estimate with the reference shown in the table suggests that the sandwich variance estimator in (12) provides reliable estimation for the variance of the MCCL estimator. This settles the second issue.

Table 2: Averages of standard deviation estimates, s.d.^^s.d.\widehat{\text{s.d.}}over^ start_ARG s.d. end_ARG, and empirical standard deviation, s.d., across 1000100010001000 Monte Carlo replicates from (M1) with σu2=1.2subscriptsuperscript𝜎2𝑢1.2\sigma^{2}_{u}=1.2italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 1.2 and n=200𝑛200n=200italic_n = 200. Numbers in parentheses are Monte Carlo standard errors associated with the Monte Carlo means.
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT logm𝑚\log mroman_log italic_m
s.d.^^s.d.\widehat{\text{s.d.}}over^ start_ARG s.d. end_ARG s.d.  s.d.^^ s.d.\widehat{\text{ s.d. }}over^ start_ARG s.d. end_ARG s.d. s.d.^^s.d.\widehat{\text{s.d.}}over^ start_ARG s.d. end_ARG s.d. s.d.^^s.d.\widehat{\text{s.d.}}over^ start_ARG s.d. end_ARG s.d.
MCCL 0.19 (0.03) 0.19 0.13 (0.03) 0.13 0.32 (0.06) 0.32 0.15 (0.02) 0.16
Naive 0.16 (0.02) 0.16 0.09 (0.01) 0.09 0.26 (0.03) 0.26 0.15 (0.01) 0.16

The third issue concerns the normality assumption on measurement error in the development of the Monte Carlo corrected score method. To assess the robustness of the MCCL estimator to this normality assumption, we revise (M1) by letting Uj,kLaplace(0,0.51/2)similar-tosubscript𝑈𝑗𝑘Laplace0superscript0.512U_{j,k}\sim\mbox{Laplace}(0,0.5^{1/2})italic_U start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∼ Laplace ( 0 , 0.5 start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ) instead, for k=1,2,3𝑘123k=1,2,3italic_k = 1 , 2 , 3, and set n=200𝑛200n=200italic_n = 200. Table 3 provides summary statistics of parameter estimates as those shown in Table 1 (with B=100𝐵100B=100italic_B = 100) under this revised setting. In addition to estimates parallel to those considered in Table 1, we also include summary statistics for MCCL estimates obtained without using replicate measures of X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. That is, we keep Wj,1subscript𝑊𝑗1W_{j,1}italic_W start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT in W~j={Wj,k}k=13subscript~𝑊𝑗superscriptsubscriptsubscript𝑊𝑗𝑘𝑘13\widetilde{W}_{j}=\{W_{j,k}\}_{k=1}^{3}over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT as the only available error-contaminated measure of X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT, for j=1,,200𝑗1200j=1,\ldots,200italic_j = 1 , … , 200, when constructing the corrected log likelihood function. In Section 6.2, we describe a modified version of the correct log likelihood in (10) that does not require replicate measures but depends on the measurement error variance (see (18)). This creates a scenario where the violation of normality assumption associated with the measurement error in Wj,1subscript𝑊𝑗1W_{j,1}italic_W start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT is more severe than when W¯j=k=13Wj,k/3subscript¯𝑊𝑗superscriptsubscript𝑘13subscript𝑊𝑗𝑘3\overline{W}_{j}=\sum_{k=1}^{3}W_{j,k}/3over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT / 3 is used as a surrogate of X1,jsubscript𝑋1𝑗X_{1,j}italic_X start_POSTSUBSCRIPT 1 , italic_j end_POSTSUBSCRIPT. As one can see from Table 3, despite the (severity in) violation of the normality assumption on U𝑈Uitalic_U, the MCCL estimates remain close to the truth and significantly outperform the naive estimates. This robustness feature of the Monte Carlo corrected score method is also noted and explained in Novick and Stefanski, (2002).

Table 3: Medians of MCCL estimates and medians of naive estimates across 1000100010001000 Monte Carlo replicates generated according to (M1) with Uj,kLaplace(0,0.51/2)similar-tosubscript𝑈𝑗𝑘Laplace0superscript0.512U_{j,k}\sim\mbox{Laplace}(0,0.5^{1/2})italic_U start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∼ Laplace ( 0 , 0.5 start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ) and n=200𝑛200n=200italic_n = 200. The number in parentheses following each median is the interquartile range of the 1000 realizations of an estimator. MCCL1subscriptMCCL1\text{MCCL}_{1}MCCL start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and MCCL2subscriptMCCL2\text{MCCL}_{2}MCCL start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT refer to MCCL estimates when replicate measures are present and absent, respectively.
β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT logm𝑚\log mroman_log italic_m
MCCL1subscriptMCCL1\text{MCCL}_{1}MCCL start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 0.25 (0.26) 0.25 (0.17) 0.26 (0.41) 1.12 (0.19)
MCCL2subscriptMCCL2\text{MCCL}_{2}MCCL start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 0.25 (0.26) 0.25 (0.17) 0.26 (0.41) 1.12 (0.19)
Naive 0.18 (0.22) 0.19 (0.13) 0.39 (0.36) 1.10 (0.19)

5.3 Performance of the model diagnostic method

Using 5000 Monte Carlo replicates from (M1) with σu2=1.2superscriptsubscript𝜎𝑢21.2\sigma_{u}^{2}=1.2italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1.2 at each sample size level in n=100𝑛100n=100italic_n = 100, 200, 500, 1000, we implement the bootstrap algorithm related in Section 4 with M=300𝑀300M=300italic_M = 300 bootstrap samples to obtain estimated p𝑝pitalic_p-values associated with the test statistic Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ). We then record the proportion of replicates, across 5000 replicates, that lead to rejection of the null hypothesis of no model misspecification at various nominal levels. This rejection rate can be viewed as an empirical size of the test at a pre-specified significance level. Figure 2 depicts this rejection rate versus the significance level, from which one can see that the size of the test is well controlled by the bootstrap procedure over a wide range of nominal levels.

Refer to caption
Figure 2: Rejection rates associated with the score-based diagnostic test across 5000 Monte Carlo replicates from (M1) versus the nominal level of the test. Black dashed lines are the 45superscript4545^{\circ}45 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT reference lines.

Table 4 presents rejection rates of the model diagnostic method in the presence of different forms of model misspecification that occur when fitting data generated according to (M2)–(M4) while assuming a beta modal regression model specified in (M1). As one can see in Table 4, the proposed score-based test has moderate power to detect a misspecified form of the linear predictor, with the power steadily increasing as n𝑛nitalic_n increases, and is especially powerful in detecting violation of the distributional assumption on Y𝑌Yitalic_Y given covariates; but the test is less sensitive to link misspecification. Low power of most goodness-of-fit tests to detect link misspecification have been reported in the context of generalized linear models (e.g., Hosmer et al.,, 1997). Given these reported findings in the literature, the low power observed under design (M3) may not be surprising, especially with the high similarity of the logit link in the assumed model with the probit link in the true model in (M3).

Table 4: Rejection rates of the score-based diagnostic test resulting from 300 Monte Carlo replicates in the presence of four types of model misspecification in (M2)–(M4)
Model n=200𝑛200n=200italic_n = 200 n=300𝑛300n=300italic_n = 300 n=400𝑛400n=400italic_n = 400 n=500𝑛500n=500italic_n = 500
(M2) 0.283 0.407 0.550 0.580
(M3) 0.053 0.120 0.090 0.113
(M4) 1.000 0.997 0.997 1.000

When the assumed beta modal regression model is rejected by the proposed diagnostic test, one may consider a more flexible unimodal distribution for the response conditioning on true covariates, such as the unimodal distributions formulated in Fernández and Steel, (1998), Quintana et al., (2009), Rubio and Steel, (2015), and Liu et al., 2022b . A different assumed primary regression model leads to a different log likelihood function (𝛀;Yj,𝐗j)𝛀subscript𝑌𝑗subscript𝐗𝑗\ell(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{X}_{j})roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) in (10), and our proposed strategy of correcting a naive log likelihood function to account for measurement error remains applicable for any parametric regression models.

6 Real-life data application

In this section, we analyze data arising from two different applications where a covariate of interest cannot be observed directly. Besides dealing with scientific questions in relevant fields, these applications provide opportunities for us to address some practical issues one faces when implementing the proposed estimation method and diagnostic method not discussed in the simulation study.

6.1 Application to dietary data

Food Frequency Questionnaire (FFQ) is a convenient and inexpensive dietary assessment instrument in epidemiologic studies. To study the association between an individual’s FFQ intake and his/her long-term usual intake as the univariate covariate X𝑋Xitalic_X, we analyze a dietary data set from Women’s Interview Survey of Health (Carroll et al.,, 1997). The data set contains 271 females’ FFQ intake records, measured as the percentage calories from fat, and six 24242424-hour food recalls, Wj,ksubscript𝑊𝑗𝑘W_{j,k}italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT, for j=1,,271𝑗1271j=1,\ldots,271italic_j = 1 , … , 271 and k=1,,6𝑘16k=1,\cdots,6italic_k = 1 , ⋯ , 6. Because the j𝑗jitalic_j-th subject’s long-term usual intake Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT cannot be measured directly, a generally accepted practice in epidemiology is to use W¯j=k=16Wj,k/6subscript¯𝑊𝑗superscriptsubscript𝑘16subscript𝑊𝑗𝑘6\overline{W}_{j}=\sum_{k=1}^{6}W_{j,k}/6over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT / 6 as a surrogate of Xjsubscript𝑋𝑗X_{j}italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, for j=1,,271𝑗1271j=1,\ldots,271italic_j = 1 , … , 271. According to the preliminary analysis in existing literature, the distribution of the FFQ intake appears to be right-skewed and potentially heavy-tailed, which motivates the consideration of a modal regression model in place of a mean regression model. Here, we assume a beta modal regression model given in (2) with θ(X)=1/{1+exp(β0β1X)}𝜃𝑋11subscript𝛽0subscript𝛽1𝑋\theta(X)=1/\{1+\exp(-\beta_{0}-\beta_{1}X)\}italic_θ ( italic_X ) = 1 / { 1 + roman_exp ( - italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_X ) } for the response data {Yj}j=1271superscriptsubscriptsubscript𝑌𝑗𝑗1271\{Y_{j}\}_{j=1}^{271}{ italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 271 end_POSTSUPERSCRIPT, where Yjsubscript𝑌𝑗Y_{j}italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the j𝑗jitalic_j-th subject’s FFQ intake in kilocalorie divided by 8000, a biologically plausible upper bound of daily energy intakes for a general population.

We obtain the MCCL estimate of 𝛀=(β0,β1,logm)T𝛀superscriptsubscript𝛽0subscript𝛽1𝑚T\mbox{\boldmath$\Omega$}=(\beta_{0},\beta_{1},\log m)^{\mathrm{% \scriptscriptstyle T}}bold_Ω = ( italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_log italic_m ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT according to (11), and also carry out regression analysis that ignores measurement error to obtain a naive maximum likelihood estimate of 𝛀𝛀\Omegabold_Ω. Moreover, we implemented the simulation-extrapolation method (SIMEX, Carroll et al.,, 2006, Chapter 5) applied to the assumed beta modal regression model. In this particular application, SIMEX amounts to repeatedly estimating 𝛀𝛀\Omegabold_Ω, without accounting for measurement error, using data 𝒟b(ζ)={(Yj,Wj,b(ζ))}j=1nsuperscriptsubscript𝒟𝑏𝜁superscriptsubscriptsubscript𝑌𝑗subscript𝑊𝑗𝑏𝜁𝑗1𝑛\mathcal{D}_{b}^{*}(\zeta)=\{(Y_{j},W_{j,b}(\zeta))\}_{j=1}^{n}caligraphic_D start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_ζ ) = { ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( italic_ζ ) ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, for b=1,,B𝑏1𝐵b=1,\ldots,Bitalic_b = 1 , … , italic_B, where Wj,b(ζ)=W¯j+ζσuZj,bsubscript𝑊𝑗𝑏𝜁subscript¯𝑊𝑗𝜁subscript𝜎𝑢subscript𝑍𝑗𝑏W_{j,b}(\zeta)=\overline{W}_{j}+\sqrt{\zeta}\sigma_{u}Z_{j,b}italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( italic_ζ ) = over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + square-root start_ARG italic_ζ end_ARG italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_Z start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT, in which {Zj,b,j=1,,n}b=1B\{Z_{j,b},j=1,\ldots,n\}_{b=1}^{B}{ italic_Z start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT , italic_j = 1 , … , italic_n } start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT are independent standard normal errors, σusubscript𝜎𝑢\sigma_{u}italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the standard deviation of measurement error associated with the surrogate measure W¯jsubscript¯𝑊𝑗\overline{W}_{j}over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and ζ𝜁\zetaitalic_ζ is a user-specified positive constant. Denote by ^𝛀b(ζ)^absentsubscript𝛀𝑏𝜁\hat{}\mbox{\boldmath$\Omega$}_{b}(\zeta)over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( italic_ζ ) the (naive) estimator of 𝛀𝛀\Omegabold_Ω based on data 𝒟b(ζ)subscriptsuperscript𝒟𝑏𝜁\mathcal{D}^{*}_{b}(\zeta)caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( italic_ζ ), then ^𝛀(ζ)=b=1B^𝛀b(ζ)/B^absent𝛀𝜁superscriptsubscript𝑏1𝐵^absentsubscript𝛀𝑏𝜁𝐵\hat{}\mbox{\boldmath$\Omega$}(\zeta)=\sum_{b=1}^{B}\hat{}\mbox{\boldmath$% \Omega$}_{b}(\zeta)/Bover^ start_ARG end_ARG bold_Ω ( italic_ζ ) = ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT over^ start_ARG end_ARG bold_Ω start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( italic_ζ ) / italic_B is a naive estimator based on data resulting from further contaminating the original error-prone data 𝒟={(Yj,W¯j)}j=1nsuperscript𝒟superscriptsubscriptsubscript𝑌𝑗subscript¯𝑊𝑗𝑗1𝑛\mathcal{D}^{*}=\{(Y_{j},\overline{W}_{j})\}_{j=1}^{n}caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = { ( italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, with the amount of additional contamination controlled by ζ𝜁\zetaitalic_ζ. Collecting a sequence of ^𝛀(ζ)^absent𝛀𝜁\hat{}\mbox{\boldmath$\Omega$}(\zeta)over^ start_ARG end_ARG bold_Ω ( italic_ζ ) as one varies ζ𝜁\zetaitalic_ζ realizes the simulation step of SIMEX. In this data application, we set B=300𝐵300B=300italic_B = 300 and let ζ𝜁\zetaitalic_ζ vary from 0.125 to 1 in increments of 0.125. The extrapolation step of SIMEX entails extrapolating the sequence of estimates in {^𝛀(ζ),ζ=0.125,0.25,,1}formulae-sequence^absent𝛀𝜁𝜁0.1250.251\{\hat{}\mbox{\boldmath$\Omega$}(\zeta),\,\zeta=0.125,0.25,\ldots,1\}{ over^ start_ARG end_ARG bold_Ω ( italic_ζ ) , italic_ζ = 0.125 , 0.25 , … , 1 } to ^𝛀(1)^absent𝛀1\hat{}\mbox{\boldmath$\Omega$}(-1)over^ start_ARG end_ARG bold_Ω ( - 1 ), leading to the so-called SIMEX estimator. A heuristic motivation of extrapolating towards ζ=1𝜁1\zeta=-1italic_ζ = - 1 can be revealed by noting that Var(Wj,b(ζ)|Xj)=Var(W¯j|Xj)+ζσu2Varconditionalsubscript𝑊𝑗𝑏𝜁subscript𝑋𝑗Varconditionalsubscript¯𝑊𝑗subscript𝑋𝑗𝜁superscriptsubscript𝜎𝑢2\text{Var}(W_{j,b}(\zeta)|X_{j})=\text{Var}(\overline{W}_{j}|X_{j})+\zeta% \sigma_{u}^{2}Var ( italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( italic_ζ ) | italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = Var ( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + italic_ζ italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, where σu2=Var(W¯j|Xj)superscriptsubscript𝜎𝑢2Varconditionalsubscript¯𝑊𝑗subscript𝑋𝑗\sigma_{u}^{2}=\text{Var}(\overline{W}_{j}|X_{j})italic_σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = Var ( over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ). Setting ζ=1𝜁1\zeta=-1italic_ζ = - 1 in the preceding variance expression gives Var(Wj,b(1)|Xj)=0Varconditionalsubscript𝑊𝑗𝑏1subscript𝑋𝑗0\text{Var}(W_{j,b}(-1)|X_{j})=0Var ( italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( - 1 ) | italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = 0, as if Wj,b(1)subscript𝑊𝑗𝑏1W_{j,b}(-1)italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( - 1 ) contained no measurement error, and hence extrapolating {^𝛀(ζ), for ζ>0}^absent𝛀𝜁 for ζ>0\{\hat{}\mbox{\boldmath$\Omega$}(\zeta),\text{ for $\zeta>0$}\}{ over^ start_ARG end_ARG bold_Ω ( italic_ζ ) , for italic_ζ > 0 } to obtain ^𝛀(1)^absent𝛀1\hat{}\mbox{\boldmath$\Omega$}(-1)over^ start_ARG end_ARG bold_Ω ( - 1 ) is an attempt to “recover" an estimator of 𝛀𝛀\Omegabold_Ω had there been no covariate measurement error. Shi et al., (2021) applied SIMEX to a kernel-based modal regression model with error-prone covariates.

Three estimates of 𝛀𝛀\Omegabold_Ω, the MCCL estimate, SIMEX estimate, and naive estimate, are given in Table 5. The covariate effect associated with the long-term intake suggested by the naive estimate is substantially weaker than that indicated by the MCCL estimate and SIMEX estimate, implying potentially significant attenuation on the covariate effect due to measurement error in the former, whereas the latter two correct for this attenuation. Figure 3 depicts the estimated regression functions θ^(x)^𝜃𝑥\hat{\theta}(x)over^ start_ARG italic_θ end_ARG ( italic_x ) resulting from these three methods, imposed on the scaled response data versus the surrogate covariate data. This pictorial contrast between the three estimated regression functions shows that the proposed method and SIMEX are able to capture the underlying positive non-linear covariate effect that is partially concealed or weakened by the naive method. Although SIMEX produces similar inference results as those from our method, the simulation step relies on the error variance σu2subscriptsuperscript𝜎2𝑢\sigma^{2}_{u}italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT when generating Wj,b(ζ)subscript𝑊𝑗𝑏𝜁W_{j,b}(\zeta)italic_W start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ( italic_ζ )’s, which we estimate in this example based on replicate measures; and the extrapolation step depends on the choice of an extrapolant, a choice that usually lacks data evidence to support in most applications. Here, we use a quadratic extrapolant to obtain the SIMEX estimate. Besides being more computationally burdensome compared to the MCCL method (due to repeatedly estimating 𝛀𝛀\Omegabold_Ω based further contaminated data), variance estimation for SIMEX estimators is also less straightforward than that for our estimator (Carroll et al.,, 1996). We resort to nonparametric bootstrap, with 1000 bootstrap samples, in this example to obtain the estimated standard errors associated with SIMEX estimates shown in Table 5. Finally, applying the proposed diagnostic method to this data set with M=300𝑀300M=300italic_M = 300 bootstrap samples yields an estimated p𝑝pitalic_p-value of 0.097. We thus conclude lack of sufficient data evidence (at significance level 0.05) to indicate the assumed beta modal regression model inadequate for this application.

Table 5: Estimates of parameters in the beta modal regression model applied to the dietary data, along with the corresponding estimated standard errors in parentheses
Method β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT logm𝑚\log mroman_log italic_m
MCCL 1.5781.578-1.578- 1.578 (0.033) 0.381 (0.099) 3.015 (0.196)
SIMEX 1.5801.580-1.580- 1.580 (0.034) 0.354 (0.087) 3.008 (0.195)
Naive 1.5811.581-1.581- 1.581 (0.041) 0.270 (0.058) 2.979 (0.094)
Refer to caption
Figure 3: Estimated conditional mode functions for the dietary data based on the MCCL estimate (red solid line) and the naive estimate (cyan dashed line), respectively. Observed covariate data {W¯j}j=1271superscriptsubscriptsubscript¯𝑊𝑗𝑗1271\{\overline{W}_{j}\}_{j=1}^{271}{ over¯ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 271 end_POSTSUPERSCRIPT are treated as surrogates of long-term usual intakes in the scatter plot of the observed data (solid dots).

6.2 Application to Alzheimer’s disease data

Medical researchers have long recognized that cerebral atrophy is associated with dementia, and extensive research have been conducted to understand the association between volumetric changes of different brain regions with the severity of dementia. Abundant data collected from this line of research are available in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). Zhou and Huang, (2020) analyzed a data set relating to 245 individuals diagnosed with mild cognitive impairment from this database. The goal is to study roles that an individual’s volumetric measure of entorhinal cortex (ERC) and that of hippocampus (HPC) play in predicting one’s risk of develo** Alzheimer’s disease. An individual’s test score from the Alzheimer’s disease assessment scale, known as ADAS-11, at month 12 since entering the ADNI cohort is used to assess one’s severity of cognitive impairment. Covariates of interest are the volumetric change in ERC (ERC.change) and that in HPC (HPC.change) at month 12 compared to the baseline measures collected at month 6. Assuming these volumetric measures are observed precisely, Zhou and Huang, (2020) fitted the data to the beta modal regression model for the response Y𝑌Yitalic_Y defined as an individual’s ADAS-11 score divided by a perfect score of 70, with the log-log link in the mode function, θ(𝐗)=exp{exp(β0β1×ERC.changeβ2×HPC.change)}𝜃𝐗subscript𝛽0subscript𝛽1ERC.changesubscript𝛽2HPC.change\theta(\mathbf{X})=\exp\{-\exp(-\beta_{0}-\beta_{1}\times\text{ERC.change}-% \beta_{2}\times\text{HPC.change})\}italic_θ ( bold_X ) = roman_exp { - roman_exp ( - italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × ERC.change - italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT × HPC.change ) }, and showed that it provides a better fit for the data compared to the beta mean regression model proposed by Ferrari and Cribari-Neto, (2004).

In reality, measuring ERC volume is challenging because of lateral border discrimination from the perirhinal cortex (Price et al.,, 2010), and the accuracy of HPC measurements is also in question (Maclaren et al.,, 2014). It is thus more sensible to view the observed volumetric change of ERC or that of HPC as a noisy surrogate of the actual amount of change. Despite of which covariate is viewed as error-prone, the current data present some challenges due to the lack of replicate measures for an individual’s true covariate value, and thus the estimation methods proposed in Section 3 are not applicable. For example, in (10), the term multiplying the imaginary unit i𝑖iitalic_i is equal to zero now with the number of replicates nj=1subscript𝑛𝑗1n_{j}=1italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 1, making the “corrected" log-likelihood the same as the naive log-likelihood. A quick fix to the problem is to invoke a similar strategy of correcting naive scores to account for measurement error as discussed in Novick and Stefanski, (2002). Following this strategy, a corrected log-likelihood evaluated at the j𝑗jitalic_j-th data point to use in place of (10) is

~(𝛀;Yj,𝐖j,𝐙~j)=1Bb=1BRe{(𝛀;Yj,𝐖j+i𝚺u1/2𝐙j,b)},~𝛀subscript𝑌𝑗subscript𝐖𝑗subscript~𝐙𝑗1𝐵superscriptsubscript𝑏1𝐵Re𝛀subscript𝑌𝑗subscript𝐖𝑗𝑖superscriptsubscript𝚺𝑢12subscript𝐙𝑗𝑏\tilde{\ell}(\mbox{\boldmath$\Omega$};Y_{j},\mathbf{W}_{j},\tilde{\mathbf{Z}}_% {j})=\frac{1}{B}\sum_{b=1}^{B}\text{Re}\{\ell(\mbox{\boldmath$\Omega$};Y_{j},% \mathbf{W}_{j}+i\mbox{\boldmath$\Sigma$}_{u}^{1/2}\mathbf{Z}_{j,b})\},over~ start_ARG roman_ℓ end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_W start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG bold_Z end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_B end_ARG ∑ start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT Re { roman_ℓ ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_W start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + italic_i bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT bold_Z start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT ) } , (18)

where 𝐙~j={𝐙j,b}b=1Bsubscript~𝐙𝑗superscriptsubscriptsubscript𝐙𝑗𝑏𝑏1𝐵\tilde{\mathbf{Z}}_{j}=\{\mathbf{Z}_{j,b}\}_{b=1}^{B}over~ start_ARG bold_Z end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { bold_Z start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_b = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT, for j=1,,n𝑗1𝑛j=1,\ldots,nitalic_j = 1 , … , italic_n, and {𝐙j,b,b=1,,B}j=1n\{\mathbf{Z}_{j,b},b=1,\ldots,B\}_{j=1}^{n}{ bold_Z start_POSTSUBSCRIPT italic_j , italic_b end_POSTSUBSCRIPT , italic_b = 1 , … , italic_B } start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT are independent p𝑝pitalic_p-dimensional normal random vectors with mean zero and variance-covariance as an identity matrix, which accommodates multiple error-prone covariates in 𝐗𝐗\mathbf{X}bold_X by letting 𝐖jsubscript𝐖𝑗\mathbf{W}_{j}bold_W start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be a p𝑝pitalic_p-dimensional multivariate surrogate of 𝐗jsubscript𝐗𝑗\mathbf{X}_{j}bold_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, contaminated by a multivariate normal measurement error 𝐔jsubscript𝐔𝑗\mathbf{U}_{j}bold_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT with variance-covariance matrix 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT. By setting all entries in 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT at zero except for the first diagonal entry gives rise to the case considered in the majority of this article with only X1subscript𝑋1X_{1}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT prone to error. Certainly, not having replicate measures still creates an obstacle to implementing this strategy due to its dependence on 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT that cannot be estimated without replicate measures of a true multivariate covariate value or other external validation data. A well-accepted practice among statisticians in similar situations is to carry out sensitivity analysis where one analyzes the data under different assumptions for the parameter, such as 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT in our case, that one lacks data information to infer. If one obtains drastically different inference results when assuming different values for 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, including a matrix of zeros corresponding to naive estimation that ignores measurement error, then one may recommend to exercise caution when interpreting results from an inference procedure that assumes error-free covariates.

For illustration purposes, we assume in the sensitivity analysis four values for 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT listed in Table 6, where inference results for model parameters under each assumed 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT are provided. According to Table 6, all four rounds of regression analyses lead to the conclusion that the volumetric change of ERC is an influential predictor for the severity of cognitive impairment, even though the magnitude of the estimated covariate effect is sensitive to the assumed error variance associated this covariate. In particular, when assuming imprecise measurements for ERC.change, the revised MCCL method that employs the corrected log-likelihood in (18) with B=100000𝐵100000B=100000italic_B = 100000 produces results indicating a much stronger association than the naive analysis. By comparison, the magnitude of the estimate for the HPC.change effect is less sensitive to the assumed 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, but its statistical significance is noticeably affected by it. For example, one would conclude a moderately significant covariate effect of HPC.change based on the naive analysis assuming error-free covariates, but claim a highly significant, or moderately significant, or nonsignificant HPC.change effect depending on which covariate(s) one assumes to be error-prone and the severity of error contamination. This phenomenon is a reminiscence of an observation made in Figure 1, and may suggest that ERC.change and HPC.change are correlated. In fact, measurements of ERC and HPC via magnetic resonance imaging are known to be highly correlated with observed clinical alterations in patients suffering mild cognitive impairment or at dementia phases of Alzheimer’s disease (Desikan et al.,, 2010; Jack et al.,, 2013; Varon et al.,, 2014).

Table 6: Sensitivity analysis using the ADNI data for the beta modal regression with the log-log link. Numbers in parentheses are estimated standard errors. Numbers in square brackets are p𝑝pitalic_p-values associated with covariate effects.
𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT β1subscript𝛽1\beta_{1}italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (ERC.change) β2subscript𝛽2\beta_{2}italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (HPC.change) logm𝑚\log mroman_log italic_m
[0000]matrix0000\begin{bmatrix}0&0\\ 0&0\end{bmatrix}[ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] 0.690.69-0.69- 0.69 (0.03) 0.120.12-0.12- 0.12 (0.05) 0.220.22-0.22- 0.22 (0.11) 2.78 (0.15)
[0.007] [0.054]
[0.16000]matrix0.16000\begin{bmatrix}0.16&0\\ 0&0\end{bmatrix}[ start_ARG start_ROW start_CELL 0.16 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] 0.880.88-0.88- 0.88 (0.03) 2.442.44-2.44- 2.44 (0.00) 0.390.390.390.39 (0.46) 3.45 (0.04)
[0.000] [0.386]
[0000.0225]matrix0000.0225\begin{bmatrix}0&0\\ 0&0.0225\end{bmatrix}[ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0.0225 end_CELL end_ROW end_ARG ] 0.710.71-0.71- 0.71 (0.03) 0.110.11-0.11- 0.11 (0.05) 0.470.47-0.47- 0.47 (0.27) 2.80 (0.16)
[0.014] [0.084]
[0.16000.0225]matrix0.16000.0225\begin{bmatrix}0.16&0\\ 0&0.0225\end{bmatrix}[ start_ARG start_ROW start_CELL 0.16 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0.0225 end_CELL end_ROW end_ARG ] 0.810.81-0.81- 0.81 (0.03) 2.422.42-2.42- 2.42 (0.00) 0.850.85-0.85- 0.85 (0.00) 3.85 (0.02)
[0.000] [0.000]

In conclusion, results from the sensitivity analysis suggest that volumetric measures of different brain regions are likely to be subject to measurement error, and statistical analyses under the assumption of precisely measured covariates should be interpreted with caution. If replicate data are available for covariates of interest, the MCCL method can provide more reliable inference. Lastly, even though one can mimic (18) to construct a corrected score in place of 𝐒~(𝛀;Yj,W~j,T~j,𝐗1,j)~𝐒𝛀subscript𝑌𝑗subscript~𝑊𝑗subscript~𝑇𝑗subscript𝐗1𝑗\tilde{\mathbf{S}}(\mbox{\boldmath$\Omega$};Y_{j},\widetilde{W}_{j},\widetilde% {T}_{j},\mathbf{X}_{-1,j})over~ start_ARG bold_S end_ARG ( bold_Ω ; italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_W end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , over~ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT - 1 , italic_j end_POSTSUBSCRIPT ) in (15) and then formulate the test statistic Q~(^𝛀;𝒟)~𝑄^absent𝛀superscript𝒟\tilde{Q}(\hat{}\mbox{\boldmath$\Omega$};\mathcal{D}^{*})over~ start_ARG italic_Q end_ARG ( over^ start_ARG end_ARG bold_Ω ; caligraphic_D start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for model diagnostics, the dependence of the revised score on the unknown 𝚺usubscript𝚺𝑢\mbox{\boldmath$\Sigma$}_{u}bold_Σ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT remains an obstacle that hinders one from using the bootstrap procedure outlined in Section 4 to assess statistical significance of the revised test statistic. Alternative diagnostic methods that do not rely on parametric bootstrap or corrected score (e.g. Huang et al.,, 2006) can be used to detect inadequate assumptions imposed on the primary regression model.

7 Discussion

We propose an inference procedure based on the idea of corrected score that falls in the framework of M𝑀Mitalic_M-estimation for modal regression with an error-prone covariate. Even though in this article we focus on the beta modal regression model as the primary regression model, the proposed MCCL method is applicable in other parametric modal regression models, such as the gamma modal regression models for non-negative responses proposed by Aristodemou, (2014) and Bourguignon et al., (2020), and the flexible Gumbel regression model recently proposed by Liu et al., 2022b for responses ranging over the entire real line. In fact, provided that a parametric modal regression model can provide reliable inference for the global mode in the absence of covariate measurement error (even when Y𝑌Yitalic_Y follows a multimodal distribution given 𝐗𝐗\mathbf{X}bold_X), such as the flexible unimodal regression models considered in Liu et al., 2022a , the proposed MCCL method applied to error-prone data is expected to improve over the counterpart naive method that ignores measurement error. A Python package for implementing the proposed methods for beta modal regression with errors-in-covariate is available at https://pypi.org/project/pybetareg/. All computer programs used in this paper are available at https://github.com/rh8liuqy/Modal_regression_with_measurement_error.

To accommodate situations without replicate measures of the true covariate or settings with multiple error-prone covariates, the MCCL method can be easily revised as demonstrated in Section 6.2, although one needs to specify the variance (or the variance-covariance matrix) of the (vector-valued) measurement error if one lacks replicate data or external validation data to estimate it.

Focusing on the current beta modal regression models, some extensions are worthy of further investigation, such as a zero-inflated beta modal regression model to fit disease prevalence data especially suitable for rare diseases, and a four-parameter beta modal regression model as considered in Zhou and Huang, (2020) for a bounded response with unknown support. Another follow-up research direction is variable selection based on a parametric modal regression model with or without measurement error contamination in covariates.

Conflict of Interest

The authors have declared no conflict of interest.

References

  • Aristodemou, (2014) Aristodemou, K. (2014). New regression methods for measures of central tendency. PhD thesis, Brunel University.
  • Bagnato and Punzo, (2013) Bagnato, L. and Punzo, A. (2013). Finite mixtures of unimodal beta and gamma densities and the k𝑘kitalic_k-bumps algorithm. Computational Statistics, 28(4):1571–1597.
  • Boos and Stefanski, (2013) Boos, D. D. and Stefanski, L. A. (2013). Essential Statistical Inference: Theory and Methods, volume 591. Springer.
  • Bourguignon et al., (2020) Bourguignon, M., Leão, J., and Gallardo, D. I. (2020). Parametric modal regression with varying precision. Biometrical Journal, 62(1):202–220.
  • Buonaccorsi et al., (2016) Buonaccorsi, J., Prochenka, A., Thoresen, M., and Ploski, R. (2016). Correcting for binomial measurement error in predictors in regression with application to analysis of DNA methylation rates by bisulfite sequencing. Statistics in medicine, 35(22):3987–4007.
  • Buonaccorsi, (2010) Buonaccorsi, J. P. (2010). Measurement Error: Models, Methods, and Applications. Chapman and Hall/CRC.
  • Buonaccorsi et al., (2018) Buonaccorsi, J. P., Romeo, G., and Thoresen, M. (2018). Model-based bootstrap** when correcting for measurement error with application to logistic regression. Biometrics, 74(1):135–144.
  • Carroll et al., (1997) Carroll, R. J., Freedman, L., and Pee, D. (1997). Design aspects of calibration studies in nutrition, with analysis of missing data in linear measurement error models. Biometrics, 53(4).
  • Carroll et al., (1996) Carroll, R. J., Küchenhoff, H., Lombard, F., and Stefanski, L. A. (1996). Asymptotics for the SIMEX estimator in nonlinear measurement error models. Journal of the American Statistical Association, 91(433):242–250.
  • Carroll et al., (2006) Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models. Chapman and Hall/CRC.
  • Chacón, (2020) Chacón, J. E. (2020). The modal age of statistics. International Statistical Review, 88(1):122–141.
  • Chen, (1999) Chen, S. X. (1999). Beta kernel estimators for density functions. Computational Statistics & Data Analysis, 31(2):131–145.
  • Chen et al., (2016) Chen, Y.-C., Genovese, C. R., Tibshirani, R. J., and Wasserman, L. (2016). Nonparametric modal regression. The Annals of Statistics, 44(2):489–514.
  • Davison and Hinkley, (1997) Davison, A. C. and Hinkley, D. V. (1997). Bootstrap methods and their application. Number 1. Cambridge university press.
  • Desikan et al., (2010) Desikan, R. S., Cabral, H. J., Settecase, F., Hess, C. P., Dillon, W. P., Glastonbury, C. M., Weiner, M. W., Schmansky, N. J., Salat, D. H., and Fischl, B. (2010). Automated MRI measures predict progression to Alzheimer’s disease. Neurobiology of Aging, 31(8).
  • Fernández and Steel, (1998) Fernández, C. and Steel, M. F. (1998). On bayesian modeling of fat tails and skewness. Journal of the american statistical association, 93(441):359–371.
  • Ferrari and Cribari-Neto, (2004) Ferrari, S. and Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7):799–815.
  • Fuller, (2009) Fuller, W. A. (2009). Measurement Error Models. John Wiley & Sons.
  • Hall and Wilson, (1991) Hall, P. and Wilson, S. R. (1991). Two guidelines for bootstrap hypothesis testing. Biometrics, pages 757–762.
  • He and Liang, (2000) He, X. and Liang, H. (2000). Quantile regression estimates for a class of linear and partially linear errors-in-variables models. Statistica Sinica, 10(1):129–140.
  • Hosmer et al., (1997) Hosmer, D. W., Hosmer, T., Le Cessie, S., and Lemeshow, S. (1997). A comparison of goodness-of-fit tests for the logistic regression model. Statistics in medicine, 16(9):965–980.
  • Huang et al., (2006) Huang, X., Stefanski, L. A., and Davidian, M. (2006). Latent-model robustness in structural measurement error models. Biometrika, 93(1):53–64.
  • Jack et al., (2013) Jack, C. R., Knopman, D. S., Jagust, W. J., Petersen, R. C., Weiner, M. W., Aisen, P. S., Shaw, L. M., Vemuri, P., Wiste, H. J., Weigand, S. D., Lesnick, T. G., Pankratz, V. S., Donohue, M. C., and Trojanowski, J. Q. (2013). Tracking pathophysiological processes in Alzheimer’s disease: an updated hypothetical model of dynamic biomarkers. The Lancet Neurology, 12(2):207–216.
  • Kemp et al., (2020) Kemp, G. C., Parente, P. M., and Santos Silva, J. (2020). Dynamic vector mode regression. Journal of Business & Economic Statistics, 38(3):647–661.
  • Kruschke, (2015) Kruschke, J. K. (2015). Doing Bayesian Data Analysis: A tutorial with R, JAGS, and Stan. Academic Press.
  • Lee, (1989) Lee, M.-J. (1989). Mode regression. Journal of Econometrics, 42(3):337–349.
  • Lee, (1993) Lee, M.-J. (1993). Quadratic mode regression. Journal of Econometrics, 57(1-3):1–19.
  • Li and Huang, (2019) Li, X. and Huang, X. (2019). Linear mode regression with covariate measurement error. Canadian Journal of Statistics, 47(2).
  • (29) Liu, Q., Huang, X., and Bai, R. (2022a). Bayesian modal regression based on mixture distributions. arXiv preprint arXiv:2211.10776.
  • (30) Liu, Q., Huang, X., and Zhou, H. (2022b). The flexible gumbel distribution: A new model for inference about the mode. arXiv preprint arXiv:2212.01832.
  • Maclaren et al., (2014) Maclaren, J., Han, Z., Vos, S. B., Fischbein, N., and Bammer, R. (2014). Reliability of brain volume measurements: A test-retest dataset. Scientific Data, 1(1).
  • Martin, (2007) Martin, M. A. (2007). Bootstrap hypothesis testing for some common statistical problems: A critical evaluation of size and power properties. Computational Statistics & Data Analysis, 51(12):6321–6342.
  • Nakamura, (1990) Nakamura, T. (1990). Corrected score function for errors-in-variables models: Methodology and application to generalized linear models. Biometrika, 77(1).
  • Novick and Stefanski, (2002) Novick, S. J. and Stefanski, L. A. (2002). Corrected score estimation via complex variable simulation extrapolation. Journal of the American Statistical Association, 97(458):472–481.
  • Ota et al., (2019) Ota, H., Kato, K., and Hara, S. (2019). Quantile regression approach to conditional mode estimation. Electronic Journal of Statistics, 13(2):3120–3160.
  • Price et al., (2010) Price, C., Wood, M., Leonard, C., Towler, S., Ward, J., Montijo, H., Kellison, I., Bowers, D., Monk, T., Newcomer, J., and et al. (2010). Entorhinal cortex volume in older adults: Reliability and validity considerations for three published measurement protocols. Journal of the International Neuropsychological Society, 16:846–855.
  • Quintana et al., (2009) Quintana, F. A., Steel, M. F., and Ferreira, J. T. (2009). Flexible univariate continuous distributions. Bayesian Analysis, 4(4):497–522.
  • Rubio and Steel, (2015) Rubio, F. and Steel, M. (2015). Bayesian modelling of skewness and kurtosis with two-piece scale and shape distributions. Electronic Journal of Statistics, 9:1884–1912.
  • Sager and Thisted, (1982) Sager, T. W. and Thisted, R. A. (1982). Maximum likelihood estimation of isotonic modal regression. The Annals of Statistics, 10(3):690–707.
  • Shi et al., (2021) Shi, J., Zhang, Y., Yu, P., and Song, W. (2021). SIMEX estimation in parametric modal regression with measurement error. Computational Statistics & Data Analysis, 157:107158.
  • Stefanski, (1989) Stefanski, L. A. (1989). Unbiased estimation of a nonlinear function a normal mean with application to measurement error models. Communications in Statistics-Theory and Methods, 18(12):4335–4358.
  • Stefanski et al., (2005) Stefanski, L. A., Novick, S. J., and Devanarayan, V. (2005). Estimating a nonlinear function of a normal mean. Biometrika, 92(3):732–736.
  • Thomas et al., (2011) Thomas, L., Stefanski, L., and Davidian, M. (2011). A moment-adjusted imputation method for measurement error models. Biometrics, 67(4):1461–1470.
  • Ullah et al., (2022) Ullah, A., Wang, T., and Yao, W. (2022). Nonlinear modal regression for dependent data with application for predicting COVID-19. Journal of the Royal Statistical Society. Series A,(Statistics in Society), 185(3):1424–1453.
  • Varon et al., (2014) Varon, D., Barker, W., Loewenstein, D., Greig, M., Bohorquez, A., Santos, I., Shen, Q., Harper, M., Vallejo-Luces, T., and and, R. D. (2014). Visual rating and volumetric measurement of medial temporal atrophy in the Alzheimer’s disease neuroimaging initiative (ADNI) cohort: baseline diagnosis and the prediction of MCI outcome. International Journal of Geriatric Psychiatry, 30(2):192–200.
  • Wang et al., (2012) Wang, H. J., Stefanski, L. A., and Zhu, Z. (2012). Corrected-loss estimation for quantile regression with covariate measurement errors. Biometrika, 99(2):405.
  • Wang et al., (2019) Wang, K., Li, S., Sun, X., and Lin, L. (2019). Modal regression statistical inference for longitudinal data semivarying coefficient models: Generalized estimating equations, empirical likelihood and variable selection. Computational Statistics & Data Analysis, 133:257–276.
  • Wei and Carroll, (2009) Wei, Y. and Carroll, R. J. (2009). Quantile regression with measurement error. Journal of the American Statistical Association, 104(487):1129–1143.
  • White, (1982) White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica: Journal of the econometric society, pages 1–25.
  • Xiang and Yao, (2022) Xiang, S. and Yao, W. (2022). Nonparametric statistical learning based on modal regression. Journal of Computational and Applied Mathematics, 409:114130.
  • Yao and Li, (2013) Yao, W. and Li, L. (2013). A new regression model: Modal linear regression. Scandinavian Journal of Statistics, 41(3):656–671.
  • Yi, (2017) Yi, G. Y. (2017). Statistical Analysis with Measurement Error or Misclassification. Springer New York.
  • Zhang et al., (2021) Zhang, T., Kato, K., and Ruppert, D. (2021). Bootstrap inference for quantile-based modal regression. Journal of the American Statistical Association, pages 1–13.
  • Zhou and Huang, (2016) Zhou, H. and Huang, X. (2016). Nonparametric modal regression in the presence of measurement error. Electronic Journal of Statistics, 10(2).
  • Zhou and Huang, (2020) Zhou, H. and Huang, X. (2020). Parametric mode regression for bounded responses. Biometrical Journal, 62(7):1791–1809.
  • Zhou and Huang, (2022) Zhou, H. and Huang, X. (2022). Bayesian beta regression for bounded responses with unknown supports. Computational Statistics & Data Analysis, 167:107345.