Asymptotics of estimators for structured covariance matrices

Hendrik Paul Lopuhaä Delft University of Technology
(July 2, 2024)
Abstract

We show that the limiting variance of a sequence of estimators for a structured covariance matrix has a general form that appears as the variance of a scaled projection of a random matrix that is of radial type and a similar result is obtained for the corresponding sequence of estimators for the vector of variance components. These results are illustrated by the limiting behavior of estimators for a linear covariance structure in a variety of multivariate statistical models. We also derive a characterization for the influence function of corresponding functionals. Furthermore, we derive the limiting distribution and influence function of scale invariant map**s of such estimators and their corresponding functionals. As a consequence, the asymptotic relative efficiency of different estimators for the shape component of a structured covariance matrix can be compared by means of a single scalar and the gross error sensitivity of the corresponding influence functions can be compared by means of a single index. Similar results are obtained for estimators of the normalized vector of variance components. We apply our results to investigate how the efficiency, gross error sensitivity, and breakdown point of S-estimators for the normalized variance components are affected simultaneously by varying their cutoff value.

1 Introduction

Covariance matrices describe the relationships and variability between different variables in a dataset. When there is a known structure or pattern in these relationships, structured covariance matrices can be estimated to capture and represent that structure. The use of structured covariance matrices is a valuable tool for modeling the underlying patterns and dependencies in multivariate data. It provides a more nuanced understanding of the relationships between variables, especially in scenarios where variables exhibit specific structures or patterns of correlation. Structured covariance matrices are commonly used in the analysis of repeated measures, longitudinal data, and multivariate data with a known underlying structure. They are particularly useful when there are dependencies or correlations among different measurements or variables and are widely used in various fields, including biology, medicine, psychology, and social sciences.

When a covariance matrix is unstructured and can be any positive definite symmetric matrix 𝚺𝚺\mathbf{\Sigma}bold_Σ, then the limiting behavior of covariance estimators 𝐕nsubscript𝐕𝑛\mathbf{V}_{n}bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for 𝚺𝚺\mathbf{\Sigma}bold_Σ is well understood. For example, if 𝐕nsubscript𝐕𝑛\mathbf{V}_{n}bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is based on a sample 𝐲1,,𝐲nksubscript𝐲1subscript𝐲𝑛superscript𝑘\mathbf{y}_{1},\ldots,\mathbf{y}_{n}\in\mathbb{R}^{k}bold_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT from a distribution with an elliptically contoured density |𝚺|1/2g((𝐲𝝁)T𝚺1(𝐲𝝁))superscript𝚺12𝑔superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁|\mathbf{\Sigma}|^{-1/2}g((\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(% \mathbf{y}-\bm{\mu}))| bold_Σ | start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_g ( ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ), then typically n(𝐕n𝚺)𝑛subscript𝐕𝑛𝚺\sqrt{n}(\mathbf{V}_{n}-\mathbf{\Sigma})square-root start_ARG italic_n end_ARG ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_Σ ) converges in distribution to a random matrix 𝐍𝐍\mathbf{N}bold_N that has a multivariate normal distribution with mean zero and variance

var{vec(𝐍)}=σ1(𝐈k2+𝐊k,k)(𝚺𝚺)+σ2vec(𝚺)vec(𝚺)T,varvec𝐍subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺subscript𝜎2vec𝚺vecsuperscript𝚺𝑇\text{var}\{\text{vec}(\mathbf{N})\}=\sigma_{1}(\mathbf{I}_{k^{2}}+\mathbf{K}_% {k,k})(\mathbf{\Sigma}\otimes\mathbf{\Sigma})+\sigma_{2}\text{vec}(\mathbf{% \Sigma})\text{vec}(\mathbf{\Sigma})^{T},var { vec ( bold_N ) } = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT vec ( bold_Σ ) vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , (1.1)

for some σ10subscript𝜎10\sigma_{1}\geq 0italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 0 and σ22σ1/ksubscript𝜎22subscript𝜎1𝑘\sigma_{2}\geq-2\sigma_{1}/kitalic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ - 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_k, where tensor-product\otimes denotes the Kronecker product, 𝐊k,ksubscript𝐊𝑘𝑘\mathbf{K}_{k,k}bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT is the commutation matrix, and vec is the operator that stacks the columns of a matrix. This form of limiting variance appears for many covariance estimators. Tyler [26] gives several examples, including the sample covariance matrix, and nicely explains that this general form will always appear when 𝐍𝐍\mathbf{N}bold_N is of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ.

The situation becomes different, when estimating a structured covariance matrix 𝚺=𝐕(𝜽)𝚺𝐕𝜽\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ), where 𝐕()𝐕\mathbf{V}(\cdot)bold_V ( ⋅ ) is a known covariance structure depending on a vector 𝜽=(θ1,,θ)𝜽subscript𝜃1subscript𝜃\bm{\theta}=(\theta_{1},\ldots,\theta_{\ell})bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ) of unknown variance components. Asymptotic results for the maximum likelihood estimator of variance components in linear models with Gaussian errors having a structured covariance matrix 𝐕(𝜽)𝐕𝜽\mathbf{V}(\bm{\theta})bold_V ( bold_italic_θ ), can be found in Hartley and Rao [8], Miller [22], and Mardia and Marshall [20]. When scaled appropriately, the maximum likelihood estimator 𝜽nsubscript𝜽𝑛\bm{\theta}_{n}bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is shown to be asymptotically normal with mean 𝜽𝜽\bm{\theta}bold_italic_θ and variance 𝐉1superscript𝐉1\mathbf{J}^{-1}bold_J start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, where 𝐉ij=tr(𝚺1𝐋i𝚺1𝐋j)/2subscript𝐉𝑖𝑗trsuperscript𝚺1subscript𝐋𝑖superscript𝚺1subscript𝐋𝑗2\mathbf{J}_{ij}=\text{tr}(\mathbf{\Sigma}^{-1}\mathbf{L}_{i}\mathbf{\Sigma}^{-% 1}\mathbf{L}_{j})/2bold_J start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = tr ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) / 2, for i,j,=1,,i,j,=1,\ldots,\ellitalic_i , italic_j , = 1 , … , roman_ℓ, with 𝚺=𝐕(𝜽)𝚺𝐕𝜽\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ) and 𝐋i=𝐕(𝜽)/θisubscript𝐋𝑖𝐕𝜽subscript𝜃𝑖\mathbf{L}_{i}=\partial\mathbf{V}(\bm{\theta})/\partial\theta_{i}bold_L start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∂ bold_V ( bold_italic_θ ) / ∂ italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. By employing the vec-notation, the limiting covariance of 𝜽nsubscript𝜽𝑛\bm{\theta}_{n}bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT can be expressed as

2(𝐋T(𝚺1𝚺1)𝐋)1,2superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋12\left(\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{% L}\right)^{-1},2 ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,

where 𝐋𝐋\mathbf{L}bold_L is the matrix with columns vec(𝐋1),,vec(𝐋)vecsubscript𝐋1vecsubscript𝐋\mathrm{vec}(\mathbf{L}_{1}),\ldots,\mathrm{vec}(\mathbf{L}_{\ell})roman_vec ( bold_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , roman_vec ( bold_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ). According to the delta method the limiting covariance of vec(𝐕(𝜽n))vec𝐕subscript𝜽𝑛\mathrm{vec}(\mathbf{V}(\bm{\theta}_{n}))roman_vec ( bold_V ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ) is then given by

2𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T.2𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇2\mathbf{L}\left(\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1% })\mathbf{L}\right)^{-1}\mathbf{L}^{T}.2 bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

Similar results have been obtained in Lopuhaä et al [16] for the class of S-estimators based on observations that follow a linear model with a structured covariance 𝚺=𝐕(𝜽)𝚺𝐕𝜽\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ), where 𝐕𝐕\mathbf{V}bold_V is a linear function of 𝜽𝜽\bm{\theta}bold_italic_θ. Under appropriate conditions, it holds that n(𝜽n𝜽)𝑛subscript𝜽𝑛𝜽\sqrt{n}(\bm{\theta}_{n}-\bm{\theta})square-root start_ARG italic_n end_ARG ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_θ ) is asymptotically normal with mean zero and variance

2σ1(𝐋T(𝚺1𝚺1)𝐋)1+σ2𝜽𝜽T,2subscript𝜎1superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1subscript𝜎2𝜽superscript𝜽𝑇2\sigma_{1}\Big{(}\mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{% \Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}+\sigma_{2}\bm{\theta}\bm{\theta}^{T},2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT bold_italic_θ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , (1.2)

and n(𝐕(𝜽n)𝚺)𝑛𝐕subscript𝜽𝑛𝚺\sqrt{n}(\mathbf{V}(\bm{\theta}_{n})-\mathbf{\Sigma})square-root start_ARG italic_n end_ARG ( bold_V ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - bold_Σ ) converges in distribution to a random matrix 𝐌𝐌\mathbf{M}bold_M, that has a multivariate normal distribution with mean zero and variance

var{vec(𝐌)}=2σ1𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T+σ2vec(𝚺)vec(𝚺)T.varvec𝐌2subscript𝜎1𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇subscript𝜎2vec𝚺vecsuperscript𝚺𝑇\text{var}\{\text{vec}(\mathbf{M})\}=2\sigma_{1}\mathbf{L}\Big{(}\mathbf{L}^{T% }\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}% ^{-1}\mathbf{L}^{T}+\sigma_{2}\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(% \mathbf{\Sigma})^{T}.var { vec ( bold_M ) } = 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . (1.3)

One of the objective of this paper is to show that this general form will always appear when 𝐌𝐌\mathbf{M}bold_M is a scaled projection on the column space of 𝐋𝐋\mathbf{L}bold_L, of a random matrix that is of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ. Moreover, we provide several examples of covariance estimators that exhibit this kind of limiting behavior.

Another objective concerns the asymptotic behavior of estimators for scale invariant map**s H𝐻Hitalic_H of positive definite symmetric matrices. For affine equivariant covariance estimators 𝐕nsubscript𝐕𝑛\mathbf{V}_{n}bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT with asymptotic variance (1.1), Tyler [27] shows that H(𝐕n)𝐻subscript𝐕𝑛H(\mathbf{V}_{n})italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) has an asymptotic variance that only depends on the scalar σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. When dealing with a structured covariance matrix, the covariance estimators are typically not affine equivariant and have asymptotic variance (1.3). The second objective of this paper is to show that Tyler’s result for affine equivariant covariance estimators, remains true for estimators of a structured covariance matrix. Moreover, we will establish a similar result for scale invariant map**s H(𝜽n)𝐻subscript𝜽𝑛H(\bm{\theta}_{n})italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) of estimators for the vector of variance components.

An example of a scale invariant map** is the shape component 𝐕/|𝐕|1/k𝐕superscript𝐕1𝑘\mathbf{V}/|\mathbf{V}|^{1/k}bold_V / | bold_V | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT. A consequence of our results is that the asymptotic relative efficiency of estimators of the shape of a structured covariance can be compared simply by comparing the corresponding values for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. For affine equivariant covariance estimators, this was already observed by Kent and Tyler [11] and Salibián et al [24]. Similar properties will be shown to hold for the direction component 𝜽/𝜽𝜽norm𝜽\bm{\theta}/\|\bm{\theta}\|bold_italic_θ / ∥ bold_italic_θ ∥ corresponding to the vector of variance components.

A final objective of this paper concerns the influence function of structured covariance functionals. For affine equivariant covariance functionals, Croux and Haesbroeck [5] show that the influence function at the multivariate normal is characterized by two real-valued functions. Structured covariance functionals, however, are not necessarily affine equivariant. We will show that such a characterization remains valid for structured covariance functionals at any elliptically contoured distribution, and similarly for the variance components functional. A nice consequence is that the influence function of scale invariant map**s H𝐻Hitalic_H of a structured covariance functional 𝐕(𝜽())𝐕𝜽\mathbf{V}(\bm{\theta}(\cdot))bold_V ( bold_italic_θ ( ⋅ ) ) or of 𝜽()𝜽\bm{\theta}(\cdot)bold_italic_θ ( ⋅ ) itself, is characterized by a single real-valued function. As such the gross-error-sensitivity (GES) is proportional to a single index, which can be used to compare the GES of different shape functionals or different direction functionals. Kent and Tyler [11] already observed such a property for the shape component of affine equivariant covariance functionals, see also Salibián et al [24].

Except that our results have a merit of their own, they also enable the construction of MM-estimators with auxiliary scale in linear mixed effects models and other linear models with structured covariances. These estimators inherit the robustness of S-estimators considered in Lopuhaä et al [16] and, in contrast to the simpler version considered in Lopuhaä [15], improve both the efficiency of the estimator of the fixed effects as well as the efficiency of the estimator of the covariance shape component and of the direction of the vector of variance components. Investigation of this version of MM-estimators will be postponed to a future manuscript, in which we will extend similar results that are already available for unstructured covariances in the multivariate location-scale model, see Tatsuoka and Tyler [25] or Salibián-Barrera et al [24], and in the multivariate regression model, see Kudraszow and Maronna [12].

The paper is organized as follows. In Section 2 we show that the general forms of (1.3) and (1.2) can be derived solely using a scaled projection of a random matrix that is of radial type. In Section 3 we investigate the limiting behavior of estimators of a linear covariance structure in a variety of multivariate models. We establish that these estimators asymptotically behave the same as a scaled projection of a sequence of affine equivariant covariance estimators that are asymptotically of radial type. In Section 4 we derive the limiting distribution of scale invariant map**s of estimators of a linear covariance structure that are asymptotically normal, and similarly for scale invariant map**s of estimators of the vector of variance components. In Section 5 we derive a characterization for the influence function of linearly structured covariance functionals and the corresponding functional of variance components, and of scale invariant map**s thereof. In Section 6 we apply our results to investigate how the efficiency, GES, and breakdown point of S-estimators of the variance components are affected simultaneously, when we vary the cut-off value of the rho-function that defines the S-estimator. All proofs are postponed to an appendix at the end of the paper.

2 Projection of a random matrix of radial type

A random matrix 𝐑𝐑\mathbf{R}bold_R is said to be of radial type, if for any orthogonal matrix 𝐎𝐎\mathbf{O}bold_O, the distribution of 𝐎𝐑𝐎Tsuperscript𝐎𝐑𝐎𝑇\mathbf{O}\mathbf{R}\mathbf{O}^{T}bold_ORO start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is the same as that of 𝐑𝐑\mathbf{R}bold_R. The covariance structure of random matrices with a radial distribution was first given by Mallows [19] in index form. Tyler [26] gave the covariance structure in matrix form and provided necessary conditions on its parameters. A random matrix 𝐍𝐍\mathbf{N}bold_N is said to be of radial type with respect to the positive definite symmetric matrix 𝚺𝚺\mathbf{\Sigma}bold_Σ, if 𝚺1/2𝐍𝚺1/2superscript𝚺12𝐍superscript𝚺12\mathbf{\Sigma}^{-1/2}\mathbf{N}\mathbf{\Sigma}^{-1/2}bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT bold_N bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT has a radial distribution. If the first two moments of 𝐍𝐍\mathbf{N}bold_N exist, then according to Corollary 1 in Tyler [26], the variance of 𝐍𝐍\mathbf{N}bold_N is given by (1.1).

Consider a k×k𝑘𝑘k\times kitalic_k × italic_k structured covariance matrix 𝚺=𝐕(𝜽)𝚺𝐕𝜽\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ), where 𝐕𝐕\mathbf{V}bold_V is a known covariance structure that is a linear function of 𝜽=(θ1,,θ)𝜽subscript𝜃1subscript𝜃\bm{\theta}=(\theta_{1},\ldots,\theta_{\ell})bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ), a vector of unknown variance components. Define the k2×superscript𝑘2k^{2}\times\ellitalic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × roman_ℓ matrix

𝐋=[vec(𝐋1)vec(𝐋)],𝐋j=𝐕/θj, for j=1,,.formulae-sequence𝐋delimited-[]vecsubscript𝐋1vecsubscript𝐋formulae-sequencesubscript𝐋𝑗𝐕subscript𝜃𝑗 for 𝑗1\mathbf{L}=\left[\begin{array}[]{ccc}\mathrm{vec}(\mathbf{L}_{1})&\cdots&% \mathrm{vec}(\mathbf{L}_{\ell})\\ \end{array}\right],\quad\mathbf{L}_{j}=\partial\mathbf{V}/\partial\theta_{j},% \text{ for }j=1,\ldots,\ell.bold_L = [ start_ARRAY start_ROW start_CELL roman_vec ( bold_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_CELL start_CELL ⋯ end_CELL start_CELL roman_vec ( bold_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARRAY ] , bold_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ∂ bold_V / ∂ italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , for italic_j = 1 , … , roman_ℓ . (2.1)

Note that since 𝐕𝐕\mathbf{V}bold_V is linear, we can write 𝚺=θ1𝐋1++θ𝐋𝚺subscript𝜃1subscript𝐋1subscript𝜃subscript𝐋\mathbf{\Sigma}=\theta_{1}\mathbf{L}_{1}+\cdots+\theta_{\ell}\mathbf{L}_{\ell}bold_Σ = italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_θ start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT and vec(𝚺)=𝐋𝜽vec𝚺𝐋𝜽\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}roman_vec ( bold_Σ ) = bold_L bold_italic_θ. Furthermore, let ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT be the projection of a vector 𝐱k2𝐱superscriptsuperscript𝑘2\mathbf{x}\in\mathbb{R}^{k^{2}}bold_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT on the column space of 𝐋𝐋\mathbf{L}bold_L, re-scaled by 𝚺1𝚺1tensor-productsuperscript𝚺1superscript𝚺1\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, that is

ΠL𝐱=argmin𝜽(𝐱𝐋𝜽)T(𝚺1𝚺1)(𝐱𝐋𝜽).subscriptΠ𝐿𝐱subscriptargmin𝜽superscriptsuperscript𝐱𝐋𝜽𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐱𝐋𝜽\Pi_{L}\mathbf{x}=\mathop{\mathrm{argmin}}_{\bm{\theta}\in\mathbb{R}^{\ell}}\,% (\mathbf{x}-\mathbf{L}\bm{\theta})^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{% \Sigma}^{-1})(\mathbf{x}-\mathbf{L}\bm{\theta}).roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT bold_x = roman_argmin start_POSTSUBSCRIPT bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( bold_x - bold_L bold_italic_θ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( bold_x - bold_L bold_italic_θ ) . (2.2)

We then have the following theorem.

Theorem 1.

Let 𝐍𝐍\mathbf{N}bold_N be a random matrix that is of radial type with respect to a positive definite symmetric matrix 𝚺𝚺\mathbf{\Sigma}bold_Σ. Suppose that 𝚺=𝐕(𝛉)𝚺𝐕𝛉\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ), for some 𝛉𝛉superscript\bm{\theta}\in\mathbb{R}^{\ell}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT, and that 𝐕𝐕\mathbf{V}bold_V is linear such that 𝐋𝐋\mathbf{L}bold_L, as defined in (2.1), is of full column rank. Let ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT be the projection defined in (2.2) and define the random matrix 𝐌𝐌\mathbf{M}bold_M by vec(𝐌)=ΠLvec(𝐍)vec𝐌subscriptΠ𝐿vec𝐍\mathrm{vec}(\mathbf{M})=\Pi_{L}\mathrm{vec}(\mathbf{N})roman_vec ( bold_M ) = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_N ).

  • (i)

    If the first two moments of 𝐍𝐍\mathbf{N}bold_N exist, then there exist constants η𝜂\etaitalic_η, σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with σ10subscript𝜎10\sigma_{1}\geq 0italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 0 and σ22σ1/ksubscript𝜎22subscript𝜎1𝑘\sigma_{2}\geq-2\sigma_{1}/kitalic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ - 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_k, such that 𝔼[vec(𝐌)]=ηvec(𝚺)𝔼delimited-[]vec𝐌𝜂vec𝚺\mathbb{E}[\mathrm{vec}(\mathbf{M})]=\eta\mathrm{vec}(\mathbf{\Sigma})blackboard_E [ roman_vec ( bold_M ) ] = italic_η roman_vec ( bold_Σ ) and var(vec(𝐌))varvec𝐌\text{\rm var}(\mathrm{vec}(\mathbf{M}))var ( roman_vec ( bold_M ) ) is given by (1.3).

  • (ii)

    If 𝐓𝐓superscript\mathbf{T}\in\mathbb{R}^{\ell}bold_T ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT is the random vector, such that vec(𝐌)=𝐋𝐓vec𝐌𝐋𝐓\mathrm{vec}(\mathbf{M})=\mathbf{L}\mathbf{T}roman_vec ( bold_M ) = bold_LT, then 𝔼[𝐓]=η𝜽𝔼delimited-[]𝐓𝜂𝜽\mathbb{E}[\mathbf{T}]=\eta\bm{\theta}blackboard_E [ bold_T ] = italic_η bold_italic_θ and var(𝐓)var𝐓\text{\rm var}(\mathbf{T})var ( bold_T ) is given by (1.2).

Note that the constants η𝜂\etaitalic_η, σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT have nothing to do with the projection ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, but are inherited from the variance (1.1) of the radial random matrix 𝐍𝐍\mathbf{N}bold_N. Their existence is guaranteed by Corollary 1 in Tyler [26].

Examples of multivariate statistical models with a linear covariance structure are linear mixed effects models. But also linear models with errors generated by some autoregressive time series may correspond to a linear covariance structure. When 𝚺𝚺\mathbf{\Sigma}bold_Σ is unstructured and can be any positive definite symmetric covariance matrix, it can also be seen as a linear covariance structure 𝐕(𝜽)𝐕𝜽\mathbf{V}(\bm{\theta})bold_V ( bold_italic_θ ), where 𝜽=vech(𝚺)𝜽vech𝚺\bm{\theta}=\mathrm{vech}(\mathbf{\Sigma})bold_italic_θ = roman_vech ( bold_Σ ), with

vech(𝐀)=(a11,,ak1,a22,,akk),vech𝐀subscript𝑎11subscript𝑎𝑘1subscript𝑎22subscript𝑎𝑘𝑘\mathrm{vech}(\mathbf{A})=(a_{11},\ldots,a_{k1},a_{22},\ldots,a_{kk}),roman_vech ( bold_A ) = ( italic_a start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k italic_k end_POSTSUBSCRIPT ) , (2.3)

is the unique k(k+1)/2𝑘𝑘12k(k+1)/2italic_k ( italic_k + 1 ) / 2-vector that stacks the columns of the lower triangle elements of a symmetric matrix 𝐀𝐀\mathbf{A}bold_A. The matrix 𝐋=vec(𝐕)/𝜽T𝐋vec𝐕superscript𝜽𝑇\mathbf{L}=\partial\mathrm{vec}(\mathbf{V})/\partial\bm{\theta}^{T}bold_L = ∂ roman_vec ( bold_V ) / ∂ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is then equal to the so-called duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, which is the unique k2×k(k+1)/2superscript𝑘2𝑘𝑘12k^{2}\times k(k+1)/2italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × italic_k ( italic_k + 1 ) / 2 matrix, with the properties 𝒟kvech(𝐀)=vec(𝐀)subscript𝒟𝑘vech𝐀vec𝐀\mathcal{D}_{k}\mathrm{vech}(\mathbf{A})=\mathrm{vec}(\mathbf{A})caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT roman_vech ( bold_A ) = roman_vec ( bold_A ) and (𝒟kT𝒟k)1𝒟kTvec(𝐀)=vech(𝐀)superscriptsuperscriptsubscript𝒟𝑘𝑇subscript𝒟𝑘1superscriptsubscript𝒟𝑘𝑇vec𝐀vech𝐀(\mathcal{D}_{k}^{T}\mathcal{D}_{k})^{-1}\mathcal{D}_{k}^{T}\mathrm{vec}(% \mathbf{A})=\mathrm{vech}(\mathbf{A})( caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_A ) = roman_vech ( bold_A ). Moreover, from the properties of 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (e.g., see Magnus and Neudecker [18, Ch. 3, Sec. 8]), it follows that

𝒟k(𝒟kT(𝚺1𝚺1)𝒟k)1𝒟kT=12(𝐈k2+𝐊k,k)(𝚺𝚺).subscript𝒟𝑘superscriptsuperscriptsubscript𝒟𝑘𝑇tensor-productsuperscript𝚺1superscript𝚺1subscript𝒟𝑘1superscriptsubscript𝒟𝑘𝑇12subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺\mathcal{D}_{k}\left(\mathcal{D}_{k}^{T}\left(\mathbf{\Sigma}^{-1}\otimes% \mathbf{\Sigma}^{-1}\right)\mathcal{D}_{k}\right)^{-1}\mathcal{D}_{k}^{T}=% \frac{1}{2}\left(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k}\right)\left(\mathbf{% \Sigma}\otimes\mathbf{\Sigma}\right).caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) . (2.4)

In this case, the expression (1.3) with 𝐋=𝒟k𝐋subscript𝒟𝑘\mathbf{L}=\mathcal{D}_{k}bold_L = caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT coincides with the expression (1.1).

3 Projections of estimators of radial type

A sequence {𝐍n}subscript𝐍𝑛\{\mathbf{N}_{n}\}{ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } of k×k𝑘𝑘k\times kitalic_k × italic_k symmetric estimators for 𝚺𝚺\mathbf{\Sigma}bold_Σ is said to be asymptotically of radial type if there exists a sequence of real numbers ansubscript𝑎𝑛a_{n}italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT increasing to infinity, such that an(𝐍n𝚺)𝐍subscript𝑎𝑛subscript𝐍𝑛𝚺𝐍a_{n}(\mathbf{N}_{n}-\mathbf{\Sigma})\to\mathbf{N}italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_Σ ) → bold_N in distribution with 𝐍𝐍\mathbf{N}bold_N being of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ, see Tyler [26]. In a large class of multivariate statistical models, for estimators 𝐕nsubscript𝐕𝑛\mathbf{V}_{n}bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT of a linearly structured covariance matrix, it turns out that the limiting behavior of vec(𝐕n)vecsubscript𝐕𝑛\mathrm{vec}(\mathbf{V}_{n})roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) is the same as that of the projection ΠLvec(𝐍n)subscriptΠ𝐿vecsubscript𝐍𝑛\Pi_{L}\mathrm{vec}(\mathbf{N}_{n})roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) of a random matrix 𝐍nsubscript𝐍𝑛\mathbf{N}_{n}bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT that is asymptotically of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ, where ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT is defined in (2.2). We illustrate this behavior in the following linear model with a structured covariance.

Consider independent observations 𝐬1,,𝐬nk×kqsubscript𝐬1subscript𝐬𝑛superscript𝑘superscript𝑘𝑞\mathbf{s}_{1},\ldots,\mathbf{s}_{n}\in\mathbb{R}^{k}\times\mathbb{R}^{kq}bold_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_k italic_q end_POSTSUPERSCRIPT with distribution P𝑃Pitalic_P, where 𝐬i=(𝐲i,𝐗i)subscript𝐬𝑖subscript𝐲𝑖subscript𝐗𝑖\mathbf{s}_{i}=(\mathbf{y}_{i},\mathbf{X}_{i})bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ), i=1,,n𝑖1𝑛i=1,\ldots,nitalic_i = 1 , … , italic_n, for which we assume the following model

𝐲i=𝐗i𝜷+𝐮i,i=1,,n,formulae-sequencesubscript𝐲𝑖subscript𝐗𝑖𝜷subscript𝐮𝑖𝑖1𝑛\mathbf{y}_{i}=\mathbf{X}_{i}\bm{\beta}+\mathbf{u}_{i},\quad i=1,\ldots,n,bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β + bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , … , italic_n , (3.1)

where 𝐲iksubscript𝐲𝑖superscript𝑘\mathbf{y}_{i}\in\mathbb{R}^{k}bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, 𝜷q𝜷superscript𝑞\bm{\beta}\in\mathbb{R}^{q}bold_italic_β ∈ blackboard_R start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT is an unknown parameter vector, 𝐗ik×qsubscript𝐗𝑖superscript𝑘𝑞\mathbf{X}_{i}\in\mathbb{R}^{k\times q}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k × italic_q end_POSTSUPERSCRIPT is a known design matrix, and 𝐮iksubscript𝐮𝑖superscript𝑘\mathbf{u}_{i}\in\mathbb{R}^{k}bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT are unobservable independent mean zero random vectors with covariance matrix 𝐕PDS(k)𝐕PDS𝑘\mathbf{V}\in\text{PDS}(k)bold_V ∈ PDS ( italic_k ), the class of positive definite symmetric k×k𝑘𝑘k\times kitalic_k × italic_k matrices. Suppose that the distribution P𝑃Pitalic_P for random variable 𝐬=(𝐲,𝐗)𝐬𝐲𝐗\mathbf{s}=(\mathbf{y},\mathbf{X})bold_s = ( bold_y , bold_X ) is such that 𝐲𝐗conditional𝐲𝐗\mathbf{y}\mid\mathbf{X}bold_y ∣ bold_X has an elliptically contoured density

f𝝁,𝚺(𝐲)=|𝚺|1/2g((𝐲𝝁)T𝚺1(𝐲𝝁)),subscript𝑓𝝁𝚺𝐲superscript𝚺12𝑔superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁f_{\bm{\mu},\mathbf{\Sigma}}(\mathbf{y})=|\mathbf{\Sigma}|^{-1/2}g\left((% \mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm{\mu})\right),italic_f start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ( bold_y ) = | bold_Σ | start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_g ( ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ) , (3.2)

where 𝝁=𝐗𝜷0𝝁𝐗subscript𝜷0\bm{\mu}=\mathbf{X}\bm{\beta}_{0}bold_italic_μ = bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 𝚺=𝐕(𝜽0)=θ01𝐋1++θ0𝐋𝚺𝐕subscript𝜽0subscript𝜃01subscript𝐋1subscript𝜃0subscript𝐋\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})=\theta_{01}\mathbf{L}_{1}+\cdots+% \theta_{0\ell}\mathbf{L}_{\ell}bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_θ start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_θ start_POSTSUBSCRIPT 0 roman_ℓ end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT, for some vector 𝜽0subscript𝜽0superscript\bm{\theta}_{0}\in\mathbb{R}^{\ell}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT of variance components. This setup includes several multivariate statistical models of interest. One possibility is the linear mixed effects model, in which the random effects together with the measurement error yields a specific covariance structure. Other covariance structures may arise, for example if the 𝐮isubscript𝐮𝑖\mathbf{u}_{i}bold_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the outcome of a time series. Note that this setup also allows models with an unstructured covariance matrix, such as the multivariate location-scale model or the multivariate regression model. See e.g., Jennrich and Schluchter [10] or Fitzmaurice et al [6], for different possible covariance structures, and Lopuhaä et al [16], who provide a uniform treatment of S-estimators in these models.

Estimators 𝝃n=(𝜷n,𝜽n)subscript𝝃𝑛subscript𝜷𝑛subscript𝜽𝑛\bm{\xi}_{n}=(\bm{\beta}_{n},\bm{\theta}_{n})bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for 𝝃0=(𝜷0,𝜽0)subscript𝝃0subscript𝜷0subscript𝜽0\bm{\xi}_{0}=(\bm{\beta}_{0},\bm{\theta}_{0})bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) are typically solutions of estimating equations of the following type

Ψ(𝐬,𝝃)dn(𝐬)=𝟎,Ψ𝐬𝝃dsubscript𝑛𝐬0\int\Psi(\mathbf{s},\bm{\xi})\,\text{d}\mathbb{P}_{n}(\mathbf{s})=\mathbf{0},∫ roman_Ψ ( bold_s , bold_italic_ξ ) d blackboard_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_s ) = bold_0 , (3.3)

where nsubscript𝑛\mathbb{P}_{n}blackboard_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denotes the empirical measure corresponding to 𝐬1,,𝐬nsubscript𝐬1subscript𝐬𝑛\mathbf{s}_{1},\ldots,\mathbf{s}_{n}bold_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_s start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, and where Ψ=(Ψ𝜷,Ψ𝜽)ΨsubscriptΨ𝜷subscriptΨ𝜽\Psi=(\Psi_{\bm{\beta}},\Psi_{\bm{\theta}})roman_Ψ = ( roman_Ψ start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ), with

Ψ𝜷(𝐬,𝝃)=w1(d)𝐗T𝐕1(𝐲𝐗𝜷)Ψ𝜽(𝐬,𝝃)=𝐋T(𝐕1𝐕1)vec{w2(d)(𝐲𝐗𝜷)(𝐲𝐗𝜷)Tw3(d)𝐕},subscriptΨ𝜷𝐬𝝃subscript𝑤1𝑑superscript𝐗𝑇superscript𝐕1𝐲𝐗𝜷subscriptΨ𝜽𝐬𝝃superscript𝐋𝑇tensor-productsuperscript𝐕1superscript𝐕1vecsubscript𝑤2𝑑𝐲𝐗𝜷superscript𝐲𝐗𝜷𝑇subscript𝑤3𝑑𝐕\begin{split}\Psi_{\bm{\beta}}(\mathbf{s},\bm{\xi})&=w_{1}(d)\mathbf{X}^{T}% \mathbf{V}^{-1}(\mathbf{y}-\mathbf{X}\bm{\beta})\\ \Psi_{\bm{\theta}}(\mathbf{s},\bm{\xi})&=\mathbf{L}^{T}(\mathbf{V}^{-1}\otimes% \mathbf{V}^{-1})\mathrm{vec}\left\{w_{2}(d)(\mathbf{y}-\mathbf{X}\bm{\beta})(% \mathbf{y}-\mathbf{X}\bm{\beta})^{T}-w_{3}(d)\mathbf{V}\right\},\end{split}start_ROW start_CELL roman_Ψ start_POSTSUBSCRIPT bold_italic_β end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ ) end_CELL start_CELL = italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_d ) bold_X start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β ) end_CELL end_ROW start_ROW start_CELL roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ ) end_CELL start_CELL = bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec { italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d ) ( bold_y - bold_X bold_italic_β ) ( bold_y - bold_X bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_d ) bold_V } , end_CELL end_ROW (3.4)

where d2=(𝐲𝐗𝜷)T𝐕1(𝐲𝐗𝜷)superscript𝑑2superscript𝐲𝐗𝜷𝑇superscript𝐕1𝐲𝐗𝜷d^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta})^{T}\mathbf{V}^{-1}(\mathbf{y}-\mathbf{% X}\bm{\beta})italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β ), and where we write 𝐕𝐕\mathbf{V}bold_V for 𝐕(𝜽)𝐕𝜽\mathbf{V}(\bm{\theta})bold_V ( bold_italic_θ ). We give some examples below. Furthermore, typically 𝝃nsubscript𝝃𝑛\bm{\xi}_{n}bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT will then converge to a solution of the corresponding population equation

Ψ(𝐬,𝝃)dP(𝐬)=𝟎.Ψ𝐬𝝃d𝑃𝐬0\int\Psi(\mathbf{s},\bm{\xi})\,\text{d}P(\mathbf{s})=\mathbf{0}.∫ roman_Ψ ( bold_s , bold_italic_ξ ) d italic_P ( bold_s ) = bold_0 . (3.5)

Let 𝐕n=𝐕(𝜽n)subscript𝐕𝑛𝐕subscript𝜽𝑛\mathbf{V}_{n}=\mathbf{V}(\bm{\theta}_{n})bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_V ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ). From the estimating equations (3.3) for 𝝃nsubscript𝝃𝑛\bm{\xi}_{n}bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, we will establish that vec(𝐕n)vecsubscript𝐕𝑛\mathrm{vec}(\mathbf{V}_{n})roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) is asymptotically equivalent with ΠLvec(𝐍n)subscriptΠ𝐿vecsubscript𝐍𝑛\Pi_{L}\mathrm{vec}(\mathbf{N}_{n})roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), for some 𝐍nsubscript𝐍𝑛\mathbf{N}_{n}bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT that is asymptotically of radial type and ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT defined in (2.2). To this end, we require the following conditions

  • (C1)

    wi(s)subscript𝑤𝑖𝑠w_{i}(s)italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_s ) is of bounded variation and continuously differentiable, for i=1,2,3𝑖123i=1,2,3italic_i = 1 , 2 , 3;

  • (C2)

    w1(s)s2superscriptsubscript𝑤1𝑠superscript𝑠2w_{1}^{\prime}(s)s^{2}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, w2(s)s3superscriptsubscript𝑤2𝑠superscript𝑠3w_{2}^{\prime}(s)s^{3}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, and w3(s)s2superscriptsubscript𝑤3𝑠superscript𝑠2w_{3}^{\prime}(s)s^{2}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT are bounded;

  • (C3)

    𝔼𝟎,𝐈k[w2(𝐳)𝐳3+k(k+2)w3(𝐳)]0subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳3𝑘𝑘2subscript𝑤3norm𝐳0\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|\mathbf{z}\|)\|% \mathbf{z}\|^{3}+k(k+2)w_{3}(\|\mathbf{z}\|)\Big{]}\neq 0blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + italic_k ( italic_k + 2 ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ] ≠ 0 and 𝔼𝟎,𝐈k[w2(𝐳)𝐳3+2kw3(𝐳)kw3(𝐳)𝐳]0subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳32𝑘subscript𝑤3norm𝐳𝑘superscriptsubscript𝑤3norm𝐳norm𝐳0\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|\mathbf{z}\|)\|% \mathbf{z}\|^{3}+2kw_{3}(\|\mathbf{z}\|)-kw_{3}^{\prime}(\|\mathbf{z}\|)\|% \mathbf{z}\|\Big{]}\neq 0blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 2 italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] ≠ 0,

where 𝔼𝟎,𝐈ksubscript𝔼0subscript𝐈𝑘\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT denotes the expectation with respect to density (3.2) with parameters (𝝁,𝚺)=(𝟎,𝐈k)𝝁𝚺0subscript𝐈𝑘(\bm{\mu},\mathbf{\Sigma})=(\mathbf{0},\mathbf{I}_{k})( bold_italic_μ , bold_Σ ) = ( bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). Condition (C3) is to ensure the existence of the scalars σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2. Maronna [21] and Tyler [26] consider M-estimators for multivariate location and covariance. Estimating equations for these estimators would correspond to Ψ𝜽subscriptΨ𝜽\Psi_{\bm{\theta}}roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT without the factor 𝐋T(𝐕1𝐕1)superscript𝐋𝑇tensor-productsuperscript𝐕1superscript𝐕1\mathbf{L}^{T}(\mathbf{V}^{-1}\otimes\mathbf{V}^{-1})bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) (see Example 2 below) and w3=1subscript𝑤31w_{3}=1italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1. Moreover, they assume that w2superscriptsubscript𝑤2w_{2}^{\prime}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is non-negative, which obviously implies (C3).

Theorem 2.

Let P𝑃Pitalic_P be a distribution for random variable 𝐬=(𝐲,𝐗)𝐬𝐲𝐗\mathbf{s}=(\mathbf{y},\mathbf{X})bold_s = ( bold_y , bold_X ), such that 𝐲𝐗conditional𝐲𝐗\mathbf{y}\mid\mathbf{X}bold_y ∣ bold_X has an elliptically contoured density (3.2), with parameters 𝛍=𝐗𝛃0𝛍𝐗subscript𝛃0\bm{\mu}=\mathbf{X}\bm{\beta}_{0}bold_italic_μ = bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), for a linear covariance structure 𝐕𝐕\mathbf{V}bold_V. Let 𝛏nsubscript𝛏𝑛\bm{\xi}_{n}bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝛏0subscript𝛏0\bm{\xi}_{0}bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT be solutions of (3.3) and (3.5), respectively, and suppose that 𝛏n𝛏0subscript𝛏𝑛subscript𝛏0\bm{\xi}_{n}\to\bm{\xi}_{0}bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT → bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in probability. Suppose that 𝔼𝐬4<𝔼superscriptnorm𝐬4\mathbb{E}\|\mathbf{s}\|^{4}<\inftyblackboard_E ∥ bold_s ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT < ∞ and that 𝐗𝐗\mathbf{X}bold_X has full rank with probability one. If w1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and w3subscript𝑤3w_{3}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT satisfy (C1)-(C3), then there exists a sequence {𝐍n}subscript𝐍𝑛\{\mathbf{N}_{n}\}{ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } of random matrices, such that

n{vec(𝐕n)vec(𝚺)}=ΠLvec{n(𝐍n𝔼[𝐍n])}+oP(1),𝑛vecsubscript𝐕𝑛vec𝚺subscriptΠ𝐿vec𝑛subscript𝐍𝑛𝔼delimited-[]subscript𝐍𝑛subscript𝑜𝑃1\sqrt{n}\left\{\mathrm{vec}(\mathbf{V}_{n})-\mathrm{vec}(\mathbf{\Sigma})% \right\}=-\Pi_{L}\mathrm{vec}\left\{\sqrt{n}(\mathbf{N}_{n}-\mathbb{E}[\mathbf% {N}_{n}])\right\}+o_{P}(1),square-root start_ARG italic_n end_ARG { roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - roman_vec ( bold_Σ ) } = - roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { square-root start_ARG italic_n end_ARG ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - blackboard_E [ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] ) } + italic_o start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ( 1 ) ,

where ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT is defined in (2.2). Moreover, n(𝐍n𝔼[𝐍n])𝐍𝑛subscript𝐍𝑛𝔼delimited-[]subscript𝐍𝑛𝐍\sqrt{n}(\mathbf{N}_{n}-\mathbb{E}[\mathbf{N}_{n}])\to\mathbf{N}square-root start_ARG italic_n end_ARG ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - blackboard_E [ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] ) → bold_N in distribution, where 𝐍𝐍\mathbf{N}bold_N is a random matrix that has a multivariate normal distribution with mean zero and variance (1.1), with

σ1=k(k+2)𝔼𝟎,𝐈k[w2(𝐳)2𝐳4](𝔼𝟎,𝐈k[w2(𝐳)𝐳3+k(k+2)w3(𝐳)])2σ2=2kσ1+4𝔼𝟎,𝐈k[(w2(𝐳)𝐳2kw3(𝐳))2](𝔼𝟎,𝐈k[w2(𝐳)𝐳3+2kw3(𝐳)kw3(𝐳)𝐳])2.subscript𝜎1𝑘𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4superscriptsubscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳3𝑘𝑘2subscript𝑤3norm𝐳2subscript𝜎22𝑘subscript𝜎14subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳2𝑘subscript𝑤3norm𝐳2superscriptsubscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳32𝑘subscript𝑤3norm𝐳𝑘superscriptsubscript𝑤3norm𝐳norm𝐳2\begin{split}\sigma_{1}&=\frac{k(k+2)\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[w_{2}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]}{\Big{(}\mathbb{E}_{% \mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^% {3}+k(k+2)w_{3}(\|\mathbf{z}\|)\Big{]}\Big{)}^{2}}\\ \sigma_{2}&=-\frac{2}{k}\sigma_{1}+\frac{4\mathbb{E}_{\mathbf{0},\mathbf{I}_{k% }}\left[\Big{(}w_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}-kw_{3}(\|\mathbf{z}\|)% \Big{)}^{2}\right]}{\left(\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{% \prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}+2kw_{3}(\|\mathbf{z}\|)-kw_{3}^{% \prime}(\|\mathbf{z}\|)\|\mathbf{z}\|\Big{]}\right)^{2}}.\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG italic_k ( italic_k + 2 ) blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + italic_k ( italic_k + 2 ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG 4 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 2 italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . end_CELL end_ROW
Remark 3.1.

From the proof of Theorem 2 one can obtain the following explicit expression for 𝐍nsubscript𝐍𝑛\mathbf{N}_{n}bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT:

𝐍n=1ni=1n{v1(di)(𝐲i𝐗i𝜷0)(𝐲i𝐗i𝜷0)Tv2(di)𝚺},subscript𝐍𝑛1𝑛superscriptsubscript𝑖1𝑛subscript𝑣1subscript𝑑𝑖subscript𝐲𝑖subscript𝐗𝑖subscript𝜷0superscriptsubscript𝐲𝑖subscript𝐗𝑖subscript𝜷0𝑇subscript𝑣2subscript𝑑𝑖𝚺\mathbf{N}_{n}=\frac{1}{n}\sum_{i=1}^{n}\left\{v_{1}(d_{i})(\mathbf{y}_{i}-% \mathbf{X}_{i}\bm{\beta}_{0})(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta}_{0})^{T}% -v_{2}(d_{i})\mathbf{\Sigma}\right\},bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) bold_Σ } ,

where di2=(𝐲i𝐗i𝛃0)T𝚺1(𝐲i𝐗i𝛃0)superscriptsubscript𝑑𝑖2superscriptsubscript𝐲𝑖subscript𝐗𝑖subscript𝛃0𝑇superscript𝚺1subscript𝐲𝑖subscript𝐗𝑖subscript𝛃0d_{i}^{2}=(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta}_{0})^{T}\mathbf{\Sigma}^{-1% }(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta}_{0})italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), and

v1(s)=w2(s)γ1;v2(s)=γ2w2(s)s2+γ1w3(s)γ1(γ1kγ2),formulae-sequencesubscript𝑣1𝑠subscript𝑤2𝑠subscript𝛾1subscript𝑣2𝑠subscript𝛾2subscript𝑤2𝑠superscript𝑠2subscript𝛾1subscript𝑤3𝑠subscript𝛾1subscript𝛾1𝑘subscript𝛾2\begin{split}v_{1}(s)&=\frac{w_{2}(s)}{\gamma_{1}};\\ v_{2}(s)&=\frac{-\gamma_{2}w_{2}(s)s^{2}+\gamma_{1}w_{3}(s)}{\gamma_{1}(\gamma% _{1}-k\gamma_{2})},\end{split}start_ROW start_CELL italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = divide start_ARG italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) end_ARG start_ARG italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ; end_CELL end_ROW start_ROW start_CELL italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = divide start_ARG - italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) end_ARG start_ARG italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG , end_CELL end_ROW

where γ1subscript𝛾1\gamma_{1}italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and γ2subscript𝛾2\gamma_{2}italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are defined in (A.6). Note that γ1subscript𝛾1\gamma_{1}italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and γ1kγ2subscript𝛾1𝑘subscript𝛾2\gamma_{1}-k\gamma_{2}italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are precisely the quantities that appear in condition (C3).

The random matrix 𝐍𝐍\mathbf{N}bold_N in Theorem 2 is of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ. This follows from the fact that 𝐑=𝚺1/2𝐍𝚺1/2𝐑superscript𝚺12𝐍superscript𝚺12\mathbf{R}=\mathbf{\Sigma}^{-1/2}\mathbf{N}\mathbf{\Sigma}^{-1/2}bold_R = bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT bold_N bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT is multivariate normal with mean zero and variance

var{vec(𝐑)}=(𝚺1/2𝚺1/2)var{vec(𝐍)}(𝚺1/2𝚺1/2)=σ1(𝐈k2+𝐊k,k)+σ2vec(𝐈k)vec(𝐈k)T.varvec𝐑tensor-productsuperscript𝚺12superscript𝚺12varvec𝐍tensor-productsuperscript𝚺12superscript𝚺12subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘subscript𝜎2vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\begin{split}\text{var}\{\text{vec}(\mathbf{R})\}&=(\mathbf{\Sigma}^{-1/2}% \otimes\mathbf{\Sigma}^{-1/2})\text{var}\left\{\text{vec}(\mathbf{N})\right\}(% \mathbf{\Sigma}^{-1/2}\otimes\mathbf{\Sigma}^{-1/2})\\ &=\sigma_{1}(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k})+\sigma_{2}\text{vec}(\mathbf% {I}_{k})\text{vec}(\mathbf{I}_{k})^{T}.\end{split}start_ROW start_CELL var { vec ( bold_R ) } end_CELL start_CELL = ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) var { vec ( bold_N ) } ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . end_CELL end_ROW

This immediately gives that for any orthogonal matrix 𝐎𝐎\mathbf{O}bold_O, the matrix 𝐎𝐑𝐎Tsuperscript𝐎𝐑𝐎𝑇\mathbf{O}\mathbf{R}\mathbf{O}^{T}bold_ORO start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is multivariate normal with mean zero and the same variance. From Theorem 2 it follows that n{vec(𝐕n)vec(𝚺)}𝑛vecsubscript𝐕𝑛vec𝚺\sqrt{n}\left\{\mathrm{vec}(\mathbf{V}_{n})-\mathrm{vec}(\mathbf{\Sigma})\right\}square-root start_ARG italic_n end_ARG { roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - roman_vec ( bold_Σ ) } is asymptotically normal with mean zero and a variance that is the same as the variance of vec(𝐌)=ΠLvec(𝐍)vec𝐌subscriptΠ𝐿vec𝐍\mathrm{vec}(\mathbf{M})=\Pi_{L}\mathrm{vec}(\mathbf{N})roman_vec ( bold_M ) = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_N ). According to Theorem 1 this variance is of the type given by (1.3). Furthermore, if we write vec(𝐌)=𝐋𝐓vec𝐌𝐋𝐓\mathrm{vec}(\mathbf{M})=\mathbf{L}\mathbf{T}roman_vec ( bold_M ) = bold_LT, then

n(𝜽n𝜽0)=(𝐋T𝐋)1𝐋Tn{vec(𝐕n)vec(𝚺)}𝐓,𝑛subscript𝜽𝑛subscript𝜽0superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇𝑛vecsubscript𝐕𝑛vec𝚺𝐓\sqrt{n}(\bm{\theta}_{n}-\bm{\theta}_{0})=(\mathbf{L}^{T}\mathbf{L})^{-1}% \mathbf{L}^{T}\sqrt{n}\left\{\mathrm{vec}(\mathbf{V}_{n})-\mathrm{vec}(\mathbf% {\Sigma})\right\}\to\mathbf{T},square-root start_ARG italic_n end_ARG ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT square-root start_ARG italic_n end_ARG { roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - roman_vec ( bold_Σ ) } → bold_T ,

in distribution, where 𝐓𝐓\mathbf{T}bold_T is multivariate normal with mean zero and variance given by (1.2).

3.1 Examples

We discuss some examples of multivariate statistical models that are covered by the setup in (3.1), in which the estimators (𝜷n,𝜽n)subscript𝜷𝑛subscript𝜽𝑛(\bm{\beta}_{n},\bm{\theta}_{n})( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) are solutions of estimating equation (3.3) for particular functions w1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and w3subscript𝑤3w_{3}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. In the Appendix we provide a detailed derivation σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for specific special cases and show that their expressions coincide with the ones in Tyler [26] and Lopuhaä et al [15].

Example 1 (Maximum likelihood for multivariate normal).

Suppose that (𝐲1,𝐗1),,(𝐲n,𝐗n)subscript𝐲1subscript𝐗1subscript𝐲𝑛subscript𝐗𝑛(\mathbf{y}_{1},\mathbf{X}_{1}),\ldots,(\mathbf{y}_{n},\mathbf{X}_{n})( bold_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , ( bold_y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) are independent, such that 𝐲i𝐗iNk(𝐗i𝛃0,𝐕(𝛉0))similar-toconditionalsubscript𝐲𝑖subscript𝐗𝑖subscript𝑁𝑘subscript𝐗𝑖subscript𝛃0𝐕subscript𝛉0\mathbf{y}_{i}\mid\mathbf{X}_{i}\sim N_{k}(\mathbf{X}_{i}\bm{\beta}_{0},% \mathbf{V}(\bm{\theta}_{0}))bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∣ bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∼ italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ). The loglikelihood is then given by

=nk2log(2π)n2log|𝐕(𝜽)|12i=1n(𝐲i𝐗i𝜷)T𝐕(𝜽)1(𝐲i𝐗i𝜷).𝑛𝑘22𝜋𝑛2𝐕𝜽12superscriptsubscript𝑖1𝑛superscriptsubscript𝐲𝑖subscript𝐗𝑖𝜷𝑇𝐕superscript𝜽1subscript𝐲𝑖subscript𝐗𝑖𝜷\mathcal{L}=-\frac{nk}{2}\log(2\pi)-\frac{n}{2}\log|\mathbf{V}(\bm{\theta})|-% \frac{1}{2}\sum_{i=1}^{n}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta})^{T}\mathbf{% V}(\bm{\theta})^{-1}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta}).caligraphic_L = - divide start_ARG italic_n italic_k end_ARG start_ARG 2 end_ARG roman_log ( 2 italic_π ) - divide start_ARG italic_n end_ARG start_ARG 2 end_ARG roman_log | bold_V ( bold_italic_θ ) | - divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V ( bold_italic_θ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) .

Setting the partial derivatives /𝛃𝛃\partial\mathcal{L}/\partial\bm{\beta}∂ caligraphic_L / ∂ bold_italic_β and /θjsubscript𝜃𝑗\partial\mathcal{L}/\partial\theta_{j}∂ caligraphic_L / ∂ italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT equal to zero gives the following estimating equations

1ni=1n𝐗iT𝐕1(𝐲i𝐗i𝜷)=𝟎,1ni=1n{(𝐲i𝐗i𝜷)T𝐕1𝐋j𝐕1(𝐲i𝐗i𝜷)tr(𝐕1𝐋j)}=0,formulae-sequence1𝑛superscriptsubscript𝑖1𝑛superscriptsubscript𝐗𝑖𝑇superscript𝐕1subscript𝐲𝑖subscript𝐗𝑖𝜷01𝑛superscriptsubscript𝑖1𝑛superscriptsubscript𝐲𝑖subscript𝐗𝑖𝜷𝑇superscript𝐕1subscript𝐋𝑗superscript𝐕1subscript𝐲𝑖subscript𝐗𝑖𝜷trsuperscript𝐕1subscript𝐋𝑗0\begin{split}\frac{1}{n}\sum_{i=1}^{n}\mathbf{X}_{i}^{T}\mathbf{V}^{-1}(% \mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta})&=\mathbf{0},\\ \frac{1}{n}\sum_{i=1}^{n}\left\{(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta})^{T}% \mathbf{V}^{-1}\mathbf{L}_{j}\mathbf{V}^{-1}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{% \beta})-\text{\rm tr}(\mathbf{V}^{-1}\mathbf{L}_{j})\right\}&=0,\end{split}start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) end_CELL start_CELL = bold_0 , end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) - tr ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) } end_CELL start_CELL = 0 , end_CELL end_ROW (3.6)

for j=1,,𝑗1j=1,\ldots,\ellitalic_j = 1 , … , roman_ℓ, where we write 𝐕𝐕\mathbf{V}bold_V for 𝐕(𝛉)𝐕𝛉\mathbf{V}(\bm{\theta})bold_V ( bold_italic_θ ). By using the vec-notation and 𝐋𝐋\mathbf{L}bold_L as defined in (2.1), we can combine the partial derivatives with respect to θjsubscript𝜃𝑗\theta_{j}italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT in the second line of (3.6) as follows

𝐋T(𝐕1𝐕1)vec{1ni=1n(𝐲i𝐗i𝜷)(𝐲i𝐗i𝜷)T𝐕}=𝟎.superscript𝐋𝑇tensor-productsuperscript𝐕1superscript𝐕1vec1𝑛superscriptsubscript𝑖1𝑛subscript𝐲𝑖subscript𝐗𝑖𝜷superscriptsubscript𝐲𝑖subscript𝐗𝑖𝜷𝑇𝐕0\mathbf{L}^{T}(\mathbf{V}^{-1}\otimes\mathbf{V}^{-1})\mathrm{vec}\left\{\frac{% 1}{n}\sum_{i=1}^{n}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta})(\mathbf{y}_{i}-% \mathbf{X}_{i}\bm{\beta})^{T}-\mathbf{V}\right\}=\mathbf{0}.bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec { divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - bold_V } = bold_0 . (3.7)

It follows that the maximum likelihood estimator (𝛃n,𝛉n)subscript𝛃𝑛subscript𝛉𝑛(\bm{\beta}_{n},\bm{\theta}_{n})( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) satisfies (3.3) and (𝛃0,𝛉0)subscript𝛃0subscript𝛉0(\bm{\beta}_{0},\bm{\theta}_{0})( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) satisfies (3.5), where ΨΨ\Psiroman_Ψ is defined in (3.4) with w1(s)=w2(s)=w3(s)=1subscript𝑤1𝑠subscript𝑤2𝑠subscript𝑤3𝑠1w_{1}(s)=w_{2}(s)=w_{3}(s)=1italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) = 1. Theorem 2 applies and one finds σ1=1subscript𝜎11\sigma_{1}=1italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 and σ2=0subscript𝜎20\sigma_{2}=0italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.

When each 𝐗i=𝐈ksubscript𝐗𝑖subscript𝐈𝑘\mathbf{X}_{i}=\mathbf{I}_{k}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, for i=1,,n𝑖1𝑛i=1,\ldots,nitalic_i = 1 , … , italic_n, then the model (3.1) reduces to the multivariate location-scale model. If 𝚺𝚺\mathbf{\Sigma}bold_Σ is unstructured, then 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), with 𝛉0=vech(𝚺)subscript𝛉0vech𝚺\bm{\theta}_{0}=\mathrm{vech}(\mathbf{\Sigma})bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_vech ( bold_Σ ) and 𝐋=vec(𝐕(𝛉0))/𝛉T𝐋vec𝐕subscript𝛉0superscript𝛉𝑇\mathbf{L}=\partial\mathrm{vec}(\mathbf{V}(\bm{\theta}_{0}))/\partial\bm{% \theta}^{T}bold_L = ∂ roman_vec ( bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) / ∂ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is equal to the duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. In this case, we can remove the factor 𝐋T(𝐕1𝐕1)superscript𝐋𝑇tensor-productsuperscript𝐕1superscript𝐕1\mathbf{L}^{T}(\mathbf{V}^{-1}\otimes\mathbf{V}^{-1})bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) from (3.7), and 𝐕nsubscript𝐕𝑛\mathbf{V}_{n}bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is simply the sample covariance of 𝐲1,,𝐲nsubscript𝐲1subscript𝐲𝑛\mathbf{y}_{1},\ldots,\mathbf{y}_{n}bold_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. This example then coincides with Example 1 in Tyler [26].

Example 2 (M-estimators).

As mentioned in Example 1, when each 𝐗i=𝐈ksubscript𝐗𝑖subscript𝐈𝑘\mathbf{X}_{i}=\mathbf{I}_{k}bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, for i=1,,n𝑖1𝑛i=1,\ldots,nitalic_i = 1 , … , italic_n, and 𝚺𝚺\mathbf{\Sigma}bold_Σ is unstructured, then the model (3.1) reduces to the multivariate location-scale model and we can remove the factor 𝐋T(𝐕1𝐕1)superscript𝐋𝑇tensor-productsuperscript𝐕1superscript𝐕1\mathbf{L}^{T}(\mathbf{V}^{-1}\otimes\mathbf{V}^{-1})bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) from Ψ𝛉subscriptΨ𝛉\Psi_{\bm{\theta}}roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT in (3.3). In that case, estimating equations (3.3) are equivalent to equations (1.1)-(1.2) in Maronna [21] or equations (4.11)-(4.12) in Huber [9] for M-estimators of multivariate location and covariance. In view of this, solutions (𝛃n,𝛉n)subscript𝛃𝑛subscript𝛉𝑛(\bm{\beta}_{n},\bm{\theta}_{n})( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) of estimating equations (3.3) are called M-estimators for (𝛃0,𝛉0)subscript𝛃0subscript𝛉0(\bm{\beta}_{0},\bm{\theta}_{0})( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). The expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2 then coincide with the ones in Example 3 in Tyler [26].

As a special case, this includes the estimating equations that correspond to maximum likelihood estimators based on independent observations (𝐲1,𝐗1),,(𝐲n,𝐗n)subscript𝐲1subscript𝐗1subscript𝐲𝑛subscript𝐗𝑛(\mathbf{y}_{1},\mathbf{X}_{1}),\ldots,(\mathbf{y}_{n},\mathbf{X}_{n})( bold_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , ( bold_y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) from an elliptical density (3.2). The maximum likelihood estimators (𝛃n,𝛉n)subscript𝛃𝑛subscript𝛉𝑛(\bm{\beta}_{n},\bm{\theta}_{n})( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) then satisfy estimating equations (3.3), for w1(s)=w2(s)=2g(s2)/g(s2)subscript𝑤1𝑠subscript𝑤2𝑠2superscript𝑔superscript𝑠2𝑔superscript𝑠2w_{1}(s)=w_{2}(s)=-2g^{\prime}(s^{2})/g(s^{2})italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = - 2 italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / italic_g ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and w3(s)=1subscript𝑤3𝑠1w_{3}(s)=1italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) = 1. The expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2 then coincide with the ones in Example 2 in Tyler [26].

Example 3 (S-estimators).

S-estimators for (𝛃0,𝛉0)subscript𝛃0subscript𝛉0(\bm{\beta}_{0},\bm{\theta}_{0})( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) are defined by means of a function ρ:[0,):𝜌0\rho:\mathbb{R}\to[0,\infty)italic_ρ : blackboard_R → [ 0 , ∞ ), as the solution to minimizing |𝐕(𝛉)|𝐕𝛉|\mathbf{V}(\bm{\theta})|| bold_V ( bold_italic_θ ) |, subject to

1ni=1nρ((𝐲i𝐗i𝜷)T𝐕(𝜽)1(𝐲i𝐗i𝜷))b0,1𝑛superscriptsubscript𝑖1𝑛𝜌superscriptsubscript𝐲𝑖subscript𝐗𝑖𝜷𝑇𝐕superscript𝜽1subscript𝐲𝑖subscript𝐗𝑖𝜷subscript𝑏0\frac{1}{n}\sum_{i=1}^{n}\rho\left(\sqrt{(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{% \beta})^{T}\mathbf{V}(\bm{\theta})^{-1}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta% })}\right)\leq b_{0},divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_ρ ( square-root start_ARG ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V ( bold_italic_θ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) end_ARG ) ≤ italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,

where the minimum is taken over all 𝛃q𝛃superscript𝑞\bm{\beta}\in\mathbb{R}^{q}bold_italic_β ∈ blackboard_R start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT and 𝛉𝛉superscript\bm{\theta}\in\mathbb{R}^{\ell}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT, such that 𝐕(𝛉)PDS(k)𝐕𝛉PDS𝑘\mathbf{V}(\bm{\theta})\in\text{\rm PDS}(k)bold_V ( bold_italic_θ ) ∈ PDS ( italic_k ). These estimators have been studied for linear mixed effects models in Copt and Victoria-Feser [4], Chervoneva and Vishnyakov [1, 2] and for general linear models with a structured covariance in Lopuhaä et al [15]. According to Section 7.2 in [15], S-estimators (𝛃n,𝛉n)subscript𝛃𝑛subscript𝛉𝑛(\bm{\beta}_{n},\bm{\theta}_{n})( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) satisfy estimating equations (3.3), with w1(d)=ρ(d)/dsubscript𝑤1𝑑superscript𝜌𝑑𝑑w_{1}(d)=\rho^{\prime}(d)/ditalic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_d ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) / italic_d, w2(s)=kρ(s)/ssubscript𝑤2𝑠𝑘superscript𝜌𝑠𝑠w_{2}(s)=k\rho^{\prime}(s)/sitalic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = italic_k italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) / italic_s and w3(s)=ρ(s)sρ(s)+b0subscript𝑤3𝑠superscript𝜌𝑠𝑠𝜌𝑠subscript𝑏0w_{3}(s)=\rho^{\prime}(s)s-\rho(s)+b_{0}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s - italic_ρ ( italic_s ) + italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2 coincide with the ones in Corollary 9.2 in Lopuhaä et al [15].

4 Homogeneous map**s of order zero

Let H(𝐯)𝐻𝐯H(\mathbf{v})italic_H ( bold_v ) be a map** from lsuperscript𝑙\mathbb{R}^{l}blackboard_R start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT to msuperscript𝑚\mathbb{R}^{m}blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT that is homogeneous of order zero, that is

H(𝐯)=H(α𝐯), for all α>0.formulae-sequence𝐻𝐯𝐻𝛼𝐯 for all 𝛼0H(\mathbf{v})=H(\alpha\mathbf{v}),\text{ for all }\alpha>0.italic_H ( bold_v ) = italic_H ( italic_α bold_v ) , for all italic_α > 0 . (4.1)

These map**s have several applications to affine equivariant covariance estimators that have limiting variance (1.1). Tyler [27] uses such a map** to show that the likelihood ratio criterion is asymptotically robust over the class of elliptical distributions. Kent and Tyler [11] consider the shape component of covariance CM-estimators and show that the limiting variance of CM-estimators of shape depends on σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT only, which may then serve as an index for the asymptotic relative efficiency. Salibián-Barrera et al [24] derive the influence function of the shape component of covariance MM-functionals and use this to obtain that the limiting variance of MM-estimators of shape only depends on a single scalar. This property of the shape component is a special case of a general result in Tyler [27] for multivariate functionals of affine equivariant covariance estimators that are asymptotically normal with limiting variance (1.1).

Estimators for a structured covariance matrix are typically not affine equivariant and have limiting variance (1.3) instead of (1.1), so that the previous results do not directly apply. The objective of this section is to extend Theorem 1 in Tyler [27] to estimators for a linearly structured covariance, and discuss its consequences for corresponding estimators of shape and scale. Moreover, we establish a similar result for estimators of the vector of variance components and apply this to its normalized version. We then have the following theorem.

Theorem 3.

Consider 𝚺=𝐕(𝛉0)PDS(k)𝚺𝐕subscript𝛉0PDS𝑘\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})\in\text{\rm PDS}(k)bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ PDS ( italic_k ), for some vector 𝛉0subscript𝛉0superscript\bm{\theta}_{0}\in\mathbb{R}^{\ell}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT and linear variance structure 𝐕𝐕\mathbf{V}bold_V. Let {𝐕n:n1}conditional-setsubscript𝐕𝑛𝑛1\{\mathbf{V}_{n}:n\geq 1\}{ bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT : italic_n ≥ 1 } be a sequence of estimators for 𝚺𝚺\mathbf{\Sigma}bold_Σ and let {𝛉n:n1}conditional-setsubscript𝛉𝑛𝑛1\{\bm{\theta}_{n}:n\geq 1\}{ bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT : italic_n ≥ 1 } be a sequence of estimators for the vector 𝛉0subscript𝛉0superscript\bm{\theta}_{0}\in\mathbb{R}^{\ell}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT of variance components.

  • (i)

    For 𝐕PDS(k)𝐕PDS𝑘\mathbf{V}\in\text{\rm PDS}(k)bold_V ∈ PDS ( italic_k ), let H(𝐕)𝐻𝐕H(\mathbf{V})italic_H ( bold_V ) be continuously differentiable satisfying (4.1). When n(𝐕n𝚺)𝑛subscript𝐕𝑛𝚺\sqrt{n}(\mathbf{V}_{n}-\mathbf{\Sigma})square-root start_ARG italic_n end_ARG ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_Σ ) converges in distribution to a random matrix 𝐌𝐌\mathbf{M}bold_M, that has a multivariate normal distribution with mean zero and variance given by (1.3), then n(H(𝐕n)H(𝚺))𝑛𝐻subscript𝐕𝑛𝐻𝚺\sqrt{n}(H(\mathbf{V}_{n})-H(\mathbf{\Sigma}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_Σ ) ) is asymptotically normal with mean zero and variance

    2σ1H(𝚺)𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋TH(𝚺)T.2subscript𝜎1superscript𝐻𝚺𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇superscript𝐻superscript𝚺𝑇2\sigma_{1}H^{\prime}(\mathbf{\Sigma})\mathbf{L}\Big{(}\mathbf{L}^{T}\left(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}% \mathbf{L}^{T}H^{\prime}(\mathbf{\Sigma})^{T}.2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .
  • (ii)

    When n(𝜽n𝜽0)𝑛subscript𝜽𝑛subscript𝜽0\sqrt{n}(\bm{\theta}_{n}-\bm{\theta}_{0})square-root start_ARG italic_n end_ARG ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is asymptotically normal with mean zero and variance (1.2). Then for any map** H(𝜽)𝐻𝜽H(\bm{\theta})italic_H ( bold_italic_θ ) that satisfies (4.1), it holds that n(H(𝜽n)H(𝜽0))𝑛𝐻subscript𝜽𝑛𝐻subscript𝜽0\sqrt{n}(H(\bm{\theta}_{n})-H(\bm{\theta}_{0}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) is asymptotically normal with mean zero and variance

    2σ1H(𝜽0)(𝐋T(𝚺1𝚺1)𝐋)1H(𝜽0)T.2subscript𝜎1superscript𝐻subscript𝜽0superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐻superscriptsubscript𝜽0𝑇2\sigma_{1}H^{\prime}(\bm{\theta}_{0})\Big{(}\mathbf{L}^{T}\left(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}H^{\prime}% (\bm{\theta}_{0})^{T}.2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

When 𝚺=𝐕(𝜽0)𝚺𝐕subscript𝜽0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is unstructured, then vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, with 𝜽0=vech(𝚺)subscript𝜽0vech𝚺\bm{\theta}_{0}=\text{\rm vech}(\mathbf{\Sigma})bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = vech ( bold_Σ ), as defined in (2.3), and 𝐋𝐋\mathbf{L}bold_L is the duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Because 𝐊k,kH(𝐕)T=H(𝐕)Tsubscript𝐊𝑘𝑘superscript𝐻superscript𝐕𝑇superscript𝐻superscript𝐕𝑇\mathbf{K}_{k,k}H^{\prime}(\mathbf{V})^{T}=H^{\prime}(\mathbf{V})^{T}bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_V ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_V ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, for symmetric 𝐕𝐕\mathbf{V}bold_V, from (2.4) it follows that Theorem 3(i) with 𝐋=𝒟k𝐋subscript𝒟𝑘\mathbf{L}=\mathcal{D}_{k}bold_L = caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT recovers Theorem 1 in Tyler [27].

From Theorem 3 it follows immediately that the asymptotic relative efficiency of different estimators H(𝐕n)𝐻subscript𝐕𝑛H(\mathbf{V}_{n})italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for H(𝚺)𝐻𝚺H(\mathbf{\Sigma})italic_H ( bold_Σ ) can be compared by simply comparing the values of the corresponding scalar σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Similarly, the scalar σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT can also be used as an index for the asymptotic relative efficiency of different estimators H(𝜽n)𝐻subscript𝜽𝑛H(\bm{\theta}_{n})italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for H(𝜽0)𝐻subscript𝜽0H(\bm{\theta}_{0})italic_H ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). We discuss some examples below.

Example 4 (Shape and scale of a structured covariance).

Suppose that n(𝐕n𝚺)𝑛subscript𝐕𝑛𝚺\sqrt{n}(\mathbf{V}_{n}-\mathbf{\Sigma})square-root start_ARG italic_n end_ARG ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_Σ ) is asymptotically normal with mean zero and variance given by (1.3). Consider the shape component H(𝐂)=vec(𝐂)/|𝐂|1/k𝐻𝐂vec𝐂superscript𝐂1𝑘H(\mathbf{C})=\mathrm{vec}(\mathbf{C})/|\mathbf{C}|^{1/k}italic_H ( bold_C ) = roman_vec ( bold_C ) / | bold_C | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT, where 𝐂PDS(k)𝐂PDS𝑘\mathbf{C}\in\text{\rm PDS}(k)bold_C ∈ PDS ( italic_k ). We have that

H(𝐂)=H(𝐂)vec(𝐂)T=1k|𝐂|1/kvec(𝐂)vec(𝐂1)T+|𝐂|1/k𝐈k2.superscript𝐻𝐂𝐻𝐂vecsuperscript𝐂𝑇1𝑘superscript𝐂1𝑘vec𝐂vecsuperscriptsuperscript𝐂1𝑇superscript𝐂1𝑘subscript𝐈superscript𝑘2H^{\prime}(\mathbf{C})=\frac{\partial H(\mathbf{C})}{\partial\mathrm{vec}(% \mathbf{C})^{T}}=-\frac{1}{k}|\mathbf{C}|^{-1/k}\text{\rm vec}(\mathbf{C})% \text{\rm vec}(\mathbf{C}^{-1})^{T}+|\mathbf{C}|^{-1/k}\mathbf{I}_{k^{2}}.italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_C ) = divide start_ARG ∂ italic_H ( bold_C ) end_ARG start_ARG ∂ roman_vec ( bold_C ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG = - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG | bold_C | start_POSTSUPERSCRIPT - 1 / italic_k end_POSTSUPERSCRIPT vec ( bold_C ) vec ( bold_C start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + | bold_C | start_POSTSUPERSCRIPT - 1 / italic_k end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT . (4.2)

Then, according to Theorem 3(i), for the shape component it follows that n(H(𝐕n)H(𝚺))𝑛𝐻subscript𝐕𝑛𝐻𝚺\sqrt{n}(H(\mathbf{V}_{n})-H(\mathbf{\Sigma}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_Σ ) ) is asymptotically normal with mean zero and variance (see Appendix for details)

2σ1|𝚺|2/k{𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T1kvec(𝚺)vec(𝚺)T}.2subscript𝜎1superscript𝚺2𝑘𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇1𝑘vec𝚺vecsuperscript𝚺𝑇\frac{2\sigma_{1}}{|\mathbf{\Sigma}|^{2/k}}\left\{\mathbf{L}\Big{(}\mathbf{L}^% {T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{% )}^{-1}\mathbf{L}^{T}-\frac{1}{k}\text{\rm vec}(\mathbf{\Sigma})\text{\rm vec}% (\mathbf{\Sigma})^{T}\right\}.divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 2 / italic_k end_POSTSUPERSCRIPT end_ARG { bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG vec ( bold_Σ ) vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } . (4.3)

When 𝚺𝚺\mathbf{\Sigma}bold_Σ is unstructured, then vec(𝚺)=𝐋𝛉0vec𝚺𝐋subscript𝛉0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, with 𝛉0=vech(𝚺)subscript𝛉0vech𝚺\bm{\theta}_{0}=\text{\rm vech}(\mathbf{\Sigma})bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = vech ( bold_Σ ) and 𝐋𝐋\mathbf{L}bold_L is the duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. In that case, from (2.4) it follows that (4.3) with 𝐋=𝒟k𝐋subscript𝒟𝑘\mathbf{L}=\mathcal{D}_{k}bold_L = caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT reduces to

σ1|𝚺|2/k{(𝐈k2+𝐊k,k)(𝚺𝚺)2kvec(𝚺)vec(𝚺)T}.subscript𝜎1superscript𝚺2𝑘subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺2𝑘vec𝚺vecsuperscript𝚺𝑇\frac{\sigma_{1}}{|\mathbf{\Sigma}|^{2/k}}\left\{\left(\mathbf{I}_{k^{2}}+% \mathbf{K}_{k,k}\right)(\mathbf{\Sigma}\otimes\mathbf{\Sigma})-\frac{2}{k}% \text{\rm vec}(\mathbf{\Sigma})\text{\rm vec}(\mathbf{\Sigma})^{T}\right\}.divide start_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 2 / italic_k end_POSTSUPERSCRIPT end_ARG { ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG vec ( bold_Σ ) vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } .

This coincides with expression (9) found in [24]. For completeness, consider the scale component σ(𝐂)=|𝐂|1/(2k)𝜎𝐂superscript𝐂12𝑘\sigma(\mathbf{C})=|\mathbf{C}|^{1/(2k)}italic_σ ( bold_C ) = | bold_C | start_POSTSUPERSCRIPT 1 / ( 2 italic_k ) end_POSTSUPERSCRIPT. It can be seen that

σ(𝐂)=12k|𝐂|1/(2k)vec(𝐂1)T.superscript𝜎𝐂12𝑘superscript𝐂12𝑘vecsuperscriptsuperscript𝐂1𝑇\sigma^{\prime}(\mathbf{C})=\frac{1}{2k}|\mathbf{C}|^{1/(2k)}\text{\rm vec}(% \mathbf{C}^{-1})^{T}.italic_σ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_C ) = divide start_ARG 1 end_ARG start_ARG 2 italic_k end_ARG | bold_C | start_POSTSUPERSCRIPT 1 / ( 2 italic_k ) end_POSTSUPERSCRIPT roman_vec ( bold_C start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . (4.4)

Application of the delta method then yields that n(σ(𝐕n)σ(𝚺))𝑛𝜎subscript𝐕𝑛𝜎𝚺\sqrt{n}(\sigma(\mathbf{V}_{n})-\sigma(\mathbf{\Sigma}))square-root start_ARG italic_n end_ARG ( italic_σ ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_σ ( bold_Σ ) ) is asymptotically normal with mean zero and variance

14(2σ1k+σ2)|𝚺|1/k.142subscript𝜎1𝑘subscript𝜎2superscript𝚺1𝑘\frac{1}{4}\left(\frac{2\sigma_{1}}{k}+\sigma_{2}\right)|\mathbf{\Sigma}|^{1/k}.divide start_ARG 1 end_ARG start_ARG 4 end_ARG ( divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | bold_Σ | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT .
Example 5 (Direction of the vector of variance components).

In order to create a single scalar as an index of the asymptotic efficiency for estimators 𝛉nsubscript𝛉𝑛\bm{\theta}_{n}bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT for the vector 𝛉0subscript𝛉0\bm{\theta}_{0}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT of variance components, it is helpful to separate 𝛉0subscript𝛉0\bm{\theta}_{0}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT into its direction and length. The direction component H(𝛉)=𝛉/𝛉𝐻𝛉𝛉norm𝛉H(\bm{\theta})=\bm{\theta}/\|\bm{\theta}\|italic_H ( bold_italic_θ ) = bold_italic_θ / ∥ bold_italic_θ ∥ satisfies (4.1). Its derivative is given by

H(𝜽)=H(𝜽)𝜽T=1𝜽(𝐈𝜽𝜽T𝜽2).superscript𝐻𝜽𝐻𝜽superscript𝜽𝑇1norm𝜽subscript𝐈𝜽superscript𝜽𝑇superscriptnorm𝜽2H^{\prime}(\bm{\theta})=\frac{\partial H(\bm{\theta})}{\partial\bm{\theta}^{T}% }=\frac{1}{\|\bm{\theta}\|}\left(\mathbf{I}_{\ell}-\frac{\bm{\theta}\bm{\theta% }^{T}}{\|\bm{\theta}\|^{2}}\right).italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ ) = divide start_ARG ∂ italic_H ( bold_italic_θ ) end_ARG start_ARG ∂ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG ∥ bold_italic_θ ∥ end_ARG ( bold_I start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - divide start_ARG bold_italic_θ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG ∥ bold_italic_θ ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) . (4.5)

Then, according to Theorem 3(ii), for the direction estimator it follows that n(H(𝛉n)H(𝛉))𝑛𝐻subscript𝛉𝑛𝐻𝛉\sqrt{n}(H(\bm{\theta}_{n})-H(\bm{\theta}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_italic_θ ) ) is asymptotically normal with mean zero and variance

2σ1𝜽02(𝐈𝜽0𝜽0T𝜽02)(𝐋T(𝚺1𝚺1)𝐋)1(𝐈𝜽0𝜽0T𝜽02).2subscript𝜎1superscriptnormsubscript𝜽02subscript𝐈subscript𝜽0superscriptsubscript𝜽0𝑇superscriptnormsubscript𝜽02superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1subscript𝐈subscript𝜽0superscriptsubscript𝜽0𝑇superscriptnormsubscript𝜽02\frac{2\sigma_{1}}{\|\bm{\theta}_{0}\|^{2}}\left(\mathbf{I}_{\ell}-\frac{\bm{% \theta}_{0}\bm{\theta}_{0}^{T}}{\|\bm{\theta}_{0}\|^{2}}\right)\Big{(}\mathbf{% L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}% \Big{)}^{-1}\left(\mathbf{I}_{\ell}-\frac{\bm{\theta}_{0}\bm{\theta}_{0}^{T}}{% \|\bm{\theta}_{0}\|^{2}}\right).divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( bold_I start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - divide start_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG ∥ bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_I start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - divide start_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG ∥ bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) .

It does not seem possible to simplify this expression any further, but it illustrates that one can use the scalar σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as an index for the asymptotic relative efficiency of estimators H(𝛉n)𝐻subscript𝛉𝑛H(\bm{\theta}_{n})italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for H(𝛉0)𝐻subscript𝛉0H(\bm{\theta}_{0})italic_H ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ).

An alternative is the map** H(𝛉)=𝛉/|𝐕(𝛉)|1/k𝐻𝛉𝛉superscript𝐕𝛉1𝑘H(\bm{\theta})=\bm{\theta}/|\mathbf{V}(\bm{\theta})|^{1/k}italic_H ( bold_italic_θ ) = bold_italic_θ / | bold_V ( bold_italic_θ ) | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT. Since 𝐕𝐕\mathbf{V}bold_V is linear, this H𝐻Hitalic_H also satisfies (4.1). For 𝐕n=𝐕(𝛉n)subscript𝐕𝑛𝐕subscript𝛉𝑛\mathbf{V}_{n}=\mathbf{V}(\bm{\theta}_{n})bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_V ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), it holds that 𝛉n=(𝐋T𝐋)1𝐋Tvec(𝐕n)subscript𝛉𝑛superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vecsubscript𝐕𝑛\bm{\theta}_{n}=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(% \mathbf{V}_{n})bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), so that

H(𝜽n)=(𝐋T𝐋)1𝐋Tvec(𝐕n/|𝐕n|1/k).𝐻subscript𝜽𝑛superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vecsubscript𝐕𝑛superscriptsubscript𝐕𝑛1𝑘H(\bm{\theta}_{n})=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}% \left(\mathbf{V}_{n}/|\mathbf{V}_{n}|^{1/k}\right).italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT / | bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT ) .

From Example 4, it follows that n(H(𝛉n)H(𝛉))𝑛𝐻subscript𝛉𝑛𝐻𝛉\sqrt{n}(H(\bm{\theta}_{n})-H(\bm{\theta}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_italic_θ ) ) is asymptotically normal with mean zero and variance

2σ1|𝚺|2/k{(𝐋T(𝚺1𝚺1)𝐋)11k𝜽0𝜽0T}.2subscript𝜎1superscript𝚺2𝑘superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋11𝑘subscript𝜽0superscriptsubscript𝜽0𝑇\frac{2\sigma_{1}}{|\mathbf{\Sigma}|^{2/k}}\left\{\Big{(}\mathbf{L}^{T}\left(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}-% \frac{1}{k}\bm{\theta}_{0}\bm{\theta}_{0}^{T}\right\}.divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 2 / italic_k end_POSTSUPERSCRIPT end_ARG { ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } .

This component H𝐻Hitalic_H leads to a simpler expression for the limiting variance and the scalar σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT can again be used as an index for the asymptotic relative efficiency of estimators H(𝛉n)𝐻subscript𝛉𝑛H(\bm{\theta}_{n})italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) for H(𝛉0)𝐻subscript𝛉0H(\bm{\theta}_{0})italic_H ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ).

5 Influence function of structured covariance functionals

The influence function measures the local robustness of an estimator. It describes the effect of an infinitesimal contamination at a single point on the corresponding functional (see Hampel [7]). Good local robustness is therefore illustrated by a bounded influence function. It is defined as follows. Let P𝑃Pitalic_P be a distribution on ksuperscript𝑘\mathbb{R}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. For 0<h<1010<h<10 < italic_h < 1 and 𝐲k𝐲superscript𝑘\mathbf{y}\in\mathbb{R}^{k}bold_y ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT fixed, define the perturbed probability measure Ph,𝐲=(1h)P+hδ𝐲subscript𝑃𝐲1𝑃subscript𝛿𝐲P_{h,\mathbf{y}}=(1-h)P+h\delta_{\mathbf{y}}italic_P start_POSTSUBSCRIPT italic_h , bold_y end_POSTSUBSCRIPT = ( 1 - italic_h ) italic_P + italic_h italic_δ start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT, where δ𝐲subscript𝛿𝐲\delta_{\mathbf{y}}italic_δ start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT denotes the Dirac measure at 𝐲k𝐲superscript𝑘\mathbf{y}\in\mathbb{R}^{k}bold_y ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. The influence function of a k×k𝑘𝑘k\times kitalic_k × italic_k covariance functional 𝐂()𝐂\mathbf{C}(\cdot)bold_C ( ⋅ ) at probability measure P𝑃Pitalic_P, is defined as

IF(𝐲;𝐂,P)=limh0𝐂((1h)P+hδ𝐲)𝐂(P)h,IF𝐲𝐂𝑃subscript0𝐂1𝑃subscript𝛿𝐲𝐂𝑃\text{IF}(\mathbf{y};\mathbf{C},P)=\lim_{h\downarrow 0}\frac{\mathbf{C}((1-h)P% +h\delta_{\mathbf{y}})-\mathbf{C}(P)}{h},IF ( bold_y ; bold_C , italic_P ) = roman_lim start_POSTSUBSCRIPT italic_h ↓ 0 end_POSTSUBSCRIPT divide start_ARG bold_C ( ( 1 - italic_h ) italic_P + italic_h italic_δ start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT ) - bold_C ( italic_P ) end_ARG start_ARG italic_h end_ARG , (5.1)

if this limit exists.

Let P𝑃Pitalic_P be a distribution on ksuperscript𝑘\mathbb{R}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT with density |𝚺|1/2g((𝐲𝝁)T𝚺1(𝐲𝝁))superscript𝚺12𝑔superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁|\mathbf{\Sigma}|^{-1/2}g\left((\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(% \mathbf{y}-\bm{\mu})\right)| bold_Σ | start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_g ( ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ), where 𝝁k𝝁superscript𝑘\bm{\mu}\in\mathbb{R}^{k}bold_italic_μ ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and 𝚺PDS(k)𝚺PDS𝑘\mathbf{\Sigma}\in\text{\rm PDS}(k)bold_Σ ∈ PDS ( italic_k ), and let 𝐂𝐂\mathbf{C}bold_C be Fisher consistent for 𝚺𝚺\mathbf{\Sigma}bold_Σ, that is 𝐂(P)=𝚺𝐂𝑃𝚺\mathbf{C}(P)=\mathbf{\Sigma}bold_C ( italic_P ) = bold_Σ, and affine equivariant, meaning 𝐂(P𝐀𝐲+𝐛)=𝐀𝐂(P𝐲)𝐀T𝐂subscript𝑃𝐀𝐲𝐛𝐀𝐂subscript𝑃𝐲superscript𝐀𝑇\mathbf{C}(P_{\mathbf{A}\mathbf{y}+\mathbf{b}})=\mathbf{A}\mathbf{C}(P_{% \mathbf{y}})\mathbf{A}^{T}bold_C ( italic_P start_POSTSUBSCRIPT bold_Ay + bold_b end_POSTSUBSCRIPT ) = bold_AC ( italic_P start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT ) bold_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, for any nonsingular k×k𝑘𝑘k\times kitalic_k × italic_k matrix 𝐀𝐀\mathbf{A}bold_A and 𝐛k𝐛superscript𝑘\mathbf{b}\in\mathbb{R}^{k}bold_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, where P𝐲subscript𝑃𝐲P_{\mathbf{y}}italic_P start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT denotes the distribution of a random vector 𝐲𝐲\mathbf{y}bold_y. Croux and Haesbroeck [5] show that the influence function of such covariance functionals at the Nk(𝝁,𝚺)subscript𝑁𝑘𝝁𝚺N_{k}(\bm{\mu},\mathbf{\Sigma})italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_μ , bold_Σ ) distribution is given by

IF(𝐲;𝐂,P)=αC(d(𝐲))(𝐲𝝁)(𝐲𝝁)TβC(d(𝐲))𝚺,IF𝐲𝐂𝑃subscript𝛼𝐶𝑑𝐲𝐲𝝁superscript𝐲𝝁𝑇subscript𝛽𝐶𝑑𝐲𝚺\text{IF}(\mathbf{y};\mathbf{C},P)=\alpha_{C}(d(\mathbf{y}))(\mathbf{y}-\bm{% \mu})(\mathbf{y}-\bm{\mu})^{T}-\beta_{C}(d(\mathbf{y}))\mathbf{\Sigma},IF ( bold_y ; bold_C , italic_P ) = italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) bold_Σ , (5.2)

for some real valued functions αCsubscript𝛼𝐶\alpha_{C}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and βCsubscript𝛽𝐶\beta_{C}italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and where d(𝐲)2=(𝐲𝝁)T𝚺1(𝐲𝝁)𝑑superscript𝐲2superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁d(\mathbf{y})^{2}=(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm% {\mu})italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ). For more details on αCsubscript𝛼𝐶\alpha_{C}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and βCsubscript𝛽𝐶\beta_{C}italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT for different covariance functionals, see Croux and Haesbroeck [5].

Structured covariance functionals 𝐌()=𝐕(𝜽())𝐌𝐕𝜽\mathbf{M}(\cdot)=\mathbf{V}(\bm{\theta}(\cdot))bold_M ( ⋅ ) = bold_V ( bold_italic_θ ( ⋅ ) ) are not necessarily affine equivariant, so that the above characterizations do not directly apply. However, Lopuhaä et al [16] find similar expressions for the influence function of the covariance S-functionals 𝐌()𝐌\mathbf{M}(\cdot)bold_M ( ⋅ ) and 𝜽()𝜽\bm{\theta}(\cdot)bold_italic_θ ( ⋅ ) in a linear model with a linearly structured covariance 𝐕𝐕\mathbf{V}bold_V, see Corollary 8.4 in [16]. The next lemma shows that these expressions will always appear at elliptical distributions for covariance functionals that are a projection of some affine equivariant covariance functional.

Lemma 1.

Let P𝑃Pitalic_P be a distribution on ksuperscript𝑘\mathbb{R}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT with density |𝚺|1/2g((𝐲𝛍)T𝚺1(𝐲𝛍))superscript𝚺12𝑔superscript𝐲𝛍𝑇superscript𝚺1𝐲𝛍|\mathbf{\Sigma}|^{-1/2}g\left((\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(% \mathbf{y}-\bm{\mu})\right)| bold_Σ | start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT italic_g ( ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ), where 𝛍k𝛍superscript𝑘\bm{\mu}\in\mathbb{R}^{k}bold_italic_μ ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and 𝚺PDS(k)𝚺PDS𝑘\mathbf{\Sigma}\in\text{\rm PDS}(k)bold_Σ ∈ PDS ( italic_k ). Let 𝐂𝐂\mathbf{C}bold_C be an affine equivariant covariance functional which possesses an influence function and is Fisher consistent for 𝚺𝚺\mathbf{\Sigma}bold_Σ. Suppose that 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), for some 𝛉0subscript𝛉0superscript\bm{\theta}_{0}\in\mathbb{R}^{\ell}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT, and that 𝐕𝐕\mathbf{V}bold_V is linear such that 𝐋𝐋\mathbf{L}bold_L, as defined in (2.1), is of full column rank. Let ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT be the projection matrix defined in (2.2) and define the covariance functional 𝐌𝐌\mathbf{M}bold_M by vec(𝐌)=ΠLvec(𝐂)vec𝐌subscriptΠ𝐿vec𝐂\mathrm{vec}(\mathbf{M})=\Pi_{L}\mathrm{vec}(\mathbf{C})roman_vec ( bold_M ) = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_C ). Then the following holds.

  • (i)

    The functional 𝐌𝐌\mathbf{M}bold_M is Fisher consistent for 𝚺𝚺\mathbf{\Sigma}bold_Σ and there exist functions αC,βC:[0,):subscript𝛼𝐶subscript𝛽𝐶0\alpha_{C},\beta_{C}:[0,\infty)\to\mathbb{R}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT : [ 0 , ∞ ) → blackboard_R, such that IF(𝐲;vec(𝐌),P)IF𝐲vec𝐌𝑃\text{\rm IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),P)IF ( bold_y ; roman_vec ( bold_M ) , italic_P ) is given by

    αC(d(𝐲))𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1(𝐲𝝁)(𝐲𝝁)T𝚺1)βC(d(𝐲))vec(𝚺),subscript𝛼𝐶𝑑𝐲𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1𝐲𝝁superscript𝐲𝝁𝑇superscript𝚺1subscript𝛽𝐶𝑑𝐲vec𝚺\alpha_{C}(d(\mathbf{y}))\mathbf{L}\Big{(}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}% \otimes\mathbf{\Sigma}^{-1})\mathbf{L}\Big{)}^{-1}\mathbf{L}^{T}\mathrm{vec}% \left(\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}% \mathbf{\Sigma}^{-1}\right)-\beta_{C}(d(\mathbf{y}))\mathrm{vec}(\mathbf{% \Sigma}),italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) roman_vec ( bold_Σ ) ,

    where d2(𝐲)=(𝐲𝝁)T𝚺1(𝐲𝝁)superscript𝑑2𝐲superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁d^{2}(\mathbf{y})=(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm% {\mu})italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_y ) = ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ).

  • (ii)

    If 𝜽(P)𝜽𝑃superscript\bm{\theta}(P)\in\mathbb{R}^{\ell}bold_italic_θ ( italic_P ) ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT is the functional, such that vec(𝐌())=𝐋𝜽()vec𝐌𝐋𝜽\mathrm{vec}(\mathbf{M}(\cdot))=\mathbf{L}\bm{\theta}(\cdot)roman_vec ( bold_M ( ⋅ ) ) = bold_L bold_italic_θ ( ⋅ ), then 𝜽𝜽\bm{\theta}bold_italic_θ is Fisher consistent for 𝜽0subscript𝜽0\bm{\theta}_{0}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and IF(𝐲;𝜽,P)IF𝐲𝜽𝑃\text{\rm IF}(\mathbf{y};\bm{\theta},P)IF ( bold_y ; bold_italic_θ , italic_P ) is given by

    αC(d(𝐲))(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1(𝐲𝝁)(𝐲𝝁)T𝚺1)βC(d(𝐲))𝜽0.subscript𝛼𝐶𝑑𝐲superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1𝐲𝝁superscript𝐲𝝁𝑇superscript𝚺1subscript𝛽𝐶𝑑𝐲subscript𝜽0\alpha_{C}(d(\mathbf{y}))\Big{(}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes% \mathbf{\Sigma}^{-1})\mathbf{L}\Big{)}^{-1}\mathbf{L}^{T}\mathrm{vec}\left(% \mathbf{\Sigma}^{-1}(\mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}\mathbf{% \Sigma}^{-1}\right)-\beta_{C}(d(\mathbf{y}))\bm{\theta}_{0}.italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT .

Note that the functions αCsubscript𝛼𝐶\alpha_{C}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and βCsubscript𝛽𝐶\beta_{C}italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT have nothing to do with the projection ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, but are inherited from the influence function (5.2) of the affine equivariant covariance functional 𝐂𝐂\mathbf{C}bold_C. At a distribution P𝑃Pitalic_P that has an elliptical density (3.2) with a linearly structured covariance, Lopuhaä et al [16] find expressions similar to the ones in Lemma 1 for the covariance S-functionals. If the S-functional is defined by some function ρ𝜌\rhoitalic_ρ and constant b0subscript𝑏0b_{0}italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (see Example 3), then

αC(s)=kρ(s)sδ1βC(s)=ρ(s)sδ12(ρ(s)b0)δ2,subscript𝛼𝐶𝑠𝑘superscript𝜌𝑠𝑠subscript𝛿1subscript𝛽𝐶𝑠superscript𝜌𝑠𝑠subscript𝛿12𝜌𝑠subscript𝑏0subscript𝛿2\begin{split}\alpha_{C}(s)&=\frac{k\rho^{\prime}(s)}{s\delta_{1}}\\ \beta_{C}(s)&=\frac{\rho^{\prime}(s)s}{\delta_{1}}-\frac{2(\rho(s)-b_{0})}{% \delta_{2}},\end{split}start_ROW start_CELL italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = divide start_ARG italic_k italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) end_ARG start_ARG italic_s italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = divide start_ARG italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s end_ARG start_ARG italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG - divide start_ARG 2 ( italic_ρ ( italic_s ) - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG , end_CELL end_ROW (5.3)

where

δ1=𝔼𝟎,𝐈[ρ′′(𝐳)𝐳2+(k+1)ρ(𝐳)𝐳]k+2δ2=𝔼𝟎,𝐈[ρ(𝐳)𝐳].subscript𝛿1subscript𝔼0𝐈delimited-[]superscript𝜌′′norm𝐳superscriptnorm𝐳2𝑘1superscript𝜌norm𝐳norm𝐳𝑘2subscript𝛿2subscript𝔼0𝐈delimited-[]superscript𝜌delimited-∥∥𝐳delimited-∥∥𝐳\begin{split}\delta_{1}&=\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}}\left[\rho^{% \prime\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}+(k+1)\rho^{\prime}(\|\mathbf{z% }\|)\|\mathbf{z}\|\right]}{k+2}\\ \delta_{2}&=\mathbb{E}_{\mathbf{0},\mathbf{I}}\left[\rho^{\prime}(\|\mathbf{z}% \|)\|\mathbf{z}\|\right].\end{split}start_ROW start_CELL italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I end_POSTSUBSCRIPT [ italic_ρ start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_k + 1 ) italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] end_ARG start_ARG italic_k + 2 end_ARG end_CELL end_ROW start_ROW start_CELL italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I end_POSTSUBSCRIPT [ italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] . end_CELL end_ROW (5.4)

These αCsubscript𝛼𝐶\alpha_{C}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT and βCsubscript𝛽𝐶\beta_{C}italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT are the same as the ones that appear in the expression for the influence function of the affine equivariant covariance S-functional 𝐂𝐂\mathbf{C}bold_C in the multivariate location-scale model, see Lopuhaä [13] or Salibián-Barrera et al [24], or in the multivariate regression model, see Van Aelst and Willems [28]. Indeed, the influence function IF(𝐲,vec(𝐕(𝜽)),P)IF𝐲vec𝐕𝜽𝑃\text{IF}(\mathbf{y},\mathrm{vec}(\mathbf{V}(\bm{\theta})),P)IF ( bold_y , roman_vec ( bold_V ( bold_italic_θ ) ) , italic_P ) of the structured covariance functional in Lopuhaä et al [16] is precisely the projection ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT of IF(𝐲,vec(𝐂),P)IF𝐲vec𝐂𝑃\text{IF}(\mathbf{y},\mathrm{vec}(\mathbf{C}),P)IF ( bold_y , roman_vec ( bold_C ) , italic_P ) as obtained in [13, 24, 28].

When 𝚺=𝐕(𝜽0)𝚺𝐕subscript𝜽0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is unstructured, then vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT with 𝜽0=vech(𝚺)subscript𝜽0vech𝚺\bm{\theta}_{0}=\text{\rm vech}(\mathbf{\Sigma})bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = vech ( bold_Σ ) and 𝐋𝐋\mathbf{L}bold_L is the duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. In that case, from (2.4) it follows that the expression for IF(𝐲;vec(𝐌),P)IF𝐲vec𝐌𝑃\text{\rm IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),P)IF ( bold_y ; roman_vec ( bold_M ) , italic_P ) in Lemma 1(i) with 𝐋=𝒟k𝐋subscript𝒟𝑘\mathbf{L}=\mathcal{D}_{k}bold_L = caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT reduces to

IF(𝐲;vec(𝐌),P)=vec{αC(d(𝐲))(𝐲𝝁)(𝐲𝝁)TβC(d(𝐲))𝚺}.IF𝐲vec𝐌𝑃vecsubscript𝛼𝐶𝑑𝐲𝐲𝝁superscript𝐲𝝁𝑇subscript𝛽𝐶𝑑𝐲𝚺\text{\rm IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),P)=\mathrm{vec}\left\{\alpha% _{C}(d(\mathbf{y}))(\mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}-\beta_{C}(d(% \mathbf{y}))\mathbf{\Sigma}\right\}.IF ( bold_y ; roman_vec ( bold_M ) , italic_P ) = roman_vec { italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) bold_Σ } . (5.5)

This coincides with the expression found in Lemma 1 in Croux and Haesbroeck [5].

Map**s H𝐻Hitalic_H that satisfy (4.1) also have useful applications to influence functions of affine equivariant covariance functionals 𝐂𝐂\mathbf{C}bold_C and their the gross-error-sensitivity (GES). Kent and Tyler [11] consider functionals 𝐂/|𝐂|1/k𝐂superscript𝐂1𝑘\mathbf{C}/|\mathbf{C}|^{1/k}bold_C / | bold_C | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT and 𝐂/tr(𝐂)𝐂tr𝐂\mathbf{C}/\text{tr}(\mathbf{C})bold_C / tr ( bold_C ) to obtain that the GES of different CM-functionals is proportional to a single scalar. Salibián-Barrera et al [24] derive the influence function of the shape component of covariance MM-functionals and show that it is proportional to a single function αCsubscript𝛼𝐶\alpha_{C}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT, which no longer depends on the scale-functional used in the first step. In fact, these properties hold more general for functionals H𝐻Hitalic_H satisfying (4.1) applied to affine equivariant covariance functionals. The next lemma establishes similar results for linearly structured covariance functionals.

Lemma 2.

Let P𝑃Pitalic_P be a distribution on ksuperscript𝑘\mathbb{R}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT with an elliptical contoured density (3.2). Suppose that 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), for some 𝛉0subscript𝛉0superscript\bm{\theta}_{0}\in\mathbb{R}^{\ell}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT, and that 𝐕𝐕\mathbf{V}bold_V is linear such that 𝐋𝐋\mathbf{L}bold_L, as defined in (2.1), is of full column rank.

  • (i)

    Let 𝐌PDS(k)𝐌PDS𝑘\mathbf{M}\in\text{\rm PDS}(k)bold_M ∈ PDS ( italic_k ) be a covariance functional that is Fisher consistent for 𝚺𝚺\mathbf{\Sigma}bold_Σ and which possesses an influence function given by Lemma 1(i). Let H(𝐌)𝐻𝐌H(\mathbf{M})italic_H ( bold_M ) be continuously differentiable in a neighborhood of 𝐌(P)𝐌𝑃\mathbf{M}(P)bold_M ( italic_P ) satisfying (4.1). Then IF(𝐲;H(𝐌),P)IF𝐲𝐻𝐌𝑃\text{\rm IF}(\mathbf{y};H(\mathbf{M}),P)IF ( bold_y ; italic_H ( bold_M ) , italic_P ) is given by

    αC(d(𝐲))H(𝚺)𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T(𝚺1𝚺1)((𝐲𝝁)(𝐲𝝁)),subscript𝛼𝐶𝑑𝐲superscript𝐻𝚺𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1tensor-product𝐲𝝁𝐲𝝁\alpha_{C}(d(\mathbf{y}))H^{\prime}(\mathbf{\Sigma})\mathbf{L}\Big{(}\mathbf{L% }^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{L}\Big{)}^{-1}% \mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\Big% {(}(\mathbf{y}-\bm{\mu})\otimes(\mathbf{y}-\bm{\mu})\Big{)},italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( ( bold_y - bold_italic_μ ) ⊗ ( bold_y - bold_italic_μ ) ) ,

    where d2(𝐲)=(𝐲𝝁)T𝚺1(𝐲𝝁)superscript𝑑2𝐲superscript𝐲𝝁𝑇superscript𝚺1𝐲𝝁d^{2}(\mathbf{y})=(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm% {\mu})italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( bold_y ) = ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ).

  • (ii)

    Let 𝜽𝜽superscript\bm{\theta}\in\mathbb{R}^{\ell}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT be a functional that is Fisher consistent for 𝜽0subscript𝜽0\bm{\theta}_{0}bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and which possesses an influence function given by Lemma 1(ii). Let H(𝜽)𝐻𝜽H(\bm{\theta})italic_H ( bold_italic_θ ) be continuously differentiable in a neighborhood of 𝜽(P)𝜽𝑃\bm{\theta}(P)bold_italic_θ ( italic_P ) satisfying (4.1). Then IF(𝐲;H(𝜽),P)IF𝐲𝐻𝜽𝑃\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) is given by

    αC(d(𝐲))H(𝜽0)(𝐋T(𝚺1𝚺1)𝐋)1𝐋T(𝚺1𝚺1)((𝐲𝝁)(𝐲𝝁)).subscript𝛼𝐶𝑑𝐲superscript𝐻subscript𝜽0superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1tensor-product𝐲𝝁𝐲𝝁\alpha_{C}(d(\mathbf{y}))H^{\prime}(\bm{\theta}_{0})\Big{(}\mathbf{L}^{T}(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{L}\Big{)}^{-1}\mathbf{% L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\Big{(}(% \mathbf{y}-\bm{\mu})\otimes(\mathbf{y}-\bm{\mu})\Big{)}.italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( ( bold_y - bold_italic_μ ) ⊗ ( bold_y - bold_italic_μ ) ) .

Consider the GES defined by sup𝐲kIF(𝐲;)subscriptsupremum𝐲superscript𝑘normIF𝐲\sup_{\mathbf{y}\in\mathbb{R}^{k}}\|\text{IF}(\mathbf{y};\cdot)\|roman_sup start_POSTSUBSCRIPT bold_y ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ IF ( bold_y ; ⋅ ) ∥, for some norm \|\cdot\|∥ ⋅ ∥. From Lemma 2 it follows immediately that regardless of the choice of the norm, the value IF(𝐲;H(𝐌),P)normIF𝐲𝐻𝐌𝑃\|\text{IF}(\mathbf{y};H(\mathbf{M}),P)\|∥ IF ( bold_y ; italic_H ( bold_M ) , italic_P ) ∥ for different functionals H(𝐌(P))𝐻𝐌𝑃H(\mathbf{M}(P))italic_H ( bold_M ( italic_P ) ) is proportional to |αC(d(𝐲))|subscript𝛼𝐶𝑑𝐲|\alpha_{C}(d(\mathbf{y}))|| italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) | and similarly for functionals H(𝜽(P))𝐻𝜽𝑃H(\bm{\theta}(P))italic_H ( bold_italic_θ ( italic_P ) ). We discuss some examples below.

Example 6 (Shape and scale of a structured covariance).

For the shape functional H(𝐌)=vec(𝐌)/|𝐌|1/k𝐻𝐌vec𝐌superscript𝐌1𝑘H(\mathbf{M})=\mathrm{vec}(\mathbf{M})/|\mathbf{M}|^{1/k}italic_H ( bold_M ) = roman_vec ( bold_M ) / | bold_M | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT, from Lemma 2(i) together with (4.2) we find

IF(𝐲;H(𝐌),P)=1k|𝚺|1/ktr(𝚺1IF(𝐲;𝐌,P))vec(𝚺)+|𝚺|1/kIF(𝐲;vec(𝐌),P).IF𝐲𝐻𝐌𝑃1𝑘superscript𝚺1𝑘trsuperscript𝚺1IF𝐲𝐌𝑃vec𝚺superscript𝚺1𝑘IF𝐲vec𝐌𝑃\text{\rm IF}(\mathbf{y};H(\mathbf{M}),P)=-\frac{1}{k}|\mathbf{\Sigma}|^{-1/k}% \text{\rm tr}\left(\mathbf{\Sigma}^{-1}\text{\rm IF}(\mathbf{y};\mathbf{M},P)% \right)\cdot\mathrm{vec}(\mathbf{\Sigma})+|\mathbf{\Sigma}|^{-1/k}\text{\rm IF% }(\mathbf{y};\mathrm{vec}(\mathbf{M}),P).IF ( bold_y ; italic_H ( bold_M ) , italic_P ) = - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG | bold_Σ | start_POSTSUPERSCRIPT - 1 / italic_k end_POSTSUPERSCRIPT tr ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT IF ( bold_y ; bold_M , italic_P ) ) ⋅ roman_vec ( bold_Σ ) + | bold_Σ | start_POSTSUPERSCRIPT - 1 / italic_k end_POSTSUPERSCRIPT IF ( bold_y ; roman_vec ( bold_M ) , italic_P ) .

See also Salibián et al [24]. In particular, at a distribution P𝑃Pitalic_P with an elliptically contoured density with parameters 𝛍𝛍\bm{\mu}bold_italic_μ and 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) one finds that IF(𝐲;H(𝐌),P)IF𝐲𝐻𝐌𝑃\text{\rm IF}(\mathbf{y};H(\mathbf{M}),P)IF ( bold_y ; italic_H ( bold_M ) , italic_P ) is given by

αC(d(𝐲))|𝚺|1/k{𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1(𝐲𝝁)(𝐲𝝁)T𝚺1)d(𝐲)2kvec(𝚺)},subscript𝛼𝐶𝑑𝐲superscript𝚺1𝑘𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1𝐲𝝁superscript𝐲𝝁𝑇superscript𝚺1𝑑superscript𝐲2𝑘vec𝚺\begin{split}\frac{\alpha_{C}(d(\mathbf{y}))}{|\mathbf{\Sigma}|^{1/k}}\bigg{\{% }\mathbf{L}\Big{(}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-% 1})\mathbf{L}\Big{)}^{-1}\mathbf{L}^{T}\mathrm{vec}\left(\mathbf{\Sigma}^{-1}(% \mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}\right)-\frac% {d(\mathbf{y})^{2}}{k}\mathrm{vec}(\mathbf{\Sigma})\bigg{\}},\end{split}start_ROW start_CELL divide start_ARG italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT end_ARG { bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) - divide start_ARG italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG roman_vec ( bold_Σ ) } , end_CELL end_ROW (5.6)

where d(𝐲)2=(𝐲𝛍)T𝚺1(𝐲𝛍)𝑑superscript𝐲2superscript𝐲𝛍𝑇superscript𝚺1𝐲𝛍d(\mathbf{y})^{2}=(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}(\mathbf{y}-\bm% {\mu})italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ). It follows that IF(𝐲;H(𝛉),P)normIF𝐲𝐻𝛉𝑃\|\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)\|∥ IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) ∥ will be proportional to |αC(d(𝐲))d(𝐲)2|subscript𝛼𝐶𝑑𝐲𝑑superscript𝐲2|\alpha_{C}(d(\mathbf{y}))d(\mathbf{y})^{2}|| italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT |. When 𝚺𝚺\mathbf{\Sigma}bold_Σ is unstructured, then vec(𝚺)=𝐋𝛉0vec𝚺𝐋subscript𝛉0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, where 𝛉0=vech(𝚺)subscript𝛉0vech𝚺\bm{\theta}_{0}=\text{\rm vech}(\mathbf{\Sigma})bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = vech ( bold_Σ ), as defined in (2.3), and 𝐋𝐋\mathbf{L}bold_L is the duplication matrix 𝒟ksubscript𝒟𝑘\mathcal{D}_{k}caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. In that case, from (2.4) it follows that (5.6) with 𝐋=𝒟k𝐋subscript𝒟𝑘\mathbf{L}=\mathcal{D}_{k}bold_L = caligraphic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT reduces to

αC(d(𝐲))|𝚺|1/kvec{(𝐲𝝁)(𝐲𝝁)Td(𝐲)2k𝚺},subscript𝛼𝐶𝑑𝐲superscript𝚺1𝑘vec𝐲𝝁superscript𝐲𝝁𝑇𝑑superscript𝐲2𝑘𝚺\frac{\alpha_{C}(d(\mathbf{y}))}{|\mathbf{\Sigma}|^{1/k}}\mathrm{vec}\left\{(% \mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}-\frac{d(\mathbf{y})^{2}}{k}% \mathbf{\Sigma}\right\},divide start_ARG italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT end_ARG roman_vec { ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG bold_Σ } ,

which coincides with formula (3) in [24]. For completeness, consider the scale component σ(𝐌)=|𝐌|1/(2k)𝜎𝐌superscript𝐌12𝑘\sigma(\mathbf{M})=|\mathbf{M}|^{1/(2k)}italic_σ ( bold_M ) = | bold_M | start_POSTSUPERSCRIPT 1 / ( 2 italic_k ) end_POSTSUPERSCRIPT. From (4.4), it follows that

IF(𝐲;σ,P)=12|𝚺|1/(2k)γC(d(𝐲)),IF𝐲𝜎𝑃12superscript𝚺12𝑘subscript𝛾𝐶𝑑𝐲\text{\rm IF}(\mathbf{y};\sigma,P)=\frac{1}{2}|\mathbf{\Sigma}|^{-1/(2k)}% \gamma_{C}(d(\mathbf{y})),IF ( bold_y ; italic_σ , italic_P ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG | bold_Σ | start_POSTSUPERSCRIPT - 1 / ( 2 italic_k ) end_POSTSUPERSCRIPT italic_γ start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ,

where γC(s)=αC(s)s2/kβC(s)subscript𝛾𝐶𝑠subscript𝛼𝐶𝑠superscript𝑠2𝑘subscript𝛽𝐶𝑠\gamma_{C}(s)=\alpha_{C}(s)s^{2}/k-\beta_{C}(s)italic_γ start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_s ) = italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_k - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_s ), which matches with equation (4) in [24].

Example 7 (Direction of the vector of variance components).

For the direction functional H(𝛉)=𝛉/𝛉𝐻𝛉𝛉norm𝛉H(\bm{\theta})=\bm{\theta}/\|\bm{\theta}\|italic_H ( bold_italic_θ ) = bold_italic_θ / ∥ bold_italic_θ ∥, from Lemma 2(ii) together with (4.5) we find that, at a distribution P𝑃Pitalic_P with an elliptically contoured distribution with parameters 𝛍𝛍\bm{\mu}bold_italic_μ and 𝚺=𝐕(𝛉0)𝚺𝐕subscript𝛉0\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), IF(𝐲;H(𝛉),P)IF𝐲𝐻𝛉𝑃\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) is given by

αC(d(𝐲))(1𝜽0𝐈𝜽0𝜽0T𝜽03)(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1(𝐲𝝁)(𝐲𝝁)T𝚺1).subscript𝛼𝐶𝑑𝐲1normsubscript𝜽0subscript𝐈subscript𝜽0superscriptsubscript𝜽0𝑇superscriptnormsubscript𝜽03superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1𝐲𝝁superscript𝐲𝝁𝑇superscript𝚺1\alpha_{C}(d(\mathbf{y}))\left(\frac{1}{\|\bm{\theta}_{0}\|}\mathbf{I}_{\ell}-% \frac{\bm{\theta}_{0}\bm{\theta}_{0}^{T}}{\|\bm{\theta}_{0}\|^{3}}\right)\Big{% (}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{L}% \Big{)}^{-1}\mathbf{L}^{T}\mathrm{vec}\left(\mathbf{\Sigma}^{-1}(\mathbf{y}-% \bm{\mu})(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}\right).italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ( divide start_ARG 1 end_ARG start_ARG ∥ bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ end_ARG bold_I start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - divide start_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG ∥ bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG ) ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) .

It follows that IF(𝐲;H(𝛉),P)normIF𝐲𝐻𝛉𝑃\|\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)\|∥ IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) ∥ will be proportional to |αC(d(𝐲))d(𝐲)2|subscript𝛼𝐶𝑑𝐲𝑑superscript𝐲2|\alpha_{C}(d(\mathbf{y}))d(\mathbf{y})^{2}|| italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT |. An alternative is the map** H(𝛉)=𝛉/|𝐕(𝛉)|1/k𝐻𝛉𝛉superscript𝐕𝛉1𝑘H(\bm{\theta})=\bm{\theta}/|\mathbf{V}(\bm{\theta})|^{1/k}italic_H ( bold_italic_θ ) = bold_italic_θ / | bold_V ( bold_italic_θ ) | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT. Since 𝐕𝐕\mathbf{V}bold_V is linear, H𝐻Hitalic_H satisfies (4.1). For 𝐌(P)=𝐕(𝛉(P))𝐌𝑃𝐕𝛉𝑃\mathbf{M}(P)=\mathbf{V}(\bm{\theta}(P))bold_M ( italic_P ) = bold_V ( bold_italic_θ ( italic_P ) ), it holds that 𝛉(P)=(𝐋T𝐋)1𝐋Tvec(𝐌(P))𝛉𝑃superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝐌𝑃\bm{\theta}(P)=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(% \mathbf{M}(P))bold_italic_θ ( italic_P ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_M ( italic_P ) ), so that

H(𝜽(P))=(𝐋T𝐋)1𝐋Tvec(𝐌(P))/|𝐌(P)|1/k.𝐻𝜽𝑃superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝐌𝑃superscript𝐌𝑃1𝑘H(\bm{\theta}(P))=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(% \mathbf{M}(P))/|\mathbf{M}(P)|^{1/k}.italic_H ( bold_italic_θ ( italic_P ) ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_M ( italic_P ) ) / | bold_M ( italic_P ) | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT .

From Example 6 it follows that IF(𝐲;H(𝛉),P)IF𝐲𝐻𝛉𝑃\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) is given by

αC(d(𝐲))|𝚺|1/k{(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1(𝐲𝝁)(𝐲𝝁)T𝚺1)d(𝐲)2k𝜽0}subscript𝛼𝐶𝑑𝐲superscript𝚺1𝑘superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1𝐲𝝁superscript𝐲𝝁𝑇superscript𝚺1𝑑superscript𝐲2𝑘subscript𝜽0\begin{split}&\frac{\alpha_{C}(d(\mathbf{y}))}{|\mathbf{\Sigma}|^{1/k}}\bigg{% \{}\Big{(}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})% \mathbf{L}\Big{)}^{-1}\mathbf{L}^{T}\mathrm{vec}\left(\mathbf{\Sigma}^{-1}(% \mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}\mathbf{\Sigma}^{-1}\right)-\frac% {d(\mathbf{y})^{2}}{k}\bm{\theta}_{0}\bigg{\}}\end{split}start_ROW start_CELL end_CELL start_CELL divide start_ARG italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 1 / italic_k end_POSTSUPERSCRIPT end_ARG { ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) - divide start_ARG italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } end_CELL end_ROW

using that vec(𝚺)=𝐋𝛉0vec𝚺𝐋subscript𝛉0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Again we find that IF(𝐲;H(𝛉),P)normIF𝐲𝐻𝛉𝑃\|\text{\rm IF}(\mathbf{y};H(\bm{\theta}),P)\|∥ IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) ∥ is proportional to |αC(d(𝐲))d(𝐲)2|subscript𝛼𝐶𝑑𝐲𝑑superscript𝐲2|\alpha_{C}(d(\mathbf{y}))d(\mathbf{y})^{2}|| italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT |.

6 Application

We apply our results to S-estimators and S-functionals in the linear model (3.1). Let P𝑃Pitalic_P be the distribution for the random variable 𝐬=(𝐲,𝐗)𝐬𝐲𝐗\mathbf{s}=(\mathbf{y},\mathbf{X})bold_s = ( bold_y , bold_X ), which is such that 𝐲𝐗conditional𝐲𝐗\mathbf{y}\mid\mathbf{X}bold_y ∣ bold_X has an elliptically contoured distribution (3.2) with parameters 𝝁=𝐗𝜷0𝝁𝐗subscript𝜷0\bm{\mu}=\mathbf{X}\bm{\beta}_{0}bold_italic_μ = bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 𝚺=𝐕(𝜽0)=θ01𝐋1++θ0𝐋𝚺𝐕subscript𝜽0subscript𝜃01subscript𝐋1subscript𝜃0subscript𝐋\mathbf{\Sigma}=\mathbf{V}(\bm{\theta}_{0})=\theta_{01}\mathbf{L}_{1}+\cdots+% \theta_{0\ell}\mathbf{L}_{\ell}bold_Σ = bold_V ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_θ start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ⋯ + italic_θ start_POSTSUBSCRIPT 0 roman_ℓ end_POSTSUBSCRIPT bold_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT. Consider the S-estimator for (𝜷0,𝜽0)subscript𝜷0subscript𝜽0(\bm{\beta}_{0},\bm{\theta}_{0})( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) defined as the solution to minimizing |𝐕(𝜽)|𝐕𝜽|\mathbf{V}(\bm{\theta})|| bold_V ( bold_italic_θ ) |, subject to

1ni=1nρ((𝐲i𝐗i𝜷)T𝐕(𝜽)1(𝐲i𝐗i𝜷))=b0,1𝑛superscriptsubscript𝑖1𝑛𝜌superscriptsubscript𝐲𝑖subscript𝐗𝑖𝜷𝑇𝐕superscript𝜽1subscript𝐲𝑖subscript𝐗𝑖𝜷subscript𝑏0\frac{1}{n}\sum_{i=1}^{n}\rho\left(\sqrt{(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{% \beta})^{T}\mathbf{V}(\bm{\theta})^{-1}(\mathbf{y}_{i}-\mathbf{X}_{i}\bm{\beta% })}\right)=b_{0},divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_ρ ( square-root start_ARG ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V ( bold_italic_θ ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_β ) end_ARG ) = italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,

where the minimum is taken over all 𝜷q𝜷superscript𝑞\bm{\beta}\in\mathbb{R}^{q}bold_italic_β ∈ blackboard_R start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT and 𝜽𝜽superscript\bm{\theta}\in\mathbb{R}^{\ell}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT, such that 𝐕(𝜽)PDS(k)𝐕𝜽PDS𝑘\mathbf{V}(\bm{\theta})\in\text{\rm PDS}(k)bold_V ( bold_italic_θ ) ∈ PDS ( italic_k ). For the function ρ𝜌\rhoitalic_ρ we take Tukey’s bi-weight

ρB(s;c)={s2/2s4/(2c2)+s6/(6c4),|s|c;c2/6|s|>c,subscript𝜌B𝑠𝑐casessuperscript𝑠22superscript𝑠42superscript𝑐2superscript𝑠66superscript𝑐4𝑠𝑐superscript𝑐26𝑠𝑐\rho_{\mathrm{B}}(s;c)=\begin{cases}s^{2}/2-s^{4}/(2c^{2})+s^{6}/(6c^{4}),&|s|% \leq c;\\ c^{2}/6&|s|>c,\end{cases}italic_ρ start_POSTSUBSCRIPT roman_B end_POSTSUBSCRIPT ( italic_s ; italic_c ) = { start_ROW start_CELL italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 2 - italic_s start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT / ( 2 italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + italic_s start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT / ( 6 italic_c start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ) , end_CELL start_CELL | italic_s | ≤ italic_c ; end_CELL end_ROW start_ROW start_CELL italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 6 end_CELL start_CELL | italic_s | > italic_c , end_CELL end_ROW (6.1)

and b0=𝔼𝟎,𝐈k[ρB(𝐳;c)]subscript𝑏0subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝜌𝐵norm𝐳𝑐b_{0}=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[\rho_{B}(\|\mathbf{z}\|;c)]italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( ∥ bold_z ∥ ; italic_c ) ]. From Theorem 6.1 in Lopuhaä et al [15] it is known that the breakdown point of the S-estimator depends on the cut-off constant c𝑐citalic_c and is at least nb0/(c2/6)/n𝑛subscript𝑏0superscript𝑐26𝑛\lceil nb_{0}/(c^{2}/6)\rceil/n⌈ italic_n italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / ( italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 6 ) ⌉ / italic_n, or asymptotically ϵ=b0/(c2/6)superscriptitalic-ϵsubscript𝑏0superscript𝑐26\epsilon^{*}=b_{0}/(c^{2}/6)italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / ( italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 6 ).

Table 1: Cut-off values of ρBsubscript𝜌𝐵\rho_{B}italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT for different breakdown points and dimensions.
Breakdown pointk0.050.100.150.200.250.300.350.400.450.5017.5455.1824.0963.4212.9372.5612.2521.9881.7561.548210.7677.4745.9815.0694.4273.9383.5423.2092.9202.661517.11411.9509.6288.2207.2426.5055.9185.4325.0174.6521024.24616.96113.69411.71910.3519.3248.5107.8407.2716.776missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionBreakdown pointmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝑘0.050.100.150.200.250.300.350.400.450.50missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression17.5455.1824.0963.4212.9372.5612.2521.9881.7561.548210.7677.4745.9815.0694.4273.9383.5423.2092.9202.661517.11411.9509.6288.2207.2426.5055.9185.4325.0174.6521024.24616.96113.69411.71910.3519.3248.5107.8407.2716.776missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression\begin{array}[]{crrrrrrrrrr}\hline\cr\hline\cr&&&&&&&&&&\\[-5.0pt] &\lx@intercol\hfil\text{Breakdown point}\hfil\lx@intercol\\ &&&&&&&&&&\\[-5.0pt] k&0.05&0.10&0.15&0.20&0.25&0.30&0.35&0.40&0.45&0.50\\ \cline{2-11}\cr&&&&&&&&&&\\[-5.0pt] 1&7.545&5.182&4.096&3.421&2.937&2.561&2.252&1.988&1.756&1.548\\ 2&10.767&7.474&5.981&5.069&4.427&3.938&3.542&3.209&2.920&2.661\\ 5&17.114&11.950&9.628&8.220&7.242&6.505&5.918&5.432&5.017&4.652\\ 10&24.246&16.961&13.694&11.719&10.351&9.324&8.510&7.840&7.271&6.776\\ &&&&&&&&&&\\[-5.0pt] \hline\cr\hline\cr\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL Breakdown point end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_k end_CELL start_CELL 0.05 end_CELL start_CELL 0.10 end_CELL start_CELL 0.15 end_CELL start_CELL 0.20 end_CELL start_CELL 0.25 end_CELL start_CELL 0.30 end_CELL start_CELL 0.35 end_CELL start_CELL 0.40 end_CELL start_CELL 0.45 end_CELL start_CELL 0.50 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 7.545 end_CELL start_CELL 5.182 end_CELL start_CELL 4.096 end_CELL start_CELL 3.421 end_CELL start_CELL 2.937 end_CELL start_CELL 2.561 end_CELL start_CELL 2.252 end_CELL start_CELL 1.988 end_CELL start_CELL 1.756 end_CELL start_CELL 1.548 end_CELL end_ROW start_ROW start_CELL 2 end_CELL start_CELL 10.767 end_CELL start_CELL 7.474 end_CELL start_CELL 5.981 end_CELL start_CELL 5.069 end_CELL start_CELL 4.427 end_CELL start_CELL 3.938 end_CELL start_CELL 3.542 end_CELL start_CELL 3.209 end_CELL start_CELL 2.920 end_CELL start_CELL 2.661 end_CELL end_ROW start_ROW start_CELL 5 end_CELL start_CELL 17.114 end_CELL start_CELL 11.950 end_CELL start_CELL 9.628 end_CELL start_CELL 8.220 end_CELL start_CELL 7.242 end_CELL start_CELL 6.505 end_CELL start_CELL 5.918 end_CELL start_CELL 5.432 end_CELL start_CELL 5.017 end_CELL start_CELL 4.652 end_CELL end_ROW start_ROW start_CELL 10 end_CELL start_CELL 24.246 end_CELL start_CELL 16.961 end_CELL start_CELL 13.694 end_CELL start_CELL 11.719 end_CELL start_CELL 10.351 end_CELL start_CELL 9.324 end_CELL start_CELL 8.510 end_CELL start_CELL 7.840 end_CELL start_CELL 7.271 end_CELL start_CELL 6.776 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW end_ARRAY

Table 1 gives the cut-off values of ρBsubscript𝜌𝐵\rho_{B}italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT for given asymptotic lower bounds ϵ=0.05,0.10,,0.50superscriptitalic-ϵ0.050.100.50\epsilon^{*}=0.05,0.10,\ldots,0.50italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 0.05 , 0.10 , … , 0.50 on the breakdown point in dimensions k=1,2,5,10𝑘12510k=1,2,5,10italic_k = 1 , 2 , 5 , 10. This table partly overlaps with Table 3 in Rousseeuw and Yohai [23].

According to Corollary 9.2 in Lopuhaä et al [16], the scalar λ=𝔼𝟎,𝐈k[ρB(𝐳;c)2]/(kα2)𝜆subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝜌𝐵superscriptnorm𝐳𝑐2𝑘superscript𝛼2\lambda=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\rho_{B}^{\prime}(\|% \mathbf{z}\|;c)^{2}\right]/(k\alpha^{2})italic_λ = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ; italic_c ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] / ( italic_k italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) represents the asymptotic efficiency of the regression S-estimator 𝜷nsubscript𝜷𝑛\bm{\beta}_{n}bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT relative to the least squares estimator (for which λ=1)\lambda=1)italic_λ = 1 ), where

α=𝔼𝟎,𝐈k[(11k)ρB(𝐳;c)𝐳+1kρB′′(𝐳;c)].𝛼subscript𝔼0subscript𝐈𝑘delimited-[]11𝑘superscriptsubscript𝜌𝐵norm𝐳𝑐norm𝐳1𝑘superscriptsubscript𝜌𝐵′′norm𝐳𝑐\alpha=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\left(1-\frac{1}{k}\right)% \frac{\rho_{B}^{\prime}(\|\mathbf{z}\|;c)}{\|\mathbf{z}\|}+\frac{1}{k}\rho_{B}% ^{\prime\prime}(\|\mathbf{z}\|;c)\right].italic_α = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( 1 - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG ) divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ; italic_c ) end_ARG start_ARG ∥ bold_z ∥ end_ARG + divide start_ARG 1 end_ARG start_ARG italic_k end_ARG italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ; italic_c ) ] . (6.2)

From Examples 4 and 5, together with Theorem 2 and Example 3, it follows that the scalar

σ1=k𝔼𝟎,𝐈k[ρB(𝐳;c)2𝐳2](k+2)δ12,subscript𝜎1𝑘subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝜌𝐵superscriptnorm𝐳𝑐2superscriptnorm𝐳2𝑘2superscriptsubscript𝛿12\sigma_{1}=\frac{k\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\rho_{B}^{\prime% }(\|\mathbf{z}\|;c)^{2}\|\mathbf{z}\|^{2}\right]}{(k+2)\delta_{1}^{2}},italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG italic_k blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ; italic_c ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( italic_k + 2 ) italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

where δ1subscript𝛿1\delta_{1}italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is defined in (5.4), serves as an index for the asymptotic efficiency of both the S-estimator of shape as well as the S-estimator for the direction of the vector of variance components, relative to the least squares estimators of shape and direction, respectively (for which σ1=1subscript𝜎11\sigma_{1}=1italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1). Finally, from Example 4, together with Example 3, it follows that

σ3=14(2σ1k+σ2)=𝔼0,𝐈k[(ρB(𝐳;c)b0)2]δ22,subscript𝜎3142subscript𝜎1𝑘subscript𝜎2subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝜌𝐵norm𝐳𝑐subscript𝑏02superscriptsubscript𝛿22\sigma_{3}=\frac{1}{4}\left(\frac{2\sigma_{1}}{k}+\sigma_{2}\right)=\frac{% \mathbb{E}_{\textbf{0},\mathbf{I}_{k}}\left[\left(\rho_{B}(\|\mathbf{z}\|;c)-b% _{0}\right)^{2}\right]}{\delta_{2}^{2}},italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 4 end_ARG ( divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = divide start_ARG blackboard_E start_POSTSUBSCRIPT 0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( ∥ bold_z ∥ ; italic_c ) - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

where δ2subscript𝛿2\delta_{2}italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is defined in (5.4), serves as an index for the asymptotic efficiency of the S-estimator of scale relative the least squares (for which σ3=1/(2k)subscript𝜎312𝑘\sigma_{3}=1/(2k)italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1 / ( 2 italic_k )). As a consequence, the cutoff constant c𝑐citalic_c of ρBsubscript𝜌𝐵\rho_{B}italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT can be tuned in such a way that the asymptotic efficiency 1/λ1𝜆1/\lambda1 / italic_λ relative to the least squares estimator is high at the normal distribution and similarly for 1/σ11subscript𝜎11/\sigma_{1}1 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 1/(2kσ3)12𝑘subscript𝜎31/(2k\sigma_{3})1 / ( 2 italic_k italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ). Since c𝑐citalic_c also determines the breakdown point, this forces a trade-off between efficiency and breakdown point. Typically, large values of c𝑐citalic_c correspond to high efficiency and low breakdown point, and vice-versa for moderate values of c𝑐citalic_c.

We further investigate how this trade-off relates to the gross error sensitivity (GES) of the corresponding S-functionals. For simplicity we only consider perturbations in 𝐲𝐲\mathbf{y}bold_y and leave 𝐗𝐗\mathbf{X}bold_X unchanged. From Corollary 8.4 in Lopuhaä et al [16], for the regression S-functional it then follows that IF(𝐲;𝜷,P)normIF𝐲𝜷𝑃\|\text{IF}(\mathbf{y};\bm{\beta},P)\|∥ IF ( bold_y ; bold_italic_β , italic_P ) ∥ is proportional to α1|ρB(d(𝐲);c)|superscript𝛼1superscriptsubscript𝜌𝐵𝑑𝐲𝑐\alpha^{-1}\left|\rho_{B}^{\prime}(d(\mathbf{y});c)\right|italic_α start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ( bold_y ) ; italic_c ) |, where α𝛼\alphaitalic_α is defined in (6.2) and d(𝐲)2=(𝐲𝐗𝜷0)T𝚺1(𝐲𝐗𝜷0)𝑑superscript𝐲2superscript𝐲𝐗subscript𝜷0𝑇superscript𝚺1𝐲𝐗subscript𝜷0d(\mathbf{y})^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})^{T}\mathbf{\Sigma}^{-1% }(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})italic_d ( bold_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Therefore, we propose the scalar

G1=1αsups>0|ρB(s;c)|,subscript𝐺11𝛼subscriptsupremum𝑠0superscriptsubscript𝜌𝐵𝑠𝑐G_{1}=\frac{1}{\alpha}\sup_{s>0}\left|\rho_{B}^{\prime}(s;c)\right|,italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_α end_ARG roman_sup start_POSTSUBSCRIPT italic_s > 0 end_POSTSUBSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ; italic_c ) | ,

as an index for the GES of regression S-functionals. This coincides with the GES index for location CM-functionals in Kent and Tyler [11]. From Examples 6 and 7, together with Lemma 2 and (5.3), for both the shape and direction S-functional, it follows that IF(𝐲)normIF𝐲\|\text{IF}(\mathbf{y})\|∥ IF ( bold_y ) ∥ is proportional to δ11|ρB(d(𝐲);c)d(𝐲)|superscriptsubscript𝛿11superscriptsubscript𝜌𝐵𝑑𝐲𝑐𝑑𝐲\delta_{1}^{-1}|\rho_{B}^{\prime}(d(\mathbf{y});c)d(\mathbf{y})|italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ( bold_y ) ; italic_c ) italic_d ( bold_y ) |, where δ1subscript𝛿1\delta_{1}italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is defined in (5.4). We propose the scalar

G2=k(k+2)δ1sups>0|ρB(s;c)s|,subscript𝐺2𝑘𝑘2subscript𝛿1subscriptsupremum𝑠0superscriptsubscript𝜌𝐵𝑠𝑐𝑠G_{2}=\frac{k}{(k+2)\delta_{1}}\sup_{s>0}\left|\rho_{B}^{\prime}(s;c)s\right|,italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG italic_k end_ARG start_ARG ( italic_k + 2 ) italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG roman_sup start_POSTSUBSCRIPT italic_s > 0 end_POSTSUBSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ; italic_c ) italic_s | ,

as an index for the GES of shape and direction S-functionals. In this way, G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT coincides with the GES index for CM-functionals of shape in Kent and Tyler [11]. Finally, from Example 6 and (5.3), if follows that for the scale functional IF(𝐲)normIF𝐲\|\text{IF}(\mathbf{y})\|∥ IF ( bold_y ) ∥ is proportional to δ21|ρB(d(𝐲);c)b0|superscriptsubscript𝛿21subscript𝜌𝐵𝑑𝐲𝑐subscript𝑏0\delta_{2}^{-1}|\rho_{B}(d(\mathbf{y});c)-b_{0}|italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( italic_d ( bold_y ) ; italic_c ) - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |, where δ2subscript𝛿2\delta_{2}italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is defined in (5.4). We propose

G3=1δ2sups>0|ρB(s;c)b0|,subscript𝐺31subscript𝛿2subscriptsupremum𝑠0subscript𝜌𝐵𝑠𝑐subscript𝑏0G_{3}=\frac{1}{\delta_{2}}\sup_{s>0}\left|\rho_{B}(s;c)-b_{0}\right|,italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG roman_sup start_POSTSUBSCRIPT italic_s > 0 end_POSTSUBSCRIPT | italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( italic_s ; italic_c ) - italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ,

as an index for the GES of the S-functional of scale.

We investigate how the asymptotic efficiency at the normal distribution of the S-estimators, and the GES of the corresponding S-functionals behave as we vary the breakdown point of the S-estimator between 0 and 0.5. Given a value ϵsuperscriptitalic-ϵ\epsilon^{*}italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT of the breakdown point, we determine the corresponding cut-off constant c𝑐citalic_c by solving ϵ=𝔼𝟎,𝐈k[ρB(𝐳;c)]/(c2/6)superscriptitalic-ϵsubscript𝔼0subscript𝐈𝑘delimited-[]subscript𝜌𝐵norm𝐳𝑐superscript𝑐26\epsilon^{*}=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[\rho_{B}(\|\mathbf{z}\|;c)% ]/(c^{2}/6)italic_ϵ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_ρ start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ( ∥ bold_z ∥ ; italic_c ) ] / ( italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / 6 ). With this value of c𝑐citalic_c, we compute the values of λ𝜆\lambdaitalic_λ, σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ3subscript𝜎3\sigma_{3}italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and the GES indices G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and G3subscript𝐺3G_{3}italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. In Figure 1, on the top row we have plotted the asymptotic relative efficiencies 1/λ1𝜆1/\lambda1 / italic_λ, 1/σ11subscript𝜎11/\sigma_{1}1 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 1/(2kσ3)12𝑘subscript𝜎31/(2k\sigma_{3})1 / ( 2 italic_k italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) as a function of the breakdown point for dimensions k=2,5,10𝑘2510k=2,5,10italic_k = 2 , 5 , 10, and the bottom row contains plots of the GES indices G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and G3subscript𝐺3G_{3}italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT for the same dimensions.

Refer to caption
Figure 1: ARE and GES as functions of the breakdown point at the multivariate normal in dimensions k=2,5,10𝑘2510k=2,5,10italic_k = 2 , 5 , 10.

As expected, the efficiency decreases with increasing breakdown point, but the loss of efficiency is less severe for the S-estimator of scale compared to the S-estimator for regression and the S-estimators for shape and direction. In dimension k=2𝑘2k=2italic_k = 2 (solid lines), the 50% breakdown S-estimators have asymptotic efficiencies 1/λ=0.5801𝜆0.5801/\lambda=0.5801 / italic_λ = 0.580, 1/σ1=0.3761subscript𝜎10.3761/\sigma_{1}=0.3761 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.376, and 1/(4σ3)=0.75514subscript𝜎30.7551/(4\sigma_{3})=0.7551 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.755. However, one can gain both efficiency and lower the GES at the cost of a lower breakdown point. For example, the GES index of the regression functional attains its minimal value G1=1.927subscript𝐺11.927G_{1}=1.927italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1.927 at breakdown point 28%, which corresponds to cut-off value c=4.115𝑐4.115c=4.115italic_c = 4.115. For this cut-off value the GES index of the shape and direction functional is G2=1.368subscript𝐺21.368G_{2}=1.368italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.368, which is not far off from its minimal value 1.344, and the GES index for scale is G3=3.323subscript𝐺33.323G_{3}=3.323italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 3.323. Furthermore, the asymptotic efficiencies then become 1/λ=0.8841𝜆0.8841/\lambda=0.8841 / italic_λ = 0.884, 1/σ1=0.8031subscript𝜎10.8031/\sigma_{1}=0.8031 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.803, and 1/(4σ3)=0.93914subscript𝜎30.9391/(4\sigma_{3})=0.9391 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.939, for the regression estimator, the estimators of shape and direction, and the scale estimator, respectively. Similarly, the GES index of the shape and direction functionals attains its minimal value G2=1.344subscript𝐺21.344G_{2}=1.344italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.344 for c=3.722𝑐3.722c=3.722italic_c = 3.722. This would yield G1=1.947subscript𝐺11.947G_{1}=1.947italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1.947, G3=2.844subscript𝐺32.844G_{3}=2.844italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 2.844, 1/λ=0.8351𝜆0.8351/\lambda=0.8351 / italic_λ = 0.835, 1/σ1=0.7231subscript𝜎10.7231/\sigma_{1}=0.7231 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.723, 1/(4σ3)=0.91214subscript𝜎30.9121/(4\sigma_{3})=0.9121 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.912 and breakdown point 33%. The GES index of the scale functional attains its minimum value G3=1.852subscript𝐺31.852G_{3}=1.852italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1.852 at 50% breakdown point, so no simultaneous gain in efficiency and smaller GES values G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be achieved at the cost of a smaller breakdown point.

In dimension k=5𝑘5k=5italic_k = 5 (dashed lines), the 50% breakdown S-estimators have asymptotic efficiencies 1/λ=0.8641𝜆0.8641/\lambda=0.8641 / italic_λ = 0.864, 1/σ1=0.7781subscript𝜎10.7781/\sigma_{1}=0.7781 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.778, and 1/(4σ3)=0.91814subscript𝜎30.9181/(4\sigma_{3})=0.9181 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.918. The GES index of the regression functional attains its minimal value G1=2.595subscript𝐺12.595G_{1}=2.595italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 2.595 at breakdown point 37%. The corresponding GES index for shape and direction functionals is G2=1.271subscript𝐺21.271G_{2}=1.271italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.271 and G3=1.480subscript𝐺31.480G_{3}=1.480italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1.480 for the scale functionals. Corresponding to this smaller regression GES index we observe a gain in the asymptotic efficiencies: 1/λ=0.9321𝜆0.9321/\lambda=0.9321 / italic_λ = 0.932, 1/σ1=0.9031subscript𝜎10.9031/\sigma_{1}=0.9031 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.903, and 1/(4σ3)=0.96514subscript𝜎30.9651/(4\sigma_{3})=0.9651 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.965, for the regression estimator, the estimators of shape and direction, and the scale estimator, respectively. The GES index of the shape and direction functionals attains its minimal value at breakdown point 47%, so the gain in both efficiency and a smaller G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT value is negligible. The situation for the GES index for scale is the same as in dimension k=2𝑘2k=2italic_k = 2, where no simultaneous gain in efficiency and smaller GES values G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be achieved at the cost of a smaller breakdown point.

Finally, in dimension (dotted lines), the 50% breakdown S-estimators have asymptotic efficiencies 1/λ=0.9331𝜆0.9331/\lambda=0.9331 / italic_λ = 0.933, 1/σ1=0.9151subscript𝜎10.9151/\sigma_{1}=0.9151 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.915, and 1/(4σ3)=0.96514subscript𝜎30.9651/(4\sigma_{3})=0.9651 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.965. The GES index of the regression functional attains its minimal value G1=3.426subscript𝐺13.426G_{1}=3.426italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 3.426 at breakdown point 42%. The corresponding GES index for shape and direction functionals is G2=1.221subscript𝐺21.221G_{2}=1.221italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.221 and G3=1.744subscript𝐺31.744G_{3}=1.744italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1.744 for the scale functionals. Corresponding to this smaller regression GES index we observe a gain in the asymptotic efficiencies: 1/λ=0.9601𝜆0.9601/\lambda=0.9601 / italic_λ = 0.960, 1/σ1=0.9491subscript𝜎10.9491/\sigma_{1}=0.9491 / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.949, and 1/(4σ3)=0.97914subscript𝜎30.9791/(4\sigma_{3})=0.9791 / ( 4 italic_σ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = 0.979, for the regression estimator, the estimators of shape and direction, and the scale estimator, respectively. Both GES indices G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and G3subscript𝐺3G_{3}italic_G start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT attain their minimal values at 50% breakdown, so no simultaneous gain in efficiency and smaller GES value G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT can be achieved at the cost of a smaller breakdown point.

We conclude that at a moderate loss of breakdown point, from 50% to about 30%-40%, one can gain efficiency of the S-estimators and at the same time reduce the GES of the regression S-estimator. The improvements becomes less as the dimension increases.

Appendix A Proofs

Proof of Theorem 1.

Proof.

It can be seen that the projection matrix, as defined in (2.2), is given by

ΠL=𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T(𝚺1𝚺1).subscriptΠ𝐿𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1\Pi_{L}=\mathbf{L}\left(\mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf% {\Sigma}^{-1}\right)\mathbf{L}\right)^{-1}\mathbf{L}^{T}\left(\mathbf{\Sigma}^% {-1}\otimes\mathbf{\Sigma}^{-1}\right).roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) . (A.1)

Since 𝐍𝐍\mathbf{N}bold_N is of radial type with respect to 𝚺𝚺\mathbf{\Sigma}bold_Σ, it follows from Corollary 1 in Tyler [26] that there exist constants η𝜂\etaitalic_η, σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with σ10subscript𝜎10\sigma_{1}\geq 0italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 0 and σ22σ1/ksubscript𝜎22subscript𝜎1𝑘\sigma_{2}\geq-2\sigma_{1}/kitalic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ - 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_k, such that 𝔼[𝐍]=η𝚺𝔼delimited-[]𝐍𝜂𝚺\mathbb{E}[\mathbf{N}]=\eta\mathbf{\Sigma}blackboard_E [ bold_N ] = italic_η bold_Σ and

var{vec(𝐍)}=σ1(𝐈k2+𝐊k,k)(𝚺𝚺)+σ2vec(𝚺)vec(𝚺)T.varvec𝐍subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺subscript𝜎2vec𝚺vecsuperscript𝚺𝑇\text{var}\{\mathrm{vec}(\mathbf{N})\}=\sigma_{1}(\mathbf{I}_{k^{2}}+\mathbf{K% }_{k,k})(\mathbf{\Sigma}\otimes\mathbf{\Sigma})+\sigma_{2}\mathrm{vec}(\mathbf% {\Sigma})\mathrm{vec}(\mathbf{\Sigma})^{T}.var { roman_vec ( bold_N ) } = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

Since 𝐕𝐕\mathbf{V}bold_V is linear, it holds that vec(𝚺)=𝐋𝜽vec𝚺𝐋𝜽\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}roman_vec ( bold_Σ ) = bold_L bold_italic_θ, so that ΠLvec(𝚺)=vec(𝚺)subscriptΠ𝐿vec𝚺vec𝚺\Pi_{L}\mathrm{vec}(\mathbf{\Sigma})=\mathrm{vec}(\mathbf{\Sigma})roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_Σ ) = roman_vec ( bold_Σ ). It follows that 𝐌𝐌\mathbf{M}bold_M has expectation

𝔼[vec(𝐌)]=ΠLvec(𝔼[𝐍])=ηΠLvec(𝚺)=ηvec(𝚺),𝔼delimited-[]vec𝐌subscriptΠ𝐿vec𝔼delimited-[]𝐍𝜂subscriptΠ𝐿vec𝚺𝜂vec𝚺\mathbb{E}[\mathrm{vec}(\mathbf{M})]=\Pi_{L}\mathrm{vec}(\mathbb{E}[\mathbf{N}% ])=\eta\Pi_{L}\mathrm{vec}(\mathbf{\Sigma})=\eta\mathrm{vec}(\mathbf{\Sigma}),blackboard_E [ roman_vec ( bold_M ) ] = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( blackboard_E [ bold_N ] ) = italic_η roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_Σ ) = italic_η roman_vec ( bold_Σ ) ,

and variance

var(vec(𝐌))=ΠLvar(vec(𝐍))ΠLT=σ1ΠL(𝐈k2+𝐊k,k)(𝚺𝚺)ΠLT+σ2ΠLvec(𝚺)vec(𝚺)TΠLT=σ1ΠL(𝐈k2+𝐊k,k)(𝚺𝚺)ΠLT+σ2vec(𝚺)vec(𝚺)T.varvec𝐌subscriptΠ𝐿varvec𝐍superscriptsubscriptΠ𝐿𝑇subscript𝜎1subscriptΠ𝐿subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺superscriptsubscriptΠ𝐿𝑇subscript𝜎2subscriptΠ𝐿vec𝚺vecsuperscript𝚺𝑇superscriptsubscriptΠ𝐿𝑇subscript𝜎1subscriptΠ𝐿subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺superscriptsubscriptΠ𝐿𝑇subscript𝜎2vec𝚺vecsuperscript𝚺𝑇\begin{split}\text{var}(\mathrm{vec}(\mathbf{M}))&=\Pi_{L}\text{var}(\mathrm{% vec}(\mathbf{N}))\Pi_{L}^{T}\\ &=\sigma_{1}\Pi_{L}(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k})(\mathbf{\Sigma}% \otimes\mathbf{\Sigma})\Pi_{L}^{T}+\sigma_{2}\Pi_{L}\mathrm{vec}(\mathbf{% \Sigma})\mathrm{vec}(\mathbf{\Sigma})^{T}\Pi_{L}^{T}\\ &=\sigma_{1}\Pi_{L}(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k})(\mathbf{\Sigma}% \otimes\mathbf{\Sigma})\Pi_{L}^{T}+\sigma_{2}\mathrm{vec}(\mathbf{\Sigma})% \mathrm{vec}(\mathbf{\Sigma})^{T}.\end{split}start_ROW start_CELL var ( roman_vec ( bold_M ) ) end_CELL start_CELL = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT var ( roman_vec ( bold_N ) ) roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . end_CELL end_ROW

Note that

𝐊k,k(𝐀𝐁)=(𝐁𝐀)𝐊k,k𝐊k,kvec(𝐀)=vec(𝐀T),subscript𝐊𝑘𝑘tensor-product𝐀𝐁tensor-product𝐁𝐀subscript𝐊𝑘𝑘subscript𝐊𝑘𝑘vec𝐀vecsuperscript𝐀𝑇\begin{split}\mathbf{K}_{k,k}(\mathbf{A}\otimes\mathbf{B})&=(\mathbf{B}\otimes% \mathbf{A})\mathbf{K}_{k,k}\\ \mathbf{K}_{k,k}\mathrm{vec}(\mathbf{A})&=\mathrm{vec}(\mathbf{A}^{T}),\end{split}start_ROW start_CELL bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ( bold_A ⊗ bold_B ) end_CELL start_CELL = ( bold_B ⊗ bold_A ) bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT roman_vec ( bold_A ) end_CELL start_CELL = roman_vec ( bold_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , end_CELL end_ROW (A.2)

e.g., see [18, Chapter 3, Section 7]. Since 𝚺=𝐕(𝜽)𝚺𝐕𝜽\mathbf{\Sigma}=\mathbf{V}(\bm{\theta})bold_Σ = bold_V ( bold_italic_θ ) is symmetric, also 𝐋j=𝐕/θjsubscript𝐋𝑗𝐕subscript𝜃𝑗\mathbf{L}_{j}=\partial\mathbf{V}/\partial\theta_{j}bold_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ∂ bold_V / ∂ italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is symmetric, for j=1,,𝑗1j=1,\ldots,\ellitalic_j = 1 , … , roman_ℓ. This means that 𝐊k,k𝐋=𝐋subscript𝐊𝑘𝑘𝐋𝐋\mathbf{K}_{k,k}\mathbf{L}=\mathbf{L}bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT bold_L = bold_L and it follows that

ΠL(𝐈k2+𝐊k,k)(𝚺𝚺)ΠLT=𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T(𝐈k2+𝐊k,k)ΠLT=2𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋TΠLT=2𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T.subscriptΠ𝐿subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-product𝚺𝚺superscriptsubscriptΠ𝐿𝑇𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘superscriptsubscriptΠ𝐿𝑇2𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇superscriptsubscriptΠ𝐿𝑇2𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇\begin{split}\Pi_{L}(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k})(\mathbf{\Sigma}% \otimes\mathbf{\Sigma})\Pi_{L}^{T}&=\mathbf{L}\left(\mathbf{L}^{T}\left(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\right)^{-1}% \mathbf{L}^{T}(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k})\Pi_{L}^{T}\\ &=2\mathbf{L}\left(\mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{% \Sigma}^{-1}\right)\mathbf{L}\right)^{-1}\mathbf{L}^{T}\Pi_{L}^{T}\\ &=2\mathbf{L}\left(\mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{% \Sigma}^{-1}\right)\mathbf{L}\right)^{-1}\mathbf{L}^{T}.\end{split}start_ROW start_CELL roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ ⊗ bold_Σ ) roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL = bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = 2 bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = 2 bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . end_CELL end_ROW

This finishes the proof of part (i). Since 𝐋𝐋\mathbf{L}bold_L has full rank, it holds that

(𝐋T𝐋)1𝐋Tvec(𝚺)=(𝐋T𝐋)1𝐋T𝐋𝜽=𝜽,superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝚺superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇𝐋𝜽𝜽(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(\mathbf{\Sigma})=(% \mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathbf{L}\bm{\theta}=\bm{\theta},( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L bold_italic_θ = bold_italic_θ , (A.3)

and similarly (𝐋T𝐋)1𝐋Tvec(𝐌)=𝐓superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝐌𝐓(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(\mathbf{M})=\mathbf{T}( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_M ) = bold_T. This immediately gives

𝔼[𝐓]=(𝐋T𝐋)1𝐋T𝔼[vec(𝐌)]=η(𝐋T𝐋)1𝐋Tvec(𝚺)=η𝜽,𝔼delimited-[]𝐓superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇𝔼delimited-[]vec𝐌𝜂superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝚺𝜂𝜽\mathbb{E}[\mathbf{T}]=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathbb{E}% [\mathrm{vec}(\mathbf{M})]=\eta(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}% \mathrm{vec}(\mathbf{\Sigma})=\eta\bm{\theta},blackboard_E [ bold_T ] = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT blackboard_E [ roman_vec ( bold_M ) ] = italic_η ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) = italic_η bold_italic_θ ,

and

var(𝐓)=(𝐋T𝐋)1𝐋Tvar{vec(𝐌)}𝐋(𝐋T𝐋)1.var𝐓superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇varvec𝐌𝐋superscriptsuperscript𝐋𝑇𝐋1\text{\rm var}(\mathbf{T})=(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\text{% \rm var}\{\mathrm{vec}(\mathbf{M})\}\mathbf{L}(\mathbf{L}^{T}\mathbf{L})^{-1}.var ( bold_T ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT var { roman_vec ( bold_M ) } bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

When we insert (1.3) and apply (A.3), the theorem follows. ∎

Proof of Theorem 2.

The proof follows the line of reasoning used in the proofs of Theorem 9.1 and Corollary 9.2 in Lopuhaä et al [16] for S-estimators. These proofs are based on estimating equations (3.3) with w1(d)=ρ(d)/dsubscript𝑤1𝑑superscript𝜌𝑑𝑑w_{1}(d)=\rho^{\prime}(d)/ditalic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_d ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) / italic_d, w2(d)=kρ(d)/dsubscript𝑤2𝑑𝑘superscript𝜌𝑑𝑑w_{2}(d)=k\rho^{\prime}(d)/ditalic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d ) = italic_k italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) / italic_d and w3(d)=ρ(d)dρ(d)+b0subscript𝑤3𝑑superscript𝜌𝑑𝑑𝜌𝑑subscript𝑏0w_{3}(d)=\rho^{\prime}(d)d-\rho(d)+b_{0}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_d ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d ) italic_d - italic_ρ ( italic_d ) + italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and require conditions (R1)-(R5) in [16] on the function ρ𝜌\rhoitalic_ρ. For the proof of Theorem 2 these conditions have been reformulated into similar conditions (C1)-(C3) for general w1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and w3subscript𝑤3w_{3}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. Furthermore, in order to incorporate the case w1=w2=w3=1subscript𝑤1subscript𝑤2subscript𝑤31w_{1}=w_{2}=w_{3}=1italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1 of Example 1, we have slightly adapted some of the boundedness conditions and use that

d2=(𝐲𝐗𝜷)T𝐕1(𝐲𝐗𝜷)𝐲𝐗𝜷2λk(𝐕)(𝐲+𝐗𝜷)2λk(𝐕)𝐬2(1+𝜷)2λk(𝐕).superscript𝑑2superscript𝐲𝐗𝜷𝑇superscript𝐕1𝐲𝐗𝜷superscriptnorm𝐲𝐗𝜷2subscript𝜆𝑘𝐕superscriptnorm𝐲norm𝐗norm𝜷2subscript𝜆𝑘𝐕superscriptnorm𝐬2superscript1norm𝜷2subscript𝜆𝑘𝐕d^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta})^{T}\mathbf{V}^{-1}(\mathbf{y}-\mathbf{% X}\bm{\beta})\leq\frac{\|\mathbf{y}-\mathbf{X}\bm{\beta}\|^{2}}{\lambda_{k}(% \mathbf{V})}\leq\frac{(\|\mathbf{y}\|+\|\mathbf{X}\|\cdot\|\bm{\beta}\|)^{2}}{% \lambda_{k}(\mathbf{V})}\leq\frac{\|\mathbf{s}\|^{2}(1+\|\bm{\beta}\|)^{2}}{% \lambda_{k}(\mathbf{V})}.italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β ) ≤ divide start_ARG ∥ bold_y - bold_X bold_italic_β ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_V ) end_ARG ≤ divide start_ARG ( ∥ bold_y ∥ + ∥ bold_X ∥ ⋅ ∥ bold_italic_β ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_V ) end_ARG ≤ divide start_ARG ∥ bold_s ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( 1 + ∥ bold_italic_β ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_V ) end_ARG .

This will ensure that d2superscript𝑑2d^{2}italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is bounded by a multiple of 𝐬2superscriptnorm𝐬2\|\mathbf{s}\|^{2}∥ bold_s ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT on a neighborhood of 𝝃0subscript𝝃0\bm{\xi}_{0}bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. In order to apply dominated convergence, we then require 𝔼𝐬4<𝔼superscriptnorm𝐬4\mathbb{E}\|\mathbf{s}\|^{4}<\inftyblackboard_E ∥ bold_s ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT < ∞ in Theorem 2 instead of 𝔼𝐗2<𝔼superscriptnorm𝐗2\mathbb{E}\|\mathbf{X}\|^{2}<\inftyblackboard_E ∥ bold_X ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < ∞, which was sufficient for Corollary 9.2 in [16].

Proof.

Define

Λ(𝝃)=Ψ(𝐬,𝝃)dP(𝐬).Λ𝝃Ψ𝐬𝝃differential-d𝑃𝐬\Lambda(\bm{\xi})=\int\Psi(\mathbf{s},\bm{\xi})\,\mathrm{d}P(\mathbf{s}).roman_Λ ( bold_italic_ξ ) = ∫ roman_Ψ ( bold_s , bold_italic_ξ ) roman_d italic_P ( bold_s ) . (A.4)

From the properties of elliptically contoured densities, one has that 𝔼[Ψ(𝐬,𝝃0)|𝐗]=𝟎𝔼delimited-[]conditionalΨ𝐬subscript𝝃0𝐗0\mathbb{E}\left[\Psi(\mathbf{s},\bm{\xi}_{0})\big{|}\mathbf{X}\right]=\mathbf{0}blackboard_E [ roman_Ψ ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | bold_X ] = bold_0, so that Λ(𝝃0)=𝟎Λsubscript𝝃00\Lambda(\bm{\xi}_{0})=\mathbf{0}roman_Λ ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = bold_0. Conditions (C1)-(C3) yield that ΛΛ\Lambdaroman_Λ is continuously differentiable and by application of empirical process theory (see e.g., Lemma 11.8 in [17] for the special case of S-estimators) one finds

𝟎=Ψ(𝐬,𝝃n)dP(𝐬)+Ψ(𝐬,𝝃0)d(nP)(𝐬)+oP(n1/2)=Λ(𝝃0)(𝝃n𝝃0)+1ni=1n{Ψ(𝐬i,𝝃0)𝔼[Ψ(𝐬i,𝝃0)]}+oP(n1/2).0Ψ𝐬subscript𝝃𝑛differential-d𝑃𝐬Ψ𝐬subscript𝝃0dsubscript𝑛𝑃𝐬subscript𝑜𝑃superscript𝑛12superscriptΛsubscript𝝃0subscript𝝃𝑛subscript𝝃01𝑛superscriptsubscript𝑖1𝑛Ψsubscript𝐬𝑖subscript𝝃0𝔼delimited-[]Ψsubscript𝐬𝑖subscript𝝃0subscript𝑜𝑃superscript𝑛12\begin{split}\mathbf{0}&=\int\Psi(\mathbf{s},\bm{\xi}_{n})\,\mathrm{d}P(% \mathbf{s})+\int\Psi(\mathbf{s},\bm{\xi}_{0})\,\mathrm{d}(\mathbb{P}_{n}-P)(% \mathbf{s})+o_{P}(n^{-1/2})\\ &=\Lambda^{\prime}(\bm{\xi}_{0})(\bm{\xi}_{n}-\bm{\xi}_{0})+\frac{1}{n}\sum_{i% =1}^{n}\Big{\{}\Psi(\mathbf{s}_{i},\bm{\xi}_{0})-\mathbb{E}[\Psi(\mathbf{s}_{i% },\bm{\xi}_{0})]\Big{\}}+o_{P}(n^{-1/2}).\end{split}start_ROW start_CELL bold_0 end_CELL start_CELL = ∫ roman_Ψ ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) roman_d italic_P ( bold_s ) + ∫ roman_Ψ ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) roman_d ( blackboard_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_P ) ( bold_s ) + italic_o start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = roman_Λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { roman_Ψ ( bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) - blackboard_E [ roman_Ψ ( bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] } + italic_o start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) . end_CELL end_ROW (A.5)

Similar to Lemma 8.3 in Lopuhaä et al [16], we find that Λ(𝝃0)superscriptΛsubscript𝝃0\Lambda^{\prime}(\bm{\xi}_{0})roman_Λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is a block matrix. This implies that n(𝜷n𝜷0)𝑛subscript𝜷𝑛subscript𝜷0\sqrt{n}(\bm{\beta}_{n}-\bm{\beta}_{0})square-root start_ARG italic_n end_ARG ( bold_italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and n(𝜽n𝜽0)𝑛subscript𝜽𝑛subscript𝜽0\sqrt{n}(\bm{\theta}_{n}-\bm{\theta}_{0})square-root start_ARG italic_n end_ARG ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) are asymptotically independent and from (A.5) we obtain

n(vec(𝐕n)vec(𝚺))=𝐋n(𝜽n𝜽0)=𝐋Λ𝜽(𝝃0)11ni=1n{Ψ𝜽(𝐬i,𝝃0)𝔼[Ψ𝜽(𝐬i,𝝃0)]}+oP(1),𝑛vecsubscript𝐕𝑛vec𝚺𝐋𝑛subscript𝜽𝑛subscript𝜽0𝐋superscriptsubscriptΛ𝜽superscriptsubscript𝝃011𝑛superscriptsubscript𝑖1𝑛subscriptΨ𝜽subscript𝐬𝑖subscript𝝃0𝔼delimited-[]subscriptΨ𝜽subscript𝐬𝑖subscript𝝃0subscript𝑜𝑃1\begin{split}\sqrt{n}(\mathrm{vec}(\mathbf{V}_{n})-\mathrm{vec}(\mathbf{\Sigma% }))&=\mathbf{L}\sqrt{n}(\bm{\theta}_{n}-\bm{\theta}_{0})\\ &=-\mathbf{L}\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}\frac{1}{\sqrt{n% }}\sum_{i=1}^{n}\Big{\{}\Psi_{\bm{\theta}}(\mathbf{s}_{i},\bm{\xi}_{0})-% \mathbb{E}[\Psi_{\bm{\theta}}(\mathbf{s}_{i},\bm{\xi}_{0})]\Big{\}}+o_{P}(1),% \end{split}start_ROW start_CELL square-root start_ARG italic_n end_ARG ( roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - roman_vec ( bold_Σ ) ) end_CELL start_CELL = bold_L square-root start_ARG italic_n end_ARG ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - bold_L roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT { roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) - blackboard_E [ roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] } + italic_o start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ( 1 ) , end_CELL end_ROW

where Ψ𝜽subscriptΨ𝜽\Psi_{\bm{\theta}}roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT is defined in (3.4), and

Λ𝜽(𝝃0)=Ψ𝜽(𝐬,𝝃0)𝜽dP(𝐬)=γ1𝐋T(𝚺1𝚺1)𝐋γ2𝐋Tvec(𝚺1)vec(𝚺1)T𝐋,superscriptsubscriptΛ𝜽subscript𝝃0subscriptΨ𝜽𝐬subscript𝝃0𝜽differential-d𝑃𝐬subscript𝛾1superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋subscript𝛾2superscript𝐋𝑇vecsuperscript𝚺1vecsuperscriptsuperscript𝚺1𝑇𝐋\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})=\int\frac{\partial\Psi_{\bm{% \theta}}(\mathbf{s},\bm{\xi}_{0})}{\partial\bm{\theta}}\,\mathrm{d}P(\mathbf{s% })=\gamma_{1}\mathbf{L}^{T}\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-% 1}\right)\mathbf{L}-\gamma_{2}\mathbf{L}^{T}\mathrm{vec}(\mathbf{\Sigma}^{-1})% \mathrm{vec}(\mathbf{\Sigma}^{-1})^{T}\mathbf{L},roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ∫ divide start_ARG ∂ roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ bold_italic_θ end_ARG roman_d italic_P ( bold_s ) = italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L - italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ,

where

γ1=𝔼𝟎,𝐈k[w2(𝐳)𝐳3+k(k+2)w3(𝐳)]k(k+2)γ2=𝔼𝟎,𝐈k[(k+2)w3(𝐳)𝐳w2(𝐳)𝐳3]2k(k+2).subscript𝛾1subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳3𝑘𝑘2subscript𝑤3norm𝐳𝑘𝑘2subscript𝛾2subscript𝔼0subscript𝐈𝑘delimited-[]𝑘2superscriptsubscript𝑤3norm𝐳norm𝐳superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳32𝑘𝑘2\begin{split}\gamma_{1}&=\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[w_{% 2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}+k(k+2)w_{3}(\|\mathbf{z}\|)% \right]}{k(k+2)}\\ \gamma_{2}&=\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[(k+2)w_{3}^{% \prime}(\|\mathbf{z}\|)\|\mathbf{z}\|-w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{% z}\|^{3}\right]}{2k(k+2)}.\end{split}start_ROW start_CELL italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + italic_k ( italic_k + 2 ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG end_CELL end_ROW start_ROW start_CELL italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( italic_k + 2 ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ - italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ] end_ARG start_ARG 2 italic_k ( italic_k + 2 ) end_ARG . end_CELL end_ROW (A.6)

First note that we can write

Ψ𝜽(𝐬,𝝃0)=𝐋T(𝚺1𝚺1)vec{Ψ𝐂(𝐬,𝝃0)}subscriptΨ𝜽𝐬subscript𝝃0superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1vecsubscriptΨ𝐂𝐬subscript𝝃0\Psi_{\bm{\theta}}(\mathbf{s},\bm{\xi}_{0})=\mathbf{L}^{T}(\mathbf{\Sigma}^{-1% }\otimes\mathbf{\Sigma}^{-1})\mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},% \bm{\xi}_{0})\right\}roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } (A.7)

where

Ψ𝐂(𝐬,𝝃0)=w2(d0)(𝐲𝐗𝜷0)(𝐲𝐗𝜷0)Tw3(d0)𝚺,subscriptΨ𝐂𝐬subscript𝝃0subscript𝑤2subscript𝑑0𝐲𝐗subscript𝜷0superscript𝐲𝐗subscript𝜷0𝑇subscript𝑤3subscript𝑑0𝚺\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})=w_{2}(d_{0})(\mathbf{y}-\mathbf{X}% \bm{\beta}_{0})(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})^{T}-w_{3}(d_{0})\mathbf{% \Sigma},roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_Σ , (A.8)

with d02=(𝐲𝐗𝜷0)T𝚺1(𝐲𝐗𝜷0)superscriptsubscript𝑑02superscript𝐲𝐗subscript𝜷0𝑇superscript𝚺1𝐲𝐗subscript𝜷0d_{0}^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})^{T}\mathbf{\Sigma}^{-1}(% \mathbf{y}-\mathbf{X}\bm{\beta}_{0})italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Furthermore, similar to Lemma 11.5 in [17], we find that

Λ𝜽(𝝃0)1=a(𝐄T𝐄)1+b(𝐄T𝐄)1𝐄Tvec(𝐈k)vec(𝐈k)T𝐄(𝐄T𝐄)1,superscriptsubscriptΛ𝜽superscriptsubscript𝝃01𝑎superscriptsuperscript𝐄𝑇𝐄1𝑏superscriptsuperscript𝐄𝑇𝐄1superscript𝐄𝑇vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇𝐄superscriptsuperscript𝐄𝑇𝐄1\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}=a(\mathbf{E}^{T}\mathbf{E})^% {-1}+b(\mathbf{E}^{T}\mathbf{E})^{-1}\mathbf{E}^{T}\mathrm{vec}(\mathbf{I}_{k}% )\mathrm{vec}(\mathbf{I}_{k})^{T}\mathbf{E}(\mathbf{E}^{T}\mathbf{E})^{-1},roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_a ( bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_b ( bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E ( bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,

where 𝐄=(𝚺1/2𝚺1/2)𝐋𝐄tensor-productsuperscript𝚺12superscript𝚺12𝐋\mathbf{E}=\left(\mathbf{\Sigma}^{-1/2}\otimes\mathbf{\Sigma}^{-1/2}\right)% \mathbf{L}bold_E = ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) bold_L and a=1/γ1𝑎1subscript𝛾1a=1/\gamma_{1}italic_a = 1 / italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and b=γ2/(γ1(γ1kγ2))𝑏subscript𝛾2subscript𝛾1subscript𝛾1𝑘subscript𝛾2b=\gamma_{2}/(\gamma_{1}(\gamma_{1}-k\gamma_{2}))italic_b = italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / ( italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ), with

γ1kγ2=12k𝔼𝟎,𝐈k[w2(𝐳)𝐳3+2kw3(𝐳)kw3(𝐳)𝐳]0.subscript𝛾1𝑘subscript𝛾212𝑘subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳32𝑘subscript𝑤3norm𝐳𝑘superscriptsubscript𝑤3norm𝐳norm𝐳0\gamma_{1}-k\gamma_{2}=\frac{1}{2k}\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{% [}w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}+2kw_{3}(\|\mathbf{z}\|)-kw_% {3}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|\Big{]}\neq 0.italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 italic_k end_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 2 italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] ≠ 0 .

This means that 𝐄T𝐄=𝐋T(𝚺1𝚺1)𝐋superscript𝐄𝑇𝐄superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋\mathbf{E}^{T}\mathbf{E}=\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{% \Sigma}^{-1})\mathbf{L}bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E = bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L and since vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we have

(𝐄T𝐄)1𝐄Tvec(𝐈k)=(𝐋T(𝚺1𝚺1)𝐋)1𝐋(𝚺1𝚺1)vec(𝚺)=𝜽0,superscriptsuperscript𝐄𝑇𝐄1superscript𝐄𝑇vecsubscript𝐈𝑘superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1𝐋tensor-productsuperscript𝚺1superscript𝚺1vec𝚺subscript𝜽0(\mathbf{E}^{T}\mathbf{E})^{-1}\mathbf{E}^{T}\mathrm{vec}(\mathbf{I}_{k})=(% \mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{L})^{-1% }\mathbf{L}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathrm{vec}(% \mathbf{\Sigma})=\bm{\theta}_{0},( bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_E ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ ) = bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,

so that Λ𝜽(𝝃0)1=a(𝐋T(𝚺1𝚺1)𝐋)1+b𝜽0𝜽0TsuperscriptsubscriptΛ𝜽superscriptsubscript𝝃01𝑎superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1𝑏subscript𝜽0superscriptsubscript𝜽0𝑇\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}=a(\mathbf{L}^{T}(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})\mathbf{L})^{-1}+b\bm{\theta}_{0}\bm{% \theta}_{0}^{T}roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_a ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_b bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Furthermore, since vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, together with

vec(𝐀𝐁𝐂)=(𝐂T𝐀)vec(𝐁)vec𝐀𝐁𝐂tensor-productsuperscript𝐂𝑇𝐀vec𝐁\mathrm{vec}(\mathbf{A}\mathbf{B}\mathbf{C})=(\mathbf{C}^{T}\otimes\mathbf{A})% \mathrm{vec}(\mathbf{B})roman_vec ( bold_ABC ) = ( bold_C start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⊗ bold_A ) roman_vec ( bold_B ) (A.9)

e.g., see [18, Chapter 2, Section 4], and ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT as in (A.1), we find

𝐋Λ𝜽(𝝃0)1𝐋T(𝚺1𝚺1)=aΠL+b𝐋𝜽0𝜽0T𝐋T(𝚺1𝚺1)=aΠL+bvec(𝚺)vec(𝚺1)T.𝐋superscriptsubscriptΛ𝜽superscriptsubscript𝝃01superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝑎subscriptΠ𝐿𝑏𝐋subscript𝜽0superscriptsubscript𝜽0𝑇superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝑎subscriptΠ𝐿𝑏vec𝚺vecsuperscriptsuperscript𝚺1𝑇\begin{split}\mathbf{L}\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}% \mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1})&=a\Pi_{L}+b% \mathbf{L}\bm{\theta}_{0}\bm{\theta}_{0}^{T}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1% }\otimes\mathbf{\Sigma}^{-1})=a\Pi_{L}+b\mathrm{vec}(\mathbf{\Sigma})\mathrm{% vec}(\mathbf{\Sigma}^{-1})^{T}.\end{split}start_ROW start_CELL bold_L roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) end_CELL start_CELL = italic_a roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + italic_b bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = italic_a roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + italic_b roman_vec ( bold_Σ ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . end_CELL end_ROW

It follows that

𝐋Λ𝜽(𝝃0)1Ψ𝜽(𝐬,𝝃0)=𝐋Λ𝜽(𝝃0)1𝐋T(𝚺1𝚺1)vec{Ψ𝐂(𝐬,𝝃0)}=aΠLvec{Ψ𝐂(𝐬,𝝃0)}+bvec(𝚺)vec(𝚺1)Tvec{Ψ𝐂(𝐬,𝝃0)}=aΠLvec{Ψ𝐂(𝐬,𝝃0)}+bvec(𝚺)tr{𝚺1Ψ𝐂(𝐬,𝝃0)}=aΠLvec{Ψ𝐂(𝐬,𝝃0)}+bvec(𝚺)(w2(d0)d02kw3(d0)).𝐋superscriptsubscriptΛ𝜽superscriptsubscript𝝃01subscriptΨ𝜽𝐬subscript𝝃0𝐋superscriptsubscriptΛ𝜽superscriptsubscript𝝃01superscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1vecsubscriptΨ𝐂𝐬subscript𝝃0𝑎subscriptΠ𝐿vecsubscriptΨ𝐂𝐬subscript𝝃0𝑏vec𝚺vecsuperscriptsuperscript𝚺1𝑇vecsubscriptΨ𝐂𝐬subscript𝝃0𝑎subscriptΠ𝐿vecsubscriptΨ𝐂𝐬subscript𝝃0𝑏vec𝚺trsuperscript𝚺1subscriptΨ𝐂𝐬subscript𝝃0𝑎subscriptΠ𝐿vecsubscriptΨ𝐂𝐬subscript𝝃0𝑏vec𝚺subscript𝑤2subscript𝑑0superscriptsubscript𝑑02𝑘subscript𝑤3subscript𝑑0\begin{split}\mathbf{L}\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}\Psi_{% \bm{\theta}}(\mathbf{s},\bm{\xi}_{0})&=\mathbf{L}\Lambda_{\bm{\theta}}^{\prime% }(\bm{\xi}_{0})^{-1}\mathbf{L}^{T}(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^% {-1})\mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right\}\\ &=a\Pi_{L}\mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right% \}+b\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(\mathbf{\Sigma}^{-1})^{T}\mathrm% {vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right\}\\ &=a\Pi_{L}\mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right% \}+b\mathrm{vec}(\mathbf{\Sigma})\text{tr}\left\{\mathbf{\Sigma}^{-1}\Psi_{% \mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right\}\\ &=a\Pi_{L}\mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right% \}+b\mathrm{vec}(\mathbf{\Sigma})\left(w_{2}(d_{0})d_{0}^{2}-kw_{3}(d_{0})% \right).\end{split}start_ROW start_CELL bold_L roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_CELL start_CELL = bold_L roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_a roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } + italic_b roman_vec ( bold_Σ ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_a roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } + italic_b roman_vec ( bold_Σ ) tr { bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_a roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } + italic_b roman_vec ( bold_Σ ) ( italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) . end_CELL end_ROW

Because ΠLvec(𝚺)=vec(𝚺)subscriptΠ𝐿vec𝚺vec𝚺\Pi_{L}\mathrm{vec}(\mathbf{\Sigma})=\mathrm{vec}(\mathbf{\Sigma})roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_Σ ) = roman_vec ( bold_Σ ), we conclude that

𝐋Λ𝜽(𝝃0)1Ψ𝜽(𝐬,𝝃0)=ΠLvec{Ψ𝐍(𝐬,𝝃0)},𝐋superscriptsubscriptΛ𝜽superscriptsubscript𝝃01subscriptΨ𝜽𝐬subscript𝝃0subscriptΠ𝐿vecsubscriptΨ𝐍𝐬subscript𝝃0\mathbf{L}\Lambda_{\bm{\theta}}^{\prime}(\bm{\xi}_{0})^{-1}\Psi_{\bm{\theta}}(% \mathbf{s},\bm{\xi}_{0})=\Pi_{L}\mathrm{vec}\left\{\Psi_{\mathbf{N}}(\mathbf{s% },\bm{\xi}_{0})\right\},bold_L roman_Λ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } ,

where

Ψ𝐍(𝐬,𝝃)=v1(d)(𝐲𝐗𝜷)(𝐲𝐗𝜷)Tv2(d)𝚺,subscriptΨ𝐍𝐬𝝃subscript𝑣1𝑑𝐲𝐗𝜷superscript𝐲𝐗𝜷𝑇subscript𝑣2𝑑𝚺\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi})=v_{1}(d)(\mathbf{y}-\mathbf{X}\bm{\beta% })(\mathbf{y}-\mathbf{X}\bm{\beta})^{T}-v_{2}(d)\mathbf{\Sigma},roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ ) = italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_d ) ( bold_y - bold_X bold_italic_β ) ( bold_y - bold_X bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d ) bold_Σ , (A.10)

with d2=(𝐲𝐗𝜷)T𝐕1(𝐲𝐗𝜷)superscript𝑑2superscript𝐲𝐗𝜷𝑇superscript𝐕1𝐲𝐗𝜷d^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta})^{T}\mathbf{V}^{-1}(\mathbf{y}-\mathbf{% X}\bm{\beta})italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_V start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β ) and

v1(s)=aw2(s)v2(s)=bw2(s)s2+(a+bk)w3(s).subscript𝑣1𝑠𝑎subscript𝑤2𝑠subscript𝑣2𝑠𝑏subscript𝑤2𝑠superscript𝑠2𝑎𝑏𝑘subscript𝑤3𝑠\begin{split}v_{1}(s)&=aw_{2}(s)\\ v_{2}(s)&=-bw_{2}(s)s^{2}+(a+bk)w_{3}(s).\end{split}start_ROW start_CELL italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = italic_a italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) end_CELL end_ROW start_ROW start_CELL italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) end_CELL start_CELL = - italic_b italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_a + italic_b italic_k ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) . end_CELL end_ROW (A.11)

Hence, if we define

𝐍n=1ni=1nΨ𝐍(𝐬i,𝝃0),subscript𝐍𝑛1𝑛superscriptsubscript𝑖1𝑛subscriptΨ𝐍subscript𝐬𝑖subscript𝝃0\mathbf{N}_{n}=\frac{1}{n}\sum_{i=1}^{n}\Psi_{\mathbf{N}}(\mathbf{s}_{i},\bm{% \xi}_{0}),bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (A.12)

with Ψ𝐍subscriptΨ𝐍\Psi_{\mathbf{N}}roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT defined in (A.10), then it follows that

n(vec(𝐕n)vec(𝚺))=ΠLvec{n(𝐍n𝔼[𝐍n])}+oP(1).𝑛vecsubscript𝐕𝑛vec𝚺subscriptΠ𝐿vec𝑛subscript𝐍𝑛𝔼delimited-[]subscript𝐍𝑛subscript𝑜𝑃1\sqrt{n}(\mathrm{vec}(\mathbf{V}_{n})-\mathrm{vec}(\mathbf{\Sigma}))=\Pi_{L}% \mathrm{vec}\left\{\sqrt{n}(\mathbf{N}_{n}-\mathbb{E}[\mathbf{N}_{n}])\right\}% +o_{P}(1).square-root start_ARG italic_n end_ARG ( roman_vec ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - roman_vec ( bold_Σ ) ) = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec { square-root start_ARG italic_n end_ARG ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - blackboard_E [ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] ) } + italic_o start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ( 1 ) .

This proves the first statement in Theorem 2.

To prove the second statement, note that from Λ(𝝃0)=𝟎Λsubscript𝝃00\Lambda(\bm{\xi}_{0})=\mathbf{0}roman_Λ ( bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = bold_0, together with (A.7) and (A.9), it follows that

0=𝜽0T𝔼[Ψ𝜽(𝐬,𝝃0)]=𝔼[vec(𝚺1)Tvec{Ψ𝐂(𝐬,𝝃0)}]=𝔼[tr(𝚺1Ψ𝐂(𝐬,𝝃0))]=𝔼[w2(d0)d02kw3(d0)],0superscriptsubscript𝜽0𝑇𝔼delimited-[]subscriptΨ𝜽𝐬subscript𝝃0𝔼delimited-[]vecsuperscriptsuperscript𝚺1𝑇vecsubscriptΨ𝐂𝐬subscript𝝃0𝔼delimited-[]trsuperscript𝚺1subscriptΨ𝐂𝐬subscript𝝃0𝔼delimited-[]subscript𝑤2subscript𝑑0superscriptsubscript𝑑02𝑘subscript𝑤3subscript𝑑0\begin{split}0&=\bm{\theta}_{0}^{T}\mathbb{E}\left[\Psi_{\bm{\theta}}(\mathbf{% s},\bm{\xi}_{0})\right]=\mathbb{E}\left[\mathrm{vec}(\mathbf{\Sigma}^{-1})^{T}% \mathrm{vec}\left\{\Psi_{\mathbf{C}}(\mathbf{s},\bm{\xi}_{0})\right\}\right]\\ &=\mathbb{E}\left[\text{tr}\left(\mathbf{\Sigma}^{-1}\Psi_{\mathbf{C}}(\mathbf% {s},\bm{\xi}_{0})\right)\right]=\mathbb{E}\left[w_{2}(d_{0})d_{0}^{2}-kw_{3}(d% _{0})\right],\end{split}start_ROW start_CELL 0 end_CELL start_CELL = bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT blackboard_E [ roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] = blackboard_E [ roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec { roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = blackboard_E [ tr ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ] = blackboard_E [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] , end_CELL end_ROW

where Ψ𝐂subscriptΨ𝐂\Psi_{\mathbf{C}}roman_Ψ start_POSTSUBSCRIPT bold_C end_POSTSUBSCRIPT is defined in (A.8). Then, from the properties of elliptically contoured densities, together with (A.11), one finds 𝔼[Ψ𝐍(𝐬,𝝃0)]=𝟎𝔼delimited-[]subscriptΨ𝐍𝐬subscript𝝃00\mathbb{E}[\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0})]=\mathbf{0}blackboard_E [ roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] = bold_0. This means that n(𝐍n𝔼[𝐍n])𝑛subscript𝐍𝑛𝔼delimited-[]subscript𝐍𝑛\sqrt{n}(\mathbf{N}_{n}-\mathbb{E}[\mathbf{N}_{n}])square-root start_ARG italic_n end_ARG ( bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - blackboard_E [ bold_N start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] ) is asymptotically normal with mean zero and variance

𝔼[vec(Ψ𝐍(𝐬,𝝃0))vec(Ψ𝐍(𝐬,𝝃0))T]=𝔼[𝔼[vec(Ψ𝐍(𝐬,𝝃0))vec(Ψ𝐍(𝐬,𝝃0))T|𝐗]].𝔼delimited-[]vecsubscriptΨ𝐍𝐬subscript𝝃0vecsuperscriptsubscriptΨ𝐍𝐬subscript𝝃0𝑇𝔼delimited-[]𝔼delimited-[]conditionalvecsubscriptΨ𝐍𝐬subscript𝝃0vecsuperscriptsubscriptΨ𝐍𝐬subscript𝝃0𝑇𝐗\mathbb{E}\left[\mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))% \mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))^{T}\right]=\mathbb{E}% \left[\mathbb{E}\left[\mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))% \mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))^{T}\Big{|}\mathbf{X}% \right]\right].blackboard_E [ roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] = blackboard_E [ blackboard_E [ roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT | bold_X ] ] .

The inner expectation on the right hand side is the conditional expectation of 𝐲𝐗conditional𝐲𝐗\mathbf{y}\mid\mathbf{X}bold_y ∣ bold_X, which has the same distribution as 𝚺1/2𝐳+𝝁superscript𝚺12𝐳𝝁\mathbf{\Sigma}^{1/2}\mathbf{z}+\bm{\mu}bold_Σ start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT bold_z + bold_italic_μ, where 𝐳𝐳\mathbf{z}bold_z has a spherical density f𝟎,𝐈k(𝐳)=g(𝐳2)subscript𝑓0subscript𝐈𝑘𝐳𝑔superscriptnorm𝐳2f_{\mathbf{0},\mathbf{I}_{k}}(\mathbf{z})=g(\|\mathbf{z}\|^{2})italic_f start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_z ) = italic_g ( ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). This implies that

𝔼[vec(𝚺1/2Ψ𝐍(𝐬,𝝃0)𝚺1/2)vec(𝚺1/2Ψ𝐍(𝐬,𝝃0)𝚺1/2)T]=𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]𝔼𝟎,𝐈k[vec(𝐮𝐮T)vec(𝐮𝐮T)T]𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]𝔼𝟎,𝐈k[vec(𝐮𝐮T)vec(𝐈k)T]𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]𝔼𝟎,𝐈k[vec(𝐈k)vec(𝐮𝐮T)T]+𝔼𝟎,𝐈k[v2(𝐳)2]𝔼𝟎,𝐈k[vec(𝐈k)vec(𝐈k)T],𝔼delimited-[]vecsuperscript𝚺12subscriptΨ𝐍𝐬subscript𝝃0superscript𝚺12vecsuperscriptsuperscript𝚺12subscriptΨ𝐍𝐬subscript𝝃0superscript𝚺12𝑇subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptdelimited-∥∥𝐳2superscriptdelimited-∥∥𝐳4subscript𝔼0subscript𝐈𝑘delimited-[]vecsuperscript𝐮𝐮𝑇vecsuperscriptsuperscript𝐮𝐮𝑇𝑇subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1delimited-∥∥𝐳subscript𝑣2delimited-∥∥𝐳superscriptdelimited-∥∥𝐳2subscript𝔼0subscript𝐈𝑘delimited-[]vecsuperscript𝐮𝐮𝑇vecsuperscriptsubscript𝐈𝑘𝑇subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1delimited-∥∥𝐳subscript𝑣2delimited-∥∥𝐳superscriptdelimited-∥∥𝐳2subscript𝔼0subscript𝐈𝑘delimited-[]vecsubscript𝐈𝑘vecsuperscriptsuperscript𝐮𝐮𝑇𝑇subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣2superscriptdelimited-∥∥𝐳2subscript𝔼0subscript𝐈𝑘delimited-[]vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\begin{split}&\mathbb{E}\left[\mathrm{vec}\left(\mathbf{\Sigma}^{-1/2}\Psi_{% \mathbf{N}}(\mathbf{s},\bm{\xi}_{0})\mathbf{\Sigma}^{-1/2}\right)\mathrm{vec}% \left(\mathbf{\Sigma}^{-1/2}\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0})\mathbf{% \Sigma}^{-1/2}\right)^{T}\right]\\ &=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)^{2}\|% \mathbf{z}\|^{4}\right]\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\mathrm{vec% }\left(\mathbf{u}\mathbf{u}^{T}\right)\mathrm{vec}\left(\mathbf{u}\mathbf{u}^{% T}\right)^{T}\right]\\ &\quad-\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)v_{2}(% \|\mathbf{z}\|)\|\mathbf{z}\|^{2}\right]\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[\mathrm{vec}\left(\mathbf{u}\mathbf{u}^{T}\right)\mathrm{vec}\left(% \mathbf{I}_{k}\right)^{T}\right]\\ &\quad-\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)v_{2}(% \|\mathbf{z}\|)\|\mathbf{z}\|^{2}\right]\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[\mathrm{vec}\left(\mathbf{I}_{k}\right)\mathrm{vec}\left(\mathbf{u}% \mathbf{u}^{T}\right)^{T}\right]\\ &\quad+\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{2}(\|\mathbf{z}\|)^{2}% \right]\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\mathrm{vec}\left(\mathbf{I% }_{k}\right)\mathrm{vec}\left(\mathbf{I}_{k}\right)^{T}\right],\end{split}start_ROW start_CELL end_CELL start_CELL blackboard_E [ roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] , end_CELL end_ROW

where 𝐮=𝐳/𝐳𝐮𝐳norm𝐳\mathbf{u}=\mathbf{z}/\|\mathbf{z}\|bold_u = bold_z / ∥ bold_z ∥. From Lemma 5.1 in [13], we have

𝔼𝟎,𝐈kvec(𝐮𝐮T)vec(𝐮𝐮T)T=σ1(𝐈k2+𝐊k,k)+σ2vec(𝐈k)vec(𝐈k)T,subscript𝔼0subscript𝐈𝑘vecsuperscript𝐮𝐮𝑇vecsuperscriptsuperscript𝐮𝐮𝑇𝑇subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘subscript𝜎2vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\mathrm{vec}(\mathbf{u}\mathbf{u}^{T})% \mathrm{vec}(\mathbf{u}\mathbf{u}^{T})^{T}=\sigma_{1}(\mathbf{I}_{k^{2}}+% \mathbf{K}_{k,k})+\sigma_{2}\mathrm{vec}(\mathbf{I}_{k})\mathrm{vec}(\mathbf{I% }_{k})^{T},blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) roman_vec ( bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ,

where σ1=σ2=(k(k+2))1subscript𝜎1subscript𝜎2superscript𝑘𝑘21\sigma_{1}=\sigma_{2}=(k(k+2))^{-1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( italic_k ( italic_k + 2 ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Hence, the first term on the right hand side is equal to

𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]k(k+2)(𝐈k2+𝐊k,k+vec(𝐈k)vec(𝐈k)T).subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘2subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)^{2}\|% \mathbf{z}\|^{4}\right]}{k(k+2)}\left(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k}+% \mathrm{vec}(\mathbf{I}_{k})\mathrm{vec}(\mathbf{I}_{k})^{T}\right).divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT + roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) .

This leads to one term 𝐈k2+𝐊k,ksubscript𝐈superscript𝑘2subscript𝐊𝑘𝑘\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k}bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT with coefficient

𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]k(k+2)subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘2\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)^{2}\|% \mathbf{z}\|^{4}\right]}{k(k+2)}divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG

and using that, according to Lemma 11.4 in [17], 𝔼𝟎,𝐈k[𝐮𝐮T]=(1/k)𝐈ksubscript𝔼0subscript𝐈𝑘delimited-[]superscript𝐮𝐮𝑇1𝑘subscript𝐈𝑘\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\mathbf{u}\mathbf{u}^{T}\right]=(1% /k)\mathbf{I}_{k}blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ bold_uu start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] = ( 1 / italic_k ) bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, we find a second term vec(𝐈k)vec(𝐈k)Tvecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\mathrm{vec}(\mathbf{I}_{k})\mathrm{vec}(\mathbf{I}_{k})^{T}roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT with coefficient

𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]k(k+2)2𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]k+𝔼𝟎,𝐈k[v2(𝐳)2].subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘22subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1norm𝐳subscript𝑣2norm𝐳superscriptnorm𝐳2𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣2superscriptnorm𝐳2\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)^{2}\|% \mathbf{z}\|^{4}\right]}{k(k+2)}-\frac{2\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[v_{1}(\|\mathbf{z}\|)v_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}\right]}{k}+% \mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{2}(\|\mathbf{z}\|)^{2}\right].divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG - divide start_ARG 2 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k end_ARG + blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] .

This means that

𝔼[vec(𝚺1/2Ψ𝐍(𝐬,𝝃0)𝚺1/2)vec(𝚺1/2Ψ𝐍(𝐬,𝝃0)𝚺1/2)T]=σ1(𝐈k2+𝐊k,k)+σ2vec(𝐈k)vec(𝐈k)T𝔼delimited-[]vecsuperscript𝚺12subscriptΨ𝐍𝐬subscript𝝃0superscript𝚺12vecsuperscriptsuperscript𝚺12subscriptΨ𝐍𝐬subscript𝝃0superscript𝚺12𝑇subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘subscript𝜎2vecsubscript𝐈𝑘vecsuperscriptsubscript𝐈𝑘𝑇\begin{split}&\mathbb{E}\left[\mathrm{vec}\left(\mathbf{\Sigma}^{-1/2}\Psi_{% \mathbf{N}}(\mathbf{s},\bm{\xi}_{0})\mathbf{\Sigma}^{-1/2}\right)\mathrm{vec}% \left(\mathbf{\Sigma}^{-1/2}\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0})\mathbf{% \Sigma}^{-1/2}\right)^{T}\right]\\ &=\sigma_{1}\left(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k}\right)+\sigma_{2}\mathrm% {vec}(\mathbf{I}_{k})\mathrm{vec}(\mathbf{I}_{k})^{T}\end{split}start_ROW start_CELL end_CELL start_CELL blackboard_E [ roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_Σ start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) roman_vec ( bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW

where

σ1=𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]k(k+2)σ2=σ12𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]k+𝔼𝟎,𝐈k[v2(𝐳)2],subscript𝜎1subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘2subscript𝜎2subscript𝜎12subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1norm𝐳subscript𝑣2norm𝐳superscriptnorm𝐳2𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣2superscriptdelimited-∥∥𝐳2\begin{split}\sigma_{1}&=\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{% 1}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]}{k(k+2)}\\ \sigma_{2}&=\sigma_{1}-\frac{2\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1% }(\|\mathbf{z}\|)v_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}\right]}{k}+\mathbb{E}% _{\mathbf{0},\mathbf{I}_{k}}\left[v_{2}(\|\mathbf{z}\|)^{2}\right],\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 2 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k end_ARG + blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] , end_CELL end_ROW (A.13)

or equivalently

𝔼[vec(Ψ𝐍(𝐬,𝝃0))vec(Ψ𝐍(𝐬,𝝃0))T]=σ1(𝐈k2+𝐊k,k)(𝚺1𝚺1)+σ2vec(𝚺)vec(𝚺)T.𝔼delimited-[]vecsubscriptΨ𝐍𝐬subscript𝝃0vecsuperscriptsubscriptΨ𝐍𝐬subscript𝝃0𝑇subscript𝜎1subscript𝐈superscript𝑘2subscript𝐊𝑘𝑘tensor-productsuperscript𝚺1superscript𝚺1subscript𝜎2vec𝚺vecsuperscript𝚺𝑇\mathbb{E}\left[\mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))% \mathrm{vec}(\Psi_{\mathbf{N}}(\mathbf{s},\bm{\xi}_{0}))^{T}\right]=\sigma_{1}% \left(\mathbf{I}_{k^{2}}+\mathbf{K}_{k,k}\right)(\mathbf{\Sigma}^{-1}\otimes% \mathbf{\Sigma}^{-1})+\sigma_{2}\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(% \mathbf{\Sigma})^{T}.blackboard_E [ roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) roman_vec ( roman_Ψ start_POSTSUBSCRIPT bold_N end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ] = italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_I start_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + bold_K start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT ) ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

We rewrite σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT:

σ2=2σ1k+(k+2)σ1k2𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]k+𝔼𝟎,𝐈k[v2(𝐳)2]=2σ1k+𝔼𝟎,𝐈k[v1(𝐳)2𝐳4]k22k𝔼𝟎,𝐈k[v1(𝐳)v2(𝐳)𝐳2]k2+k2𝔼𝟎,𝐈k[v2(𝐳)2]k2=2σ1k+1k2𝔼𝟎,𝐈k[(v1(𝐳)𝐳2kv2(𝐳))2].subscript𝜎22subscript𝜎1𝑘𝑘2subscript𝜎1𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1norm𝐳subscript𝑣2norm𝐳superscriptnorm𝐳2𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣2superscriptdelimited-∥∥𝐳22subscript𝜎1𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1superscriptnorm𝐳2superscriptnorm𝐳4superscript𝑘22𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣1norm𝐳subscript𝑣2norm𝐳superscriptnorm𝐳2superscript𝑘2superscript𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑣2superscriptnorm𝐳2superscript𝑘22subscript𝜎1𝑘1superscript𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑣1delimited-∥∥𝐳superscriptdelimited-∥∥𝐳2𝑘subscript𝑣2delimited-∥∥𝐳2\begin{split}\sigma_{2}&=-\frac{2\sigma_{1}}{k}+\frac{(k+2)\sigma_{1}}{k}-% \frac{2\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)v_{2}(% \|\mathbf{z}\|)\|\mathbf{z}\|^{2}\right]}{k}+\mathbb{E}_{\mathbf{0},\mathbf{I}% _{k}}\left[v_{2}(\|\mathbf{z}\|)^{2}\right]\\ &=-\frac{2\sigma_{1}}{k}+\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[v_{% 1}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]}{k^{2}}-\frac{2k\mathbb{E}_{% \mathbf{0},\mathbf{I}_{k}}\left[v_{1}(\|\mathbf{z}\|)v_{2}(\|\mathbf{z}\|)\|% \mathbf{z}\|^{2}\right]}{k^{2}}+\frac{k^{2}\mathbb{E}_{\mathbf{0},\mathbf{I}_{% k}}\left[v_{2}(\|\mathbf{z}\|)^{2}\right]}{k^{2}}\\ &=-\frac{2\sigma_{1}}{k}+\frac{1}{k^{2}}\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[\Big{(}v_{1}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}-kv_{2}(\|\mathbf{z}\|)% \Big{)}^{2}\right].\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + divide start_ARG ( italic_k + 2 ) italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG - divide start_ARG 2 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k end_ARG + blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG 2 italic_k blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + divide start_ARG 1 end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] . end_CELL end_ROW

Furthermore

v1(s)s2kv2(s)=(a+bk){w2(s)s2kw3(s)},subscript𝑣1𝑠superscript𝑠2𝑘subscript𝑣2𝑠𝑎𝑏𝑘subscript𝑤2𝑠superscript𝑠2𝑘subscript𝑤3𝑠v_{1}(s)s^{2}-kv_{2}(s)=(a+bk)\left\{w_{2}(s)s^{2}-kw_{3}(s)\right\},italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = ( italic_a + italic_b italic_k ) { italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) } ,

where

a+kb=a+aγ2γ1kγ2=1γ1kγ2𝑎𝑘𝑏𝑎𝑎subscript𝛾2subscript𝛾1𝑘subscript𝛾21subscript𝛾1𝑘subscript𝛾2a+kb=a+\frac{a\gamma_{2}}{\gamma_{1}-k\gamma_{2}}=\frac{1}{\gamma_{1}-k\gamma_% {2}}italic_a + italic_k italic_b = italic_a + divide start_ARG italic_a italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG

where γ1subscript𝛾1\gamma_{1}italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and γ2subscript𝛾2\gamma_{2}italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are defined in (A.6), which yields

γ1kγ2=12k𝔼𝟎,𝐈k[w2(𝐳)𝐳3+2kw3(𝐳)kw3(𝐳)𝐳].subscript𝛾1𝑘subscript𝛾212𝑘subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳32𝑘subscript𝑤3norm𝐳𝑘superscriptsubscript𝑤3norm𝐳norm𝐳\gamma_{1}-k\gamma_{2}=\frac{1}{2k}\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{% [}w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}+2kw_{3}(\|\mathbf{z}\|)-kw_% {3}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|\Big{]}.italic_γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k italic_γ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 italic_k end_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 2 italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] .

It follows that

σ1=k(k+2)𝔼𝟎,𝐈k[w2(𝐳)2𝐳4](𝔼𝟎,𝐈k[w2(𝐳)𝐳3+k(k+2)w3(𝐳)])2σ2=2kσ1+4𝔼𝟎,𝐈k[(w2(𝐳)𝐳2kw3(𝐳))2](𝔼𝟎,𝐈k[w2(𝐳)𝐳3+2kw3(𝐳)kw3(𝐳)𝐳])2.\begin{split}\sigma_{1}&=\frac{k(k+2)\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}% \left[w_{2}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]}{\Big{(}\mathbb{E}_{% \mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^% {3}+k(k+2)w_{3}(\|\mathbf{z}\|)\Big{]}\Big{)}^{2}}\\ \sigma_{2}&=-\frac{2}{k}\sigma_{1}+\frac{4\mathbb{E}_{\mathbf{0},\mathbf{I}_{k% }}\left[\Big{(}w_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}-kw_{3}(\|\mathbf{z}\|)% \Big{)}^{2}\right]}{\left(\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{% \prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}+2kw_{3}(\|\mathbf{z}\|)-kw_{3}^{% \prime}(\|\mathbf{z}\|)\|\mathbf{z}\|\Big{]}\right)^{2}.}\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG italic_k ( italic_k + 2 ) blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + italic_k ( italic_k + 2 ) italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG 4 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 2 italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) - italic_k italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . end_ARG end_CELL end_ROW

This proves the theorem. ∎

Proof of Theorem 3.

Proof.

Let H:k×km:𝐻superscript𝑘𝑘superscript𝑚H:\mathbb{R}^{k\times k}\to\mathbb{R}^{m}italic_H : blackboard_R start_POSTSUPERSCRIPT italic_k × italic_k end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT and let

H(𝐕)=H(𝐕)vec(𝐕)T=(Hi(𝐕)vst)i=1,,m;s,t=1,,ksuperscript𝐻𝐕𝐻𝐕vecsuperscript𝐕𝑇subscriptsubscript𝐻𝑖𝐕subscript𝑣𝑠𝑡formulae-sequence𝑖1𝑚𝑠𝑡1𝑘H^{\prime}(\mathbf{V})=\frac{\partial H(\mathbf{V})}{\partial\mathrm{vec}(% \mathbf{V})^{T}}=\left(\frac{\partial H_{i}(\mathbf{V})}{\partial v_{st}}% \right)_{i=1,\ldots,m;\,s,t=1,\ldots,k}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_V ) = divide start_ARG ∂ italic_H ( bold_V ) end_ARG start_ARG ∂ roman_vec ( bold_V ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG = ( divide start_ARG ∂ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_V ) end_ARG start_ARG ∂ italic_v start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT end_ARG ) start_POSTSUBSCRIPT italic_i = 1 , … , italic_m ; italic_s , italic_t = 1 , … , italic_k end_POSTSUBSCRIPT (A.14)

be the m×k2𝑚superscript𝑘2m\times k^{2}italic_m × italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT matrix of partial derivatives. According to the delta method n(H(𝐕n)H(𝚺))𝑛𝐻subscript𝐕𝑛𝐻𝚺\sqrt{n}(H(\mathbf{V}_{n})-H(\mathbf{\Sigma}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_Σ ) ) is asymptotically normal with mean zero and variance H(𝚺)var{vec(𝐌)}H(𝚺)Tsuperscript𝐻𝚺varvec𝐌superscript𝐻superscript𝚺𝑇H^{\prime}(\mathbf{\Sigma})\text{\rm var}\{\text{\rm vec}(\mathbf{M})\}H^{% \prime}(\mathbf{\Sigma})^{T}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) var { vec ( bold_M ) } italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Because H𝐻Hitalic_H is continuously differentiable and satisfies (4.1), it follows that

j=1lvjH(𝐯)vj=𝟎.superscriptsubscript𝑗1𝑙subscript𝑣𝑗𝐻𝐯subscript𝑣𝑗0\sum_{j=1}^{l}v_{j}\frac{\partial H(\mathbf{v})}{\partial v_{j}}=\mathbf{0}.∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT divide start_ARG ∂ italic_H ( bold_v ) end_ARG start_ARG ∂ italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = bold_0 . (A.15)

This means that H(𝚺)vec(𝚺)=0superscript𝐻𝚺vec𝚺0H^{\prime}(\mathbf{\Sigma})\mathrm{vec}(\mathbf{\Sigma})=\textbf{0}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) roman_vec ( bold_Σ ) = 0. Then, after inserting (1.3) for var{vec(𝐌)}varvec𝐌\text{\rm var}\{\text{\rm vec}(\mathbf{M})\}var { vec ( bold_M ) }, this finishes the proof of part (i).

For part (ii), let H:m:𝐻superscriptsuperscript𝑚H:\mathbb{R}^{\ell}\to\mathbb{R}^{m}italic_H : blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT, and let

H(𝜽)=H(𝜽)𝜽T=(Hi(𝜽)θj)i=1,,m;j=1,,superscript𝐻𝜽𝐻𝜽superscript𝜽𝑇subscriptsubscript𝐻𝑖𝜽subscript𝜃𝑗formulae-sequence𝑖1𝑚𝑗1H^{\prime}(\bm{\theta})=\frac{\partial H(\bm{\theta})}{\partial\bm{\theta}^{T}% }=\left(\frac{\partial H_{i}(\bm{\theta})}{\partial\theta_{j}}\right)_{i=1,% \ldots,m;\,j=1,\ldots,\ell}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ ) = divide start_ARG ∂ italic_H ( bold_italic_θ ) end_ARG start_ARG ∂ bold_italic_θ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG = ( divide start_ARG ∂ italic_H start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_italic_θ ) end_ARG start_ARG ∂ italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) start_POSTSUBSCRIPT italic_i = 1 , … , italic_m ; italic_j = 1 , … , roman_ℓ end_POSTSUBSCRIPT (A.16)

be the m×𝑚m\times\ellitalic_m × roman_ℓ matrix of partial derivatives. According to the delta method n(H(𝜽n)H(𝜽0))𝑛𝐻subscript𝜽𝑛𝐻subscript𝜽0\sqrt{n}(H(\bm{\theta}_{n})-H(\bm{\theta}_{0}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) is asymptotically normal with mean zero and variance

H(𝜽0){2σ1(𝐋T(𝚺1𝚺1)𝐋)1+σ2𝜽0𝜽0T}H(𝜽0)T.superscript𝐻subscript𝜽02subscript𝜎1superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1subscript𝜎2subscript𝜽0superscriptsubscript𝜽0𝑇superscript𝐻superscriptsubscript𝜽0𝑇H^{\prime}(\bm{\theta}_{0})\left\{2\sigma_{1}\Big{(}\mathbf{L}^{T}\left(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}+% \sigma_{2}\bm{\theta}_{0}\bm{\theta}_{0}^{T}\right\}H^{\prime}(\bm{\theta}_{0}% )^{T}.italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) { 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .

Because H𝐻Hitalic_H satisfies (4.1) and (A.15), it follows immediately that H(𝜽0)𝜽0=0superscript𝐻subscript𝜽0subscript𝜽00H^{\prime}(\bm{\theta}_{0})\bm{\theta}_{0}=\textbf{0}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0. This finishes the proof of part (ii). ∎

Proof of Lemma 1

Proof.

We apply Lemma 1 in [5]. Although the lemma is established for the Nk(𝝁,𝚺)subscript𝑁𝑘𝝁𝚺N_{k}(\bm{\mu},\mathbf{\Sigma})italic_N start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_μ , bold_Σ ) distribution, the proof holds for any distribution with an elliptically contoured density. According to [5], there exist two functions αC,βC:[0,):subscript𝛼𝐶subscript𝛽𝐶0\alpha_{C},\beta_{C}:[0,\infty)\to\mathbb{R}italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT , italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT : [ 0 , ∞ ) → blackboard_R, such that

IF(𝐲;𝐂,P𝝁,𝚺)=αC(d(𝐲))(𝐲𝝁)(𝐲𝝁)TβC(d(𝐲))𝚺.IF𝐲𝐂subscript𝑃𝝁𝚺subscript𝛼𝐶𝑑𝐲𝐲𝝁superscript𝐲𝝁𝑇subscript𝛽𝐶𝑑𝐲𝚺\text{IF}(\mathbf{y};\mathbf{C},P_{\bm{\mu},\mathbf{\Sigma}})=\alpha_{C}(d(% \mathbf{y}))(\mathbf{y}-\bm{\mu})(\mathbf{y}-\bm{\mu})^{T}-\beta_{C}(d(\mathbf% {y}))\mathbf{\Sigma}.IF ( bold_y ; bold_C , italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) = italic_α start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) ( bold_y - bold_italic_μ ) ( bold_y - bold_italic_μ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_β start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_d ( bold_y ) ) bold_Σ . (A.17)

We have that

IF(𝐲;vec(𝐌),P𝝁,𝚺)=limh0vec(𝐌((1h)P𝝁,𝚺+hδ𝐲))vec(𝐌)(P𝝁,𝚺)h=ΠLlimh0vec(𝐂((1h)P𝝁,𝚺+hδ𝐲))vec(𝐂)(P𝝁,𝚺)h=ΠLvec(IF(𝐲;𝐂,P𝝁,𝚺)).IF𝐲vec𝐌subscript𝑃𝝁𝚺subscript0vec𝐌1subscript𝑃𝝁𝚺subscript𝛿𝐲vec𝐌subscript𝑃𝝁𝚺subscriptΠ𝐿subscript0vec𝐂1subscript𝑃𝝁𝚺subscript𝛿𝐲vec𝐂subscript𝑃𝝁𝚺subscriptΠ𝐿vecIF𝐲𝐂subscript𝑃𝝁𝚺\begin{split}\text{\rm IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),P_{\bm{\mu},% \mathbf{\Sigma}})&=\lim_{h\downarrow 0}\frac{\mathrm{vec}(\mathbf{M}((1-h)P_{% \bm{\mu},\mathbf{\Sigma}}+h\delta_{\mathbf{y}}))-\mathrm{vec}(\mathbf{M})(P_{% \bm{\mu},\mathbf{\Sigma}})}{h}\\ &=\Pi_{L}\lim_{h\downarrow 0}\frac{\mathrm{vec}(\mathbf{C}((1-h)P_{\bm{\mu},% \mathbf{\Sigma}}+h\delta_{\mathbf{y}}))-\mathrm{vec}(\mathbf{C})(P_{\bm{\mu},% \mathbf{\Sigma}})}{h}\\ &=\Pi_{L}\mathrm{vec}\left(\text{IF}(\mathbf{y};\mathbf{C},P_{\bm{\mu},\mathbf% {\Sigma}})\right).\end{split}start_ROW start_CELL IF ( bold_y ; roman_vec ( bold_M ) , italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) end_CELL start_CELL = roman_lim start_POSTSUBSCRIPT italic_h ↓ 0 end_POSTSUBSCRIPT divide start_ARG roman_vec ( bold_M ( ( 1 - italic_h ) italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT + italic_h italic_δ start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT ) ) - roman_vec ( bold_M ) ( italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) end_ARG start_ARG italic_h end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_lim start_POSTSUBSCRIPT italic_h ↓ 0 end_POSTSUBSCRIPT divide start_ARG roman_vec ( bold_C ( ( 1 - italic_h ) italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT + italic_h italic_δ start_POSTSUBSCRIPT bold_y end_POSTSUBSCRIPT ) ) - roman_vec ( bold_C ) ( italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) end_ARG start_ARG italic_h end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( IF ( bold_y ; bold_C , italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) ) . end_CELL end_ROW

Since 𝐕𝐕\mathbf{V}bold_V is linear, it holds that vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and because ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT is the projection on the column space of 𝐋𝐋\mathbf{L}bold_L, it follows that ΠLvec(𝚺)=vec(𝚺)subscriptΠ𝐿vec𝚺vec𝚺\Pi_{L}\mathrm{vec}(\mathbf{\Sigma})=\mathrm{vec}(\mathbf{\Sigma})roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT roman_vec ( bold_Σ ) = roman_vec ( bold_Σ ). When we insert the expression (A.1) for ΠLsubscriptΠ𝐿\Pi_{L}roman_Π start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, together with (A.17) and the fact that (𝐁T𝐀)vec(𝐯𝐯T)=vec(𝐀𝐯𝐯T𝐁)tensor-productsuperscript𝐁𝑇𝐀vecsuperscript𝐯𝐯𝑇vecsuperscript𝐀𝐯𝐯𝑇𝐁(\mathbf{B}^{T}\otimes\mathbf{A})\mathrm{vec}(\mathbf{v}\mathbf{v}^{T})=% \mathrm{vec}(\mathbf{A}\mathbf{v}\mathbf{v}^{T}\mathbf{B})( bold_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⊗ bold_A ) roman_vec ( bold_vv start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = roman_vec ( bold_Avv start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_B ) according to (A.9), this finishes the proof of part (i). Since 𝐋𝐋\mathbf{L}bold_L has full column rank, (𝐋T𝐋)1𝐋Tvec(𝐌(P))=𝜽(P)superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝐌𝑃𝜽𝑃(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(\mathbf{M}(P))=\bm{% \theta}(P)( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_M ( italic_P ) ) = bold_italic_θ ( italic_P ), which yields

IF(𝐲;𝜽,P𝝁,𝚺)=(𝐋T𝐋)1𝐋TIF(𝐲;vec(𝐌),P𝝁,𝚺).IF𝐲𝜽subscript𝑃𝝁𝚺superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇IF𝐲vec𝐌subscript𝑃𝝁𝚺\text{\rm IF}(\mathbf{y};\bm{\theta},P_{\bm{\mu},\mathbf{\Sigma}})=(\mathbf{L}% ^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\text{\rm IF}(\mathbf{y};\mathrm{vec}(% \mathbf{M}),P_{\bm{\mu},\mathbf{\Sigma}}).IF ( bold_y ; bold_italic_θ , italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) = ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT IF ( bold_y ; roman_vec ( bold_M ) , italic_P start_POSTSUBSCRIPT bold_italic_μ , bold_Σ end_POSTSUBSCRIPT ) .

Part (i), together with (A.3) finishes the proof of part (ii). ∎

Proof of Lemma 2

Proof.

Let H:k×km:𝐻superscript𝑘𝑘superscript𝑚H:\mathbb{R}^{k\times k}\to\mathbb{R}^{m}italic_H : blackboard_R start_POSTSUPERSCRIPT italic_k × italic_k end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT with derivative Hsuperscript𝐻H^{\prime}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT defined in (A.14). From the definition of influence function, it follows that

IF(𝐲;H(𝐌),P)=H(𝐌(Ph,𝐲))h|h=0=H(𝐂)vec(𝐂)T|𝐂=𝐌(P)vec(𝐌(Ph,𝐲))h|h=0=H(𝐌(P))IF(𝐲;vec(𝐌),P).IF𝐲𝐻𝐌𝑃evaluated-at𝐻𝐌subscript𝑃𝐲0evaluated-atevaluated-at𝐻𝐂vecsuperscript𝐂𝑇𝐂𝐌𝑃vec𝐌subscript𝑃𝐲0superscript𝐻𝐌𝑃IF𝐲vec𝐌𝑃\begin{split}\text{IF}(\mathbf{y};H(\mathbf{M}),P)&=\frac{\partial H(\mathbf{M% }(P_{h,\mathbf{y}}))}{\partial h}\bigg{|}_{h=0}\\ &=\frac{\partial H(\mathbf{C})}{\partial\mathrm{vec}(\mathbf{C})^{T}}\bigg{|}_% {\mathbf{C}=\mathbf{M}(P)}\frac{\partial\mathrm{vec}(\mathbf{M}(P_{h,\mathbf{y% }}))}{\partial h}\bigg{|}_{h=0}\\ &=H^{\prime}(\mathbf{M}(P))\cdot\text{IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),% P).\end{split}start_ROW start_CELL IF ( bold_y ; italic_H ( bold_M ) , italic_P ) end_CELL start_CELL = divide start_ARG ∂ italic_H ( bold_M ( italic_P start_POSTSUBSCRIPT italic_h , bold_y end_POSTSUBSCRIPT ) ) end_ARG start_ARG ∂ italic_h end_ARG | start_POSTSUBSCRIPT italic_h = 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = divide start_ARG ∂ italic_H ( bold_C ) end_ARG start_ARG ∂ roman_vec ( bold_C ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG | start_POSTSUBSCRIPT bold_C = bold_M ( italic_P ) end_POSTSUBSCRIPT divide start_ARG ∂ roman_vec ( bold_M ( italic_P start_POSTSUBSCRIPT italic_h , bold_y end_POSTSUBSCRIPT ) ) end_ARG start_ARG ∂ italic_h end_ARG | start_POSTSUBSCRIPT italic_h = 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_M ( italic_P ) ) ⋅ IF ( bold_y ; roman_vec ( bold_M ) , italic_P ) . end_CELL end_ROW (A.18)

Since 𝐌(P)=𝚺𝐌𝑃𝚺\mathbf{M}(P)=\mathbf{\Sigma}bold_M ( italic_P ) = bold_Σ, after inserting the expression in Lemma 1, together with vec(𝐯𝐯T)=𝐯𝐯vecsuperscript𝐯𝐯𝑇tensor-product𝐯𝐯\mathrm{vec}(\mathbf{v}\mathbf{v}^{T})=\mathbf{v}\otimes\mathbf{v}roman_vec ( bold_vv start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = bold_v ⊗ bold_v, for IF(𝐲;vec(𝐌),P)IF𝐲vec𝐌𝑃\text{IF}(\mathbf{y};\mathrm{vec}(\mathbf{M}),P)IF ( bold_y ; roman_vec ( bold_M ) , italic_P ), this proves part (i). Next, let H:m:𝐻superscriptsuperscript𝑚H:\mathbb{R}^{\ell}\to\mathbb{R}^{m}italic_H : blackboard_R start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT with derivative Hsuperscript𝐻H^{\prime}italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT defined by (A.16). It follows that

IF(𝐲;H(𝜽),P)=H(𝜽(P))IF(𝐲;𝜽,P).IF𝐲𝐻𝜽𝑃superscript𝐻𝜽𝑃IF𝐲𝜽𝑃\text{IF}(\mathbf{y};H(\bm{\theta}),P)=H^{\prime}(\bm{\theta}(P))\cdot\text{IF% }(\mathbf{y};\bm{\theta},P).IF ( bold_y ; italic_H ( bold_italic_θ ) , italic_P ) = italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_italic_θ ( italic_P ) ) ⋅ IF ( bold_y ; bold_italic_θ , italic_P ) . (A.19)

After inserting the expression in Lemma 1(ii) for IF(𝐲;𝜽,P)IF𝐲𝜽𝑃\text{IF}(\mathbf{y};\bm{\theta},P)IF ( bold_y ; bold_italic_θ , italic_P ), together with 𝜽(P)=𝜽0𝜽𝑃subscript𝜽0\bm{\theta}(P)=\bm{\theta}_{0}bold_italic_θ ( italic_P ) = bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, this proves part (ii). ∎

Appendix B Derivation of σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

We compare the expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT derived in Theorem 2 with the ones obtained for specific cases in Tyler [26] and Lopuhaä et al [16].

Example 1.

Inserting w1=w2=w3=1subscript𝑤1subscript𝑤2subscript𝑤31w_{1}=w_{2}=w_{3}=1italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1 in the expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2 gives

σ1=𝔼𝟎,𝐈k[𝐳4]k(k+2),subscript𝜎1subscript𝔼0subscript𝐈𝑘delimited-[]superscriptnorm𝐳4𝑘𝑘2\sigma_{1}=\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\|\mathbf{z}\|^{4% }\right]}{k(k+2)},italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG italic_k ( italic_k + 2 ) end_ARG ,

which equals 1 for the multivariate normal. Furthermore,

σ2=2k+4𝔼𝟎,𝐈k[(𝐳2k)2](2k)2=2k+𝔼𝟎,𝐈k[𝐳4]2k𝔼𝟎,𝐈k[𝐳2]+k2k2=2k+k(k+2)2k2+k2k2=0.subscript𝜎22𝑘4subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsuperscriptnorm𝐳2𝑘2superscript2𝑘22𝑘subscript𝔼0subscript𝐈𝑘delimited-[]superscriptnorm𝐳42𝑘subscript𝔼0subscript𝐈𝑘delimited-[]superscriptnorm𝐳2superscript𝑘2superscript𝑘22𝑘𝑘𝑘22superscript𝑘2superscript𝑘2superscript𝑘20\begin{split}\sigma_{2}&=-\frac{2}{k}+\frac{4\mathbb{E}_{\mathbf{0},\mathbf{I}% _{k}}\left[(\|\mathbf{z}\|^{2}-k)^{2}\right]}{(2k)^{2}}\\ &=-\frac{2}{k}+\frac{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\|\mathbf{z}% \|^{4}\right]-2k\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[\|\mathbf{z}\|^{2}% \right]+k^{2}}{k^{2}}\\ &=-\frac{2}{k}+\frac{k(k+2)-2k^{2}+k^{2}}{k^{2}}=0.\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG + divide start_ARG 4 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ( ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( 2 italic_k ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG + divide start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] - 2 italic_k blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] + italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG + divide start_ARG italic_k ( italic_k + 2 ) - 2 italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = 0 . end_CELL end_ROW

Example 2.

First consider the special case of maximum likelihood, with w1(s)=w2(s)=2g(s2)/g(s2)subscript𝑤1𝑠subscript𝑤2𝑠2superscript𝑔superscript𝑠2𝑔superscript𝑠2w_{1}(s)=w_{2}(s)=-2g^{\prime}(s^{2})/g(s^{2})italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) = italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = - 2 italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / italic_g ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and w3(s)=1subscript𝑤3𝑠1w_{3}(s)=1italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) = 1. Note that

𝔼𝟎,𝐈k[z(𝐳)]=2πk/2Γ(k/2)0z(r)g(r2)rk1dr,subscript𝔼0subscript𝐈𝑘delimited-[]𝑧norm𝐳2superscript𝜋𝑘2Γ𝑘2superscriptsubscript0𝑧𝑟𝑔superscript𝑟2superscript𝑟𝑘1d𝑟\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[z(\|\mathbf{z}\|)\right]=\frac{2% \pi^{k/2}}{\Gamma(k/2)}\int_{0}^{\infty}z(r)g(r^{2})r^{k-1}\,\text{d}r,blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_z ( ∥ bold_z ∥ ) ] = divide start_ARG 2 italic_π start_POSTSUPERSCRIPT italic_k / 2 end_POSTSUPERSCRIPT end_ARG start_ARG roman_Γ ( italic_k / 2 ) end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_z ( italic_r ) italic_g ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_r start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT d italic_r , (B.1)

see e.g., Lemma 1 in Lopuhaä [14]. When 𝔼𝟎,𝐈k[𝐳2]<subscript𝔼0subscript𝐈𝑘delimited-[]normsuperscript𝐳2\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[\|\mathbf{z}^{2}\|]<\inftyblackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ ∥ bold_z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ] < ∞, then by means of integration by parts we get

𝔼𝟎,𝐈k[w2(𝐳)𝐳3]=2πk/2Γ(k/2)04g(r2)2g(r2)2g(r2)rk+3dr2πk/2Γ(k/2)04g′′(r2)g(r2)g(r2)rk+3dr=4𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]k(k+2).subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2delimited-∥∥𝐳superscriptdelimited-∥∥𝐳32superscript𝜋𝑘2Γ𝑘2superscriptsubscript04superscript𝑔superscriptsuperscript𝑟22𝑔superscriptsuperscript𝑟22𝑔superscript𝑟2superscript𝑟𝑘3d𝑟2superscript𝜋𝑘2Γ𝑘2superscriptsubscript04superscript𝑔′′superscript𝑟2𝑔superscript𝑟2𝑔superscript𝑟2superscript𝑟𝑘3d𝑟4subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptdelimited-∥∥𝐳2superscriptdelimited-∥∥𝐳4𝑘𝑘2\begin{split}\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|% \mathbf{z}\|)\|\mathbf{z}\|^{3}\Big{]}&=\frac{2\pi^{k/2}}{\Gamma(k/2)}\int_{0}% ^{\infty}\frac{4g^{\prime}(r^{2})^{2}}{g(r^{2})^{2}}g(r^{2})r^{k+3}\text{d}r-% \frac{2\pi^{k/2}}{\Gamma(k/2)}\int_{0}^{\infty}\frac{4g^{\prime\prime}(r^{2})}% {g(r^{2})}g(r^{2})r^{k+3}\text{d}r\\ &=4\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}(\|\mathbf{z}\|)^{2}\|% \mathbf{z}\|^{4}\Big{]}-k(k+2).\end{split}start_ROW start_CELL blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ] end_CELL start_CELL = divide start_ARG 2 italic_π start_POSTSUPERSCRIPT italic_k / 2 end_POSTSUPERSCRIPT end_ARG start_ARG roman_Γ ( italic_k / 2 ) end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 4 italic_g start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_g ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_g ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_r start_POSTSUPERSCRIPT italic_k + 3 end_POSTSUPERSCRIPT d italic_r - divide start_ARG 2 italic_π start_POSTSUPERSCRIPT italic_k / 2 end_POSTSUPERSCRIPT end_ARG start_ARG roman_Γ ( italic_k / 2 ) end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG 4 italic_g start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_g ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG italic_g ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_r start_POSTSUPERSCRIPT italic_k + 3 end_POSTSUPERSCRIPT d italic_r end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = 4 blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] - italic_k ( italic_k + 2 ) . end_CELL end_ROW

It follows that

σ1=k(k+2)𝔼𝟎,𝐈k[w2(𝐳)2𝐳4](𝔼𝟎,𝐈k[w2(𝐳)𝐳3]+k(k+2))2=k(k+2)𝔼𝟎,𝐈k[w2(𝐳)2𝐳4],subscript𝜎1𝑘𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4superscriptsubscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2norm𝐳superscriptnorm𝐳3𝑘𝑘22𝑘𝑘2subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4\sigma_{1}=\frac{k(k+2)\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[w_{2}(\|% \mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]}{\Big{(}\mathbb{E}_{\mathbf{0},% \mathbf{I}_{k}}[w_{2}^{\prime}(\|\mathbf{z}\|)\|\mathbf{z}\|^{3}]+k(k+2)\Big{)% }^{2}}=\frac{k(k+2)}{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[w_{2}(\|% \mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}\right]},italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG italic_k ( italic_k + 2 ) blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ] + italic_k ( italic_k + 2 ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG italic_k ( italic_k + 2 ) end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_ARG , (B.2)

which coincides with the expression found in Example 2 in Tyler [26], who expresses expectations in terms of the random variable T=𝐳2𝑇superscriptnorm𝐳2T=\|\mathbf{z}\|^{2}italic_T = ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. To compute σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, first note that by means of integration by parts it follows that 𝔼𝟎,𝐈k[w2(𝐳)𝐳2]=ksubscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2norm𝐳superscriptnorm𝐳2𝑘\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[w_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{2}% ]=kblackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_k. When we insert this in the expression for σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, this gives

σ2=2kσ1+4(𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]k2)(𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]k(k+2)+2k)2=2kσ1+4𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]k2.subscript𝜎22𝑘subscript𝜎14subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4superscript𝑘2superscriptsubscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘22𝑘22𝑘subscript𝜎14subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4superscript𝑘2\begin{split}\sigma_{2}&=-\frac{2}{k}\sigma_{1}+\frac{4\left(\mathbb{E}_{% \mathbf{0},\mathbf{I}_{k}}\left[w_{2}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}% \right]-k^{2}\right)}{\left(\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[w_{2}(\|% \mathbf{z}\|)^{2}\|\mathbf{z}\|^{4}]-k(k+2)+2k\right)^{2}}=-\frac{2}{k}\sigma_% {1}+\frac{4}{\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[w_{2}(\|\mathbf{z}\|)^{2}% \|\mathbf{z}\|^{4}]-k^{2}}.\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG 4 ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] - italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG start_ARG ( blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] - italic_k ( italic_k + 2 ) + 2 italic_k ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG 4 end_ARG start_ARG blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] - italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . end_CELL end_ROW

After inserting 𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]=k(k+2)/σ1subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptnorm𝐳2superscriptnorm𝐳4𝑘𝑘2subscript𝜎1\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[w_{2}(\|\mathbf{z}\|)^{2}\|\mathbf{z}\|% ^{4}]=k(k+2)/\sigma_{1}blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] = italic_k ( italic_k + 2 ) / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, as follows from (B.2), we find

σ2=2kσ1+4k(k+2)/σ1k2=2σ1(1σ1)k+2kσ1,subscript𝜎22𝑘subscript𝜎14𝑘𝑘2subscript𝜎1superscript𝑘22subscript𝜎11subscript𝜎1𝑘2𝑘subscript𝜎1\sigma_{2}=-\frac{2}{k}\sigma_{1}+\frac{4}{k(k+2)/\sigma_{1}-k^{2}}=\frac{2% \sigma_{1}(1-\sigma_{1})}{k+2-k\sigma_{1}},italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = - divide start_ARG 2 end_ARG start_ARG italic_k end_ARG italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG 4 end_ARG start_ARG italic_k ( italic_k + 2 ) / italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_k + 2 - italic_k italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ,

which coincides with the expression found in Example 2 in Tyler [26].

Next, consider the general case of M-estimators, with w3=1subscript𝑤31w_{3}=1italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1. First note that Tyler [26] uses a function u2subscript𝑢2u_{2}italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which relates to our function w2subscript𝑤2w_{2}italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as w2(s)=u2(s2)subscript𝑤2𝑠subscript𝑢2superscript𝑠2w_{2}(s)=u_{2}(s^{2})italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). Then, since 𝝃0=(𝜷0,𝜽0)subscript𝝃0subscript𝜷0subscript𝜽0\bm{\xi}_{0}=(\bm{\beta}_{0},\bm{\theta}_{0})bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) satisfies (3.5), we find that

0=𝜽0T𝔼[Ψ𝜽(𝐬,𝝃0)]=𝔼[vec(𝚺1)Tvec{w2(d0)(𝐲𝐗𝜷0)(𝐲𝐗𝜷0)T𝚺}]=𝔼[tr{w2(d0)(𝐲𝐗𝜷0)(𝐲𝐗𝜷0)T𝚺1𝐈k}]=𝔼𝟎,𝐈k[w2(𝐳)𝐳2k],0superscriptsubscript𝜽0𝑇𝔼delimited-[]subscriptΨ𝜽𝐬subscript𝝃0𝔼delimited-[]vecsuperscriptsuperscript𝚺1𝑇vecsubscript𝑤2subscript𝑑0𝐲𝐗subscript𝜷0superscript𝐲𝐗subscript𝜷0𝑇𝚺𝔼delimited-[]trsubscript𝑤2subscript𝑑0𝐲𝐗subscript𝜷0superscript𝐲𝐗subscript𝜷0𝑇superscript𝚺1subscript𝐈𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2delimited-∥∥𝐳superscriptdelimited-∥∥𝐳2𝑘\begin{split}0&=\bm{\theta}_{0}^{T}\mathbb{E}\left[\Psi_{\bm{\theta}}(\mathbf{% s},\bm{\xi}_{0})\right]\\ &=\mathbb{E}\left[\mathrm{vec}(\mathbf{\Sigma}^{-1})^{T}\mathrm{vec}\left\{w_{% 2}(d_{0})(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})(\mathbf{y}-\mathbf{X}\bm{\beta}% _{0})^{T}-\mathbf{\Sigma}\right\}\right]\\ &=\mathbb{E}\left[\text{tr}\left\{w_{2}(d_{0})(\mathbf{y}-\mathbf{X}\bm{\beta}% _{0})(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})^{T}\mathbf{\Sigma}^{-1}-\mathbf{I}_% {k}\right\}\right]\\ &=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[w_{2}(\|\mathbf{z}\|)\|\mathbf{z% }\|^{2}-k\right],\end{split}start_ROW start_CELL 0 end_CELL start_CELL = bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT blackboard_E [ roman_Ψ start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_s , bold_italic_ξ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = blackboard_E [ roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec { italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - bold_Σ } ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = blackboard_E [ tr { italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_k ] , end_CELL end_ROW

where d02=(𝐲𝐗𝜷0)T𝚺1(𝐲𝐗𝜷0)superscriptsubscript𝑑02superscript𝐲𝐗subscript𝜷0𝑇superscript𝚺1𝐲𝐗subscript𝜷0d_{0}^{2}=(\mathbf{y}-\mathbf{X}\bm{\beta}_{0})^{T}\mathbf{\Sigma}^{-1}(% \mathbf{y}-\mathbf{X}\bm{\beta}_{0})italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_y - bold_X bold_italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), so that k=𝔼𝟎,𝐈k[w2(𝐳)𝐳2]=𝔼[u2(T)T]𝑘subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2norm𝐳superscriptnorm𝐳2𝔼delimited-[]subscript𝑢2𝑇𝑇k=\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}[w_{2}(\|\mathbf{z}\|)\|\mathbf{z}\|^{% 2}]=\mathbb{E}[u_{2}(T)T]italic_k = blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = blackboard_E [ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_T ) italic_T ]. It then follows that

𝔼𝟎,𝐈k[w2(𝐳)2𝐳4]=𝔼[u2(T)2T2]=k(k+2)ψ1𝔼𝟎,𝐈k[w2(𝐳)𝐳3]=2𝔼[u2(T)T2]=2k(ψ21),subscript𝔼0subscript𝐈𝑘delimited-[]subscript𝑤2superscriptdelimited-∥∥𝐳2superscriptdelimited-∥∥𝐳4𝔼delimited-[]subscript𝑢2superscript𝑇2superscript𝑇2𝑘𝑘2subscript𝜓1subscript𝔼0subscript𝐈𝑘delimited-[]superscriptsubscript𝑤2delimited-∥∥𝐳superscriptdelimited-∥∥𝐳32𝔼delimited-[]superscriptsubscript𝑢2𝑇superscript𝑇22𝑘subscript𝜓21\begin{split}\mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\left[w_{2}(\|\mathbf{z}\|)% ^{2}\|\mathbf{z}\|^{4}\right]&=\mathbb{E}\left[u_{2}(T)^{2}T^{2}\right]=k(k+2)% \psi_{1}\\ \mathbb{E}_{\mathbf{0},\mathbf{I}_{k}}\Big{[}w_{2}^{\prime}(\|\mathbf{z}\|)\|% \mathbf{z}\|^{3}\Big{]}&=2\mathbb{E}\left[u_{2}^{\prime}(T)T^{2}\right]=2k(% \psi_{2}-1),\end{split}start_ROW start_CELL blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ∥ bold_z ∥ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_z ∥ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ] end_CELL start_CELL = blackboard_E [ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_T ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_k ( italic_k + 2 ) italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL blackboard_E start_POSTSUBSCRIPT bold_0 , bold_I start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ∥ bold_z ∥ ) ∥ bold_z ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ] end_CELL start_CELL = 2 blackboard_E [ italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_T ) italic_T start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = 2 italic_k ( italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 1 ) , end_CELL end_ROW

where ψ1subscript𝜓1\psi_{1}italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ψ2subscript𝜓2\psi_{2}italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are defined in Example 3 in Tyler [26]. Then from the expressions provided in Theorem 2 we find

σ1=k2(k+2)2ψ1(2k(ψ21)+k(k+2))2=(k+2)2ψ1(2ψ2+k)2σ2=2σ1k+4{k(k+2)ψ1k2}(2kψ2)2.subscript𝜎1superscript𝑘2superscript𝑘22subscript𝜓1superscript2𝑘subscript𝜓21𝑘𝑘22superscript𝑘22subscript𝜓1superscript2subscript𝜓2𝑘2subscript𝜎22subscript𝜎1𝑘4𝑘𝑘2subscript𝜓1superscript𝑘2superscript2𝑘subscript𝜓22\begin{split}\sigma_{1}&=\frac{k^{2}(k+2)^{2}\psi_{1}}{\left(2k(\psi_{2}-1)+k(% k+2)\right)^{2}}=\frac{(k+2)^{2}\psi_{1}}{(2\psi_{2}+k)^{2}}\\ \sigma_{2}&=-\frac{2\sigma_{1}}{k}+\frac{4\left\{k(k+2)\psi_{1}-k^{2}\right\}}% {(2k\psi_{2})^{2}}.\end{split}start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL = divide start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_k + 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ( 2 italic_k ( italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 1 ) + italic_k ( italic_k + 2 ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG ( italic_k + 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ( 2 italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_k ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL = - divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG + divide start_ARG 4 { italic_k ( italic_k + 2 ) italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } end_ARG start_ARG ( 2 italic_k italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . end_CELL end_ROW

The expression for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT coincides with the one in Example 3 in Tyler [26]. After inserting this in σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, one can verify that also the expression for σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT coincides with one in Example 3 in Tyler [26].

Example 3.

With w1(s)=ρ(s)/ssubscript𝑤1𝑠superscript𝜌𝑠𝑠w_{1}(s)=\rho^{\prime}(s)/sitalic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_s ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) / italic_s, w2(s)=kρ(s)/ssubscript𝑤2𝑠𝑘superscript𝜌𝑠𝑠w_{2}(s)=k\rho^{\prime}(s)/sitalic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_s ) = italic_k italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) / italic_s and w3(s)=ρ(s)sρ(s)+b0subscript𝑤3𝑠superscript𝜌𝑠𝑠𝜌𝑠subscript𝑏0w_{3}(s)=\rho^{\prime}(s)s-\rho(s)+b_{0}italic_w start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ( italic_s ) = italic_ρ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) italic_s - italic_ρ ( italic_s ) + italic_b start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, one can easily verify that the expressions for σ1subscript𝜎1\sigma_{1}italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and σ2subscript𝜎2\sigma_{2}italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in Theorem 2 coincide with the ones in Corollary 9.2 in Lopuhaä et al [16].

Appendix C Details for Examples 4 and 5

Example 4.

From 4.2 we find

H(𝚺)𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋TH(𝚺)T=1k2|𝚺|2/kvec(𝚺)vec(𝚺1)T𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1)vec(𝚺)T1k|𝚺|2/kvec(𝚺)vec(𝚺1)T𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T1k|𝚺|2/k𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋Tvec(𝚺1)vec(𝚺)T+|𝚺|2/k𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T.superscript𝐻𝚺𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇superscript𝐻superscript𝚺𝑇1superscript𝑘2superscript𝚺2𝑘vec𝚺vecsuperscriptsuperscript𝚺1𝑇𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1vecsuperscript𝚺𝑇1𝑘superscript𝚺2𝑘vec𝚺vecsuperscriptsuperscript𝚺1𝑇𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇1𝑘superscript𝚺2𝑘𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇vecsuperscript𝚺1vecsuperscript𝚺𝑇superscript𝚺2𝑘𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇\begin{split}&H^{\prime}(\mathbf{\Sigma})\mathbf{L}\Big{(}\mathbf{L}^{T}\left(% \mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}% \mathbf{L}^{T}H^{\prime}(\mathbf{\Sigma})^{T}\\ &=\frac{1}{k^{2}}|\mathbf{\Sigma}|^{-2/k}\mathrm{vec}(\mathbf{\Sigma})\mathrm{% vec}(\mathbf{\Sigma}^{-1})^{T}\mathbf{L}\Big{(}\mathbf{L}^{T}\left(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}\mathbf{L}% ^{T}\mathrm{vec}(\mathbf{\Sigma}^{-1})\mathrm{vec}(\mathbf{\Sigma})^{T}\\ &\quad-\frac{1}{k}|\mathbf{\Sigma}|^{-2/k}\mathrm{vec}(\mathbf{\Sigma})\mathrm% {vec}(\mathbf{\Sigma}^{-1})^{T}\mathbf{L}\Big{(}\mathbf{L}^{T}\left(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}\mathbf{L}% ^{T}\\ &\quad-\frac{1}{k}|\mathbf{\Sigma}|^{-2/k}\mathbf{L}\Big{(}\mathbf{L}^{T}\left% (\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}% \mathbf{L}^{T}\mathrm{vec}(\mathbf{\Sigma}^{-1})\mathrm{vec}(\mathbf{\Sigma})^% {T}\\ &\quad+|\mathbf{\Sigma}|^{-2/k}\mathbf{L}\Big{(}\mathbf{L}^{T}\left(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}\mathbf{L}% ^{T}.\end{split}start_ROW start_CELL end_CELL start_CELL italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = divide start_ARG 1 end_ARG start_ARG italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . end_CELL end_ROW (C.1)

Using that vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and

vec(𝚺1)=vec(𝚺1𝚺𝚺1)=(𝚺1𝚺1)vec(𝚺)=(𝚺1𝚺1)𝐋𝜽0,vecsuperscript𝚺1vecsuperscript𝚺1𝚺superscript𝚺1tensor-productsuperscript𝚺1superscript𝚺1vec𝚺tensor-productsuperscript𝚺1superscript𝚺1𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma}^{-1})=\mathrm{vec}(\mathbf{\Sigma}^{-1}\mathbf{% \Sigma}\mathbf{\Sigma}^{-1})=\left(\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^% {-1}\right)\mathrm{vec}(\mathbf{\Sigma})=\left(\mathbf{\Sigma}^{-1}\otimes% \mathbf{\Sigma}^{-1}\right)\mathbf{L}\bm{\theta}_{0},roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = roman_vec ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_Σ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_vec ( bold_Σ ) = ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,

the first term on the right hand side of (C.1) reduces to (1/k)|𝚺|2/kvec(𝚺)vec(𝚺)T1𝑘superscript𝚺2𝑘vec𝚺vecsuperscript𝚺𝑇(1/k)|\mathbf{\Sigma}|^{-2/k}\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(\mathbf% {\Sigma})^{T}( 1 / italic_k ) | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Similarly, the second and third term on the right hand side of (C.1) are equal to (1/k)|𝚺|2/kvec(𝚺)vec(𝚺)T1𝑘superscript𝚺2𝑘vec𝚺vecsuperscript𝚺𝑇-(1/k)|\mathbf{\Sigma}|^{-2/k}\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(% \mathbf{\Sigma})^{T}- ( 1 / italic_k ) | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Putting everything together, we find that the limiting covariance of n(H(𝐕n)H(𝚺))𝑛𝐻subscript𝐕𝑛𝐻𝚺\sqrt{n}(H(\mathbf{V}_{n})-H(\mathbf{\Sigma}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_V start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_Σ ) ) is given by (4.3).

Example 5.

From Example 4 and the delta method, it follows that the limiting variance of n(H(𝜽n)H(𝜽))𝑛𝐻subscript𝜽𝑛𝐻𝜽\sqrt{n}(H(\bm{\theta}_{n})-H(\bm{\theta}))square-root start_ARG italic_n end_ARG ( italic_H ( bold_italic_θ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_H ( bold_italic_θ ) ) is given by

(𝐋T𝐋)1𝐋T[2σ1|𝚺|2/k{𝐋(𝐋T(𝚺1𝚺1)𝐋)1𝐋T1kvec(𝚺)vec(𝚺)T}]𝐋(𝐋T𝐋)1=2σ1|𝚺|2/k{(𝐋T(𝚺1𝚺1)𝐋)11k(𝐋T𝐋)1𝐋Tvec(𝚺)vec(𝚺)T𝐋(𝐋T𝐋)1}=2σ1|𝚺|2/k{(𝐋T(𝚺1𝚺1)𝐋)11k𝜽0𝜽0T},superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇delimited-[]2subscript𝜎1superscript𝚺2𝑘𝐋superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋1superscript𝐋𝑇1𝑘vec𝚺vecsuperscript𝚺𝑇𝐋superscriptsuperscript𝐋𝑇𝐋12subscript𝜎1superscript𝚺2𝑘superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋11𝑘superscriptsuperscript𝐋𝑇𝐋1superscript𝐋𝑇vec𝚺vecsuperscript𝚺𝑇𝐋superscriptsuperscript𝐋𝑇𝐋12subscript𝜎1superscript𝚺2𝑘superscriptsuperscript𝐋𝑇tensor-productsuperscript𝚺1superscript𝚺1𝐋11𝑘subscript𝜽0superscriptsubscript𝜽0𝑇\begin{split}&(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\left[2\sigma_{1}|% \mathbf{\Sigma}|^{-2/k}\left\{\mathbf{L}\Big{(}\mathbf{L}^{T}\left(\mathbf{% \Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}\mathbf{L}% ^{T}-\frac{1}{k}\mathrm{vec}(\mathbf{\Sigma})\mathrm{vec}(\mathbf{\Sigma})^{T}% \right\}\right]\mathbf{L}(\mathbf{L}^{T}\mathbf{L})^{-1}\\ &=2\sigma_{1}|\mathbf{\Sigma}|^{-2/k}\left\{\Big{(}\mathbf{L}^{T}\left(\mathbf% {\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}-\frac{1}% {k}(\mathbf{L}^{T}\mathbf{L})^{-1}\mathbf{L}^{T}\mathrm{vec}(\mathbf{\Sigma})% \mathrm{vec}(\mathbf{\Sigma})^{T}\mathbf{L}(\mathbf{L}^{T}\mathbf{L})^{-1}% \right\}\\ &=\frac{2\sigma_{1}}{|\mathbf{\Sigma}|^{2/k}}\left\{\Big{(}\mathbf{L}^{T}\left% (\mathbf{\Sigma}^{-1}\otimes\mathbf{\Sigma}^{-1}\right)\mathbf{L}\Big{)}^{-1}-% \frac{1}{k}\bm{\theta}_{0}\bm{\theta}_{0}^{T}\right\},\end{split}start_ROW start_CELL end_CELL start_CELL ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT { bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } ] bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | bold_Σ | start_POSTSUPERSCRIPT - 2 / italic_k end_POSTSUPERSCRIPT { ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_vec ( bold_Σ ) roman_vec ( bold_Σ ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT } end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = divide start_ARG 2 italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG | bold_Σ | start_POSTSUPERSCRIPT 2 / italic_k end_POSTSUPERSCRIPT end_ARG { ( bold_L start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⊗ bold_Σ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) bold_L ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_k end_ARG bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT } , end_CELL end_ROW

using that vec(𝚺)=𝐋𝜽0vec𝚺𝐋subscript𝜽0\mathrm{vec}(\mathbf{\Sigma})=\mathbf{L}\bm{\theta}_{0}roman_vec ( bold_Σ ) = bold_L bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

References

  • [1] I. Chervoneva and M. Vishnyakov. Constrained S𝑆Sitalic_S-estimators for linear mixed effects models with covariance components. Stat. Med., 30(14):1735–1750, 2011.
  • [2] I. Chervoneva and M. Vishnyakov. Generalized s-estimators for linear mixed effects models. Statistica Sinica, 24(3):1257–1276, 2014.
  • [3] S. Copt and S. Heritier. Robust alternatives to the f-test in mixed linear models based on mm-estimates. Biometrics, 63(4):1045–1052, 2007.
  • [4] S. Copt and M. P. Victoria-Feser. High-breakdown inference for mixed linear models. Journal of the American Statistical Association, 101(473):292–300, 2006.
  • [5] C. Croux and G. Haesbroeck. Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies. Biometrika, 87(3):603–618, 2000.
  • [6] G. M. Fitzmaurice, N. M. Laird, and J. H. Ware. Applied longitudinal analysis. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, second edition, 2011.
  • [7] F. R. Hampel. The influence curve and its role in robust estimation. J. Amer. Statist. Assoc., 69:383–393, 1974.
  • [8] H. O. Hartley and J. N. K. Rao. Maximum-likelihood estimation for the mixed analysis of variance model. Biometrika, 54:93–108, 1967.
  • [9] P. J. Huber. Robust statistics. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc., New York, 1981.
  • [10] R. I. Jennrich and M. D. Schluchter. Unbalanced repeated-measures models with structured covariance matrices. Biometrics, 42(4):805–820, 1986.
  • [11] J. T. Kent and D. E. Tyler. Constrained M𝑀Mitalic_M-estimation for multivariate location and scatter. Ann. Statist., 24(3):1346–1370, 1996.
  • [12] N. L. Kudraszow and R. A. Maronna. Estimates of MM type for the multivariate linear model. J. Multivariate Anal., 102(9):1280–1292, 2011.
  • [13] H. P. Lopuhaä. On the relation between S𝑆Sitalic_S-estimators and M𝑀Mitalic_M-estimators of multivariate location and covariance. Ann. Statist., 17(4):1662–1683, 1989.
  • [14] H. P. Lopuhaä. Asymptotic expansion of S𝑆Sitalic_S-estimators of location and covariance. Statist. Neerlandica, 51(2):220–237, 1997.
  • [15] H. P. Lopuhaä. Highly efficient estimators with high breakdown point for linear models with structured covariance matrices. Econometrics and Statistics, 2023.
  • [16] H. P. Lopuhaä, V. Gares, and A. Ruiz-Gazen. S-estimation in linear models with structured covariance matrices. Ann. Statist., 51(6):2415–2439, 2023.
  • [17] H. P. Lopuhaä, V. Gares, and A. Ruiz-Gazen. Supplement to “S-estimation in linear models with structured covariance matrices”. 2023.
  • [18] J. R. Magnus and H. Neudecker. Matrix differential calculus with applications in statistics and econometrics. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Ltd., Chichester, 1988.
  • [19] C. L. Mallows. Latent vectors of random symmetric matrices. Biometrika, 48:133–149, 1961.
  • [20] K. V. Mardia and R. J. Marshall. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71(1):135–146, 1984.
  • [21] R. A. Maronna. Robust M𝑀Mitalic_M-estimators of multivariate location and scatter. Ann. Statist., 4(1):51–67, 1976.
  • [22] J. J. Miller. Asymptotic properties of maximum likelihood estimates in the mixed model of the analysis of variance. Ann. Statist., 5(4):746–762, 1977.
  • [23] P. Rousseeuw and V. Yohai. Robust regression by means of S-estimators. In Robust and nonlinear time series analysis (Heidelberg, 1983), volume 26 of Lect. Notes Stat., pages 256–272. Springer, New York, 1984.
  • [24] M. Salibián-Barrera, S. Van Aelst, and G. Willems. Principal components analysis based on multivariate MM estimators with fast and robust bootstrap. J. Amer. Statist. Assoc., 101(475):1198–1211, 2006.
  • [25] K. S. Tatsuoka and D. E. Tyler. On the uniqueness of S𝑆Sitalic_S-functionals and M𝑀Mitalic_M-functionals under nonelliptical distributions. Ann. Statist., 28(4):1219–1243, 2000.
  • [26] D. E. Tyler. Radial estimates and the test for sphericity. Biometrika, 69(2):429–436, 1982.
  • [27] D. E. Tyler. Robustness and efficiency properties of scatter matrices. Biometrika, 70(2):411–420, 1983.
  • [28] S. Van Aelst and G. Willems. Multivariate regression S𝑆Sitalic_S-estimators for robust estimation and inference. Statist. Sinica, 15(4):981–1001, 2005.