\Newassociation

solutionSolutionsolutionfile \Opensolutionfilesolutionfile[StackedMIMOCapacity]

Achievable Rate Optimization for Stacked Intelligent Metasurface-Assisted Holographic MIMO Communications

Anastasios Papazafeiropoulos, Jiancheng An, Pandelis Kourtessis, Tharmalingam Ratnarajah, Symeon Chatzinotas A. Papazafeiropoulos is with the Communications and Intelligent Systems Research Group, University of Hertfordshire, Hatfield AL10 9AB, U. K., and with SnT at the University of Luxembourg, Luxembourg. J. An is with the School of Electrical and Electronics Engineering, Nanyang Technological University, Singapore 639798. P. Kourtessis is with the Communications and Intelligent Systems Research Group, University of Hertfordshire, Hatfield AL10 9AB, U. K. T. Ratnarajah is with he School of Engineering, Institute for Digital Communications, The University of Edinburgh, EH8 9YL Edinburgh, U.K. S. Chatzinotas is with the SnT at the University of Luxembourg, Luxembourg. A. Papazafeiropoulos was supported by the University of Hertfordshire’s 5-year Vice Chancellor’s Research Fellowship. S. Chatzinotas was supported by the National Research Fund, Luxembourg, under the project RISOTTI. E-mails: [email protected], [email protected], [email protected], [email protected], [email protected].
Abstract

Stacked intelligent metasurfaces (SIM) is a revolutionary technology, which can outperform its single-layer counterparts by performing advanced signal processing relying on wave propagation. In this work, we exploit SIM to enable transmit precoding and receiver combining in holographic multiple-input multiple-output (HMIMO) communications, and we study the achievable rate by formulating a joint optimization problem of the SIM phase shifts at both sides of the transceiver and the covariance matrix of the transmitted signal. Notably, we propose its solution by means of an iterative optimization algorithm that relies on the projected gradient method, and accounts for all optimization parameters simultaneously. We also obtain the step size guaranteeing the convergence of the proposed algorithm. Simulation results provide fundamental insights such the performance improvements compared to the single-RIS counterpart and conventional MIMO system. Remarkably, the proposed algorithm results in the same achievable rate as the alternating optimization (AO) benchmark but with a less number of iterations.

Index Terms:
Holographic MIMO (HMIMO), stacked intelligent metasurfaces (SIM), reconfigurable intelligent surface (RIS), gradient projection, 6G networks.

I Introduction

The need for sixth-generation (6G) cellular networks has appeared on the horizon of the wireless network evolution [1]. Their extreme targets require vast improvements on data rates and latency together with wider connectivity to cover the explosive proliferation of the Internet-of-Everything (IoE), such as virtual/augmented reality (VR/AR) and the increase in connected devices. In particular, the latter is expected to reach 500500500500 million by 2030203020302030 [2]. Various technologies such as massive multiple-input multiple-output (mMIMO) and millimeter wave (mmWave) communications that have been suggested in recent years that require low energy consumption and can achieve high data rates, but they concern transceiver features without being able to shape the propagation channel with its stochastic characteristics that limit the performance [3].

Two possible approaches that solve the aforementioned issues are the proposed 6G-enabled technologies, reconfigurable intelligent surfaces (RIS) [4, 5, 6, 7], and holographic multiple-input multiple-output (HMIMO) Communications [8]. Specifically, RIS and its various equivalents have been proposed to realize smart reconfigurable environments [4]. A RIS is a planner metasurface equipped with a large number of nearly passive elements that induce phase shifts and/or amplitude attenuation to the im**ing waves through a smart controller. Also, optimization of the reflected signals enable us to control the interaction of the reflection characteristics with the surrounding objects. RIS performance gains have been studied under various system and channel setups, but most works have focused on single-input single-output (SISO) or multiple-input single-output (MISO) systems with single-antenna receivers [9, 10, 11, 12, 6, 13, 14, 15, 16], while the research on RIS-assisted MIMO assisted is limited [17, 18]. In particular, the study of the capacity limit of RIS-assisted MIMO systems requires the joint optimization of the RIS phase shifts and MIMO transmit covariance matrix [19, 20]. It is worthwhile to mention that since RISs do not require active transmitter radio frequency (RF) chains, they can be densely implemented with low energy consumption and low cost [5]. However, single-layer RIS designs are not capable of implementing advanced MIMO functionalities because of hardware limitations and suffer from severe path-loss attenuation.

On the ground of mMIMO systems with its improvements on spectral and energy efficiencies and their other benefits such as reduced latency [21], HMIMO has emerged as a new concept describing possibly the next MIMO generation [8, 22]. For example, in [23], a large intelligent surface (LIS) including a massive number of elements it was shown that great improvements are expected. The fundamental limits of HMIMO systems were characterised in [24] by suggesting correlated random Gaussian fading for the far field. In the case of arbitrary scattering environments a Fourier plane-wave series-based expansion of the HMIMO channel response was developed in [25]. In [26], a channel estimation scheme was proposed for arbitrary spatial correlation matrices. However, HMIMO have the disadvantages of excessive hardware cost and energy consumption because they are implemented by a large number of active components.

Recently, not only stacked intelligent metasurfaces (SIMs) have been proposed where multiple surfaces are cascaded [27, 28], but SIMs were integrated with the transceiver to implement HMIMO communications in [29]. It was shown that a SIM can implement signal processing in the electromagnetic (EM) wave regime. The motivation behind this work was to substitute the exprensive active elements at the tranceiver by exploiting the technology of programmable metasurfaces. In [23, 8], single-layer surfaces were used but the multilayer surface design is more advantageous for enhancing the spatial-domain gain by forming diverse waveforms with high accuracy.

From theoretical and practical standpoints, it is crucial to optimise the achievable rate for RIS-assisted systems. Hence, various optimization methods have been suggested in prior works, which aim to find near-optimal solutions obeying to reasonable run time and computational complexity. Most of these works considered single-antenna receive devices and relied on the alternating optimization (AO) method which optimises the transmit beamformer and the RIS phase shifts in an alternating way [12, 6]. For example, in [12], the gradient method was employed in an AO fashion to optimize the phase shifts. Other examples of AO-based works on RIS-assisted MIMO communication are [30, 31, 19, 20]. In [30], a RIS was optimized to increase the rank of the channel matrix. In [31], the optimization took place in an indoor mmWave environment. Moreover, in [19], the achievable rate of RIS-assisted multi-stream MIMO was maximized by using the AO method. However, AO-based methods require possibly many iterations to converge, which increase with the size of the RIS. Note that this is the case, where a RIS is more beneficial in practice. Contrary to this background, a joint optimization of the RIS elements and the transmit covariance matrix in [20], where an iterative projected gradient method was proposed.

Contributions: Motivated by the above observations, the topic of this paper concerns the study of SIM-enabled HMIMO systems by optimising simultaneously all parameters of the joint optimization problem to reduce the convergence time compared to an AO approach. Contrary to [29], which considers a full analog SIM-enabled HMIMO architecture by means of an AO approach we focus on a more general hybrid digital and wave design that also applies a more efficient algorithm optimizing all relevant parameters simultaneously. Compared to [20] and [14], which assumed a conventional RIS-assisted system and a STAR-RIS system, where two parameters are optimized simultaneously, we consider a general SIM-enabled system, where we optimize three key parameters simultaneously. Our main contributions are summarised as follows.

  • We maximise the achievable rate of a multi stream HMIMO system equipped with a SIM at the transmitter and a SIM at the receiver. To this end, we formulate the joint optimization problem of the transmit covariance matrix, and the RIS phase shift values of each surface at the transmitter and the receiver SIMs.

  • We propose an iterative projected gradient approach, which solves the underlying nonconvex problem. Also, we derive the gradients and the projection expressions for all parameters in closed-forms. Moreover, we show that the proposed approach converges to a critical point.

  • We determine the appropriate step size that makes the proposed algorithm to converge by deriving first the Lipschitz constant.

  • Simulation and analytical results coincide and show that both the proposed approach and the AO method result in the same rate but our approach achieves it with a substantially lower number of iterations. Furthermore, the proposed approach has remarkably lower computational complexity with respect to the AO method.

Paper Outline: The structure of this papers follows. Section II presents the system model and the problem formulation of a SIM-assisted HMIMO system. Section III presents the simultaneous optimization of the achievable rate with respect to all its parameters. Section LABEL:convergence provides the convergence and complexity analyses. In Section LABEL:Numerical, we provide the numerical results, and Section VI concludes the paper.

Notation: Vectors and matrices are denoted by boldface lower and upper case symbols, respectively. The notations ()𝖳superscript𝖳(\cdot)^{\scriptscriptstyle\mathsf{T}}( ⋅ ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT, ()𝖧superscript𝖧(\cdot)^{\scriptscriptstyle\mathsf{H}}( ⋅ ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT, and tr()trace\tr\!\left({\cdot}\right)roman_tr ( ⋅ ) describe the transpose, Hermitian transpose, and trace operators, respectively. Moreover, the notations arg()\arg\left(\cdot\right)roman_arg ( ⋅ ) and 𝔼[]𝔼delimited-[]\mathbb{E}\left[\cdot\right]blackboard_E [ ⋅ ] express the argument function and the expectation operator, respectively. The notation diag()(𝐀)diag𝐀\text{diag}\left(\right)\left({\mathbf{A}}\right)diag ( ) ( bold_A ) describes a vector with elements equal to the diagonal elements of 𝐀𝐀{\mathbf{A}}bold_A, the notation diag()(𝐱)diag𝐱\text{diag}\left(\right)\left({\mathbf{x}}\right)diag ( ) ( bold_x ) describes a diagonal matrix whose elements are 𝐱𝐱{\mathbf{x}}bold_x, while 𝐛𝒞𝒩(𝟎,𝚺)similar-to𝐛𝒞𝒩0𝚺{\mathbf{b}}\sim{\cal C}{\cal N}{({\mathbf{0}},\mathbf{\Sigma})}bold_b ∼ caligraphic_C caligraphic_N ( bold_0 , bold_Σ ) describes a circularly symmetric complex Gaussian vector with zero mean and a covariance matrix 𝚺𝚺\mathbf{\Sigma}bold_Σ.

II System Model and Problem Formulation

II-A System Model

We consider a SIM-assisted HMIMO, where the transmitter and the receiver have Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and Nrsubscript𝑁𝑟N_{r}italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT antennas, respectively, while each of them is assisted by a SIM as shown in Fig. 1. Specifically, each SIM consists of a closed vacuum container having several stacked metasurface layers [27]. The operation requires a customized field programmable gate array (FPGA), or generally a smart controller, which can adjust the phase shift of the EM waves im**ing on each meta-atom. Hence, a customized spatial waveform shape at the output of each metasurface layer is automatically produced as the transmit signals propagate through the SIM. The EM waves can be transmitted from the output metasurface of the transmitter SIM into the ether, and then, acquired by the receiver SIM, i.e., HMIMO communication can be supported while by leveraging the SIM-based analog beamforming to approach its digital counterpart. In particular, the transmitter SIM plays the role of the precoder by sending the information-bearing EM wave through the ether to the receiver SIM, which can combine the received EM to recover the transmitted signal. In other words, precoding and combining take place partially in the wave domain [29].

Refer to caption
Figure 1: A SIM-assisted HMIMO system.

According to Fig. 1, which illustrates the SIM-assisted HMIMO system supporting precoding and combining in the wave domain, we rely on the SIM design proposed in [29]. Specifically, we denote L𝐿Litalic_L the number of metasurface layers at the transmitter and K𝐾Kitalic_K the number of metasurface layers at the receiver, while ={1,,L}1𝐿\mathcal{L}=\{1,\ldots,L\}caligraphic_L = { 1 , … , italic_L } and 𝒦={1,,K}𝒦1𝐾\mathcal{K}=\{1,\ldots,K\}caligraphic_K = { 1 , … , italic_K } represent their sets, respectively. Without any loss of generality, we assume that each metasurface layer at the transmitter and the receiver consists of an identical number of meta-atoms. In particular, we denote M𝑀Mitalic_M and N𝑁Nitalic_N the number of meta-atoms on each metasurface layer at the transmitter SIM and receiver SIM, respectively. The respective sets are denoted as ={1,,M}1𝑀\mathcal{M}=\{1,\ldots,M\}caligraphic_M = { 1 , … , italic_M } and 𝒩={1,,N}𝒩1𝑁\mathcal{N}=\{1,\ldots,N\}caligraphic_N = { 1 , … , italic_N }. On this ground, we denote θml[0,2π),m,lformulae-sequencesuperscriptsubscript𝜃𝑚𝑙02𝜋formulae-sequence𝑚𝑙\theta_{m}^{l}\in[0,2\pi),m\in\mathcal{M},l\in\mathcal{L}italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ∈ [ 0 , 2 italic_π ) , italic_m ∈ caligraphic_M , italic_l ∈ caligraphic_L the phase shift by meta-atom m𝑚mitalic_m on the transmit metasurface layer l𝑙litalic_l with ϕml=ejθmlsuperscriptsubscriptitalic-ϕ𝑚𝑙superscript𝑒𝑗superscriptsubscript𝜃𝑚𝑙\phi_{m}^{l}=e^{j\theta_{m}^{l}}italic_ϕ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = italic_e start_POSTSUPERSCRIPT italic_j italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT being the respective transmission coefficient. The transmission coefficient matrix, associated with the l𝑙litalic_l-th transmit layer, is denoted by 𝚽l=diag()(ϕl)M×Msuperscript𝚽𝑙diagsuperscriptbold-italic-ϕ𝑙superscript𝑀𝑀{\bm{\Phi}}^{l}=\text{diag}\left(\right)({\bm{\phi}}^{l})\in\mathbb{C}^{M% \times M}bold_Φ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = diag ( ) ( bold_italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT, where ϕl=[ϕ1l,,ϕMl]𝖳M×1superscriptbold-italic-ϕ𝑙superscriptsubscriptsuperscriptitalic-ϕ𝑙1subscriptsuperscriptitalic-ϕ𝑙𝑀𝖳superscript𝑀1{\bm{\phi}}^{l}=[\phi^{l}_{1},\dots,\phi^{l}_{M}]^{{\scriptscriptstyle\mathsf{% T}}}\in\mathbb{C}^{M\times 1}bold_italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = [ italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × 1 end_POSTSUPERSCRIPT. Similarly, we denote ξnk[0,2π),n𝒩,k𝒦formulae-sequencesuperscriptsubscript𝜉𝑛𝑘02𝜋formulae-sequence𝑛𝒩𝑘𝒦\xi_{n}^{k}\in[0,2\pi),n\in\mathcal{N},k\in\mathcal{K}italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ [ 0 , 2 italic_π ) , italic_n ∈ caligraphic_N , italic_k ∈ caligraphic_K the phase shift by meta-atom n𝑛nitalic_n on the receive metasurface layer k𝑘kitalic_k with ψnk=ejξnksuperscriptsubscript𝜓𝑛𝑘superscript𝑒𝑗superscriptsubscript𝜉𝑛𝑘\psi_{n}^{k}=e^{j\xi_{n}^{k}}italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = italic_e start_POSTSUPERSCRIPT italic_j italic_ξ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT being the corresponding transmission coefficient. The k𝑘kitalic_k-th receive coefficient matrix is denoted by 𝚿k=diag()(𝝍k)M×Nsuperscript𝚿𝑘diagsuperscript𝝍𝑘superscript𝑀𝑁{\bm{\Psi}}^{k}=\text{diag}\left(\right)({\bm{\psi}}^{k})\in\mathbb{C}^{M% \times N}bold_Ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = diag ( ) ( bold_italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N end_POSTSUPERSCRIPT, where 𝝍k=[ψ1k,,ψMk]𝖳N×1superscript𝝍𝑘superscriptsubscriptsuperscript𝜓𝑘1subscriptsuperscript𝜓𝑘𝑀𝖳superscript𝑁1{\bm{\psi}}^{k}=[\psi^{k}_{1},\dots,\psi^{k}_{M}]^{{\scriptscriptstyle\mathsf{% T}}}\in\mathbb{C}^{N\times 1}bold_italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = [ italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × 1 end_POSTSUPERSCRIPT.

Remark 1

In this work, we assume continuously-adjustable phase shifts and constant modulus equal to 1111 to assess the performance of SIM-assisted HMIMO communications while maximizing the achievable rate as in [9]. Practical issues such as the assumption of coupled phase and magnitude [32] and the consideration of discrete phase shifts [27] will be studied in future work.

Remark 2

Note that our hybrid digital and wave architecture employing multiple-layer SIM significantly outperforms the conventional hybrid benchmarks whose performance is constrained by analog components, e.g., constant- modulus phase shifters. Remarkably, our hybrid architecture may approach the performance of an all-digital system, while the number of RF chains reduces from M𝑀Mitalic_M to Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

With all metasurface layers having an isomorphic lattice arrangement [27], we model each surface as a uniform planar array, where the element spacing between the m~~𝑚\tilde{m}over~ start_ARG italic_m end_ARGth and m𝑚mitalic_mth meta-atoms on the same transmit metasurface is given by [29]

rm,m~=re,t(mzm~z)2+(mxm~x)2subscript𝑟𝑚~𝑚subscript𝑟𝑒𝑡superscriptsubscript𝑚𝑧subscript~𝑚𝑧2superscriptsubscript𝑚𝑥subscript~𝑚𝑥2\displaystyle r_{m,\tilde{m}}=r_{e,t}\sqrt{(m_{z}-\tilde{m}_{z})^{2}+(m_{x}-% \tilde{m}_{x})^{2}}italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_e , italic_t end_POSTSUBSCRIPT square-root start_ARG ( italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT - over~ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_m start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT - over~ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (1)

with re,tsubscript𝑟𝑒𝑡r_{e,t}italic_r start_POSTSUBSCRIPT italic_e , italic_t end_POSTSUBSCRIPT expressing the element spacing between adjacent meta-atoms on the same transmit metasurface. Moreover, mxsubscript𝑚𝑥m_{x}italic_m start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and mzsubscript𝑚𝑧m_{z}italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT correspond to the m𝑚mitalic_mth meta-atom along the x𝑥xitalic_x-axis and the z𝑧zitalic_z-axis, respectively, which are given by

mx=mod(m1,mmax)+1,mz=m/mmax\displaystyle m_{x}=\!\!\!\!\!\mod(m-1,m_{\mathrm{max}})+1,~{}~{}~{}m_{z}=% \lceil m/m_{\mathrm{max}}\rceilitalic_m start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = roman_mod ( italic_m - 1 , italic_m start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ) + 1 , italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = ⌈ italic_m / italic_m start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ⌉ (2)

with mmaxsubscript𝑚maxm_{\mathrm{max}}italic_m start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT being the highest number of meta-atoms included on each row of the transmit surface. In a similar way, the element spacing between the n𝑛nitalic_nth meta-atom and the n~~𝑛\tilde{n}over~ start_ARG italic_n end_ARGth one on the same receive metasurface can be described as

tn~,n=te,r(n~xnx)2+(n~znz)2,subscript𝑡~𝑛𝑛subscript𝑡𝑒𝑟superscriptsubscript~𝑛𝑥subscript𝑛𝑥2superscriptsubscript~𝑛𝑧subscript𝑛𝑧2\displaystyle t_{\tilde{n},n}=t_{e,r}\sqrt{(\tilde{n}_{x}-n_{x})^{2}+(\tilde{n% }_{z}-n_{z})^{2}},italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT italic_e , italic_r end_POSTSUBSCRIPT square-root start_ARG ( over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT - italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( over~ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT - italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (3)

where te,rsubscript𝑡𝑒𝑟t_{e,r}italic_t start_POSTSUBSCRIPT italic_e , italic_r end_POSTSUBSCRIPT expresses the element spacing between adjacent meta-atoms on the same receive metasurface with nxsubscript𝑛𝑥n_{x}italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and nzsubscript𝑛𝑧n_{z}italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT describing the indices of the n𝑛nitalic_nth meta-atom along the x𝑥xitalic_x-axis and the z𝑧zitalic_z-axis, respectively. These indices are given by

nx=mod(n1,nmax)+1,nz\displaystyle n_{x}=\!\!\!\!\!\mod(n-1,n_{\mathrm{max}})+1,~{}~{}~{}n_{z}italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = roman_mod ( italic_n - 1 , italic_n start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ) + 1 , italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT =n/nmax,absent𝑛subscript𝑛max\displaystyle=\lceil n/n_{\mathrm{max}}\rceil,= ⌈ italic_n / italic_n start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ⌉ , (4)

where nmaxsubscript𝑛maxn_{\mathrm{max}}italic_n start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT is the maximum number of meta-atoms on each row of the receive metasurface. For the sake of simplicity, we assume square surfaces at both the transmitter and receiver sides, i.e., M=mmax2𝑀superscriptsubscript𝑚max2M=m_{\mathrm{max}}^{2}italic_M = italic_m start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and N=nmax2𝑁superscriptsubscript𝑛max2N=n_{\mathrm{max}}^{2}italic_N = italic_n start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

By assuming uniform spacing among all surfaces and that all surfaces are parallel for the sake of simplicity, we define the transmission distance from meta-atom m~~𝑚\tilde{m}over~ start_ARG italic_m end_ARG on the transmit metasurface (l1)𝑙1(l-1)( italic_l - 1 ) to meta-atom m𝑚mitalic_m on the transmit metasurface l𝑙litalic_l as

rm,m~l=rm,m~2+dt2,l/{1},formulae-sequencesuperscriptsubscript𝑟𝑚~𝑚𝑙superscriptsubscript𝑟𝑚~𝑚2superscriptsubscript𝑑𝑡2𝑙1\displaystyle r_{m,\tilde{m}}^{l}=\sqrt{r_{m,\tilde{m}}^{2}+d_{t}^{2}},~{}l\ % \in\mathcal{L}/\{1\},italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = square-root start_ARG italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , italic_l ∈ caligraphic_L / { 1 } , (5)

where dt=Dt/Lsubscript𝑑𝑡subscript𝐷𝑡𝐿d_{t}={D_{t}}/{L}italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_D start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT / italic_L is the spacing between any two adjacent metasurfaces at the transmitter SIM with Dtsubscript𝐷𝑡D_{t}italic_D start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT describing the thickness of the transmitter SIM.

Similarly, we define the distance from meta-atom n~~𝑛\tilde{n}over~ start_ARG italic_n end_ARG on the receive metasurface k𝑘kitalic_k to meta-atom n~~𝑛\tilde{n}over~ start_ARG italic_n end_ARG on the receive metasurface (k1)𝑘1(k-1)( italic_k - 1 ) as

tn~,nk=dr2+tn~,n2,k𝒦/{1},formulae-sequencesuperscriptsubscript𝑡~𝑛𝑛𝑘superscriptsubscript𝑑𝑟2superscriptsubscript𝑡~𝑛𝑛2𝑘𝒦1\displaystyle t_{\tilde{n},n}^{k}=\sqrt{d_{r}^{2}+t_{\tilde{n},n}^{2}},~{}k\ % \in\mathcal{K}/\{1\},italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = square-root start_ARG italic_d start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , italic_k ∈ caligraphic_K / { 1 } , (6)

where dr=Dr/Lsubscript𝑑𝑟subscript𝐷𝑟𝐿d_{r}={D_{r}}/{L}italic_d start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT = italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT / italic_L is the spacing between any two adjacent metasurfaces at the receiver SIM with Drsubscript𝐷𝑟D_{r}italic_D start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT describing the thickness of the receiver SIM.

In addition, by assuming that the centers of the transmit and receive antenna arrays are aligned with the centers of all metasufaces while both antenna arrays are arranged in a uniform linear array with element spacing λ/2𝜆2\lambda/2italic_λ / 2, the distances from the s𝑠sitalic_sth source to the m𝑚mitalic_mth meta-atom on the input metasurface of the transmitter SIM and from n𝑛nitalic_nth meta-atom on the output metasurface of the receiver SIM to the s𝑠sitalic_sth destination are provided by (7) and (8), respectively.

rm,s1superscriptsubscript𝑟𝑚𝑠1\displaystyle r_{m,s}^{1}italic_r start_POSTSUBSCRIPT italic_m , italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT =[(mzmmax+12)re,t(sNt+12)λ2]2+(mxmmax+12)2re,t2+dt2,absentsuperscriptdelimited-[]subscript𝑚𝑧subscript𝑚max12subscript𝑟𝑒𝑡𝑠subscript𝑁𝑡12𝜆22superscriptsubscript𝑚𝑥subscript𝑚max122superscriptsubscript𝑟𝑒𝑡2superscriptsubscript𝑑𝑡2\displaystyle=\sqrt{\left[\left(m_{z}-\frac{m_{\mathrm{max}+1}}{2}\right)r_{e,% t}-\left(s-\frac{N_{t}+1}{2}\right)\frac{\lambda}{2}\right]^{2}+\left(m_{x}-% \frac{m_{\mathrm{max}}+1}{2}\right)^{2}r_{e,t}^{2}+d_{t}^{2}},= square-root start_ARG [ ( italic_m start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT - divide start_ARG italic_m start_POSTSUBSCRIPT roman_max + 1 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) italic_r start_POSTSUBSCRIPT italic_e , italic_t end_POSTSUBSCRIPT - ( italic_s - divide start_ARG italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + 1 end_ARG start_ARG 2 end_ARG ) divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_m start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT - divide start_ARG italic_m start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT + 1 end_ARG start_ARG 2 end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_e , italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (7)
ts,n1superscriptsubscript𝑡𝑠𝑛1\displaystyle t_{s,n}^{1}italic_t start_POSTSUBSCRIPT italic_s , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT =[(nznmax+12)te,r(sNr+12)λ2]2+(nmax+12nx)2te,r2+dr2.absentsuperscriptdelimited-[]subscript𝑛𝑧subscript𝑛max12subscript𝑡𝑒𝑟𝑠subscript𝑁𝑟12𝜆22superscriptsubscript𝑛max12subscript𝑛𝑥2superscriptsubscript𝑡𝑒𝑟2superscriptsubscript𝑑𝑟2\displaystyle=\sqrt{\left[\left(n_{z}-\frac{n_{\mathrm{max}+1}}{2}\right)t_{e,% r}-\left(s-\frac{N_{r}+1}{2}\right)\frac{\lambda}{2}\right]^{2}+\left(\frac{n_% {\mathrm{max}}+1}{2}-n_{x}\right)^{2}t_{e,r}^{2}+d_{r}^{2}}.= square-root start_ARG [ ( italic_n start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT - divide start_ARG italic_n start_POSTSUBSCRIPT roman_max + 1 end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ) italic_t start_POSTSUBSCRIPT italic_e , italic_r end_POSTSUBSCRIPT - ( italic_s - divide start_ARG italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT + 1 end_ARG start_ARG 2 end_ARG ) divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( divide start_ARG italic_n start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT + 1 end_ARG start_ARG 2 end_ARG - italic_n start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_t start_POSTSUBSCRIPT italic_e , italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (8)

II-B Channel Model

Regarding the transmission coefficient from meta-atom m~~𝑚\tilde{m}over~ start_ARG italic_m end_ARG on the transmit metasurface layer (l1)𝑙1(l-1)( italic_l - 1 ) to meta-atom m𝑚mitalic_m on the transmit metasurface layer l𝑙litalic_l, provided by the Rayleigh-Sommerfeld diffraction theory [33], it is given by

wm,m~l=Atcosxm,m~lrm,m~l(12πrm,m~lj1λ)ej2πrm,m~l/λ,l,formulae-sequencesuperscriptsubscript𝑤𝑚~𝑚𝑙subscript𝐴𝑡𝑐𝑜𝑠superscriptsubscript𝑥𝑚~𝑚𝑙superscriptsubscript𝑟𝑚~𝑚𝑙12𝜋subscriptsuperscript𝑟𝑙𝑚~𝑚𝑗1𝜆superscript𝑒𝑗2𝜋superscriptsubscript𝑟𝑚~𝑚𝑙𝜆𝑙\displaystyle w_{m,\tilde{m}}^{l}=\frac{A_{t}cosx_{m,\tilde{m}}^{l}}{r_{m,% \tilde{m}}^{l}}\left(\frac{1}{2\pi r^{l}_{m,\tilde{m}}}-j\frac{1}{\lambda}% \right)e^{j2\pi r_{m,\tilde{m}}^{l}/\lambda},l\in\mathcal{L},italic_w start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = divide start_ARG italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_c italic_o italic_s italic_x start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT end_ARG ( divide start_ARG 1 end_ARG start_ARG 2 italic_π italic_r start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT end_ARG - italic_j divide start_ARG 1 end_ARG start_ARG italic_λ end_ARG ) italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT / italic_λ end_POSTSUPERSCRIPT , italic_l ∈ caligraphic_L , (9)

where Atsubscript𝐴𝑡A_{t}italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT denotes the meta-atom area at the transmitter SIM, xm,m~lsuperscriptsubscript𝑥𝑚~𝑚𝑙x_{m,\tilde{m}}^{l}italic_x start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT is the angle between the propagation direction and the normal direction of the transmit metasurface layer (l1)𝑙1(l-1)( italic_l - 1 ), and rm,m~lsuperscriptsubscript𝑟𝑚~𝑚𝑙r_{m,\tilde{m}}^{l}italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT, is the respective transmission distance. Hence, the effect of the transmitter SIM can be written as

𝐏=𝚽L𝐖L𝚽2𝐖2𝚽1𝐖1M×Nt,𝐏superscript𝚽𝐿superscript𝐖𝐿superscript𝚽2superscript𝐖2superscript𝚽1superscript𝐖1superscript𝑀subscript𝑁𝑡\displaystyle{\mathbf{P}}={\bm{\Phi}}^{L}{\mathbf{W}}^{L}\cdots{\bm{\Phi}}^{2}% {\mathbf{W}}^{2}{\bm{\Phi}}^{1}{\mathbf{W}}^{1}\in\mathbb{C}^{M\times N_{t}},bold_P = bold_Φ start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ⋯ bold_Φ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (10)

where 𝐖lM×M,l/{1}formulae-sequencesuperscript𝐖𝑙superscript𝑀𝑀𝑙1{\mathbf{W}}^{l}\in\mathbb{C}^{M\times M},l\in\mathcal{L}/\{1\}bold_W start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT , italic_l ∈ caligraphic_L / { 1 } is the transmission coefficient matrix between the transmit metasurface layer (l1)𝑙1(l-1)( italic_l - 1 ) and the transmit metasurface layer l𝑙litalic_l, while 𝐖1M×Ntsuperscript𝐖1superscript𝑀subscript𝑁𝑡{\mathbf{W}}^{1}\in\mathbb{C}^{M\times N_{t}}bold_W start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is the transmission coefficient matrix from the transmit antenna array to the input metasurface layer of the transmit SIM.

In the case of the transmission coefficient from meta-atom n𝑛nitalic_n on the receive metasurface layer k𝑘kitalic_k to the meta-atom n~~𝑛\tilde{n}over~ start_ARG italic_n end_ARG on the receive metasurface layer (k1)𝑘1(k-1)( italic_k - 1 ), it is given by

un~,nk=Arcosζn~,nktn~,nk(12πtn~,nkj1λ)ej2πtn~,nk/λ,k𝒦,formulae-sequencesuperscriptsubscript𝑢~𝑛𝑛𝑘subscript𝐴𝑟𝑐𝑜𝑠superscriptsubscript𝜁~𝑛𝑛𝑘superscriptsubscript𝑡~𝑛𝑛𝑘12𝜋subscriptsuperscript𝑡𝑘~𝑛𝑛𝑗1𝜆superscript𝑒𝑗2𝜋superscriptsubscript𝑡~𝑛𝑛𝑘𝜆𝑘𝒦\displaystyle u_{\tilde{n},n}^{k}=\frac{A_{r}cos\zeta_{\tilde{n},n}^{k}}{t_{% \tilde{n},n}^{k}}\left(\frac{1}{2\pi t^{k}_{\tilde{n},n}}-j\frac{1}{\lambda}% \right)e^{j2\pi t_{\tilde{n},n}^{k}/\lambda},k\in\mathcal{K},italic_u start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = divide start_ARG italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_c italic_o italic_s italic_ζ start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG start_ARG italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG ( divide start_ARG 1 end_ARG start_ARG 2 italic_π italic_t start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT end_ARG - italic_j divide start_ARG 1 end_ARG start_ARG italic_λ end_ARG ) italic_e start_POSTSUPERSCRIPT italic_j 2 italic_π italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT / italic_λ end_POSTSUPERSCRIPT , italic_k ∈ caligraphic_K , (11)

where Arsubscript𝐴𝑟A_{r}italic_A start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is the meta-atom area in the receiver SIM, ζn~,nksuperscriptsubscript𝜁~𝑛𝑛𝑘\zeta_{\tilde{n},n}^{k}italic_ζ start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT is the angle between the propagation direction and the normal direction of the receive metasurface layer (k1)𝑘1(k-1)( italic_k - 1 ), and tn~,nksuperscriptsubscript𝑡~𝑛𝑛𝑘t_{\tilde{n},n}^{k}italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT is the corresponding transmission distance. Thus, the effect of the receiver SIM is expressed by

𝐙=𝐔1𝚿1𝐔2𝚿2𝐔K𝚿KNr×N,𝐙superscript𝐔1superscript𝚿1superscript𝐔2superscript𝚿2superscript𝐔𝐾superscript𝚿𝐾superscriptsubscript𝑁𝑟𝑁\displaystyle{\mathbf{Z}}={\mathbf{U}}^{1}{\bm{\Psi}}^{1}{\mathbf{U}}^{2}{\bm{% \Psi}}^{2}\cdots{\mathbf{U}}^{K}{\bm{\Psi}}^{K}\in\mathbb{C}^{N_{r}\times N},bold_Z = bold_U start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_Ψ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_Ψ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋯ bold_U start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT bold_Ψ start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT × italic_N end_POSTSUPERSCRIPT , (12)

where 𝐔kN×N,k𝒦/{1}formulae-sequencesuperscript𝐔𝑘superscript𝑁𝑁𝑘𝒦1{\mathbf{U}}^{k}\in\mathbb{C}^{N\times N},k\in\mathcal{K}/\{1\}bold_U start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT , italic_k ∈ caligraphic_K / { 1 } is the transmission coefficient matrix between the receive metasurface layer k𝑘kitalic_k to the receive metasurface layer (k1)𝑘1(k-1)( italic_k - 1 ), and 𝐔1Nr×Nsuperscript𝐔1superscriptsubscript𝑁𝑟𝑁{\mathbf{U}}^{1}\in\mathbb{C}^{N_{r}\times N}bold_U start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT × italic_N end_POSTSUPERSCRIPT is the transmission coefficient matrix from the output metasurface layer of the receiver SIM to the receive antenna array.

Remark 3

Practical hardware imperfections such as innate modeling errors [27] may lead to deviation of the transmission coefficients between adjacent metasurface layers from those given by (9) and (11). In this case, calibration of these coefficients is necessary for each individual SIM. Although the calibration process is beyond the scope of this work, one solution suggests measuring the response at the receive panel after the transmission of a known signal as mentioned in [29]. Next, update of the transmission coefficients could take place after applying the standard error back-propagation algorithm [34].

Concerning the HMIMO channel between the transmitter and receiver SIMs, it is written as [35]

𝐆=𝐑R1/2𝐆~𝐑T1/2N×M,𝐆subscriptsuperscript𝐑12R~𝐆subscriptsuperscript𝐑12Tsuperscript𝑁𝑀\displaystyle{\mathbf{G}}={\mathbf{R}}^{1/2}_{\mathrm{R}}\tilde{{\mathbf{G}}}{% \mathbf{R}}^{1/2}_{\mathrm{T}}\in\mathbb{C}^{N\times M},bold_G = bold_R start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_R end_POSTSUBSCRIPT over~ start_ARG bold_G end_ARG bold_R start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_T end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × italic_M end_POSTSUPERSCRIPT , (13)

where 𝐆~𝒞𝒩(𝟎,PL𝐈N𝐈M)N×Msimilar-to~𝐆𝒞𝒩0tensor-productPLsubscript𝐈𝑁subscript𝐈𝑀superscript𝑁𝑀\tilde{{\mathbf{G}}}\sim\mathcal{CN}({\mathbf{0}},\mathrm{PL}{\bm{\mathrm{I}}}% _{N}\otimes{\bm{\mathrm{I}}}_{M})\in\mathbb{C}^{N\times M}over~ start_ARG bold_G end_ARG ∼ caligraphic_C caligraphic_N ( bold_0 , roman_PL bold_I start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ⊗ bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × italic_M end_POSTSUPERSCRIPT denotes the independent and identically distributed (i.i.d.) Rayleigh fading channel, 𝐑TM×Msubscript𝐑Tsuperscript𝑀𝑀{\mathbf{R}}_{\mathrm{T}}\in\mathbb{C}^{M\times M}bold_R start_POSTSUBSCRIPT roman_T end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT is the spatial correlation matrix at the transmitter SIM, and 𝐑RN×Nsubscript𝐑Rsuperscript𝑁𝑁{\mathbf{R}}_{\mathrm{R}}\in\mathbb{C}^{N\times N}bold_R start_POSTSUBSCRIPT roman_R end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT is the spatial correlation matrix at the receiver SIM. Note that PLPL\mathrm{PL}roman_PL corresponds to the average path loss between the transmitter and receiver SIMs. In particular, in the case of isotropic scattering and far-field propagation [24, 36], the spatial correlation matrices at the transmitter and receiver SIMs are given by [26]

[𝐑T]m,m~subscriptdelimited-[]subscript𝐑T𝑚~𝑚\displaystyle[{\mathbf{R}}_{\mathrm{T}}]_{m,\tilde{m}}[ bold_R start_POSTSUBSCRIPT roman_T end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT =sinc(2rm,m~/λ),m,m~,formulae-sequenceabsentsinc2subscript𝑟𝑚~𝑚𝜆formulae-sequence𝑚~𝑚\displaystyle=\mathrm{sinc}(2r_{m,\tilde{m}}/\lambda),m\in\mathcal{M},\tilde{m% }\in\mathcal{M},= roman_sinc ( 2 italic_r start_POSTSUBSCRIPT italic_m , over~ start_ARG italic_m end_ARG end_POSTSUBSCRIPT / italic_λ ) , italic_m ∈ caligraphic_M , over~ start_ARG italic_m end_ARG ∈ caligraphic_M , (14)
[𝐑R]n~,nsubscriptdelimited-[]subscript𝐑R~𝑛𝑛\displaystyle[{\mathbf{R}}_{\mathrm{R}}]_{\tilde{n},n}[ bold_R start_POSTSUBSCRIPT roman_R end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT =sinc(2tn~,n/λ),n~𝒩,n𝒩,formulae-sequenceabsentsinc2subscript𝑡~𝑛𝑛𝜆formulae-sequence~𝑛𝒩𝑛𝒩\displaystyle=\mathrm{sinc}(2t_{\tilde{n},n}/\lambda),\tilde{n}\in\mathcal{N},% n\in\mathcal{N},= roman_sinc ( 2 italic_t start_POSTSUBSCRIPT over~ start_ARG italic_n end_ARG , italic_n end_POSTSUBSCRIPT / italic_λ ) , over~ start_ARG italic_n end_ARG ∈ caligraphic_N , italic_n ∈ caligraphic_N , (15)

respectively.

The path loss ,which attenuates the received signal is given by [37]

PL(d)=PL(d0)+10blog10(dd0)+Xδ,dd0,formulae-sequencePL𝑑PLsubscript𝑑010𝑏subscript10𝑑subscript𝑑0subscript𝑋𝛿𝑑subscript𝑑0\displaystyle\mathrm{PL}(d)=\mathrm{PL}(d_{0})+10b\log_{10}\left(\frac{d}{d_{0% }}\right)+X_{\delta},~{}d\geq d_{0},roman_PL ( italic_d ) = roman_PL ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + 10 italic_b roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( divide start_ARG italic_d end_ARG start_ARG italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) + italic_X start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT , italic_d ≥ italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , (16)

where Xδsubscript𝑋𝛿X_{\delta}italic_X start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT is a Gaussian random variable with a zero mean and a standard deviation δ𝛿\deltaitalic_δ that depends on shadow fading, b𝑏bitalic_b is the path loss exponent, and PL(d0)=20log10(4πd0/λ)dBPLsubscript𝑑020subscript104𝜋subscript𝑑0𝜆dB\mathrm{PL}(d_{0})=20\log_{10}(4\pi d_{0}/\lambda)~{}\mathrm{dB}roman_PL ( italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 20 roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( 4 italic_π italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT / italic_λ ) roman_dB denotes the free space path loss at the reference distance d0subscript𝑑0d_{0}italic_d start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

The received signal vector at the destination is given by

𝐲=𝐇𝐱+𝐧,𝐲𝐇𝐱𝐧\displaystyle{\mathbf{y}}={\mathbf{H}}{\mathbf{x}}+{\mathbf{n}},bold_y = bold_Hx + bold_n , (17)

where 𝐱Nt×1𝐱superscriptsubscript𝑁𝑡1{\mathbf{x}}\in\mathbb{C}^{N_{t}\times 1}bold_x ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT × 1 end_POSTSUPERSCRIPT is the transmit signal vector, 𝐧Nr×1𝐧superscriptsubscript𝑁𝑟1{\mathbf{n}}\in\mathbb{C}^{N_{r}\times 1}bold_n ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT × 1 end_POSTSUPERSCRIPT is the noise vector distributed as 𝒞𝒩(𝟎,N0𝐈)𝒞𝒩0subscript𝑁0𝐈\mathcal{CN}\left({\mathbf{0}},N_{0}{\bm{\mathrm{I}}}\right)caligraphic_C caligraphic_N ( bold_0 , italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_I ), and 𝐇Nr×Nt𝐇superscriptsubscript𝑁𝑟subscript𝑁𝑡{\mathbf{H}}\in\mathbb{C}^{N_{r}\times N_{t}}bold_H ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT is the end-to-end channel that can be written as

𝐇=𝐙𝐆𝐏,𝐇𝐙𝐆𝐏\displaystyle{\mathbf{H}}={\mathbf{Z}}{\mathbf{G}}{\mathbf{P}},bold_H = bold_ZGP , (18)

We assume that 𝔼{𝐱𝖧𝐱}P𝔼superscript𝐱𝖧𝐱𝑃\mathbb{E}\{{\mathbf{x}}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{x}}\}\leq Pblackboard_E { bold_x start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_x } ≤ italic_P, where P𝑃Pitalic_P is the maximum average transmit power. Also, we denote 𝐐=𝔼{𝐱𝐱𝖧}𝐐𝔼superscript𝐱𝐱𝖧{\mathbf{Q}}=\mathbb{E}\{{\mathbf{x}}{\mathbf{x}}^{{\scriptscriptstyle\mathsf{% H}}}\}bold_Q = blackboard_E { bold_xx start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT }, where 𝐐𝟎succeeds-or-equals𝐐0{\mathbf{Q}}\succeq{\mathbf{0}}bold_Q ⪰ bold_0 is the transmit covariance matrix. Note that the transmit power constraint can be rewritten as tr(𝐐)Ptrace𝐐𝑃\tr({\mathbf{Q}})\leq Proman_tr ( start_ARG bold_Q end_ARG ) ≤ italic_P.

Remark 4

Please allow us to elaborate by mentioning that to the best of our knowledge, this is the first paper to leverage the hybrid digital and wave-based beamforming design. Hence, we focus on the point-to-point MIMO case and assume the channels associated with different meta-atoms have been estimated by existing methods, e.g., [38].

Remark 5

We highlight that the SIM aims to approach fully digital systems. The network performance and time-varying channel on point-to-point MIMO under the fully digital architecture have been well studied. Extending the proposed method to multicell networks relying on existing works, e.g., [39] is straightforward and would motivate future research. For example, in [40, 41] the authors have evaluated the performance of SIM under multiuser scenarios, demonstrating the capability of suppressing interference by leveraging the wave-based beamforming.

Remark 6

Furthermore, the time-varying channel condition would result in the wave-based beamforming design under imperfect or outdated CSI [6]. This requires the robust beamforming design by extending the proposed optimization method, which is beyond the scope of this paper and left for our future research.

II-C Problem Formulation

In this work, we aim at maximizing the achievable rate of the SIM-assisted HMIMO wireless communication system. For a given covariance matrix 𝐐𝐐{\mathbf{Q}}bold_Q, assuming Gaussian signaling, the achievable rate can be written as

R=log2det(𝐈+1N0𝐇𝐐𝐇𝖧)(bit/s/Hz),𝑅subscript2𝐈1subscript𝑁0superscript𝐇𝐐𝐇𝖧bitsHz\displaystyle R=\log_{2}\det\left({\bm{\mathrm{I}}}+\frac{1}{N_{0}}{\mathbf{H}% }{\mathbf{Q}}{\mathbf{H}}^{{\scriptscriptstyle\mathsf{H}}}\right)(\mathrm{bit/% s/Hz}),italic_R = roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT roman_det ( bold_I + divide start_ARG 1 end_ARG start_ARG italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG bold_HQH start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) ( roman_bit / roman_s / roman_Hz ) , (19)

where 𝐇𝐇{\mathbf{H}}bold_H is perfectly known at both the transmitter and the receiver, and depends on ϕl,lsuperscriptbold-italic-ϕ𝑙𝑙{\bm{\phi}}^{l},l\in\mathcal{L}bold_italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT , italic_l ∈ caligraphic_L and 𝝍k,k𝒦superscript𝝍𝑘𝑘𝒦{\bm{\psi}}^{k},k\in\mathcal{K}bold_italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_k ∈ caligraphic_K.

Mathematically, the optimization problem can be formulated as

(20a)
(20b)
(20c)
(20d)
(20e)
(20f)

where we have denoted 𝐇¯=𝐇/N0¯𝐇𝐇subscript𝑁0\bar{{\mathbf{H}}}={\mathbf{H}}/\sqrt{N_{0}}over¯ start_ARG bold_H end_ARG = bold_H / square-root start_ARG italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG.

III Achievable Rate Optimization

We observe that problem (𝒫)𝒫(\mathcal{P})( caligraphic_P ) is nonconvex with an objective function being neither concave nor convex with respect to its variables and with non-convex constant modulus constraints. Hence, contrary to conventional MIMO systems, the water-filling algorithm cannot be used to obtain the maximum achievable rate. Also, previous proposed optimization methods on RIS-assisted systems have relied on the alternating optimization (AO) [9, 19], where the covariance matrix and the RIS phase shifts are optimized separately in an alternating fashion. However, despite the easy implementation of the AO method, its convergence may require many iterations, which increase with the number of RIS elements [20]. Given that SIM-assisted HMIMO systems suggest a case, where each metasurface has a large number of elements, AO is not recommended. These observations motivate us to propose to apply an efficient projected gradient method similar to [42, 20], where the covariance matrix and the phase shifts at the transmitter and receiver SIMs are optimized simultaneously.

III-A Proposed Algorithm

According to the proposed approach, we perform a simultaneous optimization of all variables in each iteration instead of optimizing them a single variable at a time. Section LABEL:Numerical will demonstrate the faster convergence compared to the AO method.

We outline the proposed algorithm solving (20) in Algorithm 1. The central concept assumes to start from an arbitrary point (𝐐0,ϕl0,𝝍l0)superscript𝐐0superscriptsubscriptbold-italic-ϕ𝑙0superscriptsubscript𝝍𝑙0({\mathbf{Q}}^{0},{\bm{\phi}}_{l}^{0},{\bm{\psi}}_{l}^{0})( bold_Q start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ), and move towards f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), i.e., the gradient of f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ). The parameter μnq>0superscriptsubscript𝜇𝑛𝑞0\mu_{n}^{q}>0italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT > 0 for q=1,2,3𝑞123q=1,2,3italic_q = 1 , 2 , 3 determines the step of this move.

For the description of the proposed algorithm, we make use of the following sets.

𝒬𝒬\displaystyle\mathcal{Q}caligraphic_Q ={𝐐:tr(𝐐)P;𝐐𝟎},absentconditional-set𝐐formulae-sequencetrace𝐐𝑃succeeds-or-equals𝐐0\displaystyle=\{{\mathbf{Q}}\in\mathbb{C}:\tr({\mathbf{Q}})\leq P;{\mathbf{Q}}% \succeq{\mathbf{0}}\},= { bold_Q ∈ blackboard_C : roman_tr ( start_ARG bold_Q end_ARG ) ≤ italic_P ; bold_Q ⪰ bold_0 } , (20u)
ΦlsubscriptΦ𝑙\displaystyle\Phi_{l}roman_Φ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ={ϕlM×1:|ϕil|=1,i=1,,M},absentconditional-setsubscriptbold-italic-ϕ𝑙superscript𝑀1formulae-sequencesubscriptsuperscriptitalic-ϕ𝑙𝑖1𝑖1𝑀\displaystyle=\{{\bm{\phi}}_{l}\in\mathbb{C}^{M\times 1}:|\phi^{l}_{i}|=1,i=1,% \ldots,M\},= { bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × 1 end_POSTSUPERSCRIPT : | italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | = 1 , italic_i = 1 , … , italic_M } , (20v)
ΨksubscriptΨ𝑘\displaystyle\Psi_{k}roman_Ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ={𝝍kN×1:|ψik|=1,i=1,,N}.absentconditional-setsubscript𝝍𝑘superscript𝑁1formulae-sequencesubscriptsuperscript𝜓𝑘𝑖1𝑖1𝑁\displaystyle=\{{\bm{\psi}}_{k}\in\mathbb{C}^{N\times 1}:|\psi^{k}_{i}|=1,i=1,% \ldots,N\}.= { bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_N × 1 end_POSTSUPERSCRIPT : | italic_ψ start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | = 1 , italic_i = 1 , … , italic_N } . (20w)

Note that before each step towards the gradient of f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), the newly computed points 𝐐,ϕl,𝝍k𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘{\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k}bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are projected onto their feasible sets 𝒬𝒬\mathcal{Q}caligraphic_Q, ΦlsubscriptΦ𝑙\Phi_{l}roman_Φ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT, and ΨksubscriptΨ𝑘\Psi_{k}roman_Ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, respectively. Otherwise, the ensuing updated point may be found outside of the feasible set. Below, we provide 𝐐f(𝐐,ϕl,𝝍k)subscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\mathbf{Q}}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), ϕlf(𝐐,ϕl,𝝍k)subscriptsubscriptbold-italic-ϕ𝑙𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), and 𝝍kf(𝐐,ϕl,𝝍k)subscriptsubscript𝝍𝑘𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), which correspond to the directions where the rate of change of f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) becomes maximum [43, Theorem3.4].

Algorithm 1 Projected Gradient Ascent Method for SIM-assisted HMIMO Systems
1:  Input: 𝐐0,ϕl0,𝝍k0,μnq>0superscript𝐐0superscriptsubscriptbold-italic-ϕ𝑙0superscriptsubscript𝝍𝑘0superscriptsubscript𝜇𝑛𝑞0{\mathbf{Q}}^{0},{\bm{\phi}}_{l}^{0},{\bm{\psi}}_{k}^{0},\mu_{n}^{{\color[rgb]% {0,0,0}q}}>0bold_Q start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT > 0 for q=1,2,3𝑞123q=1,2,3italic_q = 1 , 2 , 3.
2:  for n=1,2,do𝑛12don=1,2,\ldots\textbf{do}italic_n = 1 , 2 , … do
3:       𝐐n+1=PQ(𝐐n+μn1𝒬f(𝐐n,ϕln,𝝍kn))superscript𝐐𝑛1subscript𝑃𝑄superscript𝐐𝑛superscriptsubscript𝜇𝑛1subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛{\mathbf{Q}}^{n+1}=P_{Q}({\mathbf{Q}}^{n}+\mu_{n}^{{\color[rgb]{0,0,0}1}}% \nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n}))bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT = italic_P start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT + italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) )
4:       ϕln+1=PΦl(ϕln+μn2ϕlf(𝐐n,ϕln,𝝍kn))superscriptsubscriptbold-italic-ϕ𝑙𝑛1subscript𝑃subscriptΦ𝑙superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝜇𝑛2subscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛{\bm{\phi}}_{l}^{n+1}=P_{\Phi_{l}}({\bm{\phi}}_{l}^{n}+\mu_{n}^{{\color[rgb]{% 0,0,0}2}}\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{% \psi}}_{k}^{n}))bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT = italic_P start_POSTSUBSCRIPT roman_Φ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT + italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) )
5:       𝝍kn+1=PΨk(𝝍kn+μn3𝝍kf(𝐐n,ϕln,𝝍kn))superscriptsubscript𝝍𝑘𝑛1subscript𝑃subscriptΨ𝑘superscriptsubscript𝝍𝑘𝑛superscriptsubscript𝜇𝑛3subscriptsubscript𝝍𝑘𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛{\bm{\psi}}_{k}^{n+1}=P_{\Psi_{k}}({\bm{\psi}}_{k}^{n}+\mu_{n}^{{\color[rgb]{% 0,0,0}3}}\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{% \psi}}_{k}^{n}))bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT = italic_P start_POSTSUBSCRIPT roman_Ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT + italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) )
6:  end for

Note that PQ()subscript𝑃𝑄P_{Q}(\cdot)italic_P start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ( ⋅ ), PΦl()subscript𝑃subscriptΦ𝑙P_{\Phi_{l}}(\cdot)italic_P start_POSTSUBSCRIPT roman_Φ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⋅ ), and PΨk()subscript𝑃subscriptΨ𝑘P_{\Psi_{k}}(\cdot)italic_P start_POSTSUBSCRIPT roman_Ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ⋅ ) denote the projections onto 𝒬𝒬\mathcal{Q}caligraphic_Q, ΦlsubscriptΦ𝑙\Phi_{l}roman_Φ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT, and ΨksubscriptΨ𝑘\Psi_{k}roman_Ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, respectively.

III-B Complex-Valued Gradients of f(𝐐,ϕl,𝛙k)𝑓𝐐subscriptbold-ϕ𝑙subscript𝛙𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )

In this subsection, we provide the gradients of f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ).

Lemma 1

The gradients of f(𝐐,ϕl,𝛙k)𝑓𝐐subscriptbold-ϕ𝑙subscript𝛙𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) with respect to 𝐐superscript𝐐{\mathbf{Q}}^{*}bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, ϕlsuperscriptsubscriptbold-ϕ𝑙{\bm{\phi}}_{l}^{*}bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, and 𝛙ksuperscriptsubscript𝛙𝑘{\bm{\psi}}_{k}^{*}bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT are given by

Figure 2: Convergence of the proposed algorithm and dependence on the initialization.

In Fig. III-B.(a), we depict the convergence of the proposed algorithm, i.e., Algorithm 1. Specifically, we have drawn the achievable rate of the proposed SIM-assisted HMIMO versus the number of iterations for various sets of meta-atoms per metasurface. Notably, we observe the fast convergence of the algorithm in all cases. For example, when M=N=100𝑀𝑁100M=N=100italic_M = italic_N = 100, the algorithm converges in 60606060 iterations. Furthermore, we notice that by increasing the number of meta-atoms, more iterations are required to reach convergence. The reason behind this observation is that the amount of optimization variables increases and the relevant search space is enlarged. Apart from this, we note that these increases, i.e., in terms of the number of meta-atoms lead to higher complexity of each iteration of Algorithm 1 as mentioned in Sec. III. Furthermore, it is shown that the gradient-based optimization method converges quickly to the optimum rate, while the AO method requires many iterations. Note that as the parameter values M,N𝑀𝑁M,Nitalic_M , italic_N increase, the AO method approaches the optimum rate slower.

In addition, the non-convexity of the optimization problem means that its result depends on the initial point. In other words, different initial points lead to different locally optimal solutions. Fig. III-B.(b) explores this dependence on the initialization by executing the algorithm for 30303030 channel realizations. According to its description, the initialization of Algorithm 1 assumes that 𝐐0=𝐈S,ϕl0=exp(jπ/2)𝟏Mformulae-sequencesuperscript𝐐0subscript𝐈𝑆superscriptsubscriptbold-ϕ𝑙0𝑗𝜋2subscript1𝑀{\mathbf{Q}}^{0}={\bm{\mathrm{I}}}_{S},{\bm{\phi}}_{l}^{0}=\exp\left(j\pi/2% \right){\bm{\mathrm{1}}}_{M}bold_Q start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = bold_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = roman_exp ( italic_j italic_π / 2 ) bold_1 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, 𝛙k0=exp(jπ/2)𝟏Nsuperscriptsubscript𝛙𝑘0𝑗𝜋2subscript1𝑁{\bm{\psi}}_{k}^{0}=\exp\left(j\pi/2\right){\bm{\mathrm{1}}}_{N}bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = roman_exp ( italic_j italic_π / 2 ) bold_1 start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT. “Alg. 1-Test” in the figure considers the best initial point out of 50505050 random initial points for each channel instance. The figure reveals that different initializations lead to different solutions. Also, it is shown that the achievable rate in both cases is almost the same. This means that the selection of these initial values for initialization is a good decision. In the same figure, we have plotted the case corresponding to different step sizes during the optimization of the three variables, where Lo3=0.5Lo1superscriptsubscript𝐿𝑜30.5superscriptsubscript𝐿𝑜1L_{o}^{3}=0.5L_{o}^{1}italic_L start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT = 0.5 italic_L start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and Lo2=0.1Lo1superscriptsubscript𝐿𝑜20.1superscriptsubscript𝐿𝑜1L_{o}^{2}=0.1L_{o}^{1}italic_L start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0.1 italic_L start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT with L01=104superscriptsubscript𝐿01superscript104L_{0}^{1}=10^{4}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT. As can be seen, in this case, the optimal value is achieved sooner.

In practice, we do not wait an optimization algorithm to reach a critical point but a close value. For this reason, in the following table, we present the computational complexity between the proposed projected gradient ascent method and the AO method to acquire an achievable rate that is equal to 95%percent9595\,\%95 % of the average achievable rate at the 100100100100th iteration. The complexity of the AO is characterized in terms of the number of outer iterations IOIsubscript𝐼OII_{\mathrm{OI}}italic_I start_POSTSUBSCRIPT roman_OI end_POSTSUBSCRIPT, which is required to obtain the optimal achievable rate. Note that one outer iteration is actually a sequence of M+N+1𝑀𝑁1M+N+1italic_M + italic_N + 1 conventional iterations, where ϕl={ϕml}m=1Msuperscriptbold-ϕ𝑙superscriptsubscriptsuperscriptsubscriptitalic-ϕ𝑚𝑙𝑚1𝑀{\bm{\phi}}^{l}=\{\phi_{m}^{l}\}_{m=1}^{M}bold_italic_ϕ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = { italic_ϕ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT, 𝛙l={ψnl}n=1Nsuperscript𝛙𝑙superscriptsubscriptsuperscriptsubscript𝜓𝑛𝑙𝑛1𝑁{\bm{\psi}}^{l}=\{\psi_{n}^{l}\}_{n=1}^{N}bold_italic_ψ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT = { italic_ψ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, and 𝐐𝐐{\mathbf{Q}}bold_Q. CAlg.1,ITsubscript𝐶Alg.1ITC_{\mathrm{Alg.}1,\mathrm{IT}}italic_C start_POSTSUBSCRIPT roman_Alg .1 , roman_IT end_POSTSUBSCRIPT is the computational complexity per iteration while IAlg.1subscript𝐼Alg.1I_{\mathrm{Alg.}1}italic_I start_POSTSUBSCRIPT roman_Alg .1 end_POSTSUBSCRIPT expresses the number of iterations required to obtain the optimal achievable rate. Their product results in the total computational complexity CAlg.1subscript𝐶Alg.1C_{\mathrm{Alg.}1}italic_C start_POSTSUBSCRIPT roman_Alg .1 end_POSTSUBSCRIPT of the proposed projected gradient ascent method. The computational complexity of the AO is described by CAO.subscript𝐶𝐴𝑂C_{AO}.italic_C start_POSTSUBSCRIPT italic_A italic_O end_POSTSUBSCRIPT . Table I presents the computational complexity of the proposed algorithm and the AO method for varying M𝑀Mitalic_M. We observe that CAlg.1subscript𝐶Alg.1C_{\mathrm{Alg.}1}italic_C start_POSTSUBSCRIPT roman_Alg .1 end_POSTSUBSCRIPT becomes larger with an increase of M𝑀Mitalic_M. Moreover, the computational complexity of AO increases in proportion to M𝑀Mitalic_M. However, as can be seen, the complexity in the AO case is greater than the proposed algorithm.

TABLE I: Computational complexity comparison between the proposed projected gradient ascent method and the AO method to reach 95 % of the average achievable rate at the 100th iteration.
M𝑀Mitalic_M IAlg.1subscript𝐼Alg.1I_{\mathrm{Alg.}1}italic_I start_POSTSUBSCRIPT roman_Alg .1 end_POSTSUBSCRIPT CAlg.1,ITsubscript𝐶Alg.1ITC_{\mathrm{Alg.}1,\mathrm{IT}}italic_C start_POSTSUBSCRIPT roman_Alg .1 , roman_IT end_POSTSUBSCRIPT CAlg.1subscript𝐶Alg.1C_{\mathrm{Alg.}1}italic_C start_POSTSUBSCRIPT roman_Alg .1 end_POSTSUBSCRIPT IOIsubscript𝐼OII_{\mathrm{OI}}italic_I start_POSTSUBSCRIPT roman_OI end_POSTSUBSCRIPT CAOsubscript𝐶AOC_{\mathrm{AO}}italic_C start_POSTSUBSCRIPT roman_AO end_POSTSUBSCRIPT
5 97 11357 1101629 1 1648395
25 86 20114 1729804 1 2793758
60 71 32646 2317866 1 3594682
100 62 51476 3208252 1 4732857
Refer to caption
Figure 3: Achievable rate versus the number of transmit metasurfaces layers L𝐿Litalic_L for K=2𝐾2K=2italic_K = 2 layers of receive metasurfaces. Impact of phase shifts optimization of the SIMs.

In Fig. 3, we show the achievable rate versus the number of transmit metasurfaces layers L𝐿Litalic_L for K=2𝐾2K=2italic_K = 2 layers of receive metasurfaces, and we focus on the impact of optimizing the phase shifts of the SIMs. The case of optimal phase shifts achieves the best performance, while the cases random phase shifts and equal phase shifts achieve lower rate. In particular, the line, corresponding to equal phase shifts, presents the worst performance.

Refer to caption
Figure 4: Achievable rate versus the number of transmit metasurfaces layers L𝐿Litalic_L while varying the number of receiver metasurface layers K𝐾Kitalic_K.

In Fig. 4, we depict the achievable rate versus the number of transmit metasurfaces layers L𝐿Litalic_L while varying the number of receiver metasurface layers K𝐾Kitalic_K. We observe that the achievable rate saturates as the number of transmit metasurface layers increases, i.e., L3𝐿3L\geq 3italic_L ≥ 3. Also, it is shown that after a certain number of metasurfaces, the system approaches the rate obtained with only digital precoding. Digital precoding (conventional MIMO system) has been implemented for the sake of comparison. Of course, care should be taken regarding the thickness of the SIM since densely implemented metasurfaces may lead to performance loss. This can be met if their number is increased after a threshold. Notably, under these values, by increasing L𝐿Litalic_L and K𝐾Kitalic_K after 3333 and 6666, respectively, does not improve the achievable rate.

Refer to caption
(a) Achievable rate versus the number of meta-atoms per transmit layer.
Figure 5: Impact of the numbers of meta-atoms per transmit layer and transmit antennas.

In Fig. III-B.(a), we illustrate the achievable rate versus the number of meta-atoms per transmit layer. As can be seen, the rate increases monotonically when the number of meta-atoms increase at the transmitter and receiver SIMs. Moreover, it is better to employ the receiver SIM with more meta-atoms per surface. For example, the case M=49,N=100formulae-sequence𝑀49𝑁100M=49,N=100italic_M = 49 , italic_N = 100 performs better than M=100,N=49formulae-sequence𝑀100𝑁49M=100,N=49italic_M = 100 , italic_N = 49. In Fig. III-B.(b), we show the achievable rate versus the number of transmit antennas Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for varying number of meta-atoms M,N𝑀𝑁M,Nitalic_M , italic_N. The rate saturates for large values of data streams due to increasing multiplexing gain. Moreover, as the number of meta-atoms increase, the rate is improved.

VI Conclusion

In this paper, we proposed a SIM-assisted HMIMO communication system, where the transmitter and the receiver are implemented by SIMs that enhance the wave-based analog precoding and combining. Specifically, we formulated the achievable rate problem, and we proposed an efficient gradient ascent algorithm that optimizes the covariance matrix of the transmitted signal and the SIM phase shifts at both sides of the transceiver simultaneously. Moreover, we obtained a Lipschitz constant guaranteeing the convergence of the proposed iterative algorithm. Numerical results confirmed that the proposed algorithm requires a lower number of iterations compared to the usually used AO approach. Finally, we showed the outperformance of the architecture with respect to its single-RIS counterpart and conventional MIMO system.

Appendix A Proof of Lemma 1

For the derivation of 𝐐f(𝐐,ϕl,𝝍k)subscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\mathbf{Q}}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), we start by deriving first its differential. We have

d(f(𝐐,ϕl,𝝍k))𝑑𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\displaystyle d(f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k}))italic_d ( italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) =tr(𝐊(𝐐,ϕl,𝝍k)d(𝐇¯𝐐𝐇¯𝖧))absenttrace𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘𝑑¯𝐇𝐐superscript¯𝐇𝖧\displaystyle=\tr({\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})d(% \bar{{\mathbf{H}}}{\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H% }}}))= roman_tr ( start_ARG bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_d ( over¯ start_ARG bold_H end_ARG bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) end_ARG ) (20bg)
=tr(𝐇¯𝖧𝐊(𝐐,ϕl,𝝍k)𝐇¯d(𝐐)),absenttracesuperscript¯𝐇𝖧𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘¯𝐇𝑑𝐐\displaystyle=\tr(\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{% K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})\bar{{\mathbf{H}}}d({\mathbf{% Q}})),= roman_tr ( start_ARG over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) over¯ start_ARG bold_H end_ARG italic_d ( bold_Q ) end_ARG ) , (20bh)

where, in (20bg), we have applied the property d(det(𝐗))=det(𝐗)tr(𝐗1d(𝐗))𝑑𝐗𝐗tracesuperscript𝐗1𝑑𝐗d(\det({\mathbf{X}}))=\det({\mathbf{X}})\tr({\mathbf{X}}^{-1}d({\mathbf{X}}))italic_d ( roman_det ( start_ARG bold_X end_ARG ) ) = roman_det ( start_ARG bold_X end_ARG ) roman_tr ( start_ARG bold_X start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d ( bold_X ) end_ARG ), and in (20bh), we have applied the property tr(𝐀𝐁)=tr(𝐁𝐀)trace𝐀𝐁trace𝐁𝐀\tr({\mathbf{A}}{\mathbf{B}})=\tr({\mathbf{B}}{\mathbf{A}})roman_tr ( start_ARG bold_AB end_ARG ) = roman_tr ( start_ARG bold_BA end_ARG ). Note that we have denoted 𝐊(𝐐,ϕl,𝝍k)=(𝐈+𝐇¯𝐐𝐇¯𝖧)1𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘superscript𝐈¯𝐇𝐐superscript¯𝐇𝖧1{\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})=\left({\bm{\mathrm{% I}}}+\bar{{\mathbf{H}}}{\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle% \mathsf{H}}}\right)^{-1}bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( bold_I + over¯ start_ARG bold_H end_ARG bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

From (20bh), we obtain 𝐐f(𝐐,ϕl,𝝍k)subscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\mathbf{Q}}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) by applying the suitable property from [43, Table 3.2]. Hence, we obtain

𝐐f(𝐐,ϕl,𝝍k)subscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\displaystyle\nabla_{{\mathbf{Q}}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{% k})∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) =𝐐f(𝐐,ϕl,𝝍k)absentsuperscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\displaystyle=\frac{\partial}{{\mathbf{Q}}^{*}}f({\mathbf{Q}},{\bm{\phi}}_{l},% {\bm{\psi}}_{k})= divide start_ARG ∂ end_ARG start_ARG bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) (20bi)
=(𝐐f(𝐐,ϕl,𝝍k))𝖳absentsuperscript𝐐𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘𝖳\displaystyle=\left(\frac{\partial}{{\mathbf{Q}}}f({\mathbf{Q}},{\bm{\phi}}_{l% },{\bm{\psi}}_{k})\right)^{{\scriptscriptstyle\mathsf{T}}}= ( divide start_ARG ∂ end_ARG start_ARG bold_Q end_ARG italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT (20bj)
=(𝐇¯𝖳(𝐈+𝐇¯𝐐𝖳𝐇¯𝖳)1𝐇¯)𝖳absentsuperscriptsuperscript¯𝐇𝖳superscript𝐈superscript¯𝐇superscript𝐐𝖳superscript¯𝐇𝖳1superscript¯𝐇𝖳\displaystyle=\left(\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{T}}}\left({% \bm{\mathrm{I}}}+\bar{{\mathbf{H}}}^{*}{\mathbf{Q}}^{{\scriptscriptstyle% \mathsf{T}}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{T}}}\right)^{-1}% \bar{{\mathbf{H}}}^{*}\right)^{{\scriptscriptstyle\mathsf{T}}}= ( over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ( bold_I + over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT bold_Q start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT (20bk)
=𝐇¯𝖧(𝐈+𝐇¯𝐐𝐇¯𝖧)1𝐇¯,absentsuperscript¯𝐇𝖧superscript𝐈¯𝐇𝐐superscript¯𝐇𝖧1¯𝐇\displaystyle=\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H}}}\left({\bm{% \mathrm{I}}}+\bar{{\mathbf{H}}}{\mathbf{Q}}\bar{{\mathbf{H}}}^{{% \scriptscriptstyle\mathsf{H}}}\right)^{-1}\bar{{\mathbf{H}}},= over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_I + over¯ start_ARG bold_H end_ARG bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG , (20bl)

where, in the last step, we have used that 𝐐=𝐐𝖳superscript𝐐superscript𝐐𝖳{\mathbf{Q}}^{*}={\mathbf{Q}}^{{\scriptscriptstyle\mathsf{T}}}bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = bold_Q start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT.

For the derivation of ϕlf(𝐐,ϕl,𝝍k)subscriptsubscriptbold-italic-ϕ𝑙𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), we focus on its differential. We have

d(f(𝐐,ϕl,𝝍k))=tr(𝐊(𝐐,ϕl,𝝍k)(d(𝐇¯)𝐐𝐇¯𝖧+𝐇¯𝐐d(𝐇¯𝖧))),𝑑𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘trace𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘𝑑¯𝐇𝐐superscript¯𝐇𝖧¯𝐇𝐐𝑑superscript¯𝐇𝖧\displaystyle d(f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k}))\!=\!\tr({% \mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})(d(\bar{{\mathbf{H}}}% ){\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H}}}\!+\!\bar{{% \mathbf{H}}}{\mathbf{Q}}d(\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H}}})% )),italic_d ( italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) = roman_tr ( start_ARG bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ( italic_d ( over¯ start_ARG bold_H end_ARG ) bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT + over¯ start_ARG bold_H end_ARG bold_Q italic_d ( over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) ) end_ARG ) , (20bm)
=tr(𝐐𝐇¯𝖧𝐊(𝐐,ϕl,𝝍k)d(𝐇¯)+𝐊(𝐐,ϕl,𝝍k)𝐇¯𝐐d(𝐇¯𝖧)).absenttrace𝐐superscript¯𝐇𝖧𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘𝑑¯𝐇𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘¯𝐇𝐐𝑑superscript¯𝐇𝖧\displaystyle=\tr({\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H% }}}{\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})d(\bar{{\mathbf{H% }}})+{\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})\bar{{\mathbf{H% }}}{\mathbf{Q}}d(\bar{{\mathbf{H}}}^{{\scriptscriptstyle\mathsf{H}}})).= roman_tr ( start_ARG bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_d ( over¯ start_ARG bold_H end_ARG ) + bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) over¯ start_ARG bold_H end_ARG bold_Q italic_d ( over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) end_ARG ) . (20bn)

From (18), it is easy to check that

d(𝐇¯)=𝐙𝐆¯d(𝐏),𝑑¯𝐇𝐙¯𝐆𝑑𝐏\displaystyle d(\bar{{\mathbf{H}}})={\mathbf{Z}}\bar{{\mathbf{G}}}d({\mathbf{P% }}),italic_d ( over¯ start_ARG bold_H end_ARG ) = bold_Z over¯ start_ARG bold_G end_ARG italic_d ( bold_P ) , (20bo)

where 𝐆¯=𝐆/N0¯𝐆𝐆subscript𝑁0\bar{{\mathbf{G}}}={\mathbf{G}}/\sqrt{N_{0}}over¯ start_ARG bold_G end_ARG = bold_G / square-root start_ARG italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG. Also, the differential of (10) can be written as

d(𝐏)𝑑𝐏\displaystyle d({\mathbf{P}})italic_d ( bold_P ) =𝚽L𝐖L𝚽l+1𝐖l+1d(𝚽l)𝐖l𝚽l1absentsuperscript𝚽𝐿superscript𝐖𝐿superscript𝚽𝑙1superscript𝐖𝑙1𝑑superscript𝚽𝑙superscript𝐖𝑙superscript𝚽𝑙1\displaystyle={\bm{\Phi}}^{L}{\mathbf{W}}^{L}\cdots{\bm{\Phi}}^{l+1}{\mathbf{W% }}^{l+1}d({\bm{\Phi}}^{l}){\mathbf{W}}^{l}{\bm{\Phi}}^{l-1}= bold_Φ start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ⋯ bold_Φ start_POSTSUPERSCRIPT italic_l + 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT italic_l + 1 end_POSTSUPERSCRIPT italic_d ( bold_Φ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) bold_W start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_l - 1 end_POSTSUPERSCRIPT
×𝐖l1𝚽1𝐖1.absentsuperscript𝐖𝑙1superscript𝚽1superscript𝐖1\displaystyle\times{\mathbf{W}}^{l-1}\cdots{\bm{\Phi}}^{1}{\mathbf{W}}^{1}.× bold_W start_POSTSUPERSCRIPT italic_l - 1 end_POSTSUPERSCRIPT ⋯ bold_Φ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT . (20bp)

Substituting (20bp) and (20bo) into (20bn), we obtain

d(f(𝐐,ϕl,𝝍k))𝑑𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\displaystyle d(f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k}))italic_d ( italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) =tr(𝐀ld(𝚽l)+𝐀l𝖧d((𝚽l)𝖧)),absenttracesubscript𝐀𝑙𝑑superscript𝚽𝑙superscriptsubscript𝐀𝑙𝖧𝑑superscriptsuperscript𝚽𝑙𝖧\displaystyle=\tr({\mathbf{A}}_{l}d({\bm{\Phi}}^{l})+{\mathbf{A}}_{l}^{{% \scriptscriptstyle\mathsf{H}}}d(({\bm{\Phi}}^{l})^{{\scriptscriptstyle\mathsf{% H}}})),= roman_tr ( start_ARG bold_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_d ( bold_Φ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) + bold_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT italic_d ( ( bold_Φ start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) end_ARG ) , (20bq)

where

𝐀lsubscript𝐀𝑙\displaystyle{\mathbf{A}}_{l}bold_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT =𝐖l𝚽l1𝐖l1𝚽1𝐖1𝐐𝐇¯𝖧𝐊𝐙𝐆¯𝚽Labsentsuperscript𝐖𝑙superscript𝚽𝑙1superscript𝐖𝑙1superscript𝚽1superscript𝐖1𝐐superscript¯𝐇𝖧𝐊𝐙¯𝐆superscript𝚽𝐿\displaystyle={\mathbf{W}}^{l}{\bm{\Phi}}^{l-1}{\mathbf{W}}^{l-1}\cdots{\bm{% \Phi}}^{1}{\mathbf{W}}^{1}{\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}{\mathbf{Z}}\bar{{\mathbf{G}}}{\bm{\Phi}}^{L}= bold_W start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT bold_Φ start_POSTSUPERSCRIPT italic_l - 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT italic_l - 1 end_POSTSUPERSCRIPT ⋯ bold_Φ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_KZ over¯ start_ARG bold_G end_ARG bold_Φ start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT
×𝐖L𝚽l+1𝐖l+1.absentsuperscript𝐖𝐿superscript𝚽𝑙1superscript𝐖𝑙1\displaystyle\times{\mathbf{W}}^{L}\cdots{\bm{\Phi}}^{l+1}{\mathbf{W}}^{l+1}.× bold_W start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT ⋯ bold_Φ start_POSTSUPERSCRIPT italic_l + 1 end_POSTSUPERSCRIPT bold_W start_POSTSUPERSCRIPT italic_l + 1 end_POSTSUPERSCRIPT . (20br)

Now, using the property tr(𝐀𝐁)=((diag()(𝐀))𝖳d(diag()(𝐁))\tr\left({\mathbf{A}}{\mathbf{B}}\right)=(\left(\text{diag}\left(\right)\left(% \mathbf{A}\right)\right)^{{\scriptscriptstyle\mathsf{T}}}d(\text{diag}\left(% \right)({\mathbf{B}}))roman_tr ( bold_AB ) = ( ( diag ( ) ( bold_A ) ) start_POSTSUPERSCRIPT sansserif_T end_POSTSUPERSCRIPT italic_d ( diag ( ) ( bold_B ) ) for any matrices 𝐀,𝐁𝐀𝐁{\mathbf{A}},{\mathbf{B}}bold_A , bold_B with 𝐁𝐁{\mathbf{B}}bold_B being a diagonal matrix, (20bq) becomes

From (A), we can conclude that

Similar to the derivation of (A), we obtain

𝝍kf(𝐐,ϕl,𝝍k)subscriptsubscript𝝍𝑘𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\displaystyle\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}% }_{k})∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )

where

𝐂ksubscript𝐂𝑘\displaystyle{\mathbf{C}}_{k}bold_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT =𝐔k𝚿k1𝐔k1𝚿1𝐔1𝐊𝐇¯𝐐𝐏𝖧𝐆¯𝖧𝚿Kabsentsuperscript𝐔𝑘superscript𝚿𝑘1superscript𝐔𝑘1superscript𝚿1superscript𝐔1𝐊¯𝐇superscript𝐐𝐏𝖧superscript¯𝐆𝖧superscript𝚿𝐾\displaystyle={\mathbf{U}}^{k}{\bm{\Psi}}^{k-1}{\mathbf{U}}^{k-1}\cdots{\bm{% \Psi}}^{1}{\mathbf{U}}^{1}{\mathbf{K}}\bar{{\mathbf{H}}}{\mathbf{Q}}{\mathbf{P% }}^{{\scriptscriptstyle\mathsf{H}}}\bar{{\mathbf{G}}}^{{\scriptscriptstyle% \mathsf{H}}}{\bm{\Psi}}^{K}= bold_U start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT bold_Ψ start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ⋯ bold_Ψ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_K over¯ start_ARG bold_H end_ARG bold_QP start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT over¯ start_ARG bold_G end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Ψ start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT
×𝐔K𝚿k+1𝐔k+1.absentsuperscript𝐔𝐾superscript𝚿𝑘1superscript𝐔𝑘1\displaystyle\times{\mathbf{U}}^{K}\cdots{\bm{\Psi}}^{k+1}{\mathbf{U}}^{k+1}.× bold_U start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT ⋯ bold_Ψ start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT bold_U start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT . (20bs)

Appendix B Proof of Proposition LABEL:proposition1

During the proof, we will use the following inequalities

𝐀𝐁norm𝐀𝐁\displaystyle\|{\mathbf{A}}{\mathbf{B}}\|∥ bold_AB ∥ λmax(𝐀)𝐁,absentsubscript𝜆max𝐀norm𝐁\displaystyle\leq\lambda_{\mathrm{max}}({\mathbf{A}})\|{\mathbf{B}}\|,≤ italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_A ) ∥ bold_B ∥ , (20bt)
𝐀𝐁𝐂norm𝐀𝐁𝐂\displaystyle\|{\mathbf{A}}{\mathbf{B}}{\mathbf{C}}\|∥ bold_ABC ∥ λmax(𝐀)λmax(𝐂)𝐁,absentsubscript𝜆max𝐀subscript𝜆max𝐂norm𝐁\displaystyle\leq\lambda_{\mathrm{max}}({\mathbf{A}})\lambda_{\mathrm{max}}({% \mathbf{C}})\|{\mathbf{B}}\|,≤ italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_A ) italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_C ) ∥ bold_B ∥ , (20bu)

where λmax(𝐗)subscript𝜆max𝐗\lambda_{\mathrm{max}}({\mathbf{X}})italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_X ) is the largest singular value of 𝐗𝐗{\mathbf{X}}bold_X. Also, we have λmax(𝐐)Psubscript𝜆max𝐐𝑃\lambda_{\mathrm{max}}({\mathbf{Q}})\leq Pitalic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_Q ) ≤ italic_P, and 𝐊(𝐐,ϕl,𝝍k)=(𝐈+𝐇¯𝐐𝐇¯𝖧)1𝐈𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘superscript𝐈¯𝐇𝐐superscript¯𝐇𝖧1precedes-or-equals𝐈{\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})=\left({\bm{\mathrm{% I}}}+\bar{{\mathbf{H}}}{\mathbf{Q}}\bar{{\mathbf{H}}}^{{\scriptscriptstyle% \mathsf{H}}}\right)^{-1}\preceq{\bm{\mathrm{I}}}bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( bold_I + over¯ start_ARG bold_H end_ARG bold_Q over¯ start_ARG bold_H end_ARG start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⪯ bold_I, which gives

λmax(𝐊(𝐐,ϕl,𝝍k))1.subscript𝜆max𝐊𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘1\displaystyle\lambda_{\mathrm{max}}({\mathbf{K}}({\mathbf{Q}},{\bm{\phi}}_{l},% {\bm{\psi}}_{k}))\leq 1.italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( bold_K ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ≤ 1 . (20bv)

Making use of (LABEL:gradient1), we can write

𝐐f(𝐐1,ϕl1,𝝍k1)𝐐f(𝐐2,ϕl2,𝝍k2)normsubscript𝐐𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscript𝐐𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2\displaystyle\|\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{% \bm{\psi}}_{k}^{1})-\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2% },{\bm{\psi}}_{k}^{2})\|∥ ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥
=𝐇¯1𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐇¯1𝐇¯2𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐇¯2absentnormsuperscriptsubscript¯𝐇1𝖧𝐊superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscript¯𝐇1superscriptsubscript¯𝐇2𝖧𝐊superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2subscript¯𝐇2\displaystyle=\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1})\bar{{% \mathbf{H}}}_{1}-\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2})\bar{{% \mathbf{H}}}_{2}\|= ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥
(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏1\displaystyle\leq\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf% {G}}^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{1}-≤ ∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT -
(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏2\displaystyle({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^{% {\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle\mathsf{% H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){% \mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{2}\|( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
+(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏2\displaystyle+\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}% }^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{2}+ ∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2𝐆𝐏2,\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|,- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20bw)

where, in (20bw), we have used (18).

The first term of (20bw) can be upper-bounded as

(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏1\displaystyle\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}% ^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{1}∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏2\displaystyle-({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){% \mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{2}\|- ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
d2b2fc2𝐏1𝐏2,absentsuperscript𝑑2superscript𝑏2𝑓superscript𝑐2normsuperscript𝐏1superscript𝐏2\displaystyle\leq\frac{d^{2}b^{2}f}{c^{2}}\|{\mathbf{P}}^{1}-{\mathbf{P}}^{2}\|,≤ divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20bx)

where, in (20bx), we have used (LABEL:bg), (LABEL:Eqf), and (20bv).

We upper bound the second term in (20bx) as

(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏2\displaystyle\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}% ^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{2}∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2𝐆𝐏2\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
bf(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1absentconditional𝑏𝑓superscriptsuperscript𝐏1𝖧superscript𝐆𝖧superscriptsuperscript𝐙1𝖧𝐊superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1superscript𝐙1\displaystyle\leq bf\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{G}}^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1% },{\bm{\psi}}_{k}^{1}){\mathbf{Z}}^{1}≤ italic_b italic_f ∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}\|- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20by)
bf((𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle\leq bf\big{(}\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}% }}{\mathbf{G}}^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1% },{\bm{\psi}}_{k}^{1}){\mathbf{Z}}^{1}≤ italic_b italic_f ( ∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){% \mathbf{Z}}^{1}\|- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥
+(𝐏2)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle+\|({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}% }^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}+ ∥ ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2),\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}\|\big{)},- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) , (20bz)

where, in (20by), we have used (LABEL:bg) and (LABEL:Eqd).

The first term of (20bz) results in

(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle\!\!\!\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{G}}^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1% },{\bm{\psi}}_{k}^{1}){\mathbf{Z}}^{1}∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1bd2c2𝐏1𝐏2,\displaystyle\!\!-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{% G}}^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}\|\leq\frac{bd^{2}}{c^{2}}\|{\mathbf{P}}^{1}-{\mathbf{P}}^% {2}\|,- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ ≤ divide start_ARG italic_b italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20ca)

where we have used (20bv).

The second term in (20bz) becomes

(𝐏2)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle\|({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}% ^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}∥ ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}\|- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
bf(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2absent𝑏𝑓normsuperscriptsuperscript𝐙1𝖧𝐊superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1superscript𝐙1superscriptsuperscript𝐙2𝖧𝐊superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2superscript𝐙2\displaystyle\leq bf\|({\mathbf{Z}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){\mathbf{% Z}}^{1}-({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{K}}({% \mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){\mathbf{Z}}^{2}\|≤ italic_b italic_f ∥ ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20cb)
bf((𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1(𝐙2)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1\displaystyle\leq bf\big{(}\|({\mathbf{Z}}^{1})^{{\scriptscriptstyle\mathsf{H}% }}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){% \mathbf{Z}}^{1}-({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{K}% }({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){\mathbf{Z}}^{1}\|≤ italic_b italic_f ( ∥ ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥
+(𝐙2)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2)\displaystyle+\|({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{K}% }({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^{1}){\mathbf{Z}}^{1}-({% \mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{2}% ,{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){\mathbf{Z}}^{2}\|\big{)}+ ∥ ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) (20cc)
2bfdc(𝐙1𝐙2,\displaystyle\leq\frac{2bfd}{c}\big{(}\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|,≤ divide start_ARG 2 italic_b italic_f italic_d end_ARG start_ARG italic_c end_ARG ( ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20cd)

where, in (20cb), we have used (LABEL:bg) and (20bv).

Substitution of (20ca) and (20cd) into (20by) gives

(𝐏1)𝖧𝐆𝖧(𝐙1)𝖧𝐊(𝐐1,ϕl1,𝝍k1)𝐙1𝐆𝐏2\displaystyle\|({\mathbf{P}}^{1})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}% ^{{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{1})^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi}}_{k}^% {1}){\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{2}∥ ( bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(𝐏2)𝖧𝐆𝖧(𝐙2)𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐙2𝐆𝐏2\displaystyle-({\mathbf{P}}^{2})^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{G}}^% {{\scriptscriptstyle\mathsf{H}}}({\mathbf{Z}}^{2})^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{K}}({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){% \mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|- ( bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_G start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ( bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
bf(bd2c2𝐏1𝐏2+2bfdc(𝐙1𝐙2).\displaystyle\leq bf\big{(}\frac{bd^{2}}{c^{2}}\|{\mathbf{P}}^{1}-{\mathbf{P}}% ^{2}\|+\frac{2bfd}{c}\big{(}\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|\big{)}.≤ italic_b italic_f ( divide start_ARG italic_b italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + divide start_ARG 2 italic_b italic_f italic_d end_ARG start_ARG italic_c end_ARG ( ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) . (20ce)

Now, inserting (20bx) and (20ce) into (20bw), we obtain

𝐐f(𝐐1,ϕl1,𝝍k1)𝐐f(𝐐2,ϕl2,𝝍k2)normsubscript𝐐𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscript𝐐𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2\displaystyle\|\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{% \bm{\psi}}_{k}^{1})-\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2% },{\bm{\psi}}_{k}^{2})\|∥ ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥
bf(2bd2c2𝐏1𝐏2+2bfdc(𝐙1𝐙2).\displaystyle\leq bf\big{(}\frac{2bd^{2}}{c^{2}}\|{\mathbf{P}}^{1}-{\mathbf{P}% }^{2}\|+\frac{2bfd}{c}\big{(}\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|\big{)}.≤ italic_b italic_f ( divide start_ARG 2 italic_b italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + divide start_ARG 2 italic_b italic_f italic_d end_ARG start_ARG italic_c end_ARG ( ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) . (20cf)

Using (LABEL:gradient2), we obtain

ϕlf(𝐐1,ϕl1,𝝍k1)ϕlf(𝐐2,ϕl2,𝝍k2)\displaystyle\|\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},% {\bm{\psi}}_{k}^{1})-\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l% }^{2},{\bm{\psi}}_{k}^{2})∥ ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )
𝐊(𝐐1,ϕl1,𝝍k1)𝐀l,1𝖧𝐊(𝐐2,ϕl2,𝝍k2)𝐀l,2𝖧\displaystyle\|\leq\|{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{% \psi}}_{k}^{1}){\mathbf{A}}_{l,1}^{{\scriptscriptstyle\mathsf{H}}}-{\mathbf{K}% }({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}){\mathbf{A}}_{l,2}^% {{\scriptscriptstyle\mathsf{H}}}\|∥ ≤ ∥ bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_A start_POSTSUBSCRIPT italic_l , 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT - bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_A start_POSTSUBSCRIPT italic_l , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ∥ (20cg)
alb𝐐1𝐇¯1𝖧𝐙1𝐐2𝐇¯2𝖧𝐙2absentsubscript𝑎𝑙𝑏normsuperscript𝐐1superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\leq a_{l}b\|{\mathbf{Q}}^{1}\bar{{\mathbf{H}}}_{1}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H% }}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{2}\|≤ italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_b ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20ch)
=alb𝐐1𝐇¯1𝖧𝐙1𝐐2𝐇¯1𝖧𝐙1+𝐐2𝐇¯1𝖧𝐙1𝐐2𝐇¯2𝖧𝐙2absentsubscript𝑎𝑙𝑏normsuperscript𝐐1superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle=a_{l}b\|{\mathbf{Q}}^{1}\bar{{\mathbf{H}}}_{1}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H% }}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}+{\mathbf{Q}}^{2}\bar{% {\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}% }^{2}\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{2}\|= italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_b ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT + bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20ci)
alb(𝐐1𝐇¯1𝖧𝐙1𝐐2𝐇¯1𝖧𝐙1+𝐐2𝐇¯1𝖧𝐙1𝐐2𝐇¯2𝖧𝐙2),absentsubscript𝑎𝑙𝑏normsuperscript𝐐1superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1normsuperscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\leq a_{l}b\big{(}\|{\mathbf{Q}}^{1}\bar{{\mathbf{H}}}_{1}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H% }}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}\|+\|{\mathbf{Q}}^{2}% \bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-{% \mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf% {Z}}^{2}\|\big{)},≤ italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_b ( ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ + ∥ bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) , (20cj)

where, in (20cg), we have used that diag()(𝐗)𝐗normdiag𝐗norm𝐗\|\text{diag}\left(\right)({\mathbf{X}})\|\leq\|{\mathbf{X}}\|∥ diag ( ) ( bold_X ) ∥ ≤ ∥ bold_X ∥. In (20ch), we have used (20bv), (LABEL:al), and (LABEL:bg).

We upper-bound the first term on the right-hand side of (20cj) as

𝐐1𝐇¯1𝖧𝐙1𝐐2𝐇¯1𝖧𝐙1c𝐐1𝐐2,normsuperscript𝐐1superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1𝑐normsuperscript𝐐1superscript𝐐2\displaystyle\|{\mathbf{Q}}^{1}\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{1}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}\|\leq c\|{\mathbf{Q}}^{1}-{% \mathbf{Q}}^{2}\|,∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ ≤ italic_c ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20ck)

where we have used (LABEL:Eqd).

The second term of (20cj) can be written

𝐐2𝐇¯1𝖧𝐙1𝐐2𝐇¯2𝖧𝐙2P𝐇¯1𝖧𝐙1𝐇¯2𝖧𝐙2normsuperscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇2𝖧superscript𝐙2𝑃normsuperscriptsubscript¯𝐇1𝖧superscript𝐙1superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\|{\mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{2}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{2}\|\leq P\|\bar{{\mathbf{H}}}_{1}% ^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{1}-\bar{{\mathbf{H}}}_{2}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{2}\|∥ bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ≤ italic_P ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20cl)
=P𝐇¯1𝖧𝐙1𝐇¯1𝖧𝐙2+𝐇¯1𝖧𝐙2𝐇¯2𝖧𝐙2absent𝑃normsuperscriptsubscript¯𝐇1𝖧superscript𝐙1superscriptsubscript¯𝐇1𝖧superscript𝐙2superscriptsubscript¯𝐇1𝖧superscript𝐙2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle=P\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{1}-\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}+\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}-\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}\|= italic_P ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20cm)
P(𝐇¯1𝖧𝐙1𝐇¯1𝖧𝐙2+𝐇¯1𝖧𝐙2𝐇¯2𝖧𝐙2).absent𝑃normsuperscriptsubscript¯𝐇1𝖧superscript𝐙1superscriptsubscript¯𝐇1𝖧superscript𝐙2normsuperscriptsubscript¯𝐇1𝖧superscript𝐙2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\leq P\big{(}\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf% {H}}}{\mathbf{Z}}^{1}-\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}\|+\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}-\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{% \mathbf{Z}}^{2}\|\big{)}.≤ italic_P ( ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) . (20cn)

The first term in (20cn) can be written as

𝐇¯1𝖧𝐙1𝐇¯1𝖧𝐙2c𝐙1𝐙2,normsuperscriptsubscript¯𝐇1𝖧superscript𝐙1superscriptsubscript¯𝐇1𝖧superscript𝐙2𝑐normsuperscript𝐙1superscript𝐙2\displaystyle\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf% {Z}}^{1}-\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{% 2}\|\leq c\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|,∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ≤ italic_c ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20co)

where we have used (LABEL:Eqc).

Next, the second term in (20cn) becomes

𝐇¯1𝖧𝐙2𝐇¯2𝖧𝐙2dc𝐇¯1𝐇¯2normsuperscriptsubscript¯𝐇1𝖧superscript𝐙2superscriptsubscript¯𝐇2𝖧superscript𝐙2𝑑𝑐normsubscript¯𝐇1subscript¯𝐇2\displaystyle\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf% {Z}}^{2}-\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{% 2}\|\leq\frac{d}{c}\|\bar{{\mathbf{H}}}_{1}-\bar{{\mathbf{H}}}_{2}\|∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ≤ divide start_ARG italic_d end_ARG start_ARG italic_c end_ARG ∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ (20cp)
=dcN0𝐙1𝐆𝐏1𝐙2𝐆𝐏2absent𝑑𝑐subscript𝑁0normsuperscript𝐙1superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏2\displaystyle=\frac{d}{cN_{0}}\|{\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{1}-{% \mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|= divide start_ARG italic_d end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20cq)
=dcN0𝐙1𝐆𝐏1𝐙2𝐆𝐏1+𝐙2𝐆𝐏1𝐙2𝐆𝐏2absent𝑑𝑐subscript𝑁0normsuperscript𝐙1superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏2\displaystyle=\frac{d}{cN_{0}}\|{\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{1}-{% \mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{1}+{\mathbf{Z}}^{2}{\mathbf{G}}{% \mathbf{P}}^{1}-{\mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|= divide start_ARG italic_d end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT + bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20cr)
=dcN0(𝐙1𝐆𝐏1𝐙2𝐆𝐏1+𝐙2𝐆𝐏1𝐙2𝐆𝐏2),absent𝑑𝑐subscript𝑁0normsuperscript𝐙1superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏1normsuperscript𝐙2superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏2\displaystyle=\frac{d}{cN_{0}}\big{(}\|{\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}% }^{1}-{\mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{1}\|+\|{\mathbf{Z}}^{2}{% \mathbf{G}}{\mathbf{P}}^{1}-{\mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{2}\|\big% {)},= divide start_ARG italic_d end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ + ∥ bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) , (20cs)

where, in (20cp), we have used (LABEL:Eqd). In (20cq), we have inserted (18).

The first term in (20cs) becomes

𝐙1𝐆𝐏1𝐙2𝐆𝐏1bf𝐙1𝐙2normsuperscript𝐙1superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏1𝑏𝑓normsuperscript𝐙1superscript𝐙2\displaystyle\|{\mathbf{Z}}^{1}{\mathbf{G}}{\mathbf{P}}^{1}-{\mathbf{Z}}^{2}{% \mathbf{G}}{\mathbf{P}}^{1}\|\leq bf\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ ≤ italic_b italic_f ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ (20ct)

where we have used (LABEL:Eqf). The second term in (20cs) yields

𝐙2𝐆𝐏1𝐙2𝐆𝐏2bdc𝐏1𝐏2.normsuperscript𝐙2superscript𝐆𝐏1superscript𝐙2superscript𝐆𝐏2𝑏𝑑𝑐normsuperscript𝐏1superscript𝐏2\displaystyle\|{\mathbf{Z}}^{2}{\mathbf{G}}{\mathbf{P}}^{1}-{\mathbf{Z}}^{2}{% \mathbf{G}}{\mathbf{P}}^{2}\|\leq b\frac{d}{c}\|{\mathbf{P}}^{1}-{\mathbf{P}}^% {2}\|.∥ bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_GP start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ≤ italic_b divide start_ARG italic_d end_ARG start_ARG italic_c end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ . (20cu)

By substituting (20ct) and (20cu) into (20cs), we obtain

𝐇¯1𝖧𝐙2𝐇¯2𝖧𝐙2normsuperscriptsubscript¯𝐇1𝖧superscript𝐙2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\|\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf% {Z}}^{2}-\bar{{\mathbf{H}}}_{2}^{{\scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{% 2}\|∥ over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bdcN0(f𝐙1𝐙2+dc𝐏1𝐏2).absent𝑏𝑑𝑐subscript𝑁0𝑓normsuperscript𝐙1superscript𝐙2𝑑𝑐normsuperscript𝐏1superscript𝐏2\displaystyle\leq\frac{bd}{cN_{0}}\big{(}f\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}% \|+\frac{d}{c}\|{\mathbf{P}}^{1}-{\mathbf{P}}^{2}\|\big{)}.≤ divide start_ARG italic_b italic_d end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( italic_f ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + divide start_ARG italic_d end_ARG start_ARG italic_c end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ) . (20cv)

Inserting (20co) and (20cv) into (20cn)

𝐐2𝐇¯1𝖧𝐙1𝐐2𝐇¯2𝖧𝐙2normsuperscript𝐐2superscriptsubscript¯𝐇1𝖧superscript𝐙1superscript𝐐2superscriptsubscript¯𝐇2𝖧superscript𝐙2\displaystyle\|{\mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{1}^{{\scriptscriptstyle% \mathsf{H}}}{\mathbf{Z}}^{1}-{\mathbf{Q}}^{2}\bar{{\mathbf{H}}}_{2}^{{% \scriptscriptstyle\mathsf{H}}}{\mathbf{Z}}^{2}\|∥ bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over¯ start_ARG bold_H end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ P(c+bdfcN0)𝐙1𝐙2absent𝑃𝑐𝑏𝑑𝑓𝑐subscript𝑁0normsuperscript𝐙1superscript𝐙2\displaystyle\leq P\left(c+\frac{bdf}{cN_{0}}\right)\|{\mathbf{Z}}^{1}-{% \mathbf{Z}}^{2}\|≤ italic_P ( italic_c + divide start_ARG italic_b italic_d italic_f end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
+Pbd2c2N0𝐏1𝐏2.𝑃𝑏superscript𝑑2superscript𝑐2subscript𝑁0normsuperscript𝐏1superscript𝐏2\displaystyle+P\frac{bd^{2}}{c^{2}N_{0}}\|{\mathbf{P}}^{1}-{\mathbf{P}}^{2}\|.+ italic_P divide start_ARG italic_b italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ . (20cw)

Finally, substituting (20cw) and (20ck) into (20cj), we derive

ϕlf(𝐐1,ϕl1,𝝍k1)ϕlf(𝐐2,ϕl2,𝝍k2)albc|𝐐1𝐐2normsubscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2subscript𝑎𝑙𝑏𝑐delimited-|‖superscript𝐐1superscript𝐐2\displaystyle\|\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},% {\bm{\psi}}_{k}^{1})-\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l% }^{2},{\bm{\psi}}_{k}^{2})\|\leq a_{l}bc|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}\|∥ ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥ ≤ italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_b italic_c | bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
+albP(c+bdfcN0)𝐙1𝐙2subscript𝑎𝑙𝑏𝑃𝑐𝑏𝑑𝑓𝑐subscript𝑁0normsuperscript𝐙1superscript𝐙2\displaystyle+a_{l}bP\left(c+\frac{bdf}{cN_{0}}\right)\|{\mathbf{Z}}^{1}-{% \mathbf{Z}}^{2}\|+ italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_b italic_P ( italic_c + divide start_ARG italic_b italic_d italic_f end_ARG start_ARG italic_c italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
+alPb2d2c2N0𝐏1𝐏2.subscript𝑎𝑙𝑃superscript𝑏2superscript𝑑2superscript𝑐2subscript𝑁0normsuperscript𝐏1superscript𝐏2\displaystyle+a_{l}P\frac{b^{2}d^{2}}{c^{2}N_{0}}\|{\mathbf{P}}^{1}-{\mathbf{P% }}^{2}\|.+ italic_a start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT italic_P divide start_ARG italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_c start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ . (20cx)

The inequality regarding 𝝍kf(𝐐,ϕl,𝝍k)subscriptsubscript𝝍𝑘𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) can be written as

𝝍kf(𝐐1,ϕl1,𝝍k1)𝝍kf(𝐐2,ϕl2,𝝍k2)normsubscriptsubscript𝝍𝑘𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscriptsubscript𝝍𝑘𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2\displaystyle\|\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},% {\bm{\psi}}_{k}^{1})-\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l% }^{2},{\bm{\psi}}_{k}^{2})\|∥ ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥
𝐊(𝐐1,ϕl1,𝝍k1)𝐂k,1𝖧𝐊(𝐐2,ϕl2,𝝍k2𝐂k,2𝖧).absentnorm𝐊superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1superscriptsubscript𝐂𝑘1𝖧𝐊superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2superscriptsubscript𝐂𝑘2𝖧\displaystyle\leq\|{\mathbf{K}}({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi% }}_{k}^{1}){\mathbf{C}}_{k,1}^{{\scriptscriptstyle\mathsf{H}}}-{\mathbf{K}}({% \mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2},{\bm{\psi}}_{k}^{2}{\mathbf{C}}_{k,2}^{{% \scriptscriptstyle\mathsf{H}}})\|.≤ ∥ bold_K ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) bold_C start_POSTSUBSCRIPT italic_k , 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT - bold_K ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_C start_POSTSUBSCRIPT italic_k , 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT sansserif_H end_POSTSUPERSCRIPT ) ∥ . (20cy)

In a similar way, we can upper bound the previous inequality. We omit the details due to limited space. Hence, we conclude the proof.

Appendix C Proof of Theorem LABEL:Theorem1

According to Lemma (1), we can write

𝐐f(𝐐1,ϕl1,𝝍k1)𝐐f(𝐐2,ϕl2,𝝍k2)2superscriptnormsubscript𝐐𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscript𝐐𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘22\displaystyle\|\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{% \bm{\psi}}_{k}^{1})-\nabla_{{\mathbf{Q}}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{l}^{2% },{\bm{\psi}}_{k}^{2})\|^{2}∥ ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+ϕlf(𝐐1,ϕl1,𝝍k1)ϕlf(𝐐2,ϕl2,𝝍k2)2superscriptnormsubscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘22\displaystyle+\|\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1}% ,{\bm{\psi}}_{k}^{1})-\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{% l}^{2},{\bm{\psi}}_{k}^{2})\|^{2}+ ∥ ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+𝝍kf(𝐐1,ϕl1,𝝍k1)𝝍kf(𝐐2,ϕl2,𝝍k2)normsubscriptsubscript𝝍𝑘𝑓superscript𝐐1superscriptsubscriptbold-italic-ϕ𝑙1superscriptsubscript𝝍𝑘1subscriptsubscript𝝍𝑘𝑓superscript𝐐2superscriptsubscriptbold-italic-ϕ𝑙2superscriptsubscript𝝍𝑘2\displaystyle+\|\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1}% ,{\bm{\psi}}_{k}^{1})-\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{2},{\bm{\phi}}_{% l}^{2},{\bm{\psi}}_{k}^{2})\|+ ∥ ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥
Λ𝐐2𝐐1𝐐22+Λϕl2𝐏1𝐏22+Λ𝝍k2𝐙1𝐙22absentsuperscriptsubscriptΛ𝐐2superscriptnormsuperscript𝐐1superscript𝐐22superscriptsubscriptΛsubscriptbold-italic-ϕ𝑙2superscriptnormsuperscript𝐏1superscript𝐏22superscriptsubscriptΛsubscript𝝍𝑘2superscriptnormsuperscript𝐙1superscript𝐙22\displaystyle\leq\Lambda_{{\mathbf{Q}}}^{2}\|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}% \|^{2}+\Lambda_{{\bm{\phi}}_{l}}^{2}\|{\mathbf{P}}^{1}-{\mathbf{P}}^{2}\|^{2}+% \Lambda_{{\bm{\psi}}_{k}}^{2}\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|^{2}≤ roman_Λ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + roman_Λ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + roman_Λ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
max(Λ𝐐2,Λϕl2,Λ𝝍k2)(𝐐1𝐐22+𝐏1𝐏22\displaystyle\leq\max(\Lambda_{{\mathbf{Q}}}^{2},\Lambda_{{\bm{\phi}}_{l}}^{2}% ,\Lambda_{{\bm{\psi}}_{k}}^{2})\big{(}\|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}\|^{2% }+\|{\mathbf{P}}^{1}-{\mathbf{P}}^{2}\|^{2}≤ roman_max ( roman_Λ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , roman_Λ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , roman_Λ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ( ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+𝐙1𝐙22).\displaystyle+\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|^{2}\big{)}.+ ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (20cz)

Applying the square root on both sides of (20cz) together with the inequality

2𝐐1𝐐2𝐏1𝐏2+2𝐙1𝐙2𝐏1𝐏22normsuperscript𝐐1superscript𝐐2normsuperscript𝐏1superscript𝐏22normsuperscript𝐙1superscript𝐙2normsuperscript𝐏1superscript𝐏2\displaystyle 2\|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}\|\cdot\|{\mathbf{P}}^{1}-{% \mathbf{P}}^{2}\|+2\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|\cdot\|{\mathbf{P}}^{1% }-{\mathbf{P}}^{2}\|2 ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ⋅ ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + 2 ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ⋅ ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
+2𝐐1𝐐2𝐙1𝐙22normsuperscript𝐐1superscript𝐐2normsuperscript𝐙1superscript𝐙2\displaystyle+2\|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}\|\cdot\|{\mathbf{Z}}^{1}-{% \mathbf{Z}}^{2}\|+ 2 ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ ⋅ ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥
𝐐1𝐐22+𝐏1𝐏2+𝐙1𝐙2,absentsuperscriptnormsuperscript𝐐1superscript𝐐22normsuperscript𝐏1superscript𝐏2normsuperscript𝐙1superscript𝐙2\displaystyle\leq\|{\mathbf{Q}}^{1}-{\mathbf{Q}}^{2}\|^{2}+\|{\mathbf{P}}^{1}-% {\mathbf{P}}^{2}\|+\|{\mathbf{Z}}^{1}-{\mathbf{Z}}^{2}\|,≤ ∥ bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_P start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_P start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ + ∥ bold_Z start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - bold_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ , (20da)

we observe that max(Λ𝐐2,Λϕl2,Λ𝝍k2)superscriptsubscriptΛ𝐐2superscriptsubscriptΛsubscriptbold-italic-ϕ𝑙2superscriptsubscriptΛsubscript𝝍𝑘2\max(\Lambda_{{\mathbf{Q}}}^{2},\Lambda_{{\bm{\phi}}_{l}}^{2},\Lambda_{{\bm{% \psi}}_{k}}^{2})roman_max ( roman_Λ start_POSTSUBSCRIPT bold_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , roman_Λ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , roman_Λ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) is a Lipschitz constant of the gradient of f(𝐐,ϕl,𝝍k)𝑓𝐐subscriptbold-italic-ϕ𝑙subscript𝝍𝑘f({\mathbf{Q}},{\bm{\phi}}_{l},{\bm{\psi}}_{k})italic_f ( bold_Q , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), which concludes the proof.

Appendix D Proof of Theorem LABEL:Theorem2

The point 𝐐n+1superscript𝐐𝑛1{\mathbf{Q}}^{n+1}bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT can be projected onto 𝒬𝒬\mathcal{Q}caligraphic_Q according to the following equation.

𝐐n+1superscript𝐐𝑛1\displaystyle{\mathbf{Q}}^{n+1}bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT =argmin𝐐𝒬𝐐𝐐nμn1𝒬f(𝐐n,ϕln,𝝍kn)2absentsubscript𝐐𝒬superscriptnorm𝐐superscript𝐐𝑛superscriptsubscript𝜇𝑛1subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛2\displaystyle=\arg\min_{{\mathbf{Q}}\in\mathcal{Q}}\|{\mathbf{Q}}-{\mathbf{Q}}% ^{n}-\mu_{n}^{1}\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{% \bm{\psi}}_{k}^{n})\|^{2}= roman_arg roman_min start_POSTSUBSCRIPT bold_Q ∈ caligraphic_Q end_POSTSUBSCRIPT ∥ bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT - italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
=argmax𝐐𝒬𝒬f(𝐐n,ϕln,𝝍kn),𝐐𝐐nabsentsubscript𝐐𝒬subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛𝐐superscript𝐐𝑛\displaystyle=\arg\max_{{\mathbf{Q}}\in\mathcal{Q}}\langle\nabla_{\mathcal{Q}}% f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n}),{\mathbf{Q}}-{% \mathbf{Q}}^{n}\rangle= roman_arg roman_max start_POSTSUBSCRIPT bold_Q ∈ caligraphic_Q end_POSTSUBSCRIPT ⟨ ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩
12μn1𝐐𝐐n2.12superscriptsubscript𝜇𝑛1superscriptnorm𝐐superscript𝐐𝑛2\displaystyle-\frac{1}{2\mu_{n}^{1}}\|{\mathbf{Q}}-{\mathbf{Q}}^{n}\|^{2}.- divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG ∥ bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (20db)

In the case of 𝐐𝐐n𝐐superscript𝐐𝑛{\mathbf{Q}}-{\mathbf{Q}}^{n}bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, we meet that the objective is 00, which means that

𝒬f(𝐐n,ϕln,𝝍kn),𝐐𝐐n12μn1𝐐𝐐n20.subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛𝐐superscript𝐐𝑛12superscriptsubscript𝜇𝑛1superscriptnorm𝐐superscript𝐐𝑛20\displaystyle\!\!\!\!\!\langle\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi% }}_{l}^{n},{\bm{\psi}}_{k}^{n}),{\mathbf{Q}}-{\mathbf{Q}}^{n}\rangle-\frac{1}{% 2\mu_{n}^{1}}\|{\mathbf{Q}}-{\mathbf{Q}}^{n}\|^{2}\geq 0.⟨ ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ - divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG ∥ bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0 . (20dc)

Similarly, we obtain

ϕlf(𝐐n,ϕln,𝝍kn),ϕlϕln12μn2ϕlϕln20,subscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛subscriptbold-italic-ϕ𝑙superscriptsubscriptbold-italic-ϕ𝑙𝑛12superscriptsubscript𝜇𝑛2superscriptnormsubscriptbold-italic-ϕ𝑙superscriptsubscriptbold-italic-ϕ𝑙𝑛20\displaystyle\langle\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}% ^{n},{\bm{\psi}}_{k}^{n}),{\bm{\phi}}_{l}-{\bm{\phi}}_{l}^{n}\rangle-\frac{1}{% 2\mu_{n}^{2}}\|{\bm{\phi}}_{l}-{\bm{\phi}}_{l}^{n}\|^{2}\geq 0,⟨ ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ - divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0 , (20dd)
𝝍kf(𝐐n,ϕln,𝝍kn),𝝍k𝝍kn12μn3𝝍k𝝍kn20.subscriptsubscript𝝍𝑘𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛subscript𝝍𝑘superscriptsubscript𝝍𝑘𝑛12superscriptsubscript𝜇𝑛3superscriptnormsubscript𝝍𝑘superscriptsubscript𝝍𝑘𝑛20\displaystyle\langle\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}% ^{n},{\bm{\psi}}_{k}^{n}),{\bm{\psi}}_{k}-{\bm{\psi}}_{k}^{n}\rangle-\frac{1}{% 2\mu_{n}^{3}}\|{\bm{\psi}}_{k}-{\bm{\psi}}_{k}^{n}\|^{2}\geq 0.⟨ ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩ - divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG ∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0 . (20de)

Now, applying the following inequality

f(𝐲)f(𝐲)+f(𝐱),𝐲𝐱Λ2𝐲𝐱2,𝑓𝐲𝑓𝐲𝑓𝐱𝐲𝐱Λ2superscriptnorm𝐲𝐱2\displaystyle f({\mathbf{y}})\geq f({\mathbf{y}})+\langle\nabla f({\mathbf{x}}% ),{\mathbf{y}}-{\mathbf{x}}\rangle-\frac{\Lambda}{2}\|{\mathbf{y}}-{\mathbf{x}% }\|^{2},italic_f ( bold_y ) ≥ italic_f ( bold_y ) + ⟨ ∇ italic_f ( bold_x ) , bold_y - bold_x ⟩ - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ∥ bold_y - bold_x ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (20df)

which holds for any ΛΛ\Lambdaroman_Λ-smooth function f(𝐱)𝑓𝐱f({\mathbf{x}})italic_f ( bold_x ), we obtain

f(𝐐n+1,ϕln+1,𝝍kn+1)f(𝐐n,ϕln,𝝍kn)𝑓superscript𝐐𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscript𝝍𝑘𝑛1𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛\displaystyle f({\mathbf{Q}}^{n+1},{\bm{\phi}}_{l}^{n+1},{\bm{\psi}}_{k}^{n+1}% )\geq f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n})italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT ) ≥ italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT )
+𝒬f(𝐐n,ϕln,𝝍kn),𝐐𝐐nsubscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛𝐐superscript𝐐𝑛\displaystyle+\langle\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n% },{\bm{\psi}}_{k}^{n}),{\mathbf{Q}}-{\mathbf{Q}}^{n}\rangle+ ⟨ ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩
+ϕlf(𝐐n,ϕln,𝝍kn),ϕlϕlnsubscriptsubscriptbold-italic-ϕ𝑙𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛subscriptbold-italic-ϕ𝑙superscriptsubscriptbold-italic-ϕ𝑙𝑛\displaystyle+\langle\nabla_{{\bm{\phi}}_{l}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l% }^{n},{\bm{\psi}}_{k}^{n}),{\bm{\phi}}_{l}-{\bm{\phi}}_{l}^{n}\rangle+ ⟨ ∇ start_POSTSUBSCRIPT bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩
+𝝍kf(𝐐n,ϕln,𝝍kn),𝝍k𝝍knsubscriptsubscript𝝍𝑘𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛subscript𝝍𝑘superscriptsubscript𝝍𝑘𝑛\displaystyle+\langle\nabla_{{\bm{\psi}}_{k}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l% }^{n},{\bm{\psi}}_{k}^{n}),{\bm{\psi}}_{k}-{\bm{\psi}}_{k}^{n}\rangle+ ⟨ ∇ start_POSTSUBSCRIPT bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⟩
Λ2𝐐n+1𝐐n2Λ2ϕln+1ϕln2Λ2𝝍kn+1𝝍kn2Λ2superscriptnormsuperscript𝐐𝑛1superscript𝐐𝑛2Λ2superscriptnormsuperscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛2Λ2superscriptnormsuperscriptsubscript𝝍𝑘𝑛1superscriptsubscript𝝍𝑘𝑛2\displaystyle-\frac{\Lambda}{2}\|{\mathbf{Q}}^{n+1}-{\mathbf{Q}}^{n}\|^{2}-% \frac{\Lambda}{2}\|{\bm{\phi}}_{l}^{n+1}-{\bm{\phi}}_{l}^{n}\|^{2}-\frac{% \Lambda}{2}\|{\bm{\psi}}_{k}^{n+1}-{\bm{\psi}}_{k}^{n}\|^{2}- divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ∥ bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
f(𝐐n,ϕln,𝝍kn)+(12μn1Λ2)𝐐n+1𝐐n2absent𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛12superscriptsubscript𝜇𝑛1Λ2superscriptnormsuperscript𝐐𝑛1superscript𝐐𝑛2\displaystyle\geq f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n})+% \left(\frac{1}{2\mu_{n}^{1}}-\frac{\Lambda}{2}\right)\|{\mathbf{Q}}^{n+1}-{% \mathbf{Q}}^{n}\|^{2}≥ italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) + ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+(12μn2Λ2)ϕln+1ϕln212superscriptsubscript𝜇𝑛2Λ2superscriptnormsuperscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛2\displaystyle+\left(\frac{1}{2\mu_{n}^{2}}-\frac{\Lambda}{2}\right)\|{\bm{\phi% }}_{l}^{n+1}-{\bm{\phi}}_{l}^{n}\|^{2}+ ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+(12μn3Λ2)𝝍kn+1𝝍kn2,12superscriptsubscript𝜇𝑛3Λ2superscriptnormsuperscriptsubscript𝝍𝑘𝑛1superscriptsubscript𝝍𝑘𝑛2\displaystyle+\left(\frac{1}{2\mu_{n}^{3}}-\frac{\Lambda}{2}\right)\|{\bm{\psi% }}_{k}^{n+1}-{\bm{\psi}}_{k}^{n}\|^{2},+ ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (20dg)

where, in the first inequality, we have applied Theorem 1.

From the above equation, when μnq<1Λsuperscriptsubscript𝜇𝑛𝑞1Λ\mu_{n}^{q}<\frac{1}{\Lambda}italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT < divide start_ARG 1 end_ARG start_ARG roman_Λ end_ARG for q=1,2,3𝑞123q=1,2,3italic_q = 1 , 2 , 3, we observe that f(𝐐n+1,ϕln+1,𝝍kn+1)f(𝐐n,ϕln,𝝍kn)𝑓superscript𝐐𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscript𝝍𝑘𝑛1𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛f({\mathbf{Q}}^{n+1},{\bm{\phi}}_{l}^{n+1},{\bm{\psi}}_{k}^{n+1})\geq f({% \mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n})italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT ) ≥ italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ). Now, we denote fsuperscript𝑓f^{\star}italic_f start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT the value of f𝑓fitalic_f at all accumulation points, and we write (20dg) as

f(𝐐n+1,ϕln+1,𝝍kn+1)f(𝐐n,ϕln,𝝍kn)(12μn1Λ2)𝑓superscript𝐐𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscript𝝍𝑘𝑛1𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛12superscriptsubscript𝜇𝑛1Λ2\displaystyle f({\mathbf{Q}}^{n+1},{\bm{\phi}}_{l}^{n+1},{\bm{\psi}}_{k}^{n+1}% )-f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n})\geq\left(\frac{1% }{2\mu_{n}^{1}}-\frac{\Lambda}{2}\right)italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT ) - italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≥ ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG )
×𝐐n+1𝐐n2+(12μn2Λ2)ϕln+1ϕln2absentsuperscriptnormsuperscript𝐐𝑛1superscript𝐐𝑛212superscriptsubscript𝜇𝑛2Λ2superscriptnormsuperscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛2\displaystyle\times\|{\mathbf{Q}}^{n+1}-{\mathbf{Q}}^{n}\|^{2}+\left(\frac{1}{% 2\mu_{n}^{2}}-\frac{\Lambda}{2}\right)\|{\bm{\phi}}_{l}^{n+1}-{\bm{\phi}}_{l}^% {n}\|^{2}× ∥ bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+(12μn3Λ2)𝝍kn+1𝝍kn2,12superscriptsubscript𝜇𝑛3Λ2superscriptnormsuperscriptsubscript𝝍𝑘𝑛1superscriptsubscript𝝍𝑘𝑛2\displaystyle+\left(\frac{1}{2\mu_{n}^{3}}-\frac{\Lambda}{2}\right)\|{\bm{\psi% }}_{k}^{n+1}-{\bm{\psi}}_{k}^{n}\|^{2},+ ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (20dh)

which gives

>ff(𝐐1,ϕl1,𝝍k1)>n=1((12μn1Λ2)\displaystyle\infty>f^{\star}-f({\mathbf{Q}}^{1},{\bm{\phi}}_{l}^{1},{\bm{\psi% }}_{k}^{1})>\sum_{n=1}^{\infty}\Bigg{(}\left(\frac{1}{2\mu_{n}^{1}}-\frac{% \Lambda}{2}\right)∞ > italic_f start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT - italic_f ( bold_Q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) > ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG )
×𝐐n+1𝐐n2+(12μn2Λ2)ϕln+1ϕln2absentsuperscriptnormsuperscript𝐐𝑛1superscript𝐐𝑛212superscriptsubscript𝜇𝑛2Λ2superscriptnormsuperscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛2\displaystyle\times\|{\mathbf{Q}}^{n+1}-{\mathbf{Q}}^{n}\|^{2}+\left(\frac{1}{% 2\mu_{n}^{2}}-\frac{\Lambda}{2}\right)\|{\bm{\phi}}_{l}^{n+1}-{\bm{\phi}}_{l}^% {n}\|^{2}× ∥ bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
+(12μn3Λ2)𝝍kn+1𝝍kn2).\displaystyle+\left(\frac{1}{2\mu_{n}^{3}}-\frac{\Lambda}{2}\right)\|{\bm{\psi% }}_{k}^{n+1}-{\bm{\psi}}_{k}^{n}\|^{2}\Bigg{)}.+ ( divide start_ARG 1 end_ARG start_ARG 2 italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG - divide start_ARG roman_Λ end_ARG start_ARG 2 end_ARG ) ∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (20di)

Given that μnq<1Λsuperscriptsubscript𝜇𝑛𝑞1Λ\mu_{n}^{q}<\frac{1}{\Lambda}italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT < divide start_ARG 1 end_ARG start_ARG roman_Λ end_ARG for q=1,2,3𝑞123q=1,2,3italic_q = 1 , 2 , 3, we result in

𝐐n+1𝐐nnormsuperscript𝐐𝑛1superscript𝐐𝑛\displaystyle\|{\mathbf{Q}}^{n+1}-{\mathbf{Q}}^{n}\|∥ bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ 0,absent0\displaystyle\to 0,→ 0 , (20dj)
ϕln+1ϕlnnormsuperscriptsubscriptbold-italic-ϕ𝑙𝑛1superscriptsubscriptbold-italic-ϕ𝑙𝑛\displaystyle\|{\bm{\phi}}_{l}^{n+1}-{\bm{\phi}}_{l}^{n}\|∥ bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ 0,absent0\displaystyle\to 0,→ 0 , (20dk)
𝝍kn+1𝝍knnormsuperscriptsubscript𝝍𝑘𝑛1superscriptsubscript𝝍𝑘𝑛\displaystyle\|{\bm{\psi}}_{k}^{n+1}-{\bm{\psi}}_{k}^{n}\|∥ bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∥ 0.absent0\displaystyle\to 0.→ 0 . (20dl)

The condition in (20db), concerning optimality, can be written as

1μni(𝐐n+1𝐐n)𝒬f(𝐐n,ϕln,𝝍kn),𝐐𝐐n+11superscriptsubscript𝜇𝑛𝑖superscript𝐐𝑛1superscript𝐐𝑛subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛𝐐superscript𝐐𝑛1\displaystyle\langle\frac{1}{\mu_{n}^{i}}\left({\mathbf{Q}}^{n+1}-{\mathbf{Q}}% ^{n}\right)-\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{% \psi}}_{k}^{n}),{\mathbf{Q}}-{\mathbf{Q}}^{n+1}\rangle⟨ divide start_ARG 1 end_ARG start_ARG italic_μ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_ARG ( bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT - bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) - ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , bold_Q - bold_Q start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT ⟩
0,𝐐𝒬.formulae-sequenceabsent0for-all𝐐𝒬\displaystyle\leq 0,\forall{\mathbf{Q}}\in\mathcal{Q}.≤ 0 , ∀ bold_Q ∈ caligraphic_Q . (20dm)

Similar inequalities hold for the other two parameters ϕlsubscriptbold-italic-ϕ𝑙{\bm{\phi}}_{l}bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and 𝝍ksubscript𝝍𝑘{\bm{\psi}}_{k}bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT.

By setting n𝑛n\to\inftyitalic_n → ∞ in (20dm), we obtain

𝒬f(𝐐,ϕl,𝝍k),𝐐𝐐0,𝐐𝒬formulae-sequencesubscript𝒬𝑓superscript𝐐superscriptsubscriptbold-italic-ϕ𝑙superscriptsubscript𝝍𝑘𝐐superscript𝐐0for-all𝐐𝒬\displaystyle\langle-\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{*},{\bm{\phi}}_{l}^{*% },{\bm{\psi}}_{k}^{*}),{\mathbf{Q}}-{\mathbf{Q}}^{\star}\rangle\leq 0,\forall{% \mathbf{Q}}\in\mathcal{Q}⟨ - ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , bold_Q - bold_Q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ ≤ 0 , ∀ bold_Q ∈ caligraphic_Q (20dn)

due to the continuity of the gradient of f(𝐐n,ϕln,𝝍kn)𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n})italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), i.e., 𝒬f(𝐐n,ϕln,𝝍kn)𝒬f(𝐐,ϕl,𝝍k)subscript𝒬𝑓superscript𝐐𝑛superscriptsubscriptbold-italic-ϕ𝑙𝑛superscriptsubscript𝝍𝑘𝑛subscript𝒬𝑓superscript𝐐superscriptsubscriptbold-italic-ϕ𝑙superscriptsubscript𝝍𝑘\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{n},{\bm{\phi}}_{l}^{n},{\bm{\psi}}_{k}^{n}% )\to\nabla_{\mathcal{Q}}f({\mathbf{Q}}^{*},{\bm{\phi}}_{l}^{*},{\bm{\psi}}_{k}% ^{*})∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) → ∇ start_POSTSUBSCRIPT caligraphic_Q end_POSTSUBSCRIPT italic_f ( bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

In the case of ϕlsubscriptbold-italic-ϕ𝑙{\bm{\phi}}_{l}bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and 𝝍ksubscript𝝍𝑘{\bm{\psi}}_{k}bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, similar observations hold, which signifies that (𝐐,ϕl,𝝍k)superscript𝐐superscriptsubscriptbold-italic-ϕ𝑙superscriptsubscript𝝍𝑘({\mathbf{Q}}^{*},{\bm{\phi}}_{l}^{*},{\bm{\psi}}_{k}^{*})( bold_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) is a critical point of Problem (𝒫)𝒫(\mathcal{P})( caligraphic_P ), and this concludes the proof.

References

  • [1] K. B. Letaief et al., “The roadmap to 6G: AI empowered wireless networks,” IEEE Commun. Mag., vol. 57, no. 8, pp. 84–90, 2019.
  • [2] S. Dang et al., “What should 6G be?” Nature Electronics, vol. 3, no. 1, pp. 20–29, 2020.
  • [3] J. G. Andrews et al., “What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, 2014.
  • [4] M. Di Renzo et al., “Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450–2525, 2020.
  • [5] Q. Wu and R. Zhang, “Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network,” IEEE Commun. Mag., vol. 58, no. 1, pp. 106–112.
  • [6] A. Papazafeiropoulos et al., “Intelligent reflecting surface-assisted MU-MISO systems with imperfect hardware: Channel estimation and beamforming design,” IEEE Trans. Wireless Commun., vol. 21, no. 3, pp. 2077–2092, 2021.
  • [7] ——, “Cooperative RIS and STAR-RIS assisted mMIMO communication: Analysis and optimization,” vol. 72, no. 9, pp. 11 975–11 989.
  • [8] C. Huang et al., “Holographic MIMO surfaces for 6G wireless networks: Opportunities, challenges, and trends,” IEEE Wireless Commun., vol. 27, no. 5, pp. 118–125, 2020.
  • [9] Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,” IEEE Trans. Wireless Commun., vol. 18, no. 11, pp. 5394–5409, 2019.
  • [10] E. Björnson, Ö. Özdogan, and E. G. Larsson, “Intelligent reflecting surface versus decode-and-forward: How large surfaces are needed to beat relaying?” vol. 9, no. 2, pp. 244–248.
  • [11] Y. Yang et al., “Intelligent reflecting surface meets OFDM: Protocol design and rate maximization,” IEEE Trans. Commun., vol. 68, no. 7, pp. 4522–4535, 2020.
  • [12] M.-M. Zhao et al., “Intelligent reflecting surface enhanced wireless networks: Two-timescale beamforming optimization,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 2–17, 2020.
  • [13] X. Mu et al., “Simultaneously transmitting and reflecting (STAR) RIS aided wireless communications,” IEEE Trans. Wireless Commun., vol. 21, no. 5, pp. 3083–3098, 2021.
  • [14] A. Papazafeiropoulos et al., “Achievable rate of a STAR-RIS assisted massive MIMO system under spatially-correlated channels,” IEEE Trans. Wireless Commun., pp. 1–1, 2023.
  • [15] A. Papazafeiropoulos, P. Kourtessis, and S. Chatzinotas, “Max-Min SINR analysis of STAR-RIS assisted massive MIMO systems with hardware impairments,” IEEE Trans. Wireless Commun., pp. 1–1, 2023.
  • [16] A. Papazafeiropoulos et al., “STAR-RIS assisted cell-free massive MIMO system under spatially-correlated channels,” pp. 1–16.
  • [17] C. Pan et al., “Multicell MIMO communications relying on intelligent reflecting surfaces,” IEEE Trans. Wireless Commun., vol. 19, no. 8, pp. 5218–5233, 2020.
  • [18] J. Ye, S. Guo, and M.-S. Alouini, “Joint reflecting and precoding designs for SER minimization in reconfigurable intelligent surfaces assisted MIMO systems,” IEEE Trans. Wireless Commun., vol. 19, no. 8, pp. 5561–5574, 2020.
  • [19] S. Zhang and R. Zhang, “Capacity characterization for intelligent reflecting surface aided MIMO communication,” IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1823–1838.
  • [20] N. S. Perović et al., “Achievable rate optimization for MIMO systems with reconfigurable intelligent surfaces,” IEEE Trans. Wireless Commun., vol. 20, no. 6, pp. 3865–3882, 2021.
  • [21] T. L. Marzetta et al., Fundamentals of Massive MIMO.   Cambridge University Press, 2016.
  • [22] Z. Wan et al., “Terahertz massive MIMO with holographic reconfigurable intelligent surfaces,” IEEE Trans. Commun., vol. 69, no. 7, pp. 4732–4750, 2021.
  • [23] S. Hu, F. Rusek, and O. Edfors, “Beyond massive MIMO: The potential of data transmission with large intelligent surfaces,” IEEE Trans. Signal Process., vol. 66, no. 10, pp. 2746–2758, 2018.
  • [24] A. Pizzo, T. L. Marzetta, and L. Sanguinetti, “Spatially-stationary model for holographic MIMO small-scale fading,” IEEE J. Sel. Areas Commun., vol. 38, no. 9, pp. 1964–1979, 2020.
  • [25] A. Pizzo, L. Sanguinetti, and T. L. Marzetta, “Fourier plane-wave series expansion for holographic MIMO communications,” IEEE Trans. Wireless Commun., vol. 21, no. 9, pp. 6890–6905, 2022.
  • [26] Ö. T. Demir, E. Björnson, and L. Sanguinetti, “Channel modeling and channel estimation for holographic massive MIMO with planar arrays,” IEEE Wireless Commun. Let., vol. 11, no. 5, pp. 997–1001, 2022.
  • [27] C. Liu et al., “A programmable diffractive deep neural network based on a digital-coding metasurface array,” Nature Electronics, vol. 5, no. 2, pp. 113–122, 2022.
  • [28] J. An et al., “Stacked intelligent metasurfaces for multiuser beamforming in the wave domain,” networks, vol. 7, p. 13, 2023.
  • [29] ——, “Stacked intelligent metasurfaces for efficient holographic MIMO communications in 6G,” IEEE J. Sel. Areas Commun., 2023.
  • [30] Ö. Özdogan, E. Björnson, and E. G. Larsson, “Using intelligent reflecting surfaces for rank improvement in MIMO communications,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2020, pp. 9160–9164.
  • [31] N. S. Perović, M. Di Renzo, and M. F. Flanagan, “Channel capacity optimization using reconfigurable intelligent surfaces in indoor mmWave environments,” in ICC 2020-2020 IEEE International Conference on Communications (ICC).   IEEE, 2020, pp. 1–7.
  • [32] S. Abeywickrama et al., “Intelligent reflecting surface: Practical phase shift model and beamforming optimization,” IEEE Trans. Commun., vol. 68, no. 9, pp. 5849–5863, 2020.
  • [33] X. Lin et al., “All-optical machine learning using diffractive deep neural networks,” Science, vol. 361, no. 6406, pp. 1004–1008, 2018.
  • [34] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
  • [35] X. Hu et al., “Holographic beamforming for ultra massive MIMO with limited radiation amplitudes: How many quantized bits do we need?” IEEE Commun. Let., vol. 26, no. 6, pp. 1403–1407, 2022.
  • [36] L. Dai et al., “Reconfigurable intelligent surface-based wireless communications: Antenna design, prototy**, and experimental results,” IEEE Access, vol. 8, pp. 45 913–45 923, 2020.
  • [37] T. S. Rappaport et al., “Wideband millimeter-wave propagation measurements and channel models for future wireless communication system design,” IEEE Trans. Commun., vol. 63, no. 9, pp. 3029–3056, 2015.
  • [38] Q.-U.-A. Nadeem, J. An, and A. Chaaban, “Hybrid digital-wave domain channel estimator for stacked intelligent metasurface enabled multi-user MISO systems,” arXiv preprint arXiv:2309.16204.
  • [39] H. ElSawy et al., “Modeling and analysis of cellular networks using stochastic geometry: A tutorial,” IEEE Commun. Surveys Tuts., vol. 19, no. 1, pp. 167–203, 2017.
  • [40] J. An et al., “Stacked intelligent metasurfaces for multiuser downlink beamforming in the wave domain.”
  • [41] ——, “Stacked intelligent metasurface-aided MIMO transceiver design.”
  • [42] H. Li and Z. Lin, “Accelerated proximal gradient methods for nonconvex programming,” Advances in neural information processing systems, vol. 28, 2015.
  • [43] A. Hjørungnes, Complex-Valued Matrix Derivatives: With Applications in Signal Processing and Communications.   Cambridge University Press, 2011.
  • [44] T. M. Pham, R. Farrell, and L.-N. Tran, “Revisiting the MIMO capacity with per-antenna power constraint: Fixed-point iteration and alternating optimization,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 388–401, 2018.
\Closesolutionfile

solutionfile