\newsiamthm

claimClaim \newsiamremarkremarkRemark \newsiamremarkhypothesisHypothesis

An Inexact augmented Lagrangian algorithm
for unsymmetric saddle-point systems

Na Huang Department of Applied Mathematics, College of Science, China Agricultural University, Bei**g, China. E-mail: [email protected]. Research partially supported by National Natural Science Foundation of China (No. 12001531).    Yu-Hong Dai LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Bei**g, China. E-mail: [email protected].    Dominique Orban GERAD and Department of Mathematics and Industrial Engineering, Polytechnique Montréal, QC, Canada. E-mail: [email protected]. Research partially supported by an NSERC Discovery Grant.    Michael A. Saunders Systems Optimization Laboratory, Department of Management Science and Engineering, Stanford University, Stanford, CA, USA. E-mail: [email protected]. Version of April 30, 2024.
Abstract

Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the exact solution of a linear system of the same size but with an SPD (2,2) block. To improve efficiency, we introduce an inexact SPAL algorithm. We establish its convergence properties under reasonable assumptions. Specifically, we use a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system at each iteration. We call the result the augmented Lagrangian BB (SPALBB) algorithm and study its convergence. Numerical experiments on test problems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB is more robust and efficient than BICGSTAB and GMRES. SPALBB often requires the least CPU time, especially on large systems.

keywords:
augmented Lagrangian algorithm, saddle-point system, Barzilai-Borwein, convergence analysis.
{AMS}

65F10, 65F50.

1 Introduction

We consider the unsymmetric saddle-point system

(1) (GBBT0)(xy)=(fg),matrix𝐺𝐵superscript𝐵𝑇0matrix𝑥𝑦matrix𝑓𝑔\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f\\ g\end{pmatrix},( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_g end_CELL end_ROW end_ARG ) ,

where Bn×m(nm)𝐵superscript𝑛𝑚𝑛𝑚B\in\mathds{R}^{n\times m}~{}(n\geq m)italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT ( italic_n ≥ italic_m ), and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is positive definite on the nullspace of BTsuperscript𝐵𝑇B^{T}italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT but may be unsymmetric and/or singular. Thus, xTGx>0superscript𝑥𝑇𝐺𝑥0x^{T}\!Gx>0italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x > 0 for all nonzero xNull(BT)𝑥Nullsuperscript𝐵𝑇x\in\mathop{\mathrm{Null}}(B^{T}\!\,)italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). The change of sign in the second block-row of (1) makes the matrix semipositive real and positive semistable if G𝐺Gitalic_G is positive semidefinite [6]. Linear systems like (1) arise from certain discretizations of Navier-Stokes equations [23], mixed and mixed-hybrid finite element approximation of the liquid crystal director model [38] and coupled Stokes-Darcy flow [13], and within interior methods for constrained optimization [25, 48]. System (1) is nonsingular if and only if B𝐵Bitalic_B has full column rank [8]. When B𝐵Bitalic_B corresponds to a discretized gradient operator, as for example in Navier-Stokes equations [23, 28], then B𝐵Bitalic_B has low column rank and (1) is singular.

Iterative methods for solving saddle-point systems have been studied for decades, such as stationary iterations [4, 8, 52], nonlinear inexact Uzawa methods [16, 30, 33], nullspace methods [37, 44, 45], Krylov subspace methods [20, 29, 35, 36], and preconditioning techniques [8, 7, 21, 41]. Some stationary iterative methods and their semi-convergence have been studied for singular cases [15, 49, 50].

Let Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT be symmetric and positive definite (SPD). If we premultiply the second block-row of (1) by BQ1𝐵superscript𝑄1-BQ^{-1}- italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and add the result to the first block equation, we find that (1) is equivalent to

(2) (G+BQ1BTBBT0)(xy)=(fBQ1gg).matrix𝐺𝐵superscript𝑄1superscript𝐵𝑇𝐵superscript𝐵𝑇0matrix𝑥𝑦matrix𝑓𝐵superscript𝑄1𝑔𝑔\begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f-BQ^{-1}g\\ g\end{pmatrix}.( start_ARG start_ROW start_CELL italic_G + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f - italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g end_CELL end_ROW start_ROW start_CELL italic_g end_CELL end_ROW end_ARG ) .

Golub and Greif [27] and Golub et al. [28] showed that methods based on (2) may have advantages. Indeed, even if G𝐺Gitalic_G is singular or ill-conditioned, the (1,1)11(1,1)( 1 , 1 ) block in (2) can be made nonsingular, positive definite or well-conditioned with suitable selections of Q𝑄Qitalic_Q. When G𝐺Gitalic_G is symmetric, the symmetric form

T(Q):=(G+BQ1BTBBT0)assign𝑇𝑄matrix𝐺𝐵superscript𝑄1superscript𝐵𝑇𝐵superscript𝐵𝑇0T(Q):=\begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\ B^{T}&0\end{pmatrix}italic_T ( italic_Q ) := ( start_ARG start_ROW start_CELL italic_G + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG )

of (2) is typically preferred. Golub and Greif [27] mainly consider the specific case Q=γI𝑄𝛾𝐼Q=\gamma Iitalic_Q = italic_γ italic_I, where γ>0𝛾0\gamma>0italic_γ > 0 is constant and I𝐼Iitalic_I is the identity matrix. They provide analytical observations on the spectrum of T(γI)𝑇𝛾𝐼T(\gamma I)italic_T ( italic_γ italic_I ) and show that there is a range of values of γ𝛾\gammaitalic_γ that will improve the condition number of T(γI)𝑇𝛾𝐼T(\gamma I)italic_T ( italic_γ italic_I ), as well as the condition number of its (1,1)11(1,1)( 1 , 1 ) block and the associated Schur complement. In particular, γ=B2/G𝛾superscriptnorm𝐵2norm𝐺\gamma=\|B\|^{2}/\|G\|italic_γ = ∥ italic_B ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ∥ italic_G ∥ may often force the norm of the added term 1γBBT1𝛾𝐵superscript𝐵𝑇\frac{1}{\gamma}BB^{T}\!divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT to be of the same magnitude as the norm of G𝐺Gitalic_G. Golub et al. [28] experimentally observe that this special choice is typically effective. Apart from the form of (2), they also show that when G𝐺Gitalic_G is symmetric positive semidefinite of nullity 1111, an effective approach to maintaining sparsity is to choose the augmented term as τbbT𝜏𝑏superscript𝑏𝑇\tau bb^{T}italic_τ italic_b italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, where b𝑏bitalic_b is a known vector not orthogonal to the nullspace of G𝐺Gitalic_G, and τ>0𝜏0\tau>0italic_τ > 0 is a constant that approximately minimizes the condition number of G+τbbT𝐺𝜏𝑏superscript𝑏𝑇G+\tau bb^{T}italic_G + italic_τ italic_b italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT.

The approach of replacing (1) by (2) can be regarded as an augmented Lagrangian (SPAL) method, also called the method of multipliers [8, 27, 28]. For an extensive overview of the augmented Lagrangian approach and its applications, we refer to [11, 10]. Awanou and Lai [3] apply the Uzawa method [1] to (2) with Q=γI𝑄𝛾𝐼Q=\gamma Iitalic_Q = italic_γ italic_I and propose the following SPAL (with k=0,1,2,𝑘012k=0,1,2,\dotsitalic_k = 0 , 1 , 2 , … and y0subscript𝑦0y_{0}italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT assumed given):

{(G+1γBBT)xk=f1γBgByk,yk+1=yk+1γ(BTxk+g).cases𝐺1𝛾𝐵superscript𝐵𝑇subscript𝑥𝑘𝑓1𝛾𝐵𝑔𝐵subscript𝑦𝑘subscript𝑦𝑘1subscript𝑦𝑘1𝛾superscript𝐵𝑇subscript𝑥𝑘𝑔\left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BB^{T}\!\,)x_{k}=f-\frac{1}{\gamma% }Bg-By_{k},\\ y_{k+1}=y_{k}+\frac{1}{\gamma}(B^{T}\!x_{k}+g).\end{array}\right.{ start_ARRAY start_ROW start_CELL ( italic_G + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_g - italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ) . end_CELL end_ROW end_ARRAY

By introducing another parameter ρ𝜌\rhoitalic_ρ, Awanou and Lai [2] further generalize SPAL as

(3) {(G+1γBQ1BT)xk=f1γBQ1gByk,yk+1=yk+1ρQ1(BTxk+g),cases𝐺1𝛾𝐵superscript𝑄1superscript𝐵𝑇subscript𝑥𝑘𝑓1𝛾𝐵superscript𝑄1𝑔𝐵subscript𝑦𝑘subscript𝑦𝑘1subscript𝑦𝑘1𝜌superscript𝑄1superscript𝐵𝑇subscript𝑥𝑘𝑔\left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BQ^{-1}B^{T}\!\,)x_{k}=f-\frac{1}{% \gamma}BQ^{-1}g-By_{k},\\ y_{k+1}=y_{k}+\frac{1}{\rho}Q^{-1}(B^{T}\!x_{k}+g),\end{array}\right.{ start_ARRAY start_ROW start_CELL ( italic_G + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g - italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ) , end_CELL end_ROW end_ARRAY

and give a first convergence analysis for the case of unsymmetric G𝐺Gitalic_G. They say that the proofs in [26] using spectral arguments cannot be extended to the nonsymmetric case. Under the assumptions that xTGx0superscript𝑥𝑇𝐺𝑥0x^{T}Gx\geq 0italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x ≥ 0 for all x𝑥xitalic_x and xTGx=0superscript𝑥𝑇𝐺𝑥0x^{T}Gx=0italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x = 0 with BTx=0superscript𝐵𝑇𝑥0B^{T}x=0italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = 0 implies x=0𝑥0x=0italic_x = 0, they verify convergence by proving that yk+1yQykyQsubscriptnormsubscript𝑦𝑘1subscript𝑦𝑄subscriptnormsubscript𝑦𝑘subscript𝑦𝑄\|y_{k+1}-y_{*}\|_{Q}\leq\|y_{k}-y_{*}\|_{Q}∥ italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ≤ ∥ italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT and then xksubscript𝑥𝑘x_{k}italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges to xsubscript𝑥x_{*}italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT, where (x,y)subscript𝑥subscript𝑦(x_{*},y_{*})( italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) is the exact solution of (1). Awanou and Lai [2] also say that their numerical experiments for an inexact Uzawa algorithm applied to (2) do not illustrate convergence. However, we have not been able to find their implementation of the inexact version and the numerical results.

We focus here on the inexact SPAL. Based on a simple splitting of the matrix in (1), we propose a stationary iterative method that is theoretically equivalent to (3) when γ=ρ𝛾𝜌\gamma=\rhoitalic_γ = italic_ρ. Hence, we also call it SPAL. We derive its convergence and semi-convergence for B𝐵Bitalic_B of any rank based on spectral arguments (unlike [2]) and obtain an explicit range of convergence for the parameter in SPAL. We allow G𝐺Gitalic_G here to be indefinite. Our SPAL requires an exact solution of a linear system at each step. To improve efficiency, we propose an inexact SPAL in which the linear system is solved inexactly. We show that it converges to the solution of (1) under reasonable conditions. Gradient methods are a class of simple optimization approaches using the negative gradient of the objective function as a search direction. The Barzilai-Borwein (BB) [5] method is a gradient method for unconstrained optimization and has proved to be efficient for solving large and sparse unconstrained convex quadratic programming, which is equivalent to solving an SPD linear system. When G𝐺Gitalic_G is unsymmetric positive definite (UPD), the linear system (7) in SPAL is UPD as well. We use the BB method to solve this UPD linear system inexactly. We call the resulting method the augmented Lagrangian BB (SPALBB) algorithm and establish its convergence under suitable assumptions. Numerical experiments on linear systems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB often solves problems more efficiently than GMRES [43] and BICGSTAB [47].

The paper is organized as follows. In Section 2, we introduce the augmented Lagrangian algorithm. Its convergence and semi-convergence are established in section 2.1 and section 2.2. The inexact SPAL and its convergence analysis are provided in Section 3. The augmented Lagrangian BB algorithm is presented in section 3.3. Numerical experiments are reported in Section 4. Conclusions appear in Section 5.

Notation

For any Hn×n𝐻superscript𝑛𝑛H\in\mathds{R}^{n\times n}italic_H ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT, we write its inverse, transpose, spectral set, nullspace and range space as H1superscript𝐻1H^{-1}italic_H start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, HTsuperscript𝐻𝑇H^{T}italic_H start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, sp(H)sp𝐻\mathrm{sp}(H)roman_sp ( italic_H ), Null(H)Null𝐻\mathop{\mathrm{Null}}(H)roman_Null ( italic_H ), and Range(H)Range𝐻\mathop{\mathrm{Range}}(H)roman_Range ( italic_H ). For any xn𝑥superscript𝑛x\in\mathds{C}^{n}italic_x ∈ blackboard_C start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, we write its conjugate transpose as xsuperscript𝑥x^{*}italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. For symmetric H𝐻Hitalic_H, λmin(H)subscript𝜆𝐻\lambda_{\min}(H)italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_H ) and λmax(H)subscript𝜆𝐻\lambda_{\max}(H)italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_H ) denote the minimum and maximum eigenvalues. \|\cdot\|∥ ⋅ ∥ denotes the 2222-norm of a vector or matrix. For an n×n𝑛𝑛n\times nitalic_n × italic_n SPD matrix G𝐺Gitalic_G, xG=Gx,x=G12xsubscriptnorm𝑥𝐺𝐺𝑥𝑥normsuperscript𝐺12𝑥\|x\|_{G}=\sqrt{\langle Gx,x\rangle}=\|G^{\tfrac{1}{2}}x\|∥ italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = square-root start_ARG ⟨ italic_G italic_x , italic_x ⟩ end_ARG = ∥ italic_G start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_x ∥ for all xn𝑥superscript𝑛x\in\mathds{R}^{n}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, and HG=supx0HxGxG=G12HG12subscriptnorm𝐻𝐺subscriptsupremum𝑥0subscriptnorm𝐻𝑥𝐺subscriptnorm𝑥𝐺normsuperscript𝐺12𝐻superscript𝐺12\|H\|_{G}=\sup\limits_{x\neq 0}\frac{\|Hx\|_{G}}{\|x\|_{G}}=\|G^{\tfrac{1}{2}}% HG^{-\tfrac{1}{2}}\|∥ italic_H ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT italic_x ≠ 0 end_POSTSUBSCRIPT divide start_ARG ∥ italic_H italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT end_ARG = ∥ italic_G start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_H italic_G start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ for all Hn×n𝐻superscript𝑛𝑛H\in\mathds{R}^{n\times n}italic_H ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT. For simplicity, the column vector (xTyT)Tsuperscriptsuperscript𝑥𝑇superscript𝑦𝑇𝑇(x^{T}\!\ y^{T}\!\,)^{T}\!( italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_y start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is written (x,y)𝑥𝑦(x,y)( italic_x , italic_y ), a+:=max{0,a}assignsubscript𝑎0𝑎a_{+}:=\max\{0,a\}italic_a start_POSTSUBSCRIPT + end_POSTSUBSCRIPT := roman_max { 0 , italic_a }, and 1/0:=+assign101/0:=+\infty1 / 0 := + ∞.

2 Augmented Lagrangian algorithm

We present SPAL for solving the unsymmetric saddle-point system (1). Let Q𝑄Qitalic_Q be SPD matrix and ω>0𝜔0\omega>0italic_ω > 0. Since

(4) A:=(GBBT0)=(GBBTωQ)(000ωQ),assign𝐴matrix𝐺𝐵superscript𝐵𝑇0matrix𝐺𝐵superscript𝐵𝑇𝜔𝑄matrix000𝜔𝑄A:=\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}-\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix},italic_A := ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ,

the saddle-point system (1) is equivalent to

(GBBTωQ)(xy)=(fωQy+g).matrix𝐺𝐵superscript𝐵𝑇𝜔𝑄matrix𝑥𝑦matrix𝑓𝜔𝑄𝑦𝑔\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f\\ \omega Qy+g\end{pmatrix}.( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_ω italic_Q italic_y + italic_g end_CELL end_ROW end_ARG ) .

This suggests Algorithm 1 for solving system (1).

Lemma 2.2 shows that it is always possible to choose Q𝑄Qitalic_Q and ω𝜔\omegaitalic_ω such that (7) is nonsingular, even if A𝐴Aitalic_A is singular.

If G𝐺Gitalic_G is symmetric, (1) is equivalent to the constrained optimization problem

(5) minx12xTGxfTxs.t.g+BTx=0.\min_{x}\ \tfrac{1}{2}x^{T}Gx-f^{T}x\mathrm{\quad s.t.\quad}g+B^{T}x=0.roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x - italic_f start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x roman_s . roman_t . italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = 0 .

The k𝑘kitalic_k-th step of the augmented Lagrangian algorithm for (5) solves the subproblem

(6) minx12xTGxfTx+12ωg+BTx+ωQykQ12,subscript𝑥12superscript𝑥𝑇𝐺𝑥superscript𝑓𝑇𝑥12𝜔superscriptsubscriptnorm𝑔superscript𝐵𝑇𝑥𝜔𝑄subscript𝑦𝑘superscript𝑄12\min_{x}~{}\tfrac{1}{2}x^{T}Gx-f^{T}x+\frac{1}{2\omega}\left\|g+B^{T}x+\omega Qy% _{k}\right\|_{Q^{-1}}^{2},roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x - italic_f start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x + divide start_ARG 1 end_ARG start_ARG 2 italic_ω end_ARG ∥ italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x + italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,
Algorithm 1 The augmented Lagrangian algorithm SPAL for solving (1)
1:  Given y0msubscript𝑦0superscript𝑚y_{0}\in\mathds{R}^{m}italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT, ω>0𝜔0\omega>0italic_ω > 0, and SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, set k=0𝑘0k=0italic_k = 0.
2:  while a stop** condition is not satisfied do
3:     Compute (xk+1,yk+1)subscript𝑥𝑘1subscript𝑦𝑘1(x_{k+1},y_{k+1})( italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) according to the iteration
(7) (GBBTωQ)(xk+1yk+1)=(fωQyk+g).matrix𝐺𝐵superscript𝐵𝑇𝜔𝑄matrixsubscript𝑥𝑘1subscript𝑦𝑘1matrix𝑓𝜔𝑄subscript𝑦𝑘𝑔\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x_{k+1}\\ y_{k+1}\end{pmatrix}=\begin{pmatrix}f\\ \omega Qy_{k}+g\end{pmatrix}.( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g end_CELL end_ROW end_ARG ) .
4:     Increment k𝑘kitalic_k by 1111.
5:  end while

where yksubscript𝑦𝑘y_{k}italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is an estimate of the Lagrange multiplier. Its optimal solution xk+1subscript𝑥𝑘1x_{k+1}italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT satisfies

(8) (G+1ωBQ1BT)xk+1+Byk=f1ωBQ1g.𝐺1𝜔𝐵superscript𝑄1superscript𝐵𝑇subscript𝑥𝑘1𝐵subscript𝑦𝑘𝑓1𝜔𝐵superscript𝑄1𝑔(G+\frac{1}{\omega}BQ^{-1}B^{T}\!\,)x_{k+1}+By_{k}=f-\frac{1}{\omega}BQ^{-1}g.( italic_G + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g .

The multiplier is updated as

(9) yk+1=1ωQ1(g+BTxk+1+ωQyk)=yk+1ωQ1(BTxk+1+g).subscript𝑦𝑘11𝜔superscript𝑄1𝑔superscript𝐵𝑇subscript𝑥𝑘1𝜔𝑄subscript𝑦𝑘subscript𝑦𝑘1𝜔superscript𝑄1superscript𝐵𝑇subscript𝑥𝑘1𝑔y_{k+1}=\frac{1}{\omega}Q^{-1}(g+B^{T}x_{k+1}+\omega Qy_{k})=y_{k}+\frac{1}{% \omega}Q^{-1}(B^{T}x_{k+1}+g).italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_g ) .

Note that (7) also gives (8)–(9). Hence, we also call it the augmented Lagrangian algorithm here. Clearly, Algorithm 1 is theoretically equivalent to (3) if γ=ρ=ω𝛾𝜌𝜔\gamma=\rho=\omegaitalic_γ = italic_ρ = italic_ω. When G𝐺Gitalic_G is symmetric, the convergence of SPAL or its variants has been studied in [26]. Awanou and Lai [2] first gave convergence results for (3) when G𝐺Gitalic_G is unsymmetric positive semi-definite but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T})roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), based on analyzing the error ykyQsubscriptnormsubscript𝑦𝑘subscript𝑦𝑄\|y_{k}-y_{*}\|_{Q}∥ italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT, where (x,y)subscript𝑥subscript𝑦(x_{*},y_{*})( italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) is the exact solution of (1). Here we give the convergence analysis of SPAL in a different way, based on the spectral properties of T𝑇Titalic_T in (15) below. We derive the explicit range of convergence for ω𝜔\omegaitalic_ω and do not require G𝐺Gitalic_G to be positive semi-definite.

We call A=MN𝐴𝑀𝑁A=M-Nitalic_A = italic_M - italic_N a splitting if M𝑀Mitalic_M is nonsingular. Defining T=M1N𝑇superscript𝑀1𝑁T=M^{-1}Nitalic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N, we consider the following iteration scheme for solving Az=𝐴𝑧Az=\ellitalic_A italic_z = roman_ℓ:

(10) zk+1=Tzk+M1.subscript𝑧𝑘1𝑇subscript𝑧𝑘superscript𝑀1z_{k+1}=Tz_{k}+M^{-1}\ell.italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_T italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ .

First, we show that (4) is a splitting of A𝐴Aitalic_A in (1). For convenience, we introduce

(11) SQ=G+1ωBQ1BT,subscript𝑆𝑄𝐺1𝜔𝐵superscript𝑄1superscript𝐵𝑇\displaystyle S_{Q}=G+\dfrac{1}{\omega}BQ^{-1}B^{T},italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = italic_G + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , H=12(G+GT),𝐻12𝐺superscript𝐺𝑇\displaystyle\qquad H=\tfrac{1}{2}(G+G^{T}),italic_H = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ,
(12) M=(GBBTωQ),𝑀matrix𝐺𝐵superscript𝐵𝑇𝜔𝑄\displaystyle M=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix},italic_M = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) , N=(000ωQ).𝑁matrix000𝜔𝑄\displaystyle\qquad N=\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}.italic_N = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) .

Note that SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is the Schur complement of ωQ𝜔𝑄\omega Qitalic_ω italic_Q in M𝑀Mitalic_M.

Lemma 2.1.

Let Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT be unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), and

(13) η=infxNull(BT)xTHxxTBQ1BTx.𝜂subscriptinfimum𝑥Nullsuperscript𝐵𝑇superscript𝑥𝑇𝐻𝑥superscript𝑥𝑇𝐵superscript𝑄1superscript𝐵𝑇𝑥\displaystyle\eta=\inf\limits_{x\notin\mathop{\mathrm{Null}}(B^{T})}\dfrac{x^{% T}Hx}{x^{T}BQ^{-1}B^{T}x}.italic_η = roman_inf start_POSTSUBSCRIPT italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG .

For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, if 0<ω<1/(η)+0𝜔1subscript𝜂0<\omega<1/(-\eta)_{+}0 < italic_ω < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, then SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite.

Proof.

Since G𝐺Gitalic_G is positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), so is H𝐻Hitalic_H. Then for any nonzero xNull(BT)𝑥Nullsuperscript𝐵𝑇x\in\mathop{\mathrm{Null}}(B^{T})italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), it holds that xT(H+1ωBQ1BT)x=xTHx>0.superscript𝑥𝑇𝐻1𝜔𝐵superscript𝑄1superscript𝐵𝑇𝑥superscript𝑥𝑇𝐻𝑥0x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx>0.italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x > 0 . For any xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), as η>1/ω𝜂1𝜔\eta>-1/\omegaitalic_η > - 1 / italic_ω, we have

xT(H+1ωBQ1BT)x=xTHx+1ωxTBQ1BTx(η+1ω)xTBQ1BTx>0.superscript𝑥𝑇𝐻1𝜔𝐵superscript𝑄1superscript𝐵𝑇𝑥superscript𝑥𝑇𝐻𝑥1𝜔superscript𝑥𝑇𝐵superscript𝑄1superscript𝐵𝑇𝑥𝜂1𝜔superscript𝑥𝑇𝐵superscript𝑄1superscript𝐵𝑇𝑥0x^{T}(H+\frac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx+\frac{1}{\omega}x^{T}BQ^{-1}B^{% T}x\geq(\eta+\frac{1}{\omega})x^{T}BQ^{-1}B^{T}x>0.italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x ≥ ( italic_η + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG ) italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x > 0 .

Hence SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite because, for any nonzero xn𝑥superscript𝑛x\in\mathds{R}^{n}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, xT(SQ+SQT)x=2xT(H+1ωBQ1BT)x>0.superscript𝑥𝑇subscript𝑆𝑄superscriptsubscript𝑆𝑄𝑇𝑥2superscript𝑥𝑇𝐻1𝜔𝐵superscript𝑄1superscript𝐵𝑇𝑥0x^{T}(S_{Q}+S_{Q}^{T})x=2x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x>0.italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = 2 italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x > 0 .

By Lemma 2.1 and some algebraic manipulation, we have the following results.

Lemma 2.2.

Under the same conditions as in Lemma 2.1, M𝑀Mitalic_M is nonsingular and

(14) M1=(SQ11ωSQ1BQ11ωQ1BTSQ11ωQ11ω2Q1BTSQ1BQ1).superscript𝑀1matrixsuperscriptsubscript𝑆𝑄11𝜔superscriptsubscript𝑆𝑄1𝐵superscript𝑄11𝜔superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄11𝜔superscript𝑄11superscript𝜔2superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1M^{-1}=\begin{pmatrix}S_{Q}^{-1}&-\dfrac{1}{\omega}S_{Q}^{-1}BQ^{-1}\\[8.0pt] \dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}Q^{-1}-\dfrac{1}{% \omega^{2}}Q^{-1}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}.italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = ( start_ARG start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) .

Lemma 2.3.

Under the same conditions as in Lemma 2.1, the iteration matrix of Algorithm 1 is

(15) T=M1N=(0SQ1B0I1ωQ1BTSQ1B)𝑇superscript𝑀1𝑁matrix0superscriptsubscript𝑆𝑄1𝐵0𝐼1𝜔superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵T=M^{-1}N=\smash[t]{\begin{pmatrix}0&-S_{Q}^{-1}B\\ 0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}}italic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW end_ARG )

and the eigenvalues of T𝑇Titalic_T are 00 with algebraic multiplicity n𝑛nitalic_n, 1111 with algebraic multiplicity ms𝑚𝑠m-sitalic_m - italic_s, and the remaining s𝑠sitalic_s eigenvalues are ωμ/(1+ωμ)𝜔𝜇1𝜔𝜇\omega\mu/(1+\omega\mu)italic_ω italic_μ / ( 1 + italic_ω italic_μ ), where s𝑠sitalic_s is the rank of B𝐵Bitalic_B and μ𝜇\muitalic_μ is a generalized eigenvalue of G𝐺Gitalic_G and BQ1BT𝐵superscript𝑄1superscript𝐵𝑇BQ^{-1}B^{T}italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ).

Proof.

It follows from (12) and (14) that

T=(GBBTωQ)1(000ωQ)=(0SQ1B0I1ωQ1BTSQ1B).𝑇superscriptmatrix𝐺𝐵superscript𝐵𝑇𝜔𝑄1matrix000𝜔𝑄matrix0superscriptsubscript𝑆𝑄1𝐵0𝐼1𝜔superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵T=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}^{-1}\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}=\begin{pmatrix}0&-S_{Q}^{-1}B\\ 0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}.italic_T = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW end_ARG ) .

Clearly, T𝑇Titalic_T has an eigenvalue 00 with algebraic multiplicity n𝑛nitalic_n, and the remaining m𝑚mitalic_m eigenvalues are 1λ/ω1𝜆𝜔1-\lambda/\omega1 - italic_λ / italic_ω, where λ𝜆\lambdaitalic_λ is an eigenvalue of Q1BTSQ1Bsuperscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵Q^{-1}B^{T}S_{Q}^{-1}Bitalic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B.

Since SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite and Q𝑄Qitalic_Q is SPD, Q1BTSQ1Bsuperscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵Q^{-1}B^{T}S_{Q}^{-1}Bitalic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B is nonsingular when B𝐵Bitalic_B has full column rank. Thus, λ=0𝜆0\lambda=0italic_λ = 0 if and only if B𝐵Bitalic_B is column rank-deficient. In this case, 1111 is an eigenvalue of T𝑇Titalic_T with algebraic multiplicity ms𝑚𝑠m-sitalic_m - italic_s.

If λ0𝜆0\lambda\neq 0italic_λ ≠ 0, note that Q1BTSQ1Bsuperscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵Q^{-1}B^{T}S_{Q}^{-1}Bitalic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B and SQ1BQ1BTsuperscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇S_{Q}^{-1}BQ^{-1}B^{T}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT possess the same nonzero eigenvalues, and λ𝜆\lambdaitalic_λ is also an eigenvalue of SQ1BQ1BTsuperscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇S_{Q}^{-1}BQ^{-1}B^{T}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Then there exists xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) such that SQ1BQ1BTx=λxsuperscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇𝑥𝜆𝑥S_{Q}^{-1}BQ^{-1}B^{T}x=\lambda xitalic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = italic_λ italic_x. Combining with (11) leads to

(16) Gx=ωλωλBQ1BTx.𝐺𝑥𝜔𝜆𝜔𝜆𝐵superscript𝑄1superscript𝐵𝑇𝑥Gx=\dfrac{\omega-\lambda}{\omega\lambda}BQ^{-1}B^{T}x.italic_G italic_x = divide start_ARG italic_ω - italic_λ end_ARG start_ARG italic_ω italic_λ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x .

Hence there exists a generalized eigenvalue μ𝜇\muitalic_μ of G𝐺Gitalic_G and BQ1BT𝐵superscript𝑄1superscript𝐵𝑇BQ^{-1}B^{T}italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) such that μ=ωλωλ𝜇𝜔𝜆𝜔𝜆\mu=\tfrac{\omega-\lambda}{\omega\lambda}italic_μ = divide start_ARG italic_ω - italic_λ end_ARG start_ARG italic_ω italic_λ end_ARG, i.e., λ=ω1+ωμ𝜆𝜔1𝜔𝜇\lambda=\tfrac{\omega}{1+\omega\mu}italic_λ = divide start_ARG italic_ω end_ARG start_ARG 1 + italic_ω italic_μ end_ARG. Therefore, we know that the remaining eigenvalues of T𝑇Titalic_T are 111+ωμ=ωμ1+ωμ.111𝜔𝜇𝜔𝜇1𝜔𝜇1-\tfrac{1}{1+\omega\mu}=\tfrac{\omega\mu}{1+\omega\mu}.1 - divide start_ARG 1 end_ARG start_ARG 1 + italic_ω italic_μ end_ARG = divide start_ARG italic_ω italic_μ end_ARG start_ARG 1 + italic_ω italic_μ end_ARG .

We should emphasize that Lemmas 2.1, 2.2 and 2.3 hold even if B𝐵Bitalic_B has low column rank. From Lemma 2.2, we know that A=MN𝐴𝑀𝑁A=M-Nitalic_A = italic_M - italic_N is a splitting of A𝐴Aitalic_A. Then the convergence analysis of Algorithm 1 can be based on the spectral properties of T=M1N𝑇superscript𝑀1𝑁T=M^{-1}Nitalic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N. In the following, we discuss the convergence of Algorithm 1 when B𝐵Bitalic_B does or does not have full column rank, respectively.

2.1 Convergence analysis when B𝐵Bitalic_B has full column rank

In this case, A𝐴Aitalic_A is nonsingular and the saddle-point system (1) has a unique solution.

Theorem 2.1.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T})roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η be defined by (13). If 0<ω<1/(2η)+0𝜔1subscript2𝜂0<\omega<1/(-2\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, then the sequence {xk,yk}subscript𝑥𝑘subscript𝑦𝑘\{x_{k},y_{k}\}{ italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 1 converges to the unique solution of saddle-point system (1).

Proof 2.2.

Algorithm 1 is convergent if and only if the spectral radius of T𝑇Titalic_T is less than 1111 [42, Theorem 4.1]. Note that 0<ω<1/(2η)+1/(η)+0𝜔1subscript2𝜂1subscript𝜂0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT and the conditions of Lemma 2.1 hold. As B𝐵Bitalic_B has full column rank, it follows from Lemma 2.3 that 1111 is not an eigenvalue of T𝑇Titalic_T and then

(17) ρ(T)=maxμω|μ||1+ωμ|=maxμ(ωμ1)2+(ωμ2)2(1+ωμ1)2+(ωμ2)2,𝜌𝑇subscript𝜇𝜔𝜇1𝜔𝜇subscript𝜇superscript𝜔subscript𝜇12superscript𝜔subscript𝜇22superscript1𝜔subscript𝜇12superscript𝜔subscript𝜇22\rho(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}\smash[t]{\sqrt% {\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(\omega% \mu_{2})^{2}}}},italic_ρ ( italic_T ) = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT divide start_ARG italic_ω | italic_μ | end_ARG start_ARG | 1 + italic_ω italic_μ | end_ARG = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT square-root start_ARG divide start_ARG ( italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ( 1 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ,

where μ=μ1+iμ2𝜇subscript𝜇1isubscript𝜇2\mu=\mu_{1}+{\rm i}\mu_{2}italic_μ = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + roman_i italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the generalized eigenvalue of G𝐺Gitalic_G and BQ1BT𝐵superscript𝑄1superscript𝐵𝑇BQ^{-1}B^{T}italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). Since xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and Q𝑄Qitalic_Q is SPD, we have xBQ1BTx>0superscript𝑥𝐵superscript𝑄1superscript𝐵𝑇𝑥0x^{*}BQ^{-1}B^{T}x>0italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x > 0. Combining with (16) gives μ=xGxxBQ1BTx𝜇superscript𝑥𝐺𝑥superscript𝑥𝐵superscript𝑄1superscript𝐵𝑇𝑥\mu=\frac{x^{*}Gx}{x^{*}BQ^{-1}B^{T}x}italic_μ = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_G italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG. Then

(18) μ1=x(G+GT)x2xBQ1BTx=xHxxBQ1BTxη.subscript𝜇1superscript𝑥𝐺superscript𝐺𝑇𝑥2superscript𝑥𝐵superscript𝑄1superscript𝐵𝑇𝑥superscript𝑥𝐻𝑥superscript𝑥𝐵superscript𝑄1superscript𝐵𝑇𝑥𝜂\mu_{1}=\dfrac{x^{*}(G+G^{T})x}{2x^{*}BQ^{-1}B^{T}x}=\dfrac{x^{*}Hx}{x^{*}BQ^{% -1}B^{T}x}\geq\eta.italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x end_ARG start_ARG 2 italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_H italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG ≥ italic_η .

Note that η>1/(2ω)𝜂12𝜔\eta>-1/(2\omega)italic_η > - 1 / ( 2 italic_ω ) and ω>0𝜔0\omega>0italic_ω > 0, so that 1+ωμ11+ωη>1/21𝜔subscript𝜇11𝜔𝜂121+\omega\mu_{1}\geq 1+\omega\eta>1/21 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 1 + italic_ω italic_η > 1 / 2. This together with (17) leads to ρ(T)<1𝜌𝑇1\rho(T)<1italic_ρ ( italic_T ) < 1. Therefore, Algorithm 1 is convergent.

Remark 1.

From (17) we see that ρ(T)𝜌𝑇\rho(T)italic_ρ ( italic_T ) decreases with ω𝜔\omegaitalic_ω. This means that the convergence rate of Algorithm 1 will improve as ω𝜔\omegaitalic_ω decreases. In particular, if ω=0𝜔0\omega=0italic_ω = 0 (which means no splitting), ρ(T)=0𝜌𝑇0\rho(T)=0italic_ρ ( italic_T ) = 0. Algorithm 1 then reduces to the exact method for problem (1). This is consistent with (7), i.e., Algorithm 1 performs only one iteration. In addition, since ρ(T)0𝜌𝑇0\rho(T)\rightarrow 0italic_ρ ( italic_T ) → 0 as |μ|0𝜇0|\mu|\rightarrow 0| italic_μ | → 0, Q𝑄Qitalic_Q should be chosen such that the generalized eigenvalues of G𝐺Gitalic_G and BQ1BT𝐵superscript𝑄1superscript𝐵𝑇BQ^{-1}B^{T}italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT are very close to 00. Therefore, we can choose Q𝑄Qitalic_Q with very small norm.

Remark 2.

If G𝐺Gitalic_G is semidefinite, we see that η0𝜂0\eta\geq 0italic_η ≥ 0. Then Algorithm 1 is convergent for any ω>0𝜔0\omega>0italic_ω > 0.

2.2 Convergence analysis when B𝐵Bitalic_B is rank-deficient

In this case, A𝐴Aitalic_A is singular. We assume that system (1) is solvable and show that Algorithm 1 is semi-convergent. To this end, we introduce some preliminaries on the semi-convergence of iteration scheme (10) for a general linear system Az=𝐴𝑧Az=\ellitalic_A italic_z = roman_ℓ.

Definition 3.

(Berman and Plemmons [9, Lemma 6.13])  Iteration (10) is semi-convergent if, for any initial guess z0subscript𝑧0z_{0}italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, the iteration sequence {zk}subscript𝑧𝑘\{z_{k}\}{ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by (10) converges to a solution z𝑧zitalic_z of Az=𝐴𝑧Az=\ellitalic_A italic_z = roman_ℓ such that z=(IT)DM1+[I(IT)D(IT)]z0,𝑧superscript𝐼𝑇𝐷superscript𝑀1delimited-[]𝐼superscript𝐼𝑇𝐷𝐼𝑇subscript𝑧0z=(I-T)^{D}M^{-1}\ell+[I-(I-T)^{D}(I-T)]z_{0},italic_z = ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ + [ italic_I - ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT ( italic_I - italic_T ) ] italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , where (IT)Dsuperscript𝐼𝑇𝐷(I-T)^{D}( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT denotes the Drazin inverse [14] of IT𝐼𝑇I-Titalic_I - italic_T.

Lemma 4 (9, Theorem 6.19).

Iteration (10) is semi-convergent if and only if index(IT)=1index𝐼𝑇1{\rm index}(I-T)=1roman_index ( italic_I - italic_T ) = 1 and v(T)<1𝑣𝑇1v(T)<1italic_v ( italic_T ) < 1, where index(IT)index𝐼𝑇{\rm index}(I-T)roman_index ( italic_I - italic_T ) is the smallest nonnegative integer k𝑘kitalic_k such that the ranks of (IT)ksuperscript𝐼𝑇𝑘(I-T)^{k}( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and (IT)k+1superscript𝐼𝑇𝑘1(I-T)^{k+1}( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT are equal, and v(T)=max{|λ|:λsp(T),λ1}𝑣𝑇:𝜆formulae-sequence𝜆sp𝑇𝜆1v(T)=\max\{|\lambda|:~{}\lambda\in{\rm sp}(T),~{}\lambda\neq 1\}italic_v ( italic_T ) = roman_max { | italic_λ | : italic_λ ∈ roman_sp ( italic_T ) , italic_λ ≠ 1 } is called the pseudo-spectral radius of T𝑇Titalic_T.

Lemma 5 (49, Theorem 2.5).

index(IT)=1index𝐼𝑇1{\rm index}(I-T)=1roman_index ( italic_I - italic_T ) = 1 holds if and only if, for all 0wRange(A)0𝑤Range𝐴0\neq w\in\mathop{\mathrm{Range}}(A)0 ≠ italic_w ∈ roman_Range ( italic_A ), wNull(AM1)𝑤Null𝐴superscript𝑀1w\notin\mathop{\mathrm{Null}}({AM}^{-1})italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ), i.e., Range(A)Null(AM1)={0}Range𝐴Null𝐴superscript𝑀10\mathop{\mathrm{Range}}(A)\cap\mathop{\mathrm{Null}}({AM}^{-1})=\{0\}roman_Range ( italic_A ) ∩ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = { 0 }.

In the following, we analyze the semi-convergence property for Algorithm 1. By Lemma 4, first, we need to show index(IT)=1index𝐼𝑇1{\rm index}(I-T)=1roman_index ( italic_I - italic_T ) = 1.

Theorem 2.3.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T})roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η be defined by (13). If 0<ω<1/(η)+0𝜔1subscript𝜂0<\omega<1/(-\eta)_{+}0 < italic_ω < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, then index(IT)=1index𝐼𝑇1{\rm index}(I-T)=1roman_index ( italic_I - italic_T ) = 1.

Proof 2.4.

Suppose 0wRange(A)0𝑤Range𝐴0\neq w\in\mathop{\mathrm{Range}}(A)0 ≠ italic_w ∈ roman_Range ( italic_A ). Then there is v=(v1,v2)n+m𝑣subscript𝑣1subscript𝑣2superscript𝑛𝑚v=(v_{1},v_{2})\in\mathds{R}^{n+m}italic_v = ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT such that

(19) w=Av=(GBBT0)(v1v2)=(Gv1+Bv2BTv1)0.𝑤𝐴𝑣matrix𝐺𝐵superscript𝐵𝑇0matrixsubscript𝑣1subscript𝑣2matrix𝐺subscript𝑣1𝐵subscript𝑣2superscript𝐵𝑇subscript𝑣10w=Av=\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}v_{1}\\ v_{2}\end{pmatrix}=\begin{pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}v_{1}\end{pmatrix}\neq 0.italic_w = italic_A italic_v = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ≠ 0 .

By (14), we have

(20) AM1w𝐴superscript𝑀1𝑤\displaystyle{AM}^{-1}witalic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w =\displaystyle== (I0BTSQ11ωBTSQ1BQ1)(Gv1+Bv2BTv1)matrix𝐼0superscript𝐵𝑇superscriptsubscript𝑆𝑄11𝜔superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1matrix𝐺subscript𝑣1𝐵subscript𝑣2superscript𝐵𝑇subscript𝑣1\displaystyle\smash[t]{\begin{pmatrix}I&0\\ -B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}\begin{% pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}v_{1}\end{pmatrix}}( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG )
=\displaystyle== (Gv1+Bv2BTSQ1(Gv1+Bv2)1ωBTSQ1BQ1BTv1).matrix𝐺subscript𝑣1𝐵subscript𝑣2superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐺subscript𝑣1𝐵subscript𝑣21𝜔superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣1\displaystyle\begin{pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}S_{Q}^{-1}(Gv_{1}+Bv_{2})-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_% {1}\end{pmatrix}.( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

If Gv1+Bv20𝐺subscript𝑣1𝐵subscript𝑣20Gv_{1}+Bv_{2}\neq 0italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≠ 0, clearly, AM1w0𝐴superscript𝑀1𝑤0{AM}^{-1}w\neq 0italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w ≠ 0, which shows that wNull(AM1)𝑤Null𝐴superscript𝑀1w\notin\mathop{\mathrm{Null}}({AM}^{-1})italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ).

If Gv1+Bv2=0𝐺subscript𝑣1𝐵subscript𝑣20Gv_{1}+Bv_{2}=0italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0, it follows from (19) that BTv10superscript𝐵𝑇subscript𝑣10B^{T}v_{1}\neq 0italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 and (20) yields

(21) AM1w=(01ωBTSQ1BQ1BTv1).𝐴superscript𝑀1𝑤matrix01𝜔superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣1{AM}^{-1}w=\begin{pmatrix}0\\ -\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\end{pmatrix}.italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

Note that Q𝑄Qitalic_Q is SPD and BTv10superscript𝐵𝑇subscript𝑣10B^{T}v_{1}\neq 0italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0, so that BQ1BTv10𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣10BQ^{-1}B^{T}v_{1}\neq 0italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0. Then we would have

BTSQ1BQ1BTv10.superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣10B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\neq 0.italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 .

Indeed, if BTSQ1BQ1BTv1=0superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣10B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0, clearly v1TBQ1BTSQ1BQ1BTv1=0superscriptsubscript𝑣1𝑇𝐵superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣10v_{1}^{T}BQ^{-1}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0. Since SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite, SQ1superscriptsubscript𝑆𝑄1S_{Q}^{-1}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT is also positive definite, which leads to BQ1BTv1=0𝐵superscript𝑄1superscript𝐵𝑇subscript𝑣10BQ^{-1}B^{T}v_{1}=0italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0. This is a contradiction. Therefore, we still get wNull(AM1)𝑤Null𝐴superscript𝑀1w\notin\mathop{\mathrm{Null}}({AM}^{-1})italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) by (21). Summing up, for any 0wRange(A)0𝑤Range𝐴0\neq w\in\mathop{\mathrm{Range}}(A)0 ≠ italic_w ∈ roman_Range ( italic_A ), wNull(AM1)𝑤Null𝐴superscript𝑀1w\notin\mathop{\mathrm{Null}}({AM}^{-1})italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ). The result follows from Lemma 5.

Next, we show that v(T)<1𝑣𝑇1v(T)<1italic_v ( italic_T ) < 1.

Theorem 2.5.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η be defined by (13). If 0<ω<1/(2η)+0𝜔1subscript2𝜂0<\omega<1/(-2\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, then v(T)<1𝑣𝑇1v(T)<1italic_v ( italic_T ) < 1.

Proof 2.6.

Since 0<ω<1/(2η)+1/(η)+0𝜔1subscript2𝜂1subscript𝜂0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, the conditions of Lemma 2.1 hold. Note the definition of the pseudo-spectral radius in Lemma 4. From Lemma 2.3,

v(T)=maxμω|μ||1+ωμ|=maxμ(ωμ1)2+(ωμ2)2(1+ωμ1)2+(ωμ2)2,𝑣𝑇subscript𝜇𝜔𝜇1𝜔𝜇subscript𝜇superscript𝜔subscript𝜇12superscript𝜔subscript𝜇22superscript1𝜔subscript𝜇12superscript𝜔subscript𝜇22\displaystyle v(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}% \sqrt{\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(% \omega\mu_{2})^{2}}},italic_v ( italic_T ) = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT divide start_ARG italic_ω | italic_μ | end_ARG start_ARG | 1 + italic_ω italic_μ | end_ARG = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT square-root start_ARG divide start_ARG ( italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ( 1 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ,

where μ=μ1+iμ2𝜇subscript𝜇1isubscript𝜇2\mu=\mu_{1}+{\rm i}\mu_{2}italic_μ = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + roman_i italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the generalized eigenvalue of G𝐺Gitalic_G and BQ1BT𝐵superscript𝑄1superscript𝐵𝑇BQ^{-1}B^{T}italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT that corresponds to the generalized eigenvector xNull(BT)𝑥Nullsuperscript𝐵𝑇x\notin\mathop{\mathrm{Null}}(B^{T})italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). By (18), ω>0𝜔0\omega>0italic_ω > 0 and η>1/(2ω)𝜂12𝜔\eta>-1/(2\omega)italic_η > - 1 / ( 2 italic_ω ), we have 1+2ωμ11+2ωη>012𝜔subscript𝜇112𝜔𝜂01+2\omega\mu_{1}\geq 1+2\omega\eta>01 + 2 italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 1 + 2 italic_ω italic_η > 0, giving v(T)<1𝑣𝑇1v(T)<1italic_v ( italic_T ) < 1.

Combining Lemma 4 with Theorems 2.3 and 2.5 and 1/(2η)+<1/(η)+1subscript2𝜂1subscript𝜂1/(-2\eta)_{+}<1/(-\eta)_{+}1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, we get the following convergence result.

Theorem 2.7.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient, and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η be defined by (13). If 0<ω<1/(2η)+0𝜔1subscript2𝜂0<\omega<1/(-2\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, then the sequence {xk,yk}subscript𝑥𝑘subscript𝑦𝑘\{x_{k},y_{k}\}{ italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 1 is semi-convergent to a solution of the singular saddle-point system (1).

3 Inexact augmented Lagrangian algorithm

In this section, we develop and analyze inexact SPAL to solve (1). Let =(f,g)𝑓𝑔\ell=(f,g)roman_ℓ = ( italic_f , italic_g ), zk=(xk,yk)subscript𝑧𝑘subscript𝑥𝑘subscript𝑦𝑘z_{k}=(x_{k},y_{k})italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), and rk=Azksubscript𝑟𝑘𝐴subscript𝑧𝑘r_{k}=Az_{k}-\ellitalic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ. It follows from (10) and A=MN𝐴𝑀𝑁A=M-Nitalic_A = italic_M - italic_N that Algorithm 1 is equivalent to

(22) zk+1=M1Nzk+M1=M1(MA)zk+M1=zkM1rk,subscript𝑧𝑘1superscript𝑀1𝑁subscript𝑧𝑘superscript𝑀1superscript𝑀1𝑀𝐴subscript𝑧𝑘superscript𝑀1subscript𝑧𝑘superscript𝑀1subscript𝑟𝑘z_{k+1}=M^{-1}Nz_{k}+M^{-1}\ell=M^{-1}(M-A)z_{k}+M^{-1}\ell=z_{k}-M^{-1}r_{k},italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_M - italic_A ) italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,

where M𝑀Mitalic_M and N𝑁Nitalic_N are defined in (12). To describe the inexact version of Algorithm 1, as done in [30], we introduce a nonlinear map** Ψ:n+mn+m:Ψsuperscript𝑛𝑚superscript𝑛𝑚\Psi:\mathds{R}^{n+m}\longrightarrow\mathds{R}^{n+m}roman_Ψ : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT ⟶ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT such that for any given rn+m𝑟superscript𝑛𝑚r\in\mathds{R}^{n+m}italic_r ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT, Ψ(r)Ψ𝑟\Psi(r)roman_Ψ ( italic_r ) approximates the solution ΔzΔ𝑧\Delta zroman_Δ italic_z of MΔz=r𝑀Δ𝑧𝑟M\Delta z=ritalic_M roman_Δ italic_z = italic_r in that

(23) rMΨ(r)δrsubscriptnorm𝑟𝑀Ψ𝑟𝛿subscriptnorm𝑟\|r-M\Psi(r)\|_{*}\leq\delta\|r\|_{*}∥ italic_r - italic_M roman_Ψ ( italic_r ) ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ≤ italic_δ ∥ italic_r ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT

for some δ[0,1)𝛿01\delta\in[0,1)italic_δ ∈ [ 0 , 1 ) and some norm \|\cdot\|_{*}∥ ⋅ ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT. We obtain the inexact augmented Lagrangian algorithm of Algorithm 2, where the main idea is to approximate M1rksuperscript𝑀1subscript𝑟𝑘M^{-1}r_{k}italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in (22).

Algorithm 2 Inexact augmented Lagrangian algorithm
1:  Given z0=(x0,y0)n+msubscript𝑧0subscript𝑥0subscript𝑦0superscript𝑛𝑚z_{0}=(x_{0},y_{0})\in\mathds{R}^{n+m}italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT, ω>0𝜔0\omega>0italic_ω > 0, 0δ<10𝛿10\leq\delta<10 ≤ italic_δ < 1 and SPD Q𝑄Qitalic_Q, set k=0𝑘0k=0italic_k = 0.
2:  while a stop** condition is not satisfied do
3:     Compute rk=Azksubscript𝑟𝑘𝐴subscript𝑧𝑘r_{k}=Az_{k}-\ellitalic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ.
4:     Compute Ψ(rk)M1rkΨsubscript𝑟𝑘superscript𝑀1subscript𝑟𝑘\Psi(r_{k})\approx M^{-1}r_{k}roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ≈ italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT satisfying (23).
5:     Compute zk+1=zkΨ(rk)subscript𝑧𝑘1subscript𝑧𝑘Ψsubscript𝑟𝑘z_{k+1}=z_{k}-\Psi(r_{k})italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ).
6:     Increment k𝑘kitalic_k by 1111.
7:  end while

In our convergence analysis we use P\|\cdot\|_{P}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT in (23), where Pβ=(I00βQ1)subscript𝑃𝛽matrix𝐼00𝛽superscript𝑄1P_{\beta}=\smash[b]{\begin{pmatrix}I&0\\ 0&\beta Q^{-1}\end{pmatrix}}italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) is SPD and β>0𝛽0\beta>0italic_β > 0 is an arbitrary constant. By Algorithm 2,

(24) rk+1subscript𝑟𝑘1\displaystyle r_{k+1}italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT =\displaystyle== Azk+1=A(zkΨ(rk))=rkAΨ(rk)𝐴subscript𝑧𝑘1𝐴subscript𝑧𝑘Ψsubscript𝑟𝑘subscript𝑟𝑘𝐴Ψsubscript𝑟𝑘\displaystyle Az_{k+1}-\ell=A(z_{k}-\Psi(r_{k}))-\ell=r_{k}-A\Psi(r_{k})italic_A italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - roman_ℓ = italic_A ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) - roman_ℓ = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_A roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )
=\displaystyle== (IAM1)rk+AM1(rkMΨ(rk))𝐼𝐴superscript𝑀1subscript𝑟𝑘𝐴superscript𝑀1subscript𝑟𝑘𝑀Ψsubscript𝑟𝑘\displaystyle(I-AM^{-1})r_{k}+AM^{-1}(r_{k}-M\Psi(r_{k}))( italic_I - italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )
=\displaystyle== NM1rk+(INM1)(rkMΨ(rk)).𝑁superscript𝑀1subscript𝑟𝑘𝐼𝑁superscript𝑀1subscript𝑟𝑘𝑀Ψsubscript𝑟𝑘\displaystyle NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k})).italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .

Likewise, we discuss the convergence of Algorithm 2 when B𝐵Bitalic_B does or does not have full column rank, respectively.

3.1 Convergence analysis when B𝐵Bitalic_B has full column rank

Note that Pβsubscript𝑃𝛽P_{\beta}italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT is SPD, and (24) gives

Pβ12rk+1=Pβ12NM1Pβ12Pβ12rk+Pβ12(INM1)Pβ12Pβ12(rkMΨ(rk)).superscriptsubscript𝑃𝛽12subscript𝑟𝑘1superscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12superscriptsubscript𝑃𝛽12subscript𝑟𝑘superscriptsubscript𝑃𝛽12𝐼𝑁superscript𝑀1superscriptsubscript𝑃𝛽12superscriptsubscript𝑃𝛽12subscript𝑟𝑘𝑀Ψsubscript𝑟𝑘\displaystyle P_{\beta}^{\tfrac{1}{2}}r_{k+1}=P_{\beta}^{\tfrac{1}{2}}NM^{-1}P% _{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}r_{k}+P_{\beta}^{\tfrac{1}{2}}% (I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}(r_{k}-M\Psi(r_{k}% )).italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .

This along with (23) yields

(25) rk+1Pβsubscriptnormsubscript𝑟𝑘1subscript𝑃𝛽\displaystyle\|r_{k+1}\|_{P_{\beta}}∥ italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT \displaystyle\leq Pβ12NM1Pβ12rkPβ+Pβ12(INM1)Pβ12rkMΨ(rk)Pβnormsuperscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12subscriptnormsubscript𝑟𝑘subscript𝑃𝛽normsuperscriptsubscript𝑃𝛽12𝐼𝑁superscript𝑀1superscriptsubscript𝑃𝛽12subscriptnormsubscript𝑟𝑘𝑀Ψsubscript𝑟𝑘subscript𝑃𝛽\displaystyle\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\|\|r_{% k}\|_{P_{\beta}}+\|P_{\beta}^{\tfrac{1}{2}}(I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}% }\|\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
\displaystyle\leq (Pβ12NM1Pβ12+δIPβ12NM1Pβ12)rkPβnormsuperscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12𝛿norm𝐼superscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12subscriptnormsubscript𝑟𝑘subscript𝑃𝛽\displaystyle\big{(}\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}% \|+\delta\|I-P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\|\big{)}% \|r_{k}\|_{P_{\beta}}( ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ + italic_δ ∥ italic_I - italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ) ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=\displaystyle== (NM1Pβ+δINM1Pβ)rkPβ.subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽subscriptnormsubscript𝑟𝑘subscript𝑃𝛽\displaystyle\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}% \big{)}\|r_{k}\|_{P_{\beta}}.( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

The following result provides sufficient conditions for NM1Pβ<1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1\|NM^{-1}\|_{P_{\beta}}<1∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1.

Lemma 1.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any β>0𝛽0\beta>0italic_β > 0 and SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η be defined by (13) and λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the minimum eigenvalue of 2ωH+BQ1BT2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇2\omega H+BQ^{-1}B^{T}2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Then, λ1>0subscript𝜆10\lambda_{1}>0italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 and if 0<ω<min{1/(2η)+,λ1/β}0𝜔1subscript2𝜂subscript𝜆1𝛽0<\omega<\min\left\{1/(-2\eta)_{+},\,\sqrt{\lambda_{1}/\beta}\,\right\}0 < italic_ω < roman_min { 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG }, we have NM1Pβ<1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1\|NM^{-1}\|_{P_{\beta}}<1∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1.

Proof 3.1.

It follows from 0<ω<1/(2η)+1/(η)+0𝜔1subscript2𝜂1subscript𝜂0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT that SQsubscript𝑆𝑄S_{Q}italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite. Combining with (12) and (14) leads to

Pβ12NM1Pβ12superscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12\displaystyle P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT =\displaystyle== Pβ12(00BTSQ1I1ωBTSQ1BQ1)Pβ12superscriptsubscript𝑃𝛽12matrix00superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐼1𝜔superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄1superscriptsubscript𝑃𝛽12\displaystyle P_{\beta}^{\tfrac{1}{2}}\begin{pmatrix}0&0\\ B^{T}S_{Q}^{-1}&I-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}P_{\beta% }^{-\tfrac{1}{2}}italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT
=\displaystyle== (00βQ12BTSQ1IE)=:T~,\displaystyle\begin{pmatrix}0&0\\ \sqrt{\beta}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}&I-E\end{pmatrix}=:\widetilde{T},( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL square-root start_ARG italic_β end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_I - italic_E end_CELL end_ROW end_ARG ) = : over~ start_ARG italic_T end_ARG ,

where E=1ωQ12BTSQ1BQ12𝐸1𝜔superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄1𝐵superscript𝑄12E=\frac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}BQ^{-\tfrac{1}{2}}italic_E = divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT. This shows that

(26) NM1Pβ=Pβ12NM1Pβ12=(ρ(T~T~T))12.subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽normsuperscriptsubscript𝑃𝛽12𝑁superscript𝑀1superscriptsubscript𝑃𝛽12superscript𝜌~𝑇superscript~𝑇𝑇12\|NM^{-1}\|_{P_{\beta}}=\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}% {2}}\|=\Big{(}\rho(\widetilde{T}\widetilde{T}^{T})\Big{)}^{\tfrac{1}{2}}.∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ = ( italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT .

By direct calculation and (11), we have

(27) ρ(T~T~T)𝜌~𝑇superscript~𝑇𝑇\displaystyle\rho\left(\widetilde{T}\widetilde{T}^{T}\right)italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) =\displaystyle== ρ((IE)(IET)+βQ12BTSQ1SQTBQ12)𝜌𝐼𝐸𝐼superscript𝐸𝑇𝛽superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄1superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\displaystyle\rho\left((I-E)(I-E^{T}\!\,)+\beta Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-% 1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right)italic_ρ ( ( italic_I - italic_E ) ( italic_I - italic_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) + italic_β italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
=\displaystyle== ρ(I1ωQ12BTSQ1(SQ+SQT1ωBQ1BTωβI)SQTBQ12)𝜌𝐼1𝜔superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄1subscript𝑆𝑄superscriptsubscript𝑆𝑄𝑇1𝜔𝐵superscript𝑄1superscript𝐵𝑇𝜔𝛽𝐼superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\displaystyle\rho\left(I-\tfrac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big% {(}S_{Q}+S_{Q}^{T}-\tfrac{1}{\omega}BQ^{-1}B^{T}-\omega\beta I\Big{)}S_{Q}^{-T% }BQ^{-\tfrac{1}{2}}\right)italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
=\displaystyle== ρ(I1ω2Q12BTSQ1(2ωH+BQ1BTω2βI)SQTBQ12).𝜌𝐼1superscript𝜔2superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄12𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscript𝜔2𝛽𝐼superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\displaystyle\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}% \Big{(}2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}% {2}}\right).italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ) .

Note that B𝐵Bitalic_B has full column rank and ω>0𝜔0\omega>0italic_ω > 0, and if 2ωH+BQ1BTω2βI2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscript𝜔2𝛽𝐼2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I is SPD, so is 1ω2Q12BTSQ1(2ωH+BQ1BTω2βI)SQTBQ121superscript𝜔2superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄12𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscript𝜔2𝛽𝐼superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big{(}2\omega H+BQ^{-1}B% ^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT. Then all eigenvalues of T~T~T~𝑇superscript~𝑇𝑇\widetilde{T}\widetilde{T}^{T}over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT are less than 1111, i.e., ρ(T~T~T)<1𝜌~𝑇superscript~𝑇𝑇1\rho(\widetilde{T}\widetilde{T}^{T})<1italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) < 1. Therefore, in order to prove NM1Pβ<1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1\|NM^{-1}\|_{P_{\beta}}<1∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1, we just need to find ω𝜔\omegaitalic_ω to guarantee that 2ωH+BQ1BTω2βI2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscript𝜔2𝛽𝐼2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I is positive definite. Since H𝐻Hitalic_H is positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T})roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ), (13) and 2ωη>12𝜔𝜂12\omega\eta>-12 italic_ω italic_η > - 1 imply 2ωH+BQ1BT2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇2\omega H+BQ^{-1}B^{T}2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is positive definite. Thus, λ1>0subscript𝜆10\lambda_{1}>0italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0. Combining with ω<λ1/β𝜔subscript𝜆1𝛽\omega<\sqrt{\lambda_{1}/\beta}italic_ω < square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG gives the result.

Remark 2.

The conditions in Lemma 1 are reasonable. Indeed, for any given ω0(0, 1/(2η)+)subscript𝜔001subscript2𝜂\omega_{0}\in(0,\,1/(-2\eta)_{+})italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ ( 0 , 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ), 2H+1ω0BQ1BT2𝐻1subscript𝜔0𝐵superscript𝑄1superscript𝐵𝑇2H+\frac{1}{\omega_{0}}BQ^{-1}B^{T}2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is SPD. Then when 0<ωω00𝜔subscript𝜔00<\omega\leq\omega_{0}0 < italic_ω ≤ italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, we have

λ1λmin(2ωH+ωω0BQ1BT)=ωλmin(2H+1ω0BQ1BT).subscript𝜆1subscript𝜆2𝜔𝐻𝜔subscript𝜔0𝐵superscript𝑄1superscript𝐵𝑇𝜔subscript𝜆2𝐻1subscript𝜔0𝐵superscript𝑄1superscript𝐵𝑇\lambda_{1}\geq\lambda_{\min}\left(2\omega H+\tfrac{\omega}{\omega_{0}}BQ^{-1}% B^{T}\right)=\omega\lambda_{\min}\left(2H+\tfrac{1}{\omega_{0}}BQ^{-1}B^{T}% \right).italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_ω italic_H + divide start_ARG italic_ω end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = italic_ω italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) .

Then the conditions in Lemma 1 can be replaced by

0<ω<min{ω0,1βλmin(2H+1ω0BQ1BT)}.0𝜔subscript𝜔01𝛽subscript𝜆2𝐻1subscript𝜔0𝐵superscript𝑄1superscript𝐵𝑇0<\omega<\min\left\{\omega_{0},\,\tfrac{1}{\beta}\lambda_{\min}\left(2H+\tfrac% {1}{\omega_{0}}BQ^{-1}B^{T}\right)\right\}.0 < italic_ω < roman_min { italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , divide start_ARG 1 end_ARG start_ARG italic_β end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) } .

In particular, when H𝐻Hitalic_H is positive semidefinite, η0𝜂0\eta\geq 0italic_η ≥ 0 and 2H+BQ1BT2𝐻𝐵superscript𝑄1superscript𝐵𝑇2H+BQ^{-1}B^{T}2 italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is SPD. Then we can pick ω0=1subscript𝜔01\omega_{0}=1italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 above and the last condition can be further simplified as

0<ω<min{1,1βλmin(2H+BQ1BT)}.0𝜔11𝛽subscript𝜆2𝐻𝐵superscript𝑄1superscript𝐵𝑇0<\omega<\min\left\{1,\,\tfrac{1}{\beta}\lambda_{\min}(2H+BQ^{-1}B^{T})\right\}.0 < italic_ω < roman_min { 1 , divide start_ARG 1 end_ARG start_ARG italic_β end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) } .

Theorem 3.2.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any β>0𝛽0\beta>0italic_β > 0 and SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, let η𝜂\etaitalic_η and δ𝛿\deltaitalic_δ be defined by (13) and (23), and λ1>0subscript𝜆10\lambda_{1}>0italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 be the minimum eigenvalue of 2ωH+BQ1BT2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇2\omega H+BQ^{-1}B^{T}2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. If ω𝜔\omegaitalic_ω and δ𝛿\deltaitalic_δ satisfy

0<ω<min{1(2η)+,λ1β}and0δ12(1NM1Pβ),formulae-sequence0𝜔1subscript2𝜂subscript𝜆1𝛽and0𝛿121subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}% \right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|NM^{-1}\|_{% P_{\beta}}\Big{)},0 < italic_ω < roman_min { divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG , square-root start_ARG divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_β end_ARG end_ARG } and 0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ,

then {xk,yk}subscript𝑥𝑘subscript𝑦𝑘\{x_{k},y_{k}\}{ italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 2 converges to the unique solution of (1).

Proof 3.3.

It follows from Lemma 1 that NM1Pβ<1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1\|NM^{-1}\|_{P_{\beta}}<1∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1, so that INM1Pβ1+NM1Pβ<2.subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽2\|I-NM^{-1}\|_{P_{\beta}}\leq 1+\|NM^{-1}\|_{P_{\beta}}<2.∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ 1 + ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 2 . The result follows from (25) and

NM1Pβ+δINM1Pβsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽\displaystyle\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT NM1Pβ+1NM1Pβ2INM1Pβabsentsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽2subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽\displaystyle\leq\|NM^{-1}\|_{P_{\beta}}+\frac{1-\|NM^{-1}\|_{P_{\beta}}}{2}\|% I-NM^{-1}\|_{P_{\beta}}≤ ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
<NM1Pβ+1NM1Pβ=1.absentsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽1\displaystyle<\|NM^{-1}\|_{P_{\beta}}+1-\|NM^{-1}\|_{P_{\beta}}=1.< ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 1 .

Remark 3.

From (25) we have rkPβ(NM1Pβ+δINM1Pβ)kr0Pβ.subscriptnormsubscript𝑟𝑘subscript𝑃𝛽superscriptsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽𝑘subscriptnormsubscript𝑟0subscript𝑃𝛽\|r_{k}\|_{P_{\beta}}\leq\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P% _{\beta}}\big{)}^{k}\|r_{0}\|_{P_{\beta}}.∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT . Hence, based on the conditions of Theorem 3.2, rksubscript𝑟𝑘r_{k}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges to zero linearly. Let zsubscript𝑧z_{*}italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT be the solution of (1). Then

zkzPβ=A1rkPβ=Pβ12A1Pβ12Pβ12rkPβ12A1Pβ12Pβ12rksubscriptnormsubscript𝑧𝑘subscript𝑧subscript𝑃𝛽subscriptnormsuperscript𝐴1subscript𝑟𝑘subscript𝑃𝛽normsuperscriptsubscript𝑃𝛽12superscript𝐴1superscriptsubscript𝑃𝛽12superscriptsubscript𝑃𝛽12subscript𝑟𝑘normsuperscriptsubscript𝑃𝛽12superscript𝐴1superscriptsubscript𝑃𝛽12normsuperscriptsubscript𝑃𝛽12subscript𝑟𝑘\displaystyle\|z_{k}-z_{*}\|_{P_{\beta}}=\|A^{-1}r_{k}\|_{P_{\beta}}=\|P_{% \beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}P_{\beta}^{\frac{1}{2}}r_{k}% \|\leq\|P_{\beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}\|\|P_{\beta}^{% \frac{1}{2}}r_{k}\|∥ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ ≤ ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥
=A1PβrkPβA1Pβ(NM1Pβ+δINM1Pβ)kr0Pβabsentsubscriptnormsuperscript𝐴1subscript𝑃𝛽subscriptnormsubscript𝑟𝑘subscript𝑃𝛽subscriptnormsuperscript𝐴1subscript𝑃𝛽superscriptsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽𝑘subscriptnormsubscript𝑟0subscript𝑃𝛽\displaystyle=\|A^{-1}\|_{P_{\beta}}\|r_{k}\|_{P_{\beta}}\leq\|A^{-1}\|_{P_{% \beta}}\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}\big{)}^{% k}\|r_{0}\|_{P_{\beta}}= ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=A1Pβ(NM1Pβ+δINM1Pβ)kA(z0z)Pβabsentsubscriptnormsuperscript𝐴1subscript𝑃𝛽superscriptsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽𝑘subscriptnorm𝐴subscript𝑧0subscript𝑧subscript𝑃𝛽\displaystyle=\|A^{-1}\|_{P_{\beta}}\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-% NM^{-1}\|_{P_{\beta}}\big{)}^{k}\|A(z_{0}-z_{*})\|_{P_{\beta}}= ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_A ( italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
A1PβAPβ(NM1Pβ+δINM1Pβ)kz0zPβ.absentsubscriptnormsuperscript𝐴1subscript𝑃𝛽subscriptnorm𝐴subscript𝑃𝛽superscriptsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛽𝛿subscriptnorm𝐼𝑁superscript𝑀1subscript𝑃𝛽𝑘subscriptnormsubscript𝑧0subscript𝑧subscript𝑃𝛽\displaystyle\leq\|A^{-1}\|_{P_{\beta}}\|A\|_{P_{\beta}}\big{(}\|NM^{-1}\|_{P_% {\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}\big{)}^{k}\|z_{0}-z_{*}\|_{P_{\beta}}.≤ ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_A ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

This implies that zksubscript𝑧𝑘z_{k}italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges linearly to zsubscript𝑧z_{*}italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT under the conditions of Theorem 3.2.

Remark 4.

If β=δ𝛽𝛿\beta=\deltaitalic_β = italic_δ in Theorem 3.2, since ω>0𝜔0\omega>0italic_ω > 0 and δ0𝛿0\delta\geq 0italic_δ ≥ 0, we know that ω<λ1/δ𝜔subscript𝜆1𝛿\omega<\sqrt{\lambda_{1}/\delta}italic_ω < square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_δ end_ARG holds if and only if δ<λ1/ω2𝛿subscript𝜆1superscript𝜔2\delta<\lambda_{1}/\omega^{2}italic_δ < italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Then the restricted conditions of ω𝜔\omegaitalic_ω and δ𝛿\deltaitalic_δ in Theorem 3.2 can be replaced by

0<ω<1(2η)+and0δ<min{λ1ω2,1NM1Pδ2}.formulae-sequence0𝜔1subscript2𝜂and0𝛿subscript𝜆1superscript𝜔21subscriptnorm𝑁superscript𝑀1subscript𝑃𝛿20<\omega<\frac{1}{(-2\eta)_{+}}\quad\mbox{and}\quad 0\leq\delta<\min\left\{% \frac{\lambda_{1}}{\omega^{2}},\,\dfrac{1-\|NM^{-1}\|_{P_{\delta}}}{2}\right\}.0 < italic_ω < divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG and 0 ≤ italic_δ < roman_min { divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG } .

It follows from (26) and (27) that

NM1Pδ2=ρ(T~T~T)=ρ(I1ω2Q12BTSQ1(2ωH+BQ1BTδω2I)SQTBQ12)superscriptsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛿2𝜌~𝑇superscript~𝑇𝑇𝜌𝐼1superscript𝜔2superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄12𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇𝛿superscript𝜔2𝐼superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\displaystyle\|NM^{-1}\|_{P_{\delta}}^{2}=\rho(\widetilde{T}\widetilde{T}^{T})% =\rho\Big{(}I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+% BQ^{-1}B^{T}-\delta\omega^{2}I)S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\Big{)}∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_δ italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
(28) =ρ(I1ω2Q12BTSQ1(2ωH+BQ1BT)SQTBQ12+δQ12BTSQ1SQTBQ12).absent𝜌𝐼1superscript𝜔2superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄12𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12𝛿superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄1superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12\displaystyle=\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1% }(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{1}{2}}+\delta Q^{-\tfrac{1}{2}}% B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right).= italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT + italic_δ italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ) .

Note that T~T~T~𝑇superscript~𝑇𝑇\widetilde{T}\widetilde{T}^{T}over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is symmetric positive semidefinite and Q12BTSQ1SQTBQ12superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄1superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT is SPD, NM1Pδsubscriptnorm𝑁superscript𝑀1subscript𝑃𝛿\|NM^{-1}\|_{P_{\delta}}∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT increases with δ𝛿\deltaitalic_δ, and

limδλ1/ω2NM1Pδ=1,limδ0+NM1Pδ=1λ~1/ω2<1,formulae-sequencesubscript𝛿subscript𝜆1superscript𝜔2subscriptnorm𝑁superscript𝑀1subscript𝑃𝛿1subscript𝛿superscript0subscriptnorm𝑁superscript𝑀1subscript𝑃𝛿1subscript~𝜆1superscript𝜔21\lim_{\delta\rightarrow\lambda_{1}/\omega^{2}}\|NM^{-1}\|_{P_{\delta}}=1,% \qquad\lim_{\delta\rightarrow 0^{+}}\|NM^{-1}\|_{P_{\delta}}=\sqrt{1-\tilde{% \lambda}_{1}/\omega^{2}}<1,roman_lim start_POSTSUBSCRIPT italic_δ → italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 1 , roman_lim start_POSTSUBSCRIPT italic_δ → 0 start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = square-root start_ARG 1 - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1 ,

where λ~1>0subscript~𝜆10\tilde{\lambda}_{1}>0over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 is the minimum eigenvalue of Q12BTSQ1(2ωH+BQ1BT)SQTBQ12superscript𝑄12superscript𝐵𝑇superscriptsubscript𝑆𝑄12𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇superscriptsubscript𝑆𝑄𝑇𝐵superscript𝑄12Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{% 1}{2}}italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT. Then there exists δ>0𝛿0\delta>0italic_δ > 0 such that NM1Pδ<1subscriptnorm𝑁superscript𝑀1subscript𝑃𝛿1\|NM^{-1}\|_{P_{\delta}}<1∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1. Therefore, for any given 0<ω<1/(2η)+0𝜔1subscript2𝜂0<\omega<1/(-2\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, Algorithm 2 is convergent for sufficiently small δ𝛿\deltaitalic_δ. Moreover, the larger ω𝜔\omegaitalic_ω is, the smaller δ𝛿\deltaitalic_δ should be. Therefore, a practical selection of δ𝛿\deltaitalic_δ could be a sequence {δk}subscript𝛿𝑘\{\delta_{k}\}{ italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } such that δk0subscript𝛿𝑘0\delta_{k}\rightarrow 0italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT → 0 as k𝑘k\rightarrow\inftyitalic_k → ∞.

Remark 5.

When G𝐺Gitalic_G is positive semidefinite, (13) yields η0𝜂0\eta\geq 0italic_η ≥ 0. It leads to (2η)+=0subscript2𝜂0(-2\eta)_{+}=0( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT = 0. In this case, the sufficient conditions in Theorem 3.2 can be replaced by 0<ω<minλ1/β0𝜔subscript𝜆1𝛽0<\omega<\min\sqrt{\lambda_{1}/\beta}0 < italic_ω < roman_min square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG and 0δ12(1NM1Pβ)0𝛿121subscriptnorm𝑁superscript𝑀1subscript𝑃𝛽0\leq\delta\leq\tfrac{1}{2}\left(1-\|NM^{-1}\|_{P_{\beta}}\right)0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ). Furthermore, from Remark 4 we know that the restrictions also can be replaced by ω>0𝜔0\omega>0italic_ω > 0 and 0δ<min{λ1ω2,1NM1Pδ2}0𝛿subscript𝜆1superscript𝜔21subscriptnorm𝑁superscript𝑀1subscript𝑃𝛿20\leq\delta<\min\left\{\tfrac{\lambda_{1}}{\omega^{2}},\,\tfrac{1-\|NM^{-1}\|_% {P_{\delta}}}{2}\right\}0 ≤ italic_δ < roman_min { divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG }. This implies that when G𝐺Gitalic_G is positive semidefinite, for any ω>0𝜔0\omega>0italic_ω > 0, Algorithm 2 is convergent for sufficiently small δ𝛿\deltaitalic_δ.

3.2 Convergence analysis when B𝐵Bitalic_B is rank-deficient

Assume that the rank of B𝐵Bitalic_B is s𝑠sitalic_s and 0<s<m0𝑠𝑚0<s<m0 < italic_s < italic_m. Let B=U(Σ0)VT𝐵𝑈matrixΣ0superscript𝑉𝑇B=U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T}italic_B = italic_U ( start_ARG start_ROW start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT be the singular value decomposition (SVD), where n×n𝑛𝑛n\times nitalic_n × italic_n U𝑈Uitalic_U and m×m𝑚𝑚m\times mitalic_m × italic_m V𝑉Vitalic_V are orthogonal matrices, Σ=(Σs0)n×sΣmatrixsubscriptΣ𝑠0superscript𝑛𝑠\Sigma=\begin{pmatrix}\Sigma_{s}\\ 0\end{pmatrix}\in\mathds{R}^{n\times s}roman_Σ = ( start_ARG start_ROW start_CELL roman_Σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_s end_POSTSUPERSCRIPT has full column rank, and Σs=diag{σ1,σ2,,σs}subscriptΣ𝑠diagsubscript𝜎1subscript𝜎2subscript𝜎𝑠\Sigma_{s}={\rm diag}\{\sigma_{1},\sigma_{2},\ldots,\sigma_{s}\}roman_Σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = roman_diag { italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } with all σj>0subscript𝜎𝑗0\sigma_{j}>0italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > 0 contains the singular values of B𝐵Bitalic_B. Let Q1s×ssubscript𝑄1superscript𝑠𝑠Q_{1}\in\mathds{R}^{s\times s}italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_s × italic_s end_POSTSUPERSCRIPT and Q2(ms)×(ms)subscript𝑄2superscript𝑚𝑠𝑚𝑠Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT be SPD, and

(29) Q=V(Q100Q2)VT,D~=(U00V),P~β=(I00βQ11),formulae-sequence𝑄𝑉matrixsubscript𝑄100subscript𝑄2superscript𝑉𝑇formulae-sequence~𝐷matrix𝑈00𝑉subscript~𝑃𝛽matrix𝐼00𝛽superscriptsubscript𝑄11\displaystyle Q=V\begin{pmatrix}Q_{1}&0\\ 0&Q_{2}\end{pmatrix}V^{T}\!,\qquad~{}\widetilde{D}=\begin{pmatrix}U&0\\ 0&V\end{pmatrix},\qquad\quad~{}\widetilde{P}_{\beta}=\begin{pmatrix}I&0\\ 0&\beta Q_{1}^{-1}\end{pmatrix},italic_Q = italic_V ( start_ARG start_ROW start_CELL italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , over~ start_ARG italic_D end_ARG = ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) , over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ,
(30) A~=(UTGUΣΣ0),M~=(UTGUΣΣωQ1),N~=(000ωQ1).formulae-sequence~𝐴matrixsuperscript𝑈𝑇𝐺𝑈ΣΣ0formulae-sequence~𝑀matrixsuperscript𝑈𝑇𝐺𝑈ΣΣ𝜔subscript𝑄1~𝑁matrix000𝜔subscript𝑄1\displaystyle\widetilde{A}=\begin{pmatrix}U^{T}GU&\Sigma\\ -\Sigma&0\end{pmatrix},\qquad\widetilde{M}=\begin{pmatrix}U^{T}GU&\Sigma\\ -\Sigma&\omega Q_{1}\end{pmatrix},\qquad\widetilde{N}=\begin{pmatrix}0&0\\ 0&\omega Q_{1}\end{pmatrix}.over~ start_ARG italic_A end_ARG = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL end_ROW start_ROW start_CELL - roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) , over~ start_ARG italic_M end_ARG = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL end_ROW start_ROW start_CELL - roman_Σ end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) , over~ start_ARG italic_N end_ARG = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

Let r~k=D~Trk=(r~ka,r~kb)subscript~𝑟𝑘superscript~𝐷𝑇subscript𝑟𝑘matrixsuperscriptsubscript~𝑟𝑘𝑎superscriptsubscript~𝑟𝑘𝑏\widetilde{r}_{k}=\widetilde{D}^{T}\!r_{k}=\begin{pmatrix}\widetilde{r}_{k}^{a% },\,\widetilde{r}_{k}^{b}\end{pmatrix}over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ), Ψ~(rk)=D~TΨ(rk)=(Ψ~a(rk),Ψ~b(rk))~Ψsubscript𝑟𝑘superscript~𝐷𝑇Ψsubscript𝑟𝑘matrixsuperscript~Ψ𝑎subscript𝑟𝑘superscript~Ψ𝑏subscript𝑟𝑘\widetilde{\Psi}(r_{k})=\widetilde{D}^{T}\!\Psi(r_{k})=\begin{pmatrix}% \widetilde{\Psi}^{a}(r_{k}),\,\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}over~ start_ARG roman_Ψ end_ARG ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( start_ARG start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) with r~ka,Ψ~a(rk)n+ssuperscriptsubscript~𝑟𝑘𝑎superscript~Ψ𝑎subscript𝑟𝑘superscript𝑛𝑠\widetilde{r}_{k}^{a},\,\widetilde{\Psi}^{a}(r_{k})\in\mathds{R}^{n+s}over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT. It follows from (4), (12), (29) and (30) that

D~TAD~superscript~𝐷𝑇𝐴~𝐷\displaystyle\widetilde{D}^{T}\!A\widetilde{D}over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG =(UT00VT)(GBBT0)(U00V)=(UTGUUTBVVTBTU0)absentmatrixsuperscript𝑈𝑇00superscript𝑉𝑇matrix𝐺𝐵superscript𝐵𝑇0matrix𝑈00𝑉matrixsuperscript𝑈𝑇𝐺𝑈superscript𝑈𝑇𝐵𝑉superscript𝑉𝑇superscript𝐵𝑇𝑈0\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\ -V^{T}\!B^{T}U&0\end{pmatrix}= ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_V end_CELL end_ROW start_ROW start_CELL - italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U end_CELL start_CELL 0 end_CELL end_ROW end_ARG )
(31) =(UTGUΣ0ΣT00000)=:(A~000),\displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\ -\Sigma^{T}\!&0&0\\ 0&0&0\end{pmatrix}=:\begin{pmatrix}\widetilde{A}&0\\ 0&0\end{pmatrix},= ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ,
D~TMD~superscript~𝐷𝑇𝑀~𝐷\displaystyle\widetilde{D}^{T}\!M\widetilde{D}over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG =(UT00VT)(GBBTωQ)(U00V)=(UTGUUTBVVTBTUωVTQV)absentmatrixsuperscript𝑈𝑇00superscript𝑉𝑇matrix𝐺𝐵superscript𝐵𝑇𝜔𝑄matrix𝑈00𝑉matrixsuperscript𝑈𝑇𝐺𝑈superscript𝑈𝑇𝐵𝑉superscript𝑉𝑇superscript𝐵𝑇𝑈𝜔superscript𝑉𝑇𝑄𝑉\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\ -V^{T}\!B^{T}U&\omega V^{T}\!QV\end{pmatrix}= ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_V end_CELL end_ROW start_ROW start_CELL - italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U end_CELL start_CELL italic_ω italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q italic_V end_CELL end_ROW end_ARG )
(32) =(UTGUΣ0ΣTωQ1000ωQ2)=:(M~00ωQ2),\displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\ -\Sigma^{T}\!&\omega Q_{1}&0\\ 0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{M}&0\\ 0&\omega Q_{2}\end{pmatrix},= ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_M end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ,
D~TND~superscript~𝐷𝑇𝑁~𝐷\displaystyle\widetilde{D}^{T}\!N\widetilde{D}over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG =(UT00VT)(000ωQ)(U00V)=(000ωVTQV)absentmatrixsuperscript𝑈𝑇00superscript𝑉𝑇matrix000𝜔𝑄matrix𝑈00𝑉matrix000𝜔superscript𝑉𝑇𝑄𝑉\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}0&0\\ 0&\omega V^{T}\!QV\end{pmatrix}= ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q italic_V end_CELL end_ROW end_ARG )
(33) =(0000ωQ1000ωQ2)=:(N~00ωQ2).\displaystyle=\begin{pmatrix}0&0&0\\ 0&\omega Q_{1}&0\\ 0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{N}&0\\ 0&\omega Q_{2}\end{pmatrix}.= ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

Based on the above notations, we have the following results.

Lemma 6.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s𝑠sitalic_s. If (1) is solvable, then r~kb=0superscriptsubscript~𝑟𝑘𝑏0\widetilde{r}_{k}^{b}=0over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0 for all k1𝑘1k\geq 1italic_k ≥ 1.

Proof 3.4.

Let zsubscript𝑧z_{*}italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT be a solution of (1), and let z~=D~Tz=(z~a,z~b)subscript~𝑧superscript~𝐷𝑇subscript𝑧matrixsuperscriptsubscript~𝑧𝑎superscriptsubscript~𝑧𝑏\widetilde{z}_{*}=\widetilde{D}^{T}\!z_{*}=\begin{pmatrix}\widetilde{z}_{*}^{a% },\,\widetilde{z}_{*}^{b}\end{pmatrix}over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ), z~k=D~Tzk=(z~ka,z~kb)subscript~𝑧𝑘superscript~𝐷𝑇subscript𝑧𝑘matrixsuperscriptsubscript~𝑧𝑘𝑎superscriptsubscript~𝑧𝑘𝑏\widetilde{z}_{k}=\widetilde{D}^{T}\!z_{k}=\begin{pmatrix}\widetilde{z}_{k}^{a% },\,\widetilde{z}_{k}^{b}\end{pmatrix}over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ), and ~=D~T=(~a,~b)~superscript~𝐷𝑇matrixsuperscript~𝑎superscript~𝑏\widetilde{\ell}=\widetilde{D}^{T}\!\ell=\begin{pmatrix}\widetilde{\ell}^{a},% \,\widetilde{\ell}^{b}\end{pmatrix}over~ start_ARG roman_ℓ end_ARG = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_ℓ = ( start_ARG start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ), where z~a,z~ka,~an+ssuperscriptsubscript~𝑧𝑎superscriptsubscript~𝑧𝑘𝑎superscript~𝑎superscript𝑛𝑠\widetilde{z}_{*}^{a},\,\widetilde{z}_{k}^{a},\,\widetilde{\ell}^{a}\in\mathds% {R}^{n+s}over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT. It follows from Az=𝐴subscript𝑧Az_{*}=\ellitalic_A italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = roman_ℓ and (31) that

D~TAD~z~=(A~000)(z~az~b)=(A~z~a0)=(~a~b),superscript~𝐷𝑇𝐴~𝐷subscript~𝑧matrix~𝐴000matrixsuperscriptsubscript~𝑧𝑎superscriptsubscript~𝑧𝑏matrix~𝐴superscriptsubscript~𝑧𝑎0matrixsuperscript~𝑎superscript~𝑏\smash[t]{\widetilde{D}^{T}\!A\widetilde{D}\widetilde{z}_{*}=\begin{pmatrix}% \widetilde{A}&0\\ 0&0\end{pmatrix}\begin{pmatrix}\widetilde{z}_{*}^{a}\\ \widetilde{z}_{*}^{b}\end{pmatrix}=\begin{pmatrix}\widetilde{A}\widetilde{z}_{% *}^{a}\\ 0\end{pmatrix}=\begin{pmatrix}\widetilde{\ell}^{a}\\ \widetilde{\ell}^{b}\end{pmatrix},}over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ,

which shows that ~b=0superscript~𝑏0\widetilde{\ell}^{b}=0over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0. Then we have

r~ksubscript~𝑟𝑘\displaystyle\widetilde{r}_{k}over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT =D~Trk=D~T(Azk)=D~TAD~D~TzkD~T=(A~z~ka~a~b)=(r~ka0).absentsuperscript~𝐷𝑇subscript𝑟𝑘superscript~𝐷𝑇𝐴subscript𝑧𝑘superscript~𝐷𝑇𝐴~𝐷superscript~𝐷𝑇subscript𝑧𝑘superscript~𝐷𝑇matrix~𝐴superscriptsubscript~𝑧𝑘𝑎superscript~𝑎superscript~𝑏matrixsuperscriptsubscript~𝑟𝑘𝑎0\displaystyle=\widetilde{D}^{T}\!r_{k}=\widetilde{D}^{T}\!(Az_{k}-\ell)=% \widetilde{D}^{T}\!A\widetilde{D}\widetilde{D}^{T}\!z_{k}-\widetilde{D}^{T}\!% \ell=\smash[t]{\begin{pmatrix}\widetilde{A}\widetilde{z}_{k}^{a}-\widetilde{% \ell}^{a}\\ -\widetilde{\ell}^{b}\end{pmatrix}}=\begin{pmatrix}\widetilde{r}_{k}^{a}\\ 0\end{pmatrix}.= over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ ) = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_ℓ = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL - over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) .

Lemma 7.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s𝑠sitalic_s. For any ω,β>0𝜔𝛽0\omega,\,\beta>0italic_ω , italic_β > 0 and SPD Q1(n+s)×(n+s)subscript𝑄1superscript𝑛𝑠𝑛𝑠Q_{1}\in\mathds{R}^{(n+s)\times(n+s)}italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_s ) × ( italic_n + italic_s ) end_POSTSUPERSCRIPT and Q2(ms)×(ms)subscript𝑄2superscript𝑚𝑠𝑚𝑠Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT, let Q𝑄Qitalic_Q and δ𝛿\deltaitalic_δ be defined by (29) and (23). Then r~kaM~Ψ~a(rk)P~βδr~kaP~βsubscriptnormsubscriptsuperscript~𝑟𝑎𝑘~𝑀superscript~Ψ𝑎subscript𝑟𝑘subscript~𝑃𝛽𝛿subscriptnormsubscriptsuperscript~𝑟𝑎𝑘subscript~𝑃𝛽\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})\|_{\widetilde% {P}_{\beta}}\leq\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}}∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_δ ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

Proof 3.5.

For any xn+m𝑥superscript𝑛𝑚x\in\mathds{R}^{n+m}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT and x~=D~Tx=(x~a,x~b)~𝑥superscript~𝐷𝑇𝑥matrixsuperscript~𝑥𝑎superscript~𝑥𝑏\widetilde{x}=\widetilde{D}^{T}x=\begin{pmatrix}\widetilde{x}^{a},\,\widetilde% {x}^{b}\end{pmatrix}over~ start_ARG italic_x end_ARG = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = ( start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) with x~an+ssuperscript~𝑥𝑎superscript𝑛𝑠\widetilde{x}^{a}\in\mathds{R}^{n+s}over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT, since D~~𝐷\widetilde{D}over~ start_ARG italic_D end_ARG is an orthogonal matrix, from (29) and the definition of Pβsubscript𝑃𝛽P_{\beta}italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT in Section 3, we have

xPβ2superscriptsubscriptnorm𝑥subscript𝑃𝛽2\displaystyle\|x\|_{P_{\beta}}^{2}∥ italic_x ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =xTPβx=xTD~D~TPβD~D~Tx=((x~a)T(x~b)T)(P~β00βQ21)(x~ax~b)absentsuperscript𝑥𝑇subscript𝑃𝛽𝑥superscript𝑥𝑇~𝐷superscript~𝐷𝑇subscript𝑃𝛽~𝐷superscript~𝐷𝑇𝑥matrixsuperscriptsuperscript~𝑥𝑎𝑇superscriptsuperscript~𝑥𝑏𝑇matrixsubscript~𝑃𝛽00𝛽superscriptsubscript𝑄21matrixsuperscript~𝑥𝑎superscript~𝑥𝑏\displaystyle=x^{T}P_{\beta}x=x^{T}\widetilde{D}\widetilde{D}^{T}P_{\beta}% \widetilde{D}\widetilde{D}^{T}x=\begin{pmatrix}(\widetilde{x}^{a})^{T}\,(% \widetilde{x}^{b})^{T}\end{pmatrix}\begin{pmatrix}\widetilde{P}_{\beta}&0\\ 0&\beta Q_{2}^{-1}\end{pmatrix}\begin{pmatrix}\widetilde{x}^{a}\\ \widetilde{x}^{b}\end{pmatrix}= italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = ( start_ARG start_ROW start_CELL ( over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG )
(34) =x~aP~β2+x~bβQ212.absentsuperscriptsubscriptnormsuperscript~𝑥𝑎subscript~𝑃𝛽2superscriptsubscriptnormsuperscript~𝑥𝑏𝛽superscriptsubscript𝑄212\displaystyle=\|\widetilde{x}^{a}\|_{\widetilde{P}_{\beta}}^{2}+\|\widetilde{x% }^{b}\|_{\beta Q_{2}^{-1}}^{2}.= ∥ over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Note that (32) and Lemma 6 give

D~T(rkMΨ(rk))superscript~𝐷𝑇subscript𝑟𝑘𝑀Ψsubscript𝑟𝑘\displaystyle\widetilde{D}^{T}\left(r_{k}-M\Psi(r_{k})\right)over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) =r~kD~TMD~Ψ~(rk)=(r~kaM~Ψ~a(rk)ωQ2Ψ~b(rk)).absentsubscript~𝑟𝑘superscript~𝐷𝑇𝑀~𝐷~Ψsubscript𝑟𝑘matrixsubscriptsuperscript~𝑟𝑎𝑘~𝑀superscript~Ψ𝑎subscript𝑟𝑘𝜔subscript𝑄2superscript~Ψ𝑏subscript𝑟𝑘\displaystyle=\widetilde{r}_{k}-\widetilde{D}^{T}M\widetilde{D}\widetilde{\Psi% }(r_{k})=\smash{\begin{pmatrix}\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{% \Psi}^{a}(r_{k})\\ -\omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}}.= over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG over~ start_ARG roman_Ψ end_ARG ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL - italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) .

This along with (34) leads to

rkMΨ(rk)Pβ2=r~kaM~Ψ~a(rk)P~β2+ωQ2Ψ~b(rk)βQ212superscriptsubscriptnormsubscript𝑟𝑘𝑀Ψsubscript𝑟𝑘subscript𝑃𝛽2superscriptsubscriptnormsubscriptsuperscript~𝑟𝑎𝑘~𝑀superscript~Ψ𝑎subscript𝑟𝑘subscript~𝑃𝛽2superscriptsubscriptnorm𝜔subscript𝑄2superscript~Ψ𝑏subscript𝑟𝑘𝛽superscriptsubscript𝑄212\displaystyle\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}^{2}=\|\widetilde{r}^{a}_{k}-% \widetilde{M}\widetilde{\Psi}^{a}(r_{k})\|_{\widetilde{P}_{\beta}}^{2}+\|% \omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\|_{\beta Q_{2}^{-1}}^{2}∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
=r~kaM~Ψ~a(rk)P~β2+ω2βΨ~b(rk)Q22.absentsuperscriptsubscriptnormsubscriptsuperscript~𝑟𝑎𝑘~𝑀superscript~Ψ𝑎subscript𝑟𝑘subscript~𝑃𝛽2superscript𝜔2𝛽superscriptsubscriptnormsuperscript~Ψ𝑏subscript𝑟𝑘subscript𝑄22\displaystyle=\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})% \|_{\widetilde{P}_{\beta}}^{2}+\omega^{2}\beta\|\widetilde{\Psi}^{b}(r_{k})\|_% {Q_{2}}^{2}.= ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β ∥ over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Using (23), (34) and r~kb=0subscriptsuperscript~𝑟𝑏𝑘0\widetilde{r}^{b}_{k}=0over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = 0 yields

r~kaM~Ψ~a(rk)P~βrkMΨ(rk)PβδrkPβ=δr~kaP~β.subscriptnormsubscriptsuperscript~𝑟𝑎𝑘~𝑀superscript~Ψ𝑎subscript𝑟𝑘subscript~𝑃𝛽subscriptnormsubscript𝑟𝑘𝑀Ψsubscript𝑟𝑘subscript𝑃𝛽𝛿subscriptnormsubscript𝑟𝑘subscript𝑃𝛽𝛿subscriptnormsubscriptsuperscript~𝑟𝑎𝑘subscript~𝑃𝛽\displaystyle\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})% \|_{\widetilde{P}_{\beta}}\leq\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}\leq\delta\|r_% {k}\|_{P_{\beta}}=\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}}.∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_δ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_δ ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

We are now ready to establish the convergence theorem for Algorithm 2 when B𝐵Bitalic_B is rank-deficient.

Theorem 3.6.

Suppose Bn×m𝐵superscript𝑛𝑚B\in\mathds{R}^{n\times m}italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s𝑠sitalic_s and Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null(BT)Nullsuperscript𝐵𝑇\mathop{\mathrm{Null}}(B^{T}\!\,)roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). For any β>0𝛽0\beta>0italic_β > 0 and SPD Q1(n+s)×(n+s)subscript𝑄1superscript𝑛𝑠𝑛𝑠Q_{1}\in\mathds{R}^{(n+s)\times(n+s)}italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_s ) × ( italic_n + italic_s ) end_POSTSUPERSCRIPT and Q2(ms)×(ms)subscript𝑄2superscript𝑚𝑠𝑚𝑠Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT, let Q𝑄Qitalic_Q, η𝜂\etaitalic_η and δ𝛿\deltaitalic_δ be defined by (29), (13) and (23), and λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the minimum eigenvalue of 2ωH+BQ1BT2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇2\omega H+BQ^{-1}B^{T}2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. If ω𝜔\omegaitalic_ω and δ𝛿\deltaitalic_δ satisfy

0<ω<min{1(2η)+,λ1β}and0δ12(1N~M~1P~β),formulae-sequence0𝜔1subscript2𝜂subscript𝜆1𝛽and0𝛿121subscriptnorm~𝑁superscript~𝑀1subscript~𝑃𝛽0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}% \right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|\widetilde{% N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}\Big{)},0 < italic_ω < roman_min { divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG , square-root start_ARG divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_β end_ARG end_ARG } and 0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ,

then {xk,yk}subscript𝑥𝑘subscript𝑦𝑘\{x_{k},y_{k}\}{ italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 2 converges to a solution of the singular saddle-point system (1).

Proof 3.7.

By Lemma 6, we just need to prove limk0r~ka=0subscript𝑘0superscriptsubscript~𝑟𝑘𝑎0\lim\limits_{k\rightarrow 0}\widetilde{r}_{k}^{a}=0roman_lim start_POSTSUBSCRIPT italic_k → 0 end_POSTSUBSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT = 0. Since D~~𝐷\widetilde{D}over~ start_ARG italic_D end_ARG is an orthogonal matrix, it follows from (24), (29), (32) and (33) that

(r~k+1ar~k+1b)=r~k+1=D~Trk+1=D~T[NM1rk+(INM1)(rkMΨ(rk))]matrixsuperscriptsubscript~𝑟𝑘1𝑎superscriptsubscript~𝑟𝑘1𝑏subscript~𝑟𝑘1superscript~𝐷𝑇subscript𝑟𝑘1superscript~𝐷𝑇delimited-[]𝑁superscript𝑀1subscript𝑟𝑘𝐼𝑁superscript𝑀1subscript𝑟𝑘𝑀Ψsubscript𝑟𝑘\displaystyle\begin{pmatrix}\widetilde{r}_{k+1}^{a}\\ \widetilde{r}_{k+1}^{b}\end{pmatrix}=\widetilde{r}_{k+1}=\widetilde{D}^{T}\!r_% {k+1}=\widetilde{D}^{T}\!\left[NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k}))\right]( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ]
=D~TND~(D~TMD~)1D~Trk+[ID~TND~(D~TMD~)1](D~TrkD~TMD~D~TΨ(rk))absentsuperscript~𝐷𝑇𝑁~𝐷superscriptsuperscript~𝐷𝑇𝑀~𝐷1superscript~𝐷𝑇subscript𝑟𝑘delimited-[]𝐼superscript~𝐷𝑇𝑁~𝐷superscriptsuperscript~𝐷𝑇𝑀~𝐷1superscript~𝐷𝑇subscript𝑟𝑘superscript~𝐷𝑇𝑀~𝐷superscript~𝐷𝑇Ψsubscript𝑟𝑘\displaystyle=\widetilde{D}^{T}\!N\widetilde{D}(\widetilde{D}^{T}\!M\widetilde% {D})^{-1}\widetilde{D}^{T}\!r_{k}+\left[I-\widetilde{D}^{T}\!N\widetilde{D}(% \widetilde{D}^{T}\!M\widetilde{D})^{-1}\right]\left(\widetilde{D}^{T}\!r_{k}-% \widetilde{D}^{T}\!M\widetilde{D}\widetilde{D}^{T}\!\Psi(r_{k})\right)= over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + [ italic_I - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )
=(N~M~100I)(r~kar~kb)+[I(N~M~100I)][(r~kar~kb)(M~00ωQ2)(Ψ~a(rk)Ψ~b(rk))]absentmatrix~𝑁superscript~𝑀100𝐼matrixsuperscriptsubscript~𝑟𝑘𝑎superscriptsubscript~𝑟𝑘𝑏delimited-[]𝐼matrix~𝑁superscript~𝑀100𝐼delimited-[]matrixsuperscriptsubscript~𝑟𝑘𝑎superscriptsubscript~𝑟𝑘𝑏matrix~𝑀00𝜔subscript𝑄2matrixsuperscript~Ψ𝑎subscript𝑟𝑘superscript~Ψ𝑏subscript𝑟𝑘\displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}&0\\ 0&I\end{pmatrix}\begin{pmatrix}\widetilde{r}_{k}^{a}\\ \widetilde{r}_{k}^{b}\end{pmatrix}+\left[I-\begin{pmatrix}\widetilde{N}% \widetilde{M}^{-1}&0\\ 0&I\end{pmatrix}\right]\left[\begin{pmatrix}\widetilde{r}_{k}^{a}\\ \widetilde{r}_{k}^{b}\end{pmatrix}-\begin{pmatrix}\widetilde{M}&0\\ 0&\omega Q_{2}\end{pmatrix}\begin{pmatrix}\widetilde{\Psi}^{a}(r_{k})\\ \widetilde{\Psi}^{b}(r_{k})\end{pmatrix}\right]= ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) + [ italic_I - ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I end_CELL end_ROW end_ARG ) ] [ ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL over~ start_ARG italic_M end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) ]
=(N~M~1r~ka+(IN~M~1)(r~kaM~Ψ~a(rk))r~kb).absentmatrix~𝑁superscript~𝑀1superscriptsubscript~𝑟𝑘𝑎𝐼~𝑁superscript~𝑀1superscriptsubscript~𝑟𝑘𝑎~𝑀superscript~Ψ𝑎subscript𝑟𝑘superscriptsubscript~𝑟𝑘𝑏\displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^% {a}+(I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}% \widetilde{\Psi}^{a}(r_{k}))\\ \widetilde{r}_{k}^{b}\end{pmatrix}.= ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT + ( italic_I - over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) .

Thus, r~k+1a=N~M~1r~ka+(IN~M~1)(r~kaM~Ψ~a(rk)).superscriptsubscript~𝑟𝑘1𝑎~𝑁superscript~𝑀1superscriptsubscript~𝑟𝑘𝑎𝐼~𝑁superscript~𝑀1superscriptsubscript~𝑟𝑘𝑎~𝑀superscript~Ψ𝑎subscript𝑟𝑘\widetilde{r}_{k+1}^{a}=\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^{a}+(% I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}% \widetilde{\Psi}^{a}(r_{k})).over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT = over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT + ( italic_I - over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) . Using (24), (31), (32), (33) and Lemma 7, we know that r~kasuperscriptsubscript~𝑟𝑘𝑎\widetilde{r}_{k}^{a}over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT is the k𝑘kitalic_k-th residual of Algorithm 2 applying to the saddle-point problem A~z~=~~𝐴~𝑧~\widetilde{A}\widetilde{z}=\widetilde{\ell}over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG = over~ start_ARG roman_ℓ end_ARG.

Note that xNull(ΣT)𝑥NullsuperscriptΣ𝑇x\in\mathop{\mathrm{Null}}(\Sigma^{T}\!\,)italic_x ∈ roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) if and only if UxNull(BT)𝑈𝑥Nullsuperscript𝐵𝑇Ux\in\mathop{\mathrm{Null}}(B^{T}\!\,)italic_U italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and UTGUsuperscript𝑈𝑇𝐺𝑈U^{T}\!GUitalic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U is positive definite on Null(ΣT)NullsuperscriptΣ𝑇\mathop{\mathrm{Null}}(\Sigma^{T}\!\,)roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ). With (13), (29), and the SVD of B𝐵Bitalic_B, we have

infxNull(ΣT)xTUTHUxxTΣQ11ΣTx\xlongequalx^=Uxinfx^Null(BT)subscriptinfimum𝑥NullsuperscriptΣ𝑇superscript𝑥𝑇superscript𝑈𝑇𝐻𝑈𝑥superscript𝑥𝑇Σsuperscriptsubscript𝑄11superscriptΣ𝑇𝑥\xlongequal^𝑥𝑈𝑥subscriptinfimum^𝑥Nullsuperscript𝐵𝑇\displaystyle\inf\limits_{x\notin\mathop{\mathrm{Null}}(\Sigma^{T}\!)}\dfrac{x% ^{T}U^{T}HUx}{x^{T}\Sigma Q_{1}^{-1}\Sigma^{T}x}\xlongequal{\hat{x}=Ux}\inf% \limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)}roman_inf start_POSTSUBSCRIPT italic_x ∉ roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_U italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG over^ start_ARG italic_x end_ARG = italic_U italic_x roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT
=infx^Null(BT)x^THx^x^TU(Σ0)VTQ1V(ΣT0)UTx^=infx^Null(BT)x^THx^x^TBQ1BTx^=η.absentsubscriptinfimum^𝑥Nullsuperscript𝐵𝑇superscript^𝑥𝑇𝐻^𝑥superscript^𝑥𝑇𝑈matrixΣ0superscript𝑉𝑇superscript𝑄1𝑉matrixsuperscriptΣ𝑇0superscript𝑈𝑇^𝑥subscriptinfimum^𝑥Nullsuperscript𝐵𝑇superscript^𝑥𝑇𝐻^𝑥superscript^𝑥𝑇𝐵superscript𝑄1superscript𝐵𝑇^𝑥𝜂\displaystyle=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)}\dfrac% {\hat{x}^{T}H\hat{x}}{\hat{x}^{T}U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T}Q^{% -1}V\begin{pmatrix}\Sigma^{T}\\ 0\end{pmatrix}U^{T}\hat{x}}=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B% ^{T}\!)}\dfrac{\hat{x}^{T}H\hat{x}}{\hat{x}^{T}BQ^{-1}B^{T}\hat{x}}=\eta.= roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H over^ start_ARG italic_x end_ARG end_ARG start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U ( start_ARG start_ROW start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_V ( start_ARG start_ROW start_CELL roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_x end_ARG end_ARG = roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H over^ start_ARG italic_x end_ARG end_ARG start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_x end_ARG end_ARG = italic_η .

Since ω(UTGU+UTGTU)+ΣQ11ΣT𝜔superscript𝑈𝑇𝐺𝑈superscript𝑈𝑇superscript𝐺𝑇𝑈Σsuperscriptsubscript𝑄11superscriptΣ𝑇\omega(U^{T}GU+U^{T}G^{T}U)+\Sigma Q_{1}^{-1}\Sigma^{T}italic_ω ( italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U + italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U ) + roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is similar to 2ωH+UΣQ11ΣTUT=2ωH+BQ1BT2𝜔𝐻𝑈Σsuperscriptsubscript𝑄11superscriptΣ𝑇superscript𝑈𝑇2𝜔𝐻𝐵superscript𝑄1superscript𝐵𝑇2\omega H+U\Sigma Q_{1}^{-1}\Sigma^{T}U^{T}=2\omega H+BQ^{-1}B^{T}2 italic_ω italic_H + italic_U roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and ΣΣ\Sigmaroman_Σ has full rank, Lemma 1 and Theorem 3.2 imply N~M~1P~β<1subscriptnorm~𝑁superscript~𝑀1subscript~𝑃𝛽1\|\widetilde{N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}<1∥ over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 and hence r~kasuperscriptsubscript~𝑟𝑘𝑎\widetilde{r}_{k}^{a}over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT converges to zero as k𝑘k\rightarrow\inftyitalic_k → ∞. Combining with r~kb=0superscriptsubscript~𝑟𝑘𝑏0\widetilde{r}_{k}^{b}=0over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0 concludes.

Similar to Remarks 4 and 5, when B𝐵Bitalic_B is rank-deficient, for any given 0<ω<1/(2η)+0𝜔1subscript2𝜂0<\omega<1/(-2\eta)_{+}0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, Algorithm 2 is still convergent for sufficiently small δ0𝛿0\delta\geq 0italic_δ ≥ 0. Furthermore, when G𝐺Gitalic_G is positive semidefinite, Algorithm 2 is convergent for any ω>0𝜔0\omega>0italic_ω > 0 and sufficiently small δ0𝛿0\delta\geq 0italic_δ ≥ 0.

3.3 Augmented Lagrangian BB algorithm

Gradient-type iterative methods for the unconstrained optimization problem minzn^f^(z)subscript𝑧superscript^𝑛^𝑓𝑧\min\limits_{z\in\mathds{R}^{\hat{n}}}\hat{f}(z)roman_min start_POSTSUBSCRIPT italic_z ∈ blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT end_POSTSUBSCRIPT over^ start_ARG italic_f end_ARG ( italic_z ) have the form

(35) zk+1=zkαkgk,subscript𝑧𝑘1subscript𝑧𝑘subscript𝛼𝑘subscript𝑔𝑘z_{k+1}=z_{k}-\alpha_{k}g_{k},italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,

where f^:n^:^𝑓superscript^𝑛\hat{f}:\mathds{R}^{\hat{n}}\rightarrow\mathds{R}over^ start_ARG italic_f end_ARG : blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT → blackboard_R is a sufficiently smooth function, gk=f^(zk)subscript𝑔𝑘^𝑓subscript𝑧𝑘g_{k}=\nabla\hat{f}(z_{k})italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∇ over^ start_ARG italic_f end_ARG ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is the gradient, and αk>0subscript𝛼𝑘0\alpha_{k}>0italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > 0 is a stepsize. Methods of this type differ in their stepsize rules. In 1988, Barzilai and Borwein [5] proposed two choices of αksubscript𝛼𝑘\alpha_{k}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, usually referred to as the BB method:

(36) αkBB1=sk1Tsk1sk1Tdk1andαkBB2=sk1Tdk1dk1Tdk1,formulae-sequencesuperscriptsubscript𝛼𝑘BB1superscriptsubscript𝑠𝑘1𝑇subscript𝑠𝑘1superscriptsubscript𝑠𝑘1𝑇subscript𝑑𝑘1andsuperscriptsubscript𝛼𝑘BB2superscriptsubscript𝑠𝑘1𝑇subscript𝑑𝑘1superscriptsubscript𝑑𝑘1𝑇subscript𝑑𝑘1\alpha_{k}^{\rm BB1}=\smash[t]{\frac{s_{k-1}^{T}s_{k-1}}{s_{k-1}^{T}d_{k-1}}% \quad\textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{s_{k-1}^{T}d_{k-1}}{d_{k-1}^{% T}d_{k-1}},}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG and italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ,

where sk1=zkzk1subscript𝑠𝑘1subscript𝑧𝑘subscript𝑧𝑘1s_{k-1}=z_{k}-z_{k-1}italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and dk1=gkgk1subscript𝑑𝑘1subscript𝑔𝑘subscript𝑔𝑘1d_{k-1}=g_{k}-g_{k-1}italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_g start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT. The rationale behind these choices is related to viewing the gradient-type methods as quasi-Newton methods, where αksubscript𝛼𝑘\alpha_{k}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in (35) is replaced by Dk=αkIsubscript𝐷𝑘subscript𝛼𝑘𝐼D_{k}=\alpha_{k}Iitalic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_I. This matrix serves as an approximate inverse Hessian. Following the quasi-Newton approach, the stepsize is calculated by forcing either Dk1superscriptsubscript𝐷𝑘1D_{k}^{-1}italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT (BB1 method) or Dksubscript𝐷𝑘D_{k}italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (BB2 method) to satisfy the secant equation in the least squares sense. The corresponding problems are minD=αID1sk1dk1subscript𝐷𝛼𝐼normsuperscript𝐷1subscript𝑠𝑘1subscript𝑑𝑘1\min\limits_{D=\alpha I}~{}\|D^{-1}s_{k-1}-d_{k-1}\|roman_min start_POSTSUBSCRIPT italic_D = italic_α italic_I end_POSTSUBSCRIPT ∥ italic_D start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ∥ and minD=αIsk1Ddk1subscript𝐷𝛼𝐼normsubscript𝑠𝑘1𝐷subscript𝑑𝑘1\min\limits_{D=\alpha I}~{}\|s_{k-1}-Dd_{k-1}\|roman_min start_POSTSUBSCRIPT italic_D = italic_α italic_I end_POSTSUBSCRIPT ∥ italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_D italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ∥.

When f^(z)^𝑓𝑧\hat{f}(z)over^ start_ARG italic_f end_ARG ( italic_z ) is a convex quadratic, i.e., f^(z)=12zTA^z^Tz^𝑓𝑧12superscript𝑧𝑇^𝐴𝑧superscript^𝑇𝑧\hat{f}(z)=\tfrac{1}{2}z^{T}\hat{A}z-\hat{\ell}^{T}zover^ start_ARG italic_f end_ARG ( italic_z ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_z - over^ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z with A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG SPD, this quadratic programming is equivalent to A^z=^^𝐴𝑧^\hat{A}z=\hat{\ell}over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG. In this case, gk=A^zk^=rksubscript𝑔𝑘^𝐴subscript𝑧𝑘^subscript𝑟𝑘g_{k}=\hat{A}z_{k}-\hat{\ell}=r_{k}italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT,

(37) sk1=αk1rk1anddk1=rkrk1=A^sk1=αk1A^rk1.formulae-sequencesubscript𝑠𝑘1subscript𝛼𝑘1subscript𝑟𝑘1andsubscript𝑑𝑘1subscript𝑟𝑘subscript𝑟𝑘1^𝐴subscript𝑠𝑘1subscript𝛼𝑘1^𝐴subscript𝑟𝑘1s_{k-1}=-\alpha_{k-1}r_{k-1}\quad\mbox{and}\quad d_{k-1}=r_{k}-r_{k-1}=\hat{A}% s_{k-1}=-\alpha_{k-1}\hat{A}r_{k-1}.italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT .

Then the two BB stepsizes (36) can be reformulated as

αkBB1=rk1Trk1rk1TA^rk1andαkBB2=rk1TA^rk1rk1TA^TA^rk1.formulae-sequencesuperscriptsubscript𝛼𝑘BB1superscriptsubscript𝑟𝑘1𝑇subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘1andsuperscriptsubscript𝛼𝑘BB2superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇superscript^𝐴𝑇^𝐴subscript𝑟𝑘1\alpha_{k}^{\rm BB1}=\frac{r_{k-1}^{T}r_{k-1}}{r_{k-1}^{T}\hat{A}r_{k-1}}\quad% \textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}% ^{T}\hat{A}^{T}\hat{A}r_{k-1}}.italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG and italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG .

Global convergence of the BB method for minimizing quadratic forms was established by Raydan [39], and its R-linear convergence rate was established by Dai and Liao [17]. For general strongly convex functions with Lipschitz gradient, the local convergence of the BB method with R-linear rate was rigorously proved by Dai et al. [19]. Extensive numerical experiments show that the BB method can solve unconstrained optimization problems efficiently and is considerably superior to the steepest descent method [12, 40]. A variety of modifications and extensions of the BB method have been developed for optimization.

Several researchers used the BB method to solve UPD linear systems. Dai et al. [18] gave an analysis of the BB1 method for two-by-two unsymmetric linear systems. Under mild conditions, they showed that the convergence rate of the BB1 method is Q𝑄Qitalic_Q-superlinear if the matrix has a double eigenvalue, but only R𝑅Ritalic_R-superlinear if the matrix has two different real eigenvalues. We find that the BB1 method for solving UPD linear systems could be divergent. Indeed, consider

A^z:=(1221)(xy)=(00).assign^𝐴𝑧matrix1221matrix𝑥𝑦matrix00\hat{A}z:=\begin{pmatrix}1&2\\ -2&1\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}0\\ 0\end{pmatrix}.over^ start_ARG italic_A end_ARG italic_z := ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 2 end_CELL end_ROW start_ROW start_CELL - 2 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) .

Note that A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG has two complex eigenvalues 1±2iplus-or-minus12i1\pm 2{\rm i}1 ± 2 roman_i. The conditions in [18] do not hold. It follows from (36) and (37) that αkBB1=(sk1Tsk1)/(sk1TA^sk1)=1.superscriptsubscript𝛼𝑘BB1superscriptsubscript𝑠𝑘1𝑇subscript𝑠𝑘1superscriptsubscript𝑠𝑘1𝑇^𝐴subscript𝑠𝑘11\alpha_{k}^{\rm BB1}=(s_{k-1}^{T}s_{k-1})/(s_{k-1}^{T}\hat{A}s_{k-1})=1.italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = ( italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) / ( italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) = 1 . Then, one BB1 iteration gives

zk+1=zkrk=(xkyk)(xk+2yk2xk+yk)=(2yk2xk).subscript𝑧𝑘1subscript𝑧𝑘subscript𝑟𝑘matrixsubscript𝑥𝑘subscript𝑦𝑘matrixsubscript𝑥𝑘2subscript𝑦𝑘2subscript𝑥𝑘subscript𝑦𝑘matrix2subscript𝑦𝑘2subscript𝑥𝑘z_{k+1}=z_{k}-r_{k}=\smash[t]{\begin{pmatrix}x_{k}\\ y_{k}\end{pmatrix}-\begin{pmatrix}x_{k}+2y_{k}\\ -2x_{k}+y_{k}\end{pmatrix}=\begin{pmatrix}-2y_{k}\\ 2x_{k}\end{pmatrix}}.italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + 2 italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - 2 italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL - 2 italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 2 italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .

This leads to zk+12=8zk2superscriptnormsubscript𝑧𝑘128superscriptnormsubscript𝑧𝑘2\|z_{k+1}\|^{2}=8\|z_{k}\|^{2}∥ italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 8 ∥ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, which means that the sequence {zk}subscript𝑧𝑘\{z_{k}\}{ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } of the BB1 iterations diverges for any initial z00subscript𝑧00z_{0}\neq 0italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≠ 0.

For quadratic programming with A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG unsymmetric, the minimal gradient method [31, 32, 42] uses the stepsize αkMG=(rkTA^rk)/(rkTA^TA^rk)superscriptsubscript𝛼𝑘MGsuperscriptsubscript𝑟𝑘𝑇^𝐴subscript𝑟𝑘superscriptsubscript𝑟𝑘𝑇superscript^𝐴𝑇^𝐴subscript𝑟𝑘\alpha_{k}^{\rm MG}=(r_{k}^{T}\hat{A}r_{k})/(r_{k}^{T}\hat{A}^{T}\hat{A}r_{k})italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT = ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) / ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), which gives an optimal residual in each iteration, namely,

αkMG=argminα>0A^(zkαrk)b=argminα>0rkαA^rk.superscriptsubscript𝛼𝑘MGsubscript𝛼0norm^𝐴subscript𝑧𝑘𝛼subscript𝑟𝑘𝑏subscript𝛼0normsubscript𝑟𝑘𝛼^𝐴subscript𝑟𝑘\alpha_{k}^{\rm MG}=\arg\min_{\alpha>0}\|\hat{A}(z_{k}-\alpha r_{k})-b\|=\arg% \min_{\alpha>0}\|r_{k}-\alpha\hat{A}r_{k}\|.italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_α > 0 end_POSTSUBSCRIPT ∥ over^ start_ARG italic_A end_ARG ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) - italic_b ∥ = roman_arg roman_min start_POSTSUBSCRIPT italic_α > 0 end_POSTSUBSCRIPT ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ .

Therefore, the minimal gradient method is convergent for solving UPD linear systems. Note that the difference between αkMGsuperscriptsubscript𝛼𝑘MG\alpha_{k}^{\rm MG}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT and αkBB2superscriptsubscript𝛼𝑘BB2\alpha_{k}^{\rm BB2}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT is that one uses rksubscript𝑟𝑘r_{k}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and the other uses rk1subscript𝑟𝑘1r_{k-1}italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT. The BB2 method can be regarded as the minimal gradient method with delay [24]. Gradient methods with delay significantly improve the performance of gradient methods, see [51] and references therein. Hence, we use the BB2 method to derive the new iterates xk+1subscript𝑥𝑘1x_{k+1}italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT and yk+1subscript𝑦𝑘1y_{k+1}italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT in Algorithm 2 when G𝐺Gitalic_G is positive definite. Then the augmented Lagrangian BB algorithm for solving (1) is as in Algorithm 3.

Algorithm 3 Augmented Lagrangian BB algorithm, SPALBB
1:  Given z1=(x1,y1),z0=(x0,y0)n+mformulae-sequencesubscript𝑧1subscript𝑥1subscript𝑦1subscript𝑧0subscript𝑥0subscript𝑦0superscript𝑛𝑚z_{-1}=(x_{-1},\,y_{-1}),~{}z_{0}=(x_{0},\,y_{0})\in\mathds{R}^{n+m}italic_z start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ) , italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT, ω>0𝜔0\omega>0italic_ω > 0, 0δ<10𝛿10\leq\delta<10 ≤ italic_δ < 1, and SPD Q𝑄Qitalic_Q, compute r0=Mz0(f,ωQy0+g)subscript𝑟0𝑀subscript𝑧0𝑓𝜔𝑄subscript𝑦0𝑔r_{0}=Mz_{0}-(f,\,\omega Qy_{0}+g)italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_M italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - ( italic_f , italic_ω italic_Q italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_g ) and set k=0𝑘0k=0italic_k = 0.
2:  while a stop** condition is not satisfied do
3:     Compute k=(f,ωQyk+g)subscript𝑘𝑓𝜔𝑄subscript𝑦𝑘𝑔\ell_{k}=(f,\,\omega Qy_{k}+g)roman_ℓ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_f , italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ).
4:     while rjMzj>δrjsubscriptnormsubscript𝑟𝑗𝑀subscript𝑧𝑗𝛿subscriptnormsubscript𝑟𝑗\|r_{j}-Mz_{j}\|_{*}>\delta\|r_{j}\|_{*}∥ italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_M italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT > italic_δ ∥ italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT do
5:        Compute sj=zjzj1subscript𝑠𝑗subscript𝑧𝑗subscript𝑧𝑗1s_{j}=z_{j}-z_{j-1}italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT.
6:        Compute dj=Msjsubscript𝑑𝑗𝑀subscript𝑠𝑗d_{j}=Ms_{j}italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_M italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.
7:        Compute rj=Mzjksubscript𝑟𝑗𝑀subscript𝑧𝑗subscript𝑘r_{j}=Mz_{j}-\ell_{k}italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_M italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - roman_ℓ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT.
8:        Compute αj=sjTdjdj2subscript𝛼𝑗superscriptsubscript𝑠𝑗𝑇subscript𝑑𝑗superscriptnormsubscript𝑑𝑗2\alpha_{j}=\frac{s_{j}^{T}d_{j}}{\|d_{j}\|^{2}}italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG.
9:        Compute zj+1=zjαjrjsubscript𝑧𝑗1subscript𝑧𝑗subscript𝛼𝑗subscript𝑟𝑗z_{{j}+1}=z_{j}-\alpha_{j}r_{j}italic_z start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.
10:     end while
11:     Increment k𝑘kitalic_k by 1111.
12:  end while

In the following, we establish the convergence of Algorithm 3. First, under some assumptions, we show that the BB2 method is convergent for solving a general UPD linear system A^z=^^𝐴𝑧^\hat{A}z=\hat{\ell}over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG, where the iterative scheme is zk+1=zkαkBB2rksubscript𝑧𝑘1subscript𝑧𝑘superscriptsubscript𝛼𝑘BB2subscript𝑟𝑘z_{k+1}=z_{k}-\alpha_{k}^{\rm BB2}r_{k}italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and rk=A^zk^.subscript𝑟𝑘^𝐴subscript𝑧𝑘^r_{k}=\hat{A}z_{k}-\hat{\ell}.italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG . For convenience, we introduce

A^h=12(A^+A^T),W=A^h1A^TA^,formulae-sequencesubscript^𝐴12^𝐴superscript^𝐴𝑇𝑊superscriptsubscript^𝐴1superscript^𝐴𝑇^𝐴\displaystyle\hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T}),\qquad W=\hat{A}_{h% }^{-1}\hat{A}^{T}\!\hat{A},over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , italic_W = over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG ,
(38) θj=max{12ujλmin(W)+|λj|2λmin(W)2, 12ujλmax(W)+|λj|2λmax(W)2},subscript𝜃𝑗12subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊212subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊2\displaystyle\theta_{j}=\max\left\{1-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{|% \lambda_{j}|^{2}}{\lambda_{\min}(W)^{2}},\,1-\frac{2u_{j}}{\lambda_{\max}(W)}+% \frac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^{2}}\right\},italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = roman_max { 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG } ,

where λj=uj+ivj(1jn)subscript𝜆𝑗subscript𝑢𝑗isubscript𝑣𝑗1𝑗𝑛\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n)italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n ) are the eigenvalues of A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG. When A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG is UPD, we know that A^hsubscript^𝐴\hat{A}_{h}over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is SPD and uj>0(1jn)subscript𝑢𝑗01𝑗𝑛u_{j}>0~{}(1\leq j\leq n)italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > 0 ( 1 ≤ italic_j ≤ italic_n ). By direct calculation, for all 1jn1𝑗𝑛1\leq j\leq n1 ≤ italic_j ≤ italic_n, θj<1subscript𝜃𝑗1\theta_{j}<1italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < 1 holds by 12ujλmin(W)+|λj|2λmin(W)2<112subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊211-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\min}(W)^% {2}}<11 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1 and 12ujλmax(W)+|λj|2λmax(W)2<112subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊211-\frac{2u_{j}}{\lambda_{\max}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^% {2}}<11 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1, which are equivalent to

(39) max1jn|λj|2uj<2λmin(W).subscript1𝑗𝑛superscriptsubscript𝜆𝑗2subscript𝑢𝑗2subscript𝜆𝑊\smash[t]{\max_{1\leq j\leq n}\frac{|\lambda_{j}|^{2}}{u_{j}}<2\lambda_{\min}(% W).}roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ italic_n end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) .

We are now ready to study the convergence of the BB2 method.

Theorem 3.8.

Suppose A^n^×n^^𝐴superscript^𝑛^𝑛\hat{A}\in\mathds{R}^{\hat{n}\times\hat{n}}over^ start_ARG italic_A end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG × over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT is UPD. If its n𝑛nitalic_n eigenvalues λj=uj+ivj(1jn)subscript𝜆𝑗subscript𝑢𝑗isubscript𝑣𝑗1𝑗𝑛\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n)italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n ) satisfy (39), then the sequence {zk}subscript𝑧𝑘\{z_{k}\}{ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by the BB2 method converges to the unique solution of A^z=^^𝐴𝑧^\hat{A}z=\hat{\ell}over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG.

Proof 3.9.

It is well known that the BB method is invariant under unitary transformation of the variables [17]. By the Schur decomposition, we can assume without loss of generality that A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG is of the form

(λ1a12a13a1n^0λ2a23a2n^00λn^1an^1,n^00λn^),matrixsubscript𝜆1subscript𝑎12subscript𝑎13subscript𝑎1^𝑛0subscript𝜆2subscript𝑎23subscript𝑎2^𝑛00subscript𝜆^𝑛1subscript𝑎^𝑛1^𝑛00subscript𝜆^𝑛\begin{pmatrix}\lambda_{1}&a_{12}&a_{13}&\cdots&a_{1\hat{n}}\\ 0&\lambda_{2}&a_{23}&\cdots&a_{2\hat{n}}\\ \vdots&\ddots&\ddots&\ddots&\vdots\\ 0&\cdots&0&\lambda_{\hat{n}-1}&a_{\hat{n}-1,\hat{n}}\\ 0&\cdots&\cdots&0&\lambda_{\hat{n}}\end{pmatrix},( start_ARG start_ROW start_CELL italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_a start_POSTSUBSCRIPT 1 over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_a start_POSTSUBSCRIPT 2 over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG - 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG - 1 , over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ,

where λj=uj+ivj,j=1,2,,n^formulae-sequencesubscript𝜆𝑗subscript𝑢𝑗isubscript𝑣𝑗𝑗12^𝑛\lambda_{j}=u_{j}+{\rm i}v_{j}\in\mathbb{C},~{}j=1,2,\ldots,\hat{n}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_C , italic_j = 1 , 2 , … , over^ start_ARG italic_n end_ARG. Because rk+1=A^zk+1^=rkαkBB2A^rksubscript𝑟𝑘1^𝐴subscript𝑧𝑘1^subscript𝑟𝑘superscriptsubscript𝛼𝑘BB2^𝐴subscript𝑟𝑘r_{k+1}=\hat{A}z_{k+1}-\hat{\ell}=r_{k}-\alpha_{k}^{\rm BB2}\hat{A}r_{k}italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT,

(40) {rk+1(n^)=rk(n^)αkBB2λn^rk(n^),rk+1(j)=rk(j)αkBB2λjrk(j)αkBB2t=j+1n^aj,trk(t),j=n^1,,1,casessuperscriptsubscript𝑟𝑘1^𝑛superscriptsubscript𝑟𝑘^𝑛superscriptsubscript𝛼𝑘BB2subscript𝜆^𝑛superscriptsubscript𝑟𝑘^𝑛formulae-sequencesuperscriptsubscript𝑟𝑘1𝑗superscriptsubscript𝑟𝑘𝑗superscriptsubscript𝛼𝑘BB2subscript𝜆𝑗superscriptsubscript𝑟𝑘𝑗superscriptsubscript𝛼𝑘BB2superscriptsubscript𝑡𝑗1^𝑛subscript𝑎𝑗𝑡superscriptsubscript𝑟𝑘𝑡𝑗^𝑛11\left\{\begin{array}[]{l}r_{k+1}^{(\hat{n})}=r_{k}^{(\hat{n})}-\alpha_{k}^{\rm BB% 2}\lambda_{\hat{n}}r_{k}^{(\hat{n})},\\[3.0pt] r_{k+1}^{(j)}=r_{k}^{(j)}-\alpha_{k}^{\rm BB2}\lambda_{j}r_{k}^{(j)}-\alpha_{k% }^{\rm BB2}\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)},~{}j=\hat{n}-1,% \ldots,1,\end{array}\right.{ start_ARRAY start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT , italic_j = over^ start_ARG italic_n end_ARG - 1 , … , 1 , end_CELL end_ROW end_ARRAY

where rk(j)superscriptsubscript𝑟𝑘𝑗r_{k}^{(j)}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT is the j𝑗jitalic_j-th component of rksubscript𝑟𝑘r_{k}italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Note that A^h=12(A^+A^T)subscript^𝐴12^𝐴superscript^𝐴𝑇\hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T})over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and rk1TA^rk1=rk1TA^Trk1superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇superscript^𝐴𝑇subscript𝑟𝑘1r_{k-1}^{T}\hat{A}r_{k-1}=r_{k-1}^{T}\hat{A}^{T}r_{k-1}italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT, giving rk1TA^rk1=12(rk1TA^rk1+rk1TA^Trk1)=rk1TA^hrk1.superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘112superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇superscript^𝐴𝑇subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇subscript^𝐴subscript𝑟𝑘1r_{k-1}^{T}\hat{A}r_{k-1}=\tfrac{1}{2}\left(r_{k-1}^{T}\hat{A}r_{k-1}+r_{k-1}^% {T}\hat{A}^{T}r_{k-1}\right)=r_{k-1}^{T}\hat{A}_{h}r_{k-1}.italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) = italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT . Since A^hsubscript^𝐴\hat{A}_{h}over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is SPD, it leads to

αkBB2=rk1TA^rk1rk1TA^TA^rk1=rk1TA^hrk1rk1TA^TA^rk1\xlongequalr^=A^h12rk1r^Tr^r^TA^h12A^TA^A^h12r^.superscriptsubscript𝛼𝑘BB2superscriptsubscript𝑟𝑘1𝑇^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇superscript^𝐴𝑇^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇subscript^𝐴subscript𝑟𝑘1superscriptsubscript𝑟𝑘1𝑇superscript^𝐴𝑇^𝐴subscript𝑟𝑘1\xlongequal^𝑟superscriptsubscript^𝐴12subscript𝑟𝑘1superscript^𝑟𝑇^𝑟superscript^𝑟𝑇superscriptsubscript^𝐴12superscript^𝐴𝑇^𝐴superscriptsubscript^𝐴12^𝑟\alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}% \hat{A}r_{k-1}}=\frac{r_{k-1}^{T}\hat{A}_{h}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}% \hat{A}r_{k-1}}\xlongequal{\hat{r}=\hat{A}_{h}^{\frac{1}{2}}r_{k-1}}\frac{\hat% {r}^{T}\hat{r}}{\hat{r}^{T}\hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}% _{h}^{-\frac{1}{2}}\hat{r}}.italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG over^ start_ARG italic_r end_ARG = over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_r end_ARG end_ARG start_ARG over^ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_r end_ARG end_ARG .

By the Courant-Fischer min-max theorem and the fact that A^h12A^TA^A^h12superscriptsubscript^𝐴12superscript^𝐴𝑇^𝐴superscriptsubscript^𝐴12\hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}_{h}^{-\frac{1}{2}}over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT is similar to W𝑊Witalic_W, we have

(41) 1λmax(W)αkBB21λmin(W).1subscript𝜆𝑊superscriptsubscript𝛼𝑘BB21subscript𝜆𝑊\smash[t]{\frac{1}{\lambda_{\max}(W)}\leq\alpha_{k}^{\rm BB2}\leq\frac{1}{% \lambda_{\min}(W)}.}divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG ≤ italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG .

It follows from λj=uj+ivjsubscript𝜆𝑗subscript𝑢𝑗isubscript𝑣𝑗\lambda_{j}=u_{j}+{\rm i}v_{j}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, (38), (41), and the behavior of the quadratic function for αkBB2superscriptsubscript𝛼𝑘BB2\alpha_{k}^{\rm BB2}italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT that, for any j=1,,n^𝑗1^𝑛j=1,\ldots,\hat{n}italic_j = 1 , … , over^ start_ARG italic_n end_ARG,

|1αkBB2λj|2=(1αkBB2uj)2+(αkBB2vj)2=12αkBB2uj+(αkBB2)2|λj|2superscript1superscriptsubscript𝛼𝑘BB2subscript𝜆𝑗2superscript1superscriptsubscript𝛼𝑘BB2subscript𝑢𝑗2superscriptsuperscriptsubscript𝛼𝑘BB2subscript𝑣𝑗212superscriptsubscript𝛼𝑘BB2subscript𝑢𝑗superscriptsuperscriptsubscript𝛼𝑘BB22superscriptsubscript𝜆𝑗2\displaystyle\left|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right|^{2}=\left(1-\alpha% _{k}^{\rm BB2}u_{j}\right)^{2}+\left(\alpha_{k}^{\rm BB2}v_{j}\right)^{2}=1-2% \alpha_{k}^{\rm BB2}u_{j}+\left(\alpha_{k}^{\rm BB2}\right)^{2}\left|\lambda_{% j}\right|^{2}| 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 - 2 italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + ( italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(42) max{12ujλmin(W)+|λj|2λmin(W)2, 12ujλmax(W)+|λj|2λmax(W)2}=θj.absent12subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊212subscript𝑢𝑗subscript𝜆𝑊superscriptsubscript𝜆𝑗2subscript𝜆superscript𝑊2subscript𝜃𝑗\displaystyle\leq\max\left\{1-\tfrac{2u_{j}}{\lambda_{\min}(W)}+\tfrac{|% \lambda_{j}|^{2}}{\lambda_{\min}(W)^{2}},\,1-\tfrac{2u_{j}}{\lambda_{\max}(W)}% +\tfrac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^{2}}\right\}=\theta_{j}.≤ roman_max { 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG } = italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .

Combining with (39) and (40) gives

|rk+1(n^)|=|1αkBB2λn^||rk(n^)|θn^|rk(n^)|<|rk(n^)|.superscriptsubscript𝑟𝑘1^𝑛1superscriptsubscript𝛼𝑘BB2subscript𝜆^𝑛superscriptsubscript𝑟𝑘^𝑛subscript𝜃^𝑛superscriptsubscript𝑟𝑘^𝑛superscriptsubscript𝑟𝑘^𝑛\left|r_{k+1}^{(\hat{n})}\right|=\left|1-\alpha_{k}^{\rm BB2}\lambda_{\hat{n}}% \right|\,\left|r_{k}^{(\hat{n})}\right|\leq\sqrt{\theta_{\hat{n}}}\left|r_{k}^% {(\hat{n})}\right|<\left|r_{k}^{(\hat{n})}\right|.| italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | = | 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT | | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | ≤ square-root start_ARG italic_θ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_ARG | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | < | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | .

This implies that rk(n^)0superscriptsubscript𝑟𝑘^𝑛0r_{k}^{(\hat{n})}\rightarrow 0italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT → 0 as k𝑘k\rightarrow\inftyitalic_k → ∞. For j=n^1,,1𝑗^𝑛11j=\hat{n}-1,\ldots,1italic_j = over^ start_ARG italic_n end_ARG - 1 , … , 1, by (40) and (42), |rk+1(j)||1αkBB2λj||rk(j)|+αkBB2|t=j+1n^aj,trk(t)|θj|rk(j)|+αkBB2|t=j+1n^aj,trk(t)|superscriptsubscript𝑟𝑘1𝑗1superscriptsubscript𝛼𝑘BB2subscript𝜆𝑗superscriptsubscript𝑟𝑘𝑗superscriptsubscript𝛼𝑘BB2superscriptsubscript𝑡𝑗1^𝑛subscript𝑎𝑗𝑡superscriptsubscript𝑟𝑘𝑡subscript𝜃𝑗superscriptsubscript𝑟𝑘𝑗superscriptsubscript𝛼𝑘BB2superscriptsubscript𝑡𝑗1^𝑛subscript𝑎𝑗𝑡superscriptsubscript𝑟𝑘𝑡\left|r_{k+1}^{(j)}\right|\leq\left|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right|\,% \left|r_{k}^{(j)}\right|+\alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n% }}a_{j,t}r_{k}^{(t)}\right|\leq\sqrt{\theta_{j}}\left|r_{k}^{(j)}\right|+% \alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)}\right|| italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | ≤ | 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | + italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT | ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT | ≤ square-root start_ARG italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | + italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT | ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT |. It follows that θj<1subscript𝜃𝑗1\theta_{j}<1italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < 1 and limkrk(n^)=0subscript𝑘superscriptsubscript𝑟𝑘^𝑛0\lim\limits_{k\rightarrow\infty}r_{k}^{(\hat{n})}=0roman_lim start_POSTSUBSCRIPT italic_k → ∞ end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT = 0.

Remark 8.

As A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG is positive definite, so is A^1superscript^𝐴1\hat{A}^{-1}over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Let λ~j=u~j+iv~j(1jn^)subscript~𝜆𝑗subscript~𝑢𝑗isubscript~𝑣𝑗1𝑗^𝑛\tilde{\lambda}_{j}=\tilde{u}_{j}+{\rm i}\tilde{v}_{j}~{}(1\leq j\leq\hat{n})over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG ) be the eigenvalues of A^1superscript^𝐴1\hat{A}^{-1}over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Clearly, 1λ~j=u~jiv~ju~j2+v~j21subscript~𝜆𝑗subscript~𝑢𝑗isubscript~𝑣𝑗superscriptsubscript~𝑢𝑗2superscriptsubscript~𝑣𝑗2\frac{1}{\tilde{\lambda}_{j}}=\frac{\tilde{u}_{j}-{\rm i}\tilde{v}_{j}}{\tilde% {u}_{j}^{2}+\tilde{v}_{j}^{2}}divide start_ARG 1 end_ARG start_ARG over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = divide start_ARG over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - roman_i over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG is an eigenvalue of A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG. Then max1jn^|λj|2uj=1min1jn^u~jsubscript1𝑗^𝑛superscriptsubscript𝜆𝑗2subscript𝑢𝑗1subscript1𝑗^𝑛subscript~𝑢𝑗\max\limits_{1\leq j\leq\hat{n}}\frac{|\lambda_{j}|^{2}}{u_{j}}=\frac{1}{\min_% {1\leq j\leq\hat{n}}\tilde{u}_{j}}roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG. This, along with

λmin(W)=2λmin((A^+A^T)1A^TA^)=2λmin(A^(A^+A^T)1A^T)subscript𝜆𝑊2subscript𝜆superscript^𝐴superscript^𝐴𝑇1superscript^𝐴𝑇^𝐴2subscript𝜆^𝐴superscript^𝐴superscript^𝐴𝑇1superscript^𝐴𝑇\displaystyle\lambda_{\min}(W)=2\lambda_{\min}\left((\hat{A}+\hat{A}^{T})^{-1}% \hat{A}^{T}\hat{A}\right)=2\lambda_{\min}\left(\hat{A}(\hat{A}+\hat{A}^{T})^{-% 1}\hat{A}^{T}\right)italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) = 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG ) = 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT )
=2λmin((A^1+A^T)1)=2λmax(A^1+A^T),absent2subscript𝜆superscriptsuperscript^𝐴1superscript^𝐴𝑇12subscript𝜆superscript^𝐴1superscript^𝐴𝑇\displaystyle=2\lambda_{\min}\left((\hat{A}^{-1}+\hat{A}^{-T})^{-1}\right)=% \frac{2}{\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})},= 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = divide start_ARG 2 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) end_ARG ,

shows that condition (39) is equivalent to λmax(A^1+A^T)<4min1jn^u~j.subscript𝜆superscript^𝐴1superscript^𝐴𝑇4subscript1𝑗^𝑛subscript~𝑢𝑗\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<4\min\limits_{1\leq j\leq\hat{n}}% \tilde{u}_{j}.italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) < 4 roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT . Note that min1jn^u~j12λmin(A^1+A^T)subscript1𝑗^𝑛subscript~𝑢𝑗12subscript𝜆superscript^𝐴1superscript^𝐴𝑇\min\limits_{1\leq j\leq\hat{n}}\tilde{u}_{j}\geq\tfrac{1}{2}\lambda_{\min}(% \hat{A}^{-1}+\hat{A}^{-T})roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ),111For any j=1,,n^𝑗1^𝑛j=1,\ldots,\hat{n}italic_j = 1 , … , over^ start_ARG italic_n end_ARG, let x~jsubscript~𝑥𝑗\tilde{x}_{j}over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT be the eigenvector of A^1superscript^𝐴1\hat{A}^{-1}over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT corresponding to λ~jsubscript~𝜆𝑗\tilde{\lambda}_{j}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Then we have λ~j=x~jA^1x~jx~jx~jsubscript~𝜆𝑗superscriptsubscript~𝑥𝑗superscript^𝐴1subscript~𝑥𝑗superscriptsubscript~𝑥𝑗subscript~𝑥𝑗\tilde{\lambda}_{j}=\tfrac{\tilde{x}_{j}^{*}\hat{A}^{-1}\tilde{x}_{j}}{\tilde{% x}_{j}^{*}\tilde{x}_{j}}over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG. Since A^1+A^Tsuperscript^𝐴1superscript^𝐴𝑇\hat{A}^{-1}+\hat{A}^{-T}over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT is SPD, it gives u~j=12(λ~j+λ~j)=12(x~jA^1x~jx~jx~j+x~jA^Tx~jx~jx~j)=x~j(A^1+A^T)x~j2x~jx~j12λmin(A^1+A^T).subscript~𝑢𝑗12subscript~𝜆𝑗superscriptsubscript~𝜆𝑗12superscriptsubscript~𝑥𝑗superscript^𝐴1subscript~𝑥𝑗superscriptsubscript~𝑥𝑗subscript~𝑥𝑗superscriptsubscript~𝑥𝑗superscript^𝐴𝑇subscript~𝑥𝑗superscriptsubscript~𝑥𝑗subscript~𝑥𝑗superscriptsubscript~𝑥𝑗superscript^𝐴1superscript^𝐴𝑇subscript~𝑥𝑗2superscriptsubscript~𝑥𝑗subscript~𝑥𝑗12subscript𝜆superscript^𝐴1superscript^𝐴𝑇\tilde{u}_{j}=\frac{1}{2}\left(\tilde{\lambda}_{j}+\tilde{\lambda}_{j}^{*}% \right)=\frac{1}{2}\left(\frac{\tilde{x}_{j}^{*}\hat{A}^{-1}\tilde{x}_{j}}{% \tilde{x}_{j}^{*}\tilde{x}_{j}}+\frac{\tilde{x}_{j}^{*}\hat{A}^{-T}\tilde{x}_{% j}}{\tilde{x}_{j}^{*}\tilde{x}_{j}}\right)=\frac{\tilde{x}_{j}^{*}\left(\hat{A% }^{-1}+\hat{A}^{-T}\right)\tilde{x}_{j}}{2\tilde{x}_{j}^{*}\tilde{x}_{j}}\geq% \frac{1}{2}\lambda_{\min}(\hat{A}^{-1}+\hat{A}^{-T}).over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( divide start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG + divide start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) = divide start_ARG over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG 2 over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over~ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) . so the above inequality can be reinforced as λmax(A^1+A^T)<2λmin(A^1+A^T).subscript𝜆superscript^𝐴1superscript^𝐴𝑇2subscript𝜆superscript^𝐴1superscript^𝐴𝑇\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<2\lambda_{\min}(\hat{A}^{-1}+\hat{A}% ^{-T}).italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) . When A^^𝐴\hat{A}over^ start_ARG italic_A end_ARG is SPD, it reduces to λmax(A^)<2λmin(A^)subscript𝜆^𝐴2subscript𝜆^𝐴\lambda_{\max}(\hat{A})<2\lambda_{\min}(\hat{A})italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ) < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ), which is the same as the convergence condition of the preconditioned BB method for SPD linear systems [34]. This means that our condition (39) is weaker than that of [34].

When G𝐺Gitalic_G is UPD, so is M𝑀Mitalic_M in (12). Combining Theorem 3.8 with the convergence conditions of Algorithm 2 gives the following result.

Theorem 3.10.

Suppose Gn×n𝐺superscript𝑛𝑛G\in\mathds{R}^{n\times n}italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is UPD. For any SPD Qm×m𝑄superscript𝑚𝑚Q\in\mathds{R}^{m\times m}italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT and any ω>0𝜔0\omega>0italic_ω > 0, let M𝑀Mitalic_M be defined by (12) and λj=uj+ivj(1jn+m)subscript𝜆𝑗subscript𝑢𝑗isubscript𝑣𝑗1𝑗𝑛𝑚\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n+m)italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n + italic_m ) be its n+m𝑛𝑚n+mitalic_n + italic_m eigenvalues. If

(43) max1jn+m|λj|2uj<4λmax(M1+MT),subscript1𝑗𝑛𝑚superscriptsubscript𝜆𝑗2subscript𝑢𝑗4subscript𝜆superscript𝑀1superscript𝑀𝑇\smash[t]{\max_{1\leq j\leq n+m}\frac{|\lambda_{j}|^{2}}{u_{j}}<\frac{4}{% \lambda_{\max}\left(M^{-1}+M^{-T}\right)},}roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ italic_n + italic_m end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG < divide start_ARG 4 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) end_ARG ,

then for sufficiently small δ𝛿\deltaitalic_δ, {xk,yk}subscript𝑥𝑘subscript𝑦𝑘\{x_{k},y_{k}\}{ italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 3 converges to a solution of (1).

Remark 9.

The residuals generated by the BB method, even for SPD linear systems, are strong nonmonotonic, which poses a challenge for the convergence [40, 17]. This is also the reason why the convergence of Algorithm 3 is intricate. Our convergence analysis of Algorithm 3 by ensuring a decrease of rksubscriptnormsubscript𝑟𝑘\|r_{k}\|_{*}∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is quite stringent, relying on a rather strong assumption (43). The nonmonotonic behavior of rknormsubscript𝑟𝑘\|r_{k}\|∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ in Figures 1 and 2 also indicates that the choices of ω𝜔\omegaitalic_ω in our numerical experiments do not meet (43). Thus, there is significant room for improving the convergence of the BB method for UPD linear systems and Algorithm 3.

Remark 10.

Although assumption (43) is strong, it is still possible to choose ω𝜔\omegaitalic_ω to satisfy it. Indeed, consider the special case n=m=1𝑛𝑚1n=m=1italic_n = italic_m = 1 and M=(abbω)𝑀matrix𝑎𝑏𝑏𝜔M=\begin{pmatrix}a&b\\ -b&\omega\end{pmatrix}italic_M = ( start_ARG start_ROW start_CELL italic_a end_CELL start_CELL italic_b end_CELL end_ROW start_ROW start_CELL - italic_b end_CELL start_CELL italic_ω end_CELL end_ROW end_ARG ) with a>0𝑎0a>0italic_a > 0 and b𝑏b\in\mathds{R}italic_b ∈ blackboard_R. Since M1=1aω+b2(ωbba),superscript𝑀11𝑎𝜔superscript𝑏2matrix𝜔𝑏𝑏𝑎M^{-1}=\frac{1}{a\omega+b^{2}}\begin{pmatrix}\omega&-b\\ b&a\end{pmatrix},italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( start_ARG start_ROW start_CELL italic_ω end_CELL start_CELL - italic_b end_CELL end_ROW start_ROW start_CELL italic_b end_CELL start_CELL italic_a end_CELL end_ROW end_ARG ) , we have

λmin(M1+MT)=2min{a,ω}aω+b2andλmax(M1+MT)=2max{a,ω}aω+b2.formulae-sequencesubscript𝜆superscript𝑀1superscript𝑀𝑇2𝑎𝜔𝑎𝜔superscript𝑏2andsubscript𝜆superscript𝑀1superscript𝑀𝑇2𝑎𝜔𝑎𝜔superscript𝑏2\lambda_{\min}\left(M^{-1}+M^{-T}\right)=\frac{2\min\{a,\omega\}}{a\omega+b^{2% }}\quad\mbox{and}\quad\lambda_{\max}\left(M^{-1}+M^{-T}\right)=\frac{2\max\{a,% \omega\}}{a\omega+b^{2}}.italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) = divide start_ARG 2 roman_min { italic_a , italic_ω } end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG and italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) = divide start_ARG 2 roman_max { italic_a , italic_ω } end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG .

It follows from Remark 8 that (43) can be reinforced as λmax(M1+MT)2λmin(M1+MT)subscript𝜆superscript𝑀1superscript𝑀𝑇2subscript𝜆superscript𝑀1superscript𝑀𝑇\lambda_{\max}\left(M^{-1}+M^{-T}\right)\leq 2\lambda_{\min}\left(M^{-1}+M^{-T% }\right)italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) ≤ 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ), namely, max{a,ω}2min{a,ω}𝑎𝜔2𝑎𝜔\max\{a,\omega\}\leq 2\min\{a,\omega\}roman_max { italic_a , italic_ω } ≤ 2 roman_min { italic_a , italic_ω }. This implies that (43) holds when ω[a/2, 2a]𝜔𝑎22𝑎\omega\in\left[a/2,\,2a\right]italic_ω ∈ [ italic_a / 2 , 2 italic_a ].

For the general case, we can apply preconditioning techniques to (7) such that M𝑀Mitalic_M is well-conditioned. Preconditioning techniques for M𝑀Mitalic_M have been widely studied; see [8] and the references therein.

4 Numerical experiments

We present the results of numerical tests to examine the feasibility and effectiveness of SPALBB. All experiments were run using MATLAB R2022b on a PC with an Intel(R) Core(TM) i7-1260P CPU @ 2.10GHz and 32GB of RAM. The initial guess is taken to be the zero vector, and the algorithms are terminated when the number of iterations exceeds 105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT or Res:=rk/r0106assignResnormsubscript𝑟𝑘normsubscript𝑟0superscript106{\rm Res}:=\|r_{k}\|/\|r_{0}\|\leq 10^{-6}roman_Res := ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ / ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ ≤ 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. We report the number of outer iterations, the total number of iterations (for SPALBB, it includes the number of inner iterations), the CPU time in seconds, and the final value of the relative residual, denoted by “Oiter”, “Titer”, “CPU” and “Res”.

In SPALBB, we set Q=I𝑄𝐼Q=Iitalic_Q = italic_I, the stop** criterion (23) for inner iterations with δ=0.5𝛿0.5\delta=0.5italic_δ = 0.5 and 2222-norm, and tried ω=10i𝜔superscript10𝑖\omega=10^{-i}italic_ω = 10 start_POSTSUPERSCRIPT - italic_i end_POSTSUPERSCRIPT with i=1𝑖1i=1italic_i = 1, 2222, 3333, 4444, 5555, denoted SPALBB(ω𝜔\omegaitalic_ω). We compared our method with BICGSTAB and restarted GMRES. We tested two restart values: 20202020 and 50505050, denoted GMRES(20) and GMRES(50).

Example 1.

The steady-state Navier-Stokes equations are

(44) ν2𝒖+𝒖𝒖+p=𝒉anddiv𝒖=0,𝒛=(x,y)Ω,formulae-sequence𝜈superscript2𝒖𝒖𝒖𝑝𝒉andformulae-sequencediv𝒖0𝒛𝑥𝑦Ω-\nu\nabla^{2}{\bm{u}}+{\bm{u}}\cdot\nabla{\bm{u}}+\nabla p={\bm{h}}\quad{\rm and% }\quad{\rm div}\,{\bm{u}}=0,\quad{\bm{z}}=(x,y)\in\Omega,- italic_ν ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_italic_u + bold_italic_u ⋅ ∇ bold_italic_u + ∇ italic_p = bold_italic_h roman_and roman_div bold_italic_u = 0 , bold_italic_z = ( italic_x , italic_y ) ∈ roman_Ω ,

where Ω2Ωsuperscript2\Omega\subseteq\mathds{R}^{2}roman_Ω ⊆ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is a bounded domain, the vector field 𝐮𝐮{\bm{u}}bold_italic_u represents the velocity in ΩΩ\Omegaroman_Ω, p𝑝pitalic_p represents pressure, and ν>0𝜈0\nu>0italic_ν > 0 is the kinematic viscosity. The test problem is a model of the flow in a square cavity Ω=(1,1)×(1,1)Ω1111\Omega=(-1,1)\times(-1,1)roman_Ω = ( - 1 , 1 ) × ( - 1 , 1 ) with the lid moving from left to right. A Dirichlet no-flow (zero velocity) condition is applied on the side and bottom boundaries, and the nonzero horizontal velocity on the lid is {y=1;1x1ux=1x4}conditional-setformulae-sequence𝑦11𝑥1subscript𝑢𝑥1superscript𝑥4\{y=1;-1\leq x\leq 1\mid u_{x}=1-x^{4}\}{ italic_y = 1 ; - 1 ≤ italic_x ≤ 1 ∣ italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = 1 - italic_x start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT }.

Finite element discretization of (44) results in system (1) with G=νG1+G2𝐺𝜈subscript𝐺1subscript𝐺2G=\nu G_{1}+G_{2}italic_G = italic_ν italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Here G1subscript𝐺1G_{1}italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is SPD and consists of a set of uncouple discrete Laplace operators, corresponding to diffusion, and G2subscript𝐺2G_{2}italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is a discrete convection operator and is unsymmetric. Evidently, G𝐺Gitalic_G becomes more unsymmetric as ν𝜈\nuitalic_ν decreases. Various methods have been developed for solving (44). However, the convergence rates of some approaches deteriorate as ν𝜈\nuitalic_ν decreases [22]. Thus, for (44), we test three small viscosity values of ν𝜈\nuitalic_ν: 0.005, 0.01, 0.050.0050.010.050.005,\,0.01,\,0.050.005 , 0.01 , 0.05.

1 is a classical test problem used in fluid dynamics, known as driven-cavity flow. We discretize (44) using Picard iterations and the Q2–Q1 mixed finite element approximation [23] on uniform grids with grid parameter h=26superscript26h=2^{-6}italic_h = 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT, 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT, 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT, 29superscript292^{-9}2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT. This discrete process can be accomplished by the IFISS software package [23, 46]. In this example, G𝐺Gitalic_G is UPD and B𝐵Bitalic_B is rank-deficient with rank m1𝑚1m-1italic_m - 1. Thus, the matrix in (1) is singular. The numerical results are reported in Tables 1, 2 and 3 and in the left-hand plots of Figure 1, where “-” means that the method failed to solve the problem and bold face indicates that the method performs best in terms of CPU time. It can be seen from Tables 1, 2 and 3 that the CPU time of all tested methods increases as ν𝜈\nuitalic_ν decreases, and BICGSTAB and SPALBB(1) fail when h=29superscript29h=2^{-9}italic_h = 2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT for ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005. The CPU time of SPALBB with ω102𝜔superscript102\omega\leq 10^{-2}italic_ω ≤ 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT is about half that of GMRES, and the best cases of SPALBB are only a third of GMRES for h=29superscript29h=2^{-9}italic_h = 2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT. The number of outer iterations of SPALBB decreases with ω𝜔\omegaitalic_ω, which is consistent with Remark 1. Nevertheless, the total number of iterations is not the least for ω=105𝜔superscript105\omega=10^{-5}italic_ω = 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT.

Table 1: Numerical results for 1 with ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=8,450,m=1,089)formulae-sequence𝑛8450𝑚1089(n=8,450,m=1,089)( italic_n = 8 , 450 , italic_m = 1 , 089 ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=33,282,m=4,225)formulae-sequence𝑛33282𝑚4225(n=33,282,m=4,225)( italic_n = 33 , 282 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 6665.5 2.28 9.839.839.839.83E0707-07- 07 14211.5 22.75 8.958.958.958.95E0707-07- 07
GMRES(20) 4221 2.55 9.999.999.999.99E0707-07- 07 7735 16.86 9.999.999.999.99E0707-07- 07
GMRES(50) 4040 4.64 9.999.999.999.99E0707-07- 07 7162 27.94 1.001.001.001.00E0606-06- 06
SPALBB(1) 1594 6765 1.23 9.979.979.979.97E0707-07- 07 22057 50212 42.16 9.939.939.939.93E0707-07- 07
SPALBB(2) 51 16705 2.83 9.989.989.989.98E0707-07- 07 243 18537 15.42 9.839.839.839.83E0707-07- 07
SPALBB(3) 17 18762 3.18 1.001.001.001.00E0606-06- 06 27 24084 20.45 7.707.707.707.70E0707-07- 07
SPALBB(4) 14 19036 3.22 9.999.999.999.99E0707-07- 07 15 22801 18.78 1.001.001.001.00E0606-06- 06
SPALBB(5) 14 24496 4.28 1.001.001.001.00E0606-06- 06 14 35119 29.70 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=132,098,m=16,641)formulae-sequence𝑛132098𝑚16641(n=132,098,m=16,641)( italic_n = 132 , 098 , italic_m = 16 , 641 ) 29superscript292^{-9}2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT (n=526,338,m=66,049)formulae-sequence𝑛526338𝑚66049(n=526,338,m=66,049)( italic_n = 526 , 338 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 39440.5 378.63 8.318.318.318.31E0707-07- 07 - - -
GMRES(20) 15638 138.37 1.001.001.001.00E0606-06- 06 37240 1563.36 1.001.001.001.00E0606-06- 06
GMRES(50) 13265 179.20 1.001.001.001.00E0606-06- 06 25858 1518.14 1.001.001.001.00E0606-06- 06
SPALBB(1) 55401 77589 374.99 9.999.999.999.99E0707-07- 07 - - - -
SPALBB(2) 2412 17982 85.91 1.001.001.001.00E0606-06- 06 14384 38247 808.82 9.949.949.949.94E0707-07- 07
SPALBB(3) 51 26095 127.44 1.001.001.001.00E0606-06- 06 340 26085 539.99 9.699.699.699.69E0707-07- 07
SPALBB(4) 17 28805 138.84 1.001.001.001.00E0606-06- 06 23 36728 778.87 1.001.001.001.00E0606-06- 06
SPALBB(5) 14 32707 165.62 1.001.001.001.00E0606-06- 06 14 36635 783.08 1.001.001.001.00E0606-06- 06
Table 2: Numerical results for 1 with ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=8,450,m=1,089)formulae-sequence𝑛8450𝑚1089(n=8,450,m=1,089)( italic_n = 8 , 450 , italic_m = 1 , 089 ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=33,282,m=4,225)formulae-sequence𝑛33282𝑚4225(n=33,282,m=4,225)( italic_n = 33 , 282 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - 7211.5 11.63 9.909.909.909.90E0707-07- 07
GMRES(20) 2453 1.60 9.989.989.989.98E0707-07- 07 5127 10.15 9.999.999.999.99E0707-07- 07
GMRES(50) 2237 2.63 9.989.989.989.98E0707-07- 07 4422 14.82 1.001.001.001.00E0606-06- 06
SPALBB(1) 1649 4803 0.87 9.969.969.969.96E0707-07- 07 8611 17952 15.60 9.499.499.499.49E0707-07- 07
SPALBB(2) 38 5475 0.91 1.001.001.001.00E0606-06- 06 411 9982 8.26 1.001.001.001.00E0606-06- 06
SPALBB(3) 16 6295 0.98 1.001.001.001.00E0606-06- 06 22 8554 6.94 1.001.001.001.00E0606-06- 06
SPALBB(4) 15 10835 1.76 1.001.001.001.00E0606-06- 06 15 10388 8.35 1.001.001.001.00E0606-06- 06
SPALBB(5) 15 17674 2.89 1.001.001.001.00E0606-06- 06 15 24493 20.09 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=132,098,m=16,641)formulae-sequence𝑛132098𝑚16641(n=132,098,m=16,641)( italic_n = 132 , 098 , italic_m = 16 , 641 ) 29superscript292^{-9}2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT (n=526,338,m=66,049)formulae-sequence𝑛526338𝑚66049(n=526,338,m=66,049)( italic_n = 526 , 338 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 17765.5 161.00 8.208.208.208.20E0707-07- 07 29320 1024.60 1.461.461.461.46E0505-05- 05
GMRES(20) 13579 113.90 1.001.001.001.00E0606-06- 06 34224 1186.20 1.001.001.001.00E0606-06- 06
GMRES(50) 9419 118.86 1.001.001.001.00E0606-06- 06 22827 1140.94 1.001.001.001.00E0606-06- 06
SPALBB(1) 25643 36571 165.72 9.999.999.999.99E0707-07- 07 77225 87388 1632.85 9.949.949.949.94E0707-07- 07
SPALBB(2) 2154 14993 67.21 1.001.001.001.00E0606-06- 06 8277 31640 581.72 9.999.999.999.99E0707-07- 07
SPALBB(3) 42 13434 60.32 9.999.999.999.99E0707-07- 07 554 22207 402.99 1.001.001.001.00E0606-06- 06
SPALBB(4) 16 14113 65.49 9.999.999.999.99E0707-07- 07 22 21902 395.92 9.149.149.149.14E0707-07- 07
SPALBB(5) 15 21503 101.01 1.001.001.001.00E0606-06- 06 15 20219 369.43 1.001.001.001.00E0606-06- 06
Table 3: Numerical results for 1 with ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=8,450,m=1,089)formulae-sequence𝑛8450𝑚1089(n=8,450,m=1,089)( italic_n = 8 , 450 , italic_m = 1 , 089 ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=33,282,m=4,225)formulae-sequence𝑛33282𝑚4225(n=33,282,m=4,225)( italic_n = 33 , 282 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 917.5 0.33 9.329.329.329.32E0707-07- 07 1648.5 2.72 9.619.619.619.61E0707-07- 07
GMRES(20) 1508 0.94 9.969.969.969.96E0707-07- 07 4507 9.58 9.989.989.989.98E0707-07- 07
GMRES(50) 989 1.15 9.969.969.969.96E0707-07- 07 2769 11.11 1.001.001.001.00E0606-06- 06
SPALBB(1) 782 2947 0.54 9.969.969.969.96E0707-07- 07 3142 9527 7.89 1.001.001.001.00E0606-06- 06
SPALBB(2) 79 1952 0.34 9.959.959.959.95E0707-07- 07 300 4762 3.84 9.989.989.989.98E0707-07- 07
SPALBB(3) 19 1729 0.28 9.979.979.979.97E0707-07- 07 36 3272 2.69 6.686.686.686.68E0707-07- 07
SPALBB(4) 17 4261 0.67 9.989.989.989.98E0707-07- 07 15 3878 3.21 9.959.959.959.95E0707-07- 07
SPALBB(5) 17 5252 0.86 1.001.001.001.00E0606-06- 06 16 8563 7.24 9.979.979.979.97E0707-07- 07
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=132,098,m=16,641)formulae-sequence𝑛132098𝑚16641(n=132,098,m=16,641)( italic_n = 132 , 098 , italic_m = 16 , 641 ) 29superscript292^{-9}2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT (n=526,338,m=66,049)formulae-sequence𝑛526338𝑚66049(n=526,338,m=66,049)( italic_n = 526 , 338 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 3253.5 30.99 8.588.588.588.58E0707-07- 07 6277.5 252.29 9.549.549.549.54E0707-07- 07
GMRES(20) 11071 99.96 1.001.001.001.00E0606-06- 06 21205 797.99 1.001.001.001.00E0606-06- 06
GMRES(50) 7775 104.78 1.001.001.001.00E0606-06- 06 18460 1048.97 1.001.001.001.00E0606-06- 06
SPALBB(1) 10731 30596 141.61 9.999.999.999.99E0707-07- 07 - - - -
SPALBB(2) 1005 13335 62.92 1.001.001.001.00E0606-06- 06 3330 34174 700.69 9.999.999.999.99E0707-07- 07
SPALBB(3) 100 7869 36.87 9.999.999.999.99E0707-07- 07 339 21075 433.98 1.001.001.001.00E0606-06- 06
SPALBB(4) 19 5156 24.64 7.367.367.367.36E0707-07- 07 37 10542 216.78 9.999.999.999.99E0707-07- 07
SPALBB(5) 17 12476 60.19 9.999.999.999.99E0707-07- 07 16 11154 231.77 9.999.999.999.99E0707-07- 07
Refer to caption
(a) ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005
Refer to caption
(b) ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005
Refer to caption
(c) ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01
Refer to caption
(d) ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01
Refer to caption
(e) ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05
Refer to caption
(f) ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05
Figure 1: Evolution of the relative residual of SPALBB tested on 1 (left) with n=8450𝑛8450n=8450italic_n = 8450, m=1089𝑚1089m=1089italic_m = 1089, and on 2 (right) with n=8416𝑛8416n=8416italic_n = 8416, m=1096𝑚1096m=1096italic_m = 1096 and ω𝜔\omegaitalic_ω as in (7).
Example 2.

We consider the steady-state Navier-Stokes equations (44), where the domain ΩΩ\Omegaroman_Ω is a rectangular region (0,8)×(1,1)0811(0,8)\times(-1,1)( 0 , 8 ) × ( - 1 , 1 ) generated by deleting the square (7/4,9/4)×(1/4,1/4)74941414(7/4,9/4)\times(-1/4,1/4)( 7 / 4 , 9 / 4 ) × ( - 1 / 4 , 1 / 4 ). This test problem is a model of the flow in a rectangular channel with an obstacle. A Poiseuille profile is imposed on the inflow boundary {x=0;1y1}formulae-sequence𝑥01𝑦1\{x=0;-1\leq y\leq 1\}{ italic_x = 0 ; - 1 ≤ italic_y ≤ 1 }, and a Dirichlet no-flow condition is imposed on the obstruction and on the top and bottom walls. A Neumann condition is applied at the outflow boundary that automatically sets the mean outflow pressure to zero.

In our tests, we set ν=0.005, 0.01, 0.05𝜈0.0050.010.05\nu=0.005,\,0.01,\,0.05italic_ν = 0.005 , 0.01 , 0.05 and discretize the Navier-Stokes equations (44) using Picard iterations and the Q2–Q1 mixed finite element approximation [23] on uniform grids with grid parameter h=25, 26, 27, 28superscript25superscript26superscript27superscript28h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8}italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT. This discretization was accomplished using IFISS [23, 46]. The resulting matrices have G𝐺Gitalic_G UPD and B𝐵Bitalic_B full column rank. The numerical results are reported in Tables 4, 5 and 6 and Figure 1. Tables 4, 5 and 6 show that all choices of ω𝜔\omegaitalic_ω are successful in solving the tested problems, and, in terms of CPU time, ω=101𝜔superscript101\omega=10^{-1}italic_ω = 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT perform better than other choices. Although BICGSTAB requires the least CPU time for ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05, it fails for ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005 and ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01 with h=25, 28superscript25superscript28h=2^{-5},\,2^{-8}italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT. The CPU time for every SPALBB test is less than for GMRES, and the best case of SPALBB takes about half the time of GMRES. Overall, SPALBB is more stable and efficient.

Table 4: Numerical results for 2 with ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=8,416,m=1,096)formulae-sequence𝑛8416𝑚1096(n=8,416,m=1,096)( italic_n = 8 , 416 , italic_m = 1 , 096 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=32,960,m=4,208)formulae-sequence𝑛32960𝑚4208(n=32,960,m=4,208)( italic_n = 32 , 960 , italic_m = 4 , 208 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - - - -
GMRES(20) 6319 3.60 9.999.999.999.99E0707-07- 07 11462 25.25 9.999.999.999.99E0707-07- 07
GMRES(50) 5540 6.10 9.989.989.989.98E0707-07- 07 11386 42.46 1.001.001.001.00E0606-06- 06
SPALBB(1) 632 17567 2.78 1.001.001.001.00E0606-06- 06 2211 16590 12.88 1.001.001.001.00E0606-06- 06
SPALBB(2) 110 29351 4.95 1.001.001.001.00E0606-06- 06 330 34576 27.70 9.999.999.999.99E0707-07- 07
SPALBB(3) 30 29332 4.64 1.001.001.001.00E0606-06- 06 61 34000 26.83 8.138.138.138.13E0707-07- 07
SPALBB(4) 18 34374 5.25 1.001.001.001.00E0606-06- 06 20 34842 27.95 9.999.999.999.99E0707-07- 07
SPALBB(5) 18 40951 6.20 1.001.001.001.00E0606-06- 06 17 44791 36.40 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=130,432,m=16,480)formulae-sequence𝑛130432𝑚16480(n=130,432,m=16,480)( italic_n = 130 , 432 , italic_m = 16 , 480 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=518,912,m=65,216)formulae-sequence𝑛518912𝑚65216(n=518,912,m=65,216)( italic_n = 518 , 912 , italic_m = 65 , 216 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - - - -
GMRES(20) 20442 173.54 1.001.001.001.00E0606-06- 06 39863 1385.75 1.001.001.001.00E0606-06- 06
GMRES(50) 20511 276.89 1.001.001.001.00E0606-06- 06 38382 2019.57 1.001.001.001.00E0606-06- 06
SPALBB(1) 9013 23219 102.78 1.001.001.001.00E0606-06- 06 43791 61073 1155.48 1.001.001.001.00E0606-06- 06
SPALBB(2) 917 46379 211.82 9.979.979.979.97E0707-07- 07 2765 35310 651.49 1.001.001.001.00E0606-06- 06
SPALBB(3) 135 44805 214.76 1.001.001.001.00E0606-06- 06 361 60088 1099.24 1.001.001.001.00E0606-06- 06
SPALBB(4) 32 46194 221.46 9.109.109.109.10E0707-07- 07 65 59081 1080.07 8.088.088.088.08E0707-07- 07
SPALBB(5) 16 52684 255.82 1.001.001.001.00E0606-06- 06 19 57737 1064.29 1.001.001.001.00E0606-06- 06
Table 5: Numerical results for 2 with ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=8,416,m=1,096)formulae-sequence𝑛8416𝑚1096(n=8,416,m=1,096)( italic_n = 8 , 416 , italic_m = 1 , 096 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=32,960,m=4,208)formulae-sequence𝑛32960𝑚4208(n=32,960,m=4,208)( italic_n = 32 , 960 , italic_m = 4 , 208 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - 5336 7.87 9.899.899.899.89E0707-07- 07
GMRES(20) 5145 2.98 9.999.999.999.99E0707-07- 07 9162 18.05 1.001.001.001.00E0606-06- 06
GMRES(50) 4904 5.59 1.001.001.001.00E0606-06- 06 9446 37.68 1.001.001.001.00E0606-06- 06
SPALBB(1) 604 10445 1.54 9.999.999.999.99E0707-07- 07 2455 10813 8.26 9.999.999.999.99E0707-07- 07
SPALBB(2) 108 13769 2.06 9.429.429.429.42E0707-07- 07 294 20425 15.79 9.779.779.779.77E0707-07- 07
SPALBB(3) 30 14084 2.95 6.636.636.636.63E0707-07- 07 57 19994 14.99 9.999.999.999.99E0707-07- 07
SPALBB(4) 18 16042 2.50 9.989.989.989.98E0707-07- 07 20 21636 16.45 7.177.177.177.17E0707-07- 07
SPALBB(5) 18 18160 2.75 1.001.001.001.00E0606-06- 06 17 26448 20.12 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=130,432,m=16,480)formulae-sequence𝑛130432𝑚16480(n=130,432,m=16,480)( italic_n = 130 , 432 , italic_m = 16 , 480 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=518,912,m=65,216)formulae-sequence𝑛518912𝑚65216(n=518,912,m=65,216)( italic_n = 518 , 912 , italic_m = 65 , 216 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 11686.5 104.74 7.607.607.607.60E0707-07- 07 - - -
GMRES(20) 17521 151.67 1.001.001.001.00E0606-06- 06 34452 1190.29 1.001.001.001.00E0606-06- 06
GMRES(50) 17430 237.68 1.001.001.001.00E0606-06- 06 33304 1629.56 1.001.001.001.00E0606-06- 06
SPALBB(1) 8001 17667 79.71 9.989.989.989.98E0707-07- 07 30579 43828 851.71 9.999.999.999.99E0707-07- 07
SPALBB(2) 818 26085 115.24 1.001.001.001.00E0606-06- 06 2576 28375 496.34 1.001.001.001.00E0606-06- 06
SPALBB(3) 138 29649 143.63 9.989.989.989.98E0707-07- 07 308 40365 719.53 9.999.999.999.99E0707-07- 07
SPALBB(4) 31 28824 133.78 6.906.906.906.90E0707-07- 07 67 47090 870.44 9.649.649.649.64E0707-07- 07
SPALBB(5) 17 40333 172.89 1.001.001.001.00E0606-06- 06 20 48639 909.21 7.297.297.297.29E0707-07- 07
Table 6: Numerical results for 2 with ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=8,416,m=1,096)formulae-sequence𝑛8416𝑚1096(n=8,416,m=1,096)( italic_n = 8 , 416 , italic_m = 1 , 096 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=32,960,m=4,208)formulae-sequence𝑛32960𝑚4208(n=32,960,m=4,208)( italic_n = 32 , 960 , italic_m = 4 , 208 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 1127.5 0.39 7.527.527.527.52E0707-07- 07 2187.5 3.24 5.815.815.815.81E0707-07- 07
GMRES(20) 2420 2.63 9.979.979.979.97E0707-07- 07 5288 10.27 1.001.001.001.00E0606-06- 06
GMRES(50) 2686 4.79 9.999.999.999.99E0707-07- 07 5168 17.39 9.999.999.999.99E0707-07- 07
SPALBB(1) 616 2780 0.47 9.929.929.929.92E0707-07- 07 1877 6454 4.69 9.999.999.999.99E0707-07- 07
SPALBB(2) 90 4043 0.59 9.989.989.989.98E0707-07- 07 222 6876 4.96 9.969.969.969.96E0707-07- 07
SPALBB(3) 28 3888 0.63 1.001.001.001.00E0606-06- 06 47 6949 5.06 9.989.989.989.98E0707-07- 07
SPALBB(4) 19 5246 0.79 9.979.979.979.97E0707-07- 07 21 8136 6.06 1.001.001.001.00E0606-06- 06
SPALBB(5) 18 5470 0.83 9.979.979.979.97E0707-07- 07 18 9266 6.89 9.999.999.999.99E0707-07- 07
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=130,432,m=16,480)formulae-sequence𝑛130432𝑚16480(n=130,432,m=16,480)( italic_n = 130 , 432 , italic_m = 16 , 480 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=518,912,m=65,216)formulae-sequence𝑛518912𝑚65216(n=518,912,m=65,216)( italic_n = 518 , 912 , italic_m = 65 , 216 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 4516.5 39.77 9.489.489.489.48E0707-07- 07 9366.5 340.85 8.408.408.408.40E0707-07- 07
GMRES(20) 10778 89.17 9.999.999.999.99E0707-07- 07 20466 711.25 1.001.001.001.00E0606-06- 06
GMRES(50) 10134 130.59 1.001.001.001.00E0606-06- 06 18845 921.70 9.999.999.999.99E0707-07- 07
SPALBB(1) 6815 19080 83.73 9.989.989.989.98E0707-07- 07 22561 57886 1903.39 1.001.001.001.00E0606-06- 06
SPALBB(2) 738 11894 53.23 9.969.969.969.96E0707-07- 07 2657 29081 536.54 9.999.999.999.99E0707-07- 07
SPALBB(3) 100 12759 57.35 9.989.989.989.98E0707-07- 07 304 24471 451.52 1.001.001.001.00E0606-06- 06
SPALBB(4) 28 13047 59.88 1.001.001.001.00E0606-06- 06 48 23291 434.16 1.001.001.001.00E0606-06- 06
SPALBB(5) 18 15504 71.03 1.001.001.001.00E0606-06- 06 21 25483 469.27 9.989.989.989.98E0707-07- 07
Example 3.

We consider the steady-state Navier-Stokes equations (44), where the domain ΩΩ\Omegaroman_Ω is a rectangular region (1,5)×(1,1)1511(-1,5)\times(-1,1)( - 1 , 5 ) × ( - 1 , 1 ) generated by deleting (1,0)×(1,1/2)(1,0)×(1/2,1)1011210121(-1,0)\times(-1,-1/2)\cup(-1,0)\times(1/2,1)( - 1 , 0 ) × ( - 1 , - 1 / 2 ) ∪ ( - 1 , 0 ) × ( 1 / 2 , 1 ). This test problem is a model of the flow in a symmetric step channel. A Poiseuille flow profile is imposed on the inflow boundary {x=1;1/2y1/2}formulae-sequence𝑥112𝑦12\{x=-1;-1/2\leq y\leq 1/2\}{ italic_x = - 1 ; - 1 / 2 ≤ italic_y ≤ 1 / 2 }, and a Dirichlet no-flow condition is imposed on the top and bottom walls and the boundaries of deleted parts. A Neumann condition is applied at the outflow boundary that sets the mean outflow pressure to zero.

The discretization of the Navier-Stokes equations (44) is done as in 2 with the same setting. In this example, G𝐺Gitalic_G is UPD and B𝐵Bitalic_B has full column rank. The numerical results are reported in Tables 7, 8 and 9 and in the left-hand plots of Figure 2. As in 2, all choices of ω𝜔\omegaitalic_ω solve the problems successfully, and BICGSTAB performs best in the case of ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05. Except for ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05 and ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01 with h=26superscript26h=2^{-6}italic_h = 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT, SPALBB requires the least CPU time. Hence, Tables 7, 8 and 9 still demonstrate the efficiency of SPALBB.

Table 7: Numerical results for 3 with ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=5,890,m=769)formulae-sequence𝑛5890𝑚769(n=5,890,m=769)( italic_n = 5 , 890 , italic_m = 769 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=23,042,m=2,945)formulae-sequence𝑛23042𝑚2945(n=23,042,m=2,945)( italic_n = 23 , 042 , italic_m = 2 , 945 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - - - -
GMRES(20) 5662 2.23 9.989.989.989.98E0707-07- 07 12340 17.73 1.001.001.001.00E0606-06- 06
GMRES(50) 3918 3.01 1.001.001.001.00E0606-06- 06 10459 26.94 9.999.999.999.99E0707-07- 07
SPALBB(1) 510 10007 0.96 9.969.969.969.96E0707-07- 07 2392 13447 6.73 9.999.999.999.99E0707-07- 07
SPALBB(2) 99 26993 2.44 9.449.449.449.44E0707-07- 07 322 36764 17.71 1.001.001.001.00E0606-06- 06
SPALBB(3) 23 21194 1.92 1.001.001.001.00E0606-06- 06 54 29542 14.58 9.399.399.399.39E0707-07- 07
SPALBB(4) 17 34776 3.20 1.001.001.001.00E0606-06- 06 20 36454 18.04 1.001.001.001.00E0606-06- 06
SPALBB(5) 18 43570 3.94 1.001.001.001.00E0606-06- 06 17 47559 23.47 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=91,138,m=11,521)formulae-sequence𝑛91138𝑚11521(n=91,138,m=11,521)( italic_n = 91 , 138 , italic_m = 11 , 521 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=362,498,m=45,569)formulae-sequence𝑛362498𝑚45569(n=362,498,m=45,569)( italic_n = 362 , 498 , italic_m = 45 , 569 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - - - -
GMRES(20) 21340 119.38 1.001.001.001.00E0606-06- 06 41829 954.23 1.001.001.001.00E0606-06- 06
GMRES(50) 22164 196.79 9.999.999.999.99E0707-07- 07 41230 1409.32 1.001.001.001.00E0606-06- 06
SPALBB(1) 9838 20630 50.41 9.979.979.979.97E0707-07- 07 44486 63899 834.13 1.001.001.001.00E0606-06- 06
SPALBB(2) 799 41186 103.26 9.999.999.999.99E0707-07- 07 3052 37812 490.00 9.979.979.979.97E0707-07- 07
SPALBB(3) 152 51307 133.36 6.436.436.436.43E0707-07- 07 429 72440 930.67 9.889.889.889.88E0707-07- 07
SPALBB(4) 25 33030 86.80 1.001.001.001.00E0606-06- 06 56 51184 667.27 9.539.539.539.53E0707-07- 07
SPALBB(5) 16 51574 139.75 1.001.001.001.00E0606-06- 06 16 54033 707.80 1.001.001.001.00E0606-06- 06
Table 8: Numerical results for 3 with ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=5,890,m=769)formulae-sequence𝑛5890𝑚769(n=5,890,m=769)( italic_n = 5 , 890 , italic_m = 769 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=23,042,m=2,945)formulae-sequence𝑛23042𝑚2945(n=23,042,m=2,945)( italic_n = 23 , 042 , italic_m = 2 , 945 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB - - - 4746.5 4.53 9.859.859.859.85E0707-07- 07
GMRES(20) 5518 2.17 1.001.001.001.00E0606-06- 06 10529 14.70 1.001.001.001.00E0606-06- 06
GMRES(50) 3782 2.87 1.001.001.001.00E0606-06- 06 9379 26.42 1.001.001.001.00E0606-06- 06
SPALBB(1) 584 8002 0.73 9.969.969.969.96E0707-07- 07 2340 9722 4.70 1.001.001.001.00E0606-06- 06
SPALBB(2) 106 12810 1.15 9.999.999.999.99E0707-07- 07 297 20437 9.54 8.668.668.668.66E0707-07- 07
SPALBB(3) 24 10581 0.91 4.684.684.684.68E0707-07- 07 53 17322 8.02 9.989.989.989.98E0707-07- 07
SPALBB(4) 17 16101 1.42 9.999.999.999.99E0707-07- 07 18 18021 8.45 9.999.999.999.99E0707-07- 07
SPALBB(5) 18 19627 1.75 1.001.001.001.00E0606-06- 06 17 26925 13.12 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=91,138,m=11,521)formulae-sequence𝑛91138𝑚11521(n=91,138,m=11,521)( italic_n = 91 , 138 , italic_m = 11 , 521 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=362,498,m=45,569)formulae-sequence𝑛362498𝑚45569(n=362,498,m=45,569)( italic_n = 362 , 498 , italic_m = 45 , 569 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 9904.5 48.44 9.999.999.999.99E0707-07- 07 - - -
GMRES(20) 19010 104.40 9.999.999.999.99E0707-07- 07 37453 841.47 1.001.001.001.00E0606-06- 06
GMRES(50) 19738 187.88 1.001.001.001.00E0606-06- 06 37074 1319.69 1.001.001.001.00E0606-06- 06
SPALBB(1) 9286 19210 45.65 9.989.989.989.98E0707-07- 07 34222 47839 825.49 9.979.979.979.97E0707-07- 07
SPALBB(2) 781 24365 61.89 1.001.001.001.00E0606-06- 06 3013 28458 343.04 1.001.001.001.00E0606-06- 06
SPALBB(3) 145 29784 77.31 9.749.749.749.74E0707-07- 07 375 46354 557.08 9.999.999.999.99E0707-07- 07
SPALBB(4) 25 23353 62.19 9.239.239.239.23E0707-07- 07 60 41959 500.22 9.999.999.999.99E0707-07- 07
SPALBB(5) 16 39604 103.42 1.001.001.001.00E0606-06- 06 17 41434 507.15 1.001.001.001.00E0606-06- 06
Table 9: Numerical results for 3 with ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=5,890,m=769)formulae-sequence𝑛5890𝑚769(n=5,890,m=769)( italic_n = 5 , 890 , italic_m = 769 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=23,042,m=2,945)formulae-sequence𝑛23042𝑚2945(n=23,042,m=2,945)( italic_n = 23 , 042 , italic_m = 2 , 945 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 914.5 0.22 8.688.688.688.68E0707-07- 07 1888.5 1.89 2.442.442.442.44E0707-07- 07
GMRES(20) 3139 1.35 9.989.989.989.98E0707-07- 07 6106 8.44 9.999.999.999.99E0707-07- 07
GMRES(50) 2808 2.14 9.999.999.999.99E0707-07- 07 6467 16.24 9.999.999.999.99E0707-07- 07
SPALBB(1) 427 2487 0.27 9.819.819.819.81E0707-07- 07 1449 5128 2.32 1.001.001.001.00E0606-06- 06
SPALBB(2) 81 3657 0.33 8.618.618.618.61E0707-07- 07 196 6386 3.11 9.999.999.999.99E0707-07- 07
SPALBB(3) 24 4135 0.37 1.001.001.001.00E0606-06- 06 45 7116 3.23 7.237.237.237.23E0707-07- 07
SPALBB(4) 19 6332 0.59 1.001.001.001.00E0606-06- 06 19 8819 4.27 9.989.989.989.98E0707-07- 07
SPALBB(5) 18 7035 0.65 9.989.989.989.98E0707-07- 07 18 11447 5.33 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=91,138,m=11,521)formulae-sequence𝑛91138𝑚11521(n=91,138,m=11,521)( italic_n = 91 , 138 , italic_m = 11 , 521 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=362,498,m=45,569)formulae-sequence𝑛362498𝑚45569(n=362,498,m=45,569)( italic_n = 362 , 498 , italic_m = 45 , 569 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 3811.5 18.29 6.506.506.506.50E0707-07- 07 8153.5 194.33 8.938.938.938.93E0707-07- 07
GMRES(20) 12413 64.26 9.999.999.999.99E0707-07- 07 23112 526.24 1.001.001.001.00E0606-06- 06
GMRES(50) 11779 99.11 9.999.999.999.99E0707-07- 07 21620 1401.43 1.001.001.001.00E0606-06- 06
SPALBB(1) 4898 13457 30.89 9.989.989.989.98E0707-07- 07 15210 36389 441.11 9.999.999.999.99E0707-07- 07
SPALBB(2) 542 9821 22.50 9.979.979.979.97E0707-07- 07 1601 21727 264.76 1.001.001.001.00E0606-06- 06
SPALBB(3) 101 12949 30.46 9.989.989.989.98E0707-07- 07 250 21972 267.31 1.001.001.001.00E0606-06- 06
SPALBB(4) 27 13639 33.74 6.406.406.406.40E0707-07- 07 46 22431 274.17 9.989.989.989.98E0707-07- 07
SPALBB(5) 18 17621 44.04 9.999.999.999.99E0707-07- 07 19 26577 328.06 1.001.001.001.00E0606-06- 06
Refer to caption
(a) ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005
Refer to caption
(b) ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005
Refer to caption
(c) ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01
Refer to caption
(d) ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01
Refer to caption
(e) ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05
Refer to caption
(f) ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05
Figure 2: Evolution of the relative residual of SPALBB tested on 3 (left) with n=5890𝑛5890n=5890italic_n = 5890, m=769𝑚769m=769italic_m = 769, and on 4 with n=12675𝑛12675n=12675italic_n = 12675, m=1089𝑚1089m=1089italic_m = 1089 and ω𝜔\omegaitalic_ω as in (7).
Example 4.

Fluid flow in Ωf2subscriptΩ𝑓superscript2\Omega_{f}\subset\mathds{R}^{2}roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ⊂ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT coupled with porous media flow in Ωp2subscriptΩ𝑝superscript2\Omega_{p}\subset\mathds{R}^{2}roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊂ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is governed by the static Stokes equations

(45) νΔ𝒖f+pf=𝒇,anddiv𝒖f=0,𝒛Ωf,formulae-sequence𝜈Δsubscript𝒖𝑓subscript𝑝𝑓𝒇andformulae-sequencedivsubscript𝒖𝑓0𝒛subscriptΩ𝑓-\nu\Delta\,{\bm{u}}_{f}+\nabla\,p_{f}={\bm{f}},\quad\textup{and}\quad{\rm div% }\,{\bm{u}}_{f}=0,\quad{\bm{z}}\in\Omega_{f},- italic_ν roman_Δ bold_italic_u start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT + ∇ italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = bold_italic_f , and roman_div bold_italic_u start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = 0 , bold_italic_z ∈ roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ,

where ΩfΩp=subscriptΩ𝑓subscriptΩ𝑝\Omega_{f}\cap\Omega_{p}=\varnothingroman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∩ roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ∅ and Ω¯fΩ¯p=Γsubscript¯Ω𝑓subscript¯Ω𝑝Γ\overline{\Omega}_{f}\cap\overline{\Omega}_{p}=\Gammaover¯ start_ARG roman_Ω end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∩ over¯ start_ARG roman_Ω end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = roman_Γ with ΓΓ\Gammaroman_Γ being an interface, ν>0𝜈0\nu>0italic_ν > 0 is the kinematic viscosity, and 𝐟𝐟\bm{f}bold_italic_f is the external force. In the porous media region, the governing variable is ϕ=ppρfgitalic-ϕsubscript𝑝𝑝subscript𝜌𝑓𝑔\phi=\frac{p_{p}}{\rho_{f}g}italic_ϕ = divide start_ARG italic_p start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT italic_g end_ARG, where ppsubscript𝑝𝑝p_{p}italic_p start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the pressure in ΩpsubscriptΩ𝑝\Omega_{p}roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, ρfsubscript𝜌𝑓\rho_{f}italic_ρ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is the fluid density, and g𝑔gitalic_g is the acceleration due to gravity. The velocity 𝐮psubscript𝐮𝑝{\bm{u}}_{p}bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT of the porous media flow is related to ϕitalic-ϕ\phiitalic_ϕ via Darcy’s law and is also divergence free:

(46) 𝒖p=ϵ2rνϕanddiv𝒖p=0,𝒛Ωp,formulae-sequencesubscript𝒖𝑝superscriptitalic-ϵ2𝑟𝜈italic-ϕandformulae-sequencedivsubscript𝒖𝑝0𝒛subscriptΩ𝑝{\bm{u}}_{p}=-\dfrac{\epsilon^{2}}{r\nu}\nabla\phi\quad\textup{and}\quad-{\rm div% }\,{\bm{u}}_{p}=0,\quad{\bm{z}}\in\Omega_{p},bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = - divide start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_r italic_ν end_ARG ∇ italic_ϕ and - roman_div bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = 0 , bold_italic_z ∈ roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ,

where r𝑟ritalic_r is the volumetric porosity and ϵitalic-ϵ\epsilonitalic_ϵ the characteristic length of the porous media.

In our numerical experiments, the computational domain is Ωf=(0,1)×(1,2)subscriptΩ𝑓0112\Omega_{f}=(0,1)\times(1,2)roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = ( 0 , 1 ) × ( 1 , 2 ), Ωp=(0,1)×(0,1)subscriptΩ𝑝0101\Omega_{p}=(0,1)\times(0,1)roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ( 0 , 1 ) × ( 0 , 1 ) and the interface is Γ=(0,1)×{1}Γ011\Gamma=(0,1)\times\{1\}roman_Γ = ( 0 , 1 ) × { 1 }. We use a uniform mesh with grid parameters h=25, 26, 27, 28superscript25superscript26superscript27superscript28h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8}italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT to decompose ΩfsubscriptΩ𝑓\Omega_{f}roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, P2–P1 elements in the fluid region, and P2 Lagrange elements in the porous media region. We set r=1𝑟1r=1italic_r = 1 and ϵ=0.1νitalic-ϵ0.1𝜈\epsilon=\sqrt{0.1\nu}italic_ϵ = square-root start_ARG 0.1 italic_ν end_ARG, and again test ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005, 0.010.010.010.01, 0.050.050.050.05. Applying finite element discretization to the mixed Stokes-Darcy model (45)–(46) with the Dirichlet no-flow boundary conditions leads to linear systems of form (1) with G=(G11G12G12TνG22)𝐺matrixsubscript𝐺11subscript𝐺12superscriptsubscript𝐺12𝑇𝜈subscript𝐺22G=\begin{pmatrix}G_{11}&G_{12}\\ -G_{12}^{T}&\nu G_{22}\end{pmatrix}italic_G = ( start_ARG start_ROW start_CELL italic_G start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_G start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_G start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ν italic_G start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) [13]. Here G𝐺Gitalic_G is UPD and B𝐵Bitalic_B has full column rank. The numerical results are reported in Tables 10, 11 and 12 and Figure 2. According to Tables 10, 11 and 12, all methods again perform better for larger ν𝜈\nuitalic_ν, and BICGSTAB requires the least CPU time in most cases, while SPALBB is more competitive for smaller ν𝜈\nuitalic_ν. For 4, SPALBB prefers smaller ω𝜔\omegaitalic_ω, such as ω=105𝜔superscript105\omega=10^{-5}italic_ω = 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT.

Table 10: Numerical results for 4 with ν=0.005𝜈0.005\nu=0.005italic_ν = 0.005.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=12,675,m=1,089)formulae-sequence𝑛12675𝑚1089(n=12,675,m=1,089)( italic_n = 12 , 675 , italic_m = 1 , 089 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=49,923,m=4,225)formulae-sequence𝑛49923𝑚4225(n=49,923,m=4,225)( italic_n = 49 , 923 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 3286.5 0.96 9.259.259.259.25E0707-07- 07 7366.5 9.22 9.909.909.909.90E0707-07- 07
GMRES(20) 7668 5.25 1.001.001.001.00E0606-06- 06 21422 50.66 1.001.001.001.00E0606-06- 06
GMRES(50) 6373 8.74 1.001.001.001.00E0606-06- 06 15042 66.02 1.001.001.001.00E0606-06- 06
SPALBB(1) 2048 9709 2.51 9.999.999.999.99E0707-07- 07 6972 29955 56.13 1.001.001.001.00E0606-06- 06
SPALBB(2) 230 7387 1.13 9.989.989.989.98E0707-07- 07 740 15065 13.15 9.989.989.989.98E0707-07- 07
SPALBB(3) 38 7709 1.08 9.989.989.989.98E0707-07- 07 91 14640 9.56 1.001.001.001.00E0606-06- 06
SPALBB(4) 18 8204 1.13 9.999.999.999.99E0707-07- 07 23 13798 8.60 1.001.001.001.00E0606-06- 06
SPALBB(5) 18 11310 1.77 9.999.999.999.99E0707-07- 07 18 25189 15.77 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=198,147,m=16,641)formulae-sequence𝑛198147𝑚16641(n=198,147,m=16,641)( italic_n = 198 , 147 , italic_m = 16 , 641 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=789,507,m=66,049)formulae-sequence𝑛789507𝑚66049(n=789,507,m=66,049)( italic_n = 789 , 507 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 15174.5 107.22 8.938.938.938.93E0707-07- 07 32027.5 829.92 9.799.799.799.79E0707-07- 07
GMRES(20) 47298 450.74 1.001.001.001.00E0606-06- 06 109310 4099.15 1.001.001.001.00E0606-06- 06
GMRES(50) 40626 605.13 1.001.001.001.00E0606-06- 06 81406 5300.94 1.001.001.001.00E0606-06- 06
SPALBB(1) 23244 94747 1623.05 1.001.001.001.00E0606-06- 06 - - - -
SPALBB(2) 2442 43102 305.56 1.001.001.001.00E0606-06- 06 - - - -
SPALBB(3) 272 30164 132.12 1.001.001.001.00E0606-06- 06 815 66624 1503.82 9.999.999.999.99E0707-07- 07
SPALBB(4) 42 27889 113.17 9.999.999.999.99E0707-07- 07 98 56613 907.24 9.999.999.999.99E0707-07- 07
SPALBB(5) 18 33198 138.39 1.001.001.001.00E0606-06- 06 24 47077 717.58 1.001.001.001.00E0606-06- 06
Table 11: Numerical results for 4 with ν=0.01𝜈0.01\nu=0.01italic_ν = 0.01.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=12,675,m=1,089)formulae-sequence𝑛12675𝑚1089(n=12,675,m=1,089)( italic_n = 12 , 675 , italic_m = 1 , 089 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=49,923,m=4,225)formulae-sequence𝑛49923𝑚4225(n=49,923,m=4,225)( italic_n = 49 , 923 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 2485.5 0.66 9.889.889.889.88E0707-07- 07 4888.5 5.27 9.959.959.959.95E0707-07- 07
GMRES(20) 5717 3.44 9.999.999.999.99E0707-07- 07 14398 29.13 9.999.999.999.99E0707-07- 07
GMRES(50) 4276 5.27 1.001.001.001.00E0606-06- 06 11608 47.31 1.001.001.001.00E0606-06- 06
SPALBB(1) 2346 9426 2.59 9.999.999.999.99E0707-07- 07 7762 30273 58.77 1.001.001.001.00E0606-06- 06
SPALBB(2) 265 5132 0.78 9.999.999.999.99E0707-07- 07 829 13805 12.72 1.001.001.001.00E0606-06- 06
SPALBB(3) 43 5380 0.69 9.979.979.979.97E0707-07- 07 104 11902 7.44 9.999.999.999.99E0707-07- 07
SPALBB(4) 18 6560 0.85 9.989.989.989.98E0707-07- 07 25 11578 7.05 9.999.999.999.99E0707-07- 07
SPALBB(5) 18 9092 1.14 1.001.001.001.00E0606-06- 06 18 17747 10.45 9.989.989.989.98E0707-07- 07
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=198,147,m=16,641)formulae-sequence𝑛198147𝑚16641(n=198,147,m=16,641)( italic_n = 198 , 147 , italic_m = 16 , 641 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=789,507,m=66,049)formulae-sequence𝑛789507𝑚66049(n=789,507,m=66,049)( italic_n = 789 , 507 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 10216 70.41 9.789.789.789.78E0707-07- 07 20513.5 579.78 9.949.949.949.94E0707-07- 07
GMRES(20) 33336 326.81 1.001.001.001.00E0606-06- 06 58210 6339.92 1.001.001.001.00E0606-06- 06
GMRES(50) 28362 438.14 1.001.001.001.00E0606-06- 06 48116 3648.93 1.001.001.001.00E0606-06- 06
SPALBB(1) 25411 97928 1769.82 1.001.001.001.00E0606-06- 06 - - - -
SPALBB(2) 2680 42275 318.61 9.999.999.999.99E0707-07- 07 - - - -
SPALBB(3) 295 23860 111.65 9.979.979.979.97E0707-07- 07 877 61065 1477.52 9.989.989.989.98E0707-07- 07
SPALBB(4) 46 22297 93.48 9.989.989.989.98E0707-07- 07 109 43631 703.09 1.001.001.001.00E0606-06- 06
SPALBB(5) 19 21946 92.28 1.001.001.001.00E0606-06- 06 26 37877 583.34 9.999.999.999.99E0707-07- 07
Table 12: Numerical results for 4 with ν=0.05𝜈0.05\nu=0.05italic_ν = 0.05.
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 25superscript252^{-5}2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT (n=12,675,m=1,089)formulae-sequence𝑛12675𝑚1089(n=12,675,m=1,089)( italic_n = 12 , 675 , italic_m = 1 , 089 ) 26superscript262^{-6}2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (n=49,923,m=4,225)formulae-sequence𝑛49923𝑚4225(n=49,923,m=4,225)( italic_n = 49 , 923 , italic_m = 4 , 225 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 883.5 0.22 9.949.949.949.94E0707-07- 07 1738.5 1.57 9.559.559.559.55E0707-07- 07
GMRES(20) 2589 2.22 9.989.989.989.98E0707-07- 07 5115 9.27 9.999.999.999.99E0707-07- 07
GMRES(50) 2509 3.85 1.001.001.001.00E0606-06- 06 4542 16.25 1.001.001.001.00E0606-06- 06
SPALBB(1) 3077 10839 2.94 9.979.979.979.97E0707-07- 07 10033 35063 68.21 1.001.001.001.00E0606-06- 06
SPALBB(2) 346 4649 0.70 9.999.999.999.99E0707-07- 07 1090 14684 12.95 9.989.989.989.98E0707-07- 07
SPALBB(3) 54 3899 0.46 9.989.989.989.98E0707-07- 07 131 7892 4.30 9.949.949.949.94E0707-07- 07
SPALBB(4) 21 3344 0.38 9.999.999.999.99E0707-07- 07 29 6819 3.38 9.989.989.989.98E0707-07- 07
SPALBB(5) 18 3263 0.37 9.979.979.979.97E0707-07- 07 18 6022 2.95 1.001.001.001.00E0606-06- 06
h(n,m)𝑛𝑚h(n,m)italic_h ( italic_n , italic_m ) 27superscript272^{-7}2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT (n=198,147,m=16,641)formulae-sequence𝑛198147𝑚16641(n=198,147,m=16,641)( italic_n = 198 , 147 , italic_m = 16 , 641 ) 28superscript282^{-8}2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT (n=789,507,m=66,049)formulae-sequence𝑛789507𝑚66049(n=789,507,m=66,049)( italic_n = 789 , 507 , italic_m = 66 , 049 )
Oiter Titer CPU RES Oiter Titer CPU RES
BICGSTAB 3512.5 23.21 9.929.929.929.92E0707-07- 07 7305.5 230.70 9.729.729.729.72E0707-07- 07
GMRES(20) 10545 107.69 1.001.001.001.00E0606-06- 06 30657 1438.72 1.001.001.001.00E0606-06- 06
GMRES(50) 8036 164.70 1.001.001.001.00E0606-06- 06 14974 1230.40 1.001.001.001.00E0606-06- 06
SPALBB(1) - - - - - - - -
SPALBB(2) 3464 46071 375.30 9.989.989.989.98E0707-07- 07 - - - -
SPALBB(3) 373 21106 102.87 9.979.979.979.97E0707-07- 07 1105 60707 1291.67 9.969.969.969.96E0707-07- 07
SPALBB(4) 55 13647 56.56 9.999.999.999.99E0707-07- 07 126 26414 469.18 1.001.001.001.00E0606-06- 06
SPALBB(5) 20 11646 48.65 1.001.001.001.00E0606-06- 06 28 21844 341.45 9.999.999.999.99E0707-07- 07

In conclusion, Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12 and Figures 1 and 2 illustrate that SPALBB is a practical method, and its advantages increase with problem size. SPALBB and GMRES are more robust than BICGSTAB. Unlike GMRES, SPALBB has constant storage. In terms of CPU time, SPALBB is more efficient than GMRES. We see from Tables 1, 2, 3, 4, 5, 6, 7, 8 and 9 that the advantages of SPALBB are more obvious for smaller ν𝜈\nuitalic_ν, i.e., more unsymmetric G𝐺Gitalic_G. Figures 1 and 2 indicate that the convergence rate of SPALBB depends strongly on ω𝜔\omegaitalic_ω. For larger ω𝜔\omegaitalic_ω, the nonmonotonicity of rknormsubscript𝑟𝑘\|r_{k}\|∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ in SPALBB becomes more pronounced. The strong nonmonotone behavior is similar to the BB method [40].

5 Conclusions

We presented a theoretical and numerical study of the augmented Lagrangian (SPAL) algorithm and its inexact version for solving unsymmetric saddle-point systems. Specifically, we used a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system in SPAL inexactly and proposed the augmented Lagrangian BB (SPALBB) algorithm. The numerical results for SPALBB presented are highly encouraging. SPALBB often requires the least CPU time, and, especially for larger problems, its advantages are clear. Practical methods for choosing ω𝜔\omegaitalic_ω and Q𝑄Qitalic_Q to balance the inner and outer iterations is a topic for future research.

Acknowledgments

We thank our colleague and friend, Prof Dr Oleg Burdakov, for his devotion to this research. In particular, we express our gratitude to him for fundamental contributions that initiated this work. Oleg developed the SPALBB algorithm, proposed the counter-example to show that the BB1 method may be divergent, and gave many constructive suggestions on our Matlab implementation of SPALBB.

References

  • Arrow et al. [1958] K. J. Arrow, L. Hurwicz, H. Uzawa, and H. B. Chenery. Studies in Linear and Non-linear Programming, volume 2. Stanford University Press, 1958.
  • Awanou and Lai [2005a] G. Awanou and M. J. Lai. On convergence rate of the augmented Lagrangian algorithm for nonsymmetric saddle point problems. Appl. Numer. Math., 54(2):122–134, 2005a.
  • Awanou and Lai [2005b] G. Awanou and M. J. Lai. Trivariate spline approximations of 3D Navier–Stokes equations. Math. Comp., 74(250):585–601, 2005b.
  • Bai and Benzi [2017] Z. Z. Bai and M. Benzi. Regularized HSS iteration methods for saddle-point linear systems. BIT Numer. Math., 57(2):287–311, 2017.
  • Barzilai and Borwein [1988] J. Barzilai and J. M. Borwein. Two-point step size gradient methods. IMA J. Numer. Anal., 8(1):141–148, 1988.
  • Benzi and Golub [2004] M. Benzi and G. H. Golub. A preconditioner for generalized saddle point problems. SIAM J. Matrix Anal. Appl., 26(1):20–41, 2004.
  • Benzi and Wathen [2008] M. Benzi and A. J. Wathen. Some preconditioning techniques for saddle point problems. Model order reduction: theory, research aspects and applications, pages 195–211, 2008.
  • Benzi et al. [2005] M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14(2):1–137, 2005.
  • Berman and Plemmons [1994] A. Berman and R. J. Plemmons. Nonnegative Matrices in the Mathematical Sciences. SIAM, 1994.
  • Bertsekas [2014] D. P. Bertsekas. Constrained Optimization and Lagrange Multiplier Methods. Academic Press, 2014.
  • Birgin and Martínez [2014] E. G. Birgin and J. M. Martínez. Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM, 2014.
  • Burdakov et al. [2019] O. Burdakov, Y. H. Dai, and N. Huang. Stabilized Barzilai-Borwein method. J. Comput. Math., 37(6):916–936, 2019.
  • Cai et al. [2009] M. C. Cai, M. Mu, and J. C. Xu. Preconditioning techniques for a mixed Stokes/Darcy model in porous media applications. J. Comput. Appl. Math., 233(2):346–355, 2009.
  • Campbell and Meyer [2009] S. L. Campbell and C. D. Meyer. Generalized Inverses of Linear Transformations. SIAM, 2009.
  • Cao and Miao [2016] Y. Cao and S. X. Miao. On semi-convergence of the generalized shift-splitting iteration method for singular nonsymmetric saddle point problems. Comput. Math. Appl., 71(7):1503–1511, 2016.
  • Cheng [2000] X. L. Cheng. On the nonlinear inexact Uzawa algorithm for saddle-point problems. SIAM J. Numer. Anal., 37(6):1930–1934, 2000.
  • Dai and Liao [2002] Y. H. Dai and L. Z. Liao. R-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal., 22(1):1–10, 2002.
  • Dai et al. [2005] Y. H. Dai, L. Z. Liao, and D. Li. An analysis of Barzilai-Borwein gradient method for unsymmetric linear equations. Optim. Control Appl., pages 183–211, 2005.
  • Dai et al. [2006] Y. H. Dai, W. W. Hager, K. Schittkowski, and H. C. Zhang. The cyclic Barzilai-Borwein method for unconstrained optimization. IMA J. Numer. Anal., 26(3):604–627, 2006.
  • Di Serafino and Orban [2021] D. Di Serafino and D. Orban. Constraint-preconditioned Krylov solvers for regularized saddle-point systems. SIAM J. Sci. Comput., 43(2):A1001–A1026, 2021.
  • Dollar et al. [2010] H. S. Dollar, N. I. Gould, M. Stoll, and A. J. Wathen. Preconditioning saddle-point systems with applications in optimization. SIAM J. Sci. Comput., 32(1):249–270, 2010.
  • Elman [1999] H. C. Elman. Preconditioning for the steady-state Navier–Stokes equations with low viscosity. SIAM Journal on Scientific Computing, 20(4):1299–1316, 1999.
  • Elman et al. [2007] H. C. Elman, A. Ramage, and D. J. Silvester. Algorithm 866: IFISS, a Matlab toolbox for modelling incompressible flow. ACM Trans. Math. Softw., 33(2):14–es, 2007.
  • Friedlander et al. [1998] A. Friedlander, J. M. Martínez, B. Molina, and M. Raydan. Gradient method with retards and generalizations. SIAM J. Numer. Anal., 36(1):275–289, 1998.
  • Ghannad et al. [2022] A. Ghannad, D. Orban, and M. A. Saunders. Linear systems arising in interior methods for convex optimization: a symmetric formulation with bounded condition number. Optim. Method Softw., 37(4):1344–1369, 2022.
  • Glowinski and Le Tallec [1989] R. Glowinski and P. Le Tallec. Augmented Lagrangian and Operator-splitting Methods in Nonlinear Mechanics. SIAM, 1989.
  • Golub and Greif [2003] G. H. Golub and C. Greif. On solving block-structured indefinite linear systems. SIAM J. Sci. Comput., 24(6):2076–2092, 2003.
  • Golub et al. [2005] G. H. Golub, C. Greif, and J. M. Varah. An algebraic analysis of a block diagonal preconditioner for saddle point systems. SIAM J. Matrix Anal. Appl., 27(3):779–792, 2005.
  • Gould et al. [2014] N. Gould, D. Orban, and T. Rees. Projected Krylov methods for saddle-point systems. SIAM J. Matrix Anal. Appl., 35(4):1329–1343, 2014.
  • Hu and Zou [2006] Q. Hu and J. Zou. Nonlinear inexact Uzawa algorithms for linear and nonlinear saddle-point problems. SIAM J. Optim., 16(3):798–825, 2006.
  • Kozjakin and Krasnosel’ski [1982] V. Kozjakin and M. Krasnosel’ski. Some remarks on the method of minimal residues. Numer. Funct. Anal. Optim., 4(3):211–239, 1982.
  • Krasnosel’skii and Krein [1952] M. A. Krasnosel’skii and S. G. Krein. An iteration process with minimal residuals. Matematicheskii Sbornik, 73(2):315–334, 1952.
  • Lu and Zhang [2010] J. Lu and Z. Zhang. A modified nonlinear inexact Uzawa algorithm with a variable relaxation parameter for the stabilized saddle point problem. SIAM J. Matrix Anal. Appl., 31(4):1934–1957, 2010.
  • Molina and Raydan [1996] B. Molina and M. Raydan. Preconditioned Barzilai-Borwein method for the numerical solution of partial differential equations. Numer. Algor., 13:45–60, 1996.
  • Montoison and Orban [2023] A. Montoison and D. Orban. GPMR: An iterative method for unsymmetric partitioned linear systems. SIAM J. Matrix Anal. Appl., 44(1):293–311, 2023.
  • Orban and Arioli [2017] D. Orban and M. Arioli. Full-Space Iterative Methods. In Iterative Solution of Symmetric Quasi-definite Linear Systems, chapter 6, pages 63–72. SIAM, 2017.
  • Pestana and Rees [2016] J. Pestana and T. Rees. Null-space preconditioners for saddle point systems. SIAM J. Matrix Anal. Appl., 37(3):1103–1128, 2016.
  • Ramage and Gartland Jr [2013] A. Ramage and E. C. Gartland Jr. A preconditioned nullspace method for liquid crystal director modeling. SIAM J. Sci. Comput., 35(1):B226–B247, 2013.
  • Raydan [1993] M. Raydan. On the Barzilai and Borwein choice of steplength for the gradient method. IMA J. Numer. Anal., 13(3):321–326, 1993.
  • Raydan [1997] M. Raydan. The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim., 7(1):26–33, 1997.
  • Rozlozník and Simoncini [2002] M. Rozlozník and V. Simoncini. Krylov subspace methods for saddle point problems with indefinite preconditioning. SIAM J. Matrix Anal. Appl., 24(2):368–391, 2002.
  • Saad [2003] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2003.
  • Saad and Schultz [1986] Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. and Statist. Comput., 7(3):856–869, 1986.
  • Scott and Tuma [2020] J. Scott and M. Tuma. A null-space approach for symmetric saddle point systems with a non zero (2, 2) block. SIAM J. Sci. Comput., 2020.
  • Scott and Tuma [2022] J. Scott and M. Tuma. A null-space approach for large-scale symmetric saddle point systems with a small and non zero (2, 2) block. Numer. Algor., 90(4):1639–1667, 2022.
  • Silvester et al. [2023] D. Silvester, H. Elman, and A. Ramage. Incompressible Flow & Iterative Solver Software. https://personalpages.manchester.ac.uk/staff/david.silvester/ifiss/, 2023.
  • Van der Vorst [1992] H. A. Van der Vorst. Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. and Statist. Comput., 13(2):631–644, 1992.
  • Wright [1997] S. J. Wright. Primal-dual Interior-point Methods. SIAM, 1997.
  • Zhang and Wei [2010] N. Zhang and Y. M. Wei. On the convergence of general stationary iterative methods for range-Hermitian singular linear systems. Numer. Linear Algebra Appl., 17:139–154, 2010.
  • Zheng et al. [2009] B. Zheng, Z. Z. Bai, and X. Yang. On semi-convergence of parameterized Uzawa methods for singular saddle point problems. Linear Algebra Appl., 431(5-7):808–817, 2009.
  • Zou and Magoulès [2022] Q. M. Zou and F. Magoulès. Delayed gradient methods for symmetric and positive definite linear systems. SIAM Rev., 64(3):517–553, 2022.
  • Zulehner [2002] W. Zulehner. Analysis of iterative methods for saddle point problems: a unified approach. Math. Comp., 71(238):479–505, 2002.