1 Introduction
We consider the unsymmetric saddle-point system
(1)
( G B − B T 0 ) ( x y ) = ( f g ) , matrix 𝐺 𝐵 superscript 𝐵 𝑇 0 matrix 𝑥 𝑦 matrix 𝑓 𝑔 \begin{pmatrix}G&B\\
-B^{T}&0\end{pmatrix}\begin{pmatrix}x\\
y\end{pmatrix}=\begin{pmatrix}f\\
g\end{pmatrix}, ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_g end_CELL end_ROW end_ARG ) ,
where B ∈ ℝ n × m ( n ≥ m ) 𝐵 superscript ℝ 𝑛 𝑚 𝑛 𝑚 B\in\mathds{R}^{n\times m}~{}(n\geq m) italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT ( italic_n ≥ italic_m ) , and
G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is positive definite on the nullspace of B T superscript 𝐵 𝑇 B^{T} italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT but may be unsymmetric and/or singular.
Thus, x T G x > 0 superscript 𝑥 𝑇 𝐺 𝑥 0 x^{T}\!Gx>0 italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x > 0 for all nonzero x ∈ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\in\mathop{\mathrm{Null}}(B^{T}\!\,) italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . The change of sign in the second block-row of (1 ) makes the matrix semipositive real and positive semistable if G 𝐺 G italic_G is positive semidefinite [6 ] . Linear systems like (1 ) arise from certain discretizations of Navier-Stokes equations [23 ] , mixed and mixed-hybrid finite element approximation of the liquid crystal director model [38 ] and coupled Stokes-Darcy flow [13 ] , and within interior methods for constrained optimization [25 , 48 ] . System (1 ) is nonsingular if and only if B 𝐵 B italic_B has full column rank [8 ] . When B 𝐵 B italic_B corresponds to a discretized gradient operator, as for example in Navier-Stokes equations [23 , 28 ] , then B 𝐵 B italic_B has low column rank and (1 ) is singular.
Iterative methods for solving saddle-point systems have been studied for decades, such as stationary iterations [4 , 8 , 52 ] , nonlinear inexact Uzawa methods [16 , 30 , 33 ] , nullspace methods [37 , 44 , 45 ] , Krylov subspace methods [20 , 29 , 35 , 36 ] , and preconditioning techniques [8 , 7 , 21 , 41 ] . Some stationary iterative methods and their semi-convergence have been studied for singular cases [15 , 49 , 50 ] .
Let Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT be symmetric and positive definite (SPD). If we premultiply the second block-row of (1 ) by − B Q − 1 𝐵 superscript 𝑄 1 -BQ^{-1} - italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and add the result to the first block equation, we find that (1 ) is equivalent to
(2)
( G + B Q − 1 B T B − B T 0 ) ( x y ) = ( f − B Q − 1 g g ) . matrix 𝐺 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝐵 superscript 𝐵 𝑇 0 matrix 𝑥 𝑦 matrix 𝑓 𝐵 superscript 𝑄 1 𝑔 𝑔 \begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\
-B^{T}&0\end{pmatrix}\begin{pmatrix}x\\
y\end{pmatrix}=\begin{pmatrix}f-BQ^{-1}g\\
g\end{pmatrix}. ( start_ARG start_ROW start_CELL italic_G + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f - italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g end_CELL end_ROW start_ROW start_CELL italic_g end_CELL end_ROW end_ARG ) .
Golub and Greif [27 ] and Golub et al. [28 ] showed that methods based on (2 ) may have advantages. Indeed, even if G 𝐺 G italic_G is singular or ill-conditioned, the ( 1 , 1 ) 1 1 (1,1) ( 1 , 1 ) block in (2 ) can be made nonsingular, positive definite or well-conditioned with suitable selections of Q 𝑄 Q italic_Q . When G 𝐺 G italic_G is symmetric, the symmetric form
T ( Q ) := ( G + B Q − 1 B T B B T 0 ) assign 𝑇 𝑄 matrix 𝐺 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝐵 superscript 𝐵 𝑇 0 T(Q):=\begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\
B^{T}&0\end{pmatrix} italic_T ( italic_Q ) := ( start_ARG start_ROW start_CELL italic_G + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG )
of (2 ) is typically preferred.
Golub and Greif [27 ] mainly consider the specific case Q = γ I 𝑄 𝛾 𝐼 Q=\gamma I italic_Q = italic_γ italic_I , where γ > 0 𝛾 0 \gamma>0 italic_γ > 0 is constant and I 𝐼 I italic_I is the identity matrix. They provide analytical observations on the spectrum of T ( γ I ) 𝑇 𝛾 𝐼 T(\gamma I) italic_T ( italic_γ italic_I ) and show that there is a range of values of γ 𝛾 \gamma italic_γ that will improve the condition number of T ( γ I ) 𝑇 𝛾 𝐼 T(\gamma I) italic_T ( italic_γ italic_I ) , as well as the condition number of its ( 1 , 1 ) 1 1 (1,1) ( 1 , 1 ) block and the associated Schur complement. In particular, γ = ‖ B ‖ 2 / ‖ G ‖ 𝛾 superscript norm 𝐵 2 norm 𝐺 \gamma=\|B\|^{2}/\|G\| italic_γ = ∥ italic_B ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ∥ italic_G ∥ may often force the norm of the added term 1 γ B B T 1 𝛾 𝐵 superscript 𝐵 𝑇 \frac{1}{\gamma}BB^{T}\! divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT to be of the same magnitude as the norm of G 𝐺 G italic_G . Golub et al. [28 ] experimentally observe that this special choice is typically effective. Apart from the form of (2 ), they also show that when G 𝐺 G italic_G is symmetric positive semidefinite of nullity 1 1 1 1 , an effective approach to maintaining sparsity is to choose the augmented term as τ b b T 𝜏 𝑏 superscript 𝑏 𝑇 \tau bb^{T} italic_τ italic_b italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , where b 𝑏 b italic_b is a known vector not orthogonal to the nullspace of G 𝐺 G italic_G , and τ > 0 𝜏 0 \tau>0 italic_τ > 0 is a constant that approximately minimizes the condition number of G + τ b b T 𝐺 𝜏 𝑏 superscript 𝑏 𝑇 G+\tau bb^{T} italic_G + italic_τ italic_b italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT .
The approach of replacing (1 ) by (2 ) can be regarded as an augmented Lagrangian (SPAL) method, also called the method of multipliers [8 , 27 , 28 ] . For an extensive overview of the augmented Lagrangian approach and its applications, we refer to [11 , 10 ] . Awanou and Lai [3 ] apply the Uzawa method [1 ] to (2 ) with Q = γ I 𝑄 𝛾 𝐼 Q=\gamma I italic_Q = italic_γ italic_I and propose the following SPAL (with k = 0 , 1 , 2 , … 𝑘 0 1 2 …
k=0,1,2,\dots italic_k = 0 , 1 , 2 , … and y 0 subscript 𝑦 0 y_{0} italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT assumed given):
{ ( G + 1 γ B B T ) x k = f − 1 γ B g − B y k , y k + 1 = y k + 1 γ ( B T x k + g ) . cases 𝐺 1 𝛾 𝐵 superscript 𝐵 𝑇 subscript 𝑥 𝑘 𝑓 1 𝛾 𝐵 𝑔 𝐵 subscript 𝑦 𝑘 subscript 𝑦 𝑘 1 subscript 𝑦 𝑘 1 𝛾 superscript 𝐵 𝑇 subscript 𝑥 𝑘 𝑔 \left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BB^{T}\!\,)x_{k}=f-\frac{1}{\gamma%
}Bg-By_{k},\\
y_{k+1}=y_{k}+\frac{1}{\gamma}(B^{T}\!x_{k}+g).\end{array}\right. { start_ARRAY start_ROW start_CELL ( italic_G + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_g - italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ) . end_CELL end_ROW end_ARRAY
By introducing another parameter ρ 𝜌 \rho italic_ρ , Awanou and Lai [2 ] further generalize SPAL as
(3)
{ ( G + 1 γ B Q − 1 B T ) x k = f − 1 γ B Q − 1 g − B y k , y k + 1 = y k + 1 ρ Q − 1 ( B T x k + g ) , cases 𝐺 1 𝛾 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑥 𝑘 𝑓 1 𝛾 𝐵 superscript 𝑄 1 𝑔 𝐵 subscript 𝑦 𝑘 subscript 𝑦 𝑘 1 subscript 𝑦 𝑘 1 𝜌 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑥 𝑘 𝑔 \left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BQ^{-1}B^{T}\!\,)x_{k}=f-\frac{1}{%
\gamma}BQ^{-1}g-By_{k},\\
y_{k+1}=y_{k}+\frac{1}{\rho}Q^{-1}(B^{T}\!x_{k}+g),\end{array}\right. { start_ARRAY start_ROW start_CELL ( italic_G + divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_γ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g - italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ) , end_CELL end_ROW end_ARRAY
and give a first convergence analysis for the case of unsymmetric G 𝐺 G italic_G . They say that the proofs in [26 ] using spectral arguments cannot be extended to the nonsymmetric case. Under the assumptions that x T G x ≥ 0 superscript 𝑥 𝑇 𝐺 𝑥 0 x^{T}Gx\geq 0 italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x ≥ 0 for all x 𝑥 x italic_x and x T G x = 0 superscript 𝑥 𝑇 𝐺 𝑥 0 x^{T}Gx=0 italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x = 0 with B T x = 0 superscript 𝐵 𝑇 𝑥 0 B^{T}x=0 italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = 0 implies x = 0 𝑥 0 x=0 italic_x = 0 , they verify convergence by proving that ‖ y k + 1 − y ∗ ‖ Q ≤ ‖ y k − y ∗ ‖ Q subscript norm subscript 𝑦 𝑘 1 subscript 𝑦 𝑄 subscript norm subscript 𝑦 𝑘 subscript 𝑦 𝑄 \|y_{k+1}-y_{*}\|_{Q}\leq\|y_{k}-y_{*}\|_{Q} ∥ italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT ≤ ∥ italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT and then x k subscript 𝑥 𝑘 x_{k} italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges to x ∗ subscript 𝑥 x_{*} italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , where ( x ∗ , y ∗ ) subscript 𝑥 subscript 𝑦 (x_{*},y_{*}) ( italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) is the exact solution of (1 ). Awanou and Lai [2 ] also say that their numerical experiments for an inexact Uzawa algorithm applied to (2 ) do not illustrate convergence. However, we have not been able to find their implementation of the inexact version and the numerical results.
We focus here on the inexact SPAL. Based on a simple splitting of the matrix in (1 ), we propose a stationary iterative method that is theoretically equivalent to (3 ) when γ = ρ 𝛾 𝜌 \gamma=\rho italic_γ = italic_ρ . Hence, we also call it SPAL. We derive its convergence and semi-convergence for B 𝐵 B italic_B of any rank based on spectral arguments (unlike [2 ] ) and obtain an explicit range of convergence for the parameter in SPAL. We allow G 𝐺 G italic_G here to be indefinite.
Our SPAL requires an exact solution of a linear system at each step. To improve efficiency, we propose an inexact SPAL in which the linear system is solved inexactly. We show that it converges to the solution of (1 ) under reasonable conditions. Gradient methods are a class of simple optimization approaches using the negative gradient of the objective function as a search direction. The Barzilai-Borwein (BB) [5 ] method is a gradient method for unconstrained optimization and has proved to be efficient for solving large and sparse unconstrained convex quadratic programming, which is equivalent to solving an SPD linear system.
When G 𝐺 G italic_G is unsymmetric positive definite (UPD), the linear system (7 ) in SPAL is UPD as well. We use the BB method to solve this UPD linear system inexactly. We call the resulting method the augmented Lagrangian BB (SPALBB) algorithm and establish its convergence under suitable assumptions. Numerical experiments on linear systems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB often solves problems more efficiently than GMRES [43 ] and BICGSTAB [47 ] .
The paper is organized as follows. In Section 2 , we introduce the augmented Lagrangian algorithm. Its convergence and semi-convergence are established in section 2.1 and section 2.2 . The inexact SPAL and its convergence analysis are provided in Section 3 . The augmented Lagrangian BB algorithm is presented in section 3.3 . Numerical experiments are reported in Section 4 . Conclusions appear in Section 5 .
Notation
For any H ∈ ℝ n × n 𝐻 superscript ℝ 𝑛 𝑛 H\in\mathds{R}^{n\times n} italic_H ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT , we write its inverse, transpose, spectral set, nullspace and range space as H − 1 superscript 𝐻 1 H^{-1} italic_H start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , H T superscript 𝐻 𝑇 H^{T} italic_H start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , sp ( H ) sp 𝐻 \mathrm{sp}(H) roman_sp ( italic_H ) , Null ( H ) Null 𝐻 \mathop{\mathrm{Null}}(H) roman_Null ( italic_H ) , and Range ( H ) Range 𝐻 \mathop{\mathrm{Range}}(H) roman_Range ( italic_H ) . For any x ∈ ℂ n 𝑥 superscript ℂ 𝑛 x\in\mathds{C}^{n} italic_x ∈ blackboard_C start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , we write its conjugate transpose as x ∗ superscript 𝑥 x^{*} italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT . For symmetric H 𝐻 H italic_H , λ min ( H ) subscript 𝜆 𝐻 \lambda_{\min}(H) italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_H ) and λ max ( H ) subscript 𝜆 𝐻 \lambda_{\max}(H) italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_H ) denote the minimum and maximum eigenvalues. ∥ ⋅ ∥ \|\cdot\| ∥ ⋅ ∥ denotes the 2 2 2 2 -norm of a vector or matrix. For an n × n 𝑛 𝑛 n\times n italic_n × italic_n SPD matrix G 𝐺 G italic_G , ‖ x ‖ G = ⟨ G x , x ⟩ = ‖ G 1 2 x ‖ subscript norm 𝑥 𝐺 𝐺 𝑥 𝑥
norm superscript 𝐺 1 2 𝑥 \|x\|_{G}=\sqrt{\langle Gx,x\rangle}=\|G^{\tfrac{1}{2}}x\| ∥ italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = square-root start_ARG ⟨ italic_G italic_x , italic_x ⟩ end_ARG = ∥ italic_G start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_x ∥ for all x ∈ ℝ n 𝑥 superscript ℝ 𝑛 x\in\mathds{R}^{n} italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , and ‖ H ‖ G = sup x ≠ 0 ‖ H x ‖ G ‖ x ‖ G = ‖ G 1 2 H G − 1 2 ‖ subscript norm 𝐻 𝐺 subscript supremum 𝑥 0 subscript norm 𝐻 𝑥 𝐺 subscript norm 𝑥 𝐺 norm superscript 𝐺 1 2 𝐻 superscript 𝐺 1 2 \|H\|_{G}=\sup\limits_{x\neq 0}\frac{\|Hx\|_{G}}{\|x\|_{G}}=\|G^{\tfrac{1}{2}}%
HG^{-\tfrac{1}{2}}\| ∥ italic_H ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT italic_x ≠ 0 end_POSTSUBSCRIPT divide start_ARG ∥ italic_H italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_x ∥ start_POSTSUBSCRIPT italic_G end_POSTSUBSCRIPT end_ARG = ∥ italic_G start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_H italic_G start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ for all H ∈ ℝ n × n 𝐻 superscript ℝ 𝑛 𝑛 H\in\mathds{R}^{n\times n} italic_H ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT . For simplicity, the column vector ( x T y T ) T superscript superscript 𝑥 𝑇 superscript 𝑦 𝑇 𝑇 (x^{T}\!\ y^{T}\!\,)^{T}\! ( italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_y start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is written ( x , y ) 𝑥 𝑦 (x,y) ( italic_x , italic_y ) , a + := max { 0 , a } assign subscript 𝑎 0 𝑎 a_{+}:=\max\{0,a\} italic_a start_POSTSUBSCRIPT + end_POSTSUBSCRIPT := roman_max { 0 , italic_a } , and 1 / 0 := + ∞ assign 1 0 1/0:=+\infty 1 / 0 := + ∞ .
2 Augmented Lagrangian algorithm
We present SPAL for solving the unsymmetric saddle-point system (1 ).
Let Q 𝑄 Q italic_Q be SPD matrix and ω > 0 𝜔 0 \omega>0 italic_ω > 0 . Since
(4)
A := ( G B − B T 0 ) = ( G B − B T ω Q ) − ( 0 0 0 ω Q ) , assign 𝐴 matrix 𝐺 𝐵 superscript 𝐵 𝑇 0 matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 matrix 0 0 0 𝜔 𝑄 A:=\begin{pmatrix}G&B\\
-B^{T}&0\end{pmatrix}=\begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}-\begin{pmatrix}0&0\\
0&\omega Q\end{pmatrix}, italic_A := ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ,
the saddle-point system (1 ) is equivalent to
( G B − B T ω Q ) ( x y ) = ( f ω Q y + g ) . matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 matrix 𝑥 𝑦 matrix 𝑓 𝜔 𝑄 𝑦 𝑔 \begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x\\
y\end{pmatrix}=\begin{pmatrix}f\\
\omega Qy+g\end{pmatrix}. ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_ω italic_Q italic_y + italic_g end_CELL end_ROW end_ARG ) .
This suggests Algorithm 1 for solving system (1 ).
Lemma 2.2 shows that it is always possible to choose Q 𝑄 Q italic_Q and ω 𝜔 \omega italic_ω such that (7 ) is nonsingular, even if A 𝐴 A italic_A is singular.
If G 𝐺 G italic_G is symmetric, (1 ) is equivalent to the constrained optimization problem
(5)
min x 1 2 x T G x − f T x s . t . g + B T x = 0 . \min_{x}\ \tfrac{1}{2}x^{T}Gx-f^{T}x\mathrm{\quad s.t.\quad}g+B^{T}x=0. roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x - italic_f start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x roman_s . roman_t . italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = 0 .
The k 𝑘 k italic_k -th step of the augmented Lagrangian algorithm for (5 ) solves the subproblem
(6)
min x 1 2 x T G x − f T x + 1 2 ω ‖ g + B T x + ω Q y k ‖ Q − 1 2 , subscript 𝑥 1 2 superscript 𝑥 𝑇 𝐺 𝑥 superscript 𝑓 𝑇 𝑥 1 2 𝜔 superscript subscript norm 𝑔 superscript 𝐵 𝑇 𝑥 𝜔 𝑄 subscript 𝑦 𝑘 superscript 𝑄 1 2 \min_{x}~{}\tfrac{1}{2}x^{T}Gx-f^{T}x+\frac{1}{2\omega}\left\|g+B^{T}x+\omega
Qy%
_{k}\right\|_{Q^{-1}}^{2}, roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_x - italic_f start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x + divide start_ARG 1 end_ARG start_ARG 2 italic_ω end_ARG ∥ italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x + italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,
Algorithm 1 The augmented Lagrangian algorithm SPAL for solving (1 )
1: Given
y 0 ∈ ℝ m subscript 𝑦 0 superscript ℝ 𝑚 y_{0}\in\mathds{R}^{m} italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ,
ω > 0 𝜔 0 \omega>0 italic_ω > 0 , and SPD
Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , set
k = 0 𝑘 0 k=0 italic_k = 0 .
2: while a stop** condition is not satisfied do
3: Compute
( x k + 1 , y k + 1 ) subscript 𝑥 𝑘 1 subscript 𝑦 𝑘 1 (x_{k+1},y_{k+1}) ( italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ) according to the iteration
(7)
( G B − B T ω Q ) ( x k + 1 y k + 1 ) = ( f ω Q y k + g ) . matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 matrix subscript 𝑥 𝑘 1 subscript 𝑦 𝑘 1 matrix 𝑓 𝜔 𝑄 subscript 𝑦 𝑘 𝑔 \begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x_{k+1}\\
y_{k+1}\end{pmatrix}=\begin{pmatrix}f\\
\omega Qy_{k}+g\end{pmatrix}. ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_f end_CELL end_ROW start_ROW start_CELL italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g end_CELL end_ROW end_ARG ) .
4: Increment
k 𝑘 k italic_k by
1 1 1 1 .
5: end while
where y k subscript 𝑦 𝑘 y_{k} italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is an estimate of the Lagrange multiplier. Its optimal solution x k + 1 subscript 𝑥 𝑘 1 x_{k+1} italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT satisfies
(8)
( G + 1 ω B Q − 1 B T ) x k + 1 + B y k = f − 1 ω B Q − 1 g . 𝐺 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑥 𝑘 1 𝐵 subscript 𝑦 𝑘 𝑓 1 𝜔 𝐵 superscript 𝑄 1 𝑔 (G+\frac{1}{\omega}BQ^{-1}B^{T}\!\,)x_{k+1}+By_{k}=f-\frac{1}{\omega}BQ^{-1}g. ( italic_G + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_B italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_g .
The multiplier is updated as
(9)
y k + 1 = 1 ω Q − 1 ( g + B T x k + 1 + ω Q y k ) = y k + 1 ω Q − 1 ( B T x k + 1 + g ) . subscript 𝑦 𝑘 1 1 𝜔 superscript 𝑄 1 𝑔 superscript 𝐵 𝑇 subscript 𝑥 𝑘 1 𝜔 𝑄 subscript 𝑦 𝑘 subscript 𝑦 𝑘 1 𝜔 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑥 𝑘 1 𝑔 y_{k+1}=\frac{1}{\omega}Q^{-1}(g+B^{T}x_{k+1}+\omega Qy_{k})=y_{k}+\frac{1}{%
\omega}Q^{-1}(B^{T}x_{k+1}+g). italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_g + italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT + italic_g ) .
Note that (7 ) also gives
(8 )–(9 ).
Hence, we also call it the augmented Lagrangian algorithm here. Clearly, Algorithm 1 is theoretically equivalent to (3 ) if γ = ρ = ω 𝛾 𝜌 𝜔 \gamma=\rho=\omega italic_γ = italic_ρ = italic_ω . When G 𝐺 G italic_G is symmetric, the convergence of SPAL or its variants has been studied in [26 ] . Awanou and Lai [2 ] first gave convergence results for (3 ) when G 𝐺 G italic_G is unsymmetric positive semi-definite but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , based on analyzing the error ‖ y k − y ∗ ‖ Q subscript norm subscript 𝑦 𝑘 subscript 𝑦 𝑄 \|y_{k}-y_{*}\|_{Q} ∥ italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT , where ( x ∗ , y ∗ ) subscript 𝑥 subscript 𝑦 (x_{*},y_{*}) ( italic_x start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) is the exact solution of (1 ). Here we give the convergence analysis of SPAL in a different way, based on the spectral properties of T 𝑇 T italic_T in (15 ) below. We derive the explicit range of convergence for ω 𝜔 \omega italic_ω and do not require G 𝐺 G italic_G to be positive semi-definite.
We call A = M − N 𝐴 𝑀 𝑁 A=M-N italic_A = italic_M - italic_N a splitting if M 𝑀 M italic_M is nonsingular.
Defining T = M − 1 N 𝑇 superscript 𝑀 1 𝑁 T=M^{-1}N italic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N , we consider the following iteration scheme for solving A z = ℓ 𝐴 𝑧 ℓ Az=\ell italic_A italic_z = roman_ℓ :
(10)
z k + 1 = T z k + M − 1 ℓ . subscript 𝑧 𝑘 1 𝑇 subscript 𝑧 𝑘 superscript 𝑀 1 ℓ z_{k+1}=Tz_{k}+M^{-1}\ell. italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_T italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ .
First, we show that (4 ) is a splitting of A 𝐴 A italic_A in (1 ). For convenience, we introduce
(11)
S Q = G + 1 ω B Q − 1 B T , subscript 𝑆 𝑄 𝐺 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 \displaystyle S_{Q}=G+\dfrac{1}{\omega}BQ^{-1}B^{T}, italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = italic_G + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ,
H = 1 2 ( G + G T ) , 𝐻 1 2 𝐺 superscript 𝐺 𝑇 \displaystyle\qquad H=\tfrac{1}{2}(G+G^{T}), italic_H = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ,
(12)
M = ( G B − B T ω Q ) , 𝑀 matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 \displaystyle M=\begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}, italic_M = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ,
N = ( 0 0 0 ω Q ) . 𝑁 matrix 0 0 0 𝜔 𝑄 \displaystyle\qquad N=\begin{pmatrix}0&0\\
0&\omega Q\end{pmatrix}. italic_N = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) .
Note that S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is the Schur complement of ω Q 𝜔 𝑄 \omega Q italic_ω italic_Q in M 𝑀 M italic_M .
Lemma 2.1 .
Let G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT be unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , and
(13)
η = inf x ∉ Null ( B T ) x T H x x T B Q − 1 B T x . 𝜂 subscript infimum 𝑥 Null superscript 𝐵 𝑇 superscript 𝑥 𝑇 𝐻 𝑥 superscript 𝑥 𝑇 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 \displaystyle\eta=\inf\limits_{x\notin\mathop{\mathrm{Null}}(B^{T})}\dfrac{x^{%
T}Hx}{x^{T}BQ^{-1}B^{T}x}. italic_η = roman_inf start_POSTSUBSCRIPT italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG .
For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , if 0 < ω < 1 / ( − η ) + 0 𝜔 1 subscript 𝜂 0<\omega<1/(-\eta)_{+} 0 < italic_ω < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , then S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite.
Proof.
Since G 𝐺 G italic_G is positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , so is H 𝐻 H italic_H . Then for any nonzero x ∈ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\in\mathop{\mathrm{Null}}(B^{T}) italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , it holds that
x T ( H + 1 ω B Q − 1 B T ) x = x T H x > 0 . superscript 𝑥 𝑇 𝐻 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 superscript 𝑥 𝑇 𝐻 𝑥 0 x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx>0. italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x > 0 .
For any x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , as η > − 1 / ω 𝜂 1 𝜔 \eta>-1/\omega italic_η > - 1 / italic_ω , we have
x T ( H + 1 ω B Q − 1 B T ) x = x T H x + 1 ω x T B Q − 1 B T x ≥ ( η + 1 ω ) x T B Q − 1 B T x > 0 . superscript 𝑥 𝑇 𝐻 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 superscript 𝑥 𝑇 𝐻 𝑥 1 𝜔 superscript 𝑥 𝑇 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 𝜂 1 𝜔 superscript 𝑥 𝑇 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 0 x^{T}(H+\frac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx+\frac{1}{\omega}x^{T}BQ^{-1}B^{%
T}x\geq(\eta+\frac{1}{\omega})x^{T}BQ^{-1}B^{T}x>0. italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_x + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x ≥ ( italic_η + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG ) italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x > 0 .
Hence S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite because, for any nonzero x ∈ ℝ n 𝑥 superscript ℝ 𝑛 x\in\mathds{R}^{n} italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , x T ( S Q + S Q T ) x = 2 x T ( H + 1 ω B Q − 1 B T ) x > 0 . superscript 𝑥 𝑇 subscript 𝑆 𝑄 superscript subscript 𝑆 𝑄 𝑇 𝑥 2 superscript 𝑥 𝑇 𝐻 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 0 x^{T}(S_{Q}+S_{Q}^{T})x=2x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x>0. italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x = 2 italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x > 0 .
By Lemma 2.1 and some algebraic manipulation, we have the following results.
Lemma 2.2 .
Under the same conditions as in Lemma 2.1 , M 𝑀 M italic_M is nonsingular and
(14)
M − 1 = ( S Q − 1 − 1 ω S Q − 1 B Q − 1 1 ω Q − 1 B T S Q − 1 1 ω Q − 1 − 1 ω 2 Q − 1 B T S Q − 1 B Q − 1 ) . superscript 𝑀 1 matrix superscript subscript 𝑆 𝑄 1 1 𝜔 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 1 𝜔 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 1 𝜔 superscript 𝑄 1 1 superscript 𝜔 2 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 M^{-1}=\begin{pmatrix}S_{Q}^{-1}&-\dfrac{1}{\omega}S_{Q}^{-1}BQ^{-1}\\[8.0pt]
\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}Q^{-1}-\dfrac{1}{%
\omega^{2}}Q^{-1}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}. italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = ( start_ARG start_ROW start_CELL italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) .
Lemma 2.3 .
Under the same conditions as in Lemma 2.1 , the iteration matrix of Algorithm 1 is
(15)
T = M − 1 N = ( 0 − S Q − 1 B 0 I − 1 ω Q − 1 B T S Q − 1 B ) 𝑇 superscript 𝑀 1 𝑁 matrix 0 superscript subscript 𝑆 𝑄 1 𝐵 0 𝐼 1 𝜔 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 T=M^{-1}N=\smash[t]{\begin{pmatrix}0&-S_{Q}^{-1}B\\
0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}} italic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW end_ARG )
and the eigenvalues of T 𝑇 T italic_T are 0 0 with algebraic multiplicity n 𝑛 n italic_n , 1 1 1 1 with algebraic multiplicity m − s 𝑚 𝑠 m-s italic_m - italic_s , and the remaining s 𝑠 s italic_s eigenvalues are ω μ / ( 1 + ω μ ) 𝜔 𝜇 1 𝜔 𝜇 \omega\mu/(1+\omega\mu) italic_ω italic_μ / ( 1 + italic_ω italic_μ ) ,
where s 𝑠 s italic_s is the rank of B 𝐵 B italic_B and μ 𝜇 \mu italic_μ is a generalized eigenvalue of G 𝐺 G italic_G and B Q − 1 B T 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 BQ^{-1}B^{T} italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) .
Proof.
It follows from (12 ) and (14 ) that
T = ( G B − B T ω Q ) − 1 ( 0 0 0 ω Q ) = ( 0 − S Q − 1 B 0 I − 1 ω Q − 1 B T S Q − 1 B ) . 𝑇 superscript matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 1 matrix 0 0 0 𝜔 𝑄 matrix 0 superscript subscript 𝑆 𝑄 1 𝐵 0 𝐼 1 𝜔 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 T=\begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}^{-1}\begin{pmatrix}0&0\\
0&\omega Q\end{pmatrix}=\begin{pmatrix}0&-S_{Q}^{-1}B\\
0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}. italic_T = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B end_CELL end_ROW end_ARG ) .
Clearly, T 𝑇 T italic_T has an eigenvalue 0 0 with algebraic multiplicity n 𝑛 n italic_n , and the remaining m 𝑚 m italic_m eigenvalues are 1 − λ / ω 1 𝜆 𝜔 1-\lambda/\omega 1 - italic_λ / italic_ω , where λ 𝜆 \lambda italic_λ is an eigenvalue of Q − 1 B T S Q − 1 B superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 Q^{-1}B^{T}S_{Q}^{-1}B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B .
Since S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite and Q 𝑄 Q italic_Q is SPD, Q − 1 B T S Q − 1 B superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 Q^{-1}B^{T}S_{Q}^{-1}B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B is nonsingular when B 𝐵 B italic_B has full column rank. Thus, λ = 0 𝜆 0 \lambda=0 italic_λ = 0 if and only if B 𝐵 B italic_B is column rank-deficient. In this case, 1 1 1 1 is an eigenvalue of T 𝑇 T italic_T with algebraic multiplicity m − s 𝑚 𝑠 m-s italic_m - italic_s .
If λ ≠ 0 𝜆 0 \lambda\neq 0 italic_λ ≠ 0 , note that Q − 1 B T S Q − 1 B superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 Q^{-1}B^{T}S_{Q}^{-1}B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B and S Q − 1 B Q − 1 B T superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 S_{Q}^{-1}BQ^{-1}B^{T} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT possess the same nonzero eigenvalues, and λ 𝜆 \lambda italic_λ is also an eigenvalue of S Q − 1 B Q − 1 B T superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 S_{Q}^{-1}BQ^{-1}B^{T} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . Then there exists x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) such that
S Q − 1 B Q − 1 B T x = λ x superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 𝜆 𝑥 S_{Q}^{-1}BQ^{-1}B^{T}x=\lambda x italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = italic_λ italic_x . Combining with (11 ) leads to
(16)
G x = ω − λ ω λ B Q − 1 B T x . 𝐺 𝑥 𝜔 𝜆 𝜔 𝜆 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 Gx=\dfrac{\omega-\lambda}{\omega\lambda}BQ^{-1}B^{T}x. italic_G italic_x = divide start_ARG italic_ω - italic_λ end_ARG start_ARG italic_ω italic_λ end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x .
Hence there exists a generalized eigenvalue μ 𝜇 \mu italic_μ of G 𝐺 G italic_G and B Q − 1 B T 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 BQ^{-1}B^{T} italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) such that μ = ω − λ ω λ 𝜇 𝜔 𝜆 𝜔 𝜆 \mu=\tfrac{\omega-\lambda}{\omega\lambda} italic_μ = divide start_ARG italic_ω - italic_λ end_ARG start_ARG italic_ω italic_λ end_ARG , i.e., λ = ω 1 + ω μ 𝜆 𝜔 1 𝜔 𝜇 \lambda=\tfrac{\omega}{1+\omega\mu} italic_λ = divide start_ARG italic_ω end_ARG start_ARG 1 + italic_ω italic_μ end_ARG .
Therefore, we know that the remaining eigenvalues of
T 𝑇 T italic_T are
1 − 1 1 + ω μ = ω μ 1 + ω μ . 1 1 1 𝜔 𝜇 𝜔 𝜇 1 𝜔 𝜇 1-\tfrac{1}{1+\omega\mu}=\tfrac{\omega\mu}{1+\omega\mu}. 1 - divide start_ARG 1 end_ARG start_ARG 1 + italic_ω italic_μ end_ARG = divide start_ARG italic_ω italic_μ end_ARG start_ARG 1 + italic_ω italic_μ end_ARG .
We should emphasize that Lemmas 2.1 , 2.2 and 2.3 hold even if B 𝐵 B italic_B has low column rank. From Lemma 2.2 , we know that A = M − N 𝐴 𝑀 𝑁 A=M-N italic_A = italic_M - italic_N is a splitting of A 𝐴 A italic_A . Then the convergence analysis of Algorithm 1 can be based on the spectral properties of T = M − 1 N 𝑇 superscript 𝑀 1 𝑁 T=M^{-1}N italic_T = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N . In the following, we discuss the convergence of Algorithm 1 when B 𝐵 B italic_B does or does not have full column rank, respectively.
2.1 Convergence analysis when B 𝐵 B italic_B has full column rank
In this case, A 𝐴 A italic_A is nonsingular and the saddle-point system (1 ) has a unique solution.
Theorem 2.1 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η be defined by (13 ). If 0 < ω < 1 / ( − 2 η ) + 0 𝜔 1 subscript 2 𝜂 0<\omega<1/(-2\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , then the sequence { x k , y k } subscript 𝑥 𝑘 subscript 𝑦 𝑘 \{x_{k},y_{k}\} { italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 1 converges to the unique solution of saddle-point system (1 ).
Proof 2.2 .
Algorithm 1 is convergent if and only if the spectral radius of T 𝑇 T italic_T is less than 1 1 1 1 [42 , Theorem 4.1] . Note that 0 < ω < 1 / ( − 2 η ) + ≤ 1 / ( − η ) + 0 𝜔 1 subscript 2 𝜂 1 subscript 𝜂 0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT and the conditions of Lemma 2.1 hold. As B 𝐵 B italic_B has full column rank, it follows from Lemma 2.3 that 1 1 1 1 is not an eigenvalue of T 𝑇 T italic_T and then
(17)
ρ ( T ) = max μ ω | μ | | 1 + ω μ | = max μ ( ω μ 1 ) 2 + ( ω μ 2 ) 2 ( 1 + ω μ 1 ) 2 + ( ω μ 2 ) 2 , 𝜌 𝑇 subscript 𝜇 𝜔 𝜇 1 𝜔 𝜇 subscript 𝜇 superscript 𝜔 subscript 𝜇 1 2 superscript 𝜔 subscript 𝜇 2 2 superscript 1 𝜔 subscript 𝜇 1 2 superscript 𝜔 subscript 𝜇 2 2 \rho(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}\smash[t]{\sqrt%
{\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(\omega%
\mu_{2})^{2}}}}, italic_ρ ( italic_T ) = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT divide start_ARG italic_ω | italic_μ | end_ARG start_ARG | 1 + italic_ω italic_μ | end_ARG = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT square-root start_ARG divide start_ARG ( italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ( 1 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ,
where μ = μ 1 + i μ 2 𝜇 subscript 𝜇 1 i subscript 𝜇 2 \mu=\mu_{1}+{\rm i}\mu_{2} italic_μ = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + roman_i italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the generalized eigenvalue of G 𝐺 G italic_G and B Q − 1 B T 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 BQ^{-1}B^{T} italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT corresponding to the generalized eigenvector x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . Since x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and Q 𝑄 Q italic_Q is SPD, we have x ∗ B Q − 1 B T x > 0 superscript 𝑥 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 0 x^{*}BQ^{-1}B^{T}x>0 italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x > 0 . Combining with (16 ) gives μ = x ∗ G x x ∗ B Q − 1 B T x 𝜇 superscript 𝑥 𝐺 𝑥 superscript 𝑥 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 \mu=\frac{x^{*}Gx}{x^{*}BQ^{-1}B^{T}x} italic_μ = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_G italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG . Then
(18)
μ 1 = x ∗ ( G + G T ) x 2 x ∗ B Q − 1 B T x = x ∗ H x x ∗ B Q − 1 B T x ≥ η . subscript 𝜇 1 superscript 𝑥 𝐺 superscript 𝐺 𝑇 𝑥 2 superscript 𝑥 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 superscript 𝑥 𝐻 𝑥 superscript 𝑥 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝑥 𝜂 \mu_{1}=\dfrac{x^{*}(G+G^{T})x}{2x^{*}BQ^{-1}B^{T}x}=\dfrac{x^{*}Hx}{x^{*}BQ^{%
-1}B^{T}x}\geq\eta. italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_G + italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_x end_ARG start_ARG 2 italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG = divide start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_H italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG ≥ italic_η .
Note that η > − 1 / ( 2 ω ) 𝜂 1 2 𝜔 \eta>-1/(2\omega) italic_η > - 1 / ( 2 italic_ω ) and ω > 0 𝜔 0 \omega>0 italic_ω > 0 , so that 1 + ω μ 1 ≥ 1 + ω η > 1 / 2 1 𝜔 subscript 𝜇 1 1 𝜔 𝜂 1 2 1+\omega\mu_{1}\geq 1+\omega\eta>1/2 1 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 1 + italic_ω italic_η > 1 / 2 . This together with (17 ) leads to ρ ( T ) < 1 𝜌 𝑇 1 \rho(T)<1 italic_ρ ( italic_T ) < 1 . Therefore, Algorithm 1 is convergent.
Remark 1 .
From (17 ) we see that ρ ( T ) 𝜌 𝑇 \rho(T) italic_ρ ( italic_T ) decreases with ω 𝜔 \omega italic_ω . This means that the convergence rate of Algorithm 1 will improve as ω 𝜔 \omega italic_ω decreases. In particular, if ω = 0 𝜔 0 \omega=0 italic_ω = 0 (which means no splitting), ρ ( T ) = 0 𝜌 𝑇 0 \rho(T)=0 italic_ρ ( italic_T ) = 0 . Algorithm 1 then reduces to the exact method for problem (1 ). This is consistent with (7 ), i.e., Algorithm 1 performs only one iteration.
In addition, since ρ ( T ) → 0 → 𝜌 𝑇 0 \rho(T)\rightarrow 0 italic_ρ ( italic_T ) → 0 as | μ | → 0 → 𝜇 0 |\mu|\rightarrow 0 | italic_μ | → 0 , Q 𝑄 Q italic_Q should be chosen such that the generalized eigenvalues of G 𝐺 G italic_G and B Q − 1 B T 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 BQ^{-1}B^{T} italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT are very close to 0 0 . Therefore, we can choose Q 𝑄 Q italic_Q with very small norm.
Remark 2 .
If G 𝐺 G italic_G is semidefinite, we see that η ≥ 0 𝜂 0 \eta\geq 0 italic_η ≥ 0 . Then Algorithm 1 is convergent for any ω > 0 𝜔 0 \omega>0 italic_ω > 0 .
2.2 Convergence analysis when B 𝐵 B italic_B is rank-deficient
In this case, A 𝐴 A italic_A is singular. We assume that system (1 ) is solvable and show that Algorithm 1 is semi-convergent. To this end, we introduce some preliminaries on the semi-convergence of iteration scheme (10 ) for a general linear system A z = ℓ 𝐴 𝑧 ℓ Az=\ell italic_A italic_z = roman_ℓ .
Definition 3 .
(Berman and Plemmons [9 , Lemma 6.13] )
Iteration (10 ) is semi-convergent if, for any initial guess z 0 subscript 𝑧 0 z_{0} italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , the iteration sequence { z k } subscript 𝑧 𝑘 \{z_{k}\} { italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by (10 ) converges to a solution z 𝑧 z italic_z of A z = ℓ 𝐴 𝑧 ℓ Az=\ell italic_A italic_z = roman_ℓ such that
z = ( I − T ) D M − 1 ℓ + [ I − ( I − T ) D ( I − T ) ] z 0 , 𝑧 superscript 𝐼 𝑇 𝐷 superscript 𝑀 1 ℓ delimited-[] 𝐼 superscript 𝐼 𝑇 𝐷 𝐼 𝑇 subscript 𝑧 0 z=(I-T)^{D}M^{-1}\ell+[I-(I-T)^{D}(I-T)]z_{0}, italic_z = ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ + [ italic_I - ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT ( italic_I - italic_T ) ] italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,
where ( I − T ) D superscript 𝐼 𝑇 𝐷 (I-T)^{D} ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT denotes the Drazin inverse [14 ] of I − T 𝐼 𝑇 I-T italic_I - italic_T .
Lemma 4 (9 , Theorem 6.19 ).
Iteration (10 ) is semi-convergent if and only if index ( I − T ) = 1 index 𝐼 𝑇 1 {\rm index}(I-T)=1 roman_index ( italic_I - italic_T ) = 1 and v ( T ) < 1 𝑣 𝑇 1 v(T)<1 italic_v ( italic_T ) < 1 , where index ( I − T ) index 𝐼 𝑇 {\rm index}(I-T) roman_index ( italic_I - italic_T ) is the smallest nonnegative integer k 𝑘 k italic_k such that the ranks of ( I − T ) k superscript 𝐼 𝑇 𝑘 (I-T)^{k} ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and ( I − T ) k + 1 superscript 𝐼 𝑇 𝑘 1 (I-T)^{k+1} ( italic_I - italic_T ) start_POSTSUPERSCRIPT italic_k + 1 end_POSTSUPERSCRIPT are equal, and
v ( T ) = max { | λ | : λ ∈ sp ( T ) , λ ≠ 1 } 𝑣 𝑇 : 𝜆 formulae-sequence 𝜆 sp 𝑇 𝜆 1 v(T)=\max\{|\lambda|:~{}\lambda\in{\rm sp}(T),~{}\lambda\neq 1\} italic_v ( italic_T ) = roman_max { | italic_λ | : italic_λ ∈ roman_sp ( italic_T ) , italic_λ ≠ 1 }
is called the pseudo-spectral radius of T 𝑇 T italic_T .
Lemma 5 (49 , Theorem 2.5 ).
index ( I − T ) = 1 index 𝐼 𝑇 1 {\rm index}(I-T)=1 roman_index ( italic_I - italic_T ) = 1 holds if and only if, for all 0 ≠ w ∈ Range ( A ) 0 𝑤 Range 𝐴 0\neq w\in\mathop{\mathrm{Range}}(A) 0 ≠ italic_w ∈ roman_Range ( italic_A ) , w ∉ Null ( A M − 1 ) 𝑤 Null 𝐴 superscript 𝑀 1 w\notin\mathop{\mathrm{Null}}({AM}^{-1}) italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) , i.e., Range ( A ) ∩ Null ( A M − 1 ) = { 0 } Range 𝐴 Null 𝐴 superscript 𝑀 1 0 \mathop{\mathrm{Range}}(A)\cap\mathop{\mathrm{Null}}({AM}^{-1})=\{0\} roman_Range ( italic_A ) ∩ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = { 0 } .
In the following, we analyze the semi-convergence property for Algorithm 1 . By Lemma 4 , first, we need to show index ( I − T ) = 1 index 𝐼 𝑇 1 {\rm index}(I-T)=1 roman_index ( italic_I - italic_T ) = 1 .
Theorem 2.3 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η be defined by (13 ). If 0 < ω < 1 / ( − η ) + 0 𝜔 1 subscript 𝜂 0<\omega<1/(-\eta)_{+} 0 < italic_ω < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , then index ( I − T ) = 1 index 𝐼 𝑇 1 {\rm index}(I-T)=1 roman_index ( italic_I - italic_T ) = 1 .
Proof 2.4 .
Suppose 0 ≠ w ∈ Range ( A ) 0 𝑤 Range 𝐴 0\neq w\in\mathop{\mathrm{Range}}(A) 0 ≠ italic_w ∈ roman_Range ( italic_A ) . Then there is v = ( v 1 , v 2 ) ∈ ℝ n + m 𝑣 subscript 𝑣 1 subscript 𝑣 2 superscript ℝ 𝑛 𝑚 v=(v_{1},v_{2})\in\mathds{R}^{n+m} italic_v = ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT such that
(19)
w = A v = ( G B − B T 0 ) ( v 1 v 2 ) = ( G v 1 + B v 2 − B T v 1 ) ≠ 0 . 𝑤 𝐴 𝑣 matrix 𝐺 𝐵 superscript 𝐵 𝑇 0 matrix subscript 𝑣 1 subscript 𝑣 2 matrix 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 superscript 𝐵 𝑇 subscript 𝑣 1 0 w=Av=\begin{pmatrix}G&B\\
-B^{T}&0\end{pmatrix}\begin{pmatrix}v_{1}\\
v_{2}\end{pmatrix}=\begin{pmatrix}Gv_{1}+Bv_{2}\\
-B^{T}v_{1}\end{pmatrix}\neq 0. italic_w = italic_A italic_v = ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ≠ 0 .
By (14 ), we have
(20)
A M − 1 w 𝐴 superscript 𝑀 1 𝑤 \displaystyle{AM}^{-1}w italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w
= \displaystyle= =
( I 0 − B T S Q − 1 1 ω B T S Q − 1 B Q − 1 ) ( G v 1 + B v 2 − B T v 1 ) matrix 𝐼 0 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 1 𝜔 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 matrix 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 superscript 𝐵 𝑇 subscript 𝑣 1 \displaystyle\smash[t]{\begin{pmatrix}I&0\\
-B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}\begin{%
pmatrix}Gv_{1}+Bv_{2}\\
-B^{T}v_{1}\end{pmatrix}} ( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG )
= \displaystyle= =
( G v 1 + B v 2 − B T S Q − 1 ( G v 1 + B v 2 ) − 1 ω B T S Q − 1 B Q − 1 B T v 1 ) . matrix 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 1 𝜔 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 \displaystyle\begin{pmatrix}Gv_{1}+Bv_{2}\\
-B^{T}S_{Q}^{-1}(Gv_{1}+Bv_{2})-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_%
{1}\end{pmatrix}. ( start_ARG start_ROW start_CELL italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .
If G v 1 + B v 2 ≠ 0 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 0 Gv_{1}+Bv_{2}\neq 0 italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≠ 0 , clearly, A M − 1 w ≠ 0 𝐴 superscript 𝑀 1 𝑤 0 {AM}^{-1}w\neq 0 italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w ≠ 0 , which shows that w ∉ Null ( A M − 1 ) 𝑤 Null 𝐴 superscript 𝑀 1 w\notin\mathop{\mathrm{Null}}({AM}^{-1}) italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) .
If G v 1 + B v 2 = 0 𝐺 subscript 𝑣 1 𝐵 subscript 𝑣 2 0 Gv_{1}+Bv_{2}=0 italic_G italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_B italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 , it follows from (19 ) that B T v 1 ≠ 0 superscript 𝐵 𝑇 subscript 𝑣 1 0 B^{T}v_{1}\neq 0 italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 and (20 ) yields
(21)
A M − 1 w = ( 0 − 1 ω B T S Q − 1 B Q − 1 B T v 1 ) . 𝐴 superscript 𝑀 1 𝑤 matrix 0 1 𝜔 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 {AM}^{-1}w=\begin{pmatrix}0\\
-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\end{pmatrix}. italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_w = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .
Note that Q 𝑄 Q italic_Q is SPD and B T v 1 ≠ 0 superscript 𝐵 𝑇 subscript 𝑣 1 0 B^{T}v_{1}\neq 0 italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 , so that B Q − 1 B T v 1 ≠ 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 0 BQ^{-1}B^{T}v_{1}\neq 0 italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 .
Then we would have
B T S Q − 1 B Q − 1 B T v 1 ≠ 0 . superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 0 B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\neq 0. italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0 .
Indeed, if B T S Q − 1 B Q − 1 B T v 1 = 0 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 0 B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0 italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 , clearly v 1 T B Q − 1 B T S Q − 1 B Q − 1 B T v 1 = 0 superscript subscript 𝑣 1 𝑇 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 0 v_{1}^{T}BQ^{-1}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0 italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 . Since S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite, S Q − 1 superscript subscript 𝑆 𝑄 1 S_{Q}^{-1} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT is also positive definite, which leads to B Q − 1 B T v 1 = 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 subscript 𝑣 1 0 BQ^{-1}B^{T}v_{1}=0 italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 . This is a contradiction. Therefore, we still get w ∉ Null ( A M − 1 ) 𝑤 Null 𝐴 superscript 𝑀 1 w\notin\mathop{\mathrm{Null}}({AM}^{-1}) italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) by (21 ).
Summing up, for any 0 ≠ w ∈ Range ( A ) 0 𝑤 Range 𝐴 0\neq w\in\mathop{\mathrm{Range}}(A) 0 ≠ italic_w ∈ roman_Range ( italic_A ) , w ∉ Null ( A M − 1 ) 𝑤 Null 𝐴 superscript 𝑀 1 w\notin\mathop{\mathrm{Null}}({AM}^{-1}) italic_w ∉ roman_Null ( italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) .
The result follows from Lemma 5 .
Next, we show that v ( T ) < 1 𝑣 𝑇 1 v(T)<1 italic_v ( italic_T ) < 1 .
Theorem 2.5 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η be defined by (13 ). If 0 < ω < 1 / ( − 2 η ) + 0 𝜔 1 subscript 2 𝜂 0<\omega<1/(-2\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , then v ( T ) < 1 𝑣 𝑇 1 v(T)<1 italic_v ( italic_T ) < 1 .
Proof 2.6 .
Since 0 < ω < 1 / ( − 2 η ) + ≤ 1 / ( − η ) + 0 𝜔 1 subscript 2 𝜂 1 subscript 𝜂 0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , the conditions of Lemma 2.1 hold.
Note the definition of the pseudo-spectral radius in Lemma 4 . From Lemma 2.3 ,
v ( T ) = max μ ω | μ | | 1 + ω μ | = max μ ( ω μ 1 ) 2 + ( ω μ 2 ) 2 ( 1 + ω μ 1 ) 2 + ( ω μ 2 ) 2 , 𝑣 𝑇 subscript 𝜇 𝜔 𝜇 1 𝜔 𝜇 subscript 𝜇 superscript 𝜔 subscript 𝜇 1 2 superscript 𝜔 subscript 𝜇 2 2 superscript 1 𝜔 subscript 𝜇 1 2 superscript 𝜔 subscript 𝜇 2 2 \displaystyle v(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}%
\sqrt{\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(%
\omega\mu_{2})^{2}}}, italic_v ( italic_T ) = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT divide start_ARG italic_ω | italic_μ | end_ARG start_ARG | 1 + italic_ω italic_μ | end_ARG = roman_max start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT square-root start_ARG divide start_ARG ( italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ( 1 + italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_ω italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ,
where μ = μ 1 + i μ 2 𝜇 subscript 𝜇 1 i subscript 𝜇 2 \mu=\mu_{1}+{\rm i}\mu_{2} italic_μ = italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + roman_i italic_μ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is the generalized eigenvalue of G 𝐺 G italic_G and B Q − 1 B T 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 BQ^{-1}B^{T} italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT that corresponds to the generalized eigenvector x ∉ Null ( B T ) 𝑥 Null superscript 𝐵 𝑇 x\notin\mathop{\mathrm{Null}}(B^{T}) italic_x ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . By (18 ), ω > 0 𝜔 0 \omega>0 italic_ω > 0 and η > − 1 / ( 2 ω ) 𝜂 1 2 𝜔 \eta>-1/(2\omega) italic_η > - 1 / ( 2 italic_ω ) , we have 1 + 2 ω μ 1 ≥ 1 + 2 ω η > 0 1 2 𝜔 subscript 𝜇 1 1 2 𝜔 𝜂 0 1+2\omega\mu_{1}\geq 1+2\omega\eta>0 1 + 2 italic_ω italic_μ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ 1 + 2 italic_ω italic_η > 0 , giving v ( T ) < 1 𝑣 𝑇 1 v(T)<1 italic_v ( italic_T ) < 1 .
Combining Lemma 4 with Theorems 2.3 and 2.5 and 1 / ( − 2 η ) + < 1 / ( − η ) + 1 subscript 2 𝜂 1 subscript 𝜂 1/(-2\eta)_{+}<1/(-\eta)_{+} 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT < 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , we get the following convergence result.
Theorem 2.7 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient, and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η be defined by (13 ). If 0 < ω < 1 / ( − 2 η ) + 0 𝜔 1 subscript 2 𝜂 0<\omega<1/(-2\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , then the sequence { x k , y k } subscript 𝑥 𝑘 subscript 𝑦 𝑘 \{x_{k},y_{k}\} { italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 1 is semi-convergent to a solution of the singular saddle-point system (1 ).
3 Inexact augmented Lagrangian algorithm
In this section, we develop and analyze inexact SPAL to solve (1 ).
Let ℓ = ( f , g ) ℓ 𝑓 𝑔 \ell=(f,g) roman_ℓ = ( italic_f , italic_g ) , z k = ( x k , y k ) subscript 𝑧 𝑘 subscript 𝑥 𝑘 subscript 𝑦 𝑘 z_{k}=(x_{k},y_{k}) italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , and r k = A z k − ℓ subscript 𝑟 𝑘 𝐴 subscript 𝑧 𝑘 ℓ r_{k}=Az_{k}-\ell italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ . It follows from (10 ) and A = M − N 𝐴 𝑀 𝑁 A=M-N italic_A = italic_M - italic_N that Algorithm 1 is equivalent to
(22)
z k + 1 = M − 1 N z k + M − 1 ℓ = M − 1 ( M − A ) z k + M − 1 ℓ = z k − M − 1 r k , subscript 𝑧 𝑘 1 superscript 𝑀 1 𝑁 subscript 𝑧 𝑘 superscript 𝑀 1 ℓ superscript 𝑀 1 𝑀 𝐴 subscript 𝑧 𝑘 superscript 𝑀 1 ℓ subscript 𝑧 𝑘 superscript 𝑀 1 subscript 𝑟 𝑘 z_{k+1}=M^{-1}Nz_{k}+M^{-1}\ell=M^{-1}(M-A)z_{k}+M^{-1}\ell=z_{k}-M^{-1}r_{k}, italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_N italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ = italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_M - italic_A ) italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_ℓ = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,
where M 𝑀 M italic_M and N 𝑁 N italic_N are defined in (12 ). To describe the inexact version of Algorithm 1 , as done in [30 ] , we introduce a nonlinear map** Ψ : ℝ n + m ⟶ ℝ n + m : Ψ ⟶ superscript ℝ 𝑛 𝑚 superscript ℝ 𝑛 𝑚 \Psi:\mathds{R}^{n+m}\longrightarrow\mathds{R}^{n+m} roman_Ψ : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT ⟶ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT
such that for any given r ∈ ℝ n + m 𝑟 superscript ℝ 𝑛 𝑚 r\in\mathds{R}^{n+m} italic_r ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT , Ψ ( r ) Ψ 𝑟 \Psi(r) roman_Ψ ( italic_r ) approximates the solution Δ z Δ 𝑧 \Delta z roman_Δ italic_z of
M Δ z = r 𝑀 Δ 𝑧 𝑟 M\Delta z=r italic_M roman_Δ italic_z = italic_r in that
(23)
‖ r − M Ψ ( r ) ‖ ∗ ≤ δ ‖ r ‖ ∗ subscript norm 𝑟 𝑀 Ψ 𝑟 𝛿 subscript norm 𝑟 \|r-M\Psi(r)\|_{*}\leq\delta\|r\|_{*} ∥ italic_r - italic_M roman_Ψ ( italic_r ) ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ≤ italic_δ ∥ italic_r ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT
for some δ ∈ [ 0 , 1 ) 𝛿 0 1 \delta\in[0,1) italic_δ ∈ [ 0 , 1 ) and some norm ∥ ⋅ ∥ ∗ \|\cdot\|_{*} ∥ ⋅ ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT . We obtain the inexact augmented Lagrangian algorithm of Algorithm 2 , where the main idea is to approximate M − 1 r k superscript 𝑀 1 subscript 𝑟 𝑘 M^{-1}r_{k} italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in (22 ).
Algorithm 2 Inexact augmented Lagrangian algorithm
1: Given
z 0 = ( x 0 , y 0 ) ∈ ℝ n + m subscript 𝑧 0 subscript 𝑥 0 subscript 𝑦 0 superscript ℝ 𝑛 𝑚 z_{0}=(x_{0},y_{0})\in\mathds{R}^{n+m} italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT ,
ω > 0 𝜔 0 \omega>0 italic_ω > 0 ,
0 ≤ δ < 1 0 𝛿 1 0\leq\delta<1 0 ≤ italic_δ < 1 and SPD
Q 𝑄 Q italic_Q , set
k = 0 𝑘 0 k=0 italic_k = 0 .
2: while a stop** condition is not satisfied do
3: Compute
r k = A z k − ℓ subscript 𝑟 𝑘 𝐴 subscript 𝑧 𝑘 ℓ r_{k}=Az_{k}-\ell italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ .
4: Compute
Ψ ( r k ) ≈ M − 1 r k Ψ subscript 𝑟 𝑘 superscript 𝑀 1 subscript 𝑟 𝑘 \Psi(r_{k})\approx M^{-1}r_{k} roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ≈ italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT satisfying (
23 ).
5: Compute
z k + 1 = z k − Ψ ( r k ) subscript 𝑧 𝑘 1 subscript 𝑧 𝑘 Ψ subscript 𝑟 𝑘 z_{k+1}=z_{k}-\Psi(r_{k}) italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) .
6: Increment
k 𝑘 k italic_k by
1 1 1 1 .
7: end while
In our convergence analysis we use ∥ ⋅ ∥ P \|\cdot\|_{P} ∥ ⋅ ∥ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT in (23 ), where P β = ( I 0 0 β Q − 1 ) subscript 𝑃 𝛽 matrix 𝐼 0 0 𝛽 superscript 𝑄 1 P_{\beta}=\smash[b]{\begin{pmatrix}I&0\\
0&\beta Q^{-1}\end{pmatrix}} italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) is SPD and β > 0 𝛽 0 \beta>0 italic_β > 0 is an arbitrary constant. By Algorithm 2 ,
(24)
r k + 1 subscript 𝑟 𝑘 1 \displaystyle r_{k+1} italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT
= \displaystyle= =
A z k + 1 − ℓ = A ( z k − Ψ ( r k ) ) − ℓ = r k − A Ψ ( r k ) 𝐴 subscript 𝑧 𝑘 1 ℓ 𝐴 subscript 𝑧 𝑘 Ψ subscript 𝑟 𝑘 ℓ subscript 𝑟 𝑘 𝐴 Ψ subscript 𝑟 𝑘 \displaystyle Az_{k+1}-\ell=A(z_{k}-\Psi(r_{k}))-\ell=r_{k}-A\Psi(r_{k}) italic_A italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - roman_ℓ = italic_A ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) - roman_ℓ = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_A roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )
= \displaystyle= =
( I − A M − 1 ) r k + A M − 1 ( r k − M Ψ ( r k ) ) 𝐼 𝐴 superscript 𝑀 1 subscript 𝑟 𝑘 𝐴 superscript 𝑀 1 subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 \displaystyle(I-AM^{-1})r_{k}+AM^{-1}(r_{k}-M\Psi(r_{k})) ( italic_I - italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_A italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )
= \displaystyle= =
N M − 1 r k + ( I − N M − 1 ) ( r k − M Ψ ( r k ) ) . 𝑁 superscript 𝑀 1 subscript 𝑟 𝑘 𝐼 𝑁 superscript 𝑀 1 subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 \displaystyle NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k})). italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .
Likewise, we discuss the convergence of Algorithm 2 when B 𝐵 B italic_B does or does not have full column rank, respectively.
3.1 Convergence analysis when B 𝐵 B italic_B has full column rank
Note that P β subscript 𝑃 𝛽 P_{\beta} italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT is SPD, and (24 ) gives
P β 1 2 r k + 1 = P β 1 2 N M − 1 P β − 1 2 P β 1 2 r k + P β 1 2 ( I − N M − 1 ) P β − 1 2 P β 1 2 ( r k − M Ψ ( r k ) ) . superscript subscript 𝑃 𝛽 1 2 subscript 𝑟 𝑘 1 superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 superscript subscript 𝑃 𝛽 1 2 subscript 𝑟 𝑘 superscript subscript 𝑃 𝛽 1 2 𝐼 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 superscript subscript 𝑃 𝛽 1 2 subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 \displaystyle P_{\beta}^{\tfrac{1}{2}}r_{k+1}=P_{\beta}^{\tfrac{1}{2}}NM^{-1}P%
_{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}r_{k}+P_{\beta}^{\tfrac{1}{2}}%
(I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}(r_{k}-M\Psi(r_{k}%
)). italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .
This along with (23 ) yields
(25)
‖ r k + 1 ‖ P β subscript norm subscript 𝑟 𝑘 1 subscript 𝑃 𝛽 \displaystyle\|r_{k+1}\|_{P_{\beta}} ∥ italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
≤ \displaystyle\leq ≤
‖ P β 1 2 N M − 1 P β − 1 2 ‖ ‖ r k ‖ P β + ‖ P β 1 2 ( I − N M − 1 ) P β − 1 2 ‖ ‖ r k − M Ψ ( r k ) ‖ P β norm superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 norm superscript subscript 𝑃 𝛽 1 2 𝐼 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 subscript norm subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 subscript 𝑃 𝛽 \displaystyle\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\|\|r_{%
k}\|_{P_{\beta}}+\|P_{\beta}^{\tfrac{1}{2}}(I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}%
}\|\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}} ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
≤ \displaystyle\leq ≤
( ‖ P β 1 2 N M − 1 P β − 1 2 ‖ + δ ‖ I − P β 1 2 N M − 1 P β − 1 2 ‖ ) ‖ r k ‖ P β norm superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 𝛿 norm 𝐼 superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 \displaystyle\big{(}\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}%
\|+\delta\|I-P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\|\big{)}%
\|r_{k}\|_{P_{\beta}} ( ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ + italic_δ ∥ italic_I - italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ) ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
= \displaystyle= =
( ‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β ) ‖ r k ‖ P β . subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 \displaystyle\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}%
\big{)}\|r_{k}\|_{P_{\beta}}. ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .
The following result provides sufficient conditions for ‖ N M − 1 ‖ P β < 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 \|NM^{-1}\|_{P_{\beta}}<1 ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 .
Lemma 1 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any β > 0 𝛽 0 \beta>0 italic_β > 0 and SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η be defined by (13 ) and λ 1 subscript 𝜆 1 \lambda_{1} italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the minimum eigenvalue of 2 ω H + B Q − 1 B T 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2\omega H+BQ^{-1}B^{T} 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . Then, λ 1 > 0 subscript 𝜆 1 0 \lambda_{1}>0 italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 and if 0 < ω < min { 1 / ( − 2 η ) + , λ 1 / β } 0 𝜔 1 subscript 2 𝜂 subscript 𝜆 1 𝛽 0<\omega<\min\left\{1/(-2\eta)_{+},\,\sqrt{\lambda_{1}/\beta}\,\right\} 0 < italic_ω < roman_min { 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG } , we have ‖ N M − 1 ‖ P β < 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 \|NM^{-1}\|_{P_{\beta}}<1 ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 .
Proof 3.1 .
It follows from 0 < ω < 1 / ( − 2 η ) + ≤ 1 / ( − η ) + 0 𝜔 1 subscript 2 𝜂 1 subscript 𝜂 0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ≤ 1 / ( - italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT that S Q subscript 𝑆 𝑄 S_{Q} italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT is positive definite. Combining with (12 ) and (14 ) leads to
P β 1 2 N M − 1 P β − 1 2 superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 \displaystyle P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}} italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT
= \displaystyle= =
P β 1 2 ( 0 0 B T S Q − 1 I − 1 ω B T S Q − 1 B Q − 1 ) P β − 1 2 superscript subscript 𝑃 𝛽 1 2 matrix 0 0 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐼 1 𝜔 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 superscript subscript 𝑃 𝛽 1 2 \displaystyle P_{\beta}^{\tfrac{1}{2}}\begin{pmatrix}0&0\\
B^{T}S_{Q}^{-1}&I-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}P_{\beta%
}^{-\tfrac{1}{2}} italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT
= \displaystyle= =
( 0 0 β Q − 1 2 B T S Q − 1 I − E ) = : T ~ , \displaystyle\begin{pmatrix}0&0\\
\sqrt{\beta}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}&I-E\end{pmatrix}=:\widetilde{T}, ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL square-root start_ARG italic_β end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL italic_I - italic_E end_CELL end_ROW end_ARG ) = : over~ start_ARG italic_T end_ARG ,
where E = 1 ω Q − 1 2 B T S Q − 1 B Q − 1 2 𝐸 1 𝜔 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 𝐵 superscript 𝑄 1 2 E=\frac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}BQ^{-\tfrac{1}{2}} italic_E = divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT .
This shows that
(26)
‖ N M − 1 ‖ P β = ‖ P β 1 2 N M − 1 P β − 1 2 ‖ = ( ρ ( T ~ T ~ T ) ) 1 2 . subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 norm superscript subscript 𝑃 𝛽 1 2 𝑁 superscript 𝑀 1 superscript subscript 𝑃 𝛽 1 2 superscript 𝜌 ~ 𝑇 superscript ~ 𝑇 𝑇 1 2 \|NM^{-1}\|_{P_{\beta}}=\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}%
{2}}\|=\Big{(}\rho(\widetilde{T}\widetilde{T}^{T})\Big{)}^{\tfrac{1}{2}}. ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ = ( italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT .
By direct calculation and (11 ), we have
(27)
ρ ( T ~ T ~ T ) 𝜌 ~ 𝑇 superscript ~ 𝑇 𝑇 \displaystyle\rho\left(\widetilde{T}\widetilde{T}^{T}\right) italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT )
= \displaystyle= =
ρ ( ( I − E ) ( I − E T ) + β Q − 1 2 B T S Q − 1 S Q − T B Q − 1 2 ) 𝜌 𝐼 𝐸 𝐼 superscript 𝐸 𝑇 𝛽 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \displaystyle\rho\left((I-E)(I-E^{T}\!\,)+\beta Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-%
1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right) italic_ρ ( ( italic_I - italic_E ) ( italic_I - italic_E start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) + italic_β italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
= \displaystyle= =
ρ ( I − 1 ω Q − 1 2 B T S Q − 1 ( S Q + S Q T − 1 ω B Q − 1 B T − ω β I ) S Q − T B Q − 1 2 ) 𝜌 𝐼 1 𝜔 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 subscript 𝑆 𝑄 superscript subscript 𝑆 𝑄 𝑇 1 𝜔 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝜔 𝛽 𝐼 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \displaystyle\rho\left(I-\tfrac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big%
{(}S_{Q}+S_{Q}^{T}-\tfrac{1}{\omega}BQ^{-1}B^{T}-\omega\beta I\Big{)}S_{Q}^{-T%
}BQ^{-\tfrac{1}{2}}\right) italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_ω end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
= \displaystyle= =
ρ ( I − 1 ω 2 Q − 1 2 B T S Q − 1 ( 2 ω H + B Q − 1 B T − ω 2 β I ) S Q − T B Q − 1 2 ) . 𝜌 𝐼 1 superscript 𝜔 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript 𝜔 2 𝛽 𝐼 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \displaystyle\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}%
\Big{(}2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}%
{2}}\right). italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ) .
Note that B 𝐵 B italic_B has full column rank and ω > 0 𝜔 0 \omega>0 italic_ω > 0 , and if 2 ω H + B Q − 1 B T − ω 2 β I 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript 𝜔 2 𝛽 𝐼 2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I is SPD, so is 1 ω 2 Q − 1 2 B T S Q − 1 ( 2 ω H + B Q − 1 B T − ω 2 β I ) S Q − T B Q − 1 2 1 superscript 𝜔 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript 𝜔 2 𝛽 𝐼 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big{(}2\omega H+BQ^{-1}B%
^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}{2}} divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT . Then all eigenvalues of T ~ T ~ T ~ 𝑇 superscript ~ 𝑇 𝑇 \widetilde{T}\widetilde{T}^{T} over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT are less than 1 1 1 1 , i.e., ρ ( T ~ T ~ T ) < 1 𝜌 ~ 𝑇 superscript ~ 𝑇 𝑇 1 \rho(\widetilde{T}\widetilde{T}^{T})<1 italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) < 1 .
Therefore, in order to prove ‖ N M − 1 ‖ P β < 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 \|NM^{-1}\|_{P_{\beta}}<1 ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 , we just need to find ω 𝜔 \omega italic_ω to guarantee that 2 ω H + B Q − 1 B T − ω 2 β I 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript 𝜔 2 𝛽 𝐼 2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β italic_I is positive definite.
Since H 𝐻 H italic_H is positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , (13 ) and 2 ω η > − 1 2 𝜔 𝜂 1 2\omega\eta>-1 2 italic_ω italic_η > - 1 imply 2 ω H + B Q − 1 B T 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2\omega H+BQ^{-1}B^{T} 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is positive definite. Thus, λ 1 > 0 subscript 𝜆 1 0 \lambda_{1}>0 italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 . Combining with ω < λ 1 / β 𝜔 subscript 𝜆 1 𝛽 \omega<\sqrt{\lambda_{1}/\beta} italic_ω < square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG
gives the result.
Remark 2 .
The conditions in Lemma 1 are reasonable.
Indeed, for any given ω 0 ∈ ( 0 , 1 / ( − 2 η ) + ) subscript 𝜔 0 0 1 subscript 2 𝜂 \omega_{0}\in(0,\,1/(-2\eta)_{+}) italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ ( 0 , 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ) , 2 H + 1 ω 0 B Q − 1 B T 2 𝐻 1 subscript 𝜔 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2H+\frac{1}{\omega_{0}}BQ^{-1}B^{T} 2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is SPD. Then when 0 < ω ≤ ω 0 0 𝜔 subscript 𝜔 0 0<\omega\leq\omega_{0} 0 < italic_ω ≤ italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , we have
λ 1 ≥ λ min ( 2 ω H + ω ω 0 B Q − 1 B T ) = ω λ min ( 2 H + 1 ω 0 B Q − 1 B T ) . subscript 𝜆 1 subscript 𝜆 2 𝜔 𝐻 𝜔 subscript 𝜔 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝜔 subscript 𝜆 2 𝐻 1 subscript 𝜔 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 \lambda_{1}\geq\lambda_{\min}\left(2\omega H+\tfrac{\omega}{\omega_{0}}BQ^{-1}%
B^{T}\right)=\omega\lambda_{\min}\left(2H+\tfrac{1}{\omega_{0}}BQ^{-1}B^{T}%
\right). italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_ω italic_H + divide start_ARG italic_ω end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = italic_ω italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) .
Then the conditions in Lemma 1 can be replaced by
0 < ω < min { ω 0 , 1 β λ min ( 2 H + 1 ω 0 B Q − 1 B T ) } . 0 𝜔 subscript 𝜔 0 1 𝛽 subscript 𝜆 2 𝐻 1 subscript 𝜔 0 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 0<\omega<\min\left\{\omega_{0},\,\tfrac{1}{\beta}\lambda_{\min}\left(2H+\tfrac%
{1}{\omega_{0}}BQ^{-1}B^{T}\right)\right\}. 0 < italic_ω < roman_min { italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , divide start_ARG 1 end_ARG start_ARG italic_β end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) } .
In particular, when H 𝐻 H italic_H is positive semidefinite, η ≥ 0 𝜂 0 \eta\geq 0 italic_η ≥ 0 and 2 H + B Q − 1 B T 2 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2H+BQ^{-1}B^{T} 2 italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is SPD. Then we can pick ω 0 = 1 subscript 𝜔 0 1 \omega_{0}=1 italic_ω start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 above and the last condition can be further simplified as
0 < ω < min { 1 , 1 β λ min ( 2 H + B Q − 1 B T ) } . 0 𝜔 1 1 𝛽 subscript 𝜆 2 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 0<\omega<\min\left\{1,\,\tfrac{1}{\beta}\lambda_{\min}(2H+BQ^{-1}B^{T})\right\}. 0 < italic_ω < roman_min { 1 , divide start_ARG 1 end_ARG start_ARG italic_β end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( 2 italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) } .
Theorem 3.2 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT has full column rank and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any β > 0 𝛽 0 \beta>0 italic_β > 0 and SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT , let η 𝜂 \eta italic_η and δ 𝛿 \delta italic_δ be defined by (13 ) and (23 ), and λ 1 > 0 subscript 𝜆 1 0 \lambda_{1}>0 italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 be the minimum eigenvalue of 2 ω H + B Q − 1 B T 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2\omega H+BQ^{-1}B^{T} 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . If ω 𝜔 \omega italic_ω and δ 𝛿 \delta italic_δ satisfy
0 < ω < min { 1 ( − 2 η ) + , λ 1 β } and 0 ≤ δ ≤ 1 2 ( 1 − ‖ N M − 1 ‖ P β ) , formulae-sequence 0 𝜔 1 subscript 2 𝜂 subscript 𝜆 1 𝛽 and 0
𝛿 1 2 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}%
\right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|NM^{-1}\|_{%
P_{\beta}}\Big{)}, 0 < italic_ω < roman_min { divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG , square-root start_ARG divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_β end_ARG end_ARG } and 0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ,
then { x k , y k } subscript 𝑥 𝑘 subscript 𝑦 𝑘 \{x_{k},y_{k}\} { italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 2 converges to the unique solution of (1 ).
Proof 3.3 .
It follows from Lemma 1 that ‖ N M − 1 ‖ P β < 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 \|NM^{-1}\|_{P_{\beta}}<1 ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 , so that
‖ I − N M − 1 ‖ P β ≤ 1 + ‖ N M − 1 ‖ P β < 2 . subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 2 \|I-NM^{-1}\|_{P_{\beta}}\leq 1+\|NM^{-1}\|_{P_{\beta}}<2. ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ 1 + ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 2 .
The result follows from (25 ) and
‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 \displaystyle\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}} ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
≤ ‖ N M − 1 ‖ P β + 1 − ‖ N M − 1 ‖ P β 2 ‖ I − N M − 1 ‖ P β absent subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 2 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 \displaystyle\leq\|NM^{-1}\|_{P_{\beta}}+\frac{1-\|NM^{-1}\|_{P_{\beta}}}{2}\|%
I-NM^{-1}\|_{P_{\beta}} ≤ ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
< ‖ N M − 1 ‖ P β + 1 − ‖ N M − 1 ‖ P β = 1 . absent subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 1 \displaystyle<\|NM^{-1}\|_{P_{\beta}}+1-\|NM^{-1}\|_{P_{\beta}}=1. < ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 1 .
Remark 3 .
From (25 ) we have
‖ r k ‖ P β ≤ ( ‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β ) k ‖ r 0 ‖ P β . subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 superscript subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝑘 subscript norm subscript 𝑟 0 subscript 𝑃 𝛽 \|r_{k}\|_{P_{\beta}}\leq\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P%
_{\beta}}\big{)}^{k}\|r_{0}\|_{P_{\beta}}. ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .
Hence, based on the conditions of Theorem 3.2 , r k subscript 𝑟 𝑘 r_{k} italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges to zero linearly. Let z ∗ subscript 𝑧 z_{*} italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT be the solution of (1 ). Then
‖ z k − z ∗ ‖ P β = ‖ A − 1 r k ‖ P β = ‖ P β 1 2 A − 1 P β − 1 2 P β 1 2 r k ‖ ≤ ‖ P β 1 2 A − 1 P β − 1 2 ‖ ‖ P β 1 2 r k ‖ subscript norm subscript 𝑧 𝑘 subscript 𝑧 subscript 𝑃 𝛽 subscript norm superscript 𝐴 1 subscript 𝑟 𝑘 subscript 𝑃 𝛽 norm superscript subscript 𝑃 𝛽 1 2 superscript 𝐴 1 superscript subscript 𝑃 𝛽 1 2 superscript subscript 𝑃 𝛽 1 2 subscript 𝑟 𝑘 norm superscript subscript 𝑃 𝛽 1 2 superscript 𝐴 1 superscript subscript 𝑃 𝛽 1 2 norm superscript subscript 𝑃 𝛽 1 2 subscript 𝑟 𝑘 \displaystyle\|z_{k}-z_{*}\|_{P_{\beta}}=\|A^{-1}r_{k}\|_{P_{\beta}}=\|P_{%
\beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}P_{\beta}^{\frac{1}{2}}r_{k}%
\|\leq\|P_{\beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}\|\|P_{\beta}^{%
\frac{1}{2}}r_{k}\| ∥ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ ≤ ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ∥ ∥ italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥
= ‖ A − 1 ‖ P β ‖ r k ‖ P β ≤ ‖ A − 1 ‖ P β ( ‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β ) k ‖ r 0 ‖ P β absent subscript norm superscript 𝐴 1 subscript 𝑃 𝛽 subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 subscript norm superscript 𝐴 1 subscript 𝑃 𝛽 superscript subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝑘 subscript norm subscript 𝑟 0 subscript 𝑃 𝛽 \displaystyle=\|A^{-1}\|_{P_{\beta}}\|r_{k}\|_{P_{\beta}}\leq\|A^{-1}\|_{P_{%
\beta}}\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}\big{)}^{%
k}\|r_{0}\|_{P_{\beta}} = ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
= ‖ A − 1 ‖ P β ( ‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β ) k ‖ A ( z 0 − z ∗ ) ‖ P β absent subscript norm superscript 𝐴 1 subscript 𝑃 𝛽 superscript subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝑘 subscript norm 𝐴 subscript 𝑧 0 subscript 𝑧 subscript 𝑃 𝛽 \displaystyle=\|A^{-1}\|_{P_{\beta}}\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-%
NM^{-1}\|_{P_{\beta}}\big{)}^{k}\|A(z_{0}-z_{*})\|_{P_{\beta}} = ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_A ( italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT
≤ ‖ A − 1 ‖ P β ‖ A ‖ P β ( ‖ N M − 1 ‖ P β + δ ‖ I − N M − 1 ‖ P β ) k ‖ z 0 − z ∗ ‖ P β . absent subscript norm superscript 𝐴 1 subscript 𝑃 𝛽 subscript norm 𝐴 subscript 𝑃 𝛽 superscript subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝛿 subscript norm 𝐼 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 𝑘 subscript norm subscript 𝑧 0 subscript 𝑧 subscript 𝑃 𝛽 \displaystyle\leq\|A^{-1}\|_{P_{\beta}}\|A\|_{P_{\beta}}\big{(}\|NM^{-1}\|_{P_%
{\beta}}+\delta\|I-NM^{-1}\|_{P_{\beta}}\big{)}^{k}\|z_{0}-z_{*}\|_{P_{\beta}}. ≤ ∥ italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ italic_A ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ ∥ italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∥ italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .
This implies that z k subscript 𝑧 𝑘 z_{k} italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT converges linearly to z ∗ subscript 𝑧 z_{*} italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT under the conditions of Theorem 3.2 .
Remark 4 .
If β = δ 𝛽 𝛿 \beta=\delta italic_β = italic_δ in Theorem 3.2 , since ω > 0 𝜔 0 \omega>0 italic_ω > 0 and δ ≥ 0 𝛿 0 \delta\geq 0 italic_δ ≥ 0 , we know that ω < λ 1 / δ 𝜔 subscript 𝜆 1 𝛿 \omega<\sqrt{\lambda_{1}/\delta} italic_ω < square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_δ end_ARG holds if and only if δ < λ 1 / ω 2 𝛿 subscript 𝜆 1 superscript 𝜔 2 \delta<\lambda_{1}/\omega^{2} italic_δ < italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
Then the restricted conditions of ω 𝜔 \omega italic_ω and δ 𝛿 \delta italic_δ in Theorem 3.2 can be replaced by
0 < ω < 1 ( − 2 η ) + and 0 ≤ δ < min { λ 1 ω 2 , 1 − ‖ N M − 1 ‖ P δ 2 } . formulae-sequence 0 𝜔 1 subscript 2 𝜂 and 0
𝛿 subscript 𝜆 1 superscript 𝜔 2 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 2 0<\omega<\frac{1}{(-2\eta)_{+}}\quad\mbox{and}\quad 0\leq\delta<\min\left\{%
\frac{\lambda_{1}}{\omega^{2}},\,\dfrac{1-\|NM^{-1}\|_{P_{\delta}}}{2}\right\}. 0 < italic_ω < divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG and 0 ≤ italic_δ < roman_min { divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG } .
It follows from (26 ) and (27 ) that
‖ N M − 1 ‖ P δ 2 = ρ ( T ~ T ~ T ) = ρ ( I − 1 ω 2 Q − 1 2 B T S Q − 1 ( 2 ω H + B Q − 1 B T − δ ω 2 I ) S Q − T B Q − 1 2 ) superscript subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 2 𝜌 ~ 𝑇 superscript ~ 𝑇 𝑇 𝜌 𝐼 1 superscript 𝜔 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 𝛿 superscript 𝜔 2 𝐼 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \displaystyle\|NM^{-1}\|_{P_{\delta}}^{2}=\rho(\widetilde{T}\widetilde{T}^{T})%
=\rho\Big{(}I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+%
BQ^{-1}B^{T}-\delta\omega^{2}I)S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\Big{)} ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_ρ ( over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT - italic_δ italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_I ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT )
(28)
= ρ ( I − 1 ω 2 Q − 1 2 B T S Q − 1 ( 2 ω H + B Q − 1 B T ) S Q − T B Q − 1 2 + δ Q − 1 2 B T S Q − 1 S Q − T B Q − 1 2 ) . absent 𝜌 𝐼 1 superscript 𝜔 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 𝛿 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 \displaystyle=\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1%
}(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{1}{2}}+\delta Q^{-\tfrac{1}{2}}%
B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right). = italic_ρ ( italic_I - divide start_ARG 1 end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT + italic_δ italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ) .
Note that T ~ T ~ T ~ 𝑇 superscript ~ 𝑇 𝑇 \widetilde{T}\widetilde{T}^{T} over~ start_ARG italic_T end_ARG over~ start_ARG italic_T end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is symmetric positive semidefinite and Q − 1 2 B T S Q − 1 S Q − T B Q − 1 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}} italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT is SPD, ‖ N M − 1 ‖ P δ subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 \|NM^{-1}\|_{P_{\delta}} ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT increases with δ 𝛿 \delta italic_δ , and
lim δ → λ 1 / ω 2 ‖ N M − 1 ‖ P δ = 1 , lim δ → 0 + ‖ N M − 1 ‖ P δ = 1 − λ ~ 1 / ω 2 < 1 , formulae-sequence subscript → 𝛿 subscript 𝜆 1 superscript 𝜔 2 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 1 subscript → 𝛿 superscript 0 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 1 subscript ~ 𝜆 1 superscript 𝜔 2 1 \lim_{\delta\rightarrow\lambda_{1}/\omega^{2}}\|NM^{-1}\|_{P_{\delta}}=1,%
\qquad\lim_{\delta\rightarrow 0^{+}}\|NM^{-1}\|_{P_{\delta}}=\sqrt{1-\tilde{%
\lambda}_{1}/\omega^{2}}<1, roman_lim start_POSTSUBSCRIPT italic_δ → italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = 1 , roman_lim start_POSTSUBSCRIPT italic_δ → 0 start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = square-root start_ARG 1 - over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1 ,
where λ ~ 1 > 0 subscript ~ 𝜆 1 0 \tilde{\lambda}_{1}>0 over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > 0 is the minimum eigenvalue of Q − 1 2 B T S Q − 1 ( 2 ω H + B Q − 1 B T ) S Q − T B Q − 1 2 superscript 𝑄 1 2 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 1 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 superscript subscript 𝑆 𝑄 𝑇 𝐵 superscript 𝑄 1 2 Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{%
1}{2}} italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT . Then there exists δ > 0 𝛿 0 \delta>0 italic_δ > 0 such that ‖ N M − 1 ‖ P δ < 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 1 \|NM^{-1}\|_{P_{\delta}}<1 ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 . Therefore, for any given 0 < ω < 1 / ( − 2 η ) + 0 𝜔 1 subscript 2 𝜂 0<\omega<1/(-2\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , Algorithm 2 is convergent for sufficiently small δ 𝛿 \delta italic_δ . Moreover, the larger ω 𝜔 \omega italic_ω is, the smaller δ 𝛿 \delta italic_δ should be. Therefore, a practical selection of δ 𝛿 \delta italic_δ could be a sequence { δ k } subscript 𝛿 𝑘 \{\delta_{k}\} { italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } such that δ k → 0 → subscript 𝛿 𝑘 0 \delta_{k}\rightarrow 0 italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT → 0 as k → ∞ → 𝑘 k\rightarrow\infty italic_k → ∞ .
Remark 5 .
When G 𝐺 G italic_G is positive semidefinite, (13 ) yields η ≥ 0 𝜂 0 \eta\geq 0 italic_η ≥ 0 . It leads to ( − 2 η ) + = 0 subscript 2 𝜂 0 (-2\eta)_{+}=0 ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT = 0 . In this case, the sufficient conditions in Theorem 3.2 can be replaced by 0 < ω < min λ 1 / β 0 𝜔 subscript 𝜆 1 𝛽 0<\omega<\min\sqrt{\lambda_{1}/\beta} 0 < italic_ω < roman_min square-root start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_β end_ARG and 0 ≤ δ ≤ 1 2 ( 1 − ‖ N M − 1 ‖ P β ) 0 𝛿 1 2 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛽 0\leq\delta\leq\tfrac{1}{2}\left(1-\|NM^{-1}\|_{P_{\beta}}\right) 0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) .
Furthermore, from Remark 4 we know that the restrictions also can be replaced by ω > 0 𝜔 0 \omega>0 italic_ω > 0 and 0 ≤ δ < min { λ 1 ω 2 , 1 − ‖ N M − 1 ‖ P δ 2 } 0 𝛿 subscript 𝜆 1 superscript 𝜔 2 1 subscript norm 𝑁 superscript 𝑀 1 subscript 𝑃 𝛿 2 0\leq\delta<\min\left\{\tfrac{\lambda_{1}}{\omega^{2}},\,\tfrac{1-\|NM^{-1}\|_%
{P_{\delta}}}{2}\right\} 0 ≤ italic_δ < roman_min { divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , divide start_ARG 1 - ∥ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG 2 end_ARG } .
This implies that when G 𝐺 G italic_G is positive semidefinite, for any ω > 0 𝜔 0 \omega>0 italic_ω > 0 , Algorithm 2 is convergent for sufficiently small δ 𝛿 \delta italic_δ .
3.2 Convergence analysis when B 𝐵 B italic_B is rank-deficient
Assume that the rank of B 𝐵 B italic_B is s 𝑠 s italic_s and 0 < s < m 0 𝑠 𝑚 0<s<m 0 < italic_s < italic_m . Let B = U ( Σ 0 ) V T 𝐵 𝑈 matrix Σ 0 superscript 𝑉 𝑇 B=U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T} italic_B = italic_U ( start_ARG start_ROW start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT be the singular value decomposition (SVD), where n × n 𝑛 𝑛 n\times n italic_n × italic_n U 𝑈 U italic_U and m × m 𝑚 𝑚 m\times m italic_m × italic_m V 𝑉 V italic_V are orthogonal matrices, Σ = ( Σ s 0 ) ∈ ℝ n × s Σ matrix subscript Σ 𝑠 0 superscript ℝ 𝑛 𝑠 \Sigma=\begin{pmatrix}\Sigma_{s}\\
0\end{pmatrix}\in\mathds{R}^{n\times s} roman_Σ = ( start_ARG start_ROW start_CELL roman_Σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_s end_POSTSUPERSCRIPT has full column rank, and Σ s = diag { σ 1 , σ 2 , … , σ s } subscript Σ 𝑠 diag subscript 𝜎 1 subscript 𝜎 2 … subscript 𝜎 𝑠 \Sigma_{s}={\rm diag}\{\sigma_{1},\sigma_{2},\ldots,\sigma_{s}\} roman_Σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = roman_diag { italic_σ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_σ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT } with all σ j > 0 subscript 𝜎 𝑗 0 \sigma_{j}>0 italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > 0 contains the singular values of B 𝐵 B italic_B . Let Q 1 ∈ ℝ s × s subscript 𝑄 1 superscript ℝ 𝑠 𝑠 Q_{1}\in\mathds{R}^{s\times s} italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_s × italic_s end_POSTSUPERSCRIPT and Q 2 ∈ ℝ ( m − s ) × ( m − s ) subscript 𝑄 2 superscript ℝ 𝑚 𝑠 𝑚 𝑠 Q_{2}\in\mathds{R}^{(m-s)\times(m-s)} italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT be SPD, and
(29)
Q = V ( Q 1 0 0 Q 2 ) V T , D ~ = ( U 0 0 V ) , P ~ β = ( I 0 0 β Q 1 − 1 ) , formulae-sequence 𝑄 𝑉 matrix subscript 𝑄 1 0 0 subscript 𝑄 2 superscript 𝑉 𝑇 formulae-sequence ~ 𝐷 matrix 𝑈 0 0 𝑉 subscript ~ 𝑃 𝛽 matrix 𝐼 0 0 𝛽 superscript subscript 𝑄 1 1 \displaystyle Q=V\begin{pmatrix}Q_{1}&0\\
0&Q_{2}\end{pmatrix}V^{T}\!,\qquad~{}\widetilde{D}=\begin{pmatrix}U&0\\
0&V\end{pmatrix},\qquad\quad~{}\widetilde{P}_{\beta}=\begin{pmatrix}I&0\\
0&\beta Q_{1}^{-1}\end{pmatrix}, italic_Q = italic_V ( start_ARG start_ROW start_CELL italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , over~ start_ARG italic_D end_ARG = ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) , over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_I end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ,
(30)
A ~ = ( U T G U Σ − Σ 0 ) , M ~ = ( U T G U Σ − Σ ω Q 1 ) , N ~ = ( 0 0 0 ω Q 1 ) . formulae-sequence ~ 𝐴 matrix superscript 𝑈 𝑇 𝐺 𝑈 Σ Σ 0 formulae-sequence ~ 𝑀 matrix superscript 𝑈 𝑇 𝐺 𝑈 Σ Σ 𝜔 subscript 𝑄 1 ~ 𝑁 matrix 0 0 0 𝜔 subscript 𝑄 1 \displaystyle\widetilde{A}=\begin{pmatrix}U^{T}GU&\Sigma\\
-\Sigma&0\end{pmatrix},\qquad\widetilde{M}=\begin{pmatrix}U^{T}GU&\Sigma\\
-\Sigma&\omega Q_{1}\end{pmatrix},\qquad\widetilde{N}=\begin{pmatrix}0&0\\
0&\omega Q_{1}\end{pmatrix}. over~ start_ARG italic_A end_ARG = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL end_ROW start_ROW start_CELL - roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) , over~ start_ARG italic_M end_ARG = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL end_ROW start_ROW start_CELL - roman_Σ end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) , over~ start_ARG italic_N end_ARG = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .
Let r ~ k = D ~ T r k = ( r ~ k a , r ~ k b ) subscript ~ 𝑟 𝑘 superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 matrix superscript subscript ~ 𝑟 𝑘 𝑎 superscript subscript ~ 𝑟 𝑘 𝑏
\widetilde{r}_{k}=\widetilde{D}^{T}\!r_{k}=\begin{pmatrix}\widetilde{r}_{k}^{a%
},\,\widetilde{r}_{k}^{b}\end{pmatrix} over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , Ψ ~ ( r k ) = D ~ T Ψ ( r k ) = ( Ψ ~ a ( r k ) , Ψ ~ b ( r k ) ) ~ Ψ subscript 𝑟 𝑘 superscript ~ 𝐷 𝑇 Ψ subscript 𝑟 𝑘 matrix superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 superscript ~ Ψ 𝑏 subscript 𝑟 𝑘
\widetilde{\Psi}(r_{k})=\widetilde{D}^{T}\!\Psi(r_{k})=\begin{pmatrix}%
\widetilde{\Psi}^{a}(r_{k}),\,\widetilde{\Psi}^{b}(r_{k})\end{pmatrix} over~ start_ARG roman_Ψ end_ARG ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( start_ARG start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) with r ~ k a , Ψ ~ a ( r k ) ∈ ℝ n + s superscript subscript ~ 𝑟 𝑘 𝑎 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘
superscript ℝ 𝑛 𝑠 \widetilde{r}_{k}^{a},\,\widetilde{\Psi}^{a}(r_{k})\in\mathds{R}^{n+s} over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT . It follows from (4 ), (12 ), (29 ) and (30 ) that
D ~ T A D ~ superscript ~ 𝐷 𝑇 𝐴 ~ 𝐷 \displaystyle\widetilde{D}^{T}\!A\widetilde{D} over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG
= ( U T 0 0 V T ) ( G B − B T 0 ) ( U 0 0 V ) = ( U T G U U T B V − V T B T U 0 ) absent matrix superscript 𝑈 𝑇 0 0 superscript 𝑉 𝑇 matrix 𝐺 𝐵 superscript 𝐵 𝑇 0 matrix 𝑈 0 0 𝑉 matrix superscript 𝑈 𝑇 𝐺 𝑈 superscript 𝑈 𝑇 𝐵 𝑉 superscript 𝑉 𝑇 superscript 𝐵 𝑇 𝑈 0 \displaystyle=\begin{pmatrix}U^{T}\!&0\\
0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\
-B^{T}&0\end{pmatrix}\begin{pmatrix}U&0\\
0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\
-V^{T}\!B^{T}U&0\end{pmatrix} = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_V end_CELL end_ROW start_ROW start_CELL - italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U end_CELL start_CELL 0 end_CELL end_ROW end_ARG )
(31)
= ( U T G U Σ 0 − Σ T 0 0 0 0 0 ) = : ( A ~ 0 0 0 ) , \displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\
-\Sigma^{T}\!&0&0\\
0&0&0\end{pmatrix}=:\begin{pmatrix}\widetilde{A}&0\\
0&0\end{pmatrix}, = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ,
D ~ T M D ~ superscript ~ 𝐷 𝑇 𝑀 ~ 𝐷 \displaystyle\widetilde{D}^{T}\!M\widetilde{D} over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG
= ( U T 0 0 V T ) ( G B − B T ω Q ) ( U 0 0 V ) = ( U T G U U T B V − V T B T U ω V T Q V ) absent matrix superscript 𝑈 𝑇 0 0 superscript 𝑉 𝑇 matrix 𝐺 𝐵 superscript 𝐵 𝑇 𝜔 𝑄 matrix 𝑈 0 0 𝑉 matrix superscript 𝑈 𝑇 𝐺 𝑈 superscript 𝑈 𝑇 𝐵 𝑉 superscript 𝑉 𝑇 superscript 𝐵 𝑇 𝑈 𝜔 superscript 𝑉 𝑇 𝑄 𝑉 \displaystyle=\begin{pmatrix}U^{T}\!&0\\
0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\
-B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\
0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\
-V^{T}\!B^{T}U&\omega V^{T}\!QV\end{pmatrix} = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_G end_CELL start_CELL italic_B end_CELL end_ROW start_ROW start_CELL - italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_V end_CELL end_ROW start_ROW start_CELL - italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U end_CELL start_CELL italic_ω italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q italic_V end_CELL end_ROW end_ARG )
(32)
= ( U T G U Σ 0 − Σ T ω Q 1 0 0 0 ω Q 2 ) = : ( M ~ 0 0 ω Q 2 ) , \displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\
-\Sigma^{T}\!&\omega Q_{1}&0\\
0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{M}&0\\
0&\omega Q_{2}\end{pmatrix}, = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U end_CELL start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL - roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_M end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ,
D ~ T N D ~ superscript ~ 𝐷 𝑇 𝑁 ~ 𝐷 \displaystyle\widetilde{D}^{T}\!N\widetilde{D} over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG
= ( U T 0 0 V T ) ( 0 0 0 ω Q ) ( U 0 0 V ) = ( 0 0 0 ω V T Q V ) absent matrix superscript 𝑈 𝑇 0 0 superscript 𝑉 𝑇 matrix 0 0 0 𝜔 𝑄 matrix 𝑈 0 0 𝑉 matrix 0 0 0 𝜔 superscript 𝑉 𝑇 𝑄 𝑉 \displaystyle=\begin{pmatrix}U^{T}\!&0\\
0&V^{T}\!\end{pmatrix}\begin{pmatrix}0&0\\
0&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\
0&V\end{pmatrix}=\begin{pmatrix}0&0\\
0&\omega V^{T}\!QV\end{pmatrix} = ( start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_U end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_V end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q italic_V end_CELL end_ROW end_ARG )
(33)
= ( 0 0 0 0 ω Q 1 0 0 0 ω Q 2 ) = : ( N ~ 0 0 ω Q 2 ) . \displaystyle=\begin{pmatrix}0&0&0\\
0&\omega Q_{1}&0\\
0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{N}&0\\
0&\omega Q_{2}\end{pmatrix}. = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = : ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .
Based on the above notations, we have the following results.
Lemma 6 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s 𝑠 s italic_s . If (1 ) is solvable, then r ~ k b = 0 superscript subscript ~ 𝑟 𝑘 𝑏 0 \widetilde{r}_{k}^{b}=0 over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0 for all k ≥ 1 𝑘 1 k\geq 1 italic_k ≥ 1 .
Proof 3.4 .
Let z ∗ subscript 𝑧 z_{*} italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT be a solution of (1 ), and let z ~ ∗ = D ~ T z ∗ = ( z ~ ∗ a , z ~ ∗ b ) subscript ~ 𝑧 superscript ~ 𝐷 𝑇 subscript 𝑧 matrix superscript subscript ~ 𝑧 𝑎 superscript subscript ~ 𝑧 𝑏
\widetilde{z}_{*}=\widetilde{D}^{T}\!z_{*}=\begin{pmatrix}\widetilde{z}_{*}^{a%
},\,\widetilde{z}_{*}^{b}\end{pmatrix} over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , z ~ k = D ~ T z k = ( z ~ k a , z ~ k b ) subscript ~ 𝑧 𝑘 superscript ~ 𝐷 𝑇 subscript 𝑧 𝑘 matrix superscript subscript ~ 𝑧 𝑘 𝑎 superscript subscript ~ 𝑧 𝑘 𝑏
\widetilde{z}_{k}=\widetilde{D}^{T}\!z_{k}=\begin{pmatrix}\widetilde{z}_{k}^{a%
},\,\widetilde{z}_{k}^{b}\end{pmatrix} over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , and ℓ ~ = D ~ T ℓ = ( ℓ ~ a , ℓ ~ b ) ~ ℓ superscript ~ 𝐷 𝑇 ℓ matrix superscript ~ ℓ 𝑎 superscript ~ ℓ 𝑏
\widetilde{\ell}=\widetilde{D}^{T}\!\ell=\begin{pmatrix}\widetilde{\ell}^{a},%
\,\widetilde{\ell}^{b}\end{pmatrix} over~ start_ARG roman_ℓ end_ARG = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_ℓ = ( start_ARG start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , where z ~ ∗ a , z ~ k a , ℓ ~ a ∈ ℝ n + s superscript subscript ~ 𝑧 𝑎 superscript subscript ~ 𝑧 𝑘 𝑎 superscript ~ ℓ 𝑎
superscript ℝ 𝑛 𝑠 \widetilde{z}_{*}^{a},\,\widetilde{z}_{k}^{a},\,\widetilde{\ell}^{a}\in\mathds%
{R}^{n+s} over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT . It follows from A z ∗ = ℓ 𝐴 subscript 𝑧 ℓ Az_{*}=\ell italic_A italic_z start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = roman_ℓ and (31 ) that
D ~ T A D ~ z ~ ∗ = ( A ~ 0 0 0 ) ( z ~ ∗ a z ~ ∗ b ) = ( A ~ z ~ ∗ a 0 ) = ( ℓ ~ a ℓ ~ b ) , superscript ~ 𝐷 𝑇 𝐴 ~ 𝐷 subscript ~ 𝑧 matrix ~ 𝐴 0 0 0 matrix superscript subscript ~ 𝑧 𝑎 superscript subscript ~ 𝑧 𝑏 matrix ~ 𝐴 superscript subscript ~ 𝑧 𝑎 0 matrix superscript ~ ℓ 𝑎 superscript ~ ℓ 𝑏 \smash[t]{\widetilde{D}^{T}\!A\widetilde{D}\widetilde{z}_{*}=\begin{pmatrix}%
\widetilde{A}&0\\
0&0\end{pmatrix}\begin{pmatrix}\widetilde{z}_{*}^{a}\\
\widetilde{z}_{*}^{b}\end{pmatrix}=\begin{pmatrix}\widetilde{A}\widetilde{z}_{%
*}^{a}\\
0\end{pmatrix}=\begin{pmatrix}\widetilde{\ell}^{a}\\
\widetilde{\ell}^{b}\end{pmatrix},} over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ,
which shows that ℓ ~ b = 0 superscript ~ ℓ 𝑏 0 \widetilde{\ell}^{b}=0 over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0 . Then we have
r ~ k subscript ~ 𝑟 𝑘 \displaystyle\widetilde{r}_{k} over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
= D ~ T r k = D ~ T ( A z k − ℓ ) = D ~ T A D ~ D ~ T z k − D ~ T ℓ = ( A ~ z ~ k a − ℓ ~ a − ℓ ~ b ) = ( r ~ k a 0 ) . absent superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 superscript ~ 𝐷 𝑇 𝐴 subscript 𝑧 𝑘 ℓ superscript ~ 𝐷 𝑇 𝐴 ~ 𝐷 superscript ~ 𝐷 𝑇 subscript 𝑧 𝑘 superscript ~ 𝐷 𝑇 ℓ matrix ~ 𝐴 superscript subscript ~ 𝑧 𝑘 𝑎 superscript ~ ℓ 𝑎 superscript ~ ℓ 𝑏 matrix superscript subscript ~ 𝑟 𝑘 𝑎 0 \displaystyle=\widetilde{D}^{T}\!r_{k}=\widetilde{D}^{T}\!(Az_{k}-\ell)=%
\widetilde{D}^{T}\!A\widetilde{D}\widetilde{D}^{T}\!z_{k}-\widetilde{D}^{T}\!%
\ell=\smash[t]{\begin{pmatrix}\widetilde{A}\widetilde{z}_{k}^{a}-\widetilde{%
\ell}^{a}\\
-\widetilde{\ell}^{b}\end{pmatrix}}=\begin{pmatrix}\widetilde{r}_{k}^{a}\\
0\end{pmatrix}. = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_A italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - roman_ℓ ) = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_A over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_ℓ = ( start_ARG start_ROW start_CELL over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL - over~ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) .
Lemma 7 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s 𝑠 s italic_s . For any ω , β > 0 𝜔 𝛽
0 \omega,\,\beta>0 italic_ω , italic_β > 0 and SPD Q 1 ∈ ℝ ( n + s ) × ( n + s ) subscript 𝑄 1 superscript ℝ 𝑛 𝑠 𝑛 𝑠 Q_{1}\in\mathds{R}^{(n+s)\times(n+s)} italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_s ) × ( italic_n + italic_s ) end_POSTSUPERSCRIPT and Q 2 ∈ ℝ ( m − s ) × ( m − s ) subscript 𝑄 2 superscript ℝ 𝑚 𝑠 𝑚 𝑠 Q_{2}\in\mathds{R}^{(m-s)\times(m-s)} italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT , let Q 𝑄 Q italic_Q and δ 𝛿 \delta italic_δ be defined by (29 ) and (23 ). Then
‖ r ~ k a − M ~ Ψ ~ a ( r k ) ‖ P ~ β ≤ δ ‖ r ~ k a ‖ P ~ β subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 subscript ~ 𝑃 𝛽 𝛿 subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 subscript ~ 𝑃 𝛽 \|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})\|_{\widetilde%
{P}_{\beta}}\leq\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}} ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_δ ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .
Proof 3.5 .
For any x ∈ ℝ n + m 𝑥 superscript ℝ 𝑛 𝑚 x\in\mathds{R}^{n+m} italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT and x ~ = D ~ T x = ( x ~ a , x ~ b ) ~ 𝑥 superscript ~ 𝐷 𝑇 𝑥 matrix superscript ~ 𝑥 𝑎 superscript ~ 𝑥 𝑏
\widetilde{x}=\widetilde{D}^{T}x=\begin{pmatrix}\widetilde{x}^{a},\,\widetilde%
{x}^{b}\end{pmatrix} over~ start_ARG italic_x end_ARG = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = ( start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) with x ~ a ∈ ℝ n + s superscript ~ 𝑥 𝑎 superscript ℝ 𝑛 𝑠 \widetilde{x}^{a}\in\mathds{R}^{n+s} over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_s end_POSTSUPERSCRIPT , since D ~ ~ 𝐷 \widetilde{D} over~ start_ARG italic_D end_ARG is an orthogonal matrix, from (29 ) and the definition of P β subscript 𝑃 𝛽 P_{\beta} italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT in Section 3 , we have
‖ x ‖ P β 2 superscript subscript norm 𝑥 subscript 𝑃 𝛽 2 \displaystyle\|x\|_{P_{\beta}}^{2} ∥ italic_x ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
= x T P β x = x T D ~ D ~ T P β D ~ D ~ T x = ( ( x ~ a ) T ( x ~ b ) T ) ( P ~ β 0 0 β Q 2 − 1 ) ( x ~ a x ~ b ) absent superscript 𝑥 𝑇 subscript 𝑃 𝛽 𝑥 superscript 𝑥 𝑇 ~ 𝐷 superscript ~ 𝐷 𝑇 subscript 𝑃 𝛽 ~ 𝐷 superscript ~ 𝐷 𝑇 𝑥 matrix superscript superscript ~ 𝑥 𝑎 𝑇 superscript superscript ~ 𝑥 𝑏 𝑇 matrix subscript ~ 𝑃 𝛽 0 0 𝛽 superscript subscript 𝑄 2 1 matrix superscript ~ 𝑥 𝑎 superscript ~ 𝑥 𝑏 \displaystyle=x^{T}P_{\beta}x=x^{T}\widetilde{D}\widetilde{D}^{T}P_{\beta}%
\widetilde{D}\widetilde{D}^{T}x=\begin{pmatrix}(\widetilde{x}^{a})^{T}\,(%
\widetilde{x}^{b})^{T}\end{pmatrix}\begin{pmatrix}\widetilde{P}_{\beta}&0\\
0&\beta Q_{2}^{-1}\end{pmatrix}\begin{pmatrix}\widetilde{x}^{a}\\
\widetilde{x}^{b}\end{pmatrix} = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT italic_x = italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x = ( start_ARG start_ROW start_CELL ( over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG )
(34)
= ‖ x ~ a ‖ P ~ β 2 + ‖ x ~ b ‖ β Q 2 − 1 2 . absent superscript subscript norm superscript ~ 𝑥 𝑎 subscript ~ 𝑃 𝛽 2 superscript subscript norm superscript ~ 𝑥 𝑏 𝛽 superscript subscript 𝑄 2 1 2 \displaystyle=\|\widetilde{x}^{a}\|_{\widetilde{P}_{\beta}}^{2}+\|\widetilde{x%
}^{b}\|_{\beta Q_{2}^{-1}}^{2}. = ∥ over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ over~ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
Note that (32 ) and Lemma 6 give
D ~ T ( r k − M Ψ ( r k ) ) superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 \displaystyle\widetilde{D}^{T}\left(r_{k}-M\Psi(r_{k})\right) over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )
= r ~ k − D ~ T M D ~ Ψ ~ ( r k ) = ( r ~ k a − M ~ Ψ ~ a ( r k ) − ω Q 2 Ψ ~ b ( r k ) ) . absent subscript ~ 𝑟 𝑘 superscript ~ 𝐷 𝑇 𝑀 ~ 𝐷 ~ Ψ subscript 𝑟 𝑘 matrix subscript superscript ~ 𝑟 𝑎 𝑘 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 𝜔 subscript 𝑄 2 superscript ~ Ψ 𝑏 subscript 𝑟 𝑘 \displaystyle=\widetilde{r}_{k}-\widetilde{D}^{T}M\widetilde{D}\widetilde{\Psi%
}(r_{k})=\smash{\begin{pmatrix}\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{%
\Psi}^{a}(r_{k})\\
-\omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}}. = over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG over~ start_ARG roman_Ψ end_ARG ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL - italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) .
This along with (34 ) leads to
‖ r k − M Ψ ( r k ) ‖ P β 2 = ‖ r ~ k a − M ~ Ψ ~ a ( r k ) ‖ P ~ β 2 + ‖ ω Q 2 Ψ ~ b ( r k ) ‖ β Q 2 − 1 2 superscript subscript norm subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 subscript 𝑃 𝛽 2 superscript subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 subscript ~ 𝑃 𝛽 2 superscript subscript norm 𝜔 subscript 𝑄 2 superscript ~ Ψ 𝑏 subscript 𝑟 𝑘 𝛽 superscript subscript 𝑄 2 1 2 \displaystyle\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}^{2}=\|\widetilde{r}^{a}_{k}-%
\widetilde{M}\widetilde{\Psi}^{a}(r_{k})\|_{\widetilde{P}_{\beta}}^{2}+\|%
\omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\|_{\beta Q_{2}^{-1}}^{2} ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_β italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
= ‖ r ~ k a − M ~ Ψ ~ a ( r k ) ‖ P ~ β 2 + ω 2 β ‖ Ψ ~ b ( r k ) ‖ Q 2 2 . absent superscript subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 subscript ~ 𝑃 𝛽 2 superscript 𝜔 2 𝛽 superscript subscript norm superscript ~ Ψ 𝑏 subscript 𝑟 𝑘 subscript 𝑄 2 2 \displaystyle=\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})%
\|_{\widetilde{P}_{\beta}}^{2}+\omega^{2}\beta\|\widetilde{\Psi}^{b}(r_{k})\|_%
{Q_{2}}^{2}. = ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β ∥ over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .
Using (23 ), (34 ) and r ~ k b = 0 subscript superscript ~ 𝑟 𝑏 𝑘 0 \widetilde{r}^{b}_{k}=0 over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = 0 yields
‖ r ~ k a − M ~ Ψ ~ a ( r k ) ‖ P ~ β ≤ ‖ r k − M Ψ ( r k ) ‖ P β ≤ δ ‖ r k ‖ P β = δ ‖ r ~ k a ‖ P ~ β . subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 subscript ~ 𝑃 𝛽 subscript norm subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 subscript 𝑃 𝛽 𝛿 subscript norm subscript 𝑟 𝑘 subscript 𝑃 𝛽 𝛿 subscript norm subscript superscript ~ 𝑟 𝑎 𝑘 subscript ~ 𝑃 𝛽 \displaystyle\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})%
\|_{\widetilde{P}_{\beta}}\leq\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}\leq\delta\|r_%
{k}\|_{P_{\beta}}=\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}}. ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≤ italic_δ ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_δ ∥ over~ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT .
We are now ready to establish the convergence theorem for Algorithm 2 when B 𝐵 B italic_B is rank-deficient.
Theorem 3.6 .
Suppose B ∈ ℝ n × m 𝐵 superscript ℝ 𝑛 𝑚 B\in\mathds{R}^{n\times m} italic_B ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT is rank-deficient with rank s 𝑠 s italic_s and G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is unsymmetric but positive definite on Null ( B T ) Null superscript 𝐵 𝑇 \mathop{\mathrm{Null}}(B^{T}\!\,) roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . For any β > 0 𝛽 0 \beta>0 italic_β > 0 and SPD Q 1 ∈ ℝ ( n + s ) × ( n + s ) subscript 𝑄 1 superscript ℝ 𝑛 𝑠 𝑛 𝑠 Q_{1}\in\mathds{R}^{(n+s)\times(n+s)} italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_s ) × ( italic_n + italic_s ) end_POSTSUPERSCRIPT and Q 2 ∈ ℝ ( m − s ) × ( m − s ) subscript 𝑄 2 superscript ℝ 𝑚 𝑠 𝑚 𝑠 Q_{2}\in\mathds{R}^{(m-s)\times(m-s)} italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_m - italic_s ) × ( italic_m - italic_s ) end_POSTSUPERSCRIPT , let Q 𝑄 Q italic_Q , η 𝜂 \eta italic_η and δ 𝛿 \delta italic_δ be defined by (29 ), (13 ) and (23 ), and λ 1 subscript 𝜆 1 \lambda_{1} italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the minimum eigenvalue of 2 ω H + B Q − 1 B T 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2\omega H+BQ^{-1}B^{T} 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT . If ω 𝜔 \omega italic_ω and δ 𝛿 \delta italic_δ satisfy
0 < ω < min { 1 ( − 2 η ) + , λ 1 β } and 0 ≤ δ ≤ 1 2 ( 1 − ‖ N ~ M ~ − 1 ‖ P ~ β ) , formulae-sequence 0 𝜔 1 subscript 2 𝜂 subscript 𝜆 1 𝛽 and 0
𝛿 1 2 1 subscript norm ~ 𝑁 superscript ~ 𝑀 1 subscript ~ 𝑃 𝛽 0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}%
\right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|\widetilde{%
N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}\Big{)}, 0 < italic_ω < roman_min { divide start_ARG 1 end_ARG start_ARG ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_ARG , square-root start_ARG divide start_ARG italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_β end_ARG end_ARG } and 0 ≤ italic_δ ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 - ∥ over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ,
then { x k , y k } subscript 𝑥 𝑘 subscript 𝑦 𝑘 \{x_{k},y_{k}\} { italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 2 converges to a solution of the singular saddle-point system (1 ).
Proof 3.7 .
By Lemma 6 , we just need to prove lim k → 0 r ~ k a = 0 subscript → 𝑘 0 superscript subscript ~ 𝑟 𝑘 𝑎 0 \lim\limits_{k\rightarrow 0}\widetilde{r}_{k}^{a}=0 roman_lim start_POSTSUBSCRIPT italic_k → 0 end_POSTSUBSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT = 0 . Since D ~ ~ 𝐷 \widetilde{D} over~ start_ARG italic_D end_ARG is an orthogonal matrix, it follows from (24 ), (29 ), (32 ) and (33 ) that
( r ~ k + 1 a r ~ k + 1 b ) = r ~ k + 1 = D ~ T r k + 1 = D ~ T [ N M − 1 r k + ( I − N M − 1 ) ( r k − M Ψ ( r k ) ) ] matrix superscript subscript ~ 𝑟 𝑘 1 𝑎 superscript subscript ~ 𝑟 𝑘 1 𝑏 subscript ~ 𝑟 𝑘 1 superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 1 superscript ~ 𝐷 𝑇 delimited-[] 𝑁 superscript 𝑀 1 subscript 𝑟 𝑘 𝐼 𝑁 superscript 𝑀 1 subscript 𝑟 𝑘 𝑀 Ψ subscript 𝑟 𝑘 \displaystyle\begin{pmatrix}\widetilde{r}_{k+1}^{a}\\
\widetilde{r}_{k+1}^{b}\end{pmatrix}=\widetilde{r}_{k+1}=\widetilde{D}^{T}\!r_%
{k+1}=\widetilde{D}^{T}\!\left[NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k}))\right] ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) = over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT [ italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + ( italic_I - italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_M roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ]
= D ~ T N D ~ ( D ~ T M D ~ ) − 1 D ~ T r k + [ I − D ~ T N D ~ ( D ~ T M D ~ ) − 1 ] ( D ~ T r k − D ~ T M D ~ D ~ T Ψ ( r k ) ) absent superscript ~ 𝐷 𝑇 𝑁 ~ 𝐷 superscript superscript ~ 𝐷 𝑇 𝑀 ~ 𝐷 1 superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 delimited-[] 𝐼 superscript ~ 𝐷 𝑇 𝑁 ~ 𝐷 superscript superscript ~ 𝐷 𝑇 𝑀 ~ 𝐷 1 superscript ~ 𝐷 𝑇 subscript 𝑟 𝑘 superscript ~ 𝐷 𝑇 𝑀 ~ 𝐷 superscript ~ 𝐷 𝑇 Ψ subscript 𝑟 𝑘 \displaystyle=\widetilde{D}^{T}\!N\widetilde{D}(\widetilde{D}^{T}\!M\widetilde%
{D})^{-1}\widetilde{D}^{T}\!r_{k}+\left[I-\widetilde{D}^{T}\!N\widetilde{D}(%
\widetilde{D}^{T}\!M\widetilde{D})^{-1}\right]\left(\widetilde{D}^{T}\!r_{k}-%
\widetilde{D}^{T}\!M\widetilde{D}\widetilde{D}^{T}\!\Psi(r_{k})\right) = over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + [ italic_I - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_N over~ start_ARG italic_D end_ARG ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] ( over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_M over~ start_ARG italic_D end_ARG over~ start_ARG italic_D end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Ψ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )
= ( N ~ M ~ − 1 0 0 I ) ( r ~ k a r ~ k b ) + [ I − ( N ~ M ~ − 1 0 0 I ) ] [ ( r ~ k a r ~ k b ) − ( M ~ 0 0 ω Q 2 ) ( Ψ ~ a ( r k ) Ψ ~ b ( r k ) ) ] absent matrix ~ 𝑁 superscript ~ 𝑀 1 0 0 𝐼 matrix superscript subscript ~ 𝑟 𝑘 𝑎 superscript subscript ~ 𝑟 𝑘 𝑏 delimited-[] 𝐼 matrix ~ 𝑁 superscript ~ 𝑀 1 0 0 𝐼 delimited-[] matrix superscript subscript ~ 𝑟 𝑘 𝑎 superscript subscript ~ 𝑟 𝑘 𝑏 matrix ~ 𝑀 0 0 𝜔 subscript 𝑄 2 matrix superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 superscript ~ Ψ 𝑏 subscript 𝑟 𝑘 \displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}&0\\
0&I\end{pmatrix}\begin{pmatrix}\widetilde{r}_{k}^{a}\\
\widetilde{r}_{k}^{b}\end{pmatrix}+\left[I-\begin{pmatrix}\widetilde{N}%
\widetilde{M}^{-1}&0\\
0&I\end{pmatrix}\right]\left[\begin{pmatrix}\widetilde{r}_{k}^{a}\\
\widetilde{r}_{k}^{b}\end{pmatrix}-\begin{pmatrix}\widetilde{M}&0\\
0&\omega Q_{2}\end{pmatrix}\begin{pmatrix}\widetilde{\Psi}^{a}(r_{k})\\
\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}\right] = ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) + [ italic_I - ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_I end_CELL end_ROW end_ARG ) ] [ ( start_ARG start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL over~ start_ARG italic_M end_ARG end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_ω italic_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ) ]
= ( N ~ M ~ − 1 r ~ k a + ( I − N ~ M ~ − 1 ) ( r ~ k a − M ~ Ψ ~ a ( r k ) ) r ~ k b ) . absent matrix ~ 𝑁 superscript ~ 𝑀 1 superscript subscript ~ 𝑟 𝑘 𝑎 𝐼 ~ 𝑁 superscript ~ 𝑀 1 superscript subscript ~ 𝑟 𝑘 𝑎 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 superscript subscript ~ 𝑟 𝑘 𝑏 \displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^%
{a}+(I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}%
\widetilde{\Psi}^{a}(r_{k}))\\
\widetilde{r}_{k}^{b}\end{pmatrix}. = ( start_ARG start_ROW start_CELL over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT + ( italic_I - over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) end_CELL end_ROW start_ROW start_CELL over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) .
Thus,
r ~ k + 1 a = N ~ M ~ − 1 r ~ k a + ( I − N ~ M ~ − 1 ) ( r ~ k a − M ~ Ψ ~ a ( r k ) ) . superscript subscript ~ 𝑟 𝑘 1 𝑎 ~ 𝑁 superscript ~ 𝑀 1 superscript subscript ~ 𝑟 𝑘 𝑎 𝐼 ~ 𝑁 superscript ~ 𝑀 1 superscript subscript ~ 𝑟 𝑘 𝑎 ~ 𝑀 superscript ~ Ψ 𝑎 subscript 𝑟 𝑘 \widetilde{r}_{k+1}^{a}=\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^{a}+(%
I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}%
\widetilde{\Psi}^{a}(r_{k})). over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT = over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT + ( italic_I - over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) ( over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT - over~ start_ARG italic_M end_ARG over~ start_ARG roman_Ψ end_ARG start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) .
Using (24 ), (31 ), (32 ), (33 ) and Lemma 7 , we know that r ~ k a superscript subscript ~ 𝑟 𝑘 𝑎 \widetilde{r}_{k}^{a} over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT is the k 𝑘 k italic_k -th residual of Algorithm 2 applying to the saddle-point problem A ~ z ~ = ℓ ~ ~ 𝐴 ~ 𝑧 ~ ℓ \widetilde{A}\widetilde{z}=\widetilde{\ell} over~ start_ARG italic_A end_ARG over~ start_ARG italic_z end_ARG = over~ start_ARG roman_ℓ end_ARG .
Note that x ∈ Null ( Σ T ) 𝑥 Null superscript Σ 𝑇 x\in\mathop{\mathrm{Null}}(\Sigma^{T}\!\,) italic_x ∈ roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) if and only if U x ∈ Null ( B T ) 𝑈 𝑥 Null superscript 𝐵 𝑇 Ux\in\mathop{\mathrm{Null}}(B^{T}\!\,) italic_U italic_x ∈ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and U T G U superscript 𝑈 𝑇 𝐺 𝑈 U^{T}\!GU italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U is positive definite on Null ( Σ T ) Null superscript Σ 𝑇 \mathop{\mathrm{Null}}(\Sigma^{T}\!\,) roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) . With (13 ), (29 ), and the SVD of B 𝐵 B italic_B , we have
inf x ∉ Null ( Σ T ) x T U T H U x x T Σ Q 1 − 1 Σ T x \xlongequal x ^ = U x inf x ^ ∉ Null ( B T ) subscript infimum 𝑥 Null superscript Σ 𝑇 superscript 𝑥 𝑇 superscript 𝑈 𝑇 𝐻 𝑈 𝑥 superscript 𝑥 𝑇 Σ superscript subscript 𝑄 1 1 superscript Σ 𝑇 𝑥 \xlongequal ^ 𝑥 𝑈 𝑥 subscript infimum ^ 𝑥 Null superscript 𝐵 𝑇 \displaystyle\inf\limits_{x\notin\mathop{\mathrm{Null}}(\Sigma^{T}\!)}\dfrac{x%
^{T}U^{T}HUx}{x^{T}\Sigma Q_{1}^{-1}\Sigma^{T}x}\xlongequal{\hat{x}=Ux}\inf%
\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)} roman_inf start_POSTSUBSCRIPT italic_x ∉ roman_Null ( roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H italic_U italic_x end_ARG start_ARG italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x end_ARG over^ start_ARG italic_x end_ARG = italic_U italic_x roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT
= inf x ^ ∉ Null ( B T ) x ^ T H x ^ x ^ T U ( Σ 0 ) V T Q − 1 V ( Σ T 0 ) U T x ^ = inf x ^ ∉ Null ( B T ) x ^ T H x ^ x ^ T B Q − 1 B T x ^ = η . absent subscript infimum ^ 𝑥 Null superscript 𝐵 𝑇 superscript ^ 𝑥 𝑇 𝐻 ^ 𝑥 superscript ^ 𝑥 𝑇 𝑈 matrix Σ 0 superscript 𝑉 𝑇 superscript 𝑄 1 𝑉 matrix superscript Σ 𝑇 0 superscript 𝑈 𝑇 ^ 𝑥 subscript infimum ^ 𝑥 Null superscript 𝐵 𝑇 superscript ^ 𝑥 𝑇 𝐻 ^ 𝑥 superscript ^ 𝑥 𝑇 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 ^ 𝑥 𝜂 \displaystyle=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)}\dfrac%
{\hat{x}^{T}H\hat{x}}{\hat{x}^{T}U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T}Q^{%
-1}V\begin{pmatrix}\Sigma^{T}\\
0\end{pmatrix}U^{T}\hat{x}}=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B%
^{T}\!)}\dfrac{\hat{x}^{T}H\hat{x}}{\hat{x}^{T}BQ^{-1}B^{T}\hat{x}}=\eta. = roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H over^ start_ARG italic_x end_ARG end_ARG start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U ( start_ARG start_ROW start_CELL roman_Σ end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) italic_V start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_V ( start_ARG start_ROW start_CELL roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_x end_ARG end_ARG = roman_inf start_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG ∉ roman_Null ( italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_H over^ start_ARG italic_x end_ARG end_ARG start_ARG over^ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_x end_ARG end_ARG = italic_η .
Since ω ( U T G U + U T G T U ) + Σ Q 1 − 1 Σ T 𝜔 superscript 𝑈 𝑇 𝐺 𝑈 superscript 𝑈 𝑇 superscript 𝐺 𝑇 𝑈 Σ superscript subscript 𝑄 1 1 superscript Σ 𝑇 \omega(U^{T}GU+U^{T}G^{T}U)+\Sigma Q_{1}^{-1}\Sigma^{T} italic_ω ( italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G italic_U + italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_G start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U ) + roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is similar to 2 ω H + U Σ Q 1 − 1 Σ T U T = 2 ω H + B Q − 1 B T 2 𝜔 𝐻 𝑈 Σ superscript subscript 𝑄 1 1 superscript Σ 𝑇 superscript 𝑈 𝑇 2 𝜔 𝐻 𝐵 superscript 𝑄 1 superscript 𝐵 𝑇 2\omega H+U\Sigma Q_{1}^{-1}\Sigma^{T}U^{T}=2\omega H+BQ^{-1}B^{T} 2 italic_ω italic_H + italic_U roman_Σ italic_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Σ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = 2 italic_ω italic_H + italic_B italic_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_B start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and Σ Σ \Sigma roman_Σ has full rank, Lemma 1 and Theorem 3.2 imply ‖ N ~ M ~ − 1 ‖ P ~ β < 1 subscript norm ~ 𝑁 superscript ~ 𝑀 1 subscript ~ 𝑃 𝛽 1 \|\widetilde{N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}<1 ∥ over~ start_ARG italic_N end_ARG over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT over~ start_ARG italic_P end_ARG start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT end_POSTSUBSCRIPT < 1 and hence r ~ k a superscript subscript ~ 𝑟 𝑘 𝑎 \widetilde{r}_{k}^{a} over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT converges to zero as k → ∞ → 𝑘 k\rightarrow\infty italic_k → ∞ . Combining with r ~ k b = 0 superscript subscript ~ 𝑟 𝑘 𝑏 0 \widetilde{r}_{k}^{b}=0 over~ start_ARG italic_r end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT = 0 concludes.
Similar to Remarks 4 and 5 , when B 𝐵 B italic_B is rank-deficient, for any given 0 < ω < 1 / ( − 2 η ) + 0 𝜔 1 subscript 2 𝜂 0<\omega<1/(-2\eta)_{+} 0 < italic_ω < 1 / ( - 2 italic_η ) start_POSTSUBSCRIPT + end_POSTSUBSCRIPT , Algorithm 2 is still convergent for sufficiently small δ ≥ 0 𝛿 0 \delta\geq 0 italic_δ ≥ 0 . Furthermore, when G 𝐺 G italic_G is positive semidefinite, Algorithm 2 is convergent for any ω > 0 𝜔 0 \omega>0 italic_ω > 0 and sufficiently small δ ≥ 0 𝛿 0 \delta\geq 0 italic_δ ≥ 0 .
3.3 Augmented Lagrangian BB algorithm
Gradient-type iterative methods for the unconstrained optimization problem min z ∈ ℝ n ^ f ^ ( z ) subscript 𝑧 superscript ℝ ^ 𝑛 ^ 𝑓 𝑧 \min\limits_{z\in\mathds{R}^{\hat{n}}}\hat{f}(z) roman_min start_POSTSUBSCRIPT italic_z ∈ blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT end_POSTSUBSCRIPT over^ start_ARG italic_f end_ARG ( italic_z )
have the form
(35)
z k + 1 = z k − α k g k , subscript 𝑧 𝑘 1 subscript 𝑧 𝑘 subscript 𝛼 𝑘 subscript 𝑔 𝑘 z_{k+1}=z_{k}-\alpha_{k}g_{k}, italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,
where f ^ : ℝ n ^ → ℝ : ^ 𝑓 → superscript ℝ ^ 𝑛 ℝ \hat{f}:\mathds{R}^{\hat{n}}\rightarrow\mathds{R} over^ start_ARG italic_f end_ARG : blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT → blackboard_R is a sufficiently smooth function, g k = ∇ f ^ ( z k ) subscript 𝑔 𝑘 ∇ ^ 𝑓 subscript 𝑧 𝑘 g_{k}=\nabla\hat{f}(z_{k}) italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∇ over^ start_ARG italic_f end_ARG ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is the gradient, and α k > 0 subscript 𝛼 𝑘 0 \alpha_{k}>0 italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > 0 is a stepsize. Methods of this type differ in their stepsize rules. In 1988, Barzilai and Borwein [ 5 ] proposed two choices of α k subscript 𝛼 𝑘 \alpha_{k} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , usually referred to as the BB method:
(36)
α k BB1 = s k − 1 T s k − 1 s k − 1 T d k − 1 and α k BB2 = s k − 1 T d k − 1 d k − 1 T d k − 1 , formulae-sequence superscript subscript 𝛼 𝑘 BB1 superscript subscript 𝑠 𝑘 1 𝑇 subscript 𝑠 𝑘 1 superscript subscript 𝑠 𝑘 1 𝑇 subscript 𝑑 𝑘 1 and
superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑠 𝑘 1 𝑇 subscript 𝑑 𝑘 1 superscript subscript 𝑑 𝑘 1 𝑇 subscript 𝑑 𝑘 1 \alpha_{k}^{\rm BB1}=\smash[t]{\frac{s_{k-1}^{T}s_{k-1}}{s_{k-1}^{T}d_{k-1}}%
\quad\textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{s_{k-1}^{T}d_{k-1}}{d_{k-1}^{%
T}d_{k-1}},} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG and italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ,
where s k − 1 = z k − z k − 1 subscript 𝑠 𝑘 1 subscript 𝑧 𝑘 subscript 𝑧 𝑘 1 s_{k-1}=z_{k}-z_{k-1} italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and d k − 1 = g k − g k − 1 subscript 𝑑 𝑘 1 subscript 𝑔 𝑘 subscript 𝑔 𝑘 1 d_{k-1}=g_{k}-g_{k-1} italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_g start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT .
The rationale behind these choices is related to viewing the gradient-type methods as quasi-Newton methods, where α k subscript 𝛼 𝑘 \alpha_{k} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT in ( 35 ) is replaced by D k = α k I subscript 𝐷 𝑘 subscript 𝛼 𝑘 𝐼 D_{k}=\alpha_{k}I italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_I . This matrix serves as an approximate inverse Hessian. Following the quasi-Newton approach, the stepsize is calculated by forcing either D k − 1 superscript subscript 𝐷 𝑘 1 D_{k}^{-1} italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT (BB1 method) or D k subscript 𝐷 𝑘 D_{k} italic_D start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (BB2 method) to satisfy the secant equation in the least squares sense. The corresponding problems are min D = α I ‖ D − 1 s k − 1 − d k − 1 ‖ subscript 𝐷 𝛼 𝐼 norm superscript 𝐷 1 subscript 𝑠 𝑘 1 subscript 𝑑 𝑘 1 \min\limits_{D=\alpha I}~{}\|D^{-1}s_{k-1}-d_{k-1}\| roman_min start_POSTSUBSCRIPT italic_D = italic_α italic_I end_POSTSUBSCRIPT ∥ italic_D start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ∥ and min D = α I ‖ s k − 1 − D d k − 1 ‖ subscript 𝐷 𝛼 𝐼 norm subscript 𝑠 𝑘 1 𝐷 subscript 𝑑 𝑘 1 \min\limits_{D=\alpha I}~{}\|s_{k-1}-Dd_{k-1}\| roman_min start_POSTSUBSCRIPT italic_D = italic_α italic_I end_POSTSUBSCRIPT ∥ italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_D italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ∥ .
When f ^ ( z ) ^ 𝑓 𝑧 \hat{f}(z) over^ start_ARG italic_f end_ARG ( italic_z ) is a convex quadratic, i.e., f ^ ( z ) = 1 2 z T A ^ z − ℓ ^ T z ^ 𝑓 𝑧 1 2 superscript 𝑧 𝑇 ^ 𝐴 𝑧 superscript ^ ℓ 𝑇 𝑧 \hat{f}(z)=\tfrac{1}{2}z^{T}\hat{A}z-\hat{\ell}^{T}z over^ start_ARG italic_f end_ARG ( italic_z ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_z - over^ start_ARG roman_ℓ end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_z with A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG SPD, this quadratic programming is equivalent to A ^ z = ℓ ^ ^ 𝐴 𝑧 ^ ℓ \hat{A}z=\hat{\ell} over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG . In this case, g k = A ^ z k − ℓ ^ = r k subscript 𝑔 𝑘 ^ 𝐴 subscript 𝑧 𝑘 ^ ℓ subscript 𝑟 𝑘 g_{k}=\hat{A}z_{k}-\hat{\ell}=r_{k} italic_g start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,
(37)
s k − 1 = − α k − 1 r k − 1 and d k − 1 = r k − r k − 1 = A ^ s k − 1 = − α k − 1 A ^ r k − 1 . formulae-sequence subscript 𝑠 𝑘 1 subscript 𝛼 𝑘 1 subscript 𝑟 𝑘 1 and
subscript 𝑑 𝑘 1 subscript 𝑟 𝑘 subscript 𝑟 𝑘 1 ^ 𝐴 subscript 𝑠 𝑘 1 subscript 𝛼 𝑘 1 ^ 𝐴 subscript 𝑟 𝑘 1 s_{k-1}=-\alpha_{k-1}r_{k-1}\quad\mbox{and}\quad d_{k-1}=r_{k}-r_{k-1}=\hat{A}%
s_{k-1}=-\alpha_{k-1}\hat{A}r_{k-1}. italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and italic_d start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT .
Then the two BB stepsizes ( 36 ) can be reformulated as
α k BB1 = r k − 1 T r k − 1 r k − 1 T A ^ r k − 1 and α k BB2 = r k − 1 T A ^ r k − 1 r k − 1 T A ^ T A ^ r k − 1 . formulae-sequence superscript subscript 𝛼 𝑘 BB1 superscript subscript 𝑟 𝑘 1 𝑇 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 and
superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 superscript ^ 𝐴 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 \alpha_{k}^{\rm BB1}=\frac{r_{k-1}^{T}r_{k-1}}{r_{k-1}^{T}\hat{A}r_{k-1}}\quad%
\textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}%
^{T}\hat{A}^{T}\hat{A}r_{k-1}}. italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG and italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG .
Global convergence of the BB method for minimizing quadratic forms was established by Raydan [ 39 ] , and its R-linear convergence rate was established by Dai and Liao [ 17 ] . For general strongly convex functions with Lipschitz gradient, the local convergence of the BB method with R-linear rate was rigorously proved by Dai et al. [ 19 ] . Extensive numerical experiments show that the BB method can solve unconstrained optimization problems efficiently and is considerably superior to the steepest descent method [ 12 , 40 ] .
A variety of modifications and extensions of the BB method have been developed for optimization.
Several researchers used the BB method to solve UPD linear systems. Dai et al. [ 18 ] gave an analysis of the BB1 method for two-by-two unsymmetric linear systems. Under mild conditions, they showed that the convergence rate of the BB1 method is Q 𝑄 Q italic_Q -superlinear if the
matrix has a double eigenvalue, but only R 𝑅 R italic_R -superlinear if the matrix has two different real eigenvalues. We find that the BB1 method for solving UPD linear systems could be divergent. Indeed, consider
A ^ z := ( 1 2 − 2 1 ) ( x y ) = ( 0 0 ) . assign ^ 𝐴 𝑧 matrix 1 2 2 1 matrix 𝑥 𝑦 matrix 0 0 \hat{A}z:=\begin{pmatrix}1&2\\
-2&1\end{pmatrix}\begin{pmatrix}x\\
y\end{pmatrix}=\begin{pmatrix}0\\
0\end{pmatrix}. over^ start_ARG italic_A end_ARG italic_z := ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 2 end_CELL end_ROW start_ROW start_CELL - 2 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) ( start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ) .
Note that A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG has two complex eigenvalues 1 ± 2 i plus-or-minus 1 2 i 1\pm 2{\rm i} 1 ± 2 roman_i . The conditions in [ 18 ] do not hold.
It follows from ( 36 ) and ( 37 ) that α k BB1 = ( s k − 1 T s k − 1 ) / ( s k − 1 T A ^ s k − 1 ) = 1 . superscript subscript 𝛼 𝑘 BB1 superscript subscript 𝑠 𝑘 1 𝑇 subscript 𝑠 𝑘 1 superscript subscript 𝑠 𝑘 1 𝑇 ^ 𝐴 subscript 𝑠 𝑘 1 1 \alpha_{k}^{\rm BB1}=(s_{k-1}^{T}s_{k-1})/(s_{k-1}^{T}\hat{A}s_{k-1})=1. italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB1 end_POSTSUPERSCRIPT = ( italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) / ( italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_s start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) = 1 . Then, one BB1 iteration gives
z k + 1 = z k − r k = ( x k y k ) − ( x k + 2 y k − 2 x k + y k ) = ( − 2 y k 2 x k ) . subscript 𝑧 𝑘 1 subscript 𝑧 𝑘 subscript 𝑟 𝑘 matrix subscript 𝑥 𝑘 subscript 𝑦 𝑘 matrix subscript 𝑥 𝑘 2 subscript 𝑦 𝑘 2 subscript 𝑥 𝑘 subscript 𝑦 𝑘 matrix 2 subscript 𝑦 𝑘 2 subscript 𝑥 𝑘 z_{k+1}=z_{k}-r_{k}=\smash[t]{\begin{pmatrix}x_{k}\\
y_{k}\end{pmatrix}-\begin{pmatrix}x_{k}+2y_{k}\\
-2x_{k}+y_{k}\end{pmatrix}=\begin{pmatrix}-2y_{k}\\
2x_{k}\end{pmatrix}}. italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) - ( start_ARG start_ROW start_CELL italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + 2 italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - 2 italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) = ( start_ARG start_ROW start_CELL - 2 italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 2 italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) .
This leads to ‖ z k + 1 ‖ 2 = 8 ‖ z k ‖ 2 superscript norm subscript 𝑧 𝑘 1 2 8 superscript norm subscript 𝑧 𝑘 2 \|z_{k+1}\|^{2}=8\|z_{k}\|^{2} ∥ italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 8 ∥ italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , which means that the sequence { z k } subscript 𝑧 𝑘 \{z_{k}\} { italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } of the BB1 iterations diverges for any initial z 0 ≠ 0 subscript 𝑧 0 0 z_{0}\neq 0 italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≠ 0 .
For quadratic programming with A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG unsymmetric, the minimal gradient method
[ 31 , 32 , 42 ] uses the stepsize
α k MG = ( r k T A ^ r k ) / ( r k T A ^ T A ^ r k ) superscript subscript 𝛼 𝑘 MG superscript subscript 𝑟 𝑘 𝑇 ^ 𝐴 subscript 𝑟 𝑘 superscript subscript 𝑟 𝑘 𝑇 superscript ^ 𝐴 𝑇 ^ 𝐴 subscript 𝑟 𝑘 \alpha_{k}^{\rm MG}=(r_{k}^{T}\hat{A}r_{k})/(r_{k}^{T}\hat{A}^{T}\hat{A}r_{k}) italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT = ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) / ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , which gives an optimal residual in each iteration, namely,
α k MG = arg min α > 0 ‖ A ^ ( z k − α r k ) − b ‖ = arg min α > 0 ‖ r k − α A ^ r k ‖ . superscript subscript 𝛼 𝑘 MG subscript 𝛼 0 norm ^ 𝐴 subscript 𝑧 𝑘 𝛼 subscript 𝑟 𝑘 𝑏 subscript 𝛼 0 norm subscript 𝑟 𝑘 𝛼 ^ 𝐴 subscript 𝑟 𝑘 \alpha_{k}^{\rm MG}=\arg\min_{\alpha>0}\|\hat{A}(z_{k}-\alpha r_{k})-b\|=\arg%
\min_{\alpha>0}\|r_{k}-\alpha\hat{A}r_{k}\|. italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_α > 0 end_POSTSUBSCRIPT ∥ over^ start_ARG italic_A end_ARG ( italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) - italic_b ∥ = roman_arg roman_min start_POSTSUBSCRIPT italic_α > 0 end_POSTSUBSCRIPT ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ .
Therefore, the minimal gradient method is convergent for solving UPD linear systems. Note that the difference between α k MG superscript subscript 𝛼 𝑘 MG \alpha_{k}^{\rm MG} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_MG end_POSTSUPERSCRIPT and α k BB2 superscript subscript 𝛼 𝑘 BB2 \alpha_{k}^{\rm BB2} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT is that one uses r k subscript 𝑟 𝑘 r_{k} italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and the other uses r k − 1 subscript 𝑟 𝑘 1 r_{k-1} italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT . The BB2 method can be regarded as the minimal gradient method with delay [ 24 ] . Gradient methods with delay significantly improve the performance of gradient methods, see [ 51 ] and references therein. Hence, we use the BB2 method to derive the new iterates x k + 1 subscript 𝑥 𝑘 1 x_{k+1} italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT and y k + 1 subscript 𝑦 𝑘 1 y_{k+1} italic_y start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT in Algorithm 2 when G 𝐺 G italic_G is positive definite. Then the augmented Lagrangian BB algorithm for solving ( 1 ) is as in Algorithm 3 .
Algorithm 3 Augmented Lagrangian BB algorithm, SPALBB
1: Given z − 1 = ( x − 1 , y − 1 ) , z 0 = ( x 0 , y 0 ) ∈ ℝ n + m formulae-sequence subscript 𝑧 1 subscript 𝑥 1 subscript 𝑦 1 subscript 𝑧 0 subscript 𝑥 0 subscript 𝑦 0 superscript ℝ 𝑛 𝑚 z_{-1}=(x_{-1},\,y_{-1}),~{}z_{0}=(x_{0},\,y_{0})\in\mathds{R}^{n+m} italic_z start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ) , italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT , ω > 0 𝜔 0 \omega>0 italic_ω > 0 , 0 ≤ δ < 1 0 𝛿 1 0\leq\delta<1 0 ≤ italic_δ < 1 , and SPD Q 𝑄 Q italic_Q , compute r 0 = M z 0 − ( f , ω Q y 0 + g ) subscript 𝑟 0 𝑀 subscript 𝑧 0 𝑓 𝜔 𝑄 subscript 𝑦 0 𝑔 r_{0}=Mz_{0}-(f,\,\omega Qy_{0}+g) italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_M italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - ( italic_f , italic_ω italic_Q italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_g ) and set k = 0 𝑘 0 k=0 italic_k = 0 .
2: while a stop** condition is not satisfied do
3: Compute ℓ k = ( f , ω Q y k + g ) subscript ℓ 𝑘 𝑓 𝜔 𝑄 subscript 𝑦 𝑘 𝑔 \ell_{k}=(f,\,\omega Qy_{k}+g) roman_ℓ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_f , italic_ω italic_Q italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_g ) .
4: while ‖ r j − M z j ‖ ∗ > δ ‖ r j ‖ ∗ subscript norm subscript 𝑟 𝑗 𝑀 subscript 𝑧 𝑗 𝛿 subscript norm subscript 𝑟 𝑗 \|r_{j}-Mz_{j}\|_{*}>\delta\|r_{j}\|_{*} ∥ italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_M italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT > italic_δ ∥ italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT do
5: Compute s j = z j − z j − 1 subscript 𝑠 𝑗 subscript 𝑧 𝑗 subscript 𝑧 𝑗 1 s_{j}=z_{j}-z_{j-1} italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_z start_POSTSUBSCRIPT italic_j - 1 end_POSTSUBSCRIPT .
6: Compute d j = M s j subscript 𝑑 𝑗 𝑀 subscript 𝑠 𝑗 d_{j}=Ms_{j} italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_M italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .
7: Compute r j = M z j − ℓ k subscript 𝑟 𝑗 𝑀 subscript 𝑧 𝑗 subscript ℓ 𝑘 r_{j}=Mz_{j}-\ell_{k} italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_M italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - roman_ℓ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT .
8: Compute α j = s j T d j ‖ d j ‖ 2 subscript 𝛼 𝑗 superscript subscript 𝑠 𝑗 𝑇 subscript 𝑑 𝑗 superscript norm subscript 𝑑 𝑗 2 \alpha_{j}=\frac{s_{j}^{T}d_{j}}{\|d_{j}\|^{2}} italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG italic_s start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG ∥ italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG .
9: Compute z j + 1 = z j − α j r j subscript 𝑧 𝑗 1 subscript 𝑧 𝑗 subscript 𝛼 𝑗 subscript 𝑟 𝑗 z_{{j}+1}=z_{j}-\alpha_{j}r_{j} italic_z start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .
10: end while
11: Increment k 𝑘 k italic_k by 1 1 1 1 .
12: end while
In the following, we establish the convergence of Algorithm 3 . First, under some assumptions, we show that the BB2 method is convergent for solving a general UPD linear system A ^ z = ℓ ^ ^ 𝐴 𝑧 ^ ℓ \hat{A}z=\hat{\ell} over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG , where the iterative scheme is
z k + 1 = z k − α k BB2 r k subscript 𝑧 𝑘 1 subscript 𝑧 𝑘 superscript subscript 𝛼 𝑘 BB2 subscript 𝑟 𝑘 z_{k+1}=z_{k}-\alpha_{k}^{\rm BB2}r_{k} italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and
r k = A ^ z k − ℓ ^ . subscript 𝑟 𝑘 ^ 𝐴 subscript 𝑧 𝑘 ^ ℓ r_{k}=\hat{A}z_{k}-\hat{\ell}. italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG .
For convenience, we introduce
A ^ h = 1 2 ( A ^ + A ^ T ) , W = A ^ h − 1 A ^ T A ^ , formulae-sequence subscript ^ 𝐴 ℎ 1 2 ^ 𝐴 superscript ^ 𝐴 𝑇 𝑊 superscript subscript ^ 𝐴 ℎ 1 superscript ^ 𝐴 𝑇 ^ 𝐴 \displaystyle\hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T}),\qquad W=\hat{A}_{h%
}^{-1}\hat{A}^{T}\!\hat{A}, over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) , italic_W = over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG ,
(38)
θ j = max { 1 − 2 u j λ min ( W ) + | λ j | 2 λ min ( W ) 2 , 1 − 2 u j λ max ( W ) + | λ j | 2 λ max ( W ) 2 } , subscript 𝜃 𝑗 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 \displaystyle\theta_{j}=\max\left\{1-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{|%
\lambda_{j}|^{2}}{\lambda_{\min}(W)^{2}},\,1-\frac{2u_{j}}{\lambda_{\max}(W)}+%
\frac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^{2}}\right\}, italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = roman_max { 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG } ,
where λ j = u j + i v j ( 1 ≤ j ≤ n ) subscript 𝜆 𝑗 subscript 𝑢 𝑗 i subscript 𝑣 𝑗 1 𝑗 𝑛 \lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n) italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n ) are the eigenvalues of A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG . When A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG is UPD, we know that A ^ h subscript ^ 𝐴 ℎ \hat{A}_{h} over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is SPD and u j > 0 ( 1 ≤ j ≤ n ) subscript 𝑢 𝑗 0 1 𝑗 𝑛 u_{j}>0~{}(1\leq j\leq n) italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT > 0 ( 1 ≤ italic_j ≤ italic_n ) . By direct calculation, for all 1 ≤ j ≤ n 1 𝑗 𝑛 1\leq j\leq n 1 ≤ italic_j ≤ italic_n , θ j < 1 subscript 𝜃 𝑗 1 \theta_{j}<1 italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < 1 holds by 1 − 2 u j λ min ( W ) + | λ j | 2 λ min ( W ) 2 < 1 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 1 1-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\min}(W)^%
{2}}<1 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1 and
1 − 2 u j λ max ( W ) + | λ j | 2 λ max ( W ) 2 < 1 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 1 1-\frac{2u_{j}}{\lambda_{\max}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^%
{2}}<1 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < 1 , which are equivalent to
(39)
max 1 ≤ j ≤ n | λ j | 2 u j < 2 λ min ( W ) . subscript 1 𝑗 𝑛 superscript subscript 𝜆 𝑗 2 subscript 𝑢 𝑗 2 subscript 𝜆 𝑊 \smash[t]{\max_{1\leq j\leq n}\frac{|\lambda_{j}|^{2}}{u_{j}}<2\lambda_{\min}(%
W).} roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ italic_n end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) .
We are now ready to study the convergence of the BB2 method.
Theorem 3.8 .
Suppose A ^ ∈ ℝ n ^ × n ^ ^ 𝐴 superscript ℝ ^ 𝑛 ^ 𝑛 \hat{A}\in\mathds{R}^{\hat{n}\times\hat{n}} over^ start_ARG italic_A end_ARG ∈ blackboard_R start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG × over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT is UPD. If its n 𝑛 n italic_n eigenvalues λ j = u j + i v j ( 1 ≤ j ≤ n ) subscript 𝜆 𝑗 subscript 𝑢 𝑗 i subscript 𝑣 𝑗 1 𝑗 𝑛 \lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n) italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n ) satisfy (39 ), then the sequence { z k } subscript 𝑧 𝑘 \{z_{k}\} { italic_z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by the BB2 method converges to the unique solution of A ^ z = ℓ ^ ^ 𝐴 𝑧 ^ ℓ \hat{A}z=\hat{\ell} over^ start_ARG italic_A end_ARG italic_z = over^ start_ARG roman_ℓ end_ARG .
Proof 3.9 .
It is well known that the BB method is invariant under unitary transformation of the variables [17 ] . By the Schur decomposition, we can assume without loss of generality that A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG is of the form
( λ 1 a 12 a 13 ⋯ a 1 n ^ 0 λ 2 a 23 ⋯ a 2 n ^ ⋮ ⋱ ⋱ ⋱ ⋮ 0 ⋯ 0 λ n ^ − 1 a n ^ − 1 , n ^ 0 ⋯ ⋯ 0 λ n ^ ) , matrix subscript 𝜆 1 subscript 𝑎 12 subscript 𝑎 13 ⋯ subscript 𝑎 1 ^ 𝑛 0 subscript 𝜆 2 subscript 𝑎 23 ⋯ subscript 𝑎 2 ^ 𝑛 ⋮ ⋱ ⋱ ⋱ ⋮ 0 ⋯ 0 subscript 𝜆 ^ 𝑛 1 subscript 𝑎 ^ 𝑛 1 ^ 𝑛
0 ⋯ ⋯ 0 subscript 𝜆 ^ 𝑛 \begin{pmatrix}\lambda_{1}&a_{12}&a_{13}&\cdots&a_{1\hat{n}}\\
0&\lambda_{2}&a_{23}&\cdots&a_{2\hat{n}}\\
\vdots&\ddots&\ddots&\ddots&\vdots\\
0&\cdots&0&\lambda_{\hat{n}-1}&a_{\hat{n}-1,\hat{n}}\\
0&\cdots&\cdots&0&\lambda_{\hat{n}}\end{pmatrix}, ( start_ARG start_ROW start_CELL italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_a start_POSTSUBSCRIPT 1 over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_a start_POSTSUBSCRIPT 2 over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG - 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_a start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG - 1 , over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL ⋯ end_CELL start_CELL ⋯ end_CELL start_CELL 0 end_CELL start_CELL italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) ,
where λ j = u j + i v j ∈ ℂ , j = 1 , 2 , … , n ^ formulae-sequence subscript 𝜆 𝑗 subscript 𝑢 𝑗 i subscript 𝑣 𝑗 ℂ 𝑗 1 2 … ^ 𝑛
\lambda_{j}=u_{j}+{\rm i}v_{j}\in\mathbb{C},~{}j=1,2,\ldots,\hat{n} italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ blackboard_C , italic_j = 1 , 2 , … , over^ start_ARG italic_n end_ARG . Because r k + 1 = A ^ z k + 1 − ℓ ^ = r k − α k BB2 A ^ r k subscript 𝑟 𝑘 1 ^ 𝐴 subscript 𝑧 𝑘 1 ^ ℓ subscript 𝑟 𝑘 superscript subscript 𝛼 𝑘 BB2 ^ 𝐴 subscript 𝑟 𝑘 r_{k+1}=\hat{A}z_{k+1}-\hat{\ell}=r_{k}-\alpha_{k}^{\rm BB2}\hat{A}r_{k} italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = over^ start_ARG italic_A end_ARG italic_z start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - over^ start_ARG roman_ℓ end_ARG = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ,
(40)
{ r k + 1 ( n ^ ) = r k ( n ^ ) − α k BB2 λ n ^ r k ( n ^ ) , r k + 1 ( j ) = r k ( j ) − α k BB2 λ j r k ( j ) − α k BB2 ∑ t = j + 1 n ^ a j , t r k ( t ) , j = n ^ − 1 , … , 1 , cases superscript subscript 𝑟 𝑘 1 ^ 𝑛 superscript subscript 𝑟 𝑘 ^ 𝑛 superscript subscript 𝛼 𝑘 BB2 subscript 𝜆 ^ 𝑛 superscript subscript 𝑟 𝑘 ^ 𝑛 formulae-sequence superscript subscript 𝑟 𝑘 1 𝑗 superscript subscript 𝑟 𝑘 𝑗 superscript subscript 𝛼 𝑘 BB2 subscript 𝜆 𝑗 superscript subscript 𝑟 𝑘 𝑗 superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑡 𝑗 1 ^ 𝑛 subscript 𝑎 𝑗 𝑡
superscript subscript 𝑟 𝑘 𝑡 𝑗 ^ 𝑛 1 … 1
\left\{\begin{array}[]{l}r_{k+1}^{(\hat{n})}=r_{k}^{(\hat{n})}-\alpha_{k}^{\rm
BB%
2}\lambda_{\hat{n}}r_{k}^{(\hat{n})},\\[3.0pt]
r_{k+1}^{(j)}=r_{k}^{(j)}-\alpha_{k}^{\rm BB2}\lambda_{j}r_{k}^{(j)}-\alpha_{k%
}^{\rm BB2}\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)},~{}j=\hat{n}-1,%
\ldots,1,\end{array}\right. { start_ARRAY start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT , italic_j = over^ start_ARG italic_n end_ARG - 1 , … , 1 , end_CELL end_ROW end_ARRAY
where r k ( j ) superscript subscript 𝑟 𝑘 𝑗 r_{k}^{(j)} italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT is the j 𝑗 j italic_j -th component of r k subscript 𝑟 𝑘 r_{k} italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT .
Note that A ^ h = 1 2 ( A ^ + A ^ T ) subscript ^ 𝐴 ℎ 1 2 ^ 𝐴 superscript ^ 𝐴 𝑇 \hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T}) over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) and r k − 1 T A ^ r k − 1 = r k − 1 T A ^ T r k − 1 superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 superscript ^ 𝐴 𝑇 subscript 𝑟 𝑘 1 r_{k-1}^{T}\hat{A}r_{k-1}=r_{k-1}^{T}\hat{A}^{T}r_{k-1} italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT , giving
r k − 1 T A ^ r k − 1 = 1 2 ( r k − 1 T A ^ r k − 1 + r k − 1 T A ^ T r k − 1 ) = r k − 1 T A ^ h r k − 1 . superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 1 2 superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 superscript ^ 𝐴 𝑇 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 subscript ^ 𝐴 ℎ subscript 𝑟 𝑘 1 r_{k-1}^{T}\hat{A}r_{k-1}=\tfrac{1}{2}\left(r_{k-1}^{T}\hat{A}r_{k-1}+r_{k-1}^%
{T}\hat{A}^{T}r_{k-1}\right)=r_{k-1}^{T}\hat{A}_{h}r_{k-1}. italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT + italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ) = italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT .
Since A ^ h subscript ^ 𝐴 ℎ \hat{A}_{h} over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT is SPD, it leads to
α k BB2 = r k − 1 T A ^ r k − 1 r k − 1 T A ^ T A ^ r k − 1 = r k − 1 T A ^ h r k − 1 r k − 1 T A ^ T A ^ r k − 1 \xlongequal r ^ = A ^ h 1 2 r k − 1 r ^ T r ^ r ^ T A ^ h − 1 2 A ^ T A ^ A ^ h − 1 2 r ^ . superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑟 𝑘 1 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 superscript ^ 𝐴 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 subscript ^ 𝐴 ℎ subscript 𝑟 𝑘 1 superscript subscript 𝑟 𝑘 1 𝑇 superscript ^ 𝐴 𝑇 ^ 𝐴 subscript 𝑟 𝑘 1 \xlongequal ^ 𝑟 superscript subscript ^ 𝐴 ℎ 1 2 subscript 𝑟 𝑘 1 superscript ^ 𝑟 𝑇 ^ 𝑟 superscript ^ 𝑟 𝑇 superscript subscript ^ 𝐴 ℎ 1 2 superscript ^ 𝐴 𝑇 ^ 𝐴 superscript subscript ^ 𝐴 ℎ 1 2 ^ 𝑟 \alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}%
\hat{A}r_{k-1}}=\frac{r_{k-1}^{T}\hat{A}_{h}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}%
\hat{A}r_{k-1}}\xlongequal{\hat{r}=\hat{A}_{h}^{\frac{1}{2}}r_{k-1}}\frac{\hat%
{r}^{T}\hat{r}}{\hat{r}^{T}\hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}%
_{h}^{-\frac{1}{2}}\hat{r}}. italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG over^ start_ARG italic_r end_ARG = over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT divide start_ARG over^ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_r end_ARG end_ARG start_ARG over^ start_ARG italic_r end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_r end_ARG end_ARG .
By the Courant-Fischer min-max theorem and the fact that A ^ h − 1 2 A ^ T A ^ A ^ h − 1 2 superscript subscript ^ 𝐴 ℎ 1 2 superscript ^ 𝐴 𝑇 ^ 𝐴 superscript subscript ^ 𝐴 ℎ 1 2 \hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}_{h}^{-\frac{1}{2}} over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG over^ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT is similar to W 𝑊 W italic_W , we have
(41)
1 λ max ( W ) ≤ α k BB2 ≤ 1 λ min ( W ) . 1 subscript 𝜆 𝑊 superscript subscript 𝛼 𝑘 BB2 1 subscript 𝜆 𝑊 \smash[t]{\frac{1}{\lambda_{\max}(W)}\leq\alpha_{k}^{\rm BB2}\leq\frac{1}{%
\lambda_{\min}(W)}.} divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG ≤ italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG .
It follows from λ j = u j + i v j subscript 𝜆 𝑗 subscript 𝑢 𝑗 i subscript 𝑣 𝑗 \lambda_{j}=u_{j}+{\rm i}v_{j} italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , (38 ), (41 ), and the behavior of the quadratic function for α k BB2 superscript subscript 𝛼 𝑘 BB2 \alpha_{k}^{\rm BB2} italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT that, for any j = 1 , … , n ^ 𝑗 1 … ^ 𝑛
j=1,\ldots,\hat{n} italic_j = 1 , … , over^ start_ARG italic_n end_ARG ,
| 1 − α k BB2 λ j | 2 = ( 1 − α k BB2 u j ) 2 + ( α k BB2 v j ) 2 = 1 − 2 α k BB2 u j + ( α k BB2 ) 2 | λ j | 2 superscript 1 superscript subscript 𝛼 𝑘 BB2 subscript 𝜆 𝑗 2 superscript 1 superscript subscript 𝛼 𝑘 BB2 subscript 𝑢 𝑗 2 superscript superscript subscript 𝛼 𝑘 BB2 subscript 𝑣 𝑗 2 1 2 superscript subscript 𝛼 𝑘 BB2 subscript 𝑢 𝑗 superscript superscript subscript 𝛼 𝑘 BB2 2 superscript subscript 𝜆 𝑗 2 \displaystyle\left|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right|^{2}=\left(1-\alpha%
_{k}^{\rm BB2}u_{j}\right)^{2}+\left(\alpha_{k}^{\rm BB2}v_{j}\right)^{2}=1-2%
\alpha_{k}^{\rm BB2}u_{j}+\left(\alpha_{k}^{\rm BB2}\right)^{2}\left|\lambda_{%
j}\right|^{2} | 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 - 2 italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + ( italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
(42)
≤ max { 1 − 2 u j λ min ( W ) + | λ j | 2 λ min ( W ) 2 , 1 − 2 u j λ max ( W ) + | λ j | 2 λ max ( W ) 2 } = θ j . absent 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 1 2 subscript 𝑢 𝑗 subscript 𝜆 𝑊 superscript subscript 𝜆 𝑗 2 subscript 𝜆 superscript 𝑊 2 subscript 𝜃 𝑗 \displaystyle\leq\max\left\{1-\tfrac{2u_{j}}{\lambda_{\min}(W)}+\tfrac{|%
\lambda_{j}|^{2}}{\lambda_{\min}(W)^{2}},\,1-\tfrac{2u_{j}}{\lambda_{\max}(W)}%
+\tfrac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^{2}}\right\}=\theta_{j}. ≤ roman_max { 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , 1 - divide start_ARG 2 italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) end_ARG + divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_W ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG } = italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .
Combining with (39 ) and (40 ) gives
| r k + 1 ( n ^ ) | = | 1 − α k BB2 λ n ^ | | r k ( n ^ ) | ≤ θ n ^ | r k ( n ^ ) | < | r k ( n ^ ) | . superscript subscript 𝑟 𝑘 1 ^ 𝑛 1 superscript subscript 𝛼 𝑘 BB2 subscript 𝜆 ^ 𝑛 superscript subscript 𝑟 𝑘 ^ 𝑛 subscript 𝜃 ^ 𝑛 superscript subscript 𝑟 𝑘 ^ 𝑛 superscript subscript 𝑟 𝑘 ^ 𝑛 \left|r_{k+1}^{(\hat{n})}\right|=\left|1-\alpha_{k}^{\rm BB2}\lambda_{\hat{n}}%
\right|\,\left|r_{k}^{(\hat{n})}\right|\leq\sqrt{\theta_{\hat{n}}}\left|r_{k}^%
{(\hat{n})}\right|<\left|r_{k}^{(\hat{n})}\right|. | italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | = | 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT | | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | ≤ square-root start_ARG italic_θ start_POSTSUBSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT end_ARG | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | < | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT | .
This implies that r k ( n ^ ) → 0 → superscript subscript 𝑟 𝑘 ^ 𝑛 0 r_{k}^{(\hat{n})}\rightarrow 0 italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT → 0 as k → ∞ → 𝑘 k\rightarrow\infty italic_k → ∞ . For j = n ^ − 1 , … , 1 𝑗 ^ 𝑛 1 … 1
j=\hat{n}-1,\ldots,1 italic_j = over^ start_ARG italic_n end_ARG - 1 , … , 1 , by (40 ) and (42 ),
| r k + 1 ( j ) | ≤ | 1 − α k BB2 λ j | | r k ( j ) | + α k BB2 | ∑ t = j + 1 n ^ a j , t r k ( t ) | ≤ θ j | r k ( j ) | + α k BB2 | ∑ t = j + 1 n ^ a j , t r k ( t ) | superscript subscript 𝑟 𝑘 1 𝑗 1 superscript subscript 𝛼 𝑘 BB2 subscript 𝜆 𝑗 superscript subscript 𝑟 𝑘 𝑗 superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑡 𝑗 1 ^ 𝑛 subscript 𝑎 𝑗 𝑡
superscript subscript 𝑟 𝑘 𝑡 subscript 𝜃 𝑗 superscript subscript 𝑟 𝑘 𝑗 superscript subscript 𝛼 𝑘 BB2 superscript subscript 𝑡 𝑗 1 ^ 𝑛 subscript 𝑎 𝑗 𝑡
superscript subscript 𝑟 𝑘 𝑡 \left|r_{k+1}^{(j)}\right|\leq\left|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right|\,%
\left|r_{k}^{(j)}\right|+\alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n%
}}a_{j,t}r_{k}^{(t)}\right|\leq\sqrt{\theta_{j}}\left|r_{k}^{(j)}\right|+%
\alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)}\right| | italic_r start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | ≤ | 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | + italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT | ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT | ≤ square-root start_ARG italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG | italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT | + italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB2 end_POSTSUPERSCRIPT | ∑ start_POSTSUBSCRIPT italic_t = italic_j + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG italic_n end_ARG end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_t ) end_POSTSUPERSCRIPT | .
It follows that
θ j < 1 subscript 𝜃 𝑗 1 \theta_{j}<1 italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < 1 and lim k → ∞ r k ( n ^ ) = 0 subscript → 𝑘 superscript subscript 𝑟 𝑘 ^ 𝑛 0 \lim\limits_{k\rightarrow\infty}r_{k}^{(\hat{n})}=0 roman_lim start_POSTSUBSCRIPT italic_k → ∞ end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( over^ start_ARG italic_n end_ARG ) end_POSTSUPERSCRIPT = 0 .
Remark 8 .
As A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG is positive definite, so is A ^ − 1 superscript ^ 𝐴 1 \hat{A}^{-1} over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .
Let λ ~ j = u ~ j + i v ~ j ( 1 ≤ j ≤ n ^ ) subscript ~ 𝜆 𝑗 subscript ~ 𝑢 𝑗 i subscript ~ 𝑣 𝑗 1 𝑗 ^ 𝑛 \tilde{\lambda}_{j}=\tilde{u}_{j}+{\rm i}\tilde{v}_{j}~{}(1\leq j\leq\hat{n}) over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG ) be the eigenvalues of A ^ − 1 superscript ^ 𝐴 1 \hat{A}^{-1} over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . Clearly, 1 λ ~ j = u ~ j − i v ~ j u ~ j 2 + v ~ j 2 1 subscript ~ 𝜆 𝑗 subscript ~ 𝑢 𝑗 i subscript ~ 𝑣 𝑗 superscript subscript ~ 𝑢 𝑗 2 superscript subscript ~ 𝑣 𝑗 2 \frac{1}{\tilde{\lambda}_{j}}=\frac{\tilde{u}_{j}-{\rm i}\tilde{v}_{j}}{\tilde%
{u}_{j}^{2}+\tilde{v}_{j}^{2}} divide start_ARG 1 end_ARG start_ARG over~ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = divide start_ARG over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - roman_i over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG is an eigenvalue of A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG . Then max 1 ≤ j ≤ n ^ | λ j | 2 u j = 1 min 1 ≤ j ≤ n ^ u ~ j subscript 1 𝑗 ^ 𝑛 superscript subscript 𝜆 𝑗 2 subscript 𝑢 𝑗 1 subscript 1 𝑗 ^ 𝑛 subscript ~ 𝑢 𝑗 \max\limits_{1\leq j\leq\hat{n}}\frac{|\lambda_{j}|^{2}}{u_{j}}=\frac{1}{\min_%
{1\leq j\leq\hat{n}}\tilde{u}_{j}} roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG .
This, along with
λ min ( W ) = 2 λ min ( ( A ^ + A ^ T ) − 1 A ^ T A ^ ) = 2 λ min ( A ^ ( A ^ + A ^ T ) − 1 A ^ T ) subscript 𝜆 𝑊 2 subscript 𝜆 superscript ^ 𝐴 superscript ^ 𝐴 𝑇 1 superscript ^ 𝐴 𝑇 ^ 𝐴 2 subscript 𝜆 ^ 𝐴 superscript ^ 𝐴 superscript ^ 𝐴 𝑇 1 superscript ^ 𝐴 𝑇 \displaystyle\lambda_{\min}(W)=2\lambda_{\min}\left((\hat{A}+\hat{A}^{T})^{-1}%
\hat{A}^{T}\hat{A}\right)=2\lambda_{\min}\left(\hat{A}(\hat{A}+\hat{A}^{T})^{-%
1}\hat{A}^{T}\right) italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_W ) = 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG ) = 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ( over^ start_ARG italic_A end_ARG + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT )
= 2 λ min ( ( A ^ − 1 + A ^ − T ) − 1 ) = 2 λ max ( A ^ − 1 + A ^ − T ) , absent 2 subscript 𝜆 superscript superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 1 2 subscript 𝜆 superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 \displaystyle=2\lambda_{\min}\left((\hat{A}^{-1}+\hat{A}^{-T})^{-1}\right)=%
\frac{2}{\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})}, = 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) = divide start_ARG 2 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) end_ARG ,
shows that condition (39 ) is equivalent to
λ max ( A ^ − 1 + A ^ − T ) < 4 min 1 ≤ j ≤ n ^ u ~ j . subscript 𝜆 superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 4 subscript 1 𝑗 ^ 𝑛 subscript ~ 𝑢 𝑗 \lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<4\min\limits_{1\leq j\leq\hat{n}}%
\tilde{u}_{j}. italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) < 4 roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .
Note that min 1 ≤ j ≤ n ^ u ~ j ≥ 1 2 λ min ( A ^ − 1 + A ^ − T ) subscript 1 𝑗 ^ 𝑛 subscript ~ 𝑢 𝑗 1 2 subscript 𝜆 superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 \min\limits_{1\leq j\leq\hat{n}}\tilde{u}_{j}\geq\tfrac{1}{2}\lambda_{\min}(%
\hat{A}^{-1}+\hat{A}^{-T}) roman_min start_POSTSUBSCRIPT 1 ≤ italic_j ≤ over^ start_ARG italic_n end_ARG end_POSTSUBSCRIPT over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) ,
so the above inequality can be reinforced as
λ max ( A ^ − 1 + A ^ − T ) < 2 λ min ( A ^ − 1 + A ^ − T ) . subscript 𝜆 superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 2 subscript 𝜆 superscript ^ 𝐴 1 superscript ^ 𝐴 𝑇 \lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<2\lambda_{\min}(\hat{A}^{-1}+\hat{A}%
^{-T}). italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) .
When A ^ ^ 𝐴 \hat{A} over^ start_ARG italic_A end_ARG is SPD, it reduces to λ max ( A ^ ) < 2 λ min ( A ^ ) subscript 𝜆 ^ 𝐴 2 subscript 𝜆 ^ 𝐴 \lambda_{\max}(\hat{A})<2\lambda_{\min}(\hat{A}) italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ) < 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( over^ start_ARG italic_A end_ARG ) , which is the same as the convergence condition of the preconditioned BB method for SPD linear systems [34 ] . This means that our condition (39 ) is weaker than that of [34 ] .
When G 𝐺 G italic_G is UPD, so is M 𝑀 M italic_M in ( 12 ). Combining Theorem 3.8 with the convergence conditions of Algorithm 2 gives the following result.
Theorem 3.10 .
Suppose G ∈ ℝ n × n 𝐺 superscript ℝ 𝑛 𝑛 G\in\mathds{R}^{n\times n} italic_G ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is UPD. For any SPD Q ∈ ℝ m × m 𝑄 superscript ℝ 𝑚 𝑚 Q\in\mathds{R}^{m\times m} italic_Q ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT and any ω > 0 𝜔 0 \omega>0 italic_ω > 0 , let M 𝑀 M italic_M be defined by (12 ) and λ j = u j + i v j ( 1 ≤ j ≤ n + m ) subscript 𝜆 𝑗 subscript 𝑢 𝑗 i subscript 𝑣 𝑗 1 𝑗 𝑛 𝑚 \lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n+m) italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + roman_i italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 1 ≤ italic_j ≤ italic_n + italic_m ) be its n + m 𝑛 𝑚 n+m italic_n + italic_m eigenvalues. If
(43)
max 1 ≤ j ≤ n + m | λ j | 2 u j < 4 λ max ( M − 1 + M − T ) , subscript 1 𝑗 𝑛 𝑚 superscript subscript 𝜆 𝑗 2 subscript 𝑢 𝑗 4 subscript 𝜆 superscript 𝑀 1 superscript 𝑀 𝑇 \smash[t]{\max_{1\leq j\leq n+m}\frac{|\lambda_{j}|^{2}}{u_{j}}<\frac{4}{%
\lambda_{\max}\left(M^{-1}+M^{-T}\right)},} roman_max start_POSTSUBSCRIPT 1 ≤ italic_j ≤ italic_n + italic_m end_POSTSUBSCRIPT divide start_ARG | italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_u start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG < divide start_ARG 4 end_ARG start_ARG italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) end_ARG ,
then for sufficiently small δ 𝛿 \delta italic_δ , { x k , y k } subscript 𝑥 𝑘 subscript 𝑦 𝑘 \{x_{k},y_{k}\} { italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT } produced by Algorithm 3 converges to a solution of (1 ).
Remark 9 .
The residuals generated by the BB method, even for SPD linear systems, are strong nonmonotonic, which poses a challenge for the convergence [40 , 17 ] . This is also the reason why the convergence of Algorithm 3 is intricate. Our convergence analysis of Algorithm 3 by ensuring a decrease of ‖ r k ‖ ∗ subscript norm subscript 𝑟 𝑘 \|r_{k}\|_{*} ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∗ end_POSTSUBSCRIPT is quite stringent, relying on a rather strong assumption (43 ). The nonmonotonic behavior of ‖ r k ‖ norm subscript 𝑟 𝑘 \|r_{k}\| ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ in Figures 1 and 2 also indicates that the choices of ω 𝜔 \omega italic_ω in our numerical experiments do not meet (43 ). Thus, there is significant room for improving the convergence of the BB method for UPD linear systems and Algorithm 3 .
Remark 10 .
Although assumption (43 ) is strong, it is still possible to choose ω 𝜔 \omega italic_ω to satisfy it. Indeed, consider the special case n = m = 1 𝑛 𝑚 1 n=m=1 italic_n = italic_m = 1 and M = ( a b − b ω ) 𝑀 matrix 𝑎 𝑏 𝑏 𝜔 M=\begin{pmatrix}a&b\\
-b&\omega\end{pmatrix} italic_M = ( start_ARG start_ROW start_CELL italic_a end_CELL start_CELL italic_b end_CELL end_ROW start_ROW start_CELL - italic_b end_CELL start_CELL italic_ω end_CELL end_ROW end_ARG ) with a > 0 𝑎 0 a>0 italic_a > 0 and b ∈ ℝ 𝑏 ℝ b\in\mathds{R} italic_b ∈ blackboard_R . Since
M − 1 = 1 a ω + b 2 ( ω − b b a ) , superscript 𝑀 1 1 𝑎 𝜔 superscript 𝑏 2 matrix 𝜔 𝑏 𝑏 𝑎 M^{-1}=\frac{1}{a\omega+b^{2}}\begin{pmatrix}\omega&-b\\
b&a\end{pmatrix}, italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( start_ARG start_ROW start_CELL italic_ω end_CELL start_CELL - italic_b end_CELL end_ROW start_ROW start_CELL italic_b end_CELL start_CELL italic_a end_CELL end_ROW end_ARG ) ,
we have
λ min ( M − 1 + M − T ) = 2 min { a , ω } a ω + b 2 and λ max ( M − 1 + M − T ) = 2 max { a , ω } a ω + b 2 . formulae-sequence subscript 𝜆 superscript 𝑀 1 superscript 𝑀 𝑇 2 𝑎 𝜔 𝑎 𝜔 superscript 𝑏 2 and
subscript 𝜆 superscript 𝑀 1 superscript 𝑀 𝑇 2 𝑎 𝜔 𝑎 𝜔 superscript 𝑏 2 \lambda_{\min}\left(M^{-1}+M^{-T}\right)=\frac{2\min\{a,\omega\}}{a\omega+b^{2%
}}\quad\mbox{and}\quad\lambda_{\max}\left(M^{-1}+M^{-T}\right)=\frac{2\max\{a,%
\omega\}}{a\omega+b^{2}}. italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) = divide start_ARG 2 roman_min { italic_a , italic_ω } end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG and italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) = divide start_ARG 2 roman_max { italic_a , italic_ω } end_ARG start_ARG italic_a italic_ω + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG .
It follows from Remark 8 that (43 ) can be reinforced as
λ max ( M − 1 + M − T ) ≤ 2 λ min ( M − 1 + M − T ) subscript 𝜆 superscript 𝑀 1 superscript 𝑀 𝑇 2 subscript 𝜆 superscript 𝑀 1 superscript 𝑀 𝑇 \lambda_{\max}\left(M^{-1}+M^{-T}\right)\leq 2\lambda_{\min}\left(M^{-1}+M^{-T%
}\right) italic_λ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) ≤ 2 italic_λ start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT ( italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_M start_POSTSUPERSCRIPT - italic_T end_POSTSUPERSCRIPT ) , namely,
max { a , ω } ≤ 2 min { a , ω } 𝑎 𝜔 2 𝑎 𝜔 \max\{a,\omega\}\leq 2\min\{a,\omega\} roman_max { italic_a , italic_ω } ≤ 2 roman_min { italic_a , italic_ω } .
This implies that (43 ) holds when ω ∈ [ a / 2 , 2 a ] 𝜔 𝑎 2 2 𝑎 \omega\in\left[a/2,\,2a\right] italic_ω ∈ [ italic_a / 2 , 2 italic_a ] .
For the general case, we can apply preconditioning techniques to (7 ) such that M 𝑀 M italic_M is well-conditioned. Preconditioning techniques for M 𝑀 M italic_M have been widely studied; see [8 ] and the references therein.
4 Numerical experiments
We present the results of numerical tests to examine the feasibility and effectiveness of SPALBB. All experiments were run using MATLAB R2022b on a PC with an Intel(R) Core(TM) i7-1260P CPU @ 2.10GHz and 32GB of RAM. The initial guess is taken to be the zero vector, and the algorithms are terminated when the number of iterations exceeds 10 5 superscript 10 5 10^{5} 10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT or Res := ‖ r k ‖ / ‖ r 0 ‖ ≤ 10 − 6 assign Res norm subscript 𝑟 𝑘 norm subscript 𝑟 0 superscript 10 6 {\rm Res}:=\|r_{k}\|/\|r_{0}\|\leq 10^{-6} roman_Res := ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ / ∥ italic_r start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ ≤ 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT . We report the number of outer iterations, the total number of iterations (for SPALBB, it includes the number of inner iterations), the CPU time in seconds, and the final value of the relative residual, denoted by “Oiter”, “Titer”, “CPU” and “Res”.
In SPALBB, we set Q = I 𝑄 𝐼 Q=I italic_Q = italic_I , the stop** criterion ( 23 ) for inner iterations with δ = 0.5 𝛿 0.5 \delta=0.5 italic_δ = 0.5 and 2 2 2 2 -norm, and tried ω = 10 − i 𝜔 superscript 10 𝑖 \omega=10^{-i} italic_ω = 10 start_POSTSUPERSCRIPT - italic_i end_POSTSUPERSCRIPT with i = 1 𝑖 1 i=1 italic_i = 1 , 2 2 2 2 , 3 3 3 3 , 4 4 4 4 , 5 5 5 5 , denoted SPALBB( ω 𝜔 \omega italic_ω ). We compared our method with BICGSTAB and restarted GMRES. We tested two restart values: 20 20 20 20 and 50 50 50 50 , denoted GMRES(20) and GMRES(50).
Example 1 .
The steady-state Navier-Stokes equations are
(44)
− ν ∇ 2 𝒖 + 𝒖 ⋅ ∇ 𝒖 + ∇ p = 𝒉 and div 𝒖 = 0 , 𝒛 = ( x , y ) ∈ Ω , formulae-sequence 𝜈 superscript ∇ 2 𝒖 ⋅ 𝒖 ∇ 𝒖 ∇ 𝑝 𝒉 and
formulae-sequence div 𝒖 0 𝒛 𝑥 𝑦 Ω -\nu\nabla^{2}{\bm{u}}+{\bm{u}}\cdot\nabla{\bm{u}}+\nabla p={\bm{h}}\quad{\rm
and%
}\quad{\rm div}\,{\bm{u}}=0,\quad{\bm{z}}=(x,y)\in\Omega, - italic_ν ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_italic_u + bold_italic_u ⋅ ∇ bold_italic_u + ∇ italic_p = bold_italic_h roman_and roman_div bold_italic_u = 0 , bold_italic_z = ( italic_x , italic_y ) ∈ roman_Ω ,
where Ω ⊆ ℝ 2 Ω superscript ℝ 2 \Omega\subseteq\mathds{R}^{2} roman_Ω ⊆ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is a bounded domain, the vector field 𝐮 𝐮 {\bm{u}} bold_italic_u represents the velocity in Ω Ω \Omega roman_Ω , p 𝑝 p italic_p represents pressure, and ν > 0 𝜈 0 \nu>0 italic_ν > 0 is the kinematic viscosity. The test problem is a model of the flow in a square cavity Ω = ( − 1 , 1 ) × ( − 1 , 1 ) Ω 1 1 1 1 \Omega=(-1,1)\times(-1,1) roman_Ω = ( - 1 , 1 ) × ( - 1 , 1 ) with the lid moving from left to right. A Dirichlet no-flow (zero velocity) condition is applied on the side and bottom boundaries, and the nonzero horizontal velocity on the lid is { y = 1 ; − 1 ≤ x ≤ 1 ∣ u x = 1 − x 4 } conditional-set formulae-sequence 𝑦 1 1 𝑥 1 subscript 𝑢 𝑥 1 superscript 𝑥 4 \{y=1;-1\leq x\leq 1\mid u_{x}=1-x^{4}\} { italic_y = 1 ; - 1 ≤ italic_x ≤ 1 ∣ italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = 1 - italic_x start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT } .
Finite element discretization of ( 44 ) results in system ( 1 ) with G = ν G 1 + G 2 𝐺 𝜈 subscript 𝐺 1 subscript 𝐺 2 G=\nu G_{1}+G_{2} italic_G = italic_ν italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . Here G 1 subscript 𝐺 1 G_{1} italic_G start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is SPD and consists of a set of uncouple discrete Laplace operators, corresponding to diffusion, and G 2 subscript 𝐺 2 G_{2} italic_G start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is a discrete convection operator and is unsymmetric. Evidently, G 𝐺 G italic_G becomes more unsymmetric as ν 𝜈 \nu italic_ν decreases. Various methods have been developed for solving ( 44 ). However, the convergence rates of some approaches deteriorate as ν 𝜈 \nu italic_ν decreases [ 22 ] . Thus, for ( 44 ), we test three small viscosity values of ν 𝜈 \nu italic_ν : 0.005 , 0.01 , 0.05 0.005 0.01 0.05
0.005,\,0.01,\,0.05 0.005 , 0.01 , 0.05 .
1 is a classical test problem used in fluid dynamics, known as driven-cavity flow. We discretize ( 44 ) using Picard iterations and the Q2–Q1 mixed finite element approximation [ 23 ] on uniform grids with grid parameter h = 2 − 6 ℎ superscript 2 6 h=2^{-6} italic_h = 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 2 − 7 superscript 2 7 2^{-7} 2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT , 2 − 8 superscript 2 8 2^{-8} 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT , 2 − 9 superscript 2 9 2^{-9} 2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT . This discrete process can be accomplished by the IFISS software package [ 23 , 46 ] . In this example, G 𝐺 G italic_G is UPD and B 𝐵 B italic_B is rank-deficient with rank m − 1 𝑚 1 m-1 italic_m - 1 . Thus, the matrix in ( 1 ) is singular. The numerical results are reported in Tables 1 , 2 and 3 and in the left-hand plots of Figure 1 , where “-” means that the method failed to solve the problem and bold face indicates that the method performs best in terms of CPU time. It can be seen from Tables 1 , 2 and 3 that the CPU time of all tested methods increases as ν 𝜈 \nu italic_ν decreases, and BICGSTAB and SPALBB(1) fail when h = 2 − 9 ℎ superscript 2 9 h=2^{-9} italic_h = 2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT for ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 . The CPU time of SPALBB with ω ≤ 10 − 2 𝜔 superscript 10 2 \omega\leq 10^{-2} italic_ω ≤ 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT is about half that of GMRES, and the best cases of SPALBB are only a third of GMRES for h = 2 − 9 ℎ superscript 2 9 h=2^{-9} italic_h = 2 start_POSTSUPERSCRIPT - 9 end_POSTSUPERSCRIPT . The number of outer iterations of SPALBB decreases with ω 𝜔 \omega italic_ω , which is consistent with Remark 1 . Nevertheless, the total number of iterations is not the least for ω = 10 − 5 𝜔 superscript 10 5 \omega=10^{-5} italic_ω = 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT .
Table 1 : Numerical results for 1 with ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 .
Table 2 : Numerical results for 1 with ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 .
Table 3 : Numerical results for 1 with ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 .
Figure 1 : Evolution of the relative residual of SPALBB tested on 1 (left) with n = 8450 𝑛 8450 n=8450 italic_n = 8450 , m = 1089 𝑚 1089 m=1089 italic_m = 1089 , and on 2 (right) with n = 8416 𝑛 8416 n=8416 italic_n = 8416 , m = 1096 𝑚 1096 m=1096 italic_m = 1096 and ω 𝜔 \omega italic_ω as in (7 ).
Example 2 .
We consider the steady-state Navier-Stokes equations (44 ), where the domain Ω Ω \Omega roman_Ω is a rectangular region ( 0 , 8 ) × ( − 1 , 1 ) 0 8 1 1 (0,8)\times(-1,1) ( 0 , 8 ) × ( - 1 , 1 ) generated by deleting the square ( 7 / 4 , 9 / 4 ) × ( − 1 / 4 , 1 / 4 ) 7 4 9 4 1 4 1 4 (7/4,9/4)\times(-1/4,1/4) ( 7 / 4 , 9 / 4 ) × ( - 1 / 4 , 1 / 4 ) . This test problem is a model of the flow in a rectangular channel with an obstacle. A Poiseuille profile is imposed on the inflow boundary { x = 0 ; − 1 ≤ y ≤ 1 } formulae-sequence 𝑥 0 1 𝑦 1 \{x=0;-1\leq y\leq 1\} { italic_x = 0 ; - 1 ≤ italic_y ≤ 1 } , and a Dirichlet no-flow condition is imposed on the obstruction and on the top and bottom walls. A Neumann condition is applied at the outflow boundary that automatically sets the mean outflow pressure to zero.
In our tests, we set ν = 0.005 , 0.01 , 0.05 𝜈 0.005 0.01 0.05
\nu=0.005,\,0.01,\,0.05 italic_ν = 0.005 , 0.01 , 0.05 and discretize the Navier-Stokes equations ( 44 ) using Picard iterations and the Q2–Q1 mixed finite element approximation [ 23 ] on uniform grids with grid parameter h = 2 − 5 , 2 − 6 , 2 − 7 , 2 − 8 ℎ superscript 2 5 superscript 2 6 superscript 2 7 superscript 2 8
h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8} italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT . This discretization was accomplished using IFISS [ 23 , 46 ] . The resulting matrices have G 𝐺 G italic_G UPD and B 𝐵 B italic_B full column rank. The numerical results are reported in Tables 4 , 5 and 6 and Figure 1 . Tables 4 , 5 and 6 show that all choices of ω 𝜔 \omega italic_ω are successful in solving the tested problems, and, in terms of CPU time, ω = 10 − 1 𝜔 superscript 10 1 \omega=10^{-1} italic_ω = 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and 10 − 2 superscript 10 2 10^{-2} 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT perform better than other choices. Although BICGSTAB requires the least CPU time for ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 , it fails for ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 and ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 with h = 2 − 5 , 2 − 8 ℎ superscript 2 5 superscript 2 8
h=2^{-5},\,2^{-8} italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT . The CPU time for every SPALBB test is less than for GMRES, and the best case of SPALBB takes about half the time of GMRES. Overall, SPALBB is more stable and efficient.
Table 4 : Numerical results for 2 with ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 .
Table 5 : Numerical results for 2 with ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 .
Table 6 : Numerical results for 2 with ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 .
Example 3 .
We consider the steady-state Navier-Stokes equations (44 ), where the domain Ω Ω \Omega roman_Ω is a rectangular region ( − 1 , 5 ) × ( − 1 , 1 ) 1 5 1 1 (-1,5)\times(-1,1) ( - 1 , 5 ) × ( - 1 , 1 ) generated by deleting ( − 1 , 0 ) × ( − 1 , − 1 / 2 ) ∪ ( − 1 , 0 ) × ( 1 / 2 , 1 ) 1 0 1 1 2 1 0 1 2 1 (-1,0)\times(-1,-1/2)\cup(-1,0)\times(1/2,1) ( - 1 , 0 ) × ( - 1 , - 1 / 2 ) ∪ ( - 1 , 0 ) × ( 1 / 2 , 1 ) . This test problem is a model of the flow in a symmetric step channel. A Poiseuille flow profile is imposed on the inflow boundary { x = − 1 ; − 1 / 2 ≤ y ≤ 1 / 2 } formulae-sequence 𝑥 1 1 2 𝑦 1 2 \{x=-1;-1/2\leq y\leq 1/2\} { italic_x = - 1 ; - 1 / 2 ≤ italic_y ≤ 1 / 2 } , and a Dirichlet no-flow condition is imposed on the top and bottom walls and the boundaries of deleted parts. A Neumann condition is applied at the outflow boundary that sets the mean outflow pressure to zero.
The discretization of the Navier-Stokes equations ( 44 ) is done as in 2 with the same setting. In this example, G 𝐺 G italic_G is UPD and B 𝐵 B italic_B has full column rank. The numerical results are reported in Tables 7 , 8 and 9 and in the left-hand plots of Figure 2 . As in 2 , all choices of ω 𝜔 \omega italic_ω solve the problems successfully, and BICGSTAB performs best in the case of ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 . Except for ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 and ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 with h = 2 − 6 ℎ superscript 2 6 h=2^{-6} italic_h = 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , SPALBB requires the least CPU time. Hence, Tables 7 , 8 and 9 still demonstrate the efficiency of SPALBB.
Table 7 : Numerical results for 3 with ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 .
Table 8 : Numerical results for 3 with ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 .
Table 9 : Numerical results for 3 with ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 .
Figure 2 : Evolution of the relative residual of SPALBB tested on 3 (left) with n = 5890 𝑛 5890 n=5890 italic_n = 5890 , m = 769 𝑚 769 m=769 italic_m = 769 , and on 4 with n = 12675 𝑛 12675 n=12675 italic_n = 12675 , m = 1089 𝑚 1089 m=1089 italic_m = 1089 and ω 𝜔 \omega italic_ω as in (7 ).
Example 4 .
Fluid flow in Ω f ⊂ ℝ 2 subscript Ω 𝑓 superscript ℝ 2 \Omega_{f}\subset\mathds{R}^{2} roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ⊂ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT coupled with porous media flow in Ω p ⊂ ℝ 2 subscript Ω 𝑝 superscript ℝ 2 \Omega_{p}\subset\mathds{R}^{2} roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⊂ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is governed by the static Stokes equations
(45)
− ν Δ 𝒖 f + ∇ p f = 𝒇 , and div 𝒖 f = 0 , 𝒛 ∈ Ω f , formulae-sequence 𝜈 Δ subscript 𝒖 𝑓 ∇ subscript 𝑝 𝑓 𝒇 and
formulae-sequence div subscript 𝒖 𝑓 0 𝒛 subscript Ω 𝑓 -\nu\Delta\,{\bm{u}}_{f}+\nabla\,p_{f}={\bm{f}},\quad\textup{and}\quad{\rm div%
}\,{\bm{u}}_{f}=0,\quad{\bm{z}}\in\Omega_{f}, - italic_ν roman_Δ bold_italic_u start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT + ∇ italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = bold_italic_f , and roman_div bold_italic_u start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = 0 , bold_italic_z ∈ roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ,
where Ω f ∩ Ω p = ∅ subscript Ω 𝑓 subscript Ω 𝑝 \Omega_{f}\cap\Omega_{p}=\varnothing roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∩ roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ∅ and Ω ¯ f ∩ Ω ¯ p = Γ subscript ¯ Ω 𝑓 subscript ¯ Ω 𝑝 Γ \overline{\Omega}_{f}\cap\overline{\Omega}_{p}=\Gamma over¯ start_ARG roman_Ω end_ARG start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ∩ over¯ start_ARG roman_Ω end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = roman_Γ with Γ Γ \Gamma roman_Γ being an interface, ν > 0 𝜈 0 \nu>0 italic_ν > 0 is the kinematic viscosity, and 𝐟 𝐟 \bm{f} bold_italic_f is the external force. In the porous media region, the governing variable is ϕ = p p ρ f g italic-ϕ subscript 𝑝 𝑝 subscript 𝜌 𝑓 𝑔 \phi=\frac{p_{p}}{\rho_{f}g} italic_ϕ = divide start_ARG italic_p start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_ARG start_ARG italic_ρ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT italic_g end_ARG , where p p subscript 𝑝 𝑝 p_{p} italic_p start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the pressure in Ω p subscript Ω 𝑝 \Omega_{p} roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , ρ f subscript 𝜌 𝑓 \rho_{f} italic_ρ start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT is the fluid density, and g 𝑔 g italic_g is the acceleration due to gravity. The velocity 𝐮 p subscript 𝐮 𝑝 {\bm{u}}_{p} bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT of the porous media flow is related to ϕ italic-ϕ \phi italic_ϕ via Darcy’s law and is also divergence free:
(46)
𝒖 p = − ϵ 2 r ν ∇ ϕ and − div 𝒖 p = 0 , 𝒛 ∈ Ω p , formulae-sequence subscript 𝒖 𝑝 superscript italic-ϵ 2 𝑟 𝜈 ∇ italic-ϕ and
formulae-sequence div subscript 𝒖 𝑝 0 𝒛 subscript Ω 𝑝 {\bm{u}}_{p}=-\dfrac{\epsilon^{2}}{r\nu}\nabla\phi\quad\textup{and}\quad-{\rm
div%
}\,{\bm{u}}_{p}=0,\quad{\bm{z}}\in\Omega_{p}, bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = - divide start_ARG italic_ϵ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_r italic_ν end_ARG ∇ italic_ϕ and - roman_div bold_italic_u start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = 0 , bold_italic_z ∈ roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ,
where r 𝑟 r italic_r is the volumetric porosity and ϵ italic-ϵ \epsilon italic_ϵ the characteristic length of the porous media.
In our numerical experiments, the computational domain is Ω f = ( 0 , 1 ) × ( 1 , 2 ) subscript Ω 𝑓 0 1 1 2 \Omega_{f}=(0,1)\times(1,2) roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = ( 0 , 1 ) × ( 1 , 2 ) , Ω p = ( 0 , 1 ) × ( 0 , 1 ) subscript Ω 𝑝 0 1 0 1 \Omega_{p}=(0,1)\times(0,1) roman_Ω start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ( 0 , 1 ) × ( 0 , 1 ) and the interface is Γ = ( 0 , 1 ) × { 1 } Γ 0 1 1 \Gamma=(0,1)\times\{1\} roman_Γ = ( 0 , 1 ) × { 1 } . We use a uniform mesh with grid parameters h = 2 − 5 , 2 − 6 , 2 − 7 , 2 − 8 ℎ superscript 2 5 superscript 2 6 superscript 2 7 superscript 2 8
h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8} italic_h = 2 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 7 end_POSTSUPERSCRIPT , 2 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT to decompose Ω f subscript Ω 𝑓 \Omega_{f} roman_Ω start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , P2–P1 elements in the fluid region, and P2 Lagrange elements in the porous media region. We set r = 1 𝑟 1 r=1 italic_r = 1 and ϵ = 0.1 ν italic-ϵ 0.1 𝜈 \epsilon=\sqrt{0.1\nu} italic_ϵ = square-root start_ARG 0.1 italic_ν end_ARG , and again test ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 , 0.01 0.01 0.01 0.01 , 0.05 0.05 0.05 0.05 . Applying finite element discretization to the mixed Stokes-Darcy model ( 45 )–( 46 ) with the Dirichlet no-flow boundary conditions leads to linear systems of form ( 1 ) with
G = ( G 11 G 12 − G 12 T ν G 22 ) 𝐺 matrix subscript 𝐺 11 subscript 𝐺 12 superscript subscript 𝐺 12 𝑇 𝜈 subscript 𝐺 22 G=\begin{pmatrix}G_{11}&G_{12}\\
-G_{12}^{T}&\nu G_{22}\end{pmatrix} italic_G = ( start_ARG start_ROW start_CELL italic_G start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_G start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_G start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL start_CELL italic_ν italic_G start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG )
[ 13 ] .
Here G 𝐺 G italic_G is UPD and B 𝐵 B italic_B has full column rank. The numerical results are reported in Tables 10 , 11 and 12 and Figure 2 . According to Tables 10 , 11 and 12 , all methods again perform better for larger ν 𝜈 \nu italic_ν , and BICGSTAB requires the least CPU time in most cases, while SPALBB is more competitive for smaller ν 𝜈 \nu italic_ν . For 4 , SPALBB prefers smaller ω 𝜔 \omega italic_ω , such as ω = 10 − 5 𝜔 superscript 10 5 \omega=10^{-5} italic_ω = 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT .
Table 10 : Numerical results for 4 with ν = 0.005 𝜈 0.005 \nu=0.005 italic_ν = 0.005 .
Table 11 : Numerical results for 4 with ν = 0.01 𝜈 0.01 \nu=0.01 italic_ν = 0.01 .
Table 12 : Numerical results for 4 with ν = 0.05 𝜈 0.05 \nu=0.05 italic_ν = 0.05 .
In conclusion, Tables 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 and 12 and Figures 1 and 2 illustrate that SPALBB is a practical method, and its advantages increase with problem size. SPALBB and GMRES are more robust than BICGSTAB. Unlike GMRES, SPALBB has constant storage. In terms of CPU time, SPALBB is more efficient than GMRES. We see from Tables 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 and 9 that the advantages of SPALBB are more obvious for smaller ν 𝜈 \nu italic_ν , i.e., more unsymmetric G 𝐺 G italic_G . Figures 1 and 2 indicate that the convergence rate of SPALBB depends strongly on ω 𝜔 \omega italic_ω . For larger ω 𝜔 \omega italic_ω , the nonmonotonicity of ‖ r k ‖ norm subscript 𝑟 𝑘 \|r_{k}\| ∥ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ in SPALBB becomes more pronounced. The strong nonmonotone behavior is similar to the BB method [ 40 ] .