\newsiamthm

claimClaim \newsiamremarkremarkRemark \newsiamremarkhypothesisHypothesis

An Inexact augmented Lagrangian algorithm
for unsymmetric saddle-point systems

Na Huang Department of Applied Mathematics, College of Science, China Agricultural University, Bei**g, China. E-mail: [email protected]. Research partially supported by National Natural Science Foundation of China (No. 12001531). Yu-Hong Dai LSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Bei**g, China. E-mail: [email protected]. Dominique Orban GERAD and Department of Mathematics and Industrial Engineering, Polytechnique Montréal, QC, Canada. E-mail: [email protected]. Research partially supported by an NSERC Discovery Grant. Michael A. Saunders Systems Optimization Laboratory, Department of Management Science and Engineering, Stanford University, Stanford, CA, USA. E-mail: [email protected]. Version of April 30, 2024.

Abstract

Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the exact solution of a linear system of the same size but with an SPD (2,2) block. To improve efficiency, we introduce an inexact SPAL algorithm. We establish its convergence properties under reasonable assumptions. Specifically, we use a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system at each iteration. We call the result the augmented Lagrangian BB (SPALBB) algorithm and study its convergence. Numerical experiments on test problems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB is more robust and efficient than BICGSTAB and GMRES. SPALBB often requires the least CPU time, especially on large systems.

keywords:

augmented Lagrangian algorithm, saddle-point system, Barzilai-Borwein, convergence analysis.

{AMS}

65F10, 65F50.

1 Introduction

We consider the unsymmetric saddle-point system

(1)

\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f\\ g\end{pmatrix},

where $B\in\mathds{R}^{n\times m}~{}(n\geq m)$ , and $G\in\mathds{R}^{n\times n}$ is positive definite on the nullspace of $B^{T}$ but may be unsymmetric and/or singular. Thus, $x^{T}\!Gx>0$ for all nonzero $x\in\mathop{\mathrm{Null}}(B^{T}\!\,)$ . The change of sign in the second block-row of (1) makes the matrix semipositive real and positive semistable if $G$ is positive semidefinite [6]. Linear systems like (1) arise from certain discretizations of Navier-Stokes equations [23], mixed and mixed-hybrid finite element approximation of the liquid crystal director model [38] and coupled Stokes-Darcy flow [13], and within interior methods for constrained optimization [25, 48]. System (1) is nonsingular if and only if $B$ has full column rank [8]. When $B$ corresponds to a discretized gradient operator, as for example in Navier-Stokes equations [23, 28], then $B$ has low column rank and (1) is singular.

Iterative methods for solving saddle-point systems have been studied for decades, such as stationary iterations [4, 8, 52], nonlinear inexact Uzawa methods [16, 30, 33], nullspace methods [37, 44, 45], Krylov subspace methods [20, 29, 35, 36], and preconditioning techniques [8, 7, 21, 41]. Some stationary iterative methods and their semi-convergence have been studied for singular cases [15, 49, 50].

Let $Q\in\mathds{R}^{m\times m}$ be symmetric and positive definite (SPD). If we premultiply the second block-row of (1) by $-BQ^{-1}$ and add the result to the first block equation, we find that (1) is equivalent to

(2)

\begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f-BQ^{-1}g\\ g\end{pmatrix}.

Golub and Greif [27] and Golub et al. [28] showed that methods based on (2) may have advantages. Indeed, even if $G$ is singular or ill-conditioned, the $(1,1)$ block in (2) can be made nonsingular, positive definite or well-conditioned with suitable selections of $Q$ . When $G$ is symmetric, the symmetric form

T(Q):=\begin{pmatrix}G+BQ^{-1}B^{T}\!&B\\ B^{T}&0\end{pmatrix}

of (2) is typically preferred. Golub and Greif [27] mainly consider the specific case $Q=\gamma I$ , where $\gamma>0$ is constant and $I$ is the identity matrix. They provide analytical observations on the spectrum of $T(\gamma I)$ and show that there is a range of values of $\gamma$ that will improve the condition number of $T(\gamma I)$ , as well as the condition number of its $(1,1)$ block and the associated Schur complement. In particular, $\gamma=\|B\|^{2}/\|G\|$ may often force the norm of the added term $\frac{1}{\gamma}BB^{T}\!$ to be of the same magnitude as the norm of $G$ . Golub et al. [28] experimentally observe that this special choice is typically effective. Apart from the form of (2), they also show that when $G$ is symmetric positive semidefinite of nullity $1$ , an effective approach to maintaining sparsity is to choose the augmented term as $\tau bb^{T}$ , where $b$ is a known vector not orthogonal to the nullspace of $G$ , and $\tau>0$ is a constant that approximately minimizes the condition number of $G+\tau bb^{T}$ .

The approach of replacing (1) by (2) can be regarded as an augmented Lagrangian (SPAL) method, also called the method of multipliers [8, 27, 28]. For an extensive overview of the augmented Lagrangian approach and its applications, we refer to [11, 10]. Awanou and Lai [3] apply the Uzawa method [1] to (2) with $Q=\gamma I$ and propose the following SPAL (with $k=0,1,2,\dots$ and $y_{0}$ assumed given):

\left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BB^{T}\!\,)x_{k}=f-\frac{1}{\gamma% }Bg-By_{k},\\ y_{k+1}=y_{k}+\frac{1}{\gamma}(B^{T}\!x_{k}+g).\end{array}\right.

By introducing another parameter $\rho$ , Awanou and Lai [2] further generalize SPAL as

(3)

\left\{\begin{array}[]{l}(G+\frac{1}{\gamma}BQ^{-1}B^{T}\!\,)x_{k}=f-\frac{1}{% \gamma}BQ^{-1}g-By_{k},\\ y_{k+1}=y_{k}+\frac{1}{\rho}Q^{-1}(B^{T}\!x_{k}+g),\end{array}\right.

and give a first convergence analysis for the case of unsymmetric $G$ . They say that the proofs in [26] using spectral arguments cannot be extended to the nonsymmetric case. Under the assumptions that $x^{T}Gx\geq 0$ for all $x$ and $x^{T}Gx=0$ with $B^{T}x=0$ implies $x=0$ , they verify convergence by proving that $\|y_{k+1}-y_{*}\|_{Q}\leq\|y_{k}-y_{*}\|_{Q}$ and then $x_{k}$ converges to $x_{*}$ , where $(x_{*},y_{*})$ is the exact solution of (1). Awanou and Lai [2] also say that their numerical experiments for an inexact Uzawa algorithm applied to (2) do not illustrate convergence. However, we have not been able to find their implementation of the inexact version and the numerical results.

We focus here on the inexact SPAL. Based on a simple splitting of the matrix in (1), we propose a stationary iterative method that is theoretically equivalent to (3) when $\gamma=\rho$ . Hence, we also call it SPAL. We derive its convergence and semi-convergence for $B$ of any rank based on spectral arguments (unlike [2]) and obtain an explicit range of convergence for the parameter in SPAL. We allow $G$ here to be indefinite. Our SPAL requires an exact solution of a linear system at each step. To improve efficiency, we propose an inexact SPAL in which the linear system is solved inexactly. We show that it converges to the solution of (1) under reasonable conditions. Gradient methods are a class of simple optimization approaches using the negative gradient of the objective function as a search direction. The Barzilai-Borwein (BB) [5] method is a gradient method for unconstrained optimization and has proved to be efficient for solving large and sparse unconstrained convex quadratic programming, which is equivalent to solving an SPD linear system. When $G$ is unsymmetric positive definite (UPD), the linear system (7) in SPAL is UPD as well. We use the BB method to solve this UPD linear system inexactly. We call the resulting method the augmented Lagrangian BB (SPALBB) algorithm and establish its convergence under suitable assumptions. Numerical experiments on linear systems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB often solves problems more efficiently than GMRES [43] and BICGSTAB [47].

The paper is organized as follows. In Section 2, we introduce the augmented Lagrangian algorithm. Its convergence and semi-convergence are established in section 2.1 and section 2.2. The inexact SPAL and its convergence analysis are provided in Section 3. The augmented Lagrangian BB algorithm is presented in section 3.3. Numerical experiments are reported in Section 4. Conclusions appear in Section 5.

Notation

For any $H\in\mathds{R}^{n\times n}$ , we write its inverse, transpose, spectral set, nullspace and range space as $H^{-1}$ , $H^{T}$ , $\mathrm{sp}(H)$ , $\mathop{\mathrm{Null}}(H)$ , and $\mathop{\mathrm{Range}}(H)$ . For any $x\in\mathds{C}^{n}$ , we write its conjugate transpose as $x^{*}$ . For symmetric $H$ , $\lambda_{\min}(H)$ and $\lambda_{\max}(H)$ denote the minimum and maximum eigenvalues. $\|\cdot\|$ denotes the $2$ -norm of a vector or matrix. For an $n\times n$ SPD matrix $G$ , $\|x\|_{G}=\sqrt{\langle Gx,x\rangle}=\|G^{\tfrac{1}{2}}x\|$ for all $x\in\mathds{R}^{n}$ , and $\|H\|_{G}=\sup\limits_{x\neq 0}\frac{\|Hx\|_{G}}{\|x\|_{G}}=\|G^{\tfrac{1}{2}}% HG^{-\tfrac{1}{2}}\|$ for all $H\in\mathds{R}^{n\times n}$ . For simplicity, the column vector $(x^{T}\!\ y^{T}\!\,)^{T}\!$ is written $(x,y)$ , $a_{+}:=\max\{0,a\}$ , and $1/0:=+\infty$ .

2 Augmented Lagrangian algorithm

We present SPAL for solving the unsymmetric saddle-point system (1). Let $Q$ be SPD matrix and $\omega>0$ . Since

(4)

A:=\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}-\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix},

the saddle-point system (1) is equivalent to

\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}f\\ \omega Qy+g\end{pmatrix}.

This suggests Algorithm 1 for solving system (1).

Lemma 2.2 shows that it is always possible to choose $Q$ and $\omega$ such that (7) is nonsingular, even if $A$ is singular.

If $G$ is symmetric, (1) is equivalent to the constrained optimization problem

(5)

\min_{x}\ \tfrac{1}{2}x^{T}Gx-f^{T}x\mathrm{\quad s.t.\quad}g+B^{T}x=0.

The $k$ -th step of the augmented Lagrangian algorithm for (5) solves the subproblem

(6)

\min_{x}~{}\tfrac{1}{2}x^{T}Gx-f^{T}x+\frac{1}{2\omega}\left\|g+B^{T}x+\omega Qy% _{k}\right\|_{Q^{-1}}^{2},

Algorithm 1 The augmented Lagrangian algorithm SPAL for solving (1)

1: Given

y_{0}\in\mathds{R}^{m}

\omega>0

, and SPD

Q\in\mathds{R}^{m\times m}

, set

k=0

2: while a stop** condition is not satisfied do

3: Compute

(x_{k+1},y_{k+1})

according to the iteration

(7)

\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}x_{k+1}\\ y_{k+1}\end{pmatrix}=\begin{pmatrix}f\\ \omega Qy_{k}+g\end{pmatrix}.

4: Increment

k

1

5: end while

where $y_{k}$ is an estimate of the Lagrange multiplier. Its optimal solution $x_{k+1}$ satisfies

(8)

(G+\frac{1}{\omega}BQ^{-1}B^{T}\!\,)x_{k+1}+By_{k}=f-\frac{1}{\omega}BQ^{-1}g.

The multiplier is updated as

(9)

y_{k+1}=\frac{1}{\omega}Q^{-1}(g+B^{T}x_{k+1}+\omega Qy_{k})=y_{k}+\frac{1}{% \omega}Q^{-1}(B^{T}x_{k+1}+g).

Note that (7) also gives (8)–(9). Hence, we also call it the augmented Lagrangian algorithm here. Clearly, Algorithm 1 is theoretically equivalent to (3) if $\gamma=\rho=\omega$ . When $G$ is symmetric, the convergence of SPAL or its variants has been studied in [26]. Awanou and Lai [2] first gave convergence results for (3) when $G$ is unsymmetric positive semi-definite but positive definite on $\mathop{\mathrm{Null}}(B^{T})$ , based on analyzing the error $\|y_{k}-y_{*}\|_{Q}$ , where $(x_{*},y_{*})$ is the exact solution of (1). Here we give the convergence analysis of SPAL in a different way, based on the spectral properties of $T$ in (15) below. We derive the explicit range of convergence for $\omega$ and do not require $G$ to be positive semi-definite.

We call $A=M-N$ a splitting if $M$ is nonsingular. Defining $T=M^{-1}N$ , we consider the following iteration scheme for solving $Az=\ell$ :

(10)

z_{k+1}=Tz_{k}+M^{-1}\ell.

First, we show that (4) is a splitting of $A$ in (1). For convenience, we introduce

(11)		$\displaystyle S_{Q}=G+\dfrac{1}{\omega}BQ^{-1}B^{T},$	$\displaystyle\qquad H=\tfrac{1}{2}(G+G^{T}),$
(12)		$\displaystyle M=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix},$	$\displaystyle\qquad N=\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}.$

Note that $S_{Q}$ is the Schur complement of $\omega Q$ in $M$ .

Lemma 2.1.

Let $G\in\mathds{R}^{n\times n}$ be unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ , and

(13)

\displaystyle\eta=\inf\limits_{x\notin\mathop{\mathrm{Null}}(B^{T})}\dfrac{x^{% T}Hx}{x^{T}BQ^{-1}B^{T}x}.

For any SPD $Q\in\mathds{R}^{m\times m}$ , if $0<\omega<1/(-\eta)_{+}$ , then $S_{Q}$ is positive definite.

Proof.

Since $G$ is positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ , so is $H$ . Then for any nonzero $x\in\mathop{\mathrm{Null}}(B^{T})$ , it holds that $x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx>0.$ For any $x\notin\mathop{\mathrm{Null}}(B^{T})$ , as $\eta>-1/\omega$ , we have

x^{T}(H+\frac{1}{\omega}BQ^{-1}B^{T})x=x^{T}Hx+\frac{1}{\omega}x^{T}BQ^{-1}B^{% T}x\geq(\eta+\frac{1}{\omega})x^{T}BQ^{-1}B^{T}x>0.

Hence $S_{Q}$ is positive definite because, for any nonzero $x\in\mathds{R}^{n}$ , $x^{T}(S_{Q}+S_{Q}^{T})x=2x^{T}(H+\tfrac{1}{\omega}BQ^{-1}B^{T})x>0.$

By Lemma 2.1 and some algebraic manipulation, we have the following results.

Lemma 2.2.

Under the same conditions as in Lemma 2.1, $M$ is nonsingular and

(14)

M^{-1}=\begin{pmatrix}S_{Q}^{-1}&-\dfrac{1}{\omega}S_{Q}^{-1}BQ^{-1}\\[8.0pt] \dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}Q^{-1}-\dfrac{1}{% \omega^{2}}Q^{-1}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}.

Lemma 2.3.

Under the same conditions as in Lemma 2.1, the iteration matrix of Algorithm 1 is

(15)

T=M^{-1}N=\smash[t]{\begin{pmatrix}0&-S_{Q}^{-1}B\\ 0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}}

and the eigenvalues of $T$ are $0$ with algebraic multiplicity $n$ , $1$ with algebraic multiplicity $m-s$ , and the remaining $s$ eigenvalues are $\omega\mu/(1+\omega\mu)$ , where $s$ is the rank of $B$ and $\mu$ is a generalized eigenvalue of $G$ and $BQ^{-1}B^{T}$ corresponding to the generalized eigenvector $x\notin\mathop{\mathrm{Null}}(B^{T})$ .

Proof.

It follows from (12) and (14) that

T=\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}^{-1}\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}=\begin{pmatrix}0&-S_{Q}^{-1}B\\ 0&I-\dfrac{1}{\omega}Q^{-1}B^{T}S_{Q}^{-1}B\end{pmatrix}.

Clearly, $T$ has an eigenvalue $0$ with algebraic multiplicity $n$ , and the remaining $m$ eigenvalues are $1-\lambda/\omega$ , where $\lambda$ is an eigenvalue of $Q^{-1}B^{T}S_{Q}^{-1}B$ .

Since $S_{Q}$ is positive definite and $Q$ is SPD, $Q^{-1}B^{T}S_{Q}^{-1}B$ is nonsingular when $B$ has full column rank. Thus, $\lambda=0$ if and only if $B$ is column rank-deficient. In this case, $1$ is an eigenvalue of $T$ with algebraic multiplicity $m-s$ .

If $\lambda\neq 0$ , note that $Q^{-1}B^{T}S_{Q}^{-1}B$ and $S_{Q}^{-1}BQ^{-1}B^{T}$ possess the same nonzero eigenvalues, and $\lambda$ is also an eigenvalue of $S_{Q}^{-1}BQ^{-1}B^{T}$ . Then there exists $x\notin\mathop{\mathrm{Null}}(B^{T})$ such that $S_{Q}^{-1}BQ^{-1}B^{T}x=\lambda x$ . Combining with (11) leads to

(16)

Gx=\dfrac{\omega-\lambda}{\omega\lambda}BQ^{-1}B^{T}x.

Hence there exists a generalized eigenvalue $\mu$ of $G$ and $BQ^{-1}B^{T}$ corresponding to the generalized eigenvector $x\notin\mathop{\mathrm{Null}}(B^{T})$ such that $\mu=\tfrac{\omega-\lambda}{\omega\lambda}$ , i.e., $\lambda=\tfrac{\omega}{1+\omega\mu}$ . Therefore, we know that the remaining eigenvalues of $T$ are $1-\tfrac{1}{1+\omega\mu}=\tfrac{\omega\mu}{1+\omega\mu}.$

We should emphasize that Lemmas 2.1, 2.2 and 2.3 hold even if $B$ has low column rank. From Lemma 2.2, we know that $A=M-N$ is a splitting of $A$ . Then the convergence analysis of Algorithm 1 can be based on the spectral properties of $T=M^{-1}N$ . In the following, we discuss the convergence of Algorithm 1 when $B$ does or does not have full column rank, respectively.

2.1 Convergence analysis when $B$ has full column rank

In this case, $A$ is nonsingular and the saddle-point system (1) has a unique solution.

Theorem 2.1.

Suppose $B\in\mathds{R}^{n\times m}$ has full column rank and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T})$ . For any SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ be defined by (13). If $0<\omega<1/(-2\eta)_{+}$ , then the sequence $\{x_{k},y_{k}\}$ produced by Algorithm 1 converges to the unique solution of saddle-point system (1).

Proof 2.2.

Algorithm 1 is convergent if and only if the spectral radius of $T$ is less than $1$ [42, Theorem 4.1]. Note that $0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}$ and the conditions of Lemma 2.1 hold. As $B$ has full column rank, it follows from Lemma 2.3 that $1$ is not an eigenvalue of $T$ and then

(17)

\rho(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}\smash[t]{\sqrt% {\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(\omega% \mu_{2})^{2}}}},

where $\mu=\mu_{1}+{\rm i}\mu_{2}$ is the generalized eigenvalue of $G$ and $BQ^{-1}B^{T}$ corresponding to the generalized eigenvector $x\notin\mathop{\mathrm{Null}}(B^{T})$ . Since $x\notin\mathop{\mathrm{Null}}(B^{T})$ and $Q$ is SPD, we have $x^{*}BQ^{-1}B^{T}x>0$ . Combining with (16) gives $\mu=\frac{x^{*}Gx}{x^{*}BQ^{-1}B^{T}x}$ . Then

(18)

\mu_{1}=\dfrac{x^{*}(G+G^{T})x}{2x^{*}BQ^{-1}B^{T}x}=\dfrac{x^{*}Hx}{x^{*}BQ^{% -1}B^{T}x}\geq\eta.

Note that $\eta>-1/(2\omega)$ and $\omega>0$ , so that $1+\omega\mu_{1}\geq 1+\omega\eta>1/2$ . This together with (17) leads to $\rho(T)<1$ . Therefore, Algorithm 1 is convergent.

Remark 1.

From (17) we see that $\rho(T)$ decreases with $\omega$ . This means that the convergence rate of Algorithm 1 will improve as $\omega$ decreases. In particular, if $\omega=0$ (which means no splitting), $\rho(T)=0$ . Algorithm 1 then reduces to the exact method for problem (1). This is consistent with (7), i.e., Algorithm 1 performs only one iteration. In addition, since $\rho(T)\rightarrow 0$ as $|\mu|\rightarrow 0$ , $Q$ should be chosen such that the generalized eigenvalues of $G$ and $BQ^{-1}B^{T}$ are very close to $0$ . Therefore, we can choose $Q$ with very small norm.

Remark 2.

If $G$ is semidefinite, we see that $\eta\geq 0$ . Then Algorithm 1 is convergent for any $\omega>0$ .

2.2 Convergence analysis when $B$ is rank-deficient

In this case, $A$ is singular. We assume that system (1) is solvable and show that Algorithm 1 is semi-convergent. To this end, we introduce some preliminaries on the semi-convergence of iteration scheme (10) for a general linear system $Az=\ell$ .

Definition 3.

(Berman and Plemmons [9, Lemma 6.13]) Iteration (10) is semi-convergent if, for any initial guess $z_{0}$ , the iteration sequence $\{z_{k}\}$ produced by (10) converges to a solution $z$ of $Az=\ell$ such that $z=(I-T)^{D}M^{-1}\ell+[I-(I-T)^{D}(I-T)]z_{0},$ where $(I-T)^{D}$ denotes the Drazin inverse [14] of $I-T$ .

Lemma 4 (9, Theorem 6.19).

Iteration (10) is semi-convergent if and only if ${\rm index}(I-T)=1$ and $v(T)<1$ , where ${\rm index}(I-T)$ is the smallest nonnegative integer $k$ such that the ranks of $(I-T)^{k}$ and $(I-T)^{k+1}$ are equal, and $v(T)=\max\{|\lambda|:~{}\lambda\in{\rm sp}(T),~{}\lambda\neq 1\}$ is called the pseudo-spectral radius of $T$ .

Lemma 5 (49, Theorem 2.5).

${\rm index}(I-T)=1$ holds if and only if, for all $0\neq w\in\mathop{\mathrm{Range}}(A)$ , $w\notin\mathop{\mathrm{Null}}({AM}^{-1})$ , i.e., $\mathop{\mathrm{Range}}(A)\cap\mathop{\mathrm{Null}}({AM}^{-1})=\{0\}$ .

In the following, we analyze the semi-convergence property for Algorithm 1. By Lemma 4, first, we need to show ${\rm index}(I-T)=1$ .

Theorem 2.3.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T})$ . For any SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ be defined by (13). If $0<\omega<1/(-\eta)_{+}$ , then ${\rm index}(I-T)=1$ .

Proof 2.4.

Suppose $0\neq w\in\mathop{\mathrm{Range}}(A)$ . Then there is $v=(v_{1},v_{2})\in\mathds{R}^{n+m}$ such that

(19)

w=Av=\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}v_{1}\\ v_{2}\end{pmatrix}=\begin{pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}v_{1}\end{pmatrix}\neq 0.

By (14), we have

(20)		$\displaystyle{AM}^{-1}w$	$\displaystyle=$	$\displaystyle\smash[t]{\begin{pmatrix}I&0\\ -B^{T}S_{Q}^{-1}&\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}\begin{% pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}v_{1}\end{pmatrix}}$
(20)			$\displaystyle=$	$\displaystyle\begin{pmatrix}Gv_{1}+Bv_{2}\\ -B^{T}S_{Q}^{-1}(Gv_{1}+Bv_{2})-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_% {1}\end{pmatrix}.$

If $Gv_{1}+Bv_{2}\neq 0$ , clearly, ${AM}^{-1}w\neq 0$ , which shows that $w\notin\mathop{\mathrm{Null}}({AM}^{-1})$ .

If $Gv_{1}+Bv_{2}=0$ , it follows from (19) that $B^{T}v_{1}\neq 0$ and (20) yields

(21)

{AM}^{-1}w=\begin{pmatrix}0\\ -\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\end{pmatrix}.

Note that $Q$ is SPD and $B^{T}v_{1}\neq 0$ , so that $BQ^{-1}B^{T}v_{1}\neq 0$ . Then we would have

B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}\neq 0.

Indeed, if $B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0$ , clearly $v_{1}^{T}BQ^{-1}B^{T}S_{Q}^{-1}BQ^{-1}B^{T}v_{1}=0$ . Since $S_{Q}$ is positive definite, $S_{Q}^{-1}$ is also positive definite, which leads to $BQ^{-1}B^{T}v_{1}=0$ . This is a contradiction. Therefore, we still get $w\notin\mathop{\mathrm{Null}}({AM}^{-1})$ by (21). Summing up, for any $0\neq w\in\mathop{\mathrm{Range}}(A)$ , $w\notin\mathop{\mathrm{Null}}({AM}^{-1})$ . The result follows from Lemma 5.

Next, we show that $v(T)<1$ .

Theorem 2.5.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ . For any SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ be defined by (13). If $0<\omega<1/(-2\eta)_{+}$ , then $v(T)<1$ .

Proof 2.6.

Since $0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}$ , the conditions of Lemma 2.1 hold. Note the definition of the pseudo-spectral radius in Lemma 4. From Lemma 2.3,

\displaystyle v(T)=\max_{\mu}\dfrac{\omega|\mu|}{|1+\omega\mu|}=\max_{\mu}% \sqrt{\dfrac{(\omega\mu_{1})^{2}+(\omega\mu_{2})^{2}}{(1+\omega\mu_{1})^{2}+(% \omega\mu_{2})^{2}}},

where $\mu=\mu_{1}+{\rm i}\mu_{2}$ is the generalized eigenvalue of $G$ and $BQ^{-1}B^{T}$ that corresponds to the generalized eigenvector $x\notin\mathop{\mathrm{Null}}(B^{T})$ . By (18), $\omega>0$ and $\eta>-1/(2\omega)$ , we have $1+2\omega\mu_{1}\geq 1+2\omega\eta>0$ , giving $v(T)<1$ .

Combining Lemma 4 with Theorems 2.3 and 2.5 and $1/(-2\eta)_{+}<1/(-\eta)_{+}$ , we get the following convergence result.

Theorem 2.7.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient, and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ . For any SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ be defined by (13). If $0<\omega<1/(-2\eta)_{+}$ , then the sequence $\{x_{k},y_{k}\}$ produced by Algorithm 1 is semi-convergent to a solution of the singular saddle-point system (1).

3 Inexact augmented Lagrangian algorithm

In this section, we develop and analyze inexact SPAL to solve (1). Let $\ell=(f,g)$ , $z_{k}=(x_{k},y_{k})$ , and $r_{k}=Az_{k}-\ell$ . It follows from (10) and $A=M-N$ that Algorithm 1 is equivalent to

(22)

z_{k+1}=M^{-1}Nz_{k}+M^{-1}\ell=M^{-1}(M-A)z_{k}+M^{-1}\ell=z_{k}-M^{-1}r_{k},

where $M$ and $N$ are defined in (12). To describe the inexact version of Algorithm 1, as done in [30], we introduce a nonlinear map** $\Psi:\mathds{R}^{n+m}\longrightarrow\mathds{R}^{n+m}$ such that for any given $r\in\mathds{R}^{n+m}$ , $\Psi(r)$ approximates the solution $\Delta z$ of $M\Delta z=r$ in that

(23)

\|r-M\Psi(r)\|_{*}\leq\delta\|r\|_{*}

for some $\delta\in[0,1)$ and some norm $\|\cdot\|_{*}$ . We obtain the inexact augmented Lagrangian algorithm of Algorithm 2, where the main idea is to approximate $M^{-1}r_{k}$ in (22).

Algorithm 2 Inexact augmented Lagrangian algorithm

1: Given

z_{0}=(x_{0},y_{0})\in\mathds{R}^{n+m}

\omega>0

0\leq\delta<1

and SPD

Q

, set

k=0

2: while a stop** condition is not satisfied do

3: Compute

r_{k}=Az_{k}-\ell

4: Compute

\Psi(r_{k})\approx M^{-1}r_{k}

satisfying (23).

5: Compute

z_{k+1}=z_{k}-\Psi(r_{k})

6: Increment

k

1

7: end while

In our convergence analysis we use $\|\cdot\|_{P}$ in (23), where $P_{\beta}=\smash[b]{\begin{pmatrix}I&0\\ 0&\beta Q^{-1}\end{pmatrix}}$ is SPD and $\beta>0$ is an arbitrary constant. By Algorithm 2,

(24)	$\displaystyle r_{k+1}$	$\displaystyle=$	$\displaystyle Az_{k+1}-\ell=A(z_{k}-\Psi(r_{k}))-\ell=r_{k}-A\Psi(r_{k})$
		$\displaystyle=$	$\displaystyle(I-AM^{-1})r_{k}+AM^{-1}(r_{k}-M\Psi(r_{k}))$
		$\displaystyle=$	$\displaystyle NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k})).$

Likewise, we discuss the convergence of Algorithm 2 when $B$ does or does not have full column rank, respectively.

3.1 Convergence analysis when $B$ has full column rank

Note that $P_{\beta}$ is SPD, and (24) gives

\displaystyle P_{\beta}^{\tfrac{1}{2}}r_{k+1}=P_{\beta}^{\tfrac{1}{2}}NM^{-1}P% _{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}r_{k}+P_{\beta}^{\tfrac{1}{2}}% (I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}}P_{\beta}^{\tfrac{1}{2}}(r_{k}-M\Psi(r_{k}% )).

This along with (23) yields

(25)	$\displaystyle\\|r_{k+1}\\|_{P_{\beta}}$	$\displaystyle\leq$	$\displaystyle\\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\\|\\|r_{% k}\\|_{P_{\beta}}+\\|P_{\beta}^{\tfrac{1}{2}}(I-NM^{-1})P_{\beta}^{-\tfrac{1}{2}% }\\|\\|r_{k}-M\Psi(r_{k})\\|_{P_{\beta}}$
		$\displaystyle\leq$	$\displaystyle\big{(}\\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}% \\|+\delta\\|I-P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}\\|\big{)}% \\|r_{k}\\|_{P_{\beta}}$
		$\displaystyle=$	$\displaystyle\big{(}\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}% \big{)}\\|r_{k}\\|_{P_{\beta}}.$

The following result provides sufficient conditions for $\|NM^{-1}\|_{P_{\beta}}<1$ .

Lemma 1.

Suppose $B\in\mathds{R}^{n\times m}$ has full column rank and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ . For any $\beta>0$ and SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ be defined by (13) and $\lambda_{1}$ be the minimum eigenvalue of $2\omega H+BQ^{-1}B^{T}$ . Then, $\lambda_{1}>0$ and if $0<\omega<\min\left\{1/(-2\eta)_{+},\,\sqrt{\lambda_{1}/\beta}\,\right\}$ , we have $\|NM^{-1}\|_{P_{\beta}}<1$ .

Proof 3.1.

It follows from $0<\omega<1/(-2\eta)_{+}\leq 1/(-\eta)_{+}$ that $S_{Q}$ is positive definite. Combining with (12) and (14) leads to

	$\displaystyle P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}{2}}$	$\displaystyle=$	$\displaystyle P_{\beta}^{\tfrac{1}{2}}\begin{pmatrix}0&0\\ B^{T}S_{Q}^{-1}&I-\dfrac{1}{\omega}B^{T}S_{Q}^{-1}BQ^{-1}\end{pmatrix}P_{\beta% }^{-\tfrac{1}{2}}$
		$\displaystyle=$	$\displaystyle\begin{pmatrix}0&0\\ \sqrt{\beta}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}&I-E\end{pmatrix}=:\widetilde{T},$

where $E=\frac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}BQ^{-\tfrac{1}{2}}$ . This shows that

(26)

\|NM^{-1}\|_{P_{\beta}}=\|P_{\beta}^{\tfrac{1}{2}}NM^{-1}P_{\beta}^{-\tfrac{1}% {2}}\|=\Big{(}\rho(\widetilde{T}\widetilde{T}^{T})\Big{)}^{\tfrac{1}{2}}.

By direct calculation and (11), we have

(27)	$\displaystyle\rho\left(\widetilde{T}\widetilde{T}^{T}\right)$	$\displaystyle=$	$\displaystyle\rho\left((I-E)(I-E^{T}\!\,)+\beta Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-% 1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right)$
		$\displaystyle=$	$\displaystyle\rho\left(I-\tfrac{1}{\omega}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big% {(}S_{Q}+S_{Q}^{T}-\tfrac{1}{\omega}BQ^{-1}B^{T}-\omega\beta I\Big{)}S_{Q}^{-T% }BQ^{-\tfrac{1}{2}}\right)$
		$\displaystyle=$	$\displaystyle\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}% \Big{(}2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}% {2}}\right).$

Note that $B$ has full column rank and $\omega>0$ , and if $2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I$ is SPD, so is $\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}\Big{(}2\omega H+BQ^{-1}B% ^{T}-\omega^{2}\beta I\Big{)}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}$ . Then all eigenvalues of $\widetilde{T}\widetilde{T}^{T}$ are less than $1$ , i.e., $\rho(\widetilde{T}\widetilde{T}^{T})<1$ . Therefore, in order to prove $\|NM^{-1}\|_{P_{\beta}}<1$ , we just need to find $\omega$ to guarantee that $2\omega H+BQ^{-1}B^{T}-\omega^{2}\beta I$ is positive definite. Since $H$ is positive definite on $\mathop{\mathrm{Null}}(B^{T})$ , (13) and $2\omega\eta>-1$ imply $2\omega H+BQ^{-1}B^{T}$ is positive definite. Thus, $\lambda_{1}>0$ . Combining with $\omega<\sqrt{\lambda_{1}/\beta}$ gives the result.

Remark 2.

The conditions in Lemma 1 are reasonable. Indeed, for any given $\omega_{0}\in(0,\,1/(-2\eta)_{+})$ , $2H+\frac{1}{\omega_{0}}BQ^{-1}B^{T}$ is SPD. Then when $0<\omega\leq\omega_{0}$ , we have

\lambda_{1}\geq\lambda_{\min}\left(2\omega H+\tfrac{\omega}{\omega_{0}}BQ^{-1}% B^{T}\right)=\omega\lambda_{\min}\left(2H+\tfrac{1}{\omega_{0}}BQ^{-1}B^{T}% \right).

Then the conditions in Lemma 1 can be replaced by

0<\omega<\min\left\{\omega_{0},\,\tfrac{1}{\beta}\lambda_{\min}\left(2H+\tfrac% {1}{\omega_{0}}BQ^{-1}B^{T}\right)\right\}.

In particular, when $H$ is positive semidefinite, $\eta\geq 0$ and $2H+BQ^{-1}B^{T}$ is SPD. Then we can pick $\omega_{0}=1$ above and the last condition can be further simplified as

0<\omega<\min\left\{1,\,\tfrac{1}{\beta}\lambda_{\min}(2H+BQ^{-1}B^{T})\right\}.

Theorem 3.2.

Suppose $B\in\mathds{R}^{n\times m}$ has full column rank and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ . For any $\beta>0$ and SPD $Q\in\mathds{R}^{m\times m}$ , let $\eta$ and $\delta$ be defined by (13) and (23), and $\lambda_{1}>0$ be the minimum eigenvalue of $2\omega H+BQ^{-1}B^{T}$ . If $\omega$ and $\delta$ satisfy

0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}% \right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|NM^{-1}\|_{% P_{\beta}}\Big{)},

then $\{x_{k},y_{k}\}$ produced by Algorithm 2 converges to the unique solution of (1).

Proof 3.3.

It follows from Lemma 1 that $\|NM^{-1}\|_{P_{\beta}}<1$ , so that $\|I-NM^{-1}\|_{P_{\beta}}\leq 1+\|NM^{-1}\|_{P_{\beta}}<2.$ The result follows from (25) and

	$\displaystyle\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}$	$\displaystyle\leq\\|NM^{-1}\\|_{P_{\beta}}+\frac{1-\\|NM^{-1}\\|_{P_{\beta}}}{2}\\|% I-NM^{-1}\\|_{P_{\beta}}$
			$\displaystyle<\\|NM^{-1}\\|_{P_{\beta}}+1-\\|NM^{-1}\\|_{P_{\beta}}=1.$

Remark 3.

From (25) we have $\|r_{k}\|_{P_{\beta}}\leq\big{(}\|NM^{-1}\|_{P_{\beta}}+\delta\|I-NM^{-1}\|_{P% _{\beta}}\big{)}^{k}\|r_{0}\|_{P_{\beta}}.$ Hence, based on the conditions of Theorem 3.2, $r_{k}$ converges to zero linearly. Let $z_{*}$ be the solution of (1). Then

	$\displaystyle\\|z_{k}-z_{*}\\|_{P_{\beta}}=\\|A^{-1}r_{k}\\|_{P_{\beta}}=\\|P_{% \beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}P_{\beta}^{\frac{1}{2}}r_{k}% \\|\leq\\|P_{\beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}\\|\\|P_{\beta}^{% \frac{1}{2}}r_{k}\\|$
	$\displaystyle=\\|A^{-1}\\|_{P_{\beta}}\\|r_{k}\\|_{P_{\beta}}\leq\\|A^{-1}\\|_{P_{% \beta}}\big{(}\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}\big{)}^{% k}\\|r_{0}\\|_{P_{\beta}}$
	$\displaystyle=\\|A^{-1}\\|_{P_{\beta}}\big{(}\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-% NM^{-1}\\|_{P_{\beta}}\big{)}^{k}\\|A(z_{0}-z_{*})\\|_{P_{\beta}}$
	$\displaystyle\leq\\|A^{-1}\\|_{P_{\beta}}\\|A\\|_{P_{\beta}}\big{(}\\|NM^{-1}\\|_{P_% {\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}\big{)}^{k}\\|z_{0}-z_{*}\\|_{P_{\beta}}.$

This implies that $z_{k}$ converges linearly to $z_{*}$ under the conditions of Theorem 3.2.

Remark 4.

If $\beta=\delta$ in Theorem 3.2, since $\omega>0$ and $\delta\geq 0$ , we know that $\omega<\sqrt{\lambda_{1}/\delta}$ holds if and only if $\delta<\lambda_{1}/\omega^{2}$ . Then the restricted conditions of $\omega$ and $\delta$ in Theorem 3.2 can be replaced by

0<\omega<\frac{1}{(-2\eta)_{+}}\quad\mbox{and}\quad 0\leq\delta<\min\left\{% \frac{\lambda_{1}}{\omega^{2}},\,\dfrac{1-\|NM^{-1}\|_{P_{\delta}}}{2}\right\}.

It follows from (26) and (27) that

	$\displaystyle\\|NM^{-1}\\|_{P_{\delta}}^{2}=\rho(\widetilde{T}\widetilde{T}^{T})% =\rho\Big{(}I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+% BQ^{-1}B^{T}-\delta\omega^{2}I)S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\Big{)}$
(28)		$\displaystyle=\rho\left(I-\tfrac{1}{\omega^{2}}Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1% }(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{1}{2}}+\delta Q^{-\tfrac{1}{2}}% B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}\right).$

Note that $\widetilde{T}\widetilde{T}^{T}$ is symmetric positive semidefinite and $Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}S_{Q}^{-T}BQ^{-\tfrac{1}{2}}$ is SPD, $\|NM^{-1}\|_{P_{\delta}}$ increases with $\delta$ , and

\lim_{\delta\rightarrow\lambda_{1}/\omega^{2}}\|NM^{-1}\|_{P_{\delta}}=1,% \qquad\lim_{\delta\rightarrow 0^{+}}\|NM^{-1}\|_{P_{\delta}}=\sqrt{1-\tilde{% \lambda}_{1}/\omega^{2}}<1,

where $\tilde{\lambda}_{1}>0$ is the minimum eigenvalue of $Q^{-\tfrac{1}{2}}B^{T}S_{Q}^{-1}(2\omega H+BQ^{-1}B^{T})S_{Q}^{-T}BQ^{-\tfrac{% 1}{2}}$ . Then there exists $\delta>0$ such that $\|NM^{-1}\|_{P_{\delta}}<1$ . Therefore, for any given $0<\omega<1/(-2\eta)_{+}$ , Algorithm 2 is convergent for sufficiently small $\delta$ . Moreover, the larger $\omega$ is, the smaller $\delta$ should be. Therefore, a practical selection of $\delta$ could be a sequence $\{\delta_{k}\}$ such that $\delta_{k}\rightarrow 0$ as $k\rightarrow\infty$ .

Remark 5.

When $G$ is positive semidefinite, (13) yields $\eta\geq 0$ . It leads to $(-2\eta)_{+}=0$ . In this case, the sufficient conditions in Theorem 3.2 can be replaced by $0<\omega<\min\sqrt{\lambda_{1}/\beta}$ and $0\leq\delta\leq\tfrac{1}{2}\left(1-\|NM^{-1}\|_{P_{\beta}}\right)$ . Furthermore, from Remark 4 we know that the restrictions also can be replaced by $\omega>0$ and $0\leq\delta<\min\left\{\tfrac{\lambda_{1}}{\omega^{2}},\,\tfrac{1-\|NM^{-1}\|_% {P_{\delta}}}{2}\right\}$ . This implies that when $G$ is positive semidefinite, for any $\omega>0$ , Algorithm 2 is convergent for sufficiently small $\delta$ .

3.2 Convergence analysis when $B$ is rank-deficient

Assume that the rank of $B$ is $s$ and $0<s<m$ . Let $B=U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T}$ be the singular value decomposition (SVD), where $n\times n$ $U$ and $m\times m$ $V$ are orthogonal matrices, $\Sigma=\begin{pmatrix}\Sigma_{s}\\ 0\end{pmatrix}\in\mathds{R}^{n\times s}$ has full column rank, and $\Sigma_{s}={\rm diag}\{\sigma_{1},\sigma_{2},\ldots,\sigma_{s}\}$ with all $\sigma_{j}>0$ contains the singular values of $B$ . Let $Q_{1}\in\mathds{R}^{s\times s}$ and $Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}$ be SPD, and

(29)		$\displaystyle Q=V\begin{pmatrix}Q_{1}&0\\ 0&Q_{2}\end{pmatrix}V^{T}\!,\qquad~{}\widetilde{D}=\begin{pmatrix}U&0\\ 0&V\end{pmatrix},\qquad\quad~{}\widetilde{P}_{\beta}=\begin{pmatrix}I&0\\ 0&\beta Q_{1}^{-1}\end{pmatrix},$
(30)		$\displaystyle\widetilde{A}=\begin{pmatrix}U^{T}GU&\Sigma\\ -\Sigma&0\end{pmatrix},\qquad\widetilde{M}=\begin{pmatrix}U^{T}GU&\Sigma\\ -\Sigma&\omega Q_{1}\end{pmatrix},\qquad\widetilde{N}=\begin{pmatrix}0&0\\ 0&\omega Q_{1}\end{pmatrix}.$

Let $\widetilde{r}_{k}=\widetilde{D}^{T}\!r_{k}=\begin{pmatrix}\widetilde{r}_{k}^{a% },\,\widetilde{r}_{k}^{b}\end{pmatrix}$ , $\widetilde{\Psi}(r_{k})=\widetilde{D}^{T}\!\Psi(r_{k})=\begin{pmatrix}% \widetilde{\Psi}^{a}(r_{k}),\,\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}$ with $\widetilde{r}_{k}^{a},\,\widetilde{\Psi}^{a}(r_{k})\in\mathds{R}^{n+s}$ . It follows from (4), (12), (29) and (30) that

	$\displaystyle\widetilde{D}^{T}\!A\widetilde{D}$	$\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\ -B^{T}&0\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\ -V^{T}\!B^{T}U&0\end{pmatrix}$
(31)			$\displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\ -\Sigma^{T}\!&0&0\\ 0&0&0\end{pmatrix}=:\begin{pmatrix}\widetilde{A}&0\\ 0&0\end{pmatrix},$
	$\displaystyle\widetilde{D}^{T}\!M\widetilde{D}$	$\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}G&B\\ -B^{T}&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}U^{T}\!GU&U^{T}\!BV\\ -V^{T}\!B^{T}U&\omega V^{T}\!QV\end{pmatrix}$
(32)			$\displaystyle=\begin{pmatrix}U^{T}\!GU&\Sigma&0\\ -\Sigma^{T}\!&\omega Q_{1}&0\\ 0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{M}&0\\ 0&\omega Q_{2}\end{pmatrix},$
	$\displaystyle\widetilde{D}^{T}\!N\widetilde{D}$	$\displaystyle=\begin{pmatrix}U^{T}\!&0\\ 0&V^{T}\!\end{pmatrix}\begin{pmatrix}0&0\\ 0&\omega Q\end{pmatrix}\begin{pmatrix}U&0\\ 0&V\end{pmatrix}=\begin{pmatrix}0&0\\ 0&\omega V^{T}\!QV\end{pmatrix}$
(33)			$\displaystyle=\begin{pmatrix}0&0&0\\ 0&\omega Q_{1}&0\\ 0&0&\omega Q_{2}\end{pmatrix}=:\begin{pmatrix}\widetilde{N}&0\\ 0&\omega Q_{2}\end{pmatrix}.$

Based on the above notations, we have the following results.

Lemma 6.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient with rank $s$ . If (1) is solvable, then $\widetilde{r}_{k}^{b}=0$ for all $k\geq 1$ .

Proof 3.4.

Let $z_{*}$ be a solution of (1), and let $\widetilde{z}_{*}=\widetilde{D}^{T}\!z_{*}=\begin{pmatrix}\widetilde{z}_{*}^{a% },\,\widetilde{z}_{*}^{b}\end{pmatrix}$ , $\widetilde{z}_{k}=\widetilde{D}^{T}\!z_{k}=\begin{pmatrix}\widetilde{z}_{k}^{a% },\,\widetilde{z}_{k}^{b}\end{pmatrix}$ , and $\widetilde{\ell}=\widetilde{D}^{T}\!\ell=\begin{pmatrix}\widetilde{\ell}^{a},% \,\widetilde{\ell}^{b}\end{pmatrix}$ , where $\widetilde{z}_{*}^{a},\,\widetilde{z}_{k}^{a},\,\widetilde{\ell}^{a}\in\mathds% {R}^{n+s}$ . It follows from $Az_{*}=\ell$ and (31) that

\smash[t]{\widetilde{D}^{T}\!A\widetilde{D}\widetilde{z}_{*}=\begin{pmatrix}% \widetilde{A}&0\\ 0&0\end{pmatrix}\begin{pmatrix}\widetilde{z}_{*}^{a}\\ \widetilde{z}_{*}^{b}\end{pmatrix}=\begin{pmatrix}\widetilde{A}\widetilde{z}_{% *}^{a}\\ 0\end{pmatrix}=\begin{pmatrix}\widetilde{\ell}^{a}\\ \widetilde{\ell}^{b}\end{pmatrix},}

which shows that $\widetilde{\ell}^{b}=0$ . Then we have

\displaystyle\widetilde{r}_{k}

\displaystyle=\widetilde{D}^{T}\!r_{k}=\widetilde{D}^{T}\!(Az_{k}-\ell)=% \widetilde{D}^{T}\!A\widetilde{D}\widetilde{D}^{T}\!z_{k}-\widetilde{D}^{T}\!% \ell=\smash[t]{\begin{pmatrix}\widetilde{A}\widetilde{z}_{k}^{a}-\widetilde{% \ell}^{a}\\ -\widetilde{\ell}^{b}\end{pmatrix}}=\begin{pmatrix}\widetilde{r}_{k}^{a}\\ 0\end{pmatrix}.

Lemma 7.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient with rank $s$ . For any $\omega,\,\beta>0$ and SPD $Q_{1}\in\mathds{R}^{(n+s)\times(n+s)}$ and $Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}$ , let $Q$ and $\delta$ be defined by (29) and (23). Then $\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})\|_{\widetilde% {P}_{\beta}}\leq\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}}$ .

Proof 3.5.

For any $x\in\mathds{R}^{n+m}$ and $\widetilde{x}=\widetilde{D}^{T}x=\begin{pmatrix}\widetilde{x}^{a},\,\widetilde% {x}^{b}\end{pmatrix}$ with $\widetilde{x}^{a}\in\mathds{R}^{n+s}$ , since $\widetilde{D}$ is an orthogonal matrix, from (29) and the definition of $P_{\beta}$ in Section 3, we have

	$\displaystyle\\|x\\|_{P_{\beta}}^{2}$	$\displaystyle=x^{T}P_{\beta}x=x^{T}\widetilde{D}\widetilde{D}^{T}P_{\beta}% \widetilde{D}\widetilde{D}^{T}x=\begin{pmatrix}(\widetilde{x}^{a})^{T}\,(% \widetilde{x}^{b})^{T}\end{pmatrix}\begin{pmatrix}\widetilde{P}_{\beta}&0\\ 0&\beta Q_{2}^{-1}\end{pmatrix}\begin{pmatrix}\widetilde{x}^{a}\\ \widetilde{x}^{b}\end{pmatrix}$
(34)			$\displaystyle=\\|\widetilde{x}^{a}\\|_{\widetilde{P}_{\beta}}^{2}+\\|\widetilde{x% }^{b}\\|_{\beta Q_{2}^{-1}}^{2}.$

Note that (32) and Lemma 6 give

\displaystyle\widetilde{D}^{T}\left(r_{k}-M\Psi(r_{k})\right)

\displaystyle=\widetilde{r}_{k}-\widetilde{D}^{T}M\widetilde{D}\widetilde{\Psi% }(r_{k})=\smash{\begin{pmatrix}\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{% \Psi}^{a}(r_{k})\\ -\omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\end{pmatrix}}.

This along with (34) leads to

	$\displaystyle\\|r_{k}-M\Psi(r_{k})\\|_{P_{\beta}}^{2}=\\|\widetilde{r}^{a}_{k}-% \widetilde{M}\widetilde{\Psi}^{a}(r_{k})\\|_{\widetilde{P}_{\beta}}^{2}+\\|% \omega Q_{2}\widetilde{\Psi}^{b}(r_{k})\\|_{\beta Q_{2}^{-1}}^{2}$
	$\displaystyle=\\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})% \\|_{\widetilde{P}_{\beta}}^{2}+\omega^{2}\beta\\|\widetilde{\Psi}^{b}(r_{k})\\|_% {Q_{2}}^{2}.$

Using (23), (34) and $\widetilde{r}^{b}_{k}=0$ yields

\displaystyle\|\widetilde{r}^{a}_{k}-\widetilde{M}\widetilde{\Psi}^{a}(r_{k})% \|_{\widetilde{P}_{\beta}}\leq\|r_{k}-M\Psi(r_{k})\|_{P_{\beta}}\leq\delta\|r_% {k}\|_{P_{\beta}}=\delta\|\widetilde{r}^{a}_{k}\|_{\widetilde{P}_{\beta}}.

We are now ready to establish the convergence theorem for Algorithm 2 when $B$ is rank-deficient.

Theorem 3.6.

Suppose $B\in\mathds{R}^{n\times m}$ is rank-deficient with rank $s$ and $G\in\mathds{R}^{n\times n}$ is unsymmetric but positive definite on $\mathop{\mathrm{Null}}(B^{T}\!\,)$ . For any $\beta>0$ and SPD $Q_{1}\in\mathds{R}^{(n+s)\times(n+s)}$ and $Q_{2}\in\mathds{R}^{(m-s)\times(m-s)}$ , let $Q$ , $\eta$ and $\delta$ be defined by (29), (13) and (23), and $\lambda_{1}$ be the minimum eigenvalue of $2\omega H+BQ^{-1}B^{T}$ . If $\omega$ and $\delta$ satisfy

0<\omega<\min\left\{\frac{1}{(-2\eta)_{+}},\,\sqrt{\frac{\lambda_{1}}{\beta}}% \right\}\quad\mbox{and}\quad 0\leq\delta\leq\tfrac{1}{2}\Big{(}1-\|\widetilde{% N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}\Big{)},

then $\{x_{k},y_{k}\}$ produced by Algorithm 2 converges to a solution of the singular saddle-point system (1).

Proof 3.7.

By Lemma 6, we just need to prove $\lim\limits_{k\rightarrow 0}\widetilde{r}_{k}^{a}=0$ . Since $\widetilde{D}$ is an orthogonal matrix, it follows from (24), (29), (32) and (33) that

	$\displaystyle\begin{pmatrix}\widetilde{r}_{k+1}^{a}\\ \widetilde{r}_{k+1}^{b}\end{pmatrix}=\widetilde{r}_{k+1}=\widetilde{D}^{T}\!r_% {k+1}=\widetilde{D}^{T}\!\left[NM^{-1}r_{k}+(I-NM^{-1})(r_{k}-M\Psi(r_{k}))\right]$
	$\displaystyle=\widetilde{D}^{T}\!N\widetilde{D}(\widetilde{D}^{T}\!M\widetilde% {D})^{-1}\widetilde{D}^{T}\!r_{k}+\left[I-\widetilde{D}^{T}\!N\widetilde{D}(% \widetilde{D}^{T}\!M\widetilde{D})^{-1}\right]\left(\widetilde{D}^{T}\!r_{k}-% \widetilde{D}^{T}\!M\widetilde{D}\widetilde{D}^{T}\!\Psi(r_{k})\right)$
	$\displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}&0\\ 0&I\end{pmatrix}\begin{pmatrix}\widetilde{r}_{k}^{a}\\ \widetilde{r}_{k}^{b}\end{pmatrix}+\left[I-\begin{pmatrix}\widetilde{N}% \widetilde{M}^{-1}&0\\ 0&I\end{pmatrix}\right]\left[\begin{pmatrix}\widetilde{r}_{k}^{a}\\ \widetilde{r}_{k}^{b}\end{pmatrix}-\begin{pmatrix}\widetilde{M}&0\\ 0&\omega Q_{2}\end{pmatrix}\begin{pmatrix}\widetilde{\Psi}^{a}(r_{k})\\ \widetilde{\Psi}^{b}(r_{k})\end{pmatrix}\right]$
	$\displaystyle=\begin{pmatrix}\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^% {a}+(I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}% \widetilde{\Psi}^{a}(r_{k}))\\ \widetilde{r}_{k}^{b}\end{pmatrix}.$

Thus, $\widetilde{r}_{k+1}^{a}=\widetilde{N}\widetilde{M}^{-1}\widetilde{r}_{k}^{a}+(% I-\widetilde{N}\widetilde{M}^{-1})(\widetilde{r}_{k}^{a}-\widetilde{M}% \widetilde{\Psi}^{a}(r_{k})).$ Using (24), (31), (32), (33) and Lemma 7, we know that $\widetilde{r}_{k}^{a}$ is the $k$ -th residual of Algorithm 2 applying to the saddle-point problem $\widetilde{A}\widetilde{z}=\widetilde{\ell}$ .

Note that $x\in\mathop{\mathrm{Null}}(\Sigma^{T}\!\,)$ if and only if $Ux\in\mathop{\mathrm{Null}}(B^{T}\!\,)$ and $U^{T}\!GU$ is positive definite on $\mathop{\mathrm{Null}}(\Sigma^{T}\!\,)$ . With (13), (29), and the SVD of $B$ , we have

	$\displaystyle\inf\limits_{x\notin\mathop{\mathrm{Null}}(\Sigma^{T}\!)}\dfrac{x% ^{T}U^{T}HUx}{x^{T}\Sigma Q_{1}^{-1}\Sigma^{T}x}\xlongequal{\hat{x}=Ux}\inf% \limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)}$
	$\displaystyle=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B^{T}\!)}\dfrac% {\hat{x}^{T}H\hat{x}}{\hat{x}^{T}U\begin{pmatrix}\Sigma&0\end{pmatrix}V^{T}Q^{% -1}V\begin{pmatrix}\Sigma^{T}\\ 0\end{pmatrix}U^{T}\hat{x}}=\inf\limits_{\hat{x}\notin\mathop{\mathrm{Null}}(B% ^{T}\!)}\dfrac{\hat{x}^{T}H\hat{x}}{\hat{x}^{T}BQ^{-1}B^{T}\hat{x}}=\eta.$

Since $\omega(U^{T}GU+U^{T}G^{T}U)+\Sigma Q_{1}^{-1}\Sigma^{T}$ is similar to $2\omega H+U\Sigma Q_{1}^{-1}\Sigma^{T}U^{T}=2\omega H+BQ^{-1}B^{T}$ and $\Sigma$ has full rank, Lemma 1 and Theorem 3.2 imply $\|\widetilde{N}\widetilde{M}^{-1}\|_{\widetilde{P}_{\beta}}<1$ and hence $\widetilde{r}_{k}^{a}$ converges to zero as $k\rightarrow\infty$ . Combining with $\widetilde{r}_{k}^{b}=0$ concludes.

Similar to Remarks 4 and 5, when $B$ is rank-deficient, for any given $0<\omega<1/(-2\eta)_{+}$ , Algorithm 2 is still convergent for sufficiently small $\delta\geq 0$ . Furthermore, when $G$ is positive semidefinite, Algorithm 2 is convergent for any $\omega>0$ and sufficiently small $\delta\geq 0$ .

3.3 Augmented Lagrangian BB algorithm

Gradient-type iterative methods for the unconstrained optimization problem $\min\limits_{z\in\mathds{R}^{\hat{n}}}\hat{f}(z)$ have the form

(35)

z_{k+1}=z_{k}-\alpha_{k}g_{k},

where $\hat{f}:\mathds{R}^{\hat{n}}\rightarrow\mathds{R}$ is a sufficiently smooth function, $g_{k}=\nabla\hat{f}(z_{k})$ is the gradient, and $\alpha_{k}>0$ is a stepsize. Methods of this type differ in their stepsize rules. In 1988, Barzilai and Borwein [5] proposed two choices of $\alpha_{k}$ , usually referred to as the BB method:

(36)

\alpha_{k}^{\rm BB1}=\smash[t]{\frac{s_{k-1}^{T}s_{k-1}}{s_{k-1}^{T}d_{k-1}}% \quad\textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{s_{k-1}^{T}d_{k-1}}{d_{k-1}^{% T}d_{k-1}},}

where $s_{k-1}=z_{k}-z_{k-1}$ and $d_{k-1}=g_{k}-g_{k-1}$ . The rationale behind these choices is related to viewing the gradient-type methods as quasi-Newton methods, where $\alpha_{k}$ in (35) is replaced by $D_{k}=\alpha_{k}I$ . This matrix serves as an approximate inverse Hessian. Following the quasi-Newton approach, the stepsize is calculated by forcing either $D_{k}^{-1}$ (BB1 method) or $D_{k}$ (BB2 method) to satisfy the secant equation in the least squares sense. The corresponding problems are $\min\limits_{D=\alpha I}~{}\|D^{-1}s_{k-1}-d_{k-1}\|$ and $\min\limits_{D=\alpha I}~{}\|s_{k-1}-Dd_{k-1}\|$ .

When $\hat{f}(z)$ is a convex quadratic, i.e., $\hat{f}(z)=\tfrac{1}{2}z^{T}\hat{A}z-\hat{\ell}^{T}z$ with $\hat{A}$ SPD, this quadratic programming is equivalent to $\hat{A}z=\hat{\ell}$ . In this case, $g_{k}=\hat{A}z_{k}-\hat{\ell}=r_{k}$ ,

(37)

s_{k-1}=-\alpha_{k-1}r_{k-1}\quad\mbox{and}\quad d_{k-1}=r_{k}-r_{k-1}=\hat{A}% s_{k-1}=-\alpha_{k-1}\hat{A}r_{k-1}.

Then the two BB stepsizes (36) can be reformulated as

\alpha_{k}^{\rm BB1}=\frac{r_{k-1}^{T}r_{k-1}}{r_{k-1}^{T}\hat{A}r_{k-1}}\quad% \textrm{and}\quad\alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}% ^{T}\hat{A}^{T}\hat{A}r_{k-1}}.

Global convergence of the BB method for minimizing quadratic forms was established by Raydan [39], and its R-linear convergence rate was established by Dai and Liao [17]. For general strongly convex functions with Lipschitz gradient, the local convergence of the BB method with R-linear rate was rigorously proved by Dai et al. [19]. Extensive numerical experiments show that the BB method can solve unconstrained optimization problems efficiently and is considerably superior to the steepest descent method [12, 40]. A variety of modifications and extensions of the BB method have been developed for optimization.

Several researchers used the BB method to solve UPD linear systems. Dai et al. [18] gave an analysis of the BB1 method for two-by-two unsymmetric linear systems. Under mild conditions, they showed that the convergence rate of the BB1 method is $Q$ -superlinear if the matrix has a double eigenvalue, but only $R$ -superlinear if the matrix has two different real eigenvalues. We find that the BB1 method for solving UPD linear systems could be divergent. Indeed, consider

\hat{A}z:=\begin{pmatrix}1&2\\ -2&1\end{pmatrix}\begin{pmatrix}x\\ y\end{pmatrix}=\begin{pmatrix}0\\ 0\end{pmatrix}.

Note that $\hat{A}$ has two complex eigenvalues $1\pm 2{\rm i}$ . The conditions in [18] do not hold. It follows from (36) and (37) that $\alpha_{k}^{\rm BB1}=(s_{k-1}^{T}s_{k-1})/(s_{k-1}^{T}\hat{A}s_{k-1})=1.$ Then, one BB1 iteration gives

z_{k+1}=z_{k}-r_{k}=\smash[t]{\begin{pmatrix}x_{k}\\ y_{k}\end{pmatrix}-\begin{pmatrix}x_{k}+2y_{k}\\ -2x_{k}+y_{k}\end{pmatrix}=\begin{pmatrix}-2y_{k}\\ 2x_{k}\end{pmatrix}}.

This leads to $\|z_{k+1}\|^{2}=8\|z_{k}\|^{2}$ , which means that the sequence $\{z_{k}\}$ of the BB1 iterations diverges for any initial $z_{0}\neq 0$ .

For quadratic programming with $\hat{A}$ unsymmetric, the minimal gradient method [31, 32, 42] uses the stepsize $\alpha_{k}^{\rm MG}=(r_{k}^{T}\hat{A}r_{k})/(r_{k}^{T}\hat{A}^{T}\hat{A}r_{k})$ , which gives an optimal residual in each iteration, namely,

\alpha_{k}^{\rm MG}=\arg\min_{\alpha>0}\|\hat{A}(z_{k}-\alpha r_{k})-b\|=\arg% \min_{\alpha>0}\|r_{k}-\alpha\hat{A}r_{k}\|.

Therefore, the minimal gradient method is convergent for solving UPD linear systems. Note that the difference between $\alpha_{k}^{\rm MG}$ and $\alpha_{k}^{\rm BB2}$ is that one uses $r_{k}$ and the other uses $r_{k-1}$ . The BB2 method can be regarded as the minimal gradient method with delay [24]. Gradient methods with delay significantly improve the performance of gradient methods, see [51] and references therein. Hence, we use the BB2 method to derive the new iterates $x_{k+1}$ and $y_{k+1}$ in Algorithm 2 when $G$ is positive definite. Then the augmented Lagrangian BB algorithm for solving (1) is as in Algorithm 3.

Algorithm 3 Augmented Lagrangian BB algorithm, SPALBB

1: Given

z_{-1}=(x_{-1},\,y_{-1}),~{}z_{0}=(x_{0},\,y_{0})\in\mathds{R}^{n+m}

\omega>0

0\leq\delta<1

, and SPD

Q

, compute

r_{0}=Mz_{0}-(f,\,\omega Qy_{0}+g)

and set

k=0

2: while a stop** condition is not satisfied do

3: Compute

\ell_{k}=(f,\,\omega Qy_{k}+g)

4: while

\|r_{j}-Mz_{j}\|_{*}>\delta\|r_{j}\|_{*}

5: Compute

s_{j}=z_{j}-z_{j-1}

6: Compute

d_{j}=Ms_{j}

7: Compute

r_{j}=Mz_{j}-\ell_{k}

8: Compute

\alpha_{j}=\frac{s_{j}^{T}d_{j}}{\|d_{j}\|^{2}}

9: Compute

z_{{j}+1}=z_{j}-\alpha_{j}r_{j}

10: end while

11: Increment

k

1

12: end while

In the following, we establish the convergence of Algorithm 3. First, under some assumptions, we show that the BB2 method is convergent for solving a general UPD linear system $\hat{A}z=\hat{\ell}$ , where the iterative scheme is $z_{k+1}=z_{k}-\alpha_{k}^{\rm BB2}r_{k}$ and $r_{k}=\hat{A}z_{k}-\hat{\ell}.$ For convenience, we introduce

	$\displaystyle\hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T}),\qquad W=\hat{A}_{h% }^{-1}\hat{A}^{T}\!\hat{A},$
(38)		$\displaystyle\theta_{j}=\max\left\{1-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{\|% \lambda_{j}\|^{2}}{\lambda_{\min}(W)^{2}},\,1-\frac{2u_{j}}{\lambda_{\max}(W)}+% \frac{\|\lambda_{j}\|^{2}}{\lambda_{\max}(W)^{2}}\right\},$

where $\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n)$ are the eigenvalues of $\hat{A}$ . When $\hat{A}$ is UPD, we know that $\hat{A}_{h}$ is SPD and $u_{j}>0~{}(1\leq j\leq n)$ . By direct calculation, for all $1\leq j\leq n$ , $\theta_{j}<1$ holds by $1-\frac{2u_{j}}{\lambda_{\min}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\min}(W)^% {2}}<1$ and $1-\frac{2u_{j}}{\lambda_{\max}(W)}+\frac{|\lambda_{j}|^{2}}{\lambda_{\max}(W)^% {2}}<1$ , which are equivalent to

(39)

\smash[t]{\max_{1\leq j\leq n}\frac{|\lambda_{j}|^{2}}{u_{j}}<2\lambda_{\min}(% W).}

We are now ready to study the convergence of the BB2 method.

Theorem 3.8.

Suppose $\hat{A}\in\mathds{R}^{\hat{n}\times\hat{n}}$ is UPD. If its $n$ eigenvalues $\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n)$ satisfy (39), then the sequence $\{z_{k}\}$ produced by the BB2 method converges to the unique solution of $\hat{A}z=\hat{\ell}$ .

Proof 3.9.

It is well known that the BB method is invariant under unitary transformation of the variables [17]. By the Schur decomposition, we can assume without loss of generality that $\hat{A}$ is of the form

\begin{pmatrix}\lambda_{1}&a_{12}&a_{13}&\cdots&a_{1\hat{n}}\\ 0&\lambda_{2}&a_{23}&\cdots&a_{2\hat{n}}\\ \vdots&\ddots&\ddots&\ddots&\vdots\\ 0&\cdots&0&\lambda_{\hat{n}-1}&a_{\hat{n}-1,\hat{n}}\\ 0&\cdots&\cdots&0&\lambda_{\hat{n}}\end{pmatrix},

where $\lambda_{j}=u_{j}+{\rm i}v_{j}\in\mathbb{C},~{}j=1,2,\ldots,\hat{n}$ . Because $r_{k+1}=\hat{A}z_{k+1}-\hat{\ell}=r_{k}-\alpha_{k}^{\rm BB2}\hat{A}r_{k}$ ,

(40)

\left\{\begin{array}[]{l}r_{k+1}^{(\hat{n})}=r_{k}^{(\hat{n})}-\alpha_{k}^{\rm BB% 2}\lambda_{\hat{n}}r_{k}^{(\hat{n})},\\[3.0pt] r_{k+1}^{(j)}=r_{k}^{(j)}-\alpha_{k}^{\rm BB2}\lambda_{j}r_{k}^{(j)}-\alpha_{k% }^{\rm BB2}\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)},~{}j=\hat{n}-1,% \ldots,1,\end{array}\right.

where $r_{k}^{(j)}$ is the $j$ -th component of $r_{k}$ . Note that $\hat{A}_{h}=\tfrac{1}{2}(\hat{A}+\hat{A}^{T})$ and $r_{k-1}^{T}\hat{A}r_{k-1}=r_{k-1}^{T}\hat{A}^{T}r_{k-1}$ , giving $r_{k-1}^{T}\hat{A}r_{k-1}=\tfrac{1}{2}\left(r_{k-1}^{T}\hat{A}r_{k-1}+r_{k-1}^% {T}\hat{A}^{T}r_{k-1}\right)=r_{k-1}^{T}\hat{A}_{h}r_{k-1}.$ Since $\hat{A}_{h}$ is SPD, it leads to

\alpha_{k}^{\rm BB2}=\frac{r_{k-1}^{T}\hat{A}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}% \hat{A}r_{k-1}}=\frac{r_{k-1}^{T}\hat{A}_{h}r_{k-1}}{r_{k-1}^{T}\hat{A}^{T}% \hat{A}r_{k-1}}\xlongequal{\hat{r}=\hat{A}_{h}^{\frac{1}{2}}r_{k-1}}\frac{\hat% {r}^{T}\hat{r}}{\hat{r}^{T}\hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}% _{h}^{-\frac{1}{2}}\hat{r}}.

By the Courant-Fischer min-max theorem and the fact that $\hat{A}_{h}^{-\frac{1}{2}}\hat{A}^{T}\hat{A}\hat{A}_{h}^{-\frac{1}{2}}$ is similar to $W$ , we have

(41)

\smash[t]{\frac{1}{\lambda_{\max}(W)}\leq\alpha_{k}^{\rm BB2}\leq\frac{1}{% \lambda_{\min}(W)}.}

It follows from $\lambda_{j}=u_{j}+{\rm i}v_{j}$ , (38), (41), and the behavior of the quadratic function for $\alpha_{k}^{\rm BB2}$ that, for any $j=1,\ldots,\hat{n}$ ,

	$\displaystyle\left\|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right\|^{2}=\left(1-\alpha% _{k}^{\rm BB2}u_{j}\right)^{2}+\left(\alpha_{k}^{\rm BB2}v_{j}\right)^{2}=1-2% \alpha_{k}^{\rm BB2}u_{j}+\left(\alpha_{k}^{\rm BB2}\right)^{2}\left\|\lambda_{% j}\right\|^{2}$
(42)		$\displaystyle\leq\max\left\{1-\tfrac{2u_{j}}{\lambda_{\min}(W)}+\tfrac{\|% \lambda_{j}\|^{2}}{\lambda_{\min}(W)^{2}},\,1-\tfrac{2u_{j}}{\lambda_{\max}(W)}% +\tfrac{\|\lambda_{j}\|^{2}}{\lambda_{\max}(W)^{2}}\right\}=\theta_{j}.$

Combining with (39) and (40) gives

\left|r_{k+1}^{(\hat{n})}\right|=\left|1-\alpha_{k}^{\rm BB2}\lambda_{\hat{n}}% \right|\,\left|r_{k}^{(\hat{n})}\right|\leq\sqrt{\theta_{\hat{n}}}\left|r_{k}^% {(\hat{n})}\right|<\left|r_{k}^{(\hat{n})}\right|.

This implies that $r_{k}^{(\hat{n})}\rightarrow 0$ as $k\rightarrow\infty$ . For $j=\hat{n}-1,\ldots,1$ , by (40) and (42), $\left|r_{k+1}^{(j)}\right|\leq\left|1-\alpha_{k}^{\rm BB2}\lambda_{j}\right|\,% \left|r_{k}^{(j)}\right|+\alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n% }}a_{j,t}r_{k}^{(t)}\right|\leq\sqrt{\theta_{j}}\left|r_{k}^{(j)}\right|+% \alpha_{k}^{\rm BB2}\left|\sum\limits_{t=j+1}^{\hat{n}}a_{j,t}r_{k}^{(t)}\right|$ . It follows that $\theta_{j}<1$ and $\lim\limits_{k\rightarrow\infty}r_{k}^{(\hat{n})}=0$ .

Remark 8.

As $\hat{A}$ is positive definite, so is $\hat{A}^{-1}$ . Let $\tilde{\lambda}_{j}=\tilde{u}_{j}+{\rm i}\tilde{v}_{j}~{}(1\leq j\leq\hat{n})$ be the eigenvalues of $\hat{A}^{-1}$ . Clearly, $\frac{1}{\tilde{\lambda}_{j}}=\frac{\tilde{u}_{j}-{\rm i}\tilde{v}_{j}}{\tilde% {u}_{j}^{2}+\tilde{v}_{j}^{2}}$ is an eigenvalue of $\hat{A}$ . Then $\max\limits_{1\leq j\leq\hat{n}}\frac{|\lambda_{j}|^{2}}{u_{j}}=\frac{1}{\min_% {1\leq j\leq\hat{n}}\tilde{u}_{j}}$ . This, along with

	$\displaystyle\lambda_{\min}(W)=2\lambda_{\min}\left((\hat{A}+\hat{A}^{T})^{-1}% \hat{A}^{T}\hat{A}\right)=2\lambda_{\min}\left(\hat{A}(\hat{A}+\hat{A}^{T})^{-% 1}\hat{A}^{T}\right)$
	$\displaystyle=2\lambda_{\min}\left((\hat{A}^{-1}+\hat{A}^{-T})^{-1}\right)=% \frac{2}{\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})},$

shows that condition (39) is equivalent to $\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<4\min\limits_{1\leq j\leq\hat{n}}% \tilde{u}_{j}.$ Note that $\min\limits_{1\leq j\leq\hat{n}}\tilde{u}_{j}\geq\tfrac{1}{2}\lambda_{\min}(% \hat{A}^{-1}+\hat{A}^{-T})$ ,¹¹1For any $j=1,\ldots,\hat{n}$ , let $\tilde{x}_{j}$ be the eigenvector of $\hat{A}^{-1}$ corresponding to $\tilde{\lambda}_{j}$ . Then we have $\tilde{\lambda}_{j}=\tfrac{\tilde{x}_{j}^{*}\hat{A}^{-1}\tilde{x}_{j}}{\tilde{% x}_{j}^{*}\tilde{x}_{j}}$ . Since $\hat{A}^{-1}+\hat{A}^{-T}$ is SPD, it gives $\tilde{u}_{j}=\frac{1}{2}\left(\tilde{\lambda}_{j}+\tilde{\lambda}_{j}^{*}% \right)=\frac{1}{2}\left(\frac{\tilde{x}_{j}^{*}\hat{A}^{-1}\tilde{x}_{j}}{% \tilde{x}_{j}^{*}\tilde{x}_{j}}+\frac{\tilde{x}_{j}^{*}\hat{A}^{-T}\tilde{x}_{% j}}{\tilde{x}_{j}^{*}\tilde{x}_{j}}\right)=\frac{\tilde{x}_{j}^{*}\left(\hat{A% }^{-1}+\hat{A}^{-T}\right)\tilde{x}_{j}}{2\tilde{x}_{j}^{*}\tilde{x}_{j}}\geq% \frac{1}{2}\lambda_{\min}(\hat{A}^{-1}+\hat{A}^{-T}).$ so the above inequality can be reinforced as $\lambda_{\max}(\hat{A}^{-1}+\hat{A}^{-T})<2\lambda_{\min}(\hat{A}^{-1}+\hat{A}% ^{-T}).$ When $\hat{A}$ is SPD, it reduces to $\lambda_{\max}(\hat{A})<2\lambda_{\min}(\hat{A})$ , which is the same as the convergence condition of the preconditioned BB method for SPD linear systems [34]. This means that our condition (39) is weaker than that of [34].

When $G$ is UPD, so is $M$ in (12). Combining Theorem 3.8 with the convergence conditions of Algorithm 2 gives the following result.

Theorem 3.10.

Suppose $G\in\mathds{R}^{n\times n}$ is UPD. For any SPD $Q\in\mathds{R}^{m\times m}$ and any $\omega>0$ , let $M$ be defined by (12) and $\lambda_{j}=u_{j}+{\rm i}v_{j}~{}(1\leq j\leq n+m)$ be its $n+m$ eigenvalues. If

(43)

\smash[t]{\max_{1\leq j\leq n+m}\frac{|\lambda_{j}|^{2}}{u_{j}}<\frac{4}{% \lambda_{\max}\left(M^{-1}+M^{-T}\right)},}

then for sufficiently small $\delta$ , $\{x_{k},y_{k}\}$ produced by Algorithm 3 converges to a solution of (1).

Remark 9.

The residuals generated by the BB method, even for SPD linear systems, are strong nonmonotonic, which poses a challenge for the convergence [40, 17]. This is also the reason why the convergence of Algorithm 3 is intricate. Our convergence analysis of Algorithm 3 by ensuring a decrease of $\|r_{k}\|_{*}$ is quite stringent, relying on a rather strong assumption (43). The nonmonotonic behavior of $\|r_{k}\|$ in Figures 1 and 2 also indicates that the choices of $\omega$ in our numerical experiments do not meet (43). Thus, there is significant room for improving the convergence of the BB method for UPD linear systems and Algorithm 3.

Remark 10.

Although assumption (43) is strong, it is still possible to choose $\omega$ to satisfy it. Indeed, consider the special case $n=m=1$ and $M=\begin{pmatrix}a&b\\ -b&\omega\end{pmatrix}$ with $a>0$ and $b\in\mathds{R}$ . Since $M^{-1}=\frac{1}{a\omega+b^{2}}\begin{pmatrix}\omega&-b\\ b&a\end{pmatrix},$ we have

\lambda_{\min}\left(M^{-1}+M^{-T}\right)=\frac{2\min\{a,\omega\}}{a\omega+b^{2% }}\quad\mbox{and}\quad\lambda_{\max}\left(M^{-1}+M^{-T}\right)=\frac{2\max\{a,% \omega\}}{a\omega+b^{2}}.

It follows from Remark 8 that (43) can be reinforced as $\lambda_{\max}\left(M^{-1}+M^{-T}\right)\leq 2\lambda_{\min}\left(M^{-1}+M^{-T% }\right)$ , namely, $\max\{a,\omega\}\leq 2\min\{a,\omega\}$ . This implies that (43) holds when $\omega\in\left[a/2,\,2a\right]$ .

For the general case, we can apply preconditioning techniques to (7) such that $M$ is well-conditioned. Preconditioning techniques for $M$ have been widely studied; see [8] and the references therein.

4 Numerical experiments

We present the results of numerical tests to examine the feasibility and effectiveness of SPALBB. All experiments were run using MATLAB R2022b on a PC with an Intel(R) Core(TM) i7-1260P CPU @ 2.10GHz and 32GB of RAM. The initial guess is taken to be the zero vector, and the algorithms are terminated when the number of iterations exceeds $10^{5}$ or ${\rm Res}:=\|r_{k}\|/\|r_{0}\|\leq 10^{-6}$ . We report the number of outer iterations, the total number of iterations (for SPALBB, it includes the number of inner iterations), the CPU time in seconds, and the final value of the relative residual, denoted by “Oiter”, “Titer”, “CPU” and “Res”.

In SPALBB, we set $Q=I$ , the stop** criterion (23) for inner iterations with $\delta=0.5$ and $2$ -norm, and tried $\omega=10^{-i}$ with $i=1$ , $2$ , $3$ , $4$ , $5$ , denoted SPALBB( $\omega$ ). We compared our method with BICGSTAB and restarted GMRES. We tested two restart values: $20$ and $50$ , denoted GMRES(20) and GMRES(50).

Example 1.

The steady-state Navier-Stokes equations are

(44)

-\nu\nabla^{2}{\bm{u}}+{\bm{u}}\cdot\nabla{\bm{u}}+\nabla p={\bm{h}}\quad{\rm and% }\quad{\rm div}\,{\bm{u}}=0,\quad{\bm{z}}=(x,y)\in\Omega,

where $\Omega\subseteq\mathds{R}^{2}$ is a bounded domain, the vector field ${\bm{u}}$ represents the velocity in $\Omega$ , $p$ represents pressure, and $\nu>0$ is the kinematic viscosity. The test problem is a model of the flow in a square cavity $\Omega=(-1,1)\times(-1,1)$ with the lid moving from left to right. A Dirichlet no-flow (zero velocity) condition is applied on the side and bottom boundaries, and the nonzero horizontal velocity on the lid is $\{y=1;-1\leq x\leq 1\mid u_{x}=1-x^{4}\}$ .

Finite element discretization of (44) results in system (1) with $G=\nu G_{1}+G_{2}$ . Here $G_{1}$ is SPD and consists of a set of uncouple discrete Laplace operators, corresponding to diffusion, and $G_{2}$ is a discrete convection operator and is unsymmetric. Evidently, $G$ becomes more unsymmetric as $\nu$ decreases. Various methods have been developed for solving (44). However, the convergence rates of some approaches deteriorate as $\nu$ decreases [22]. Thus, for (44), we test three small viscosity values of $\nu$ : $0.005,\,0.01,\,0.05$ .

1 is a classical test problem used in fluid dynamics, known as driven-cavity flow. We discretize (44) using Picard iterations and the Q2–Q1 mixed finite element approximation [23] on uniform grids with grid parameter $h=2^{-6}$ , $2^{-7}$ , $2^{-8}$ , $2^{-9}$ . This discrete process can be accomplished by the IFISS software package [23, 46]. In this example, $G$ is UPD and $B$ is rank-deficient with rank $m-1$ . Thus, the matrix in (1) is singular. The numerical results are reported in Tables 1, 2 and 3 and in the left-hand plots of Figure 1, where “-” means that the method failed to solve the problem and bold face indicates that the method performs best in terms of CPU time. It can be seen from Tables 1, 2 and 3 that the CPU time of all tested methods increases as $\nu$ decreases, and BICGSTAB and SPALBB(1) fail when $h=2^{-9}$ for $\nu=0.005$ . The CPU time of SPALBB with $\omega\leq 10^{-2}$ is about half that of GMRES, and the best cases of SPALBB are only a third of GMRES for $h=2^{-9}$ . The number of outer iterations of SPALBB decreases with $\omega$ , which is consistent with Remark 1. Nevertheless, the total number of iterations is not the least for $\omega=10^{-5}$ .

Table 1: Numerical results for 1 with

\nu=0.005

$h(n,m)$	$2^{-6}$ $(n=8,450,m=1,089)$				$2^{-7}$ $(n=33,282,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		6665.5	2.28	$9.83$ E $-07$		14211.5	22.75	$8.95$ E $-07$
GMRES(20)		4221	2.55	$9.99$ E $-07$		7735	16.86	$9.99$ E $-07$
GMRES(50)		4040	4.64	$9.99$ E $-07$		7162	27.94	$1.00$ E $-06$
SPALBB(1)	1594	6765	1.23	$9.97$ E $-07$	22057	50212	42.16	$9.93$ E $-07$
SPALBB(2)	51	16705	2.83	$9.98$ E $-07$	243	18537	15.42	$9.83$ E $-07$
SPALBB(3)	17	18762	3.18	$1.00$ E $-06$	27	24084	20.45	$7.70$ E $-07$
SPALBB(4)	14	19036	3.22	$9.99$ E $-07$	15	22801	18.78	$1.00$ E $-06$
SPALBB(5)	14	24496	4.28	$1.00$ E $-06$	14	35119	29.70	$1.00$ E $-06$
$h(n,m)$	$2^{-8}$ $(n=132,098,m=16,641)$				$2^{-9}$ $(n=526,338,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		39440.5	378.63	$8.31$ E $-07$		-	-	-
GMRES(20)		15638	138.37	$1.00$ E $-06$		37240	1563.36	$1.00$ E $-06$
GMRES(50)		13265	179.20	$1.00$ E $-06$		25858	1518.14	$1.00$ E $-06$
SPALBB(1)	55401	77589	374.99	$9.99$ E $-07$	-	-	-	-
SPALBB(2)	2412	17982	85.91	$1.00$ E $-06$	14384	38247	808.82	$9.94$ E $-07$
SPALBB(3)	51	26095	127.44	$1.00$ E $-06$	340	26085	539.99	$9.69$ E $-07$
SPALBB(4)	17	28805	138.84	$1.00$ E $-06$	23	36728	778.87	$1.00$ E $-06$
SPALBB(5)	14	32707	165.62	$1.00$ E $-06$	14	36635	783.08	$1.00$ E $-06$

Table 2: Numerical results for 1 with

\nu=0.01

$h(n,m)$	$2^{-6}$ $(n=8,450,m=1,089)$				$2^{-7}$ $(n=33,282,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		7211.5	11.63	$9.90$ E $-07$
GMRES(20)		2453	1.60	$9.98$ E $-07$		5127	10.15	$9.99$ E $-07$
GMRES(50)		2237	2.63	$9.98$ E $-07$		4422	14.82	$1.00$ E $-06$
SPALBB(1)	1649	4803	0.87	$9.96$ E $-07$	8611	17952	15.60	$9.49$ E $-07$
SPALBB(2)	38	5475	0.91	$1.00$ E $-06$	411	9982	8.26	$1.00$ E $-06$
SPALBB(3)	16	6295	0.98	$1.00$ E $-06$	22	8554	6.94	$1.00$ E $-06$
SPALBB(4)	15	10835	1.76	$1.00$ E $-06$	15	10388	8.35	$1.00$ E $-06$
SPALBB(5)	15	17674	2.89	$1.00$ E $-06$	15	24493	20.09	$1.00$ E $-06$
$h(n,m)$	$2^{-8}$ $(n=132,098,m=16,641)$				$2^{-9}$ $(n=526,338,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		17765.5	161.00	$8.20$ E $-07$		29320	1024.60	$1.46$ E $-05$
GMRES(20)		13579	113.90	$1.00$ E $-06$		34224	1186.20	$1.00$ E $-06$
GMRES(50)		9419	118.86	$1.00$ E $-06$		22827	1140.94	$1.00$ E $-06$
SPALBB(1)	25643	36571	165.72	$9.99$ E $-07$	77225	87388	1632.85	$9.94$ E $-07$
SPALBB(2)	2154	14993	67.21	$1.00$ E $-06$	8277	31640	581.72	$9.99$ E $-07$
SPALBB(3)	42	13434	60.32	$9.99$ E $-07$	554	22207	402.99	$1.00$ E $-06$
SPALBB(4)	16	14113	65.49	$9.99$ E $-07$	22	21902	395.92	$9.14$ E $-07$
SPALBB(5)	15	21503	101.01	$1.00$ E $-06$	15	20219	369.43	$1.00$ E $-06$

Table 3: Numerical results for 1 with

\nu=0.05

$h(n,m)$	$2^{-6}$ $(n=8,450,m=1,089)$				$2^{-7}$ $(n=33,282,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		917.5	0.33	$9.32$ E $-07$		1648.5	2.72	$9.61$ E $-07$
GMRES(20)		1508	0.94	$9.96$ E $-07$		4507	9.58	$9.98$ E $-07$
GMRES(50)		989	1.15	$9.96$ E $-07$		2769	11.11	$1.00$ E $-06$
SPALBB(1)	782	2947	0.54	$9.96$ E $-07$	3142	9527	7.89	$1.00$ E $-06$
SPALBB(2)	79	1952	0.34	$9.95$ E $-07$	300	4762	3.84	$9.98$ E $-07$
SPALBB(3)	19	1729	0.28	$9.97$ E $-07$	36	3272	2.69	$6.68$ E $-07$
SPALBB(4)	17	4261	0.67	$9.98$ E $-07$	15	3878	3.21	$9.95$ E $-07$
SPALBB(5)	17	5252	0.86	$1.00$ E $-06$	16	8563	7.24	$9.97$ E $-07$
$h(n,m)$	$2^{-8}$ $(n=132,098,m=16,641)$				$2^{-9}$ $(n=526,338,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		3253.5	30.99	$8.58$ E $-07$		6277.5	252.29	$9.54$ E $-07$
GMRES(20)		11071	99.96	$1.00$ E $-06$		21205	797.99	$1.00$ E $-06$
GMRES(50)		7775	104.78	$1.00$ E $-06$		18460	1048.97	$1.00$ E $-06$
SPALBB(1)	10731	30596	141.61	$9.99$ E $-07$	-	-	-	-
SPALBB(2)	1005	13335	62.92	$1.00$ E $-06$	3330	34174	700.69	$9.99$ E $-07$
SPALBB(3)	100	7869	36.87	$9.99$ E $-07$	339	21075	433.98	$1.00$ E $-06$
SPALBB(4)	19	5156	24.64	$7.36$ E $-07$	37	10542	216.78	$9.99$ E $-07$
SPALBB(5)	17	12476	60.19	$9.99$ E $-07$	16	11154	231.77	$9.99$ E $-07$

Example 2.

We consider the steady-state Navier-Stokes equations (44), where the domain $\Omega$ is a rectangular region $(0,8)\times(-1,1)$ generated by deleting the square $(7/4,9/4)\times(-1/4,1/4)$ . This test problem is a model of the flow in a rectangular channel with an obstacle. A Poiseuille profile is imposed on the inflow boundary $\{x=0;-1\leq y\leq 1\}$ , and a Dirichlet no-flow condition is imposed on the obstruction and on the top and bottom walls. A Neumann condition is applied at the outflow boundary that automatically sets the mean outflow pressure to zero.

In our tests, we set $\nu=0.005,\,0.01,\,0.05$ and discretize the Navier-Stokes equations (44) using Picard iterations and the Q2–Q1 mixed finite element approximation [23] on uniform grids with grid parameter $h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8}$ . This discretization was accomplished using IFISS [23, 46]. The resulting matrices have $G$ UPD and $B$ full column rank. The numerical results are reported in Tables 4, 5 and 6 and Figure 1. Tables 4, 5 and 6 show that all choices of $\omega$ are successful in solving the tested problems, and, in terms of CPU time, $\omega=10^{-1}$ and $10^{-2}$ perform better than other choices. Although BICGSTAB requires the least CPU time for $\nu=0.05$ , it fails for $\nu=0.005$ and $\nu=0.01$ with $h=2^{-5},\,2^{-8}$ . The CPU time for every SPALBB test is less than for GMRES, and the best case of SPALBB takes about half the time of GMRES. Overall, SPALBB is more stable and efficient.

Table 4: Numerical results for 2 with

\nu=0.005

$h(n,m)$	$2^{-5}$ $(n=8,416,m=1,096)$				$2^{-6}$ $(n=32,960,m=4,208)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		-	-	-
GMRES(20)		6319	3.60	$9.99$ E $-07$		11462	25.25	$9.99$ E $-07$
GMRES(50)		5540	6.10	$9.98$ E $-07$		11386	42.46	$1.00$ E $-06$
SPALBB(1)	632	17567	2.78	$1.00$ E $-06$	2211	16590	12.88	$1.00$ E $-06$
SPALBB(2)	110	29351	4.95	$1.00$ E $-06$	330	34576	27.70	$9.99$ E $-07$
SPALBB(3)	30	29332	4.64	$1.00$ E $-06$	61	34000	26.83	$8.13$ E $-07$
SPALBB(4)	18	34374	5.25	$1.00$ E $-06$	20	34842	27.95	$9.99$ E $-07$
SPALBB(5)	18	40951	6.20	$1.00$ E $-06$	17	44791	36.40	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=130,432,m=16,480)$				$2^{-8}$ $(n=518,912,m=65,216)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		-	-	-
GMRES(20)		20442	173.54	$1.00$ E $-06$		39863	1385.75	$1.00$ E $-06$
GMRES(50)		20511	276.89	$1.00$ E $-06$		38382	2019.57	$1.00$ E $-06$
SPALBB(1)	9013	23219	102.78	$1.00$ E $-06$	43791	61073	1155.48	$1.00$ E $-06$
SPALBB(2)	917	46379	211.82	$9.97$ E $-07$	2765	35310	651.49	$1.00$ E $-06$
SPALBB(3)	135	44805	214.76	$1.00$ E $-06$	361	60088	1099.24	$1.00$ E $-06$
SPALBB(4)	32	46194	221.46	$9.10$ E $-07$	65	59081	1080.07	$8.08$ E $-07$
SPALBB(5)	16	52684	255.82	$1.00$ E $-06$	19	57737	1064.29	$1.00$ E $-06$

Table 5: Numerical results for 2 with

\nu=0.01

$h(n,m)$	$2^{-5}$ $(n=8,416,m=1,096)$				$2^{-6}$ $(n=32,960,m=4,208)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		5336	7.87	$9.89$ E $-07$
GMRES(20)		5145	2.98	$9.99$ E $-07$		9162	18.05	$1.00$ E $-06$
GMRES(50)		4904	5.59	$1.00$ E $-06$		9446	37.68	$1.00$ E $-06$
SPALBB(1)	604	10445	1.54	$9.99$ E $-07$	2455	10813	8.26	$9.99$ E $-07$
SPALBB(2)	108	13769	2.06	$9.42$ E $-07$	294	20425	15.79	$9.77$ E $-07$
SPALBB(3)	30	14084	2.95	$6.63$ E $-07$	57	19994	14.99	$9.99$ E $-07$
SPALBB(4)	18	16042	2.50	$9.98$ E $-07$	20	21636	16.45	$7.17$ E $-07$
SPALBB(5)	18	18160	2.75	$1.00$ E $-06$	17	26448	20.12	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=130,432,m=16,480)$				$2^{-8}$ $(n=518,912,m=65,216)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		11686.5	104.74	$7.60$ E $-07$		-	-	-
GMRES(20)		17521	151.67	$1.00$ E $-06$		34452	1190.29	$1.00$ E $-06$
GMRES(50)		17430	237.68	$1.00$ E $-06$		33304	1629.56	$1.00$ E $-06$
SPALBB(1)	8001	17667	79.71	$9.98$ E $-07$	30579	43828	851.71	$9.99$ E $-07$
SPALBB(2)	818	26085	115.24	$1.00$ E $-06$	2576	28375	496.34	$1.00$ E $-06$
SPALBB(3)	138	29649	143.63	$9.98$ E $-07$	308	40365	719.53	$9.99$ E $-07$
SPALBB(4)	31	28824	133.78	$6.90$ E $-07$	67	47090	870.44	$9.64$ E $-07$
SPALBB(5)	17	40333	172.89	$1.00$ E $-06$	20	48639	909.21	$7.29$ E $-07$

Table 6: Numerical results for 2 with

\nu=0.05

$h(n,m)$	$2^{-5}$ $(n=8,416,m=1,096)$				$2^{-6}$ $(n=32,960,m=4,208)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		1127.5	0.39	$7.52$ E $-07$		2187.5	3.24	$5.81$ E $-07$
GMRES(20)		2420	2.63	$9.97$ E $-07$		5288	10.27	$1.00$ E $-06$
GMRES(50)		2686	4.79	$9.99$ E $-07$		5168	17.39	$9.99$ E $-07$
SPALBB(1)	616	2780	0.47	$9.92$ E $-07$	1877	6454	4.69	$9.99$ E $-07$
SPALBB(2)	90	4043	0.59	$9.98$ E $-07$	222	6876	4.96	$9.96$ E $-07$
SPALBB(3)	28	3888	0.63	$1.00$ E $-06$	47	6949	5.06	$9.98$ E $-07$
SPALBB(4)	19	5246	0.79	$9.97$ E $-07$	21	8136	6.06	$1.00$ E $-06$
SPALBB(5)	18	5470	0.83	$9.97$ E $-07$	18	9266	6.89	$9.99$ E $-07$
$h(n,m)$	$2^{-7}$ $(n=130,432,m=16,480)$				$2^{-8}$ $(n=518,912,m=65,216)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		4516.5	39.77	$9.48$ E $-07$		9366.5	340.85	$8.40$ E $-07$
GMRES(20)		10778	89.17	$9.99$ E $-07$		20466	711.25	$1.00$ E $-06$
GMRES(50)		10134	130.59	$1.00$ E $-06$		18845	921.70	$9.99$ E $-07$
SPALBB(1)	6815	19080	83.73	$9.98$ E $-07$	22561	57886	1903.39	$1.00$ E $-06$
SPALBB(2)	738	11894	53.23	$9.96$ E $-07$	2657	29081	536.54	$9.99$ E $-07$
SPALBB(3)	100	12759	57.35	$9.98$ E $-07$	304	24471	451.52	$1.00$ E $-06$
SPALBB(4)	28	13047	59.88	$1.00$ E $-06$	48	23291	434.16	$1.00$ E $-06$
SPALBB(5)	18	15504	71.03	$1.00$ E $-06$	21	25483	469.27	$9.98$ E $-07$

Example 3.

We consider the steady-state Navier-Stokes equations (44), where the domain $\Omega$ is a rectangular region $(-1,5)\times(-1,1)$ generated by deleting $(-1,0)\times(-1,-1/2)\cup(-1,0)\times(1/2,1)$ . This test problem is a model of the flow in a symmetric step channel. A Poiseuille flow profile is imposed on the inflow boundary $\{x=-1;-1/2\leq y\leq 1/2\}$ , and a Dirichlet no-flow condition is imposed on the top and bottom walls and the boundaries of deleted parts. A Neumann condition is applied at the outflow boundary that sets the mean outflow pressure to zero.

The discretization of the Navier-Stokes equations (44) is done as in 2 with the same setting. In this example, $G$ is UPD and $B$ has full column rank. The numerical results are reported in Tables 7, 8 and 9 and in the left-hand plots of Figure 2. As in 2, all choices of $\omega$ solve the problems successfully, and BICGSTAB performs best in the case of $\nu=0.05$ . Except for $\nu=0.05$ and $\nu=0.01$ with $h=2^{-6}$ , SPALBB requires the least CPU time. Hence, Tables 7, 8 and 9 still demonstrate the efficiency of SPALBB.

Table 7: Numerical results for 3 with

\nu=0.005

$h(n,m)$	$2^{-5}$ $(n=5,890,m=769)$				$2^{-6}$ $(n=23,042,m=2,945)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		-	-	-
GMRES(20)		5662	2.23	$9.98$ E $-07$		12340	17.73	$1.00$ E $-06$
GMRES(50)		3918	3.01	$1.00$ E $-06$		10459	26.94	$9.99$ E $-07$
SPALBB(1)	510	10007	0.96	$9.96$ E $-07$	2392	13447	6.73	$9.99$ E $-07$
SPALBB(2)	99	26993	2.44	$9.44$ E $-07$	322	36764	17.71	$1.00$ E $-06$
SPALBB(3)	23	21194	1.92	$1.00$ E $-06$	54	29542	14.58	$9.39$ E $-07$
SPALBB(4)	17	34776	3.20	$1.00$ E $-06$	20	36454	18.04	$1.00$ E $-06$
SPALBB(5)	18	43570	3.94	$1.00$ E $-06$	17	47559	23.47	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=91,138,m=11,521)$				$2^{-8}$ $(n=362,498,m=45,569)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		-	-	-
GMRES(20)		21340	119.38	$1.00$ E $-06$		41829	954.23	$1.00$ E $-06$
GMRES(50)		22164	196.79	$9.99$ E $-07$		41230	1409.32	$1.00$ E $-06$
SPALBB(1)	9838	20630	50.41	$9.97$ E $-07$	44486	63899	834.13	$1.00$ E $-06$
SPALBB(2)	799	41186	103.26	$9.99$ E $-07$	3052	37812	490.00	$9.97$ E $-07$
SPALBB(3)	152	51307	133.36	$6.43$ E $-07$	429	72440	930.67	$9.88$ E $-07$
SPALBB(4)	25	33030	86.80	$1.00$ E $-06$	56	51184	667.27	$9.53$ E $-07$
SPALBB(5)	16	51574	139.75	$1.00$ E $-06$	16	54033	707.80	$1.00$ E $-06$

Table 8: Numerical results for 3 with

\nu=0.01

$h(n,m)$	$2^{-5}$ $(n=5,890,m=769)$				$2^{-6}$ $(n=23,042,m=2,945)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		-	-	-		4746.5	4.53	$9.85$ E $-07$
GMRES(20)		5518	2.17	$1.00$ E $-06$		10529	14.70	$1.00$ E $-06$
GMRES(50)		3782	2.87	$1.00$ E $-06$		9379	26.42	$1.00$ E $-06$
SPALBB(1)	584	8002	0.73	$9.96$ E $-07$	2340	9722	4.70	$1.00$ E $-06$
SPALBB(2)	106	12810	1.15	$9.99$ E $-07$	297	20437	9.54	$8.66$ E $-07$
SPALBB(3)	24	10581	0.91	$4.68$ E $-07$	53	17322	8.02	$9.98$ E $-07$
SPALBB(4)	17	16101	1.42	$9.99$ E $-07$	18	18021	8.45	$9.99$ E $-07$
SPALBB(5)	18	19627	1.75	$1.00$ E $-06$	17	26925	13.12	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=91,138,m=11,521)$				$2^{-8}$ $(n=362,498,m=45,569)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		9904.5	48.44	$9.99$ E $-07$		-	-	-
GMRES(20)		19010	104.40	$9.99$ E $-07$		37453	841.47	$1.00$ E $-06$
GMRES(50)		19738	187.88	$1.00$ E $-06$		37074	1319.69	$1.00$ E $-06$
SPALBB(1)	9286	19210	45.65	$9.98$ E $-07$	34222	47839	825.49	$9.97$ E $-07$
SPALBB(2)	781	24365	61.89	$1.00$ E $-06$	3013	28458	343.04	$1.00$ E $-06$
SPALBB(3)	145	29784	77.31	$9.74$ E $-07$	375	46354	557.08	$9.99$ E $-07$
SPALBB(4)	25	23353	62.19	$9.23$ E $-07$	60	41959	500.22	$9.99$ E $-07$
SPALBB(5)	16	39604	103.42	$1.00$ E $-06$	17	41434	507.15	$1.00$ E $-06$

Table 9: Numerical results for 3 with

\nu=0.05

$h(n,m)$	$2^{-5}$ $(n=5,890,m=769)$				$2^{-6}$ $(n=23,042,m=2,945)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		914.5	0.22	$8.68$ E $-07$		1888.5	1.89	$2.44$ E $-07$
GMRES(20)		3139	1.35	$9.98$ E $-07$		6106	8.44	$9.99$ E $-07$
GMRES(50)		2808	2.14	$9.99$ E $-07$		6467	16.24	$9.99$ E $-07$
SPALBB(1)	427	2487	0.27	$9.81$ E $-07$	1449	5128	2.32	$1.00$ E $-06$
SPALBB(2)	81	3657	0.33	$8.61$ E $-07$	196	6386	3.11	$9.99$ E $-07$
SPALBB(3)	24	4135	0.37	$1.00$ E $-06$	45	7116	3.23	$7.23$ E $-07$
SPALBB(4)	19	6332	0.59	$1.00$ E $-06$	19	8819	4.27	$9.98$ E $-07$
SPALBB(5)	18	7035	0.65	$9.98$ E $-07$	18	11447	5.33	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=91,138,m=11,521)$				$2^{-8}$ $(n=362,498,m=45,569)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		3811.5	18.29	$6.50$ E $-07$		8153.5	194.33	$8.93$ E $-07$
GMRES(20)		12413	64.26	$9.99$ E $-07$		23112	526.24	$1.00$ E $-06$
GMRES(50)		11779	99.11	$9.99$ E $-07$		21620	1401.43	$1.00$ E $-06$
SPALBB(1)	4898	13457	30.89	$9.98$ E $-07$	15210	36389	441.11	$9.99$ E $-07$
SPALBB(2)	542	9821	22.50	$9.97$ E $-07$	1601	21727	264.76	$1.00$ E $-06$
SPALBB(3)	101	12949	30.46	$9.98$ E $-07$	250	21972	267.31	$1.00$ E $-06$
SPALBB(4)	27	13639	33.74	$6.40$ E $-07$	46	22431	274.17	$9.98$ E $-07$
SPALBB(5)	18	17621	44.04	$9.99$ E $-07$	19	26577	328.06	$1.00$ E $-06$

Example 4.

Fluid flow in $\Omega_{f}\subset\mathds{R}^{2}$ coupled with porous media flow in $\Omega_{p}\subset\mathds{R}^{2}$ is governed by the static Stokes equations

(45)

-\nu\Delta\,{\bm{u}}_{f}+\nabla\,p_{f}={\bm{f}},\quad\textup{and}\quad{\rm div% }\,{\bm{u}}_{f}=0,\quad{\bm{z}}\in\Omega_{f},

where $\Omega_{f}\cap\Omega_{p}=\varnothing$ and $\overline{\Omega}_{f}\cap\overline{\Omega}_{p}=\Gamma$ with $\Gamma$ being an interface, $\nu>0$ is the kinematic viscosity, and $\bm{f}$ is the external force. In the porous media region, the governing variable is $\phi=\frac{p_{p}}{\rho_{f}g}$ , where $p_{p}$ is the pressure in $\Omega_{p}$ , $\rho_{f}$ is the fluid density, and $g$ is the acceleration due to gravity. The velocity ${\bm{u}}_{p}$ of the porous media flow is related to $\phi$ via Darcy’s law and is also divergence free:

(46)

{\bm{u}}_{p}=-\dfrac{\epsilon^{2}}{r\nu}\nabla\phi\quad\textup{and}\quad-{\rm div% }\,{\bm{u}}_{p}=0,\quad{\bm{z}}\in\Omega_{p},

where $r$ is the volumetric porosity and $\epsilon$ the characteristic length of the porous media.

In our numerical experiments, the computational domain is $\Omega_{f}=(0,1)\times(1,2)$ , $\Omega_{p}=(0,1)\times(0,1)$ and the interface is $\Gamma=(0,1)\times\{1\}$ . We use a uniform mesh with grid parameters $h=2^{-5},\,2^{-6},\,2^{-7},\,2^{-8}$ to decompose $\Omega_{f}$ , P2–P1 elements in the fluid region, and P2 Lagrange elements in the porous media region. We set $r=1$ and $\epsilon=\sqrt{0.1\nu}$ , and again test $\nu=0.005$ , $0.01$ , $0.05$ . Applying finite element discretization to the mixed Stokes-Darcy model (45)–(46) with the Dirichlet no-flow boundary conditions leads to linear systems of form (1) with $G=\begin{pmatrix}G_{11}&G_{12}\\ -G_{12}^{T}&\nu G_{22}\end{pmatrix}$ [13]. Here $G$ is UPD and $B$ has full column rank. The numerical results are reported in Tables 10, 11 and 12 and Figure 2. According to Tables 10, 11 and 12, all methods again perform better for larger $\nu$ , and BICGSTAB requires the least CPU time in most cases, while SPALBB is more competitive for smaller $\nu$ . For 4, SPALBB prefers smaller $\omega$ , such as $\omega=10^{-5}$ .

Table 10: Numerical results for 4 with

\nu=0.005

$h(n,m)$	$2^{-5}$ $(n=12,675,m=1,089)$				$2^{-6}$ $(n=49,923,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		3286.5	0.96	$9.25$ E $-07$		7366.5	9.22	$9.90$ E $-07$
GMRES(20)		7668	5.25	$1.00$ E $-06$		21422	50.66	$1.00$ E $-06$
GMRES(50)		6373	8.74	$1.00$ E $-06$		15042	66.02	$1.00$ E $-06$
SPALBB(1)	2048	9709	2.51	$9.99$ E $-07$	6972	29955	56.13	$1.00$ E $-06$
SPALBB(2)	230	7387	1.13	$9.98$ E $-07$	740	15065	13.15	$9.98$ E $-07$
SPALBB(3)	38	7709	1.08	$9.98$ E $-07$	91	14640	9.56	$1.00$ E $-06$
SPALBB(4)	18	8204	1.13	$9.99$ E $-07$	23	13798	8.60	$1.00$ E $-06$
SPALBB(5)	18	11310	1.77	$9.99$ E $-07$	18	25189	15.77	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=198,147,m=16,641)$				$2^{-8}$ $(n=789,507,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		15174.5	107.22	$8.93$ E $-07$		32027.5	829.92	$9.79$ E $-07$
GMRES(20)		47298	450.74	$1.00$ E $-06$		109310	4099.15	$1.00$ E $-06$
GMRES(50)		40626	605.13	$1.00$ E $-06$		81406	5300.94	$1.00$ E $-06$
SPALBB(1)	23244	94747	1623.05	$1.00$ E $-06$	-	-	-	-
SPALBB(2)	2442	43102	305.56	$1.00$ E $-06$	-	-	-	-
SPALBB(3)	272	30164	132.12	$1.00$ E $-06$	815	66624	1503.82	$9.99$ E $-07$
SPALBB(4)	42	27889	113.17	$9.99$ E $-07$	98	56613	907.24	$9.99$ E $-07$
SPALBB(5)	18	33198	138.39	$1.00$ E $-06$	24	47077	717.58	$1.00$ E $-06$

Table 11: Numerical results for 4 with

\nu=0.01

$h(n,m)$	$2^{-5}$ $(n=12,675,m=1,089)$				$2^{-6}$ $(n=49,923,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		2485.5	0.66	$9.88$ E $-07$		4888.5	5.27	$9.95$ E $-07$
GMRES(20)		5717	3.44	$9.99$ E $-07$		14398	29.13	$9.99$ E $-07$
GMRES(50)		4276	5.27	$1.00$ E $-06$		11608	47.31	$1.00$ E $-06$
SPALBB(1)	2346	9426	2.59	$9.99$ E $-07$	7762	30273	58.77	$1.00$ E $-06$
SPALBB(2)	265	5132	0.78	$9.99$ E $-07$	829	13805	12.72	$1.00$ E $-06$
SPALBB(3)	43	5380	0.69	$9.97$ E $-07$	104	11902	7.44	$9.99$ E $-07$
SPALBB(4)	18	6560	0.85	$9.98$ E $-07$	25	11578	7.05	$9.99$ E $-07$
SPALBB(5)	18	9092	1.14	$1.00$ E $-06$	18	17747	10.45	$9.98$ E $-07$
$h(n,m)$	$2^{-7}$ $(n=198,147,m=16,641)$				$2^{-8}$ $(n=789,507,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		10216	70.41	$9.78$ E $-07$		20513.5	579.78	$9.94$ E $-07$
GMRES(20)		33336	326.81	$1.00$ E $-06$		58210	6339.92	$1.00$ E $-06$
GMRES(50)		28362	438.14	$1.00$ E $-06$		48116	3648.93	$1.00$ E $-06$
SPALBB(1)	25411	97928	1769.82	$1.00$ E $-06$	-	-	-	-
SPALBB(2)	2680	42275	318.61	$9.99$ E $-07$	-	-	-	-
SPALBB(3)	295	23860	111.65	$9.97$ E $-07$	877	61065	1477.52	$9.98$ E $-07$
SPALBB(4)	46	22297	93.48	$9.98$ E $-07$	109	43631	703.09	$1.00$ E $-06$
SPALBB(5)	19	21946	92.28	$1.00$ E $-06$	26	37877	583.34	$9.99$ E $-07$

Table 12: Numerical results for 4 with

\nu=0.05

$h(n,m)$	$2^{-5}$ $(n=12,675,m=1,089)$				$2^{-6}$ $(n=49,923,m=4,225)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		883.5	0.22	$9.94$ E $-07$		1738.5	1.57	$9.55$ E $-07$
GMRES(20)		2589	2.22	$9.98$ E $-07$		5115	9.27	$9.99$ E $-07$
GMRES(50)		2509	3.85	$1.00$ E $-06$		4542	16.25	$1.00$ E $-06$
SPALBB(1)	3077	10839	2.94	$9.97$ E $-07$	10033	35063	68.21	$1.00$ E $-06$
SPALBB(2)	346	4649	0.70	$9.99$ E $-07$	1090	14684	12.95	$9.98$ E $-07$
SPALBB(3)	54	3899	0.46	$9.98$ E $-07$	131	7892	4.30	$9.94$ E $-07$
SPALBB(4)	21	3344	0.38	$9.99$ E $-07$	29	6819	3.38	$9.98$ E $-07$
SPALBB(5)	18	3263	0.37	$9.97$ E $-07$	18	6022	2.95	$1.00$ E $-06$
$h(n,m)$	$2^{-7}$ $(n=198,147,m=16,641)$				$2^{-8}$ $(n=789,507,m=66,049)$
	Oiter	Titer	CPU	RES	Oiter	Titer	CPU	RES
BICGSTAB		3512.5	23.21	$9.92$ E $-07$		7305.5	230.70	$9.72$ E $-07$
GMRES(20)		10545	107.69	$1.00$ E $-06$		30657	1438.72	$1.00$ E $-06$
GMRES(50)		8036	164.70	$1.00$ E $-06$		14974	1230.40	$1.00$ E $-06$
SPALBB(1)	-	-	-	-	-	-	-	-
SPALBB(2)	3464	46071	375.30	$9.98$ E $-07$	-	-	-	-
SPALBB(3)	373	21106	102.87	$9.97$ E $-07$	1105	60707	1291.67	$9.96$ E $-07$
SPALBB(4)	55	13647	56.56	$9.99$ E $-07$	126	26414	469.18	$1.00$ E $-06$
SPALBB(5)	20	11646	48.65	$1.00$ E $-06$	28	21844	341.45	$9.99$ E $-07$

In conclusion, Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12 and Figures 1 and 2 illustrate that SPALBB is a practical method, and its advantages increase with problem size. SPALBB and GMRES are more robust than BICGSTAB. Unlike GMRES, SPALBB has constant storage. In terms of CPU time, SPALBB is more efficient than GMRES. We see from Tables 1, 2, 3, 4, 5, 6, 7, 8 and 9 that the advantages of SPALBB are more obvious for smaller $\nu$ , i.e., more unsymmetric $G$ . Figures 1 and 2 indicate that the convergence rate of SPALBB depends strongly on $\omega$ . For larger $\omega$ , the nonmonotonicity of $\|r_{k}\|$ in SPALBB becomes more pronounced. The strong nonmonotone behavior is similar to the BB method [40].

5 Conclusions

We presented a theoretical and numerical study of the augmented Lagrangian (SPAL) algorithm and its inexact version for solving unsymmetric saddle-point systems. Specifically, we used a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system in SPAL inexactly and proposed the augmented Lagrangian BB (SPALBB) algorithm. The numerical results for SPALBB presented are highly encouraging. SPALBB often requires the least CPU time, and, especially for larger problems, its advantages are clear. Practical methods for choosing $\omega$ and $Q$ to balance the inner and outer iterations is a topic for future research.

Acknowledgments

We thank our colleague and friend, Prof Dr Oleg Burdakov, for his devotion to this research. In particular, we express our gratitude to him for fundamental contributions that initiated this work. Oleg developed the SPALBB algorithm, proposed the counter-example to show that the BB1 method may be divergent, and gave many constructive suggestions on our Matlab implementation of SPALBB.

References

Arrow et al. [1958] K. J. Arrow, L. Hurwicz, H. Uzawa, and H. B. Chenery. Studies in Linear and Non-linear Programming, volume 2. Stanford University Press, 1958.
Awanou and Lai [2005a] G. Awanou and M. J. Lai. On convergence rate of the augmented Lagrangian algorithm for nonsymmetric saddle point problems. Appl. Numer. Math., 54(2):122–134, 2005a.
Awanou and Lai [2005b] G. Awanou and M. J. Lai. Trivariate spline approximations of 3D Navier–Stokes equations. Math. Comp., 74(250):585–601, 2005b.
Bai and Benzi [2017] Z. Z. Bai and M. Benzi. Regularized HSS iteration methods for saddle-point linear systems. BIT Numer. Math., 57(2):287–311, 2017.
Barzilai and Borwein [1988] J. Barzilai and J. M. Borwein. Two-point step size gradient methods. IMA J. Numer. Anal., 8(1):141–148, 1988.
Benzi and Golub [2004] M. Benzi and G. H. Golub. A preconditioner for generalized saddle point problems. SIAM J. Matrix Anal. Appl., 26(1):20–41, 2004.
Benzi and Wathen [2008] M. Benzi and A. J. Wathen. Some preconditioning techniques for saddle point problems. Model order reduction: theory, research aspects and applications, pages 195–211, 2008.
Benzi et al. [2005] M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14(2):1–137, 2005.
Berman and Plemmons [1994] A. Berman and R. J. Plemmons. Nonnegative Matrices in the Mathematical Sciences. SIAM, 1994.
Bertsekas [2014] D. P. Bertsekas. Constrained Optimization and Lagrange Multiplier Methods. Academic Press, 2014.
Birgin and Martínez [2014] E. G. Birgin and J. M. Martínez. Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM, 2014.
Burdakov et al. [2019] O. Burdakov, Y. H. Dai, and N. Huang. Stabilized Barzilai-Borwein method. J. Comput. Math., 37(6):916–936, 2019.
Cai et al. [2009] M. C. Cai, M. Mu, and J. C. Xu. Preconditioning techniques for a mixed Stokes/Darcy model in porous media applications. J. Comput. Appl. Math., 233(2):346–355, 2009.
Campbell and Meyer [2009] S. L. Campbell and C. D. Meyer. Generalized Inverses of Linear Transformations. SIAM, 2009.
Cao and Miao [2016] Y. Cao and S. X. Miao. On semi-convergence of the generalized shift-splitting iteration method for singular nonsymmetric saddle point problems. Comput. Math. Appl., 71(7):1503–1511, 2016.
Cheng [2000] X. L. Cheng. On the nonlinear inexact Uzawa algorithm for saddle-point problems. SIAM J. Numer. Anal., 37(6):1930–1934, 2000.
Dai and Liao [2002] Y. H. Dai and L. Z. Liao. R-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal., 22(1):1–10, 2002.
Dai et al. [2005] Y. H. Dai, L. Z. Liao, and D. Li. An analysis of Barzilai-Borwein gradient method for unsymmetric linear equations. Optim. Control Appl., pages 183–211, 2005.
Dai et al. [2006] Y. H. Dai, W. W. Hager, K. Schittkowski, and H. C. Zhang. The cyclic Barzilai-Borwein method for unconstrained optimization. IMA J. Numer. Anal., 26(3):604–627, 2006.
Di Serafino and Orban [2021] D. Di Serafino and D. Orban. Constraint-preconditioned Krylov solvers for regularized saddle-point systems. SIAM J. Sci. Comput., 43(2):A1001–A1026, 2021.
Dollar et al. [2010] H. S. Dollar, N. I. Gould, M. Stoll, and A. J. Wathen. Preconditioning saddle-point systems with applications in optimization. SIAM J. Sci. Comput., 32(1):249–270, 2010.
Elman [1999] H. C. Elman. Preconditioning for the steady-state Navier–Stokes equations with low viscosity. SIAM Journal on Scientific Computing, 20(4):1299–1316, 1999.
Elman et al. [2007] H. C. Elman, A. Ramage, and D. J. Silvester. Algorithm 866: IFISS, a Matlab toolbox for modelling incompressible flow. ACM Trans. Math. Softw., 33(2):14–es, 2007.
Friedlander et al. [1998] A. Friedlander, J. M. Martínez, B. Molina, and M. Raydan. Gradient method with retards and generalizations. SIAM J. Numer. Anal., 36(1):275–289, 1998.
Ghannad et al. [2022] A. Ghannad, D. Orban, and M. A. Saunders. Linear systems arising in interior methods for convex optimization: a symmetric formulation with bounded condition number. Optim. Method Softw., 37(4):1344–1369, 2022.
Glowinski and Le Tallec [1989] R. Glowinski and P. Le Tallec. Augmented Lagrangian and Operator-splitting Methods in Nonlinear Mechanics. SIAM, 1989.
Golub and Greif [2003] G. H. Golub and C. Greif. On solving block-structured indefinite linear systems. SIAM J. Sci. Comput., 24(6):2076–2092, 2003.
Golub et al. [2005] G. H. Golub, C. Greif, and J. M. Varah. An algebraic analysis of a block diagonal preconditioner for saddle point systems. SIAM J. Matrix Anal. Appl., 27(3):779–792, 2005.
Gould et al. [2014] N. Gould, D. Orban, and T. Rees. Projected Krylov methods for saddle-point systems. SIAM J. Matrix Anal. Appl., 35(4):1329–1343, 2014.
Hu and Zou [2006] Q. Hu and J. Zou. Nonlinear inexact Uzawa algorithms for linear and nonlinear saddle-point problems. SIAM J. Optim., 16(3):798–825, 2006.
Kozjakin and Krasnosel’ski [1982] V. Kozjakin and M. Krasnosel’ski. Some remarks on the method of minimal residues. Numer. Funct. Anal. Optim., 4(3):211–239, 1982.
Krasnosel’skii and Krein [1952] M. A. Krasnosel’skii and S. G. Krein. An iteration process with minimal residuals. Matematicheskii Sbornik, 73(2):315–334, 1952.
Lu and Zhang [2010] J. Lu and Z. Zhang. A modified nonlinear inexact Uzawa algorithm with a variable relaxation parameter for the stabilized saddle point problem. SIAM J. Matrix Anal. Appl., 31(4):1934–1957, 2010.
Molina and Raydan [1996] B. Molina and M. Raydan. Preconditioned Barzilai-Borwein method for the numerical solution of partial differential equations. Numer. Algor., 13:45–60, 1996.
Montoison and Orban [2023] A. Montoison and D. Orban. GPMR: An iterative method for unsymmetric partitioned linear systems. SIAM J. Matrix Anal. Appl., 44(1):293–311, 2023.
Orban and Arioli [2017] D. Orban and M. Arioli. Full-Space Iterative Methods. In Iterative Solution of Symmetric Quasi-definite Linear Systems, chapter 6, pages 63–72. SIAM, 2017.
Pestana and Rees [2016] J. Pestana and T. Rees. Null-space preconditioners for saddle point systems. SIAM J. Matrix Anal. Appl., 37(3):1103–1128, 2016.
Ramage and Gartland Jr [2013] A. Ramage and E. C. Gartland Jr. A preconditioned nullspace method for liquid crystal director modeling. SIAM J. Sci. Comput., 35(1):B226–B247, 2013.
Raydan [1993] M. Raydan. On the Barzilai and Borwein choice of steplength for the gradient method. IMA J. Numer. Anal., 13(3):321–326, 1993.
Raydan [1997] M. Raydan. The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim., 7(1):26–33, 1997.
Rozlozník and Simoncini [2002] M. Rozlozník and V. Simoncini. Krylov subspace methods for saddle point problems with indefinite preconditioning. SIAM J. Matrix Anal. Appl., 24(2):368–391, 2002.
Saad [2003] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, 2003.
Saad and Schultz [1986] Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. and Statist. Comput., 7(3):856–869, 1986.
Scott and Tuma [2020] J. Scott and M. Tuma. A null-space approach for symmetric saddle point systems with a non zero (2, 2) block. SIAM J. Sci. Comput., 2020.
Scott and Tuma [2022] J. Scott and M. Tuma. A null-space approach for large-scale symmetric saddle point systems with a small and non zero (2, 2) block. Numer. Algor., 90(4):1639–1667, 2022.
Silvester et al. [2023] D. Silvester, H. Elman, and A. Ramage. Incompressible Flow & Iterative Solver Software. https://personalpages.manchester.ac.uk/staff/david.silvester/ifiss/, 2023.
Van der Vorst [1992] H. A. Van der Vorst. Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. and Statist. Comput., 13(2):631–644, 1992.
Wright [1997] S. J. Wright. Primal-dual Interior-point Methods. SIAM, 1997.
Zhang and Wei [2010] N. Zhang and Y. M. Wei. On the convergence of general stationary iterative methods for range-Hermitian singular linear systems. Numer. Linear Algebra Appl., 17:139–154, 2010.
Zheng et al. [2009] B. Zheng, Z. Z. Bai, and X. Yang. On semi-convergence of parameterized Uzawa methods for singular saddle point problems. Linear Algebra Appl., 431(5-7):808–817, 2009.
Zou and Magoulès [2022] Q. M. Zou and F. Magoulès. Delayed gradient methods for symmetric and positive definite linear systems. SIAM Rev., 64(3):517–553, 2022.
Zulehner [2002] W. Zulehner. Analysis of iterative methods for saddle point problems: a unified approach. Math. Comp., 71(238):479–505, 2002.

	$\displaystyle\\|z_{k}-z_{*}\\|_{P_{\beta}}=\\|A^{-1}r_{k}\\|_{P_{\beta}}=\\|P_{% \beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}P_{\beta}^{\frac{1}{2}}r_{k}% \\|\leq\\|P_{\beta}^{\frac{1}{2}}A^{-1}P_{\beta}^{-\frac{1}{2}}\\|\\|P_{\beta}^{% \frac{1}{2}}r_{k}\\|$
	$\displaystyle=\\|A^{-1}\\|_{P_{\beta}}\\|r_{k}\\|_{P_{\beta}}\leq\\|A^{-1}\\|_{P_{% \beta}}\big{(}\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}\big{)}^{% k}\\|r_{0}\\|_{P_{\beta}}$
	$\displaystyle=\\|A^{-1}\\|_{P_{\beta}}\big{(}\\|NM^{-1}\\|_{P_{\beta}}+\delta\\|I-% NM^{-1}\\|_{P_{\beta}}\big{)}^{k}\\|A(z_{0}-z_{*})\\|_{P_{\beta}}$
	$\displaystyle\leq\\|A^{-1}\\|_{P_{\beta}}\\|A\\|_{P_{\beta}}\big{(}\\|NM^{-1}\\|_{P_% {\beta}}+\delta\\|I-NM^{-1}\\|_{P_{\beta}}\big{)}^{k}\\|z_{0}-z_{*}\\|_{P_{\beta}}.$

An Inexact augmented Lagrangian algorithm for unsymmetric saddle-point systems

Abstract

keywords:

1 Introduction

Notation

2 Augmented Lagrangian algorithm

Lemma 2.1.

Proof.

Lemma 2.2.

Lemma 2.3.

Proof.

2.1 Convergence analysis when B𝐵Bitalic_B has full column rank

Theorem 2.1.

Proof 2.2.

Remark 1.

Remark 2.

2.2 Convergence analysis when B𝐵Bitalic_B is rank-deficient

Definition 3.

Lemma 4 (9, Theorem 6.19).

Lemma 5 (49, Theorem 2.5).

Theorem 2.3.

Proof 2.4.

Theorem 2.5.

Proof 2.6.

Theorem 2.7.

3 Inexact augmented Lagrangian algorithm

3.1 Convergence analysis when B𝐵Bitalic_B has full column rank

Lemma 1.

Proof 3.1.

Remark 2.

Theorem 3.2.

Proof 3.3.

Remark 3.

Remark 4.

Remark 5.

3.2 Convergence analysis when B𝐵Bitalic_B is rank-deficient

Lemma 6.

Proof 3.4.

Lemma 7.

Proof 3.5.

Theorem 3.6.

Proof 3.7.

3.3 Augmented Lagrangian BB algorithm

Theorem 3.8.

Proof 3.9.

Remark 8.

Theorem 3.10.

Remark 9.

Remark 10.

4 Numerical experiments

Example 1.

Example 2.

Example 3.

Example 4.

5 Conclusions

Acknowledgments

References

An Inexact augmented Lagrangian algorithm
for unsymmetric saddle-point systems

2.1 Convergence analysis when $B$ has full column rank

2.2 Convergence analysis when $B$ is rank-deficient

3.1 Convergence analysis when $B$ has full column rank

3.2 Convergence analysis when $B$ is rank-deficient