Background results for robust minmax control
of linear dynamical systemsthanks: An earlier version of these results was presented at the short course “Robust Nonlinear Model Predictive Control: Recent Advances in Design and Computation,” University of California, Santa Barbara, CA, March 25–28, 2024.

James B. Rawlings, Davide Mannini, and Steven J. Kuntz
Department of Chemical Engineering
University of California, Santa Barbara
The authors gratefully acknowledge the financial support of the National Science Foundation (NSF) under Grant Nos. 2027091 and 2138985. [email protected], [email protected], [email protected].
(June 21, 2024)

The purpose of this note is to summarize the arguments required to derive the results appearing in robust minmax control of linear dynamical systems using a quadratic stage cost. The main results required in robust minmax control are Corollary 19 and Proposition 20. Moreover, the solution to the trust-region problem given in Proposition 15 and Lemma 16 may be of more general interest.

Linear algebra

We assume throughout that the parameters Dn×n0𝐷superscript𝑛𝑛0D\in\mathbb{R}^{n\times n}\geq 0italic_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ≥ 0, Am×n,bm,dnformulae-sequence𝐴superscript𝑚𝑛formulae-sequence𝑏superscript𝑚𝑑superscript𝑛A\in\mathbb{R}^{m\times n},b\in\mathbb{R}^{m},d\in\mathbb{R}^{n}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT , italic_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT , italic_d ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT or dn+m𝑑superscript𝑛𝑚d\in\mathbb{R}^{n+m}italic_d ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT. Let A+n×msuperscript𝐴superscript𝑛𝑚A^{+}\in\mathbb{R}^{n\times m}italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_m end_POSTSUPERSCRIPT denote the pseudoinverse of matrix Am×n𝐴superscript𝑚𝑛A\in\mathbb{R}^{m\times n}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT. Let N(A)𝑁𝐴N(A)italic_N ( italic_A ) and R(A)𝑅𝐴R(A)italic_R ( italic_A ) denoted the null space and range space of matrix A𝐴Aitalic_A, respectively. We will also make use of the singular value decomposition (SVD) of A𝐴Aitalic_A given by

A=[U1U2][Σr000][V1V2]=U1ΣrV1𝐴matrixsubscript𝑈1subscript𝑈2matrixsubscriptΣ𝑟000matrixsuperscriptsubscript𝑉1superscriptsubscript𝑉2subscript𝑈1subscriptΣ𝑟superscriptsubscript𝑉1A=\begin{bmatrix}U_{1}&U_{2}\end{bmatrix}\begin{bmatrix}\Sigma_{r}&0\\ 0&0\end{bmatrix}\begin{bmatrix}V_{1}^{\prime}\\ V_{2}^{\prime}\end{bmatrix}=U_{1}\Sigma_{r}V_{1}^{\prime}italic_A = [ start_ARG start_ROW start_CELL italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL roman_Σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] = italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_Σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (1)

and r𝑟ritalic_r is the rank of A𝐴Aitalic_A. The properties of the SVD and the fundamental theorem of linear algebra imply that the orthonormal columns of U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and U2subscript𝑈2U_{2}italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are bases for R(A)𝑅𝐴R(A)italic_R ( italic_A ) and N(A)𝑁superscript𝐴N(A^{\prime})italic_N ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), respectively, and the orthonormal columns of V1subscript𝑉1V_{1}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and V2subscript𝑉2V_{2}italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are bases for R(A)𝑅superscript𝐴R(A^{\prime})italic_R ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and N(A)𝑁𝐴N(A)italic_N ( italic_A ), respectively.111Edge cases: A=0𝐴0A=0italic_A = 0 has r=0𝑟0r=0italic_r = 0 and empty U1,V1,Σrsubscript𝑈1subscript𝑉1subscriptΣ𝑟U_{1},V_{1},\Sigma_{r}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , roman_Σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT matrices, so U=U2,V=V2formulae-sequence𝑈subscript𝑈2𝑉subscript𝑉2U=U_{2},V=V_{2}italic_U = italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_V = italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and R(A)={0},R(A)={0},N(A)=n,N(A)=mformulae-sequence𝑅𝐴0formulae-sequence𝑅superscript𝐴0formulae-sequence𝑁𝐴superscript𝑛𝑁superscript𝐴superscript𝑚R(A)=\{0\},R(A^{\prime})=\{0\},N(A)=\mathbb{R}^{n},N(A^{\prime})=\mathbb{R}^{m}italic_R ( italic_A ) = { 0 } , italic_R ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = { 0 } , italic_N ( italic_A ) = blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_N ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. At the other extreme, if A𝐴Aitalic_A is square and invertible, r=m=n𝑟𝑚𝑛r=m=nitalic_r = italic_m = italic_n and U2,V2subscript𝑈2subscript𝑉2U_{2},V_{2}italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are empty so U=U1,V=V1formulae-sequence𝑈subscript𝑈1𝑉subscript𝑉1U=U_{1},V=V_{1}italic_U = italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_V = italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and R(A)=n,R(A)=m,N(A)={0},N(A)={0}formulae-sequence𝑅𝐴superscript𝑛formulae-sequence𝑅superscript𝐴superscript𝑚formulae-sequence𝑁𝐴0𝑁superscript𝐴0R(A)=\mathbb{R}^{n},R(A^{\prime})=\mathbb{R}^{m},N(A)=\{0\},N(A^{\prime})=\{0\}italic_R ( italic_A ) = blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_R ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT , italic_N ( italic_A ) = { 0 } , italic_N ( italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = { 0 }. We also have that A+=V1Σr1U1superscript𝐴subscript𝑉1superscriptsubscriptΣ𝑟1superscriptsubscript𝑈1A^{+}=V_{1}\Sigma_{r}^{-1}U_{1}^{\prime}italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT roman_Σ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

First, we require solutions to linear algebra problems when such solutions exist.

Proposition 1 (Solving linear algebra problems.).

Consider the linear algebra problem

Ax=b𝐴𝑥𝑏Ax=bitalic_A italic_x = italic_b
  1. 1.

    A solution exists if and only if bR(A)𝑏𝑅𝐴b\in R(A)italic_b ∈ italic_R ( italic_A ).

  2. 2.

    For bR(A)𝑏𝑅𝐴b\in R(A)italic_b ∈ italic_R ( italic_A ), the solution (set of solutions) is given by222We overload the addition symbol to mean set addition when adding singletons (A+bsuperscript𝐴𝑏A^{+}bitalic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b) and sets (N(A)𝑁𝐴N(A)italic_N ( italic_A )).

    x0A+b+N(A)superscript𝑥0superscript𝐴𝑏𝑁𝐴x^{0}\in A^{+}b+N(A)italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∈ italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b + italic_N ( italic_A ) (2)
Proof.

By definition of range, if bR(A)𝑏𝑅𝐴b\notin R(A)italic_b ∉ italic_R ( italic_A ) there is no x𝑥xitalic_x such that Ax=b𝐴𝑥𝑏Ax=bitalic_A italic_x = italic_b, and if bR(A)𝑏𝑅𝐴b\in R(A)italic_b ∈ italic_R ( italic_A ), there is an x𝑥xitalic_x such that Ax=b𝐴𝑥𝑏Ax=bitalic_A italic_x = italic_b, which is the same as the existence condition. For bR(A)𝑏𝑅𝐴b\in R(A)italic_b ∈ italic_R ( italic_A ), let zn𝑧superscript𝑛z\in\mathbb{R}^{n}italic_z ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT denote a value so that Az=b𝐴𝑧𝑏Az=bitalic_A italic_z = italic_b, and let q𝑞qitalic_q be an arbitrary element in N(A)𝑁𝐴N(A)italic_N ( italic_A ) so x0=A+b+qsuperscript𝑥0superscript𝐴𝑏𝑞x^{0}=A^{+}b+qitalic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b + italic_q. To show Eq. 2 are solutions, note that

Ax0=A(A+b+q)=AA+Az=Az=b𝐴superscript𝑥0𝐴superscript𝐴𝑏𝑞𝐴superscript𝐴𝐴𝑧𝐴𝑧𝑏Ax^{0}=A(A^{+}b+q)=AA^{+}Az=Az=bitalic_A italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_A ( italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b + italic_q ) = italic_A italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_A italic_z = italic_A italic_z = italic_b

where we have used the definition of the null space and one of the pseudoinverse’s defining properties, AA+A=A𝐴superscript𝐴𝐴𝐴AA^{+}A=Aitalic_A italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_A = italic_A. To show that Eq. 2 are all solutions, let xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote a solution. We then have A(xA+b)=bb=0𝐴superscript𝑥superscript𝐴𝑏𝑏𝑏0A(x^{\prime}-A^{+}b)=b-b=0italic_A ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b ) = italic_b - italic_b = 0, so xA+bN(A)superscript𝑥superscript𝐴𝑏𝑁𝐴x^{\prime}-A^{+}b\in N(A)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b ∈ italic_N ( italic_A ) or xA+b+N(A)superscript𝑥superscript𝐴𝑏𝑁𝐴x^{\prime}\in A^{+}b+N(A)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_b + italic_N ( italic_A ). Since xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is an arbitrary solution, Eq. 2 gives all solutions. ∎

Note that if one is interested in deriving Eq. 2 rather than establishing that it is correct as we did here, use the two orthogonal coordinate systems provided by the SVD of A𝐴Aitalic_A, and let x=Vα𝑥𝑉𝛼x=V\alphaitalic_x = italic_V italic_α, b=Uβ𝑏𝑈𝛽b=U\betaitalic_b = italic_U italic_β, and solve that simpler decoupled linear algebra problem for α0superscript𝛼0\alpha^{0}italic_α start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT as a function of β𝛽\betaitalic_β, and convert back to x0superscript𝑥0x^{0}italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT in terms of b𝑏bitalic_b.

If bR(A)𝑏𝑅𝐴b\notin R(A)italic_b ∉ italic_R ( italic_A ), x0superscript𝑥0x^{0}italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT is still well-defined, but Ax0b=(AA+I)b=U2U2b0𝐴superscript𝑥0𝑏𝐴superscript𝐴𝐼𝑏subscript𝑈2superscriptsubscript𝑈2𝑏0Ax^{0}-b=(AA^{+}-I)b=-U_{2}U_{2}^{\prime}b\neq 0italic_A italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - italic_b = ( italic_A italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT - italic_I ) italic_b = - italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_b ≠ 0. In this case, the x0superscript𝑥0x^{0}italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT given in Eq. 2 solves minx|Axb|subscript𝑥𝐴𝑥𝑏\min_{x}\left|Ax-b\right|roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT | italic_A italic_x - italic_b | (least-squares solution), and achieves value |Ax0b|=|U2b|𝐴superscript𝑥0𝑏superscriptsubscript𝑈2𝑏\left|Ax^{0}-b\right|=\left|U_{2}^{\prime}b\right|| italic_A italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - italic_b | = | italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_b |.

Positive semidefinite matrices.

We say that a matrix Mn×n𝑀superscript𝑛𝑛M\in\mathbb{R}^{n\times n}italic_M ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT is positive semidefinite, denoted M0𝑀0M\geq 0italic_M ≥ 0, if M𝑀Mitalic_M is symmetric and xMx0superscript𝑥𝑀𝑥0x^{\prime}Mx\geq 0italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M italic_x ≥ 0 for all xn𝑥superscript𝑛x\in\mathbb{R}^{n}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

Optimization

We shall appeal without proof to one theorem for existence of solutions to optimization problems, the Weierstrass (extreme value) theorem. It says that a continuous function on a closed and bounded set attains its min and max on the set.333Proofs for the multivariate version required here can be found in Mangasarian (1994, p. 198), Polak (1997, Corollary 5.1.25), Rockafellar and Wets (1998, p. 11), and Rawlings et al. (2020, Proposition A.7). As we specialize to the results of interest in this note, next we consider convex, differentiable functions.

Definition 2 (Convex function).

A function V:n:𝑉superscript𝑛V:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_V : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R is convex if

V(αu+(1α)v)αV(u)+(1α)V(v)𝑉𝛼𝑢1𝛼𝑣𝛼𝑉𝑢1𝛼𝑉𝑣V(\alpha u+(1-\alpha)v)\leq\alpha V(u)+(1-\alpha)V(v)italic_V ( italic_α italic_u + ( 1 - italic_α ) italic_v ) ≤ italic_α italic_V ( italic_u ) + ( 1 - italic_α ) italic_V ( italic_v ) (3)

for all u,vn𝑢𝑣superscript𝑛u,v\in\mathbb{R}^{n}italic_u , italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and 0α10𝛼10\leq\alpha\leq 10 ≤ italic_α ≤ 1.

If the function V𝑉Vitalic_V is differentiable, then it is convex if and only if

V(v)V(u)+(vu)dVdu(u)𝑉𝑣𝑉𝑢superscript𝑣𝑢𝑑𝑉𝑑𝑢𝑢V(v)\geq V(u)+(v-u)^{\prime}\frac{dV}{du}(u)italic_V ( italic_v ) ≥ italic_V ( italic_u ) + ( italic_v - italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT divide start_ARG italic_d italic_V end_ARG start_ARG italic_d italic_u end_ARG ( italic_u ) (4)

for all u,vn𝑢𝑣superscript𝑛u,v\in\mathbb{R}^{n}italic_u , italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. See Boyd and Vandenberghe (2004, pp.69–70) for a proof of this fact.

An immediate consequence of this global lower bound is that u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT is a minimizer of V𝑉Vitalic_V if and only if (dV/du)(u0)=0𝑑𝑉𝑑𝑢superscript𝑢00(dV/du)(u^{0})=0( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = 0.

Proposition 3.

A convex, differentiable function V:n:𝑉superscript𝑛V:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_V : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R has a minimizer u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT if and only if (dV/du)(u0)=0𝑑𝑉𝑑𝑢superscript𝑢00(dV/du)(u^{0})=0( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = 0.

Proof.

To establish sufficiency, assume (dV/du)(u0)=0𝑑𝑉𝑑𝑢superscript𝑢00(dV/du)(u^{0})=0( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = 0; Eq. 4 then implies V(v)V(u0)𝑉𝑣𝑉superscript𝑢0V(v)\geq V(u^{0})italic_V ( italic_v ) ≥ italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) for all vn𝑣superscript𝑛v\in\mathbb{R}^{n}italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, and therefore u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT is the minimizer of V𝑉Vitalic_V.

To establish necessity, assume that u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT is optimal but that, contrary to what is to be proven, (dV/du)(u0)0𝑑𝑉𝑑𝑢superscript𝑢00(dV/du)(u^{0})\neq 0( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ≠ 0, and let h=(dV/du)(u0)𝑑𝑉𝑑𝑢superscript𝑢0h=-(dV/du)(u^{0})italic_h = - ( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) so that the directional derivative satisfies

limλ0+V(u0+λh)V(u0)λ=hdVdu(u0)=|dVdu(u0)|2subscript𝜆superscript0𝑉superscript𝑢0𝜆𝑉superscript𝑢0𝜆superscript𝑑𝑉𝑑𝑢superscript𝑢0superscript𝑑𝑉𝑑𝑢superscript𝑢02\lim_{\lambda\rightarrow 0^{+}}\frac{V(u^{0}+\lambda h)-V(u^{0})}{\lambda}=h^{% \prime}\frac{dV}{du}(u^{0})=-\left|\frac{dV}{du}(u^{0})\right|^{2}roman_lim start_POSTSUBSCRIPT italic_λ → 0 start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT divide start_ARG italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_λ italic_h ) - italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_λ end_ARG = italic_h start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT divide start_ARG italic_d italic_V end_ARG start_ARG italic_d italic_u end_ARG ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = - | divide start_ARG italic_d italic_V end_ARG start_ARG italic_d italic_u end_ARG ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

Given this limit, for every ϵ>0italic-ϵ0\epsilon>0italic_ϵ > 0 there exists δ(ϵ)>0𝛿italic-ϵ0\delta(\epsilon)>0italic_δ ( italic_ϵ ) > 0 such that

V(u0+λh)V(u0)λ|dVdu(u0)|2+ϵ𝑉superscript𝑢0𝜆𝑉superscript𝑢0𝜆superscript𝑑𝑉𝑑𝑢superscript𝑢02italic-ϵ\frac{V(u^{0}+\lambda h)-V(u^{0})}{\lambda}\leq-\left|\frac{dV}{du}(u^{0})% \right|^{2}+\epsilondivide start_ARG italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_λ italic_h ) - italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_λ end_ARG ≤ - | divide start_ARG italic_d italic_V end_ARG start_ARG italic_d italic_u end_ARG ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ϵ

for all 0<λδ0𝜆𝛿0<\lambda\leq\delta0 < italic_λ ≤ italic_δ. Choose ϵ=(1/2)|(dV/du)(u0)|2>0italic-ϵ12superscript𝑑𝑉𝑑𝑢superscript𝑢020\epsilon=(1/2)\left|(dV/du)(u^{0})\right|^{2}>0italic_ϵ = ( 1 / 2 ) | ( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 0, and we have that

V(u0+λh)V(u0)(λ/2)|dVdu(u0)|2𝑉superscript𝑢0𝜆𝑉superscript𝑢0𝜆2superscript𝑑𝑉𝑑𝑢superscript𝑢02V(u^{0}+\lambda h)\leq V(u^{0})-(\lambda/2)\left|\frac{dV}{du}(u^{0})\right|^{2}italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_λ italic_h ) ≤ italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) - ( italic_λ / 2 ) | divide start_ARG italic_d italic_V end_ARG start_ARG italic_d italic_u end_ARG ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

for 0<λδ0𝜆𝛿0<\lambda\leq\delta0 < italic_λ ≤ italic_δ. This inequality contradicts the optimality of u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and, therefore (dV/du)(u0)=0𝑑𝑉𝑑𝑢superscript𝑢00(dV/du)(u^{0})=0( italic_d italic_V / italic_d italic_u ) ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = 0, which establishes necessity, and the proposition is proven. ∎

When considering robust control of linear dynamical systems with quadratic stage cost, quadratic functions play a central role. We have the following result about their convexity.

Proposition 4 (Convex quadratic functions).

The quadratic function V(u)=(1/2)uDu+ud+c𝑉𝑢12superscript𝑢𝐷𝑢superscript𝑢𝑑𝑐V(u)=(1/2)u^{\prime}Du+u^{\prime}d+citalic_V ( italic_u ) = ( 1 / 2 ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_u + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d + italic_c is convex if and only if D0𝐷0D\geq 0italic_D ≥ 0.

Proof.

We establish that the quadratic term f(u)uDu𝑓𝑢superscript𝑢𝐷𝑢f(u)\coloneqq u^{\prime}Duitalic_f ( italic_u ) ≔ italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_u is convex by substituting αu+(1α)v𝛼𝑢1𝛼𝑣\alpha u+(1-\alpha)vitalic_α italic_u + ( 1 - italic_α ) italic_v into function f𝑓fitalic_f and rearranging the terms

f(αu+(1α)v)(αf(u)+(1α)f(v))=α(1α)(uv)D(uv)𝑓𝛼𝑢1𝛼𝑣𝛼𝑓𝑢1𝛼𝑓𝑣𝛼1𝛼superscript𝑢𝑣𝐷𝑢𝑣f(\alpha u+(1-\alpha)v)-\big{(}\alpha f(u)+(1-\alpha)f(v)\big{)}=-\alpha(1-% \alpha)(u-v)^{\prime}D(u-v)italic_f ( italic_α italic_u + ( 1 - italic_α ) italic_v ) - ( italic_α italic_f ( italic_u ) + ( 1 - italic_α ) italic_f ( italic_v ) ) = - italic_α ( 1 - italic_α ) ( italic_u - italic_v ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D ( italic_u - italic_v )

Since α(1α)<0𝛼1𝛼0-\alpha(1-\alpha)<0- italic_α ( 1 - italic_α ) < 0 for α(0,1)𝛼01\alpha\in(0,1)italic_α ∈ ( 0 , 1 ), we have that the right-hand side is less than or equal to zero for every u,vn𝑢𝑣superscript𝑛u,v\in\mathbb{R}^{n}italic_u , italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT if and only if D0𝐷0D\geq 0italic_D ≥ 0, verifying (3) for the function f𝑓fitalic_f.

It is then straightforward to show that the linear function udsuperscript𝑢𝑑u^{\prime}ditalic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d and the constant function c𝑐citalic_c are both convex by directly verifying (3). It is also straightforward to establish that linear combinations of convex functions are convex by verifying (3), and, therefore the function V𝑉Vitalic_V is convex if and only if D0𝐷0D\geq 0italic_D ≥ 0. ∎

The following optimization result for convex quadratic functions is then useful in the ensuing discussion.

Proposition 5 (Minimum of quadratic functions).

Consider the quadratic function V():n:𝑉superscript𝑛V(\cdot):\mathbb{R}^{n}\rightarrow\mathbb{R}italic_V ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R with Dn×n0𝐷superscript𝑛𝑛0D\in\mathbb{R}^{n\times n}\geq 0italic_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ≥ 0

V(u)(1/2)uDu+ud𝑉𝑢12superscript𝑢𝐷𝑢superscript𝑢𝑑V(u)\coloneqq(1/2)u^{\prime}Du+u^{\prime}ditalic_V ( italic_u ) ≔ ( 1 / 2 ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_u + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d
  1. 1.

    A solution to minuVsubscript𝑢𝑉\min_{u}Vroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V exists if and only if dR(D)𝑑𝑅𝐷d\in R(D)italic_d ∈ italic_R ( italic_D ).

  2. 2.

    For dR(D)𝑑𝑅𝐷d\in R(D)italic_d ∈ italic_R ( italic_D ), the minimizer and optimal value function are

    u0D+d+N(D)V0=(1/2)dD+dformulae-sequencesuperscript𝑢0superscript𝐷𝑑𝑁𝐷superscript𝑉012superscript𝑑superscript𝐷𝑑u^{0}\in-D^{+}d+N(D)\qquad V^{0}=-(1/2)d^{\prime}D^{+}ditalic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∈ - italic_D start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D ) italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d (5)

    and (d/du)V(u)=0𝑑𝑑𝑢𝑉𝑢0(d/du)V(u)=0( italic_d / italic_d italic_u ) italic_V ( italic_u ) = 0 at u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT.

Proof.

The function V(u)𝑉𝑢V(u)italic_V ( italic_u ) is differentiable and convex (Proposition 4) so from Proposition 3, a solution exists if and only if the derivative is zero. Taking the derivative gives (d/du)V(u)=Du+d𝑑𝑑𝑢𝑉𝑢𝐷𝑢𝑑(d/du)V(u)=Du+d( italic_d / italic_d italic_u ) italic_V ( italic_u ) = italic_D italic_u + italic_d. From Proposition 1, Du+d=0𝐷𝑢𝑑0Du+d=0italic_D italic_u + italic_d = 0 has a solution if and only if dR(D)𝑑𝑅𝐷d\in R(D)italic_d ∈ italic_R ( italic_D ) and the set of all solutions is d0=D+d+N(D)superscript𝑑0superscript𝐷𝑑𝑁𝐷d^{0}=-D^{+}d+N(D)italic_d start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - italic_D start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D ), and evaluating V𝑉Vitalic_V at the solution gives V(u0)=(1/2)dD+d𝑉superscript𝑢012superscript𝑑superscript𝐷𝑑V(u^{0})=-(1/2)d^{\prime}D^{+}ditalic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d establishing Eq. 5. ∎

For maximization problems, we can replace D0𝐷0D\geq 0italic_D ≥ 0 with D0𝐷0D\leq 0italic_D ≤ 0 and min with max.

Partitioned semidefinite matrices.

We make extensive use of partitioned matrices

M=[M11M12M12M22]𝑀matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22M=\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}italic_M = [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ]

We have the following result for positive semidefinite partitioned matrices (Boyd and Vandenberghe, 2004, p.651).

Proposition 6 (Positive semidefinite partitioned matrices).

The matrix M0𝑀0M\geq 0italic_M ≥ 0 if and only if M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, M22M12M11+M120subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀120M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0, and R(M12)R(M11)𝑅subscript𝑀12𝑅subscript𝑀11R(M_{12})\subseteq R(M_{11})italic_R ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) ⊆ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ).

Proof.

  1. 1.

    Forward implication. Define V(x,y)(1/2)(x,y)M(x,y)𝑉𝑥𝑦12superscript𝑥𝑦𝑀𝑥𝑦V(x,y)\coloneqq(1/2)(x,y)^{\prime}M(x,y)italic_V ( italic_x , italic_y ) ≔ ( 1 / 2 ) ( italic_x , italic_y ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M ( italic_x , italic_y ), and assume M0𝑀0M\geq 0italic_M ≥ 0. Expanding V𝑉Vitalic_V using the partitioned matrix

    V(x,y)𝑉𝑥𝑦\displaystyle V(x,y)italic_V ( italic_x , italic_y ) =(1/2)[xy][M11M12M12M22][xy]absent12superscriptmatrix𝑥𝑦matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑥𝑦\displaystyle=(1/2)\begin{bmatrix}x\\ y\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}x\\ y\end{bmatrix}= ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_x end_CELL end_ROW start_ROW start_CELL italic_y end_CELL end_ROW end_ARG ]
    =(1/2)(xM11x+2xM12y+yM22y)absent12superscript𝑥subscript𝑀11𝑥2superscript𝑥subscript𝑀12𝑦superscript𝑦subscript𝑀22𝑦\displaystyle=(1/2)\big{(}x^{\prime}M_{11}x+2x^{\prime}M_{12}y+y^{\prime}M_{22% }y\big{)}= ( 1 / 2 ) ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_x + 2 italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_y + italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y ) (6)
    0,for all (x,y)absent0for all 𝑥𝑦\displaystyle\geq 0,\quad\text{for all }(x,y)≥ 0 , for all ( italic_x , italic_y )

    Setting y=0𝑦0y=0italic_y = 0 in Eq. 6 implies that M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0. Since M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, V(x,y)𝑉𝑥𝑦V(x,y)italic_V ( italic_x , italic_y ) is a differentiable, convex function of x𝑥xitalic_x for any y𝑦yitalic_y. Therefore minxV(x,y)subscript𝑥𝑉𝑥𝑦\min_{x}V(x,y)roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_V ( italic_x , italic_y ) has a solution for every y𝑦yitalic_y, and Proposition 5 then implies M12yR(M11)subscript𝑀12𝑦𝑅subscript𝑀11M_{12}y\in R(M_{11})italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_y ∈ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) for every y𝑦yitalic_y, which is equivalent to R(M12)R(M11)𝑅subscript𝑀12𝑅subscript𝑀11R(M_{12})\subseteq R(M_{11})italic_R ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) ⊆ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ). Substituting the minimizer over x𝑥xitalic_x, x0=M11+M12ysuperscript𝑥0superscriptsubscript𝑀11subscript𝑀12𝑦x^{0}=-M_{11}^{+}M_{12}yitalic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_y into V𝑉Vitalic_V gives

    V(x0,y)=(1/2)y(M22M12M11+M12)y𝑉superscript𝑥0𝑦12superscript𝑦subscript𝑀22subscriptsuperscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝑦V(x^{0},y)=(1/2)y^{\prime}(M_{22}-M^{\prime}_{12}M_{11}^{+}M_{12})yitalic_V ( italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_y ) = ( 1 / 2 ) italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) italic_y (7)

    and since V(x,y)0𝑉𝑥𝑦0V(x,y)\geq 0italic_V ( italic_x , italic_y ) ≥ 0 for all (x,y)𝑥𝑦(x,y)( italic_x , italic_y ), we have that M22M12M11+M120subscript𝑀22subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑀120M_{22}-M^{\prime}_{12}M^{+}_{11}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0, and the forward implication is established.

  2. 2.

    Reverse implication. Assume M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, M22M12M11+M120subscript𝑀22subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑀120M_{22}-M^{\prime}_{12}M^{+}_{11}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0, and R(M12)R(M11)𝑅subscript𝑀12𝑅subscript𝑀11R(M_{12})\subseteq R(M_{11})italic_R ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) ⊆ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ), and we establish that M0𝑀0M\geq 0italic_M ≥ 0. For proof by contradiction, assume there exists an (x¯,y¯)¯𝑥¯𝑦(\overline{x},\overline{y})( over¯ start_ARG italic_x end_ARG , over¯ start_ARG italic_y end_ARG ) such that (x¯,y¯)M(x¯,y¯)<0superscript¯𝑥¯𝑦𝑀¯𝑥¯𝑦0(\overline{x},\overline{y})^{\prime}M(\overline{x},\overline{y})<0( over¯ start_ARG italic_x end_ARG , over¯ start_ARG italic_y end_ARG ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M ( over¯ start_ARG italic_x end_ARG , over¯ start_ARG italic_y end_ARG ) < 0. By Proposition 5, we know that minxV(x,y¯)subscript𝑥𝑉𝑥¯𝑦\min_{x}V(x,\overline{y})roman_min start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_V ( italic_x , over¯ start_ARG italic_y end_ARG ) exists since M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 and M12y¯R(M11)subscript𝑀12¯𝑦𝑅subscript𝑀11M_{12}\overline{y}\in R(M_{11})italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT over¯ start_ARG italic_y end_ARG ∈ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ), and it has value V0=V(x0,y¯)superscript𝑉0𝑉superscript𝑥0¯𝑦V^{0}=V(x^{0},\overline{y})italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_V ( italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , over¯ start_ARG italic_y end_ARG ) with x0=M11+M12y¯superscript𝑥0superscriptsubscript𝑀11subscript𝑀12¯𝑦x^{0}=-M_{11}^{+}M_{12}\overline{y}italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT over¯ start_ARG italic_y end_ARG. Substituting this into V𝑉Vitalic_V gives V0=(1/2)y¯(M22M12M11+M12)y¯0superscript𝑉012superscript¯𝑦subscript𝑀22subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑀12¯𝑦0V^{0}=(1/2)\overline{y}^{\prime}(M_{22}-M^{\prime}_{12}M^{+}_{11}M_{12})% \overline{y}\geq 0italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = ( 1 / 2 ) over¯ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) over¯ start_ARG italic_y end_ARG ≥ 0 because matrix M22M12M11+M120subscript𝑀22subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑀120M_{22}-M^{\prime}_{12}M^{+}_{11}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0. By optimality of x0superscript𝑥0x^{0}italic_x start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, V(x,y¯)V00𝑉𝑥¯𝑦superscript𝑉00V(x,\overline{y})\geq V^{0}\geq 0italic_V ( italic_x , over¯ start_ARG italic_y end_ARG ) ≥ italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ≥ 0 for all x𝑥xitalic_x. But that contradicts V(x¯,y¯)<0𝑉¯𝑥¯𝑦0V(\overline{x},\overline{y})<0italic_V ( over¯ start_ARG italic_x end_ARG , over¯ start_ARG italic_y end_ARG ) < 0, and we conclude M0𝑀0M\geq 0italic_M ≥ 0, and the proof is complete. ∎

Note that

[M11M12M12M22]0if and only if[M22M12M12M11]0formulae-sequencematrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀220if and only ifmatrixsubscript𝑀22subscriptsuperscript𝑀12subscript𝑀12subscript𝑀110\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\geq 0\quad\text{if and only if}\quad\begin% {bmatrix}M_{22}&M^{\prime}_{12}\\ M_{12}&M_{11}\end{bmatrix}\geq 0[ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ≥ 0 if and only if [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ≥ 0

So we can also conclude that M0𝑀0M\geq 0italic_M ≥ 0 if and only if M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, M220subscript𝑀220M_{22}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ≥ 0, M22M12M11+M120subscript𝑀22subscriptsuperscript𝑀12superscriptsubscript𝑀11subscript𝑀120M_{22}-M^{\prime}_{12}M_{11}^{+}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0, M11M12M22+M120subscript𝑀11subscript𝑀12superscriptsubscript𝑀22subscriptsuperscript𝑀120M_{11}-M_{12}M_{22}^{+}M^{\prime}_{12}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0, R(M12)R(M11)𝑅subscript𝑀12𝑅subscript𝑀11R(M_{12})\subseteq R(M_{11})italic_R ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ) ⊆ italic_R ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ), and R(M12)R(M22)𝑅superscriptsubscript𝑀12𝑅subscript𝑀22R(M_{12}^{\prime})\subseteq R(M_{22})italic_R ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊆ italic_R ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ). Note also that given the partitioning in M𝑀Mitalic_M, we define

M~11subscript~𝑀11\displaystyle\tilde{M}_{11}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT M22M12M11+M12absentsubscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle\coloneqq M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}≔ italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT
M~22subscript~𝑀22\displaystyle\tilde{M}_{22}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT M11M12M22+M12absentsubscript𝑀11subscript𝑀12superscriptsubscript𝑀22subscriptsuperscript𝑀12\displaystyle\coloneqq M_{11}-M_{12}M_{22}^{+}M^{\prime}_{12}≔ italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT (8)

and M~11subscript~𝑀11\tilde{M}_{11}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT is known as the Schur complement of M11subscript𝑀11M_{11}italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT, and M~22subscript~𝑀22\tilde{M}_{22}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT is known as the Schur complement of M22subscript𝑀22M_{22}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT.

Constraints and Lagrangians.

Next we require a standard optimization result for using a Lagrangian to reformulate a constrained minimization as an unconstrained minmax problem. The following result will be useful for this purpose. Let Un𝑈superscript𝑛U\subseteq\mathbb{R}^{n}italic_U ⊆ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT be a nonempty compact set and V():U:𝑉𝑈V(\cdot):U\rightarrow\mathbb{R}italic_V ( ⋅ ) : italic_U → blackboard_R be a continuous function on U𝑈Uitalic_U. Define the Lagrangian function L():U×:𝐿𝑈L(\cdot):U\times\mathbb{R}\rightarrow\mathbb{R}italic_L ( ⋅ ) : italic_U × blackboard_R → blackboard_R as

L(u,λ)=V(u)λρ(u,U)𝐿𝑢𝜆𝑉𝑢𝜆𝜌𝑢𝑈L(u,\lambda)=V(u)-\lambda\rho(u,U)italic_L ( italic_u , italic_λ ) = italic_V ( italic_u ) - italic_λ italic_ρ ( italic_u , italic_U ) (9)

where ρ():n×U0:𝜌superscript𝑛𝑈subscriptabsent0\rho(\cdot):\mathbb{R}^{n}\times U\rightarrow\mathbb{R}_{\geq 0}italic_ρ ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT × italic_U → blackboard_R start_POSTSUBSCRIPT ≥ 0 end_POSTSUBSCRIPT is any convenient continuous indicator function that evaluates to zero if and only if uU𝑢𝑈u\in Uitalic_u ∈ italic_U. Denote a minmax problem as

pinfusupλL(u,λ)subscriptpinf𝑢subscriptsupremum𝜆𝐿𝑢𝜆\mathop{\rm\hbox to0.0pt{\phantom{p}\hss}{inf}}_{u}\sup_{\lambda}L(u,\lambda)start_BIGOP p roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ )

When a solution to this problem exists, we define the optimal value Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and solution set usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT as

L=minumaxλL(u,λ)u=argminumaxλL(u,λ)formulae-sequencesuperscript𝐿subscript𝑢subscript𝜆𝐿𝑢𝜆superscript𝑢subscript𝑢subscript𝜆𝐿𝑢𝜆L^{*}=\min_{u}\max_{\lambda}L(u,\lambda)\qquad u^{*}=\arg\min_{u}\max_{\lambda% }L(u,\lambda)italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ )

It is convenient in the subsequent development to define the maximizer of the inner problem

λ¯(u)argmaxλL(u,λ),uUformulae-sequence¯𝜆𝑢subscript𝜆𝐿𝑢𝜆𝑢𝑈\overline{\lambda}(u)\coloneqq\arg\max_{\lambda}L(u,\lambda),\quad u\in Uover¯ start_ARG italic_λ end_ARG ( italic_u ) ≔ roman_arg roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) , italic_u ∈ italic_U
Proposition 7 (Constrained minimization and Lagrangian minmax).

Let Un𝑈superscript𝑛U\subseteq\mathbb{R}^{n}italic_U ⊆ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT be a nonempty compact set and V():U:𝑉𝑈V(\cdot):U\rightarrow\mathbb{R}italic_V ( ⋅ ) : italic_U → blackboard_R be a continuous function on U𝑈Uitalic_U, and L():U×:𝐿𝑈L(\cdot):U\times\mathbb{R}\rightarrow\mathbb{R}italic_L ( ⋅ ) : italic_U × blackboard_R → blackboard_R be defined as L(u,λ)V(u)λρ(u,U)𝐿𝑢𝜆𝑉𝑢𝜆𝜌𝑢𝑈L(u,\lambda)\coloneqq V(u)-\lambda\rho(u,U)italic_L ( italic_u , italic_λ ) ≔ italic_V ( italic_u ) - italic_λ italic_ρ ( italic_u , italic_U ). Consider the constrained optimization problem

infuUV(u)subscriptinfimum𝑢𝑈𝑉𝑢\inf_{u\in U}V(u)roman_inf start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u ) (10)

and the (unconstrained) Lagrangian minmax problem

pinfusupλL(u,λ)subscriptpinf𝑢subscriptsupremum𝜆𝐿𝑢𝜆\mathop{\rm\hbox to0.0pt{\phantom{p}\hss}{inf}}_{u}\sup_{\lambda}L(u,\lambda)start_BIGOP p roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) (11)
  1. 1.

    Solutions to both problems exist.

  2. 2.

    Let V0superscript𝑉0V^{0}italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT be the solution and u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT be the set of optimizers of minuUV(u)subscript𝑢𝑈𝑉𝑢\min_{u\in U}V(u)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u ). Let Lsuperscript𝐿L^{*}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the solution and usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT the set of optimizers of minumaxλL(u,λ)subscript𝑢subscript𝜆𝐿𝑢𝜆\min_{u}\max_{\lambda}L(u,\lambda)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ). Then

    V0=Lu0=uλ¯(u)=formulae-sequencesuperscript𝑉0superscript𝐿formulae-sequencesuperscript𝑢0superscript𝑢¯𝜆superscript𝑢V^{0}=L^{*}\qquad u^{0}=u^{*}\qquad\overline{\lambda}(u^{*})=\mathbb{R}italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over¯ start_ARG italic_λ end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = blackboard_R
Proof.

The solution to Eq. 10 exists by the Weierstrass theorem. Denote the optimal value V0superscript𝑉0V^{0}italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and solution set u0Usuperscript𝑢0𝑈u^{0}\subseteq Uitalic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ⊆ italic_U which satisfy V(u0)=V0𝑉superscript𝑢0superscript𝑉0V(u^{0})=V^{0}italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT. We show that a solution to Eq. 11 also exists. Consider the inner supremum. From the definitions of functions L𝐿Litalic_L and ρ𝜌\rhoitalic_ρ, we conclude

supλL(u,λ)={V(u),uU+,uUsubscriptsupremum𝜆𝐿𝑢𝜆cases𝑉𝑢𝑢𝑈𝑢𝑈\sup_{\lambda}L(u,\lambda)=\begin{cases}V(u),&\quad u\in U\\ +\infty,&\quad u\notin U\end{cases}roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) = { start_ROW start_CELL italic_V ( italic_u ) , end_CELL start_CELL italic_u ∈ italic_U end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_u ∉ italic_U end_CELL end_ROW

Then consider the outer infimum. We have that

L=pinfusupλL(u,λ)=infuUV(u)=minuUV(u)=V0superscript𝐿subscriptpinf𝑢subscriptsupremum𝜆𝐿𝑢𝜆subscriptinfimum𝑢𝑈𝑉𝑢subscript𝑢𝑈𝑉𝑢superscript𝑉0L^{*}=\mathop{\rm\hbox to0.0pt{\phantom{p}\hss}{inf}}_{u}\sup_{\lambda}L(u,% \lambda)=\inf_{u\in U}V(u)=\min_{u\in U}V(u)=V^{0}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = start_BIGOP p roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) = roman_inf start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u ) = roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u ) = italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT

So the solution to (11) exists with value L=V0superscript𝐿superscript𝑉0L^{*}=V^{0}italic_L start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT. Taking the argument gives

uargpinfusupλL(u,λ)=argminuUV(u)=u0superscript𝑢subscriptpinf𝑢subscriptsupremum𝜆𝐿𝑢𝜆subscript𝑢𝑈𝑉𝑢superscript𝑢0u^{*}\coloneqq\arg\mathop{\rm\hbox to0.0pt{\phantom{p}\hss}{inf}}_{u}\sup_{% \lambda}L(u,\lambda)=\arg\min_{u\in U}V(u)=u^{0}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ≔ roman_arg start_BIGOP p roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_λ ) = roman_arg roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u ) = italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT

For the inner problem evaluated at usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, we note that

supλL(u,λ)=supλV(u0)=maxλV0=V0subscriptsupremum𝜆𝐿superscript𝑢𝜆subscriptsupremum𝜆𝑉superscript𝑢0subscript𝜆superscript𝑉0superscript𝑉0\sup_{\lambda}L(u^{*},\lambda)=\sup_{\lambda}V(u^{0})=\max_{\lambda}V^{0}=V^{0}roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ ) = roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT

Taking the argument then gives

λ¯(u)=argsupλL(u,λ)=argmaxλV0=¯𝜆superscript𝑢subscriptsupremum𝜆𝐿superscript𝑢𝜆subscript𝜆superscript𝑉0\overline{\lambda}(u^{*})=\arg\sup_{\lambda}L(u^{*},\lambda)=\arg\max_{\lambda% }V^{0}=\mathbb{R}over¯ start_ARG italic_λ end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = roman_arg roman_sup start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ ) = roman_arg roman_max start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = blackboard_R

and the result is established. ∎

Minmax and Maxmin

More generally, we are interested in a function V(u,w)𝑉𝑢𝑤V(u,w)italic_V ( italic_u , italic_w ) V:U×W:𝑉𝑈𝑊V:U\times W\rightarrow\mathbb{R}italic_V : italic_U × italic_W → blackboard_R and the optimization problems

infuUsupwWV(u,w)supwWinfuUV(u,w)subscriptinf𝑢𝑈subscriptsupremum𝑤𝑊𝑉𝑢𝑤subscriptsupremum𝑤𝑊subscriptinf𝑢𝑈𝑉𝑢𝑤\displaystyle\mathop{\mathrm{inf}\vphantom{\mathrm{sup}}}_{u\in U}\sup_{w\in W% }V(u,w)\qquad\sup_{w\in W}\mathop{\mathrm{inf}\vphantom{\mathrm{sup}}}_{u\in U% }V(u,w)start_BIGOP roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) roman_sup start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT start_BIGOP roman_inf end_BIGOP start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

We assume in the following that the infinfimum\infroman_inf and supsupremum\suproman_sup are achieved on the respective sets and replace them with min\minroman_min and max\maxroman_max.

Continuous functions.

Let’s start here. According to Wikipedia, von Neumann’s minimax theorem states (von Neumann, 1928)

Theorem 8 (Minimax Theorem).

Let Um𝑈superscript𝑚U\subset\mathbb{R}^{m}italic_U ⊂ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT and Wn𝑊superscript𝑛W\subset\mathbb{R}^{n}italic_W ⊂ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT be compact convex sets. If V:U×W:𝑉𝑈𝑊V:U\times W\to\mathbb{R}italic_V : italic_U × italic_W → blackboard_R is a continuous function that is convex-concave, i.e., V(,w):U:𝑉𝑤𝑈V(\cdot,w):U\to\mathbb{R}italic_V ( ⋅ , italic_w ) : italic_U → blackboard_R is convex for all wW𝑤𝑊w\in Witalic_w ∈ italic_W, and V(u,):W:𝑉𝑢𝑊V(u,\cdot):W\to\mathbb{R}italic_V ( italic_u , ⋅ ) : italic_W → blackboard_R is concave for all uU𝑢𝑈u\in Uitalic_u ∈ italic_U
Then we have that

minuUmaxwWV(u,w)=maxwWminuUV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\min_{u\in U}\max_{w\in W}V(u,w)=\max_{w\in W}\min_{u\in U}V(u,w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

Note that existence of min and max is guaranteed by compactness of U,W𝑈𝑊U,Witalic_U , italic_W (closed, bounded). Also note that the following holds for any continuous function V𝑉Vitalic_V

minuUmaxwWV(u,w)maxwWminuUV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\min_{u\in U}\max_{w\in W}V(u,w)\geq\max_{w\in W}\min_{u\in U}V(u,w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) ≥ roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

This is often called weak duality. It’s easy to establish. We are regarding the switching of the order of min and max as a form of duality. (Think of observability and controllability as duals of each other.)

So when this inequality achieves equality, that’s often called strong duality. So the minimax theorem says that continuous functions that are convex-concave on compact sets satisfy strong duality. When strong duality is not achieved, we refer to the difference as the duality gap, which is positive due to weak duality

minuUmaxwWV(u,w)maxwWminuUV(u,w)>0subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤0\min_{u\in U}\max_{w\in W}V(u,w)-\max_{w\in W}\min_{u\in U}V(u,w)>0roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) - roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) > 0

Saddle Points.

In characterizing solutions of these problems, it is useful to define a saddle point of the function V(u,w)𝑉𝑢𝑤V(u,w)italic_V ( italic_u , italic_w ).

Definition 9 (Saddle point).

The point (set) (u,w)U×Wsuperscript𝑢superscript𝑤𝑈𝑊(u^{*},w^{*})\subseteq U\times W( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ⊆ italic_U × italic_W is called a saddle point (set) of V()𝑉V(\cdot)italic_V ( ⋅ ) if

V(u,w)V(u,w)V(u,w)for all uU,wWformulae-sequence𝑉superscript𝑢𝑤𝑉superscript𝑢superscript𝑤𝑉𝑢superscript𝑤formulae-sequencefor all 𝑢𝑈𝑤𝑊V(u^{*},w)\leq V(u^{*},w^{*})\leq V(u,w^{*})\quad\text{for all }u\in U,w\in Witalic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) ≤ italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all italic_u ∈ italic_U , italic_w ∈ italic_W (12)
Proposition 10 (Saddle-point theorem).

The point (set) (u,w)U×Wsuperscript𝑢superscript𝑤𝑈𝑊(u^{*},w^{*})\subseteq U\times W( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ⊆ italic_U × italic_W is a saddle point (set) of function V()𝑉V(\cdot)italic_V ( ⋅ ) if and only if strong duality holds and (u,w)superscript𝑢superscript𝑤(u^{*},w^{*})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) is a solution to the two problems

minuUmaxwWV(u,w)=maxwWminuUV(u,w)=V(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤𝑉superscript𝑢superscript𝑤\displaystyle\min_{u\in U}\max_{w\in W}V(u,w)=\max_{w\in W}\min_{u\in U}V(u,w)% =V(u^{*},w^{*})roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) (13)
u=argminuUmaxwWV(u,w)w=argmaxwWminuUV(u,w)formulae-sequencesuperscript𝑢subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤superscript𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\displaystyle u^{*}=\arg\min_{u\in U}\max_{w\in W}V(u,w)\qquad w^{*}=\arg\max_% {w\in W}\min_{u\in U}V(u,w)italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_arg roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) (14)

In the following development it is convenient to define the solutions to the minimization and maximization problems

V¯(u)maxwWV(u,w),uUV¯(w)minuUV(u,w),wWformulae-sequence¯𝑉𝑢subscript𝑤𝑊𝑉𝑢𝑤formulae-sequence𝑢𝑈formulae-sequence¯𝑉𝑤subscript𝑢𝑈𝑉𝑢𝑤𝑤𝑊\displaystyle\overline{V}(u)\coloneqq\max_{w\in W}V(u,w),\quad u\in U\qquad% \underline{V}(w)\coloneqq\min_{u\in U}V(u,w),\quad w\in Wover¯ start_ARG italic_V end_ARG ( italic_u ) ≔ roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , italic_u ∈ italic_U under¯ start_ARG italic_V end_ARG ( italic_w ) ≔ roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , italic_w ∈ italic_W (15)

Note that Eq. 14 implies that maxwWV¯(w)=V¯(w)subscript𝑤𝑊¯𝑉𝑤¯𝑉superscript𝑤\max_{w\in W}\underline{V}(w)=\underline{V}(w^{*})roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT under¯ start_ARG italic_V end_ARG ( italic_w ) = under¯ start_ARG italic_V end_ARG ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) and minuUV¯(u)=V¯(u)subscript𝑢𝑈¯𝑉𝑢¯𝑉superscript𝑢\min_{u\in U}\overline{V}(u)=\overline{V}(u^{*})roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT over¯ start_ARG italic_V end_ARG ( italic_u ) = over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

Remark 11.

Note that Eq. 14 also implies that

maxwWminuUV(u,w)=minuUV(u,w)minuUmaxwWV(u,w)=maxwWV(u,w)formulae-sequencesubscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤subscript𝑢𝑈𝑉𝑢superscript𝑤subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊𝑉superscript𝑢𝑤\max_{w\in W}\min_{u\in U}V(u,w)=\min_{u\in U}V(u,w^{*})\qquad\min_{u\in U}% \max_{w\in W}V(u,w)=\max_{w\in W}V(u^{*},w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) (16)

To establish this remark, note that

maxwWminuUV(u,w)=maxwWV¯(w)=V¯(w)=minuUV(u,w)subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤subscript𝑤𝑊¯𝑉𝑤¯𝑉superscript𝑤subscript𝑢𝑈𝑉𝑢superscript𝑤\max_{w\in W}\min_{u\in U}V(u,w)=\max_{w\in W}\underline{V}(w)=\underline{V}(w% ^{*})=\min_{u\in U}V(u,w^{*})roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT under¯ start_ARG italic_V end_ARG ( italic_w ) = under¯ start_ARG italic_V end_ARG ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

Similarly,

minuUmaxwWV(u,w)=minuUV¯(u)=V¯(u)=maxwWV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑢𝑈¯𝑉𝑢¯𝑉superscript𝑢subscript𝑤𝑊𝑉superscript𝑢𝑤\min_{u\in U}\max_{w\in W}V(u,w)=\min_{u\in U}\overline{V}(u)=\overline{V}(u^{% *})=\max_{w\in W}V(u^{*},w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT over¯ start_ARG italic_V end_ARG ( italic_u ) = over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w )

Next we prove Proposition 10

Proof.

First we establish that Eq. 14 implies Eq. 12. Note that by optimality, the first equality in Eq. 16, which is a consequence of assuming Eq. 14, implies that V(u,w)V(u,w)𝑉superscript𝑢superscript𝑤𝑉𝑢superscript𝑤V(u^{*},w^{*})\leq V(u,w^{*})italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all uU𝑢𝑈u\in Uitalic_u ∈ italic_U, and the second implies that V(u,w)V(u,w)𝑉superscript𝑢superscript𝑤𝑉superscript𝑢𝑤V(u^{*},w^{*})\geq V(u^{*},w)italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≥ italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) for all wW𝑤𝑊w\in Witalic_w ∈ italic_W. Taken together, these are Eq. 12.

Next we show that Eq. 12 implies Eq. 14. We know that the following holds by weak duality

maxwWminuUV(u,w)minuUmaxwWV(u,w)subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤\max_{w\in W}\min_{u\in U}V(u,w)\leq\min_{u\in U}\max_{w\in W}V(u,w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) ≤ roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) (17)

So we wish to show that the reverse inequality also holds to establish strong duality, i.e., the first equality in Eq. 14. To that end note that from Eq. 12

V(u,w)V(u,w)for all wW,uUformulae-sequence𝑉superscript𝑢𝑤𝑉𝑢superscript𝑤formulae-sequencefor all 𝑤𝑊𝑢𝑈V(u^{*},w)\leq V(u,w^{*})\quad\text{for all }w\in W,u\in Uitalic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all italic_w ∈ italic_W , italic_u ∈ italic_U

Since this holds for all wW𝑤𝑊w\in Witalic_w ∈ italic_W, it also holds for a maximizer, and therefore

maxwWV(u,w)V(u,w)for all uUformulae-sequencesubscript𝑤𝑊𝑉superscript𝑢𝑤𝑉𝑢superscript𝑤for all 𝑢𝑈\max_{w\in W}V(u^{*},w)\leq V(u,w^{*})\quad\text{for all }u\in Uroman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all italic_u ∈ italic_U

The left-hand side will not be larger if instead of evaluating at u=uU𝑢superscript𝑢𝑈u=u^{*}\in Uitalic_u = italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ italic_U, we minimize over all uU𝑢𝑈u\in Uitalic_u ∈ italic_U, giving

minuUmaxwWV(u,w)V(u,w)for all uUformulae-sequencesubscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤𝑉𝑢superscript𝑤for all 𝑢𝑈\min_{u\in U}\max_{w\in W}V(u,w)\leq V(u,w^{*})\quad\text{for all }u\in Uroman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all italic_u ∈ italic_U

Now if this inequality holds for all uU𝑢𝑈u\in Uitalic_u ∈ italic_U, it also holds for the minimizer on the right-hand side so that

minuUmaxwWV(u,w)minuUV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑢𝑈𝑉𝑢superscript𝑤\min_{u\in U}\max_{w\in W}V(u,w)\leq\min_{u\in U}V(u,w^{*})roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) ≤ roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

We can only increase the value of the right-hand side if instead of evaluating at w=wW𝑤superscript𝑤𝑊w=w^{*}\in Witalic_w = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ italic_W, we maximize over all wW𝑤𝑊w\in Witalic_w ∈ italic_W, giving

minuUmaxwWV(u,w)maxwWminuUV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\min_{u\in U}\max_{w\in W}V(u,w)\leq\max_{w\in W}\min_{u\in U}V(u,w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) ≤ roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

Note that this is the weak duality inequality Eq. 17 written in the reverse direction, so combining with weak duality, we have that

minuUmaxwWV(u,w)=maxwWminuUV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\min_{u\in U}\max_{w\in W}V(u,w)=\max_{w\in W}\min_{u\in U}V(u,w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

and strong duality is established.

We next show that usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT solves the minmax problem. From the defined optimizations in Eq. 21 we have that

minuUmaxwWV(u,w)subscript𝑢𝑈subscript𝑤𝑊𝑉𝑢𝑤\displaystyle\min_{u\in U}\max_{w\in W}V(u,w)roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) =minuUV¯(u)absentsubscript𝑢𝑈¯𝑉𝑢\displaystyle=\min_{u\in U}\overline{V}(u)= roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT over¯ start_ARG italic_V end_ARG ( italic_u ) (18)
maxwWminuUV(u,w)subscript𝑤𝑊subscript𝑢𝑈𝑉𝑢𝑤\displaystyle\max_{w\in W}\min_{u\in U}V(u,w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) =maxwWV¯(w)absentsubscript𝑤𝑊¯𝑉𝑤\displaystyle=\max_{w\in W}\underline{V}(w)= roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT under¯ start_ARG italic_V end_ARG ( italic_w ) (19)

Next choose an arbitraru u1Usubscript𝑢1𝑈u_{1}\in Uitalic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ italic_U and assume for contradiction that V¯(u1)<V¯(u)¯𝑉subscript𝑢1¯𝑉superscript𝑢\overline{V}(u_{1})<\overline{V}(u^{*})over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) < over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ). From the definition of V¯¯𝑉\overline{V}over¯ start_ARG italic_V end_ARG we then have have that

maxwWV(u1,w)<maxwWV(u,w)subscript𝑤𝑊𝑉subscript𝑢1𝑤subscript𝑤𝑊𝑉superscript𝑢𝑤\max_{w\in W}V(u_{1},w)<\max_{w\in W}V(u^{*},w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_w ) < roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w )

Therefore since wWsuperscript𝑤𝑊w^{*}\in Witalic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ italic_W

V(u1,w)<maxwWV(u,w)𝑉subscript𝑢1superscript𝑤subscript𝑤𝑊𝑉superscript𝑢𝑤V(u_{1},w^{*})<\max_{w\in W}V(u^{*},w)italic_V ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) < roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w )

But from the saddle-point condition, Eq. 12, maxwWV(u,w)V(u,w)subscript𝑤𝑊𝑉superscript𝑢𝑤𝑉𝑢superscript𝑤\max_{w\in W}V(u^{*},w)\leq V(u,w^{*})roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w ) ≤ italic_V ( italic_u , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all uU𝑢𝑈u\in Uitalic_u ∈ italic_U, which contradicts the previous inequality since u1Usubscript𝑢1𝑈u_{1}\in Uitalic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ italic_U. Therefore V¯(u1)V¯(u)¯𝑉subscript𝑢1¯𝑉superscript𝑢\overline{V}(u_{1})\geq\overline{V}(u^{*})over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ≥ over¯ start_ARG italic_V end_ARG ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), and since u1subscript𝑢1u_{1}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is an arbitrary element of U𝑈Uitalic_U, usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT solves the minmax problem Eq. 18.

Similarly we can show that wsuperscript𝑤w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT solves the maxmin problem Eq. 19 by exchanging the variables w𝑤witalic_w and u𝑢uitalic_u and the operations max\maxroman_max and min\minroman_min. Therefore (w,u)superscript𝑤superscript𝑢(w^{*},u^{*})( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) solves Eq. 14, and we have established that Eq. 12 implies Eq. 14. ∎

In the following development it is convenient to define the solutions to the inner minimization and maximization problems

u¯0(w):=argminuUV(u,w),wWformulae-sequenceassignsuperscript¯𝑢0𝑤subscript𝑢𝑈𝑉𝑢𝑤𝑤𝑊\displaystyle\underline{u}^{0}(w):=\arg\min_{u\in U}V(u,w),\quad w\in Wunder¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) := roman_arg roman_min start_POSTSUBSCRIPT italic_u ∈ italic_U end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , italic_w ∈ italic_W (20)
w¯0(u):=argmaxwWV(u,w),uUformulae-sequenceassignsuperscript¯𝑤0𝑢subscript𝑤𝑊𝑉𝑢𝑤𝑢𝑈\displaystyle\overline{w}^{0}(u):=\arg\max_{w\in W}V(u,w),\quad u\in Uover¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) := roman_arg roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , italic_u ∈ italic_U (21)

Note that these inner solution sets are too “large” in the following sense. Even if we evaluate them at the optimizers of their respective outer problems, we know only that

uu¯0(w)ww¯0(u)formulae-sequencesuperscript𝑢superscript¯𝑢0superscript𝑤superscript𝑤superscript¯𝑤0superscript𝑢\displaystyle u^{*}\subseteq\underline{u}^{0}(w^{*})\qquad w^{*}\subseteq% \overline{w}^{0}(u^{*})italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⊆ under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⊆ over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

and these subsets may be strict. So we have to exercise some care when we exploit strong duality and want to extract the optimizer from a dual problem. We shall illustrate this issue in the upcoming results.

Quadratic functions.

In control problems, we min and max over possibly unbounded sets, so we need something other than compactness to guarantee existence of solutions. When we have linear dynamic models and quadratic stage cost (LQ), we can use the following results for quadratic functions.

Proposition 12 (Saddle-point theorem for quadratic functions).

Consider the quadratic function V():n+m:𝑉superscript𝑛𝑚V(\cdot):\mathbb{R}^{n+m}\rightarrow\mathbb{R}italic_V ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT → blackboard_R

V(u,w)(1/2)[uw][M11M12M12M22][uw]+[uw][d1d2]𝑉𝑢𝑤12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑢𝑤superscriptmatrix𝑢𝑤matrixsubscript𝑑1subscript𝑑2V(u,w)\coloneqq(1/2)\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}+\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}d_{1}\\ d_{2}\end{bmatrix}italic_V ( italic_u , italic_w ) ≔ ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ]

with M22n×n0subscript𝑀22superscript𝑛𝑛0M_{22}\in\mathbb{R}^{n\times n}\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ≤ 0, M11m×m0subscript𝑀11superscript𝑚𝑚0M_{11}\in\mathbb{R}^{m\times m}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT ≥ 0, M12m×n,dn+mformulae-sequencesubscript𝑀12superscript𝑚𝑛𝑑superscript𝑛𝑚M_{12}\in\mathbb{R}^{m\times n},d\in\mathbb{R}^{n+m}italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT , italic_d ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT.

  1. 1.

    A solution to minumaxwVsubscript𝑢subscript𝑤𝑉\min_{u}\max_{w}Vroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V exists if and only if dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ). Similarly, a solution to maxwminuVsubscript𝑤subscript𝑢𝑉\max_{w}\min_{u}Vroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V exists if and only if dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ).

  2. 2.

    For dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ), strong duality holds so that

    minumaxwV(u,w)=maxwminuV(u,w)=V(u,w)subscript𝑢subscript𝑤𝑉𝑢𝑤subscript𝑤subscript𝑢𝑉𝑢𝑤𝑉superscript𝑢superscript𝑤\min_{u}\max_{w}V(u,w)=\max_{w}\min_{u}V(u,w)=V(u^{*},w^{*})roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) = italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

    where (u,w)superscript𝑢superscript𝑤(u^{*},w^{*})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) are saddle points of the function V𝑉Vitalic_V, satisfying

    [uw]M+d+N(M)V(u,w)=(1/2)dM+dformulae-sequencematrixsuperscript𝑢superscript𝑤superscript𝑀𝑑𝑁𝑀𝑉superscript𝑢superscript𝑤12superscript𝑑superscript𝑀𝑑\begin{bmatrix}u^{*}\\ w^{*}\end{bmatrix}\in-M^{+}d+N(M)\qquad V(u^{*},w^{*})=-(1/2)d^{\prime}M^{+}d[ start_ARG start_ROW start_CELL italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] ∈ - italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_M ) italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d (22)

    and dV(u,w)/d(u,w)=0𝑑𝑉𝑢𝑤𝑑𝑢𝑤0dV(u,w)/d(u,w)=0italic_d italic_V ( italic_u , italic_w ) / italic_d ( italic_u , italic_w ) = 0 at (u,w)superscript𝑢superscript𝑤(u^{*},w^{*})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

  3. 3.

    For dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ), let u¯0(w)minuV(u,w)superscript¯𝑢0𝑤subscript𝑢𝑉𝑢𝑤\underline{u}^{0}(w)\coloneqq\min_{u}V(u,w)under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) ≔ roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) and w¯0(u)maxwV(u,w)superscript¯𝑤0𝑢subscript𝑤𝑉𝑢𝑤\overline{w}^{0}(u)\coloneqq\max_{w}V(u,w)over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ≔ roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ). The solution sets and saddle points satisfy the following relationships

    usuperscript𝑢\displaystyle u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =argminumaxwV(u,w),absentsubscript𝑢subscript𝑤𝑉𝑢𝑤\displaystyle=\arg\min_{u}\max_{w}V(u,w),= roman_arg roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , usuperscript𝑢\displaystyle\quad u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT u¯0(w),absentsuperscript¯𝑢0superscript𝑤\displaystyle\subseteq\underline{u}^{0}(w^{*}),⊆ under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , (23)
    wsuperscript𝑤\displaystyle w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =argmaxwminuV(u,w),absentsubscript𝑤subscript𝑢𝑉𝑢𝑤\displaystyle=\arg\max_{w}\min_{u}V(u,w),= roman_arg roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) , wsuperscript𝑤\displaystyle\quad w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT w¯0(u).absentsuperscript¯𝑤0superscript𝑢\displaystyle\subseteq\overline{w}^{0}(u^{*}).⊆ over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) . (24)
Proof.

First we establish that (u,w)superscript𝑢superscript𝑤(u^{*},w^{*})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) satisfy (22) by analyzing the minumaxwVsubscript𝑢subscript𝑤𝑉\min_{u}\max_{w}Vroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V problem. We assume dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ) and expand V()𝑉V(\cdot)italic_V ( ⋅ ) as

V(u,w)=(1/2)wM22w+w(M12u+d2)+(1/2)uM11u+ud1𝑉𝑢𝑤12superscript𝑤subscript𝑀22𝑤superscript𝑤superscriptsubscript𝑀12𝑢subscript𝑑212superscript𝑢subscript𝑀11𝑢superscript𝑢subscript𝑑1V(u,w)=(1/2)w^{\prime}M_{22}w+w^{\prime}(M_{12}^{\prime}u+d_{2})+(1/2)u^{% \prime}M_{11}u+u^{\prime}d_{1}italic_V ( italic_u , italic_w ) = ( 1 / 2 ) italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + ( 1 / 2 ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_u + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (25)

From Proposition 5, maxwVsubscript𝑤𝑉\max_{w}Vroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V exists if and only if M12u+d2R(M22)superscriptsubscript𝑀12𝑢subscript𝑑2𝑅subscript𝑀22M_{12}^{\prime}u+d_{2}\in R(M_{22})italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_R ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ). This condition is satisfied for some nonempty set of u𝑢uitalic_u by the bottom half of dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ). For such u𝑢uitalic_u we have the necessary and sufficient condition for the optimum

M22w¯0+M12u+d2=0subscript𝑀22superscript¯𝑤0superscriptsubscript𝑀12𝑢subscript𝑑20M_{22}\overline{w}^{0}+M_{12}^{\prime}u+d_{2}=0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 (26)

which defines an implicit function w¯0(u)superscript¯𝑤0𝑢\overline{w}^{0}(u)over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ), and optimal value given by (5)

w¯0(u)superscript¯𝑤0𝑢\displaystyle\overline{w}^{0}(u)over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) =M22+(M12u+d2)+N(M22)absentsubscriptsuperscript𝑀22superscriptsubscript𝑀12𝑢subscript𝑑2𝑁subscript𝑀22\displaystyle=-M^{+}_{22}(M_{12}^{\prime}u+d_{2})+N(M_{22})= - italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT )
V(u,w¯0(u))𝑉𝑢superscript¯𝑤0𝑢\displaystyle V(u,\overline{w}^{0}(u))italic_V ( italic_u , over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) =(1/2)uM~22u+u(d1M12M22+d2)(1/2)d2M22+d2absent12superscript𝑢subscript~𝑀22𝑢superscript𝑢subscript𝑑1subscript𝑀12superscriptsubscript𝑀22subscript𝑑212subscript𝑑2superscriptsubscript𝑀22subscript𝑑2\displaystyle=(1/2)u^{\prime}\tilde{M}_{22}u+u^{\prime}(d_{1}-M_{12}M_{22}^{+}% d_{2})-(1/2)d_{2}M_{22}^{+}d_{2}= ( 1 / 2 ) italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_u + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - ( 1 / 2 ) italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

where M~22subscript~𝑀22\tilde{M}_{22}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT is the Schur complement of M22subscript𝑀22M_{22}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT defined in (8). Note that M~220subscript~𝑀220\tilde{M}_{22}\geq 0over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ≥ 0 since M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 and M220subscript𝑀220M_{22}\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ≤ 0, which implies M22+0superscriptsubscript𝑀220M_{22}^{+}\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ≤ 0. However, we cannot simply set the derivative to zero because we require M12u+d2R(M22)superscriptsubscript𝑀12𝑢subscript𝑑2𝑅subscript𝑀22M_{12}^{\prime}u+d_{2}\in R(M_{22})italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_R ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ) for the existence of V(u,w¯0(u))=maxwV(u,w)𝑉𝑢superscript¯𝑤0𝑢subscript𝑤𝑉𝑢𝑤V(u,\overline{w}^{0}(u))=\max_{w}V(u,w)italic_V ( italic_u , over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) = roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ). To handle this range constraint, we use a linear equality constraint M12u+d2=M22ysuperscriptsubscript𝑀12𝑢subscript𝑑2subscript𝑀22𝑦M_{12}^{\prime}u+d_{2}=M_{22}yitalic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y where y𝑦yitalic_y is a slack variable. Under the equality constraint, the problem minumaxwVsubscript𝑢subscript𝑤𝑉\min_{u}\max_{w}Vroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V is equivalent to the following constrained minimization:

minu,yV(u,w¯0(u))subject toM12u+d2=M22y.subscript𝑢𝑦𝑉𝑢superscript¯𝑤0𝑢subject tosuperscriptsubscript𝑀12𝑢subscript𝑑2subscript𝑀22𝑦\min_{u,y}V(u,\overline{w}^{0}(u))\qquad\textnormal{subject to}\qquad M_{12}^{% \prime}u+d_{2}=M_{22}y.roman_min start_POSTSUBSCRIPT italic_u , italic_y end_POSTSUBSCRIPT italic_V ( italic_u , over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) subject to italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y .

Because V(u,w¯0(u))𝑉𝑢superscript¯𝑤0𝑢V(u,\overline{w}^{0}(u))italic_V ( italic_u , over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) is convex and differentiable in (u,y)𝑢𝑦(u,y)( italic_u , italic_y ), its minimum subject to the affine constraint M12u+d2=M22ysuperscriptsubscript𝑀12𝑢subscript𝑑2subscript𝑀22𝑦M_{12}^{\prime}u+d_{2}=M_{22}yitalic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y is achieved by the stationary points of the Lagrangian (Boyd and Vandenberghe, 2004, pp. 141–142):

L(u,y,λ):=V(u,w¯0(u))+λ(M12u+d2M22y).assign𝐿𝑢𝑦𝜆𝑉𝑢superscript¯𝑤0𝑢superscript𝜆superscriptsubscript𝑀12𝑢subscript𝑑2subscript𝑀22𝑦L(u,y,\lambda):=V(u,\overline{w}^{0}(u))+\lambda^{\prime}(M_{12}^{\prime}u+d_{% 2}-M_{22}y).italic_L ( italic_u , italic_y , italic_λ ) := italic_V ( italic_u , over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) + italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y ) .

Taking derivatives, we have (u,y,λ)superscript𝑢superscript𝑦superscript𝜆(u^{*},y^{*},\lambda^{*})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) is a stationary point if and only if

M~22u+d1M12M22+d2+M12λsubscript~𝑀22superscript𝑢subscript𝑑1subscript𝑀12superscriptsubscript𝑀22subscript𝑑2subscript𝑀12superscript𝜆\displaystyle\tilde{M}_{22}u^{*}+d_{1}-M_{12}M_{22}^{+}d_{2}+M_{12}\lambda^{*}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =0absent0\displaystyle=0= 0 (27)
M22λsubscript𝑀22superscript𝜆\displaystyle-M_{22}\lambda^{*}- italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =0absent0\displaystyle=0= 0 (28)
M12u+d2M22ysuperscriptsubscript𝑀12superscript𝑢subscript𝑑2subscript𝑀22superscript𝑦\displaystyle M_{12}^{\prime}u^{*}+d_{2}-M_{22}y^{*}italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =0.absent0\displaystyle=0.= 0 . (29)

Substituting Eq. 29 into Eq. 27, we have

M11u+M12(λM22+M22y)+d1=0.subscript𝑀11superscript𝑢subscript𝑀12superscript𝜆superscriptsubscript𝑀22subscript𝑀22superscript𝑦subscript𝑑10M_{11}u^{*}+M_{12}(\lambda^{*}-M_{22}^{+}M_{22}y^{*})+d_{1}=0.italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 .

Next, Eq. 28 implies λN(M22)superscript𝜆𝑁subscript𝑀22\lambda^{*}\in N(M_{22})italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ), so we can rewrite Eq. 29 as

M12u+M22(λM22+M22y)+d2=0.superscriptsubscript𝑀12superscript𝑢subscript𝑀22superscript𝜆superscriptsubscript𝑀22subscript𝑀22superscript𝑦subscript𝑑20M_{12}^{\prime}u^{*}+M_{22}(\lambda^{*}-M_{22}^{+}M_{22}y^{*})+d_{2}=0.italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 .

With w:=λM22+M22yassignsuperscript𝑤superscript𝜆superscriptsubscript𝑀22subscript𝑀22superscript𝑦w^{*}:=\lambda^{*}-M_{22}^{+}M_{22}y^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT := italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, we have the system Eqs. 29 and 27 as

[M11M12M12M22][uw]+[d1d2]=0matrixsubscript𝑀11subscript𝑀12superscriptsubscript𝑀12subscript𝑀22matrixsuperscript𝑢superscript𝑤matrixsubscript𝑑1subscript𝑑20\begin{bmatrix}M_{11}&M_{12}\\ M_{12}^{\prime}&M_{22}\end{bmatrix}\begin{bmatrix}u^{*}\\ w^{*}\end{bmatrix}+\begin{bmatrix}d_{1}\\ d_{2}\end{bmatrix}=0[ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] + [ start_ARG start_ROW start_CELL italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] = 0 (30)

which has solutions since dR(M)𝑑𝑅𝑀d\in R(M)italic_d ∈ italic_R ( italic_M ), and moreover, any such solution gives y=M22+M22wsuperscript𝑦superscriptsubscript𝑀22subscript𝑀22superscript𝑤y^{*}=-M_{22}^{+}M_{22}w^{*}italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = - italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and λ=w+ysuperscript𝜆superscript𝑤superscript𝑦\lambda^{*}=w^{*}+y^{*}italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + italic_y start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT satisfying Eqs. 27, 28 and 29. In fact, ww¯0(u)superscript𝑤superscript¯𝑤0superscript𝑢w^{*}\subseteq\overline{w}^{0}(u^{*})italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⊆ over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) is implied by Eqs. 28 and 29. Finally, solving Eq. 30 and substituting the solution into V(u,w)𝑉𝑢𝑤V(u,w)italic_V ( italic_u , italic_w ) gives Eq. 22.

To solve the maxwminuVsubscript𝑤subscript𝑢𝑉\max_{w}\min_{u}Vroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V problem, take the negative of the objective to obtain maxwminuV=minwmaxu(V)subscript𝑤subscript𝑢𝑉subscript𝑤subscript𝑢𝑉\max_{w}\min_{u}V=-\min_{w}\max_{u}(-V)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V = - roman_min start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( - italic_V ). Therefore the exact same procedure can be used here, and it produces the same solutions and optimal values Eq. 22, along with uu¯0(w)superscript𝑢superscript¯𝑢0superscript𝑤u^{*}\subseteq\underline{u}^{0}(w^{*})italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⊆ under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

Finally, note that if dR(M)𝑑𝑅𝑀d\notin R(M)italic_d ∉ italic_R ( italic_M ), we have no solution to maxwV(u,w)subscript𝑤𝑉𝑢𝑤\max_{w}V(u,w)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) for any u𝑢uitalic_u, and therefore no solution to minumaxwV(u,w)subscript𝑢subscript𝑤𝑉𝑢𝑤\min_{u}\max_{w}V(u,w)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ). Similarly we have no solution to minuV(u,w)subscript𝑢𝑉𝑢𝑤\min_{u}V(u,w)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ) for any w𝑤witalic_w, and therefore no solution to maxwminuV(u,w)subscript𝑤subscript𝑢𝑉𝑢𝑤\max_{w}\min_{u}V(u,w)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w ). We have thus established all the claims of the proposition. ∎

Applying Proposition 12 to the following example with M11=M22=0subscript𝑀11subscript𝑀220M_{11}=M_{22}=0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT = 0 and M12=1subscript𝑀121M_{12}=1italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = 1

M=[0110]V(u,w)=uw+[uw]dformulae-sequence𝑀matrix0110𝑉𝑢𝑤𝑢𝑤superscriptmatrix𝑢𝑤𝑑M=\begin{bmatrix}0&1\\ 1&0\end{bmatrix}\qquad V(u,w)=uw+\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}ditalic_M = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] italic_V ( italic_u , italic_w ) = italic_u italic_w + [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d

gives (u,w)=(d2,d1)superscript𝑢superscript𝑤subscript𝑑2subscript𝑑1(u^{*},w^{*})=-(d_{2},d_{1})( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = - ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ), V(u,w)=d1d2𝑉superscript𝑢superscript𝑤subscript𝑑1subscript𝑑2V(u^{*},w^{*})=-d_{1}d_{2}italic_V ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = - italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, w¯0(u)=superscript¯𝑤0superscript𝑢\overline{w}^{0}(u^{*})=\mathbb{R}over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = blackboard_R, u¯0(w)=superscript¯𝑢0superscript𝑤\underline{u}^{0}(w^{*})=\mathbb{R}under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = blackboard_R. Note that both functions w¯0()superscript¯𝑤0\overline{w}^{0}(\cdot)over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( ⋅ ) and u¯0()superscript¯𝑢0\underline{u}^{0}(\cdot)under¯ start_ARG italic_u end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( ⋅ ) are defined at only a single point, usuperscript𝑢u^{*}italic_u start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and wsuperscript𝑤w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, respectively. So in this degenerate case, these functions are not even differentiable.

Lagrangian functions.

The connections between constrained optimization problems via the use of Lagrange multipliers and game theory problems are useful (Rockafellar, 1993).

For optimization problems of convex type, Lagrange multipliers take on a game-theoretic role that could hardly even have been imagined before the creative insights of von Neumann [32], [33], in applying mathematics to models of social and economic conflict.

–T.A. Rockafellar

Next we are interested in the Lagrangian function L():n+m+1:𝐿superscript𝑛𝑚1L(\cdot):\mathbb{R}^{n+m+1}\rightarrow\mathbb{R}italic_L ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m + 1 end_POSTSUPERSCRIPT → blackboard_R

L(u,w,λ)𝐿𝑢𝑤𝜆\displaystyle L(u,w,\lambda)italic_L ( italic_u , italic_w , italic_λ ) (1/2)[uw][M11M12M12M22][uw](1/2)λ(ww1)absent12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑢𝑤12𝜆superscript𝑤𝑤1\displaystyle\coloneqq(1/2)\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}-(1/2)\lambda(w^{\prime}w-1)≔ ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] - ( 1 / 2 ) italic_λ ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 )
=(1/2)[uw][M11M12M12M22λI][uw]+λ/2absent12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22𝜆𝐼matrix𝑢𝑤𝜆2\displaystyle=(1/2)\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}-\lambda I\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}+\lambda/2= ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + italic_λ / 2

with M(n+m)×(n+m)0𝑀superscript𝑛𝑚𝑛𝑚0M\in\mathbb{R}^{(n+m)\times(n+m)}\geq 0italic_M ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_m ) × ( italic_n + italic_m ) end_POSTSUPERSCRIPT ≥ 0, and M11m×msubscript𝑀11superscript𝑚𝑚M_{11}\in\mathbb{R}^{m\times m}italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, M12m×nsubscript𝑀12superscript𝑚𝑛M_{12}\in\mathbb{R}^{m\times n}italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT, M22n×nsubscript𝑀22superscript𝑛𝑛M_{22}\in\mathbb{R}^{n\times n}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT. Note that from Proposition 6, both M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 and M220subscript𝑀220M_{22}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ≥ 0 as well, so that maxwsubscript𝑤\max_{w}roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT is not bounded unless λ𝜆\lambdaitalic_λ is large enough to make M22λI0subscript𝑀22𝜆𝐼0M_{22}-\lambda I\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ≤ 0. The Schur complements of M11subscript𝑀11M_{11}italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT and M22λIsubscript𝑀22𝜆𝐼M_{22}-\lambda Iitalic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I are useful for expressing the solution.

M~11(λ)subscript~𝑀11𝜆\displaystyle\tilde{M}_{11}(\lambda)over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) (M22λI)M12M11+M12absentsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle\coloneqq(M_{22}-\lambda I)-M_{12}^{\prime}M_{11}^{+}M_{12}≔ ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT
M~22(λ)subscript~𝑀22𝜆\displaystyle\tilde{M}_{22}(\lambda)over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) M11M12(M22λI)+M12absentsubscript𝑀11subscript𝑀12superscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12\displaystyle\coloneqq M_{11}-M_{12}(M_{22}-\lambda I)^{+}M_{12}^{\prime}≔ italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT

Note that both Schur complements depend on the parameter λ𝜆\lambdaitalic_λ.

Corollary 13 (Minmax and maxmin of a quadratic function with a parameter).

Consider the quadratic function L():n+m+1:𝐿superscript𝑛𝑚1L(\cdot):\mathbb{R}^{n+m+1}\rightarrow\mathbb{R}italic_L ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m + 1 end_POSTSUPERSCRIPT → blackboard_R expressed as

L(u,w,λ)(1/2)[uw][M11M12M12M22λI][uw]+λ/2𝐿𝑢𝑤𝜆12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22𝜆𝐼matrix𝑢𝑤𝜆2L(u,w,\lambda)\coloneqq(1/2)\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}-\lambda I\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}+\lambda/2italic_L ( italic_u , italic_w , italic_λ ) ≔ ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + italic_λ / 2 (31)

  1. 1.

    A solution to minumaxwLsubscript𝑢subscript𝑤𝐿\min_{u}\max_{w}Lroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L exists if and only if

    λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | (32)

    and the solution (set) and optimal value function are

    w0(u)superscript𝑤0𝑢\displaystyle w^{0}(u)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) (M22λI)+M12u+N(M22λI)absentsuperscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12𝑢𝑁subscript𝑀22𝜆𝐼\displaystyle\in-(M_{22}-\lambda I)^{+}M_{12}^{\prime}\;u+N(M_{22}-\lambda I)∈ - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I )
    u0superscript𝑢0\displaystyle u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT N(M~22(λ))absent𝑁subscript~𝑀22𝜆\displaystyle\in N(\tilde{M}_{22}(\lambda))∈ italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0,w0(u0),λ)𝐿superscript𝑢0superscript𝑤0superscript𝑢0𝜆\displaystyle L(u^{0},w^{0}(u^{0}),\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ ) ={λ/2,λ|M22|+,λ<|M22|absentcases𝜆2𝜆subscript𝑀22𝜆subscript𝑀22\displaystyle=\begin{cases}\lambda/2,\quad&\lambda\geq\left|M_{22}\right|\\ +\infty,\quad&\lambda<\left|M_{22}\right|\end{cases}= { start_ROW start_CELL italic_λ / 2 , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW
  2. 2.

    A solution to maxwminuLsubscript𝑤subscript𝑢𝐿\max_{w}\min_{u}Lroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L exists if and only if

    λ|M22M12M11+M12|𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | (33)

    and the solution (set) and optimal value function are

    u0(w)superscript𝑢0𝑤\displaystyle u^{0}(w)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) M11+M12w+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12𝑤𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}M_{12}\;w+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )
    w0superscript𝑤0\displaystyle w^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT N(M~11(λ))absent𝑁subscript~𝑀11𝜆\displaystyle\in N(\tilde{M}_{11}(\lambda))∈ italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0(w0),w0,λ)𝐿superscript𝑢0superscript𝑤0superscript𝑤0𝜆\displaystyle L(u^{0}(w^{0}),w^{0},\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) ={λ/2,λ|M22M12M11+M12|+,λ<|M22M12M11+M12|absentcases𝜆2𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle=\begin{cases}\lambda/2,\quad&\lambda\geq\left|M_{22}-M_{12}^{% \prime}M_{11}^{+}M_{12}\right|\\ +\infty,\quad&\lambda<\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\end{cases}= { start_ROW start_CELL italic_λ / 2 , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW
  3. 3.

    Strong duality holds and the duality gap is zero for λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |.

  4. 4.

    If |M22M12M11+M12|<|M22|subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript𝑀22\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|<\left|M_{22}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, then there is an unbounded duality gap for λ𝜆\lambdaitalic_λ in that interval

    minumaxwLmaxwminuL=+,for |M22M12M11+M12|λ<|M22|formulae-sequencesubscript𝑢subscript𝑤𝐿subscript𝑤subscript𝑢𝐿for subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22\min_{u}\max_{w}L-\max_{w}\min_{u}L=+\infty,\quad\text{for }\left|M_{22}-M_{12% }^{\prime}M_{11}^{+}M_{12}\right|\leq\lambda<\left|M_{22}\right|roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L - roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L = + ∞ , for | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | ≤ italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |

Figure 1 illustrates a common outcome for Corollary 13. Note that the proof of Corollary 13 follows the proof of Proposition 14.

Refer to caption

Figure 1: The optimal value function L0superscript𝐿0L^{0}italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT for minumaxwLsubscript𝑢subscript𝑤𝐿\min_{u}\max_{w}Lroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L and maxwminuLsubscript𝑤subscript𝑢𝐿\max_{w}\min_{u}Lroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L versus parameter λ𝜆\lambdaitalic_λ. Strong duality holds only when λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |. For λ<|M22|𝜆subscript𝑀22\lambda<\left|M_{22}\right|italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, minumaxwL0=+subscript𝑢subscript𝑤superscript𝐿0\min_{u}\max_{w}L^{0}=+\inftyroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = + ∞. For λ<|M22M12M11+M12|𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda<\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT |, maxwminuL0=+subscript𝑤subscript𝑢superscript𝐿0\max_{w}\min_{u}L^{0}=+\inftyroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = + ∞.
Proposition 14 (Minmax and maxmin of a quadratic function with a parameter and a linear term).

Consider the quadratic function L():n+m+1:𝐿superscript𝑛𝑚1L(\cdot):\mathbb{R}^{n+m+1}\rightarrow\mathbb{R}italic_L ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m + 1 end_POSTSUPERSCRIPT → blackboard_R expressed as

L(u,w,λ)𝐿𝑢𝑤𝜆\displaystyle L(u,w,\lambda)italic_L ( italic_u , italic_w , italic_λ ) 12[uw][M11M12M12M22][uw]+[uw][d1d2]λ2(ww1)absent12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑢𝑤superscriptmatrix𝑢𝑤matrixsubscript𝑑1subscript𝑑2𝜆2superscript𝑤𝑤1\displaystyle\coloneqq\frac{1}{2}\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}+\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}d_{1}\\ d_{2}\end{bmatrix}-\frac{\lambda}{2}(w^{\prime}w-1)≔ divide start_ARG 1 end_ARG start_ARG 2 end_ARG [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] - divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 )
=12[uw][M11M12M12M22λI]M(λ)[uw]+[uw][d1d2]+λ2absent12superscriptmatrix𝑢𝑤subscriptmatrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22𝜆𝐼𝑀𝜆matrix𝑢𝑤superscriptmatrix𝑢𝑤matrixsubscript𝑑1subscript𝑑2𝜆2\displaystyle=\frac{1}{2}\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\underbrace{\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}-\lambda I\end{bmatrix}}_{M(\lambda)}\begin{bmatrix}u\\ w\end{bmatrix}+\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}d_{1}\\ d_{2}\end{bmatrix}+\frac{\lambda}{2}= divide start_ARG 1 end_ARG start_ARG 2 end_ARG [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT under⏟ start_ARG [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL end_ROW end_ARG ] end_ARG start_POSTSUBSCRIPT italic_M ( italic_λ ) end_POSTSUBSCRIPT [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] + divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG

with M(λ=0)0𝑀𝜆00M(\lambda=0)\geq 0italic_M ( italic_λ = 0 ) ≥ 0.

  1. 1.

    A solution to minumaxwLsubscript𝑢subscript𝑤𝐿\min_{u}\max_{w}Lroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L exists if and only if

    λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | (34)

    and the solution (set) and optimal value function are

    w0(u,λ)superscript𝑤0𝑢𝜆\displaystyle w^{0}(u,\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u , italic_λ ) (M22λI)+(M12u+d2)+N(M22λI)absentsuperscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12𝑢subscript𝑑2𝑁subscript𝑀22𝜆𝐼\displaystyle\in-(M_{22}-\lambda I)^{+}(M_{12}^{\prime}\;u+d_{2})+N(M_{22}-% \lambda I)∈ - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I )
    u0(λ)superscript𝑢0𝜆\displaystyle u^{0}(\lambda)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) M~22+(λ)(d1M12(M22λI)+d2)+N(M~22(λ))absentsubscriptsuperscript~𝑀22𝜆subscript𝑑1subscript𝑀12superscriptsubscript𝑀22𝜆𝐼subscript𝑑2𝑁subscript~𝑀22𝜆\displaystyle\in-\tilde{M}^{+}_{22}(\lambda)(d_{1}-M_{12}(M_{22}-\lambda I)^{+% }d_{2})+N(\tilde{M}_{22}(\lambda))∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0,w0(u0),λ)𝐿superscript𝑢0superscript𝑤0superscript𝑢0𝜆\displaystyle L(u^{0},w^{0}(u^{0}),\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ ) ={λ212dM+(λ)d,λ|M22|+,λ<|M22|absentcases𝜆212superscript𝑑superscript𝑀𝜆𝑑𝜆subscript𝑀22𝜆subscript𝑀22\displaystyle=\begin{cases}\frac{\lambda}{2}-\frac{1}{2}d^{\prime}M^{+}(% \lambda)d,\quad&\lambda\geq\left|M_{22}\right|\\ +\infty,\quad&\lambda<\left|M_{22}\right|\end{cases}= { start_ROW start_CELL divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW
  2. 2.

    A solution to maxwminuLsubscript𝑤subscript𝑢𝐿\max_{w}\min_{u}Lroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L exists if and only if

    λ|M22M12M11+M12|𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | (35)

    and the solution (set) and optimal value function are

    u0(w,λ)superscript𝑢0𝑤𝜆\displaystyle u^{0}(w,\lambda)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w , italic_λ ) M11+(M12w+d1)+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12𝑤subscript𝑑1𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}(M_{12}\;w+d_{1})+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )
    w0(λ)superscript𝑤0𝜆\displaystyle w^{0}(\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) M~11+(λ)(d2M12M11+d1)+N(M~11(λ))absentsubscriptsuperscript~𝑀11𝜆subscript𝑑2subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑑1𝑁subscript~𝑀11𝜆\displaystyle\in-\tilde{M}^{+}_{11}(\lambda)(d_{2}-M^{\prime}_{12}M^{+}_{11}d_% {1})+N(\tilde{M}_{11}(\lambda))∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0(w0),w0,λ)𝐿superscript𝑢0superscript𝑤0superscript𝑤0𝜆\displaystyle L(u^{0}(w^{0}),w^{0},\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) ={λ212dM+(λ)d,λ|M22M12M11+M12|+,λ<|M22M12M11+M12|absentcases𝜆212superscript𝑑superscript𝑀𝜆𝑑𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle=\begin{cases}\frac{\lambda}{2}-\frac{1}{2}d^{\prime}M^{+}(% \lambda)d,\quad&\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|% \\ +\infty,\quad&\lambda<\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\end{cases}= { start_ROW start_CELL divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW
  3. 3.

    Strong duality holds and the duality gap is zero for λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |.

  4. 4.

    If |M22M12M11+M12|<|M22|subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript𝑀22\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|<\left|M_{22}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, then there is an unbounded duality gap for λ𝜆\lambdaitalic_λ in that interval

    minumaxwLmaxwminuL=+,subscript𝑢subscript𝑤𝐿subscript𝑤subscript𝑢𝐿\min_{u}\max_{w}L-\max_{w}\min_{u}L=+\infty,roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L - roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L = + ∞ ,
    for |M22M12M11+M12|λ<|M22|for subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22\text{for }\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\leq\lambda<% \left|M_{22}\right|for | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | ≤ italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |
Proof.

Expand L𝐿Litalic_L as

L(u,w,λ)=12(uM11u+2uM12w+w(M22λI)w+2ud1+2wd2+λ)𝐿𝑢𝑤𝜆12superscript𝑢subscript𝑀11𝑢2superscript𝑢subscript𝑀12𝑤superscript𝑤subscript𝑀22𝜆𝐼𝑤2superscript𝑢subscript𝑑12superscript𝑤subscript𝑑2𝜆L(u,w,\lambda)=\frac{1}{2}\big{(}u^{\prime}M_{11}u+2u^{\prime}M_{12}w+w^{% \prime}(M_{22}-\lambda I)w+2u^{\prime}d_{1}+2w^{\prime}d_{2}+\lambda\big{)}italic_L ( italic_u , italic_w , italic_λ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_u + 2 italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) italic_w + 2 italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 2 italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ )
  1. 1.

    First note that maxwLsubscript𝑤𝐿\max_{w}Lroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L exists if and only if M22λI0subscript𝑀22𝜆𝐼0M_{22}-\lambda I\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ≤ 0. Otherwise maxwL(w,u,λ)=+subscript𝑤𝐿𝑤𝑢𝜆\max_{w}L(w,u,\lambda)=+\inftyroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_u , italic_λ ) = + ∞. And M22λI0subscript𝑀22𝜆𝐼0M_{22}-\lambda I\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ≤ 0 if and only if λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, which establishes (34). If this condition is satisfied, from Proposition 5 the solution is

    w0(u)superscript𝑤0𝑢\displaystyle w^{0}(u)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) (M22λI)+(M12u+d2)+N(M22λI)absentsuperscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12𝑢subscript𝑑2𝑁subscript𝑀22𝜆𝐼\displaystyle\in-(M_{22}-\lambda I)^{+}(M_{12}^{\prime}\;u+d_{2})+N(M_{22}-% \lambda I)∈ - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I )
    L(u,w0(u),λ)𝐿𝑢superscript𝑤0𝑢𝜆\displaystyle L(u,w^{0}(u),\lambda)italic_L ( italic_u , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) , italic_λ ) =12(uM~22(λ)u+2u(d1M12(M22λI)+d2)d2(M22λI)+d2+λ)absent12superscript𝑢subscript~𝑀22𝜆𝑢2superscript𝑢subscript𝑑1subscript𝑀12superscriptsubscript𝑀22𝜆𝐼subscript𝑑2subscriptsuperscript𝑑2superscriptsubscript𝑀22𝜆𝐼subscript𝑑2𝜆\displaystyle=\frac{1}{2}\big{(}u^{\prime}\tilde{M}_{22}(\lambda)u+2u^{\prime}% (d_{1}-M_{12}(M_{22}-\lambda I)^{+}d_{2})-d^{\prime}_{2}(M_{22}-\lambda I)^{+}% d_{2}+\lambda\big{)}= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) italic_u + 2 italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ )

    Since M22λI0subscript𝑀22𝜆𝐼0M_{22}-\lambda I\leq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ≤ 0, we have that (M22λI)+0superscriptsubscript𝑀22𝜆𝐼0(M_{22}-\lambda I)^{+}\leq 0( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ≤ 0 as well and therefore M~22(λ)0subscript~𝑀22𝜆0\tilde{M}_{22}(\lambda)\geq 0over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) ≥ 0. Therefore, minuL(u,w0(u))subscript𝑢𝐿𝑢superscript𝑤0𝑢\min_{u}L(u,w^{0}(u))roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) ) exists and the solution from Eq. 5 is

    u0(λ)M~22+(λ)(d1M12(M22λI)+d2)+N(M~22(λ))superscript𝑢0𝜆subscriptsuperscript~𝑀22𝜆subscript𝑑1subscript𝑀12superscriptsubscript𝑀22𝜆𝐼subscript𝑑2𝑁subscript~𝑀22𝜆u^{0}(\lambda)\in-\tilde{M}^{+}_{22}(\lambda)(d_{1}-M_{12}(M_{22}-\lambda I)^{% +}d_{2})+N(\tilde{M}_{22}(\lambda))italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) ∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) )

    substituting u0superscript𝑢0u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and w(u0)𝑤superscript𝑢0w(u^{0})italic_w ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) into L𝐿Litalic_L gives

    L(u0,w0(u0),λ)={λ212dM+(λ)d,λ|M22|+,λ<|M22|𝐿superscript𝑢0superscript𝑤0superscript𝑢0𝜆cases𝜆212superscript𝑑superscript𝑀𝜆𝑑𝜆subscript𝑀22𝜆subscript𝑀22L(u^{0},w^{0}(u^{0}),\lambda)=\begin{cases}\frac{\lambda}{2}-\frac{1}{2}d^{% \prime}M^{+}(\lambda)d,\quad&\lambda\geq\left|M_{22}\right|\\ +\infty,\quad&\lambda<\left|M_{22}\right|\end{cases}italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ ) = { start_ROW start_CELL divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW

    and this part is established.

  2. 2.

    Since M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, minuLsubscript𝑢𝐿\min_{u}Lroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L exists from Proposition 5 and we have

    u0(w)superscript𝑢0𝑤\displaystyle u^{0}(w)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) M11+(M12w+d1)+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12𝑤subscript𝑑1𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}(M_{12}\;w+d_{1})+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )
    L(u0(w),w,λ)𝐿superscript𝑢0𝑤𝑤𝜆\displaystyle L(u^{0}(w),w,\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) , italic_w , italic_λ ) =12(wM~11(λ)w+2w(d2M12M11+d1)d1M11+d1+λ)absent12superscript𝑤subscript~𝑀11𝜆𝑤2superscript𝑤subscript𝑑2subscriptsuperscript𝑀12superscriptsubscript𝑀11subscript𝑑1subscriptsuperscript𝑑1subscriptsuperscript𝑀11subscript𝑑1𝜆\displaystyle=\frac{1}{2}\big{(}w^{\prime}\tilde{M}_{11}(\lambda)w+2w^{\prime}% (d_{2}-M^{\prime}_{12}M_{11}^{+}d_{1})-d^{\prime}_{1}M^{+}_{11}d_{1}+\lambda% \big{)}= divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) italic_w + 2 italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ )

    Next note that maxwL(u0(w),w,λ)subscript𝑤𝐿superscript𝑢0𝑤𝑤𝜆\max_{w}L(u^{0}(w),w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) , italic_w , italic_λ ) exists if and only if M~11(λ)0subscript~𝑀11𝜆0\tilde{M}_{11}(\lambda)\leq 0over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) ≤ 0, or

    00\displaystyle 0 (M22λI)M12M11+M12absentsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle\geq(M_{22}-\lambda I)-M_{12}^{\prime}M_{11}^{+}M_{12}≥ ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT
    λI𝜆𝐼\displaystyle\lambda Iitalic_λ italic_I M22M12M11+M12absentsubscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle\geq M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}≥ italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT

    and the last inequality is satisfied if and only if λ|M22M12M11+M12|𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT |. If that does not hold maxwL(u0(w),w,λ)=+subscript𝑤𝐿superscript𝑢0𝑤𝑤𝜆\max_{w}L(u^{0}(w),w,\lambda)=+\inftyroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w ) , italic_w , italic_λ ) = + ∞, which establishes (34). When this condition is satisfied

    w0(λ)M~11+(λ)(d2M12M11+d1)+N(M~11(λ))superscript𝑤0𝜆subscriptsuperscript~𝑀11𝜆subscript𝑑2subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑑1𝑁subscript~𝑀11𝜆w^{0}(\lambda)\in-\tilde{M}^{+}_{11}(\lambda)(d_{2}-M^{\prime}_{12}M^{+}_{11}d% _{1})+N(\tilde{M}_{11}(\lambda))italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) ∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) )

    Substituting these values into L𝐿Litalic_L gives

    L(u0(w0),w0,λ)={λ212dM+(λ)d,λ|M22M12M11+M12|+,λ<|M22M12M11+M12|𝐿superscript𝑢0superscript𝑤0superscript𝑤0𝜆cases𝜆212superscript𝑑superscript𝑀𝜆𝑑𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12L(u^{0}(w^{0}),w^{0},\lambda)=\begin{cases}\frac{\lambda}{2}-\frac{1}{2}d^{% \prime}M^{+}(\lambda)d,\quad&\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}% M_{12}\right|\\ +\infty,\quad&\lambda<\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\end{cases}italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) = { start_ROW start_CELL divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW

    and this part is established.

  3. 3.

    Recall from Proposition 6 that M0𝑀0M\geq 0italic_M ≥ 0 implies M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, M220subscript𝑀220M_{22}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ≥ 0, and M22M12M11+M120subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀120M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0. Since M110subscript𝑀110M_{11}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, M11+0superscriptsubscript𝑀110M_{11}^{+}\geq 0italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ≥ 0 as well and therefore M12M11+M120superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀120M_{12}^{\prime}M_{11}^{+}M_{12}\geq 0italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≥ 0. Therefore 0M22M12M11+M12M220subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript𝑀220\leq M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\leq M_{22}0 ≤ italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ≤ italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT, which implies |M22M12M11+M12||M22|subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript𝑀22\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\leq\left|M_{22}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | ≤ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |. So for λ>|M22|𝜆subscript𝑀22\lambda>\left|M_{22}\right|italic_λ > | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, both minumaxwLsubscript𝑢subscript𝑤𝐿\min_{u}\max_{w}Lroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L and maxwminuLsubscript𝑤subscript𝑢𝐿\max_{w}\min_{u}Lroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L have value λ212dM+(λ)d𝜆212superscript𝑑superscript𝑀𝜆𝑑\frac{\lambda}{2}-\frac{1}{2}d^{\prime}M^{+}(\lambda)ddivide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d, and strong duality holds.

  4. 4.

    If M𝑀Mitalic_M is such that |M22M12M11+M12|<|M22|subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript𝑀22\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|<\left|M_{22}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, then for λ𝜆\lambdaitalic_λ satisfying |M22M12M11+M12|λ<|M22|subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\leq\lambda<\left|M_{22}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | ≤ italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |, we have minumaxwL=+subscript𝑢subscript𝑤𝐿\min_{u}\max_{w}L=+\inftyroman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L = + ∞ and maxwminuL=λ212dM+(λ)dsubscript𝑤subscript𝑢𝐿𝜆212superscript𝑑superscript𝑀𝜆𝑑\max_{w}\min_{u}L=\frac{\lambda}{2}-\frac{1}{2}d^{\prime}M^{+}(\lambda)droman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L = divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d, so the duality gap is infinite, which establishes this part. ∎

Note that setting d=0𝑑0d=0italic_d = 0 in the proof of Proposition 14 establishes Corollary 13.

For strong duality to hold for all λ𝜆\lambdaitalic_λ such that either problem has a bounded solution requires that |M22|=|M22M12M11+M12|subscript𝑀22subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\left|M_{22}\right|=\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | = | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT |. The following example shows that it is not necessary for M12M11+M12=0superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀120M_{12}^{\prime}M_{11}^{+}M_{12}=0italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = 0 for this condition to hold.

M22=M12=[1001]M11=[1000]=M11+formulae-sequencesubscript𝑀22subscript𝑀12matrix1001subscript𝑀11matrix1000superscriptsubscript𝑀11M_{22}=M_{12}=\begin{bmatrix}1&0\\ 0&1\end{bmatrix}\qquad M_{11}=\begin{bmatrix}1&0\\ 0&0\end{bmatrix}=M_{11}^{+}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ] italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] = italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT
M22M12M11+M12=[0001]subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12matrix0001M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}=\begin{bmatrix}0&0\\ 0&1\end{bmatrix}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ]

We have that |M22|=1subscript𝑀221\left|M_{22}\right|=1| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | = 1 and |M22M12M11+M12|=1subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀121\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|=1| italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | = 1, so the norms are equal but M12M11+M12=M11+0superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12superscriptsubscript𝑀110M_{12}^{\prime}M_{11}^{+}M_{12}=M_{11}^{+}\neq 0italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ≠ 0.

Constrained quadratic optimization

A mysterious piece of information has been uncovered. In our innocence we thought we were engaged straightforwardly in solving a single problem (P). But we find we’ve assumed the role of Player 1 in a certain game in which we have an adversary, Player 2, whose interests are diametrically opposed to ours!

–T.A. Rockafellar

We next consider maximization of a convex function so that a constraint is required for even existence of a solution. We establish the following result.

Proposition 15 (Constrained quadratic optimization).

Define the convex quadratic function, V():n:𝑉superscript𝑛V(\cdot):\mathbb{R}^{n}\rightarrow\mathbb{R}italic_V ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R and compact constraint set W𝑊Witalic_W

V(w)(1/2)wDw+wdW{www=1}formulae-sequence𝑉𝑤12superscript𝑤𝐷𝑤superscript𝑤𝑑𝑊conditional-set𝑤superscript𝑤𝑤1V(w)\coloneqq(1/2)w^{\prime}Dw+w^{\prime}d\qquad W\coloneqq\{w\mid w^{\prime}w% =1\}italic_V ( italic_w ) ≔ ( 1 / 2 ) italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d italic_W ≔ { italic_w ∣ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w = 1 }

with Dn×n0𝐷superscript𝑛𝑛0D\in\mathbb{R}^{n\times n}\geq 0italic_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ≥ 0. Consider the constrained maximization problem

maxwWV(w)subscript𝑤𝑊𝑉𝑤\max_{w\in W}V(w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_w ) (36)

Define the Lagrangian function

L(w,λ)=V(w)(1/2)λ(ww1)𝐿𝑤𝜆𝑉𝑤12𝜆superscript𝑤𝑤1L(w,\lambda)=V(w)-(1/2)\lambda(w^{\prime}w-1)italic_L ( italic_w , italic_λ ) = italic_V ( italic_w ) - ( 1 / 2 ) italic_λ ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 )

and the (unconstrained) Lagrangian problem

maxwminλL(w,λ)subscript𝑤subscript𝜆𝐿𝑤𝜆\max_{w}\min_{\lambda}L(w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ ) (37)

and the (unconstrained) dual Lagrangian problem

minλmaxwL(w,λ)subscript𝜆subscript𝑤𝐿𝑤𝜆\min_{\lambda}\max_{w}L(w,\lambda)roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ ) (38)
  1. 1.

    Solutions to all three problems (36), (37), and (38) exist for all D0𝐷0D\geq 0italic_D ≥ 0 and dn𝑑superscript𝑛d\in\mathbb{R}^{n}italic_d ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with optimal value

    V0=L0=(1/2)d(DλPI)+d+λP/2superscript𝑉0superscript𝐿012superscript𝑑superscript𝐷subscript𝜆𝑃𝐼𝑑subscript𝜆𝑃2V^{0}=L^{0}=-(1/2)d^{\prime}(D-\lambda_{P}I)^{+}d+\lambda_{P}/2italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2

    where

    λP:=the largest real eigenvalue of PP[DIddD]formulae-sequenceassignsubscript𝜆𝑃the largest real eigenvalue of P𝑃matrix𝐷𝐼𝑑superscript𝑑𝐷\lambda_{P}:=\;\text{the largest real eigenvalue of $P$}\qquad P\coloneqq% \begin{bmatrix}D&I\\ dd^{\prime}&D\end{bmatrix}italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT := the largest real eigenvalue of italic_P italic_P ≔ [ start_ARG start_ROW start_CELL italic_D end_CELL start_CELL italic_I end_CELL end_ROW start_ROW start_CELL italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_D end_CELL end_ROW end_ARG ] (39)
  2. 2.

    Problems (37) and (38) satisfy strong duality, and the function L(w,λ)𝐿𝑤𝜆L(w,\lambda)italic_L ( italic_w , italic_λ ) has saddle points (sets) (w,λ)superscript𝑤superscript𝜆(w^{*},\lambda^{*})( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) given by

    wsuperscript𝑤\displaystyle w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ={((DλPI)+d+N(DλPI))W,λP=|D|(DλPI)1d,λP>|D|absentcasessuperscript𝐷subscript𝜆𝑃𝐼𝑑𝑁𝐷subscript𝜆𝑃𝐼𝑊subscript𝜆𝑃𝐷superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃𝐷\displaystyle=\begin{cases}\bigg{(}-(D-\lambda_{P}I)^{+}d+N(D-\lambda_{P}I)% \bigg{)}\cap W,\quad&\lambda_{P}=\left|D\right|\\ -(D-\lambda_{P}I)^{-1}d,\qquad&\lambda_{P}>\left|D\right|\end{cases}= { start_ROW start_CELL ( - ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) ) ∩ italic_W , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | end_CELL end_ROW start_ROW start_CELL - ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | end_CELL end_ROW
    λsuperscript𝜆\displaystyle\lambda^{*}italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =λPabsentsubscript𝜆𝑃\displaystyle=\lambda_{P}= italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT
  3. 3.

    The optimizer of (36) is given by w0=wsuperscript𝑤0superscript𝑤w^{0}=w^{*}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

  4. 4.

    The optimizer of (37) is given by

    w0=wλ¯0(w0)=formulae-sequencesuperscript𝑤0superscript𝑤superscript¯𝜆0superscript𝑤0w^{0}=w^{*}\qquad\underline{\lambda}^{0}(w^{0})=\mathbb{R}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT under¯ start_ARG italic_λ end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = blackboard_R
  5. 5.

    The optimizer of (38) is given by

    w¯0(λ0)superscript¯𝑤0superscript𝜆0\displaystyle\overline{w}^{0}(\lambda^{0})over¯ start_ARG italic_w end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ={(DλPI)+d+N(DλPI),λP=|D|(DλPI)1d,λP>|D|absentcasessuperscript𝐷subscript𝜆𝑃𝐼𝑑𝑁𝐷subscript𝜆𝑃𝐼subscript𝜆𝑃𝐷superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃𝐷\displaystyle=\begin{cases}-(D-\lambda_{P}I)^{+}d+N(D-\lambda_{P}I),\quad&% \lambda_{P}=\left|D\right|\\ -(D-\lambda_{P}I)^{-1}d,\qquad&\lambda_{P}>\left|D\right|\end{cases}= { start_ROW start_CELL - ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | end_CELL end_ROW start_ROW start_CELL - ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | end_CELL end_ROW
    λ0superscript𝜆0\displaystyle\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT =λPabsentsubscript𝜆𝑃\displaystyle=\lambda_{P}= italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT
  6. 6.

    Additionally λP=|D|subscript𝜆𝑃𝐷\lambda_{P}=\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | if and only if (i) dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ) and (ii) |(D|D|I)+d|1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|\leq 1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1. If (i) or (ii) do not hold, then λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | and |(DλPI)1d|=1superscript𝐷subscript𝜆𝑃𝐼1𝑑1\left|(D-\lambda_{P}I)^{-1}d\right|=1| ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d | = 1.

Figure 2 shows the possible behaviors. The green lines show the case dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), for different |d|𝑑\left|d\right|| italic_d |. For d=0𝑑0d=0italic_d = 0 (bottom green line), λP=|D|subscript𝜆𝑃𝐷\lambda_{P}=\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | and the optimum is on the boundary. Increasing |d|𝑑\left|d\right|| italic_d | eventually produces a zero derivative at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D | (third green line from bottom). Further increasing |d|𝑑\left|d\right|| italic_d | makes the derivative at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D | negative (top two green lines), and λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | (red dots), and the optimum moves to the interior. The blue line shows the case dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\notin R(D-\left|D\right|I)italic_d ∉ italic_R ( italic_D - | italic_D | italic_I ). L𝐿Litalic_L is unbounded at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |, λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | (blue dot) and the optimum is again in the interior.

Refer to caption

Figure 2: L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) versus λ𝜆\lambdaitalic_λ for the same D𝐷Ditalic_D but different d𝑑ditalic_d. Green lines: for dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), L𝐿Litalic_L is bounded for all λ|D|𝜆𝐷\lambda\geq\left|D\right|italic_λ ≥ | italic_D |. Green dots: for |(D|D|I)+d|1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|\leq 1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1, the optimum is on the boundary and λP=|D|subscript𝜆𝑃𝐷\lambda_{P}=\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D |. Red dots: for |(D|D|I)+d|>1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|>1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | > 1, the optimum is in the interior and λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D |. Blue line and dot: for dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\notin R(D-\left|D\right|I)italic_d ∉ italic_R ( italic_D - | italic_D | italic_I ), L𝐿Litalic_L is unbounded at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |, and the optimum is in the interior, and λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D |.

To organize the proof of this proposition, we treat the Lagrangian, dual Lagrangian, and saddle-point problems in separate lemmas, and then combine them. We start with the dual Lagrangian minmax problem. We shall find that all of the information about λPsubscript𝜆𝑃\lambda_{P}italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT emerges from this problem.

Lemma 16 (Dual Lagrangian of constrained quadratic optimization).

Consider the dual Lagrangian problem

minλmaxwL(w,λ)L(w,λ)(1/2)wDw+wd(1/2)λ(ww1)subscript𝜆subscript𝑤𝐿𝑤𝜆𝐿𝑤𝜆12superscript𝑤𝐷𝑤superscript𝑤𝑑12𝜆superscript𝑤𝑤1\min_{\lambda}\max_{w}L(w,\lambda)\qquad L(w,\lambda)\coloneqq(1/2)w^{\prime}% Dw+w^{\prime}d-(1/2)\lambda(w^{\prime}w-1)roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ ) italic_L ( italic_w , italic_λ ) ≔ ( 1 / 2 ) italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d - ( 1 / 2 ) italic_λ ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 ) (40)

with D0𝐷0D\geq 0italic_D ≥ 0. We have the following results.

  1. 1.

    This problem is equivalent to

    minλ|D|maxwL(w,λ)subscript𝜆𝐷subscript𝑤𝐿𝑤𝜆\min_{\lambda\geq\left|D\right|}\max_{w}L(w,\lambda)roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_D | end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ )
  2. 2.

    The solution exists for all D𝐷Ditalic_D and d𝑑ditalic_d and has optimal value

    L0=(1/2)d(DλPI)+d+λP/2superscript𝐿012superscript𝑑superscript𝐷subscript𝜆𝑃𝐼𝑑subscript𝜆𝑃2L^{0}=-(1/2)d^{\prime}(D-\lambda_{P}I)^{+}d+\lambda_{P}/2italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2 (41)

    where λPsubscript𝜆𝑃\lambda_{P}italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT and matrix P2n×2n𝑃superscript2𝑛2𝑛P\in\mathbb{R}^{2n\times 2n}italic_P ∈ blackboard_R start_POSTSUPERSCRIPT 2 italic_n × 2 italic_n end_POSTSUPERSCRIPT are defined as

    λPthe largest real eigenvalue of PP[DIddD]formulae-sequencesubscript𝜆𝑃the largest real eigenvalue of P𝑃matrix𝐷𝐼𝑑superscript𝑑𝐷\lambda_{P}\coloneqq\;\text{the largest real eigenvalue of $P$}\qquad P% \coloneqq\begin{bmatrix}D&I\\ dd^{\prime}&D\end{bmatrix}italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ≔ the largest real eigenvalue of italic_P italic_P ≔ [ start_ARG start_ROW start_CELL italic_D end_CELL start_CELL italic_I end_CELL end_ROW start_ROW start_CELL italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_D end_CELL end_ROW end_ARG ] (42)
  3. 3.

    The optimal λ𝜆\lambdaitalic_λ and w0(λ)superscript𝑤0𝜆w^{0}(\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) are given by

    λ0superscript𝜆0\displaystyle\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT =λP[|D|,)absentsubscript𝜆𝑃𝐷\displaystyle=\lambda_{P}\in[\left|D\right|,\infty)= italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∈ [ | italic_D | , ∞ )
    w0(λ0)superscript𝑤0superscript𝜆0\displaystyle w^{0}(\lambda^{0})italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ={(D|D|I)+d+N(D|D|I),λP=|D|(DλPI)1d,λP(|D|,)absentcasessuperscript𝐷𝐷𝐼𝑑𝑁𝐷𝐷𝐼subscript𝜆𝑃𝐷superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃𝐷\displaystyle=\begin{cases}-(D-\left|D\right|I)^{+}d+N(D-\left|D\right|I),% \quad&\lambda_{P}=\left|D\right|\\ -(D-\lambda_{P}I)^{-1}d,\quad&\lambda_{P}\in(\left|D\right|,\infty)\end{cases}= { start_ROW start_CELL - ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D - | italic_D | italic_I ) , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | end_CELL end_ROW start_ROW start_CELL - ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∈ ( | italic_D | , ∞ ) end_CELL end_ROW
  4. 4.

    We have that λP=|D|subscript𝜆𝑃𝐷\lambda_{P}=\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | if and only if (i) dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), and (ii) |(D|D|I)+d|1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|\leq 1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1. Otherwise λP>|D|subscript𝜆𝑃𝐷\lambda_{P}>\left|D\right|italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D |. If (i) is violated, L(w0(λ),λ)=+𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)=+\inftyitalic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = + ∞ at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |. If (i) holds but (ii) is violated, then L(w0(λ,λ)L(w^{0}(\lambda,\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ , italic_λ ) is finite at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |, but (d/dλ)L(w0(λ),λ)<0𝑑𝑑𝜆𝐿superscript𝑤0𝜆𝜆0(d/d\lambda)L(w^{0}(\lambda),\lambda)<0( italic_d / italic_d italic_λ ) italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) < 0 at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |.

Proof.

To establish statement 1 in the lemma, note that if λ<|D|𝜆𝐷\lambda<\left|D\right|italic_λ < | italic_D |, then DλI>0𝐷𝜆𝐼0D-\lambda I>0italic_D - italic_λ italic_I > 0 and maxwL(w,λ)=+subscript𝑤𝐿𝑤𝜆\max_{w}L(w,\lambda)=+\inftyroman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ ) = + ∞. So adding the constraint λ|D|𝜆𝐷\lambda\geq\left|D\right|italic_λ ≥ | italic_D | to the outer minimization does not alter the solution.

To establish statements 2–4 in the lemma, we make use of the SVD of matrix D=UMU𝐷𝑈𝑀superscript𝑈D=UMU^{\prime}italic_D = italic_U italic_M italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which we partition as

D=[U1U2][|D|IpM2][U1U2]𝐷matrixsubscript𝑈1subscript𝑈2matrix𝐷subscript𝐼𝑝missing-subexpressionmissing-subexpressionsubscript𝑀2matrixsuperscriptsubscript𝑈1superscriptsubscript𝑈2D=\begin{bmatrix}U_{1}&U_{2}\end{bmatrix}\begin{bmatrix}\left|D\right|I_{p}&\\ &M_{2}\end{bmatrix}\begin{bmatrix}U_{1}^{\prime}\\ U_{2}^{\prime}\end{bmatrix}italic_D = [ start_ARG start_ROW start_CELL italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL | italic_D | italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ]

where p𝑝pitalic_p is the multiplicity of the largest eigenvalue of D𝐷Ditalic_D, 1pn1𝑝𝑛1\leq p\leq n1 ≤ italic_p ≤ italic_n. Also denote y=Ud,y1=U1d,y2=U2dformulae-sequence𝑦superscript𝑈𝑑formulae-sequencesubscript𝑦1superscriptsubscript𝑈1𝑑subscript𝑦2superscriptsubscript𝑈2𝑑y=U^{\prime}d,y_{1}=U_{1}^{\prime}d,y_{2}=U_{2}^{\prime}ditalic_y = italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d , italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d , italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d.

We break the problem into two cases.

  1. 1.

    Case λP(|D|,)subscript𝜆𝑃𝐷\lambda_{P}\in(\left|D\right|,\infty)italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∈ ( | italic_D | , ∞ ).

    For λ(|D|,)𝜆𝐷\lambda\in(\left|D\right|,\infty)italic_λ ∈ ( | italic_D | , ∞ ), we have from Proposition 5 that w0(λ)=(DλI)1dsuperscript𝑤0𝜆superscript𝐷𝜆𝐼1𝑑w^{0}(\lambda)=-(D-\lambda I)^{-1}ditalic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) = - ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d and L(w0(λ),λ)=(1/2)d(DλI)1d+λ/2𝐿superscript𝑤0𝜆𝜆12superscript𝑑superscript𝐷𝜆𝐼1𝑑𝜆2L(w^{0}(\lambda),\lambda)=-(1/2)d^{\prime}(D-\lambda I)^{-1}d+\lambda/2italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d + italic_λ / 2 . L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is differentiable, and taking two derivatives gives

    dLdλ𝑑𝐿𝑑𝜆\displaystyle\frac{dL}{d\lambda}divide start_ARG italic_d italic_L end_ARG start_ARG italic_d italic_λ end_ARG =(1/2)(1d(DλI)2d)absent121superscript𝑑superscript𝐷𝜆𝐼2𝑑\displaystyle=(1/2)(1-d^{\prime}(D-\lambda I)^{-2}d)= ( 1 / 2 ) ( 1 - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_d )
    d2Ldλ2superscript𝑑2𝐿𝑑superscript𝜆2\displaystyle\frac{d^{2}L}{d\lambda^{2}}divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_L end_ARG start_ARG italic_d italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG =d(DλI)3dabsentsuperscript𝑑superscript𝐷𝜆𝐼3𝑑\displaystyle=-d^{\prime}(D-\lambda I)^{-3}d= - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT italic_d

    Setting the first derivative to zero yields

    00\displaystyle 0 =1d(DλI)2dabsent1superscript𝑑superscript𝐷𝜆𝐼2𝑑\displaystyle=1-d^{\prime}(D-\lambda I)^{-2}d= 1 - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_d (43)
    =det(1d(DλI)2d)absent1superscript𝑑superscript𝐷𝜆𝐼2𝑑\displaystyle=\det(1-d^{\prime}(D-\lambda I)^{-2}d)= roman_det ( 1 - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_d )
    =det(Idd(DλI)2)absent𝐼𝑑superscript𝑑superscript𝐷𝜆𝐼2\displaystyle=\det(I-dd^{\prime}(D-\lambda I)^{-2})= roman_det ( italic_I - italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT )
    =det(DλI)dd(DλI)1)det(DλI)1\displaystyle=\det(D-\lambda I)-dd^{\prime}(D-\lambda I)^{-1})\det(D-\lambda I% )^{-1}= roman_det ( italic_D - italic_λ italic_I ) - italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_det ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT

    where we have used the fact that det(I+AB)=det(I+BA)𝐼𝐴𝐵𝐼𝐵𝐴\det(I+AB)=\det(I+BA)roman_det ( italic_I + italic_A italic_B ) = roman_det ( italic_I + italic_B italic_A ). Since det(DλI)0𝐷𝜆𝐼0\det(D-\lambda I)\neq 0roman_det ( italic_D - italic_λ italic_I ) ≠ 0, we can multiply both sides of the last equality by det(DλI)2superscript𝐷𝜆𝐼2\det(D-\lambda I)^{2}roman_det ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to obtain

    00\displaystyle 0 =det(DλI)dd(DλI)1)det(DλI)\displaystyle=\det(D-\lambda I)-dd^{\prime}(D-\lambda I)^{-1})\det(D-\lambda I)= roman_det ( italic_D - italic_λ italic_I ) - italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) roman_det ( italic_D - italic_λ italic_I )
    =det([DλIIddDλI])=det(PλI)absentmatrix𝐷𝜆𝐼𝐼𝑑superscript𝑑𝐷𝜆𝐼𝑃𝜆𝐼\displaystyle=\det\left(\begin{bmatrix}D-\lambda I&I\\ dd^{\prime}&D-\lambda I\end{bmatrix}\right)=\det(P-\lambda I)= roman_det ( [ start_ARG start_ROW start_CELL italic_D - italic_λ italic_I end_CELL start_CELL italic_I end_CELL end_ROW start_ROW start_CELL italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_D - italic_λ italic_I end_CELL end_ROW end_ARG ] ) = roman_det ( italic_P - italic_λ italic_I )

    where we have used the partitioned determinant formula, which is valid since DλI𝐷𝜆𝐼D-\lambda Iitalic_D - italic_λ italic_I is nonsingular for λ>|D|𝜆𝐷\lambda>\left|D\right|italic_λ > | italic_D |. Therefore the first derivative of L𝐿Litalic_L vanishes in the interval (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ) if and only if there exists a real-valued eigenvalue of P𝑃Pitalic_P in this interval. Also, we have from Eq. 43 that

    1=d(DλI)2d=dU(MλI)2Ud=y(MλI)2y1superscript𝑑superscript𝐷𝜆𝐼2𝑑superscript𝑑𝑈superscript𝑀𝜆𝐼2superscript𝑈𝑑superscript𝑦superscript𝑀𝜆𝐼2𝑦1=d^{\prime}(D-\lambda I)^{-2}d=d^{\prime}U(M-\lambda I)^{-2}U^{\prime}d=y^{% \prime}(M-\lambda I)^{-2}y1 = italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_d = italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_U ( italic_M - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d = italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M - italic_λ italic_I ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_y

    with yUd𝑦superscript𝑈𝑑y\coloneqq U^{\prime}ditalic_y ≔ italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d. So we conclude that y0𝑦0y\neq 0italic_y ≠ 0 for this case. Examining the second derivative, we have

    d2Ldλ2=d(λID)3d=y(λIM)3ysuperscript𝑑2𝐿𝑑superscript𝜆2superscript𝑑superscript𝜆𝐼𝐷3𝑑superscript𝑦superscript𝜆𝐼𝑀3𝑦\frac{d^{2}L}{d\lambda^{2}}=d^{\prime}(\lambda I-D)^{-3}d=y^{\prime}(\lambda I% -M)^{-3}ydivide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_L end_ARG start_ARG italic_d italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_λ italic_I - italic_D ) start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT italic_d = italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_λ italic_I - italic_M ) start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT italic_y

    Note that since λ>|D|𝜆𝐷\lambda>\left|D\right|italic_λ > | italic_D |, (λIM)3>0superscript𝜆𝐼𝑀30(\lambda I-M)^{-3}>0( italic_λ italic_I - italic_M ) start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT > 0, and since y0𝑦0y\neq 0italic_y ≠ 0, we have that d2L/dλ2>0superscript𝑑2𝐿𝑑superscript𝜆20d^{2}L/d\lambda^{2}>0italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_L / italic_d italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 0 on (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ) and therefore L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is strictly convex on this interval. Therefore the minimizer of L𝐿Litalic_L is unique and the first derivative is zero at the solution. We also know that there is only one real eigenvalue of P𝑃Pitalic_P in this interval due to the uniqueness of the optimal solution. Therefore we have established that λ0=λP(|D|,)superscript𝜆0subscript𝜆𝑃𝐷\lambda^{0}=\lambda_{P}\in(\left|D\right|,\infty)italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∈ ( | italic_D | , ∞ ) is the optimal solution. Substituting this solution into L𝐿Litalic_L gives

    L0=L(w0(λ0),λ0)=(1/2)d(DλPI)1d+λP/2superscript𝐿0𝐿superscript𝑤0superscript𝜆0superscript𝜆012superscript𝑑superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃2L^{0}=L(w^{0}(\lambda^{0}),\lambda^{0})=-(1/2)d^{\prime}(D-\lambda_{P}I)^{-1}d% +\lambda_{P}/2italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2

    verifying that (41) holds for the first case.

  2. 2.

    Case λP(|D|,)subscript𝜆𝑃𝐷\lambda_{P}\notin(\left|D\right|,\infty)italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∉ ( | italic_D | , ∞ ). In this case, we first show that the λ0=|D|superscript𝜆0𝐷\lambda^{0}=\left|D\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_D |. Using the SVD, we have for λ(|D|,)𝜆𝐷\lambda\in(\left|D\right|,\infty)italic_λ ∈ ( | italic_D | , ∞ )

    L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆\displaystyle L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) =(1/2)d(DλI)1d+λ/2absent12superscript𝑑superscript𝐷𝜆𝐼1𝑑𝜆2\displaystyle=-(1/2)d^{\prime}(D-\lambda I)^{-1}d+\lambda/2= - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d + italic_λ / 2
    =(1/2)[y1y2][1|D|λIp(M2λI)1][y1y2]+λ/2absent12matrixsuperscriptsubscript𝑦1superscriptsubscript𝑦2matrix1𝐷𝜆subscript𝐼𝑝missing-subexpressionmissing-subexpressionsuperscriptsubscript𝑀2𝜆𝐼1matrixsubscript𝑦1subscript𝑦2𝜆2\displaystyle=-(1/2)\begin{bmatrix}y_{1}^{\prime}&y_{2}^{\prime}\end{bmatrix}% \begin{bmatrix}\frac{1}{\left|D\right|-\lambda}I_{p}&\\ &(M_{2}-\lambda I)^{-1}\end{bmatrix}\begin{bmatrix}y_{1}\\ y_{2}\end{bmatrix}+\lambda/2= - ( 1 / 2 ) [ start_ARG start_ROW start_CELL italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG | italic_D | - italic_λ end_ARG italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ( italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] + italic_λ / 2

    From this expression for L𝐿Litalic_L, note that y1=U1dsubscript𝑦1superscriptsubscript𝑈1𝑑y_{1}=U_{1}^{\prime}ditalic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d must be zero for this case, or limλ|D|+L(w0(λ),λ)=+subscript𝜆superscript𝐷𝐿superscript𝑤0𝜆𝜆\lim_{\lambda\rightarrow\left|D\right|^{+}}L(w^{0}(\lambda),\lambda)=+\inftyroman_lim start_POSTSUBSCRIPT italic_λ → | italic_D | start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = + ∞, which is a contradiction since limλ+L(w0(λ),λ)=+subscript𝜆𝐿superscript𝑤0𝜆𝜆\lim_{\lambda\rightarrow+\infty}L(w^{0}(\lambda),\lambda)=+\inftyroman_lim start_POSTSUBSCRIPT italic_λ → + ∞ end_POSTSUBSCRIPT italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = + ∞ as well, and L𝐿Litalic_L is a smooth function on the interval (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ), so it must have a minimum on that interval (zero derivative), but by assumption it does not have a zero derivative on that interval. Note that y1=U1d=0subscript𝑦1superscriptsubscript𝑈1𝑑0y_{1}=U_{1}^{\prime}d=0italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d = 0 is equivalent to dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), which can be seen from the SVD of D|D|I𝐷𝐷𝐼D-\left|D\right|Iitalic_D - | italic_D | italic_I

    D|D|I=[U1U2][0M2|D|I][U1U2]𝐷𝐷𝐼matrixsubscript𝑈1subscript𝑈2matrix0missing-subexpressionmissing-subexpressionsubscript𝑀2𝐷𝐼matrixsuperscriptsubscript𝑈1superscriptsubscript𝑈2D-\left|D\right|I=\begin{bmatrix}U_{1}&U_{2}\end{bmatrix}\begin{bmatrix}0&\\ &M_{2}-\left|D\right|I\end{bmatrix}\begin{bmatrix}U_{1}^{\prime}\\ U_{2}^{\prime}\end{bmatrix}italic_D - | italic_D | italic_I = [ start_ARG start_ROW start_CELL italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - | italic_D | italic_I end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ]

    so the columns of U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are a basis for N(D|D|I)𝑁𝐷𝐷𝐼N(D-\left|D\right|I)italic_N ( italic_D - | italic_D | italic_I ) and d𝑑ditalic_d is orthogonal to the columns of U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT so dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ). Substituting y1=0subscript𝑦10y_{1}=0italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0, into the expression for L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) gives

    L(w0(λ),λ)=(1/2)d(DλI)+d+λ/2,λ|D|,dR(D|D|I)formulae-sequence𝐿superscript𝑤0𝜆𝜆12superscript𝑑superscript𝐷𝜆𝐼𝑑𝜆2formulae-sequence𝜆𝐷𝑑𝑅𝐷𝐷𝐼L(w^{0}(\lambda),\lambda)=-(1/2)d^{\prime}(D-\lambda I)^{+}d+\lambda/2,\qquad% \lambda\geq\left|D\right|,\;d\in R(D-\left|D\right|I)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_λ / 2 , italic_λ ≥ | italic_D | , italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ) (44)

    and L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is smooth on the interval including the left boundary, [|D|,)𝐷[\left|D\right|,\infty)[ | italic_D | , ∞ ), and the optimizer must be on the boundary, λ0=|D|superscript𝜆0𝐷\lambda^{0}=\left|D\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_D |. For this value of λ𝜆\lambdaitalic_λ, the inner maximization over w𝑤witalic_w gives from Proposition 5

    w0(λ0)=(1/2)(D|D|I)+d+N(D|D|I)superscript𝑤0superscript𝜆012superscript𝐷𝐷𝐼𝑑𝑁𝐷𝐷𝐼w^{0}(\lambda^{0})=-(1/2)(D-\left|D\right|I)^{+}d+N(D-\left|D\right|I)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = - ( 1 / 2 ) ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D - | italic_D | italic_I )

    and evaluating L0superscript𝐿0L^{0}italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT gives

    L(w0(λ0),λ0)𝐿superscript𝑤0superscript𝜆0superscript𝜆0\displaystyle L(w^{0}(\lambda^{0}),\lambda^{0})italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) =(1/2)y2(M2|D|I)1y2+|D|/2absent12superscriptsubscript𝑦2superscriptsubscript𝑀2𝐷𝐼1subscript𝑦2𝐷2\displaystyle=-(1/2)y_{2}^{\prime}(M_{2}-\left|D\right|I)^{-1}y_{2}+\left|D% \right|/2= - ( 1 / 2 ) italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - | italic_D | italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + | italic_D | / 2
    =(1/2)d(D|D|I)+d+|D|/2absent12superscript𝑑superscript𝐷𝐷𝐼𝑑𝐷2\displaystyle=-(1/2)d^{\prime}(D-\left|D\right|I)^{+}d+\left|D\right|/2\qquad= - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + | italic_D | / 2 (45)

    verifying Eq. 41 for this case.

    Taking the derivative of Eq. 44 and evaluating at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D | gives

    (d/dλ)L(w0(λ),λ)=(1/2)(1d((D|D|I)+)2d)=(1/2)(1|(D|D|I)+d|2)𝑑𝑑𝜆𝐿superscript𝑤0𝜆𝜆121superscript𝑑superscriptsuperscript𝐷𝐷𝐼2𝑑121superscriptsuperscript𝐷𝐷𝐼𝑑2(d/d\lambda)L(w^{0}(\lambda),\lambda)=(1/2)(1-d^{\prime}((D-\left|D\right|I)^{% +})^{2}d)=(1/2)(1-\left|(D-\left|D\right|I)^{+}d\right|^{2})( italic_d / italic_d italic_λ ) italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) = ( 1 / 2 ) ( 1 - italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d ) = ( 1 / 2 ) ( 1 - | ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )

    which is non-negative if and only if |(D|D|I)+d|1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|\leq 1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1. Otherwise the derivative at the boundary is negative and the optimal λ𝜆\lambdaitalic_λ is in the interval (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ), which is the previous case. Therefore λ0=|D|superscript𝜆0𝐷\lambda^{0}=\left|D\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_D | if and only if

    dR(D|D|I),|(D|D|I)+d|1formulae-sequence𝑑𝑅𝐷𝐷𝐼superscript𝐷𝐷𝐼𝑑1d\in R(D-\left|D\right|I),\quad\left|(D-\left|D\right|I)^{+}d\right|\leq 1italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ) , | ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1

    Next we show that |D|𝐷\left|D\right|| italic_D | is an eigenvalue of P𝑃Pitalic_P in this case. Factoring PλI𝑃𝜆𝐼P-\lambda Iitalic_P - italic_λ italic_I gives

    PλI𝑃𝜆𝐼\displaystyle P-\lambda Iitalic_P - italic_λ italic_I =[UU][UDUλIIUddUUDUλI][UU]absentmatrix𝑈missing-subexpressionmissing-subexpression𝑈matrixsuperscript𝑈𝐷𝑈𝜆𝐼𝐼superscript𝑈𝑑superscript𝑑𝑈superscript𝑈𝐷𝑈𝜆𝐼matrixsuperscript𝑈missing-subexpressionmissing-subexpressionsuperscript𝑈\displaystyle=\begin{bmatrix}U&\\ &U\end{bmatrix}\begin{bmatrix}U^{\prime}DU-\lambda I&I\\ U^{\prime}dd^{\prime}U&U^{\prime}DU-\lambda I\end{bmatrix}\begin{bmatrix}U^{% \prime}&\\ &U^{\prime}\end{bmatrix}= [ start_ARG start_ROW start_CELL italic_U end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_U end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_U - italic_λ italic_I end_CELL start_CELL italic_I end_CELL end_ROW start_ROW start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_U end_CELL start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_U - italic_λ italic_I end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ]
    =[UU][(|D|λ)IpIM2λIIy1y1y1y2(|D|λ)Ipy2y1y2y2M2λI][UU]absentmatrix𝑈missing-subexpressionmissing-subexpression𝑈matrix𝐷𝜆subscript𝐼𝑝missing-subexpression𝐼missing-subexpressionmissing-subexpressionsubscript𝑀2𝜆𝐼missing-subexpression𝐼subscript𝑦1superscriptsubscript𝑦1subscript𝑦1superscriptsubscript𝑦2𝐷𝜆subscript𝐼𝑝missing-subexpressionsubscript𝑦2superscriptsubscript𝑦1subscript𝑦2superscriptsubscript𝑦2missing-subexpressionsubscript𝑀2𝜆𝐼matrixsuperscript𝑈missing-subexpressionmissing-subexpressionsuperscript𝑈\displaystyle=\begin{bmatrix}U&\\ &U\end{bmatrix}\begin{bmatrix}(\left|D\right|-\lambda)I_{p}&&I&\\ &M_{2}-\lambda I&&I\\ y_{1}y_{1}^{\prime}&y_{1}y_{2}^{\prime}&(\left|D\right|-\lambda)I_{p}&\\ y_{2}y_{1}^{\prime}&y_{2}y_{2}^{\prime}&&M_{2}-\lambda I\end{bmatrix}\begin{% bmatrix}U^{\prime}&\\ &U^{\prime}\end{bmatrix}= [ start_ARG start_ROW start_CELL italic_U end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_U end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL ( | italic_D | - italic_λ ) italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL end_CELL start_CELL italic_I end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL start_CELL end_CELL start_CELL italic_I end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL ( | italic_D | - italic_λ ) italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ]

    Since the leading and trailing matrices are inverses of each other, we have a similarity transformation, and the eigenvalues of the inner matrix are the eigenvalues of P𝑃Pitalic_P. Setting y1=0subscript𝑦10y_{1}=0italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 in the inner matrix and setting λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D | gives a zero third block row of the inner matrix, and it is singular. Therefore λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D | is an eigenvalue of P𝑃Pitalic_P. Since there are no real eigenvalues of P𝑃Pitalic_P in (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ), we have that |D|𝐷\left|D\right|| italic_D | is the largest real eigenvalue of P𝑃Pitalic_P for this case, and we have established that λ0=λPsuperscript𝜆0subscript𝜆𝑃\lambda^{0}=\lambda_{P}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT also for this case.

Summarizing, we have broken the problem into two cases. In the first case we have shown that λ0=λP>|D|superscript𝜆0subscript𝜆𝑃𝐷\lambda^{0}=\lambda_{P}>\left|D\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D |, and (d/dλ)L(w0(λ),λ)𝑑𝑑𝜆𝐿superscript𝑤0𝜆𝜆(d/d\lambda)L(w^{0}(\lambda),\lambda)( italic_d / italic_d italic_λ ) italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is zero at λ=λP𝜆subscript𝜆𝑃\lambda=\lambda_{P}italic_λ = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT. In this case there is only one real eigenvalue of P𝑃Pitalic_P in (|D|,)𝐷(\left|D\right|,\infty)( | italic_D | , ∞ ).

In the second case, we have that λ0=λP=|D|superscript𝜆0subscript𝜆𝑃𝐷\lambda^{0}=\lambda_{P}=\left|D\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D |, the boundary of the feasible set. We have also shown that dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), and |(D|D|I)+d|1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|\leq 1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | ≤ 1 for this case. If dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\notin R(D-\left|D\right|I)italic_d ∉ italic_R ( italic_D - | italic_D | italic_I ), then L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is ++\infty+ ∞ at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |, which is in the first case. If dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ), but |(D|D|I)+d|>1superscript𝐷𝐷𝐼𝑑1\left|(D-\left|D\right|I)^{+}d\right|>1| ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d | > 1, then L(w0(λ),λ)𝐿superscript𝑤0𝜆𝜆L(w^{0}(\lambda),\lambda)italic_L ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) is finite at λ=|D|𝜆𝐷\lambda=\left|D\right|italic_λ = | italic_D |, but the derivative is negative, which is again in the first case. Thus we have established statements 2–4 in the lemma and the proof is complete. ∎

Next we turn to the saddle points.

Lemma 17 (Saddle points of the Lagrangian of constrained quadratic optimization).

The following (w,λ)superscript𝑤superscript𝜆(w^{*},\lambda^{*})( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) are saddle points of L(w,λ)(1/2)wDw+wd(1/2)λ(ww1)𝐿𝑤𝜆12superscript𝑤𝐷𝑤superscript𝑤𝑑12𝜆superscript𝑤𝑤1L(w,\lambda)\coloneqq(1/2)w^{\prime}Dw+w^{\prime}d-(1/2)\lambda(w^{\prime}w-1)italic_L ( italic_w , italic_λ ) ≔ ( 1 / 2 ) italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d - ( 1 / 2 ) italic_λ ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 ).

wsuperscript𝑤\displaystyle w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ={((D|D|I)+d+N(D|D|I))WλP=|D|(DλPI)1d,λP>|D|absentcasessuperscript𝐷𝐷𝐼𝑑𝑁𝐷𝐷𝐼𝑊subscript𝜆𝑃𝐷superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃𝐷\displaystyle=\begin{cases}\bigg{(}(D-\left|D\right|I)^{+}d+N(D-\left|D\right|% I)\bigg{)}\cap W\quad&\lambda_{P}=\left|D\right|\\ (D-\lambda_{P}I)^{-1}d,\quad&\lambda_{P}>\left|D\right|\end{cases}= { start_ROW start_CELL ( ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_N ( italic_D - | italic_D | italic_I ) ) ∩ italic_W end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | end_CELL end_ROW start_ROW start_CELL ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D | end_CELL end_ROW
λsuperscript𝜆\displaystyle\lambda^{*}italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT =λPabsentsubscript𝜆𝑃\displaystyle=\lambda_{P}= italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT
Proof.

From the definition of a saddle point we need to establish the inequalities

L(w,λ)L(w,λ)L(w,λ)𝐿𝑤superscript𝜆𝐿superscript𝑤superscript𝜆𝐿superscript𝑤𝜆L(w,\lambda^{*})\leq L(w^{*},\lambda^{*})\leq L(w^{*},\lambda)italic_L ( italic_w , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≤ italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ≤ italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ )

hold for all wn𝑤superscript𝑛w\in\mathbb{R}^{n}italic_w ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and λ𝜆\lambda\in\mathbb{R}italic_λ ∈ blackboard_R.

Taking the second inequality first, we have that L(w,λ)=(1/2)(w)Dw+(w)d𝐿superscript𝑤𝜆12superscriptsuperscript𝑤𝐷superscript𝑤superscriptsuperscript𝑤𝑑L(w^{*},\lambda)=(1/2)(w^{*})^{\prime}Dw^{*}+(w^{*})^{\prime}ditalic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ ) = ( 1 / 2 ) ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT + ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d for all λ𝜆\lambdaitalic_λ since (w)w=1superscriptsuperscript𝑤superscript𝑤1(w^{*})^{\prime}w^{*}=1( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 1. Therefore L(w,λ)=L(w,λ)𝐿superscript𝑤𝜆𝐿superscript𝑤superscript𝜆L(w^{*},\lambda)=L(w^{*},\lambda^{*})italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ ) = italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all λ𝜆\lambdaitalic_λ and the second inequality is established with equality.

Turning to the first inequality, we consider the two cases; (i) λ=λP>|D|superscript𝜆subscript𝜆𝑃𝐷\lambda^{*}=\lambda_{P}>\left|D\right|italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT > | italic_D |, and (ii) λ=λP=|D|superscript𝜆subscript𝜆𝑃𝐷\lambda^{*}=\lambda_{P}=\left|D\right|italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT = | italic_D | and dR(D|D|I)𝑑𝑅𝐷𝐷𝐼d\in R(D-\left|D\right|I)italic_d ∈ italic_R ( italic_D - | italic_D | italic_I ). For the first case we know that

maxwL(w,λP)=(1/2)d(DλPI)1d+λP/2=L(w,λP)=L(w,λ)subscript𝑤𝐿𝑤subscript𝜆𝑃12superscript𝑑superscript𝐷subscript𝜆𝑃𝐼1𝑑subscript𝜆𝑃2𝐿superscript𝑤subscript𝜆𝑃𝐿superscript𝑤superscript𝜆\max_{w}L(w,\lambda_{P})=-(1/2)d^{\prime}(D-\lambda_{P}I)^{-1}d+\lambda_{P}/2=% L(w^{*},\lambda_{P})=L(w^{*},\lambda^{*})roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2 = italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ) = italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

where the maximization over w𝑤witalic_w is unconstrained, and the first equality follows from Proposition 5. Since this equality holds for the maximizer, we have that L(w,λ)=L(w,λP)L(w,λ)𝐿𝑤superscript𝜆𝐿𝑤subscript𝜆𝑃𝐿superscript𝑤superscript𝜆L(w,\lambda^{*})=L(w,\lambda_{P})\leq L(w^{*},\lambda^{*})italic_L ( italic_w , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = italic_L ( italic_w , italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ) ≤ italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all wn𝑤superscript𝑛w\in\mathbb{R}^{n}italic_w ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, where the first equality comes from the definition of λsuperscript𝜆\lambda^{*}italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. Thus the first inequality holds for the first case.

Turning to the second case, we have similarly from Proposition 5

maxwL(w,|D|)=(1/2)d(D|D|I)+d+|D|/2=L(w,|D|)=L(w,λ)subscript𝑤𝐿𝑤𝐷12superscript𝑑superscript𝐷𝐷𝐼𝑑𝐷2𝐿superscript𝑤𝐷𝐿superscript𝑤superscript𝜆\max_{w}L(w,\left|D\right|)=-(1/2)d^{\prime}(D-\left|D\right|I)^{+}d+\left|D% \right|/2=L(w^{*},\left|D\right|)=L(w^{*},\lambda^{*})roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_w , | italic_D | ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - | italic_D | italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + | italic_D | / 2 = italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , | italic_D | ) = italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )

where again the maximization over w𝑤witalic_w is unconstrained, and we have L(w,λ)=L(w,|D|)L(w,λ)𝐿𝑤superscript𝜆𝐿𝑤𝐷𝐿superscript𝑤superscript𝜆L(w,\lambda^{*})=L(w,\left|D\right|)\leq L(w^{*},\lambda^{*})italic_L ( italic_w , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) = italic_L ( italic_w , | italic_D | ) ≤ italic_L ( italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) for all wn𝑤superscript𝑛w\in\mathbb{R}^{n}italic_w ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Thus both cases satisfy the first inequality, both inequalities have been established, and the result is proven. ∎

Finally, we address the original constrained optimization of the concave quadratic function and its Lagrangian

Lemma 18 (Constrained quadratic optimization and its Lagrangian).

We are given the following: (i) a convex quadratic function and compact constraint set W𝑊Witalic_W defined in Proposition 15

V(w)(1/2)wDw+wdW{www=1}formulae-sequence𝑉𝑤12superscript𝑤𝐷𝑤superscript𝑤𝑑𝑊conditional-set𝑤superscript𝑤𝑤1V(w)\coloneqq(1/2)w^{\prime}Dw+w^{\prime}d\qquad W\coloneqq\{w\mid w^{\prime}w% =1\}italic_V ( italic_w ) ≔ ( 1 / 2 ) italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_D italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d italic_W ≔ { italic_w ∣ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w = 1 }

with Dn×n0𝐷superscript𝑛𝑛0D\in\mathbb{R}^{n\times n}\geq 0italic_D ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT ≥ 0, (ii) the constrained maximization problem Eq. 36

maxwWV(w)subscript𝑤𝑊𝑉𝑤\max_{w\in W}V(w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_w )

with Lagrangian function

L(w,λ)=V(w)(1/2)λ(ww1)𝐿𝑤𝜆𝑉𝑤12𝜆superscript𝑤𝑤1L(w,\lambda)=V(w)-(1/2)\lambda(w^{\prime}w-1)italic_L ( italic_w , italic_λ ) = italic_V ( italic_w ) - ( 1 / 2 ) italic_λ ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 )

and the (unconstrained) Lagrangian problem Eq. 37

maxwminλL(w,λ)subscript𝑤subscript𝜆𝐿𝑤𝜆\max_{w}\min_{\lambda}L(w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_w , italic_λ )
  1. 1.

    Solutions to (36) and (37) exist for all D0𝐷0D\geq 0italic_D ≥ 0 and dn𝑑superscript𝑛d\in\mathbb{R}^{n}italic_d ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and achieve the same optimal value V0=L0=(1/2)d(DλPI)+d+λP/2superscript𝑉0superscript𝐿012superscript𝑑superscript𝐷subscript𝜆𝑃𝐼𝑑subscript𝜆𝑃2V^{0}=L^{0}=-(1/2)d^{\prime}(D-\lambda_{P}I)^{+}d+\lambda_{P}/2italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2.

  2. 2.

    The optimizer of Eq. 36, denoted w0superscript𝑤0w^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, is given by w0=wsuperscript𝑤0superscript𝑤w^{0}=w^{*}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT where wsuperscript𝑤w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT the saddle-point solution set from Lemma 17.

  3. 3.

    The optimizer of Eq. 37, denoted (wL0,λ¯(wL0))superscriptsubscript𝑤𝐿0¯𝜆superscriptsubscript𝑤𝐿0(w_{L}^{0},\underline{\lambda}(w_{L}^{0}))( italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , under¯ start_ARG italic_λ end_ARG ( italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) is given by wL0=wsuperscriptsubscript𝑤𝐿0superscript𝑤w_{L}^{0}=w^{*}italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and λ¯(wL0)=¯𝜆superscriptsubscript𝑤𝐿0\underline{\lambda}(w_{L}^{0})=\mathbb{R}under¯ start_ARG italic_λ end_ARG ( italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = blackboard_R.

Proof.

The solution to (36) exists since V()𝑉V(\cdot)italic_V ( ⋅ ) is continuous and W𝑊Witalic_W is compact. The solution to Eq. 37 exists and satisfies strong duality with (38) due to the saddle-point theorem (Proposition 12) and Lemma 17, so we have that

L0=maxwminλL=minλmaxwL=(1/2)d(DλPI)+d+λP/2superscript𝐿0subscript𝑤subscript𝜆𝐿subscript𝜆subscript𝑤𝐿12superscript𝑑superscript𝐷subscript𝜆𝑃𝐼𝑑subscript𝜆𝑃2L^{0}=\max_{w}\min_{\lambda}L=\min_{\lambda}\max_{w}L=-(1/2)d^{\prime}(D-% \lambda_{P}I)^{+}d+\lambda_{P}/2italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L = roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_D - italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d + italic_λ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT / 2

where the last equality follows by (41). From the saddle-point theorem, we also have that wL0=wsuperscriptsubscript𝑤𝐿0superscript𝑤w_{L}^{0}=w^{*}italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT for the optimizer of the Lagrangian. Since wL0superscriptsubscript𝑤𝐿0w_{L}^{0}italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT satisfies (wL0)wL0=1superscriptsuperscriptsubscript𝑤𝐿0superscriptsubscript𝑤𝐿01(w_{L}^{0})^{\prime}w_{L}^{0}=1( italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = 1, it follows that λ¯0(wL0)=superscript¯𝜆0superscriptsubscript𝑤𝐿0\underline{\lambda}^{0}(w_{L}^{0})=\mathbb{R}under¯ start_ARG italic_λ end_ARG start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = blackboard_R. Finally, we have that value V0=L0superscript𝑉0superscript𝐿0V^{0}=L^{0}italic_V start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_L start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and set w0=wL0superscript𝑤0superscriptsubscript𝑤𝐿0w^{0}=w_{L}^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_w start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT by Proposition 7, and the proof is complete. ∎

The proofs of Lemma 16, Lemma 17, and Lemma 18 have proven Proposition 15.

Discussion of Proposition 15.

The basic problem of constrained, nonconvex quadratic optimization has appeared in several fields. In the optimization literature it is known as the “trust-region” problem. Nocedal and Wright discuss the numerical solution of the trust-region problem in the context of nonlinear programming (Nocedal and Wright, 2006, p.69). Boyd and Vandenberghe establish strong dualilty of the Lagrangian and dual Langrangian formulations of the problem (Boyd and Vandenberghe, 2004, Appendix B). The complete solution provided in Proposition 15 appears to be new to this work. The authors would also like to acknowledge Robin Strässer for his work on earlier versions of Proposition 15 (Mannini et al., 2024).

Constrained minmax and maxmin.

To compactly state the results in this section it is convenient to define two functions

M(λ)[M11M12M12M22λI]L(λ)(1/2)dM+(λ)d+λ/2formulae-sequence𝑀𝜆matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22𝜆𝐼𝐿𝜆12superscript𝑑superscript𝑀𝜆𝑑𝜆2\displaystyle M(\lambda)\coloneqq\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}-\lambda I\end{bmatrix}\qquad L(\lambda)\coloneqq-(1/2)d% ^{\prime}M^{+}(\lambda)d+\lambda/2italic_M ( italic_λ ) ≔ [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I end_CELL end_ROW end_ARG ] italic_L ( italic_λ ) ≔ - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ ) italic_d + italic_λ / 2

and for convenience, we repeat the definition of the Schur complements

M~11(λ)(M22λI)M12M11+M12M~22(λ)M11M12(M22λI)+M12formulae-sequencesubscript~𝑀11𝜆subscript𝑀22𝜆𝐼superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12subscript~𝑀22𝜆subscript𝑀11subscript𝑀12superscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12\displaystyle\tilde{M}_{11}(\lambda)\coloneqq(M_{22}-\lambda I)-M_{12}^{\prime% }M_{11}^{+}M_{12}\qquad\tilde{M}_{22}(\lambda)\coloneqq M_{11}-M_{12}(M_{22}-% \lambda I)^{+}M_{12}^{\prime}over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) ≔ ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) ≔ italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
Corollary 19 (Minmax and maxmin of constrained quadratic functions).

Consider quadratic function V():n+m:𝑉superscript𝑛𝑚V(\cdot):\mathbb{R}^{n+m}\rightarrow\mathbb{R}italic_V ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT → blackboard_R and compact constraint set W𝑊Witalic_W

V(u,w)12[uw][M11M12M12M22][uw]W{www=1}formulae-sequence𝑉𝑢𝑤12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑢𝑤𝑊conditional-set𝑤superscript𝑤𝑤1\displaystyle V(u,w)\coloneqq\frac{1}{2}\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}\qquad W\coloneqq\{w\mid w^{\prime}w=1\}italic_V ( italic_u , italic_w ) ≔ divide start_ARG 1 end_ARG start_ARG 2 end_ARG [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] italic_W ≔ { italic_w ∣ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w = 1 }

with M(n+m)×(n+m)0𝑀superscript𝑛𝑚𝑛𝑚0M\in\mathbb{R}^{(n+m)\times(n+m)}\geq 0italic_M ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_m ) × ( italic_n + italic_m ) end_POSTSUPERSCRIPT ≥ 0, M22n×nsubscript𝑀22superscript𝑛𝑛M_{22}\in\mathbb{R}^{n\times n}italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n end_POSTSUPERSCRIPT, M11m×msubscript𝑀11superscript𝑚𝑚M_{11}\in\mathbb{R}^{m\times m}italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_m end_POSTSUPERSCRIPT, M12m×nsubscript𝑀12superscript𝑚𝑛M_{12}\in\mathbb{R}^{m\times n}italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_m × italic_n end_POSTSUPERSCRIPT, and the two constrained optimization problems.

minusubscript𝑢\displaystyle\min_{u}roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT maxwWsubscript𝑤𝑊\displaystyle\max_{w\in W}roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT V(u,w)𝑉𝑢𝑤\displaystyle V(u,w)italic_V ( italic_u , italic_w ) (46)
maxwWsubscript𝑤𝑊\displaystyle\max_{w\in W}roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT minusubscript𝑢\displaystyle\min_{u}roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT V(u,w)𝑉𝑢𝑤\displaystyle V(u,w)italic_V ( italic_u , italic_w ) (47)
  1. 1.

    The solution to (46) is

    V(u0,w0(u0))𝑉superscript𝑢0superscript𝑤0superscript𝑢0\displaystyle V(u^{0},w^{0}(u^{0}))italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) =λ02absentsuperscript𝜆02\displaystyle=\frac{\lambda^{0}}{2}= divide start_ARG italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG
    u0superscript𝑢0\displaystyle u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT N(M~22(λ0))absent𝑁subscript~𝑀22superscript𝜆0\displaystyle\in N(\tilde{M}_{22}(\lambda^{0}))∈ italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) )
    w0(u0)superscript𝑤0superscript𝑢0\displaystyle w^{0}(u^{0})italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ((M22λ0I)+M12u0+N(M22λ0I))Wabsentsuperscriptsubscript𝑀22superscript𝜆0𝐼superscriptsubscript𝑀12superscript𝑢0𝑁subscript𝑀22superscript𝜆0𝐼𝑊\displaystyle\in\bigg{(}-(M_{22}-\lambda^{0}I)^{+}M_{12}^{\prime}\;u^{0}+N(M_{% 22}-\lambda^{0}I)\bigg{)}\cap W∈ ( - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) ) ∩ italic_W

    with λ0=|M22|superscript𝜆0subscript𝑀22\lambda^{0}=\left|M_{22}\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT |.

  2. 2.

    The solution to (47) is

    V(u0(w0),w0)𝑉superscript𝑢0superscript𝑤0superscript𝑤0\displaystyle V(u^{0}(w^{0}),w^{0})italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) =λ02absentsuperscript𝜆02\displaystyle=\frac{\lambda^{0}}{2}= divide start_ARG italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG
    w0superscript𝑤0\displaystyle w^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT (N(M~11(λ0)))Wabsent𝑁subscript~𝑀11superscript𝜆0𝑊\displaystyle\in\big{(}N(\tilde{M}_{11}(\lambda^{0}))\big{)}\cap W∈ ( italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) ) ∩ italic_W
    u0(w0)superscript𝑢0superscript𝑤0\displaystyle u^{0}(w^{0})italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) M11+M12w0+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12superscript𝑤0𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}M_{12}\;w^{0}+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )

    with λ0=|M22M12M11+M12|superscript𝜆0subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda^{0}=\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT |.

The proof of Corollary 19 is given at the end of the proof of Proposition 20.

Next we add the linear term to V(u,w)𝑉𝑢𝑤V(u,w)italic_V ( italic_u , italic_w ), which may seem harmless but actually precludes a closed-form solution as in Corollary 19. Here a nonlinear optimization over scalar λ𝜆\lambdaitalic_λ remains.

Proposition 20 (Minmax and maxmin of constrained quadratic functions with linear term).

Consider quadratic function V():n+m:𝑉superscript𝑛𝑚V(\cdot):\mathbb{R}^{n+m}\rightarrow\mathbb{R}italic_V ( ⋅ ) : blackboard_R start_POSTSUPERSCRIPT italic_n + italic_m end_POSTSUPERSCRIPT → blackboard_R and compact constraint set W𝑊Witalic_W

V(u,w)12[uw][M11M12M12M22][uw]+[uw][d1d2]W{www=1}formulae-sequence𝑉𝑢𝑤12superscriptmatrix𝑢𝑤matrixsubscript𝑀11subscript𝑀12subscriptsuperscript𝑀12subscript𝑀22matrix𝑢𝑤superscriptmatrix𝑢𝑤matrixsubscript𝑑1subscript𝑑2𝑊conditional-set𝑤superscript𝑤𝑤1\displaystyle V(u,w)\coloneqq\frac{1}{2}\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}M_{11}&M_{12}\\ M^{\prime}_{12}&M_{22}\end{bmatrix}\begin{bmatrix}u\\ w\end{bmatrix}+\begin{bmatrix}u\\ w\end{bmatrix}^{\prime}\begin{bmatrix}d_{1}\\ d_{2}\end{bmatrix}\qquad W\coloneqq\{w\mid w^{\prime}w=1\}italic_V ( italic_u , italic_w ) ≔ divide start_ARG 1 end_ARG start_ARG 2 end_ARG [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL start_CELL italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] + [ start_ARG start_ROW start_CELL italic_u end_CELL end_ROW start_ROW start_CELL italic_w end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [ start_ARG start_ROW start_CELL italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] italic_W ≔ { italic_w ∣ italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w = 1 }

with M(n+m)×(n+m)0𝑀superscript𝑛𝑚𝑛𝑚0M\in\mathbb{R}^{(n+m)\times(n+m)}\geq 0italic_M ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + italic_m ) × ( italic_n + italic_m ) end_POSTSUPERSCRIPT ≥ 0, and the two constrained optimization problems.

minusubscript𝑢\displaystyle\min_{u}roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT maxwWsubscript𝑤𝑊\displaystyle\max_{w\in W}roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT V(u,w)𝑉𝑢𝑤\displaystyle V(u,w)italic_V ( italic_u , italic_w ) (48)
maxwWsubscript𝑤𝑊\displaystyle\max_{w\in W}roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT minusubscript𝑢\displaystyle\min_{u}roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT V(u,w)𝑉𝑢𝑤\displaystyle V(u,w)italic_V ( italic_u , italic_w ) (49)
  1. 1.

    The solution to (48) is

    V(u0,w0(u0))𝑉superscript𝑢0superscript𝑤0superscript𝑢0\displaystyle V(u^{0},w^{0}(u^{0}))italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) =(1/2)dM+(λ0)d+λ0/2absent12superscript𝑑superscript𝑀superscript𝜆0𝑑superscript𝜆02\displaystyle=-(1/2)d^{\prime}M^{+}(\lambda^{0})d+\lambda^{0}/2= - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d + italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / 2
    u0superscript𝑢0\displaystyle u^{0}italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT M~22+(λ0)(d1M12(M22λ0I)+d2)+N(M~22(λ0))absentsubscriptsuperscript~𝑀22superscript𝜆0subscript𝑑1subscript𝑀12superscriptsubscript𝑀22superscript𝜆0𝐼subscript𝑑2𝑁subscript~𝑀22superscript𝜆0\displaystyle\in-\tilde{M}^{+}_{22}(\lambda^{0})(d_{1}-M_{12}(M_{22}-\lambda^{% 0}I)^{+}d_{2})+N(\tilde{M}_{22}(\lambda^{0}))∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) )
    w0(u0)superscript𝑤0superscript𝑢0\displaystyle w^{0}(u^{0})italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ((M22λ0I)+(M12u0+d2)+N(M22λ0I))W\displaystyle\in\bigg{(}-(M_{22}-\lambda^{0}I)^{+}(M_{12}^{\prime}\;u^{0}+d_{2% })+-N(M_{22}-\lambda^{0}I)\bigg{)}\cap W∈ ( - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + - italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) ) ∩ italic_W

    where λ0superscript𝜆0\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT must be computed from the following nonlinear optimization problem

    λ0=argminλ|M22|L(λ)superscript𝜆0subscript𝜆subscript𝑀22𝐿𝜆\lambda^{0}=\arg\min_{\lambda\geq\left|M_{22}\right|}L(\lambda)italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ )
  2. 2.

    The solution to (49) is

    V(u0(w0),w0)𝑉superscript𝑢0superscript𝑤0superscript𝑤0\displaystyle V(u^{0}(w^{0}),w^{0})italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) =(1/2)dM+(λ0)d+λ0/2absent12superscript𝑑superscript𝑀superscript𝜆0𝑑superscript𝜆02\displaystyle=-(1/2)d^{\prime}M^{+}(\lambda^{0})d+\lambda^{0}/2= - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d + italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / 2
    w0superscript𝑤0\displaystyle w^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT (M~11+(λ0)(d2M12M11+d1)+N(M~11(λ0)))Wabsentsubscriptsuperscript~𝑀11superscript𝜆0subscript𝑑2subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑑1𝑁subscript~𝑀11superscript𝜆0𝑊\displaystyle\in\bigg{(}-\tilde{M}^{+}_{11}(\lambda^{0})(d_{2}-M^{\prime}_{12}% M^{+}_{11}d_{1})+N(\tilde{M}_{11}(\lambda^{0}))\bigg{)}\cap W∈ ( - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) ) ∩ italic_W
    u0(w0)superscript𝑢0superscript𝑤0\displaystyle u^{0}(w^{0})italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) M11+(M12w0+d1)+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12superscript𝑤0subscript𝑑1𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}(M_{12}\;w^{0}+d_{1})+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )

    where λ0superscript𝜆0\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT must be computed from the following nonlinear optimization problem

    λ0=argminλ|M22M12M11+M12|L(λ)superscript𝜆0subscript𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝐿𝜆\lambda^{0}=\arg\min_{\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}% \right|}L(\lambda)italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = roman_arg roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ )
Proof.

Expand V𝑉Vitalic_V as

V(u,w)=12uM11u+uM12w+12wM22w+ud1+wd2𝑉𝑢𝑤12superscript𝑢subscript𝑀11𝑢superscript𝑢subscript𝑀12𝑤12superscript𝑤subscript𝑀22𝑤superscript𝑢subscript𝑑1superscript𝑤subscript𝑑2V(u,w)=\frac{1}{2}u^{\prime}M_{11}u+u^{\prime}M_{12}w+\frac{1}{2}w^{\prime}M_{% 22}w+u^{\prime}d_{1}+w^{\prime}d_{2}italic_V ( italic_u , italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_u + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT italic_w + italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT
  1. 1.

    Note that from Proposition 7 the optimization problem

    minumaxwWV(u,w)subscript𝑢subscript𝑤𝑊𝑉𝑢𝑤\min_{u}\max_{w\in W}V(u,w)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

    is equivalent to the Lagrangian problem

    minumaxwminλL(u,w,λ),L(u,w,λ):=V(u,w)λ2(ww1)assignsubscript𝑢subscript𝑤subscript𝜆𝐿𝑢𝑤𝜆𝐿𝑢𝑤𝜆𝑉𝑢𝑤𝜆2superscript𝑤𝑤1\min_{u}\max_{w}\min_{\lambda}L(u,w,\lambda),\qquad L(u,w,\lambda):=V(u,w)-% \frac{\lambda}{2}(w^{\prime}w-1)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ) , italic_L ( italic_u , italic_w , italic_λ ) := italic_V ( italic_u , italic_w ) - divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG ( italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_w - 1 )

    From Proposition 15 strong duality holds for maxwminλL(u,w,λ)subscript𝑤subscript𝜆𝐿𝑢𝑤𝜆\max_{w}\min_{\lambda}L(u,w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ), so the optimization problem is also equivalent to the dual Lagrangian problem

    minλminumaxwL(u,w,λ)subscript𝜆subscript𝑢subscript𝑤𝐿𝑢𝑤𝜆\min_{\lambda}\min_{u}\max_{w}L(u,w,\lambda)roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ )

    but when we use the dual to obtain a solution set for the inner maxwsubscript𝑤\max_{w}roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT problem, which gives w0(u,λ)superscript𝑤0𝑢𝜆w^{0}(u,\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u , italic_λ ), that solution set may be too large. We fix this issue subsequently. From Proposition 14 a solution to minumaxwL(u,w,λ)subscript𝑢subscript𝑤𝐿𝑢𝑤𝜆\min_{u}\max_{w}L(u,w,\lambda)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ) exists if and only if λ|M22|𝜆subscript𝑀22\lambda\geq\left|M_{22}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | and the solution (set) and optimal value function are

    w0(u,λ)superscript𝑤0𝑢𝜆\displaystyle w^{0}(u,\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u , italic_λ ) (M22λI)+(M12u+d2)+N(M22λI)absentsuperscriptsubscript𝑀22𝜆𝐼superscriptsubscript𝑀12𝑢subscript𝑑2𝑁subscript𝑀22𝜆𝐼\displaystyle\in-(M_{22}-\lambda I)^{+}(M_{12}^{\prime}\;u+d_{2})+N(M_{22}-% \lambda I)∈ - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I )
    u0(λ)superscript𝑢0𝜆\displaystyle u^{0}(\lambda)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) M~22+(λ)(d1M12(M22λI)+d2)+N(M~22(λ))absentsubscriptsuperscript~𝑀22𝜆subscript𝑑1subscript𝑀12superscriptsubscript𝑀22𝜆𝐼subscript𝑑2𝑁subscript~𝑀22𝜆\displaystyle\in-\tilde{M}^{+}_{22}(\lambda)(d_{1}-M_{12}(M_{22}-\lambda I)^{+% }d_{2})+N(\tilde{M}_{22}(\lambda))∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0(λ),w0(u0,λ),λ)𝐿superscript𝑢0𝜆superscript𝑤0superscript𝑢0𝜆𝜆\displaystyle L(u^{0}(\lambda),w^{0}(u^{0},\lambda),\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) , italic_λ ) ={L(λ),λ|M22|+,λ<|M22|absentcases𝐿𝜆𝜆subscript𝑀22𝜆subscript𝑀22\displaystyle=\begin{cases}L(\lambda),\quad&\lambda\geq\left|M_{22}\right|\\ +\infty,\quad&\lambda<\left|M_{22}\right|\end{cases}= { start_ROW start_CELL italic_L ( italic_λ ) , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_CELL end_ROW

    The remaining optimization, which is equivalent to solving Eq. 48, is

    minλ|M22|L(λ)subscript𝜆subscript𝑀22𝐿𝜆\min_{\lambda\geq\left|M_{22}\right|}L(\lambda)roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ ) (50)

    which establishes that

    λ0:=argminλ|M22|L(λ)assignsuperscript𝜆0subscript𝜆subscript𝑀22𝐿𝜆\lambda^{0}:=\arg\min_{\lambda\geq\left|M_{22}\right|}L(\lambda)italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT := roman_arg roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ )

    Substituting λ0superscript𝜆0\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT in u0(λ)superscript𝑢0𝜆u^{0}(\lambda)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ), w0(u(λ))superscript𝑤0𝑢𝜆w^{0}(u(\lambda))italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ( italic_λ ) ), and L(u0(λ),w0(u0,λ),λ)𝐿superscript𝑢0𝜆superscript𝑤0superscript𝑢0𝜆𝜆L(u^{0}(\lambda),w^{0}(u^{0},\lambda),\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) , italic_λ ) gives

    w0(u)superscript𝑤0𝑢\displaystyle w^{0}(u)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u ) (M22λ0I)+(M12u+d2)+N(M22λ0I)absentsuperscriptsubscript𝑀22superscript𝜆0𝐼superscriptsubscript𝑀12𝑢subscript𝑑2𝑁subscript𝑀22superscript𝜆0𝐼\displaystyle\in-(M_{22}-\lambda^{0}I)^{+}(M_{12}^{\prime}\;u+d_{2})+N(M_{22}-% \lambda^{0}I)∈ - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I )
    u0(λ0)superscript𝑢0superscript𝜆0\displaystyle u^{0}(\lambda^{0})italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) M~22+(λ0)(d1M12(M22λ0I)+d2)+N(M~22(λ0))absentsubscriptsuperscript~𝑀22superscript𝜆0subscript𝑑1subscript𝑀12superscriptsubscript𝑀22superscript𝜆0𝐼subscript𝑑2𝑁subscript~𝑀22superscript𝜆0\displaystyle\in\tilde{M}^{+}_{22}(\lambda^{0})(d_{1}-M_{12}(M_{22}-\lambda^{0% }I)^{+}d_{2})+N(\tilde{M}_{22}(\lambda^{0}))∈ over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) )
    L(u0,w0(u0),λ0)𝐿superscript𝑢0superscript𝑤0superscript𝑢0superscript𝜆0\displaystyle L(u^{0},w^{0}(u^{0}),\lambda^{0})italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) =(1/2)dM+(λ0)d+λ0/2absent12superscript𝑑superscript𝑀superscript𝜆0𝑑superscript𝜆02\displaystyle=-(1/2)d^{\prime}M^{+}(\lambda^{0})d+\lambda^{0}/2= - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d + italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / 2

    Finally, restricting the dual solution w0(u0)superscript𝑤0superscript𝑢0w^{0}(u^{0})italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) to satisfy the constraint wW𝑤𝑊w\in Witalic_w ∈ italic_W by intersecting w0(u0)superscript𝑤0superscript𝑢0w^{0}(u^{0})italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) with W𝑊Witalic_W giving

    w0(u0)((M22λ0I)+(M12u0+d2)+N(M22λI))Wsuperscript𝑤0superscript𝑢0superscriptsubscript𝑀22superscript𝜆0𝐼superscriptsubscript𝑀12superscript𝑢0subscript𝑑2𝑁subscript𝑀22𝜆𝐼𝑊w^{0}(u^{0})\in\bigg{(}-(M_{22}-\lambda^{0}I)^{+}(M_{12}^{\prime}\;u^{0}+d_{2}% )+N(M_{22}-\lambda I)\bigg{)}\cap Witalic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ∈ ( - ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT italic_I ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT + italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_λ italic_I ) ) ∩ italic_W

    Since the constraint is satisfied, L(u0,w0(u0),λ0)=V(u0,w0(u0))𝐿superscript𝑢0superscript𝑤0superscript𝑢0superscript𝜆0𝑉superscript𝑢0superscript𝑤0superscript𝑢0L(u^{0},w^{0}(u^{0}),\lambda^{0})=V(u^{0},w^{0}(u^{0}))italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) giving

    V(u0,w0(u0))=(1/2)dM+(λ0)d+λ0/2𝑉superscript𝑢0superscript𝑤0superscript𝑢012superscript𝑑superscript𝑀superscript𝜆0𝑑superscript𝜆02V(u^{0},w^{0}(u^{0}))=-(1/2)d^{\prime}M^{+}(\lambda^{0})d+\lambda^{0}/2italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d + italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / 2

    and part 1 is established.

  2. 2.

    From Proposition 7 the optimization problem

    maxwWminuV(u,w)subscript𝑤𝑊subscript𝑢𝑉𝑢𝑤\max_{w\in W}\min_{u}V(u,w)roman_max start_POSTSUBSCRIPT italic_w ∈ italic_W end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_V ( italic_u , italic_w )

    is equivalent to the Lagrangian minmax problem

    maxwminλminuL(u,w,λ)subscript𝑤subscript𝜆subscript𝑢𝐿𝑢𝑤𝜆\max_{w}\min_{\lambda}\min_{u}L(u,w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ )

    Unlike the previous part, before we can use Proposition 15 and invoke strong duality, we must first examine the form of the innermost problem minuL(u,w,λ)subscript𝑢𝐿𝑢𝑤𝜆\min_{u}L(u,w,\lambda)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ). Using Proposition 5 to solve the minuL(u,w,λ)subscript𝑢𝐿𝑢𝑤𝜆\min_{u}L(u,w,\lambda)roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ) problem and evaluating at the optimal u𝑢uitalic_u gives

    L(u0(w,λ),w,λ)=12wM~11(λ)w+w(d2M12M11+d1)12d1M11+d1+λ2𝐿superscript𝑢0𝑤𝜆𝑤𝜆12superscript𝑤subscript~𝑀11𝜆𝑤superscript𝑤subscript𝑑2superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑑112superscriptsubscript𝑑1superscriptsubscript𝑀11subscript𝑑1𝜆2L(u^{0}(w,\lambda),w,\lambda)=\frac{1}{2}w^{\prime}\tilde{M}_{11}(\lambda)w+w^% {\prime}(d_{2}-M_{12}^{\prime}M_{11}^{+}d_{1})-\frac{1}{2}d_{1}^{\prime}M_{11}% ^{+}d_{1}+\frac{\lambda}{2}italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w , italic_λ ) , italic_w , italic_λ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) italic_w + italic_w start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + divide start_ARG italic_λ end_ARG start_ARG 2 end_ARG

    Given the functional form of w𝑤witalic_w and λ𝜆\lambdaitalic_λ above, we see that Proposition 15 indeed applies, and the optimization problem (49) is also equivalent to the dual Lagrangian problem

    minλmaxwminuL(u,w,λ)subscript𝜆subscript𝑤subscript𝑢𝐿𝑢𝑤𝜆\min_{\lambda}\max_{w}\min_{u}L(u,w,\lambda)roman_min start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ )

    Again, using the dual to obtain the solution for the inner maxwsubscript𝑤\max_{w}roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT problem, which gives w0(λ)superscript𝑤0𝜆w^{0}(\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ), may give a solution set that is too large, and we further restrict that set subsequently. From Proposition 14 a solution to maxwminuL(u,w,λ)subscript𝑤subscript𝑢𝐿𝑢𝑤𝜆\max_{w}\min_{u}L(u,w,\lambda)roman_max start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT roman_min start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_L ( italic_u , italic_w , italic_λ ) exists if and only if λ|M22M12M11+M12|𝜆subscript𝑀22subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑀12\lambda\geq\left|M_{22}-M^{\prime}_{12}M^{+}_{11}M_{12}\right|italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | and the solution (set) and optimal value function are

    u0(w,λ)superscript𝑢0𝑤𝜆\displaystyle u^{0}(w,\lambda)italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w , italic_λ ) M11+(M12w+d1)+N(M11)absentsuperscriptsubscript𝑀11subscript𝑀12𝑤subscript𝑑1𝑁subscript𝑀11\displaystyle\in-M_{11}^{+}(M_{12}\;w+d_{1})+N(M_{11})∈ - italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_w + italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )
    w0(λ)superscript𝑤0𝜆\displaystyle w^{0}(\lambda)italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) M~11+(λ)(d2M12M11+d1)+N(M~11(λ))absentsubscriptsuperscript~𝑀11𝜆subscript𝑑2subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑑1𝑁subscript~𝑀11𝜆\displaystyle\in-\tilde{M}^{+}_{11}(\lambda)(d_{2}-M^{\prime}_{12}M^{+}_{11}d_% {1})+N(\tilde{M}_{11}(\lambda))∈ - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ ) )
    L(u0(w0,λ),w0(λ),λ)𝐿superscript𝑢0superscript𝑤0𝜆superscript𝑤0𝜆𝜆\displaystyle L(u^{0}(w^{0},\lambda),w^{0}(\lambda),\lambda)italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ ) , italic_λ ) ={L(λ),λ|M22M12M11+M12|+,λ<|M22M12M11+M12|absentcases𝐿𝜆𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\displaystyle=\begin{cases}L(\lambda),\quad&\lambda\geq\left|M_{22}-M_{12}^{% \prime}M_{11}^{+}M_{12}\right|\\ +\infty,\quad&\lambda<\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|\end{cases}= { start_ROW start_CELL italic_L ( italic_λ ) , end_CELL start_CELL italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW start_ROW start_CELL + ∞ , end_CELL start_CELL italic_λ < | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_CELL end_ROW

    The remaining optimization, which is equivalent to solving Eq. 49, is

    minλ|M22M12M11+M12|L(λ)subscript𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝐿𝜆\min_{\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|}L(\lambda)roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ ) (51)

    which establishes that

    λ0:=argminλ|M22M12M11+M12|L(λ)assignsuperscript𝜆0subscript𝜆subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12𝐿𝜆\lambda^{0}:=\arg\min_{\lambda\geq\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}% \right|}L(\lambda)italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT := roman_arg roman_min start_POSTSUBSCRIPT italic_λ ≥ | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | end_POSTSUBSCRIPT italic_L ( italic_λ )

    Restricting w0superscript𝑤0w^{0}italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT by enforcing the constraint w0Wsuperscript𝑤0𝑊w^{0}\in Witalic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∈ italic_W gives

    w0(λ0)(M~11+(λ0)(d2M12M11+d1)+N(M~11(λ0)))Wsuperscript𝑤0superscript𝜆0subscriptsuperscript~𝑀11superscript𝜆0subscript𝑑2subscriptsuperscript𝑀12subscriptsuperscript𝑀11subscript𝑑1𝑁subscript~𝑀11superscript𝜆0𝑊w^{0}(\lambda^{0})\in\bigg{(}-\tilde{M}^{+}_{11}(\lambda^{0})(d_{2}-M^{\prime}% _{12}M^{+}_{11}d_{1})+N(\tilde{M}_{11}(\lambda^{0}))\bigg{)}\cap Witalic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ∈ ( - over~ start_ARG italic_M end_ARG start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_N ( over~ start_ARG italic_M end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) ) ∩ italic_W

    Since the constraint is satisfied, L(u0(w0),w0,λ)=V(u0(w0),w0)𝐿superscript𝑢0superscript𝑤0superscript𝑤0𝜆𝑉superscript𝑢0superscript𝑤0superscript𝑤0L(u^{0}(w^{0}),w^{0},\lambda)=V(u^{0}(w^{0}),w^{0})italic_L ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_λ ) = italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) giving

    V(u0(w0),w0)=(1/2)dM+(λ0)d+λ0/2𝑉superscript𝑢0superscript𝑤0superscript𝑤012superscript𝑑superscript𝑀superscript𝜆0𝑑superscript𝜆02V(u^{0}(w^{0}),w^{0})=-(1/2)d^{\prime}M^{+}(\lambda^{0})d+\lambda^{0}/2italic_V ( italic_u start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) , italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = - ( 1 / 2 ) italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ( italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) italic_d + italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT / 2

    and part 2 is established. ∎

To prove Corollary 19, note that d=0𝑑0d=0italic_d = 0 so that Eq. 50 and Eq. 51 can be solved analytically giving λ0=|M22|superscript𝜆0subscript𝑀22\lambda^{0}=\left|M_{22}\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT | for Eq. 50 and λ0=|M22M12M11+M12|superscript𝜆0subscript𝑀22superscriptsubscript𝑀12superscriptsubscript𝑀11subscript𝑀12\lambda^{0}=\left|M_{22}-M_{12}^{\prime}M_{11}^{+}M_{12}\right|italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = | italic_M start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT - italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT italic_M start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT | for Eq. 51. Substituting these λ0superscript𝜆0\lambda^{0}italic_λ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT values and d=0𝑑0d=0italic_d = 0 into the statement of Proposition 20 then establishes the results of Corollary 19.

References

  • Boyd and Vandenberghe (2004) S. P. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
  • Mangasarian (1994) O. Mangasarian. Nonlinear Programming. SIAM, Philadelphia, PA, 1994.
  • Mannini et al. (2024) D. Mannini, R. Strässer, and J. B. Rawlings. Optimal design of disturbance attenuation feedback controllers for linear dynamical systems. In American Control Conference, Toronto, CA, July 8–12, 2024.
  • Nocedal and Wright (2006) J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York, second edition, 2006.
  • Polak (1997) E. Polak. Optimization: Algorithms and Consistent Approximations. Springer Verlag, New York, 1997. ISBN 0-387-94971-2.
  • Rawlings et al. (2020) J. B. Rawlings, D. Q. Mayne, and M. M. Diehl. Model Predictive Control: Theory, Design, and Computation. Nob Hill Publishing, Santa Barbara, CA, 2nd, paperback edition, 2020. 770 pages, ISBN 978-0-9759377-5-4.
  • Rockafellar (1993) R. T. Rockafellar. Lagrange multipliers and optimality. SIAM Rev., 35(2):183–238, 1993.
  • Rockafellar and Wets (1998) R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlag, 1998.
  • von Neumann (1928) J. von Neumann. Zur Theorie der Gesellschaftsspiele. Math. Ann., 100:295–320, 1928. doi: 10.1007/BF01448847.
  • von Neumann and Morgenstern (1944) J. von Neumann and O. Morgenstern. Theory of Games and Economic Behavior. Princeton University Press, Princeton and Oxford, 1944.