\equalcont

These authors contributed equally to this work.

[1]\fnmLixin \surShen \equalcontThese authors contributed equally to this work.

[1]\orgdivDepartment of Mathematics, \orgnameSyracuse University, \orgaddress\citySyracuse, \postcodeNY 13244, \countryUSA

2]\orgdivInformation Directorate, \orgnameAir Force Research Laboratory, \orgaddress\cityRome, \postcodeNY 10587,\countryUSA

Computing Proximity Operators of Scale and Signed Permutation Invariant Functions

\fnmJianqing \surJia [email protected]    \fnmAshley \surPrater-Bennette [email protected]    [email protected] * [
Abstract

This paper investigates the computation of proximity operators for scale and signed permutation invariant functions. A scale-invariant function remains unchanged under uniform scaling, while a signed permutation invariant function retains its structure despite permutations and sign changes applied to its input variables. Noteworthy examples include the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT function and the ratios of 1/2subscript1subscript2\ell_{1}/\ell_{2}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and its square, with their proximity operators being particularly crucial in sparse signal recovery. We delve into the properties of scale and signed permutation invariant functions, delineating the computation of their proximity operators into three sequential steps: the 𝒘𝒘\bm{w}bold_italic_w-step, r𝑟ritalic_r-step, and d𝑑ditalic_d-step. These steps collectively form a procedure termed as WRD, with the 𝒘𝒘\bm{w}bold_italic_w-step being of utmost importance and requiring careful treatment. Leveraging this procedure, we present a method for explicitly computing the proximity operator of (1/2)2superscriptsubscript1subscript22(\ell_{1}/\ell_{2})^{2}( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and introduce an efficient algorithm for the proximity operator of 1/2subscript1subscript2\ell_{1}/\ell_{2}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

keywords:
sparse promoting functions, proximity operator, 1/2subscript1subscript2\ell_{1}/\ell_{2}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, (1/2)2superscriptsubscript1subscript22(\ell_{1}/\ell_{2})^{2}( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
pacs:
[

MSC Classification]90C26, 90C32, 90C55, 90C90, 65K05

1 Introduction

This paper addresses the computation of the proximity operator for scale and signed permutation invariant functions. A scale-invariant function is characterized by its resilience to uniform scaling: it remains unaltered when its input undergoes a constant factor multiplication. This invariance extends to permutations, ensuring that changes in the order of input variables do not affect the function’s value. Additionally, the function exhibits invariance under sign changes, meaning that if any component of an input is replaced by its negative counterpart, the function value remains consistent. In the context of this study, a signed permutation invariant function is defined as a mathematical function that retains its form despite permutations and sign changes applied to its input variables.

Several well-known examples of signed permutation invariant functions, as well as scale and signed permutation invariant functions, are presented:

  • All psubscript𝑝\ell_{p}roman_ℓ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT norms, where 0<p0𝑝0<p\leq\infty0 < italic_p ≤ ∞, and log-sum penalty function in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT are signed permutation invariant but not scale invariant, see [1, 2];

  • The 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm and the effective sparsity measure (q1)q1q\left(\frac{\|\cdot\|_{q}}{\|\cdot\|_{1}}\right)^{\frac{q}{1-q}}( divide start_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT end_ARG start_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT divide start_ARG italic_q end_ARG start_ARG 1 - italic_q end_ARG end_POSTSUPERSCRIPT, q(0,){1}𝑞01q\in(0,\infty)\setminus\{1\}italic_q ∈ ( 0 , ∞ ) ∖ { 1 } are both scale and signed permutation invariant, see [3, 4, 5, 6, 7].

The proximity operator is a mathematical concept used in optimization. This operator provides a computationally efficient way to find solutions for optimization problems involving nonsmooth functions [8, 9, 10, 11, 12, 13, 14, 15]. Given a proper lower semicontinuous function f𝑓fitalic_f and a point 𝒙𝒙\bm{x}bold_italic_x, the proximity operator of f𝑓fitalic_f at 𝒙𝒙\bm{x}bold_italic_x, denoted as proxf(𝒙)subscriptprox𝑓𝒙\mathrm{prox}_{f}(\bm{x})roman_prox start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_italic_x ), is defined as:

proxf(𝒙)=argmin{12𝒖𝒙22+f(𝒖):𝒖n}.subscriptprox𝑓𝒙:12superscriptsubscriptnorm𝒖𝒙22𝑓𝒖𝒖superscript𝑛\mathrm{prox}_{f}(\bm{x})=\arg\min\left\{\frac{1}{2}\|\bm{u}-\bm{x}\|_{2}^{2}+% f(\bm{u}):\bm{u}\in\mathbb{R}^{n}\right\}.roman_prox start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_italic_x ) = roman_arg roman_min { divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∥ bold_italic_u - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_u ) : bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT } .

In simpler terms, the proximity operator finds a point 𝒖𝒖\bm{u}bold_italic_u that minimizes the sum of the function f𝑓fitalic_f and half of the squared Euclidean distance between 𝒖𝒖\bm{u}bold_italic_u and a given point 𝒙𝒙\bm{x}bold_italic_x.

Our focus of this paper is to study the proximity operator of scale and signed permutation invariant functions. Our approach for computing the proximity operator of scale and signed permutation invariant functions is based on this observation: the space nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is isomorphic to the Cartesian product of \mathbb{R}blackboard_R and the (n1)𝑛1(n-1)( italic_n - 1 ) dimensional unit sphere, denoted by 𝕊n1superscript𝕊𝑛1\mathbb{S}^{n-1}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. Mathematically, this can be expressed as:

n×𝕊n1.superscript𝑛superscript𝕊𝑛1\mathbb{R}^{n}\cong\mathbb{R}\times\mathbb{S}^{n-1}.blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ≅ blackboard_R × blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT .

That is, for 𝒖n𝒖superscript𝑛\bm{u}\in\mathbb{R}^{n}bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, it can be converted to a pair (r,𝒘)×𝕊n1𝑟𝒘superscript𝕊𝑛1(r,\bm{w})\in\mathbb{R}\times\mathbb{S}^{n-1}( italic_r , bold_italic_w ) ∈ blackboard_R × blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT such that 𝒖=r𝒘𝒖𝑟𝒘\bm{u}=r\bm{w}bold_italic_u = italic_r bold_italic_w, where r=𝒖2𝑟subscriptnorm𝒖2r=\|\bm{u}\|_{2}italic_r = ∥ bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and 𝒘=𝒖/𝒖2𝒘𝒖subscriptnorm𝒖2\bm{w}=\bm{u}/\|\bm{u}\|_{2}bold_italic_w = bold_italic_u / ∥ bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. With this conversion, the task of finding a point 𝒖proxf(𝒙)𝒖subscriptprox𝑓𝒙\bm{u}\in\mathrm{prox}_{f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_italic_x ) transforms into finding a pair of (r,𝒘)×𝕊n1𝑟𝒘superscript𝕊𝑛1(r,\bm{w})\in\mathbb{R}\times\mathbb{S}^{n-1}( italic_r , bold_italic_w ) ∈ blackboard_R × blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT such that 𝒖=r𝒘𝒖𝑟𝒘\bm{u}=r\bm{w}bold_italic_u = italic_r bold_italic_w. Exploring the properties of the scale and signed permutation invariant functions f𝑓fitalic_f, the process of finding this pair (r,𝒘)𝑟𝒘(r,\bm{w})( italic_r , bold_italic_w ) involves three consecutive steps. The first step is to solve an optimization problem with variable 𝒘𝒘\bm{w}bold_italic_w only, the second step straightforwardly yields r=𝒙,𝒘𝑟𝒙𝒘r=\langle\bm{x},\bm{w}\rangleitalic_r = ⟨ bold_italic_x , bold_italic_w ⟩, and the final step involves deciding whether to choose the origin or the scaled vector 𝒖=r𝒘𝒖𝑟𝒘\bm{u}=r\bm{w}bold_italic_u = italic_r bold_italic_w as the resulting point. Clearly, the first step is crucial.

For all scale and signed permutation invariant functions, we will present a complete study on the following function

hp(𝒙)=(𝒙1𝒙2)pforp=1,2.formulae-sequencesubscript𝑝𝒙superscriptsubscriptnorm𝒙1subscriptnorm𝒙2𝑝for𝑝12h_{p}(\bm{x})=\left(\frac{\|\bm{x}\|_{1}}{\|\bm{x}\|_{2}}\right)^{p}\quad\mbox% {for}\quad p=1,2.italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) = ( divide start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT for italic_p = 1 , 2 .

Notably, there has been a gap in existing literature concerning the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and we have observed a recent study that addresses the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [16]. In our work, we aim to fill this gap by providing a comprehensive analysis of the proximity operator for both h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT within the context of scale and signed permutation invariant functions.

With our approach, the optimization problem for 𝒘𝒘\bm{w}bold_italic_w associated with both h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is nonconvex and takes the form of a constrained quadratic programming problem after certain simplifications. Despite the nonconvex nature of the objective functions and the constrained sets, we adopt a distinct strategy to address them individually.

For the h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT function, the objective function of the quadratic programming problem involves only a quadratic term formulated by a structured symmetric rank-2 matrix. Explicitly demonstrating that this matrix possesses one positive eigenvalue and one negative eigenvalue, and the constrained set of the problem is 𝕊n1+nsuperscript𝕊𝑛1subscriptsuperscript𝑛\mathbb{S}^{n-1}\cap\mathbb{R}^{n}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, where +nsubscriptsuperscript𝑛\mathbb{R}^{n}_{+}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT is the first orthant of nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. While both the objective function and constrained set are nonconvex, we are able to develop a procedure to find the optimal solution 𝒘𝒘\bm{w}bold_italic_w through the eigenvector of the matrix corresponding to the negative eigenvalue, achieved in a finite number of iterations.

For the h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT function, the objective function of the quadratic programming problem comprises a quadratic term formulated by a rank-one symmetric matrix and one linear term. The rank-1 matrix is negative definite, and the constrained set remains 𝕊n1+nsuperscript𝕊𝑛1subscriptsuperscript𝑛\mathbb{S}^{n-1}\cap\mathbb{R}^{n}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Similar to the situation with h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, both the objective function and constrained set are nonconvex. However, the procedure utilized for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT cannot be directly adapted for h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. To address this, we relax the nonconvex feasible set 𝕊n1+nsuperscript𝕊𝑛1subscriptsuperscript𝑛\mathbb{S}^{n-1}\cap\mathbb{R}^{n}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT to a convex set {𝒘+n:𝒘21}conditional-set𝒘subscriptsuperscript𝑛subscriptnorm𝒘21\{\bm{w}\in\mathbb{R}^{n}_{+}:\|\bm{w}\|_{2}\leq 1\}{ bold_italic_w ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT : ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 1 }. The resulting optimization problem maintains the same objective function as the non-relaxed version, but is now constrained in a convex domain. We establish conditions ensuring that the optimal solution to the relaxed problem lies on 𝕊n1+nsuperscript𝕊𝑛1subscriptsuperscript𝑛\mathbb{S}^{n-1}\cap\mathbb{R}^{n}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT or to be the origin. Subsequently, we propose a projected gradient method to solve the relaxed optimization problem. Leveraging the fact that the optimal solution is related to the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT at a given point, we use this information as prior knowledge to initialize the projected gradient method. Through numerical experiments, our findings consistently indicate that the algorithm can successfully find the optimal 𝒘𝒘\bm{w}bold_italic_w for the original, unrelaxed optimization problem.

It’s worth noting that a different approach for the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT has been reported recently in [16]. That paper claimed to have derived the analytical solution of the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, relying on prior knowledge about the sparsity of the corresponding output from this proximity operator, which, however, is unknown in general. A bisection method was then applied for finding this desired sparsity.

The current literature, including works such as [3, 4, 5, 6], suggests that both h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT functions can effectively promote sparsity in underlying signals. However, to the best of our knowledge, there is a lack of theoretical justification for this claim. In this paper, we provide the theoretical proof that both h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT functions qualify as sparsity-promoting functions, as defined in [17].

The outline of the rest of the paper is as follows: In Section 2, we begin by presenting some properties of the proximity operators for scale and signed permutation invariant functions. These properties allow us to focus our discussion on these proximity operators within a specific set: each point lies in the first orthant of nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, and the entries of the point are in descending order. By employing a different representation of the points in this set, determining the proximity operators of scale and signed permutation invariant functions at these points essentially reduces to solving a quadratic programming problem constrained on a nonconvex set. We then introduce a comprehensive procedure called the WRD procedure, which comprises three distinct steps: 𝒘𝒘\bm{w}bold_italic_w-step, r𝑟ritalic_r-step and d𝑑ditalic_d-step. This procedure enables efficient computation of proximity operators for scale and signed permutation invariant functions, offering a systematic approach to solving such problems.

In Section 3, utilizing the WRD procedure, we compute the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. We are able to provide an explicit solution for the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT at any point in a highly efficient manner, thereby demonstrating the effectiveness of our approach.

In Section 4, leveraging the WRD procedure, we compute the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. We are able to develop an efficient algorithm to evaluate the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT at any point, showcasing the versatility of our methodology.

The conclusion of this paper is drawn in Section 5, summarizing the findings and contributions of our study. We discuss the implications of our results and propose avenues for future research.

2 Scale and Signed Permutation Invariant Functions and their Proximity Operators

All functions in this work are defined on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT the Euclidean space of dimension n𝑛nitalic_n. Bold lowercase letters, such as 𝒙𝒙\bm{x}bold_italic_x, signify vectors, with the j𝑗jitalic_jth component represented by the corresponding lowercase letter xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. Matrices are indicated by bold uppercase letters such as 𝖠𝖠\mathsf{A}sansserif_A and 𝖡𝖡\mathsf{B}sansserif_B. We use +nsubscriptsuperscript𝑛\mathbb{R}^{n}_{+}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT to denote the set of points in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT such that all entries of each point in the set are nonnegative. The cone of vectors 𝒙𝒙\bm{x}bold_italic_x in +nsubscriptsuperscript𝑛\mathbb{R}^{n}_{+}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT satisfying x1x2xn0subscript𝑥1subscript𝑥2subscript𝑥𝑛0x_{1}\geq x_{2}\geq\ldots\geq x_{n}\geq 0italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ … ≥ italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ≥ 0 is denoted by nsubscriptsuperscript𝑛\mathbb{R}^{n}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT. We use 𝕊n1superscript𝕊𝑛1\mathbb{S}^{n-1}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT (or 𝔹nsuperscript𝔹𝑛\mathbb{B}^{n}blackboard_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT) to denote the unit sphere (or ball) in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. We use 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT (𝔹+nsubscriptsuperscript𝔹𝑛\mathbb{B}^{n}_{+}blackboard_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT or 𝔹nsubscriptsuperscript𝔹𝑛\mathbb{B}^{n}_{\downarrow}blackboard_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT) to denote the partial unit sphere 𝕊n1+nsuperscript𝕊𝑛1subscriptsuperscript𝑛\mathbb{S}^{n-1}\cap\mathbb{R}^{n}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT (the partial unit ball 𝔹n+nsuperscript𝔹𝑛subscriptsuperscript𝑛\mathbb{B}^{n}\cap\mathbb{R}^{n}_{+}blackboard_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT or 𝔹nnsuperscript𝔹𝑛subscriptsuperscript𝑛\mathbb{B}^{n}\cap\mathbb{R}^{n}_{\downarrow}blackboard_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∩ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT) in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Let 𝒫nsubscript𝒫𝑛\mathcal{P}_{n}caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT denote the set of all n×n𝑛𝑛n\times nitalic_n × italic_n signed permutation matrices: those matrices that have only one nonzero entry in every row or column, which is ±1plus-or-minus1\pm 1± 1.

The psubscript𝑝\ell_{p}roman_ℓ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT norm of 𝒙=[x1,,xn]n𝒙superscriptsubscript𝑥1subscript𝑥𝑛topsuperscript𝑛\bm{x}=[x_{1},\ldots,x_{n}]^{\top}\in\mathbb{R}^{n}bold_italic_x = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is defined as 𝒙p=(k=1n|xk|p)1/psubscriptnorm𝒙𝑝superscriptsuperscriptsubscript𝑘1𝑛superscriptsubscript𝑥𝑘𝑝1𝑝\|\bm{x}\|_{p}=(\sum_{k=1}^{n}|x_{k}|^{p})^{1/p}∥ bold_italic_x ∥ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = ( ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 1 / italic_p end_POSTSUPERSCRIPT for 1p<1𝑝1\leq p<\infty1 ≤ italic_p < ∞ and 𝒙=max{|xk|:k=1,2,,n}subscriptnorm𝒙:subscript𝑥𝑘𝑘12𝑛\|\bm{x}\|_{\infty}=\max\{|x_{k}|:k=1,2,\ldots,n\}∥ bold_italic_x ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = roman_max { | italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | : italic_k = 1 , 2 , … , italic_n }. When p=0𝑝0p=0italic_p = 0, 𝒙0subscriptnorm𝒙0\|\bm{x}\|_{0}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT represents the number of non-zero components in 𝒙𝒙\bm{x}bold_italic_x. The standard inner product in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is denoted by 𝒖,𝒗𝒖𝒗\langle\bm{u},\bm{v}\rangle⟨ bold_italic_u , bold_italic_v ⟩, where 𝒖𝒖\bm{u}bold_italic_u and 𝒗𝒗\bm{v}bold_italic_v are vectors in nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

We denote [n]:={1,2,,n}assigndelimited-[]𝑛12𝑛[n]:=\{1,2,\ldots,n\}[ italic_n ] := { 1 , 2 , … , italic_n } as an index set up to a positive integer n𝑛nitalic_n. For a subset S𝑆Sitalic_S of [n]delimited-[]𝑛[n][ italic_n ], the notation |S|𝑆|S|| italic_S | represents the cardinality of S𝑆Sitalic_S. For a vector 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and a subset S𝑆Sitalic_S of [n]delimited-[]𝑛[n][ italic_n ], 𝒙Ssubscript𝒙𝑆\bm{x}_{S}bold_italic_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT denotes the vector that retains the entries with indices in S𝑆Sitalic_S of 𝒙𝒙\bm{x}bold_italic_x and sets the remaining entries to zero, or the subvector of 𝒙𝒙\bm{x}bold_italic_x with indices solely from S𝑆Sitalic_S. The specific meaning of 𝒙Ssubscript𝒙𝑆\bm{x}_{S}bold_italic_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT being referred to will be evident from the context of the discussion.

A function f:n:𝑓superscript𝑛f:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_f : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R is considered scale invariant if for all 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and α>0𝛼0\alpha>0italic_α > 0, the following holds:

f(α𝒙)=f(𝒙).𝑓𝛼𝒙𝑓𝒙f(\alpha\bm{x})=f(\bm{x}).italic_f ( italic_α bold_italic_x ) = italic_f ( bold_italic_x ) .

In other words, scaling the input by any positive constant does not alter the value of the function.

A function f:n:𝑓superscript𝑛f:\mathbb{R}^{n}\rightarrow\mathbb{R}italic_f : blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → blackboard_R is considered signed permutation invariant if it remains unchanged under the action of permutations and sign changes of its input variables. Formally, a function f𝑓fitalic_f is signed permutation invariant if, for all permutations 𝖯𝒫n𝖯subscript𝒫𝑛\mathsf{P}\in\mathcal{P}_{n}sansserif_P ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and for all vectors 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, the following holds:

f(𝖯𝒙)=f(𝒙).𝑓𝖯𝒙𝑓𝒙f(\mathsf{P}\bm{x})=f(\bm{x}).italic_f ( sansserif_P bold_italic_x ) = italic_f ( bold_italic_x ) .

A function f𝑓fitalic_f defined on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with values in {+}\mathbb{R}\cup\{+\infty\}blackboard_R ∪ { + ∞ } is proper if its domain dom(f)={xn:f(x)<+}dom𝑓conditional-set𝑥superscript𝑛𝑓𝑥\mathrm{dom}(f)=\{x\in\mathbb{R}^{n}:f(x)<+\infty\}roman_dom ( italic_f ) = { italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_f ( italic_x ) < + ∞ } is nonempty, and f𝑓fitalic_f is lower semicontinuous if its epigraph is a closed set. The set of proper and lower semicontinuous functions on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT to {+}\mathbb{R}\cup\{+\infty\}blackboard_R ∪ { + ∞ } is denoted by Γ(n)Γsuperscript𝑛\Gamma(\mathbb{R}^{n})roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ).

The proximity operator was introduced by Moreau in [18]. For a function fΓ(n)𝑓Γsuperscript𝑛f\in\Gamma(\mathbb{R}^{n})italic_f ∈ roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), the proximity operator of f𝑓fitalic_f at 𝒛n𝒛superscript𝑛\bm{z}\in\mathbb{R}^{n}bold_italic_z ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with index α𝛼\alphaitalic_α is defined by

proxαf(𝒛):=argmin{12α𝒙𝒛22+f(𝒙):𝒙n}.assignsubscriptprox𝛼𝑓𝒛arg:12𝛼superscriptsubscriptnorm𝒙𝒛22𝑓𝒙𝒙superscript𝑛\mathrm{prox}_{\alpha f}(\bm{z}):=\mathrm{arg}\min\left\{\frac{1}{2\alpha}\|% \bm{x}-\bm{z}\|_{2}^{2}+f(\bm{x}):\bm{x}\in\mathbb{R}^{n}\right\}.roman_prox start_POSTSUBSCRIPT italic_α italic_f end_POSTSUBSCRIPT ( bold_italic_z ) := roman_arg roman_min { divide start_ARG 1 end_ARG start_ARG 2 italic_α end_ARG ∥ bold_italic_x - bold_italic_z ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_x ) : bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT } .

The proximity operator of f𝑓fitalic_f is a set-valued operator from n2nsuperscript𝑛superscript2superscript𝑛\mathbb{R}^{n}\rightarrow 2^{\mathbb{R}^{n}}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT → 2 start_POSTSUPERSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, the power set of nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. In this paper, for a scale and signed permutation function, we always assume that the set proxαf(𝒛)subscriptprox𝛼𝑓𝒛\mathrm{prox}_{\alpha f}(\bm{z})roman_prox start_POSTSUBSCRIPT italic_α italic_f end_POSTSUBSCRIPT ( bold_italic_z ) is nonempty and compact.

2.1 Properties

The proximity operator exhibits certain properties concerning scale and signed permutation invariant functions.

Lemma 2.1.

Let 𝐱n𝐱superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, 𝖯𝒫n𝖯subscript𝒫𝑛\mathsf{P}\in\mathcal{P}_{n}sansserif_P ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, α>0𝛼0\alpha>0italic_α > 0, and λ>0𝜆0\lambda>0italic_λ > 0. The following statements hold:

  • (i)

    For a signed permutation invariant function fΓ(n)𝑓Γsuperscript𝑛f\in\Gamma(\mathbb{R}^{n})italic_f ∈ roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), proxλf(𝒙)=𝖯1proxλf(𝖯𝒙)subscriptprox𝜆𝑓𝒙superscript𝖯1subscriptprox𝜆𝑓𝖯𝒙\mathrm{prox}_{\lambda f}(\bm{x})=\mathsf{P}^{-1}\mathrm{prox}_{\lambda f}(% \mathsf{P}\bm{x})roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) = sansserif_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( sansserif_P bold_italic_x ).

  • (ii)

    For a scale invariant function fΓ(n)𝑓Γsuperscript𝑛f\in\Gamma(\mathbb{R}^{n})italic_f ∈ roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), proxλf(α𝒙)=αproxλα2f(𝒙)subscriptprox𝜆𝑓𝛼𝒙𝛼subscriptprox𝜆superscript𝛼2𝑓𝒙\mathrm{prox}_{\lambda f}(\alpha\bm{x})=\alpha\mathrm{prox}_{\lambda\alpha^{-2% }f}(\bm{x})roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( italic_α bold_italic_x ) = italic_α roman_prox start_POSTSUBSCRIPT italic_λ italic_α start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT italic_f end_POSTSUBSCRIPT ( bold_italic_x ).

Proof.

The proof of the two items is based on the definitions of the proximity operator and scale and signed permutation invariant function. We skip the details of the proof here. ∎

For any vector 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, there is a signed permutation 𝖯𝒫n𝖯subscript𝒫𝑛\mathsf{P}\in\mathcal{P}_{n}sansserif_P ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT such that 𝖯𝒙n𝖯𝒙superscriptsubscript𝑛\mathsf{P}\bm{x}\in\mathbb{R}_{\downarrow}^{n}sansserif_P bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, that is, the entries of 𝒙𝒙\bm{x}bold_italic_x can be sorted in a way of |xσ(1)||xσ(2)||xσ(n)|subscript𝑥𝜎1subscript𝑥𝜎2subscript𝑥𝜎𝑛|x_{\sigma(1)}|\geq|x_{\sigma(2)}|\geq\ldots\geq|x_{\sigma(n)}|| italic_x start_POSTSUBSCRIPT italic_σ ( 1 ) end_POSTSUBSCRIPT | ≥ | italic_x start_POSTSUBSCRIPT italic_σ ( 2 ) end_POSTSUBSCRIPT | ≥ … ≥ | italic_x start_POSTSUBSCRIPT italic_σ ( italic_n ) end_POSTSUBSCRIPT |, where σ(i)𝜎𝑖\sigma(i)italic_σ ( italic_i ) is the index of nonzero entry in the i𝑖iitalic_ith column of 𝖯𝖯\mathsf{P}sansserif_P. By Lemma 2.1, for a signed permutation invariant function in Γ(n)Γsuperscript𝑛\Gamma(\mathbb{R}^{n})roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), it is sufficient to consider its proximity operator on nsuperscriptsubscript𝑛\mathbb{R}_{\downarrow}^{n}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

For a vector 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, we assert that 𝒙𝒙\bm{x}bold_italic_x exhibits k𝑘kitalic_k blocks, characterized by (k+1)𝑘1(k+1)( italic_k + 1 ) distinct indices {ij:j[k+1]}conditional-setsubscript𝑖𝑗𝑗delimited-[]𝑘1\{i_{j}:j\in[k+1]\}{ italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT : italic_j ∈ [ italic_k + 1 ] } satisfying i1=1subscript𝑖11i_{1}=1italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1, ik+1=nsubscript𝑖𝑘1𝑛i_{k+1}=nitalic_i start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_n, and ij<ij+1subscript𝑖𝑗subscript𝑖𝑗1i_{j}<i_{j+1}italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < italic_i start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT. In these blocks, 𝒙𝒙\bm{x}bold_italic_x follows the pattern xij=xij+11<xij+1subscript𝑥subscript𝑖𝑗subscript𝑥subscript𝑖𝑗11subscript𝑥subscript𝑖𝑗1x_{i_{j}}=x_{i_{j+1}-1}<x_{i_{j+1}}italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT < italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT for 1jk11𝑗𝑘11\leq j\leq k-11 ≤ italic_j ≤ italic_k - 1 and xik=xik+1subscript𝑥subscript𝑖𝑘subscript𝑥subscript𝑖𝑘1x_{i_{k}}=x_{i_{k+1}}italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. In essence, the vector 𝒙𝒙\bm{x}bold_italic_x comprises k𝑘kitalic_k blocks, where entries within each block are identical, yet they differ from entries in other blocks.

Lemma 2.2.

Let f𝑓fitalic_f be a signed permutation invariant function in Γ(n)Γsuperscript𝑛\Gamma(\mathbb{R}^{n})roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), and let λ>0𝜆0\lambda>0italic_λ > 0. Consider 𝐱n𝐱subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT, we assert that proxλf(𝐱)+nsubscriptprox𝜆𝑓𝐱subscriptsuperscript𝑛\operatorname*{prox}_{\lambda f}(\bm{x})\subseteq\mathbb{R}^{n}_{+}roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) ⊆ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Furthermore, there exists a point 𝐮proxλf(𝐱)𝐮subscriptprox𝜆𝑓𝐱\bm{u}\in\operatorname*{prox}_{\lambda f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) such that 𝐮n𝐮superscriptsubscript𝑛\bm{u}\in\mathbb{R}_{\downarrow}^{n}bold_italic_u ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

Proof.

To establish proxλf(𝒙)+nsubscriptprox𝜆𝑓𝒙subscriptsuperscript𝑛\operatorname*{prox}_{\lambda f}(\bm{x})\subseteq\mathbb{R}^{n}_{+}roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) ⊆ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, we observe that the objective function from the definition of proxλf(𝒙)subscriptprox𝜆𝑓𝒙\operatorname*{prox}_{\lambda f}(\bm{x})roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) is 12λ𝒖𝒙22+f(𝒖)12𝜆superscriptsubscriptnorm𝒖𝒙22𝑓𝒖\frac{1}{2\lambda}\|\bm{u}-\bm{x}\|_{2}^{2}+f(\bm{u})divide start_ARG 1 end_ARG start_ARG 2 italic_λ end_ARG ∥ bold_italic_u - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_u ) for all 𝒖n𝒖superscript𝑛\bm{u}\in\mathbb{R}^{n}bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. As f𝑓fitalic_f is a signed permutation invariant function, f(𝒖)=f(𝖯𝒖)𝑓𝒖𝑓𝖯𝒖f(\bm{u})=f(\mathsf{P}\bm{u})italic_f ( bold_italic_u ) = italic_f ( sansserif_P bold_italic_u ) for all 𝖯𝒫n𝖯subscript𝒫𝑛\mathsf{P}\in\mathcal{P}_{n}sansserif_P ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Given 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, our discussion can be restricted to 𝒖+n𝒖superscriptsubscript𝑛\bm{u}\in\mathbb{R}_{+}^{n}bold_italic_u ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT; otherwise, say the first element u1subscript𝑢1u_{1}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT of 𝒖𝒖\bm{u}bold_italic_u is negative, then (u1x1)2+=2n(ux)2(u1x1)2+=2n(ux)2superscriptsubscript𝑢1subscript𝑥12superscriptsubscript2𝑛superscriptsubscript𝑢subscript𝑥2superscriptsubscript𝑢1subscript𝑥12superscriptsubscript2𝑛superscriptsubscript𝑢subscript𝑥2(-u_{1}-x_{1})^{2}+\sum_{\ell=2}^{n}(u_{\ell}-x_{\ell})^{2}\leq(u_{1}-x_{1})^{% 2}+\sum_{\ell=2}^{n}(u_{\ell}-x_{\ell})^{2}( - italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT roman_ℓ = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ ( italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT roman_ℓ = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. From the above discussion, we conclude that proxλf(𝒙)+nsubscriptprox𝜆𝑓𝒙superscriptsubscript𝑛\operatorname*{prox}_{\lambda f}(\bm{x})\subseteq\mathbb{R}_{+}^{n}roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) ⊆ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

Now, suppose 𝒖proxλf(𝒙)𝒖subscriptprox𝜆𝑓𝒙\bm{u}\in\operatorname*{prox}_{\lambda f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ). If the vector 𝒙𝒙\bm{x}bold_italic_x has one block, that is, all entries of 𝒙𝒙\bm{x}bold_italic_x are the same. Clearly, we can rearrange entries of 𝒖𝒖\bm{u}bold_italic_u so that the rearranged one is in nsubscriptsuperscript𝑛\mathbb{R}^{n}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT and is still in proxλf(𝒙)subscriptprox𝜆𝑓𝒙\operatorname*{prox}_{\lambda f}(\bm{x})roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ). If vector 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT has k2𝑘2k\geq 2italic_k ≥ 2 blocks, characterized by (k+1)𝑘1(k+1)( italic_k + 1 ) distinct indices {ij:j[k+1]}conditional-setsubscript𝑖𝑗𝑗delimited-[]𝑘1\{i_{j}:j\in[k+1]\}{ italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT : italic_j ∈ [ italic_k + 1 ] }. We define uj¯=max{u:ijij+11}subscript𝑢¯𝑗:subscript𝑢subscript𝑖𝑗subscript𝑖𝑗11u_{\overline{j}}=\max\{u_{\ell}:i_{j}\leq\ell\leq i_{j+1}-1\}italic_u start_POSTSUBSCRIPT over¯ start_ARG italic_j end_ARG end_POSTSUBSCRIPT = roman_max { italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT : italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ roman_ℓ ≤ italic_i start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT - 1 } and uj¯=min{u:ijij+11}subscript𝑢¯𝑗:subscript𝑢subscript𝑖𝑗subscript𝑖𝑗11u_{\underline{j}}=\min\{u_{\ell}:i_{j}\leq\ell\leq i_{j+1}-1\}italic_u start_POSTSUBSCRIPT under¯ start_ARG italic_j end_ARG end_POSTSUBSCRIPT = roman_min { italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT : italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ roman_ℓ ≤ italic_i start_POSTSUBSCRIPT italic_j + 1 end_POSTSUBSCRIPT - 1 } for 1jk11𝑗𝑘11\leq j\leq k-11 ≤ italic_j ≤ italic_k - 1, and uk¯=max{u:ikik+1}subscript𝑢¯𝑘:subscript𝑢subscript𝑖𝑘subscript𝑖𝑘1u_{\overline{k}}=\max\{u_{\ell}:i_{k}\leq\ell\leq i_{k+1}\}italic_u start_POSTSUBSCRIPT over¯ start_ARG italic_k end_ARG end_POSTSUBSCRIPT = roman_max { italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT : italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≤ roman_ℓ ≤ italic_i start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT } and uk¯=min{u:ikik+1}subscript𝑢¯𝑘:subscript𝑢subscript𝑖𝑘subscript𝑖𝑘1u_{\underline{k}}=\min\{u_{\ell}:i_{k}\leq\ell\leq i_{k+1}\}italic_u start_POSTSUBSCRIPT under¯ start_ARG italic_k end_ARG end_POSTSUBSCRIPT = roman_min { italic_u start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT : italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≤ roman_ℓ ≤ italic_i start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT }. We claim that uj¯uj+1¯subscript𝑢¯𝑗subscript𝑢¯𝑗1u_{\underline{j}}\geq u_{\overline{j+1}}italic_u start_POSTSUBSCRIPT under¯ start_ARG italic_j end_ARG end_POSTSUBSCRIPT ≥ italic_u start_POSTSUBSCRIPT over¯ start_ARG italic_j + 1 end_ARG end_POSTSUBSCRIPT for 1jk11𝑗𝑘11\leq j\leq k-11 ≤ italic_j ≤ italic_k - 1. If these inequalities do not hold for some 1ik11𝑖𝑘11\leq i\leq k-11 ≤ italic_i ≤ italic_k - 1, assume, without loss of generality, that u1¯<u2¯subscript𝑢¯1subscript𝑢¯2u_{\underline{1}}<u_{\overline{2}}italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT < italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT. One can assume that u1=u1¯subscript𝑢1subscript𝑢¯1u_{1}=u_{\underline{1}}italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT and ui2=u2¯subscript𝑢subscript𝑖2subscript𝑢¯2u_{i_{2}}=u_{\overline{2}}italic_u start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT. In this case, let 𝒖~~𝒖\widetilde{\bm{u}}over~ start_ARG bold_italic_u end_ARG be a vector from 𝒖𝒖\bm{u}bold_italic_u by exchanging its first and the i2subscript𝑖2i_{2}italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT components. Immediately, f(𝒖~)=f(𝒖)𝑓~𝒖𝑓𝒖f(\widetilde{\bm{u}})=f(\bm{u})italic_f ( over~ start_ARG bold_italic_u end_ARG ) = italic_f ( bold_italic_u ), and

𝒖~𝒙22𝒖𝒙22superscriptsubscriptnorm~𝒖𝒙22superscriptsubscriptnorm𝒖𝒙22\displaystyle\|\widetilde{\bm{u}}-\bm{x}\|_{2}^{2}-\|\bm{u}-\bm{x}\|_{2}^{2}∥ over~ start_ARG bold_italic_u end_ARG - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ∥ bold_italic_u - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =\displaystyle== (u2¯x1)2+(u1¯xi2)2(u1¯x1)2(u2¯xi2)2superscriptsubscript𝑢¯2subscript𝑥12superscriptsubscript𝑢¯1subscript𝑥subscript𝑖22superscriptsubscript𝑢¯1subscript𝑥12superscriptsubscript𝑢¯2subscript𝑥subscript𝑖22\displaystyle(u_{\overline{2}}-x_{1})^{2}+(u_{\underline{1}}-x_{i_{2}})^{2}-(u% _{\underline{1}}-x_{1})^{2}-(u_{\overline{2}}-x_{i_{2}})^{2}( italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
=\displaystyle== 2(u1¯u2¯)(x1xi2)<02subscript𝑢¯1subscript𝑢¯2subscript𝑥1subscript𝑥subscript𝑖20\displaystyle 2(u_{\underline{1}}-u_{\overline{2}})(x_{1}-x_{i_{2}})<02 ( italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT - italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT ) ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) < 0

due to the conditions of x1=xi1>xi2subscript𝑥1subscript𝑥subscript𝑖1subscript𝑥subscript𝑖2x_{1}=x_{i_{1}}>x_{i_{2}}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and u1¯<u2¯subscript𝑢¯1subscript𝑢¯2u_{\underline{1}}<u_{\overline{2}}italic_u start_POSTSUBSCRIPT under¯ start_ARG 1 end_ARG end_POSTSUBSCRIPT < italic_u start_POSTSUBSCRIPT over¯ start_ARG 2 end_ARG end_POSTSUBSCRIPT. This conflicts with our assumption of 𝒖proxλf(𝒙)𝒖subscriptprox𝜆𝑓𝒙\bm{u}\in\operatorname*{prox}_{\lambda f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ).

Finally, since all entries in each block of 𝒙𝒙\bm{x}bold_italic_x are the same, arranging the entries of 𝒖proxλf(𝒙)𝒖subscriptprox𝜆𝑓𝒙\bm{u}\in\operatorname*{prox}_{\lambda f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) for the indices in the same block in descending order results in 𝒖𝒖\bm{u}bold_italic_u still belonging to proxλf(𝒙)subscriptprox𝜆𝑓𝒙\operatorname*{prox}_{\lambda f}(\bm{x})roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ). Thus, there exists a point 𝒖proxλf(𝒙)𝒖subscriptprox𝜆𝑓𝒙\bm{u}\in\operatorname*{prox}_{\lambda f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_λ italic_f end_POSTSUBSCRIPT ( bold_italic_x ) such that 𝒖n𝒖superscriptsubscript𝑛\bm{u}\in\mathbb{R}_{\downarrow}^{n}bold_italic_u ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. ∎

2.2 Reformulation

Our focus of this paper is to study the proximity operator of scale and signed permutation invariant functions. Our approach for computing the proximity operator of scale and signed permutation invariant functions is based on this observation: the space nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is isomorphic to the Cartesian product of \mathbb{R}blackboard_R and 𝕊n1superscript𝕊𝑛1\mathbb{S}^{n-1}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. That is, for 𝒖n𝒖superscript𝑛\bm{u}\in\mathbb{R}^{n}bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, it can be converted to a pair (r,𝒘)×𝕊n1𝑟𝒘superscript𝕊𝑛1(r,\bm{w})\in\mathbb{R}\times\mathbb{S}^{n-1}( italic_r , bold_italic_w ) ∈ blackboard_R × blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT such that

𝒖=r𝒘,𝒖𝑟𝒘\bm{u}=r\bm{w},bold_italic_u = italic_r bold_italic_w ,

where

r=𝒖2and𝒘=𝒖𝒖2𝕊n1.formulae-sequence𝑟subscriptnorm𝒖2and𝒘𝒖subscriptnorm𝒖2superscript𝕊𝑛1r=\|\bm{u}\|_{2}\in\mathbb{R}\quad\mbox{and}\quad\bm{w}=\frac{\bm{u}}{\|\bm{u}% \|_{2}}\in\mathbb{S}^{n-1}.italic_r = ∥ bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R and bold_italic_w = divide start_ARG bold_italic_u end_ARG start_ARG ∥ bold_italic_u ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT .

With this conversion, the task of finding a point 𝒖proxf(𝒙)𝒖subscriptprox𝑓𝒙\bm{u}\in\mathrm{prox}_{f}(\bm{x})bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( bold_italic_x ) transforms into finding a pair of (r,𝒘)×𝕊n1𝑟𝒘superscript𝕊𝑛1(r,\bm{w})\in\mathbb{R}\times\mathbb{S}^{n-1}( italic_r , bold_italic_w ) ∈ blackboard_R × blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT such that 𝒖=r𝒘𝒖𝑟𝒘\bm{u}=r\bm{w}bold_italic_u = italic_r bold_italic_w.

Theorem 2.3.

Let f𝑓fitalic_f be a scale and signed permutation invariant function in Γ(n)Γsuperscript𝑛\Gamma(\mathbb{R}^{n})roman_Γ ( blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ), and let ρ>0𝜌0\rho>0italic_ρ > 0. Consider a vector 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and define

F(𝒖):=ρ2𝒖𝒙22+f(𝒖).assign𝐹𝒖𝜌2superscriptsubscriptnorm𝒖𝒙22𝑓𝒖F(\bm{u}):=\frac{\rho}{2}\|\bm{u}-\bm{x}\|_{2}^{2}+f(\bm{u}).italic_F ( bold_italic_u ) := divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_u - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_u ) . (1)

Then 𝐱prox1ρf(𝐱)superscript𝐱subscriptprox1𝜌𝑓𝐱\bm{x}^{\star}\in\mathrm{prox}_{\frac{1}{\rho}f}(\bm{x})bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_f end_POSTSUBSCRIPT ( bold_italic_x ) if and only if 𝐱superscript𝐱\bm{x}^{\star}bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is given by

𝒙{{𝟎},if F(𝟎)<F(𝒙,𝒘𝒘);{𝟎,𝒙,𝒘𝒘},if F(𝟎)=F(𝒙,𝒘𝒘);{𝒙,𝒘𝒘},otherwise.superscript𝒙cases0if F(𝟎)<F(𝒙,𝒘𝒘);0𝒙superscript𝒘superscript𝒘if F(𝟎)=F(𝒙,𝒘𝒘);𝒙superscript𝒘superscript𝒘otherwise.\bm{x}^{\star}\in\left\{\begin{array}[]{ll}\{\mathbf{0}\},&\hbox{if $F(\mathbf% {0})<F(\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star})$;}\\ \{\mathbf{0},\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\},&\hbox{if $F(% \mathbf{0})=F(\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star})$;}\\ \{\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\},&\hbox{otherwise.}\end{% array}\right.bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ { start_ARRAY start_ROW start_CELL { bold_0 } , end_CELL start_CELL if italic_F ( bold_0 ) < italic_F ( ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) ; end_CELL end_ROW start_ROW start_CELL { bold_0 , ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } , end_CELL start_CELL if italic_F ( bold_0 ) = italic_F ( ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) ; end_CELL end_ROW start_ROW start_CELL { ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } , end_CELL start_CELL otherwise. end_CELL end_ROW end_ARRAY (2)

where 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is a solution to the following optimization problem

min{ρ2𝒙,𝒘2+f(𝒘):𝒘𝕊+n1}.:𝜌2superscript𝒙𝒘2𝑓𝒘𝒘subscriptsuperscript𝕊𝑛1\min\left\{-\frac{\rho}{2}\langle\bm{x},\bm{w}\rangle^{2}+f(\bm{w}):\bm{w}\in% \mathbb{S}^{n-1}_{+}\right\}.roman_min { - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_x , bold_italic_w ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_w ) : bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT } . (3)
Proof.

From the definition of proximity operator,

prox1ρf(𝒙)=argmin{F(𝒖):𝒖n}.subscriptprox1𝜌𝑓𝒙argmin:𝐹𝒖𝒖superscript𝑛\mathrm{prox}_{\frac{1}{\rho}f}(\bm{x})=\operatorname*{arg\,min}\left\{F(\bm{u% }):\bm{u}\in\mathbb{R}^{n}\right\}.roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_f end_POSTSUBSCRIPT ( bold_italic_x ) = start_OPERATOR roman_arg roman_min end_OPERATOR { italic_F ( bold_italic_u ) : bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT } .

By Lemma 2.1 and Lemma 2.2, for 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT we establish that

argmin{F(𝒖):𝒖n}=argmin{F(𝒖):𝒖+n}.argmin:𝐹𝒖𝒖superscript𝑛argmin:𝐹𝒖𝒖subscriptsuperscript𝑛\operatorname*{arg\,min}\left\{F(\bm{u}):\bm{u}\in\mathbb{R}^{n}\right\}=% \operatorname*{arg\,min}\left\{F(\bm{u}):\bm{u}\in\mathbb{R}^{n}_{+}\right\}.start_OPERATOR roman_arg roman_min end_OPERATOR { italic_F ( bold_italic_u ) : bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT } = start_OPERATOR roman_arg roman_min end_OPERATOR { italic_F ( bold_italic_u ) : bold_italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT } .

To delve deeper into the optimization problem on the right-hand side, we express 𝒖=r𝒘𝒖𝑟𝒘\bm{u}=r\bm{w}bold_italic_u = italic_r bold_italic_w with r0𝑟0r\geq 0italic_r ≥ 0 and 𝒘𝕊+n1𝒘superscriptsubscript𝕊𝑛1\bm{w}\in\mathbb{S}_{+}^{n-1}bold_italic_w ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. Consequently, for r=0𝑟0r=0italic_r = 0,

F(𝟎)=ρ2𝒙22+f(𝟎)𝐹0𝜌2superscriptsubscriptnorm𝒙22𝑓0F(\mathbf{0})=\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+f(\mathbf{0})italic_F ( bold_0 ) = divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_0 )

and for r>0𝑟0r>0italic_r > 0

F(𝒖)𝐹𝒖\displaystyle F(\bm{u})italic_F ( bold_italic_u ) =\displaystyle== ρ2r𝒘𝒙22+f(r𝒘)𝜌2superscriptsubscriptnorm𝑟𝒘𝒙22𝑓𝑟𝒘\displaystyle\frac{\rho}{2}\|r\bm{w}-\bm{x}\|_{2}^{2}+f(r\bm{w})divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ italic_r bold_italic_w - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( italic_r bold_italic_w ) (4)
=\displaystyle== ρ2(r22r𝒘,𝒙+𝒙22)+f(𝒘)𝜌2superscript𝑟22𝑟𝒘𝒙superscriptsubscriptnorm𝒙22𝑓𝒘\displaystyle\frac{\rho}{2}(r^{2}-2r\langle\bm{w},\bm{x}\rangle+\|\bm{x}\|_{2}% ^{2})+f(\bm{w})divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ( italic_r start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_r ⟨ bold_italic_w , bold_italic_x ⟩ + ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + italic_f ( bold_italic_w )
=\displaystyle== ρ2(r𝒘,𝒙)2+ρ2𝒙22+(ρ2𝒘,𝒙2+f(𝒘)).𝜌2superscript𝑟𝒘𝒙2𝜌2superscriptsubscriptnorm𝒙22𝜌2superscript𝒘𝒙2𝑓𝒘\displaystyle\frac{\rho}{2}(r-\langle\bm{w},\bm{x}\rangle)^{2}+\frac{\rho}{2}% \|\bm{x}\|_{2}^{2}+\left(-\frac{\rho}{2}\langle\bm{w},\bm{x}\rangle^{2}+f(\bm{% w})\right).divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ( italic_r - ⟨ bold_italic_w , bold_italic_x ⟩ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_w , bold_italic_x ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_w ) ) .

In equation (4), the terms are as follows: The first term ρ2(r𝒘,𝒙)2𝜌2superscript𝑟𝒘𝒙2\frac{\rho}{2}(r-\langle\bm{w},\bm{x}\rangle)^{2}divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ( italic_r - ⟨ bold_italic_w , bold_italic_x ⟩ ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT can always achieve the minimum value 00 by taking r=𝒘,𝒙𝑟𝒘𝒙r=\langle\bm{w},\bm{x}\rangleitalic_r = ⟨ bold_italic_w , bold_italic_x ⟩; the second term ρ2𝒙22𝜌2superscriptsubscriptnorm𝒙22\frac{\rho}{2}\|\bm{x}\|_{2}^{2}divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is constant with respect to the pair (r,𝒘)𝑟𝒘(r,\bm{w})( italic_r , bold_italic_w ); and third term ρ2𝒘,𝒙2+f(𝒘)𝜌2superscript𝒘𝒙2𝑓𝒘-\frac{\rho}{2}\langle\bm{w},\bm{x}\rangle^{2}+f(\bm{w})- divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_w , bold_italic_x ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_w ) is solely a function of 𝒘𝒘\bm{w}bold_italic_w. Therefore, we seek 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT that minimizes the third term with respect to 𝒘𝒘\bm{w}bold_italic_w, i.e., solving the optimization problem (3), then form the expression 𝒙,𝒘𝒘𝒙superscript𝒘superscript𝒘\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. Hence, the conclusion of this theorem holds. ∎

In the following discussion, we use the notation F𝐹Fitalic_F in (1) to represent the objective function for prox1ρf(𝒙)subscriptprox1𝜌𝑓𝒙\mathrm{prox}_{\frac{1}{\rho}f}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_f end_POSTSUBSCRIPT ( bold_italic_x ) and denote

G(𝒘):=ρ2𝒙,𝒘2+f(𝒘).assign𝐺𝒘𝜌2superscript𝒙𝒘2𝑓𝒘G(\bm{w}):=-\frac{\rho}{2}\langle\bm{x},\bm{w}\rangle^{2}+f(\bm{w}).italic_G ( bold_italic_w ) := - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_x , bold_italic_w ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_f ( bold_italic_w ) . (5)

to represent the objective function of (3).

The significance of the scale and signed permutation invariance of f𝑓fitalic_f becomes evident in the proof of the theorem above. The scale invariance of f𝑓fitalic_f facilitates the discussion from nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT to 𝕊n1superscript𝕊𝑛1\mathbb{S}^{n-1}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT, while the signed permutation invariance narrows the focus from 𝕊n1superscript𝕊𝑛1\mathbb{S}^{n-1}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT to 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, allowing us to isolate the impact of r𝑟ritalic_r and 𝒘𝒘\bm{w}bold_italic_w when solving an optimization problem that involves 𝒘𝒘\bm{w}bold_italic_w exclusively.

In accordance with Theorem 2.3, the process of determining the pair (r,𝒘)𝑟𝒘(r,\bm{w})( italic_r , bold_italic_w ) involves three distinct steps:

  • 𝒘𝒘\bm{w}bold_italic_w-step: In this step, the objective is to find an optimal solution 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to the optimization problem (3).

  • r𝑟ritalic_r-step: Following the 𝒘𝒘\bm{w}bold_italic_w-step, the corresponding rsuperscript𝑟r^{\star}italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is computed as r=𝒙,𝒘superscript𝑟𝒙superscript𝒘r^{\star}=\langle\bm{x},\bm{w}^{\star}\rangleitalic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩, where 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the output from 𝒘𝒘\bm{w}bold_italic_w-step.

  • d𝑑ditalic_d-step: This final step determines 𝒙superscript𝒙\bm{x}^{\star}bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT according to (2).

Upon completing these three steps, as shown in (2), 𝒙superscript𝒙\bm{x}^{\star}bold_italic_x start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT belongs to prox1ρf(𝒙)subscriptprox1𝜌𝑓𝒙\mathrm{prox}_{\frac{1}{\rho}f}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_f end_POSTSUBSCRIPT ( bold_italic_x ). For ease of reference in the subsequent discussion, this procedure is referred to as WRD (𝒘𝒘\bm{w}bold_italic_w-step, r𝑟ritalic_r-step, d𝑑ditalic_d-step).

To show the applicability of the WRD procedure, we present the proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm, a typical scale and signed permutation invariant function.

Example 2.4.

The proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm at 𝐱𝐱\bm{x}bold_italic_x with index 1/ρ1𝜌1/\rho1 / italic_ρ is, see, e.g., [19, 17],

(prox1ρ0(𝒙))i={{xi},if |xi|>2/ρ;{0,xi},if |xi|=2/ρ;{0},otherwise.(\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}(\bm{x}))_{i}=\left\{\begin{array}% []{ll}\{x_{i}\},&\hbox{if $|x_{i}|>\sqrt{2/\rho}$;}\\ \{0,x_{i}\},&\hbox{if $|x_{i}|=\sqrt{2/\rho}$;}\\ \{0\},&\hbox{otherwise.}\end{array}\right.( roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { start_ARRAY start_ROW start_CELL { italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } , end_CELL start_CELL if | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | > square-root start_ARG 2 / italic_ρ end_ARG ; end_CELL end_ROW start_ROW start_CELL { 0 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } , end_CELL start_CELL if | italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | = square-root start_ARG 2 / italic_ρ end_ARG ; end_CELL end_ROW start_ROW start_CELL { 0 } , end_CELL start_CELL otherwise. end_CELL end_ROW end_ARRAY

We intend to apply the WRD procedure for computing prox1ρ0\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Assuming 𝐱n𝐱subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT, and following the approach used in the proof of Theorem 2.3, we define F(𝐮):=ρ2𝐮𝐱22+𝐮0assign𝐹𝐮𝜌2superscriptsubscriptnorm𝐮𝐱22subscriptnorm𝐮0F(\bm{u}):=\frac{\rho}{2}\|\bm{u}-\bm{x}\|_{2}^{2}+\|\bm{u}\|_{0}italic_F ( bold_italic_u ) := divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_u - bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_italic_u ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The next step involves seeking the optimal solution to optimization problem (3) for 𝐰𝕊+n1𝐰subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, where G(𝐰):=ρ2𝐱,𝐰2+𝐰0assign𝐺𝐰𝜌2superscript𝐱𝐰2subscriptnorm𝐰0G(\bm{w}):=-\frac{\rho}{2}\langle\bm{x},\bm{w}\rangle^{2}+\|\bm{w}\|_{0}italic_G ( bold_italic_w ) := - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_x , bold_italic_w ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Thus, for 𝐰𝕊+n1𝐰subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT with an 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm of k𝑘kitalic_k, the smallest value of G𝐺Gitalic_G is achieved when 𝐰𝐰\bm{w}bold_italic_w is aligned with the first k𝑘kitalic_k entries of 𝐱𝐱\bm{x}bold_italic_x, that is,

G(𝒙[k]𝒙[k]2)=ρ2𝒙[k]22+k=i=1k(ρ2xi2+1).𝐺subscript𝒙delimited-[]𝑘subscriptnormsubscript𝒙delimited-[]𝑘2𝜌2superscriptsubscriptnormsubscript𝒙delimited-[]𝑘22𝑘superscriptsubscript𝑖1𝑘𝜌2superscriptsubscript𝑥𝑖21G\left(\frac{\bm{x}_{[k]}}{\|\bm{x}_{[k]}\|_{2}}\right)=-\frac{\rho}{2}\|\bm{x% }_{[k]}\|_{2}^{2}+k=\sum_{i=1}^{k}\left(-\frac{\rho}{2}x_{i}^{2}+1\right).italic_G ( divide start_ARG bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) = - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_k = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 ) .

Here 𝐱[k]subscript𝐱delimited-[]𝑘\bm{x}_{[k]}bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT keeps the first k𝑘kitalic_k entries of 𝐱𝐱\bm{x}bold_italic_x and sets the remaining entries zeros. Therefore, the output in the 𝐰𝐰\bm{w}bold_italic_w-step of the WRD procedure is given by

argmin𝒘𝕊+n1G(𝒘)={{𝒙[1]𝒙[1]2},if x1<2/ρ;{𝒙S𝒙S2:S[p],|S|1},if p[n] s.t. x1=xp=2/ρ>xp+1;{𝒙[k]𝒙[k]2},if k[n] s.t. xk>2/ρ>xk+1;{𝒙[k]S𝒙[k]S2:S[p],|S|1},if k[n] and p[nk] s.t.  xk>2/ρ=xk+1=xk+p>xk+p+1.subscriptargmin𝒘subscriptsuperscript𝕊𝑛1𝐺𝒘casessubscript𝒙delimited-[]1subscriptnormsubscript𝒙delimited-[]12if x1<2/ρ;conditional-setsubscript𝒙𝑆subscriptnormsubscript𝒙𝑆2formulae-sequence𝑆delimited-[]𝑝𝑆1if p[n] s.t. x1=xp=2/ρ>xp+1;subscript𝒙delimited-[]𝑘subscriptnormsubscript𝒙delimited-[]𝑘2if k[n] s.t. xk>2/ρ>xk+1;conditional-setsubscript𝒙delimited-[]𝑘𝑆subscriptnormsubscript𝒙delimited-[]𝑘𝑆2formulae-sequence𝑆delimited-[]𝑝𝑆1if k[n] and p[nk] s.t. missing-subexpression xk>2/ρ=xk+1=xk+p>xk+p+1\operatorname*{arg\,min}_{\bm{w}\in\mathbb{S}^{n-1}_{+}}G(\bm{w})=\left\{% \begin{array}[]{ll}\left\{\frac{\bm{x}_{[1]}}{\|\bm{x}_{[1]}\|_{2}}\right\},&% \hbox{if $x_{1}<\sqrt{2/\rho}$;}\\ \left\{\frac{\bm{x}_{S}}{\|\bm{x}_{S}\|_{2}}:S\subseteq[p],|S|\geq 1\right\},&% \hbox{if $\exists p\in[n]$ s.t. $x_{1}=x_{p}=\sqrt{2/\rho}>x_{p+1}$;}\\ \left\{\frac{\bm{x}_{[k]}}{\|\bm{x}_{[k]}\|_{2}}\right\},&\hbox{if $\exists k% \in[n]$ s.t. $x_{k}>\sqrt{2/\rho}>x_{k+1}$;}\\ \left\{\frac{\bm{x}_{[k]\cup S}}{\|\bm{x}_{[k]\cup S}\|_{2}}:S\subseteq[p],|S|% \geq 1\right\},&\hbox{if $\exists k\in[n]$ and $p\in[n-k]$ s.t. }\\ &\mbox{ \hskip 5.69046pt$x_{k}>\sqrt{2/\rho}=x_{k+1}=x_{k+p}>x_{k+p+1}$}.\\ \end{array}\right.start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_G ( bold_italic_w ) = { start_ARRAY start_ROW start_CELL { divide start_ARG bold_italic_x start_POSTSUBSCRIPT [ 1 ] end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x start_POSTSUBSCRIPT [ 1 ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG } , end_CELL start_CELL if italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < square-root start_ARG 2 / italic_ρ end_ARG ; end_CELL end_ROW start_ROW start_CELL { divide start_ARG bold_italic_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG : italic_S ⊆ [ italic_p ] , | italic_S | ≥ 1 } , end_CELL start_CELL if ∃ italic_p ∈ [ italic_n ] s.t. italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = square-root start_ARG 2 / italic_ρ end_ARG > italic_x start_POSTSUBSCRIPT italic_p + 1 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL { divide start_ARG bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG } , end_CELL start_CELL if ∃ italic_k ∈ [ italic_n ] s.t. italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > square-root start_ARG 2 / italic_ρ end_ARG > italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL { divide start_ARG bold_italic_x start_POSTSUBSCRIPT [ italic_k ] ∪ italic_S end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x start_POSTSUBSCRIPT [ italic_k ] ∪ italic_S end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG : italic_S ⊆ [ italic_p ] , | italic_S | ≥ 1 } , end_CELL start_CELL if ∃ italic_k ∈ [ italic_n ] and italic_p ∈ [ italic_n - italic_k ] s.t. end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > square-root start_ARG 2 / italic_ρ end_ARG = italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_k + italic_p end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT italic_k + italic_p + 1 end_POSTSUBSCRIPT . end_CELL end_ROW end_ARRAY

This output represents the solutions to the 𝐰𝐰\bm{w}bold_italic_w-step of the WRD. Subsequently, choosing a vector 𝐰argmin𝐰𝕊+n1G(𝐰)superscript𝐰subscriptargmin𝐰subscriptsuperscript𝕊𝑛1𝐺𝐰\bm{w}^{\star}\in\operatorname*{arg\,min}_{\bm{w}\in\mathbb{S}^{n-1}_{+}}G(\bm% {w})bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ start_OPERATOR roman_arg roman_min end_OPERATOR start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_G ( bold_italic_w ), the r𝑟ritalic_r-step generates r=𝐱,𝐰𝑟𝐱superscript𝐰r=\langle\bm{x},\bm{w}^{\star}\rangleitalic_r = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩. With the pair (r,𝐰)𝑟superscript𝐰(r,\bm{w}^{\star})( italic_r , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ), the d𝑑ditalic_d-step of the WRD compares the difference between F(𝐱,𝐰𝐰)𝐹𝐱superscript𝐰superscript𝐰F(\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star})italic_F ( ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) and F(𝟎)𝐹0F(\mathbf{0})italic_F ( bold_0 ), resulting in

F(𝒙,𝒘𝒘)F(𝟎)=i=1k(ρ2xi2+1),𝐹𝒙superscript𝒘superscript𝒘𝐹0superscriptsubscript𝑖1𝑘𝜌2superscriptsubscript𝑥𝑖21F(\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star})-F(\mathbf{0})=\sum_{i=1}^% {k}\left(-\frac{\rho}{2}x_{i}^{2}+1\right),italic_F ( ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 ) ,

where k𝑘kitalic_k is equal to 1111 if x12/ρsubscript𝑥12𝜌x_{1}\leq\sqrt{2/\rho}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ square-root start_ARG 2 / italic_ρ end_ARG or is the integer such that xk>2/ρxk+1subscript𝑥𝑘2𝜌subscript𝑥𝑘1x_{k}>\sqrt{2/\rho}\geq x_{k+1}italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > square-root start_ARG 2 / italic_ρ end_ARG ≥ italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT. Clearly, the vector 𝟎0\mathbf{0}bold_0 is in prox1ρ0(𝐱)\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) if x1<2/ρsubscript𝑥12𝜌x_{1}<\sqrt{2/\rho}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < square-root start_ARG 2 / italic_ρ end_ARG, and both 𝟎0\mathbf{0}bold_0 and 𝐱,𝐰𝐰𝐱superscript𝐰superscript𝐰\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT are in prox1ρ0(𝐱)\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) if x1=2/ρsubscript𝑥12𝜌x_{1}=\sqrt{2/\rho}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 / italic_ρ end_ARG; otherwise 𝐱,𝐰𝐰𝐱superscript𝐰superscript𝐰\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is in prox1ρ0(𝐱)\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). These discussions affirm that the WRD procedure accurately recovers the proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm.

In the rest of the paper, we focus on computing the proximity operator of the function below:

hp(𝒙)={(𝒙1𝒙2)p,if 𝒙𝟎;0,otherwise,subscript𝑝𝒙casessuperscriptsubscriptnorm𝒙1subscriptnorm𝒙2𝑝if 𝒙𝟎;0𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒h_{p}(\bm{x})=\left\{\begin{array}[]{ll}\left(\frac{\|\bm{x}\|_{1}}{\|\bm{x}\|% _{2}}\right)^{p},&\hbox{if $\bm{x}\neq\mathbf{0}$;}\\ 0,&{otherwise},\end{array}\right.italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ARRAY start_ROW start_CELL ( divide start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , end_CELL start_CELL if bold_italic_x ≠ bold_0 ; end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL italic_o italic_t italic_h italic_e italic_r italic_w italic_i italic_s italic_e , end_CELL end_ROW end_ARRAY (6)

for p=1𝑝1p=1italic_p = 1 and 2222. This function is lower semicontinuous and for all nonzero vectors 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, 1hp(𝒙)np/21subscript𝑝𝒙superscript𝑛𝑝21\leq h_{p}(\bm{x})\leq n^{p/2}1 ≤ italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) ≤ italic_n start_POSTSUPERSCRIPT italic_p / 2 end_POSTSUPERSCRIPT. Thus, the proximity operator of hpsubscript𝑝h_{p}italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT at any point is nonempty. Notably, setting the value of hpsubscript𝑝h_{p}italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT at the origin to any value smaller than or equal to 1 preserves the lower semicontinuity of the function. For example, h1(𝟎)subscript10h_{1}(\mathbf{0})italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_0 ) is set to be 1111 as illustrated in [16]. Therefore, our proposed WRD procedure remains applicable. Lastly, it’s important to note that in \mathbb{R}blackboard_R, our definition of hpsubscript𝑝h_{p}italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT aligns consistently with the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm, that is, hp(𝒙)=𝒙0subscript𝑝𝒙subscriptnorm𝒙0h_{p}(\bm{x})=\|\bm{x}\|_{0}italic_h start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) = ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for 𝒙𝒙\bm{x}\in\mathbb{R}bold_italic_x ∈ blackboard_R.

In the next section, we consider the computation of the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT first.

3 The Proximity Operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT

We plan to use the WRD procedure to compute the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. We begin with showing the optimization problem (3) associated with the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD.

Define 𝒆𝒆\bm{e}bold_italic_e as a vector with all its components 1111. For 𝒙+n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{+}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, we have

𝒘,𝒙2=𝒘𝒙𝒙𝒘and𝒘12=𝒘𝒆𝒆𝒘.formulae-sequencesuperscript𝒘𝒙2superscript𝒘top𝒙superscript𝒙top𝒘andsuperscriptsubscriptnorm𝒘12superscript𝒘top𝒆superscript𝒆top𝒘\langle\bm{w},\bm{x}\rangle^{2}=\bm{w}^{\top}\bm{x}\bm{x}^{\top}\bm{w}\quad% \mbox{and}\quad\|\bm{w}\|_{1}^{2}=\bm{w}^{\top}\bm{e}\bm{e}^{\top}\bm{w}.⟨ bold_italic_w , bold_italic_x ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w and ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w .

Set

𝖠ρ,𝒙:=2𝒆𝒆ρ𝒙𝒙.assignsubscript𝖠𝜌𝒙2𝒆superscript𝒆top𝜌𝒙superscript𝒙top\mathsf{A}_{\rho,\bm{x}}:=2\bm{e}\bm{e}^{\top}-{\rho}\bm{x}\bm{x}^{\top}.sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT := 2 bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT - italic_ρ bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . (7)

The corresponding function G𝐺Gitalic_G in (5) becomes

G(𝒘)=12𝒘𝖠ρ,𝒙𝒘,𝐺𝒘12superscript𝒘topsubscript𝖠𝜌𝒙𝒘G(\bm{w})=\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w},italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w ,

Hence, the optimization problem (3) is a quadratic programming constrained on 𝕊+n1superscriptsubscript𝕊𝑛1\mathbb{S}_{+}^{n-1}blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT.

We promptly obtain a result concerning the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT at points that are multiples of the vector 𝒆𝒆\bm{e}bold_italic_e as follows:

Theorem 3.1.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱=α𝐞𝐱𝛼𝐞\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0, then

prox1ρh2(𝒙)={{α𝒆},if ρα2>2;{𝟎}{α𝒘1𝒘:𝒘𝕊+n1},if ρα2=2;{𝟎},if ρα2<2.subscriptprox1𝜌subscript2𝒙cases𝛼𝒆if ρα2>2;0conditional-set𝛼:evaluated-at𝒘1𝒘𝒘superscriptsubscript𝕊𝑛1if ρα2=2;0if ρα2<2.\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\left\{\begin{array}[]{ll}\{\alpha% \bm{e}\},&\hbox{if $\rho\alpha^{2}>2$;}\\ \{\mathbf{0}\}\cup\{\alpha\|\bm{w}\|_{1}\bm{w}:\bm{w}\in\mathbb{S}_{+}^{n-1}\}% ,&\hbox{if $\rho\alpha^{2}=2$;}\\ \{\mathbf{0}\},&\hbox{if $\rho\alpha^{2}<2$.}\end{array}\right.roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ARRAY start_ROW start_CELL { italic_α bold_italic_e } , end_CELL start_CELL if italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 2 ; end_CELL end_ROW start_ROW start_CELL { bold_0 } ∪ { italic_α ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT } , end_CELL start_CELL if italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2 ; end_CELL end_ROW start_ROW start_CELL { bold_0 } , end_CELL start_CELL if italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 2 . end_CELL end_ROW end_ARRAY
Proof.

In the situation of 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0, we have 𝖠ρ,𝒙=(2ρα2)𝒆𝒆subscript𝖠𝜌𝒙2𝜌superscript𝛼2𝒆superscript𝒆top\mathsf{A}_{\rho,\bm{x}}=(2-\rho\alpha^{2})\bm{e}\bm{e}^{\top}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = ( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT from (7). The objective function of problem (3) is G(𝒘)=12(2ρα2)𝒘12𝐺𝒘122𝜌superscript𝛼2superscriptsubscriptnorm𝒘12G(\bm{w})=\frac{1}{2}(2-\rho\alpha^{2})\|\bm{w}\|_{1}^{2}italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. To investigate the minimal value of the above function on 𝕊+n1superscriptsubscript𝕊𝑛1\mathbb{S}_{+}^{n-1}blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT and at which point the optimal is achieved, there are three different situations according to the value of ρα2𝜌superscript𝛼2\rho\alpha^{2}italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

If ρα2>2𝜌superscript𝛼22\rho\alpha^{2}>2italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 2, the minimal value of 12𝒘𝖠ρ,𝒙𝒘12superscript𝒘topsubscript𝖠𝜌𝒙𝒘\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w is achieved at 𝒘𝒘\bm{w}bold_italic_w which has the largest 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT norm for 𝒘𝕊+n1𝒘superscriptsubscript𝕊𝑛1\bm{w}\in\mathbb{S}_{+}^{n-1}bold_italic_w ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. Clearly, the optimal 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT must be 1n𝒆1𝑛𝒆\frac{1}{\sqrt{n}}\bm{e}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e and G(𝒘)=12(2ρα2)n<0𝐺superscript𝒘122𝜌superscript𝛼2𝑛0G(\bm{w}^{\star})=\frac{1}{2}(2-\rho\alpha^{2})n<0italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_n < 0. Hence, prox1ρh2(𝒙)={α𝒆,𝒘𝒘}={α𝒆}subscriptprox1𝜌subscript2𝒙𝛼𝒆superscript𝒘superscript𝒘𝛼𝒆\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\{\langle\alpha\bm{e},\bm{w}^{% \star}\rangle\bm{w}^{\star}\}=\{\alpha\bm{e}\}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { ⟨ italic_α bold_italic_e , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } = { italic_α bold_italic_e }.

If ρα2=2𝜌superscript𝛼22\rho\alpha^{2}=2italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2, then G(𝒘)=0𝐺𝒘0G(\bm{w})=0italic_G ( bold_italic_w ) = 0 for all 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Note that α𝒆,𝒘𝒘=α𝒘1𝒘𝛼𝒆𝒘𝒘𝛼subscriptnorm𝒘1𝒘\langle\alpha\bm{e},\bm{w}\rangle\bm{w}=\alpha\|\bm{w}\|_{1}\bm{w}⟨ italic_α bold_italic_e , bold_italic_w ⟩ bold_italic_w = italic_α ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_w. Hence, prox1ρh2(𝒙)={𝟎}{α𝒘1𝒘:𝒘𝕊+n1}subscriptprox1𝜌subscript2𝒙0conditional-set𝛼:evaluated-at𝒘1𝒘𝒘superscriptsubscript𝕊𝑛1\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\{\mathbf{0}\}\cup\{\alpha\|\bm{w}% \|_{1}\bm{w}:\bm{w}\in\mathbb{S}_{+}^{n-1}\}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { bold_0 } ∪ { italic_α ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT }.

Finally, if ρα2<2𝜌superscript𝛼22\rho\alpha^{2}<2italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 2, then the minimal value of G𝐺Gitalic_G on 𝕊+n1superscriptsubscript𝕊𝑛1\mathbb{S}_{+}^{n-1}blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT is achieved at 𝒘{𝒆i:i[n]}superscript𝒘conditional-setsubscript𝒆𝑖𝑖delimited-[]𝑛\bm{w}^{\star}\in\{\bm{e}_{i}:i\in[n]\}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ { bold_italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT : italic_i ∈ [ italic_n ] } and G(𝒆i)=12(2ρα2)>0𝐺subscript𝒆𝑖122𝜌superscript𝛼20G(\bm{e}_{i})=\frac{1}{2}(2-\rho\alpha^{2})>0italic_G ( bold_italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) > 0 for all i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ]. Hence, prox1ρh2(𝒙)={𝟎}subscriptprox1𝜌subscript2𝒙0\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\{\mathbf{0}\}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { bold_0 }. ∎

By Lemma 2.1, we restrict our attention to the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on nsuperscriptsubscript𝑛\mathbb{R}_{\downarrow}^{n}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. The complete discussion is presented in the following two subsections. In the first subsection, we conduct a comprehensive analysis of the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT specially in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. We delve into the intricacies of this operator, exploring its behavior and characteristics within this constrained domain. In the second subsection, we begin with investigating the properties of the eigenvectors of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT. The eigenvector corresponding to a negative eigenvalue plays a pivotal role in determining the solution in the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure. By leveraging these properties effectively, we explicitly derive the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT over the entire space nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

3.1 Special case: the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

The following result is about the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT.

Theorem 3.2.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱2𝐱superscriptsubscript2\bm{x}\in\mathbb{R}_{\downarrow}^{2}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT not a multiple of 𝐞𝐞\bm{e}bold_italic_e, write

θ={12arctan(2(2ρx1x2)ρ(x12x22)),if ρx1x2>2;0,if ρx1x22.superscript𝜃cases1222𝜌subscript𝑥1subscript𝑥2𝜌superscriptsubscript𝑥12superscriptsubscript𝑥22if ρx1x2>2;0if ρx1x22.\theta^{\star}=\left\{\begin{array}[]{ll}\frac{1}{2}\arctan\left(\frac{-2(2-% \rho x_{1}x_{2})}{\rho(x_{1}^{2}-x_{2}^{2})}\right),&\hbox{if $\rho x_{1}x_{2}% >2$;}\\ 0,&\hbox{if $\rho x_{1}x_{2}\leq 2$.}\end{array}\right.italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = { start_ARRAY start_ROW start_CELL divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_arctan ( divide start_ARG - 2 ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG ) , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 2 ; end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 . end_CELL end_ROW end_ARRAY

then, 𝐰=[cosθsinθ]superscript𝐰superscriptmatrixsuperscript𝜃superscript𝜃top\bm{w}^{\star}=\begin{bmatrix}\cos\theta^{\star}&\sin\theta^{\star}\end{% bmatrix}^{\top}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL roman_cos italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_CELL start_CELL roman_sin italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT is the optimal solution to problem (3). Finally,

prox1ρh2(𝒙)={{𝟎},if ρx1x22 and ρx12<2;{𝟎,x1𝒆1},if ρx1x22 and ρx12=2;{x1𝒆1},if ρx1x22 and ρx12>2;{𝒙,𝒘𝒘},if ρx1x2>2.subscriptprox1𝜌subscript2𝒙cases0if ρx1x22 and ρx12<2;0subscript𝑥1subscript𝒆1if ρx1x22 and ρx12=2;subscript𝑥1subscript𝒆1if ρx1x22 and ρx12>2;𝒙superscript𝒘superscript𝒘if ρx1x2>2.\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\left\{\begin{array}[]{ll}\{% \mathbf{0}\},&\hbox{if $\rho x_{1}x_{2}\leq 2$ and $\rho x_{1}^{2}<2$;}\\ \{\mathbf{0},x_{1}\bm{e}_{1}\},&\hbox{if $\rho x_{1}x_{2}\leq 2$ and $\rho x_{% 1}^{2}=2$;}\\ \{x_{1}\bm{e}_{1}\},&\hbox{if $\rho x_{1}x_{2}\leq 2$ and $\rho x_{1}^{2}>2$;}% \\ \{\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\},&\hbox{if $\rho x_{1}x_{% 2}>2$.}\end{array}\right.roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ARRAY start_ROW start_CELL { bold_0 } , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 and italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 2 ; end_CELL end_ROW start_ROW start_CELL { bold_0 , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 and italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2 ; end_CELL end_ROW start_ROW start_CELL { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 and italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > 2 ; end_CELL end_ROW start_ROW start_CELL { ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } , end_CELL start_CELL if italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 2 . end_CELL end_ROW end_ARRAY
Proof.

For 𝒘𝕊+1𝒘subscriptsuperscript𝕊1\bm{w}\in\mathbb{S}^{1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, we have

G(𝒘)=12(2ρx12)w12+12(2ρx22)w22+(2ρx1x2)w1w2.𝐺𝒘122𝜌superscriptsubscript𝑥12superscriptsubscript𝑤12122𝜌superscriptsubscript𝑥22superscriptsubscript𝑤222𝜌subscript𝑥1subscript𝑥2subscript𝑤1subscript𝑤2G(\bm{w})=\frac{1}{2}(2-\rho x_{1}^{2})w_{1}^{2}+\frac{1}{2}(2-\rho x_{2}^{2})% w_{2}^{2}+(2-\rho x_{1}x_{2})w_{1}w_{2}.italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT .

Write w1=cosθsubscript𝑤1𝜃w_{1}=\cos\thetaitalic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = roman_cos italic_θ and w2=sinθsubscript𝑤2𝜃w_{2}=\sin\thetaitalic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = roman_sin italic_θ. The function G𝐺Gitalic_G can be written as

G(𝒘)=12(2ρx22)ρ4(x12x22)+ρ4ρ(x22x12)cos(2θ)+12(2ρx1x2)sin(2θ)Q(θ):=.𝐺𝒘122𝜌superscriptsubscript𝑥22𝜌4superscriptsubscript𝑥12superscriptsubscript𝑥22subscript𝜌4𝜌superscriptsubscript𝑥22superscriptsubscript𝑥122𝜃122𝜌subscript𝑥1subscript𝑥22𝜃assign𝑄𝜃absentG(\bm{w})=\frac{1}{2}(2-\rho x_{2}^{2})-\frac{\rho}{4}(x_{1}^{2}-x_{2}^{2})+% \underbrace{\frac{\rho}{4}\rho(x_{2}^{2}-x_{1}^{2})\cos(2\theta)+\frac{1}{2}(2% -\rho x_{1}x_{2})\sin(2\theta)}_{Q(\theta):=}.italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) - divide start_ARG italic_ρ end_ARG start_ARG 4 end_ARG ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + under⏟ start_ARG divide start_ARG italic_ρ end_ARG start_ARG 4 end_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) roman_cos ( 2 italic_θ ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) roman_sin ( 2 italic_θ ) end_ARG start_POSTSUBSCRIPT italic_Q ( italic_θ ) := end_POSTSUBSCRIPT .

It is clear that minimizing 𝒘𝖠ρ,𝒙𝒘superscript𝒘topsubscript𝖠𝜌𝒙𝒘\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w for 𝒘𝕊+1𝒘subscriptsuperscript𝕊1\bm{w}\in\mathbb{S}^{1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT is equivalent to minimizing the function F(θ)𝐹𝜃F(\theta)italic_F ( italic_θ ) for θ[0,π/2]𝜃0𝜋2\theta\in[0,\pi/2]italic_θ ∈ [ 0 , italic_π / 2 ]. By Lemma 2.1 and Theorem 2.3, we can restrict the parameter θ[0,π/4]𝜃0𝜋4\theta\in[0,\pi/4]italic_θ ∈ [ 0 , italic_π / 4 ].

To investigate the global minimizer of Q𝑄Qitalic_Q over the interval θ[0,π/4]𝜃0𝜋4\theta\in[0,\pi/4]italic_θ ∈ [ 0 , italic_π / 4 ], we compute the derivative of Q𝑄Qitalic_Q as follows

Q(θ)=12ρ(x12x22)sin(2θ)+(2ρx1x2)cos(2θ).superscript𝑄𝜃12𝜌superscriptsubscript𝑥12superscriptsubscript𝑥222𝜃2𝜌subscript𝑥1subscript𝑥22𝜃Q^{\prime}(\theta)=\frac{1}{2}\rho(x_{1}^{2}-x_{2}^{2})\sin(2\theta)+(2-\rho x% _{1}x_{2})\cos(2\theta).italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) roman_sin ( 2 italic_θ ) + ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) roman_cos ( 2 italic_θ ) .

We consider two cases. Case 1: If 2ρx1x202𝜌subscript𝑥1subscript𝑥202-\rho x_{1}x_{2}\geq 02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ 0, Q(θ)0superscript𝑄𝜃0Q^{\prime}(\theta)\geq 0italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) ≥ 0 for θ[0,π/4]𝜃0𝜋4\theta\in[0,\pi/4]italic_θ ∈ [ 0 , italic_π / 4 ]. Hence, Q𝑄Qitalic_Q achieves its global minimum at θ=0𝜃0\theta=0italic_θ = 0. That is, 𝒘=𝒆1superscript𝒘subscript𝒆1\bm{w}^{\star}=\bm{e}_{1}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Case 2: If 2ρx1x2<02𝜌subscript𝑥1subscript𝑥202-\rho x_{1}x_{2}<02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < 0, Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT has only one root θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT on [0,π/4]0𝜋4[0,\pi/4][ 0 , italic_π / 4 ], given by θ=12arctan(2(2ρx1x2)ρ(x12x22))superscript𝜃1222𝜌subscript𝑥1subscript𝑥2𝜌superscriptsubscript𝑥12superscriptsubscript𝑥22\theta^{\star}=\frac{1}{2}\arctan\left(\frac{-2(2-\rho x_{1}x_{2})}{\rho(x_{1}% ^{2}-x_{2}^{2})}\right)italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_arctan ( divide start_ARG - 2 ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG ). Due to Q(0)=2(2ρx1x2)<0superscript𝑄022𝜌subscript𝑥1subscript𝑥20Q^{\prime}(0)=2(2-\rho x_{1}x_{2})<0italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) = 2 ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) < 0 and Q(π/4)=ρ(x12x22)>0superscript𝑄𝜋4𝜌superscriptsubscript𝑥12superscriptsubscript𝑥220Q^{\prime}(\pi/4)=\rho(x_{1}^{2}-x_{2}^{2})>0italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_π / 4 ) = italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) > 0. Hence, Q𝑄Qitalic_Q achieves its global minimum at θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. As a result, 𝒘=[cosθsinθ]superscript𝒘superscriptmatrixsuperscript𝜃superscript𝜃top\bm{w}^{\star}=\begin{bmatrix}\cos\theta^{\star}&\sin\theta^{\star}\end{% bmatrix}^{\top}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL roman_cos italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_CELL start_CELL roman_sin italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT is the optimal solution to problem (3). This completes the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure for prox1ρh2(𝒙)subscriptprox1𝜌subscript2𝒙\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). The r𝑟ritalic_r-step follows immediately with r=𝒙,𝒘superscript𝑟𝒙superscript𝒘r^{\star}=\langle\bm{x},\bm{w}^{\star}\rangleitalic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩.

Finally, for the d𝑑ditalic_d-step of the WRD procedure, we only need to know the sign of G(𝒘)𝐺superscript𝒘G(\bm{w}^{\star})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ). For Case 1, G(𝒘)=G(𝒆1)=12(2ρx12)𝐺superscript𝒘𝐺subscript𝒆1122𝜌superscriptsubscript𝑥12G(\bm{w}^{\star})=G(\bm{e}_{1})=\frac{1}{2}(2-\rho x_{1}^{2})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = italic_G ( bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), which is positive if ρx12<2𝜌superscriptsubscript𝑥122\rho x_{1}^{2}<2italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 2, zero if ρx12=2𝜌superscriptsubscript𝑥122\rho x_{1}^{2}=2italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2, and negative otherwise. For Case 2, G(𝒘)<G(𝒆1)=12(2ρx12)<12(2ρx1x2)<0𝐺superscript𝒘𝐺subscript𝒆1122𝜌superscriptsubscript𝑥12122𝜌subscript𝑥1subscript𝑥20G(\bm{w}^{\star})<G(\bm{e}_{1})=\frac{1}{2}(2-\rho x_{1}^{2})<\frac{1}{2}(2-% \rho x_{1}x_{2})<0italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) < italic_G ( bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) < divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) < 0. So, from the sign of G(𝒘)𝐺superscript𝒘G(\bm{w}^{\star})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ), we conclude prox1ρh2(𝒙)subscriptprox1𝜌subscript2𝒙\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). ∎

To close this subsection, a detailed examination of the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT with index 1/ρ1𝜌1/\rho1 / italic_ρ in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is conducted through visual representation via plots. In addition, the proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm with index 1/ρ1𝜌1/\rho1 / italic_ρ is incorporated for comparative analysis, considering h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as an approximation of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm. The ensuing visualizations aim to provide insights into the behavior and characteristics of the proximity operator for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in comparison to the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm, enhancing our understanding of their respective properties in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. As stipulated by Lemma 2.1, we exclusively present the behavior on 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT.

Figure 3.1(a) illustrates the proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm. Following the guidance from Example 2.4, the set 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT is divided into three distinct regions I, II, and III as depicted in Figure 3.1(a) and defined as follows:

Region I =\displaystyle== {(x1,x2):0x2x12/ρ},conditional-setsubscript𝑥1subscript𝑥20subscript𝑥2subscript𝑥12𝜌\displaystyle\{(x_{1},x_{2}):0\leq x_{2}\leq x_{1}\leq\sqrt{{2}/{\rho}}\},{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ square-root start_ARG 2 / italic_ρ end_ARG } ,
Region II =\displaystyle== {(x1,x2):0x22/ρ<x1},conditional-setsubscript𝑥1subscript𝑥20subscript𝑥22𝜌subscript𝑥1\displaystyle\{(x_{1},x_{2}):0\leq x_{2}\leq\sqrt{{2}/{\rho}}<x_{1}\},{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ square-root start_ARG 2 / italic_ρ end_ARG < italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } ,
Region III =\displaystyle== {(x1,x2):2/ρ<x2x1}.conditional-setsubscript𝑥1subscript𝑥22𝜌subscript𝑥2subscript𝑥1\displaystyle\{(x_{1},x_{2}):\sqrt{{2}/{\rho}}<x_{2}\leq x_{1}\}.{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : square-root start_ARG 2 / italic_ρ end_ARG < italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } .

On Region I, the prox1ρ0\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at the corner 2/ρ𝒆2𝜌𝒆\sqrt{{2}/{\rho}}\bm{e}square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e is {𝟎,2/ρ𝒆,2/ρ𝒆1,2/ρ𝒆2}02𝜌𝒆2𝜌subscript𝒆12𝜌subscript𝒆2\{\mathbf{0},\sqrt{{2}/{\rho}}\bm{e},\sqrt{{2}/{\rho}}\bm{e}_{1},\sqrt{{2}/{% \rho}}\bm{e}_{2}\}{ bold_0 , square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e , square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }; at each other point on the line x1=2/ρsubscript𝑥12𝜌x_{1}=\sqrt{{2}/{\rho}}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 / italic_ρ end_ARG, it is 2/ρ𝒆12𝜌subscript𝒆1\sqrt{{2}/{\rho}}\bm{e}_{1}square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT; and at each other point in Region I, it is 𝟎0\mathbf{0}bold_0. On Region II, prox1ρ0\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at the point (x1,2/ρ)subscript𝑥12𝜌(x_{1},\sqrt{{2}/{\rho}})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , square-root start_ARG 2 / italic_ρ end_ARG ) is {(x1,2/ρ),x1𝒆1}subscript𝑥12𝜌subscript𝑥1subscript𝒆1\{(x_{1},\sqrt{{2}/{\rho}}),x_{1}\bm{e}_{1}\}{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , square-root start_ARG 2 / italic_ρ end_ARG ) , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } and at each other point (x1,x2)subscript𝑥1subscript𝑥2(x_{1},x_{2})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is x1𝒆1subscript𝑥1subscript𝒆1x_{1}\bm{e}_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. On Region III, prox1ρ0\mathrm{prox}_{\frac{1}{\rho}\|\cdot\|_{0}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG ∥ ⋅ ∥ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each point is itself.

Figure 3.1(b) showcases the proximity operator of the h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on the line x1=x2subscript𝑥1subscript𝑥2x_{1}=x_{2}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The operator prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each point α𝒆𝛼𝒆\alpha\bm{e}italic_α bold_italic_e is 𝟎0\mathbf{0}bold_0 if α<2/ρ𝛼2𝜌\alpha<\sqrt{2/\rho}italic_α < square-root start_ARG 2 / italic_ρ end_ARG (blue dash-dot line); {𝟎}{α𝒘1𝒘:𝒘𝕊+n1}0conditional-set𝛼:evaluated-at𝒘1𝒘𝒘superscriptsubscript𝕊𝑛1\{\mathbf{0}\}\cup\{\alpha\|\bm{w}\|_{1}\bm{w}:\bm{w}\in\mathbb{S}_{+}^{n-1}\}{ bold_0 } ∪ { italic_α ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT } if α=2/ρ𝛼2𝜌\alpha=\sqrt{2/\rho}italic_α = square-root start_ARG 2 / italic_ρ end_ARG (marked by the square); and α𝒆𝛼𝒆\alpha\bm{e}italic_α bold_italic_e itself if α>2/ρ𝛼2𝜌\alpha>\sqrt{2/\rho}italic_α > square-root start_ARG 2 / italic_ρ end_ARG (magenta dot line). Comparing with the proximity operator of the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm, the main difference is at the point 2/ρ𝒆2𝜌𝒆\sqrt{2/\rho}\bm{e}square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e.

Figure 3.1(c) exhibits the proximity operator of the h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT excluding the line x1=x2subscript𝑥1subscript𝑥2x_{1}=x_{2}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The set 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT partitions into three regions I, II, and III as shown in Figure 3.1(c) and defined as follows:

Region I =\displaystyle== {(x1,x2):0x2<x12/ρ},conditional-setsubscript𝑥1subscript𝑥20subscript𝑥2subscript𝑥12𝜌\displaystyle\{(x_{1},x_{2}):0\leq x_{2}<x_{1}\leq\sqrt{{2}/{\rho}}\},{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ square-root start_ARG 2 / italic_ρ end_ARG } ,
Region II =\displaystyle== {(x1,x2):0x22/(ρx1),x1>2/ρ},conditional-setsubscript𝑥1subscript𝑥2formulae-sequence0subscript𝑥22𝜌subscript𝑥1subscript𝑥12𝜌\displaystyle\{(x_{1},x_{2}):0\leq x_{2}\leq{2}/{(\rho x_{1})},x_{1}>\sqrt{{2}% /{\rho}}\},{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : 0 ≤ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2 / ( italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG 2 / italic_ρ end_ARG } ,
Region III =\displaystyle== {(x1,x2):2/(ρx1)<x2x1,x1>2/ρ}.conditional-setsubscript𝑥1subscript𝑥2formulae-sequence2𝜌subscript𝑥1subscript𝑥2subscript𝑥1subscript𝑥12𝜌\displaystyle\{(x_{1},x_{2}):{2}/{(\rho x_{1})}<x_{2}\leq x_{1},x_{1}>\sqrt{{2% }/{\rho}}\}.{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) : 2 / ( italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) < italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG 2 / italic_ρ end_ARG } .

On Region I, the prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each point on the line x1=2/ρsubscript𝑥12𝜌x_{1}=\sqrt{{2}/{\rho}}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG 2 / italic_ρ end_ARG is {𝟎,2/ρ𝒆1}02𝜌subscript𝒆1\{\mathbf{0},\sqrt{{2}/{\rho}}\bm{e}_{1}\}{ bold_0 , square-root start_ARG 2 / italic_ρ end_ARG bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT }; the prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each other point is 𝟎0\mathbf{0}bold_0. On Region II, the prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each point (x1,x2)subscript𝑥1subscript𝑥2(x_{1},x_{2})( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is x1𝒆1subscript𝑥1subscript𝒆1x_{1}\bm{e}_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (see the red line). On Region III, the prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at each point 𝒙𝒙\bm{x}bold_italic_x is 𝒙,𝒘𝒘𝒙superscript𝒘superscript𝒘\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT, where 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is given in Theorem 3.2. Specifically, results for three lines with their slopes 0.9 (green line), 0.5 (cyan line), and 0.3 (black line) are presented, and the prox1ρh2subscriptprox1𝜌subscript2\mathrm{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT at these points are represented by dashed lines with corresponding colors.

Refer to caption Refer to caption Refer to caption
(a) (b) (c)
Figure 3.1: The plots of the proximity operator in 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT for (a) the 0subscript0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT norm; (b) h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on the line with the slope 1111; and (c) h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT excluding the line with the slope 1111.

3.2 General case: the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT

In the preceding subsection, we explored the determination of the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT through the WRD procedure. The central concept involved parameterizing 𝕊+1subscriptsuperscript𝕊1\mathbb{S}^{1}_{+}blackboard_S start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT using a single variable, simplifying the resulting problem in the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure and facilitating ease of solution. While 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT for n>2𝑛2n>2italic_n > 2 can be parameterized by (n1)𝑛1(n-1)( italic_n - 1 ) parameters, the ensuing problem in the 𝒘𝒘\bm{w}bold_italic_w-step appears to be intricate for direct analysis. Consequently, alternative approaches must be considered to address and overcome the complexities associated with this scenario.

Given the pivotal role of the 𝒘𝒘\bm{w}bold_italic_w-step in the WRD procedure, this subsection places particular emphasis on this phase. It is noteworthy that the objective function G𝐺Gitalic_G for the 𝒘𝒘\bm{w}bold_italic_w-step is characterized as a quadratic form. In this context, we invoke the following two pertinent results.

Lemma 3.3 (Theorem 1 in [20]).

Consider the following optimization problem

min{12𝒘𝖧𝒘+𝒃𝒘:𝒘2=r},:12superscript𝒘top𝖧𝒘superscript𝒃top𝒘subscriptnorm𝒘2𝑟\min\left\{\frac{1}{2}\bm{w}^{\top}\mathsf{H}\bm{w}+\bm{b}^{\top}\bm{w}:\|\bm{% w}\|_{2}=r\right\},roman_min { divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_H bold_italic_w + bold_italic_b start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w : ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_r } , (8)

where 𝖧𝖧\mathsf{H}sansserif_H is an n×n𝑛𝑛n\times nitalic_n × italic_n symmetric matrix, 𝐛n𝐛superscript𝑛\bm{b}\in\mathbb{R}^{n}bold_italic_b ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and r𝑟ritalic_r a positive number. A vector 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is a solution to this problem if and only if there is a real number λsuperscript𝜆\lambda^{\star}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT such that (i) 𝖧+λI𝖧superscript𝜆I\mathsf{H}+\lambda^{\star}\mathsf{\mathrm{I}}sansserif_H + italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT roman_I is positive semi-definite; (ii) (𝖧+λI)𝐰=𝐛𝖧superscript𝜆Isuperscript𝐰𝐛(\mathsf{H}+\lambda^{\star}\mathsf{\mathrm{I}})\bm{w}^{\star}=-\bm{b}( sansserif_H + italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT roman_I ) bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - bold_italic_b; and 𝐰2=rsubscriptnormsuperscript𝐰2𝑟\|\bm{w}^{\star}\|_{2}=r∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_r. Such a λsuperscript𝜆\lambda^{\star}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is unique.

Lemma 3.4 ([21, 20]).

Consider the optimization problem (8). If 𝐛𝐛\bm{b}bold_italic_b is orthogonal to some eigenvector associated with the smallest eigenvalue, then there is no local-nonglobal minimum for (8).

Note that both Lemma 3.3 and Lemma 3.4 consider the quadratic optimization problems constrained on a sphere. However, our problem in 𝒘𝒘\bm{w}bold_italic_w-step is restricted on 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT.

To investigate the applicability of Lemma 3.3 for the optimization problem in the 𝒘𝒘\bm{w}bold_italic_w-step, a crucial prerequisite is understanding the eigen-structure of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, as defined in (7). This matrix is the sum of two rank-1 matrices; consequently, it possesses at most two non-zero eigenvalues. In order to delve into the eigen-structure of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, let’s introduce a set of notations:

ΔΔ\displaystyle\Deltaroman_Δ :=assign\displaystyle:=:= (ρ2𝒙22+n)22ρ𝒙12,superscript𝜌2superscriptsubscriptnorm𝒙22𝑛22𝜌superscriptsubscriptnorm𝒙12\displaystyle\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+n\right)^{2}-2\rho\|\bm{x}% \|_{1}^{2},( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (9)
α¯¯𝛼\displaystyle\underline{\alpha}under¯ start_ARG italic_α end_ARG :=assign\displaystyle:=:= (ρ2𝒙22+n)Δ,𝜌2superscriptsubscriptnorm𝒙22𝑛Δ\displaystyle\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+n\right)-\sqrt{\Delta},( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) - square-root start_ARG roman_Δ end_ARG , (10)
α¯¯𝛼\displaystyle\overline{\alpha}over¯ start_ARG italic_α end_ARG :=assign\displaystyle:=:= (ρ2𝒙22+n)+Δ.𝜌2superscriptsubscriptnorm𝒙22𝑛Δ\displaystyle\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+n\right)+\sqrt{\Delta}.( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) + square-root start_ARG roman_Δ end_ARG . (11)
λ¯¯𝜆\displaystyle\underline{\lambda}under¯ start_ARG italic_λ end_ARG :=assign\displaystyle:=:= 2nα¯2𝑛¯𝛼\displaystyle 2n-\overline{\alpha}2 italic_n - over¯ start_ARG italic_α end_ARG (12)
λ¯¯𝜆\displaystyle\overline{\lambda}over¯ start_ARG italic_λ end_ARG :=assign\displaystyle:=:= 2nα¯2𝑛¯𝛼\displaystyle 2n-\underline{\alpha}2 italic_n - under¯ start_ARG italic_α end_ARG (13)
𝒘¯¯𝒘\displaystyle\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG :=assign\displaystyle:=:= 𝒙α¯ρ𝒙1𝒆𝒙¯𝛼𝜌subscriptnorm𝒙1𝒆\displaystyle\bm{x}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}bold_italic_x - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e (14)
𝒘¯¯𝒘\displaystyle\overline{\bm{w}}over¯ start_ARG bold_italic_w end_ARG :=assign\displaystyle:=:= 𝒙α¯ρ𝒙1𝒆.𝒙¯𝛼𝜌subscriptnorm𝒙1𝒆\displaystyle\bm{x}-\frac{\overline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}.bold_italic_x - divide start_ARG over¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e . (15)

Observations about the above notations are as follows: The inequality 𝒙1n𝒙2subscriptnorm𝒙1𝑛subscriptnorm𝒙2\|\bm{x}\|_{1}\leq\sqrt{n}\|\bm{x}\|_{2}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ square-root start_ARG italic_n end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT implies that Δ(ρ2𝒙22n)2Δsuperscript𝜌2superscriptsubscriptnorm𝒙22𝑛2\Delta\geq\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right)^{2}roman_Δ ≥ ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and the inequality strictly holds if 𝒙𝒙\bm{x}bold_italic_x is not a multiple of 𝒆𝒆\bm{e}bold_italic_e. This observation further implies that both α¯¯𝛼\underline{\alpha}under¯ start_ARG italic_α end_ARG and α¯¯𝛼\overline{\alpha}over¯ start_ARG italic_α end_ARG (given in (10) and (11)) are non-negative numbers. For λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG (given in (12)):

λ¯=2nα¯=(ρ2𝒙22n)+Δ(ρ2𝒙22n)+|ρ2𝒙22n|0,¯𝜆2𝑛¯𝛼𝜌2superscriptsubscriptnorm𝒙22𝑛Δ𝜌2superscriptsubscriptnorm𝒙22𝑛𝜌2superscriptsubscriptnorm𝒙22𝑛0\overline{\lambda}=2n-\underline{\alpha}=-\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{% 2}-n\right)+\sqrt{\Delta}\geq-\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right)+% \left|\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right|\geq 0,over¯ start_ARG italic_λ end_ARG = 2 italic_n - under¯ start_ARG italic_α end_ARG = - ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) + square-root start_ARG roman_Δ end_ARG ≥ - ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) + | divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n | ≥ 0 ,

where the equality holds if 𝒙𝒙\bm{x}bold_italic_x is a multiple of 𝒆𝒆\bm{e}bold_italic_e. Similarly, for λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG (given in (13)):

λ¯=2nα¯=(ρ2𝒙22n)Δ(ρ2𝒙22n)|ρ2𝒙22n|0,¯𝜆2𝑛¯𝛼𝜌2superscriptsubscriptnorm𝒙22𝑛Δ𝜌2superscriptsubscriptnorm𝒙22𝑛𝜌2superscriptsubscriptnorm𝒙22𝑛0\underline{\lambda}=2n-\overline{\alpha}=-\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{% 2}-n\right)-\sqrt{\Delta}\leq-\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right)-% \left|\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right|\leq 0,under¯ start_ARG italic_λ end_ARG = 2 italic_n - over¯ start_ARG italic_α end_ARG = - ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) - square-root start_ARG roman_Δ end_ARG ≤ - ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) - | divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n | ≤ 0 ,

where the equality holds if 𝒙𝒙\bm{x}bold_italic_x is a multiple of 𝒆𝒆\bm{e}bold_italic_e again. Hence, if 𝒙𝒙\bm{x}bold_italic_x is not a multiple of 𝒆𝒆\bm{e}bold_italic_e, then λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG is positive, while λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG is negative.

The subsequent result elucidates the eigenstructure of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT.

Proposition 3.5.

Let 𝖠ρ,𝐱subscript𝖠𝜌𝐱\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT be given in (7) for ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱+n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{+}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Let α¯¯𝛼\underline{\alpha}under¯ start_ARG italic_α end_ARG, α¯¯𝛼\overline{\alpha}over¯ start_ARG italic_α end_ARG, λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG, λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG, 𝐰¯¯𝐰\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG, and 𝐰¯¯𝐰\overline{\bm{w}}over¯ start_ARG bold_italic_w end_ARG be given by (10), (11), (12), (13), (14), and (15), respectively. The following statements hold:

  • (i)

    Assume 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0. Then the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT has only zero as its eigenvalues if ρα2=2𝜌superscript𝛼22\rho\alpha^{2}=2italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2; or has (2ρα2)n2𝜌superscript𝛼2𝑛(2-\rho\alpha^{2})n( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_n as its only non-zero eigenvalue with 1n𝒆1𝑛𝒆\frac{1}{\sqrt{n}}\bm{e}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e the corresponding eigenvector.

  • (ii)

    Assume 𝒙α𝒆𝒙𝛼𝒆\bm{x}\neq\alpha\bm{e}bold_italic_x ≠ italic_α bold_italic_e for any α𝛼\alphaitalic_α. Then the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT has only one positive eigenvalue λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG and one negative eigenvalue λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG given as λ¯=2nα¯¯𝜆2𝑛¯𝛼\overline{\lambda}=2n-\underline{\alpha}over¯ start_ARG italic_λ end_ARG = 2 italic_n - under¯ start_ARG italic_α end_ARG and λ¯=2nα¯¯𝜆2𝑛¯𝛼\underline{\lambda}=2n-\overline{\alpha}under¯ start_ARG italic_λ end_ARG = 2 italic_n - over¯ start_ARG italic_α end_ARG. The corresponding eigenvectors associated with λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG and λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG are 𝒘¯=𝒙α¯ρ𝒙1𝒆¯𝒘𝒙¯𝛼𝜌subscriptnorm𝒙1𝒆\overline{\bm{w}}=\bm{x}-\frac{\overline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}over¯ start_ARG bold_italic_w end_ARG = bold_italic_x - divide start_ARG over¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e and 𝒘¯=𝒙α¯ρ𝒙1𝒆¯𝒘𝒙¯𝛼𝜌subscriptnorm𝒙1𝒆\underline{\bm{w}}=\bm{x}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}under¯ start_ARG bold_italic_w end_ARG = bold_italic_x - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e, respectively.

Proof.

Item (i). In this case, 𝖠ρ,𝒙=(2ρα2)𝒆𝒆subscript𝖠𝜌𝒙2𝜌superscript𝛼2𝒆superscript𝒆top\mathsf{A}_{\rho,\bm{x}}=\left(2-\rho\alpha^{2}\right)\bm{e}\bm{e}^{\top}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = ( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT. Clearly, 𝖠ρ,𝒙=𝟎subscript𝖠𝜌𝒙0\mathsf{A}_{\rho,\bm{x}}=\mathbf{0}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = bold_0 if ρα2=2𝜌superscript𝛼22\rho\alpha^{2}=2italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2, so 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT has only zero as its eigenvalues. Otherwise, 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT has (2ρα2)n2𝜌superscript𝛼2𝑛\left(2-\rho\alpha^{2}\right)n( 2 - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_n as its only non-zero eigenvalue with the corresponding eigenvector 1n𝒆1𝑛𝒆\frac{1}{\sqrt{n}}\bm{e}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e.

Item (ii). From rank(𝖠ρ,𝒙)rank(2𝒆𝒆)+rank(ρ𝒙𝒙)=2ranksubscript𝖠𝜌𝒙rank2𝒆superscript𝒆toprank𝜌𝒙superscript𝒙top2\mathrm{rank}(\mathsf{A}_{\rho,\bm{x}})\leq\mathrm{rank}(2\bm{e}\bm{e}^{\top})% +\mathrm{rank}({\rho}\bm{x}\bm{x}^{\top})=2roman_rank ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) ≤ roman_rank ( 2 bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) + roman_rank ( italic_ρ bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) = 2, the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT has at most two nonzero eigenvalues which will be found as follows. For any λ𝜆\lambdaitalic_λ, a direct computation gives

𝖠ρ,𝒙(𝒙λ𝒆)=ρ(𝒙22𝒙1λ)𝒙+2(𝒙1nλ)𝒆.subscript𝖠𝜌𝒙𝒙𝜆𝒆𝜌superscriptsubscriptnorm𝒙22subscriptnorm𝒙1𝜆𝒙2subscriptnorm𝒙1𝑛𝜆𝒆\mathsf{A}_{\rho,\bm{x}}(\bm{x}-\lambda\bm{e})=-{\rho}(\|\bm{x}\|_{2}^{2}-\|% \bm{x}\|_{1}\lambda)\bm{x}+2(\|\bm{x}\|_{1}-n\lambda)\bm{e}.sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ( bold_italic_x - italic_λ bold_italic_e ) = - italic_ρ ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_λ ) bold_italic_x + 2 ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_n italic_λ ) bold_italic_e .

If the vector 𝒙λ𝒆𝒙𝜆𝒆\bm{x}-\lambda\bm{e}bold_italic_x - italic_λ bold_italic_e is the eigenvector of 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, then the equation

ρ(𝒙22𝒙1λ)=2(𝒙1nλ)λ𝜌superscriptsubscriptnorm𝒙22subscriptnorm𝒙1𝜆2subscriptnorm𝒙1𝑛𝜆𝜆-{\rho}(\|\bm{x}\|_{2}^{2}-\|\bm{x}\|_{1}\lambda)=\frac{2(\|\bm{x}\|_{1}-n% \lambda)}{-\lambda}- italic_ρ ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_λ ) = divide start_ARG 2 ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_n italic_λ ) end_ARG start_ARG - italic_λ end_ARG

holds for some λ𝜆\lambdaitalic_λ and the value 2(𝒙1nλ)λ2subscriptnorm𝒙1𝑛𝜆𝜆\frac{2(\|\bm{x}\|_{1}-n\lambda)}{-\lambda}divide start_ARG 2 ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_n italic_λ ) end_ARG start_ARG - italic_λ end_ARG is the associated eigenvalue. Simplifying the above equation leads to the following quadratic equation

ρ2𝒙1λ2(ρ2𝒙22+n)λ+𝒙1=0.𝜌2subscriptnorm𝒙1superscript𝜆2𝜌2superscriptsubscriptnorm𝒙22𝑛𝜆subscriptnorm𝒙10\frac{\rho}{2}\|\bm{x}\|_{1}\lambda^{2}-\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}% +n\right)\lambda+\|\bm{x}\|_{1}=0.divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) italic_λ + ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 .

The discriminant of the quadratic equation with variable λ𝜆\lambdaitalic_λ is ΔΔ\Deltaroman_Δ given by (9). Since 𝒙12<n𝒙22evaluated-atsuperscriptsubscriptnorm𝒙12bra𝑛𝒙22\|\bm{x}\|_{1}^{2}<n\|\bm{x}\|_{2}^{2}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < italic_n ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, we have Δ>(ρ2𝒙22n)20Δsuperscript𝜌2superscriptsubscriptnorm𝒙22𝑛20\Delta>(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n)^{2}\geq 0roman_Δ > ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 0. Hence, the above quadratic equation has two real roots λ1=α¯ρ𝒙1subscript𝜆1¯𝛼𝜌subscriptnorm𝒙1\lambda_{1}=\frac{\overline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = divide start_ARG over¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG and λ2=α¯ρ𝒙1subscript𝜆2¯𝛼𝜌subscriptnorm𝒙1\lambda_{2}=\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG. Substituting λ1subscript𝜆1\lambda_{1}italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and λ2subscript𝜆2\lambda_{2}italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT into 2(𝒙1nλ)λ2subscriptnorm𝒙1𝑛𝜆𝜆\frac{2(\|\bm{x}\|_{1}-n\lambda)}{-\lambda}divide start_ARG 2 ( ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_n italic_λ ) end_ARG start_ARG - italic_λ end_ARG yield two eigenvalues λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG and λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG of 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, respectively. In this case, we know that λ¯>0¯𝜆0\overline{\lambda}>0over¯ start_ARG italic_λ end_ARG > 0 and λ¯<0¯𝜆0\underline{\lambda}<0under¯ start_ARG italic_λ end_ARG < 0. The eigenvectors corresponding to λ¯¯𝜆\overline{\lambda}over¯ start_ARG italic_λ end_ARG and λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG are 𝒘¯¯𝒘\overline{\bm{w}}over¯ start_ARG bold_italic_w end_ARG and 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG, respectively. ∎

We remark that for 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT, the largest component of 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG in (14), that is the first component x1α¯ρ𝒙1subscript𝑥1¯𝛼𝜌subscriptnorm𝒙1x_{1}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG, is always non-negative. Actually, by (9), (10), and (14), we have

x1α¯ρ𝒙1subscript𝑥1¯𝛼𝜌subscriptnorm𝒙1\displaystyle x_{1}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG =\displaystyle== x12ρ𝒙12ρ𝒙1[(ρ2𝒙22+n)+(ρ2𝒙22+n)22ρ𝒙12]subscript𝑥12𝜌superscriptsubscriptnorm𝒙12𝜌subscriptnorm𝒙1delimited-[]𝜌2superscriptsubscriptnorm𝒙22𝑛superscript𝜌2superscriptsubscriptnorm𝒙22𝑛22𝜌superscriptsubscriptnorm𝒙12\displaystyle x_{1}-\frac{2\rho\|\bm{x}\|_{1}^{2}}{\rho\|\bm{x}\|_{1}\left[% \left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+n\right)+\sqrt{\left(\frac{\rho}{2}\|% \bm{x}\|_{2}^{2}+n\right)^{2}-2\rho\|\bm{x}\|_{1}^{2}}\right]}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) + square-root start_ARG ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ] end_ARG
\displaystyle\geq x12𝒙1(ρ2𝒙22+n)+|ρ2𝒙22n|={x12𝒙1ρ𝒙22,if ρ𝒙222n;x12𝒙12n,if ρ𝒙22<2n,subscript𝑥12subscriptnorm𝒙1𝜌2superscriptsubscriptnorm𝒙22𝑛𝜌2superscriptsubscriptnorm𝒙22𝑛casessubscript𝑥12subscriptnorm𝒙1𝜌superscriptsubscriptnorm𝒙22if ρ𝒙222n;subscript𝑥12subscriptnorm𝒙12𝑛if ρ𝒙22<2n,\displaystyle x_{1}-\frac{2\|\bm{x}\|_{1}}{\left(\frac{\rho}{2}\|\bm{x}\|_{2}^% {2}+n\right)+\left|\frac{\rho}{2}\|\bm{x}\|_{2}^{2}-n\right|}=\left\{\begin{% array}[]{ll}x_{1}-\frac{2\|\bm{x}\|_{1}}{\rho\|\bm{x}\|_{2}^{2}},&\hbox{if $% \rho\|\bm{x}\|_{2}^{2}\geq 2n$;}\\ x_{1}-\frac{2\|\bm{x}\|_{1}}{2n},&\hbox{if $\rho\|\bm{x}\|_{2}^{2}<2n$,}\end{% array}\right.italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 2 ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_n ) + | divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_n | end_ARG = { start_ARRAY start_ROW start_CELL italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 2 ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , end_CELL start_CELL if italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ 2 italic_n ; end_CELL end_ROW start_ROW start_CELL italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 2 ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG 2 italic_n end_ARG , end_CELL start_CELL if italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 2 italic_n , end_CELL end_ROW end_ARRAY

which is always non-negative. This derivation also indicates that x1α¯ρ𝒙1>0subscript𝑥1¯𝛼𝜌subscriptnorm𝒙10x_{1}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}>0italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG > 0 always holds if 𝒙𝒙\bm{x}bold_italic_x is not parallel to 𝒆𝒆\bm{e}bold_italic_e.

If the last component xnα¯ρ𝒙1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝒙1x_{n}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG of 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG in (14) is positive, we have the following result.

Theorem 3.6.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT not being a multiple of 𝐞𝐞\bm{e}bold_italic_e, if xn>α¯ρ𝐱1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝐱1x_{n}>\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT > divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG, then the vector 𝐰:=𝐰¯𝐰¯2assignsuperscript𝐰¯𝐰subscriptnorm¯𝐰2\bm{w}^{\star}:=\frac{\underline{\bm{w}}}{\|\underline{\bm{w}}\|_{2}}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT := divide start_ARG under¯ start_ARG bold_italic_w end_ARG end_ARG start_ARG ∥ under¯ start_ARG bold_italic_w end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG with 𝐰¯=𝐱α¯ρ𝐱1𝐞¯𝐰𝐱¯𝛼𝜌subscriptnorm𝐱1𝐞\underline{\bm{w}}=\bm{x}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}under¯ start_ARG bold_italic_w end_ARG = bold_italic_x - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e is the solution to the optimization problem (3). Furthermore, we have

prox1ρh2(𝒙)=𝒙,𝒘𝒘.subscriptprox1𝜌subscript2𝒙𝒙superscript𝒘superscript𝒘\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\left\langle\bm{x},\bm{w}^{\star}% \right\rangle\bm{w}^{\star}.roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT .
Proof.

From 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT not being a multiple of 𝒆𝒆\bm{e}bold_italic_e, xn>α¯ρ𝒙1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝒙1x_{n}>\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT > divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG, and α¯¯𝛼\underline{\alpha}under¯ start_ARG italic_α end_ARG being nonnegative, we know that 𝟎𝒘¯n0¯𝒘superscriptsubscript𝑛\mathbf{0}\neq\underline{\bm{w}}\in\mathbb{R}_{\downarrow}^{n}bold_0 ≠ under¯ start_ARG bold_italic_w end_ARG ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and 𝒘n𝕊+n1superscript𝒘superscriptsubscript𝑛subscriptsuperscript𝕊𝑛1\bm{w}^{\star}\in\mathbb{R}_{\downarrow}^{n}\cap\mathbb{S}^{n-1}_{+}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∩ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. By identifying 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, 𝟎0\mathbf{0}bold_0, and 1111 as 𝖧𝖧\mathsf{H}sansserif_H, 𝒃𝒃\bm{b}bold_italic_b, and r𝑟ritalic_r in (8) of Lemma 3.3, respectively, we know that 𝖠ρ,𝒙λ¯Isubscript𝖠𝜌𝒙¯𝜆I\mathsf{A}_{\rho,\bm{x}}-\underline{\lambda}\mathsf{\mathrm{I}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT - under¯ start_ARG italic_λ end_ARG roman_I is positive semi-definite and (𝖠ρ,𝒙λ¯I)𝒘¯=𝟎subscript𝖠𝜌𝒙¯𝜆I¯𝒘0(\mathsf{A}_{\rho,\bm{x}}-\underline{\lambda}\mathsf{\mathrm{I}})\underline{% \bm{w}}=\mathbf{0}( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT - under¯ start_ARG italic_λ end_ARG roman_I ) under¯ start_ARG bold_italic_w end_ARG = bold_0 from the item (ii) of Proposition 3.5. Therefore, the unit vector 𝒘+nsuperscript𝒘superscriptsubscript𝑛\bm{w}^{\star}\in\mathbb{R}_{+}^{n}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is the solution to the problem (3) from Lemma 3.3.

To determine prox1ρh2(𝒙)subscriptprox1𝜌subscript2𝒙\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ), we notice that the first entries of both 𝒙𝒙\bm{x}bold_italic_x and 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT are positive, hence 𝒙,𝒘>0𝒙superscript𝒘0\langle\bm{x},\bm{w}^{\star}\rangle>0⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ > 0. Furthermore, since G(𝒘)=λ¯<0𝐺superscript𝒘¯𝜆0G(\bm{w}^{\star})=\underline{\lambda}<0italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = under¯ start_ARG italic_λ end_ARG < 0 for G𝐺Gitalic_G given in (5), we conclude that F(𝒘)<F(𝟎)𝐹superscript𝒘𝐹0F(\bm{w}^{\star})<F(\mathbf{0})italic_F ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) < italic_F ( bold_0 ) for F𝐹Fitalic_F given in (1). This completes the proof of this theorem. ∎

There are two remarks on Theorem 3.6. The first one is that under the conditions of this theorem, simplifying the expression 𝒙,𝒘𝒘𝒙superscript𝒘superscript𝒘\left\langle\bm{x},\bm{w}^{\star}\right\rangle\bm{w}^{\star}⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT leads to

prox1ρh2(𝒙)=𝒙22α¯ρ𝒙222α¯ρ+nα¯2ρ2𝒙12(𝒙α¯ρ𝒙1𝒆).subscriptprox1𝜌subscript2𝒙superscriptsubscriptnorm𝒙22¯𝛼𝜌superscriptsubscriptnorm𝒙222¯𝛼𝜌𝑛superscript¯𝛼2superscript𝜌2superscriptsubscriptnorm𝒙12𝒙¯𝛼𝜌subscriptnorm𝒙1𝒆\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})=\frac{\|\bm{x}\|_{2}^{2}-\frac{% \underline{\alpha}}{\rho}}{\|\bm{x}\|_{2}^{2}-2\frac{\underline{\alpha}}{\rho}% +\frac{n\underline{\alpha}^{2}}{\rho^{2}\|\bm{x}\|_{1}^{2}}}\left(\bm{x}-\frac% {\underline{\alpha}}{\rho\|\bm{x}\|_{1}}\bm{e}\right).roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = divide start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ end_ARG end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ end_ARG + divide start_ARG italic_n under¯ start_ARG italic_α end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ( bold_italic_x - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG bold_italic_e ) .

The second remark concerns the consistency of Theorem 3.6 in 2subscriptsuperscript2\mathbb{R}^{2}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT with Theorem 3.2. That is, if the condition x2>α¯ρ𝒙1subscript𝑥2¯𝛼𝜌subscriptnorm𝒙1x_{2}>\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG holds, then ρx1x2>2𝜌subscript𝑥1subscript𝑥22\rho x_{1}x_{2}>2italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 2 and 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in both Theorem 3.6 and Theorem 3.2 are identical. To this end, and to have simpler expressions, let us denote

a:=(ρ2𝒙22+2)22ρ𝒙12andb:=(ρ2𝒙22+2)ρx2𝒙1.formulae-sequenceassign𝑎superscript𝜌2superscriptsubscriptnorm𝒙22222𝜌superscriptsubscriptnorm𝒙12andassign𝑏𝜌2superscriptsubscriptnorm𝒙222𝜌subscript𝑥2subscriptnorm𝒙1a:=\sqrt{\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+2\right)^{2}-2\rho\|\bm{x}\|_{% 1}^{2}}\quad\mbox{and}\quad b:=\left(\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+2\right)% -\rho x_{2}\|\bm{x}\|_{1}.italic_a := square-root start_ARG ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG and italic_b := ( divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ) - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .

By (10), the condition x2>α¯ρ𝒙1subscript𝑥2¯𝛼𝜌subscriptnorm𝒙1x_{2}>\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG implies a>b𝑎𝑏a>bitalic_a > italic_b. We claim that a>|b|𝑎𝑏a>|b|italic_a > | italic_b |. If this claim does not hold, then b𝑏bitalic_b must be negative and 0<a|b|0𝑎𝑏0<a\leq|b|0 < italic_a ≤ | italic_b |. Squaring this inequality and simplifying it yield ρx1x22𝜌subscript𝑥1subscript𝑥22\rho x_{1}x_{2}\leq 2italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ 2. In this situation, b=ρ2(x12x22)+2ρx1x2>0𝑏𝜌2superscriptsubscript𝑥12superscriptsubscript𝑥222𝜌subscript𝑥1subscript𝑥20b=\frac{\rho}{2}(x_{1}^{2}-x_{2}^{2})+2-\rho x_{1}x_{2}>0italic_b = divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 0. This contradicts the negativeness of b𝑏bitalic_b. Hence, a>|b|𝑎𝑏a>|b|italic_a > | italic_b |. Similarly, squaring this inequality and simplifying it leads to ρx1x2>2𝜌subscript𝑥1subscript𝑥22\rho x_{1}x_{2}>2italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 2.

Further, defining β:=2θ=arctan(2(2ρx1x2)ρ(x12x22))assign𝛽2superscript𝜃22𝜌subscript𝑥1subscript𝑥2𝜌superscriptsubscript𝑥12superscriptsubscript𝑥22\beta:=2\theta^{\star}=\arctan\left(\frac{-2(2-\rho x_{1}x_{2})}{\rho(x_{1}^{2% }-x_{2}^{2})}\right)italic_β := 2 italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = roman_arctan ( divide start_ARG - 2 ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_ARG ) and with the help of the identity

cosθsinθ=1+1tan2β+1tanβ,superscript𝜃superscript𝜃11superscript2𝛽1𝛽\frac{\cos\theta^{\star}}{\sin\theta^{\star}}=\sqrt{1+\frac{1}{\tan^{2}\beta}}% +\frac{1}{\tan\beta},divide start_ARG roman_cos italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG start_ARG roman_sin italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG = square-root start_ARG 1 + divide start_ARG 1 end_ARG start_ARG roman_tan start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_β end_ARG end_ARG + divide start_ARG 1 end_ARG start_ARG roman_tan italic_β end_ARG ,

we can show, after some simplifications, that the ratios of the entries of 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in both Theorem 3.6 and Theorem 3.2 are the same:

cosθsinθ=ρx1𝒙1α¯ρx2𝒙1α¯,superscript𝜃superscript𝜃𝜌subscript𝑥1subscriptnorm𝒙1¯𝛼𝜌subscript𝑥2subscriptnorm𝒙1¯𝛼\frac{\cos\theta^{\star}}{\sin\theta^{\star}}=\frac{\rho x_{1}\|\bm{x}\|_{1}-% \underline{\alpha}}{\rho x_{2}\|\bm{x}\|_{1}-\underline{\alpha}},divide start_ARG roman_cos italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG start_ARG roman_sin italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG = divide start_ARG italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - under¯ start_ARG italic_α end_ARG end_ARG ,

which means that 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in both Theorem 3.6 and Theorem 3.2 are identical.

The next result discusses the property of the solution from the 𝒘𝒘\bm{w}bold_italic_w-step under the condition that the last component xnα¯ρ𝒙1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝒙1x_{n}-\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG of 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG in (14) is non-positive.

Theorem 3.7.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, let 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT be the optimal solution to the optimization problem (3). If xnα¯ρ𝐱1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝐱1x_{n}\leq\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ≤ divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG, then (𝐰)n=0subscriptsuperscript𝐰𝑛0\left(\bm{w}^{\star}\right)_{n}=0( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 0.

Proof.

Suppose that all components of 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT are positive, Then 𝒘𝕊+n1𝕊n1superscript𝒘superscriptsubscript𝕊𝑛1superscript𝕊𝑛1\bm{w}^{\star}\in\mathbb{S}_{+}^{n-1}\subset\mathbb{S}^{n-1}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT ⊂ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. So 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is a local minimizer of

min{12𝒘𝖠ρ,𝒙𝒘:𝒘𝕊n1}.:12superscript𝒘topsubscript𝖠𝜌𝒙𝒘𝒘superscript𝕊𝑛1\min\left\{\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}:\bm{w}\in% \mathbb{S}^{n-1}\right\}.roman_min { divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT } . (17)

As the zero vector is orthogonal any vector, it naturally follows that it is orthogonal to 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG, the eigenvector of 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT associated with the negative eigenvalue λ¯¯𝜆\underline{\lambda}under¯ start_ARG italic_λ end_ARG. By Lemma 3.4, there is no local-nonglobal minimum for (17). Hence 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the global minimizer of problem (17). As a result, 𝒘=𝒘¯𝒘¯2superscript𝒘¯𝒘subscriptnorm¯𝒘2\bm{w}^{\star}=\frac{\underline{\bm{w}}}{\|\underline{\bm{w}}\|_{2}}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = divide start_ARG under¯ start_ARG bold_italic_w end_ARG end_ARG start_ARG ∥ under¯ start_ARG bold_italic_w end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG, whose last component is less than 00 by the given condition xnα¯ρ𝒙1subscript𝑥𝑛¯𝛼𝜌subscriptnorm𝒙1x_{n}\leq\frac{\underline{\alpha}}{\rho\|\bm{x}\|_{1}}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ≤ divide start_ARG under¯ start_ARG italic_α end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG. This completes our proof. ∎

To have an efficient approach for computing the proximity operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, let us access the entries of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, which are

𝖠ρ,𝒙=[2ρx122ρx1x22ρx1xn2ρx2x12ρx222ρx2xn2ρxnx12ρxnx22ρxn2].subscript𝖠𝜌𝒙matrix2𝜌superscriptsubscript𝑥122𝜌subscript𝑥1subscript𝑥22𝜌subscript𝑥1subscript𝑥𝑛2𝜌subscript𝑥2subscript𝑥12𝜌superscriptsubscript𝑥222𝜌subscript𝑥2subscript𝑥𝑛2𝜌subscript𝑥𝑛subscript𝑥12𝜌subscript𝑥𝑛subscript𝑥22𝜌superscriptsubscript𝑥𝑛2\mathsf{A}_{\rho,\bm{x}}=\begin{bmatrix}2-\rho x_{1}^{2}&2-\rho x_{1}x_{2}&% \cdots&2-\rho x_{1}x_{n}\\ 2-\rho x_{2}x_{1}&2-\rho x_{2}^{2}&\cdots&2-\rho x_{2}x_{n}\\ \vdots&\vdots&\ddots&\vdots\\ 2-\rho x_{n}x_{1}&2-\rho x_{n}x_{2}&\cdots&2-\rho x_{n}^{2}\end{bmatrix}.sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL 2 - italic_ρ italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] .

Since 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT, the numbers of entries in each row, each column, and each diagonal are increasing corresponding to the indices of the entries. Based on the structure of this matrix, we define a function μ𝜇\muitalic_μ that maps every pair (ρ,𝒙)𝜌𝒙(\rho,\bm{x})( italic_ρ , bold_italic_x ) with ρ𝜌\rhoitalic_ρ and 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT to a non-negative integer as follows:

μ(ρ,𝒙):={0,if (𝖠ρ,𝒙)110;k,if there exists 1k<n such that (𝖠ρ,𝒙)1k<0 and (𝖠ρ,𝒙)1(k+1)0;n,if (𝖠ρ,𝒙)1n<0.assign𝜇𝜌𝒙cases0if (𝖠ρ,𝒙)110;𝑘if there exists 1k<n such that (𝖠ρ,𝒙)1k<0 and (𝖠ρ,𝒙)1(k+1)0;𝑛if (𝖠ρ,𝒙)1n<0.\mu(\rho,\bm{x}):=\left\{\begin{array}[]{ll}0,&\hbox{if $(\mathsf{A}_{\rho,\bm% {x}})_{11}\geq 0$;}\\ k,&\hbox{if there exists $1\leq k<n$ such that $(\mathsf{A}_{\rho,\bm{x}})_{1k% }<0$ and $(\mathsf{A}_{\rho,\bm{x}})_{1(k+1)}\geq 0$;}\\ n,&\hbox{if $(\mathsf{A}_{\rho,\bm{x}})_{1n}<0$.}\end{array}\right.italic_μ ( italic_ρ , bold_italic_x ) := { start_ARRAY start_ROW start_CELL 0 , end_CELL start_CELL if ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 ; end_CELL end_ROW start_ROW start_CELL italic_k , end_CELL start_CELL if there exists 1 ≤ italic_k < italic_n such that ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 1 italic_k end_POSTSUBSCRIPT < 0 and ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 1 ( italic_k + 1 ) end_POSTSUBSCRIPT ≥ 0 ; end_CELL end_ROW start_ROW start_CELL italic_n , end_CELL start_CELL if ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 1 italic_n end_POSTSUBSCRIPT < 0 . end_CELL end_ROW end_ARRAY (18)

This number μ(ρ,𝒙)𝜇𝜌𝒙\mu(\rho,\bm{x})italic_μ ( italic_ρ , bold_italic_x ) counts how many negative components in the vector 2𝒆ρx1𝒙2𝒆𝜌subscript𝑥1𝒙2\bm{e}-\rho x_{1}\bm{x}2 bold_italic_e - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_x. As 2𝒆ρx1𝒙2𝒆𝜌subscript𝑥1𝒙2\bm{e}-\rho x_{1}\bm{x}2 bold_italic_e - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_x is the first column of the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, with the number μ(ρ,𝒙)𝜇𝜌𝒙\mu(\rho,\bm{x})italic_μ ( italic_ρ , bold_italic_x ), we consider three cases for the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT in the following theorem.

Theorem 3.8.

Let ρ>0𝜌0\rho>0italic_ρ > 0 and let 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Set k=μ(ρ,𝐱)𝑘𝜇𝜌𝐱k=\mu(\rho,\bm{x})italic_k = italic_μ ( italic_ρ , bold_italic_x ). Then the following statements hold:

  • (i)

    If k=0𝑘0k=0italic_k = 0, then 𝒆1subscript𝒆1\bm{e}_{1}bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the global minimizer to the optimization problem (3);

  • (ii)

    If 1kn1𝑘𝑛1\leq k\leq n1 ≤ italic_k ≤ italic_n, then the vector

    [𝒘~𝟎(nk)×1]matrixsuperscript~𝒘subscript0𝑛𝑘1\begin{bmatrix}\tilde{\bm{w}}^{\star}\\ \mathbf{0}_{(n-k)\times 1}\end{bmatrix}[ start_ARG start_ROW start_CELL over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL bold_0 start_POSTSUBSCRIPT ( italic_n - italic_k ) × 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ]

    is the global minimizer to the optimization problem (3), where 𝒘~superscript~𝒘\tilde{\bm{w}}^{\star}over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the minimizer of the problem

    min𝒘𝕊+k112𝒘𝖠ρ,𝒙[k]𝒘.subscript𝒘subscriptsuperscript𝕊𝑘112superscript𝒘topsubscript𝖠𝜌subscript𝒙delimited-[]𝑘𝒘\min_{\bm{w}\in\mathbb{S}^{k-1}_{+}}\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,% \bm{x}_{[k]}}\bm{w}.roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_italic_w .

    Here 𝖠ρ,𝒙[k]subscript𝖠𝜌subscript𝒙delimited-[]𝑘\mathsf{A}_{\rho,\bm{x}_{[k]}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT end_POSTSUBSCRIPT is the k𝑘kitalic_k-order leading principal submatrix of 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT obtained by removing its last (nk)𝑛𝑘(n-k)( italic_n - italic_k ) rows and columns.

Proof.

(i) For 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, from the fact (𝖠ρ,𝒙)110subscriptsubscript𝖠𝜌𝒙110(\mathsf{A}_{\rho,\bm{x}})_{11}\geq 0( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0, we conclude that (𝖠ρ,𝒙)ij(𝖠ρ,𝒙)110subscriptsubscript𝖠𝜌𝒙𝑖𝑗subscriptsubscript𝖠𝜌𝒙110(\mathsf{A}_{\rho,\bm{x}})_{ij}\geq(\mathsf{A}_{\rho,\bm{x}})_{11}\geq 0( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ≥ ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 for all i,j[n]𝑖𝑗delimited-[]𝑛i,j\in[n]italic_i , italic_j ∈ [ italic_n ]. Therefore, for all 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, we have

12𝒘𝖠ρ,𝒙𝒘12(𝖠ρ,𝒙)11i,j=1nwiwj=12(𝖠ρ,𝒙)11𝒘1212(𝖠ρ,𝒙)11𝒘22=12(𝖠ρ,𝒙)11.12superscript𝒘topsubscript𝖠𝜌𝒙𝒘12subscriptsubscript𝖠𝜌𝒙11superscriptsubscript𝑖𝑗1𝑛subscript𝑤𝑖subscript𝑤𝑗12subscriptsubscript𝖠𝜌𝒙11superscriptsubscriptnorm𝒘1212subscriptsubscript𝖠𝜌𝒙11superscriptsubscriptnorm𝒘2212subscriptsubscript𝖠𝜌𝒙11\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}\geq\frac{1}{2}(\mathsf{% A}_{\rho,\bm{x}})_{11}\sum_{i,j=1}^{n}w_{i}w_{j}=\frac{1}{2}(\mathsf{A}_{\rho,% \bm{x}})_{11}\|\bm{w}\|_{1}^{2}\geq\frac{1}{2}(\mathsf{A}_{\rho,\bm{x}})_{11}% \|\bm{w}\|_{2}^{2}=\frac{1}{2}(\mathsf{A}_{\rho,\bm{x}})_{11}.divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i , italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≥ divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT .

The inequalities in the above can be achieved for 𝒘=𝒆1𝒘subscript𝒆1\bm{w}=\bm{e}_{1}bold_italic_w = bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

(ii) In this case, we split the matrix 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT into 2×2222\times 22 × 2 block matrix as follows

𝖠ρ,𝒙=[𝖠11𝖠12𝖠21𝖠22],subscript𝖠𝜌𝒙matrixsubscript𝖠11subscript𝖠12subscript𝖠21subscript𝖠22\mathsf{A}_{\rho,\bm{x}}=\begin{bmatrix}\mathsf{A}_{11}&\mathsf{A}_{12}\\ \mathsf{A}_{21}&\mathsf{A}_{22}\end{bmatrix},sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL start_CELL sansserif_A start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL sansserif_A start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT end_CELL start_CELL sansserif_A start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] ,

where 𝖠11subscript𝖠11\mathsf{A}_{11}sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT, 𝖠12subscript𝖠12\mathsf{A}_{12}sansserif_A start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT, 𝖠21subscript𝖠21\mathsf{A}_{21}sansserif_A start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT, and 𝖠22subscript𝖠22\mathsf{A}_{22}sansserif_A start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT are size k×k𝑘𝑘k\times kitalic_k × italic_k, k×(nk)𝑘𝑛𝑘k\times(n-k)italic_k × ( italic_n - italic_k ), (nk)×k𝑛𝑘𝑘(n-k)\times k( italic_n - italic_k ) × italic_k, and (nk)×(nk)𝑛𝑘𝑛𝑘(n-k)\times(n-k)( italic_n - italic_k ) × ( italic_n - italic_k ), respectively. In fact, 𝖠11=𝖠ρ,𝒙[k]subscript𝖠11subscript𝖠𝜌subscript𝒙delimited-[]𝑘\mathsf{A}_{11}=\mathsf{A}_{\rho,\bm{x}_{[k]}}sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT = sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x start_POSTSUBSCRIPT [ italic_k ] end_POSTSUBSCRIPT end_POSTSUBSCRIPT. We further know that all entries in 𝖠12subscript𝖠12\mathsf{A}_{12}sansserif_A start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT, 𝖠21subscript𝖠21\mathsf{A}_{21}sansserif_A start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT, and 𝖠22subscript𝖠22\mathsf{A}_{22}sansserif_A start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT are non-negative. For any 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, write

𝒘=[𝒘1𝒘2]𝒘matrixsubscript𝒘1subscript𝒘2\bm{w}=\begin{bmatrix}\bm{w}_{1}\\ \bm{w}_{2}\end{bmatrix}bold_italic_w = [ start_ARG start_ROW start_CELL bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ]

with 𝒘1ksubscript𝒘1superscript𝑘\bm{w}_{1}\in\mathbb{R}^{k}bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and 𝒘2nksubscript𝒘2superscript𝑛𝑘\bm{w}_{2}\in\mathbb{R}^{n-k}bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n - italic_k end_POSTSUPERSCRIPT. We have

𝒘𝖠ρ,𝒙𝒘=𝒘1𝖠11𝒘1+𝒘1𝖠12𝒘2+𝒘2𝖠21𝒘1+𝒘2𝖠22𝒘2𝒘1𝖠11𝒘1.superscript𝒘topsubscript𝖠𝜌𝒙𝒘superscriptsubscript𝒘1topsubscript𝖠11subscript𝒘1superscriptsubscript𝒘1topsubscript𝖠12subscript𝒘2superscriptsubscript𝒘2topsubscript𝖠21subscript𝒘1superscriptsubscript𝒘2topsubscript𝖠22subscript𝒘2superscriptsubscript𝒘1topsubscript𝖠11subscript𝒘1\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}=\bm{w}_{1}^{\top}\mathsf{A}_{11}% \bm{w}_{1}+\bm{w}_{1}^{\top}\mathsf{A}_{12}\bm{w}_{2}+\bm{w}_{2}^{\top}\mathsf% {A}_{21}\bm{w}_{1}+\bm{w}_{2}^{\top}\mathsf{A}_{22}\bm{w}_{2}\geq\bm{w}_{1}^{% \top}\mathsf{A}_{11}\bm{w}_{1}.bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w = bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .

The inequality 2ρx1xk<02𝜌subscript𝑥1subscript𝑥𝑘02-\rho x_{1}x_{k}<02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT < 0 implies min𝒘1𝒘1𝖠11𝒘1<0subscriptsubscript𝒘1superscriptsubscript𝒘1topsubscript𝖠11subscript𝒘10\min_{\bm{w}_{1}}\bm{w}_{1}^{\top}\mathsf{A}_{11}\bm{w}_{1}<0roman_min start_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < 0. Thus,

min𝒘𝕊+n112𝒘𝖠ρ,𝒙𝒘min𝒘𝕊+n112𝒘1𝖠11𝒘1min𝒘~𝕊+k112𝒘~𝖠11𝒘~.subscript𝒘subscriptsuperscript𝕊𝑛112superscript𝒘topsubscript𝖠𝜌𝒙𝒘subscript𝒘subscriptsuperscript𝕊𝑛112superscriptsubscript𝒘1topsubscript𝖠11subscript𝒘1subscript~𝒘subscriptsuperscript𝕊𝑘112superscript~𝒘topsubscript𝖠11~𝒘\min_{\bm{w}\in\mathbb{S}^{n-1}_{+}}\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,% \bm{x}}\bm{w}\geq\min_{\bm{w}\in\mathbb{S}^{n-1}_{+}}\frac{1}{2}\bm{w}_{1}^{% \top}\mathsf{A}_{11}\bm{w}_{1}\geq\min_{\tilde{\bm{w}}\in\mathbb{S}^{k-1}_{+}}% \frac{1}{2}\tilde{\bm{w}}^{\top}\mathsf{A}_{11}\tilde{\bm{w}}.roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w ≥ roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ roman_min start_POSTSUBSCRIPT over~ start_ARG bold_italic_w end_ARG ∈ blackboard_S start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over~ start_ARG bold_italic_w end_ARG .

In particular, for all vectors 𝒘~𝕊+n1~𝒘subscriptsuperscript𝕊𝑛1\tilde{\bm{w}}\in\mathbb{S}^{n-1}_{+}over~ start_ARG bold_italic_w end_ARG ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT with 𝒘2=𝟎subscript𝒘20\bm{w}_{2}=\mathbf{0}bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = bold_0, one has

12𝒘1~𝖠11𝒘1~=12𝒘~𝖠ρ,𝒙𝒘~min𝒘𝕊+n112𝒘𝖠ρ,𝒙𝒘.12superscript~subscript𝒘1topsubscript𝖠11~subscript𝒘112superscript~𝒘topsubscript𝖠𝜌𝒙~𝒘subscript𝒘subscriptsuperscript𝕊𝑛112superscript𝒘topsubscript𝖠𝜌𝒙𝒘\frac{1}{2}\tilde{\bm{w}_{1}}^{\top}\mathsf{A}_{11}\tilde{\bm{w}_{1}}=\frac{1}% {2}\tilde{\bm{w}}^{\top}\mathsf{A}_{\rho,\bm{x}}\tilde{\bm{w}}\geq\min_{\bm{w}% \in\mathbb{S}^{n-1}_{+}}\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}.divide start_ARG 1 end_ARG start_ARG 2 end_ARG over~ start_ARG bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over~ start_ARG bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT over~ start_ARG bold_italic_w end_ARG ≥ roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w .

We conclude that

min𝒘𝕊+n112𝒘𝖠ρ,𝒙𝒘=min𝒘~𝕊+k112𝒘~𝖠11𝒘~.subscript𝒘subscriptsuperscript𝕊𝑛112superscript𝒘topsubscript𝖠𝜌𝒙𝒘subscript~𝒘subscriptsuperscript𝕊𝑘112superscript~𝒘topsubscript𝖠11~𝒘\min_{\bm{w}\in\mathbb{S}^{n-1}_{+}}\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,% \bm{x}}\bm{w}=\min_{\tilde{\bm{w}}\in\mathbb{S}^{k-1}_{+}}\frac{1}{2}\tilde{% \bm{w}}^{\top}\mathsf{A}_{11}\tilde{\bm{w}}.roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w = roman_min start_POSTSUBSCRIPT over~ start_ARG bold_italic_w end_ARG ∈ blackboard_S start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over~ start_ARG bold_italic_w end_ARG .

This completes the proof. ∎

We remark that not all entries of 𝒘~superscript~𝒘\tilde{\bm{w}}^{\star}over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT in Theorem 3.8 are necessarily positive, and some entries may be zero, as demonstrated in the following example.

Example 3.9.

Let

𝒙=[2.51.510.5].𝒙superscriptmatrix2.51.510.5top\bm{x}=\begin{bmatrix}2.5&1.5&1&0.5\end{bmatrix}^{\top}.bold_italic_x = [ start_ARG start_ROW start_CELL 2.5 end_CELL start_CELL 1.5 end_CELL start_CELL 1 end_CELL start_CELL 0.5 end_CELL end_ROW end_ARG ] start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT .

For this vector and two different values of ρ𝜌\rhoitalic_ρ, we present the matrix 𝖠ρ,𝐱subscript𝖠𝜌𝐱\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, its eigenvector 𝐯𝐯\bm{v}bold_italic_v associated with the negative eigenvalue, and 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT the minimizer of the problem min𝐰𝕊+312𝐰𝖠ρ,𝐱𝐰subscript𝐰subscriptsuperscript𝕊312superscript𝐰topsubscript𝖠𝜌𝐱𝐰\min_{\bm{w}\in\mathbb{S}^{3}_{+}}\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm% {x}}\bm{w}roman_min start_POSTSUBSCRIPT bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w.

For ρ1=2.5subscript𝜌12.5\rho_{1}=2.5italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 2.5, we have 𝖠ρ1,𝐱subscript𝖠subscript𝜌1𝐱\mathsf{A}_{\rho_{1},\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_x end_POSTSUBSCRIPT, 𝐯1subscript𝐯1\bm{v}_{1}bold_italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and 𝐰1superscriptsubscript𝐰1\bm{w}_{1}^{\star}bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT as follows:

𝖠ρ1,𝒙=18[10959349592914134144691611],𝒗1=[0.85980.44810.24220.0363],and𝒘1=[0.85980.44810.24220.0363].formulae-sequencesubscript𝖠subscript𝜌1𝒙18matrix10959349592914134144691611formulae-sequencesubscript𝒗1matrix0.85980.44810.24220.0363andsuperscriptsubscript𝒘1matrix0.85980.44810.24220.0363\mathsf{A}_{\rho_{1},\bm{x}}=\frac{1}{8}\begin{bmatrix}-109&-59&-34&-9\\ -59&-29&-14&1\\ -34&-14&-4&6\\ -9&1&6&11\end{bmatrix},\bm{v}_{1}=\begin{bmatrix}0.8598\\ 0.4481\\ 0.2422\\ 0.0363\end{bmatrix},\quad\mbox{and}\quad\bm{w}_{1}^{\star}=\begin{bmatrix}0.85% 98\\ 0.4481\\ 0.2422\\ 0.0363\end{bmatrix}.sansserif_A start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_x end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 8 end_ARG [ start_ARG start_ROW start_CELL - 109 end_CELL start_CELL - 59 end_CELL start_CELL - 34 end_CELL start_CELL - 9 end_CELL end_ROW start_ROW start_CELL - 59 end_CELL start_CELL - 29 end_CELL start_CELL - 14 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL - 34 end_CELL start_CELL - 14 end_CELL start_CELL - 4 end_CELL start_CELL 6 end_CELL end_ROW start_ROW start_CELL - 9 end_CELL start_CELL 1 end_CELL start_CELL 6 end_CELL start_CELL 11 end_CELL end_ROW end_ARG ] , bold_italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 0.8598 end_CELL end_ROW start_ROW start_CELL 0.4481 end_CELL end_ROW start_ROW start_CELL 0.2422 end_CELL end_ROW start_ROW start_CELL 0.0363 end_CELL end_ROW end_ARG ] , and bold_italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL 0.8598 end_CELL end_ROW start_ROW start_CELL 0.4481 end_CELL end_ROW start_ROW start_CELL 0.2422 end_CELL end_ROW start_ROW start_CELL 0.0363 end_CELL end_ROW end_ARG ] .

For ρ2=1.8subscript𝜌21.8\rho_{2}=1.8italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.8, we have 𝖠ρ2,𝐱subscript𝖠subscript𝜌2𝐱\mathsf{A}_{\rho_{2},\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_x end_POSTSUBSCRIPT, 𝐯2subscript𝐯2\bm{v}_{2}bold_italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and 𝐰2superscriptsubscript𝐰2\bm{w}_{2}^{\star}bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT as follows:

𝖠ρ2,𝒙=[9.254.752.500.254.752.050.700.652.500.700.201.100.250.651.101.55],𝒗2=[0.87950.42940.20430.0207],and𝒘2=[0.88040.42860.20270].formulae-sequencesubscript𝖠subscript𝜌2𝒙matrix9.254.752.500.254.752.050.700.652.500.700.201.100.250.651.101.55formulae-sequencesubscript𝒗2matrix0.87950.42940.20430.0207andsuperscriptsubscript𝒘2matrix0.88040.42860.20270\mathsf{A}_{\rho_{2},\bm{x}}=\begin{bmatrix}-9.25&-4.75&-2.50&-0.25\\ -4.75&-2.05&-0.70&0.65\\ -2.50&-0.70&0.20&1.10\\ -0.25&0.65&1.10&1.55\end{bmatrix},\bm{v}_{2}=\begin{bmatrix}0.8795\\ 0.4294\\ 0.2043\\ -0.0207\end{bmatrix},\quad\mbox{and}\quad\bm{w}_{2}^{\star}=\begin{bmatrix}0.8% 804\\ 0.4286\\ 0.2027\\ 0\end{bmatrix}.sansserif_A start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_x end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL - 9.25 end_CELL start_CELL - 4.75 end_CELL start_CELL - 2.50 end_CELL start_CELL - 0.25 end_CELL end_ROW start_ROW start_CELL - 4.75 end_CELL start_CELL - 2.05 end_CELL start_CELL - 0.70 end_CELL start_CELL 0.65 end_CELL end_ROW start_ROW start_CELL - 2.50 end_CELL start_CELL - 0.70 end_CELL start_CELL 0.20 end_CELL start_CELL 1.10 end_CELL end_ROW start_ROW start_CELL - 0.25 end_CELL start_CELL 0.65 end_CELL start_CELL 1.10 end_CELL start_CELL 1.55 end_CELL end_ROW end_ARG ] , bold_italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL 0.8795 end_CELL end_ROW start_ROW start_CELL 0.4294 end_CELL end_ROW start_ROW start_CELL 0.2043 end_CELL end_ROW start_ROW start_CELL - 0.0207 end_CELL end_ROW end_ARG ] , and bold_italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL 0.8804 end_CELL end_ROW start_ROW start_CELL 0.4286 end_CELL end_ROW start_ROW start_CELL 0.2027 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW end_ARG ] .

Notice that for the values ρ1=2.5subscript𝜌12.5\rho_{1}=2.5italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 2.5 and ρ2=1.8subscript𝜌21.8\rho_{2}=1.8italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1.8, both meet the condition 2ρx1x4<02𝜌subscript𝑥1subscript𝑥402-\rho x_{1}x_{4}<02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT < 0, that is μ(ρ1,𝐱)=μ(ρ2,𝐱)=4𝜇subscript𝜌1𝐱𝜇subscript𝜌2𝐱4\mu(\rho_{1},\bm{x})=\mu(\rho_{2},\bm{x})=4italic_μ ( italic_ρ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_x ) = italic_μ ( italic_ρ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_x ) = 4. However, this does not determine the positivity of all components in 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT.

We can establish that h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT acts as a promoter of sparsity from Theorem 3.8 under the situation of μ(ρ,𝒙)=0𝜇𝜌𝒙0\mu(\rho,\bm{x})=0italic_μ ( italic_ρ , bold_italic_x ) = 0. This assertion is encapsulated in the subsequent result.

Theorem 3.10.

For ρ>0𝜌0\rho>0italic_ρ > 0, the following inclusion holds for all 𝐱𝐱\bm{x}bold_italic_x in the set {𝐱n:𝐱2/ρ}conditional-set𝐱superscript𝑛subscriptnorm𝐱2𝜌\{\bm{x}\in\mathbb{R}^{n}:\|\bm{x}\|_{\infty}\leq\sqrt{{2}/{\rho}}\}{ bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : ∥ bold_italic_x ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ square-root start_ARG 2 / italic_ρ end_ARG }:

𝟎prox1ρh2(𝒙).0subscriptprox1𝜌subscript2𝒙\mathbf{0}\in\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x}).bold_0 ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) .
Proof.

By Lemma 2.1, it suffices to consider all points in the set nsubscriptsuperscript𝑛\mathbb{R}^{n}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT with their subscript\ell_{\infty}roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT norm smaller than 2/ρ2𝜌\sqrt{{2}/{\rho}}square-root start_ARG 2 / italic_ρ end_ARG. For 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT, we examine two scenarios. If 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e with α2/ρ𝛼2𝜌\alpha\leq\sqrt{{2}/{\rho}}italic_α ≤ square-root start_ARG 2 / italic_ρ end_ARG, the result holds due to Theorem 3.1. If 𝒙α𝒆𝒙𝛼𝒆\bm{x}\neq\alpha\bm{e}bold_italic_x ≠ italic_α bold_italic_e for any α>0𝛼0\alpha>0italic_α > 0, by Theorem 3.8 we have G(𝒆1)=12(2ρx12)0𝐺subscript𝒆1122𝜌superscriptsubscript𝑥120G(\bm{e}_{1})=\frac{1}{2}(2-\rho x_{1}^{2})\geq 0italic_G ( bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 2 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ≥ 0, hence, the results holds as well. ∎

This theorem underscores the sparse-promoting nature of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT within the specified domain.

Given ρ>0𝜌0\rho>0italic_ρ > 0 and 𝒙n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, Theorem 3.8 provides a clear guideline for algorithm development when computing the optimal solution 𝒘𝒘\bm{w}bold_italic_w to problem (3), eventually, prox1ρh2(𝒙)subscriptprox1𝜌subscript2𝒙\mathrm{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). If there exists an integer k[1,n1]𝑘1𝑛1k\in[1,n-1]italic_k ∈ [ 1 , italic_n - 1 ] such that 2ρx1xk<02𝜌subscript𝑥1subscript𝑥𝑘02-\rho x_{1}x_{k}<02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT < 0 and 2ρx1xk+102𝜌subscript𝑥1subscript𝑥𝑘102-\rho x_{1}x_{k+1}\geq 02 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT ≥ 0, it follows that wk+1==wn=0subscript𝑤𝑘1subscript𝑤𝑛0w_{k+1}=\cdots=w_{n}=0italic_w start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT = ⋯ = italic_w start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = 0. This allows us to safely truncate 𝒙𝒙\bm{x}bold_italic_x by removing its last nk𝑛𝑘n-kitalic_n - italic_k entries. This approach can significantly speed up the computation process by focusing only on the relevant components of 𝒙𝒙\bm{x}bold_italic_x.

We are ready now to present our algorithm for computing prox1ρh2subscriptprox1𝜌subscript2\operatorname*{prox}_{\frac{1}{\rho}h_{2}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT based on our WRD procedure for arbitrary 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. This algorithm is presented in Algorithm 1.

Algorithm 1 Computing the Proximal Operator of h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT
1:Input: Vector 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, parameter ρ>0𝜌0\rho>0italic_ρ > 0
2:Output: The proximal operator prox1ρh2(𝒙)subscriptprox1𝜌subscript2𝒙\text{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x )
3:procedure (WRD Procedure)
4:     Sort and convert 𝒙𝒙\bm{x}bold_italic_x into nsubscriptsuperscript𝑛\mathbb{R}^{n}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT via a signed permutation matrix 𝖯𝖯\mathsf{P}sansserif_P.
5:     Compute k=μ(ρ,𝒙)𝑘𝜇𝜌𝒙k=\mu(\rho,\bm{x})italic_k = italic_μ ( italic_ρ , bold_italic_x ) by (18)
6:     if k=0𝑘0k=0italic_k = 0 then
7:         𝒘=𝒆1𝒘subscript𝒆1\bm{w}=\bm{e}_{1}bold_italic_w = bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT          (see item (i) of Theorem 3.8)
8:     else(𝒘𝒘\bm{w}bold_italic_w-step)
9:         for  k:1:1:𝑘1:1k:-1:1italic_k : - 1 : 1 do
10:              Forming a vector (still denoted by 𝒙𝒙\bm{x}bold_italic_x) from the first k𝑘kitalic_k entries of 𝒙𝒙\bm{x}bold_italic_x
11:              if 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0 then
12:                  𝒖=prox1ρh2(𝒙)𝒖subscriptprox1𝜌subscript2𝒙\bm{u}=\text{prox}_{\frac{1}{\rho}h_{2}}(\bm{x})bold_italic_u = prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) by Theorem 3.1
13:              else if k=2𝑘2k=2italic_k = 2 then
14:                  return 𝒘𝒘\bm{w}bold_italic_w by Theorem 3.2
15:              else
16:                  if the last entry of 𝒘¯¯𝒘\underline{\bm{w}}under¯ start_ARG bold_italic_w end_ARG by (14), is greater than 00 then
17:                       return 𝒘𝒘¯𝒘¯2𝒘¯𝒘subscriptnorm¯𝒘2\bm{w}\leftarrow\frac{\underline{\bm{w}}}{\|\underline{\bm{w}}\|_{2}}bold_italic_w ← divide start_ARG under¯ start_ARG bold_italic_w end_ARG end_ARG start_ARG ∥ under¯ start_ARG bold_italic_w end_ARG ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG by Theorem 3.6
18:                  end if
19:              end if
20:         end for
21:     end if
22:     Pad 𝒘𝒘\bm{w}bold_italic_w with a zero block such that the resulting vector, still denoted by 𝒘𝒘\bm{w}bold_italic_w, is in 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT.
23:     Form 𝒖𝒙,𝒘𝒘𝒖𝒙𝒘𝒘\bm{u}\leftarrow\langle\bm{x},\bm{w}\rangle\bm{w}bold_italic_u ← ⟨ bold_italic_x , bold_italic_w ⟩ bold_italic_w (r𝑟ritalic_r-step)
24:     Determine 𝒖{𝟎,if F(𝟎)F(𝒖);𝒖,otherwise.𝒖cases0if F(𝟎)F(𝒖);𝒖otherwise.\bm{u}\leftarrow\left\{\begin{array}[]{ll}\mathbf{0},&\hbox{if $F(\mathbf{0})% \leq F(\bm{u})$;}\\ \bm{u},&\hbox{otherwise.}\end{array}\right.bold_italic_u ← { start_ARRAY start_ROW start_CELL bold_0 , end_CELL start_CELL if italic_F ( bold_0 ) ≤ italic_F ( bold_italic_u ) ; end_CELL end_ROW start_ROW start_CELL bold_italic_u , end_CELL start_CELL otherwise. end_CELL end_ROW end_ARRAY (d𝑑ditalic_d-step)
25:     𝒖𝖯1𝒖prox1ρh2(𝒙)𝒖superscript𝖯1𝒖subscriptprox1𝜌subscript2𝒙\bm{u}\leftarrow\mathsf{P}^{-1}\bm{u}\in\text{prox}_{\frac{1}{\rho}h_{2}}(\bm{% x})bold_italic_u ← sansserif_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_u ∈ prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x )
26:end procedure

4 The Proximal Operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

In this section, we detail the computation of the proximal operator for the function h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT via the WRD procedure.

We begin with showing the optimization problem (3) associated with the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure. For the given ρ>0𝜌0\rho>0italic_ρ > 0 and 𝒙+n𝒙superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{+}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, defining

𝖠ρ,𝒙=ρ𝒙𝒙.subscript𝖠𝜌𝒙𝜌𝒙superscript𝒙top\mathsf{A}_{\rho,\bm{x}}=-\rho\cdot\bm{x}\bm{x}^{\top}.sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = - italic_ρ ⋅ bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . (19)

The corresponding function G𝐺Gitalic_G in (5) for h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT becomes the quadratic form

G(𝒘)=12𝒘𝖠ρ,𝒙𝒘+𝒆𝒘.𝐺𝒘12superscript𝒘topsubscript𝖠𝜌𝒙𝒘superscript𝒆top𝒘G(\bm{w})=\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}+\bm{e}^{\top}% \bm{w}.italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w .

By Lemma 2.1, our focus is restricted to discussing the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT on nsuperscriptsubscript𝑛\mathbb{R}_{\downarrow}^{n}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. This discussion unfolds in the subsequent three subsections.

In the first subsection, we highlight that the method for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, as delineated in Section 3, cannot be directly applied to h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, despite the initial feasibility of such a transfer, particularly considering their analogous reformulations. Additionally, we provide the explicit expression of the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT at specific points, highlighting that h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT serves as a function that promotes sparsity.

The second subsection conducts an in-depth examination of the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Notably, the method tailored for this task poses challenges in its extension to higher dimensions.

In the third subsection, we introduce a strategy to transform the optimization problem in the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure. This transformation entails converting a concave objective function constrained on a nonconvex set into one with the same objective function but constrained on a closed and bounded convex set. The latter can be efficiently solved using the nonconvex gradient projection algorithm (see [8]).

4.1 The approach for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT does not work for h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

Initially, it may seem feasible to directly apply the method for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT described in Section 3 to h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, especially given their similar reformulations. However, we want to point out that this approach is not directly transferable to h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. This becomes evident when considering Lemma 3.3, which leads us to the subsequent result.

Proposition 4.1.

For 𝐱+n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{+}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and ρ>0𝜌0\rho>0italic_ρ > 0, we consider a quadratic optimization problem on the unit sphere as follows

min{12𝒘𝖠ρ,𝒙𝒘+𝒆𝒘:𝒘𝕊n1}.:12superscript𝒘topsubscript𝖠𝜌𝒙𝒘superscript𝒆top𝒘𝒘superscript𝕊𝑛1\min\left\{\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}+\bm{e}^{\top% }\bm{w}:\bm{w}\in\mathbb{S}^{n-1}\right\}.roman_min { divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT } . (20)

A vector 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is a solution to (4.1) if and only if there is a unique λ>ρ𝐱22superscript𝜆𝜌superscriptsubscriptnorm𝐱22\lambda^{\star}>\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT > italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that

(𝖠ρ,𝒙+λI)𝒘=𝒆subscript𝖠𝜌𝒙superscript𝜆Isuperscript𝒘𝒆(\mathsf{A}_{\rho,\bm{x}}+\lambda^{\star}\mathsf{\mathrm{I}})\bm{w}^{\star}=-% \bm{e}( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT roman_I ) bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - bold_italic_e

with 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT being a unit vector.

Proof.

Problem (20) is a special case of problem (8) by identifying 𝖠ρ,𝒙subscript𝖠𝜌𝒙\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT, 𝒆𝒆\bm{e}bold_italic_e, and 1111 as 𝖧𝖧\mathsf{H}sansserif_H, 𝒃𝒃\bm{b}bold_italic_b, and r𝑟ritalic_r, respectively.

The matrix 𝖠ρ,𝒙=ρ𝒙𝒙subscript𝖠𝜌𝒙𝜌𝒙superscript𝒙top\mathsf{A}_{\rho,\bm{x}}=-\rho\bm{x}\bm{x}^{\top}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT = - italic_ρ bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT is a rank-1 matrix and has ρ𝒙22𝜌superscriptsubscriptnorm𝒙22-\rho\|\bm{x}\|_{2}^{2}- italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as its only one non-zero eigenvalue with the associated unit eigenvector 𝒙𝒙2𝒙subscriptnorm𝒙2\frac{\bm{x}}{\|\bm{x}\|_{2}}divide start_ARG bold_italic_x end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG. Hence, for any λρ𝒙22𝜆𝜌superscriptsubscriptnorm𝒙22\lambda\geq\rho\|\bm{x}\|_{2}^{2}italic_λ ≥ italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the matrix 𝖠ρ,𝒙+λIsubscript𝖠𝜌𝒙𝜆I\mathsf{A}_{\rho,\bm{x}}+\lambda\mathsf{\mathrm{I}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ roman_I is positive semidefinite.

\Rightarrow” If 𝒘superscript𝒘\bm{w}^{*}bold_italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is the optimal solution to problem (20), by Lemma 3.3, there exists a unique λρ𝒙22superscript𝜆𝜌superscriptsubscriptnorm𝒙22\lambda^{\star}\geq\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ≥ italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that (𝖠ρ,𝒙+λI)𝒘=𝒆subscript𝖠𝜌𝒙superscript𝜆Isuperscript𝒘𝒆(\mathsf{A}_{\rho,\bm{x}}+\lambda^{\star}\mathsf{\mathrm{I}})\bm{w}^{\star}=-% \bm{e}( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT roman_I ) bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - bold_italic_e with 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT being a unit vector. We claim that λ>ρ𝒙22superscript𝜆𝜌superscriptsubscriptnorm𝒙22\lambda^{\star}>\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT > italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. If not, assume that λ=ρ𝒙22superscript𝜆𝜌superscriptsubscriptnorm𝒙22\lambda^{\star}=\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and let 𝖴𝖴\mathsf{U}sansserif_U be an orthogonal matrix whose the first column is 𝒙𝒙2𝒙subscriptnorm𝒙2\frac{\bm{x}}{\|\bm{x}\|_{2}}divide start_ARG bold_italic_x end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG. Then, the equality (𝖠ρ,𝒙+λI)𝒘=𝒆subscript𝖠𝜌𝒙superscript𝜆Isuperscript𝒘𝒆(\mathsf{A}_{\rho,\bm{x}}+\lambda^{\star}\mathsf{\mathrm{I}})\bm{w}^{\star}=-% \bm{e}( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT roman_I ) bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - bold_italic_e leads to

𝖴[0ρ𝒙22ρ𝒙22]𝖴𝒘=𝒆or[0ρ𝒙22ρ𝒙22]𝖴𝒘=[𝒙1𝒙2],formulae-sequence𝖴matrix0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜌superscriptsubscriptnorm𝒙22missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜌superscriptsubscriptnorm𝒙22superscript𝖴topsuperscript𝒘𝒆ormatrix0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜌superscriptsubscriptnorm𝒙22missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpression𝜌superscriptsubscriptnorm𝒙22superscript𝖴topsuperscript𝒘matrixsubscriptnorm𝒙1subscriptnorm𝒙2\mathsf{U}\begin{bmatrix}0&&&\\ &\rho\|\bm{x}\|_{2}^{2}&&\\ &&\ddots&\\ &&&\rho\|\bm{x}\|_{2}^{2}\end{bmatrix}\mathsf{U}^{\top}\bm{w}^{\star}=-\bm{e}% \quad\mbox{or}\quad\begin{bmatrix}0&&&\\ &\rho\|\bm{x}\|_{2}^{2}&&\\ &&\ddots&\\ &&&\rho\|\bm{x}\|_{2}^{2}\end{bmatrix}\mathsf{U}^{\top}\bm{w}^{\star}=-\begin{% bmatrix}\frac{\|\bm{x}\|_{1}}{\|\bm{x}\|_{2}}\\ \star\\ \vdots\\ \star\end{bmatrix},sansserif_U [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL ⋱ end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] sansserif_U start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - bold_italic_e or [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL ⋱ end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] sansserif_U start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - [ start_ARG start_ROW start_CELL divide start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL ⋆ end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ⋆ end_CELL end_ROW end_ARG ] ,

which is inconsistent. Hence, λsuperscript𝜆\lambda^{\star}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is strictly greater than ρ𝒙22𝜌superscriptsubscriptnorm𝒙22\rho\|\bm{x}\|_{2}^{2}italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

\Leftarrow” We show that there exists an λ>ρ𝒙22𝜆𝜌superscriptsubscriptnorm𝒙22\lambda>\rho\|\bm{x}\|_{2}^{2}italic_λ > italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that (𝖠ρ,𝒙+λI)1𝒆2=1subscriptnormsuperscriptsubscript𝖠𝜌𝒙𝜆I1𝒆21\|(\mathsf{A}_{\rho,\bm{x}}+\lambda\mathsf{\mathrm{I}})^{-1}\bm{e}\|_{2}=1∥ ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ roman_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_e ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1. For λρ𝒙22𝜆𝜌superscriptsubscriptnorm𝒙22\lambda\neq\rho\|\bm{x}\|_{2}^{2}italic_λ ≠ italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the matrix 𝖠ρ,𝒙+λIsubscript𝖠𝜌𝒙𝜆I\mathsf{A}_{\rho,\bm{x}}+\lambda\mathsf{\mathrm{I}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ roman_I is invertible and its inverse is

(𝖠ρ,𝒙+λI)1=1λ(I+ρλρ𝒙22𝒙𝒙).superscriptsubscript𝖠𝜌𝒙𝜆I11𝜆I𝜌𝜆𝜌superscriptsubscriptnorm𝒙22𝒙superscript𝒙top(\mathsf{A}_{\rho,\bm{x}}+\lambda\mathsf{\mathrm{I}})^{-1}=\frac{1}{\lambda}% \left(\mathsf{\mathrm{I}}+\frac{\rho}{\lambda-\rho\|\bm{x}\|_{2}^{2}}\bm{x}\bm% {x}^{\top}\right).( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ roman_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_λ end_ARG ( roman_I + divide start_ARG italic_ρ end_ARG start_ARG italic_λ - italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_italic_x bold_italic_x start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT ) .

For λ>ρ𝒙22𝜆𝜌superscriptsubscriptnorm𝒙22\lambda>\rho\|\bm{x}\|_{2}^{2}italic_λ > italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, from (𝖠ρ,𝒙+λI)1𝒆2=1subscriptnormsuperscriptsubscript𝖠𝜌𝒙𝜆I1𝒆21\|(\mathsf{A}_{\rho,\bm{x}}+\lambda\mathsf{\mathrm{I}})^{-1}\bm{e}\|_{2}=1∥ ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ roman_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_e ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 together with the above equation, we obtain

(λρ𝒙22)𝒆+ρ𝒙1𝒙2=λ(λρ𝒙22).subscriptnorm𝜆𝜌superscriptsubscriptnorm𝒙22𝒆𝜌subscriptnorm𝒙1𝒙2𝜆𝜆𝜌superscriptsubscriptnorm𝒙22\left\|(\lambda-\rho\|\bm{x}\|_{2}^{2})\bm{e}+\rho\|\bm{x}\|_{1}\bm{x}\right\|% _{2}=\lambda(\lambda-\rho\|\bm{x}\|_{2}^{2}).∥ ( italic_λ - italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) bold_italic_e + italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_λ ( italic_λ - italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (21)

To study the root the above equation, we consider two different cases: (i) 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0 and (ii) 𝒙α𝒆𝒙𝛼𝒆\bm{x}\neq\alpha\bm{e}bold_italic_x ≠ italic_α bold_italic_e for any α>0𝛼0\alpha>0italic_α > 0.

For the case of 𝒙=α𝒆𝒙𝛼𝒆\bm{x}=\alpha\bm{e}bold_italic_x = italic_α bold_italic_e for some α>0𝛼0\alpha>0italic_α > 0, one has 𝒙1=αnsubscriptnorm𝒙1𝛼𝑛\|\bm{x}\|_{1}=\alpha n∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_α italic_n and 𝒙2=αnsubscriptnorm𝒙2𝛼𝑛\|\bm{x}\|_{2}=\alpha\sqrt{n}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_α square-root start_ARG italic_n end_ARG. It leads from (21) that λn=λ(λρα2n)𝜆𝑛𝜆𝜆𝜌superscript𝛼2𝑛\lambda\sqrt{n}=\lambda(\lambda-\rho\alpha^{2}n)italic_λ square-root start_ARG italic_n end_ARG = italic_λ ( italic_λ - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n ). This equation has two real roots and the only root, that is larger than ρ𝒙22𝜌superscriptsubscriptnorm𝒙22\rho\|\bm{x}\|_{2}^{2}italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, is λ=n+ρα2n>ρα2n=ρ𝒙22superscript𝜆𝑛𝜌superscript𝛼2𝑛𝜌superscript𝛼2𝑛𝜌superscriptsubscriptnorm𝒙22\lambda^{\star}=\sqrt{n}+\rho\alpha^{2}n>\rho\alpha^{2}n=\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = square-root start_ARG italic_n end_ARG + italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n > italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n = italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. By Lemma 3.3,

𝒘=1n𝒆superscript𝒘1𝑛𝒆\bm{w}^{\star}=-\frac{1}{\sqrt{n}}\bm{e}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = - divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e (22)

is the optimal solution to problem (20).

The rest of the proof considers the case of 𝒙α𝒆𝒙𝛼𝒆\bm{x}\neq\alpha\bm{e}bold_italic_x ≠ italic_α bold_italic_e for any α>0𝛼0\alpha>0italic_α > 0. Squaring the identity (21) from its both sides and simplifying the resulting equation lead to the following quartic equation

Q(q)=0,𝑄𝑞0Q(q)=0,italic_Q ( italic_q ) = 0 ,

where q=λρ𝒙22𝑞𝜆𝜌superscriptsubscriptnorm𝒙22q=\lambda-\rho\|\bm{x}\|_{2}^{2}italic_q = italic_λ - italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and

Q(q)=q4+2ρ𝒙22q3+(ρ2𝒙24n)q22ρ𝒙12qρ2𝒙12𝒙22.𝑄𝑞superscript𝑞42𝜌superscriptsubscriptnorm𝒙22superscript𝑞3superscript𝜌2superscriptsubscriptnorm𝒙24𝑛superscript𝑞22𝜌superscriptsubscriptnorm𝒙12𝑞superscript𝜌2superscriptsubscriptnorm𝒙12superscriptsubscriptnorm𝒙22Q(q)=q^{4}+2\rho\|\bm{x}\|_{2}^{2}q^{3}+(\rho^{2}\|\bm{x}\|_{2}^{4}-n)q^{2}-2% \rho\|\bm{x}\|_{1}^{2}q-\rho^{2}\|\bm{x}\|_{1}^{2}\|\bm{x}\|_{2}^{2}.italic_Q ( italic_q ) = italic_q start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT + 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + ( italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT - italic_n ) italic_q start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Since Q(0)=ρ2𝒙12𝒙22<0𝑄0superscript𝜌2superscriptsubscriptnorm𝒙12superscriptsubscriptnorm𝒙220Q(0)=-\rho^{2}\|\bm{x}\|_{1}^{2}\|\bm{x}\|_{2}^{2}<0italic_Q ( 0 ) = - italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 0 and Q(q)𝑄𝑞Q(q)italic_Q ( italic_q ) is positive for a sufficient large q𝑞qitalic_q, there exists at least one root of Q𝑄Qitalic_Q on the interval [0,)0[0,\infty)[ 0 , ∞ ). No matter what value of (ρ2𝒙24n)superscript𝜌2superscriptsubscriptnorm𝒙24𝑛(\rho^{2}\|\bm{x}\|_{2}^{4}-n)( italic_ρ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT - italic_n ) will be, the number of sign changes of the polynomial Q𝑄Qitalic_Q is 1111. Therefore, by Descartes’ Rule of Signs [22], we conclude that Q𝑄Qitalic_Q has exactly one positive root, say qsuperscript𝑞q^{\star}italic_q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. Hence, with λ=q+ρ𝒙22superscript𝜆superscript𝑞𝜌superscriptsubscriptnorm𝒙22\lambda^{\star}=q^{\star}+\rho\|\bm{x}\|_{2}^{2}italic_λ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = italic_q start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT + italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT,

𝒘=(𝖠ρ,𝒙+λI)1(𝒆)=1λ(𝒆+ρ𝒙1λρ𝒙22𝒙)superscript𝒘superscriptsubscript𝖠𝜌𝒙superscript𝜆I1𝒆1superscript𝜆𝒆𝜌subscriptnorm𝒙1superscript𝜆𝜌superscriptsubscriptnorm𝒙22𝒙\bm{w}^{\star}=(\mathsf{A}_{\rho,\bm{x}}+\lambda^{*}\mathsf{\mathrm{I}})^{-1}(% -\bm{e})=-\frac{1}{\lambda^{*}}\left(\bm{e}+\frac{\rho\|\bm{x}\|_{1}}{\lambda^% {*}-\rho\|\bm{x}\|_{2}^{2}}\bm{x}\right)bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT + italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT roman_I ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( - bold_italic_e ) = - divide start_ARG 1 end_ARG start_ARG italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG ( bold_italic_e + divide start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_λ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_italic_x ) (23)

is the optimal solution to problem (20) by Lemma 3.3 again. ∎

It is evident from the preceding proof that all entries of the optimal solution 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT, as indicated in (22) and (23), are negative. Consequently, this vector 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT cannot serve as the solution to problem  (3). Therefore, the methodology employed for h2subscript2h_{2}italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is not applicable to h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, necessitating a distinct approach.

Next, we provide the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for vectors 𝒙𝒙\bm{x}bold_italic_x with uniform entries.

Theorem 4.2.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱=α𝐞n𝐱𝛼𝐞superscript𝑛\bm{x}=\alpha\bm{e}\in\mathbb{R}^{n}bold_italic_x = italic_α bold_italic_e ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT for some α>0𝛼0\alpha>0italic_α > 0, then

prox1ρh1(𝒙)={{𝟎},if α<2ρn{𝟎,𝒙},if α=2ρn;{𝒙},if α>2ρn.subscriptprox1𝜌subscript1𝒙cases0if 𝛼2𝜌𝑛0𝒙if 𝛼2𝜌𝑛𝒙if 𝛼2𝜌𝑛\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})=\begin{cases}\{\bm{0}\},&\text{if % }\alpha<\sqrt{\frac{2}{\rho\sqrt{n}}}\\ \{\bm{0},\bm{x}\},&\text{if }\alpha=\sqrt{\frac{2}{\rho\sqrt{n}}};\\ \{\bm{x}\},&\text{if }\alpha>\sqrt{\frac{2}{\rho\sqrt{n}}}.\end{cases}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ROW start_CELL { bold_0 } , end_CELL start_CELL if italic_α < square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ square-root start_ARG italic_n end_ARG end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL { bold_0 , bold_italic_x } , end_CELL start_CELL if italic_α = square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ square-root start_ARG italic_n end_ARG end_ARG end_ARG ; end_CELL end_ROW start_ROW start_CELL { bold_italic_x } , end_CELL start_CELL if italic_α > square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ square-root start_ARG italic_n end_ARG end_ARG end_ARG . end_CELL end_ROW
Proof.

In this situation, we have 𝖠ρ,x=ρα2𝒆𝒆subscript𝖠𝜌𝑥𝜌superscript𝛼2𝒆superscript𝒆top\mathsf{A}_{\rho,x}=-\rho\alpha^{2}\bm{e}\bm{e}^{\top}sansserif_A start_POSTSUBSCRIPT italic_ρ , italic_x end_POSTSUBSCRIPT = - italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_italic_e bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT from (19). The objective function of problem (3) is

G(𝒘)=12𝒘𝖠ρ,x𝒘+𝒆𝒘=12ρα2𝒘12+𝒘1=12ρα2(𝒘11ρα2)2+12ρα2,𝐺𝒘12superscript𝒘topsubscript𝖠𝜌𝑥𝒘superscript𝒆top𝒘12𝜌superscript𝛼2superscriptsubscriptnorm𝒘12subscriptnorm𝒘112𝜌superscript𝛼2superscriptsubscriptnorm𝒘11𝜌superscript𝛼2212𝜌superscript𝛼2G(\bm{w})=\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,x}\bm{w}+\bm{e}^{\top}\bm{w% }=-\frac{1}{2}\rho\alpha^{2}\|\bm{w}\|_{1}^{2}+\|\bm{w}\|_{1}=-\frac{1}{2}\rho% \alpha^{2}\left(\|\bm{w}\|_{1}-\frac{1}{\rho\alpha^{2}}\right)^{2}+\frac{1}{2% \rho\alpha^{2}},italic_G ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( ∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 1 end_ARG start_ARG 2 italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

where 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Note that 𝒘1[1,n]subscriptnorm𝒘11𝑛\|\bm{w}\|_{1}\in[1,\sqrt{n}]∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ [ 1 , square-root start_ARG italic_n end_ARG ] for all 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, the above quantity achieves its global minimum at 𝒘1subscriptnorm𝒘1\|\bm{w}\|_{1}∥ bold_italic_w ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT being 1111 or n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG, depending on which one is further away to 1ρα21𝜌superscript𝛼2\frac{1}{\rho\alpha^{2}}divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG. Hence, 𝒘1subscriptnormsuperscript𝒘1\|\bm{w}^{\star}\|_{1}∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT the 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT norm of the optimal solution 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem (3) is n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG if 1ρα2<12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}<\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ); 1111 or n𝑛\sqrt{n}square-root start_ARG italic_n end_ARG if 1ρα2=12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}=\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ); or 1111 if 1ρα2>12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}>\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ). As a result, the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure provides the optimal solution 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem (3) as follows:

𝒘{{1n𝒆},if 1ρα2<12(1+n);{1n𝒆}{𝒆𝒊:i=1,,n},if 1ρα2=12(1+n);{𝒆𝒊:i=1,,n},if 1ρα2>12(1+n).superscript𝒘cases1𝑛𝒆if 1𝜌superscript𝛼2121𝑛1𝑛𝒆conditional-setsubscript𝒆𝒊𝑖1𝑛if 1𝜌superscript𝛼2121𝑛conditional-setsubscript𝒆𝒊𝑖1𝑛if 1𝜌superscript𝛼2121𝑛\bm{w}^{\star}\in\begin{cases}\{\frac{1}{\sqrt{n}}\bm{e}\},&\text{if }\frac{1}% {\rho\alpha^{2}}<\frac{1}{2}(1+\sqrt{n});\\ \{\frac{1}{\sqrt{n}}\bm{e}\}\cup\{\bm{e_{i}}:i=1,\ldots,n\},&\text{if }\frac{1% }{\rho\alpha^{2}}=\frac{1}{2}(1+\sqrt{n});\\ \{\bm{e_{i}}:i=1,\ldots,n\},&\text{if }\frac{1}{\rho\alpha^{2}}>\frac{1}{2}(1+% \sqrt{n}).\end{cases}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ { start_ROW start_CELL { divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e } , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) ; end_CELL end_ROW start_ROW start_CELL { divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG bold_italic_e } ∪ { bold_italic_e start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT : italic_i = 1 , … , italic_n } , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) ; end_CELL end_ROW start_ROW start_CELL { bold_italic_e start_POSTSUBSCRIPT bold_italic_i end_POSTSUBSCRIPT : italic_i = 1 , … , italic_n } , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) . end_CELL end_ROW

The r𝑟ritalic_r-step of the WRD procedure simply follows with r=𝒙,𝒘superscript𝑟𝒙superscript𝒘r^{\star}=\langle\bm{x},\bm{w}^{\star}\rangleitalic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩. At the d𝑑ditalic_d-step of the WRD procedure, we compare F(r𝒘)𝐹superscript𝑟superscript𝒘F(r^{\star}\bm{w}^{\star})italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) and F(𝟎)𝐹0F(\bm{0})italic_F ( bold_0 ) with F𝐹Fitalic_F defined in (1). Note that

F(r𝒘)F(𝟎)=G(𝒘)={12ρα2n+n,if 1ρα2<12(1+n);n1+n,if 1ρα2=12(1+n);12ρα2+1,if 1ρα2>12(1+n).𝐹superscript𝑟superscript𝒘𝐹0𝐺superscript𝒘cases12𝜌superscript𝛼2𝑛𝑛if 1𝜌superscript𝛼2121𝑛𝑛1𝑛if 1𝜌superscript𝛼2121𝑛12𝜌superscript𝛼21if 1𝜌superscript𝛼2121𝑛F(r^{\star}\bm{w}^{\star})-F(\bm{0})=G(\bm{w}^{\star})=\begin{cases}-\frac{1}{% 2}\rho\alpha^{2}n+\sqrt{n},&\text{if }\frac{1}{\rho\alpha^{2}}<\frac{1}{2}(1+% \sqrt{n});\\ \frac{\sqrt{n}}{1+\sqrt{n}},&\text{if }\frac{1}{\rho\alpha^{2}}=\frac{1}{2}(1+% \sqrt{n});\\ -\frac{1}{2}\rho\alpha^{2}+1,&\text{if }\frac{1}{\rho\alpha^{2}}>\frac{1}{2}(1% +\sqrt{n}).\end{cases}italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = { start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n + square-root start_ARG italic_n end_ARG , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) ; end_CELL end_ROW start_ROW start_CELL divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 1 + square-root start_ARG italic_n end_ARG end_ARG , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) ; end_CELL end_ROW start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 , end_CELL start_CELL if divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ) . end_CELL end_ROW

We see that under the condition 1ρα2<12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}<\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ), the quality F(r𝒘)F(𝟎)=12ρα2n+n𝐹superscript𝑟superscript𝒘𝐹012𝜌superscript𝛼2𝑛𝑛F(r^{\star}\bm{w}^{\star})-F(\bm{0})=-\frac{1}{2}\rho\alpha^{2}n+\sqrt{n}italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n + square-root start_ARG italic_n end_ARG is positive if 1ρα2>n21𝜌superscript𝛼2𝑛2\frac{1}{\rho\alpha^{2}}>\frac{\sqrt{n}}{2}divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 2 end_ARG, zero if 1ρα2=n21𝜌superscript𝛼2𝑛2\frac{1}{\rho\alpha^{2}}=\frac{\sqrt{n}}{2}divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 2 end_ARG, or negative if 1ρα2<n21𝜌superscript𝛼2𝑛2\frac{1}{\rho\alpha^{2}}<\frac{\sqrt{n}}{2}divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG < divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 2 end_ARG; Under the condition 1ρα2=12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}=\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ), F(r𝒘)F(𝟎)=n1+n>0𝐹superscript𝑟superscript𝒘𝐹0𝑛1𝑛0F(r^{\star}\bm{w}^{\star})-F(\bm{0})=\frac{\sqrt{n}}{1+\sqrt{n}}>0italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 1 + square-root start_ARG italic_n end_ARG end_ARG > 0; Under the condition 1ρα2>12(1+n)1𝜌superscript𝛼2121𝑛\frac{1}{\rho\alpha^{2}}>\frac{1}{2}(1+\sqrt{n})divide start_ARG 1 end_ARG start_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( 1 + square-root start_ARG italic_n end_ARG ), i.e., 12ρα2>11+n12𝜌superscript𝛼211𝑛-\frac{1}{2}\rho\alpha^{2}>\frac{-1}{1+\sqrt{n}}- divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT > divide start_ARG - 1 end_ARG start_ARG 1 + square-root start_ARG italic_n end_ARG end_ARG, we have F(r𝒘)F(𝟎)=12ρα2+1>n1+n𝐹superscript𝑟superscript𝒘𝐹012𝜌superscript𝛼21𝑛1𝑛F(r^{\star}\bm{w}^{\star})-F(\bm{0})=-\frac{1}{2}\rho\alpha^{2}+1>\frac{\sqrt{% n}}{1+\sqrt{n}}italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 > divide start_ARG square-root start_ARG italic_n end_ARG end_ARG start_ARG 1 + square-root start_ARG italic_n end_ARG end_ARG always positive. The result of this theorem follows from (2). ∎

The next result shows that the function h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is indeed a sparse promoting function whose proximity operator will send the points in a neighborhood of the origin to the origin (see [17]).

Theorem 4.3.

For ρ>0𝜌0\rho>0italic_ρ > 0, the following inclusion

𝟎prox1ρh1(𝒙)0subscriptprox1𝜌subscript1𝒙\mathbf{0}\in\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})bold_0 ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x )

holds for 𝐱+n𝐱subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{+}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT with 𝐱22ρsubscriptnorm𝐱22𝜌\|\bm{x}\|_{2}\leq\sqrt{\frac{2}{\rho}}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG.

Proof.

Let G𝐺Gitalic_G be the objective function of problem (3) associated with h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. For 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, we have

G(𝒘)=ρ2𝒙,𝒘2+𝒆𝒘ρ2𝒙22+10𝐺𝒘𝜌2superscript𝒙𝒘2superscript𝒆top𝒘𝜌2superscriptsubscriptnorm𝒙2210G(\bm{w})=-\frac{\rho}{2}\langle\bm{x},\bm{w}\rangle^{2}+\bm{e}^{\top}\bm{w}% \geq-\frac{\rho}{2}\|\bm{x}\|_{2}^{2}+1\geq 0italic_G ( bold_italic_w ) = - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ⟨ bold_italic_x , bold_italic_w ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w ≥ - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 ≥ 0

for 𝒙+n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{+}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT with 𝒙22ρsubscriptnorm𝒙22𝜌\|\bm{x}\|_{2}\leq\sqrt{\frac{2}{\rho}}∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG. We further have F(𝒙,𝒘𝒘)F(𝟎)=G(𝒘)0𝐹𝒙𝒘𝒘𝐹0𝐺𝒘0F(\langle\bm{x},\bm{w}\rangle\bm{w})-F(\mathbf{0})=G(\bm{w})\geq 0italic_F ( ⟨ bold_italic_x , bold_italic_w ⟩ bold_italic_w ) - italic_F ( bold_0 ) = italic_G ( bold_italic_w ) ≥ 0, where F𝐹Fitalic_F is defined in (1). Hence, 𝟎prox1ρh1(𝒙)0subscriptprox1𝜌subscript1𝒙\mathbf{0}\in\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})bold_0 ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). ∎

4.2 Special case: the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT on 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT

The following result establishes a region in which the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT does not vanish on 2superscriptsubscript2\mathbb{R}_{\downarrow}^{2}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

Proposition 4.4.

For ρ>0𝜌0\rho>0italic_ρ > 0, define two sets in 2superscriptsubscript2\mathbb{R}_{\downarrow}^{2}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as follows:

S1subscript𝑆1\displaystyle S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =\displaystyle== {𝒙2:x1>2ρ},conditional-set𝒙superscriptsubscript2subscript𝑥12𝜌\displaystyle\left\{\bm{x}\in\mathbb{R}_{\downarrow}^{2}:x_{1}>\sqrt{\frac{2}{% \rho}}\right\},{ bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } ,
S2subscript𝑆2\displaystyle S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =\displaystyle== {𝒙2:x2=κx1,x1>2(1+κ)ρ(1+κ2)3/2,κ[0,1]}.conditional-set𝒙superscriptsubscript2formulae-sequencesubscript𝑥2𝜅subscript𝑥1formulae-sequencesubscript𝑥121𝜅𝜌superscript1superscript𝜅232𝜅01\displaystyle\left\{\bm{x}\in\mathbb{R}_{\downarrow}^{2}:x_{2}=\kappa x_{1},x_% {1}>\sqrt{\frac{2(1+\kappa)}{\rho(1+\kappa^{2})^{3/2}}},\kappa\in[0,1]\right\}.{ bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_κ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG divide start_ARG 2 ( 1 + italic_κ ) end_ARG start_ARG italic_ρ ( 1 + italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT end_ARG end_ARG , italic_κ ∈ [ 0 , 1 ] } .

Then, the origin is not in prox1ρh1(𝐱)subscriptprox1𝜌subscript1𝐱\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) for every point 𝐱S1S2𝐱subscript𝑆1subscript𝑆2\bm{x}\in S_{1}\cup S_{2}bold_italic_x ∈ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

Proof.

For each point 𝒙S1S2𝒙subscript𝑆1subscript𝑆2\bm{x}\in S_{1}\cup S_{2}bold_italic_x ∈ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, to prove the origin is not in prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) it is sufficient to show that there exists a point, say 𝒛𝒛\bm{z}bold_italic_z, in 2superscriptsubscript2\mathbb{R}_{\downarrow}^{2}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that F(𝒛)F(𝟎)<0𝐹𝒛𝐹00F(\bm{z})-F(\mathbf{0})<0italic_F ( bold_italic_z ) - italic_F ( bold_0 ) < 0, where F𝐹Fitalic_F is defined in (1).

First, we choose 𝒛=x1𝒆12𝒛subscript𝑥1subscript𝒆1superscriptsubscript2\bm{z}=x_{1}\bm{e}_{1}\in\mathbb{R}_{\downarrow}^{2}bold_italic_z = italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Then, F(𝒛)F(𝟎)=ρ2x12+1<0𝐹𝒛𝐹0𝜌2superscriptsubscript𝑥1210F(\bm{z})-F(\mathbf{0})=-\frac{\rho}{2}x_{1}^{2}+1<0italic_F ( bold_italic_z ) - italic_F ( bold_0 ) = - divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 < 0 which holds for 𝒙S1𝒙subscript𝑆1\bm{x}\in S_{1}bold_italic_x ∈ italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Next, we choose 𝒛=𝒙𝒛𝒙\bm{z}=\bm{x}bold_italic_z = bold_italic_x. Then, with κ=x2x1𝜅subscript𝑥2subscript𝑥1\kappa=\frac{x_{2}}{x_{1}}italic_κ = divide start_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG,

F(𝒛)F(𝟎)=(1+κ2)(12ρx12+1+κ(1+κ2)3/2)<0,𝐹𝒛𝐹01superscript𝜅212𝜌superscriptsubscript𝑥121𝜅superscript1superscript𝜅2320F(\bm{z})-F(\mathbf{0})=(1+\kappa^{2})\left(-\frac{1}{2}\rho x_{1}^{2}+\frac{1% +\kappa}{(1+\kappa^{2})^{3/2}}\right)<0,italic_F ( bold_italic_z ) - italic_F ( bold_0 ) = ( 1 + italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ( - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + divide start_ARG 1 + italic_κ end_ARG start_ARG ( 1 + italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT end_ARG ) < 0 ,

for all points 𝒙S2𝒙subscript𝑆2\bm{x}\in S_{2}bold_italic_x ∈ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. This completes the proof of this proposition. ∎

We comment on this proposition. Consider two curves parameterized by the parameter κ[0,1]𝜅01\kappa\in[0,1]italic_κ ∈ [ 0 , 1 ] as follows:

𝒞1:[0,1]κ2ρ(1,κ)and𝒞2:[0,1]κ2(1+κ)ρ(1+κ2)3/2(1,κ).\mathcal{C}_{1}:[0,1]\ni\kappa\mapsto\sqrt{\frac{2}{\rho}}(1,\kappa)\quad\mbox% {and}\quad\mathcal{C}_{2}:[0,1]\ni\kappa\mapsto\sqrt{\frac{2(1+\kappa)}{\rho(1% +\kappa^{2})^{3/2}}}(1,\kappa).caligraphic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : [ 0 , 1 ] ∋ italic_κ ↦ square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG ( 1 , italic_κ ) and caligraphic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : [ 0 , 1 ] ∋ italic_κ ↦ square-root start_ARG divide start_ARG 2 ( 1 + italic_κ ) end_ARG start_ARG italic_ρ ( 1 + italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT end_ARG end_ARG ( 1 , italic_κ ) .

We have 𝒞1(0)=𝒞2(0)=2ρ(1,0)subscript𝒞10subscript𝒞202𝜌10\mathcal{C}_{1}(0)=\mathcal{C}_{2}(0)=\sqrt{\frac{2}{\rho}}(1,0)caligraphic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 0 ) = caligraphic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 0 ) = square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG ( 1 , 0 ), 𝒞1(1)=2ρ(1,1)subscript𝒞112𝜌11\mathcal{C}_{1}(1)=\sqrt{\frac{2}{\rho}}(1,1)caligraphic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 ) = square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG ( 1 , 1 ), and 𝒞2(1)=2ρ(1,1)subscript𝒞212𝜌11\mathcal{C}_{2}(1)=\sqrt{\frac{\sqrt{2}}{\rho}}(1,1)caligraphic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 ) = square-root start_ARG divide start_ARG square-root start_ARG 2 end_ARG end_ARG start_ARG italic_ρ end_ARG end_ARG ( 1 , 1 ). Two curves intersect at the point with κ𝜅\kappaitalic_κ to be the root of the polynomial of κ5+3κ2+2κ2=0superscript𝜅53superscript𝜅22𝜅20\kappa^{5}+3\kappa^{2}+2\kappa-2=0italic_κ start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT + 3 italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_κ - 2 = 0. This root is κ0.6124𝜅0.6124\kappa\approx 0.6124italic_κ ≈ 0.6124. The red shaded region in Figure 4.2 is the set S1S2subscript𝑆1subscript𝑆2S_{1}\cup S_{2}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The blue shaded region Figure 4.2 represents the set where every point is mapped to the origin by prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ), as stipulated by Theorem 4.3. We will explore the blank region situated between the blue and red shaded areas in the subsequent analysis.

Refer to caption
Figure 4.2: The proximity operator prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT will map all points in the blue shaded region to the origin and all points in the red region to a nonzero point.

In the following analysis, our discussion distinctly excludes the instances of uniform entries in 𝒙𝒙\bm{x}bold_italic_x, which have been previously addressed in Theorem 4.2. We now focus on the case where 𝒙2𝒙superscriptsubscript2\bm{x}\in\mathbb{R}_{\downarrow}^{2}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. This scenario can be further divided into two distinct cases: one where 𝒙𝒙\bm{x}bold_italic_x contains one zero entry, and another where it does not. We begin by examining the situation where 𝒙𝒙\bm{x}bold_italic_x includes one zero entry, as detailed in the following proposition.

Proposition 4.5.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱=α𝐞1𝐱𝛼subscript𝐞1\bm{x}=\alpha\bm{e}_{1}bold_italic_x = italic_α bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT with α>0𝛼0\alpha>0italic_α > 0, then

prox1ρh1(𝒙)={{𝟎},if α<2ρ;{𝟎,𝒙},if α=2ρ;{𝒙},if α>2ρ.subscriptprox1𝜌subscript1𝒙cases0if 𝛼2𝜌0𝒙if 𝛼2𝜌𝒙if 𝛼2𝜌\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})=\begin{cases}\{\mathbf{0}\},&\text% {if }\alpha<\sqrt{\frac{2}{\rho}};\\ \{\mathbf{0},\bm{x}\},&\text{if }\alpha=\sqrt{\frac{2}{\rho}};\\ \{\bm{x}\},&\text{if }\alpha>\sqrt{\frac{2}{\rho}}.\end{cases}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ROW start_CELL { bold_0 } , end_CELL start_CELL if italic_α < square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG ; end_CELL end_ROW start_ROW start_CELL { bold_0 , bold_italic_x } , end_CELL start_CELL if italic_α = square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG ; end_CELL end_ROW start_ROW start_CELL { bold_italic_x } , end_CELL start_CELL if italic_α > square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG . end_CELL end_ROW
Proof.

The objective function of problem (3) G𝐺Gitalic_G associated with h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for the given 𝒙𝒙\bm{x}bold_italic_x is

G(𝒘)=12ρα2w12+w1+w2=12ρα2w12+w1+1w12,𝐺𝒘12𝜌superscript𝛼2superscriptsubscript𝑤12subscript𝑤1subscript𝑤212𝜌superscript𝛼2superscriptsubscript𝑤12subscript𝑤11superscriptsubscript𝑤12G(\bm{w})=-\frac{1}{2}\rho\alpha^{2}w_{1}^{2}+w_{1}+w_{2}=-\frac{1}{2}\rho% \alpha^{2}w_{1}^{2}+w_{1}+\sqrt{1-w_{1}^{2}},italic_G ( bold_italic_w ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

where w1[0,1]subscript𝑤101w_{1}\in[0,1]italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ [ 0 , 1 ]. A direct calculation shows that both functions 12ρα2w1212𝜌superscript𝛼2superscriptsubscript𝑤12-\frac{1}{2}\rho\alpha^{2}w_{1}^{2}- divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and w1+1w12subscript𝑤11superscriptsubscript𝑤12w_{1}+\sqrt{1-w_{1}^{2}}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG are concave with respect to w1subscript𝑤1w_{1}italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. Together with the facts of G(𝒆1)=12ρα2+1𝐺subscript𝒆112𝜌superscript𝛼21G(\bm{e}_{1})=-\frac{1}{2}\rho\alpha^{2}+1italic_G ( bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 and G(𝒆2)=1𝐺subscript𝒆21G(\bm{e}_{2})=1italic_G ( bold_italic_e start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = 1, hence, G𝐺Gitalic_G achieves its global minimum at 𝒘=𝒆1superscript𝒘subscript𝒆1\bm{w}^{\star}=\bm{e}_{1}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

The r𝑟ritalic_r-step of the WRD procedure simply follows with r=𝒙,𝒘=αsuperscript𝑟𝒙superscript𝒘𝛼r^{\star}=\langle\bm{x},\bm{w}^{\star}\rangle=\alphaitalic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ = italic_α. At the d𝑑ditalic_d-step, we compare F(r𝒘)𝐹superscript𝑟superscript𝒘F(r^{\star}\bm{w}^{\star})italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) and F(𝟎)𝐹0F(\bm{0})italic_F ( bold_0 ) via their difference F(r𝒘)F(𝟎)=G(𝒘)=12ρα2+1𝐹superscript𝑟superscript𝒘𝐹0𝐺superscript𝒘12𝜌superscript𝛼21F(r^{\star}\bm{w}^{\star})-F(\bm{0})=G(\bm{w}^{\star})=-\frac{1}{2}\rho\alpha^% {2}+1italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1. Our result of this theorem immediately follows from the above difference. ∎

We observe that Proposition 4.5 corroborates the findings of Proposition 4.4 for points lying on the x1subscript𝑥1x_{1}italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-axis. Further, (prox1ρh1(𝒙))1=prox1ρ||0(x1)(\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x}))_{1}=\mathrm{prox}_{\frac{1}{\rho% }|\cdot|_{0}}(x_{1})( roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG | ⋅ | start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) for 𝒙=α𝒆1𝒙𝛼subscript𝒆1\bm{x}=\alpha\bm{e}_{1}bold_italic_x = italic_α bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

For 𝒙2𝒙superscriptsubscript2\bm{x}\in\mathbb{R}_{\downarrow}^{2}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with x10subscript𝑥10x_{1}\neq 0italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ 0, let G𝐺Gitalic_G be the objective function of problem (3) associated with h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT. We define Q:[0,π4]:𝑄0𝜋4Q:[0,\frac{\pi}{4}]\rightarrow\mathbb{R}italic_Q : [ 0 , divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ] → blackboard_R as

Q(θ):=G(𝒘(θ))with𝒘(θ)=[cos(θ)sin(θ)].formulae-sequenceassign𝑄𝜃𝐺𝒘𝜃with𝒘𝜃matrix𝜃𝜃Q(\theta):=G(\bm{w}(\theta))\quad\mbox{with}\quad\bm{w}(\theta)=\begin{bmatrix% }\cos(\theta)\\ \sin(\theta)\end{bmatrix}.italic_Q ( italic_θ ) := italic_G ( bold_italic_w ( italic_θ ) ) with bold_italic_w ( italic_θ ) = [ start_ARG start_ROW start_CELL roman_cos ( italic_θ ) end_CELL end_ROW start_ROW start_CELL roman_sin ( italic_θ ) end_CELL end_ROW end_ARG ] .

A direct computation yields

Q(θ)=12ρ𝒙22cos2(θα2)+2sin(θ+π4),𝑄𝜃12𝜌superscriptsubscriptnorm𝒙22superscript2𝜃𝛼22𝜃𝜋4Q(\theta)=-\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\cos^{2}\left(\theta-\frac{\alpha}% {2}\right)+\sqrt{2}\sin\left(\theta+\frac{\pi}{4}\right),italic_Q ( italic_θ ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_cos start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_θ - divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) + square-root start_ARG 2 end_ARG roman_sin ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) , (24)

where the constant α𝛼\alphaitalic_α is given by, with κ=x2x1[0,1]𝜅subscript𝑥2subscript𝑥101\kappa=\frac{x_{2}}{x_{1}}\in[0,1]italic_κ = divide start_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ∈ [ 0 , 1 ],

α={arctan(2κ1κ2)[0,π2),if x1>x2;π2,if x1=x2.𝛼cases2𝜅1superscript𝜅20𝜋2if x1>x2;𝜋2if x1=x2.\alpha=\left\{\begin{array}[]{ll}\arctan\left(\frac{2\kappa}{1-\kappa^{2}}% \right)\in\left[0,\frac{\pi}{2}\right),&\hbox{if $x_{1}>x_{2}$;}\\ \frac{\pi}{2},&\hbox{if $x_{1}=x_{2}$.}\end{array}\right.italic_α = { start_ARRAY start_ROW start_CELL roman_arctan ( divide start_ARG 2 italic_κ end_ARG start_ARG 1 - italic_κ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ∈ [ 0 , divide start_ARG italic_π end_ARG start_ARG 2 end_ARG ) , end_CELL start_CELL if italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_π end_ARG start_ARG 2 end_ARG , end_CELL start_CELL if italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . end_CELL end_ROW end_ARRAY (25)

Then, solving problem (3) involves minimizing Q𝑄Qitalic_Q over the interval [0,π4]0𝜋4[0,\frac{\pi}{4}][ 0 , divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ]. The minimal value of Q𝑄Qitalic_Q on this interval can be attained at 00, π/4𝜋4\pi/4italic_π / 4, or the critical points of Q𝑄Qitalic_Q. To determine these critical points, we examine the properties of Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which is

Q(θ)=12ρ𝒙22sin(2θα)+2cos(θ+π4).superscript𝑄𝜃12𝜌superscriptsubscriptnorm𝒙222𝜃𝛼2𝜃𝜋4Q^{\prime}(\theta)=\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\sin(2\theta-\alpha)+\sqrt% {2}\cos\left(\theta+\frac{\pi}{4}\right).italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin ( 2 italic_θ - italic_α ) + square-root start_ARG 2 end_ARG roman_cos ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) .

We immediately observed that: first, the function 2cos(θ+π4)2𝜃𝜋4\sqrt{2}\cos(\theta+\frac{\pi}{4})square-root start_ARG 2 end_ARG roman_cos ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) monotonically decreases from 1111 to 00 as θ𝜃\thetaitalic_θ varies from 00 to π4𝜋4\frac{\pi}{4}divide start_ARG italic_π end_ARG start_ARG 4 end_ARG; second, the function 12ρ𝒙22sin(2θ+α)12𝜌superscriptsubscriptnorm𝒙222𝜃𝛼\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\sin(2\theta+\alpha)divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin ( 2 italic_θ + italic_α ) monotonically increases from 12ρ𝒙22sin(α)=ρx1x212𝜌superscriptsubscriptnorm𝒙22𝛼𝜌subscript𝑥1subscript𝑥2\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\sin(-\alpha)=-\rho x_{1}x_{2}divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_sin ( - italic_α ) = - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT to 00 as θ𝜃\thetaitalic_θ ranges from 00 to α2𝛼2\frac{\alpha}{2}divide start_ARG italic_α end_ARG start_ARG 2 end_ARG, and from 00 to 12ρ𝒙22cos(α)=12ρ(x12x22)12𝜌superscriptsubscriptnorm𝒙22𝛼12𝜌superscriptsubscript𝑥12superscriptsubscript𝑥22\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\cos(\alpha)=\frac{1}{2}\rho(x_{1}^{2}-x_{2}^% {2})divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_cos ( italic_α ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) as θ𝜃\thetaitalic_θ goes from α2𝛼2\frac{\alpha}{2}divide start_ARG italic_α end_ARG start_ARG 2 end_ARG to π4𝜋4\frac{\pi}{4}divide start_ARG italic_π end_ARG start_ARG 4 end_ARG. Thus, Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is positive, and consequently Q𝑄Qitalic_Q is increasing on [α2,π4]𝛼2𝜋4[\frac{\alpha}{2},\frac{\pi}{4}][ divide start_ARG italic_α end_ARG start_ARG 2 end_ARG , divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ]. Therefore, the optimal value of Q𝑄Qitalic_Q will be achieved at zero or some point in the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ]. Hence, we confine our analysis of Q𝑄Qitalic_Q to this interval.

Remarkably, we can establish that Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT has at most two zeros in the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ]. This can be demonstrated by factorizing Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as a product of a positive function with a convex function:

Q(θ)=12ρ𝒙22cos(θ+π4)L(θ),superscript𝑄𝜃12𝜌superscriptsubscriptnorm𝒙22𝜃𝜋4𝐿𝜃Q^{\prime}(\theta)=\frac{1}{2}\rho\|\bm{x}\|_{2}^{2}\cos\left(\theta+\frac{\pi% }{4}\right)L(\theta),italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_cos ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) italic_L ( italic_θ ) ,

where L:[0,α2]:𝐿0𝛼2L:[0,\frac{\alpha}{2}]\rightarrow\mathbb{R}italic_L : [ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ] → blackboard_R is defined as:

L(θ)=sin(2θα)cos(θ+π4)+22ρ𝒙22.𝐿𝜃2𝜃𝛼𝜃𝜋422𝜌superscriptsubscriptnorm𝒙22L(\theta)=\frac{\sin(2\theta-\alpha)}{\cos(\theta+\frac{\pi}{4})}+\frac{2\sqrt% {2}}{\rho\|\bm{x}\|_{2}^{2}}.italic_L ( italic_θ ) = divide start_ARG roman_sin ( 2 italic_θ - italic_α ) end_ARG start_ARG roman_cos ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) end_ARG + divide start_ARG 2 square-root start_ARG 2 end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . (26)

We proceed to demonstrate that L𝐿Litalic_L is convex on the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ].

Lemma 4.6.

For ρ>0𝜌0\rho>0italic_ρ > 0 and a nonzero vector 𝐱2𝐱superscriptsubscript2\bm{x}\in\mathbb{R}_{\downarrow}^{2}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with κ=x2x1[0,1)𝜅subscript𝑥2subscript𝑥101\kappa=\frac{x_{2}}{x_{1}}\in[0,1)italic_κ = divide start_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ∈ [ 0 , 1 ), the following statements for the function L𝐿Litalic_L given by (26) hold:

  • (i)

    L𝐿Litalic_L is convex on the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ], where α𝛼\alphaitalic_α is given in (25).

  • (ii)

    L(0)𝐿0L(0)italic_L ( 0 ) is positive, zero, or negative if ρx1x21𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}-1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - 1 is negative, zero, or positive, respectively. L(0)superscript𝐿0L^{\prime}(0)italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) is nonnegative if κ512𝜅512\kappa\leq\frac{\sqrt{5}-1}{2}italic_κ ≤ divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG and negative if κ>512𝜅512\kappa>\frac{\sqrt{5}-1}{2}italic_κ > divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG.

  • (iii)

    L𝐿Litalic_L has at most two roots on the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ].

Proof.

Item (i). Notice that

L(θ)superscript𝐿𝜃\displaystyle L^{\prime}(\theta)italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ ) =\displaystyle== 2cos(2θα)cos(θ+π4)+sin(2θα)sin(θ+π4)cos2(θ+π4),22𝜃𝛼𝜃𝜋42𝜃𝛼𝜃𝜋4superscript2𝜃𝜋4\displaystyle\frac{2\cos(2\theta-\alpha)\cos(\theta+\frac{\pi}{4})+\sin(2% \theta-\alpha)\sin(\theta+\frac{\pi}{4})}{\cos^{2}(\theta+\frac{\pi}{4})},divide start_ARG 2 roman_cos ( 2 italic_θ - italic_α ) roman_cos ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) + roman_sin ( 2 italic_θ - italic_α ) roman_sin ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) end_ARG start_ARG roman_cos start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) end_ARG ,
L′′(θ)superscript𝐿′′𝜃\displaystyle L^{\prime\prime}(\theta)italic_L start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_θ ) =\displaystyle== 12sin(2θα)(sin(2θ)1)+2cosαcos3(θ+π4).122𝜃𝛼2𝜃12𝛼superscript3𝜃𝜋4\displaystyle\frac{\frac{1}{2}\sin(2\theta-\alpha)(\sin(2\theta)-1)+2\cos% \alpha}{\cos^{3}(\theta+\frac{\pi}{4})}.divide start_ARG divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_sin ( 2 italic_θ - italic_α ) ( roman_sin ( 2 italic_θ ) - 1 ) + 2 roman_cos italic_α end_ARG start_ARG roman_cos start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ( italic_θ + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) end_ARG .

Since both numerator and denominator of L′′superscript𝐿′′L^{\prime\prime}italic_L start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT are positive, L′′(θ)>0superscript𝐿′′𝜃0L^{\prime\prime}(\theta)>0italic_L start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ( italic_θ ) > 0 for all θ[0,α2]𝜃0𝛼2\theta\in[0,\frac{\alpha}{2}]italic_θ ∈ [ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ], hence, L𝐿Litalic_L is strictly convex on this interval.

Item (ii). We notice that

L(0)=22ρ𝒙22(1ρx1x2),L(0)=22ρ𝒙22(x12x22x1x2).formulae-sequence𝐿022𝜌superscriptsubscriptnorm𝒙221𝜌subscript𝑥1subscript𝑥2superscript𝐿022𝜌superscriptsubscriptnorm𝒙22superscriptsubscript𝑥12superscriptsubscript𝑥22subscript𝑥1subscript𝑥2L(0)=\frac{2\sqrt{2}}{\rho\|\bm{x}\|_{2}^{2}}(1-\rho x_{1}x_{2}),\quad L^{% \prime}(0)=\frac{2\sqrt{2}}{\rho\|\bm{x}\|_{2}^{2}}(x_{1}^{2}-x_{2}^{2}-x_{1}x% _{2}).italic_L ( 0 ) = divide start_ARG 2 square-root start_ARG 2 end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( 1 - italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) = divide start_ARG 2 square-root start_ARG 2 end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) .

Hence, the statements in item (ii) hold.

Item (iii). We have

L(α2)=22ρ𝒙22>0,L(α2)=2cos(α2+π4)>0.formulae-sequence𝐿𝛼222𝜌superscriptsubscriptnorm𝒙220superscript𝐿𝛼22𝛼2𝜋40L\left(\frac{\alpha}{2}\right)=\frac{2\sqrt{2}}{\rho\|\bm{x}\|_{2}^{2}}>0,% \quad L^{\prime}\left(\frac{\alpha}{2}\right)=\frac{2}{\cos(\frac{\alpha}{2}+% \frac{\pi}{4})}>0.italic_L ( divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) = divide start_ARG 2 square-root start_ARG 2 end_ARG end_ARG start_ARG italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG > 0 , italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) = divide start_ARG 2 end_ARG start_ARG roman_cos ( divide start_ARG italic_α end_ARG start_ARG 2 end_ARG + divide start_ARG italic_π end_ARG start_ARG 4 end_ARG ) end_ARG > 0 .

Together with the convexity of L𝐿Litalic_L, and the value of L(0)𝐿0L(0)italic_L ( 0 ), we know that L𝐿Litalic_L has at most two zeros on the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ]. ∎

With these preliminaries, we can now present the solution to problem (3) associated with h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in the following theorem, which provides the outcome of the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure for the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Proposition 4.7.

For ρ>0𝜌0\rho>0italic_ρ > 0 and a nonzero vector 𝐱2𝐱superscriptsubscript2\bm{x}\in\mathbb{R}_{\downarrow}^{2}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with κ=x2x1[0,1)𝜅subscript𝑥2subscript𝑥101\kappa=\frac{x_{2}}{x_{1}}\in[0,1)italic_κ = divide start_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ∈ [ 0 , 1 ), let the function Q𝑄Qitalic_Q be given by (24), and let the function L𝐿Litalic_L be given by (26). Define α𝛼\alphaitalic_α as in (25). Then, the optimal solution 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem (3) is represented as:

𝒘=[cos(θ)sin(θ)],superscript𝒘matrixsuperscript𝜃superscript𝜃\bm{w}^{\star}=\begin{bmatrix}\cos(\theta^{\star})\\ \sin(\theta^{\star})\end{bmatrix},bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = [ start_ARG start_ROW start_CELL roman_cos ( italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL roman_sin ( italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) end_CELL end_ROW end_ARG ] ,

where θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is determined as follows:

  • (i)

    Case ρx1x2<1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}<1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < 1. We choose

    θ={0,if κ512;0,if 512<κ<1L(θ0)0 with L(θ0)=0;argmin{Q(θ):θ{0,θ1}},if 512<κ<1L(θ0)<0 with L(θ0)=0.superscript𝜃cases0if κ512;0if 512<κ<1L(θ0)0 with L(θ0)=0;:𝑄𝜃𝜃0subscript𝜃1if 512<κ<1L(θ0)<0 with L(θ0)=0.\theta^{\star}=\left\{\begin{array}[]{ll}0,&\hbox{if $\kappa\leq\frac{\sqrt{5}% -1}{2}$;}\\ 0,&\hbox{if $\frac{\sqrt{5}-1}{2}<\kappa<1$, $L(\theta_{0})\geq 0$ with $L^{% \prime}(\theta_{0})=0$;}\\ \arg\min\{Q(\theta):\theta\in\{0,\theta_{1}\}\},&\hbox{if $\frac{\sqrt{5}-1}{2% }<\kappa<1$, $L(\theta_{0})<0$ with $L^{\prime}(\theta_{0})=0$.}\end{array}\right.italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = { start_ARRAY start_ROW start_CELL 0 , end_CELL start_CELL if italic_κ ≤ divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG ; end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL if divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG < italic_κ < 1 , italic_L ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ≥ 0 with italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0 ; end_CELL end_ROW start_ROW start_CELL roman_arg roman_min { italic_Q ( italic_θ ) : italic_θ ∈ { 0 , italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } } , end_CELL start_CELL if divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG < italic_κ < 1 , italic_L ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) < 0 with italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0 . end_CELL end_ROW end_ARRAY (27)

    Here θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the root of L𝐿Litalic_L on the interval (θ0,α2)subscript𝜃0𝛼2(\theta_{0},\frac{\alpha}{2})( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ).

  • (ii)

    Case ρx1x2=1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}=1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1. If κ512𝜅512\kappa\leq\frac{\sqrt{5}-1}{2}italic_κ ≤ divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG, we choose θ=0superscript𝜃0\theta^{\star}=0italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = 0; Otherwise, θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is chosen to be the root of L𝐿Litalic_L on (0,α2)0𝛼2(0,\frac{\alpha}{2})( 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ).

  • (iii)

    Case ρx1x2>1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}>1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 1. θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is chosen to be the only root of L𝐿Litalic_L on the interval [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ].

Proof.

Case ρx1x2<1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}<1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < 1. That is, L(0)>0𝐿00L(0)>0italic_L ( 0 ) > 0 by Lemma 4.6. Then Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT has no root if L(0)0superscript𝐿00L^{\prime}(0)\geq 0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) ≥ 0. In this situation, L𝐿Litalic_L is positive, so is Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT on [0,α2]0𝛼2[0,\frac{\alpha}{2}][ 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ]. Hence, we choose θ=0superscript𝜃0\theta^{\star}=0italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = 0; If L(0)<0superscript𝐿00L^{\prime}(0)<0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) < 0, since L(α2)>0superscript𝐿𝛼20L^{\prime}\left(\frac{\alpha}{2}\right)>0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) > 0, there exists one and only one point θ0(0,α2)subscript𝜃00𝛼2\theta_{0}\in(0,\frac{\alpha}{2})italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ ( 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ) such that L(θ0)=0superscript𝐿subscript𝜃00L^{\prime}(\theta_{0})=0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0. If L(θ0)0𝐿subscript𝜃00L(\theta_{0})\geq 0italic_L ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ≥ 0, Qsuperscript𝑄Q^{\prime}italic_Q start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT has no root, we choose θ=0superscript𝜃0\theta^{\star}=0italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = 0. If L(θ0)<0𝐿subscript𝜃00L(\theta_{0})<0italic_L ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) < 0, then L𝐿Litalic_L has a unique root, say θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, on the interval (θ0,α2)subscript𝜃0𝛼2(\theta_{0},\frac{\alpha}{2})( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ). In this situation, we choose θ=argmin{Q(θ):θ{0,θ1}}superscript𝜃:𝑄𝜃𝜃0subscript𝜃1\theta^{\star}=\arg\min\{Q(\theta):\theta\in\{0,\theta_{1}\}\}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = roman_arg roman_min { italic_Q ( italic_θ ) : italic_θ ∈ { 0 , italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } }. All situations are summarized in (27).

Case ρx1x2=1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}=1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1. That is, L(0)=0𝐿00L(0)=0italic_L ( 0 ) = 0 by Lemma 4.6. If L(0)0superscript𝐿00L^{\prime}(0)\geq 0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) ≥ 0, we choose θ=0superscript𝜃0\theta^{\star}=0italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = 0. On the other hand, if L(0)<0superscript𝐿00L^{\prime}(0)<0italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( 0 ) < 0, let θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the only root of L𝐿Litalic_L on the open interval (0,α2)0𝛼2(0,\frac{\alpha}{2})( 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ), then θ=θ1superscript𝜃subscript𝜃1\theta^{\star}=\theta_{1}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Case ρx1x2>1𝜌subscript𝑥1subscript𝑥21\rho x_{1}x_{2}>1italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 1. That is, L(0)<0𝐿00L(0)<0italic_L ( 0 ) < 0 by Lemma 4.6 again. Let θ1subscript𝜃1\theta_{1}italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT be the only root on the open interval (0,α2)0𝛼2(0,\frac{\alpha}{2})( 0 , divide start_ARG italic_α end_ARG start_ARG 2 end_ARG ). Then, θ=θ1superscript𝜃subscript𝜃1\theta^{\star}=\theta_{1}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, and Q𝑄Qitalic_Q achieves its global minimum at θsuperscript𝜃\theta^{\star}italic_θ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. ∎

Based on Proposition 4.7, the set of 2{α𝒆:α}superscriptsubscript2conditional-set𝛼𝒆𝛼\mathbb{R}_{\downarrow}^{2}\setminus\{\alpha\bm{e}:\alpha\in\mathbb{R}\}blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∖ { italic_α bold_italic_e : italic_α ∈ blackboard_R } is split into three disjoint sets I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and I3subscript𝐼3I_{3}italic_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, as follows:

I1subscript𝐼1\displaystyle I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT =\displaystyle== {(x1,x2)2:x1>x2,ρx1x2<1}conditional-setsubscript𝑥1subscript𝑥2superscriptsubscript2formulae-sequencesubscript𝑥1subscript𝑥2𝜌subscript𝑥1subscript𝑥21\displaystyle\{(x_{1},x_{2})\in\mathbb{R}_{\downarrow}^{2}:x_{1}>x_{2},\rho x_% {1}x_{2}<1\}{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < 1 }
I2subscript𝐼2\displaystyle I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =\displaystyle== {(x1,x2)2:x1>x2,ρx1x2=1}conditional-setsubscript𝑥1subscript𝑥2superscriptsubscript2formulae-sequencesubscript𝑥1subscript𝑥2𝜌subscript𝑥1subscript𝑥21\displaystyle\{(x_{1},x_{2})\in\mathbb{R}_{\downarrow}^{2}:x_{1}>x_{2},\rho x_% {1}x_{2}=1\}{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1 }
I3subscript𝐼3\displaystyle I_{3}italic_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT =\displaystyle== {(x1,x2)2:x1>x2,ρx1x2>1}.conditional-setsubscript𝑥1subscript𝑥2superscriptsubscript2formulae-sequencesubscript𝑥1subscript𝑥2𝜌subscript𝑥1subscript𝑥21\displaystyle\{(x_{1},x_{2})\in\mathbb{R}_{\downarrow}^{2}:x_{1}>x_{2},\rho x_% {1}x_{2}>1\}.{ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > 1 } .

We further split I1subscript𝐼1I_{1}italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as the union of I1isubscript𝐼1𝑖I_{1i}italic_I start_POSTSUBSCRIPT 1 italic_i end_POSTSUBSCRIPT, i=1,2,3,4𝑖1234i=1,2,3,4italic_i = 1 , 2 , 3 , 4 and I2subscript𝐼2I_{2}italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT as the union of I2isubscript𝐼2𝑖I_{2i}italic_I start_POSTSUBSCRIPT 2 italic_i end_POSTSUBSCRIPT, i=1,2,3,4𝑖1234i=1,2,3,4italic_i = 1 , 2 , 3 , 4 as follows:

I11={(x1,x2)I1:x1>2ρ}I21={(x1,x2)I2:x1>2ρ}I12={(x1,x2)I1:x1=2ρ}I22={(2ρ,12ρ)}I13={(x1,x2)I1:5+12x2x1<2ρ}I23={(x1,x2)I2:5+12ρx1<2ρ}I14={(x1,x2)I1:5+12x2>x1}I24={(x1,x2)I2:1ρ<x1<5+12ρ}subscript𝐼11conditional-setsubscript𝑥1subscript𝑥2subscript𝐼1subscript𝑥12𝜌subscript𝐼21conditional-setsubscript𝑥1subscript𝑥2subscript𝐼2subscript𝑥12𝜌subscript𝐼12conditional-setsubscript𝑥1subscript𝑥2subscript𝐼1subscript𝑥12𝜌subscript𝐼222𝜌12𝜌subscript𝐼13conditional-setsubscript𝑥1subscript𝑥2subscript𝐼1512subscript𝑥2subscript𝑥12𝜌subscript𝐼23conditional-setsubscript𝑥1subscript𝑥2subscript𝐼2512𝜌subscript𝑥12𝜌subscript𝐼14conditional-setsubscript𝑥1subscript𝑥2subscript𝐼1512subscript𝑥2subscript𝑥1subscript𝐼24conditional-setsubscript𝑥1subscript𝑥2subscript𝐼21𝜌subscript𝑥1512𝜌\begin{array}[]{ll}I_{11}=\left\{(x_{1},x_{2})\in I_{1}:x_{1}>\sqrt{\frac{2}{% \rho}}\right\}&I_{21}=\left\{(x_{1},x_{2})\in I_{2}:x_{1}>\sqrt{\frac{2}{\rho}% }\right\}\\ I_{12}=\left\{(x_{1},x_{2})\in I_{1}:x_{1}=\sqrt{\frac{2}{\rho}}\right\}&I_{22% }=\left\{\left(\sqrt{\frac{2}{\rho}},\sqrt{\frac{1}{2\rho}}\right)\right\}\\ I_{13}=\left\{(x_{1},x_{2})\in I_{1}:\frac{\sqrt{5}+1}{2}x_{2}\leq x_{1}<\sqrt% {\frac{2}{\rho}}\right\}&I_{23}=\left\{(x_{1},x_{2})\in I_{2}:\sqrt{\frac{% \sqrt{5}+1}{2\rho}}\leq x_{1}<\sqrt{\frac{2}{\rho}}\right\}\\ I_{14}=\left\{(x_{1},x_{2})\in I_{1}:\frac{\sqrt{5}+1}{2}x_{2}>x_{1}\right\}&I% _{24}=\left\{(x_{1},x_{2})\in I_{2}:\sqrt{\frac{1}{\rho}}<x_{1}<\sqrt{\frac{% \sqrt{5}+1}{2\rho}}\right\}\end{array}start_ARRAY start_ROW start_CELL italic_I start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } end_CELL start_CELL italic_I start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } end_CELL end_ROW start_ROW start_CELL italic_I start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } end_CELL start_CELL italic_I start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT = { ( square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG , square-root start_ARG divide start_ARG 1 end_ARG start_ARG 2 italic_ρ end_ARG end_ARG ) } end_CELL end_ROW start_ROW start_CELL italic_I start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : divide start_ARG square-root start_ARG 5 end_ARG + 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } end_CELL start_CELL italic_I start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : square-root start_ARG divide start_ARG square-root start_ARG 5 end_ARG + 1 end_ARG start_ARG 2 italic_ρ end_ARG end_ARG ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG } end_CELL end_ROW start_ROW start_CELL italic_I start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : divide start_ARG square-root start_ARG 5 end_ARG + 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT > italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } end_CELL start_CELL italic_I start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT = { ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG end_ARG < italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < square-root start_ARG divide start_ARG square-root start_ARG 5 end_ARG + 1 end_ARG start_ARG 2 italic_ρ end_ARG end_ARG } end_CELL end_ROW end_ARRAY

With the given sets, the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from the WRD procedure is presented in the next theorem.

Theorem 4.8.

Let ρ>0𝜌0\rho>0italic_ρ > 0. For 𝐱I1I2𝐱subscript𝐼1subscript𝐼2\bm{x}\in I_{1}\cup I_{2}bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ italic_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we have

prox1ρh1(𝒙)={{x1𝒆1}if 𝒙I11I21;{𝟎,2ρ𝒆1}if 𝒙I12I22;{𝟎}if 𝒙I13I23;argmin{F(𝒛):𝒛{𝟎,𝒙,𝒘𝒘}}if 𝒙I14I24,subscriptprox1𝜌subscript1𝒙casessubscript𝑥1subscript𝒆1if 𝒙I11I21;02𝜌subscript𝒆1if 𝒙I12I22;0if 𝒙I13I23;:𝐹𝒛𝒛0𝒙superscript𝒘superscript𝒘if 𝒙I14I24,\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})=\left\{\begin{array}[]{ll}\{x_{1}% \bm{e}_{1}\}&\hbox{if $\bm{x}\in I_{11}\cup I_{21}$;}\\ \{\mathbf{0},\sqrt{\frac{2}{\rho}}\bm{e}_{1}\}&\hbox{if $\bm{x}\in I_{12}\cup I% _{22}$;}\\ \{\mathbf{0}\}&\hbox{if $\bm{x}\in I_{13}\cup I_{23}$;}\\ \arg\min\{F(\bm{z}):\bm{z}\in\{\mathbf{0},\langle\bm{x},\bm{w}^{\star}\rangle% \bm{w}^{\star}\}\}&\hbox{if $\bm{x}\in I_{14}\cup I_{24}$,}\end{array}\right.roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = { start_ARRAY start_ROW start_CELL { italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } end_CELL start_CELL if bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∪ italic_I start_POSTSUBSCRIPT 21 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL { bold_0 , square-root start_ARG divide start_ARG 2 end_ARG start_ARG italic_ρ end_ARG end_ARG bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } end_CELL start_CELL if bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 12 end_POSTSUBSCRIPT ∪ italic_I start_POSTSUBSCRIPT 22 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL { bold_0 } end_CELL start_CELL if bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 13 end_POSTSUBSCRIPT ∪ italic_I start_POSTSUBSCRIPT 23 end_POSTSUBSCRIPT ; end_CELL end_ROW start_ROW start_CELL roman_arg roman_min { italic_F ( bold_italic_z ) : bold_italic_z ∈ { bold_0 , ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } } end_CELL start_CELL if bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT ∪ italic_I start_POSTSUBSCRIPT 24 end_POSTSUBSCRIPT , end_CELL end_ROW end_ARRAY

where 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is from item (i) or item (ii) of Proposition 4.7.

For 𝐱I3𝐱subscript𝐼3\bm{x}\in I_{3}bold_italic_x ∈ italic_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, we have

prox1ρh1(𝒙)=argmin{F(𝒛):𝒛{𝟎,𝒙,𝒘𝒘}},subscriptprox1𝜌subscript1𝒙:𝐹𝒛𝒛0𝒙superscript𝒘superscript𝒘\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})=\arg\min\{F(\bm{z}):\bm{z}\in\{% \mathbf{0},\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\}\},roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = roman_arg roman_min { italic_F ( bold_italic_z ) : bold_italic_z ∈ { bold_0 , ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT } } ,

where 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is from item (iii) of Proposition 4.7.

Proof.

The 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure provides 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT, the solution of optimization problem (3) associated with the function h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT by Proposition 4.7. The r𝑟ritalic_r-step simply follows with r=𝒙,𝒘superscript𝑟𝒙superscript𝒘r^{\star}=\langle\bm{x},\bm{w}^{\star}\rangleitalic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩. At the d𝑑ditalic_d-step, we compare F(r𝒘)𝐹superscript𝑟superscript𝒘F(r^{\star}\bm{w}^{\star})italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) and F(𝟎)𝐹0F(\bm{0})italic_F ( bold_0 ) with F𝐹Fitalic_F defined in (1). Note that

F(r𝒘)F(𝟎)=G(𝒘).𝐹superscript𝑟superscript𝒘𝐹0𝐺superscript𝒘F(r^{\star}\bm{w}^{\star})-F(\bm{0})=G(\bm{w}^{\star}).italic_F ( italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) - italic_F ( bold_0 ) = italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) .

If G(𝒘)𝐺superscript𝒘G(\bm{w}^{\star})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) is positive, the zero is in prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ); if G(𝒘)𝐺superscript𝒘G(\bm{w}^{\star})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) is negative, r𝒘superscript𝑟superscript𝒘r^{\star}\bm{w}^{\star}italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is in prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ); if G(𝒘)𝐺superscript𝒘G(\bm{w}^{\star})italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) is zero, both the zero vector and r𝒘superscript𝑟superscript𝒘r^{\star}\bm{w}^{\star}italic_r start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT are in prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). The rest of the result follows directly from Proposition 4.7. ∎

Figure 4.3(a) illustrates the region where the proximity operator prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT maps points to the origin. According to Theorem 4.2, all points on the line segment from the origin to (2/ρ,2/ρ)2𝜌2𝜌(\sqrt{{\sqrt{2}}/{\rho}},\sqrt{{\sqrt{2}}/{\rho}})( square-root start_ARG square-root start_ARG 2 end_ARG / italic_ρ end_ARG , square-root start_ARG square-root start_ARG 2 end_ARG / italic_ρ end_ARG ) will be mapped to the origin by prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Additionally, as stated in Theorem 4.8, all points under the line x2=512x1subscript𝑥2512subscript𝑥1x_{2}=\frac{\sqrt{5}-1}{2}x_{1}italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = divide start_ARG square-root start_ARG 5 end_ARG - 1 end_ARG start_ARG 2 end_ARG italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in the red region are mapped to the origin by prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT. The remaining points in both red and blue colors are obtained numerically with the assistance of Theorem 4.8.

Refer to caption Refer to caption
(a) (b)
Figure 4.3: (a) The proximity operator prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT will map all points in the shaded region to the origin; (b) Numerical result for the region which will be mapped to the origin by the prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

4.3 General case: the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT

Here, we demonstrate that if the last k𝑘kitalic_k entries of 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT are zero, then the last k𝑘kitalic_k entries of 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT, the optimal solution to problem (3), are zero as well. Leveraging this result, we proceed by assuming that all entries of 𝒙n𝒙subscriptsuperscript𝑛\bm{x}\in\mathbb{R}^{n}_{\downarrow}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT are all nonzero. The primary outcome of this subsection is the transformation of problem (3) into the one with same objective function but constrained on a convex set. The modified problem can be addressed using the nonconvex gradient projection algorithm in [8]. Subsequently, we introduce an algorithm for computing the proximity operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT on nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT.

Theorem 4.9.

For ρ>0𝜌0\rho>0italic_ρ > 0 and 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, suppose that the last k1𝑘1k\geq 1italic_k ≥ 1 entries of 𝐱𝐱\bm{x}bold_italic_x are zeros, that is,

𝒙=[𝒙[nk]𝟎],𝒙matrixsubscript𝒙delimited-[]𝑛𝑘0\bm{x}=\begin{bmatrix}\bm{x}_{[n-k]}\\ \mathbf{0}\end{bmatrix},bold_italic_x = [ start_ARG start_ROW start_CELL bold_italic_x start_POSTSUBSCRIPT [ italic_n - italic_k ] end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL bold_0 end_CELL end_ROW end_ARG ] ,

Then, for an optimal solution 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem (3), we have 𝐰[n][nk]=𝟎subscriptsuperscript𝐰delimited-[]𝑛delimited-[]𝑛𝑘0\bm{w}^{\star}_{[n]\setminus[n-k]}=\mathbf{0}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n ] ∖ [ italic_n - italic_k ] end_POSTSUBSCRIPT = bold_0, that is, the last k𝑘kitalic_k entries of 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT are zero.

Proof.

The proof hinges on iteratively reducing the dimension by one up to k𝑘kitalic_k steps. Without loss of generality, we assume that k=1𝑘1k=1italic_k = 1. Let F𝐹Fitalic_F denote the objective function of problem (3) defined on 𝕊+n1subscriptsuperscript𝕊𝑛1\mathbb{S}^{n-1}_{+}blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Throughout this proof, we consistently treat 𝒘[n1]subscript𝒘delimited-[]𝑛1\bm{w}_{[n-1]}bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT as the truncation of 𝒘𝒘\bm{w}bold_italic_w from its first (n1)𝑛1(n-1)( italic_n - 1 ) entries.

Define: H:𝔹+n1(𝟎,1):𝐻subscriptsuperscript𝔹𝑛101H:\mathbb{B}^{n-1}_{+}(\mathbf{0},1)\rightarrow\mathbb{R}italic_H : blackboard_B start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ( bold_0 , 1 ) → blackboard_R as H(𝒘[n1]):=G(𝒘)assign𝐻subscript𝒘delimited-[]𝑛1𝐺𝒘H(\bm{w}_{[n-1]}):=G(\bm{w})italic_H ( bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) := italic_G ( bold_italic_w ). Considering the last entry of 𝒙𝒙\bm{x}bold_italic_x being zero, we have

H(𝒘[n1])𝐻subscript𝒘delimited-[]𝑛1\displaystyle H(\bm{w}_{[n-1]})italic_H ( bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) =\displaystyle== 12ρ𝒙[n1],𝒘[n1]2H1(𝒘[n1])+i=1n1wi+1i=1n1wi2H2(𝒘[n1]).subscript12𝜌superscriptsubscript𝒙delimited-[]𝑛1subscript𝒘delimited-[]𝑛12subscript𝐻1subscript𝒘delimited-[]𝑛1subscriptsuperscriptsubscript𝑖1𝑛1subscript𝑤𝑖1superscriptsubscript𝑖1𝑛1superscriptsubscript𝑤𝑖2subscript𝐻2subscript𝒘delimited-[]𝑛1\displaystyle\underbrace{-\frac{1}{2}\rho\langle\bm{x}_{[n-1]},\bm{w}_{[n-1]}% \rangle^{2}}_{H_{1}(\bm{w}_{[n-1]})}+\underbrace{\sum_{i=1}^{n-1}w_{i}+\sqrt{1% -\sum_{i=1}^{n-1}w_{i}^{2}}}_{H_{2}(\bm{w}_{[n-1]})}.under⏟ start_ARG - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ⟨ bold_italic_x start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT , bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT + under⏟ start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + square-root start_ARG 1 - ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT .

We can verify that both H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and H2subscript𝐻2H_{2}italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are concave functions over the domain 𝔹+n1subscriptsuperscript𝔹𝑛1\mathbb{B}^{n-1}_{+}blackboard_B start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT, hence, the minimal value of H1+H2subscript𝐻1subscript𝐻2H_{1}+H_{2}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT will be achieved at 𝒘[n1]subscriptsuperscript𝒘delimited-[]𝑛1\bm{w}^{\star}_{[n-1]}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT on the boundary of the ball.

We remark that 𝒘[n1]subscriptsuperscript𝒘delimited-[]𝑛1\bm{w}^{\star}_{[n-1]}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT cannot be the zero vector. If so, H(𝒘[n1])=H(𝟎)=1𝐻subscriptsuperscript𝒘delimited-[]𝑛1𝐻01H(\bm{w}^{\star}_{[n-1]})=H(\mathbf{0})=1italic_H ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) = italic_H ( bold_0 ) = 1. However, H(𝒆1)=12ρx12+1<1𝐻subscript𝒆112𝜌superscriptsubscript𝑥1211H(\bm{e}_{1})=-\frac{1}{2}\rho x_{1}^{2}+1<1italic_H ( bold_italic_e start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 1 < 1, which contradicts 𝒘[n1]subscriptsuperscript𝒘delimited-[]𝑛1\bm{w}^{\star}_{[n-1]}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT being the minimal solution to H𝐻Hitalic_H.

Next, we show that 𝒘[n1]subscriptsuperscript𝒘delimited-[]𝑛1\bm{w}^{\star}_{[n-1]}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT must be a unit vector, that is, 𝒘[n1]𝕊+n2subscriptsuperscript𝒘delimited-[]𝑛1subscriptsuperscript𝕊𝑛2\bm{w}^{\star}_{[n-1]}\in\mathbb{S}^{n-2}_{+}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. If not, assume that 𝒘[n1]2<1subscriptnormsubscriptsuperscript𝒘delimited-[]𝑛121\|\bm{w}^{\star}_{[n-1]}\|_{2}<1∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < 1, we can show that there exists a better solution on the boundary of 𝔹+n1(𝟎,1)subscriptsuperscript𝔹𝑛101\mathbb{B}^{n-1}_{+}(\mathbf{0},1)blackboard_B start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT ( bold_0 , 1 ), which contradicts the optimality of 𝒘[n1]superscriptsubscript𝒘delimited-[]𝑛1\bm{w}_{[n-1]}^{\star}bold_italic_w start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT. Write 𝒘~[n1]=𝒘[n1]𝒘[n1]2subscriptsuperscript~𝒘delimited-[]𝑛1subscriptsuperscript𝒘delimited-[]𝑛1subscriptnormsubscriptsuperscript𝒘delimited-[]𝑛12\tilde{\bm{w}}^{\star}_{[n-1]}=\frac{\bm{w}^{\star}_{[n-1]}}{\|\bm{w}^{\star}_% {[n-1]}\|_{2}}over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT = divide start_ARG bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT end_ARG start_ARG ∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG, we define C:[0,1]:𝐶01C:[0,1]\rightarrow\mathbb{R}italic_C : [ 0 , 1 ] → blackboard_R as follows:

C(λ):=H1(λ𝒘~[n1])+H2(λ𝒘~[n1]).assign𝐶𝜆subscript𝐻1𝜆subscriptsuperscript~𝒘delimited-[]𝑛1subscript𝐻2𝜆subscriptsuperscript~𝒘delimited-[]𝑛1C(\lambda):=H_{1}(\lambda\tilde{\bm{w}}^{\star}_{[n-1]})+H_{2}(\lambda\tilde{% \bm{w}}^{\star}_{[n-1]}).italic_C ( italic_λ ) := italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_λ over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) + italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_λ over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ) .

Clearly,

C(λ)=12ρ𝒙[n1],𝒘~[n1]2λ2+𝒆,𝒘~[n1]λ+1λ2,𝐶𝜆12𝜌superscriptsubscript𝒙delimited-[]𝑛1subscriptsuperscript~𝒘delimited-[]𝑛12superscript𝜆2𝒆subscriptsuperscript~𝒘delimited-[]𝑛1𝜆1superscript𝜆2C(\lambda)=-\frac{1}{2}\rho\langle\bm{x}_{[n-1]},\tilde{\bm{w}}^{\star}_{[n-1]% }\rangle^{2}\lambda^{2}+\langle\bm{e},\tilde{\bm{w}}^{\star}_{[n-1]}\rangle% \lambda+\sqrt{1-\lambda^{2}},italic_C ( italic_λ ) = - divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ⟨ bold_italic_x start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT , over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ⟨ bold_italic_e , over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ⟩ italic_λ + square-root start_ARG 1 - italic_λ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

which is not constant, and concave with respect to the variable λ𝜆\lambdaitalic_λ. Therefore, the minimal value can only be achieved at λ=1𝜆1\lambda=1italic_λ = 1. Therefore, 𝒘[n1]2=1subscriptnormsubscriptsuperscript𝒘delimited-[]𝑛121\|\bm{w}^{\star}_{[n-1]}\|_{2}=1∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT [ italic_n - 1 ] end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1. In other words, the n𝑛nitalic_n-th entry of the optimal solution 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT to problem (3) must be 00. This completes the proof. ∎

Note that for problem (3), the feasible set 𝕊+n1superscriptsubscript𝕊𝑛1\mathbb{S}_{+}^{n-1}blackboard_S start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT is nonconvex. This nonconvex nature poses significant challenges in algorithm development. To address this, we present the following result which allows us to consider the problem within the confines of a convex set, specifically 𝔹+n(𝟎,1)superscriptsubscript𝔹𝑛01\mathbb{B}_{+}^{n}(\mathbf{0},1)blackboard_B start_POSTSUBSCRIPT + end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ). This approach provides a more tractable pathway for algorithmic development and analysis.

Theorem 4.10.

Let 𝐱n𝐱superscriptsubscript𝑛\bm{x}\in\mathbb{R}_{\downarrow}^{n}bold_italic_x ∈ blackboard_R start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and assume that its last entry xnsubscript𝑥𝑛x_{n}italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is nonzero. Let 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT be an optimal solution to the following optimization problem

min{12𝒘𝖠ρ,𝒙𝒘+𝒆𝒘:𝒘𝔹n(𝟎,1)},:12superscript𝒘topsubscript𝖠𝜌𝒙𝒘superscript𝒆top𝒘𝒘superscriptsubscript𝔹𝑛01\min\left\{\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}+\bm{e}^{\top% }\bm{w}:\bm{w}\in\mathbb{B}_{\downarrow}^{n}(\mathbf{0},1)\right\},roman_min { divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w : bold_italic_w ∈ blackboard_B start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ) } , (28)

where 𝖠ρ,𝐱subscript𝖠𝜌𝐱\mathsf{A}_{\rho,\bm{x}}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT is given by (19). Then, 𝐰superscript𝐰\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is either the origin or the optimal solution to the optimization problem (3). Furthermore, we have

𝒙,𝒘𝒘prox1ρh1(𝒙).𝒙superscript𝒘superscript𝒘subscriptprox1𝜌subscript1𝒙\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\in\mathrm{prox}_{\frac{1}{% \rho}h_{1}}(\bm{x}).⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) . (29)
Proof.

The proof is trivial if 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the zero vector. If 𝒘𝟎superscript𝒘0\bm{w}^{\star}\neq\mathbf{0}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ≠ bold_0, we now show that 𝒘2=1subscriptnormsuperscript𝒘21\|\bm{w}^{\star}\|_{2}=1∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 1, i.e. 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the optimal solution to the optimization problem (3). If not, we denote the objection function of problem (28) by H𝐻Hitalic_H, that is,

H(𝒘)=12𝒘𝖠ρ,𝒙𝒘+𝒆𝒘.𝐻𝒘12superscript𝒘topsubscript𝖠𝜌𝒙𝒘superscript𝒆top𝒘H(\bm{w})=\frac{1}{2}\bm{w}^{\top}\mathsf{A}_{\rho,\bm{x}}\bm{w}+\bm{e}^{\top}% \bm{w}.italic_H ( bold_italic_w ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG bold_italic_w start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT bold_italic_w .

Set 𝒘~:=𝒘𝒘2assignsuperscript~𝒘superscript𝒘subscriptnormsuperscript𝒘2\tilde{\bm{w}}^{\star}:=\frac{\bm{w}^{\star}}{\|\bm{w}^{\star}\|_{2}}over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT := divide start_ARG bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT end_ARG start_ARG ∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG, and define C:[0,1]:𝐶01C:[0,1]\rightarrow\mathbb{R}italic_C : [ 0 , 1 ] → blackboard_R as follows:

C(λ)=H(λ𝒘~)=λ(12ρ𝒙,𝒘~2λ𝒘~1).𝐶𝜆𝐻𝜆superscript~𝒘𝜆12𝜌superscript𝒙superscript~𝒘2𝜆subscriptnormsuperscript~𝒘1C(\lambda)=H(\lambda\tilde{\bm{w}}^{\star})=-\lambda\left(\frac{1}{2}\rho% \langle\bm{x},\tilde{\bm{w}}^{\star}\rangle^{2}\lambda-\|\tilde{\bm{w}}^{\star% }\|_{1}\right).italic_C ( italic_λ ) = italic_H ( italic_λ over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = - italic_λ ( divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_ρ ⟨ bold_italic_x , over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_λ - ∥ over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) .

Clearly, C𝐶Citalic_C achieves its optimal value at either λ=0𝜆0\lambda=0italic_λ = 0 or 1111. Hence,

H(𝒘)=C(𝒘2)>min{C(0),C(1)}=min{H(𝟎),H(𝒘~)}.𝐻superscript𝒘𝐶subscriptnormsuperscript𝒘2𝐶0𝐶1𝐻0𝐻superscript~𝒘H({\bm{w}}^{\star})=C(\|\bm{w}^{\star}\|_{2})>\min\{C(0),C(1)\}=\min\{H(% \mathbf{0}),H(\tilde{\bm{w}}^{\star})\}.italic_H ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) = italic_C ( ∥ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) > roman_min { italic_C ( 0 ) , italic_C ( 1 ) } = roman_min { italic_H ( bold_0 ) , italic_H ( over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) } .

We conclude that 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is either the origin or the optimal solution to the optimization problem (3).

Finally, we show the inclusion (29) holds. If 𝒘=𝟎superscript𝒘0\bm{w}^{\star}=\mathbf{0}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT = bold_0, then, for all 𝒘𝕊+n1𝒘subscriptsuperscript𝕊𝑛1\bm{w}\in\mathbb{S}^{n-1}_{+}bold_italic_w ∈ blackboard_S start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT + end_POSTSUBSCRIPT,

0=H(𝟎)H(𝒘)=G(𝒘),0𝐻0𝐻𝒘𝐺𝒘0=H(\mathbf{0})\leq H(\bm{w})=G(\bm{w}),0 = italic_H ( bold_0 ) ≤ italic_H ( bold_italic_w ) = italic_G ( bold_italic_w ) ,

where G𝐺Gitalic_G is the objective function of problem (3). Therefore, no matter which the optimal point for problem (3) is, we know 𝟎prox1ρh1(𝒙)0subscriptprox1𝜌subscript1𝒙\mathbf{0}\in\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})bold_0 ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ).

If 𝒘𝟎superscript𝒘0\bm{w}^{\star}\neq\mathbf{0}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ≠ bold_0, then 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the optimal solution to problem (3) as well. Hence 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT is the output of the 𝒘𝒘\bm{w}bold_italic_w-step of the WRD procedure and G(𝒘)<0𝐺superscript𝒘0G(\bm{w}^{\star})<0italic_G ( bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ) < 0. Obviously, 𝒙,𝒘𝒘prox1ρh1(𝒙)𝒙superscript𝒘superscript𝒘subscriptprox1𝜌subscript1𝒙\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}\in\mathrm{prox}_{\frac{1}{% \rho}h_{1}}(\bm{x})⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) by the r𝑟ritalic_r-step and d𝑑ditalic_d-step of the WRD procedure. We conclude that the inclusion (29) holds. ∎

Based on Theorem 4.10, computing prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\mathrm{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) is resorting to solving optimization problem (28). This problem has a concave objective function restricted on a convex set. A popular algorithm for solving problem (28) is called nonconvex gradient projection algorithm as follows: with an initial guess 𝒘(0)superscript𝒘0\bm{w}^{(0)}bold_italic_w start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT, iterative

𝒘(k+1)=P𝔹n(𝟎,1)(𝒘(k)12ρ𝒙22(𝖠ρ,𝒙𝒘(k)+𝒆)),superscript𝒘𝑘1subscript𝑃superscriptsubscript𝔹𝑛01superscript𝒘𝑘12𝜌superscriptsubscriptnorm𝒙22subscript𝖠𝜌𝒙superscript𝒘𝑘𝒆\bm{w}^{(k+1)}=P_{\mathbb{B}_{\downarrow}^{n}(\mathbf{0},1)}\left(\bm{w}^{(k)}% -\frac{1}{2\rho\|\bm{x}\|_{2}^{2}}(\mathsf{A}_{\rho,\bm{x}}\bm{w}^{(k)}+\bm{e}% )\right),bold_italic_w start_POSTSUPERSCRIPT ( italic_k + 1 ) end_POSTSUPERSCRIPT = italic_P start_POSTSUBSCRIPT blackboard_B start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ) end_POSTSUBSCRIPT ( bold_italic_w start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ( sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT + bold_italic_e ) ) , (30)

where P𝔹n(𝟎,1)subscript𝑃superscriptsubscript𝔹𝑛01P_{\mathbb{B}_{\downarrow}^{n}(\mathbf{0},1)}italic_P start_POSTSUBSCRIPT blackboard_B start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ) end_POSTSUBSCRIPT is the projection operator onto the set 𝔹n(𝟎,1)superscriptsubscript𝔹𝑛01\mathbb{B}_{\downarrow}^{n}(\mathbf{0},1)blackboard_B start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ). Since 𝔹n(𝟎,1)superscriptsubscript𝔹𝑛01\mathbb{B}_{\downarrow}^{n}(\mathbf{0},1)blackboard_B start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( bold_0 , 1 ) is a closed and bounded semi-algebraic convex subset of nsuperscript𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and the gradient of the objective function of problem (28) is 𝖠ρ,𝒙𝒘+𝒆subscript𝖠𝜌𝒙𝒘𝒆\mathsf{A}_{\rho,\bm{x}}\bm{w}+\bm{e}sansserif_A start_POSTSUBSCRIPT italic_ρ , bold_italic_x end_POSTSUBSCRIPT bold_italic_w + bold_italic_e with Lipschtiz constant ρ𝒙22𝜌superscriptsubscriptnorm𝒙22\rho\|\bm{x}\|_{2}^{2}italic_ρ ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the sequence {𝒘(k)}ksubscriptsuperscript𝒘𝑘𝑘\{\bm{w}^{(k)}\}_{k\in\mathbb{N}}{ bold_italic_w start_POSTSUPERSCRIPT ( italic_k ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_k ∈ blackboard_N end_POSTSUBSCRIPT converges, see, for example [8, Theorem 5.3].

We are ready now to present our algorithm for computing prox1ρh1subscriptprox1𝜌subscript1\operatorname*{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT based on our WRD procedure for arbitrary 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. This algorithm is presented in Algorithm 2.

1:Input: Vector 𝒙n𝒙superscript𝑛\bm{x}\in\mathbb{R}^{n}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, parameter ρ>0𝜌0\rho>0italic_ρ > 0, and an initial guess 𝒘(0)superscript𝒘0\bm{w}^{(0)}bold_italic_w start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT
2:Output: The proximal operator prox1ρh1(𝒙)subscriptprox1𝜌subscript1𝒙\text{prox}_{\frac{1}{\rho}h_{1}}(\bm{x})prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x )
3:procedure (WRD Procedure)
4:     Sort and convert 𝒙𝒙\bm{x}bold_italic_x into nsubscriptsuperscript𝑛\mathbb{R}^{n}_{\downarrow}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ↓ end_POSTSUBSCRIPT via a signed permutation matrix 𝖯𝖯\mathsf{P}sansserif_P.
5:     Trim 𝒙𝒙\bm{x}bold_italic_x if necessary by Theorem 4.9
6:     Generate 𝒘(i)superscript𝒘𝑖\bm{w}^{(i)}bold_italic_w start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT via (30) and denote its limit by 𝒘superscript𝒘\bm{w}^{\star}bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT (𝒘𝒘\bm{w}bold_italic_w-step)
7:     Form 𝒖𝒙,𝒘𝒘𝒖𝒙superscript𝒘superscript𝒘\bm{u}\leftarrow\langle\bm{x},\bm{w}^{\star}\rangle\bm{w}^{\star}bold_italic_u ← ⟨ bold_italic_x , bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT ⟩ bold_italic_w start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT by Theorem 4.10 (r𝑟ritalic_r-step and d𝑑ditalic_d-step)
8:     Pad 𝒖𝒖\bm{u}bold_italic_u with a zero block if necessary by Theorem 4.9
9:     𝒖𝖯1𝒖prox1ρh1(𝒙)𝒖superscript𝖯1𝒖subscriptprox1𝜌subscript1𝒙\bm{u}\leftarrow\mathsf{P}^{-1}\bm{u}\in\operatorname*{prox}_{\frac{1}{\rho}h_% {1}}(\bm{x})bold_italic_u ← sansserif_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT bold_italic_u ∈ roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x )
10:end procedure
Algorithm 2 Computing the Proximal Operator of h1subscript1h_{1}italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT

Due to the inherent nonconvexity of problem (28), the initial guess provided to any algorithms for this problem significantly influences the quality of the solution obtained. In our simulations, we have observed that choosing 𝒘(0)=α𝒙𝒙2superscript𝒘0𝛼𝒙subscriptnorm𝒙2\bm{w}^{(0)}=\alpha\frac{\bm{x}}{\|\bm{x}\|_{2}}bold_italic_w start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT = italic_α divide start_ARG bold_italic_x end_ARG start_ARG ∥ bold_italic_x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG with α[14,34]𝛼1434\alpha\in[\frac{1}{4},\frac{3}{4}]italic_α ∈ [ divide start_ARG 1 end_ARG start_ARG 4 end_ARG , divide start_ARG 3 end_ARG start_ARG 4 end_ARG ] tends to yield satisfactory results. The numerical result with Algorithm 2 in 2superscript2\mathbb{R}^{2}blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is shown in Figure 4.3(b). In comparison with Figure 4.3(a), the regions which are identified to be mapped to the origin by prox1ρh1subscriptprox1𝜌subscript1\mathrm{prox}_{\frac{1}{\rho}h_{1}}roman_prox start_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_ρ end_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT are consistent.

5 Conclusions

This paper addresses the computation of proximity operators of scale and signed permutation invariant functions. By delving into the intrinsic properties of these functions, we introduce a procedure called WRD, which includes the 𝒘𝒘\bm{w}bold_italic_w-step, r𝑟ritalic_r-step, and d𝑑ditalic_d-step, to effectively handle the computation of proximity operators. Specifically, we conduct a thorough investigation into two specific scale and signed permutation invariant functions: the ratio of 1/2subscript1subscript2\ell_{1}/\ell_{2}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and its square. For the function (1/2)2superscriptsubscript1subscript22(\ell_{1}/\ell_{2})^{2}( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, we propose an algorithm capable of explicitly generating its proximity operator through a few straightforward steps. Additionally, for the function 1/2subscript1subscript2\ell_{1}/\ell_{2}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we devise an efficient algorithm with guaranteed convergence to compute its proximity operator.

In future endeavors, we aim to explore the practical applications of these developed algorithms, particularly in sparse signal recovery and image processing domains.

Declarations

  • The authors declare that they have no conflict of interest.

  • The work of L. Shen was supported in part by the National Science Foundation under grant DMS-2208385 and by 2023 and 2024 Air Force Summer Faculty Fellowship Program (SFFP). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation and AFRL (Air Force Research Laboratory).

References

  • \bibcommenthead
  • Candes et al. [2008] Candes, E., Wakin, M.B., Boyd, S.: Enhancing sparsity by reweighted 1superscript1\ell^{1}roman_ℓ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT minimization. Journal of Fourier Analysis and Applications 14, 877–905 (2008)
  • Prater-Bennette et al. [2022] Prater-Bennette, A., Shen, L., Tripp, E.E.: The proximity operator of the log-sum penalty. Journal of Scientific Computing 93(3), 1–34 (2022)
  • Lopes [2016] Lopes, M.E.: Unknown sparsity in compressed sensing: Denoising and inference. IEEE Transactions on Information Theory 62(9), 5145–5166 (2016)
  • Rahimi et al. [2019] Rahimi, Y., Wang, C., Dong, H., Lou, Y.: A scale-invariant approach for sparse signal recovery. SIAM Journal on Scientific Computing 41(6), 3649–3672 (2019)
  • Tang and Nehorai [2011] Tang, G., Nehorai, A.: Performance analysis of sparse recovery based on constrained minimal singular values. IEEE Transactions on Signal Processing 59(12), 5734–5745 (2011)
  • Yin et al. [2014] Yin, P., Esser, E., Xin, J.: Ratio and difference of 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT norms and sparse representation with coherent dictionaries. Communications in Information and Systems 14(2), 87–109 (2014)
  • Xu et al. [2021] Xu, Y., Narayan, A., Tran, H., Webster, C.G.: Analysis of the ratio of 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT norms in compressed sensing. Applied and Computational Harmonic Analysis 55, 486–511 (2021)
  • Attouch et al. [2013] Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Mathematical Programming, Ser. A 137, 91–129 (2013)
  • Beck and Teboulle [2009] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2, 183–202 (2009)
  • Bolte et al. [2014] Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming 146, 449–494 (2014)
  • Combettes and Wajs [2005] Combettes, P., Wajs, V.: Signal recovery by proximal forward-backward splitting. Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal 4, 1168–1200 (2005)
  • Krol et al. [2012] Krol, A., Li, S., Shen, L., Xu, Y.: Preconditioned alternating projection algorithms for maximum a Posteriori ECT reconstruction. Inverse Problems 28, 115005–34 (2012)
  • Li et al. [2015] Li, Q., Shen, L., Xu, Y., Zhang, N.: Multi-step fixed-point proximity algorithms for solving a class of optimization problems arising from image processing. Advances in Computational Mathematics 41(2), 387–422 (2015)
  • Micchelli et al. [2011] Micchelli, C.A., Shen, L., Xu, Y.: Proximity algorithms for image models: Denoising. Inverse Problems 27, 045009–30 (2011)
  • Parikh and Boyd [2014] Parikh, N., Boyd, S.: Proximal algorithms. Foundations and Trends in Optimization 1, 123–231 (2014)
  • Tao [2022] Tao, M.: Minimization of 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT over 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for sparse signal recovery with convergence guarantee. SIAM Journal on Scientific Computing 44(2), 770–797 (2022)
  • Shen et al. [2019] Shen, L., Suter, B.W., Tripp, E.E.: Structured sparsity promoting functions. Journal of Optimization Theory and Applications 183(3), 386–421 (2019)
  • Moreau [1962] Moreau, J.-J.: Fonctions convexes duales et points proximaux dans un espace hilbertien. C.R. Acad. Sci. Paris Sér. A Math. 255, 1897–2899 (1962)
  • Donoho [1995] Donoho, D.: De-noising by soft-thresholding. IEEE Transactions on Information Theory 41, 613–627 (1995)
  • Tao and An [1996] Tao, P.D., An, L.T.H.: Difference of convex functions optimization algorithms (DCA) for globally minimizing nonconvex quadratic forms on Euclidean balls and spheres. Operations Research Letters 19(5), 207–216 (1996)
  • Martínez [1994] Martínez, J.M.: Local minimizers of quadratic functions on Euclidean balls and spheres. SIAM Journal on Optimization 4(1), 159–176 (1994)
  • Wang [2004] Wang, X.: A simple proof of Descartes’s rule of signs. The American Mathematical Monthly 111(6), 525–526 (2004) https://doi.org/10.1080/00029890.2004.11920108