[PDF] Calmness characterizations on PSD composite rank constraint systems and applications

Abstract

This paper is concerned with the positive semidefinite (PSD) composite rank constraint system. By leveraging its equivalent reformulations, we present several criteria to identify the calmness of the multifunction associated to its partial perturbation and discuss the relation among them. Then, a collection of examples are provided to illustrate that these criteria can be satisfied for some common PSD composite rank constraint sets. Finally, the calmness of the associated multifunction is employed to establish the global exact penalty for the equivalent reformulation of the PSD composite rank constrained and regularized optimization problems.

Full PDF

aa r X i v : . [ m a t h . O C ] F e b Calmness characterizations on PSD composite rankconstraint systems and applications

Yitian Qian ∗ , Shaohua Pan † and Yulan Liu ‡ February 23, 2021

Abstract

This paper is concerned with the positive semideﬁnite (PSD) composite rankconstraint system. By leveraging its equivalent reformulations, we present severalcriteria to identify the calmness of the multifunction associated to its partial per-turbation and discuss the relation among them. Then, a collection of examples areprovided to illustrate that these criteria can be satisﬁed for some common PSD com-posite rank constraint sets. Finally, the calmness of the associated multifunction isemployed to establish the global exact penalty for the equivalent reformulation ofthe PSD composite rank constrained and regularized optimization problems.

In this paper, a hollow capital, say X , denotes a ﬁnite dimensional vector space equippedwith the inner product h· , ·i and its induced norm k · k ; for example, S n represent thespace of all n × n real symmetric matrices, equipped with the trace inner product h· , ·i and its induced Frobenius norm k · k F . Let Ω be a closed set in S n . For any given integer ≤ r ≤ n , we are interested in the following PSD composite rank constraint set Γ := (cid:8) X ∈ S n + ∩ Ω | rank( X ) ≤ r (cid:9) . where S n + denotes the closed convex cone consisting of all PSD matrices of S n . Noticethat for any Z ∈ S n + , rank( Z ) ≤ r if and only if tr( Z ) − S r ( Z ) = 0 , where S r ( Z ) denotesthe sum of the ﬁrst r largest eigenvalues of Z . So, the set Γ can be rewritten as Γ = (cid:8) X ∈ S n + ∩ Ω | tr( X ) − S r ( X ) = 0 (cid:9) . (1)Such a composite rank constraint set has wide applications in statistics [37], system iden-tiﬁcation and control [14], ﬁnance [38], machine learning [28, 20], and quantum tomogra-phy [19]. Moreover, it also frequently arises from the PSD relaxations to combinational ∗ School of Mathematics, South China University of Technology, Guangzhou. † ([email protected]) School of Mathematics, South China University of Technology, Guangzhou. ‡ School of Applied Mathematics, Guangdong University of Technology, Guangzhou. Γ , i.e., the following multifunction Υ( τ ) := (cid:8) X ∈ S n + ∩ Ω | tr( X ) − S r ( X ) = τ (cid:9) ∀ τ ∈ R . (2)The notion of calmness of a multifunction was ﬁrst introduced in [46] under the term“pseudo upper-Lipschitz continuity” owing to the fact that it is a combination of Aubin’spseudo-Lipschitz continuity [1] and Robinson’s upper-Lipschitz continuity [40], and theterm “calmness” was later coined in [41]. A multifunction M : Y ⇒ X is called calm at y for x ∈ M ( y ) if there exists a constant κ > along with ε > and δ > such that M ( y ) ∩ B ( x, δ ) ⊆ M ( y ) + κ k y − y k B X for all y ∈ B ( y, ε ) . (3)By [10, Exercise 3H.4], the neighborhood restriction on y in (3) can be removed. Asobserved by Henrion and Outrata [23], the calmness of M at y for x ∈ M ( y ) is equivalentto the (metric) subregularity of its inverse at x for y ∈ M − ( x ) . Subregularity wasintroduced by Ioﬀe in [24] (under a diﬀerent name) as a constraint qualiﬁcation relatedto equality constraints in nonsmooth optimization problems, and was later generalized in[9] to generalized equations. Recall that a multifunction F : X ⇒ Y is called (metrically)subregular at x for y ∈ F ( x ) if there exist a constant ν > along with ε > such that dist( x, F − ( y )) ≤ ν dist( y, F ( x )) for all x ∈ B ( x, ε ) . (4)The calmness and subregularity have already been studied by many authors under variousnames (see, e.g., [22, 23, 25, 15, 10, 48] and the references therein).The notion of subregularity is closely related to local error bound of a function as-sociated to inequality systems. Consider a lower semicontinuous (lsc) h : X → ( −∞ , ∞ ] and the set S h = { x ∈ X | h ( x ) ≤ } . The function h is said to have a local error boundat x ∈ S h with h ( x ) = 0 if there exists a constant τ > along with δ > such that τ dist( x, S h ) ≤ max(0 , h ( x )) for all x ∈ B ( x, δ ) . (5)By comparing with (4), the subregularity of F at x for y ∈ F ( x ) is equivalent to thelocal error bound of h ( · ) = dist( y, F ( · )) at x ∈ F − ( y ) . There are a host of literature todeal with the error bound of general functions (see, e.g., [34, 31, 13, 27, 44, 49]).Most of the works mentioned above are concerned with the calmness (or subregularity)of a general multifunction or the error bound of a general lsc function. In this paper, wepay our attention to the calmness of a speciﬁc multifunction, i.e., the mapping Υ : R ⇒ S n in (2), inspired by the fact that it is signiﬁcant to characterize the ﬁrst-order optimalityconditions for rank constrained and regularized problems as well as to establish the globalexact penalty for their equivalent reformulations. In particular, the latter opens a doorto design eﬀective convex relaxation approaches to these diﬃcult matrix optimizationproblems (see [4, 39]). Although there are some papers to study the error bounds forsemideﬁnite programs (see [26, 45, 47, 7]), to the best of our knowledge, few works focuson the error bounds for PSD rank constrained problems except [3], where the locallyupper Lipschitz of Υ was achieved for three special Ω via technical construction.2he contribution of this work is to present several groups of criteria to identify thecalmness of the mapping Υ and provide a collection of examples to demonstrate that theycan be satisﬁed for some common PSD composite rank constraint sets. The criterionarising from the coderivative is shown to be the most convenient for use though it is themost strongest (see Section 3.3). As will be discussed at the end of Section 3.2.2, somecriteria are easily obtained by applying the error bound results in [34, 31, 13, 27, 44] to ψ ( X ) := tr( X ) − S r ( X ) ∀ X ∈ S n , (6)but the DC (diﬀerence of convex) property of ψ makes inconvenient for their use. Thisdiﬃculty is well overcome in Section 3.2 by leveraging the equivalent characterizations onthe calmness of Υ . In Section 4, the calmness of Υ is used to establish the global exactpenalty for the equivalent reformulations of rank constrained and regularized optimiza-tion problems, and the global exact penalty result in [33] is recovered as a byproduct. Notation:

Throughout this paper, we write

Ξ := S n + ∩ Ω , R := (cid:8) X ∈ S n | rank( X ) ≤ r (cid:9) and R + := (cid:8) X ∈ S n + | rank( X ) ≤ r (cid:9) . The notation O n represents the set of all n × n matrices with orthonormal columns, and I and e denote an identity matrix and a vectorof all ones, respectively, whose dimensions are known from the context. For a given X ∈ S n , dist( X, ∆) means the distance from X to a closed set ∆ ⊆ S n in terms of theFrobenius norm, λ ( X ) = ( λ ( X ) , . . . , λ n ( X )) T and σ ( X ) = ( σ ( X ) , . . . , σ n ( X )) T denotethe eigenvalue value and singular value vectors of X arranged in a nonincreasing order,and O n ( X ) := (cid:8) U ∈ O n | X = U Diag( λ ( X )) U T (cid:9) . For a closed set C , Π C denotes theprojection mapping onto C , which may be multi-valued when C is nonconvex, and δ C represents the indicator function of C , i.e., δ C ( x ) = 0 if x ∈ C and ∞ otherwise. Thenotation B and S denote the unit ball and sphere in a suitable space, and B ( x, δ ) denotesa closed ball of radius δ > centered at x . For an extended real-valued h : X → ( −∞ , ∞ ] ,the notation x ′ −→ h x means x ′ → x with h ( x ′ ) → h ( x ) . Let h : X → ( −∞ , ∞ ] be an extended real-valued function. Consider any point x ∈ X with h ( x ) ﬁnite. The local slope of the function h at x ∈ X is deﬁned as |∇ h | ( x ) := lim sup z → xz = x [ h ( x ) − h ( z )] + k z − x k , while its subderivative dh ( x ) : X → R ∪ [ −∞ , ∞ ] at x is deﬁned as dh ( x )( w ) := lim inf τ ↓ w ′→ w h ( x + τ w ′ ) − h ( x ) τ ∀ w ∈ X . By invoking [41, Proposition 8.32], the following relation holds for |∇ h | ( x ) and dh ( x ) . Lemma 2.1

Let h : X → ( −∞ , ∞ ] be an extended real-valued function. Consider a point x ∈ X with h ( x ) ﬁnite. Then, it holds that |∇ h | ( x ) = − min k w k≤ dh ( x )( w ) = max (cid:0) , − min k w k =1 dh ( x )( w ) (cid:1) . .1 Generalized subdiﬀerentials and tangent cones Deﬁnition 2.1 (see [41, 36]) Consider a function h : X → ( −∞ , ∞ ] and a point x ∈ X with h ( x ) ﬁnite. The regular subdiﬀerential of h at x is deﬁned as b ∂h ( x ) := (cid:26) v ∈ X | lim inf x ′ → x,x ′ = x h ( x ′ ) − h ( x ) − h v, x ′ − x ik x ′ − x k ≥ (cid:27) ; and the subdiﬀerential (also known as the limiting subdiﬀerential) of h at x is deﬁned as ∂h ( x ) := n v ∈ X | ∃ x k −→ h x and v k ∈ b ∂h ( x k ) with v k → v o . When h is convex, its (regular) subdiﬀerential reduces to the one in the sense of convexanalysis. The following lemma provides the characterization on the subdiﬀerential of thefunction S r , which is immediate by combining [43, Lemma 2.2] with [29, Theorem 6]. Lemma 2.2

Consider a point X ∈ S n + with rank( X ) > r . Let X have the eigenvaluedecomposition X = P Diag( λ ( X )) P T with P ∈ O n ( X ) , and deﬁne the index sets α := (cid:8) i | λ i ( X ) > λ r ( X ) (cid:9) , β := (cid:8) i | λ i ( X ) = λ r ( X ) (cid:9) , γ := (cid:8) i | λ i ( X ) < λ r ( X ) (cid:9) . (7) Then, ∂S r ( X ) = (cid:8) [ P α P β Q P γ ]Diag( v )[ P α P β Q P γ ] T | v ∈ ∂s r ( λ ( X )) , Q ∈ O | β | (cid:9) with ∂s r ( λ ( X )) = n v ∈ R n | v α = e ; v γ = 0; e T v β = r − | α | with v β ∈ [0 , e ] o . Consequently, for any H ∈ S n , S ′ r ( X ; H ) = h P α P T α , H i + S r −| α | ( P T β HP β ) . When h is the indicator function of a set S ⊂ X , its regular subdiﬀerential at x ∈ S is the regular normal cone b N S ( x ) to S at x , and its subdiﬀerential at x ∈ S is the normalcone N S ( x ) to S at x . Next we recall from [41, Chapter 6] the tangent cone to a set. Deﬁnition 2.2

For a given set S ⊂ X , the tangent cone to S at x ∈ S is deﬁned as T S ( x ) := n d ∈ X | ∃ t k ↓ , d k → d such that x + t k d k ∈ S for each k o , and the regular (or Clarke) tangent cone to S at x ∈ S is deﬁned as b T S ( x ) := n d ∈ X | ∀ t k ↓ ∀ x k −→ S x, ∃ e x k −→ S x with e x k − x k t k → d o . The following lemma characterizes the tangent and normal cones to the set S n + . Lemma 2.3 (see [2, Example 2.65]) Fix any X ∈ S n + with rank( X ) = k . Let X have thespectral decomposition as P Diag( λ ( X )) P T , and let P ∈ R n × k and P ∈ R n × ( n − k ) be thematrix consisting of the ﬁrst k -columns and the last n − k columns of P . Then, T S n + ( X ) = (cid:8) H ∈ S n | P T HP (cid:23) (cid:9) and N S n + ( X ) = (cid:8) W ∈ S n − | P T W P = 0 (cid:9) . .2 Normal and tangent cones to R and R + The tangent cone to R was derived in [42, Theorem 3.2], and the normal cone to R wasachieved in [30, Proposition 3.6]. Here, we provide a diﬀerent proof for the latter. Proposition 2.1

Consider any X ∈ R with the spectral decomposition P Diag( λ ( X )) P T .Deﬁne the index sets α := (cid:8) i | λ i ( X ) > (cid:9) , β := (cid:8) i | λ i ( X ) = 0 (cid:9) , γ := (cid:8) i | λ i ( X ) < (cid:9) .(i) If rank( X ) = r , then b N R ( X ) = N R ( X ) = (cid:8) P β HP T β | H ∈ S | β | (cid:9) ; if rank( X ) < r ,then b N R ( X ) = { } ⊂ N R ( X ) = (cid:8) H ∈ S n | rank( H ) ≤ n − r (cid:9) ∩ (cid:8) P β HP T β | H ∈ S | β | (cid:9) . (ii) If rank( X ) = r , then b T R ( X ) = T R ( X ) = { H ∈ S n | P T β HP β = 0 } ; if rank( X ) = s < r ,then T R ( X ) = { H ∈ S n | P T β HP β = 0 } + (cid:8) P β WP T β | rank( P β WP T β ) ≤ r − s, W ∈ S | β | (cid:9) ,which is contained in (cid:8) H ∈ S n | k P T β HP β k ∗ − k P T β HP β k ( r − s ) ≤ (cid:9) . Proof:

Let µ > µ > · · · > µ l be the distinct eigenvalues of X . Deﬁne the index sets a k := (cid:8) i | λ i ( X ) = µ k (cid:9) for k = 1 , , . . . , l. For any U ∈ O n ( X ) , since U T P Diag( λ ( X )) P T U = Diag( λ ( X )) , by [8, Proposition 2.4], P T U = BlkDiag( Q , . . . , Q l ) with Q i ∈ O | a i | for i = 1 , , . . . , l. (8)Notice that δ R ( Z ) = δ R ( λ ( Z )) where R := { z ∈ R n | k z k ≤ r } . Since the indicatorfunction δ R of the set R is symmetric, by invoking [29, Theorem 6] we have b ∂δ R ( X ) = n U Diag( ξ ) U T | ξ ∈ b ∂δ R ( λ ( X )) , X = U Diag( λ ( X )) U T , U ∈ O n ( X ) o ,∂δ R ( X ) = n U Diag( ξ ) U T | ξ ∈ ∂δ R ( λ ( X )) , X = U Diag( λ ( X )) U T , U ∈ O n ( X ) o . By combining the two equalities with (8) and [6, Proposition 3.8 &Theorem 3.9], it is easyto obtain part (i). The ﬁrst part of (ii) follows by the ﬁrst part of (i) and [41, Corollary6.29], and the equality in the second part of (ii) is due to [42, Theorem 3.2]. For the lastpart, let f ( Z ) := f ( Z ) − f ( Z ) , where f ( Z ) = k Z k ∗ and f ( Z ) = k Z k ( r ) for Z ∈ S n denote the nuclear norm and Ky-Fan r -norm of Z . Since R = (cid:8) Z ∈ S n | f ( Z ) ≤ (cid:9) and f ( X ) = 0 , by [41, Proposition 10.3 & Corollary 10.9], T R ( X ) ⊆ (cid:8) H ∈ S n | df ( X )( H ) ≤ (cid:9) ⊆ (cid:8) H ∈ S n | df ( X )( H ) + d ( − f ( X ))( H ) ≤ (cid:9) . (9)Fix any H ∈ S n . Since f and f are globally Lipschitz continuous and directionallydiﬀerentiable, df ( X )( H ) = f ′ ( X ; H ) and d ( − f ( X ))( H ) = − f ′ ( X ; H ) . In addition, bycombining [43, Section 2] with [29, Theorem 6], it is not hard to obtain that ∂ k X k ∗ = n P α P T α − P γ P T γ + P β HP T β | k H k ≤ , H ∈ S | β | o ,∂ k X k ( r ) = n P α P T α − P γ P T γ + P β ( G − H ) P T β | h I, G + H i = r − s, (cid:22) G, H (cid:22) I o . This means that f ′ ( X ; H ) − f ′ ( X ; H ) = k P T β HP β k ∗ − k P T β HP β k ( r − s ) . Together withthe inclusion in (12), we obtain the last part of (ii). ✷ orollary 2.1 Consider a point X ∈ R + . If rank( X ) = r , then b N R + ( X ) = N R + ( X ) = N S n + ( X ) + N R ( X ) and b T R + ( X ) = T R + ( X ) = T R ( X ) ∩ T S n + ( X ) ; if rank( X ) < r , then ( N S n + ( X ) ⊆ b N R + ( X ) ⊆ N R + ( X ) ⊆ N S n + ( X ) + N R ( X ) , (10a) T S n + ( X ) ∩ b T R ( X ) ⊆ b T R + ( X ) ⊆ T R + ( X ) ⊆ T S n + ( X ) ∩ T R ( X ) . (10b) Proof:

Notice that for any Y ∈ S n + , dist( Y, R + ) = dist( Y, R ) . Hence, it holds that dist( Z, R + ) ≤ k Z − Π S n + ( Z ) k F + dist(Π S n + ( Z ) , R ) ≤ k Z − Π S n + ( Z ) k F + dist( Z, R )= 2dist( Z, S n + ) + dist( Z, R ) ∀ Z ∈ S n . (11)This shows that the set R + = S n + ∩ R satisﬁes the metric qualiﬁcation at any Z ∈ S n .From [25, Section 3.1] and [41, Corollary 10.9], it follows that N S n + ( X ) + b N R ( X ) ⊆ b N R + ( X ) ⊆ N R + ( X ) ⊆ N S n + ( X ) + N R ( X ) . (12)When rank( X ) = r , since b N R ( X ) = N R ( X ) , the last inclusions become the equalitieson the normal cones, and the equalities on the tangent cones to R + then follow. When rank( X ) < r , since b N R ( X ) = { } we have N S n + ( X ) + b N R ( X ) = N S n + ( X ) , which alongwith equation (12) implies the inclusions in (10a). The inclusions in (10b) are immediateby using (10a) and [41, Theorem 6.28]. ✷ Lemma 2.4

Fix any G ∈ S n with the SVD given by G = U Diag( σ ( G )) V T . Then U Σ r ( G ) V T ∈ arg min Z ∈R k Z − G k ∗ , where U and V are the matrix consisting of the ﬁrst r columns of U and V , respectively,and Σ r ( G ) = Diag( σ ( G ) , . . . , σ r ( G )) with σ ( G ) ≥ σ ( G ) ≥ · · · ≥ σ r ( G ) . Proof:

By Mirsky’s theorem (see [50, Theorem 4.11]), k Z − G k ∗ ≥ k σ ( Z ) − σ ( G ) k for any Z ∈ S n . By this, it is easy to argue that if Z ∗ is an optimal solution to min Z ∈R k Z − G k ∗ ,then σ ( Z ∗ ) is optimal to min k z k ≤ r k z − σ ( G ) k . Conversely, if z ∗ is an optimal solutionof min k z k ≤ r k z − σ ( G ) k , then U Diag( | z ∗ | ↓ ) V T is optimal to min Z ∈R k Z − G k ∗ . Clearly, diag(Σ r ( G )) is an optimal solution of min k z k ≤ r k z − σ ( G ) k . Then, the result holds. ✷ Υ In order to achieve some criteria to identify the calmness of the mapping

Υ : R ⇒ S n at for X ∈ Υ(0) = Γ , we ﬁrst present some characterizations on the calmness of Υ .6 .1 Equivalent characterizations The following theorem states some equivalent characterizations on the calmess of Υ . Theorem 3.1

Consider any X ∈ Γ . The mapping Υ is calm at for X if and only ifone of the following equivalent conditions holds:(i) there exists a constant γ > along with δ > such that for all X ∈ B ( X, δ ) , dist( X, Υ(0)) ≤ γ (cid:2) dist( X, Ξ) + P ni = r +1 σ i ( X ) (cid:3) ; (13) (ii) there exist γ > and δ > such that for any X ∈ B ( X, δ ) ∩ Ξ with < ψ ( X ) < δ , sup Z ∈ Ξ \{ X } [tr( X − Z ) − S r ( X ) + S r ( Z )] + k Z − X k F ≥ γ ; (iii) G ( X, Y ) := (cid:26) { X − Y } if ( X, Y ) ∈ Ξ ×R∅ otherwise is subregular at ( X, X ) for the origin;(iv) there exist γ > and δ > such that for any ( X, Y ) ∈ B (( X, X ) , δ ) ∩ [Ξ × R ] with < k X − Y k < δ , sup ( G,H ) ∈ Ξ ×R ( G,H ) =( X,Y ) [ k X − Y k F − k G − H k F ] + k ( G − X, H − Y ) k F ≥ γ. Proof: (i)

By the deﬁnition, the calmness of the mapping Υ at for X is equivalent tothe existence of γ ′ > and δ ′ > such that for all Z ∈ B ( X, δ ′ ) , dist( Z, Υ(0)) ≤ γ ′ dist(0 , Υ − ( Z )) = (cid:26) γ ′ (tr( Z ) − S r ( Z )) if Z ∈ Ξ; ∞ otherwise . (14)If there exist γ > and δ > such that inequality (13) holds for all X ∈ B ( X, δ ) , theninequality (14) obviously holds, and the mapping Υ is calm at for X . Now assumethat Υ is calm at for X , i.e., there exist γ ′ > and δ ′ > such that inequality(14) holds for all X ∈ B ( X, δ ′ ) . We will show that inequality (13) holds with δ = δ ′ / and γ = (1 + √ nγ ′ ) . Pick any X ∈ B ( X, δ ) . If X ∈ Ξ , then from (14) it follows thatinequality (13) holds with γ = γ ′ . If X / ∈ Ξ , pick any X ∗ ∈ Π Ξ ( X ) . By noting that k X ∗ − X k F ≤ k X ∗ − X k F + k X − X k F ≤ k X − X k F ≤ δ ′ and using (14), we have dist( X, Υ(0)) ≤ k X − X ∗ k F + dist( X ∗ , Υ(0)) ≤ dist( X, Ξ) + γ ′ P ni = r +1 σ i ( X ∗ )= dist( X, Ξ) + γ ′ min Z ∈R k Z − X ∗ k ∗ ≤ dist( X, Ξ) + γ ′ k X − X ∗ k ∗ + γ ′ min Z ∈R k Z − X k ∗ = dist( X, Ξ) + γ ′ k X − X ∗ k ∗ + γ ′ P ni = r +1 σ i ( X ) ≤ (1+ √ nγ ′ )dist( X, Ξ) + γ ′ P ni = r +1 σ i ( X ) γ = (1+ √ nγ ′ ) . (ii) Let h ( Z ) = ψ ( Z ) + δ Ξ ( Z ) for Z ∈ S n . From (14), the calmness of Υ at for X isequivalent to the local error bound property of h at X . By [27, Corollary 2.4 (a)&(i)]and the nonnegativity of h , part (ii) is equivalent to the calmness of Υ at for X . (iii) It suﬃces to argue that G is subregular at ( X, X ) for the origin iﬀ part (i) holds. = ⇒ . Since G is subregular at ( X, X ) for the origin, there exist κ > and ε > such that dist(( X, Y ) , G − (0)) ≤ κ dist(0 , G ( X, Y )) for all (

X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ξ × R ] . (15)Pick any X ∈ B ( X, ε/ . Obviously, ( X, X ) ∈ B (( X, X ) , ε ) . If X ∈ Ξ ∩ R , from (15) and G − (0) ⊆ Υ(0) × Υ(0) , it follows that dist( X, Υ(0)) ≤ dist(( X, X ) , G − (0)) ≤ κ dist(0 , G ( X, X )) = κ (cid:2) dist( X, Ξ) + dist( X, R ) (cid:3) . If X ∈ Ξ \R , for any X ∗ ∈ Π R ( X ) , we have ( X, X ∗ ) ∈ B (( X, X ) , ε ) ∩ [Ξ × R ] by notingthat k X ∗ − X k F ≤ k X − X k F ≤ ε/ , and from (15) it follows that dist( X, Υ(0)) ≤ dist(( X, X ) , G − (0)) ≤ dist(( X, X ∗ ) , G − (0)) + k X ∗ − X k F ≤ κ dist(0 , G ( X, X ∗ )) + dist( X, R ) = (1+ κ )dist( X, R ) . If X ∈ R\ Ξ , using the similar arguments can yield dist( X, Υ(0)) ≤ (1 + κ )dist( X, Ξ) .Finally, we consider the case that X / ∈ Ξ ∪ R . Pick any X ∗ ∈ Π Ξ ( X ) and any Y ∗ ∈ Π R ( X ) . Since k X ∗ − X k F ≤ ε/ and k Y ∗ − X k F ≤ ε/ , we have ( X ∗ , Y ∗ ) ∈ B (( X, X ) , ε ) ∩ [Ξ × R ] , which together with (15) and G − (0) ⊆ Υ(0) × Υ(0) implies that dist( X, Υ(0)) ≤ dist(( X, X ) , G − (0)) ≤ dist(( X ∗ , Y ∗ ) , G − (0)) + k ( X ∗ , Y ∗ ) − ( X, X ) k F ≤ κ dist(0 , G ( X ∗ , Y ∗ )) + dist( X, Ξ) + dist( X, R ) ≤ (1+ κ ) (cid:2) dist( X, Ξ) + dist( X, R ) (cid:3) . From the above arguments for four cases, we have that Υ is subregular at X for . ⇐ = . Since part (i) holds, there exist γ > and δ > such that inequality (13) holds forall Z ∈ B ( X, δ ) . Fix any ( X, Y ) ∈ B (( X, X ) , δ ) ∩ [Ξ × R ] . Since X ∈ B ( X, δ ) ∩ Ξ and Y ∈ B ( X, δ ) ∩ R , from inequality (13) it follows that dist( X, Υ(0)) ≤ γ √ n − r dist( X, R ) and dist( Y, Υ(0)) ≤ γ dist( Y, Ξ) . Notice that dist((

X, Y ) , G − (0)) ≤ dist( X, Υ(0)) + dist( Y, Υ(0)) . Then, it holds that dist((

X, Y ) , G − (0)) ≤ γ √ n − r (cid:2) dist( Y, Ξ) + dist( X, R ) (cid:3) ≤ γ √ n − r k X − Y k F = 2 γ √ n − r dist(0 , G ( X, Y )) . This shows that the mapping G is metrically subregular at ( X, X ) for the origin. (iv) Let h ( X, Y ) = k X − Y k F + δ Ξ ×R ( X, Y ) for X, Y ∈ S n . The subregularity of G at ( X, X ) for the origin is equivalent to the local error bound property of h at ( X, X ) . By[27, Corollary 2.4 (a)&(i)] and the nonnegativity of h , (iv) is equivalent to (iii). ✷ The characterizations in Theorem 3.1 are achieved in terms of

Γ = Ξ ∩ R . Next weprovide a group of characterizations by

Γ = Ω ∩ R + under a mild restriction on Ξ .8 heorem 3.2 Consider any X ∈ Γ . The mapping Υ is calm at for X under one ofthe following equivalent conditions:(i) there exists a constant ν > along with ε > such that for all X ∈ B ( X, ε ) , dist( X, Υ(0)) ≤ ν (cid:2) dist( X, Ω) + dist( X, R + ) (cid:3) ; (16) (ii) there exist ν, ε > such that for any X ∈ B ( X, ε ) ∩ Ω with < dist( X, R + ) < ε , sup Z ∈ Ω \{ X } [dist( X, R + ) − dist( Z, R + )] + k Z − X k F ≥ ν ; (iii) H ( X, Y ) := (cid:26) { X − Y } if ( X, Y ) ∈ Ω ×R + ∅ otherwise is subregular at ( X, X ) for the origin;(iv) there exist ν > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ω × R + ] with < k X − Y k F < ε , sup ( G,H ) ∈ Ω ×R +( G,H ) =( X,Y ) [ k X − Y k F − k G − H k F ] + k ( G − X, H − Y ) k F ≥ ν. If in addition there exist a constant κ ′ > along with ǫ ′ > such that for all X ∈ B ( X, ǫ ′ )dist( X, Ξ) ≤ κ ′ (cid:2) dist( X, Ω) + dist( X, S n + ) (cid:3) , (17) then each condition in (i)-(iv) is equivalent to the calmness of Υ at for X . Proof: (i)

Pick any X ∈ B ( X, ε ) . By combining inequality (16) with (11), we have dist( X, Υ(0)) ≤ ν (cid:2) X, Ξ) + P ni = r +1 σ i ( X ) (cid:3) . So, part (i) of Theorem 3.1 holds with δ = ε and γ = 2 ν , and Υ is calm at for X . (ii) By following the similar arguments as those for Theorem 3.1 (i), one may show thatpart (i) is equivalent to the existence of ν ′ > and ε ′ > such that for all Z ∈ B ( X, ε ′ ) , dist( Z, Υ(0)) ≤ (cid:26) ν ′ dist( Z, R + ) if Z ∈ Ω; ∞ otherwise . (18)This is equivalent to the local error bound of the function S n ∋ Z dist( Z, R + ) + δ Ω ( Z ) at X . From [27, Corollary 2.4 (a)&(i)], part (ii) is equivalent to part (i). (iii)-(iv) By following the similar arguments as those for Theorem 3.1 (iii), it is not hardto show that part (iii) is equivalent to part (i). Deﬁne the function f ( X, Y ) := k X − Y k F + δ Ω ×R + ( X, Y ) ∀ X, Y ∈ S n . (19)The subregularity of H at ( X, X ) for the origin is equivalent to the local error boundproperty of f at ( X, X ) . By [27, Corollary 2.4 (a)&(i)], (iv) is equivalent to part (iii).9o achieve the last conclusion, it suﬃces to argue that the calmness of Υ at for X along with the condition in (17) implies part (i). Indeed, since Υ is calm at for X ,by Theorem 3.1 (i) there exist γ > and δ > such that inequality (13) holds for all X ∈ B ( X, δ ) . Set ε = min( δ, ǫ ′ ) . Pick any X ∈ B ( X, ε ) . Then, it holds that dist( X, Υ(0)) ≤ γ (cid:2) κ ′ (dist( X, Ω) + dist( X, S n + )) + P ni = r +1 σ i ( X ) (cid:3) ≤ γ max( κ ′ , √ n − r ) (cid:2) dist( X, Ω) + dist( X, S n + ) + dist( X, R ) (cid:3) ≤ γ max( κ ′ , √ n − r ) (cid:2) dist( X, Ω) + 2dist( X, S n + ∩ R ) (cid:3) ≤ γ max( κ ′ , √ n − r ) (cid:2) dist( X, Ω) + dist( X, R + ) (cid:3) . This shows that part (i) holds. The proof is then completed. ✷ Remark 3.1 (a)

When the closed set Ω is convex, from [5, Corollary 3] we know thatthe condition ri(Ω) ∩ S n ++ = ∅ is enough for the condition in (17) to hold. Clearly, thereare many classes of closed convex sets Ω satisfying this basic constraint qualiﬁcation. (b) Fix any X ∈ Γ . Notice that for any ( X, Y ) ∈ S n × S n with X ∈ Γ or Y ∈ Γ , dist(( X, Y ) , H − (0)) ≤ k X − Y k F ≤ dist(0 , H ( X, Y )) . Hence, the subregularity of H at ( X, X ) for the origin is equivalent to the existence of κ > and ε > such that for all ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , dist(( X, Y ) , H − (0)) ≤ κ k X − Y k F . Υ The calmness of a mapping M : Y ⇒ X at a point of its graph is implied by its Aubinproperty at this point, while the coderivative is a convenient tool to study the latter.The coderivative of M at ( y, x ) ∈ gph M is the mapping D ∗ M ( y | x ) : X ⇒ Y deﬁned by u ∈ D ∗ M ( y | x )( v ) ⇐⇒ ( u, − v ) ∈ N gph M ( y, x ) . Thus, by Theorem 3.2 (iii), we can provide a practical criterion to identify the calmnessof Υ at for X ∈ Γ by characterizing the Aubin property of the inverse mapping H − . Proposition 3.1

Consider any X ∈ Γ . The mapping Υ is calm at for X whenever [ −N Ω ( X )] ∩ N R + ( X ) = { } , (20) which is equivalent to the Aubin property of H − at the origin for ( X, X ) . Proof:

Observe that gph H − = L − (Ω ×R + ×{ } ) where L ( G, X, Y ) := ( X ; Y ; G − X + Y ) for G, X, Y ∈ S n . By the surjectivity of L and [41, Exercise 6.7 & Proposition 6.41], N gph H − (0 , X, X ) = n (∆ W, ∆ S − ∆ W, ∆ Z +∆ W ) | ∆ S ∈ N Ω ( X ) , ∆ Z ∈ N R + ( X ) o . (21)10rom [35, Proposition 3.5] or [41, Theorem 9.40], the mapping H − has the Aubin prop-erty at the origin for ( X, X ) if and only if D ∗ H − (0 | ( X , X ))(0 ,

0) = { } , or equivalently (∆ G, , ∈ N gph H − (0 , X , X ) = ⇒ ∆ G = 0 . This, together with (21), is equivalent to saying that condition (20) holds. ✷ Remark 3.2 (a)

By following the same arguments as those for Proposition 3.1, onemay achieve the calmness of the mapping Υ at for X ∈ Υ(0) under the condition [ −N Ξ ( X )] ∩ N R ( X ) = { } , (22) which is equivalent to the Aubin property of G − at the origin for ( X, X ) . However,the criterion in (22) can not hold since, by taking Z = P β P T β with P ∈ O n ( X ) and β ⊂ β := { i | λ i ( X ) = 0 } for | β | ≤ n − r , from Lemma 2.3 and Proposition 2.1 (i) wehave Z ∈ [ −N S n + ( X )] ∩ N R ( X ) , which together with N Ξ ( X ) ⊇ b N Ω ( X ) + N S n + ( X ) meansthat = Z ∈ [ −N Ξ ( X )] ∩ N R ( X ) . Inspired by this, in Section 3.2.2 we develop somecriteria for the calmness of Υ by the subregularity of the mapping H instead of G . (b) Recently, Gfrerer [15, 16] proposed a suﬃcient condition to guarantee the subregu-larity of a multifunction. Its key ingredient for the subregularity of H is the limit set Cr > H ( X, X,

0) := n ( F, S, T ) ∈ S n × S n × S n | ∃ t k ↓ , ( F k , S k , T k ) → ( F, S, T ) , ( G k , H k ) ∈ S , R k ∈ S , ( X, X )+ t k ( G k , H k ) / ∈ H − (0) , (23) ( − S k , − T k , R k ) ∈ b N gph H (( X, X )+ t k ( G k , H k ) , t k F k ) o . Since criterion (20) is equivalent to the metric regularity of H , from [15, Proposition 3.8]we know that it is stronger than the following subregularity criterion owing to Gfrerer: (0 , , / ∈ Cr > H ( X, X, . (24)From Proposition 3.1 and Corollary 2.1, we immediately have the following result. Corollary 3.1

Let

Ω = (cid:8) X ∈ S n | A ( X ) = b (cid:9) where A : S n → R p is a linear mappingwith A ∗ being its adjoint. Then the mapping Υ is calm at for X ∈ Υ(0) provided that

Range( A ∗ ) ∩ (cid:2) N S n + ( X ) + N R ( X ) (cid:3) = { } . By Theorem 3.2 (iii), the calmness of Υ at for X ∈ Γ is implied by (and under thecondition in (17) is equivalent to) the subregularity of the mapping H at ( X, X ) for theorigin. By invoking [11, Theorem 6.2], we can obtain the following conclusion. Proposition 3.2

Consider any X ∈ Γ . The mapping Υ is calm at for X provided thatone of the following equivalent conditions holds: i) ∃ γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , max (cid:8) dist( − W, N Ω ( X )) , dist( W, N R + ( Y )) (cid:9) ≥ γ for W = X − Y k X − Y k F ; (ii) ∃ γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , max (cid:8) dist( − W, b N Ω ( X )) , dist( W, b N R + ( Y )) (cid:9) ≥ γ for W = X − Y k X − Y k F ; (25) (iii) ∃ γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , lim sup ( X ′ ,Y ′ ) −→ ( X,Y )( X ′ ,Y ′ ) =( X,Y ) [ k X − Y k F − k X ′ − Y ′ k F ] + k ( X ′ , Y ′ ) − ( X, Y ) k F ≥ γ ; (iv) ∃ γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , h X − Y, G − H ik X − Y k F ≤ − γ for some ( G, H ) ∈ S ∩ T Ω ×R + ( X, Y ) . Proof:

Notice that condition (i) is precisely the intrinsic transversality for the closedsets Ω and R + , which was introduced in [11] to establish the linear convergence of thealternating projection method for seeking a point in the intersection of two closed sets.By [11, Theorem 6.2], under condition (i), the mapping H is subregular at ( X, X ) forthe origin. Together with Theorem 3.2 (iii), the mapping Υ is calm at for X .(i) ⇐⇒ (ii). It suﬃces to argue that (ii) = ⇒ (i). Suppose that part (ii) holds. Let φ ( X ′ , Y ′ ) := k X ′ − Y ′ k F for X ′ , Y ′ ∈ S n . Clearly, for any given ( X, Y ) ∈ Ω × R + with X = Y , the function φ is continuously diﬀerentiable on a neighborhood of ( X, Y ) , whichalong with the expression of f in (19) and [41, Exercise 8.8] implies that b ∂f ( X, Y ) = (cid:16) X − Y k X − Y k F , Y − X k X − Y k F (cid:17) + b N Ω ×R + ( X, Y ) . From part (ii), for all ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , it holds that dist((0 , , b ∂f ( X, Y )) ≥ γ Let δ := ε/ . Pick any ( X, Y ) ∈ B (( X, X ) , δ ) ∩ [(Ω \R + ) × ( R + \ Ω)] . By [41, Exercise 8.8], ∂f ( X, Y ) = ∅ . Let ( U, V ) ∈ ∂f ( X, Y ) be such that dist(0 , ∂f ( X, Y )) = k ( U, V ) k F . ByDeﬁnition 2.1, there exist sequences ( X k , Y k ) −→ f ( X, Y ) and ( U k , V k ) ∈ b ∂f ( X k , Y k ) with ( U k , V k ) → ( U, V ) . Recall that ( X, Y ) ∈ (Ω \R + ) × ( R + \ Ω) . Along with ( X k , Y k ) −→ f ( X, Y ) and ( X, Y ) ∈ B (( X, X ) , δ ) , we have ( X k , Y k ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] for all suﬃciently large k . By invoking the last inequality with ( X, Y ) = ( X k , Y k ) for all12 large enough, it follows that k ( U k , V k ) k F ≥ γ . Passing to the limit k → ∞ yields that dist(0 , ∂f ( X, Y )) = k ( U, V ) k F ≥ γ . Thus, together with ∂f ( X, Y ) = (cid:16) X − Y k X − Y k F , Y − X k X − Y k F (cid:17) + N Ω ×R + ( X, Y ) , we conclude that condition (i) holds. The equivalence between (i) and (ii) follows.The equivalence between (i) and (iii) was established in [11, Proposition 4.1]. Noticethat condition (iii) is precisely the existence of γ > and ε > such that |∇ f | ( X, Y ) ≥ γ for all ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] . By Lemma 2.1, it is equivalent to theexistence of γ > and ε > such that for all ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , df ( X, Y )( G, H ) ≤ − γ for some ( G, H ) ∈ S . For any given ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [(Ω \R + ) × ( R + \ Ω)] , from the global Lipschitzcontinuity and directional diﬀerentiability of φ , for any ( G, H ) ∈ S n × S n , df ( X, Y )( G, H ) = lim inf τ ↓ G ′ ,H ′ ) → ( G,H ) φ ( X + τ G ′ , Y + τ H ′ ) − φ ( X, Y ) τ + lim inf τ ↓ G ′ ,H ′ ) → ( G,H ) δ Ω ×R + ( X + τ G ′ , Y + τ H ′ ) τ = h X − Y, G i + h Y − X, H ik X − Y k F + δ T Ω ×R + ( X,Y ) ( G, H ) . This shows that condition (iii) is equivalent to condition (iv). ✷ Remark 3.3

The criteria in Proposition 3.2 are implied by criterion (20) . To see this,suppose that criterion (20) holds, but Proposition 3.2 (i) does not hold. Then, there existsa sequence { ( X k , Y k ) } ⊆ [(Ω \R + ) × ( R + \ Ω)] converging to ( X, X ) such that dist( − W k , N Ω ( X k )) < k and dist( W k , N R + ( Y k )) < k for W k = X k − Y k k X k − Y k k F . We assume (if necessary taking a subsequence) that W k → W with W ∈ S . Notice that dist( − W, N Ω ( X k )) − k W k − W k F ≤ dist( − W k , N Ω ( X k )) < /k ;dist( W, −N R + ( Y k )) − k W k − W k F ≤ dist( W k , N R + ( Y k )) < /k. Since the mappings N Ω and N R + are outer semicontinuous relative to Ω and R + , respec-tively (see [41, Proposition 6.6]), the functions dist( − W, N Ω ( · )) and dist( W, N R + ( · )) arelsc relative to Ω and R + , respectively (see [41, Proposition 5.11]). Now taking the limit k → ∞ and using the fact that { ( X k , Y k ) } ⊆ Ω × R + , we obtain dist( − W, N Ω ( X )) = 0 and dist( W, N R + ( X )) = 0 . This means that ( − W, W ) ∈ [ N Ω ( X ) × N R + ( X )] ∩ S , acontradiction to criterion (20) . So, the stated implication holds true. It is worthwhile topoint out that the criteria in Proposition 3.2 are independent of Gfrerer’s criterion (24) . H at ( X, X ) for the origin is equivalent to the localerror bound property of f in (19). By invoking [27, Corollary 2.4] to the function f , it isnot hard to obtain the following criteria independent of those of Proposition 3.2 for thecalmness of Υ at for X ∈ Υ(0) , where the equivalence between (ii) and (iii) is due to[27, Corollary 2.4 (vi)], and the equivalence between (i) and (ii), and between (iii) and(iv) can be obtained by following the same arguments as those for Proposition 3.2.

Proposition 3.3

Consider any X ∈ Γ . The mapping Υ is calm at for X provided thateither of the following equivalent conditions holds:(i) there exist γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ω × R + ] with < k X − Y k F k ( X,Y ) − ( X,X ) k F ≤ ε , max (cid:8) dist( − W, N Ω ( X )) , dist( W, N R + ( Y )) (cid:9) ≥ γ for W = X − Y k X − Y k F ; (26) (ii) there exist γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ω × R + ] with < k X − Y k F k ( X,Y ) − ( X,X ) k F ≤ ε , max (cid:8) dist( − W, b N Ω ( X )) , dist( W, b N R + ( Y )) (cid:9) ≥ γ for W = X − Y k X − Y k F ; (27) (iii) there exist γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ω × R + ] with < k X − Y k F k ( X,Y ) − ( X,X ) k F ≤ ε , lim sup ( X ′ ,Y ′ ) → ( X,Y )( X ′ ,Y ′ ) =( X,Y ) [ k X − Y k F − k X ′ − Y ′ k F ] + k ( X ′ , Y ′ ) − ( X, Y ) k F ≥ γ ; (iv) there exist γ > and ε > such that for any ( X, Y ) ∈ B (( X, X ) , ε ) ∩ [Ω × R + ] with < k X − Y k F k ( X,Y ) − ( X,X ) k F ≤ ε , h X − Y, G − H ik X − Y k F ≤ − γ for some ( G, H ) ∈ S ∩ T Ω ×R + ( X, Y ) . Although the criteria in Proposition 3.2 are independent of Gfrerer’s criterion (24),the following lemma states the latter implies the criteria in Proposition 3.3.

Lemma 3.1

Fix any X ∈ Γ . If (0 , , / ∈ Cr > H ( X, X, , then Proposition 3.3 (i) holds. Proof:

Firstly, by following the proof of Proposition 3.1, for any ( X, Y, Z ) ∈ gph H , b N gph H ( X, Y, Z ) = n (∆ X − ∆ Z, ∆ Y +∆ Z, ∆ Z ) | ∆ X ∈ b N Ω ( X ) , ∆ Y ∈ b N R + ( Y ) o . (28)14uppose that Proposition 3.3 (ii) does not hold. For each k ∈ N , there exists a point ( X k , Y k ) ∈ Ω × R + with < k X k − Y k k F k ( X k ,Y k ) − ( X,X ) k F ≤ k and k ( X k , Y k ) − ( X, X ) k F ≤ k such that dist( − W k , b N Ω ( X k )) ≤ k and dist( W k , b N R + ( Y k )) ≤ k with W k = X k − Y k k X k − Y k k F .Then, for each k ∈ N , there exists (∆ X k , ∆ Y k ) ∈ b N Ω ( X k ) × b N R + ( Y k ) such that k W k +∆ X k k F ≤ k and k W k − ∆ Y k k F ≤ k . Notice that ( X k , Y k , X k − Y k ) ∈ gph H with ( X k , Y k ) / ∈ H − (0) . By comparing with equation (28), it immediately follows that (∆ X k + W k , ∆ Y k − W k , − W k ) ∈ b N gph H ( X k , Y k , X k − Y k ) . For each k , write t k := q k X k − X k F + k Y k − X k F , G k = t − k ( X k − X ) , H k = t − k ( Y k − X ); S k := − ( W k +∆ X k ) , T k := W k − ∆ Y k , R k := − W k , F k := t − k ( X k − Y k ) . Clearly, ( X k , Y k ) = ( X, X ) + t k ( G k , H k ) with ( G k , H k ) ∈ S and ( X k , Y k ) / ∈ H − (0) .Also, ( − S k , − T k , R k ) ∈ b N gph H (( X, X ) + t k ( G k , H k ) , t k F k ) with R k ∈ S . Since k S k k F ≤ k , k T k k ≤ k and k F k k F = k X k − Y k k F k ( X k ,Y k ) − ( X,X ) k F ≤ k , we have { ( F k , S k , T k ) } → (0 , , .From the deﬁnition of Cr > H ( X, X, in (23), it follows that (0 , , ∈ Cr > H ( X, X, ,a contradiction to the given assumption. The proof is completed. ✷ By the proof of Theorem 3.1 (ii), the calmness of Υ at for X ∈ Γ is equivalent tothe local error bound property of h ( · ) = ψ ( · ) + δ Ξ ( · ) at X . By [27, Corollary 2.4] andLemma 2.1, the latter is implied by the following criterion: there exist γ > and ε > such that for any X ∈ B ( X, ε ) ∩ (Ξ \R ) , ψ ′ ( X ; H ) ≤ − γ for some H ∈ T Ξ ( X ) ∩ S . (29)Of course, by [27, Corollary 2.4], one can also achieve the criteria involving the subdif-ferential ψ and the normal cone to Ξ , but they are stronger than criterion (29) since theinclusion ∂h ( X ) ⊆ N Ξ ( X ) + ∂ψ ( X ) is generally strict. Now it seems that criterion (29)is independent of criterion (22) though it holds for the set Ξ = { X ∈ S n + | tr( X ) = 1 } and X ∈ Ξ with rank( X ) = 1 . Indeed, by ﬁxing any ε > , for any X ∈ B ( X, ε ) ∩ (Ξ \R ) wehave λ ( X ) > λ ( X ) ≥ · · · ≥ λ n ( X ) (if necessary by shrinking ε ). Pick any P ∈ O n ( X ) and take H = P P T − P P T where P and P is the ﬁrst and second column of P ,respectively. It is easy to check that H ∈ T Ξ ( X ) , and moreover, by Lemma 2.2, ψ ′ ( X ; H ) = − S ′ r ( X ; H ) = − S r ( P T β HP β ) = − P T HP = − . Recall from [40] that a multifunction M : Y ⇒ X is said to be locally upper-Lipschitzian(UL) at a point y if there exist κ > and ε > such that for all y ∈ B ( y, ε ) , M ( y ) ⊆ M ( y ) + κ k y − y k B . Clearly, the locally UL continuity of Υ at implies its calmness at for every X ∈ Υ(0) .The following proposition shows that the converse also holds if the set Ξ is compact.15 roposition 3.4 Let

Υ : R ⇒ S n be the mapping in (2) . The following results hold.(i) The mapping Υ is locally UL at if and only if there exist κ ≥ and δ > suchthat for all X ∈ Ξ ∩ { X ∈ S n | tr( X ) − S r ( X ) ≤ δ } , dist( X, Υ(0)) ≤ κ (cid:0) tr( X ) − S r ( X ) (cid:1) . (30) (ii) Suppose that the set Ξ is compact. The mapping Υ is locally UL at if and only ifthere exists κ ′ ≥ such that dist( X, Γ) ≤ κ ′ (tr( X ) − S r ( X )) for all X ∈ Ξ .(iii) Suppose that the set Ξ is compact. If the mapping Υ is calm at for every X ∈ Υ(0) ,then it is necessarily locally UL at . Proof: (i)

The necessity is trivial by the deﬁnition of locally upper Lipschitz. For thesuﬃciency, consider any ω with | ω | ≤ δ . If ω < , then Υ( ω ) ⊆ Υ(0) + κ | ω | B forany κ ≥ since Υ( ω ) = ∅ . So, it suﬃces to consider the case ≤ ω ≤ δ . Fix any X ∈ Υ( ω ) . Clearly, X ∈ Ξ ∩ { X ∈ S n | tr( X ) − S r ( X ) ≤ δ } . From inequality (30), wehave Υ( ω ) ⊆ Υ(0) + κ (cid:2) tr( X ) − S r ( X ) (cid:3) ⊆ Υ(0) + κ | ω | B . So, Υ is locally UL at . (ii) By part (i), it suﬃces to consider the necessity. Since Υ is locally UL at , thereexist δ > and κ ≥ such that (30) holds for all X ∈ Ξ ∩ { X ∈ S n | tr( X ) − S r ( X ) ≤ δ } .Since the set Ξ is compact, there exists M > such that k Z k F ≤ M for all Z ∈ Ξ . Fixany X ∈ Ξ . If tr( X ) − S r ( X ) ≤ δ , by (30) we have dist( X, Γ) ≤ κ ′ (tr( X ) − S r ( X )) for κ ′ = κ . Otherwise, from dist( X, Γ) ≤ M , we have dist( X, Γ) ≤ Mδ (tr( X ) − S r ( X )) .Thus, κ ′ = max( δ, M /δ ) is precisely the required one. (iii) By the given assumption, for every X ∈ Γ , there exist κ X ≥ and ε X > such that dist( Z, Υ(0)) ≤ κ X (cid:0) tr( Z ) − S r ( Z ) (cid:1) ∀ Z ∈ Ξ ∩ B ( X, ε X ) . Observe that S X ∈ Γ B ◦ ( X, ε X ) is an open covering of the compact set Γ . By Heine-Borelcovering theorem, there exist X , X , . . . , X m ∈ Γ such that Γ ⊆ S mi =1 B ◦ ( X i , ε X i ) .Write b κ := max ≤ i ≤ m κ X i . From the last inequality, it immediately follows that dist( Z, Υ(0)) ≤ b κ (cid:0) tr( Z ) − S r ( Z ) (cid:1) ∀ Z ∈ S mi =1 (cid:2) Ξ ∩ B ( X i , ε X i ) (cid:3) . Let D := S mi =1 (cid:2) Ξ ∩ B ( X i , ε X i ) (cid:3) . Consider the set e Ξ = cl[Ξ \ D ] . Then, there exists e κ > such that min X ∈ e Ξ (cid:0) tr( X ) − S r ( X ) (cid:1) ≥ e κ (if not, we will have a sequence { X k } ⊆ e Ξ suchthat tr( X k ) − S r ( X k ) ≤ /k , which by the compactness of the set e Ξ and the continuityof the function X tr( X ) − S r ( X ) means that there is a cluster point, say X ∈ e Ξ , of { X k } such that tr( X ) − S r ( X ) = 0 . This implies that X ∈ Γ = S mi =1 B ◦ ( X i , ε X i ) , acontradiction to the fact that X / ∈ D ). In addition, since e Ξ is compact, there exists M > such that dist( X, Γ) ≤ M for all X ∈ e Ξ . The two sides show that dist( Z, Υ(0)) ≤ ( M/ e κ ) (cid:0) tr( X ) − S r ( X ) (cid:1) ∀ Z ∈ e Ξ . Take κ = max( b κ, M/ e κ ) . From the last two inequalities, the inequality in (30) holds withsuch κ and δ . By part (ii), the mapping Υ is locally UL at . ✷ We have achieved some criteria for the calmness of Υ at for X ∈ Υ(0) by leveragingthe subregularity of the mapping H . The relation among them is demonstrated as follows.16riterion (20) Criterion (24)Proposition 3.2 (i)-(iv) Proposition 3.3 (i)-(iv)locally UL property of Υ at calmness of Υ at for X ∈ Υ(0)

Figure 1: The relation among the criteria for the calmness of Υ at for X ∈ Υ(0)

Next we shall employ the criteria in Section 3.2 to ﬁnd some closed sets Ω for which theassociated Υ is calm at for every X ∈ Υ(0) . Some examples are given as follows.

Example 3.1

Fix any ρ > . Let Ω = { Z ∈ S n | |k Z |k ≤ ρ } where |k · |k is an arbitrarymatrix norm. We verify that criterion (20) holds at every X ∈ Γ . Pick any H ∈ [ −N Ω ( X )] ∩ [ N R ( X ) + N S n + ( X )] . There exists H ∈ N R ( X ) and H ∈ N S n + ( X ) such that H = H + H . Clearly, h H, X i = 0 . Let |k · |k ∗ denote the dual norm of |k · |k . Then, ρ |k − H |k ∗ = max |k Z |k≤ ρ h Z, − H i = max Z ∈ Ω h Z, − H i . Since − H ∈ N Ω ( X ) , max Z ∈ Ω h Z, − H i ≤ h X, − H i . Thus, ρ |k − H |k ∗ ≤ h− H, X i = 0 , and H = 0 . Consequently, criterion (20) holds. Example 3.2

Let

Ω = { X ∈ S n | diag( X ) = e } . We shall verify that Corollary 3.1 holdsfor such Ω at every X ∈ Γ . To this end, pick any H ∈ Range( A ∗ ) ∩ (cid:2) N S n + ( X ) + N R ( X ) (cid:3) ,where A : S n → R n is the linear mapping deﬁned by A ( X ) = diag( X ) for X ∈ S n . Thereexist y ∈ R n , H ∈ N S n + ( X ) and H ∈ N R ( X ) such that H = Diag( y ) = H + H . From H ∈ N S n + ( X ) , we have H X = 0 . By Proposition 2.1 (i), H X = 0 . So, Diag( y ) X = 0 .Together with diag( X ) = e , y = 0 . Thus, Corollary 3.1 holds for such Ω at every X ∈ Γ .For such Ω , the associated Γ is precisely the rank constraint set from [38]; when r = 1 ,the set Γ is exactly the PSD matrix reformulation for the max-cut problem (see [18]). Example 3.3

Fix any a ∈ R , b ∈ R \{ } and any B, C ∈ S n with B − ( a/b ) C beingnonsingular. Consider the set Ω = { X ∈ S n | h B, X i = a, h C, X i = b } . We verify thatCorollary 3.1 with Range( A ∗ ) = { y B + y C | y ∈ R , y ∈ R } holds for such Ω at every ∈ Γ . Pick any H ∈ Range( A ∗ ) ∩ (cid:2) N S n + ( X )+ N R ( X ) (cid:3) . Then, H = y B + y C = H + H for some y ∈ R , y ∈ R , H ∈ N S n + ( X ) and H ∈ N R ( X ) . Since h X, H + H i = 0 , wehave y a + y b = 0 , which implies that H = y ( B − ( a/b ) C ) . Notice that H X = 0 and H X = 0 . We get y ( B − ( a/b ) C ) X = 0 , which by the assumption on B and C implies y = 0 , and then y = 0 and H = 0 . So, Corollary 3.1 holds for such Ω . For such Ω and r = 1 , Γ corresponds to the feasible set of the generalized eigenvalue problem (see [17]).When B = 0 and a = 0 , Ω = { X ∈ S n | h C, X i = b } . From the above arguments, forsuch Ω with any C ∈ S n , the associated Υ is calm at for every X ∈ Γ . For such Ω with C = I , the associated set Γ often appears in quantum state tomography [19]. Example 3.4

Let

Ω = { X ∈ S n | X = 1 , X ii − ( X i + X i ) = 0 for i = 2 , . . . , n } . Weverify that Corollary 3.1 holds for such Ω at each X ∈ Γ , where A ∗ : R n → S n is given by A ∗ ( y ) =  y − y − y · · · − y n − y y · · · − y y · · · ... ... ... . . . ... − y n · · · y n  for y ∈ R n . Pick any H ∈ Range( A ∗ ) ∩ (cid:2) N S n + ( X ) + N R ( X ) (cid:3) . There exist y ∈ R n , H ∈ N S n + ( X ) and H ∈ N R ( X ) such that H = A ∗ ( y ) = H + H . Notice that HX = 0 . So, for i = 2 , . . . , n , HX ) ii = y i ( X ii − X i / , HX ) i = y i ( X i − X / . Together with X = 1 and X ii − X i = 0 for i = 2 , . . . , n , we obtain y i = 0 for all i = 2 , . . . , n , which implies that h H, X i = y X = y . Thus, y = 0 and H = 0 .So, Corollary 3.1 holds for such Ω at every X ∈ Γ . For such Ω and r = 1 , the set Γ isprecisely the PSD matrix reformulation for the unconstrained 0-1 quadratic problem. Example 3.5

Let

Ω = (cid:8) X ∈ S n | h I, X ii i = 1 , h I, X ij i = 0 , i = j ∈ { , . . . , k } (cid:9) with n = kp , where X ij ∈ S p is the ( i, j ) th block of X . For such Ω and r = 1 , the associated Γ isprecisely the PSD reformulation of the orthogonal matrix set { Y ∈ R p × k | Y T Y = I } with X = vec( Y )vec( Y ) T . Fix any X ∈ Γ . Pick any H ∈ Range( A ∗ ) ∩ (cid:2) N S n + ( X ) + N R ( X ) (cid:3) ,where A ∗ : R k → S n is a linear operator with A ∗ ( y ) for y ∈ R k having the form A ∗ ( y ) =  y I y I · · · y k Iy I y I · · · y k I ... ... . . . ... y k I y k I · · · y kk I  . Then, there exist y ij ∈ R for i, j = 1 , . . . , k , H ∈ N S n + ( X ) and H ∈ N R ( X ) such that H = A ∗ ( y ) = H + H . Multiplying this equality by X yields that HX = H X + H X .Notice that H X + H X = 0 . So, for all i, j = 1 , . . . , k , HX ) ij = P kt =1 y it X tj .Consequently, h I, P kt =1 y it X tj i = y ij where the last equality is due to X ∈ Ω . Thus,we have H = 0 . By Corollary 3.1, the mapping Υ is calm at for X . Υ at for every X ∈ Γ was obtained in [3] byusing Proposition 3.4 (ii) and a technically complicated construction, and we achieve itby directly verifying criterion (20). We also ﬁnd that it is not an easy task to achieve thecalmness of Υ for Example 3.1-3.5 by using other criteria in Figure 1. For the followingtwo examples, it is almost impossible to achieve the calmness of Υ at for every X ∈ Γ whether by the criterion in (20) or the one in Proposition 3.4 (ii). Here, we achieve it byusing the special structure of Ω and the result of Example 3.2 and 3.4. Example 3.6

Let

Ω = { X ∈ S n +1 | diag( X ) = e, A ( X ) ≤ b } , where b = ( b , . . . , b m ) T isa given vector, and the linear mapping A : S n +1 → R m is deﬁned as follows: A ( X ) := ( h A , X i , . . . , h A m , X i ) T with A i = (cid:18) c T i c i Q i (cid:19) for i = 1 , . . . , m. For such Ω and r = 1 , the associated Γ is precisely the feasible set for the PSD matrixreformulation of the following combinatorial optimization problem min x ∈{− , } n h x, Q x i + 2 h c , x i s . t . h x, Q i x i + 2 h c i , x i ≤ b i , i = 1 , , . . . , m. (31) Write b Γ := { X ∈ S n + | rank( X ) ≤ , diag( X ) = e } . It is a discrete set in the space S n since b Γ ⊆ { xx T | x ∈ {− , } n } . We shall employ this structure and the result of Example3.2 to show that the associated Υ is calm at for every X ∈ Γ . Indeed, since b Γ is adiscrete set, there exists δ > such that for all X ∈ B ( X, δ ) , dist( X, b Γ) = k X − X k F .Since Γ ⊆ b Γ , Γ is also a discrete set in S n . Then, there exists δ ∈ (0 , δ ] such that dist( X, Γ) = k X − X k F = dist( X, b Γ) for all X ∈ B ( X, δ ) . Let e Ξ := { X ∈ S n + | diag( X ) = e } . By Example 3.2, there exist γ > and δ > such that dist( X, b Γ) ≤ γ (cid:2) dist( X, e Ξ) + P ni = r +1 σ i ( X ) (cid:3) ≤ γ (cid:2) dist( X, Ξ) + P ni = r +1 σ i ( X ) (cid:3) for all X ∈ B ( X, δ ) . Set δ := min( δ , δ ) . Then, for all X ∈ B ( X, δ ) , it holds that dist( X, Γ) = dist( X, b Γ) ≤ γ (cid:2) dist( X, Ξ) + P ni = r +1 σ i ( X ) (cid:3) . By Theorem 3.1 (i), the mapping Υ for such Ω and r = 1 is calm at for every X ∈ Γ . Example 3.7

Let

Ω = { X ∈ S n +1 | X = 1 , diag( X ) = X · , A ( X ) ≤ b } , where X · isthe ﬁrst column of X , and the vector b ∈ R m and the linear mapping A : S n +1 → R m aresame as those in Example 3.6. For such Ω and r = 1 , the associated Γ is precisely thefeasible set of the PSD matrix reformulation of the combinatorial problem min x ∈{ , } n h x, Q x i + 2 h c , x i s . t . h x, Q i x i + 2 h c i , x i ≤ b i , i = 1 , , . . . , m. (32) By Example 3.4 and the similar proof in Example 3.6, the mapping Υ for such Ω and r = 1 is calm at for every X ∈ Γ . The PSD reformulation of the MIMO detectionproblem has a feasible set taking the form of Γ associated to such Ω and r = 1 . Applications of the calmness of Υ An important application of the calmness of Υ is to characterize the ﬁrst-order necessaryoptimality condition for the following rank constrained minimization problem min X ∈ S n n f ( X ) s . t . rank( X ) ≤ r, X ∈ Ξ o (33)where f : S n → R is a proper lsc function that is locally Lipschitz relative to Ξ . Let X be a local optimal solution to (33). By [41, Theorem 10.1&Exercise 10.10], ∈ ∂f ( X ) + N Γ ( X ) . Under the calmness of Υ at for X , from Theorem 3.1 (i) and [25, Section 3.1], we have N Γ ( X ) ⊆ N Ξ ( X ) + N R ( X ) which will become an equality if rank( X ) = r and the set Ξ is regular (respectively,become N Γ ( X ) = N Ω ( X ) + N R + ( X ) if Ω is a convex set with ri(Ω) ∩ S n ++ = ∅ ). Thus, itholds that ∈ ∂f ( X ) + N Ξ ( X ) + N R ( X ) (respectively, ∈ ∂f ( X ) + N Ω ( X ) + N R + ( X ) ).Another important application of the calmness of Υ is to achieve the partial calmnessand global exact penalty of the following equivalent reformulation of problem (33): min X ∈ S n (cid:8) f ( X ) s . t . tr( X ) − S r ( X ) = 0 , X ∈ Ξ (cid:9) . (34)By [32, Lemma 2.1&Proposition 2.1], we have the following global exact penalty result. Theorem 4.1

If the mapping Υ is calm at for every X ∈ Υ(0) , then problem (34) ispartially calm at every global optimal solution X ∗ , i.e., there exist ε > and µ > suchthat for all τ ∈ R and all X ∈ B ( X ∗ , ε ) ∩ Υ( τ ) , one has f ( X ) − f ( X ∗ ) + µψ ( X ) ≥ .So, if in addition f is coercive on Ξ or the set Ξ is compact, there exists µ > such that min X ∈ Ξ (cid:8) f ( X ) + µ (cid:0) tr( X ) − S r ( X ) (cid:1)(cid:9) (35) associated to each µ ≥ µ has the same global optimal solution set as problem (33) does. Remark 4.1 (a)

For the set Ω in Example 3.1-3.7, problem (34) is partially calm atevery global optimal solution, and hence problem (35) is a global exact penalty for (33) ifin addition f is coercive on those noncompact Ξ . By leveraging the global exact penaltyfor the set Ω in Example 3.4, the ﬁrst two authors have designed an eﬀective feasibleapproach to a class of unconstrained binary polynomial program (see [39]). (b) Recently, for the set

Ξ = { X ∈ S n | (cid:22) X (cid:22) I } , Liu et al. [33] showed that min X ∈ Ξ (cid:8) f ( X ) + µ P ni = r +1 [ λ i ( X )] p (cid:9) for p ∈ (0 , (36) is a global exact penalty of problem (33) . When p = 1 , by noting that Ξ = Ω ∩ S n + with Ω given by Example 3.1 for the spectral norm, this result is implied by Theorem 4.1, i.e., here exists µ > such that problem (35) associated to every µ > µ has the same globaloptimal solution set as (33) does. When p ∈ (0 , , by letting X ∗ µ and X µ be a globaloptimal solution of (35) and (36) associated with µ > µ , we have f ( X µ ) + µ P ni = r +1 [ λ i ( X µ )] p ≤ f ( X ∗ µ ) + µ P ni = r +1 [ λ i ( X ∗ µ )] p = f ( X ∗ µ ) + µ P ni = r +1 λ i ( X ∗ µ ) ≤ f ( X µ ) + µ P ni = r +1 λ i ( X µ ) ≤ f ( X µ ) + µ P ni = r +1 [ λ i ( X µ )] p where the ﬁrst inequality is due to the feasibility of X ∗ µ to (36) , the second one is by thefeasibility of X µ to (35) , and the last one is using (cid:22) X µ (cid:22) I . This shows that problems (36) and (35) associated to every µ > µ have the same global optimal solution set. So,problem (36) with p ∈ (0 , is a global exact penalty of (33) . Next we focus on the application of the calmness of Υ in achieving the partial calmnessof an equivalent MPEC reformulation for the PSD rank regularized problem min X ∈ Ξ (cid:8) νf ( X ) + rank( X ) (cid:9) (37)where f is same as above, and ν > is the regularization parameter. To this end, let L denote the family of proper lsc convex functions on R satisfying the conditions: int(dom φ ) ⊇ [0 , , t ∗ := arg min ≤ t ≤ φ ( t ) , φ ( t ∗ ) = 0 and φ (1) = 1 . (38)With φ ∈ L , problem (37) can be equivalently reformulated as the following problem min X ∈ Ω ,W (cid:23) n νf ( X ) + P ni =1 φ ( λ i ( W )) s . t . h I − W, X i = 0 , X (cid:23) , I − W (cid:23) o , (39)which is a mathematical program with the PSD complementarity constraint. The follow-ing proposition shows that MPEC (39) is partially calm at every global optimal solution. Proposition 4.1

For each k ∈ { , . . . , n } , deﬁne the multifunction Υ k : R + ⇒ S n by Υ k ( τ ) := n X ∈ Ξ | tr( X ) − S k ( X ) = τ o for τ ≥ . (40) If the mapping Υ k for k = 1 , , . . . , n is calm at for all X ∈ Υ k (0) , then MPEC (39) ispartially calm at every global optimal solution ( X ∗ , W ∗ ) . Proof:

For each τ ≥ , write F τ := (cid:8) ( X, W ) ∈ Ξ × S n | h I − W, X i = τ, (cid:22) W (cid:22) I (cid:9) .To show that MPEC (39) is partially calm at ( X ∗ , W ∗ ) if there exist ε > and ρ > such that for all τ ≥ and all ( X, W ) ∈ B (( X ∗ , W ∗ ) , ε ) ∩ F τ , νf ( X ) + P ni =1 φ ( σ i ( W )) − (cid:2) νf ( X ∗ ) + P ni =1 φ ( σ i ( W ∗ )) (cid:3) + ρ h I − W, X i ≥ . By the given assumption, for each k ∈ { , . . . , n } , there are δ k > and γ k > such that dist( Z, Ξ) ≤ γ k (cid:2) tr( Z ) − S k ( Z ) (cid:3) for all Z ∈ B ( X ∗ , δ k ) ∩ Ξ . (41)21et ε := min ≤ k ≤ n δ k . Fix any τ ≥ . Pick any ( X, W ) ∈ B (( X ∗ , W ∗ ) , ε ) ∩ Υ k ( τ ) .Consider any ρ ≥ ρ := νγφ ′− (1)(1 − t ∗ ) L f − t with γ = max ≤ k ≤ n γ k , where t ∈ [0 , is suchthat − t ∗ ∈ ∂φ ( t ) and its existence is due to [32, Lemma 1], and L f is the Lipschitzconstant of f on Ξ . Write J := (cid:8) j | ρλ j ( X ) > φ ′− (1) (cid:9) and r := | J | . Clearly, X ∈ B ( X ∗ , δ r ) ∩ Ξ . By invoking inequality (41), there necessarily exists X ρ ∈ Ξ such that k X − X ρ k F ≤ γ (cid:2) tr( X ) − S r ( X ) (cid:3) = γ P ni = r +1 λ i ( X ) . (42)Let J = (cid:8) j | − t ∗ ≤ ρλ j ( X ) ≤ φ ′− (1) (cid:9) and J = (cid:8) j | ≤ ρλ j ( X ) < − t ∗ (cid:9) . Notice that P ni =1 φ ( λ i ( W )) + ρ h I − W, X i ≥ P ni =1 min t ∈ [0 , (cid:8) φ ( t ) + ρλ i ( X )(1 − t ) (cid:9) . By invoking [32, Lemma 1] with ω = λ i ( X ) , we can obtain the following inequalities P ni =1 φ ( λ i ( W )) + ρ h I − W, X i≥ rank( X ρ ) + ρ (1 − t ) φ ′− (1)(1 − t ∗ ) X j ∈ J λ j ( X ) + ρ (1 − t ) X j ∈ J λ j ( X ) ≥ rank( X ρ ) + ρ (1 − t ) φ ′− (1)(1 − t ∗ ) X j ∈ J ∪ J λ j ( X ) ≥ rank( X ρ ) + ρ (1 − t ) γφ ′− (1)(1 − t ∗ ) k X − X ρ k F ≥ rank( X ρ ) + νL f k X − X ρ k F ≥ rank( X ρ ) + ν (cid:2) f ( X ρ ) − f ( X ) (cid:3) where the second inequality is since φ ( t ∗ ) − φ (1) ≥ φ ′− (1)( t ∗ − , the third one is due to { i | λ i ( X ) < λ r ( X ) } = J ∪ J and inequality (42), and the last one is using the Lipschitzcontinuity of f relative to Ξ . Now assume that X ρ has the eigenvalue decomposition as X ρ = U Diag( λ ( X ρ )) U T with U ∈ O n ( X ρ ) . Write W ρ = U U T + t ∗ U U T where U and U are the matrix consisting of the ﬁrst r columns and the rest n − r columns of U . Clearly, ( X ρ , W ρ ) is a feasible point of (39) and P ni =1 φ ( λ i ( W ρ )) = rank( X ρ ) . The last inequalityshows that (39) is partially calm at ( X ∗ , W ∗ ) . The proof is completed. ✷ Combining Proposition 4.1 with [32, Proposition 2.1], we have the following globalexact penalty result, which greatly improves the result of [4, Theorem 3.4] where theglobal exact penalty result is only achieved for the spectral norm unit ball Ω . Theorem 4.2

Suppose that for each k = 1 , , . . . , n the mapping Υ k is calm at for all X ∈ Υ k (0) (this automatically holds for the set Ω from Example 3.1-3.5). If f is coerciveon those noncompact Ξ , then there exists ρ > such that the penalty problem min X ∈ Ξ , (cid:22) W (cid:22) I n f ( X ) + P ni =1 φ ( λ i ( W )) + ρ h I − W, X i o associated to ρ ≥ ρ has the same global optimal solution set as MPEC (39) does. Conclusions

We have presented some equivalent characterizations on the calmness of the mapping Υ ,and achieved several groups of criteria to identify the calmness of Υ by leveraging thesecharacterizations. A collection of common PSD rank constraint sets are included to showthat these criteria can be satisﬁed and the criterion (20) is the most convenient for use.The calmness of Υ is also used to establish the global exact penalty for rank constrainedand regularized optimization problems, and recover the result in [33] as a byproduct. References [1]

J. P. Aubin , Lipschitz behavior of solutions to convex minimization problems , Math-ematics of Operations Research, 9(1984): 87-111.[2]

J. F. Bonnans and A. Shapiro , Perturbation Analysis of Optimization Problems ,Springer, New York. 2000.[3]

S. J. Bi and S. H. Pan , Error bounds for rank constrained optimization problemsand applications , Operations Research Letters, 44(2016): 336-341.[4]

S. J. Bi and S. H. Pan , Multistage convex relaxation approach to rank regularizedminimization problems based on equivalent mathematical program with a generalizedcomplementarity constraint , SIAM Journal on Control and Optimization, 55(2017):2493-2518.[5]

H. H. Bauschke, J. M. Borwein and W. Li , Strong conical hull intersection prop-erty, bounded linear regularity, Jameson’s property (G), and error bounds in convexoptimization , Mathematical Programming, 86(1999): 135-160.[6]

H. H. Bauschke, D. R. Luke, H. M. Phan and X. F. Wang , Restricted normalcones and sparsity optimization with aﬃne constraints , Foundations of ComputationalMathematics, 14(2014): 63-83.[7]

S. Deng and H. Hu,

Computable error Bounds for semideﬁnite programming ,Journal of Global Optimization, 14(1999): 105-115.[8] C. Ding,

An introduction to a class of matrix cone programming , National Uni-versity of Singapore, PhD thesis, 2012.[9] A. L. Dontchev and R. T. Rockafellar,

Regularity and conditioning of solutionmappings in variational analysis , Set-Valued Analysis, 12(2004): 79-109.[10] A. L. Dontchev and R. T. Rockafellar,

Implicit Functions and SolutionMappings , Springer Monographs in Mathematics, LLC, New York, 2009.[11] D. Drusvyatskiy, A. D. Ioffe and A. S. Lewis,

Transversality and alternatingprojections for nonconvex sets , Foundations of Computational Mathematics,15(2015): 1637-1651.

12] I. Dukanovic and F. Rendl,

Semideﬁnite programming relaxations for graphcoloring and maximal clique problems , Mathematical Programming, 109(2007):345-365.[13] M. J. Fabian, R. Henrion, A. Y. Kruger and J. V. Outrata,

Error bounds:necessary and suﬃcient conditions , Set-Valued Analysis, 18(2010): 121-149.[14] M. Fazel,

Matrix Rank Minimization with Applications , PhD thesis, StanfordUniversity, 2002.[15] H. Gfrerer,

First order and second order characterizations of metric subregular-ity and calmness of constraint set mappings , SIAM Journal on Optimization,21(2011): 1439-1474.[16] H. Gfrerer,

On directional metric subregularity, subregularity and optimality con-ditions for nonsmooth mathematical programs , Set-Valued Variational Analy-sis, 21(2013): 151-176.[17] R. Ge, C. Jin, P. Netrapalli and A. Sidford,

Eﬃcient algorithms for large-scale generalized eigenvector computation and canonical correlation analysis , In In-ternational Conference on Machine Learning, 2016: 2741-2750.[18] M. X. Goemans and D. P. Williamson,

Improved approximation algorithms formaximum cut and satisﬁability problems using semideﬁnite programming , Journalof the Association for Computing Machinery, 42(1995): 1115-1145.[19] D. Gross, Y. K. Liu, S. T. Flammia, S. Becker and J. Eisert,

Quantumstate tomography via compressed sensing , Physical Review Letters, 105(2011):150401.[20] B. Hajek, Y. H. Wu and J. M. Xu,

Semideﬁnite programs for exact recovery ofa hidden community , 29th Annual Conference on Learning Theory, Pro-ceedings of Machine Learning Research, 49(2016): 1051-1095.[21] C. Helmberg and F. Rendl,

Solving quadratic (0,1)-problems by semideﬁniteprograms and cutting planes , Mathematical Programming, 82(1998): 291-315.[22] R. Henrion, A. Jourani and J. Outrata,

On the calmness of a class of mul-tifunctions , SIAM Journal on Optimization, 13(2002): 520-534.[23] R. Henrion and J. Outrata,

Calmness of constraint systems with applications ,Mathematical Programming, 104(2005): 437-464.[24] A. D. Ioffe,

Regular points of Lipschitz functions , Transactions of the Amer-ican Mathematical Society, 251(1979): 61-69.[25] A. D. Ioffe and J. V. Outrata,

On metric and calmness qualiﬁcation conditionsin subdiﬀerential calculus , Set-Valued Analysis, 16 (2008), pp. 199-227.

26] A. Jourani and J. J. Ye,

Error bounds for eigenvalue and semideﬁnite matrixinequality systems , Mathematical Programming, 104(2005): 525-540.[27] A. Y. Kruger,

Error bounds and metric subregularity , Optimization, 64(2015):49-79.[28] B. Kulis, M. A. Sustik and I. S. Dhillon,

Low-rank Kernel learning with Breg-man matrix divergences , Journal of Machine Learning Research, 10(2009):341-376.[29] A. S. Lewis,

Nonsmooth analysis of eigenvalues , Mathematical Programming,84(1999): 1-24.[30] D. R. Luke,

Prox-regularity of rank constraint sets and implications for algorithms ,Journal of Mathematical Imaging and Vision, 47(2013): 231-238.[31] H. V. Ngai and M. Théra,

Error bounds for systems of lower semicontinuousfunctions in Asplund spaces , Mathematical Programming, 116(2009): 397-427.[32] Y. L. Liu, S. J. Bi and S. H. Pan,

Equivalent Lipschitz surrogates for zero-normand rank optimization problems , Journal of Global Optimization, 72(2018):679-704.[33] T. X. Liu, Z. S. Lu, X. J. Chen and Y. H. Dai,

An exact penalty method forsemideﬁnite-box-constrained low-rank matrix optimization problems , IMA Journalof Numerical Analysis, 40(2020): 563-86.[34] K. W. Meng and X. Q. Yang,

Equivalent conditions for local error bounds ,Set-Valued Variational Analysis, 20(2012): 617-636.[35] B. S. Mordukhovich,

Stability theory for parametric generalized equations andvariational inequalities via nonsmooth analysis , Transactions of the AmericanMathematical Society, 343(1994): 609-656.[36] B. S. Mordukhovich,

Variational Analysis and Generalized Diﬀerentiation I ,Springer, New York. 2006.[37] S. Negahban and M. J. Wainwright,

Estimation of (near) low-rank matriceswith noise and high-dimensional scaling . The Annals of Statistics, 39(2011):1069-1097.[38] R. Pietersz and P. J. F. Groenen,

Rank reduction of correlation matrices bymajorization . Quantitative Finance, 4(2004): 649-662.[39] Y. T. Qian and S. H. Pan,

Relaxation approaches to a class of UBPPs based onequivalent DC penalized matrix programs , arXiv:2004.12345.

40] S. M. Robinson,

Some continuity properties of polyhedral multifunctions , Math-ematical Programming Study, 14 (1981), pp. 206-214.[41] R. T. Rockafellar and R. J-B. Wets,

Variational Analysis , Springer, 1998.[42] R. Schneider and A. Uschmajew,

Convergence results for projected line-searchmethods on varieties of low-rank matrices via Łojasiewicz inequality , SIAM Journalon Optimization, 25(2015): 622-646.[43] B. Wu, C. Ding, D. F. Sun and K. C. Toh,

On the Moreau-Yosida regular-ization of the vector k-norm related functions , SIAM Journal on Optimization,24(2014): 766-794.[44] Z. L. Wu and J. J. Ye,

First-order and second-order conditions for error bounds ,SIAM Journal on Optimization, 14(2003): 621-645.[45] J. F. Sturm,

Error bounds for linear matrix inequalities , SIAM Journal onOptimization, 10(2000): 1228-1248.[46] J. J. Ye and X. Y. Ye,

Necessary optimality conditions for optimization problemswith variational inequality constraints , Mathematics of Operations Research,4(1997): 977-997.[47] S. Z. Zhang,

Global error bounds for convex conic problems , SIAM Journal onOptimization, 10(2000): 836-851.[48] X. Y. Zheng and K. F. Ng,

Metric subregularity and calmness for nonconvex gen-eralized equations in Banach spaces , SIAM Journal on Optimization, 20(2010):2119-2136.[49] X. Y. Zheng and Z. Wei,

Perturbation analysis of error bounds for quasi-subsmooth inequalities and semi-inﬁnite constraint systems , SIAM Journal onOptimization, 22(2012): 41-65.[50] G. W. Stewart and J. G. Sun,

Matrix Perturbation Theory , Academic Press,1990., Academic Press,1990.