Naive constant rank-type constraint qualifications for multifold second-order cone programming and semidefinite programming
R. Andreani, G. Haeser, L.M. Mito, H. Ramirez, D.O. Santos, T.P. Silveira
aa r X i v : . [ m a t h . O C ] S e p Naive constant rank-type constraint qualifications for multifoldsecond-order cone programming and semidefinite programming
R. Andreani ∗ G. Haeser † L.M. Mito † H. Ram´ırez ‡ D.O. Santos § T.P. Silveira † September 11, 2020
Abstract
The constant rank constraint qualification, introduced by Janin in 1984 for nonlinear programming, has been ex-tensively used for sensitivity analysis, global convergence of first- and second-order algorithms, and for computing thederivative of the value function. In this paper we discuss naive extensions of constant rank-type constraint qualificationsto second-order cone programming and semidefinite programming, which are based on the Approximate-Karush-Kuhn-Tucker necessary optimality condition and on the application of the reduction approach. Our definitions are strictlyweaker than Robinson’s constraint qualification, and an application to the global convergence of an augmented La-grangian algorithm is obtained.
Keywords:
Constraint qualifications; Optimality conditions; Second-order cone programming; Semidefinite program-ming; Global convergence.
In this paper we investigate constraint qualifications (CQs) for second-order cone programming and semidefinite pro-gramming. In particular, we are interested in constant rank CQs as defined first in [15] and later extended in [8, 7, 18, 20]in the context of nonlinear programming. In particular, the definition in [15] gained some notoriety for its ability tocompute the derivative of the value function, a result known to hold at the time only under Mangasarian-FromovitzCQ [23]. Also, the definition from [15] includes naturally the case of linear constraints, which does not follow underMangasarian-Fromovitz CQ. The ability to handle redundant constraints (in particular, linear ones) in the case of non-linear programming is a powerful modeling tool that frees the model builder from the apprehension of including themwithout preprocessing. Actually, the effort of finding which constraints are redundant may be equivalent to the effort ofsolving the problem.For conic programming, it is well known that linearity of the constraints is not a CQ [2, 21] and this somehow stressesthe difficulties in extending these ideas to the conic context. In particular, a previous tentative extension to second-ordercones [27] has been shown to be incorrect [3].In this paper, we make use of the reduction approach in order to propose new constant rank-type CQs for second-order cone programming and semidefinite programming that are strictly weaker than Robinson’s CQ. In our approach,we separate the constraints into two sets: one consisting of the constraints that can be completely characterized bystandard equality and inequality nonlinear programming constraints, and other with the irreducible conic constraints.For second-order cone programming, the second block consists of constraints that are active at the vertex of a multi-dimensional second-order cone, while for semidefinite programming these correspond to semidefinite blocks where thezero eigenvalue is non-simple.We consider our conditions to be naive extensions of the corresponding nonlinear programming CQ in the sense thatif the problem only has irreducible constraints then all our conditions coincide with Robinson’s CQ; however we showsome interesting examples where our condition holds while Robinson’s CQ fails. Extending these ideas to consider alsothe irreducible constraints is an ongoing topic of research. ∗ Department of Applied Mathematics, University of Campinas, Campinas-SP, Brazil. Email: [email protected] † Department of Applied Mathematics, University of S˜ao Paulo, S˜ao Paulo-SP, Brazil. Email: { ghaeser,leokoto,thiagops } @ime.usp.br ‡ Departamento de Ingenier´ıa Matem´atica and Centro de Modelamiento Matem´atico (CNRS UMI 2807), Universidad de Chile, Santiago, Chile.Email: [email protected] § Institute of Science and Technology, Federal University of S˜ao Paulo, S˜ao Jos´e dos Campos-SP, Brazil. Email: [email protected] espite our inability of dealing with the irreducible conic constraints, the Approximate-Karush-Kuhn-Tucker (AKKT)[5] necessary optimality condition, recently extended to second-order cones [4] and semidefinite programming [9], caneasily be used to handle the remaining constraints by means of the reduction approach. This allows obtaining CQsanalogous to those defined in [8, 7, 15, 18, 20]. Analogous definitions of [15, 18] are independent of Robinson’s CQ,while analogues of [8, 7, 20] are strictly weaker than Robinson’s CQ.Since several algorithms are expected to generate AKKT sequences (this is the case, for instance, of the augmentedLagrangian algorithms of [4] and [9]), a relevant corollary of our analysis is that all CQs introduced in this paper can beused for proving global convergence of these algorithms to a KKT point.This paper is organized as follows. In Section 2, we briefly introduce constant rank CQs for nonlinear programming.In Section 3, we revisit constraint qualifications for second-order cone programming. Section 4 is devoted to the AKKTapproach, while in Section 5 we introduce and explain our new CQs for second-order cones. In Section 6 we extend theseideas to semidefinite programming. Finally, our conclusions are presented in Section 7. Notation:
For a continuously differentiable function g : R n → R m , we denote J g ( x ) the m × n Jacobian matrix of g at x , for which the j -th row is given by the transposed gradient ∇ g j ( x ) T of the j -th component function g j : R n → R , j = ,..., m . Any finite-dimensional space R m is equipped with its standard Euclidean inner product h x , y i : = x T y = ∑ mj = x j y j . Then, given a closed convex cone K ⊆ R m , we denote its polar by K ◦ : = { v ∈ R m | h v , y i ≤ , ∀ y ∈ K } . Finally,we adopt the following standard conventions on the empty set /0: the sum over an empty index set is null (i.e., ∑ /0 = { } ). Consider the following nonlinear programming problem (NLP):Minimize f ( x ) , s.t. h i ( x ) = , i = ,..., p , (1) g j ( x ) ≤ , j = ,..., q , where f , h i , g j : R n → R are continuously differentiable functions. We denote by A ( x ∗ ) : = { j ∈ { ,... , q } | g j ( x ∗ ) = } ,the set of indices of active inequality constraints at a feasible point x ∗ .It is well known that at a local minimizer x ∗ , it holds that − ∇ f ( x ∗ ) ∈ T ( x ∗ ) ◦ , where T ( x ∗ ) denotes the (Bouligand)tangent cone to the feasible set at x ∗ (see, e.g., [19, Theorem 12.8]). However, since the tangent cone is a geometricobject, this necessary optimality condition is not always easy to manipulate. For this reason, one considers the linearizedcone, which is defined as follows: L ( x ∗ ) : = n d ∈ R n | ∇ h i ( x ∗ ) T d = , i = ,..., p ; ∇ g j ( x ∗ ) T d ≤ , j ∈ A ( x ∗ ) o . Its polar may be computed via Farkas’ Lemma, obtaining: L ( x ∗ ) ◦ = ( v ∈ R n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) v = p ∑ i = λ i ∇ h i ( x ∗ ) + ∑ j ∈ A ( x ∗ ) µ j ∇ g j ( x ∗ ) , µ j ≥ , j ∈ A ( x ∗ ) ) . Hence, when T ( x ∗ ) ◦ = L ( x ∗ ) ◦ , this geometric optimality condition takes the form of the usual, much more tractable,Karush-Kuhn-Tucker conditions. Vectors ( λ i , µ j ) above are called Lagrange multipliers associated with x ∗ , and the setof all these vectors is denoted by Λ ( x ∗ ) in this manuscript.A constraint qualification (CQ) is a condition that ensures the equality T ( x ∗ ) ◦ = L ( x ∗ ) ◦ . One of the most used CQin the NLP literature is the well-known Linear Independence Constraint Qualification (LICQ), which states the linearindependence of the set of gradients { ∇ h i ( x ∗ ) } pi = ∪ { ∇ g j ( x ∗ ) } j ∈ A ( x ∗ ) . LICQ ensures not only the existence, but also theuniqueness of the Lagrange multiplier (see, e.g., [19, Section 12.3]). Several weaker CQs have been defined for NLP.In this paper, we are interested in constant rank-type ones as first introduced by Janin in [15]. Recall that in the NLPsetting, we say that the Constant Rank Constraint Qualification (CRCQ) holds at a feasible point x ∗ if there exists aneighborhood V of x ∗ , such that for every subsets I ⊆ { ,... , p } and J ⊆ A ( x ∗ ) , the rank of { ∇ h i ( x ) , ∇ g j ( x ) ; i ∈ I , j ∈ J } remains constant for all x ∈ V . CRCQ is clearly weaker than LICQ.Note that requiring only constant rank of the full set of gradients { ∇ h i ( x ) } pi = ∪ { ∇ g j ( x ) } j ∈ A ( x ∗ ) (which is known asthe Weak Constant Rank (WCR) property) is not a CQ, as shown in [10]. The necessity of considering every subset ofthis set of gradients may be seen from the definition of the linearized cone. Indeed, given d ∈ L ( x ∗ ) , the relevant indexset of inequality constraints gradients is given by J = J d : = { j ∈ A ( x ∗ ) | ∇ g j ( x ∗ ) T d = } , which cannot be chosen in dvance if one only considers the point x ∗ . However, this suggests that there is no need to consider subsets of indicesfor the equality constraints, that is, it is enough to fix I = { ,... , p } . This condition, called Relaxed-CRCQ (RCRCQ),has been shown to be a CQ in [17]. This condition reads as follows: RCRCQ holds at a feasible point x ∗ if there existsa neighborhood V of x ∗ , such that for every subset J ⊆ A ( x ∗ ) , the rank of { ∇ h i ( x ) , ∇ g j ( x ) ; i ∈ { ,... , p } , j ∈ J } remainsconstant for all x ∈ V .These conditions can be seen as constant linear dependence conditions and thus it is natural to weaken these defini-tions by considering only constant positive linear dependence , providing conditions CPLD [20] and its relaxed variantRCPLD [8], both strictly weaker than Mangasarian-Formovitz CQ. This will be the most natural formulation for the CQswe propose in this paper. We refer the reader to [8].It turns out that the idea behind the construction of RCRCQ can be also extended to inequality constraints, providingan even weaker CQ. One seeks at characterizing a single index set J which is relevant of having the constant rankproperty. This set consists of the indices of gradients defining the subspace component of L ( x ∗ ) ◦ , which is given by itslineality space. More precisely, the lineality space of L ( x ∗ ) ◦ , defined as the largest linear space contained in L ( x ∗ ) ◦ ,is in this case given by L ( x ∗ ) ◦ ∩ − L ( x ∗ ) ◦ . So, a gradient ∇ g j ( x ∗ ) belongs to L ( x ∗ ) ◦ ∩ − L ( x ∗ ) ◦ if, and only if, − ∇ g j ( x ∗ ) ∈ L ( x ∗ ) ◦ . Thus, for J = J − ( x ∗ ) : = { j ∈ A ( x ∗ ) | − ∇ g j ( x ∗ ) ∈ L ( x ∗ ) ◦ } , we say that the Constant Rank of theSubspace Component (CRSC) CQ holds at a feasible point x ∗ if there exists a neighborhood V of x ∗ , such that the rankof { ∇ h i ( x ) , ∇ g j ( x ) ; i ∈ { ,... , p } , j ∈ J − ( x ∗ ) } remains constant for all x ∈ V . It was proved in [7] that CRSC is sufficientfor the existence of Lagrange multipliers at a local minimizer, and this is the weakest of the CQs we have discussed.CQ conditions discussed above in the NLP context have multiple applications. For instance, RCRCQ was used tocompute the derivative of the value function in [18], as well as to prove the convergence of a second-order augmentedLagrangian algorithm to second-order stationary points in [6]. RCPLD and CRSC were shown to be sufficient for provingfirst-order global convergence of several algorithms while also implying the validity of an error bound property (cf. [7]).Noteworthy, under CRSC, all inequality constraints in the set J − ( x ∗ ) behave locally as equality constraints, in the sensethat they are active at any feasible point in a neighborhood of x ∗ . Therefore, we strongly believe that the extension ofthese notions to a conic framework may have a major impact in stability and algorithmic theory for conic programming. Let us consider the second-order cone programming (SOCP) problem as follows:Minimize f ( x ) , s.t. h i ( x ) = , i = ,..., p , (2) g j ( x ) ∈ K m j , j = ,...,ℓ, where the functions are continuously differentiable and the second-order cones are denoted by K m j : = { ( z , z ) ∈ R × R m j − | z ≥ k z k} when m j >
1, and K m j : = R + (non-negative reals) otherwise.We say that the Karush-Kuhn-Tucker (KKT) conditions hold for problem (2) at a feasible point x ∗ if there exists λ ∈ R p , µ j ∈ K m j , j = ,...,ℓ, such that ∇ x L ( x ∗ , λ , µ ) = ∇ f ( x ∗ ) + J h ( x ∗ ) T λ − ℓ ∑ j = J g j ( x ∗ ) T µ j = , (3) h µ j , g j ( x ∗ ) i = , j = ,...,ℓ. (4)Here, L ( x , λ , µ ) : = f ( x )+ h λ , h ( x ) i− ∑ ℓ j = h µ j , g j ( x ) i is the standard Lagrangian function for problem (2), and ∇ x L ( x , λ , µ ) denotes the gradient of L at ( x , λ , µ ) with respect to x . As usual, the set of all Lagrange multipliers ( λ , µ ) associated withthe feasible point x ∗ , such that (3)–(4) are fulfilled, is denoted by Λ ( x ∗ ) .As in NLP, one needs to assume a suitable CQ in order to ensure the existence of Lagrange multipliers associatedwith a local minimizer. In what follows, we recall the elements needed to define these CQs in the SOCP context.The topological interior of K m j , denoted by int ( K m j ) , and the non-zero boundary, denoted by bd + ( K m j ) , are respec-tively defined by int ( K m j ) : = { ( z , z ) ∈ R × R m j − | z > k z k} , bd + ( K m j ) : = { ( z , z ) ∈ R × R m j − | z = k z k > } . hus, given a feasible point x ∗ , we introduce the index sets: I int ( x ∗ ) : = { j ∈ { ,...,ℓ } | g j ( x ∗ ) ∈ int ( K m j ) } , I B ( x ∗ ) : = { j ∈ { ,... ,ℓ } | g j ( x ∗ ) ∈ bd + ( K m j ) } , I ( x ∗ ) : = { j ∈ { ,... ,ℓ } | g j ( x ∗ ) = } . Moreover, the complementarity condition (4) can be equivalently written as µ j ◦ g j ( x ∗ ) = , j = ,...,ℓ, (5)where the operation ◦ is defined for any couple of vectors y : = ( y , ¯ y ) and s : = ( s , ¯ s ) , with the same dimension, as follows: y ◦ s : = (cid:18) h y , s i y ¯ s + s ¯ y (cid:19) . For more details about this operation, its algebraic properties and its relation with Jordan algebras, see [1, Section 4] andreferences therein.From (5), it is easy to check that complementarity condition is equivalently written in terms of the above-mentionedindex sets as follows: µ j = j ∈ I int ( x ∗ ) , µ j = α j R m j g j ( x ∗ ) , for some α j ≥ , if j ∈ I B ( x ∗ ) , (6)and no condition on µ j can be inferred when j ∈ I ( x ∗ ) . Here, R m is an m × m diagonal matrix whose first entry is 1and the remaining ones are −
1. Consequently, KKT conditions at x ∗ can be characterized as the existence of λ ∈ R p , µ j ∈ K m j , j ∈ I ( x ∗ ) , and α j ≥ , j ∈ I B ( x ∗ ) , such that ∇ f ( x ∗ ) + J h ( x ∗ ) T λ − ∑ j ∈ I ( x ∗ ) J g j ( x ∗ ) T µ j − ∑ j ∈ I B ( x ∗ ) α j ∇φ j ( x ∗ ) = , (7)where φ j ( x ) : = ([ g j ( x )] − k g j ( x ) k ) for all j ∈ I B ( x ∗ ) . Indeed, it is straightforward to check that ∇φ j ( x ) = J g j ( x ) T R m j g j ( x ) and multipliers µ j for all j I ( x ∗ ) are recoveredfrom (6).The use of mappings φ j is a consequence of applying the reduction approach to problem (2). Actually, condition (7)is simply KKT conditions at point x ∗ for a locally equivalent version of problem (2) for which constraints g j ( x ) ∈ K m j are replaced by φ j ( x ) ≥ j ∈ I B ( x ∗ ) , and are omitted when j ∈ I int ( x ∗ ) . For the sake of completeness, this reducedequivalent problem is explicitly stated here below:Minimize f ( x ) , s.t. h i ( x ) = , i = ,..., p , (8) g j ( x ) ∈ K m j , j ∈ I ( x ∗ ) , φ j ( x ) ≥ , j ∈ I B ( x ∗ ) . Despite its apparent simplicity in the SOCP setting, the reduction approach is a key tool in conic programming. Itpermits obtaining first- and second-order optimality conditions, to simplify some well-known CQs, among other crucialproperties. See [13, Section 3.4.4] and [12, Section 4] for more details. Throughout this article we will use KKT condition(7) and problem (8) to adapt CQ conditions from NLP to the SOCP setting (2).One of the most used (and strong) conditions to guarantee the existence of a Lagrange multiplier at a local minimizer x ∗ is the nondegeneracy condition. Thanks to the reduction approach (cf. [13, Equation 4.172]), this condition can beequivalently defined as follows: Definition 3.1.
Let x ∗ be a feasible point of (2) . Consider all the row vectors of the matrices J h ( x ∗ ) and J g j ( x ∗ ) , j ∈ I ( x ∗ ) together with the row vectors ∇φ j ( x ∗ ) T , j ∈ I B ( x ∗ ) . We say that nondegeneracy holds at x ∗ when these vectors are linearlyindependent. The nondegeneracy condition implies the existence and uniqueness of a Lagrange multiplier at a local minimizer x ∗ ,and the reciprocal is true provided that ( x ∗ , λ , µ ) (with ( λ , µ ) ∈ Λ ( x ∗ ) ) is strictly complementary, that is, g j ( x ∗ ) + µ j ∈ int ( K m j ) for all j = ,...,ℓ ; see [13, Proposition 4.75]. Thus, nondegeneracy is the analogue of LICQ from nonlinearprogramming. Note that there are other definitions of nondegeneracy e.g. [1, Definition 18] and [12, Definition 16]. owever, all these definitions coincide in the case of SOCP problem (2). We address the reader to [12, Section 4] formore details about nondegeneracy in the context of SOCP.As LICQ in NLP, nondegeneracy condition is often considered too strong. For this reason, one typically assumesa weaker condition, called Robinson’s CQ, which was originally defined in [22] for a general conic setting. In ourSOCP setting, we can use characterizations given in [13, Proposition 2.97, Corollary 2.98 and Lemma 2.99] to obtain thefollowing equivalent definition: Definition 3.2.
Let x ∗ be a feasible point of (2) . We say that Robinson’s CQ holds at x ∗ ifJ h ( x ∗ ) T λ + ℓ ∑ j = J g j ( x ∗ ) T µ j = and λ ∈ R m , µ j ∈ K m j , h µ j , g j ( x ∗ ) i = , j = ,...,ℓ ⇒ λ = and µ j = , j = ,... ,ℓ. (9)As in NLP, when x ∗ is assumed to be a local solution of (2), Robinson’s CQ (9) is equivalent to saying that the set ofLagrange multipliers Λ ( x ∗ ) is nonempty and compact (cf. [13, Props. 3.9 and 3.17]). In this sense, condition (9) can beseen as an extension of Mangasarian-Fromovitz CQ in NLP to the SOCP setting (2), written in a dual form.Thanks to (6), condition (9) can be rewritten as follows: J h ( x ∗ ) T λ + ∑ j ∈ I ( x ∗ ) J g j ( x ∗ ) T µ j + ∑ j ∈ I B ( x ∗ ) α j ∇φ j ( x ∗ ) = , λ ∈ R m , µ j ∈ K m j , j ∈ I ( x ∗ ) ; α j ≥ , j ∈ I B ( x ∗ ) ⇒ λ = , µ j = , j ∈ I ( x ∗ ) ; α j = , j ∈ I B ( x ∗ ) . (10)As we will see in the forthcoming sections, condition (10) best fits our analysis.Note that (10) can be interpreted as a conic linear independence of the (transposed) Jacobians and gradients involved inits definition. Indeed, given some finite number of convex and closed cones C j and denoting by ∏ j C j the cartesian productof these sets, we say that a correspondent set of matrices V j of appropriate dimensions is ∏ j C j -linearly independent if ∑ j V j s j = − s j ∈ C ◦ j for all j ⇒ s j = j . Then, (10) coincides with the { p } × ∏ j ∈ I ( x ∗ ) K m j × R | I B ( x ∗ ) | + -linear independence of matrices: J h ( x ∗ ) T , J g i ( x ∗ ) T with j ∈ I ( x ∗ ) , and ∇φ j ( x ∗ ) with j ∈ I B ( x ∗ ) . Here, 0 p denotes the null vector in R p . Moreover, when C j = R + for all j inthe definition above (and consequently, each matrix V j is simply a column vector), ∏ j C j -linear independence coincideswith the well-known positive linear independence. Then, condition (10) reminds the characterization of Mangasarian-Fromovitz CQ condition given by the positive linear independence of the gradients of active constraints (after replacingeach equality constraint h i ( x ) = h i ( x ) ≥ h i ( x ) ≤ { p } × ∏ j = ,...,ℓ K m j -linear independence of matrices J h ( x ∗ ) T and J g i ( x ∗ ) T with j = ,...,ℓ , is strictly stronger than Robinson’sCQ (9). This again shows how useful is the reduction approach for our analysis. Given the analyzed above, whenRobinson’s CQ fails, we say that the corresponding matrices in (10) are conic linearly dependent. For the nonlinear programming problem (1), the following
Approximate-KKT (AKKT) necessary optimality condition [5]is well known:
Theorem 4.1.
Let x ∗ be a local minimizer of (1) . Then, there exist sequences { x k } ⊂ R n , { λ k } ⊂ R p , { µ k } ⊂ R q + suchthat x k → x ∗ and ∇ f ( x k ) + p ∑ i = λ ki ∇ h i ( x k ) + ∑ j ∈ A ( x ∗ ) µ kj ∇ g j ( x k ) → . (11)We define µ kj → µ kj =
0) for j A ( x ∗ ) . Note that this does not require any constraint qualificationat all and the sequence of approximate Lagrange multipliers { ( λ k , µ k ) } may be unbounded. If the sequence has a boundedsubsequence, one may take a convergent subsequence such that the KKT conditions hold. In the unbounded case, onemay define M k : = max {| λ ki | , i = ,..., p ; µ kj , j ∈ A ( x ∗ ) } → + ∞ and divide the expression in (11) by M k . Thus, one maytake an appropriate subsequence such that λ k M k → λ ∈ R p and µ kj M k → µ j ≥ , j ∈ A ( x ∗ ) , btaining the existence of scalars λ i , i = ,..., p ; µ j ≥ , j ∈ A ( x ∗ ) , not all equal to zero, satisfying p ∑ i = λ i ∇ h i ( x ∗ ) + ∑ j ∈ A ( x ∗ ) µ j ∇ g j ( x ∗ ) = . That is, the gradients of equality constraints and active inequality constraints are positive linearly dependent. This pro-vides a simple proof for the existence of Lagrange multipliers under the Mangasarian-Fromovitz CQ (MFCQ). A verysimilar argument shows that the set of Lagrange multipliers at x ∗ is bounded if, and only if, MFCQ holds.In order to go beyond MFCQ in nonlinear programming, one relies on the well-known Carath´eodory’s Lemma , asstated in [17]:
Lemma 4.1.
Let v ,..., v p + q ∈ R n be such that { v i } pi = are linearly independent. Consider scalars β i , i = ,..., p + q, and denote y : = ∑ p + qi = β i v i . Then, there exist J ⊆ { p + ,... , p + q } and scalars ˆ β i , i ∈ { ,... , p } ∪ J, such that { v i } i ∈{ ,..., p }∪ J are linearly independent, β i > implies ˆ β i > , for all i ∈ J, and y = ∑ i ∈{ ,..., p }∪ J ˆ β i v i . Thus, in order to prove that CRCQ (and its weaker variants) is a CQ for the nonlinear programming problem (1), weapply Carath´eodory’s Lemma to (11). This yields ∇ f ( x k ) + ∑ i ∈ I k ˜ λ ki ∇ h i ( x k ) + ∑ j ∈ J k ˜ µ kj ∇ g j ( x k ) → , with I k ⊆ { ,... , p } , J k ⊆ A ( x ∗ ) , ˜ µ kj ≥ , j ∈ J k , and such that the vectors of the set { ∇ h i ( x k ) } i ∈ I k ∪ { ∇ g j ( x k ) } j ∈ J k arelinearly independent for all k . Here, by the infinite pigeonhole principle and passing to a subsequence if necessary, indexsubsets I k and J k can be taken as fixed and not depending on k . Then, the AKKT approach described above is similarlyfollowed. It is worth to emphasize here that the application of Carath´eodory’s Lemma preserves the sign of the candidateto multipliers, that is, ˜ µ kj has the same sign than µ kj . This is a crucial step which is not clearly extended to the conic case(see [3]). Note that if { ∇ h i ( x k ) } pi = is linearly independent for all k , we may take I k = { ,... , p } , which will be relevantin our analysis.In the sequel, we will use the extension of the AKKT necessary optimality condition for second-order cone program-ming (2), as presented in [4]: Theorem 4.2.
Let x ∗ be a local minimizer of (2) . Then, there exist sequences { x k } ⊂ R n , { λ k } ⊂ R p , { µ kj } ⊂ K m j , j ∈ I ( x ∗ ) , { α kj } ⊂ R + , j ∈ I B ( x ∗ ) such that x k → x ∗ and ∇ f ( x k ) + J h ( x k ) T λ k − ∑ j ∈ I ( x ∗ ) J g j ( x k ) T µ kj − ∑ j ∈ I B ( x ∗ ) α kj ∇φ j ( x k ) → . (12) Following the previous discussion, we present a “naive” formulation of constant rank constraint qualifications for thesecond-order cone programming problem (2).
Definition 5.1.
Let x ∗ be a feasible point of problem (2) and I ⊆ { ,... , p } be such that { ∇ h i ( x ∗ ) } i ∈ I is a basis of thelinear space generated by vectors { ∇ h i ( x ∗ ) } pi = . We say that the Relaxed Constant Positive Linear Dependence (RCPLD) condition holds at x ∗ when, for all J ⊆ I B ( x ∗ ) , there exists a neighborhood V of x ∗ such that: • { ∇ h i ( x ) } pi = has constant rank for all x in V ; • if the system ∑ i ∈ I λ i ∇ h i ( x ∗ ) + ∑ j ∈ I ( x ∗ ) J g j ( x ∗ ) T µ j + ∑ j ∈ J α j ∇φ j ( x ∗ ) = , λ i ∈ R , i ∈ I ; µ j ∈ K m j , j ∈ I ( x ∗ ) ; α j ≥ , j ∈ J , has a not all zero solution ( λ i ) i ∈ I , ( µ j ) j ∈ I ( x ∗ ) , ( α j ) j ∈ I B ( x ∗ ) , then vectors { ∇ h i ( x ) } i ∈ I ∪ { ∇φ j ( x ) } j ∈ J are linearlydependent for all x in V . Note that Robinson’s CQ implies RCPLD since it states the conic linear independence of the corresponding sets (andthus, for all its subsets) while RCPLD allows its conic linear dependence, as long as the linearly dependence is maintainedfor a reduced subset in a neighborhood.The definition above takes into account our inability to relax Robinson’s CQ for cones K m j with j ∈ I ( x ∗ ) , as thelinear dependence for x near x ∗ is required only for equalities and for constraints at the boundary. Indeed, note that in he case when I B ( x ∗ ) = /0 and no equalities are considered (i.e., p = m j = j ∈ I ( x ∗ ) . Indeed,in such case, the associated inequality g j ( x ) ∈ K m j corresponds to an inequality constraint of the form g j ( x ) ≥
0, whichis active at x ∗ . Hence, RCPLD definition can be slightly modified to take this situation into account as follows: define A ( x ∗ ) : = { j ∈ I ( x ∗ ) | m j = } , and remove those indices from I ( x ∗ ) , that is, define ˜ I ( x ∗ ) : = I ( x ∗ ) \ A ( x ∗ ) . Indices in A ( x ∗ ) can thus be treated similarly to those in I B ( x ∗ ) . So, by defining φ j ( x ) : = g j ( x ) when j ∈ A ( x ∗ ) , a slightly weakerversion of RCPLD can be obtained by replacing I ( x ∗ ) by ˜ I ( x ∗ ) and I B ( x ∗ ) by I B ( x ∗ ) ∪ A ( x ∗ ) in Definition 5.1. Sincethis modification has no consequence in the proof of Theorem 5.1, we do not include it in its statement.The point raised in the last paragraph explains why Definition 5.1 is considered a “naive” extension of a constantrank-type condition. Before proving that RCPLD is a CQ for problem (2), we make further observations related to thispoint. Remark 5.1. a) When we choose J = /0 in Definition 5.1, we necessarily obtain that there is no non-zero solution ( λ i , µ j ) ,with i ∈ I and j ∈ I ( x ∗ ) , to the system: ∑ i ∈ I λ i ∇ h i ( x ∗ ) + ∑ j ∈ I ( x ∗ ) J g j ( x ∗ ) T µ j = and λ i ∈ R , i ∈ I ; µ j ∈ K m j , j ∈ I ( x ∗ ) . This is equivalent to saying that Robinson’s CQ holds at x ∗ for the constrained set Γ : = { x | h i ( x ) = , i ∈ I , g j ( x ) ∈ K m j , j ∈ I ( x ∗ ) } . So, RCPLD ensures that Robinson’s CQ is fulfilled at x ∗ for the active set Γ . Actually, by using theslight modification discussed above, we can exclude standard nonlinear constraints from I ( x ∗ ) , and conclude that itonly implies the weaker condition: Robinson’s CQ holds at x ∗ for the constrained set ˜ Γ : = { x | h i ( x ) = , i ∈ I , g j ( x ) ∈ K m j , j ∈ I ( x ∗ ) , m j > } .b) Consider the case when problem (2) reduces to NLP (1) , that is, ˜ I ( x ∗ ) = /0 and I B ( x ∗ ) = /0 . Then, RCPLD inDefinition 5.1 reduces to the respective definition for nonlinear programming [8]. In particular, by enlarging the sys-tem to include α j ∈ R , j ∈ J, instead of only considering α j ≥ , j ∈ J, the definition reduces to an equivalent char-acterization (see [8]) of RCRCQ: { ∇ h i ( x ) } pi = has constant rank for x around x ∗ and for all J ⊆ A ( x ∗ ) , if the set { ∇ h i ( x ∗ ) } i ∈ I ∪ { ∇φ j ( x ∗ ) } j ∈ J is linearly dependent, then { ∇ h i ( x ) } i ∈ I ∪ { ∇φ j ( x ) } j ∈ J must remain linearly dependentfor all x in a neighborhood of x ∗ (here, the set I is fixed as in Definition 5.1). The latter also explains why RCPLD, givenin Definition 5.1, is considered a constant rank-type condition for problem (2) .c) Differently from the definition of nondegeneracy and Robinson’s CQ, the choice of the reduction function φ ( · ) gives rise to different constant rank conditions. For instance, one could formulate a similar, but different, condition byconsidering the alternative reduction function ˜ φ j ( x ) : = [ g j ( x )] − k g j ( x ) k for j ∈ I B ( x ∗ ) . This is a well-known fact fornonlinear programming, which establishes that when a constraint set satisfies CRCQ, it can be rewritten in such a waythat it fulfills Robinson’s CQ [16]. Theorem 5.1.
Let x ∗ be a feasible point of problem (2) satisfying the AKKT condition (12) and RCPLD. Then, the KKTconditions hold at x ∗ . In particular, RCPLD is a constraint qualification.Proof. AKKT condition (12) ensures the existence of sequences { x k } ⊂ R n , { λ k } ⊂ R p , { µ kj } ⊂ K m j , j ∈ I ( x ∗ ) , { α kj } ⊂ R + , j ∈ I B ( x ∗ ) , such that x k → x ∗ and ∇ f ( x k ) + p ∑ i = λ ki ∇ h i ( x k ) − ∑ j ∈ I ( x ∗ ) J g j ( x k ) T µ kj − ∑ j ∈ I B ( x ∗ ) α kj ∇φ j ( x k ) → . By the constant rank assumption on the equality constraints, and the definition of I , we may rewrite ∑ pi = λ ki ∇ h i ( x k ) = ∑ i ∈ I ˜ λ ki ∇ h i ( x k ) for new scalars ˜ λ ki ∈ R , i ∈ I , such that vectors { ∇ h i ( x k ) } i ∈ I are linearly independent. Applying Carath´eodory’sLemma, for each k , we get J k ⊆ I B ( x ∗ ) and new scalars ˆ λ ki ∈ R , i ∈ I , ˆ α kj ≥ , j ∈ J k , such that ∇ f ( x k ) + ∑ i ∈ I ˆ λ ki ∇ h i ( x k ) − ∑ j ∈ I ( x ∗ ) J g j ( x k ) T µ kj − ∑ j ∈ J k ˆ α kj ∇φ j ( x k ) → , (13)and vectors { ∇ h i ( x k ) } i ∈ I ∪ { ∇φ j ( x k ) } j ∈ J k are linearly independent. By the infinite pigeonhole principle, without loss ofgenerality we can consider subsequences, which are renamed as the original ones, for which sets J k are the same for all k . This set is denoted by J .Define M k : = max {| ˆ λ ki | , i ∈ I ; k µ ki k , i ∈ I ( x ∗ ) ; ˆ α j , j ∈ J } . If { M k } is bounded, any accumulation point of { ˆ λ ki , i ∈ I ; µ ki , i ∈ I ( x ∗ ) ; ˆ α j , j ∈ J } (after replacing by 0 the values for indices that are neither in I , nor in J ) satisfies (7). Hence, ∗ is a KKT point of (2). Otherwise, we may take a subsequence such that M k → + ∞ , and divide the expression in (13)by M k , considering convergent subsequences such that − ˆ λ ki M k → λ i ∈ R , i ∈ I ; µ kj M k → µ j ∈ K m j , j ∈ I ( x ∗ ) ;ˆ α kj M k → α j ≥ , j ∈ J , with ( λ i , µ j , α j ) = , and obtaining ∑ i ∈ I λ i ∇ h i ( x ∗ ) + ∑ j ∈ I ( x ∗ ) J g j ( x ∗ ) T µ j + ∑ j ∈ J α j ∇φ j ( x ∗ ) = . Then, since vectors { ∇ h i ( x k ) } i ∈ I ∪ { ∇φ j ( x k ) } j ∈ J are linearly independent, this contradicts the definition of RCPLD.Exact definition of RCPLD in nonlinear programming can be consulted in [8]. The definition of CRCQ [15], RCRCQ[18], and CPLD [20] may be analogously extended. They are omitted. We only introduce the extension of CRSC [7] forthis SOCP setting, since its definition is more involving and differs from its nonlinear programming counterpart. For thesake of completeness, the definition of CRSC considers sets ˜ I ( x ∗ ) and A ( x ∗ ) . To prove that CRSC is a CQ is enough tofollow the proof of Theorem 5.1, so it is omitted. Definition 5.2.
Let x ∗ be a feasible point of (2) and J − ( x ∗ ) ⊆ I B ( x ∗ ) ∪ A ( x ∗ ) be defined asJ − ( x ∗ ) : = ( j ∈ I B ( x ∗ ) ∪ A ( x ∗ ) (cid:12)(cid:12)(cid:12) − ∇φ j ( x ∗ ) = p ∑ i = λ i ∇ h i ( x ∗ ) + ∑ j ∈ I B ( x ∗ ) ∪ A ( x ∗ ) α j ∇φ j ( x ∗ ) , for some λ i ∈ R , α j ≥ ) . Set J + ( x ∗ ) : = I B ( x ∗ ) ∪ A ( x ∗ ) \ J − ( x ∗ ) . We also define I ⊆ { ,... , p } and J ⊆ J − ( x ∗ ) such that { ∇ h i ( x ∗ ) } i ∈ I ∪{ ∇φ j ( x ∗ ) } j ∈ J is a basis of the linear space generated by { ∇ h i ( x ∗ ) } pi = ∪ { ∇φ j ( x ∗ ) } j ∈ J − ( x ∗ ) . We say that the Constant Rank of theSubspace Component (CRSC) condition holds at x ∗ when there exists a neighborhood V of x ∗ such that: • { ∇ h i ( x ) } pi = ∪ { ∇φ j ( x ) } j ∈ J − ( x ∗ ) has constant rank for all x in V ; • the system ∑ i ∈ I ∇ h i ( x ∗ ) λ i + ∑ j ∈ ˜ I ( x ∗ ) J g j ( x ∗ ) µ j + ∑ j ∈ J ∪ J + ( x ∗ ) ∇φ j ( x ∗ ) α j = , λ i ∈ R , i ∈ I ; µ j ∈ K m j , j ∈ ˜ I ( x ∗ ) ; α j ∈ R , j ∈ J ; α j ≥ , j ∈ J + ( x ∗ ) , has only the trivial solution. Note that when ˜ I ( x ∗ ) = /0, the second requirement in the definition of CRSC always holds [7].As said above, both definitions, RCPLD and CRSC, are “naive” in the sense that they do not improve on Robinson’sCQ regarding multi-dimensional cones at zero. That is, when all constraint indices belong to ˜ I ( x ∗ ) , both definitionscoincide with Robinson’s CQ (9). However, the example below shows that RCPLD and CRSC are strictly weaker thanRobinson’s CQ: Example 5.1.
Consider the constraint set defined byg ( x ) : = ( g ( x ) , g ( x )) : = ( x , x ) ∈ K , where x is one-dimensional. Clearly, x ∗ = is feasible and the single constraint is in the boundary, i.e. I B ( x ∗ ) is theonly nonempty index set. Reduced constraint is such that φ ( x ) : = ( g ( x ) − g ( x ) ) = for all x. Then, it follows that ∇φ ( x ∗ ) = and consequently, Robinson’s CQ fails. However, ∇φ ( x ) = for all x, which implies that RCPLD holds.CRSC also holds by noting that the reduced constraint belongs to the index set J − ( x ∗ ) , whose gradient has constantrank, and ˜ I ( x ∗ ) = /0 , which is sufficient for ensuring the second condition. Indeed, J = /0 is a basis for the linear spacegenerated by the constraint gradient in J − ( x ∗ ) and the result follows by the linear independence of the empty set. Extension to semidefinite programming
Consider the semidefinite programming (SDP) problem with multiple constraints:Minimize f ( x ) , s.t. h ( x ) = , (14) g j ( x ) ∈ S m j + , j = ,...,ℓ, where f : R n → R , h : R n → R p , and g j : R n → S m j are continuously differentiable functions, S m j is the linear space of m j × m j real symmetric matrices equipped with the inner product A · B : = trace ( AB ) , where trace ( AB ) denotes the sumof the elements of the diagonal of AB for all matrices A , B ∈ S m j , and S m j + : = { M ∈ S m j | z T Mz ≥ , ∀ z ∈ R m j } is the closed convex cone of all positive semidefinite elements of S m j , for all j = ,...,ℓ . We denote by (cid:22) j the partialorder relation induced by S m j + , that is, A (cid:22) j B if, and only if, B − A ∈ S m j + . For the sake of notation, the index j is omittedthroughout the paper and this relation order is simply denoted by (cid:22) . The order relations (cid:23) , ≻ , and ≺ are similarlydefined.We end this subsection by recalling the Karush-Kuhn-Tucker conditions in the SDP framework. We say that KKTconditions hold at a feasible point x ∗ of problem (14) when there exist Lagrange multipliers λ ∈ R p and µ j ∈ S m j , j = ,... ,ℓ such that ∇ f ( x ∗ ) + J h ( x ∗ ) T λ − ℓ ∑ j = J g j ( x ∗ ) T µ j , (15a) g j ( x ∗ ) · µ j = , j = ,... ,ℓ, (15b)with J g j ( x ∗ ) T z : = ( ∂ g j ( x ∗ ) · z ,..., ∂ n g j ( x ∗ ) · z ) T , ∀ z ∈ S m j , where ∂ i g j ( x ∗ ) is the partial derivative of g j with respect to the variable x i , at x ∗ , for each i = ,..., n . In fact, J g j ( x ∗ ) T isthe adjoint of the linear mapping J g j ( x ∗ ) , defined by J g j ( x ∗ ) d : = n ∑ i = d i ∂ i g j ( x ∗ ) , for all d = ( d ,..., d n ) T ∈ R n , j = ,... ,ℓ . Constraint qualification conditions recalled in Section 3 for SOCP have been also well established for SDP problem(14). In this section, we start by quickly recalling Robinson’s CQ, before proceeding with the study of nondegeneracycondition, which needs more attention for our purposes.As in the SOCP setting, Robinson’s CQ [22] can be equivalently characterized via the properties established in [13,Proposition 2.97, Corollary 2.98 and Lemma 2.99] in its dual form:
Definition 6.1.
We say that
Robinson’s CQ holds at a feasible point x ∗ of problem (14) whenJ h ( x ∗ ) T λ + ℓ ∑ j = J g j ( x ∗ ) T µ j = , g j ( x ∗ ) · µ j = , ∀ j = ,...,ℓ, µ j ∈ S m j + , ∀ j = ,...,ℓ, ⇒ µ j = , ∀ j = ,...,ℓ. (16)As in SOCP, Robinson’s CQ is considered as the natural extension of Mangasarian-Fromovitz CQ from NLP to theSDP setting. Actually, when x ∗ is assumed to be a local solution of (2), Robinson’s CQ (16) is equivalent to saying thatthe set of Lagrange multipliers Λ ( x ∗ ) is nonempty and compact (cf. [13, Props. 3.9 and 3.17]).Let us now recall nondegeneracy condition in the SDP context. The notion of nondegeneracy (called transversalitytherein) was introduced by Shapiro and Fan in [25, Section 2] by means of tangent spaces in the context of eigenvalueoptimization. An equivalent form is proven in [13, Equation (4.172)] for reducible cones. This is adopted as a formaldefinition in our multifold SDP setting: efinition 6.2. We say that a feasible point x ∗ of problem (14) is nondegenerate when the following relation is satisfied Im A ( x ∗ ) + { } × ℓ ∏ j = lin ( T S mj + ( g j ( x ∗ ))) = R p × ℓ ∏ j = S m j , (17) where A ( x ∗ ) : = (cid:18) J h ( x ∗ ) J g j ( x ∗ ) ; j = ,...,ℓ (cid:19) is a linear mapping from R n to R p × ∏ ℓ j = S m j . As it happens in SOCP, the nondegeneracy condition is considered to be a natural analogue of LICQ from NLP toSDP. Actually, nondegeneracy condition (17) implies the existence and uniqueness of a Lagrange multiplier at a localminimizer x ∗ , and the reciprocal is true provided that ( x ∗ , λ , µ ) (with ( λ , µ ) ∈ Λ ( x ∗ ) ) is strictly complementary, that is, g j ( x ∗ ) + µ j ≻ j = ,... ,ℓ ; see [13, Proposition 4.75]. However, this analogy only makes sense when matrixblocks g j ( x ∗ ) are chosen in a “minimal” way, in the sense of avoiding zeros in the off diagonal entries. In particular, anNLP problem with ℓ inequality constraints should be modeled as an instance of (14) with m = ... = m ℓ =
1. Only in thatcase, nondegeneracy coincides LICQ. To stress the point above, we recall here below some results from [11, Section 5].Consider the NLP problem of minimizing f ( x ) under two constraints: g ( x ) > g ( x ) >
0, where f , g , and g are smooth real-valued functions. Let x ∗ be a local mimimun for which g ( x ∗ ) = g ( x ∗ ) = ∇ g ( x ∗ ) and ∇ g ( x ∗ ) are linearly independent). Denote by ¯ µ and ¯ µ the unique associated Lagrange multipliers,and assume that strict complementarity holds: ¯ µ i > i = ,
2. If this NLP problem is written as the following SDPproblem Minimize f ( x ) , s.t. (cid:20) g ( x ) g ( x ) (cid:21) ∈ S + , (18)then nondegeneracy condition (17) never holds. Indeed, the Lagrange multiplier associated with x ∗ for the reformulatedproblem (18) is never unique. It is enough to note that the matrix¯ µ : = (cid:20) ¯ µ
00 ¯ µ (cid:21) is an associated Lagrange multiplier as well as ¯ µ + t (cid:18) (cid:19) , for any t ∈ R such that t ≤ ¯ µ ¯ µ . Of course, this apparent inconsistency occurs not only for diagonal matrices but alsofor any SDP problem with a diagonal structure (see e.g. [11, Lemma 5.1]), and it is due to an inappropriate modelingdecision regarding the sparse structure of the studied SDP problem.On the other hand, this phenomenon does not occur with Robinson’s CQ, which is always preserved independentlyof the block structure of the SDP constraint set. This may be one of the reasons why multifold SDP is not often takeninto consideration in the literature, along with the fact that interior-point methods are knowingly capable of exploitingblock-diagonal structure (see Gondzio’s review [14] and references therein for details). It is not expected, though, thatevery constraint qualification will be preserved between multifold and block-diagonal representations. In particular, theconstraint qualifications we define in the next section are defined by means of exploiting the multifold structure. In thiscontext, they are strictly weaker than Robinson’s CQ, while if one considers a single block-diagonal representation ourcondition would resume to Robinson’s CQ. Furthermore, since our analysis is related to AKKT sequences, which describethe output of many practical algorithms, our results provide a stronger convergence theory for them when applied to SDPproblems under multifold representation.For more details about the nondegeneracy condition in the semidefinite programming context, see e.g. [11, 24]. Inparticular, Nondegeneracy condition for multifold SDP given in Definition 6.2 and the discussion above are inspired from[11, Section 5].In the next section we propose a naive RCPLD condition similar to Definition 5.1 for multifold SDP, as in (14). Wenote that CPLD has already been used in the context of SDP problems in [26], however, they consider the applicationof an augmented Lagrangian method for a mixed problem with SDP constraints and NLP constraints, where the NLPconstraints are not penalized and are carried out to the subproblems. Hence, the usual CPLD is assumed for the NLPconstrained subproblems, in the context of feasibility results, while Robinson’s CQ is assumed for the full problem in thecontext of optimality results. In particular, no CPLD-type CQ is introduced for the full problem. .2 A constant rank condition for SDP Denote the smallest eigenvalue of a matrix A by σ min ( A ) and its associated unitary eigenvectors by ν min ( A ) and − ν min ( A ) .It is known that σ min is continuously differentiable at A when σ min ( A ) is simple, i.e., when it has algebraic multiplicityequal to one, and that J σ min ( A ) = ν min ( A ) ν min ( A ) T in this case (see, e.g., [25]). So, given a local minimizer x ∗ , thecomposition σ min ◦ g j is a reduction mapping for the block j when σ min ( g j ( x ∗ )) is simple, playing a similar role to φ j ( x ) for problem (8). Also, in this scenario, ∇ ( σ min ( g j ( x )) = J g j ( x ) T J σ min ( g j ( x )) (19)when x is close enough to x ∗ . This motivates us to define an analogue of problem (8) for SDP as follows:Minimize f ( x ) , s.t. h ( x ) = , (20) g j ( x ) ∈ S m j + , j ∈ I N ( x ∗ ) , σ min ( g j ( x )) ≥ , j ∈ I R ( x ∗ ) , where I R ( x ∗ ) : = { j ∈ { ,... ,ℓ } | = σ min ( g j ( x ∗ )) is simple } and I N ( x ∗ ) : = { j ∈ { ,...,ℓ } | = σ min ( g j ( x ∗ )) is not simple } . Note that (20) is locally equivalent to (14) and that we have removed for simplicity all the constraints such that g j ( x ∗ ) ≻ j ∈ I N ( x ∗ ) . Roughly speaking, our approach consists of defining a constraint qualification that relaxesRobinson’s CQ to a constant rank-type condition, but only at the constraints indexed by I R ( x ∗ ) , which are the ones thatare well-behaved enough to be fully replaceable by a single real-valued constraint. As in the SOCP case, our strategy forproving that this is indeed a constraint qualification is based on sequential optimality conditions.In [9], the AKKT condition was extended for SDP. Next, we present an adapted version of it for problems with mixedNLP and SDP constraints, like (20): Theorem 6.1.
Let x ∗ be a local minimizer of (20) . Then, there exist AKKT sequences { x k } ⊂ R n , { λ k } ⊂ R p , { α kj } ⊂ R + ,and { µ kj } ⊂ S m j + such that x k → x ∗ and ∇ f ( x k ) + J h ( x k ) T λ k − ∑ j ∈ I N ( x ∗ ) J g j ( x k ) T µ kj − ∑ j ∈ I R ( x ∗ ) α kj ∇σ min ( g j ( x k )) → , (21) σ i ( g j ( x ∗ )) > ⇒ σ i ( µ kj ) → , i = ,..., m j , ∀ j ∈ I N ( x ∗ ) , (22) where σ i ( µ kj ) and σ i ( g j ( x ∗ )) denote corresponding eigenvalues of µ kj and g j ( x ∗ ) , respectively, regarding ordered or-thonormal eigenbasis { ν i ( µ kj ) } m j i = and { ν i ( g j ( x ∗ )) } m j i = such that ν i ( µ kj ) → ν i ( g j ( x ∗ )) for all i = ,... , m j and allj ∈ I N ( x ∗ ) . With this result at hand, we proceed in a similar manner to Definition 5.1 in order to extend the
Relaxed ConstantPositive Linear Dependence (RCPLD) condition to SDP via problem (20).
Definition 6.3.
Let x ∗ be feasible for problem (14) and let I ⊆ { ,... , p } be such that { ∇ h i ( x ∗ ) } i ∈ I is a basis for thespace spanned by { ∇ h i ( x ∗ ) } pi = . We say that Relaxed Constant Positive Linear Dependence holds at x ∗ when, for everyJ ⊆ I R ( x ∗ ) , there exists a neighborhood V of x ∗ such that: • { ∇ h i ( x ) } pi = has constant rank for all x ∈ V ; • If the system J h ( x ∗ ) T λ + ∑ j ∈ I N ( x ∗ ) J g j ( x ∗ ) T µ j + ∑ j ∈ J α j ∇σ min ( g j ( x ∗ )) = , λ ∈ R p , µ j (cid:23) , ∀ j ∈ I N ( x ∗ ) , α j > , ∀ j ∈ Jhas a nontrivial solution, then { ∇ h i ( x ) } i ∈ I ∪ { ∇σ min ( g j ( x )) } j ∈ J is linearly dependent for every x ∈ V .
Next, we show that RCPLD is a constraint qualification using AKKT sequences (Theorem 6.1).
Theorem 6.2.
Let x ∗ be a feasible point of problem (14) satisfying the AKKT condition (21) and RCPLD stated inDefinition 6.3. Then, the KKT conditions (15) hold at x ∗ . In particular, RCPLD is a constraint qualification. roof. Let { x k } → x ∗ , { λ k } ⊂ R p , { α kj } ⊂ R + , and { µ kj } ⊂ S m j + be sequences such that (21) and (22) hold. By theconstant rank assumption and the definition of I , the set { ∇ h i ( x k ) } i ∈ I is a basis for the space spanned by { ∇ h i ( x k ) } pi = when k is large enough. Hence, for all such k , there are new scalars ˜ λ k ∈ R | I | such that p ∑ i = λ ki ∇ h i ( x k ) = ∑ i ∈ I ˜ λ ki ∇ h i ( x k ) , for all k . Set ˜ λ ki = i I . So, J h ( x k ) T λ k = J h ( x k ) T ˜ λ k for all k .Also, thanks to Carath´eodory’s Lemma (Lemma 4.1) in (21), for every fixed k there is a nonempty subset J k ⊂ I R ( x ∗ ) such that { ∇ h i ( x k ) } i ∈ I S { ∇σ min ( g j ( x k )) } j ∈ J k is linearly independent and, consequently, (21) can be rewritten as follows ∇ f ( x k ) + J h ( x k ) T ˜ λ k − ∑ j ∈ I N ( x ∗ ) J g j ( x k ) T µ kj − ∑ j ∈ J k ˜ α kj ∇σ min ( g j ( x k )) → , (23)for some ˜ α kj >
0, where j ∈ J k . Note that in this process the scalars ˜ λ ki , i ∈ I , also changes, but we abuse the notationby still denoting them by ˜ λ ki . Now, by the infinite pigeonhole principle, we can assume, without loss of generality, that J k = J , for all k ∈ N . That is, we can take a subsequence if necessary such that J k does not vary with k .Now, we claim that the sequences { ˜ λ k } , { µ kj } , j ∈ I N ( x ∗ ) , and { ˜ α kj } , j ∈ J are bounded. Indeed, set M k : = max { ˜ α kj , j ∈ J ; k µ kj k , j ∈ I N ( x ∗ ) ; k ˜ λ k k} and suppose that { M k } is unbounded. This implies, by passing to a subsequence if necessary, that − ˜ λ ki M k → λ i ∈ R , i ∈ I ; µ kj M k → µ j ∈ K m j , j ∈ I N ( x ∗ ) ;˜ α kj M k → α j ≥ , j ∈ J , with ( λ i , µ j , α j ) = . Then, by dividing (21) by M k and passing to the limit, we contradict RCPLD.Finally, let ¯ µ j ∈ S m j + ( j ∈ I N ( x ∗ ) ), ¯ α j > j ∈ I R ( x ∗ ) ), and ¯ λ , be limit points of the sequences { µ kj } ( j ∈ I N ( x ∗ ) ), { ˜ α kj } ( j ∈ I R ( x ∗ ) ), and { ˜ λ k } , respectively. Note that these limit points are Lagrange multipliers associated with x ∗ .Indeed, by definition of I R ( x ∗ ) , we always have σ min ( g j ( x ∗ )) ¯ α j =
0, for all j ∈ I R ( x ∗ ) . So, for each j ∈ I R ( x ∗ ) the matrix¯ µ j : = ¯ α j ν min ( g j ( x ∗ )) ν min ( g j ( x ∗ )) T is positive semidefinite and satisfies that J g j ( x ∗ ) T ¯ µ j = ¯ α kj ∇σ min ( g j ( x k )) (cf. (19)).Additionally, set ¯ µ j : = j is such that g j ( x ∗ ) ≻
0. Then, it follows from (21) that ∇ f ( x ∗ ) + J h ( x ∗ ) T ¯ λ − ℓ ∑ j = J g j ( x ∗ ) T ¯ µ j = , which together with (22) implies that g j ( x ∗ ) · ¯ µ j = j . The desired result follows.The CRSC condition can also be extended in a very similar manner. That is, we treat the conic constraints that “looklike equality constraints” near the feasible point x ∗ , as equality constraints, which means it is not necessary to considerthe rank-type structure of every subset of their gradients, but only of one fixed set. To formalize our analyses, we definethe set J − ( x ∗ ) : = ( j ∈ I R ( x ∗ ) (cid:12)(cid:12)(cid:12) − ∇σ min ( g j ( x ∗ )) = p ∑ i = λ i ∇ h i ( x ∗ ) + ∑ j ∈ I R ( x ∗ ) α j ∇σ min ( g j ( x ∗ )) , for some λ i ∈ R , α j ≥ ) , (24)and the set J + ( x ∗ ) : = I R ( x ∗ ) \ J − ( x ∗ ) . Now, the Constant Rank of the Subspace Component (CRSC) constraint qualifica-tion for SDP is defined as follows:
Definition 6.4.
Let x ∗ be a feasible point of (2) and J − ( x ∗ ) ⊆ I R ( x ∗ ) be defined as in (24) . We also take I ⊆ { ,..., p } and J ⊆ J − ( x ∗ ) such that { ∇ h i ( x ∗ ) } i ∈ I ∪ { ∇σ min ( g j ( x ∗ )) } j ∈ J is a basis of the space spanned by the set { ∇ h i ( x ∗ ) } pi = ∪{ ∇σ min ( g j ( x ∗ )) } j ∈ J − ( x ∗ ) . We say that Constant Rank of the Subspace Component (CRSC) condition holds at x ∗ whenthere exists a neighborhood V of x ∗ such that: • { ∇ h i ( x ) } pi = ∪ { ∇σ min ( g j ( x )) } j ∈ J − ( x ∗ ) has constant rank for all x in V ; the system ∑ i ∈ I λ i ∇ h i ( x ∗ ) + ∑ j ∈ I N ( x ∗ ) J g j ( x ∗ ) T µ j + ∑ j ∈ J ∪ J + ( x ∗ ) α j ∇σ min ( g j ( x ∗ )) = , λ i ∈ R , i ∈ I ; µ j ∈ S m j + , j ∈ I N ( x ∗ ) ; α j ∈ R , j ∈ J ; α j ≥ , j ∈ J + ( x ∗ ) , has only the trivial solution. It is possible to prove that CRSC is indeed a constraint qualification, but since the proof follows from the samearguments provided in the proof of Theorem 6.2, it is omitted. The next counterexample, analogous to Example 5.1,shows that CRSC and RCPLD are strictly weaker than Robinson’s CQ.
Example 6.1.
Consider the following pair of constraints:g ( x ) : = (cid:20) x + x − x − x + (cid:21) ∈ S + , g ( x ) : = (cid:20) − x − x − − x − − x (cid:21) ∈ S + and the point x ∗ = , which is the unique feasible point. The eigenvalues of g ( x ) are σ min ( g ( x )) = x and σ max ( g ( x )) = , with corresponding eigenvectors ν min ( g ( x )) = ( , ) T and ν max ( g ( x )) = ( , − ) T , respectively, for all x close to x ∗ .With the same eigenvectors, the eigenvalues of g ( x ) are σ min ( g ( x )) = − x and σ max ( g ( x )) = , when x is close to x ∗ .Also, note that σ min ( g ( x ∗ )) and σ min ( g ( x ∗ )) are both simple, which means the reformulation of the problem asin (20) is simply an NLP problem. Moreover, we have that ∇σ min ( g ( x )) = , ∇σ min ( g ( x )) = − , for all x close enoughto x ∗ = . Then, RCPLD and CRSC (with J − ( x ∗ ) = { , } and, consequently, J + ( x ∗ ) = /0 and J equals either { } or { } )hold. However, Robinson’s CQ does not hold. Thus, RCPLD and CRSC are strictly implied by Robinson’s CQ. We have presented naive definitions of constant rank-type CQs for second-order cone programming and semidefinite pro-gramming. The definition is naive in the sense that no improvement is made with respect to irreducible constraints, whereour definitions resume to Robinson’s CQ. However, in general, our definitions are strictly weaker than Robinson’s CQ.In order to present a definition that takes into account the true conic constraints, we expect that a much more involvingimplicit function approach or Approximate-KKT approach would be needed, which is a subject of current research. Notethat, since augmented Lagrangian algorithms described in [4] and [9] generate an AKKT sequence for SOCP (2) and SDP(14) problems, respectively, CQs introduced in these notes are sufficient for showing global convergence to a KKT pointwithout assuming Robinson’s CQ.
Acknowledgement
We would like to thank Ellen H. Fukuda (Kyoto University) and Paulo J.S. Silva (University of Campinas) for initialdiscussions on this topic. This work was supported by CEPID-CeMEAI (FAPESP 2013/07375-0), FAPESP (grants2018/24293-0, 2017/18308-2, 2017/17840-2, and 2017/12187-9), CNPq (grants 301888/2017-5, 303427/2018-3, and404656/2018-8), and FONDECYT grant 1201982 and Basal Program CMM-AFB 170001, both from ANID (Chile).
References [1] F. Alizadeh and D. Goldfarb. Second-order cone programming.
Mathematical Programming , 95(1):3–51, 2003.[2] E. D. Andersen, C. Roos, and T. Terlaky. Notes on duality in second order and p-order cone optimization.
Opti-mization , 4(51):627–643, 2002.[3] R. Andreani, E. H. Fukuda, G. Haeser, H. Ram´ırez, D. O. Santos, P. J. S. Silva, and T. P. Silveira. Erratum to: Newconstraint qualifications and optimality conditions for second order cone programs. submitted to Set-Valued andVariational Analysis , 2020.[4] R. Andreani, E. H. Fukuda, G. Haeser, D. O. Santos, and L. D. Secchin. Optimality conditions for nonlinearsecond-order cone programming and symmetric cone programming.
Optimization Online , 2019.[5] R. Andreani, G. Haeser, and J. M. Mart´ınez. On sequential optimality conditions for smooth constrained optimiza-tion.
Optimization , 60(5):627–641, 2011.
6] R. Andreani, G. Haeser, A. Ramos, and P. J. S. Silva. A second-order sequential optimality condition associated tothe convergence of algorithms.
IMA Journal of Numerical Analysis , 37(4):1902–1929, 2017.[7] R. Andreani, G. Haeser, M. L. Schuverdt, and P. J. S. Silva. Two new weak constraint qualifications and applications.
SIAM Journal on Optimization , 22(3):1109–1135, 2012.[8] R. Andreani, G. Haeser, M.L. Schuverdt, and P.J.S. Silva. A relaxed constant positive linear dependence constraintqualification and applications.
Mathematical Programming , 135(1-2):255–273, 2012.[9] R. Andreani, G. Haeser, and D. S. Viana. Optimality conditions and global convergence for nonlinear semidefiniteprogramming.
Mathematical Programming , 180(1):203–235, 2020.[10] R. Andreani, J. M. Mart´ınez, and M. L. Schuverdt. On second-order optimality conditions for nonlinear program-ming.
Optimization , 56:529–542, 2007.[11] J. F. Bonnans and H. Ram´ırez. Strong regularity of semidefinite programs. Technical report DIM-CMM B-05-06-137, 2005.[12] J. F. Bonnans and H. Ram´ırez. Perturbation analysis of second-order cone programming problems.
MathematicalProgramming , 104(2):205–227, 2005.[13] J. F. Bonnans and A. Shapiro.
Perturbation Analysis of Optimization Problems . Springer Verlag, New York, 2000.[14] J. Gondzio. Interior point methods 25 years later.
European Journal of Operational Research , 216(3):587–601,2012.[15] R. Janin. Direction derivate of the marginal function in nonlinear programming.
Mathematical Programming Study ,21:110–126, 1984.[16] S. Lu. Implications of the constant rank constraint qualification.
Mathematical Programming , 126(2):365–392,2011.[17] L. Minchenko and S. Stakhovski. On relaxed constant rank regularity condition in mathematical programming.
Optimization , 60(4):429–440, 2011.[18] L. Minchenko and S. Stakhovski. Parametric nonlinear programming problems under the relaxed constant rankcondition.
SIAM Journal on Optimization , 1(314–332), 2011.[19] J. Nocedal and S. Wright.
Numerical Optimization . Springer Science & Business Media, 2006.[20] L. Qi and Z. Wei. On the constant positive linear dependence conditions and its application to SQP methods.
SIAMJournal on Optimization , 10(4):963–981, 2000.[21] M. V. Ramana, L. Tunc¸el, and H. Wolkowicz. Strong duality for semidefinite programming.
SIAM Journal onOptimization , 3(7):641–662, 1997.[22] S. M. Robinson. Stability theorems for systems of inequalities, Part II: differentiable nonlinear systems.
SIAMJournal on Numerical Analysis , 13:pp. 497–513, 1976.[23] S. M Robinson. Generalized equations and their solutions, Part II: applications to nonlinear programming. In
Optimality and Stability in Mathematical Programming , pages 200–221. Springer, 1982.[24] A. Shapiro. First and second-order analysis of nonlinear semidefinite programs.
Mathematical Programming ,77(2):pp. 301–320, 1997.[25] A. Shapiro and M. K. H. Fan. On eigenvalue optimization.
SIAM J. Optimization , 5:pp. 552–569, 1995.[26] H. Wu, H. Luo, X. Ding, and G. Chen. Global convergence of modified augmented Lagrangian methods fornonlinear semidefinite programmings.
Computational Optimization and Applications , 56(3):531–558, 2013.[27] Y. Zhang and L. Zhang. New constraint qualifications and optimality conditions for second order cone programs.
Set-Valued and Variational Analysis , 27:693–712, 2019., 27:693–712, 2019.