Necessary Optimality Conditions and Exact Penalization for Non-Lipschitz Nonlinear Programs
aa r X i v : . [ m a t h . O C ] J a n Mathematical Programming manuscript No. (will be inserted by the editor)
Necessary Optimality Conditions and ExactPenalization for Non-Lipschitz Nonlinear Programs
Dedicated To R. Terry Rockafellar in honor of his thbirthday Lei Guo · Jane J. Ye
Received: date / Accepted: date
Abstract
When the objective function is not locally Lipschitz, constraintqualifications are no longer sufficient for Karush-Kuhn-Tucker (KKT) condi-tions to hold at a local minimizer, let alone ensuring an exact penalization. Inthis paper, we extend quasi-normality and relaxed constant positive linear de-pendence (RCPLD) condition to allow the non-Lipschitzness of the objectivefunction and show that they are sufficient for KKT conditions to be necessaryfor optimality. Moreover, we derive exact penalization results for the followingtwo special cases. When the non-Lipschitz term in the objective function isthe sum of a composite function of a separable lower semi-continuous func-tion with a continuous function and an indicator function of a closed subset,we show that a local minimizer of our problem is also a local minimizer ofan exact penalization problem under a local error bound condition for a re-stricted constraint region and a suitable assumption on the outer separablefunction. When the non-Lipschitz term is the sum of a continuous functionand an indicator function of a closed subset, we also show that our problemadmits an exact penalization under an extended quasi-normality involving thecoderivative of the continuous function.
Keywords
Non-Lipschitz program · necessary optimality · exact penaliza-tion · error bound Mathematics Subject Classification (2010) · · The first author’s work was supported in part by NSFC Grant (No. 11401379) and thesecond author’s work was supported in part by NSERC.Lei GuoSino-US Global Logistics Institute, Shanghai Jiao Tong University, Shanghai 200030, ChinaE-mail: [email protected] YeDepartment of Mathematics and Statistics, University of Victoria, Victoria, BC, V8W 2Y2,CanadaE-mail: [email protected] Lei Guo, Jane J. Ye
The purpose of this paper is to study necessary optimality conditions andexact penalization for the following non-Lipschitz nonlinear program:min f ( x ) + Φ ( x )s . t . g ( x ) ≤ , (1) h ( x ) = 0 , where f : ℜ d → ℜ , g : ℜ d → ℜ n , h : ℜ d → ℜ m are Lipschitz around the point ofinterest, and Φ : ℜ d → ( −∞ , ∞ ] is an extended-valued lower semi-continuousfunction.Including a non-Lipschitz term in the objective function has significantlyenlarged the applicability of standard nonlinear programs. For example, ithas recently been discovered that when the term Φ belongs to a certain classof non-Lipschitz functions, local minimizers of problem (1) are often sparse.This property makes problem (1) useful for seeking a sparse solution in manyfields such as image restoration, signal processing, wireless communication,and portfolio selection in financial applications; see, e.g., [8, 10, 11, 13, 24].It is well-known that a constraint qualification is a condition imposed onconstraint functions so that Karush-Kuhn-Tucker (KKT) conditions hold at alocal minimizer. There exist very weak constraint qualifications such as Guig-nard’s and Abadie’s constraint qualifications ( [1,18]) but they are not easy toverify since it involves computing the tangent or normal cone of the constraintregion. The challenge is to find verifiable constraint qualifications that areapplicable to as many situations as possible. For nonlinear programs wherethe objective function is locally Lipschitz, the verifiable classical constraintqualifications in the literature include linear independence constraint qualifi-cation, Slater’s condition, and Mangasarian-Fromovitz constraint qualification(MFCQ). Moreover, it is well-known that when all constraint functions are lin-ear, no constraint qualification is required for KKT conditions to hold at a localminimizer. In recent years, quite a few new and weaker verifiable constraintqualifications have been introduced; see, e.g., [2–7,17,26]. In particular, quasi-normality is a weak constraint qualification that was first introduced in [7]and extended to locally Lipschitz programs in [33]. The recently introducedrelaxed constant positive linear dependence (RCPLD) condition in [2] is alsoa weak constraint qualification. Both quasi-normality and RCPLD are weakerthan MFCQ and hold automatically when all constraint functions are linear(see [7, Proposition 3.1]). Moreover, they can admit a local error bound for theconstraint region (see [2, 19]) and thus by Clarke’s exact penalization princi-ple [15, Proposition 2.4.3], they are sufficient to ensure an exact penalizationwhen the objective function is locally Lipschitz.Little research has been done in KKT necessary optimality conditions fornon-Lipschitz nonlinear programs in the literature, let alone exact penaliza-tion. The Fritz John type necessary optimality conditions for non-Lipschitzprograms were first given by Kruger and Mordukhovich in [23] and reproved on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 3 by Mordukhovich in [27, Theorem 1(b)] and Borwein et al. in [9, Corollary2.6]. For our problem, since all functions are locally Lipschitz except the ob-jective function, (A1) in [9, Corollary 2.6] never holds and consequently (A2)in [9, Corollary 2.6] holds. Hence, the Fritz John condition [9, Corollary 2.6]for our problem states that at a local minimizer x ∗ , there exist λ ∈ ℜ |I ∗ | + and µ ∈ ℜ m not all zero such that at least one of the following cases holds:(i) 0 ∈ ∂ ∞ Φ ( x ∗ ) + P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ),(ii) 0 ∈ ∂ ( f + Φ )( x ∗ ) + P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ),where I ∗ := { i : g i ( x ∗ ) = 0 } , and ∂, ∂ ∞ denote the limiting subdifferentialand the horizon subdifferential respectively (see the definitions in Section 2).Consequently, we can derive the KKT necessary optimality condition from theabove Fritz John condition immediately as follows. Suppose that there are nononzero abnormal multipliers, i.e., the following implication holds: ∈ ∂ ∞ Φ ( x ∗ ) + n P i =1 λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ) λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n = ⇒ ( λ, µ ) = 0 , (2)then the above condition (ii) holds, which means that x ∗ is a KKT point. Wecall the implication (2) ∂ ∞ -no nonzero abnormal multiplier constraint qual-ification ( ∂ ∞ -NNAMCQ) at x ∗ . Note that when Φ is Lipschitz around x ∗ ,we have ∂ ∞ Φ ( x ∗ ) = { } and hence ∂ ∞ -NNAMCQ reduces to the standardNNAMCQ for nonlinear programs with equality and inequality constraints.When Φ is an indicator function of a closed subset, ∂ ∞ -NNAMCQ reducesto the standard NNAMCQ for nonlinear programs with equality, inequality,and abstract set constraints. When Φ is neither Lipschitz around x ∗ nor equalto an indicator function, the implication (2) involves the horizon subdifferen-tial ∂ ∞ Φ ( x ∗ ) of the non-Lipschitz term. Thus, ∂ ∞ -NNAMCQ is no longer aconstraint qualification since it is related to the objective function. However,since it is a condition under which a local minimizer is a KKT point, we callsuch a condition a qualification condition . Very recently, Chen et al. [12] gavesome necessary optimality conditions for problem (1) where the non-Lipschitzterm Φ is continuous and all the other functions are continuously differentiableunder RCPLD, and the so-called basic qualification (BQ for short) (see thedefinition in (12)), and proposed an augmented Lagrangian method for solvingthis kind of problems. It should be noted that BQ is very difficult to verify asdiscussed in the paragraph after Corollary 1.In this paper, we extend the standard quasi-normality and the standardRCPLD to problem (1). Similar to ∂ ∞ -NNAMCQ, our new qualification con-ditions also involve ∂ ∞ Φ ( x ∗ ) and we call them ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD respectively. Moreover, we derive two exact penalization results fortwo special cases of problem (1) under some suitable conditions. We summarizeour main contributions as follows: Lei Guo, Jane J. Ye – We introduce two new verifiable qualification conditions called ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD respectively and show that they are sufficientfor KKT conditions to be necessary for optimality. These two qualificationconditions are both weaker than ∂ ∞ -NNAMCQ and hold automaticallywhen Φ is Lipschitz around the point of interest and g, h are linear. As a by-product, we extend the standard RCPLD on smooth constraint functionsto the case where there is an extra abstract set constraint and show thatKKT conditions are necessary for optimality. – Exact penalization for two special cases of problem (1) are derived. Case i): Φ is the sum of a composite function of a separable lower semi-continuousfunction with a continuous function and an indictor function of a closedsubset. In this case, we show that a local minimizer of problem (1) is alsoa local minimizer of an exact penalization problem under a local errorbound condition for a restricted constraint region and a suitable assump-tion on the outer separable function. Case ii): Φ is the sum of a continuousfunction and an indicator function of a closed subset. In this case, we intro-duce D ∗ -quasi-normality that is an extended quasi-normality involving thecoderivative of the continuous function, and show that D ∗ -quasi-normalityis sufficient to ensure an exact penalization. Note that D ∗ -quasi-normalityreduces to the standard quasi-normality for nonlinear programs with equal-ity, inequality, and abstract set constraints when the continuous functionis Lipschitz around the point of interest.The rest of this paper is organized as follows. In Section 2 we give somebackground materials. In Section 3 we propose some qualification conditionsfor problem (1). In Section 4 we derive necessary optimality conditions forproblem (1) under these qualification conditions. We investigate some sufficientconditions ensuring an exact penalization for problem (1) in Section 5. The notations used in this paper are standard in the literature. The symbol N (resp., ℜ , ℜ + , ℜ − ) denotes the set of nonnegative integers (resp., real numbers,nonnegative real numbers, nonpositive real numbers). For a finite set T , | T | denotes its cardinality. For any x ∈ ℜ d , we denote by x + := max { x, } thenon-negative part of x , k x k p := ( d P i =1 | x i | p ) /p for any p >
0, and k x k any normin ℜ d . Let B δ ( x ) denote a closed ball centered at x with positive radius δ . Theindicator function of a subset D ⊆ ℜ d is denoted by δ D and dist D ( x ) denotesthe Euclidean distance from x to D . Let F denote the constraint region forproblem (1) and for any x ∈ F , denote by I g ( x ) := { i : g i ( x ) = 0 } the indexset of active inequality constraints.We say that F admits a local error bound at ¯ x ∈ F if there exist δ > κ > F ( x ) ≤ κ ( k h ( x ) k + k g ( x ) + k ) for all x ∈ B δ (¯ x ).We next give some background materials on variational analysis; see, e.g.,[15, 16, 28, 31] for more details. For a function ϕ : ℜ d → [ −∞ , ∞ ] and a point on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 5 x ∗ ∈ ℜ d where ϕ ( x ∗ ) is finite, the regular (or Fr´echet) subdifferential of ϕ at x ∗ is defined asˆ ∂ϕ ( x ∗ ) := { v : ϕ ( x ) ≥ ϕ ( x ∗ ) + v T ( x − x ∗ ) + o ( k x − x ∗ k ) ∀ x } , the limiting (or Mordukhovich) subdifferential of ϕ at x ∗ is defined as ∂ϕ ( x ∗ ) := { v : ∃ x k → ϕ x ∗ , v k ∈ ˆ ∂ϕ ( x k ) s . t . v k → v } , and the horizon (or singular Mordukhovich) subdifferential of ϕ at x ∗ is definedas ∂ ∞ ϕ ( x ∗ ) := { v : ∃ x k → ϕ x ∗ , v k ∈ ˆ ∂ϕ ( x k ) and t k → t k ≥ . t . t k v k → v } , where o ( · ) means o ( α ) /α → α →
0, and x k → ϕ x ∗ means x k → x ∗ and ϕ ( x k ) → ϕ ( x ∗ ) as k → ∞ . It is well-known that ϕ is Lipschitz around x ∗ ifand only if ∂ ∞ ϕ ( x ∗ ) = { } by [31, Theorem 9.13].The regular (or Fr´echet) normal cone of D at x ∗ ∈ D is a closed convexcone defined as b N D ( x ∗ ) := ˆ ∂δ D ( x ∗ ) and the limiting (or Mordukhovich) normalcone of D at x ∗ is a closed cone defined as N D ( x ∗ ) := ∂δ D ( x ∗ ). We say that D is regular at x ∗ if D is locally closed at x ∗ and N D ( x ∗ ) = b N D ( x ∗ ).Given a set-valued mapping S : ℜ d ⇒ ℜ m and a point ¯ x with S (¯ x ) = ∅ , thecoderivative of S at ¯ x for any ¯ u ∈ S (¯ x ) is the mapping D ∗ S (¯ x | ¯ u ) : ℜ m ⇒ ℜ d defined by D ∗ S (¯ x | ¯ u )( y ) := { v : ( v, − y ) ∈ N gph S (¯ x, ¯ u ) } , where gph S := { ( x, y ) : y ∈ S ( x ) } . When S is single-valued at ¯ x with S (¯ x ) =¯ u , the notation D ∗ S (¯ x | ¯ u ) is simplified to D ∗ S (¯ x ). In the case where S is notonly single-valued but also Lipschitz around ¯ x , the coderivative is related tothe limiting subdifferential by the scalarization formula: D ∗ S (¯ x )( y ) = ∂ h y, Si (¯ x ) ∀ y ∈ ℜ m . We say that S is locally bounded at ¯ x ∈ ℜ d if there exist M > δ > k v k ≤ M ∀ v ∈ S ( x ) , ∀ x ∈ B δ (¯ x ) . Recall from [31, Definition 5.4] that S is said to be outer semi-continuous at¯ x if { ¯ v : ∃ x k → ¯ x, v k ∈ S ( x k ) s . t . v k → ¯ v } ⊆ S (¯ x ) . It is well-known that the limiting normal cone mapping, the limiting sub-differential mapping, and the horizon subdifferential mapping are all outersemi-continuous everywhere; see, e.g., [31, Propositions 6.6 and 8.7].By using the outer semi-continuity of the limiting normal cone mappingand the definition of the coderivative, it is easy to give the following propositionthat will be useful in deriving exact penalization results in Section 5.
Proposition 1
The coderivative D ∗ S ( x | u ) : ℜ d ⇒ ℜ m is outer semi-continuousin the sense that if there exists v k ∈ D ∗ S ( x k | u k )( y k ) where x k → x ∗ , y k → y ∗ ,and u k → u ∗ with u k ∈ S ( x k ) such that v k → v ∗ , then v ∗ ∈ D ∗ S ( x ∗ | u ∗ )( y ∗ ) . Lei Guo, Jane J. Ye
The following proposition collects some useful properties and calculus rulesof the limiting subdifferential.
Proposition 2 (i) [31, Exercise 10.10]
Let f, g : ℜ d → [ −∞ , ∞ ] be properlower semi-continuous around x ∗ ∈ ℜ d and finite at x ∗ , and let α, β benonnegative scalars. Assume that at least one of them is Lipschitz around x ∗ . Then ∂ ( αf + βg )( x ∗ ) ⊆ α∂f ( x ∗ ) + β∂g ( x ∗ ) . Here we let · ∅ = { } by convention. (ii) [22, Theorem 2.5 and Remark (2)] Let g : ℜ n → ℜ m be Lipschitz around x ∗ and f : ℜ m → ℜ be Lipschitz around g ( x ∗ ) . Then the composite function f ◦ g is Lipschitz around x ∗ and ∂ ( f ◦ g )( x ∗ ) ⊆ [ ξ ∈ ∂f ( g ( x ∗ )) ∂ h ξ, g i ( x ∗ ) . (iii) [28, Theorem 3.38] Let g : ℜ d → ℜ m be continuous at x ∗ and f : ℜ m → ℜ be Lipschitz around g ( x ∗ ) . Then ∂ ( f ◦ g )( x ∗ ) ⊆ [ ξ ∈ ∂f ( g ( x ∗ )) D ∗ g ( x ∗ )( ξ ) , ∂ ∞ ( f ◦ g )( x ∗ ) ⊆ D ∗ g ( x ∗ )(0) . (iv) [29, Theorem 7.5] Let f ( x ) := max { f i ( x ) : i = 1 , . . . , s } where f i : ℜ d → ℜ is continuous at x ∗ for all i = 1 , . . . , s . If all but at most one of the functions { f i : i = 1 , . . . , s } are Lipschitz around x ∗ , then ∂f ( x ∗ ) ⊆ [ ( X i ∈I ∗ λ i ⋄ ∂f i ( x ∗ ) : ( λ , . . . , λ s ) ∈ Λ ∗ ) , where I ∗ := { i : f i ( x ∗ ) = f ( x ∗ ) } is the index set of active indices and Λ ∗ := ( ( λ , . . . , λ s ) : λ i ≥ i ∈ I ∗ , λ i = 0 i / ∈ I ∗ , X i ∈I ∗ λ i = 1 ) . Here we define α ⋄ ∂g by α∂g if α > and by ∂ ∞ g if α = 0 . Since the objective function of problem (1) includes a non-Lipschitz term,KKT conditions are no longer necessary for optimality only under constraintqualifications such as the standard quasi-normality [33, Definition 5] and thestandard RCPLD [2, Definition 4]. In this section we extend the standardquasi-normality and the standard RCPLD to problem (1) as follows so thatKKT conditions can be necessary for optimality under the extended quasi-normality and the extended RCPLD respectively. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 7
Definition 1
Let x ∈ F . (a) We say that x is ∂ ∞ -quasi-normal if there is nononzero vector ( λ, µ ) ∈ ℜ n × ℜ m such that there exists a sequence { x k } whichconverges to x as k → ∞ satisfying0 ∈ ∂ ∞ Φ ( x ) + n X i =1 λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x ) , (3) λ i ≥ , λ i g i ( x ) = 0 i = 1 , . . . , n, (4) g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (5)where I := { i : λ i > } , J := { j : µ j = 0 } , and N is the set of all positiveintegers.(b) Assume that g, h are smooth around x . Let J ⊆ { , . . . , m } be suchthat {∇ h j ( x ) : j ∈ J } is a basis for the span {∇ h j ( x ) : j = 1 , . . . , m } . We saythat ∂ ∞ -RCPLD condition holds at x if there exists δ > {∇ h j ( y ) : j = 1 , . . . , m } has the same rank for each y ∈ B δ ( x );(ii) for each I ⊆ I g ( x ), if there exist { λ i ≥ i ∈ I} and { µ j : j ∈ J } not allzero such that 0 ∈ ∂ ∞ Φ ( x ) + X i ∈I λ i ∇ g i ( x ) + X j ∈J µ j ∇ h j ( x ) , (6)then {∇ g i ( y ) , ∇ h j ( y ) : i ∈ I , j ∈ J } is linearly dependent for each y ∈B δ ( x ).It is easy to see that both ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD are weakerthan ∂ ∞ -NNAMCQ (i.e., implication (2)) but the reverse is not true; see Ex-amples 1–2. Note that if Φ is Lipschitz around x , then ∂ ∞ Φ ( x ) = { } , and thus ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD reduce to the standard quasi-normalityand the standard RCPLD respectively for nonlinear programs with equalityand inequality constraints. If Φ is an indicator function of a closed subset Ω ,i.e., Φ ( x ) = δ Ω ( x ), then ∂ ∞ Φ ( x ) = N Ω ( x ) by [31, Exercise 8.14]. Thus ∂ ∞ -quasi-normality reduces to the standard quasi-normality for nonlinear pro-grams with equality, inequality, and abstract set constraints, and ∂ ∞ -RCPLDallows us to extend the original definition of RCPLD [2, Definition 4] to theproblem where there is an extra abstract set constraint x ∈ Ω since inclusion(6) becomes 0 ∈ X i ∈I λ i ∇ g i ( x ) + X j ∈J µ j ∇ h j ( x ) + N Ω ( x ) . In this case we simply say that RCPLD holds.We next extend the standard quasi-normality to problem (1) for ensuringan exact penalization.
Definition 2
Suppose that Φ ( x ) := Ψ ( x ) + δ Ω ( x ) where Ψ is a continuousfunction and Ω is a closed subset in ℜ d . Let x ∈ F . We say that x is D ∗ -quasi-normal if there is no nonzero vector ( λ, µ ) ∈ ℜ n × ℜ m such that there exists a Lei Guo, Jane J. Ye sequence { x k } which converges to x as k → ∞ satisfying (4)–(5) and0 ∈ D ∗ Ψ ( x )(0) + n X i =1 λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x ) + N Ω ( x ) . If Ψ is Lipschitz around x , then D ∗ Ψ ( x )(0) = { } and hence D ∗ -quasi-normality reduces to the standard quasi-normality for nonlinear programs withequality, inequality, and abstract set constraints. Since ∂ ∞ Φ ( x ) ⊆ D ∗ Φ ( x )(0)(see [28, Theorem 1.80]), D ∗ -quasi-normality is stronger than ∂ ∞ -quasi-normalitywhen Ω = ℜ d .We call problem (1) an ℓ / minimization problem if the non-Lipschitz term Φ ( x ) is equal to ( k x k / ) / . The problem in the following example is an ℓ / minimization problem with linear constraints. It gives an example for which ∂ ∞ -RCPLD, ∂ ∞ -quasi-normality, and D ∗ -quasi-normality are all satisfied but ∂ ∞ -NNAMCQ does not hold. Example 1
Consider the following problemmin f ( x ) := p | x | + p | x | + p | x | + p | x | s . t . g ( x ) := x + x + x + x − ≤ ,h ( x ) := x + x − ,h ( x ) := x + x − x ∗ = (1 , , , ∂ ∞ f ( x ∗ ) = { } × ℜ × { } × ℜ , ∇ g ( x ) = (1 , , , ∇ h ( x ) = (1 , , , ∇ h ( x ) =(0 , , ,
1) for any x . Direct verification implies that there exists ( λ , µ , µ ) = 0such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , which means that ∂ ∞ -NNAMCQ does not hold at x ∗ . But in this case, thefamily of gradients {∇ g ( x ) , ∇ h ( x ) , ∇ h ( x ) } is linearly dependent for any x .Thus, ∂ ∞ -RCPLD holds at x ∗ . To show that ∂ ∞ -quasi-normality also holds at x ∗ , we assume that there exist ( λ , µ , µ ) = 0 and a sequence { x k } convergingto x ∗ such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , (7) g ( x k ) > λ > , µ h ( x k ) > µ = 0 , µ h ( x k ) > µ = 0 . (8)By (7), it follows that µ = µ = − λ . Thus λ > µ = µ <
0. Thesetogether with (8) leads to2 < x k + x k + x k + x k < . This contradiction shows that there is no nonzero ( λ , µ , µ ) satisfying (7)–(8). Thus, ∂ ∞ -quasi-normality holds at x ∗ . Since it is easy to verify that D ∗ f ( x ∗ )(0) = ∂ ∞ f ( x ∗ ), D ∗ -quasi-normality holds at x ∗ as well. (cid:3) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 9 The following example of an ℓ / minimization problem with nonlinearconstraints illustrates that it is possible that ∂ ∞ -quasi-normality holds but ∂ ∞ -RCPLD does not hold. Example 2
Consider the following problemmin f ( x ) := p | x | + p | x | + p | x | s . t . g ( x ) := x + x − x − ≤ ,h ( x ) := x + x − x ∗ = (1 , , ∂ ∞ -RCPLD does not hold at x ∗ since there exists( λ, µ ) = 0 such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ {∇ g ( x ) , ∇ h ( x ) } is linearly independent for any x with x = 0. To showthat ∂ ∞ -quasi-normality holds at x ∗ , assume that there exist ( λ, µ ) = 0 anda sequence { x k } converging to x ∗ such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , (9) g ( x k ) > λ > , µh ( x k ) > µ = 0 . (10)By (9), it follows that λ + µ = 0. Thus λ > µ <
0. But by (10), we have1 + ( x k ) <
1. The contradiction shows that ∂ ∞ -quasi-normality holds at x ∗ . (cid:3) We know that the standard RCPLD does not imply the standard quasi-normality (see [2, Example 2]). It is also easy to see that the standard RCPLDand the standard quasi-normality at a local minimizer x ∗ of the problemmin f ( x ) s . t . x ∈ F (11)are equivalent to ∂ ∞ -RCPLD and ∂ ∞ -quasi-normality at a local minimizer( x ∗ ,
0) of the perturbed problemmin f ( x ) + p | z | s . t . x ∈ F , z ∈ ℜ , respectively. Thus, if problem (11) is such that the standard RCPLD holdsbut the standard quasi-normality does not hold, then ∂ ∞ -RCPLD holds but ∂ ∞ -quasi-normality does not hold for the above perturbed problem.We next give some characterizations for ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD,and D ∗ -quasi-normality in terms of the standard quasi-normality and the stan-dard RCPLD. Proposition 3
Let x ∗ ∈ F . (i) If ∂ ∞ -quasi-normality holds at x ∗ , then boththe standard quasi-normality and the following basic qualification (BQ) holdat x ∗ : − ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ) = { } . (12) If g, h are smooth around x ∗ , then ∂ ∞ -quasi-normality at x ∗ is equivalent tothe standard quasi-normality plus BQ (12) at x ∗ . (ii) ∂ ∞ -RCPLD holds at x ∗ if and only if both the standard RCPLD andBQ (12) hold at x ∗ . (iii) If D ∗ -quasi-normality holds at x ∗ , then the standard quasi-normalityholds at x ∗ and − D ∗ Ψ ( x ∗ )(0) ∩ N F ( x ∗ ) = { } . (13) If Ω is regular and g, h are smooth around x ∗ , then D ∗ -quasi-normality holdsat x ∗ if and only if both the standard quasi-normality and condition (13) holdat x ∗ .Proof (i) Suppose that ∂ ∞ -quasi-normality holds at x ∗ . Then it is easy to seethat the standard quasi-normality also holds at x ∗ since 0 ∈ ∂ ∞ Φ ( x ∗ ). Thusby [33, Proposition 4], it follows that N F ( x ∗ ) ⊆ P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ) : λ i ≥ i ∈ I ∗ , ∃{ x k } → x ∗ s . t . g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (14)where I ∗ := I g ( x ∗ ) , I := { i : λ i > } , J := { j : µ j = 0 } . (15)We now show that BQ (12) holds. By contradiction, suppose that 0 = ζ ∈− ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ). Then by (14), there exist ( λ, µ ) ∈ ℜ |I ∗ | × ℜ m and asequence { x k } converging to x ∗ such that ζ ∈ X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) ,λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N . Since 0 = ζ ∈ − ∂ ∞ Φ ( x ∗ ), it then follows that ( λ, µ ) = 0 and0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) , (16) λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (17)which contradicts ∂ ∞ -quasi-normality. Thus, BQ (12) holds.We next show the converse part. Assume that g, h are smooth around x ∗ ,and both the standard quasi-normality and BQ (12) hold at x ∗ . By contra-diction, assume that ∂ ∞ -quasi-normality does not hold at x ∗ . That is, thereexist 0 = ( λ, µ ) ∈ ℜ |I ∗ | × ℜ m and a sequence { x k } converging to x ∗ such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) , (18) λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (19) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 11 where I ∗ , I, J are defined as in (15). Moreover, since g, h are smooth around x ∗ , it follows from [31, Theorem 6.14] that X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m ⊆ N F ( x ∗ ) . (20)This and (18) imply that X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) ∈ − ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ) , which together with BQ (12) means that X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) = 0 . This together with (19) and the relation ( λ, µ ) = 0 contradicts the standardquasi-normality. Thus, ∂ ∞ -quasi-normality holds at x ∗ .(ii) Let the standard RCPLD and BQ (12) hold at x ∗ . Let J be the indexset given in the definition of the standard RCPLD such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } , and let I ∗ be defined as in(15). To show ∂ ∞ -RCPLD, it suffices to show that Definition 1(b)(ii) holds.Assume that there exist nonzero vectors { α i ≥ i ∈ I} with I ⊆ I ∗ and { β j : j ∈ J } such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I α i ∇ g i ( x ∗ ) + X j ∈J β j ∇ h j ( x ∗ ) , which together with (12) and (20) implies that X i ∈I α i ∇ g i ( x ∗ ) + X j ∈J β j ∇ h j ( x ∗ ) = 0 . This and the standard RCPLD imply the existence of δ > {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent for each x ∈ B δ ( x ∗ ) . Thus, Definition 1(b)(ii) holds and then ∂ ∞ -RCPLD holds at x ∗ .To show the converse part, suppose that ∂ ∞ -RCPLD holds at x ∗ . It thenfollows immediately that RCPLD holds at x ∗ since 0 ∈ ∂ ∞ Φ ( x ∗ ). Then by [20,Theorem 3.2], it follows that N F ( x ∗ ) ⊆ X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m . (21)We now show that BQ (12) holds. To the contrary, assume that 0 = ζ ∈− ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ). Let J be such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } . Then by (21), there exist { λ i > i ∈ I} with I ⊆ I ∗ and { µ j : j ∈ J } not all zero such that ζ = X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) , (22)which together with the relation ζ ∈ − ∂ ∞ Φ ( x ∗ ) implies that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) . Then by Definition 1(b)(ii), there exist { α i : i ∈ I} and { β j : j ∈ J } not allzero such that X i ∈I α i ∇ g i ( x ∗ ) + X i ∈J β j ∇ h j ( x ∗ ) = 0 , which together with (22) implies that for any γ ∈ ℜ , ζ = X i ∈I ( λ i − γα i ) ∇ g i ( x ∗ ) + X j ∈J ( µ j − γβ j ) ∇ h j ( x ∗ ) . Choosing γ = 0 as the smallest number such that λ i − γα i = 0 for at least one i ∈ I , we are able to represent ζ with at least one fewer vectors ∇ g i ( x ∗ ). Wemay repeat this procedure until ζ = P j ∈J θ j ∇ h j ( x ∗ ) for some { θ j : j ∈ J } not all zero. Then by the relation ζ ∈ − ∂ ∞ Φ ( x ∗ ), it follows that0 ∈ ∂ ∞ Φ ( x ∗ ) + X j ∈J θ j ∇ h j ( x ∗ ) . Thus by Definition 1(b)(ii), {∇ h j ( x ∗ ) : j ∈ J } must be linearly dependent.This contradicts the fact that {∇ h j ( x ∗ ) : j ∈ J } is a basis. Hence BQ (12)holds.(iii) When g, h are smooth around x ∗ and Ω is regular, it follows from [31,Theorem 6.14] that P i ∈I ∗ λ i ∇ g i ( x ∗ ) + m P j =1 µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m ⊆ N F ( x ∗ ) . (23)The proof for (iii) is exactly the same as that for (i) except that ∂ ∞ Φ ( x ∗ ) and(20) are replaced by D ∗ Ψ ( x ∗ )(0) and (23), respectively. (cid:3) When the constraint region is so simple that its limiting normal cone iseasy to calculate directly, we can use Proposition 3 to verify our proposedqualification conditions. The following simple minimax problem illustrates thatconditions (12)–(13) hold and then ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD, and D ∗ -quasi-normality are all satisfied since the constraint function is linear. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 13 Example 3
Consider the following minimax problemmin x ≤ max y ≥ − x + xy. Let V ( x ) := max {− x + xy : y ≥ } . Then it is easy to verify that the aboveminimax problem can be equivalently rewritten asmin V ( x ) s . t . x ≤ , (24)where V ( x ) = − x if x ≤ ∞ otherwise. Clearly, x ∗ = 0 is a minimizerof problem (24). We observe that V is not continuous at x ∗ but is lower semi-continuous at x ∗ . Moreover, ∂ ∞ V ( x ∗ ) = D ∗ V ( x ∗ )(0) = N ℜ − ( x ∗ ) = ℜ + , which indicates that − ∂ ∞ V ( x ∗ ) ∩ N ℜ − ( x ∗ ) = − D ∗ V ( x ∗ )(0) ∩ N ℜ − ( x ∗ ) = { } .Since the constraint function of problem (24) is linear, the standard RCPLDobviously holds and by [7, Proposition 3.1], the standard quasi-normality isalso satisfied. It then follows from Proposition 3 that ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD, and D ∗ -quasi-normality are all satisfied at x ∗ for problem (24). (cid:3) By using the outer semi-continuity of the horizon subdifferential mappingand Proposition 1, it is not difficult to show that both ∂ ∞ -quasi-normalityand D ∗ -quasi-normality are locally persistent as follows. Proposition 4 If ∂ ∞ -quasi-normality ( D ∗ -quasi-normality) holds at x ∗ ∈ F ,then there exists δ > such that ∂ ∞ -quasi-normality ( D ∗ -quasi-normality)holds at every point in B δ ( x ∗ ) ∩ F . We now show that ∂ ∞ -RCPLD is also locally persistent. Proposition 5 If ∂ ∞ -RCPLD holds at x ∗ ∈ F , then there exists δ > suchthat ∂ ∞ -RCPLD holds at every point in B δ ( x ∗ ) ∩ F .Proof Assume that ∂ ∞ -RCPLD holds at x ∗ . Let J ⊆ { , . . . , m } be such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } . Thenit is easy to see that there exists δ ∈ (0 , δ ) such that {∇ h i ( x ) : i ∈ I} islinearly independent for all x ∈ B δ ( x ∗ ), where δ is given in Definition 1(b).Then it follows from Definition 1(b)(i) that {∇ h j ( x ) : j ∈ J } is a basis forthe span {∇ h j ( x ) : j = 1 , . . . , m } for any x ∈ B δ ( x ∗ ). Let δ := δ /
2. Then byDefinition 1(b)(i) again, {∇ h j ( y ) : j = 1 , . . . , m } has the same rank for all y ∈B δ ( x ) and x ∈ B δ ( x ∗ ). Thus it suffices to show that there exists δ ∈ (0 , δ )such that for any x ∈ B δ ( x ∗ ), Definition 1(b)(ii) holds at x . Assume to thecontrary that this is not true. That is, there exist a sequence { x k } convergingto x ∗ , and { λ ki ≥ i ∈ I k } with I k ⊆ I g ( x k ) and { µ kj : j ∈ J } not all zerosuch that 0 ∈ ∂ ∞ Φ ( x k ) + X i ∈I k λ ki ∇ g i ( x k ) + X j ∈J µ kj ∇ h j ( x k ) , (25) and there exists a sequence { y k,l } l converging to x k such that {∇ g i ( y k,l ) , ∇ h j ( y k,l ) : i ∈ I k , j ∈ J } is linearly independent for all l. By the diagonalization law, there exists a sequence { z k } converging to x ∗ suchthat {∇ g i ( z k ) , ∇ h j ( z k ) : i ∈ I k , j ∈ J } is linearly independent for all k. (26)Since g is continuous, it is easy to verify that I g ( x k ) ⊆ I g ( x ∗ ) for any k sufficiently large and hence I k ⊆ I g ( x ∗ ). Since the number of the possible sets I k is finite, without loss of generality we may assume that I k ≡ I for any k sufficiently large. Let t k := max { λ ki , | µ kj | : i ∈ I , j ∈ J } . Clearly, t k > k . Without loss of generality, we may assume that λ ki t k → λ ∗ i ≥ i ∈ I , µ kj t k → µ ∗ j j ∈ J as k → ∞ . It is easy to see that max { λ ∗ i , | µ ∗ j | : i ∈ I , j ∈ J } = 1 . By the outer semi-continuity of the horizon subdifferential mapping, dividing(25) by t k and taking limits on both sides as k → ∞ imply that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ ∗ i ∇ g i ( x ∗ ) + X j ∈J µ ∗ j ∇ h j ( x ∗ ) . The last two relations and ∂ ∞ -RCPLD imply that for any x ∈ B δ ( x ∗ ), {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent , which contradicts (26). The desired result follows immediately. (cid:3) The purpose of this section is to show that the KKT condition defined belowis necessary for optimality under ∂ ∞ -quasi-normality or ∂ ∞ -RCPLD. Definition 3 (KKT condition)
Let x ∗ ∈ F . We say that x ∗ is a KKT pointof problem (1) if there exist multipliers λ ∈ ℜ n and µ ∈ ℜ m such that0 ∈ ∂f ( x ∗ ) + ∂Φ ( x ∗ ) + n X i =1 λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) ,λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 15 We first show that the KKT condition holds at a local minimizer under aweaker qualification condition.
Lemma 1
Let x ∗ be a local minimizer of problem (1) . Suppose that BQ (12) holds at x ∗ and N F ( x ∗ ) ⊆ X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m , (27) where I ∗ := I g ( x ∗ ) . Then x ∗ is a KKT point.Proof It is clear that x ∗ is a local minimizer of the problemmin f ( x ) + Φ ( x ) + δ F ( x ) . Then by Fermat’s rule (see, e.g., [31, Theorem 10.1]), we have0 ∈ ∂f ( x ∗ ) + ∂ ( Φ + δ F )( x ∗ ) . Since BQ (12) holds at x ∗ , it then follows from the sum rule for the limitingsubdifferentials (see, e.g., [31, Corollary 10.9]) and the relation ∂δ F ( x ∗ ) = N F ( x ∗ ) that 0 ∈ ∂f ( x ∗ ) + ∂Φ ( x ∗ ) + N F ( x ∗ ) . This and (27) imply the desired result immediately. (cid:3)
The following result follows immediately from the fact that (27) may beimplied by the local error bound condition (e.g., [21, Proposition 3.4]).
Corollary 1
Let x ∗ be a local minimizer of problem (1) . If BQ (12) holds at x ∗ and F admits a local error bound at x ∗ , then x ∗ is a KKT point. Let us revisit Example 1 which is in a four dimensional space. Even in thislow dimensional space, it is not easy to calculate the limiting normal cone of theconstraint region and hence BQ (12) is difficult to verify. For a constraint regioninvolving many nonlinear constraints in high-dimensional spaces, it is almostimpossible to calculate directly the limiting normal cone and thus BQ (12)is very difficult to verify. Fortunately, ∂ ∞ -quasi-normality and ∂ ∞ -RCPLDare expressed in terms of the problem data explicitly and hence much easierto verify. The following result shows that these two proposed qualificationconditions are sufficient for the KKT condition to hold at a local minimizer. Theorem 1
Let x ∗ be a local minimizer of problem (1) . Assume that either ∂ ∞ -quasi-normality holds at x ∗ or ∂ ∞ -RCPLD holds at x ∗ and g, h are smootharound x ∗ . Then x ∗ is a KKT point.Proof By Proposition 3, either the standard quasi-normality and BQ (12) orthe standard RCPLD and BQ (12) hold. It then follows from [33, Proposition4] and [20, Theorem 3.2] that condition (27) holds. Thus, the desired resultfollows from Lemma 1 immediately. (cid:3)
Corollary 2
Let x ∗ be a local minimizer of problem (1) and let g, h be linear.Suppose also that the following implication holds: ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) , λ i ≥ i ∈ I ∗ , = ⇒ X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) = 0 , where I ∗ := I g ( x ∗ ) . (28) Then x ∗ is a KKT point.Proof We first show that ∂ ∞ -RCPLD holds at x ∗ . Since h is linear, Definition1(b)(i) holds. It then suffices to show that Definition 1(b)(ii) holds. Let J besuch that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } and I ⊆ I ∗ . Assume that there exist { λ i : i ∈ I} and { µ j : j ∈ J } not allzero such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) , λ i ≥ i ∈ I , which implies that P i ∈I λ i ∇ g i ( x ∗ )+ P j ∈J µ j ∇ h j ( x ∗ ) = 0 by (28). This means thatthe family of gradients {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependentfor all x since g, h are linear. Thus, ∂ ∞ -RCPLD holds at x ∗ and then thedesired result follows immediately from Theorem 1. (cid:3) The following example illustrates the applicability of Corollary 2.
Example 4
Consider the following problemmin f ( x ) := p | x | + p | x | s . t . g ( x ) := x + x − ≥ ,g ( x ) := x + x − ≤ x ∗ = (1 , ∂ ∞ f ( x ∗ ) = { } × ℜ and then 0 ∈ ∂ ∞ f ( x ∗ ) − λ ∇ g ( x ∗ ) + λ ∇ g ( x ∗ )implies that λ = λ . Thus − λ (cid:18) (cid:19) + λ (cid:18) (cid:19) = (cid:18) (cid:19) and by Corollary 2,it then follows that x ∗ is a KKT point. (cid:3) Letting Φ be an indicator function of a closed subset Ω in ℜ d , i.e., Φ ( x ) = δ Ω ( x ), the following result follows immediately from Theorem 1, which extendsthe result of [2, Corollary 1] to allow an extra abstract set constraint x ∈ Ω and the nonsmoothness of the objective function. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 17 Corollary 3
Let x ∗ be a local minimizer of the nonlinear program min x ∈ Ω f ( x )s . t . g ( x ) ≤ ,h ( x ) = 0 , where f, g, h are defined as in problem (1) and Ω is a closed subset in ℜ d .Here we assume that g, h are smooth around x ∗ . Suppose further that RCPLDholds at x ∗ , i.e., there exists δ > such that (i) {∇ h j ( x ) : j = 1 , . . . , m } has the same rank for each x ∈ B δ ( x ∗ ) ; (ii) for each I ⊆ I g ( x ∗ ) , if there exist { λ i ≥ i ∈ I} and { µ j : j ∈ J } notall zero such that ∈ X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) , then {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent for each x ∈B δ ( x ∗ ) ,where J ⊆ { , . . . , m } is such that {∇ h j ( x ) : j ∈ J } is a basis for the span {∇ h j ( x ) : j = 1 , . . . , m } . Then there exist multipliers λ ∈ ℜ n and µ ∈ ℜ m such that ∈ ∂f ( x ∗ ) + n X i =1 λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) ,λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n. This section focuses on exact penalization for problem (1). We first give anexact penalization result for a special case of problem (1) where Φ is the sumof a composite function of a separable lower semi-continuous function with acontinuous function and an indictor function of a closed subset. To this end,we give a characterization of the regular subdifferential as follows. It can beshown easily by using the definition of the regular subdifferential and thus weomit the proof here. Lemma 2
Let ψ : ℜ d → ( −∞ , ∞ ] be lower semi-continuous and x ∗ ∈ ℜ d besuch that ψ ( x ∗ ) is finite. Then ˆ ∂ψ ( x ∗ ) = ℜ d if and only if for any M > ,there exists δ > such that ψ ( x ) − ψ ( x ∗ ) ≥ M k x − x ∗ k ∀ x ∈ B δ ( x ∗ ) . We are now ready to give the first main result on exact penalization.
Theorem 2
Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 φ i ( ω i ( x )) + δ Ω ( x ) . Here Ω is a closed subset in ℜ d and for any i = 1 , . . . , s , φ i : ℜ → ℜ islower semi-continuous and ω i ( x ) : ℜ d → ℜ is continuous. Let t ∗ := ω ( x ∗ ) , I := { i : ∂ ∞ φ i ( t ∗ i ) = { }} , and I c be the complement of I with respect to { , . . . , s } . Assume further that ˆ ∂φ i ( t ∗ i ) = ℜ for any i ∈ I c and the followingrestricted system with respect to ( x, t ) : g ( x ) ≤ , h ( x ) = 0 , x ∈ Ω,w i ( x ) − t i = 0 i = 1 , . . . , s,t i − t ∗ i = 0 i ∈ I c admits a local error bound at ( x ∗ , t ∗ ) . Then there exists ρ > such that forany ρ ≥ ρ , x ∗ is also a local minimizer of the exact penalization problem min x ∈ Ω f ( x ) + s X i =1 φ i ( ω i ( x )) + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof
Since x ∗ is a local minimizer of problem (1), it is not difficult to seethat ( x ∗ , t ∗ ) is a local minimizer of the following problem:min x ∈ Ω Π ( x, t ) := f ( x ) + P i ∈I φ i ( t i )s . t . g ( x ) ≤ , h ( x ) = 0 ,w i ( x ) − t i = 0 i = 1 , . . . , s,t i − t ∗ i = 0 i ∈ I c . (29)We observe that Π is Lipschitz around ( x ∗ , t ∗ ) and denote by L Π the Lipschitzconstant. Then by Clarke’s exact penalization principle [15, Proposition 2.4.3],there exists δ > Π ( x ∗ , t ∗ ) ≤ Π ( x, t ) + L Π dist F ′ ( x, t ) ∀ ( x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ) , (30)where F ′ denotes the constraint region of problem (29). Since F ′ admits alocal error bound at ( x ∗ , t ∗ ), there exist δ ∈ (0 , δ ) and κ > x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ),dist F ′ ( x, t ) ≤ κ s X i =1 | w i ( x ) − t i | + X i ∈I c | t i − t ∗ i | + k g ( x ) + k + k h ( x ) k ! . For simplicity, the above local error bound is expressed under the ℓ norm.This together with (30) implies that for any ( x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ), Π ( x ∗ , t ∗ ) ≤ Π ( x, t )+ κL Π (cid:18) s P i =1 | w i ( x ) − t i | + P i ∈I c | t i − t ∗ i | + k g ( x ) + k + k h ( x ) k (cid:19) . (31) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 19 Due to the continuity of function w , we may choose δ ∈ (0 , δ ) such that( x, w ( x )) ∈ B δ ( x ∗ , t ∗ ) for any x ∈ B δ ( x ∗ ). Thus, by letting t = w ( x ) in (31),it follows that for any x ∈ B δ ( x ∗ ) ∩ Ω , Π ( x ∗ , t ∗ ) ≤ Π ( x, w ( x )) + κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! . (32)Since ˆ ∂φ i ( t ∗ i ) = ℜ for any i ∈ I c , it then follows from Lemma 2 and thecontinuity of w that there exists δ ∈ (0 , δ ) such that φ i ( w i ( x )) − φ i ( t ∗ i ) ≥ κL Π | w i ( x ) − t ∗ i | ∀ x ∈ B δ ( x ∗ ) ∀ i ∈ I c . This and (32) imply that for any x ∈ B δ ( x ∗ ) ∩ Ω , f ( x ∗ ) + s X i =1 φ i ( t ∗ i ) = Π ( x ∗ , t ∗ ) + X i ∈I c φ i ( t ∗ i ) ≤ Π ( x, w ( x )) + X i ∈I c φ i ( w i ( x )) + X i ∈I c φ i ( t ∗ i ) − X i ∈I c φ i ( w i ( x ))+ κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! ≤ f ( x ) + s X i =1 φ i ( w i ( x )) − κL Π X i ∈I c | w i ( x ) − t ∗ i | + κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! = f ( x ) + s X i =1 φ i ( w i ( x )) + κL Π ( k g ( x ) + k + k h ( x ) k ) . Then the desired result follows immediately by the equivalence of all norms infinite dimensional spaces. (cid:3)
It should be noted that Theorem 2 can be applied to a class of sparseoptimization problems. For the widely used bridge penalty φ ( t ) = | t | p with p ∈ (0 ,
1) in the sparse optimization literature, it is easy to see that φ is notLipschitz around t ∗ = 0. However, it is not hard to verify that ˆ ∂φ ( t ∗ ) = ℜ andthus this bridge penalty function is a suitable outer function required in The-orem 2. In the following, we give some exact penalization results for problem(1) where the objective function is related to the bridge penalty function.The following result shows that the problem considered in [14] with anextra abstract constraint set which is the union of finitely many polyhedralsets admits an exact penalization. Corollary 4
Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 | a Ti x | p + δ Ω ( x ) with a i ∈ ℜ d , p ∈ (0 , , and Ω ⊆ ℜ d which is the union of finitely manypolyhedral sets. Assume further that g, h are linear. Then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also a local minimizer of the exact penalizationproblem min x ∈ Ω f ( x ) + s X i =1 | a Ti x | p + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof
Let φ ( t ) := | t | p and t ∗ i := a Ti x ∗ i = 1 , . . . , s . It is easy to verify that I := { i : ∂ ∞ φ ( t ∗ i ) = { }} = { i : t ∗ i = 0 } and ˆ ∂φ ( t ∗ i ) = ℜ for any i ∈ I c where I c is the complement of I with respectto { , . . . , s } . Since the constraint set (cid:26) ( x, t, p ) : g ( x ) + p g ≤ , h ( x ) + p h = 0 , t i − t ∗ i + p ti = 0 i ∈ I c a Ti x − t i + p ai = 0 i = 1 , . . . , s, x ∈ Ω (cid:27) is the union of finitely many polyhedral sets, it then follows from the corollaryin [30, Page 210] that the local error bound condition holds everywhere for theconstraint set (cid:26) ( x, t ) ∈ Ω × ℜ s : g ( x ) ≤ , h ( x ) = 0 , t i − t ∗ i = 0 i ∈ I c a Ti x − t i = 0 i = 1 , . . . , s (cid:27) . Then the desired result follows immediately from Theorem 2. (cid:3)
We next give an exact penalization result for a problem which is moregeneral than the one considered in [25].
Corollary 5
Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 [( b i − a Ti x ) + ] p + δ Ω ( x ) with a i ∈ ℜ d , b i ∈ ℜ , p ∈ (0 , , and Ω ⊆ ℜ d which is the union of finitelymany polyhedral sets. Assume further that g, h are linear. Then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also a local minimizer of the exactpenalization problem min x ∈ Ω f ( x ) + s X i =1 [( b i − a Ti x ) + ] p + ρ ( k g ( x ) + k + k h ( x ) k ) . on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 21 Proof
Let φ ( t ) := | t | p and t ∗ i := ( b i − a Ti x ∗ ) + i = 1 , . . . , s . Using the samenotations I , I c as in the proof of Corollary 4, it suffices to investigate the localerror bound condition for the constraint set (cid:26) ( x, t ) ∈ Ω × ℜ s : g ( x ) ≤ , h ( x ) = 0 , t i − t ∗ i = 0 i ∈ I c ( b i − a Ti x ) + − t i = 0 i = 1 , . . . , s (cid:27) . It is easy to see that the parametric counterpart of the above set (cid:26) ( x, t, p ) : g ( x ) + p g ≤ , h ( x ) + p h = 0 , t i − t ∗ i + p ti = 0 i ∈ I c ( b i − a Ti x ) + − t i + p + i = 0 i = 1 , . . . , s, x ∈ Ω (cid:27) is the union of finitely many polyhedral sets. Thus by the corollary in [30, Page210], the desired local error bound condition is satisfied. The proof is completeby applying Theorem 2. (cid:3) In the rest of this section, we investigate sufficient conditions ensuring anexact penalization for problem (1) where Φ is the sum of a continuous functionand an indictor function of a closed subset. In particular, we investigate exactpenalization for the following problem:min x ∈ Ω f ( x ) + Ψ ( x )s . t . g ( x ) ≤ , (33) h ( x ) = 0 , where f, g, h are defined as in problem (1), Ψ : ℜ d → ℜ is a continuousfunction, and Ω is a closed subset in ℜ d . As discussed in Section 1, when theobjective function of a nonlinear program is locally Lipschitz, the admittanceof the local error bound for its constraint region is sufficient to ensure an exactpenalization. For this purpose, we introduce the following auxiliary problemwhere the objective function is locally Lipschitz:min ( x,y ) ∈ Ω ×ℜ f ( x ) + y s . t . Ψ ( x ) − y = 0 , (34) g ( x ) ≤ , h ( x ) = 0 . In the case where Ω = ℜ d , the constraint region of problem (34) can berewritten as Λ = { ( x, y ) ∈ gph Ψ : g ( x ) ≤ , h ( x ) = 0 } . We observe that by using the definition of the coderivative, the inclusion0 ∈ D ∗ Ψ ( x )(0) + X i ∈I g ( x ) λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x )can be rewritten as(0 , ∈ X i ∈I g ( x ) (cid:18) λ i ∂g i ( x )0 (cid:19) + m X j =1 (cid:18) ∂ ( µ j h j )( x )0 (cid:19) + N gph Ψ ( x, Ψ ( x )) . Hence if x ∗ is D ∗ -quasinormal for problem (33), then ( x ∗ , Ψ ( x ∗ )) is quasi-normal for problem (34). Moreover, gph Ψ is a closed subset in ℜ d +1 by thecontinuity of Ψ . These and the local Lipschitzness of g, h enable us to ap-ply [19, Corollary 5.3] to derive the local error bound condition at ( x ∗ , Ψ ( x ∗ )),that is, there exist δ > κ > Λ ( x, y ) ≤ κ ( k g ( x ) + k + k h ( x ) k ) ∀ ( x, y ) ∈ B δ ( x ∗ , Ψ ( x ∗ )) ∩ gph Ψ. The exact penalization result then follows from applying Clarke’s exact pe-nalization principle [15, Proposition 2.4.3] to problem (34). Unfortunately, theabove argument does not work for the case where the abstract constraint set Ω is not equal to the whole space ℜ d . Nevertheless, we have succeeded in de-riving the following local error bound result under D ∗ -quasi-normality givenin Definition 2. Lemma 3
Suppose that D ∗ -quasi-normality holds at x ∗ ∈ F . Then the set Λ := { ( x, y ) ∈ Ω × ℜ : Ψ ( x ) − y = 0 , g ( x ) ≤ , h ( x ) = 0 } (35) admits a local error bound at ( x ∗ , y ∗ ) with y ∗ := Ψ ( x ∗ ) , that is, there exist δ > and κ > such that dist Λ ( x, y ) ≤ κ ( | Ψ ( x ) − y | + k g ( x ) + k + k h ( x ) k ) ∀ ( x, y ) ∈ B δ ( x ∗ , y ∗ ) ∩ ( Ω ×ℜ ) . Proof
First we observe that Λ defined in (35) can be rewritten as Λ = { ( x, y ) : Ξ ( x, y ) + dist Ω ( x ) = 0 } , where Ξ ( x, y ) := max { H ( x, y ) , g ( x ) , . . . , g n ( x ) , | h ( x ) | , . . . , | h m ( x ) |} with H ( x, y ) := | G ( x, y ) | and G ( x, y ) := Ψ ( x ) − y . Then in order to obtain thedesired result, by [32, Theorem 3.1], it suffices to show that there exist ˜ δ > κ > k π k ≥ ˜ κ ∀ π ∈ ∂ ( Ξ + dist Ω )( x, y ) , ∀ ( x, y ) ∈ B ˜ δ ( x ∗ , y ∗ ) ∩ ( Ω × ℜ ) \ Λ. (36)We now make some preparations for subsequent analysis. Since | · | is glob-ally Lipschitz and G is continuous, it follows from Proposition 2(iii) that forany ( x, y ), ∂H ( x, y ) ⊆ [ ξ ∈ ∂ | G ( x,y ) | D ∗ G ( x, y )( ξ ) ⊆ [ ξ ∈ ∂ | G ( x,y ) | (cid:18) D ∗ Ψ ( x )( ξ ) − ξ (cid:19) , (37) ∂ ∞ H ( x, y ) ⊆ D ∗ G ( x, y )(0) ⊆ (cid:18) D ∗ Ψ ( x )(0)0 (cid:19) . (38)Since dist Ω ( · ) is globally Lipschitz and Ξ is continuous, by Proposition 2(i),we have that for any ( x, y ), ∂ ( Ξ + dist Ω )( x, y ) ⊆ ∂Ξ ( x, y ) + ∂ dist Ω ( x ) × { } . (39) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 23 Since g, h are both Lipschitz around x ∗ , by Proposition 2(iv), it follows thatfor any ( x, y ) with x sufficiently close to x ∗ , there exists ( α, β, γ ) ∈ M ( x, y )where M ( x, y ) := ( α, β, γ ) : α ≥ , β ≥ , γ ≥ , α + k β k + k γ k = 1 α ( H ( x, y ) − Ξ ( x, y )) = 0 β i ( g i ( x ) − Ξ ( x, y )) = 0 i = 1 , . . . , nγ j ( | h j ( x ) | − Ξ ( x, y )) = 0 j = 1 , . . . , m such that ∂Ξ ( x, y ) ⊆ S ( α,β,γ ) ∈M ( x,y ) ( α ⋄ ∂H ( x, y ) + n P i =1 (cid:18) β i ∂g i ( x )0 (cid:19) + m P j =1 (cid:18) γ j ∂ | h j | ( x )0 (cid:19)) . (40)In the following, we prove (36) by contradiction. Assume to the contrarythat there exist a sequence { ( x k , y k ) } with ( x k , y k ) ∈ Ω × ℜ\ Λ converging to( x ∗ , y ∗ ) and π k ∈ ∂ ( Ξ + dist Ω )( x k , y k ) such that π k →
0. Then it follows from(39)–(40) that there exists ( α k , β k , γ k ) ∈ M ( x k , y k ) such that π k ∈ α k ⋄ ∂H ( x k , y k )+ n X i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m X j =1 (cid:18) γ kj ∂ | h j | ( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (41)Noting that ( x k , y k ) ∈ Ω × ℜ\ Λ , we have that Ξ ( x k , y k ) > ∀ k. (42)Since ( α k , β k , γ k ) ∈ M ( x k , y k ), it follows that α k ≥ β k ≥ , γ k ≥
0, and α k + k β k k + k γ k k = 1 , (43) α k ( H ( x k , y k ) − Ξ ( x k , y k )) = 0 , (44) β ki ( g i ( x k ) − Ξ ( x k , y k )) = 0 i = 1 , . . . , n, (45) γ kj ( | h j ( x k ) | − Ξ ( x k , y k )) = 0 j = 1 , . . . , m. (46)Define ¯ γ kj := sign( h j ( x k )) γ kj , where sign(0) := 0Since it follows from (42) and (46) that γ kj = 0 when h j ( x k ) = 0, it is easy tosee that k γ k k = k ¯ γ k k . It then follows from (43) that α k + k β k k + k ¯ γ k k = 1 . (47)Moreover, by Proposition 2(ii), we have that γ kj ∂ | h j | ( x k ) = ∂ (¯ γ kj h j )( x k ) . (48)We continue the proof by considering the two separate cases as follows. Case (a): There exists a subsequence { α k } k ∈K with K ⊆ N such that α k = 0for any k ∈ K . Then it follows from (38), (41), (48), and the definition ofnotation ⋄ that for any k ∈ K , π k ∈ (cid:18) D ∗ Ψ ( x k )(0)0 (cid:19) + n X i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m X j =1 (cid:18) ∂ (¯ γ kj h j )( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (49)In this case, it follows from (47) that k β k k + k ¯ γ k k = 1. Thus, there must existsubsequences { β k } k ∈K and { ¯ γ k } k ∈K with K ⊆ K such that as K ∋ k → ∞ , β k → β ∗ ≥ , ¯ γ k → γ ∗ with k β ∗ k + k γ ∗ k = 1 . (50)Taking limits on both sides of (49), it then follows from (50), Proposition1, and the local boundedness of the limiting subdifferential of local Lipschitzfunctions that0 ∈ D ∗ Ψ ( x ∗ )(0) + n X i =1 β ∗ i ∂g i ( x ∗ ) + m X j =1 ∂ ( γ ∗ j h j )( x ∗ ) + ∂ dist Ω ( x ∗ ) . (51)If g i ( x ∗ ) <
0, then g i ( x k ) < k sufficiently large. Thus by (42) and(45), β ki = 0 for any k sufficiently large. This together with (50) implies that β ∗ i = 0. In conclusion, we have β ∗ i ≥ , β ∗ i g i ( x ∗ ) = 0 i = 1 , . . . , n. (52)Moreover, if β ∗ i >
0, then by (50), we have β ki > k ∈ K sufficientlylarge. This and (45) imply that g i ( x k ) = Ξ ( x k , y k ) >
0. If γ ∗ j = 0, then by(50), we have γ ∗ j ¯ γ kj > k ∈ K sufficiently large. Thus, it follows fromthe definition of ¯ γ kj and the relation γ kj ≥ γ ∗ j h j ( x k ) > k ∈ K sufficiently large. Thus, we have β ∗ i > ⇒ g i ( x k ) > , γ ∗ j = 0 = ⇒ γ ∗ j h j ( x k ) > , which together with (50)–(52) contradicts D ∗ -quasi-normality at x ∗ by usingthe relation ∂ dist Ω ( x ∗ ) ⊆ N Ω ( x ∗ ).Case (b): There exists a subsequence { α k } k ∈K with K ⊆ N such that α k > k ∈ K . In this case, it then follows from (37), (41), (48), andthe definition of notation ⋄ that for any k ∈ K , there exists ξ k ∈ ∂ | G ( x k , y k ) | such that π k ∈ (cid:18) D ∗ Ψ ( x k )( α k ξ k ) − α k ξ k (cid:19) + n P i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m P j =1 (cid:18) ∂ (¯ γ kj h j )( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (53)Since α k >
0, it follows from (42) and (44) that H ( x k , y k ) = Ξ ( x k , y k ) > (cid:26) ξ k = 1 if Ψ ( x k ) − y k > ,ξ k = − . (54) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 25 It then follows that | ¯ α k | = α k where ¯ α k := α k ξ k . Thus by (47), it follows that | ¯ α k | + k β k k + k ¯ γ k k = 1 . Without loss of generality, we assume that as
K ∋ k → ∞ ,¯ α k → α ∗ , β k → β ∗ , ¯ γ k → γ ∗ with | α ∗ | + k β ∗ k + k γ ∗ k = 1 . (55)It then follows from (53) and the relation π k → α k → K ∋ k → ∞ .Thus by (55), we have that α ∗ = 0 , k β ∗ k + k γ ∗ k = 1 . Taking limits on both sides of (53), it then follows from (55), Proposition1, and the local boundedness of the limiting subdifferential of local Lipschitzfunctions that0 ∈ D ∗ Ψ ( x ∗ )(0) + n X i =1 β ∗ i ∂g i ( x ∗ ) + m X j =1 ∂ ( γ ∗ j h j )( x ∗ ) + ∂ dist Ω ( x ∗ ) . The rest of the proof for case (b) is similar to that for case (a).Therefore, there exist ˜ δ > κ > (cid:3)
We are now ready to give the exact penalization result for problem (33).
Theorem 3
Let x ∗ be a local minimizer of problem (33). If D ∗ -quasi-normalityholds at x ∗ , then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also alocal minimizer of the exact penalization problem min x ∈ Ω f ( x ) + Ψ ( x ) + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof
By the local optimality of x ∗ , it is easy to see that ( x ∗ , y ∗ ) with y ∗ := Ψ ( x ∗ ) is a local minimizer of the following auxiliary problem:min f ( x ) + y s . t . ( x, y ) ∈ Λ, where Λ is defined in Lemma 3. Denote by ℓ the Lipschitz constant of theobjective function f ( x ) + y around ( x ∗ , y ∗ ). By Clarke’s exact penalizationprinciple [15, Proposition 2.4.3], there exists δ > f ( x ∗ ) + y ∗ ≤ f ( x ) + y + ℓ dist Λ ( x, y ) ∀ ( x, y ) ∈ B δ ( x ∗ , y ∗ ) . Then it follows from Lemma 3 that there exist δ ∈ (0 , δ ) and κ > x, y ) ∈ B δ ( x ∗ , y ∗ ) ∩ ( Ω × ℜ ), f ( x ∗ ) + y ∗ ≤ f ( x ) + y + ℓ dist Λ ( x, y ) ≤ f ( x ) + y + κℓ ( | Ψ ( x ) − y | + k g ( x ) + k + k h ( x ) k ) . (56)By the continuity of Ψ , we may choose δ ∈ (0 , δ ) such that ( x, Ψ ( x )) ∈B δ ( x ∗ , y ∗ ) for any x ∈ B δ ( x ∗ ). Then the desired result follows immediatelyfrom (56) by letting y = Ψ ( x ) and ρ := κℓ . (cid:3) Acknowledgements
We thank the referees for their helpful suggestions and commentsthat have helped us to improve the presentation of the paper. We would also like to thankJim Burke for a discussion on the topic of this research.
References
1. J.M. Abadie, On the Kuhn-Tucker theorem, In Nonlinear Programming, J. Abadie, ed.,John Wiley, New York, 1967, 21–36.2. R. Andreani, G. Haeser, M.L. Schuverdt and P.J. Silva,
A relaxed constant positive lineardependence constraint qualification and applications , Math. Program., 135 (2012), 255–273.3. R. Andreani, G. Haeser, M.L. Schuverdt and P.J. Silva,
Two new weak constraint quali-fications and applications , SIAM J. Optim., 22 (2012), 1109–1135.4. R. Andreani, J.M. Martinez, A. Ramos and P.J. Silva,
A cone-continuity constraintqualification and algorithmic consequences , SIAM J. Optim., 26 (2016), 96–110.5. R. Andreani, J.M. Martinez, A. Ramos and P.J. Silva,
Strict Constraint Qual-ifications and Sequential Optimality Conditions for Constrained Optimization
On the relations between constant pos-itive linear dependence condition and quasinormality constraint qualification , J. Optim.Theory Appl., 125 (2005), 473–485.7. D.P. Bertsekas and A.E. Ozdaglar,
Pseudonormality and a Lagrange multiplier theoryfor constrained optimization , J. Optim. Theory Appl., 114 (2002), 287–343.8. W. Bian and X. Chen,
Linearly constrained non-Lipschitz optimization for image restora-tion , SIAM J. Imaging Sci., 8 (2015), 2294–2322.9. J. Borwein, J. Treiman and Q. Zhu,
Necessary conditions for constrained optimizationproblems with semicontinuous and continuous data , Trans. Amer. Math. Soc., 350 (1998),2409–2429.10. A.M. Bruckstein, D.L. Donoho and M. Elad,
From sparse solutions of systems of equa-tions to sparse modeling of signals and images , SIAM Rev., 51 (2009), 34–81.11. R. Chartrand,
Exact reconstruction of sparse signals via nonconvex minimization , IEEESignal Process. Lett., 14 (2007), 707–710.12. X. Chen, L. Guo, Z. Lu and J.J. Ye,
An augmented Lagrangian method for non-Lipschitznonconvex programming , SIAM J. Numer. Anal., in press.13. C. Chen, X. Li, C. Tolman, S. Wang and Y. Ye,
Sparse portfolio selection via quasi-normregularization , ArXiv preprint, arXiv:1312.6350, 2014.14. X. Chen, L. Niu and Y. Yuan,
Optimality conditions and a smoothing trust regionnewton method for nonlipschitz optimization , SIAM J. Optim., 23 (2013), 1528–1552.15. F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, New York,1983.16. F.H. Clarke, Yu. S. Ledyaev, R.J. Stern and P.R. Wolenski, Nonsmooth Analysis andControl Theory, Springer, New York, 1998.17. H. Gfrerer,
First order and second order characterizations of metric subregularity andcalmness of constraint set mappings , SIAM J. Optim., 21 (2011), 1439–1474.18. M. Guignard,
Generalized Kuhn-Tucker conditions for mathematical programmingproblems in a Banach space , SIAM J. Contr., 7 (1969), 232–241.19. L. Guo, J.J. Ye and J. Zhang,
Mathematical programs with geometric constraints inBanach spaces: enhanced optimality, exact penalty, and sensitivity , SIAM J. Optim., 4(2013), 2295–2319.20. L. Guo, J. Zhang and G.H. Lin,
New results on constraint qualifications for nonlinearextremum problems and extensions , J. Optim. Theory Appl., 163 (2014), 737–754.21. A. Ioffe and J.V. Outrata,
On metric and calmness qualification conditions in subdif-ferential calculus , Set Val. Anal., 16 (2008), 199–227.22. A. Jourani and L. Thibault,
The approximate subdifferential of composite functions ,Bull. Aust. Math. Soc., 47 (1993), 443–456.on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 2723. A.Y. Kruger and B.S. Mordukhovich,
New necessary optimality conditions in problemsof nondifferentiable programming , in Numerical Methods of Nonlinear Programming, 116–119, Kharkov, 1979 (in Russian).24. Y.F. Liu, Y.H. Dai and S. Ma,
Joint power and admission control: non-convex ℓ q approximation and an effective polynomial time deflation approach , IEEE Trans. SignalProcess., 63 (2015), 3641–3656.25. Y.F. Liu, S. Ma, Y.H. Dai and S. Zhang, A smoothing SQP framework for a class ofcomposite ℓ q minimization over polyhedron , Math. Program., 158 (2016), 467–500.26. L. Minchenko and S. Stakhovski, Parametric nonlinear programming problems underthe relaxed constant rank condition , SIAM J. Optim., 21 (2011), 314–332.27. B.S. Mordukhovich,
Metric approximations and necessary optimality conditions forgeneral classes of nonsmooth extremal problems , Soviet Math. Dokl., 22 (1980), 526–530.28. B.S. Mordukhovich, Variational Analysis and Generalized Differentiation I: Basic The-ory, II: Application, Grundlehren der Mathematischen Wissenschaften 330, Springer,Berlin, 2006.29. B.S. Mordukhovich and Y.H. Shao,
Nonsmooth sequential analysis in Asplund space ,Trans. Amer. Math. Soc., 348 (1996), 123–128030. S.M. Robinson,
Some continuity properties of polyhedral multifunctions , Math. Pro-gram. Stud., 14 (1981), 206–214.31. R.T. Rockafellar and R.J.-B. Wets, Variational Analysis, Springer, 1998.32. Z. Wu and J.J. Ye,
Sufficient conditions for error bounds , SIAM J. Optim., 12 (2001),421–435.33. J.J. Ye and J. Zhang,