[PDF] Necessary Optimality Conditions and Exact Penalization for Non-Lipschitz Nonlinear Programs

Abstract

When the objective function is not locally Lipschitz, constraint qualifications are no longer sufficient for Karush-Kuhn-Tucker (KKT) conditions to hold at a local minimizer, let alone ensuring an exact penalization. In this paper, we extend quasi-normality and relaxed constant positive linear dependence (RCPLD) condition to allow the non-Lipschitzness of the objective function and show that they are sufficient for KKT conditions to be necessary for optimality. Moreover, we derive exact penalization results for the following two special cases. When the non-Lipschitz term in the objective function is the sum of a composite function of a separable lower semi-continuous function with a continuous function and an indicator function of a closed subset, we show that a local minimizer of our problem is also a local minimizer of an exact penalization problem under a local error bound condition for a restricted constraint region and a suitable assumption on the outer separable function. When the non-Lipschitz term is the sum of a continuous function and an indicator function of a closed subset, we also show that our problem admits an exact penalization under an extended quasi-normality involving the coderivative of the continuous function.

Full PDF

aa r X i v : . [ m a t h . O C ] J a n Mathematical Programming manuscript No. (will be inserted by the editor)

Necessary Optimality Conditions and ExactPenalization for Non-Lipschitz Nonlinear Programs

Dedicated To R. Terry Rockafellar in honor of his thbirthday Lei Guo · Jane J. Ye

Received: date / Accepted: date

Abstract

When the objective function is not locally Lipschitz, constraintqualiﬁcations are no longer suﬃcient for Karush-Kuhn-Tucker (KKT) condi-tions to hold at a local minimizer, let alone ensuring an exact penalization. Inthis paper, we extend quasi-normality and relaxed constant positive linear de-pendence (RCPLD) condition to allow the non-Lipschitzness of the objectivefunction and show that they are suﬃcient for KKT conditions to be necessaryfor optimality. Moreover, we derive exact penalization results for the followingtwo special cases. When the non-Lipschitz term in the objective function isthe sum of a composite function of a separable lower semi-continuous func-tion with a continuous function and an indicator function of a closed subset,we show that a local minimizer of our problem is also a local minimizer ofan exact penalization problem under a local error bound condition for a re-stricted constraint region and a suitable assumption on the outer separablefunction. When the non-Lipschitz term is the sum of a continuous functionand an indicator function of a closed subset, we also show that our problemadmits an exact penalization under an extended quasi-normality involving thecoderivative of the continuous function.

Keywords

Non-Lipschitz program · necessary optimality · exact penaliza-tion · error bound Mathematics Subject Classiﬁcation (2010) · · The ﬁrst author’s work was supported in part by NSFC Grant (No. 11401379) and thesecond author’s work was supported in part by NSERC.Lei GuoSino-US Global Logistics Institute, Shanghai Jiao Tong University, Shanghai 200030, ChinaE-mail: [email protected] YeDepartment of Mathematics and Statistics, University of Victoria, Victoria, BC, V8W 2Y2,CanadaE-mail: [email protected] Lei Guo, Jane J. Ye

The purpose of this paper is to study necessary optimality conditions andexact penalization for the following non-Lipschitz nonlinear program:min f ( x ) + Φ ( x )s . t . g ( x ) ≤ , (1) h ( x ) = 0 , where f : ℜ d → ℜ , g : ℜ d → ℜ n , h : ℜ d → ℜ m are Lipschitz around the point ofinterest, and Φ : ℜ d → ( −∞ , ∞ ] is an extended-valued lower semi-continuousfunction.Including a non-Lipschitz term in the objective function has signiﬁcantlyenlarged the applicability of standard nonlinear programs. For example, ithas recently been discovered that when the term Φ belongs to a certain classof non-Lipschitz functions, local minimizers of problem (1) are often sparse.This property makes problem (1) useful for seeking a sparse solution in manyﬁelds such as image restoration, signal processing, wireless communication,and portfolio selection in ﬁnancial applications; see, e.g., [8, 10, 11, 13, 24].It is well-known that a constraint qualiﬁcation is a condition imposed onconstraint functions so that Karush-Kuhn-Tucker (KKT) conditions hold at alocal minimizer. There exist very weak constraint qualiﬁcations such as Guig-nard’s and Abadie’s constraint qualiﬁcations ( [1,18]) but they are not easy toverify since it involves computing the tangent or normal cone of the constraintregion. The challenge is to ﬁnd veriﬁable constraint qualiﬁcations that areapplicable to as many situations as possible. For nonlinear programs wherethe objective function is locally Lipschitz, the veriﬁable classical constraintqualiﬁcations in the literature include linear independence constraint qualiﬁ-cation, Slater’s condition, and Mangasarian-Fromovitz constraint qualiﬁcation(MFCQ). Moreover, it is well-known that when all constraint functions are lin-ear, no constraint qualiﬁcation is required for KKT conditions to hold at a localminimizer. In recent years, quite a few new and weaker veriﬁable constraintqualiﬁcations have been introduced; see, e.g., [2–7,17,26]. In particular, quasi-normality is a weak constraint qualiﬁcation that was ﬁrst introduced in [7]and extended to locally Lipschitz programs in [33]. The recently introducedrelaxed constant positive linear dependence (RCPLD) condition in [2] is alsoa weak constraint qualiﬁcation. Both quasi-normality and RCPLD are weakerthan MFCQ and hold automatically when all constraint functions are linear(see [7, Proposition 3.1]). Moreover, they can admit a local error bound for theconstraint region (see [2, 19]) and thus by Clarke’s exact penalization princi-ple [15, Proposition 2.4.3], they are suﬃcient to ensure an exact penalizationwhen the objective function is locally Lipschitz.Little research has been done in KKT necessary optimality conditions fornon-Lipschitz nonlinear programs in the literature, let alone exact penaliza-tion. The Fritz John type necessary optimality conditions for non-Lipschitzprograms were ﬁrst given by Kruger and Mordukhovich in [23] and reproved on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 3 by Mordukhovich in [27, Theorem 1(b)] and Borwein et al. in [9, Corollary2.6]. For our problem, since all functions are locally Lipschitz except the ob-jective function, (A1) in [9, Corollary 2.6] never holds and consequently (A2)in [9, Corollary 2.6] holds. Hence, the Fritz John condition [9, Corollary 2.6]for our problem states that at a local minimizer x ∗ , there exist λ ∈ ℜ |I ∗ | + and µ ∈ ℜ m not all zero such that at least one of the following cases holds:(i) 0 ∈ ∂ ∞ Φ ( x ∗ ) + P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ),(ii) 0 ∈ ∂ ( f + Φ )( x ∗ ) + P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ),where I ∗ := { i : g i ( x ∗ ) = 0 } , and ∂, ∂ ∞ denote the limiting subdiﬀerentialand the horizon subdiﬀerential respectively (see the deﬁnitions in Section 2).Consequently, we can derive the KKT necessary optimality condition from theabove Fritz John condition immediately as follows. Suppose that there are nononzero abnormal multipliers, i.e., the following implication holds:  ∈ ∂ ∞ Φ ( x ∗ ) + n P i =1 λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ) λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n = ⇒ ( λ, µ ) = 0 , (2)then the above condition (ii) holds, which means that x ∗ is a KKT point. Wecall the implication (2) ∂ ∞ -no nonzero abnormal multiplier constraint qual-iﬁcation ( ∂ ∞ -NNAMCQ) at x ∗ . Note that when Φ is Lipschitz around x ∗ ,we have ∂ ∞ Φ ( x ∗ ) = { } and hence ∂ ∞ -NNAMCQ reduces to the standardNNAMCQ for nonlinear programs with equality and inequality constraints.When Φ is an indicator function of a closed subset, ∂ ∞ -NNAMCQ reducesto the standard NNAMCQ for nonlinear programs with equality, inequality,and abstract set constraints. When Φ is neither Lipschitz around x ∗ nor equalto an indicator function, the implication (2) involves the horizon subdiﬀeren-tial ∂ ∞ Φ ( x ∗ ) of the non-Lipschitz term. Thus, ∂ ∞ -NNAMCQ is no longer aconstraint qualiﬁcation since it is related to the objective function. However,since it is a condition under which a local minimizer is a KKT point, we callsuch a condition a qualiﬁcation condition . Very recently, Chen et al. [12] gavesome necessary optimality conditions for problem (1) where the non-Lipschitzterm Φ is continuous and all the other functions are continuously diﬀerentiableunder RCPLD, and the so-called basic qualiﬁcation (BQ for short) (see thedeﬁnition in (12)), and proposed an augmented Lagrangian method for solvingthis kind of problems. It should be noted that BQ is very diﬃcult to verify asdiscussed in the paragraph after Corollary 1.In this paper, we extend the standard quasi-normality and the standardRCPLD to problem (1). Similar to ∂ ∞ -NNAMCQ, our new qualiﬁcation con-ditions also involve ∂ ∞ Φ ( x ∗ ) and we call them ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD respectively. Moreover, we derive two exact penalization results fortwo special cases of problem (1) under some suitable conditions. We summarizeour main contributions as follows: Lei Guo, Jane J. Ye – We introduce two new veriﬁable qualiﬁcation conditions called ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD respectively and show that they are suﬃcientfor KKT conditions to be necessary for optimality. These two qualiﬁcationconditions are both weaker than ∂ ∞ -NNAMCQ and hold automaticallywhen Φ is Lipschitz around the point of interest and g, h are linear. As a by-product, we extend the standard RCPLD on smooth constraint functionsto the case where there is an extra abstract set constraint and show thatKKT conditions are necessary for optimality. – Exact penalization for two special cases of problem (1) are derived. Case i): Φ is the sum of a composite function of a separable lower semi-continuousfunction with a continuous function and an indictor function of a closedsubset. In this case, we show that a local minimizer of problem (1) is alsoa local minimizer of an exact penalization problem under a local errorbound condition for a restricted constraint region and a suitable assump-tion on the outer separable function. Case ii): Φ is the sum of a continuousfunction and an indicator function of a closed subset. In this case, we intro-duce D ∗ -quasi-normality that is an extended quasi-normality involving thecoderivative of the continuous function, and show that D ∗ -quasi-normalityis suﬃcient to ensure an exact penalization. Note that D ∗ -quasi-normalityreduces to the standard quasi-normality for nonlinear programs with equal-ity, inequality, and abstract set constraints when the continuous functionis Lipschitz around the point of interest.The rest of this paper is organized as follows. In Section 2 we give somebackground materials. In Section 3 we propose some qualiﬁcation conditionsfor problem (1). In Section 4 we derive necessary optimality conditions forproblem (1) under these qualiﬁcation conditions. We investigate some suﬃcientconditions ensuring an exact penalization for problem (1) in Section 5. The notations used in this paper are standard in the literature. The symbol N (resp., ℜ , ℜ + , ℜ − ) denotes the set of nonnegative integers (resp., real numbers,nonnegative real numbers, nonpositive real numbers). For a ﬁnite set T , | T | denotes its cardinality. For any x ∈ ℜ d , we denote by x + := max { x, } thenon-negative part of x , k x k p := ( d P i =1 | x i | p ) /p for any p >

0, and k x k any normin ℜ d . Let B δ ( x ) denote a closed ball centered at x with positive radius δ . Theindicator function of a subset D ⊆ ℜ d is denoted by δ D and dist D ( x ) denotesthe Euclidean distance from x to D . Let F denote the constraint region forproblem (1) and for any x ∈ F , denote by I g ( x ) := { i : g i ( x ) = 0 } the indexset of active inequality constraints.We say that F admits a local error bound at ¯ x ∈ F if there exist δ > κ > F ( x ) ≤ κ ( k h ( x ) k + k g ( x ) + k ) for all x ∈ B δ (¯ x ).We next give some background materials on variational analysis; see, e.g.,[15, 16, 28, 31] for more details. For a function ϕ : ℜ d → [ −∞ , ∞ ] and a point on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 5 x ∗ ∈ ℜ d where ϕ ( x ∗ ) is ﬁnite, the regular (or Fr´echet) subdiﬀerential of ϕ at x ∗ is deﬁned asˆ ∂ϕ ( x ∗ ) := { v : ϕ ( x ) ≥ ϕ ( x ∗ ) + v T ( x − x ∗ ) + o ( k x − x ∗ k ) ∀ x } , the limiting (or Mordukhovich) subdiﬀerential of ϕ at x ∗ is deﬁned as ∂ϕ ( x ∗ ) := { v : ∃ x k → ϕ x ∗ , v k ∈ ˆ ∂ϕ ( x k ) s . t . v k → v } , and the horizon (or singular Mordukhovich) subdiﬀerential of ϕ at x ∗ is deﬁnedas ∂ ∞ ϕ ( x ∗ ) := { v : ∃ x k → ϕ x ∗ , v k ∈ ˆ ∂ϕ ( x k ) and t k → t k ≥ . t . t k v k → v } , where o ( · ) means o ( α ) /α → α →

0, and x k → ϕ x ∗ means x k → x ∗ and ϕ ( x k ) → ϕ ( x ∗ ) as k → ∞ . It is well-known that ϕ is Lipschitz around x ∗ ifand only if ∂ ∞ ϕ ( x ∗ ) = { } by [31, Theorem 9.13].The regular (or Fr´echet) normal cone of D at x ∗ ∈ D is a closed convexcone deﬁned as b N D ( x ∗ ) := ˆ ∂δ D ( x ∗ ) and the limiting (or Mordukhovich) normalcone of D at x ∗ is a closed cone deﬁned as N D ( x ∗ ) := ∂δ D ( x ∗ ). We say that D is regular at x ∗ if D is locally closed at x ∗ and N D ( x ∗ ) = b N D ( x ∗ ).Given a set-valued mapping S : ℜ d ⇒ ℜ m and a point ¯ x with S (¯ x ) = ∅ , thecoderivative of S at ¯ x for any ¯ u ∈ S (¯ x ) is the mapping D ∗ S (¯ x | ¯ u ) : ℜ m ⇒ ℜ d deﬁned by D ∗ S (¯ x | ¯ u )( y ) := { v : ( v, − y ) ∈ N gph S (¯ x, ¯ u ) } , where gph S := { ( x, y ) : y ∈ S ( x ) } . When S is single-valued at ¯ x with S (¯ x ) =¯ u , the notation D ∗ S (¯ x | ¯ u ) is simpliﬁed to D ∗ S (¯ x ). In the case where S is notonly single-valued but also Lipschitz around ¯ x , the coderivative is related tothe limiting subdiﬀerential by the scalarization formula: D ∗ S (¯ x )( y ) = ∂ h y, Si (¯ x ) ∀ y ∈ ℜ m . We say that S is locally bounded at ¯ x ∈ ℜ d if there exist M > δ > k v k ≤ M ∀ v ∈ S ( x ) , ∀ x ∈ B δ (¯ x ) . Recall from [31, Deﬁnition 5.4] that S is said to be outer semi-continuous at¯ x if { ¯ v : ∃ x k → ¯ x, v k ∈ S ( x k ) s . t . v k → ¯ v } ⊆ S (¯ x ) . It is well-known that the limiting normal cone mapping, the limiting sub-diﬀerential mapping, and the horizon subdiﬀerential mapping are all outersemi-continuous everywhere; see, e.g., [31, Propositions 6.6 and 8.7].By using the outer semi-continuity of the limiting normal cone mappingand the deﬁnition of the coderivative, it is easy to give the following propositionthat will be useful in deriving exact penalization results in Section 5.

Proposition 1

The coderivative D ∗ S ( x | u ) : ℜ d ⇒ ℜ m is outer semi-continuousin the sense that if there exists v k ∈ D ∗ S ( x k | u k )( y k ) where x k → x ∗ , y k → y ∗ ,and u k → u ∗ with u k ∈ S ( x k ) such that v k → v ∗ , then v ∗ ∈ D ∗ S ( x ∗ | u ∗ )( y ∗ ) . Lei Guo, Jane J. Ye

The following proposition collects some useful properties and calculus rulesof the limiting subdiﬀerential.

Proposition 2 (i) [31, Exercise 10.10]

Let f, g : ℜ d → [ −∞ , ∞ ] be properlower semi-continuous around x ∗ ∈ ℜ d and ﬁnite at x ∗ , and let α, β benonnegative scalars. Assume that at least one of them is Lipschitz around x ∗ . Then ∂ ( αf + βg )( x ∗ ) ⊆ α∂f ( x ∗ ) + β∂g ( x ∗ ) . Here we let · ∅ = { } by convention. (ii) [22, Theorem 2.5 and Remark (2)] Let g : ℜ n → ℜ m be Lipschitz around x ∗ and f : ℜ m → ℜ be Lipschitz around g ( x ∗ ) . Then the composite function f ◦ g is Lipschitz around x ∗ and ∂ ( f ◦ g )( x ∗ ) ⊆ [ ξ ∈ ∂f ( g ( x ∗ )) ∂ h ξ, g i ( x ∗ ) . (iii) [28, Theorem 3.38] Let g : ℜ d → ℜ m be continuous at x ∗ and f : ℜ m → ℜ be Lipschitz around g ( x ∗ ) . Then ∂ ( f ◦ g )( x ∗ ) ⊆ [ ξ ∈ ∂f ( g ( x ∗ )) D ∗ g ( x ∗ )( ξ ) , ∂ ∞ ( f ◦ g )( x ∗ ) ⊆ D ∗ g ( x ∗ )(0) . (iv) [29, Theorem 7.5] Let f ( x ) := max { f i ( x ) : i = 1 , . . . , s } where f i : ℜ d → ℜ is continuous at x ∗ for all i = 1 , . . . , s . If all but at most one of the functions { f i : i = 1 , . . . , s } are Lipschitz around x ∗ , then ∂f ( x ∗ ) ⊆ [ ( X i ∈I ∗ λ i ⋄ ∂f i ( x ∗ ) : ( λ , . . . , λ s ) ∈ Λ ∗ ) , where I ∗ := { i : f i ( x ∗ ) = f ( x ∗ ) } is the index set of active indices and Λ ∗ := ( ( λ , . . . , λ s ) : λ i ≥ i ∈ I ∗ , λ i = 0 i / ∈ I ∗ , X i ∈I ∗ λ i = 1 ) . Here we deﬁne α ⋄ ∂g by α∂g if α > and by ∂ ∞ g if α = 0 . Since the objective function of problem (1) includes a non-Lipschitz term,KKT conditions are no longer necessary for optimality only under constraintqualiﬁcations such as the standard quasi-normality [33, Deﬁnition 5] and thestandard RCPLD [2, Deﬁnition 4]. In this section we extend the standardquasi-normality and the standard RCPLD to problem (1) as follows so thatKKT conditions can be necessary for optimality under the extended quasi-normality and the extended RCPLD respectively. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 7

Deﬁnition 1

Let x ∈ F . (a) We say that x is ∂ ∞ -quasi-normal if there is nononzero vector ( λ, µ ) ∈ ℜ n × ℜ m such that there exists a sequence { x k } whichconverges to x as k → ∞ satisfying0 ∈ ∂ ∞ Φ ( x ) + n X i =1 λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x ) , (3) λ i ≥ , λ i g i ( x ) = 0 i = 1 , . . . , n, (4) g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (5)where I := { i : λ i > } , J := { j : µ j = 0 } , and N is the set of all positiveintegers.(b) Assume that g, h are smooth around x . Let J ⊆ { , . . . , m } be suchthat {∇ h j ( x ) : j ∈ J } is a basis for the span {∇ h j ( x ) : j = 1 , . . . , m } . We saythat ∂ ∞ -RCPLD condition holds at x if there exists δ > {∇ h j ( y ) : j = 1 , . . . , m } has the same rank for each y ∈ B δ ( x );(ii) for each I ⊆ I g ( x ), if there exist { λ i ≥ i ∈ I} and { µ j : j ∈ J } not allzero such that 0 ∈ ∂ ∞ Φ ( x ) + X i ∈I λ i ∇ g i ( x ) + X j ∈J µ j ∇ h j ( x ) , (6)then {∇ g i ( y ) , ∇ h j ( y ) : i ∈ I , j ∈ J } is linearly dependent for each y ∈B δ ( x ).It is easy to see that both ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD are weakerthan ∂ ∞ -NNAMCQ (i.e., implication (2)) but the reverse is not true; see Ex-amples 1–2. Note that if Φ is Lipschitz around x , then ∂ ∞ Φ ( x ) = { } , and thus ∂ ∞ -quasi-normality and ∂ ∞ -RCPLD reduce to the standard quasi-normalityand the standard RCPLD respectively for nonlinear programs with equalityand inequality constraints. If Φ is an indicator function of a closed subset Ω ,i.e., Φ ( x ) = δ Ω ( x ), then ∂ ∞ Φ ( x ) = N Ω ( x ) by [31, Exercise 8.14]. Thus ∂ ∞ -quasi-normality reduces to the standard quasi-normality for nonlinear pro-grams with equality, inequality, and abstract set constraints, and ∂ ∞ -RCPLDallows us to extend the original deﬁnition of RCPLD [2, Deﬁnition 4] to theproblem where there is an extra abstract set constraint x ∈ Ω since inclusion(6) becomes 0 ∈ X i ∈I λ i ∇ g i ( x ) + X j ∈J µ j ∇ h j ( x ) + N Ω ( x ) . In this case we simply say that RCPLD holds.We next extend the standard quasi-normality to problem (1) for ensuringan exact penalization.

Deﬁnition 2

Suppose that Φ ( x ) := Ψ ( x ) + δ Ω ( x ) where Ψ is a continuousfunction and Ω is a closed subset in ℜ d . Let x ∈ F . We say that x is D ∗ -quasi-normal if there is no nonzero vector ( λ, µ ) ∈ ℜ n × ℜ m such that there exists a Lei Guo, Jane J. Ye sequence { x k } which converges to x as k → ∞ satisfying (4)–(5) and0 ∈ D ∗ Ψ ( x )(0) + n X i =1 λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x ) + N Ω ( x ) . If Ψ is Lipschitz around x , then D ∗ Ψ ( x )(0) = { } and hence D ∗ -quasi-normality reduces to the standard quasi-normality for nonlinear programs withequality, inequality, and abstract set constraints. Since ∂ ∞ Φ ( x ) ⊆ D ∗ Φ ( x )(0)(see [28, Theorem 1.80]), D ∗ -quasi-normality is stronger than ∂ ∞ -quasi-normalitywhen Ω = ℜ d .We call problem (1) an ℓ / minimization problem if the non-Lipschitz term Φ ( x ) is equal to ( k x k / ) / . The problem in the following example is an ℓ / minimization problem with linear constraints. It gives an example for which ∂ ∞ -RCPLD, ∂ ∞ -quasi-normality, and D ∗ -quasi-normality are all satisﬁed but ∂ ∞ -NNAMCQ does not hold. Example 1

Consider the following problemmin f ( x ) := p | x | + p | x | + p | x | + p | x | s . t . g ( x ) := x + x + x + x − ≤ ,h ( x ) := x + x − ,h ( x ) := x + x − x ∗ = (1 , , , ∂ ∞ f ( x ∗ ) = { } × ℜ × { } × ℜ , ∇ g ( x ) = (1 , , , ∇ h ( x ) = (1 , , , ∇ h ( x ) =(0 , , ,

1) for any x . Direct veriﬁcation implies that there exists ( λ , µ , µ ) = 0such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , which means that ∂ ∞ -NNAMCQ does not hold at x ∗ . But in this case, thefamily of gradients {∇ g ( x ) , ∇ h ( x ) , ∇ h ( x ) } is linearly dependent for any x .Thus, ∂ ∞ -RCPLD holds at x ∗ . To show that ∂ ∞ -quasi-normality also holds at x ∗ , we assume that there exist ( λ , µ , µ ) = 0 and a sequence { x k } convergingto x ∗ such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , (7) g ( x k ) > λ > , µ h ( x k ) > µ = 0 , µ h ( x k ) > µ = 0 . (8)By (7), it follows that µ = µ = − λ . Thus λ > µ = µ <

0. Thesetogether with (8) leads to2 < x k + x k + x k + x k < . This contradiction shows that there is no nonzero ( λ , µ , µ ) satisfying (7)–(8). Thus, ∂ ∞ -quasi-normality holds at x ∗ . Since it is easy to verify that D ∗ f ( x ∗ )(0) = ∂ ∞ f ( x ∗ ), D ∗ -quasi-normality holds at x ∗ as well. (cid:3) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 9 The following example of an ℓ / minimization problem with nonlinearconstraints illustrates that it is possible that ∂ ∞ -quasi-normality holds but ∂ ∞ -RCPLD does not hold. Example 2

Consider the following problemmin f ( x ) := p | x | + p | x | + p | x | s . t . g ( x ) := x + x − x − ≤ ,h ( x ) := x + x − x ∗ = (1 , , ∂ ∞ -RCPLD does not hold at x ∗ since there exists( λ, µ ) = 0 such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ {∇ g ( x ) , ∇ h ( x ) } is linearly independent for any x with x = 0. To showthat ∂ ∞ -quasi-normality holds at x ∗ , assume that there exist ( λ, µ ) = 0 anda sequence { x k } converging to x ∗ such that0 ∈ ∂ ∞ f ( x ∗ ) + λ ∇ g ( x ∗ ) + µ ∇ h ( x ∗ ) , λ ≥ , (9) g ( x k ) > λ > , µh ( x k ) > µ = 0 . (10)By (9), it follows that λ + µ = 0. Thus λ > µ <

0. But by (10), we have1 + ( x k ) <

1. The contradiction shows that ∂ ∞ -quasi-normality holds at x ∗ . (cid:3) We know that the standard RCPLD does not imply the standard quasi-normality (see [2, Example 2]). It is also easy to see that the standard RCPLDand the standard quasi-normality at a local minimizer x ∗ of the problemmin f ( x ) s . t . x ∈ F (11)are equivalent to ∂ ∞ -RCPLD and ∂ ∞ -quasi-normality at a local minimizer( x ∗ ,

0) of the perturbed problemmin f ( x ) + p | z | s . t . x ∈ F , z ∈ ℜ , respectively. Thus, if problem (11) is such that the standard RCPLD holdsbut the standard quasi-normality does not hold, then ∂ ∞ -RCPLD holds but ∂ ∞ -quasi-normality does not hold for the above perturbed problem.We next give some characterizations for ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD,and D ∗ -quasi-normality in terms of the standard quasi-normality and the stan-dard RCPLD. Proposition 3

Let x ∗ ∈ F . (i) If ∂ ∞ -quasi-normality holds at x ∗ , then boththe standard quasi-normality and the following basic qualiﬁcation (BQ) holdat x ∗ : − ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ) = { } . (12) If g, h are smooth around x ∗ , then ∂ ∞ -quasi-normality at x ∗ is equivalent tothe standard quasi-normality plus BQ (12) at x ∗ . (ii) ∂ ∞ -RCPLD holds at x ∗ if and only if both the standard RCPLD andBQ (12) hold at x ∗ . (iii) If D ∗ -quasi-normality holds at x ∗ , then the standard quasi-normalityholds at x ∗ and − D ∗ Ψ ( x ∗ )(0) ∩ N F ( x ∗ ) = { } . (13) If Ω is regular and g, h are smooth around x ∗ , then D ∗ -quasi-normality holdsat x ∗ if and only if both the standard quasi-normality and condition (13) holdat x ∗ .Proof (i) Suppose that ∂ ∞ -quasi-normality holds at x ∗ . Then it is easy to seethat the standard quasi-normality also holds at x ∗ since 0 ∈ ∂ ∞ Φ ( x ∗ ). Thusby [33, Proposition 4], it follows that N F ( x ∗ ) ⊆  P i ∈I ∗ λ i ∂g i ( x ∗ ) + m P j =1 ∂ ( µ j h j )( x ∗ ) : λ i ≥ i ∈ I ∗ , ∃{ x k } → x ∗ s . t . g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N  , (14)where I ∗ := I g ( x ∗ ) , I := { i : λ i > } , J := { j : µ j = 0 } . (15)We now show that BQ (12) holds. By contradiction, suppose that 0 = ζ ∈− ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ). Then by (14), there exist ( λ, µ ) ∈ ℜ |I ∗ | × ℜ m and asequence { x k } converging to x ∗ such that ζ ∈ X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) ,λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N . Since 0 = ζ ∈ − ∂ ∞ Φ ( x ∗ ), it then follows that ( λ, µ ) = 0 and0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) , (16) λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (17)which contradicts ∂ ∞ -quasi-normality. Thus, BQ (12) holds.We next show the converse part. Assume that g, h are smooth around x ∗ ,and both the standard quasi-normality and BQ (12) hold at x ∗ . By contra-diction, assume that ∂ ∞ -quasi-normality does not hold at x ∗ . That is, thereexist 0 = ( λ, µ ) ∈ ℜ |I ∗ | × ℜ m and a sequence { x k } converging to x ∗ such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) , (18) λ i ≥ i ∈ I ∗ , g i ( x k ) > i ∈ I, µ j h j ( x k ) > j ∈ J, ∀ k ∈ N , (19) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 11 where I ∗ , I, J are deﬁned as in (15). Moreover, since g, h are smooth around x ∗ , it follows from [31, Theorem 6.14] that  X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m  ⊆ N F ( x ∗ ) . (20)This and (18) imply that X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) ∈ − ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ) , which together with BQ (12) means that X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) = 0 . This together with (19) and the relation ( λ, µ ) = 0 contradicts the standardquasi-normality. Thus, ∂ ∞ -quasi-normality holds at x ∗ .(ii) Let the standard RCPLD and BQ (12) hold at x ∗ . Let J be the indexset given in the deﬁnition of the standard RCPLD such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } , and let I ∗ be deﬁned as in(15). To show ∂ ∞ -RCPLD, it suﬃces to show that Deﬁnition 1(b)(ii) holds.Assume that there exist nonzero vectors { α i ≥ i ∈ I} with I ⊆ I ∗ and { β j : j ∈ J } such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I α i ∇ g i ( x ∗ ) + X j ∈J β j ∇ h j ( x ∗ ) , which together with (12) and (20) implies that X i ∈I α i ∇ g i ( x ∗ ) + X j ∈J β j ∇ h j ( x ∗ ) = 0 . This and the standard RCPLD imply the existence of δ > {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent for each x ∈ B δ ( x ∗ ) . Thus, Deﬁnition 1(b)(ii) holds and then ∂ ∞ -RCPLD holds at x ∗ .To show the converse part, suppose that ∂ ∞ -RCPLD holds at x ∗ . It thenfollows immediately that RCPLD holds at x ∗ since 0 ∈ ∂ ∞ Φ ( x ∗ ). Then by [20,Theorem 3.2], it follows that N F ( x ∗ ) ⊆  X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m  . (21)We now show that BQ (12) holds. To the contrary, assume that 0 = ζ ∈− ∂ ∞ Φ ( x ∗ ) ∩ N F ( x ∗ ). Let J be such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } . Then by (21), there exist { λ i > i ∈ I} with I ⊆ I ∗ and { µ j : j ∈ J } not all zero such that ζ = X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) , (22)which together with the relation ζ ∈ − ∂ ∞ Φ ( x ∗ ) implies that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) . Then by Deﬁnition 1(b)(ii), there exist { α i : i ∈ I} and { β j : j ∈ J } not allzero such that X i ∈I α i ∇ g i ( x ∗ ) + X i ∈J β j ∇ h j ( x ∗ ) = 0 , which together with (22) implies that for any γ ∈ ℜ , ζ = X i ∈I ( λ i − γα i ) ∇ g i ( x ∗ ) + X j ∈J ( µ j − γβ j ) ∇ h j ( x ∗ ) . Choosing γ = 0 as the smallest number such that λ i − γα i = 0 for at least one i ∈ I , we are able to represent ζ with at least one fewer vectors ∇ g i ( x ∗ ). Wemay repeat this procedure until ζ = P j ∈J θ j ∇ h j ( x ∗ ) for some { θ j : j ∈ J } not all zero. Then by the relation ζ ∈ − ∂ ∞ Φ ( x ∗ ), it follows that0 ∈ ∂ ∞ Φ ( x ∗ ) + X j ∈J θ j ∇ h j ( x ∗ ) . Thus by Deﬁnition 1(b)(ii), {∇ h j ( x ∗ ) : j ∈ J } must be linearly dependent.This contradicts the fact that {∇ h j ( x ∗ ) : j ∈ J } is a basis. Hence BQ (12)holds.(iii) When g, h are smooth around x ∗ and Ω is regular, it follows from [31,Theorem 6.14] that  P i ∈I ∗ λ i ∇ g i ( x ∗ ) + m P j =1 µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m  ⊆ N F ( x ∗ ) . (23)The proof for (iii) is exactly the same as that for (i) except that ∂ ∞ Φ ( x ∗ ) and(20) are replaced by D ∗ Ψ ( x ∗ )(0) and (23), respectively. (cid:3) When the constraint region is so simple that its limiting normal cone iseasy to calculate directly, we can use Proposition 3 to verify our proposedqualiﬁcation conditions. The following simple minimax problem illustrates thatconditions (12)–(13) hold and then ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD, and D ∗ -quasi-normality are all satisﬁed since the constraint function is linear. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 13 Example 3

Consider the following minimax problemmin x ≤ max y ≥ − x + xy. Let V ( x ) := max {− x + xy : y ≥ } . Then it is easy to verify that the aboveminimax problem can be equivalently rewritten asmin V ( x ) s . t . x ≤ , (24)where V ( x ) = − x if x ≤ ∞ otherwise. Clearly, x ∗ = 0 is a minimizerof problem (24). We observe that V is not continuous at x ∗ but is lower semi-continuous at x ∗ . Moreover, ∂ ∞ V ( x ∗ ) = D ∗ V ( x ∗ )(0) = N ℜ − ( x ∗ ) = ℜ + , which indicates that − ∂ ∞ V ( x ∗ ) ∩ N ℜ − ( x ∗ ) = − D ∗ V ( x ∗ )(0) ∩ N ℜ − ( x ∗ ) = { } .Since the constraint function of problem (24) is linear, the standard RCPLDobviously holds and by [7, Proposition 3.1], the standard quasi-normality isalso satisﬁed. It then follows from Proposition 3 that ∂ ∞ -quasi-normality, ∂ ∞ -RCPLD, and D ∗ -quasi-normality are all satisﬁed at x ∗ for problem (24). (cid:3) By using the outer semi-continuity of the horizon subdiﬀerential mappingand Proposition 1, it is not diﬃcult to show that both ∂ ∞ -quasi-normalityand D ∗ -quasi-normality are locally persistent as follows. Proposition 4 If ∂ ∞ -quasi-normality ( D ∗ -quasi-normality) holds at x ∗ ∈ F ,then there exists δ > such that ∂ ∞ -quasi-normality ( D ∗ -quasi-normality)holds at every point in B δ ( x ∗ ) ∩ F . We now show that ∂ ∞ -RCPLD is also locally persistent. Proposition 5 If ∂ ∞ -RCPLD holds at x ∗ ∈ F , then there exists δ > suchthat ∂ ∞ -RCPLD holds at every point in B δ ( x ∗ ) ∩ F .Proof Assume that ∂ ∞ -RCPLD holds at x ∗ . Let J ⊆ { , . . . , m } be such that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } . Thenit is easy to see that there exists δ ∈ (0 , δ ) such that {∇ h i ( x ) : i ∈ I} islinearly independent for all x ∈ B δ ( x ∗ ), where δ is given in Deﬁnition 1(b).Then it follows from Deﬁnition 1(b)(i) that {∇ h j ( x ) : j ∈ J } is a basis forthe span {∇ h j ( x ) : j = 1 , . . . , m } for any x ∈ B δ ( x ∗ ). Let δ := δ /

2. Then byDeﬁnition 1(b)(i) again, {∇ h j ( y ) : j = 1 , . . . , m } has the same rank for all y ∈B δ ( x ) and x ∈ B δ ( x ∗ ). Thus it suﬃces to show that there exists δ ∈ (0 , δ )such that for any x ∈ B δ ( x ∗ ), Deﬁnition 1(b)(ii) holds at x . Assume to thecontrary that this is not true. That is, there exist a sequence { x k } convergingto x ∗ , and { λ ki ≥ i ∈ I k } with I k ⊆ I g ( x k ) and { µ kj : j ∈ J } not all zerosuch that 0 ∈ ∂ ∞ Φ ( x k ) + X i ∈I k λ ki ∇ g i ( x k ) + X j ∈J µ kj ∇ h j ( x k ) , (25) and there exists a sequence { y k,l } l converging to x k such that {∇ g i ( y k,l ) , ∇ h j ( y k,l ) : i ∈ I k , j ∈ J } is linearly independent for all l. By the diagonalization law, there exists a sequence { z k } converging to x ∗ suchthat {∇ g i ( z k ) , ∇ h j ( z k ) : i ∈ I k , j ∈ J } is linearly independent for all k. (26)Since g is continuous, it is easy to verify that I g ( x k ) ⊆ I g ( x ∗ ) for any k suﬃciently large and hence I k ⊆ I g ( x ∗ ). Since the number of the possible sets I k is ﬁnite, without loss of generality we may assume that I k ≡ I for any k suﬃciently large. Let t k := max { λ ki , | µ kj | : i ∈ I , j ∈ J } . Clearly, t k > k . Without loss of generality, we may assume that λ ki t k → λ ∗ i ≥ i ∈ I , µ kj t k → µ ∗ j j ∈ J as k → ∞ . It is easy to see that max { λ ∗ i , | µ ∗ j | : i ∈ I , j ∈ J } = 1 . By the outer semi-continuity of the horizon subdiﬀerential mapping, dividing(25) by t k and taking limits on both sides as k → ∞ imply that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ ∗ i ∇ g i ( x ∗ ) + X j ∈J µ ∗ j ∇ h j ( x ∗ ) . The last two relations and ∂ ∞ -RCPLD imply that for any x ∈ B δ ( x ∗ ), {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent , which contradicts (26). The desired result follows immediately. (cid:3) The purpose of this section is to show that the KKT condition deﬁned belowis necessary for optimality under ∂ ∞ -quasi-normality or ∂ ∞ -RCPLD. Deﬁnition 3 (KKT condition)

Let x ∗ ∈ F . We say that x ∗ is a KKT pointof problem (1) if there exist multipliers λ ∈ ℜ n and µ ∈ ℜ m such that0 ∈ ∂f ( x ∗ ) + ∂Φ ( x ∗ ) + n X i =1 λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) ,λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 15 We ﬁrst show that the KKT condition holds at a local minimizer under aweaker qualiﬁcation condition.

Lemma 1

Let x ∗ be a local minimizer of problem (1) . Suppose that BQ (12) holds at x ∗ and N F ( x ∗ ) ⊆  X i ∈I ∗ λ i ∂g i ( x ∗ ) + m X j =1 ∂ ( µ j h j )( x ∗ ) : λ i ≥ i ∈ I ∗ , µ ∈ ℜ m  , (27) where I ∗ := I g ( x ∗ ) . Then x ∗ is a KKT point.Proof It is clear that x ∗ is a local minimizer of the problemmin f ( x ) + Φ ( x ) + δ F ( x ) . Then by Fermat’s rule (see, e.g., [31, Theorem 10.1]), we have0 ∈ ∂f ( x ∗ ) + ∂ ( Φ + δ F )( x ∗ ) . Since BQ (12) holds at x ∗ , it then follows from the sum rule for the limitingsubdiﬀerentials (see, e.g., [31, Corollary 10.9]) and the relation ∂δ F ( x ∗ ) = N F ( x ∗ ) that 0 ∈ ∂f ( x ∗ ) + ∂Φ ( x ∗ ) + N F ( x ∗ ) . This and (27) imply the desired result immediately. (cid:3)

The following result follows immediately from the fact that (27) may beimplied by the local error bound condition (e.g., [21, Proposition 3.4]).

Corollary 1

Let x ∗ be a local minimizer of problem (1) . If BQ (12) holds at x ∗ and F admits a local error bound at x ∗ , then x ∗ is a KKT point. Let us revisit Example 1 which is in a four dimensional space. Even in thislow dimensional space, it is not easy to calculate the limiting normal cone of theconstraint region and hence BQ (12) is diﬃcult to verify. For a constraint regioninvolving many nonlinear constraints in high-dimensional spaces, it is almostimpossible to calculate directly the limiting normal cone and thus BQ (12)is very diﬃcult to verify. Fortunately, ∂ ∞ -quasi-normality and ∂ ∞ -RCPLDare expressed in terms of the problem data explicitly and hence much easierto verify. The following result shows that these two proposed qualiﬁcationconditions are suﬃcient for the KKT condition to hold at a local minimizer. Theorem 1

Let x ∗ be a local minimizer of problem (1) . Assume that either ∂ ∞ -quasi-normality holds at x ∗ or ∂ ∞ -RCPLD holds at x ∗ and g, h are smootharound x ∗ . Then x ∗ is a KKT point.Proof By Proposition 3, either the standard quasi-normality and BQ (12) orthe standard RCPLD and BQ (12) hold. It then follows from [33, Proposition4] and [20, Theorem 3.2] that condition (27) holds. Thus, the desired resultfollows from Lemma 1 immediately. (cid:3)

Corollary 2

Let x ∗ be a local minimizer of problem (1) and let g, h be linear.Suppose also that the following implication holds: ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) , λ i ≥ i ∈ I ∗ , = ⇒ X i ∈I ∗ λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) = 0 , where I ∗ := I g ( x ∗ ) . (28) Then x ∗ is a KKT point.Proof We ﬁrst show that ∂ ∞ -RCPLD holds at x ∗ . Since h is linear, Deﬁnition1(b)(i) holds. It then suﬃces to show that Deﬁnition 1(b)(ii) holds. Let J besuch that {∇ h j ( x ∗ ) : j ∈ J } is a basis for the span {∇ h j ( x ∗ ) : j = 1 , . . . , m } and I ⊆ I ∗ . Assume that there exist { λ i : i ∈ I} and { µ j : j ∈ J } not allzero such that0 ∈ ∂ ∞ Φ ( x ∗ ) + X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) , λ i ≥ i ∈ I , which implies that P i ∈I λ i ∇ g i ( x ∗ )+ P j ∈J µ j ∇ h j ( x ∗ ) = 0 by (28). This means thatthe family of gradients {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependentfor all x since g, h are linear. Thus, ∂ ∞ -RCPLD holds at x ∗ and then thedesired result follows immediately from Theorem 1. (cid:3) The following example illustrates the applicability of Corollary 2.

Example 4

Consider the following problemmin f ( x ) := p | x | + p | x | s . t . g ( x ) := x + x − ≥ ,g ( x ) := x + x − ≤ x ∗ = (1 , ∂ ∞ f ( x ∗ ) = { } × ℜ and then 0 ∈ ∂ ∞ f ( x ∗ ) − λ ∇ g ( x ∗ ) + λ ∇ g ( x ∗ )implies that λ = λ . Thus − λ (cid:18) (cid:19) + λ (cid:18) (cid:19) = (cid:18) (cid:19) and by Corollary 2,it then follows that x ∗ is a KKT point. (cid:3) Letting Φ be an indicator function of a closed subset Ω in ℜ d , i.e., Φ ( x ) = δ Ω ( x ), the following result follows immediately from Theorem 1, which extendsthe result of [2, Corollary 1] to allow an extra abstract set constraint x ∈ Ω and the nonsmoothness of the objective function. on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 17 Corollary 3

Let x ∗ be a local minimizer of the nonlinear program min x ∈ Ω f ( x )s . t . g ( x ) ≤ ,h ( x ) = 0 , where f, g, h are deﬁned as in problem (1) and Ω is a closed subset in ℜ d .Here we assume that g, h are smooth around x ∗ . Suppose further that RCPLDholds at x ∗ , i.e., there exists δ > such that (i) {∇ h j ( x ) : j = 1 , . . . , m } has the same rank for each x ∈ B δ ( x ∗ ) ; (ii) for each I ⊆ I g ( x ∗ ) , if there exist { λ i ≥ i ∈ I} and { µ j : j ∈ J } notall zero such that ∈ X i ∈I λ i ∇ g i ( x ∗ ) + X j ∈J µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) , then {∇ g i ( x ) , ∇ h j ( x ) : i ∈ I , j ∈ J } is linearly dependent for each x ∈B δ ( x ∗ ) ,where J ⊆ { , . . . , m } is such that {∇ h j ( x ) : j ∈ J } is a basis for the span {∇ h j ( x ) : j = 1 , . . . , m } . Then there exist multipliers λ ∈ ℜ n and µ ∈ ℜ m such that ∈ ∂f ( x ∗ ) + n X i =1 λ i ∇ g i ( x ∗ ) + m X j =1 µ j ∇ h j ( x ∗ ) + N Ω ( x ∗ ) ,λ i ≥ , λ i g i ( x ∗ ) = 0 i = 1 , . . . , n. This section focuses on exact penalization for problem (1). We ﬁrst give anexact penalization result for a special case of problem (1) where Φ is the sumof a composite function of a separable lower semi-continuous function with acontinuous function and an indictor function of a closed subset. To this end,we give a characterization of the regular subdiﬀerential as follows. It can beshown easily by using the deﬁnition of the regular subdiﬀerential and thus weomit the proof here. Lemma 2

Let ψ : ℜ d → ( −∞ , ∞ ] be lower semi-continuous and x ∗ ∈ ℜ d besuch that ψ ( x ∗ ) is ﬁnite. Then ˆ ∂ψ ( x ∗ ) = ℜ d if and only if for any M > ,there exists δ > such that ψ ( x ) − ψ ( x ∗ ) ≥ M k x − x ∗ k ∀ x ∈ B δ ( x ∗ ) . We are now ready to give the ﬁrst main result on exact penalization.

Theorem 2

Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 φ i ( ω i ( x )) + δ Ω ( x ) . Here Ω is a closed subset in ℜ d and for any i = 1 , . . . , s , φ i : ℜ → ℜ islower semi-continuous and ω i ( x ) : ℜ d → ℜ is continuous. Let t ∗ := ω ( x ∗ ) , I := { i : ∂ ∞ φ i ( t ∗ i ) = { }} , and I c be the complement of I with respect to { , . . . , s } . Assume further that ˆ ∂φ i ( t ∗ i ) = ℜ for any i ∈ I c and the followingrestricted system with respect to ( x, t ) :  g ( x ) ≤ , h ( x ) = 0 , x ∈ Ω,w i ( x ) − t i = 0 i = 1 , . . . , s,t i − t ∗ i = 0 i ∈ I c admits a local error bound at ( x ∗ , t ∗ ) . Then there exists ρ > such that forany ρ ≥ ρ , x ∗ is also a local minimizer of the exact penalization problem min x ∈ Ω f ( x ) + s X i =1 φ i ( ω i ( x )) + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof

Since x ∗ is a local minimizer of problem (1), it is not diﬃcult to seethat ( x ∗ , t ∗ ) is a local minimizer of the following problem:min x ∈ Ω Π ( x, t ) := f ( x ) + P i ∈I φ i ( t i )s . t . g ( x ) ≤ , h ( x ) = 0 ,w i ( x ) − t i = 0 i = 1 , . . . , s,t i − t ∗ i = 0 i ∈ I c . (29)We observe that Π is Lipschitz around ( x ∗ , t ∗ ) and denote by L Π the Lipschitzconstant. Then by Clarke’s exact penalization principle [15, Proposition 2.4.3],there exists δ > Π ( x ∗ , t ∗ ) ≤ Π ( x, t ) + L Π dist F ′ ( x, t ) ∀ ( x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ) , (30)where F ′ denotes the constraint region of problem (29). Since F ′ admits alocal error bound at ( x ∗ , t ∗ ), there exist δ ∈ (0 , δ ) and κ > x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ),dist F ′ ( x, t ) ≤ κ s X i =1 | w i ( x ) − t i | + X i ∈I c | t i − t ∗ i | + k g ( x ) + k + k h ( x ) k ! . For simplicity, the above local error bound is expressed under the ℓ norm.This together with (30) implies that for any ( x, t ) ∈ B δ ( x ∗ , t ∗ ) ∩ ( Ω × ℜ s ), Π ( x ∗ , t ∗ ) ≤ Π ( x, t )+ κL Π (cid:18) s P i =1 | w i ( x ) − t i | + P i ∈I c | t i − t ∗ i | + k g ( x ) + k + k h ( x ) k (cid:19) . (31) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 19 Due to the continuity of function w , we may choose δ ∈ (0 , δ ) such that( x, w ( x )) ∈ B δ ( x ∗ , t ∗ ) for any x ∈ B δ ( x ∗ ). Thus, by letting t = w ( x ) in (31),it follows that for any x ∈ B δ ( x ∗ ) ∩ Ω , Π ( x ∗ , t ∗ ) ≤ Π ( x, w ( x )) + κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! . (32)Since ˆ ∂φ i ( t ∗ i ) = ℜ for any i ∈ I c , it then follows from Lemma 2 and thecontinuity of w that there exists δ ∈ (0 , δ ) such that φ i ( w i ( x )) − φ i ( t ∗ i ) ≥ κL Π | w i ( x ) − t ∗ i | ∀ x ∈ B δ ( x ∗ ) ∀ i ∈ I c . This and (32) imply that for any x ∈ B δ ( x ∗ ) ∩ Ω , f ( x ∗ ) + s X i =1 φ i ( t ∗ i ) = Π ( x ∗ , t ∗ ) + X i ∈I c φ i ( t ∗ i ) ≤ Π ( x, w ( x )) + X i ∈I c φ i ( w i ( x )) + X i ∈I c φ i ( t ∗ i ) − X i ∈I c φ i ( w i ( x ))+ κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! ≤ f ( x ) + s X i =1 φ i ( w i ( x )) − κL Π X i ∈I c | w i ( x ) − t ∗ i | + κL Π X i ∈I c | w i ( x ) − t ∗ i | + k g ( x ) + k + k h ( x ) k ! = f ( x ) + s X i =1 φ i ( w i ( x )) + κL Π ( k g ( x ) + k + k h ( x ) k ) . Then the desired result follows immediately by the equivalence of all norms inﬁnite dimensional spaces. (cid:3)

It should be noted that Theorem 2 can be applied to a class of sparseoptimization problems. For the widely used bridge penalty φ ( t ) = | t | p with p ∈ (0 ,

1) in the sparse optimization literature, it is easy to see that φ is notLipschitz around t ∗ = 0. However, it is not hard to verify that ˆ ∂φ ( t ∗ ) = ℜ andthus this bridge penalty function is a suitable outer function required in The-orem 2. In the following, we give some exact penalization results for problem(1) where the objective function is related to the bridge penalty function.The following result shows that the problem considered in [14] with anextra abstract constraint set which is the union of ﬁnitely many polyhedralsets admits an exact penalization. Corollary 4

Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 | a Ti x | p + δ Ω ( x ) with a i ∈ ℜ d , p ∈ (0 , , and Ω ⊆ ℜ d which is the union of ﬁnitely manypolyhedral sets. Assume further that g, h are linear. Then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also a local minimizer of the exact penalizationproblem min x ∈ Ω f ( x ) + s X i =1 | a Ti x | p + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof

Let φ ( t ) := | t | p and t ∗ i := a Ti x ∗ i = 1 , . . . , s . It is easy to verify that I := { i : ∂ ∞ φ ( t ∗ i ) = { }} = { i : t ∗ i = 0 } and ˆ ∂φ ( t ∗ i ) = ℜ for any i ∈ I c where I c is the complement of I with respectto { , . . . , s } . Since the constraint set (cid:26) ( x, t, p ) : g ( x ) + p g ≤ , h ( x ) + p h = 0 , t i − t ∗ i + p ti = 0 i ∈ I c a Ti x − t i + p ai = 0 i = 1 , . . . , s, x ∈ Ω (cid:27) is the union of ﬁnitely many polyhedral sets, it then follows from the corollaryin [30, Page 210] that the local error bound condition holds everywhere for theconstraint set (cid:26) ( x, t ) ∈ Ω × ℜ s : g ( x ) ≤ , h ( x ) = 0 , t i − t ∗ i = 0 i ∈ I c a Ti x − t i = 0 i = 1 , . . . , s (cid:27) . Then the desired result follows immediately from Theorem 2. (cid:3)

We next give an exact penalization result for a problem which is moregeneral than the one considered in [25].

Corollary 5

Assume that x ∗ is a local minimizer of problem (1) where Φ ( x ) := s X i =1 [( b i − a Ti x ) + ] p + δ Ω ( x ) with a i ∈ ℜ d , b i ∈ ℜ , p ∈ (0 , , and Ω ⊆ ℜ d which is the union of ﬁnitelymany polyhedral sets. Assume further that g, h are linear. Then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also a local minimizer of the exactpenalization problem min x ∈ Ω f ( x ) + s X i =1 [( b i − a Ti x ) + ] p + ρ ( k g ( x ) + k + k h ( x ) k ) . on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 21 Proof

Let φ ( t ) := | t | p and t ∗ i := ( b i − a Ti x ∗ ) + i = 1 , . . . , s . Using the samenotations I , I c as in the proof of Corollary 4, it suﬃces to investigate the localerror bound condition for the constraint set (cid:26) ( x, t ) ∈ Ω × ℜ s : g ( x ) ≤ , h ( x ) = 0 , t i − t ∗ i = 0 i ∈ I c ( b i − a Ti x ) + − t i = 0 i = 1 , . . . , s (cid:27) . It is easy to see that the parametric counterpart of the above set (cid:26) ( x, t, p ) : g ( x ) + p g ≤ , h ( x ) + p h = 0 , t i − t ∗ i + p ti = 0 i ∈ I c ( b i − a Ti x ) + − t i + p + i = 0 i = 1 , . . . , s, x ∈ Ω (cid:27) is the union of ﬁnitely many polyhedral sets. Thus by the corollary in [30, Page210], the desired local error bound condition is satisﬁed. The proof is completeby applying Theorem 2. (cid:3) In the rest of this section, we investigate suﬃcient conditions ensuring anexact penalization for problem (1) where Φ is the sum of a continuous functionand an indictor function of a closed subset. In particular, we investigate exactpenalization for the following problem:min x ∈ Ω f ( x ) + Ψ ( x )s . t . g ( x ) ≤ , (33) h ( x ) = 0 , where f, g, h are deﬁned as in problem (1), Ψ : ℜ d → ℜ is a continuousfunction, and Ω is a closed subset in ℜ d . As discussed in Section 1, when theobjective function of a nonlinear program is locally Lipschitz, the admittanceof the local error bound for its constraint region is suﬃcient to ensure an exactpenalization. For this purpose, we introduce the following auxiliary problemwhere the objective function is locally Lipschitz:min ( x,y ) ∈ Ω ×ℜ f ( x ) + y s . t . Ψ ( x ) − y = 0 , (34) g ( x ) ≤ , h ( x ) = 0 . In the case where Ω = ℜ d , the constraint region of problem (34) can berewritten as Λ = { ( x, y ) ∈ gph Ψ : g ( x ) ≤ , h ( x ) = 0 } . We observe that by using the deﬁnition of the coderivative, the inclusion0 ∈ D ∗ Ψ ( x )(0) + X i ∈I g ( x ) λ i ∂g i ( x ) + m X j =1 ∂ ( µ j h j )( x )can be rewritten as(0 , ∈ X i ∈I g ( x ) (cid:18) λ i ∂g i ( x )0 (cid:19) + m X j =1 (cid:18) ∂ ( µ j h j )( x )0 (cid:19) + N gph Ψ ( x, Ψ ( x )) . Hence if x ∗ is D ∗ -quasinormal for problem (33), then ( x ∗ , Ψ ( x ∗ )) is quasi-normal for problem (34). Moreover, gph Ψ is a closed subset in ℜ d +1 by thecontinuity of Ψ . These and the local Lipschitzness of g, h enable us to ap-ply [19, Corollary 5.3] to derive the local error bound condition at ( x ∗ , Ψ ( x ∗ )),that is, there exist δ > κ > Λ ( x, y ) ≤ κ ( k g ( x ) + k + k h ( x ) k ) ∀ ( x, y ) ∈ B δ ( x ∗ , Ψ ( x ∗ )) ∩ gph Ψ. The exact penalization result then follows from applying Clarke’s exact pe-nalization principle [15, Proposition 2.4.3] to problem (34). Unfortunately, theabove argument does not work for the case where the abstract constraint set Ω is not equal to the whole space ℜ d . Nevertheless, we have succeeded in de-riving the following local error bound result under D ∗ -quasi-normality givenin Deﬁnition 2. Lemma 3

Suppose that D ∗ -quasi-normality holds at x ∗ ∈ F . Then the set Λ := { ( x, y ) ∈ Ω × ℜ : Ψ ( x ) − y = 0 , g ( x ) ≤ , h ( x ) = 0 } (35) admits a local error bound at ( x ∗ , y ∗ ) with y ∗ := Ψ ( x ∗ ) , that is, there exist δ > and κ > such that dist Λ ( x, y ) ≤ κ ( | Ψ ( x ) − y | + k g ( x ) + k + k h ( x ) k ) ∀ ( x, y ) ∈ B δ ( x ∗ , y ∗ ) ∩ ( Ω ×ℜ ) . Proof

First we observe that Λ deﬁned in (35) can be rewritten as Λ = { ( x, y ) : Ξ ( x, y ) + dist Ω ( x ) = 0 } , where Ξ ( x, y ) := max { H ( x, y ) , g ( x ) , . . . , g n ( x ) , | h ( x ) | , . . . , | h m ( x ) |} with H ( x, y ) := | G ( x, y ) | and G ( x, y ) := Ψ ( x ) − y . Then in order to obtain thedesired result, by [32, Theorem 3.1], it suﬃces to show that there exist ˜ δ > κ > k π k ≥ ˜ κ ∀ π ∈ ∂ ( Ξ + dist Ω )( x, y ) , ∀ ( x, y ) ∈ B ˜ δ ( x ∗ , y ∗ ) ∩ ( Ω × ℜ ) \ Λ. (36)We now make some preparations for subsequent analysis. Since | · | is glob-ally Lipschitz and G is continuous, it follows from Proposition 2(iii) that forany ( x, y ), ∂H ( x, y ) ⊆ [ ξ ∈ ∂ | G ( x,y ) | D ∗ G ( x, y )( ξ ) ⊆ [ ξ ∈ ∂ | G ( x,y ) | (cid:18) D ∗ Ψ ( x )( ξ ) − ξ (cid:19) , (37) ∂ ∞ H ( x, y ) ⊆ D ∗ G ( x, y )(0) ⊆ (cid:18) D ∗ Ψ ( x )(0)0 (cid:19) . (38)Since dist Ω ( · ) is globally Lipschitz and Ξ is continuous, by Proposition 2(i),we have that for any ( x, y ), ∂ ( Ξ + dist Ω )( x, y ) ⊆ ∂Ξ ( x, y ) + ∂ dist Ω ( x ) × { } . (39) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 23 Since g, h are both Lipschitz around x ∗ , by Proposition 2(iv), it follows thatfor any ( x, y ) with x suﬃciently close to x ∗ , there exists ( α, β, γ ) ∈ M ( x, y )where M ( x, y ) :=  ( α, β, γ ) : α ≥ , β ≥ , γ ≥ , α + k β k + k γ k = 1 α ( H ( x, y ) − Ξ ( x, y )) = 0 β i ( g i ( x ) − Ξ ( x, y )) = 0 i = 1 , . . . , nγ j ( | h j ( x ) | − Ξ ( x, y )) = 0 j = 1 , . . . , m  such that ∂Ξ ( x, y ) ⊆ S ( α,β,γ ) ∈M ( x,y ) ( α ⋄ ∂H ( x, y ) + n P i =1 (cid:18) β i ∂g i ( x )0 (cid:19) + m P j =1 (cid:18) γ j ∂ | h j | ( x )0 (cid:19)) . (40)In the following, we prove (36) by contradiction. Assume to the contrarythat there exist a sequence { ( x k , y k ) } with ( x k , y k ) ∈ Ω × ℜ\ Λ converging to( x ∗ , y ∗ ) and π k ∈ ∂ ( Ξ + dist Ω )( x k , y k ) such that π k →

0. Then it follows from(39)–(40) that there exists ( α k , β k , γ k ) ∈ M ( x k , y k ) such that π k ∈ α k ⋄ ∂H ( x k , y k )+ n X i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m X j =1 (cid:18) γ kj ∂ | h j | ( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (41)Noting that ( x k , y k ) ∈ Ω × ℜ\ Λ , we have that Ξ ( x k , y k ) > ∀ k. (42)Since ( α k , β k , γ k ) ∈ M ( x k , y k ), it follows that α k ≥ β k ≥ , γ k ≥

0, and α k + k β k k + k γ k k = 1 , (43) α k ( H ( x k , y k ) − Ξ ( x k , y k )) = 0 , (44) β ki ( g i ( x k ) − Ξ ( x k , y k )) = 0 i = 1 , . . . , n, (45) γ kj ( | h j ( x k ) | − Ξ ( x k , y k )) = 0 j = 1 , . . . , m. (46)Deﬁne ¯ γ kj := sign( h j ( x k )) γ kj , where sign(0) := 0Since it follows from (42) and (46) that γ kj = 0 when h j ( x k ) = 0, it is easy tosee that k γ k k = k ¯ γ k k . It then follows from (43) that α k + k β k k + k ¯ γ k k = 1 . (47)Moreover, by Proposition 2(ii), we have that γ kj ∂ | h j | ( x k ) = ∂ (¯ γ kj h j )( x k ) . (48)We continue the proof by considering the two separate cases as follows. Case (a): There exists a subsequence { α k } k ∈K with K ⊆ N such that α k = 0for any k ∈ K . Then it follows from (38), (41), (48), and the deﬁnition ofnotation ⋄ that for any k ∈ K , π k ∈ (cid:18) D ∗ Ψ ( x k )(0)0 (cid:19) + n X i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m X j =1 (cid:18) ∂ (¯ γ kj h j )( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (49)In this case, it follows from (47) that k β k k + k ¯ γ k k = 1. Thus, there must existsubsequences { β k } k ∈K and { ¯ γ k } k ∈K with K ⊆ K such that as K ∋ k → ∞ , β k → β ∗ ≥ , ¯ γ k → γ ∗ with k β ∗ k + k γ ∗ k = 1 . (50)Taking limits on both sides of (49), it then follows from (50), Proposition1, and the local boundedness of the limiting subdiﬀerential of local Lipschitzfunctions that0 ∈ D ∗ Ψ ( x ∗ )(0) + n X i =1 β ∗ i ∂g i ( x ∗ ) + m X j =1 ∂ ( γ ∗ j h j )( x ∗ ) + ∂ dist Ω ( x ∗ ) . (51)If g i ( x ∗ ) <

0, then g i ( x k ) < k suﬃciently large. Thus by (42) and(45), β ki = 0 for any k suﬃciently large. This together with (50) implies that β ∗ i = 0. In conclusion, we have β ∗ i ≥ , β ∗ i g i ( x ∗ ) = 0 i = 1 , . . . , n. (52)Moreover, if β ∗ i >

0, then by (50), we have β ki > k ∈ K suﬃcientlylarge. This and (45) imply that g i ( x k ) = Ξ ( x k , y k ) >

0. If γ ∗ j = 0, then by(50), we have γ ∗ j ¯ γ kj > k ∈ K suﬃciently large. Thus, it follows fromthe deﬁnition of ¯ γ kj and the relation γ kj ≥ γ ∗ j h j ( x k ) > k ∈ K suﬃciently large. Thus, we have β ∗ i > ⇒ g i ( x k ) > , γ ∗ j = 0 = ⇒ γ ∗ j h j ( x k ) > , which together with (50)–(52) contradicts D ∗ -quasi-normality at x ∗ by usingthe relation ∂ dist Ω ( x ∗ ) ⊆ N Ω ( x ∗ ).Case (b): There exists a subsequence { α k } k ∈K with K ⊆ N such that α k > k ∈ K . In this case, it then follows from (37), (41), (48), andthe deﬁnition of notation ⋄ that for any k ∈ K , there exists ξ k ∈ ∂ | G ( x k , y k ) | such that π k ∈ (cid:18) D ∗ Ψ ( x k )( α k ξ k ) − α k ξ k (cid:19) + n P i =1 (cid:18) β ki ∂g i ( x k )0 (cid:19) + m P j =1 (cid:18) ∂ (¯ γ kj h j )( x k )0 (cid:19) + (cid:18) ∂ dist Ω ( x k )0 (cid:19) . (53)Since α k >

0, it follows from (42) and (44) that H ( x k , y k ) = Ξ ( x k , y k ) > (cid:26) ξ k = 1 if Ψ ( x k ) − y k > ,ξ k = − . (54) on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 25 It then follows that | ¯ α k | = α k where ¯ α k := α k ξ k . Thus by (47), it follows that | ¯ α k | + k β k k + k ¯ γ k k = 1 . Without loss of generality, we assume that as

K ∋ k → ∞ ,¯ α k → α ∗ , β k → β ∗ , ¯ γ k → γ ∗ with | α ∗ | + k β ∗ k + k γ ∗ k = 1 . (55)It then follows from (53) and the relation π k → α k → K ∋ k → ∞ .Thus by (55), we have that α ∗ = 0 , k β ∗ k + k γ ∗ k = 1 . Taking limits on both sides of (53), it then follows from (55), Proposition1, and the local boundedness of the limiting subdiﬀerential of local Lipschitzfunctions that0 ∈ D ∗ Ψ ( x ∗ )(0) + n X i =1 β ∗ i ∂g i ( x ∗ ) + m X j =1 ∂ ( γ ∗ j h j )( x ∗ ) + ∂ dist Ω ( x ∗ ) . The rest of the proof for case (b) is similar to that for case (a).Therefore, there exist ˜ δ > κ > (cid:3)

We are now ready to give the exact penalization result for problem (33).

Theorem 3

Let x ∗ be a local minimizer of problem (33). If D ∗ -quasi-normalityholds at x ∗ , then there exists ρ > such that for any ρ ≥ ρ , x ∗ is also alocal minimizer of the exact penalization problem min x ∈ Ω f ( x ) + Ψ ( x ) + ρ ( k g ( x ) + k + k h ( x ) k ) . Proof

By the local optimality of x ∗ , it is easy to see that ( x ∗ , y ∗ ) with y ∗ := Ψ ( x ∗ ) is a local minimizer of the following auxiliary problem:min f ( x ) + y s . t . ( x, y ) ∈ Λ, where Λ is deﬁned in Lemma 3. Denote by ℓ the Lipschitz constant of theobjective function f ( x ) + y around ( x ∗ , y ∗ ). By Clarke’s exact penalizationprinciple [15, Proposition 2.4.3], there exists δ > f ( x ∗ ) + y ∗ ≤ f ( x ) + y + ℓ dist Λ ( x, y ) ∀ ( x, y ) ∈ B δ ( x ∗ , y ∗ ) . Then it follows from Lemma 3 that there exist δ ∈ (0 , δ ) and κ > x, y ) ∈ B δ ( x ∗ , y ∗ ) ∩ ( Ω × ℜ ), f ( x ∗ ) + y ∗ ≤ f ( x ) + y + ℓ dist Λ ( x, y ) ≤ f ( x ) + y + κℓ ( | Ψ ( x ) − y | + k g ( x ) + k + k h ( x ) k ) . (56)By the continuity of Ψ , we may choose δ ∈ (0 , δ ) such that ( x, Ψ ( x )) ∈B δ ( x ∗ , y ∗ ) for any x ∈ B δ ( x ∗ ). Then the desired result follows immediatelyfrom (56) by letting y = Ψ ( x ) and ρ := κℓ . (cid:3) Acknowledgements

We thank the referees for their helpful suggestions and commentsthat have helped us to improve the presentation of the paper. We would also like to thankJim Burke for a discussion on the topic of this research.

References

1. J.M. Abadie, On the Kuhn-Tucker theorem, In Nonlinear Programming, J. Abadie, ed.,John Wiley, New York, 1967, 21–36.2. R. Andreani, G. Haeser, M.L. Schuverdt and P.J. Silva,

A relaxed constant positive lineardependence constraint qualiﬁcation and applications , Math. Program., 135 (2012), 255–273.3. R. Andreani, G. Haeser, M.L. Schuverdt and P.J. Silva,

Two new weak constraint quali-ﬁcations and applications , SIAM J. Optim., 22 (2012), 1109–1135.4. R. Andreani, J.M. Martinez, A. Ramos and P.J. Silva,

A cone-continuity constraintqualiﬁcation and algorithmic consequences , SIAM J. Optim., 26 (2016), 96–110.5. R. Andreani, J.M. Martinez, A. Ramos and P.J. Silva,

Strict Constraint Qual-iﬁcations and Sequential Optimality Conditions for Constrained Optimization

On the relations between constant pos-itive linear dependence condition and quasinormality constraint qualiﬁcation , J. Optim.Theory Appl., 125 (2005), 473–485.7. D.P. Bertsekas and A.E. Ozdaglar,

Pseudonormality and a Lagrange multiplier theoryfor constrained optimization , J. Optim. Theory Appl., 114 (2002), 287–343.8. W. Bian and X. Chen,

Linearly constrained non-Lipschitz optimization for image restora-tion , SIAM J. Imaging Sci., 8 (2015), 2294–2322.9. J. Borwein, J. Treiman and Q. Zhu,

Necessary conditions for constrained optimizationproblems with semicontinuous and continuous data , Trans. Amer. Math. Soc., 350 (1998),2409–2429.10. A.M. Bruckstein, D.L. Donoho and M. Elad,

From sparse solutions of systems of equa-tions to sparse modeling of signals and images , SIAM Rev., 51 (2009), 34–81.11. R. Chartrand,

Exact reconstruction of sparse signals via nonconvex minimization , IEEESignal Process. Lett., 14 (2007), 707–710.12. X. Chen, L. Guo, Z. Lu and J.J. Ye,

An augmented Lagrangian method for non-Lipschitznonconvex programming , SIAM J. Numer. Anal., in press.13. C. Chen, X. Li, C. Tolman, S. Wang and Y. Ye,

Sparse portfolio selection via quasi-normregularization , ArXiv preprint, arXiv:1312.6350, 2014.14. X. Chen, L. Niu and Y. Yuan,

Optimality conditions and a smoothing trust regionnewton method for nonlipschitz optimization , SIAM J. Optim., 23 (2013), 1528–1552.15. F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, New York,1983.16. F.H. Clarke, Yu. S. Ledyaev, R.J. Stern and P.R. Wolenski, Nonsmooth Analysis andControl Theory, Springer, New York, 1998.17. H. Gfrerer,

First order and second order characterizations of metric subregularity andcalmness of constraint set mappings , SIAM J. Optim., 21 (2011), 1439–1474.18. M. Guignard,

Generalized Kuhn-Tucker conditions for mathematical programmingproblems in a Banach space , SIAM J. Contr., 7 (1969), 232–241.19. L. Guo, J.J. Ye and J. Zhang,

Mathematical programs with geometric constraints inBanach spaces: enhanced optimality, exact penalty, and sensitivity , SIAM J. Optim., 4(2013), 2295–2319.20. L. Guo, J. Zhang and G.H. Lin,

New results on constraint qualiﬁcations for nonlinearextremum problems and extensions , J. Optim. Theory Appl., 163 (2014), 737–754.21. A. Ioﬀe and J.V. Outrata,

On metric and calmness qualiﬁcation conditions in subdif-ferential calculus , Set Val. Anal., 16 (2008), 199–227.22. A. Jourani and L. Thibault,

The approximate subdiﬀerential of composite functions ,Bull. Aust. Math. Soc., 47 (1993), 443–456.on-Lipschitz Nonlinear Programs: Stationarity and Exact Penalization 2723. A.Y. Kruger and B.S. Mordukhovich,

New necessary optimality conditions in problemsof nondiﬀerentiable programming , in Numerical Methods of Nonlinear Programming, 116–119, Kharkov, 1979 (in Russian).24. Y.F. Liu, Y.H. Dai and S. Ma,

Joint power and admission control: non-convex ℓ q approximation and an eﬀective polynomial time deﬂation approach , IEEE Trans. SignalProcess., 63 (2015), 3641–3656.25. Y.F. Liu, S. Ma, Y.H. Dai and S. Zhang, A smoothing SQP framework for a class ofcomposite ℓ q minimization over polyhedron , Math. Program., 158 (2016), 467–500.26. L. Minchenko and S. Stakhovski, Parametric nonlinear programming problems underthe relaxed constant rank condition , SIAM J. Optim., 21 (2011), 314–332.27. B.S. Mordukhovich,

Metric approximations and necessary optimality conditions forgeneral classes of nonsmooth extremal problems , Soviet Math. Dokl., 22 (1980), 526–530.28. B.S. Mordukhovich, Variational Analysis and Generalized Diﬀerentiation I: Basic The-ory, II: Application, Grundlehren der Mathematischen Wissenschaften 330, Springer,Berlin, 2006.29. B.S. Mordukhovich and Y.H. Shao,

Nonsmooth sequential analysis in Asplund space ,Trans. Amer. Math. Soc., 348 (1996), 123–128030. S.M. Robinson,

Some continuity properties of polyhedral multifunctions , Math. Pro-gram. Stud., 14 (1981), 206–214.31. R.T. Rockafellar and R.J.-B. Wets, Variational Analysis, Springer, 1998.32. Z. Wu and J.J. Ye,

Suﬃcient conditions for error bounds , SIAM J. Optim., 12 (2001),421–435.33. J.J. Ye and J. Zhang,