[PDF] Codifferentials and Quasidifferentials of the Expectation of Nonsmooth Random Integrands and Two-Stage Stochastic Programming

Abstract

This work is devoted to an analysis of exact penalty functions and optimality conditions for nonsmooth two-stage stochastic programming problems. To this end, we first study the co-/quasi-differentiability of the expectation of nonsmooth random integrands and obtain explicit formulae for its co- and quasidifferential under some natural assumptions on the integrand. Then we analyse exact penalty functions for a variational reformulation of two-stage stochastic programming problems and obtain sufficient conditions for the global exactness of these functions with two different penalty terms. In the end of the paper, we combine our results on the co-/quasi-differentiability of the expectation of nonsmooth random integrands and exact penalty functions to derive optimality conditions for nonsmooth two-stage stochastic programming problems in terms of codifferentials.

Full PDF

aa r X i v : . [ m a t h . O C ] F e b Codiﬀerentials and Quasidiﬀerentials ofthe Expectation of Nonsmooth RandomIntegrands and Two-Stage StochasticProgramming

M.V. DolgopolikFebruary 15, 2021

Abstract

This work is devoted to an analysis of exact penalty functions andoptimality conditions for nonsmooth two-stage stochastic programmingproblems. To this end, we ﬁrst study the co-/quasi-diﬀerentiability of theexpectation of nonsmooth random integrands and obtain explicit formu-lae for its co- and quasidiﬀerential under some natural assumptions onthe integrand. Then we analyse exact penalty functions for a variationalreformulation of two-stage stochastic programming problems and obtainsuﬃcient conditions for the global exactness of these functions with twodiﬀerent penalty terms. In the end of the paper, we combine our resultson the co-/quasi-diﬀerentiability of the expectation of nonsmooth randomintegrands and exact penalty functions to derive optimality conditions fornonsmooth two-stage stochastic programming problems in terms of codif-ferentials.

Two-stage stochastic programming is one of the basic problems of stochastic op-timization [3,40] that has multiple applications in various ﬁelds, including trans-portation planning [2, 30], disaster management [25], optimal design of energysystems [49], resources management [27], etc. Although two-stage stochasticprogramming problems can be viewed as stochastic versions of bilevel optimiza-tion problems [8,9], their stochastic nature requires a largely diﬀerent approachto their solution. Optimality conditions for two-stage stochastic programmingproblems were obtained in [26,36,40,45,46], while numerical methods for solvingvarious classes of two-stage stochastic programming problems were studied e.g.in [23, 28, 32, 39] (see also the references therein).The need for computing convex or nonconvex subdiﬀerentials of the ex-pectation of nonsmooth random integrands arises in many areas of stochasticoptimization, including two-stage stochastic programming, as well as stochas-tic linear complementarity problems [6], stochastic variational inequalities [7],etc. The subdiﬀerential in the sense of convex analysis of the expectation of aconvex integrand was computed in [37], while its approximations were discussedin [31]. Various approximations of the Clarke subdiﬀerential of the expectation1f nonsmooth random integrands were studied in [5, 47], while an outer esti-mate of its Mordukhovich basic subdiﬀerential was obtained in [46]. Finally,a quasidiﬀerential of the expectation of quasidiﬀerentiable random integrandswas computed in [29].The main goal of this paper is to apply constructive nonsmooth analysis [12,14,15] to a theoretical analysis of nonsmooth two-stage stochastic programmingproblems. Firstly, we analyse the codiﬀerentiability and quasidiﬀerentiability ofthe expectation of nonsmooth random integrands and present explicit formulaefor its codiﬀerential and quasidiﬀerential in the more general case and underdiﬀerent assumptions than in [29] (see Remark 2 for more details).In the second part of the paper we study exact penalty functions for two-stage stochastic programming problems, reformulated as equivalent variationalproblems with pointwise constraints. With the use of the general theory of ex-act penalty functions [11, 19, 22, 34, 38, 48], we obtain suﬃcient conditions forthe global exactness of penalty functions for two-stage stochastic programmingwith two diﬀerent types of penalty terms. The use of penalty terms of the ﬁrsttype leads to much less restrictive assumptions on constraints of the secondstage problem, while the second type of penalty terms is more convenient forapplications. In particular, it allows one to reformulate two-stage stochasticprogramming problems, whose second stage problem has DC (Diﬀerence-of-Convex) objective function and DC constraints, as equivalent unconstrainedDC optimization problems and apply the well-developed apparatus of DC opti-mization to ﬁnd their solutions (cf. analogous results for bilevel programmingproblems in [33, 42]). Let us also note that exact penalty functions for single-stage stochastic programming were analysed in [24].Finally, in the end of the paper we combine our results on quasidiﬀerentialsof the expectation of nonsmooth random integrands and exact penalty functionsfor two-stage stochastic programming problems to obtains necessary optimalityconditions for these problems in terms of codiﬀerentials.The paper is organised as follows. Some auxiliary deﬁnitions and facts fromconstructive nonsmooth analysis, that are necessary for understanding the pa-per, are collected in Section 2. Codiﬀerentiability and quasidiﬀerentiability ofthe expectation of nonsmooth random integrands is studied in Section 3, whileSection 4 is devoted to nonsmooth two-stage stochastic programming problems.Exact penalty functions for such problems are analysed in Subsection 4.1, whileoptimality conditions for these problems in terms of codiﬀerentials are derivedin Subsection 4.2.

Let us introduce the notation and brieﬂy recall several deﬁnitions from nons-mooth analysis that will be used throughout the article. For more details in theﬁnite dimensional case see [12,14,15]. The inﬁnite dimensional case was studiedin [16–18, 20].Let X be a real Banach space. Denote by X ∗ its topological dual, and by h· , ·i the duality pairing between X and X ∗ . The weak ∗ topology on X ∗ isdenoted by w ∗ or σ ( X ∗ , X ) depending on the context. Denote also by τ R thecanonical topology of the real line R . Let ﬁnally U ⊂ X be an open set.2 eﬁnition 1. A function f : U → R is called codiﬀerentiable at a point x ∈ U ,if there exists a pair of convex subsets df ( x ) , df ( x ) ⊂ R × X ∗ that are compactin the topological product ( R × X ∗ , τ R × w ∗ ), satisfy the equalitymax ( a,x ∗ ) ∈ df ( x ) a = min ( b,y ∗ ) ∈ df ( x ) b = 0 , (1)and for any ∆ x ∈ X satisfy the following condition:lim α → +0 α (cid:12)(cid:12)(cid:12) f ( x + α ∆ x ) − f ( x ) − max ( a,x ∗ ) ∈ df ( x ) (cid:0) a + h x ∗ , α ∆ x i (cid:1) − min ( b,y ∗ ) ∈ df ( x ) (cid:0) b + h y ∗ , α ∆ x i (cid:1)(cid:12)(cid:12)(cid:12) = 0The pair Df ( x ) = [ df ( x ) , df ( x )] is called a codiﬀerential of f at x , the set df ( x ) is referred to as a hypodiﬀerential of f at x , while the set df ( x ) is calleda hyperdiﬀerential of f at x . Remark . (i) In the case when X = R d , a codiﬀerential Df ( x ) is a pair ofconvex compact subsets of R × R d = R d +1 satisfying the equalities from theprevious deﬁnition. In addition, if X is a Hilbert space, then it is natural tosuppose that a codiﬀerential Df ( x ) is a pair of convex weakly compact subsetsof the space R × X .(ii) Note that a codiﬀerential is not uniquely deﬁned. In particular, one caneasily verify that for any compact convex subset C of the space ( R × X ∗ , τ R × w ∗ )the pair [ df ( x ) + C, df ( x ) − C ] is a codiﬀerential of f at x as well. Deﬁnition 2.

A function f : U → R is called continuously codiﬀerentiableat a point x ∈ U , if f is codiﬀerentiable at every point in a neighbourhoodof x and there exists a codiﬀerential mapping Df ( · ) = [ df ( · ) , df ( · )], deﬁnedin a neighbourhood of x and such that the multifunctions df ( · ) and df ( · ) arecontinuous in Hausdorﬀ metric at x .The class of continuously codiﬀerentiable at a given point (or on a givenset) functions is closed under addition, multiplication, composition with con-tinuously diﬀerentiable functions, as well as pointwise maximum and minimumof ﬁnite families of functions. Moreover, any convex function is continuouslycodiﬀerentiable in a neighbourhood of any given point from the interior of itseﬀective domain, and any DC function (i.e. a function that can be representedas the diﬀerence of convex functions) is continuously codiﬀerentiable in a neigh-bourhood of any given point. Numerous examples of continuously codiﬀeren-tiable functions, as well as main rules of codiﬀerential calculus can be foundin [12, 14, 15, 18, 20]. Deﬁnition 3.

A function f : U → R is called quasidiﬀerentiable at a point x ∈ U , if f is directionally diﬀerentiable at x and its directional derivative f ′ ( x, · ) at this point can be represented as the diﬀerence of sublinear functionsor, equivalently, if there exists a pair ∂f ( x ) , ∂f ( x ) ⊂ X ∗ of compact weak ∗ compact sets such that f ′ ( x, h ) = max x ∗ ∈ ∂f ( x ) h x ∗ , h i + min y ∗ ∈ ∂f ( x ) h y ∗ , h i ∀ h ∈ X. The pair D f ( x ) = [ ∂f ( x ) , ∂f ( x )] is called a quasidiﬀerential of f at x , the set ∂f ( x ) is called a subdiﬀerential of f at x , while the set ∂f ( x ) is referred to asa superdiﬀerential of f at x . 3ust like codiﬀerential, a quasidiﬀerential is not uniquely deﬁned. Here weonly mention that a function f is codiﬀerentiable at a point x iﬀ f is quasidif-ferentiable at x and one can easily compute a quasidiﬀerential of f at x from itscodiﬀerential at this point and vice versa. Namely, if Df ( x ) is a codiﬀerentialof f at x , then the pair D f ( x ) = [ ∂f ( x ) , ∂f ( x )] with ∂f ( x ) = n x ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) (0 , x ∗ ) ∈ df ( x ) o ,∂f ( x ) = n y ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) (0 , y ∗ ) ∈ df ( x ) o (2)is a quasidiﬀerential of g at x . Conversely, if D f ( x ) is a quasidiﬀerential of f at x , then the pair [ { } × ∂f ( x ) , { } × ∂f ( x )] is a codiﬀerential of f at x (see,e.g. [14, 20]). Below we consider only quasidiﬀerentials of the form (2), that is,we suppose that if a codiﬀerentiable function f and its codiﬀerential Df ( x ) aregiven, then D f ( x ) is a quasidiﬀerential of f of the form (2).Let us ﬁnally recall one auxiliary deﬁnition from set-valued analysis that willbe used later (see, e.g. [1, Sect. 8.2] for more details). Let X and Y be metricspace and (Ω , A , µ ) be a measure space. A set-valued mapping F : X × Ω ⇒ Y , F = F ( x, ω ) is called a Carath´eodory map , if for every x ∈ X the multifunction F ( x, · ) is measurable and for a.e. ω ∈ Ω the multifunction F ( · , ω ) is continuous. Let (Ω , A , P ) be a probability space, and suppose that a nonsmooth function f : R d × R m × Ω → R , f = f ( x, y, ω ) is given. In this section we study thecodiﬀerentiability of the nonsmooth integral functional I ( x, y ) = E (cid:2) f ( x, y ( · ) , · ) (cid:3) := Z Ω f ( x, y ( ω ) , ω ) dP ( ω ) , where x ∈ R d is a parameter and y ∈ L p (Ω , A , P ; R m ) with 1 < p ≤ + ∞ isan m -dimensional random vector. Although the case p = 1 can be includedinto the general theory under some additional assumptions, we exclude it forthe sake of simplicity, since the proofs of the main results below are much morecumbersome in the case p = 1, than in the case 1 < p ≤ + ∞ .Denote by p ′ ∈ [1 , + ∞ ) the conjugate exponent of p , i.e. 1 /p + 1 /p ′ = 1,and let | · | be the Euclidean norm in R n . Let us impose some assumptions onthe integrand f that, as we will show below, ensure that the functional I iscorrectly deﬁned and codiﬀerentiable.Namely, we will suppose that for a.e. ω ∈ Ω and for all ( x, y ) ∈ R d × R m the function f is codiﬀerentiable jointly in x and y , that is, there exists a pairof compact convex sets d x,y f ( x, y, ω ) , d x,y f ( x, y, ω ) ⊂ R × R d × R m such thatΦ f ( x, y, ω ; 0 ,

0) = Ψ f ( x, y, ω ; 0 ,

0) = 0 , and for all (∆ x, ∆ y ) ∈ R d × R m one haslim α → +0 α (cid:12)(cid:12)(cid:12) f ( x + α ∆ x, y + α ∆ y, ω ) − f ( x, y, ω ) − Φ f ( x, y, ω ; α ∆ x, α ∆ y ) − Ψ f ( x, y, ω ; α ∆ x, α ∆ y ) (cid:12)(cid:12)(cid:12) = 0 , f ( x, y, ω ; ∆ x, ∆ y ) = max ( a,v x ,v y ) ∈ d x,y f ( x,y,ω ) (cid:0) a + h v x , ∆ x i + h v y , ∆ y i (cid:1) Ψ f ( x, y, ω ; ∆ x, ∆ y ) = min ( b,w x ,w y ) ∈ d x,y f ( x,y,ω ) (cid:0) b + h w x , ∆ x i + h w y , ∆ y i (cid:1) . (3)The pair D x,y f ( x, y, ω ) = [ d x,y f ( x, y, ω ) , d x,y f ( x, y, ω )] is called a codiﬀerentialof f in ( x, y ). Assumption 1.

The function f satisﬁes the following conditions:1. for any x ∈ R d the map ( y, ω ) f ( x, y, ω ) is a Carath´eodory function;2. the function f satisﬁes the following growh condition of order p : for any N > C N > β N ∈ L (Ω , A , P )such that | f ( x, y, ω ) | ≤ β N ( ω ) + C N | y | p for all x ∈ R d with | x | ≤ N , all y ∈ R m , and a.e. ω ∈ Ω in the case 1 < p < + ∞ , and | f ( x, y, ω ) | ≤ β N ( ω )for a.e. ω ∈ Ω and all ( x, y ) ∈ R d × R m with max {| x | , | y |} ≤ N in the case p = + ∞ ;3. the multifunctions ( y, ω ) d x,y f ( x, y, ω ) and ( y, ω ) d x,y f ( x, y, ω ) areCarath´eodory maps for any x ∈ R d ;4. the codiﬀerential mapping D x,y f ( · ) satisﬁes the following growth conditionof order p : for any N > C N >

0, and nonnegative functions β N ∈ L (Ω , A , P ) and γ N ∈ L p ′ (Ω , A , P ) such thatmax {| a | , | v x |} ≤ β N ( ω ) + C N | y | p , | v y | ≤ γ N ( ω ) + C N | y | p − for all ( a, v x , v y ) ∈ d x,y f ( x, y, ω ) ∪ d x,y f ( x, y, ω ), all x ∈ R d with | x | ≤ N ,all y ∈ R m , and a.e. ω ∈ Ω in the case 1 < p < + ∞ , andmax {| a | , | v x | , | v y |} ≤ β N ( ω )for all ( a, v x , v y ) ∈ d x,y f ( x, y, ω ) ∪ d x,y f ( x, y, ω ), a.e. ω ∈ Ω, and all( x, y ) ∈ R d × R m with max {| x | , | y |} ≤ N in the case p = + ∞ .Note that the Carath´eodory and the growth conditions on the function f ensure that the value I ( x, y ) is correctly deﬁned and ﬁnite for all x ∈ R d and y ∈ L p (Ω , A , P ; R m ). Let X = R d × L p (Ω , A , P ; R m ). Theorem 1.

Let < p ≤ + ∞ and Assumption 1 be valid. Then the functional I is codiﬀerentiable on R d × L (Ω , A , P ; R m ) , and for any ( x, y ) from this spacethe pair D I ( x, y ) = [ d I ( x, y ) , d I ( x, y )] , deﬁned as d I ( x, y ) = n ( A, x ∗ ) ∈ R × X ∗ (cid:12)(cid:12)(cid:12) A = E [ a ] , h x ∗ , ( h x , h y ) i = (cid:10) E [ v x ] , h x (cid:11) + Z Ω h v y ( ω ) , h y ( ω ) i dP ( ω ) ∀ ( h x , h y ) ∈ X, ( a ( · ) , v x ( · ) , v y ( · )) is a measurable selection of the map d x,y f ( x, y ( · ) , · ) o (4)5 nd d I ( x, y ) = n ( B, y ∗ ) ∈ R × X ∗ (cid:12)(cid:12)(cid:12) B = E [ b ] , h y ∗ , ( h x , h y ) i = (cid:10) E [ w x ] , h x (cid:11) + Z Ω h w y ( ω ) , h y ( ω ) i dP ( ω ) ∀ ( h x , h y ) ∈ X, ( b ( · ) , w x ( · ) , w y ( · )) is a measurable selection of the map d x,y f ( x, y ( · ) , · ) o , is a codiﬀerential of I at ( x, y ) . The proof of Theorem 1 is similar to the proof of the codiﬀerentiability ofthe mapping I ( u ) = R Ω f ( x, u ( x ) , ∇ u ( x )) dx from the author’s papers [17, 21](here Ω ⊆ R n is an open set and u belongs to the Sobolev space). On theother hand, Theorem 1 cannot be directly deduced from the main results of[17, 21]. That is why below we present a detailed proof of Theorem 1. It seemspossible to prove a more general result on the codiﬀerentiability of integralfunctionals deﬁned on Banach spaces that subsumes Theorem 1 and the mainresults of [17,21] as particular cases. A development of such general theorem onthe codiﬀerentiability of nonsmooth integral functionals is an interesting openproblem for future research.For the sake of convenience, we divide the proof of Theorem 1 into twolemmas. Lemma 1.

Let < p ≤ + ∞ and Assumption 1 be valid. Then for any ( x, y ) ∈ X the sets d I ( x, y ) and d I ( x, y ) from Theorem 1 are nonempty, convex, compactin the topological product ( R × X ∗ , τ R × w ∗ ) , and satisfy the following equalities: max ( A,x ∗ ) ∈ d I ( x,y ) A = min ( B,y ∗ ) ∈ d I ( x,y ) B = 0 . (5) Proof.

Fix any ( x, y ) ∈ X . We prove the statement of the lemma only forthe hypodiﬀerential d I ( x, y ), since the proof for the hyperdiﬀerential d I ( x, y )is exactly the same.By Assumption 1 the multifunction ( y, ω ) d x,y f ( x, y, ω ) is a Carath´eodorymap. Therefore by [1, Thrm. 8.2.8] the multifunction d x,y f ( x, y ( · ) , · ) is measur-able, which by [1, Thrm. 8.1.3] implies that there exist a measurable selection( a ( · ) , v x ( · ) , v y ( · )) of this mapping. Furthermore, by the growth condition onthe codiﬀerential D x,y f ( · ) from Assumption 1 all measurable selections of theset-valued mapping d x,y f ( x, y ( · ) , · ) belong to the space Y := L (Ω , A , P ) × L (Ω , A , P ; R d ) × L p ′ (Ω , A , P ; R m ) . (6)Consequently, the linear functional x ∗ , deﬁned as h x ∗ , ( h x , h y ) i = (cid:10) E [ v x ] , h x (cid:11) + Z Ω h v y ( ω ) , h y ( ω ) i dP ( ω ) ∀ ( h x , h y ) ∈ X, belongs to X ∗ , and one can conclude that the hypodiﬀerential d I ( x, y ) is cor-rectly deﬁned and nonempty.Denote by E ( x, y ) the set of all measurable selections z ( · ) = ( a ( · ) , v x ( · ) , v y ( · ))of the set-valued mapping d x,y f ( x, y ( · ) , · ). As was noted above, E ( x, y ) is asubset of the space Y deﬁned in (6). For any z = ( a, v x , v y ) ∈ Y denote by T ( z )the pair ( A, x ∗ ) deﬁned as in (4). Then d I ( x, y ) = T ( E ( x, y )).6y deﬁnition, for a.e. ω ∈ Ω the hypodiﬀerential d x,y f ( x, y ( ω ) , ω ) is a convexset. Therefore the set of measurable selections E ( x, y ) of the multifunction d x,y f ( x, y ( · ) , · ) is convex. Hence taking into account the fact that the operator T is linear one obtains that the hypodiﬀerential d I ( x, y ) is a convex set as theimage of a convex set under a linear map.Recall that by the deﬁnition of hypodiﬀerential one has a ≤ a, v x , v y ) ∈ d x,y f ( x, y ( ω ) , ω ), ω ∈ Ω. Therefore A ≤ A, x ∗ ) ∈ d I ( x, y ).On the other hand, observe that thanks to equality (1) for a.e. ω ∈ Ω one has0 ∈ n a ∈ R (cid:12)(cid:12)(cid:12) ∃ ( v x , v y ) ∈ R d + m : ( a, v x , v y ) ∈ d x,y f ( x, y ( ω ) , ω ) o . Hence by the Filippov theorem (see, e.g. [1, Thrm. 8.2.10]) there exists a mea-surable selection ( a ( · ) , v x ( · ) , v y ( · )) of the multifunction d x,y f ( x, y ( · ) , · ) suchthat a ( ω ) = 0 almost surely. Consequently, for ( A , x ∗ ) = T ( a , v x , v y ) onehas A = 0, which implies that equality (5) holds true.Thus, it remains to prove the compactness of the set d I ( x, y ) in the corre-sponding product topology. To this end, let us verify that the set E ( x, y ) is aweakly compact subset of the space Y deﬁned in (6), and the operator T con-tinuously maps the space Y endowed with the weak topology to the topologicalproduct ( R , τ R ) × ( X ∗ , w ∗ ). Then one can conclude that the hypodiﬀerential d I ( x, y ) is compact in the corresponding product topology as a continuous im-age of a compact set.We start with the proof of the continuity of the operator T . Let V bean open subset of the product space ( R , τ R ) × ( X ∗ , w ∗ ). Let us show that itspreimage U = T − ( V ) under the map T is weakly open in Y . Indeed, ﬁx any( a, v x , v y ) ∈ U . Then ( A, x ∗ ) = T ( a, v x , v y ) ∈ V , which due to the openness ofthe set V in the corresponding topology implies that there exist ε > n ∈ N ,and pairs ( h i , ξ i ) ∈ X , i ∈ I = { , . . . , n } , such that V ε ( A, x ∗ ) = n ( B, y ∗ ) ∈ R × X ∗ (cid:12)(cid:12)(cid:12) (cid:12)(cid:12) B − A (cid:12)(cid:12) < ε, max i ∈ I (cid:12)(cid:12) h y ∗ − x ∗ , ( h i , ξ i ) i (cid:12)(cid:12) < ε o ⊆ V . Introduce the set U ε ( a, v x , v y ) = (cid:26) ( b, w x , w y ) ∈ Y (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( b − a ) (cid:12)(cid:12) < ε, max i ∈ I (cid:12)(cid:12)(cid:12) Z Ω h w x ( ω ) − v x ( ω ) , h i i dP ( ω ) (cid:12)(cid:12)(cid:12) < ε , max i ∈ I (cid:12)(cid:12)(cid:12) Z Ω h w y ( ω ) − v y ( ω ) , ξ i ( ω ) i dP ( ω ) (cid:12)(cid:12)(cid:12) < ε (cid:27) . This set is neighbourhood of the point ( a, v x , v y ) in the weak topology on Y . Moreover, by deﬁnition T ( U ε ( a, v x , v y )) ⊆ V ε ( A, x ∗ ), which implies that U ε ( a, v x , v y ) ⊆ U . Thus, for any point ( a, v x , v y ) ∈ U there exists a neighbour-hood of this point in the weak topology contained in U . In other words, the set U is weakly open, and one can conclude that the operator T is continuous withrespect to the chosen topologies.Let us ﬁnally proof the weak compactness of the set E ( x, y ) in the space Y deﬁned in (6). By the Eberlein-ˇSmulian theorem it suﬃce to prove that E ( x, y ) is weakly sequentially compact. To this end, choose any sequence z n ( · ) =( a n ( · ) , v xn ( · ) , v yn ( · )) ∈ E ( x, y ), n ∈ N . Let us consider two cases.7 ase p = + ∞ . By the growth condition on the codiﬀerential D x,y f ( · ) (seeAssumption 1) there exists an a.e. nonnegative function β ∈ L (Ω , A , P ) suchthat for a.e. ω ∈ Ω one hasmax (cid:8) | a n ( ω ) | , | v xn ( ω ) | , | v yn ( ω ) | (cid:9) ≤ β ( ω ) ∀ n ∈ N . Hence by the weak compactness criterion in L (see, e.g. [4, Thrm. 4.7.20]) theclosures of the sets { a n } n ∈ N , { v xn } n ∈ N , and { v yn } n ∈ N are weakly compact inthe corresponding L spaces. Therefore by the Eberlein-ˇSmulian theorem thereexists a subsequence z n k = ( a n k , v xn k , v yn k ) weakly converging to some z ∗ in Y . By Mazur’s lemma there exists a sequence of convex combinations { b z k } ofelements of the sequence z n k strongly converging to z ∗ . Therefore, as is wellknown, there exists a subsequence { b z k l } converging to z ∗ almost surely.Note that due to the convexity of E ( x, y ) one has { b z k } ⊂ E ( x, y ), that is, b z k ( ω ) ∈ d x,y f ( x, y ( ω ) , ω ) for a.e. ω ∈ Ω and all k ∈ N . Hence taking intoaccount the fact that by deﬁnition the hypodiﬀerential d x,y f ( x, y ( ω ) , ω ), ω ∈ Ω,is a closed set, one obtains that z ∗ ( ω ) ∈ d x,y f ( x, y ( ω ) , ω ) for a.e. ω ∈ Ω. Thus, z ∗ ∈ E ( x, y ), and the set E ( x, y ) is weakly sequentially compact, which completesthe proof. Case p < + ∞ . By the growth condition on the codiﬀerential D x,y f ( · ) (seeAssumption 1) there exist C > β ∈ L (Ω , A , P )and γ ∈ L p ′ (Ω , A , P ) such that for a.e. ω ∈ Ω and all n ∈ N one hasmax (cid:8) | a n ( ω ) | , | v xn ( ω ) | (cid:9) ≤ β ( ω ) + C | y ( ω ) | p , | v yn ( ω ) | ≤ γ ( ω ) + C | y ( ω ) | p − . Observe that the right-hand side of the ﬁrst inequality belongs to L (Ω , A , P ),while the right-hand side of the second one belongs to L p ′ (Ω , A , P ). Thus, thesequence { v yn } is norm-bounded in L p ′ (Ω , A , P ; R m ), which due to the reﬂex-ivity of this space (note that 1 < p ′ < + ∞ , since 1 < p < + ∞ ) implies thatthere exists a weakly convergent subsequence { v yn k } . In turn, the existence ofweakly convergence subsequences of the sequences { a n } and { v xn } follows fromthe weak compactness criterion in L (see [4, Thrm. 4.7.20]).Thus, there exists a subsequence { z n k } weakly converging to some z ∗ ∈ Y .Now, applying Mazur’s lemma and arguing precisely in the same way as in thecase p = + ∞ one can prove the weak compactness of the set E ( x, y ).Denote by k · k p the standard norm on L p (Ω , A , P ). Lemma 2.

Let < p ≤ + ∞ , Assumption 1 be valid, and the sets d I ( x, y ) and d I ( x, y ) be deﬁned as in Theorem 1. Then for any ( x, y ) ∈ X and (∆ x, ∆ y ) ∈ X one has lim α → +0 α (cid:12)(cid:12)(cid:12) I ( x + α ∆ x, y + α ∆ y ) − I ( x, y ) − max ( A,x ∗ ) ∈ d I ( x,y ) (cid:0) A + h x ∗ , α (∆ x, ∆ y ) i (cid:1) − min ( B,y ∗ ) ∈ d I ( x,y ) (cid:0) B + h y ∗ , α (∆ x, ∆ y ) i (cid:1)(cid:12)(cid:12)(cid:12) = 0 . Proof.

Fix any ( x, y ) ∈ X and (∆ x, ∆ y ) ∈ X , and choose an arbitrary sequence { α n } ⊂ (0 , + ∞ ) converging to zero. For a.e. ω ∈ Ω and n ∈ N denote f n ( ω ) = 1 α n (cid:16) f ( x + α n ∆ x, y ( ω ) + α n ∆ y ( ω ) , ω ) − f ( x, y ( ω ) , ω ) − Φ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) − Ψ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1)(cid:17) , (7)8here the functions Φ f and Ψ f are deﬁned in (3). By the deﬁnition of codiﬀer-entiability the sequence f n converges to zero almost surely. Our aim is to provethat each term in the deﬁnition of f n belongs to L (Ω , A , P ) and there existsan a.e. nonnegative function ω ∈ L (Ω , A , P ) such that | f n | ≤ ω almost surely.Then by Lebesgue’s dominated convergence theorem E [ | f n | ] → n → ∞ .Hence integrating each term in the deﬁnition of f n separately one obtains thatlim n →∞ α n (cid:12)(cid:12)(cid:12) I ( x + α n ∆ x,y + α n ∆ y ) − I ( x, y ) − Z Ω Φ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) dP ( ω ) − Z Ω Ψ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) dP ( ω ) (cid:12)(cid:12)(cid:12) = 0 . Let us check that Z Ω Φ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) dP ( ω )= max ( A,x ∗ ) ∈ d I ( x,y ) (cid:0) A + h x ∗ , α n (∆ x, ∆ y ) i (cid:1) (8)(a similar equality for the min terms involving the hyperdiﬀerentials can beveriﬁed in the same way). Then one obtains the desired result.Indeed, by deﬁnition (see (3)) for any measurable selection ( a ( · ) , v x ( · ) , v y ( · ))of the set-valued mapping d x,y f ( x, y ( · ) , · ) one hasΦ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) ≥ a ( ω ) + h v x ( ω ) , α n ∆ x i + h v y ( ω ) , α n ∆ y ( ω ) i , which implies that Z Ω Φ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) dP ( ω ) ≥ max ( A,x ∗ ) ∈ d I ( x,y ) (cid:0) A + h x ∗ , α n ∆ x i (cid:1) (see (4)). On the other hand, for a.e. ω ∈ Ω one hasΦ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) ∈ n a + h v x , α n ∆ x i + h v y , α n ∆ y ( ω ) i (cid:12)(cid:12)(cid:12) ( a, v x , v y ) ∈ d x,y f ( x, y ( ω ) , ω ) o . Consequently, by the Filippov theorem (see, e.g. [1, Thrm. 8.2.10]) there existsa measurable selection ( a ( · ) , v x ( · ) , v y ( · )) of the multifunction d x,y f ( x, y ( · ) , · )such thatΦ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) = a ( ω ) + h v x ( ω ) , α n ∆ x i + h v y ( ω ) , α n ∆ y ( ω ) i for a.e. ω ∈ Ω. Hence for the corresponding pair ( A , x ∗ ) = T ( a , v x , v y ) (seethe proof of Lemma 1), that by deﬁnition belongs to d I ( x, y ), one has Z Ω Φ f (cid:0) x, y ( ω ) , ω ; α n ∆ x, α n ∆ y ( ω ) (cid:1) dP ( ω ) = A + h x ∗ , α n (∆ x, ∆ y ) i , and therefore equality (8) holds true. 9hus, it remains to show that Lebesgue’s dominated convergence theoremis applicable to the sequence { f n } . Indeed, the ﬁrst two terms in the deﬁnitionof f n (see (7)) belong to L (Ω , A , P ) by virtue of the ﬁrst two parts of Assump-tion 1. Let us check that these terms are dominated by a Lebesgue integrablefunction independent of n .By the mean value theorem for codiﬀerentiable functions [20, Prp. 2] for any n ∈ N and for a.e. ω ∈ Ω there exist α n ( ω ) ∈ (0 , α n ) and(0 , v xn ( ω ) , v yn ( ω )) ∈ d x,y f ( x + α n ( ω )∆ x, y ( ω ) + α n ( ω )∆ y ( ω ) , ω ) , (0 , w xn ( ω ) , w yn ( ω )) ∈ d x,y f ( x + α n ( ω )∆ x, y ( ω ) + α n ( ω )∆ y ( ω ) , ω )such that1 α n (cid:16) f ( x + α n ∆ x, y ( ω ) + α n ∆ y ( ω ) , ω ) − f ( x, y ( ω ) , ω ) (cid:17) = h v xn ( ω ) + w xn ( ω ) , ∆ x i + h v yn ( ω ) + w yn ( ω ) , ∆ y ( ω ) i . (9)Put α ∗ = max n ∈ N α n . By the growth condition on the codiﬀerential D x,y f (seeAssumption 1) there exist C N > β N ∈ L (Ω , A , P )and γ N ∈ L p ′ (Ω , A , P ) (here N = | x | + α ∗ | ∆ x | ) such thatmax (cid:8) | v xn ( ω ) | , | w xn ( ω ) | (cid:9) ≤ β N ( ω ) + C N (cid:12)(cid:12) y ( ω ) + α n ( ω )∆ y ( ω ) (cid:12)(cid:12) p ≤ β N ( ω ) + C N p (cid:16) | y ( ω ) | p + α ∗ | ∆ y ( ω ) | p (cid:17) , max (cid:8) | v yn ( ω ) | , | w yn ( ω ) | (cid:9) ≤ γ N ( ω ) + C N (cid:12)(cid:12) y ( ω ) + α n ( ω )∆ y ( ω ) (cid:12)(cid:12) p − for a.e. ω ∈ Ω and all n ∈ N in the case 1 < p < + ∞ , and there exists β N ∈ L (Ω , A , P ) (here N = max {| x | + α ∗ | ∆ x | , k y k ∞ + α ∗ k ∆ y k ∞ } ) such thatmax (cid:8) | v xn ( ω ) | , | w xn ( ω ) | , | v yn ( ω ) | , | w yn ( ω ) | (cid:9) ≤ β N ( ω )for a.e. ω ∈ Ω and all n ∈ N in the case p = + ∞ . Hence with the use of (9)one obtains that in the case p = + ∞ the inequality1 α n (cid:12)(cid:12)(cid:12) f ( x + α n ∆ x, y ( ω ) + α n ∆ y ( ω ) , ω ) − f ( x, y ( ω ) , ω ) (cid:12)(cid:12)(cid:12) ≤ β N ( ω ) | ∆ x | + 2 β N ( ω ) k ∆ y k ∞ holds true for a.e. ω ∈ Ω and all n ∈ N , which implies that the ﬁrst two termsin the deﬁnition of f n (see (7)) are dominated by a Lebesgue integrable functionindependent of n . In the case p < + ∞ one has1 α n (cid:12)(cid:12)(cid:12) f ( x + α n ∆ x, y ( ω ) + α n ∆ y ( ω ) , ω ) − f ( x, y ( ω ) , ω ) (cid:12)(cid:12)(cid:12) ≤ (cid:16) β N ( ω ) + C N p (cid:0) | y ( ω ) | p + α ∗ | ∆ y ( ω ) | p (cid:1)(cid:17) | ∆ x | + 2 (cid:16) γ N ( ω ) + C N p − (cid:0) | y ( ω ) | p − + α p − ∗ | ∆ y ( ω ) | p − (cid:1)(cid:17) | ∆ y ( ω ) | The right-hand side of this inequality does not depend on n and is Lebesgueintegrable, as one can easily verify with the use of H¨older’s inequality and theequality p ′ ( p −

1) = p . Thus, in the case p < + ∞ the ﬁrst two terms in the10eﬁnition of f n are dominated by a Lebesgue integrable function independentof n as well.Let us ﬁnally check that the third term in the deﬁnition of f n , denoted by θ n ( ω ) := 1 α n max ( a,v x ,v y ) ∈ d x,y f ( x,y ( ω ) ,ω ) (cid:0) a + h v x , α n ∆ x i + h v y , α n ∆ y ( ω ) i (cid:1) (see (7)), is measurable and dominated by a Lebesgue integrable function inde-pendent of n . The fact that the last term (the min term) in the deﬁnition of f n is measurable and dominated by a Lebesgue integrable function independent of n is proved in exactly the same way.As was shown in the proof of Lemma 1, the set-valued mapping d x,y f ( x, y ( · ) , · )is measurable. Consequently, the function θ n is measurable by [1, Thrm. 8.2.11].For any ω ∈ Ω introduce the function g ω ( t ) = max ( a,v x ,v y ) ∈ d x,y f ( x,y ( ω ) ,ω ) (cid:0) a + h v x , t ∆ x i + h v y , t ∆ y ( ω ) i (cid:1) . Observe that by the deﬁnition of codiﬀerential g ω (0) = 0 (see Def. 1) and forany t, ∆ t ∈ R and α > α (cid:12)(cid:12)(cid:12) g ω ( t + α ∆ t ) − g ω ( t ) − max ( a g ,v g ) ∈ dg ω ( t ) (cid:0) a g + v g ( α ∆ t ) (cid:1)(cid:12)(cid:12)(cid:12) = 0 , where dg ω ( t ) = n ( a g , v g ) ∈ R × R (cid:12)(cid:12)(cid:12) a g = a + h v x , t ∆ x i + h v y , t ∆ y ( ω ) i − g ω ( t ) ,v g = h v x , ∆ x i + h v y , ∆ y ( ω ) i , ( a, v x , v y ) ∈ d x,y f ( x, y ( ω ) , ω ) o . The set dg ω ( t ) is obviously convex and compact. Moreover, note that the equal-ity max { a g | ( a g , v g ) ∈ dg ω ( t ) } = g ω ( t ) − g ω ( t ) = 0 holds true. Thus, the func-tion g ω is codiﬀerentiable at every point t ∈ R , and the pair [ dg ω ( t ) , { } ] is acodiﬀerential of g ω at the point t .Applying the mean value theorem for codiﬀerentiable functions [20, Prp. 2]one obtains that for any n ∈ N and for a.e. ω ∈ Ω there exists α n ( ω ) ∈ (0 , α n )and (0 , v gn ( ω )) ∈ dg ω ( α n ( ω )) such that θ n ( ω ) = 1 α n ( g ω ( α n ) − g ω (0)) = v g ( ω )or, equivalently, there exists ( a n ( ω ) , v xn ( ω ) , v yn ( ω )) ∈ d x,y f ( x, y ( ω ) , ω ) suchthat θ n ( ω ) = h v xn ( ω ) , ∆ x i + h v yn ( ω ) , ∆ y ( ω ) i ∀ n ∈ N Hence by the growth condition on the codiﬀerential D x,y f (see Assumption 1)there exist C N > β N ∈ L (Ω , A , P ) and γ N ∈ L p ′ (Ω , A , P ) (here N = k x k ) satisfying the inequality | θ n ( ω ) | ≤ (cid:16) β N ( ω ) + C N | y ( ω ) | p (cid:17) | ∆ x | + (cid:16) γ N ( ω ) + C N | y ( ω ) | p − (cid:17) | ∆ y ( ω ) | for a.e. ω ∈ Ω in the case p < + ∞ , and the inequality | θ n ( ω ) | ≤ β N ( ω ) | ∆ x | + β N ( ω ) k ∆ y ( ω ) k ∞ ω ∈ Ω in the case p = + ∞ . The right-hand sides of these inequalitiesare Lebesgue integrable and do not depend on n . Thus, the sequence { θ n } isdominated by a Lebesgue integrable function, which completes the proof.With the use of Theorem 1 one can easily obtain suﬃcient conditions for thequasidiﬀerentiability of the functional I . Recall that X = R d × L p (Ω , A , P ; R m ). Corollary 1.

Let < p ≤ + ∞ and Assumption 1 be valid. Then the functional I is quasidiﬀerentiable on R d × L (Ω , A , P ; R m ) , and for any ( x, y ) from thisspace the pair D I ( x, y ) = [ ∂ I ( x, y ) , ∂ I ( x, y )] , deﬁned as ∂ I ( x, y ) = n x ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) h x ∗ , ( h x , h y ) i = (cid:10) E [ v x ] , h x (cid:11) + Z Ω h v y ( ω ) , h y ( ω ) i dP ( ω ) ∀ ( h x , h y ) ∈ X, (0 , v x ( · ) , v y ( · )) is a measurable selection of d x,y f ( x, y ( · ) , · ) o and ∂ I ( x, y ) = n y ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) h y ∗ , ( h x , h y ) i = (cid:10) E [ w x ] , h x (cid:11) + Z Ω h w y ( ω ) , h y ( ω ) i dP ( ω ) ∀ ( h x , h y ) ∈ X, (0 , w x ( · ) , w y ( · )) is a measurable selection of d x,y f ( x, y ( · ) , · ) o , is a quasidiﬀerential of I at ( x, y ) . Moreover, the following equality holds true: I ′ ( x, y ; h x , h y ) = Z Ω (cid:2) f ( · , · , ω ) (cid:3) ′ ( x, y ( ω ); h x , h y ( ω )) dP ( ω ) ∀ ( h x , h y ) ∈ X. (10) Proof.

Applying Theorem 1 and the fact that any codiﬀerentiable function g with codiﬀerential Dg ( x ) is quasidiﬀerentiable and the pair ∂g ( x ) = n x ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) (0 , x ∗ ) ∈ dg ( x ) o , ∂g ( x ) = n y ∗ ∈ X ∗ (cid:12)(cid:12)(cid:12) (0 , y ∗ ) ∈ dg ( x ) o is a quasidiﬀerential of g at x (see, e.g. [14,20]), one obtains the required resultson the quasidiﬀerentiability of the functional I .To prove equality (10), recall that the set-valued maps d x,y f ( x, y ( · ) , · ) and d x,y f ( x, y ( · ) , · ) are measurable, as was shown in the proof of Lemma 1. Hencewith the use of [1, Thrm. 8.2.4] one obtains that the set-valued mappings ∂ x,y f ( x, y ( · ) , · ) and ∂ x,y f ( x, y ( · ) , · ), deﬁned according to equalities (2), are mea-surable as well. Consequently, applying the deﬁnition of quasidiﬀerentiabilityand arguing in the same way as in the proof of Lemma 2 (or utilising the inter-changeability principle; see, e.g. [35, Thrm. 14.60]) one gets that I ′ ( x, y ; h x , h y ) = Z Ω (cid:16) max ( v x ,v y ) ∈ ∂ x,y f ( x ∗ ,y ∗ ( ω ) ,ω ) (cid:0) h v x , h x i + h v y , h y ( ω ) i (cid:1) + min ( w x ,w y ) ∈ ∂ x,y f ( x ∗ ,y ∗ ( ω ) ,ω ) (cid:0) h w x , h x i + h w y , h y ( ω ) i (cid:1)(cid:17) dP ( ω )for all ( h x , h y ) ∈ X , which by the deﬁnition of quasidiﬀerential of the function f implies that equality (10) holds true.12 emark . In the particular case when the function f does not depent on y ,i.e. f = f ( x, ω ), the previous corollary contains suﬃcient conditions for thequasidiﬀerentiability of the function F ( x ) = E [ f ( x, · )]. Quasidiﬀerentiability ofthis function was studied in the recent paper [29] under diﬀerent assumptionson the function f . Namely, instead of imposing any growth conditions, in [29] itwas assumed that all integrals are correctly deﬁned and the function f is locallyLipschitz continuous in x uniformly in ω .Let us ﬁnally show that under the assumptions of Theorem 1 the functional I ( x, y ) is not only codiﬀerentiable, but also Lipschitz continuous on boundedsets. Corollary 2.

Let < p ≤ + ∞ and Assumption 1 be valid. Then I is Lipschitzcontinuous on any bounded subset of the space X = R d × L p (Ω , A , P ; R m ) .Proof. With the use of the growth condition on the codiﬀerential mapping D x,y f ( · ) from Assumption 1 one can readily verify that both multifunctions d I ( · ) and d I ( · ) are bounded on bounded subsets of the space X . Thereforeby [20, Corollary 2] the functional I is Lipschitz continuous on any boundedsubset of this space. Let, as above, (Ω , A , P ) be a probability space. In this section we study a generaltwo-stage stochastic programming problem of the formmin x ∈ A E (cid:2) F ( x, ω ) (cid:3) , (11)where F ( x, ω ) is the optimal value of the second stage problemmin y ∈ G ( x,ω ) f ( x, y, ω ) . (12)Here A ⊂ R d is a closed set, f : R d × R m × Ω → R is a Carath´eodory function,and G : R d × Ω ⇒ R m is a multifunction. We assume that G is measurable andfor every ω ∈ Ω the multifunction G ( · , ω ) is closed.Choose any 1 ≤ p ≤ + ∞ , and denote X = R d × L p (Ω , A , P ; R m ). By theinterchangeability principle for two-stage stochastic programming (see, e.g. [40,Thrm. 2.20]), problem (11), (12) is equivalent the following variational problemwith pointwise constraints:min ( x,y ) ∈ X E (cid:2) f ( x, y ( · ) , · ) (cid:3) subject to x ∈ A, y ( ω ) ∈ G ( x, ω ) for a.e. ω ∈ Ω , ( P )in the sense that the optimal values of these problems coincide, and if thiscommon optimal value is ﬁnite, then for any globally optimal solution ( x ∗ , y ∗ ( · ))of the problem ( P ) the point x ∗ is a globally optimal solution of problem (11) andfor a.e. ω ∈ Ω the point y ∗ ( ω ) is a globally optimal solution of the second stageproblem (12). Conversely, if x ∗ is a globally optimal solution of problem (11)and for a.e. ω ∈ Ω the point y ∗ ( ω ) is a globally optimal solution of problem (12)13ith x = x ∗ such that y ∗ ∈ L p (Ω , A , P ; R m ), then ( x ∗ , y ∗ ) is a globally optimalsolution of the problem ( P ).Since problem (11), (12) and the problem ( P ) are equivalent, below we con-sider only the problem ( P ). Our aim is to present several results on exactpenalty functions for the problem ( P ), which not only allow one to obtain op-timality conditions for the original two-stage stochastic programming problem,but also can be used for design and analysis of exact penalty methods for solvingproblem (11), (12). Fix any p ∈ [1 , + ∞ ], and denote by I ( x, y ) = Z Ω f ( x, y ( ω ) , ω ) dP ( ω )the objective function of the problem ( P ). Below we suppose that the functional I is correctly deﬁned on the space X := R d × L p (Ω , A , P ; R m ) and does not takethe value −∞ . In particular, it is suﬃcient to suppose that for any x ∈ R d there exist C > β ∈ L (Ω , A , P ) suchthat | f ( x, y, ω ) | ≤ β ( ω ) + C | y | p for a.e. ω ∈ Ω and all y ∈ R m in the case p < + ∞ , and for any x ∈ R d and N > β N ∈ L (Ω , A , P ) such that | f ( x, y, ω ) | ≤ β N ( ω ) for a.e. ω ∈ Ω and all y ∈ R m with | y | ≤ N .Introduce the set M = n ( x, y ) ∈ X (cid:12)(cid:12)(cid:12) y ( ω ) ∈ G ( x, ω ) for a.e. ω ∈ Ω o . Then the problem ( P ) can be rewritten as follows:min ( x,y ) ∈ X I ( x, y ) subject to ( x, y ) ∈ M ∩ ( A × L p (Ω , A , P ; R m )) . Let ϕ : X → [0 , + ∞ ] be any function such that ϕ ( x, y ) = 0 iﬀ ( x, y ) ∈ M , andlet Φ c ( x, y ) = I ( x, y ) + cϕ ( x, y ). The function Φ c is called a penalty function for the problem ( P ) with c ≥ ϕ is called a penalty term for the constrain ( x, y ) ∈ M . Our aim is to obtainsuﬃcient conditions for the exactness of the penalty function Φ c .Recall that the penalty function Φ c is called globally exact , if there exists c ∗ ≥ c ≥ c ∗ the set of globally optimal solutions of thepenalized problem min ( x,y ) ∈ X Φ c ( x, y ) subject to x ∈ A (13)coincides with the set of globally optimal solutions of the problem ( P ). Thegreatest lower bound of all such c ∗ is called the least exact penalty parameter ofthe penalty function Φ c . One can verify that the penalty function Φ c is globallyexact iﬀ there exists c ∗ ≥ c ≥ c ∗ the optimal values ofthe problem ( P ) and problem (13), and the greatest lower bound of all such c ∗ coincides with the least exact penalty parameter. See [11, 19, 22, 34, 38, 48] formore details on exact penalty functions.14et us obtain suﬃcient conditions for the global exactness of the penaltyfunction Φ c with the penalty term ϕ deﬁned in several diﬀerent ways. To thisend, we will utilise general suﬃcient conditions for the exactness of penaltyfunctions in metric and normed spaces from [19,22], and the following auxiliarylemma, which is a slight generalization of [19, Prp. 3.13]. Lemma 3.

Let Y be a normed space, F ⊂ Y be nonempty sets, and a function F : Y → R ∪ { + ∞} be such that for any bounded set C ⊂ Y there exists acontinuous from the right function ω C : [0 , + ∞ ) → [0 , + ∞ ) for which (cid:12)(cid:12) F ( y ) − F ( y ) (cid:12)(cid:12) ≤ ω C (cid:0) k y − y k (cid:1) ∀ y , y ∈ C. (14) Then for any

R > there exists a bounded set C ⊂ Y such that F ( y ) ≥ inf z ∈F F ( z ) − ω C (cid:0) dist( y, F ) (cid:1) ∀ y ∈ B (0 , R ) = { z ∈ Y | k z k ≤ R } . (15) Proof.

Denote F ∗ = inf z ∈F F ( z ), and ﬁx any R > z ∈ F . By our assump-tion there exists a continuous from the right function ω C such that inequality(14) holds true for C = B (0 , R + k z k ).Choose any y ∈ B (0 , R ). If y ∈ F , then inequality (15) trivially holdstrue. Suppose now that y ∈ B (0 , R ) \ F . Clearly, there exists a sequence { y n } ⊂ F such that k y − y n k → dist( y, F ) as n → ∞ , and the inequalities k y − y n k ≤ k y − z k ≤ R + k z k and k y − y n k ≥ k y − y n +1 k are satisﬁed forall n ∈ N . By deﬁnition { y n } ⊂ C , y ∈ C , and F ( y n ) ≥ F ∗ for all n ∈ N .Therefore, by applying inequality (14) one obtains that F ∗ − F ( y ) = F ∗ − F ( y n ) + F ( y n ) − F ( y ) ≤ F ( y n ) − F ( y ) ≤ ω C (cid:0) k y − y n k (cid:1) for any n ∈ N . Hence passing to the limit as n → ∞ with the use of the factthat the function ω C is continuous from the right and the sequence {k y − y n k} is non-increasing one gets that inequality (15) holds true. Remark . Note that if F is Lipschitz continuous on bounded sets, then in-equality (14) holds true with ω C ( t ) = L C t , where L C is a Lipschitz constantof F on C . In this case the statement of the lemma can be reformulated asfollows: for any R >

L > F ( y ) ≥ F ∗ − L dist( y, F ) forall y ∈ B (0 , R ). Thus, Lemma 3 provides a lower estimate of the decay of thefunction F relative to a given set F .We start our analysis of the exactness of the penalty function Φ c with thesimplest case when the penalty term ϕ is deﬁned via the distance function tothe multifunction G . Denote by I ∗ the optimal value of the problem ( P ). Theorem 2.

Let there exist a globally optimal solution of the problem ( P ) , theset-valued mapping G have closed images, and ϕ ( x, y ) = (cid:16) E [dist( y ( · ) , G ( x, · )) p ] (cid:17) /p ∀ ( x, y ) ∈ X in the case p < + ∞ , and ϕ ( x, y ) = ess sup ω ∈ Ω dist( y ( ω ) , G ( x, ω )) for all ( x, y ) ∈ X in the case p = + ∞ . Suppose also that the functional I is Lipschitz contin-uous on bounded sets, and there exists c ≥ such that the set { ( x, y ) ∈ X | x ∈ A, Φ c ( x, y ) < I ∗ } is bounded. Then the penalty function Φ c is globally exact. roof. Observe that the function ϕ is correctly deﬁned for all ( x, y ) ∈ X , sincethe multifunction G is measurable. Moreover, ϕ is nonnegative, and ϕ ( x, y ) = 0iﬀ ( x, y ) ∈ M . Denote by F the feasible set of the problem ( P ). Let us showthat ϕ ( x, y ) ≥ dist (cid:16) ( x, y ) , F (cid:17) ∀ x ∈ A, y ∈ L p (Ω , A , P ; R m ) (16)Indeed, ﬁx any ( x, y ) ∈ X such that x ∈ A . If ϕ ( x, y ) = + ∞ , then inequality(16) obviously holds true. Suppose now that ϕ ( x, y ) < + ∞ . Then, in particular, G ( x, ω ) = ∅ for a.e. ω ∈ Ω.By our assumptions the multifunction G is measurable and has closed im-ages. Therefore by [1, Crlr. 8.2.13] there exists a measurable selection z of theset-valued mapping G ( x, · ) such that | y ( ω ) − z ( ω ) | = dist (cid:0) y ( ω ) , G ( x, ω ) (cid:1) for a.e. ω ∈ Ω . Let us check that z ∈ L p (Ω , A , P ; R m ). Then ( x, z ) ∈ F and ϕ ( x, y ) = k y − z k p = (cid:13)(cid:13) ( x, y ) − ( x, z ) (cid:13)(cid:13) ≥ dist (cid:16) ( x, y ) , F (cid:17) , that is, inequality (16) holds true.To verify that z belongs to the space L p , observe that | z ( ω ) | ≤ | y ( ω ) | + | z ( ω ) − y ( ω ) | = | y ( ω ) | + dist (cid:0) y ( ω ) , G ( x, ω ) (cid:1) for a.e. ω ∈ Ω. The right-hand side of this inequality belongs to L p (Ω , A , P ; R m )due to the fact that ϕ ( x, y ) < + ∞ . Therefore the function z belongs to thisspace as well.Thus, inequality (16) holds true. Since the functional I is Lipschitz con-tinuous on bounded sets, by Lemma 3 for any R >

L > I ( x, y ) ≥ I ∗ − L dist (cid:16) ( x, y ) , F (cid:17) ∀ ( x, y ) ∈ B (0 , R ) . Hence by [19, Prp. 3.16 and Remark 15, part (ii)] the penalty function Φ c isglobally exact. Remark . Note that by Corollary 2 the functional I is Lipschitz continuous onbounded sets in the case p >

1, provided the integrand f satisﬁes Assumption 1.In turn, as one can readily verify, the set { ( x, y ) ∈ X | x ∈ A, Φ c ( x, y ) < I ∗ } isbounded for some c ≥

0, if 1 ≤ p < + ∞ and one of the following conditions issatisﬁed:1. the set A is bounded, and the multifunction G is bounded on A × Ω;2. the set A is bounded, and there exist C > β ∈ L (Ω , A , P ) suchthat f ( x, y, ω ) ≥ C | y | p + β ( ω ) for all ( x, y ) ∈ A × R m and a.e. ω ∈ Ω;3. the multifunction G is bounded on A × Ω, and there exist β ∈ L (Ω , A , P )and a function ρ : [0 , + ∞ ) → [0 , + ∞ ) such that ρ ( t ) → + ∞ as t → + ∞ ,and f ( x, y, ω ) ≥ ρ ( | x | ) + β ( ω ) for all ( x, y ) ∈ R d + m and a.e. ω ∈ Ω;4. there exist

C > β ∈ L (Ω , A , P ), and a function ρ : [0 , + ∞ ) → [0 , + ∞ )such that ρ ( t ) → + ∞ as t → + ∞ , and f ( x, y, ω ) ≥ ρ ( | x | ) + C | y | p + β ( ω )for all ( x, y ) ∈ R d + m and a.e. ω ∈ Ω;16. (Ω , A , P ) is a ﬁnite probability space, and min ω ∈ Ω f ( x, y, ω ) → + ∞ as | x | + | y | → + ∞ .In the case p = + ∞ the set { ( x, y ) ∈ X | x ∈ A, Φ c ( x, y ) < I ∗ } is bounded,provided the ﬁrst, the third or the last of the assumptions above is satisﬁed.In most particular cases the feasible set G ( x, ω ) of the second stage problem(12) is not deﬁned explicitly, but rather via some constraints. As a result,one usually does not know an explicit expression for the penalty term ϕ fromTheorem 2, which makes this theorem inapplicable to real-world problems, atleast in a direct way. In some cases Theorem 2 can still be applied indirectly toreduce an analysis of the exactness of a penalty function for the problem ( P )to an analysis of constraints of the second stage problem. Let us explain thisstatement with the use of a simple example. Example 1.

Suppose that the set-valued map G is deﬁned in the followingway: G ( x, ω ) = n y ∈ R m (cid:12)(cid:12)(cid:12) ∈ Q ( x, y, ω ) o where Q : R d × R m × Ω → R s is a multifunction with closed images. In otherwords, the second stage problem (12) has the form:min y f ( x, y, ω ) subject to 0 ∈ Q ( x, y, ω ) . In this case it is natural to deﬁne ϕ ( x, y ) = (cid:16) E (cid:2) dist(0 , Q ( x, y ( · ) , · )) p (cid:3)(cid:17) /p , ≤ p < + ∞ . Then ϕ ( x, y ) = 0 iﬀ ( x, y ) ∈ M . Suppose that there exists K > K dist(0 , Q ( x, y, ω )) ≥ dist( y, G ( x, ω )) ∀ x ∈ A, y ∈ R m , ω ∈ Ω , that is, the function g ( y ) = dist(0 , Q ( x, y, ω )) admits a global error bound uni-form for all x ∈ A and ω ∈ Ω. ThenΦ Kc ( x, y ) = I ( x, y ) + Kcϕ ( x, y ) ≥ I ( x, y ) + cψ ( x, y )for all x ∈ A and y ∈ L p (Ω , A , P ; R m ), where ψ ( x, y ) = (cid:16) E [dist( y ( · ) , G ( x, · )) p ] (cid:17) /p . Therefore, as one can readily verify (cf. [19, Prp. 2.2]), under the assumptions ofTheorem 2 the penalty function Φ c is globally exact and its least exact penaltyparameter is at most K times greater than the least exact penalty parameterof the penalty function from Theorem 2.Let us also point out two simple cases when Theorem 2 can be applieddirectly, that is, the cases when one can write a simple explicit expression forthe penalty term ϕ from this theorem. Note that Theorem 2 can be applieddirectly whenever the distance from a given point y to the set G ( x, ω ) is easyto compute, e.g. when the set G ( x, ω ) is deﬁned by linear or, more generally,convex quadratic constraints. 17 xample 2. Let I := { , . . . , m } . Suppose that the set G ( x, ω ) is deﬁned bybound (box) constraints, that is, G ( x, ω ) = n y = ( y , . . . , y m ) T ∈ R m (cid:12)(cid:12)(cid:12) a i ( x, ω ) ≤ y i ≤ b i ( x, ω ) , i ∈ I o for some given functions a i and b i . Let the space R m be equipped with the ℓ ∞ norm. Then the penalty term ϕ from Theorem 2 has the form ϕ ( x, y ) = (cid:16) Z Ω max i ∈ I (cid:8) , y i ( ω ) − b i ( x, ω ) , a i ( x, ω ) − y i ( ω ) (cid:9) p dP ( ω ) (cid:17) /p in the case 1 ≤ p < + ∞ . Example 3.

Let G ( x, ω ) = B ( z ( x, ω ) , R ( x, ω )) be the closed ball with centre z ( x, ω ) and radius R ( x, ω ). Then the penalty term ϕ from Theorem 2 has theform ϕ ( x, y ) = (cid:16) Z Ω max (cid:8) , | y ( ω ) − z ( x, ω ) | − R ( x, ω ) (cid:9) p dP ( ω ) (cid:17) /p . in the case 1 ≤ p < + ∞ .Observe that the penalty terms from Theorem 2 and the examples abovedepend on the parameter p that deﬁnes the space in which one solves the problem( P ). This parameter must be chosen to satisfy the assumption of Theorem 2.Under some additional assumptions on constraints of the second stage prob-lem one can prove the global exactness of the penalty function Φ c with a penaltyterm ϕ that does not depend on p . For the sake of simplicity, we will prove thisresult only in the case when the feasible set G ( x, ω ) of the second stage problemis deﬁned by inequality constraints, i.e. it has the form G ( x, ω ) = n y ∈ R m (cid:12)(cid:12)(cid:12) g i ( x, y, ω ) ≤ , i ∈ I = { , . . . , ℓ } o for some functions g i : R d × R m × Ω → R . Below we suppose that for each x ∈ R d the map ( y, ω ) g i ( x, y, ω ), i ∈ I , is a Carath´eodory function, so thatthe penalty term ϕ ( x, y ) = Z Ω max i ∈ I (cid:8) , g i ( x, y, ω ) (cid:9) dP ( ω ) (17)is correctly deﬁned. Note that ϕ ( x, y ) = 0 iﬀ ( x, y ) ∈ M . We will assume thatfor any x ∈ R d and a.e. ω ∈ Ω the function y g i ( x, y, ω ), i ∈ I , is quasidiﬀer-entiable and denote by D y g i ( x, y, ω ) = [ ∂ y g i ( x, y, ω ) , ∂ y g i ( x, y, ω )] its quasidif-ferential. Denote also I ( x, y, ω ) = { i ∈ I | g i ( x, y, ω ) = max k ∈ I g k ( x, y, ω ) } .Let ( Y, d ) be a metric space, K ⊂ Y be a given set, and g : Y → R ∪ { + ∞} be a given function. Recall that for any y ∈ K ∩ dom g the quantity g ↓ K ( y ) = lim inf z → y,z ∈ K g ( z ) − g ( y ) d ( z, y )is called the rate of steepest descent of g at y . If y is not a limit point of the set K ,then by deﬁnition g ↓ K ( y ) = + ∞ . Recall also that a point y ∈ K ∩ dom g is called an inf-stationary point of g on the set K , if g ↓ K ( y ) ≥

0. It should be noted that18n various particular cases this inequality is reduced to standard stationarityconditions. For example, if Y is normed space, g is Fr´echet diﬀerentiable at apoint y ∈ K , and the set K is convex, then g ↓ K ( y ) ≥ g ′ ( y )[ z − y ] ≥ z ∈ K , where g ′ ( y ) is the Fr´echet derivative of g at y . See [10,11,43,44] for moredetails on the rate of steepest descent and the deﬁnition of inf-stationarity. Theorem 3.

Let ≤ p < + ∞ and the following assumptions be valid:1. there exist a globally optimal solution of the problem ( P ) ;2. the functional I is Lipschitz continuous on bounded sets;3. the set S c ( γ ) = { ( x, y ) ∈ X | x ∈ A, Φ c ( x, y ) < γ } is bounded for some c ≥ and γ > I ∗ , where Φ c is the penalty functions with the penalty term (17) ;4. for any x ∈ A there exists an a.e. nonnegative function L ( · ) ∈ L (Ω , A , P ) such that | g i ( x, y , ω ) − g i ( x, y , ω ) | ≤ L ( ω ) k y − y k for all y , y ∈ R d ,all i ∈ I and a.e. ω ∈ Ω ;5. for all i ∈ I , x ∈ A , and y ∈ L p (Ω , A , P ; R m ) the set-valued mappings ∂ y g i ( x, y ( · ) , · ) and ∂ y g i ( x, y ( · ) , · ) are measurable;6. there exists a > such that for any ( x, y ) ∈ A × R m and a.e. ω ∈ Ω such that y / ∈ G ( x, ω ) , and for all i ∈ I ( x, y, ω ) one can ﬁnd w i ( x, y, ω ) ∈ ∂ y g i ( x, y, ω ) satisfying the following condition: dist (cid:16) , co n ∂ y g i ( x, y, ω ) + w i ( x, y, ω ) (cid:12)(cid:12)(cid:12) i ∈ I ( x, y, ω ) o(cid:17) ≥ a. (18) Then the penalty function Φ c with the penalty term (17) is globally exact andthere exists c ∗ ≥ such that for any c ≥ c ∗ the following statements hold true:1. ( x ∗ , y ∗ ) ∈ S c ( γ ) is a locally optimal solution of the penalized problem (13) iﬀ ( x ∗ , y ∗ ) is a locally optimal solution of the problem ( P ) ;2. ( x ∗ , y ∗ ) ∈ S c ( γ ) is an inf-stationary point of the penalty function Φ c onthe set A × L p (Ω , A , P ; R m ) iﬀ ( x ∗ , y ∗ ) is an inf-stationary point of thefunctional I on the feasible set F of the problem ( P ) .Proof. Let us show that under the assumptions of the theorem ϕ ↓ ( x, · )( y ) ≤ − a for any ( x, y ) ∈ X \ F such that x ∈ A and ϕ ( x, y ) < + ∞ (here ϕ ↓ ( x, · )( y ) isthe rate of steepest descent of the function y ϕ ( x, y ) at the point y ). Thenapplying [22, Thrm. 2] one obtains the required result.To prove the required estimate for ϕ ↓ ( x, · )( y ), we ﬁrst construct a descentdirection for the function ϕ using condition (18), and then obtain an upperestimate for the rate of steepest descent via the directional derivative of ϕ alongthe constructed descent direction.Fix any ( x, y ) ∈ X \ F such that x ∈ A and ϕ ( x, y ) < + ∞ . Recall that bythe deﬁnition of quasidiﬀerential one has Q i ( h, ω ) = ( g i ( x, · , ω )) ′ ( y ( ω ) , h ) = max v ∈ ∂ y g i ( x,y ( ω ) ,ω ) h v, h i + min w ∈ ∂ y g i ( x,y ( ω ) ,ω ) h w, h i (19)19see Def. 3). Applying Assumption 5 and [1, Thrm. 8.2.11] one obtains thatthe function Q i is measurable in ω for any h ∈ R m . Moreover, since in theﬁnite dimensional case the quasidiﬀerential is a pair of compact convex sets, thefunction Q i is continuous for a.e. ω ∈ Ω, i.e. Q i is a Carath´eodory function.Let us now prove that the multifunction I ( · ) := I ( x, y ( · ) , · ), I : Ω → { , . . . , ℓ } is measurable. Indeed, by deﬁnitions for any nonempty subset K ⊆ { , . . . , ℓ } one has I − ( K ) = n ω ∈ Ω (cid:12)(cid:12)(cid:12) I ( x, y ( ω ) , ω ) ∩ K = ∅ o = n ω ∈ Ω (cid:12)(cid:12)(cid:12) max k ∈ K g k ( x, y ( ω ) , ω ) ≥ max i ∈ I g i ( x, y ( ω ) , ω ) o . This set is measurable, since the functions g i ( x, y ( · ) , · )) are measurable due tothe fact that the maps ( y, ω ) g i ( x, y, ω ) are Carath´eodory functions by ourassumption. Thus, for any subset K ⊆ { , . . . , s } the set I − ( K ) is measur-able, that is, the set-valued map I ( · ) is measurable by deﬁnition (see, e.g. [1,Def. 8.1.1]).Introduce the sets E = n ω ∈ Ω (cid:12)(cid:12)(cid:12) max i ∈ I g i ( x, y ( ω ) , ω ) > o . Note that the set E is measurable, thanks to our assumption that the maps( y, ω ) g i ( x, y, ω ) are Carath´eodory functions. Moreover, P ( E ) > x, y ) is not a feasible point of the problem ( P ).Since the multifunction I ( · ) is measurable and Q i are Carath´eodory func-tions, the set-valued mapping H ( ω ) := n h ∈ R m (cid:12)(cid:12)(cid:12) | h | = 1 , max i ∈ I ( ω ) Q i ( h, ω ) = min | z | =1 max i ∈ I ( ω ) Q i ( z, ω ) o , ω ∈ E is measurable by [1, Thrm. 8.2.11]. Furthermore, this multifunction obviouslyhas closed images. Therefore by [1, Thrm. 8.1.3] there exists a measurablefunction h ∗ : E → R m such that h ∗ ( ω ) ∈ H ( ω ) for all ω ∈ E . For any ω ∈ Ω \ E deﬁne h ∗ ( ω ) = 0. Then h ∗ : Ω → R m is a measurable function and, moreover, k h ∗ k p = P ( E ) > ω ∈ E there exists b h ( ω ) ∈ R m with | b h ( ω ) | = 1 such that h v, b h ( ω ) i ≤ − a ∀ v ∈ co n ∂ y g i ( x, y ( ω ) , ω ) + w i ( x, y ( ω ) , ω ) (cid:12)(cid:12)(cid:12) i ∈ I ( ω ) o . Hence with the use of (19) one obtains that Q i ( b h ( ω ) , ω ) ≤ − a for all ω ∈ E and i ∈ I ( ω ), which by the deﬁnition of h ∗ implies thatmax i ∈ I ( ω ) Q i ( h ∗ ( ω ) , ω ) ( ≤ − a, if ω ∈ E, = 0 , if ω / ∈ E. (20)Thus, the function h ∗ is the desired descent direction, along which we willevaluate the directional derivative of the penalty term ϕ .Indeed, denote ψ ( ω, α ) = max i ∈ I { , g i ( x, y ( ω ) + αh ∗ ( ω ) , ω ) } for all α ≥ ω ∈ Ω. Applying relations (20) and standard calculus rules for directional20erivatives (see, e.g. [14]) one gets thatlim α → +0 ψ ( ω, α ) − ψ ( ω, α = ( max i ∈ I ( ω ) Q i ( h ∗ ( ω ) , ω ) ≤ − a, if ω ∈ E, , if ω / ∈ E. Applying Assumption 4 and the well-known fact that the maximum of a ﬁ-nite family of Lipschitz continuous is Lipschitz continuous (see, e.g. [13, Ap-pendix III]) one obtains that there exists an a.e. nonnegative function L ( · ) ∈ L (Ω , A , P ) such that (cid:12)(cid:12)(cid:12)(cid:12) ψ ( ω, α ) − ψ ( ω, α (cid:12)(cid:12)(cid:12)(cid:12) ≤ L ( ω ) | h ∗ ( ω ) | ≤ L ( ω ) ∀ α > , a.e. ω ∈ Ω . Note also that ψ ( · , ∈ L (Ω , A , P ), since ϕ ( x, y ) < + ∞ . Hence by the inequal-ity above ψ ( · , α ) ∈ L (Ω , A , P ) for all α >

0. Consequently, applying Lebesgue’sdominated convergence theorem and the fact that ϕ ( x, y + αh ∗ ) = E [ ψ ( · , α )] oneobtains that (cid:2) ϕ ( x, · ) (cid:3) ′ ( y ; h ∗ ) = lim α → +0 ϕ ( x, y + αh ∗ ) − ϕ ( x, y ) α = Z E max i ∈ I ( ω ) Q i ( h ∗ ( ω ) , ω ) dP ( ω ) ≤ − aP ( E ) . Therefore ϕ ↓ ( x, · )( y ) = lim inf z → y ϕ ( x, z ) − ϕ ( x, y ) k z − y k p ≤ lim inf α → +0 ϕ ( x, y + αh ∗ ) − ϕ ( x, y ) α k h ∗ k p = (cid:2) ϕ ( x, · ) (cid:3) ′ ( y ; h ∗ ) k h ∗ k p ≤ − aP ( E ) P ( E ) = − a, and the proof is complete. Remark . (i) Note that by [35, Crlr. 14.14] the multifunctions ∂ y g i ( x, y ( · ) , · )and ∂ y g i ( x, y ( · ) , · ) are measurable for any measurable function y ( · ), provided forany ω ∈ Ω the mapping ∂ y g i ( x, · , ω ) is outer semicontinuous and the graphicalmapping ω Graph ∂ y g i ( x, · , ω ) is measurable.(ii) In the case when the functions g i are continuously diﬀerentiable in y , as-sumption (18) is satisﬁed iﬀ there exists a > x, y ) ∈ R d + m and a.e. ω ∈ Ω such that y / ∈ G ( x, ω ) one hasdist (cid:16) , co n ∇ y g i ( x, y, ω ) (cid:12)(cid:12)(cid:12) i ∈ I ( x, y, ω ) o(cid:17) ≥ a. This condition can be viewed as a uniform Mangasarian-Fromovitz constraintqualiﬁcation. In turn, in the case when the functions g i are convex in y , as-sumption (18) is satisﬁed iﬀ there exists a > x, y ) ∈ R d + m and a.e. ω ∈ Ω such that y / ∈ G ( x, ω ) one hasdist (cid:16) , co n ∂ y g i ( x, y, ω ) (cid:12)(cid:12)(cid:12) i ∈ I ( x, y, ω ) o(cid:17) ≥ a. where ∂ y g i ( x, y, ω ) is the subdiﬀerential of the function g i ( x, · , ω ) in the senseof convex analysis. 21 emark . Suppose that for a.e. ω ∈ Ω the functions ( x, y ) f ( x, y, ω ) and( x, y ) g i ( x, y, ω ), i ∈ I , are DC (Diﬀerence-of-Convex), that is, there existsconvex in ( x, y ) functions f ( x, y, ω ) , f ( x, y, ω ) , g i ( x, y, ω ), and g i ( x, y, ω ) suchthat f ( x, y, ω ) = f ( x, y, ω ) − f ( x, y, ω ) , g i ( x, y, ω ) = g i ( x, y, ω ) − g i ( x, y, ω )for all ( x, y ) ∈ R d + m , i ∈ I , and a.e. ω ∈ Ω. Then the penalty function fromTheorem 3 is DC as well. Namely, one has Φ c ( x, y ) = Φ c ( x, y ) − Φ c ( x, y ), whereΦ c ( x, y ) = Z Ω (cid:16) f ( x, y ( ω ) , ω )+ c max i ∈ I n , g i ( x, y ( ω ) , ω ) + X k = i g k ( x, y ( ω ) , ω ) o(cid:17) dP ( ω ) , and Φ c ( x, y ) = Z Ω (cid:16) f ( x, y ( ω ) , ω ) + c X i ∈ I g i ( x, y ( ω ) , ω ) (cid:17) dP ( ω )are convex functionals. Therefore with the use of Theorem 3 and well-knownglobal optimality conditions for DC optimization problems one can easily ob-tain global optimality conditions for the problem ( P ) and the original two-stagestochastic programming problem (cf. [41]). Moreover, under the assumptions ofTheorem 3 one can apply well-developed methods of DC optimization to ﬁndlocal or global minima of the DC penalty function Φ c ( x, y ), which coincide withlocal/global minima of the problem ( P ). Thus, Theorem 3 opens a way for ap-plications of DC programming algorithms to two-stage stochastic programmingproblems (cf. [33, 42]). Let us ﬁnally derive optimality conditions for the problem ( P ) in terms of cod-iﬀerentials. We will derive these conditions by applying standard optimalityconditions for quasidiﬀerentiable functions to an exact penalty function for theproblem ( P ).For the sake of shortness, we will consider only the case when the set A isconvex and obtain optimality conditions under the assumptions of Theorem 3.It should be noted that one can obtain such conditions under less restrictiveassumptions on the functional I and the penalty function Φ c , if one considersthe so-called local exactness of the penalty function instead of the global one(see [11, 19]). Moreover, one can signiﬁcantly relax the assumptions on theconstraints of the second-stage problem by considering the case p = + ∞ andutilising the highly nonsmooth penalty term ϕ ( x, y ) = ess sup ω ∈ Ω n max i ∈ I { , g i ( x, y ( ω ) , ω ) } o . However, the price one has to pay for less restrictive assumptions on con-straints is the reduced regularity of Lagrange multipliers (see the theorem be-low). Namely, in this case one must assume that the Lagrange multipliers arejust ﬁnitely additive measures. 22or any convex subset K of a Banach space Y and any y ∈ K denote by N K ( y ) = { y ∗ ∈ Y ∗ | h y ∗ , z − y i ≤ ∀ z ∈ K } the normal cone to the set K atthe point y . Theorem 4.

Let < p < + ∞ , the set A be convex, the feasible set of thesecond-stage problem (12) have the form G ( x, ω ) = n y ∈ R m (cid:12)(cid:12)(cid:12) g i ( x, y, ω ) ≤ , i ∈ I = { , . . . , ℓ } o for some functions g i : R d × R m × Ω → R , the function f satisfy Assumption 1,and the functions g i , i ∈ I , satisfy the same assumption. Suppose also thatassumptions 1, 3–6 of Theorem 3 are valid, and ( x ∗ , y ∗ ) is a locally optimalsolution of the problem ( P ) such that ( x ∗ , y ∗ ) ∈ S c ( γ ) for some c ≥ c ∗ , where c ∗ is from Theorem 3.Then for any measurable selection (0 , w x ( · ) , w y ( · )) of the set-valued map-ping d x,y f ( x ∗ , y ∗ ( · ) , · ) and any measurable selections (0 , w xi ( · ) , w yi ( · )) of themultifunctions d x,y g i ( x ∗ , y ∗ ( · ) , · ) , i ∈ I , there exist ζ ∈ L (Ω , A , P ; R d ) andnonnegative multipliers λ i ∈ L ∞ (Ω , A , P ) , i ∈ I , such that E [ ζ ] ∈ − N A ( x ∗ ) , P i ∈ I k λ i k ∞ ≤ c ∗ , λ i ( ω ) g i ( x ∗ , y ∗ ( ω ) , ω ) = 0 for a.e. ω ∈ Ω and all i ∈ I , and (0 , ζ ( ω ) , ∈ d x,y f ( x ∗ , y ∗ ( · ) , · ) + (0 , w x ( · ) , w y ( · ))+ s X i =1 λ i ( ω ) (cid:16) d x,y g i ( x ∗ , y ∗ ( · ) , · ) + (0 , w xi ( · ) , w yi ( · )) (cid:17) for a.e. ω ∈ Ω .Proof. Under the assumptions of the theorem the functional I is Lipschitz con-tinuous on bounded sets by Corollary 2. Let ϕ ( x, y ) = Z Ω max i ∈ I (cid:8) , g i ( x, y, ω ) (cid:9) dP ( ω ) ∀ ( x, y ) ∈ X. Then by Theorem 3 the pair ( x ∗ , y ∗ ) is a point of local minimum of the penaltyfunction Φ c on the set A × L p (Ω , A , P ) for any c ≥ c ∗ , where c ∗ is from Theo-rem 3. Thus, in particular, ( x ∗ , y ∗ ) is a point of local minimum of the problemmin ( x,y ) J ( x, y ) = Z Ω f ( x, y ( ω ) , ω ) dP ( ω ) s.t. ( x, y ) ∈ A × L p (Ω , A , P ; R m ) , where f = f + c ∗ max i ∈ I { , g i } . The function f is codiﬀerentiable in ( x, y ),and applying codiﬀerential calculus (see, e.g. [14]) one can compute its codif-ferential and verify that f satisﬁes Assumption 1. Therefore by Corollary 1the functional J is directionally diﬀerentiable. Applying well-known necessaryconditions for a minimum of a directionally diﬀerentiable function on a convexset (see, e.g. [14, Lemma V.1.2]) and Corollary 1 one obtains that J ′ ( x ∗ , y ∗ ; h x , h y ) = Z Ω (cid:2) f ( · , · , ω ) (cid:3) ′ ( x ∗ , y ∗ ( ω ); h x , h y ( ω )) dP ( ω ) ≥ h x , h y ) ∈ ( A − x ∗ ) × L p (Ω , A , P ; R m ). Hence with the use of the standardcalculus rules for directional derivatives (see [14, Sect. I.3]) one gets that for all23uch ( h x , h y ) the following inequality holds true: J ′ ( x ∗ , y ∗ ; h x , h y ) = Z Ω (cid:16)(cid:2) f ( · , · , ω )] ′ ( x ∗ , y ∗ ( ω ); h x , h y ( ω ))+ c ∗ max i ∈ b I ( ω ) (cid:2) g i ( · , · , ω ) (cid:3) ′ ( x ∗ , y ∗ ( ω ); h x , h y ( ω )) (cid:17) dP ( ω ) ≥ , where g ( x, y, ω ) ≡ b I ( ω ) = n i ∈ I ∪ { } (cid:12)(cid:12)(cid:12) g i ( x ∗ , y ∗ ( ω ) , ω ) = max i ∈ I (cid:8) , g i ( x ∗ , y ∗ ( ω ) , ω ) (cid:9)o . Fix any measurable selection (0 , w x ( · ) , w y ( · )) of the set-valued mapping d x,y f ( x ∗ , y ∗ ( · ) , · )and any measurable selections (0 , w xi ( · ) , w yi ( · )) of the set-valued mapping d x,y g i ( x ∗ , y ∗ ( · ) , · ), i ∈ I , and denote ( w x ( · ) , w y ( · )) ≡

0. Then by the deﬁnition of quasidiﬀerential(Def. 3) and equality (2) one has Z Ω (cid:16) max ( v x ,v y ) ∈ ∂f ( x ∗ ,y ∗ ( ω ) ,ω ) (cid:0) h v x + w x ( ω ) , h x i + h v y + w y ( ω ) , h y ( ω ) i (cid:1) + c ∗ max i ∈ b I ( ω ) max (cid:0) h v xi + w xi ( ω ) , h x i + h v yi + w yi ( ω ) , h y ( ω ) i (cid:1)(cid:17) dP ( ω ) ≥ h x , h y ) ∈ ( A − x ∗ ) × L p (Ω , A , P ; R m ), where the last maximum is takenover all ( v xi , v yi ) ∈ ∂g i ( x ∗ , y ∗ ( ω ) , ω ). Consequently, one has Z Ω max ( v x ,v y ) ∈ Q ( ω ) (cid:0) h v x , h x i + h v y , h y ( ω ) i (cid:1) dP ( ω ) ≥ h x , h y ) ∈ ( A − x ∗ ) × L p (Ω , A , P ; R m ), where Q ( ω ) = ∂f ( x ∗ , y ∗ ( ω ) , ω ) + ( w x ( ω ) , w y ( ω ))+ c ∗ co n ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) (cid:12)(cid:12)(cid:12) i ∈ b I ( ω ) o for any ω ∈ Ω.Let us show that the multifunction Q ( · ) is measurable. Indeed, as waspointed out in the proof of Corollary 1, Assumption 1 guarantees that the set-valued mappings ∂f ( x ∗ , y ∗ ( · ) , · ) and ∂g i ( x ∗ , y ∗ ( · ) , · ), i ∈ I ∪{ } , are measurable.Hence with the use of [35, Prp. 14.11, part (c)] one gets that the set-valued maps ∂f ( x ∗ , y ∗ ( · ) , · ) + ( w x ( · ) , w y ( · )) and ∂g i ( x ∗ , y ∗ ( · ) , · ) + ( w xi ( · ) , w yi ( · )), i ∈ I ∪ { } ,are measurable as well.Arguing in the same way as in the proof of Theorem 3 one can easily checkthat the multifunction b I ( · ) is measurable, which implies that the set-valuedmaps Q i ( ω ) := ( ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) , if i ∈ b I ( ω ) , ∅ , if i / ∈ b I ( ω )are measurable for all i ∈ I ∪ { } . Therefore by [35, Prp. 14.11, part (b)]and [1, Thrm. 8.2.2] the set-valued mapco (cid:16) [ i ∈ I ∪{ } Q i ( · ) (cid:17) = co n ∂g i ( x ∗ , y ∗ ( · ) , · ) + { (0 , w xi ( · ) , w yi ( · ) } (cid:12)(cid:12)(cid:12) i ∈ b I ( · ) o .

24s measurable. Hence applying [35, Prp. 14.11, part (c)] one ﬁnally gets thatthe multifunction Q ( · ) is measurable.Now, arguing in the same way as in the proof of Lemma 2 (or utilising theinterchangeability principle; see, e.g. [35, Thrm. 14.60]) one gets that inequality(21) is satisﬁed iﬀmax ( v x ( ω ) ,v y ( · )) Z Ω (cid:0) h v x ( ω ) , h x i + h v y ( ω ) , h y ( ω ) i (cid:1) dP ( ω ) ≥ h x , h y ) ∈ ( A − x ∗ ) × L p (Ω , A , P ; R m ), where the maximum is taken overall measurable selections of the multifunction Q ( · ) (note that at least one suchselection exists by [1, Thrm. 8.1.3]). From the deﬁnition of Q ( · ) and the growthcondition on the codiﬀerentials of the functions f and g i from Assumption 1 itfollows that the set of all measurable selection of Q ( · ) is a bounded subspace ofthe space L (Ω , A , P ; R d ) × L p ′ (Ω , A , P ; R m ). Therefore inequality (22) can berewritten as follows:max ( v ,v ) ∈Q ( x ∗ ,y ∗ ) (cid:16) h v , h x i + Z Ω h v y ( ω ) , h y ( ω ) i dP ( ω ) (cid:17) ≥ h x , h y ) ∈ ( A − x ∗ ) × L p (Ω , A , P ; R m ), where Q ( x ∗ , y ∗ ) := n ( v , v ) ∈ R d × L p ′ (Ω , A , P ; R m ) (cid:12)(cid:12)(cid:12) v = E [ v x ] , v = v y , ( v x ( · ) , v y ( · )) is a measurable selection of the map Q ( · ) o . The set Q ( x ∗ , y ∗ ) is bounded due to the boundedness of the set of all measurableselections of Q ( · ). Furthermore, the set Q ( x ∗ , y ∗ ) is convex and closed, since bydeﬁnition Q ( · ) has closed and convex images. Therefore, Q ( x ∗ , y ∗ ) is a weaklycompact convex subset of R d × L p ′ (Ω , A , P ; R m ). Hence taking into accountinequality (23) and applying the separation theorem one can easily check that Q ( x ∗ , y ∗ ) ∩ (cid:16)(cid:8) − N A ( x ∗ ) } × { } (cid:17) = ∅ . Consequently, by the deﬁnitions of Q ( x ∗ , y ∗ ) and Q ( · ) there exists a function ζ ∈ L (Ω , A , P ; R d ) such that E [ ζ ] ∈ − N A ( x ∗ ) and( ζ ( ω ) , ∈ ∂f ( x ∗ , y ∗ ( ω ) , ω ) + ( w x ( ω ) , w y ( ω ))+ c ∗ co n ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) (cid:12)(cid:12)(cid:12) i ∈ b I ( ω ) o (24)for a.e. ω ∈ Ω.Let E J = { ω ∈ Ω | b I ( ω ) = J } for any nonempty subset J ⊆ I ∪ { } . Thesets E J form a partition of Ω. Moreover, these sets are measurable, since themultifunction b I ( · ) is measurable.Observe that from (24) it follows that( ζ ( ω ) , ∈ ∂f ( x ∗ , y ∗ ( ω ) , ω ) + ( w x ( ω ) , w y ( ω ))+ c ∗ co n ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) (cid:12)(cid:12)(cid:12) i ∈ J o for any ω ∈ E J and any nonempty J ⊆ I ∪ { } . With the use of the Filippovtheorem (see, e.g. [1, Thrm. 8.2.10]) one can readily check that the previous25nclusion implies that for any nonempty J ⊆ I ∪ { } there exist nonnegativemeasurable functions α Ji ( · ), i ∈ J , such that P i ∈ J α Ji ( ω ) = 1 and( ζ ( ω ) , ∈ ∂f ( x ∗ , y ∗ ( ω ) , ω ) + ( w x ( ω ) , w y ( ω ))+ c ∗ X i ∈ J α i ( ω ) (cid:16) ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) (cid:17) for a.e. ω ∈ E J . For any i ∈ I deﬁne λ i ( ω ) = ( c ∗ α Ji ( ω ) , if ω ∈ E J , i ∈ J (or, equivalently, i ∈ b I ( ω )) , , otherwise.Observe that by deﬁnition λ i , i ∈ I , are nonnegative measurable functions suchthat P i ∈ I k λ i k ∞ ≤ c ∗ , and λ i ( ω ) g i ( x ∗ , y ∗ ( ω ) , ω ) = 0 for a.e. ω ∈ Ω, since λ i ( ω ) = 0 whenever i / ∈ b I ( ω ), i.e. g i ( x ∗ , y ∗ ( ω ) , ω ) <

0. Furthermore, bearing inmind the fact that w x ( · ) ≡ w y ( · ) ≡

0, and ∂g ( x ∗ , y ∗ ( ω ) , ω ) ≡ { } one getsthat ( ζ ( ω ) , ∈ ∂f ( x ∗ , y ∗ ( ω ) , ω ) + ( w x ( ω ) , w y ( ω ))+ X i ∈ I λ i ( ω ) (cid:16) ∂g i ( x ∗ , y ∗ ( ω ) , ω ) + ( w xi ( ω ) , w yi ( ω )) (cid:17) . for a.e. ω ∈ Ω. Hence applying equality (2) we arrive at the required result.

Remark . It should be noted that with the use of the codiﬀerential calculus onecan compute a codiﬀerential of the function f from the proof of the previoustheorem, apply necessary conditions for a minimum of a codiﬀerentiable functionon a convex set [17, Thrm. 2.8] to the functional J , and then directly rewritethese conditions in terms of the problem ( P ) with the use of Theorem 1 and anexplicit expression for a codiﬀerential of f . However, one can check that thisapproach leads to more cumbersome optimality conditions than the ones fromthe theorem above. It is possible to verify that these conditions are equivalent,but in the author’s opinion the proof of this equivalence is more diﬃcult thanthe proof of the previous theorem. That is why we chose to present a simpler,but somewhat indirect derivation of optimality conditions for the problem ( P ). Remark . Note that in the case when the functions f and g i , i ∈ I , arediﬀerentiable jointly in x and y , the optimality conditions from Theorem 4 takethe following well-known form (cf. [26, 36, 40, 45, 46]). There exist nonnegativemultipliers λ i ∈ L ∞ (Ω , A , P ), i ∈ I , such that λ i ( ω ) g i ( x ∗ , y ∗ ( ω ) , ω ) = 0 for a.e. ω ∈ Ω and all i ∈ I , and * E h ∇ x f ( x ∗ , y ∗ ( · ) , · ) + X i ∈ I λ i ( · ) ∇ x g i ( x ∗ , y ∗ ( · ) , · ) i , x − x ∗ + ≥ ∀ x ∈ A, ∇ y f ( x ∗ , y ∗ ( ω ) , ω ) + X i ∈ I λ i ( ω ) ∇ y g i ( x ∗ , y ∗ ( ω ) , ω ) = 0 for a.e. ω ∈ Ω . This work was devoted to an analysis of nonsmooth two-stage stochastic pro-gramming problems with the use of tools of constructive nonsmooth analy-sis [14]. In the ﬁrst part of the paper, we analysed the co-/quasi-diﬀerentiability26f the expectation of nonsmooth random integrands and obtained explicit formu-lae for its co-/quasi-diﬀerentials under some natural measurability and growthconditions on the integrand and its codiﬀerential.In the second part of the paper, we obtained two types of suﬃcient conditionsfor the global exactness of a penalty function for two-stage stochastic program-ming problems, reformulated as equivalent variational problems with pointwiseconstraints. The ﬁrst type of suﬃcient conditions is formulated for the penaltyterm deﬁned via the L p norm of the distance to the feasible set of the secondstage problem, while the second type of suﬃcient conditions is formulated forthe penalty term that is independent of p and is deﬁned via the constraints ofthe second stage problems. Although the second type of suﬃcient conditionsis much more restrictive than the ﬁrst one, it is more convenient for applica-tions and derivation of optimality conditions. Furthermore, as is pointed out inRemark 6, these conditions open a way for the derivation of global optimalityconditions and application of DC optimization method to two stage stochasticprogramming problems, whose second stage problem has DC objective functionand DC constraints.Finally, in the last part of the paper we combined our results on codiﬀeren-tiability of the expectation of nonsmooth random integrands and exact penaltyfunction to derive optimality conditions for nonsmooth two-stage stochastic pro-gramming problems in terms of codiﬀerentials, involving essentially boundedLagrange multipliers. References [1] J.-P. Aubin and H. Frankowska.

Set-Valued Analysis . Birkh¨auser, Boston,1990.[2] G. Barbarosoˇglu and Y. Arda. A two-stage stochastic programming frame-work for transportation planning in disaster response.

J. Oper. Res. Soc. ,55:43–53, 2004.[3] J. R. Birge and F. Louveaux.

Introduction to Stochastic Programming .Springer, New York, 2011.[4] V. I. Bogachev.

Measure Theory. Volume I . Springer-Verlag, Berlin, Hei-delberg, 2007.[5] J. V. Burke. The subdiﬀerential of measurable composite max integrandsand smoothing approximation.

Math. Program. , 181:229–264, 2020.[6] X. Chen and M. Fukushima. Expected residual minimization method forstochastic linear complementarity problems.

Math. Oper. Res. , 30:916–638,2005.[7] X. Chen, R. J.-B. Wets, and Y. Zhang. Stochastic variational inequalities:residual minimization smoothing sample average approximations.

SIAM J.Optim. , 22:649–673, 2012.[8] S. Dempe, V. Kalashnikov, G. A. P´erez-Vald´es, and N. Kalashnykova, edi-tors.

Bilevel Programming Problems. Theory, Algorithmis and Applicationsto Energy Networks . Springer, Berlin, Heidelberg, 2015.279] S. Dempe and A. Zemkoho, editors.

Bilevel Optimization. Advances andNext Challenges . Springer, Cham, 2020.[10] V. F. Demyanov. Conditions for an extremum in metric spaces.

J. Glob.Optim. , 17:55–63, 2000.[11] V. F. Demyanov. Nonsmooth optimization. In G. Di Pillo and F. Schoen,editors,

Nonlinear optimization. Lecture notes in mathematics, vol. 1989 ,pages 55–163. Springer-Verlag, Berlin, 2010.[12] V. F. Demyanov and L. C. W. Dixon, editors.

Quasidiﬀerential Calculus .Springer, Berlin, Heidelberg, 1986.[13] V. F. Dem’yanov and V. N. Malozemov.

Introduction to Minimax . DoverPublications, New York, 2014.[14] V. F. Demyanov and A. M. Rubinov.

Constructive Nonsmooth Analysis .Peter Lang, Frankfurt am Main, 1995.[15] V. F. Demyanov and A. M. Rubinov, editors.

Quasidiﬀerentiability andRelated Topics . Kluwer Academic Publishers, Dordrecht, 2000.[16] M. V. Dolgopolik. Codiﬀerential calculus in normed spaces.

J. Math. Sci. ,173:441–462, 2011.[17] M. V. Dolgopolik. Nonsmooth problems of calculus of variations via codif-ferentiation.

ESAIM: Control Optim. Calc. Var. , 20:1153–1180, 2014.[18] M. V. Dolgopolik. Abstract convex approximations of nonsmooth functions.

Optim. , 64:1439–1469, 2015.[19] M. V. Dolgopolik. A unifying theory of exactness of linear penalty func-tions.

Optim. , 65:1167–1202, 2016.[20] M. V. Dolgopolik. A convergence analysis of the method of codiﬀerentialdescent.

Comput. Optim. Appl. , 71:879–913, 2018.[21] M. V. Dolgopolik. Constrained nonsmooth problems of the calculus ofvariations and nonsmooth Noether equations. arXiv: 2004.14061 , pages1–44, 2020.[22] M. V. Dolgopolik and A. Fominyh. Exact penalty functions for optimal con-trol problems I: main theorem and free-endpoint problems.

Optim. ControlAppl. Meth. , 40:1018–1044, 2019.[23] C. I. F´abi´an and Z. Sz˝oke. Solving two-stage stochastic programming prob-lems with level decomposition.

Comput. Manag. Sci. , 4:313–353, 2007.[24] S. D. Fl˚am and J. Zowe. Exact penalty functions in single-stage stochasticprogramming.

Optim. , 21:723–734, 1990.[25] E. Grass and K. Fischer. Two-stage stochastic programming in disastermanagement: a literature survey.

Surv. Oper. Res. Manag. Sci. , 21:85–100, 2016. 2826] J. B. Hiriart-Urruty. Conditions n´ecessaires d’optimalit´e pour un pro-gramme stochastique avec recours.

SIAM J. Control Optim. , 16:317–329,1978.[27] G. H. Huang and D. P. Loucks. An inexact two-stage stochastic program-ming model for water resourcers management under uncertainty.

Civ. Eng.Environ. Syst. , 17:95–118, 2000.[28] H. Le¨ovey and W. R¨omisch. Quasi-Monte Carlo methods for linear two-stage stochastic programming problems.

Math. Program. , 151:315–345,2015.[29] S. Lin, M. Huang, Z. Xia, and D. Li. Quasidiﬀerentiabilities of the expecta-tion functions of random quasidiﬀerentiable functions.

Optim. , pages 1–16,2020. DOI: 10.1080/02331934.2020.1818235.[30] C. Liu, Y. Fan, and F. Ord´o˜nez. A two-stage stochastic programming modelfor transportation network protection.

Comput. Oper. Res. , 36:1582–1590,2009.[31] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochas-tic approximation approach to stochastic programming.

SIAM J. Optim. ,19:1574–1609, 2009.[32] W. Oliveira, C. Sagastiz´abal, and S. Scheimberg. Inexact bundle methodsfor two-stage stochastic programming.

SIAM J. Optim. , 21:517–544, 2011.[33] A. V. Orlov. On a solving bilevel D.C.-convex optimization problems. InY. Kochetov, I. Bukadorov, and T. Gruzdeva, editors,

Mathematical Opti-mization Theory and Operations Research. MOTOR 2020. , pages 179–191.Springer, Cham, 2020.[34] G. Di Pillo and L. Grippo. Exact penalty functions in constrained opti-mization.

SIAM J. Control Optim. , 27:1333–1360, 1989.[35] R. T. Rockafellar and R. J.-B. Wets.

Variational Analysis . Springer-Verlag,Berlin, Heidelberg, 1998.[36] R. T. Rockafellar and R. J.-B. Wets. Stochastic convex programming:Kuhn-Tucker conditions.

J. Math. Econ. , 2:349–370, 1975.[37] R. T. Rockafellar and R. J.-B. Wets. On the interchange of subdiﬀer-entiation and conditional expectation for convex functions.

Stochastics ,7:173–182, 1982.[38] A. Rubinov and X. Yang.

Lagrange-Type Functions in Constrained Non-Convex Optimization . Kluwer Academic Publishers, Boston, 2003.[39] A. Shapiro and T. H. de Mello. A simulation-based approach to two-stagestochastic programming with recourse.

Math. Program. , 81:301–325, 1998.[40] A. Shapiro, D. Dentcheva, and A. Ruszcz`nski.

Lectures on Stochastic Pro-gramming: Modeling and Theory . SIAM, Philadelphia, 2014.2941] A. S. Strekalovsky. Global optimality conditions and exact penalization.

Optim. Lett. , 13:597–615, 2019.[42] A. S. Strekalovsky and A. V. Orlov. Global search for bilevel optimiza-tion with quadratic data. In S. Dempe and A. Zemkoho, editors,

BilevelOptimization , pages 313–334. Springer, Cham, 2020.[43] A. Uderzo. On the variational behaviour of functions with positive steepestdescent rate.

Positivity , 19:725–745, 2015.[44] A. Uderzo. A strong metric subregularity analysis of nonsmooth mappingsvia steepest displacement rate.

J. Optim. Theory Appl. , 171:573–599, 2016.[45] S. Vogel. Necessary optimality conditions for two-stage stochastic program-ming problems.

Optim. , 16:607–616, 1985.[46] H. Xu and J. J. Ye. Necessary optimality conditions for two-stage stochasticprograms with equilibrium constraints.

SIAM J. Optim. , 20:1685–1715,2010.[47] H. Xu and D. Zhang. Smooth sample average approximation of station-ary points in nonsmooth stochastic optimization and applications.

Math.Program. , 119:371–401, 2009.[48] A. J. Zaslavski.

Optimization on Metric and Normed Spaces . Springer,New York, 2010.[49] Z. Zhou, J. Zhang, P. Liu, Z. Li, M. C. Georgiadis, and E. N. Pistikopou-los. A two-stage stochastic programming model for the optimal design ofdistributed energy systems.