A general stochastic maximum principle for optimal control problems of forward-backward systems
aa r X i v : . [ m a t h . P R ] D ec A general stochastic maximum principle foroptimal control problems of forward-backwardsystems
Seid BAHLALI ∗ October 30, 2018
Abstract
Stochastic maximum principle of nonlinear controlled forward-backwardsystems, where the set of strict (classical) controls need not be convex andthe diffusion coefficient depends explicitly on the variable control, is anopen problem impossible to solve by the classical method of spike varia-tion. In this paper, we introduce a new approach to solve this open prob-lem and we establish necessary as well as sufficient conditions of optimal-ity, in the form of global stochastic maximum principle, for two models.The first concerns the relaxed controls, who are a measure-valued pro-cesses. The second is a restriction of the first to strict control problems.
AMS Subject Classification .
93 Exx
Keywords . Forward-backward stochastic differential equations, Stochas-tic maximum principle, Strict control, Relaxed control, Adjoint equations,Variational inequality.
We study a stochastic control problem where the system is governed by a non-linear forward-backward stochastic differential equation (FBSDE for short) ofthe type dx vt = b ( t, x vt , v t ) dt + σ ( t, x vt , v t ) dW t ,x v = x,dy vt = − f ( t, x vt , y vt , z vt , v t ) dt + z vt dW t ,y vT = ϕ ( x vT ) , where b, σ, f and ϕ are given maps, W = ( W t ) t ≥ is a standard Brownianmotion, defined on a filtered probability space (cid:16) Ω , F , ( F t ) t ≥ , P (cid:17) , satisfyingthe usual conditions. ∗ Laboratory of Applied Mathematics, University Med Khider, Po. Box 145, Biskra 07000,Algeria. [email protected] v = ( v t ), called strict (classical) control, is an F t adapted process with values in some set U of R k . We denote by U the class ofall strict controls.The criteria to be minimized, over the set U , has the form J ( v ) = E " g ( x vT ) + h ( y v ) + Z T l ( t, x vt , y vt , z vt , v t ) dt , where g, h and l are given functions and ( x vt , y vt , z vt ) is the trajectory of thesystem controlled by v. A control u ∈ U is called optimal if it satisfies J ( u ) = inf v ∈U J ( v ) . The objective of this kind of stochastic control problem is to obtain the op-timality conditions of controls in the form of Pontryagin stochastic maximumprinciple. There is many works on the subject, including Peng [42], Xu [47], Wu[46], Shi and Wu [44], Ji and Zhou [30], Bahlali and Labed [5] and Bahlali [8].All the previous results on stochastic maximum principle of forward-backwardsystems are established in the cases where the control domain is convex or un-controlled diffusion coefficient. The general case, where the set of strict controlsneed not be convex and the diffusion coefficient depends explicitly on the con-trol variable, is an open problem unsolved until now. There is no result in theliterature concerning this problem and the classical way which consists to usethe spike variation method on the strict controls does not lead to any result.The approach developed by Peng [41] to solve the similar case of controlledstochastic differential equations (SDEs) cannot be applied in the case of con-trolled FBSDEs. Indeed, since the control domain is not necessarily convex andthe diffusion σ depends on the control variable, the classical way of treatingsuch a problem would be to use the spike variation method on the strict con-trols and to introduce the second-order variational equation. But, the FBSDEsystem depends on three variables ( x, y and z ) and the second order expansionleads to a nonlinear problem. It is impossible to deduce then the second-ordervariational inequality.In this paper, we solve this open problem by using the new approach devel-oped by Bahlali [7] . We introduse then a bigger new class R of processes byreplacing the U -valued process ( v t ) by a P ( U )-valued process ( q t ), where P ( U )is the space of probability measures on U equipped with the topology of stableconvergence. This new class of processes is called relaxed controls and have aricher structure of convexity, for which the control problem becomes solvable.The main idea is to use the property of convexity of the set of relaxed controlsand treat the problem with the method of convex perturbation on relaxed con-trols (instead of that of the spike variation on strict one). We establish thennecessary and sufficient optimality conditions for relaxed controls and we derivedirectly the optimality conditions for strict controls from those of relaxed one.2n the relaxed model, the system is governed by the FBSDE dx qt = R U b ( t, x qt , a ) q t ( da ) dt + R U σ ( t, x qt , a ) q t ( da ) dW t ,x q = x,dy qt = − R U f ( t, x qt , y qt , z qt , a ) q t ( da ) dt + z qt dW t ,y qT = ϕ ( x qT ) . The functional cost to be minimized, over the class R of relaxed controls, isdefined by J ( q ) = E " g ( x qT ) + h ( y q ) + Z T Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) dt . A relaxed control µ is called optimal if it solves J ( µ ) = inf q ∈R J ( q ) . The relaxed control problem is a generalization of the problem of strictcontrols. Indeed, if q t ( da ) = δ v t ( da ) is a Dirac measure concentrated at asingle point v t ∈ U , then we get a strict control problem as a particular case ofthe relaxed one.To achieve the objective of this paper and establish necessary and sufficientoptimality conditions for these two models, we proceed as follows.Firstly, we give the optimality conditions for relaxed controls. The idea isto use the fact that the set of relaxed controls is convex. Then, we establishnecessary optimality conditions by using the classical way of the convex pertur-bation method. More precisely, if we denote by µ an optimal relaxed controland q is an arbitrary element of R , then with a sufficiently small θ > t ∈ [0 , T ], we can define a perturbed control as follows µ θt = µ t + θ ( q t − µ t ) . We derive the variational equation from the state equation, and the varia-tional inequality from the inequality0 ≤ J (cid:0) µ θ (cid:1) − J ( µ ) . By using the fact that the coefficients b, σ, f and l are linear with respectto the relaxed control variable, necessary optimality conditions are obtaineddirectly in the global form.To enclose this part of the paper, we prove under minimal additional hy-pothesis, that these necessary optimality conditions for relaxed controls are alsosufficient.The second main result in the paper characterizes the optimality for strictcontrol processes. It is directly derived from the above result by restrict-ing from relaxed to strict controls. The idea is to replace the relaxed con-trols by a Dirac measures charging a strict controls. Thus, we reduce the set3 of relaxed controls and we minimize the cost J over the subset δ ( U ) = { q ∈ R / q = δ v ; v ∈ U} . Necessary optimality conditions for strict con-trols are then obtained directly from those of relaxed one. Finally, we provethat these necessary conditions becomes sufficient, without imposing neitherthe convexity of U nor that of the Hamiltonian H in v .This paper can be also regarded as an extension of that of Bahlali [7] to theforward-backward systems. Indeed, if we consider only the forward equation,without the backward one ( y = z = f = h = 0), we recover then exactly all theresults of [7] . The paper is organized as follows. In Section 2, we formulate the strict andrelaxed control problems and give the various assumptions used throughoutthe paper. Section 3 is devoted to study the relaxed control problems andwe establish necessary as well as sufficient conditions of optimality for relaxedcontrols. In the last Section, we derive directly from the results of Section 3,the optimality conditions for strict controls.Along this paper, we denote by C some positive constant, M n × d ( R ) thespace of n × d real matrix and M dn × n ( R ) the linear space of vectors M =( M , ..., M d ) where M i ∈ M n × n ( R ). We use the standard calculus of inner andmatrix product. Let (cid:16) Ω , F , ( F t ) t ≥ , P (cid:17) be a filtered probability space satisfying the usual con-ditions, on which a d -dimensional Brownian motion W = ( W t ) t ≥ is defined.We assume that ( F t ) is the P - augmentation of the natural filtration of W. Let T be a strictly positive real number and U a non-empty set of R k . Definition 1
An admissible strict control is an F t − adapted process v = ( v t ) with values in U such that E " sup t ∈ [0 ,T ] | v t | < ∞ . We denote by U the set of all admissible strict controls. For any v ∈ U , we consider the following controlled FBSDE dx vt = b ( t, x vt , v t ) dt + σ ( t, x vt , v t ) dW t ,x v = x,dy vt = − f ( t, x vt , y vt , z vt , v t ) dt + z vt dW t ,y vT = ϕ ( x vT ) , (1)4here, b : [0 , T ] × R n × U −→ R n ,σ : [0 , T ] × R n × U −→ M n × d ( R ) ,f : [0 , T ] × R n × R m × M m × d ( R ) × U −→ R m ,ϕ : R n −→ R m , and x is an n − dimensional F -measurable random variable such that E | x | < ∞ . The criteria to be minimized is defined from U into R by J ( v ) = E " g ( x vT ) + h ( y v ) + Z T l ( t, x vt , y vt , z vt , v t ) dt , (2)where, g : R n −→ R ,h : R m −→ R ,l : [0 , T ] × R n × R m × M m × d ( R ) × U −→ R . A strict control u is called optimal if it satisfies J ( u ) = inf v ∈U J ( v ) . (3)We assume that b, σ, f, g, h, l and ϕ are continuously differentiable with respect (4)to ( x, y, z ) , they are bounded by C (1 + | x | + | y | + | z | + | v | ) and theirderivatives with respect to ( x, y, z ) are continuous in ( x, y, z, v )and uniformly bounded.Under the above hypothesis, for every v ∈ U , equation (1) has a uniquestrong solution and the functional cost J is well defined from U into R . The idea for relaxed the strict control problem defined above is to embed theset U of strict controls into a wider class which gives a more suitable topologicalstructure. In the relaxed model, the U -valued process v is replaced by a P ( U )-valued process q , where P ( U ) denotes the space of probability measure on U equipped with the topology of stable convergence. Definition 2
A relaxed control ( q t ) t is a P ( U ) -valued process, progressivelymeasurable with respect to ( F t ) t and such that for each t , ]0 ,t ] .q is F t -measurable.We denote by R the set of all relaxed controls. emark 3 Every relaxed control q may be desintegrated as q ( dt, da ) = q ( t, da ) dt = q t ( da ) dt , where q t ( da ) is a progressively measurable process with value in theset of probability measures P ( U ) . The set U is embedded into the set R of relaxed process by the mapping f : v ∈ U f v ( dt, da ) = δ v t ( da ) dt ∈ R where δ v is the atomic measure concentrated at a single point v . For more details on relaxed controls, see [4] , [6] , [7] , [16] , [21] , [34] , [37] , [38] . For any q ∈ R , we consider the following relaxed FBSDE dx qt = R U b ( t, x qt , a ) q t ( da ) dt + R U σ ( t, x qt , a ) q t ( da ) dW t ,x q = x,dy qt = − R U f ( t, x qt , y qt , z qt , a ) q t ( da ) dt + z qt dW t ,y qT = ϕ ( x qT ) . (5)The expected cost to be minimized, in the relaxed model, is defined from R into R by J ( q ) = E " g ( x qT ) + h ( y q ) + Z T Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) dt . (6)A relaxed control µ is called optimal if it solves J ( µ ) = inf q ∈R J ( q ) . (7) Remark 4
If we put b ( t, x qt , q t ) = Z U b ( t, x qt , a ) q t ( da ) ,σ ( t, x qt , q t ) = Z U σ ( t, x qt , a ) q t ( da ) ,f ( t, x qt , y qt , z qt , a ) = Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) ,l ( t, x qt , y qt , z qt , a ) = Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) . Then, equation (5) becomes dx qt = b ( t, x qt , q t ) dt + σ ( t, x qt , q t ) dW t ,x q = x,dy qt = − f ( t, x qt , y qt , z qt , q t ) dt + z qt dW t ,y qT = ϕ ( x qT ) . ith a functional cost given by J ( q ) = E " g ( x qT ) + h ( y q ) + Z T l ( t, x qt , y qt , z qt , q t ) dt . Hence, by introducing relaxed controls, we have replaced U by a larger space P ( U ) . We have gained the advantage that P ( U ) is and convex. Furthermore,the new coefficients of equation (5) and the running cost are linear with respectto the relaxed control variable. Remark 5
The coefficients b, σ and f (defined in the above remark) check re-spectively the same assumptions as b, σ and f . Then, under assumptions (4) , b, σ and f are uniformly Lipschitz and with linear growth. Then by classicalresults on FBSDEs, for every q ∈ R equation (5) has a unique strong solution.On the other hand, It is easy to see that l checks the same assumptions as l . Then, the functional cost J is well defined from R into R . Remark 6 If q t = δ v t is an atomic measure concentrated at a single point v t ∈ U , then for each t ∈ [0 , T ] we have Z U b ( t, x qt , a ) q t ( da ) = Z U b ( t, x qt , a ) δ v t ( da ) = b ( t, x qt , v t ) , Z U σ ( t, x qt , a ) q t ( da ) = Z U σ ( t, x qt , a ) δ v t ( da ) = σ ( t, x qt , v t ) , Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) = Z U f ( t, x qt , y qt , z qt , a ) δ v t ( da ) = f ( t, x qt , y qt , z qt , v t ) , Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) = Z U l ( t, x qt , y qt , z qt , a ) δ v t ( da ) = l ( t, x qt , y qt , z qt , v t ) . In this case ( x q , y q , z q ) = ( x v , y v , z v ) , J ( q ) = J ( v ) and we get a strictcontrol problem. So the problem of strict controls { (1) , (2) , (3) } is a particularcase of relaxed control problem { (5) , (6) , (7) } . Remark 7
The relaxed control problems studied in El Karoui et al [16] andBahlali, Mezerdi and Djehiche [4] is different to ours, in that they relax thecorresponding infinitesimal generator of the state process, which leads to a mar-tingale problem for which the state process driven by an orthogonal martingalemeasure. In our setting the driving martingale measure q t ( da ) dW t is howevernot orthogonal. See Ma and Yong [34] for more details. In this section, we study the problem { (5) , (6) , (7) } and we establish necessaryas well as sufficient conditions of optimality for relaxed controls.7 .1 Preliminary results Since the set R is convex, then the classical way to derive necessary optimalityconditions for relaxed controls is to use the convex perturbation method. Moreprecisely, let µ be an optimal relaxed control and ( x µ , y µ , z µ ) the solution of (5)controlled by µ . Then, for each t ∈ [0 , T ] we can define a perturbed relaxedcontrol as follows µ θt = µ t + θ ( q t − µ t ) , where, θ > q is an arbitrary element of R .Denote by (cid:0) x θ , y θ , z θ (cid:1) the solution of (5) associated with µ θ .From optimality of µ , the variational inequality will be derived from the factthat 0 ≤ J (cid:0) µ θ (cid:1) − J ( µ ) . For this end, we need the following classical lemmas.
Lemma 8
Under assumptions (4) , we have lim θ → " sup t ∈ [0 ,T ] E (cid:12)(cid:12) x θt − x µt (cid:12)(cid:12) = 0 , (8)lim θ → " sup t ∈ [0 ,T ] E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) = 0 , (9)lim θ → E Z T (cid:12)(cid:12) z θt − z µt (cid:12)(cid:12) dt = 0 . (10) Proof. (8) is proved in [7, Lemm 9, page 2085] . Let us prove (9) and (10).Applying Itˆo’s formula to (cid:0) y θt − y µt (cid:1) , we have E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds = E (cid:12)(cid:12) ϕ (cid:0) x θT (cid:1) − ϕ ( x µT ) (cid:12)(cid:12) +2 E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)(cid:0) y θs − y µs (cid:1) (cid:20)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ θs ( da ) − Z U f ( s, x µs , y µs , z µs , a ) µ s ( da ) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) ds. From the Young formula, for every ε >
0, we have E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ E (cid:12)(cid:12) ϕ (cid:0) x θT (cid:1) − ϕ ( x µT ) (cid:12)(cid:12) + 1 ε E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + ε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ θs ( da ) − Z U f ( s, x µs , y µs , z µs , a ) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds. E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ E (cid:12)(cid:12) ϕ (cid:0) x θT (cid:1) − ϕ ( x µT ) (cid:12)(cid:12) + 1 ε E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ θs ( da ) − Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ s ( da ) − Z U f (cid:0) s, x µs , y θs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x µs , y θs , z θs , a (cid:1) µ s ( da ) − Z U f (cid:0) s, x µs , y µs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x µs , y µs , z θs , a (cid:1) µ s ( da ) − Z U f ( s, x µs , y µs , z µs , a ) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds. By the definition of µ θt , we have E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ E (cid:12)(cid:12) ϕ (cid:0) x θT (cid:1) − ϕ ( x T ) (cid:12)(cid:12) + 1 ε E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + Cεθ E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) q s ( da ) − Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x θs , y θs , z θs , a (cid:1) µ s ( da ) − Z U f (cid:0) s, x µs , y θs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x µs , y θs , z θs , a (cid:1) µ s ( da ) − Z U f (cid:0) s, x µs , y µs , z θs , a (cid:1) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12)(cid:12)(cid:12)Z U f (cid:0) s, x µs , y µs , z θs , a (cid:1) µ s ( da ) − Z U f ( s, x µs , y µs , z µs , a ) µ s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds. Since ϕ and f are uniformly Lipschitz with respect to x, y, z , then E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ (cid:18) ε + Cε (cid:19) E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds (11)+ Cε E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds + α θt , where α θt is given by α θt = E (cid:12)(cid:12) x θT − x µT (cid:12)(cid:12) + Cε E Z Tt (cid:12)(cid:12) x θs − x µs (cid:12)(cid:12) ds + Cεθ .
9y (8), we have lim θ → α θt = 0 . (12)Choose ε = 12 C , then (11) becomes E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) + 12 E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ (cid:18) C + 12 (cid:19) E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + α θt . From the above inequality, we derive two inequalities E (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) ≤ (cid:18) C + 12 (cid:19) E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + α θt , (13) E Z Tt (cid:12)(cid:12) z θs − z µs (cid:12)(cid:12) ds ≤ (4 C + 1) E Z Tt (cid:12)(cid:12) y θs − y µs (cid:12)(cid:12) ds + 2 α θt . (14)By using (12) , (13), Gronwall’s lemma and Bukholder-Davis-Gundy inequal-ity, we obtain (9). Finally, (10) is derived from (9) and (12). Lemma 9
Let e x t and e y t are respectively the solutions of the following linearequations (called variational equations) d e x t = Z U b x ( t, x µt , a ) µ t ( da ) e x t dt + Z U σ x ( t, x µt , a ) µ t ( da ) e x t dW t + (cid:20)Z U b ( t, x µt , a ) µ t ( da ) − Z U b ( t, x µt , a ) q t ( da ) (cid:21) dt + (cid:20)Z U σ ( t, x µt , a ) µ t ( da ) − Z U σ ( t, x µt , a ) q t ( da ) (cid:21) dW t , e x = 0 . (15) d e y t = − Z U [ f x ( t, x µt , y µt , z µt , a ) e x t + f y ( t, x µt , y µt , z µt , a ) e y t + f z ( t, x µt , y µt , z µt , a ) e z t ] µ t ( da ) dt + (cid:20)Z U f ( t, x µt , y µt , z µt , a ) µ t ( da ) − Z U f ( t, x µt , y µt , z µt , a ) q t ( da ) (cid:21) dt + e z t dW t , e y T = ϕ x ( x µT ) e x T . (16) Then, the following estimations hold lim θ → E (cid:12)(cid:12)(cid:12)(cid:12) x θt − x µt θ − e x t (cid:12)(cid:12)(cid:12)(cid:12) = 0 , (17)lim θ → E (cid:12)(cid:12)(cid:12)(cid:12) y θt − y µt θ − e y t (cid:12)(cid:12)(cid:12)(cid:12) = 0 , (18)lim θ → E Z T (cid:12)(cid:12)(cid:12)(cid:12) z θt − z µt θ − e z t (cid:12)(cid:12)(cid:12)(cid:12) dt = 0 . (19)10 roof. For simplicity, we put X θt = x θt − x µt θ − e x t , (20) Y θt = y θt − y µt θ − e y t , (21) Z θt = z θt − z µt θ − e z t . (22)Λ θt ( a ) = (cid:0) t, x µt + λθ (cid:0) X θt + e x t (cid:1) , y µt + λθ (cid:0) Y θt + e y t (cid:1) , z µt + λθ (cid:0) Z θt + e z t (cid:1) , a (cid:1) . i) (17) is proved in [7, Lemma 10, Page 2086]ii) Proof of (18) and (19).By (21) and (22), we have the following FBSDE dY θt = (cid:0) F yt Y θt dt + F yt Z θt − γ θt (cid:1) dt + Z θt dW t ,Y θT = ϕ (cid:0) x θT (cid:1) − ϕ ( x µT ) θ − ϕ x ( x µT ) e x T , where, F yt = − Z Z U f y (cid:0) Λ θt ( a ) (cid:1) µ t ( da ) dλ,F zt = − Z Z U f z (cid:0) Λ θt ( a ) (cid:1) µ t ( da ) dλ, and γ θt is given by γ θt = Z Tt Z U f x (cid:0) Λ θs ( a ) (cid:1) X θs µ s ( da ) ds + Z Tt Z U (cid:2) f x (cid:0) Λ θs ( a ) (cid:1) (cid:0) x θs − x µs (cid:1) + f y (cid:0) Λ θs ( a ) (cid:1) (cid:0) y θs − y µs (cid:1) + f z (cid:0) Λ θs ( a ) (cid:1) (cid:0) z θs − z µs (cid:1)(cid:3) q s ( da ) ds − Z Tt Z U (cid:2) f x (cid:0) Λ θs ( a ) (cid:1) (cid:0) x θs − x µs (cid:1) + f y (cid:0) Λ θs ( a ) (cid:1) (cid:0) y θs − y µs (cid:1) + f z (cid:0) Λ θs ( a ) (cid:1) (cid:0) z θs − z µs (cid:1)(cid:3) µ s ( da ) ds. Since f x , f y and f z are continuous and bounded, then from (8) , (9) , (10)and (17), we have lim θ → E (cid:12)(cid:12) γ θt (cid:12)(cid:12) = 0 . (23)Applying Itˆo’s formula to (cid:0) Y θt (cid:1) , we get E (cid:12)(cid:12) Y θt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds = E (cid:12)(cid:12) Y θT (cid:12)(cid:12) + 2 E Z Tt (cid:12)(cid:12) Y θs (cid:0) F ys Y θs + F zs Z θs − γ θs (cid:1)(cid:12)(cid:12) ds.
11y using the Young formula, for every ε >
0, we have E (cid:12)(cid:12) Y θt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds ≤ E (cid:12)(cid:12) Y θT (cid:12)(cid:12) + 1 ε E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + ε E Z Tt (cid:12)(cid:12)(cid:0) F ys Y θs + F zs Z θs − γ θs (cid:1)(cid:12)(cid:12) ds ≤ E (cid:12)(cid:12) Y θT (cid:12)(cid:12) + 1 ε E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12) F ys Y θs (cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12) F zs Z θs (cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12) γ θs (cid:12)(cid:12) ds. Since F yt and F zt are bounded, then E (cid:12)(cid:12) Y θt (cid:12)(cid:12) + E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds ≤ (cid:18) ε + Cε (cid:19) E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + Cε E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds + η θt , where η θt = E (cid:12)(cid:12) Y θT (cid:12)(cid:12) + Cε E Z Tt (cid:12)(cid:12) γ θs (cid:12)(cid:12) ds. Choose ε = 12 C , then we have E (cid:12)(cid:12) Y θt (cid:12)(cid:12) + 12 E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds ≤ (cid:18) C + 12 (cid:19) E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + η θt . From the above inequality, we deduce two inequalities E (cid:12)(cid:12) Y θt (cid:12)(cid:12) ≤ (cid:18) C + 12 (cid:19) E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + η θt , (24) E Z Tt (cid:12)(cid:12) Z θs (cid:12)(cid:12) ds ≤ (4 C + 1) E Z Tt (cid:12)(cid:12) Y θs (cid:12)(cid:12) ds + 2 η θt . (25)On the other hand, we have E (cid:12)(cid:12) Y θT (cid:12)(cid:12) = E (cid:12)(cid:12)(cid:12)(cid:12)e y T − y θT − y µT θ (cid:12)(cid:12)(cid:12)(cid:12) = E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ϕ x ( x µT ) e x T − ϕ (cid:0) x θT (cid:1) − ϕ ( x µT ) θ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ E Z (cid:12)(cid:12)(cid:2) ϕ x ( x µT ) − ϕ x (cid:0) x µT + λθ (cid:0)e x T + X θT (cid:1)(cid:1)(cid:3) e x T (cid:12)(cid:12) dλ + 2 E Z (cid:12)(cid:12) ϕ x (cid:0) x µT + λθ (cid:0)e x T + X θT (cid:1)(cid:1) X θT (cid:12)(cid:12) dλ. By using (17) and the fact that ϕ x is continuous and bounded, we getlim θ → E (cid:12)(cid:12) Y θT (cid:12)(cid:12) = 0 . (26)12rom (23) and (26), we deduce thatlim θ → η θt = 0 . (27)Finally, by using (24) , (27), Gronwall’s lemma and Bukholder-Davis-Gundyinequality, we obtain (18). Finally (19) is derived from (25) , (27) and (18). Lemma 10
Let µ be an optimal control minimizing the functional J over R and ( x µt , y µt , z µt ) the solution of (1) associated with µ . Then for any q ∈ R , wehave ≤ E [ g x ( x µT ) e x T ] + E [ h y ( y µ ) e y ]+ E Z T (cid:20)Z U l ( t, x µt , y µt , z µt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt (28)+ E Z T (cid:20)Z U l x ( t, x µt , y µt , z µt , a ) µ t ( da ) e x t + Z U l y ( t, x µt , y µt , z µt , a ) µ t ( da ) e y t (cid:21) dt + E Z T Z U l z ( t, x µt , y µt , z µt , a ) µ t ( da ) e z t dt. Proof.
Let µ be an optimal relaxed control minimizing the cost J over R , thenwe get0 ≤ E (cid:2) g (cid:0) x θT (cid:1) − g ( x µT ) (cid:3) + E (cid:2) h (cid:0) y θ (cid:1) − h ( y µ ) (cid:3) + E Z T (cid:20)Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) µ θt ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt = E (cid:2) g (cid:0) x θT (cid:1) − g ( x µT ) (cid:3) + E (cid:2) h (cid:0) y θ (cid:1) − h ( y µ ) (cid:3) + E Z T (cid:20)Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) µ θt ( da ) − Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) µ t ( da ) (cid:21) dt + E Z T (cid:20)Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) µ t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt. From the definition of µ θ , we get0 ≤ E (cid:2) g (cid:0) x θT (cid:1) − g ( x µT ) (cid:3) + E (cid:2) h (cid:0) y θ (cid:1) − h ( y µ ) (cid:3) + θ E Z T (cid:20)Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) q t ( da ) − Z U l (cid:0) t, x θt , y θt , z θt , a (cid:1) µ t ( da ) (cid:21) dt + E Z T Z U (cid:2) l (cid:0) t, x θt , y θt , z θt , a (cid:1) − l ( t, x µt , y µt , z µt , a ) (cid:3) µ t ( da ) dt. ≤ E Z (cid:2) g x (cid:0) x µT + λθ (cid:0)e x T + X θT (cid:1)(cid:1) e x T (cid:3) dλ (29)+ E Z (cid:2) h y (cid:0) y µ + λθ (cid:0)e y + Y θ (cid:1)(cid:1) e y (cid:3) dλ + E Z T Z Z U (cid:2) l x (cid:0) Λ θt ( a ) (cid:1) e x t + l y (cid:0) Λ θt ( a ) (cid:1) e y t + l z (cid:0) Λ θt ( a ) (cid:1) e z t (cid:3) µ t ( da ) dλdt + E Z T (cid:20)Z U l ( t, x µt , y µt , z µt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt + ρ θt , where ρ θt is given by ρ θt = E Z (cid:2) g x (cid:0) x µT + λθ (cid:0)e x T + X θT (cid:1)(cid:1) X θT (cid:3) dλ + E Z (cid:2) h y (cid:0) y µ + λθ (cid:0)e y + Y θ (cid:1)(cid:1) Y θ (cid:3) dλ + E Z T Z Z U (cid:2) l x (cid:0) Λ θt ( a ) (cid:1) (cid:0) x θt − x µt (cid:1) + l y (cid:0) Λ θt ( a ) (cid:1) (cid:0) y θt − y µt (cid:1) + l z (cid:0) Λ θt ( a ) (cid:1) (cid:0) z θt − z µt (cid:1)(cid:3) q t ( da ) dλdt + E Z T Z Z U (cid:2) l x (cid:0) Λ θt ( a ) (cid:1) (cid:0) x θt − x µt (cid:1) + l y (cid:0) Λ θt ( a ) (cid:1) (cid:0) y θt − y µt (cid:1) + l z (cid:0) Λ θt ( a ) (cid:1) (cid:0) z θt − z µt (cid:1)(cid:3) µ t ( da ) dλdt + E Z T Z Z U (cid:2) l x (cid:0) Λ θt ( a ) (cid:1) X θT + l y (cid:0) Λ θt ( a ) (cid:1) Y θt + l z (cid:0) Λ θt ( a ) (cid:1) Z θt (cid:3) µ t ( da ) dλdt. Since the derivatives g x , h y , l x , l y and l z are bounded, then by using theCauchy-Schwartz inequality, we have ρ θt ≤ C (cid:16) E (cid:12)(cid:12) X θT (cid:12)(cid:12) (cid:17) / + C (cid:16) E (cid:12)(cid:12) Y θ (cid:12)(cid:12) (cid:17) / + C E Z T (cid:12)(cid:12) x θt − x µt (cid:12)(cid:12) dt ! / + C E Z T (cid:12)(cid:12) y θt − y µt (cid:12)(cid:12) dt ! / + C E Z T (cid:12)(cid:12) z θt − z µt (cid:12)(cid:12) dt ! / + C E Z T (cid:12)(cid:12) X θt (cid:12)(cid:12) dt ! / + C E Z T (cid:12)(cid:12) Y θt (cid:12)(cid:12) dt ! / + C E Z T (cid:12)(cid:12) Z θt (cid:12)(cid:12) dt ! / . By using (8) , (9) , (10) , (17) , (18) and (19), we getlim θ → ρ θt = 0 . Since g x , h y , l x , l y and l z are continuous and bounded, the proof is com-pleted by letting θ go to 0 in (29). 14 .2 Necessary optimality conditions for relaxed controls Starting from the variational inequality (28), we can now state necessary opti-mality conditions for the relaxed control problem { (5) , (6) , (7) } in the globalform. Theorem 11 (Necessary optimality conditions for relaxed controls). Let µ bean optimal relaxed control minimizing the functional J over R and ( x µ , y µ , z µ ) the solution of (5) controlled by µ . Then, there exist three adapted processes ( k µ , p µ , P µ ) , uniqe solution of the following FBSDE system (called adjoint equa-tions) dk µt = H y ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) dt + H z ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) dW t ,k µ = h y ( y µ ) dp µt = −H x ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) dt + P µt dW t ,p µT = g x ( x µT ) + ϕ x ( x µT ) k µT , (30) such that for every q t ∈ P ( U ) H ( t, x µt , y µt , z µt , µ t , p µt , k µt , P µt ) ≤ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) , ae , as, (31) where the Hamiltonian H is defined from [0 , T ] × R n × R m ×M m × d ( R ) × P ( U ) × R m × R n × M n × d ( R ) into R by H ( t, x, y, z, q, k, p, P ) = Z U l ( t, x, y, z, a ) q t ( da ) + p Z U b ( t, x, a ) q t ( da )+ P Z U σ ( t, x, a ) q t ( da ) + k Z U f ( t, x, y, z, a ) q t ( da ) . Proof.
Since k µ = h y ( y µ ) and p µT = g x ( x µT ) + ϕ x ( x µT ) k µT , then (28) becomes0 ≤ E [ p µT e x T ] + E [ k µ e y ] − E [ ϕ x ( x µT ) k µT ] + E Z T Z U l x ( t, x µt , y µt , z µt , a ) e x t µ t ( da ) dt + E Z T Z U [ l y ( t, x µt , y µt , z µt , a ) e y t + l z ( t, x µt , y µt , z µt , a ) e z t ] µ t ( da ) dt (32)+ E Z T (cid:20)Z U l ( t, x µt , y µt , z µt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt. By applying Itˆo’s formula to ( p µt e x t ) and ( k µt e y t ), we have E [ p µT e x T ] = − E Z T (cid:20)Z U f x ( t, x µt , y µt , z µt , a ) µ t ( da ) k µt + Z U l x ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) e x t dt + E Z T p µt (cid:20)Z U b ( t, x µt , a ) q t ( da ) − Z U b ( t, x µt , a ) µ t ( da ) (cid:21) dt + E Z T P µt (cid:20)Z U σ ( t, x µt , a ) q t ( da ) − Z U σ ( t, x µt , a ) µ t ( da ) (cid:21) dt. [ k µ e y ] = E [ k µT e y T ] − E "Z T Z U l y ( t, x µt , y µt , z µt , a ) µ t ( da ) e y t + Z U f x ( t, x µt , y µt , z µt , a ) µ t ( da ) e x t k µt dt + E Z T k µt (cid:20)Z U f ( t, x µt , y µt , z µt , a ) q t ( da ) − Z U f ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt − E Z T Z U l z ( t, x µt , y µt , z µt , a ) µ t ( da ) e z t dt. Then for every q ∈ R , (32) becomes0 ≤ E Z T [ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt )] dt. Now, let q ∈ R and F be an arbitrary element of the σ -algebra F t , and set π t = q t F + µ t Ω − F . It is obvious that π is an admissible relaxed control.Applying the above inequality with π , we get0 ≤ E [ F {H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) } ] , ∀ F ∈ F t . Which implies that0 ≤ E [ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) / F t ] . The quantity inside the conditional expectation is F t -measurable, and thusthe result follows immediately. In this subsection, we study when necessary optimality conditions (31) becomessufficient. For any q ∈ R , we denote by ( x q , y q , z q ) the solution of equation (5)controlled by q. Theorem 12 (Sufficient optimality conditions for relaxed controls). Assumethat the functions g, h and ( x, y, z ) ( t, x, y, z, q, k, p, P ) are convex, andfor any q ∈ R , y qT = ξ, where ξ is an m -dimensional F T -measurable randomvariable such that E | ξ | < ∞ . Then, µ is an optimal solution of the relaxed control problem { (5) , (6) , (7) } ,if it satisfies (31) . Proof.
Let µ be an arbitrary element of R (candidate to be optimal). For any q ∈ R , we have J ( q ) − J ( µ ) = E [ g ( x qT ) − g ( x µT )] + E [ h ( y q ) − h ( y µ )]+ E Z T (cid:20)Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt. g and h are convex, then g ( x qT ) − g ( x µT ) ≥ g x ( x µT ) ( x qT − x µT ) ,h ( y q ) − h ( y µ ) ≥ h y ( y µ ) ( y q − y µ ) . Thus, J ( q ) − J ( µ ) ≥ E [ g x ( x µT ) ( x qT − x µT )] + E [ h y ( y µ ) ( y q − y µ )]+ E Z T (cid:20)Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt. We remark from (30) that p µT = g x ( x µT ) ,k µ = h y ( y µ ) . Then, we have J ( q ) − J ( µ ) ≥ E [ p µT ( x qT − x µT )] + E [ k µ ( y q − y µ )]+ E Z T (cid:20)Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) − Z U l ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt. By applying Itˆo’s formula respectively to p µt ( x qt − x µt ) and k µt ( y qt − y µt ), weobtain E [ p µT ( x qT − x µT )] = − E Z T H x ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( x qt − x µt ) dt + E Z T p µt (cid:20)Z U b ( t, x qt , a ) q t ( da ) − Z U b ( t, x µt , a ) µ t ( da ) (cid:21) dt + E Z T P µt (cid:20)Z U σ ( t, x qt , a ) q t ( da ) − Z U σ ( t, x µt , a ) µ t ( da ) (cid:21) dt, E [ k µ ( y q − y µ )] = − E Z T H y ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( y qt − y µt ) dt + E Z T k µt (cid:20)Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) − Z U f ( t, x µt , y µt , z µt , a ) µ t ( da ) (cid:21) dt − E Z T H z ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( z qt − z µt ) dt. J ( q ) − J ( µ ) (33) ≥ E Z T [ H ( t, x qt , y qt , z qt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt )] dt − E Z T H x ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( x qt − x µt ) dt − E Z T H y ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( y qt − y µt ) dt − E Z T H z ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( z qt − z µt ) dt. Since H is convex in ( x, y, z ) and linear in µ , then by using the Clarke gen-eralized gradient of H evaluated at ( x t , y t , z t , µ t ) and the necessary optimalityconditions (31), it follows by [50, Lemmas 2.2 and 2.3] that H ( t, x qt , y qt , z qt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ≥ H x ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( x qt − x µt ) + H y ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( y qt − y µt )+ H z ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( z qt − z µt ) . Or equivalently,0 ≤ H ( t, x qt , y qt , z qt , q t , k µt , p µt , P µt ) − H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) − H x ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( x qt − x µt ) − H y ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( y qt − y µt ) − H z ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ( z qt − z µt ) . Then from (33), we get J ( q ) − J ( µ ) ≥ . The theorem is proved.
In this section, we study the strict control problem { (1) , (2) , (3) } and from theresults of section 3, we derive the optimality conditions for strict controls.Throughout this section and in addition to the assumptions (4), we supposethat U is compact. (34) b, σ, f and l are bounded. (35)18onsider the following subset of R δ ( U ) = { q ∈ R / q = δ v ; v ∈ U} . The set δ ( U ) is the collection of all relaxed controls in the form of Diracmeasure charging a strict control.Denote by δ ( U ) the action set of all relaxed controls in δ ( U ).If q ∈ δ ( U ), then q = δ v with v ∈ U . In this case we have for each t , q t ∈ δ ( U ) and q t = δ v t .We equipped P ( U ) with the topology of stable convergence. Since U iscompact, then with this topology P ( U ) is a compact metrizable space. Thestable convergence is required for bounded measurable functions f ( t, a ) suchthat for each fixed t ∈ [0 , T ], f ( t, . ) is continuous (Instead of functions boundedand continuous with respect to the pair ( t, a ) for the weak topology). The space P ( U ) is equipped with its Borel σ -field, which is the smallest σ -field such that themapping q Z f ( s, a ) q ( ds, da ) are measurable for any bounded measurablefunction f , continuous with respect to a. For more details, see Jacod and Memin[29] and El Karoui et al [16].This allows us to summarize some of lemmas that we will be used in thesequel.
Lemma 13 (Chattering Lemma). Let q be a predictable process with values inthe space of probability measures on U . Then there exists a sequence of pre-dictable processes ( u n ) n with values in U such that dtq nt ( da ) = dtδ u nt ( da ) −→ n −→∞ dtq t ( da ) stably , P − a.s. (36) where δ u nt is the Dirac measure concentrated at a single point u nt of U . Proof.
See El Karoui et al [16].
Lemma 14
Let q be a relaxed control and ( u n ) n be a sequence of strict controlssuch that (36) holds. Then for any bounded measurable function f : [0 , T ] × U → R , such that for each fixed t ∈ [0 , T ] , f ( t, . ) is continuous, we have Z U f ( t, a ) δ u nt ( da ) −→ n −→∞ Z U f ( t, a ) q t ( da ) ; dt − a.e (37) Proof.
By (36) and the definition of the stable convergence (see Jacod-Memin[29 , definition 1.1, page 529], we have Z T Z U f ( t, a ) δ u nt ( da ) dt −→ n −→∞ Z T Z U f ( t, a ) q t ( da ) dt. Put g ( s, a ) = 1 [0 ,t ] ( s ) f ( s, a ) . g is bounded, measurable and continuous with respect to a .Then Z T Z U g ( s, a ) δ u ns ( da ) ds −→ n −→∞ Z T Z U g ( s, a ) q s ( da ) ds. By replacing g ( s, a ) by its value, we have Z t Z U f ( s, a ) δ u ns ( da ) ds −→ n −→∞ Z t Z U f ( s, a ) q s ( da ) ds. The set { ( s, t ) ; 0 ≤ s ≤ t ≤ T } generate B [0 ,T ] . Then, for every B ∈ B [0 ,T ] we have Z B Z U f ( s, a ) δ u ns ( da ) ds −→ n −→∞ Z B Z U f ( s, a ) q s ( da ) ds. This implies that Z U f ( s, a ) δ u ns ( da ) −→ n −→∞ Z U f ( s, a ) q s ( da ) , dt − a.e. The lemma is proved.The next lemma gives the stability of the controlled FBSDE with respect tothe control variable.
Lemma 15
Let q ∈ R be a relaxed control and ( x q , y q , z q ) the correspondingtrajectory. Then there exists a sequence ( u n ) n ⊂ U such that lim n →∞ E " sup t ∈ [0 ,T ] | x nt − x qt | = 0 , (38)lim n →∞ E " sup t ∈ [0 ,T ] | y nt − y qt | = 0 , (39)lim n →∞ Z T E | z nt − z qt | dt = 0 , (40)lim n →∞ J ( u n ) = J ( q ) . (41) where ( x n , y n , z n ) denotes the solution of equation (1) associated with u n . roof. Proof of (38). We have E | x nt − x qt | ≤ C Z t E (cid:12)(cid:12)(cid:12)(cid:12) b ( s, x ns , u ns ) − Z U b ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + C Z t E (cid:12)(cid:12)(cid:12)(cid:12) σ ( s, x ns , u ns ) − Z U σ ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds ≤ C Z t E | b ( s, x ns , u ns ) − b ( s, x qs , u ns ) | ds + C Z t E (cid:12)(cid:12)(cid:12)(cid:12) b ( s, x qs , u ns ) − Z U b ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + C Z t E | σ ( s, x ns , u ns ) − σ ( s, x qs , u ns ) | ds + C Z t E (cid:12)(cid:12)(cid:12)(cid:12) σ ( s, x qs , u ns ) − Z U σ ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds Since b and σ are uniformly Lipschitz with respect to x , then E | x nt − x qt | ≤ C Z t E | x ns − x qs | ds + C Z t E (cid:12)(cid:12)(cid:12)(cid:12) b ( s, x qs , u ns ) − Z U b ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds + C Z t E (cid:12)(cid:12)(cid:12)(cid:12)Z U σ ( s, x qs , a ) δ u ns ( da ) − Z U σ ( s, x qs , a ) q s ( da ) (cid:12)(cid:12)(cid:12)(cid:12) ds Since b and σ are bounded, measurable and continuous with respect to a ,then by (37) and the dominated convergence theorem, the second and thirdterms in the right hand side of the above inequality tend to zero as n tends toinfinity. We conclude then by using Gronwall’s lemma and Bukholder-Davis-Gundy inequality.ii) Proof of (39) and (40) . We have d ( y nt − y qt ) = − [ f ( t, x nt , y nt , z nt , u nt ) − f ( t, x nt , y qt , z qt , u nt )] dt − [ f ( t, x nt , y qt , z qt , u nt ) − f ( t, x qt , y qt , z qt , u nt )] dt − (cid:20) f ( t, x qt , y qt , z qt , u nt ) − Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) (cid:21) dt + ( z nt − z qt ) dW t ,y nT − y qT = ϕ ( x nT ) − ϕ ( x qT ) . Put Y nt = y nt − y qt ,Z nt = z nt − z qt , n ( t, Y nt , Z nt ) = − [ f ( t, x nt , y qt , z qt , u nt ) − f ( t, x qt , y qt , z qt , u nt )] dt (42) − f ( t, x qt , y qt , z qt , u nt ) − Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) − Z f y ( t, x qt , y qt + λ ( y nt − y qt ) , z qt + λ ( z nt − z qt ) , u nt ) Y nt dλ − Z f z ( t, x qt , y qt + λ ( y nt − y qt ) , z qt + λ ( z nt − z qt ) , u nt ) Z nt dλ. Then (cid:26) dY nt = Ψ n ( t, Y nt , Z nt ) dt + Z nt dW t ,Y nT = ϕ ( x nT ) − ϕ ( x qT ) . (43)The above equation is a linear BSDE with bounded coefficients, then byapplying a priori estimates (see Briand et al [12]), we get E " sup t ∈ [0 ,T ] | Y nt | + Z T | Z nt | dt ≤ C E | ϕ ( x nT ) − ϕ ( x qT ) | + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z T | Ψ n ( t, , | dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C E " | ϕ ( x nT ) − ϕ ( x qT ) | + Z T | Ψ n ( t, , | dt . From (42), we get E " sup t ∈ [0 ,T ] | Y nt | + Z T | Z nt | dt ≤ C E | ϕ ( x nT ) − ϕ ( x qT ) | + C E Z T | f ( t, x nt , y qt , z qt , u nt ) − f ( t, x qt , y qt , z qt , u nt ) | dt + C E Z T (cid:12)(cid:12)(cid:12)(cid:12) f ( t, x qt , y qt , z qt , u nt ) − Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) (cid:12)(cid:12)(cid:12)(cid:12) dt. By (4), ϕ and f are uniformly Lipshitz with respect to x , then we get E " sup t ∈ [0 ,T ] | Y nt | + Z T | Z nt | dt ≤ C E | x nT − x qT | + C E Z T | x nt − x qt | dt + C E Z T (cid:12)(cid:12)(cid:12)(cid:12)Z U f ( t, x qt , y qt , z qt , a ) δ u nt ( da ) − Z U f ( t, x qt , y qt , z qt , a ) q t ( da ) (cid:12)(cid:12)(cid:12)(cid:12) dt. By (38), the first and second terms in the right hand side of the aboveinequality tends to zero as n tends to infinity. Moreover, since f is bounded,measurable and continuous with respect to a , then by (37) and the dominatedconvergence theorem, the third term in the right hand side tends to zero as n tends to infinity. This prove (39) and (40).22ii) Proof of (41) . Since g , h and l are uniformly Lipshitz with respect to ( x, y, z ), then byusing the Cauchy-Schwartz inequality, we have | J ( q n ) − J ( q ) |≤ C (cid:16) E | x nT − x qT | (cid:17) / + C (cid:16) E | y n − y q | (cid:17) / + C Z T E | x nt − x qt | ds ! / + C Z T E | y nt − y qt | ds ! / + C E Z T | z nt − z qt | dt ! / + E Z T (cid:12)(cid:12)(cid:12)(cid:12)Z U l ( t, x qt , y qt , z qt , a ) δ u nt ( da ) dt − Z U l ( t, x qt , y qt , z qt , a ) q t ( da ) (cid:12)(cid:12)(cid:12)(cid:12) dt ! / . By (38), (39) and (40) the first five terms in the right hand side converge tozero. Furthermore, since h is bounded, measurable and continuous in a , then by(37) and the dominated convergence theorem, the sixth term in the right handside tends to zero as n tends to infinity. This prove (41). Lemma 16
As a consequence of (41) , the strict and the relaxed control problemshave the same value functions. That is inf v ∈U J ( v ) = inf q ∈R J ( q ) . (44) Proof.
Let u ∈ U and µ ∈ R be respectively a strict and relaxed controls suchthat J ( u ) = inf v ∈U J ( v ) (45) J ( µ ) = inf q ∈R J ( q ) . (46)By (46), we have J ( µ ) ≤ J ( q ) , ∀ q ∈ R .Since δ ( U ) ⊂ R , then J ( µ ) ≤ J ( q ) , ∀ q ∈ δ ( U ) .Since q ∈ δ ( U ), then q = δ v , where v ∈ U .Then we get (cid:26) ( x q , y q , z q ) = ( x v , y v , z v ) , J ( q ) = J ( v ) . Hence, J ( µ ) ≤ J ( v ) , ∀ v ∈ U .The control u becomes an element of U , then we get J ( µ ) ≤ J ( u ) . (47)23n the other hand, by (45) we have J ( u ) ≤ J ( v ) , ∀ v ∈ U . (48)The control µ becomes a relaxed control, then by lemma 13, there exists asequence ( u n ) n of strict controls such that dtµ nt ( da ) = dtδ u nt ( da ) −→ n −→∞ dtµ t ( da ) stably , P − a.s.
By (48), we get then J ( u ) ≤ J ( u n ) , ∀ n ∈ N ,By using (41) and letting n go to infinity in the above inequality, we get J ( u ) ≤ J ( µ ) . (49)Finally, by (47) and (49), the proof is completed.To establish necessary optimality conditions for strict controls, we need thefollowing lemma Lemma 17
The strict control u minimizes J over U if and only if the relaxedcontrol µ = δ u minimizes J over R . Proof.
Suppose that u minimizes the cost J over U , then J ( u ) = inf v ∈U J ( v ) .By using (44), we get J ( u ) = inf q ∈R J ( q ) .Since µ = δ u , then (cid:26) ( x µ , y µ , z µ ) = ( x u , y u , z u ) , J ( µ ) = J ( u ) , (50)This implies that J ( µ ) = inf J ( q ) q ∈R . Conversely, if µ = δ u minimize J over R , then J ( µ ) = inf J ( q ) q ∈R . From (44), we get J ( µ ) = inf J ( v ) v ∈U . µ = δ u , then relations (50) hold, and we obtain J ( u ) = inf J ( v ) v ∈U . The proof is completed.The following lemma, who will be used to establish sufficient optimalityconditions for strict controls, shows that we get the results of the above lemmaif we replace R by δ ( U ) . Lemma 18
The strict control u minimizes J over U if and only if the relaxedcontrol µ = δ u minimizes J over δ ( U ) . Proof.
Let µ = δ u be an optimal relaxed control minimizing the cost J over δ ( U ), we have then J ( µ ) ≤ J ( q ) , ∀ q ∈ δ ( U ) . Since q ∈ δ ( U ), then there exists v ∈ U such that q = δ v . It is easy to see that ( x µ , y µ , z µ ) = ( x u , y u , z u ) , ( x q , y q , z q ) = ( x v , y v , z v ) , J ( µ ) = J ( u ) , J ( q ) = J ( v ) . (51)Then, we get J ( u ) ≤ J ( v ) , ∀ v ∈ U .Conversely, let u be a strict control minimizing the cost J over U . Then J ( u ) ≤ J ( v ) , ∀ v ∈ U .Since the controls u, v ∈ U , then there exist µ, q ∈ δ ( U ) such that µ = δ u , q = δ v . This implies that relations (51) hold. Consequently, we get J ( µ ) ≤ J ( q ) , ∀ q ∈ δ ( U ) . The lemma is proved.
Define the Hamiltonian H in the strict case from [0 , T ] × R n × R m ×M m × d ( R ) × U × R m × R n × M n × d ( R ) into R by H ( t, x, y, z, v, k, p, P ) = l ( t, x, y, z, v )+ pb ( t, x, v )+ P σ ( t, x, v )+ kf ( t, x, y, z, v ) . heorem 19 (Necessary optimality conditions for strict controls). Let u be anoptimal control minimizing the functional J over U and ( x u , y u , z u ) the solutionof (1) associated with u . Then, there exist three adapted processes ( p µ , P µ , k µ ) ,unique solution of the following FBSDE system (called adjoint equations) dk ut = H y ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) dt + H z ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) dW t ,k u = h y ( y u ) dp ut = − H x ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) dt + P ut dW t ,p uT = g x ( x uT ) + ϕ x ( x uT ) k uT , (52) such that for every v t ∈ UH ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) ≤ H ( t, x ut , y ut , z ut , v t , k ut , p ut , P ut ) , ae , as. (53) Proof.
Let u be an optimal solution of the strict control problem { (1) , (2) , (3) } .Then, there exist µ ∈ δ ( U ) such that µ = δ u . Since u minimizes the cost J over U , then by lemma 17, µ minimizes J over R . Hence, by the necessary optimality conditions for relaxed controls(Theorem 11), there exist three unique adapted processes ( k µ , p µ , P µ ), solutionof the system of relaxed adjoint equations (30) such that, for every q t ∈ P ( U ) H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ≤ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) , a.e, a.s. Since δ ( U ) ⊂ P ( U ), then for every v t ∈ δ ( U ), we get H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ≤ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) , a.e, a.s. (54)Since q ∈ δ ( U ), then there exist v ∈ U such that q = δ v .We note that v is an arbitrary element of U since q is arbitrary.Now, since µ = δ u and q = δ v , we can easily see that ( x µ , y µ , z µ ) = ( x u , y u , z u ) , ( x q , y q , z q ) = ( x v , y v , z v ) , ( k µ , p µ , P µ ) = ( k u , p u , P u ) , H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) = H ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) , H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) = H ( t, x ut , y ut , z ut , v t , k ut , p ut , P ut ) , (55)where, the pair ( p u , P u ) and k u are respectively the unique solutions of thesystem of strict adjoint equations (52).Finally, by using (54) and (55), we can easy deduce (53). The proof iscompleted. 26 .2 Sufficient optimality conditions for strict controls Theorem 20 (Sufficient optimality conditions for strict controls). Assume thatthe functions g , and ( x, y, z ) H ( t, x, y, z, q, k, p, P ) are convex, and for any v ∈ U , y vT = ξ, where ξ is an m -dimensional F T -measurable random variablesuch that E | ξ | < ∞ . Then, u is an optimal solution of the control problem { (1) , (2) , (3) } , if itsatisfies (53) . Proof.
Let u be a strict control (candidate to be optimal) such that necessaryoptimality conditions for strict controls (53) hold. i.e, for every v t ∈ UH ( t, x ut , y ut , z ut , u t , k ut , p ut , P ut ) ≤ H ( t, x ut , y ut , z ut , v t , k ut , p ut , P ut ) , a.e, a.s. (56)The controls u, v are elements of U , then there exist µ, q ∈ δ ( U ) such that µ = δ u ,q = δ v . This implies that relations (55) hold. Then by (56), we deduce that for every q t ∈ δ ( U ) H ( t, x µt , y µt , z µt , µ t , k µt , p µt , P µt ) ≤ H ( t, x µt , y µt , z µt , q t , k µt , p µt , P µt ) , a.e, a.s. Since H is convex in ( x, y, z ), it is easy to see that H is convex in ( x, y, z ),and since g and h are convex, then by the same proof that in theorem 12, weshow that µ minimizes the cost J over δ ( U ). Finally by lemma 18, we deducethat u minimizes the cost J over U . The theorem is proved. Remark 21
The sufficient optimality conditions for strict controls are provedwithout assuming neither the convexity of U nor that of H in v . References [1] F. Antonelli,
Backward-forward stochastic differential equations . Annals ofApplied Probability, 1993, 3, pp. 777–793.[2] F. Antonelli and J. Ma,
Weak solution of Forward-Backward SDE’s .Stochatsic Analysis and Application, 2003, 21, n.3, pp. 493–514.[3] F. Armerin,
Aspects of Cash Flow Valuation , Doctoral thesis, KTH Stock-holm - Sweden. 2004.[4] S. Bahlali, B. Mezerdi and B. Djehiche,
Approximation and optimality nec-essary conditions in relaxed stochastic control problems,
Journal of AppliedMathematics and Stochastic Analysis, Volume 2006, pp 1-23.275] S. Bahlali and B. Labed,
Necessary and sufficient conditions of optimalityfor optimal control problem with initial and terminal costs , Rand. Operat.and Stoch. Equ, 2006, Vol 14, No3, pp 291-301.[6] S. Bahlali, B. Djehiche and B. Mezerdi,
The relaxed maximum principle insingular control of diffusions , SIAM J. Control and Optim, 2007, Vol 46,Issue 2, pp 427-444.[7] S. Bahlali,
Necessary and sufficient conditions of optimality for relaxed andstrict control problems , SIAM J. Control and Optim, 2008, Vol. 47, No. 4,pp. 2078–2095.[8] S. Bahlali,
Necessary and sufficient condition of optimality for optimal con-trol problem of forward and backward systems , Theory of Probability andIts Applications ( TVP), In revision.[9] S. Bahlali,
Necessary and sufficient optimality conditions for relaxed andstrict control problems of backward systems , Stochastics and Dynamics,Submitted.[10] S. Bahlali,
A general necessary and sufficient optimality conditions for sin-gular control problems , SIAM J. Control and Optim, Submitted.[11] A. Bensoussan,
Non linear filtering and stochastic control . Proc. Cortona1981, Lect. notes in Math. 1982, 972, Springer Verlag.[12] Ph. Briand, B. Delyon, Y. Hu, E. Pardoux and L. Stoica, L p Solutionsof backward stochastic differential equations , Sochastic Process and theirApplications, No 108, 2003, pp 109-129.[13] F. Delarue,
On the existence and uniqueness of solutions to FBSDEs in anon-degenerate case . Stochastic Process. Appl., 2002, 99, pp. 209–286.[14] N. Dokuchaev and X. Y. Zhou,
Stochastic controls with terminal contingentconditions,
Journal Of Mathematical Analysis And Applications, 1999, 238,pp 143-165.[15] J. Douglas, J. Ma and P. Protter,
Numerical methods for forward-backwardstochastic differential equations , 1996, Ann. Appl. Probab., 6(3), pp 940-968.[16] N. El Karoui, N. Huu Nguyen and M. Jeanblanc Piqu´e,
Compactificationmethods in the control of degenerate diffusions.
Stochastics, Vol. 20, 1987,pp 169-219.[17] N. El Karoui and L. Mazliak,
Backward stochastic differential equations ,1997, Addison Wesley, Longman.[18] N. El-Karoui, S. Peng, and M. C. Quenez,
Backward stochastic differentialequations in finance , A dynamic maximum principle forthe optimization of recursive utilities under constraints , Annals of AppliedProbability, 11(2001), pp 664-693.[20] R.J. Elliott and M. Kohlmann,
The variational principle and stochasticoptimal control.
Stochastics 3, 1980, pp 229-241.[21] W.H. Fleming,
Generalized solutions in optimal stochastic control , Differ-ential games and control theory 2, (Kingston conference 1976), Lect. Notesin Pure and Appl. Math.30, 1978.[22] N.F. Framstad, B. Oksendal and A. Sulem,
A sufficient stochastic maxi-mum principle for optimal control of jump diffusions and applications tofinance , J. Optim. Theory and applications. 121, 2004, pp 77-98.[23] M. Fuhrman and G. Tessitore,
Existence of optimal stochastic controls andglobal solutions of forward-backward stochastic differential equations , SIAMJ. Control and Optim, 2004, Vol 43, N ◦
3, pp 813-830.[24] U.G. Haussmann,
General necessary conditions for optimal control ofstochastic systems , Math. Programming Studies 6, 1976, pp 30-48.[25] U.G. Haussmann,
A Stochastic maximum principle for optimal control ofdiffusions , Pitman Research Notes in Math, 1986, Series 151.[26] Y. Hu,
On the solution of Forward-backward SDEs with monotone andcontinuous coeffcients , Nonlinear Anal., 1999, 42, pp 1-12.[27] Y. Hu and S. Peng,
Solution of forward-backward stochastic differentialequations . Probab.Theory Rel. Fields, 1995,103, pp. 273–283.[28] Y. Hu and J. Yong,
Forward-backward stochastic differential equations withnonsmooth coeffcients . Stochatic Process. Appl., 2000, 87, pp. 93–106.[29] J. Jacod and J. M´emin,
Sur un type de convergence interm´ediaire entre laconvergence en loi et la convergence en probabilit´e . Sem. Proba.XV, Lect.Notes in Math. 851, Springer Verlag, 1980.[30] S.Ji and X. Y. Zhou,
A maximum principle for stochastic optimal controlwith terminal state constraints, and its applications.
Commun. Inf. Syst,2006, 6(4), pp 321-338.[31] H.J. Kushner,
Necessary conditions for continuous parameter stochasticoptimization problems , SIAM J. Control Optim, Vol. 10, 1973, pp 550-565.[32] N.V. Krylov,
Controlled diffusion processes , Springer verlag. 1980.[33] J. Ma, P. Protter and J. Yong,
Solving forward-backward stochastic differ-ential equations explicitly - a four step scheme,
Probab. Theory Rel. Fields,1994, 98, pp 339–359. 2934] J. Ma and J. Yong,
Solvability of forward-backward SDEs and the nodalset of Hamilton-Jacobi-Bellman equations . A Chinese summary appears inChinese Ann. Math. Ser. A 16 (1995), no. 4, 532. Chinese Ann. Math. Ser.B 16, 1995, no. 3, pp 279–298.[35] J. Ma and J. Yong,
Forward-backward stochastic differential equations andtheir applications,
In Lecture Notes Math., 1999, volume 1702. Springer,Berlin.[36] J. Ma and J. Zhang,
Representation theorems for backward stochastic dif-ferential equations , Ann. Appl. Probab., 2002, 12(4), pp 1390-1418, 2002.[37] B. Mezerdi and S. Bahlali,
Approximation in optimal control of diffusionprocesses , Rand. Operat. and Stoch. Equ, 2000, Vol.8, No 4, pp 365-372.[38] B. Mezerdi and S. Bahlali,
Necessary conditions for optimality in relaxedstochastic control problems , Stochastics And Stoch. Reports, 2002, Vol 73(3-4), pp 201-218.[39] E. Pardoux and S. Peng,
Adapted solutions of backward stochastic differ-ential equations , Sys. Control Letters, 1990, Vol. 14, pp 55-61.[40] E. Pardoux and S. Tang,
Forward-Backward stochastic differential equa-tions and quasilinear parabolic PDEs . Probab. Theory Rel. Fields, 1999,114, pp 123–150.[41] S. Peng,
A general stochastic maximum principle for optimal control prob-lems.
SIAM J. Control and Optim. 1990, 28, N ◦
4, pp 966-979.[42] S. Peng,
Backward stochastic differential equations and application to op-timal control , Appl. Math. Optim. 1993, 27, pp 125-144.[43] S. Peng and Z. Wu,
Fully coupled forward-backward stochastic differentialequations and applications to optimal control . SIAM J. Control Optim.,1999, 37, no. 3, pp. 825–843.[44] J.T. Shi and Z. Wu,
The maximum principle for fully coupled forward-backward stochastic control system , Acta Automatica Sinica, Vol 32, No 2,2006, pp 161-169.[45] A.V. Skorokhod,
Studies in the theory of random processes,
Reading Mass,Addison Wesley. 1965.[46] Z. Wu,
Maximum Principle for Optimal Control Problem of Fully CoupledForward-Backward Stochastic Systems , Systems Sci. Math. Sci, 1998, 11,No.3, pp 249-259.[47] W. Xu,
Stochastic maximum principle for optimal control problem of for-ward and backward system,
J. Austral. Math. Soc. Ser. B 37, 1995, pp172-185. 3048] J. Yong,
Finding adapted solutions of forward-backward stochastic differen-tial equations - method of continuation , Probablity Theory Related Fields,1997, 107, pp. 537–572.[49] J. Yong and X.Y. Zhou,
Stochastic controls : Hamilton systems and HJBequations , vol 43, Springer, New York, 1999.[50] X.Y. Zhou,