[PDF] A Mean-field Stochastic Control Problem with Partial Observations

Abstract

In this paper we are interested in a new type of {\it mean-field}, non-Markovian stochastic control problems with partial observations. More precisely, we assume that the coefficients of the controlled dynamics depend not only on the paths of the state, but also on the conditional law of the state, given the observation to date. Our problem is strongly motivated by the recent study of the mean field games and the related McKean-Vlasov stochastic control problem, but with added aspects of path-dependence and partial observation. We shall first investigate the well-posedness of the state-observation dynamics, with combined reference probability measure arguments in nonlinear filtering theory and the Schauder fixed point theorem. We then study the stochastic control problem with a partially observable system in which the conditional law appears nonlinearly in both the coefficients of the system and cost function. As a consequence the control problem is intrinsically "time-inconsistent", and we prove that the Pontryagin Stochastic Maximum Principle holds in this case and characterize the adjoint equations, which turn out to be a new form of mean-field type BSDEs.

Full PDF

aa r X i v : . [ m a t h . P R ] F e b A Mean-ﬁeld Stochastic Control Problem withPartial Observations ∗ Rainer Buckdahn , Juan Li , and Jin Ma October 10, 2018

Abstract

In this paper we are interested in a new type of mean-ﬁeld , non-Markovian stochasticcontrol problems with partial observations. More precisely, we assume that the coeﬃ-cients of the controlled dynamics depend not only on the paths of the state, but also onthe conditional law of the state, given the observation to date. Our problem is stronglymotivated by the recent study of the mean ﬁeld games and the related McKean-Vlasovstochastic control problem, but with added aspects of path-dependence and partialobservation. We shall ﬁrst investigate the well-posedness of the state-observation dy-namics, with combined reference probability measure arguments in nonlinear ﬁlteringtheory and the Schauder ﬁxed point theorem. We then study the stochastic controlproblem with a partially observable system in which the conditional law appears non-linearly in both the coeﬃcients of the system and cost function. As a consequence thecontrol problem is intrinsically “time-inconsistent”, and we prove that the PontryaginStochastic Maximum Principle holds in this case and characterize the adjoint equations,which turn out to be a new form of mean-ﬁeld type BSDEs.

Keywords.

Conditional mean-ﬁeld SDEs, non-Markovian stochastic control system,nonlinear ﬁltering, stochastic maximum principle, mean-ﬁeld backward SDEs. ∗ The authors would like to dedicate this paper to Prof. Hans-J¨urgen Engelbert, in the occasion of his70th birthday, for his generous guidance and inspirational discussions throughout the past decades. Laboratoire de Math´ematiques, Universit´e de Bretagne-Occidentale, F-29285 Brest Cedex, France, email:[email protected]; School of Mathematics, Shandong University, Jinan, 250100, China. Thiswork is part of the French ANR project CAESARS (ANR-15-CE05-0024). Corresponding author. School of Mathematics, Shandong University, Weihai. Weihai, 264209, China.Email: [email protected]. This author has been supported by the NSF of P.R.China (No. 11222110),Shandong Province (No. JQ201202), NSFC-RS (No. 11661130148; NA150344). Department of Mathematics, University of Southern California, Los Angeles, 90089, USA. Email:[email protected]. This author is supported in part by US NSF grant Introduction

In this paper we are interested in the following mean-ﬁeld-type stochastic control problem,on a given ﬁltered probability space (Ω , F , P ; F = {F t } t ≥ ): ( dX t = E { b ( t, ϕ ·∧ t , E [ X t |G t ] , u ) }| ϕ = X,u = u t dt + E { σ ( t, ϕ ·∧ t , E [ X t |G t ] , u ) }| ϕ = X,u = u t dB t ,X = x, (1.1)where B is an F -Brownian motion, b and σ are measurable functions satisfying reason-able conditions, ϕ ·∧ t and X ·∧ t denote the continuous function and process, respectively,“stopped” at t ; G △ = {G t } t ≥ is a given ﬁltration that could involve the information of X itself, and u = { u t : t ≥ } is the “control process”, assumed to be adapted to a ﬁltration H = {H t } t ≥ , where H t ⊆ F Xt ∨ G t , t ≥

0. We note that if G t = {∅ , Ω } , for all t ≥ H t = F Xt , and coeﬃcientsare “Markovian” (i.e., ϕ ·∧ t = ϕ t ), then the problem becomes a stochastic control problemwith McKean-Vlasov dynamics and/or a Mean-ﬁeld game (see, for example, [7, 8, 9] in its“forward” form, and [2, 3, 4] in its “backward” form). On the other hand, when G is a givenﬁltration, this is the so-called conditional mean-ﬁeld SDE (CMFSDE for short) studied in[12]. We note that in that case the conditioning is essentially “open-looped”.The problem that this paper is particularly focusing on is when G t = F Yt , t ≥

0, where Y is an “observation process” of the dynamics of X , i.e., the case when the pair ( X, Y ) formsa “close-looped” or “coupled” CMFSDE. More precisely, we shall consider the followingpartially observed controlled dynamics (assuming b = 0 for notational simplicity):  dX t = E { σ ( t, ϕ ·∧ t , E [ X t |F Yt ] , u ) }| ϕ = X,u = u t dB t ; dY t = h ( t, X t ) dt + ˆ σdB t ; X = x, Y = 0 . (1.2)Here X is the “signal” process that can only be observed through Y , ( B , B ) is a standardBrownian motion, and ˆ σ is a constant. We should note that in SDEs (1.2) the condition-ing ﬁltration F Y now depends on X itself, therefore it is much more convoluted than theCMFSDE we have seen in the literature. Furthermore, the path-dependent nature of thecoeﬃcients makes the SDE essentially non-Markovian . Such form of CMFSDEs, to the bestof our knowledge, has not been explored fully in the literature.Our study of the CMFSDE (1.2) is strongly motivated by the following variation ofthe mean-ﬁeld game in a ﬁnance context, which would result in a type of stochastic controlproblem involving a controlled dynamics of such a form. Consider a ﬁrm whose fundamentalvalue , under the risk neutral measure P with zero interest, evolves as the following SDE2ith “stochastic volatility” σ = σ ( t, ω ), ( t, ω ) ∈ [0 , ∞ ) × Ω: X t = x + Z t σ ( s, · ) dB s , t ≥ , (1.3)where B is the intrinsic noise from inside the ﬁrm. We assume that such fundamental valueprocess cannot be observed directly, but can be observed through a stochastic dynamics(e.g., its stock value) via an SDE: Y t = Z t h ( s, X s ) ds + B t , t ≥ , (1.4)where B is the noise from the market, which we assume is independent of B (this is byno means necessary, we can certainly consider the ﬁltering problem with correlated noises).Now let us assume that the volatility σ in (1.3) is aﬀected by the actions of a largenumber of investors, and all can only make decisions based on the information from theprocess Y . Therefore, similar to [8] (or [17]) we begin by considering N individual investors,and assume that i -th investor’s private state dynamics is of the form: dU it = σ i ( t, U i ·∧ t , ¯ ν Nt , α it ) dB ,it , t ≥ , ≤ i ≤ N, (1.5)where B ,i ’s are independent Brownian motions, and ¯ ν Nt denotes the empirical conditionaldistribution of U = ( U , · · · , U N ), given the (common) observation Y = { Y t : t ≥ } , thatis, ¯ ν Nt △ = N P Nj =1 δ E [ U jt |F Yt ] , where δ x denotes the Dirac measure at x . More precisely, thenotation in (1.5) means (see, e.g., [8]), σ i ( t, U i ·∧ t , ¯ ν Nt , α it ) △ = Z R ˜ σ i ( t, U i ·∧ t , y, α it )¯ ν Nt ( dy )= 1 N N X j =1 Z R ˜ σ i ( t, U i ·∧ t , y, α it ) δ E [ U jt |F Yt ] ( dy ) (1.6)= 1 N N X j =1 ˜ σ i ( t, U i ·∧ t , E [ U jt |F Yt ] , α it ) . Here, ˜ σ i ’s are the functions deﬁned on appropriate (Euclidean) spaces.We now assume that each investor chooses an individual strategy to minimize the cost;the cost functional of the i -th agent is of the form: J i ( α i ) △ = E n Φ i ( U iT ) + Z T L i ( t, U i ·∧ t , ¯ ν Nt , α it ) dt o , ≤ i ≤ N, (1.7)Following the argument of Lasry and Lions [20] (see also [8, 9, 11, 12, 17]), if we assumethat the game is symmetric , i.e., ˜ σ i = ˜ σ, L i and Φ i = Φ are independent of i , and that3he number of investors N converges to + ∞ , then under suitable technical conditions,one could ﬁnd (approximate) Nash equilibriums through a limiting dynamics, and assign arepresentative investor the uniﬁed strategy α , determined by a conditional McKean-Vlasovtype SDE dX t = σ ( t, X ·∧ t , µ t , α t ) dB t , t ≥ , (1.8)where µ is the conditional distribution of X t given F Yt , and σ ( t, X ·∧ t , µ t , u t ) △ = Z σ ( t, X ·∧ t , y, u t ) µ t ( dy ) = E { σ ( t, ϕ ·∧ t , E [ X t |F Yt ] , u ) }| ϕ = X,u = u t . Furthermore, the value function becomes, with similar notations, V ( x ) = inf α J ( α ) △ = E n Φ( X T ) + Z T L ( t, X ·∧ t , µ t , α t ) dt o . (1.9)We note that (1.8) and (1.9), together with (1.4), form a stochastic control problem involvingCMFSDE dynamics and partial observations, as we are proposing.The main objective of this paper is two-fold: We shall ﬁrst study the exact meaning aswell as the well-posedness of the dynamics, and then investigate the Stochastic MaximumPrinciple for the corresponding stochastic control problem. For the wellposedness of (1.2) weshall use a scheme that combines the idea of [7] and the techniques of nonlinear ﬁltering, andprove the existence and uniqueness of the solution to SDE (1.8) via Schauder’s ﬁxed pointtheorem on P (Ω), the space of probability measures with ﬁnite second moment, endowedwith the 2-Wasserstein metric. We note that the important elements in this argumentinclude the so-called reference probability space that is often seen in the nonlinear ﬁlteringtheory and the Kallianpur-Striebel formula (cf. e.g., [1, 26]), which enable us to deﬁne thesolution mapping.Our next task is to prove Pontryagin’s Maximum Principle for our stochastic controlproblem. The main idea is similar to earlier works of the ﬁrst two authors ([4, 21]), withsome signiﬁcant modiﬁcations. In particular, since in the present case the control problemcan only be carried out in a weak form, due to the lack of strong solution of CMFSDE,the existence of the common reference probability space is essential. Consequently, extraeﬀorts are needed to overcome the complexity caused by the change of probability measures,which, together with the path-dependent nature of the underlying dynamic system, makeseven the ﬁrst order adjoint equation more complicated than the traditional ones. To thebest of our knowledge, the resulting mean-ﬁeld backward SDE is new.The paper is organized as follows. In Section 2 we provide all the necessary prepara-tions, including some known facts of nonlinear ﬁltering. In Sections 3 and 4 we prove the4ell-posedness of the partially observable dynamics. In Section 5 we introduce the stochas-tic control problem, and in Section 6 we study the variational equations and give someimportant estimates. Finally, in Section 7 we prove the Pontryagin maximum principle. Throughout this paper we consider the canonical space (Ω , F ), where Ω △ = C ([0 , ∞ ); R d ) = { ω ∈ C ([0 , ∞ ); R d ) : ω = } , and F be its topological σ -ﬁeld. Let F = {F t } t ≥ be thenatural ﬁltration on Ω, that is, for each t ≥ F t is the topological σ -ﬁeld of the spaceΩ t △ = { ω ( · ∧ t ) : ω ∈ Ω } . For simplicity, throughout this paper we assume d = 1, and thatall the processes are 1-dimensional, although the higher dimensional cases can be arguedsimilarly without substantial diﬃculties. Furthermore, we let P (Ω) denote the space of allprobability measures on (Ω , F ), and for each P ∈ P (Ω), we assume that F is P -augmentedso that the ﬁltered probability space (Ω , F , P ; F ) satisﬁes the usual hypotheses .Next, for given T > C T = C ([0 , T ]) endowed by the supremum norm k · k C T ,and let B ( C T ) be its topological σ -ﬁeld. Consider now the space of all probability measureson ( C T , B ( C T )), denoted by P ( C T ), and for p ≥ P p ( C T ) ⊆ P ( C T ) be those thathave ﬁnite p -th moment. We recall that the p -Wasserstein metric on P p ( C T ) is deﬁned asa mapping W p : P p ( C T ) × P p ( C T ) R + such that, for all µ, ν ∈ P p ( C T ), W p ( µ, ν ) △ = inf { ( Z C T k x − y k p C T π ( dx, dy )) p : π ∈ P p ( C T ) with marginals µ and ν } . (2.1)In this paper we shall use the 2-Wasserstein metric W , and abbreviate ( P ( C T ) , W ) by P ( C T ). Since C T is a separable Banach space, it is known that P ( C T ) is a separable andcomplete metric space. Furthermore, it is known that (cf. e.g., [24]), for µ n , µ ∈ P ( C T ),lim n →∞ W ( µ n , µ ) = 0 ⇐⇒ µ n w → µ in P ( C T ) and, as N → + ∞ , sup n Z Ω k ϕ k C T I {k ϕ k C T ≥ N } µ n ( dϕ ) → . (2.2)Next, for any P ∈ P (Ω), p, q ≥

1, any sub-ﬁltration G ⊆ F , and any Banach space X , we denote L p ( P ; X ) to be all X -valued L p -random variables under P . In particular, wedenote by L p ( P ; R ) to be all real valued L p -random variables under P . Further, we denoteby L p G ( P ; L q ([0 , T ])) the L p -space of all G -adapted processes η , such that k η k p,q, P △ = n E P h Z T | η t | q dt i p/q o /p < ∞ . (2.3)If p = q , we simply write L p G ( P ; [0 , T ]) △ = L p G ( P ; L p ([0 , T ])). Finally, we deﬁne L ∞− G ( P ; [0 , T ]) △ = T p> L p G ( P ; [0 , T ]) and L ∞− G ( P ; C T ) △ = T p> L p G ( P ; C T ), where L p G ( P ; C T ) is the space of all5ontinuous, F -adapted, processes ξ = { ξ t } such that k ξ k C T ∈ L p ( P ; R ). We will often drop“ P ” from the subscript/superscript when the context is clear.We now give a more precise description of the SDEs (1.2), in terms of the standardMcKean-Vlasov SDE. Again we consider only the case b = 0, and we assume further thatˆ σ = 1 in (1.2) for simplicity.We begin by introducing some notations. Let X be the state process and Y the observa-tion process, deﬁned on (Ω , F , P ), for some P ∈ P (Ω). We denote the “ﬁltered” state pro-cess by U X | Yt = E P [ X t |F Yt ], t ≥

0. Since (as we show in Lemma 3.2 below) the process U X | Y is continuous, we denote its law under P on C T by µ X | Y = P ◦ [ U X | Y ] − ∈ P ( C T ). Next,let P t ( ϕ ) = ϕ ( t ), ϕ ∈ C T , t ≥

0, be the projection mapping, and deﬁne µ X | Yt = µ X | Y ◦ P t − .Then, for any ϕ ∈ C T , and u ∈ R , we can write E [ σ ( t, ϕ ·∧ t , E [ X t |F Yt ] , u )] = Z σ ( t, ϕ ·∧ t , y, u ) µ X | Yt ( dy ) △ = σ ( t, ϕ ·∧ t , µ X | Yt , u ) . We should note that since the dynamics X is non-observable, the decision of the con-troller can only be made based on the information observed from the process Y . Therefore,it is reasonable to assume that the control process u is F Y = {F Yt } t ≥ adapted (or pro-gressively measurable). We should remark that, for a given such control, it is by no meansclear that the state-observation SDEs will have a strong solution on a prescribed probabil-ity space, as we shall see from our well-posedness result in the next sections. We thereforeconsider a “weak formulation” which we now describe. Consider the pairs ( P , u ), where P ∈ P (Ω), u ∈ L F ( P ; [0 , T ]), such that the following SDEs are well-deﬁned: X t = x + Z t E P [ σ ( s, ϕ ·∧ s , E P [ X s |F Ys ] , z )] (cid:12)(cid:12)(cid:12) ϕ = X,z = u s dB s (2.4)= x + Z t Z R σ ( s, X ·∧ s , y, u s ) µ s ( dy ) dB s = x + Z t σ ( s, X ·∧ s , µ s , u s ) dB s ,Y t = Z t h ( s, X s ) ds + B t , t ≥ , (2.5)where ( B , B ) is a standard 2- d Brownian motion under P , and µ t ( · ) △ = P ◦ E P [ X t |F Yt ] − ( · )is the distribution, under P , of the conditional expectation of X t , given F Yt . We note thatwe do not require that the solution to (2.4) and (2.5) (or probability P for given u ) beunique(!). Now let U be a convex subset of R k . For simplicity, assume k = 1. Deﬁnition 2.1.

A pair ( P , u ) ∈ P (Ω) × L F ( P ; [0 , T ]) is called an “admissible control” if (i) u t ∈ U , for all t ∈ [0 , T ] , and B = ( B , B ) is a ( F , P ) -Brownian motion; (ii) There exist processes ( X, Y ) ∈ L F ( P ; [0 , T ]) satisfying SDEs (2.4) and (2.5); and u ∈ L ∞− F Y ( P ; [0 , T ]) . We shall denote the set of all admissible controls by U ad . For simplicity, we often write u ∈ U ad , and denote the associated probability measure(s) P by P u , for u ∈ U ad . Remark 2.2.

As we will shall see later, under our standing assumptions to every control u ∈ U ad there is only one probability measure P u associated. We should note, however,that unlike the traditional ﬁltering problem, the main diﬃculty of SDE (2.4)-(2.5) lies inthe mutual dependence between the solution pair X u and Y , via the law of conditionalexpectation µ ut = P u ◦ E P u [ X ut |F Yt ] − in the coeﬃcients. Moreover, the requirement that u is F Y -adapted adds an additional seemingly “circular” nature to the problem. Thus, thewell-posedness of the problem is far from obvious, and will be the main subject of § X u , Y ) areoften deﬁned on diﬀerent probability spaces. To facilitate our discussion we shall designatea common space on which all the controlled dynamics can be evaluated. In light of thenonlinear ﬁltering theory, we make the following assumption. Assumption 2.3.

There exists a probability measure Q on (Ω , F ) , such that, under Q , ( B , Y ) is a 2-dimensional Brownian motion, where Y is the observation process. We note that the probability measure Q is commonly known as the “reference probabil-ity measure” in nonlinear ﬁltering theory. The existence of such measure can be argued oncethe existence of the weak solution of (2.4)-(2.5) is known. Indeed, suppose that u ∈ U ad and P u ∈ P (Ω) is the associated probability such that the SDEs (2.4) and (2.5) have asolution ( X u , Y ) on (Ω , F , P u ). Consider the following SDE:¯ L t = 1 − Z t h ( s, X us ) ¯ L s dB s = 1 + Z t ¯ L s dZ us , (2.6)where Z ut = − R t h ( s, X us ) dB s . We denote its solution by ¯ L u . Then, under appropriateconditions on h , both Z u and ¯ L u are P u -martingales, and ¯ L u is the stochastic exponential:¯ L ut = exp n Z ut − h Z u i t o = exp n − Z t h ( s, X us ) dB s − Z t | h ( s, X us ) | ds o . (2.7)Thus, the Girsanov Theorem suggests that d Q = ¯ L uT d P u deﬁnes a new probability measure Q under which ( B , Y ) is a Brownian motion, hence a “reference measure”.The essence of Assumption 2.3 is, therefore, to assign a prior distribution on the ob-servation process Y before the well-posedness of the control system is established. In fact,7ith such an assumption one can begin by assuming that ( B , Y ) is the canonical process(i.e., ( B t , Y t )( ω ) = ω ( t ), ω ∈ Ω) and Q the Wiener measure on (Ω , F ), and then proceedto prove the existence of the weak solution of the system (2.4) and (2.5). This scheme willbe carried out in details in § u ∈ U ad , we deﬁne the cost functional by J ( t, x ; u ) △ = E Q n Z Tt f ( s, X u ·∧ s , µ us , u s ) ds + Φ( X uT , µ uT ) o = E Q n Z Tt E P u [ f ( s, ϕ ·∧ s , E P u [ X us |F Ys ] , u )] (cid:12)(cid:12)(cid:12) ϕ = X u ,u = u s ds (2.8)+ E P u [Φ( x, E P u [ X uT |F YT ])] (cid:12)(cid:12)(cid:12) x = X uT o , and we denote the value function as V ( t, x ) △ = inf u ∈ U ad J ( t, x ; u ) . (2.9)We shall make use of the following Standing Assumptions on the coeﬃcients.

Assumption 2.4. (i)

The mappings ( t, ϕ, x, y, z ) σ ( t, ϕ ·∧ t , y, z ) , h ( t, x ) , f ( t, ϕ ·∧ t , y, z ) ,and Φ( x, y ) are bounded and continuous, for ( t, ϕ, x, y, z ) ∈ [0 , T ] × C T × R × R × U ; (ii) The partial derivatives ∂ y σ , ∂ z σ , ∂ y f , ∂ z f , ∂ x h , ∂ x Φ , ∂ y Φ are bounded and contin-uous, for ( ϕ, x, y, z ) ∈ C T × R × R × U , uniformly in t ∈ [0 , T ] ; (iii) The mappings ϕ σ ( t, ϕ ·∧ t , y, z ) , f ( t, ϕ ·∧ t , y, z ) , as functionals from C T to R ,are Fr´echet diﬀerentiable. Furthermore, there exists a family of measures { ℓ ( t, · ) }| t ∈ [0 ,T ] ,satisfying ≤ R T ℓ ( t, ds ) ≤ C , for all t ∈ [0 , T ] , such that both derivatives, denoted by D ϕ σ = D ϕ σ ( t, ϕ ·∧ t , y, z ) and D ϕ f = D ϕ f ( t, ϕ ·∧ t , y, z ) , respectively, satisfy | D ϕ σ ( t, ϕ ·∧ t , y, z )( ψ ) | + | D ϕ f ( t, ϕ ·∧ t , y, z )( ψ ) | ≤ Z T | ψ ( s ) | ℓ ( t, ds ) , ψ ∈ C T , (2.10) uniformly in ( t, ϕ, y, z ) ; (iv) The mapping y y∂ y σ ( t, ϕ ·∧ t , y, z ) is uniformly bounded, uniformly in ( t, ϕ, z ) ; (v) The mapping x x∂ x h ( t, x ) is bounded, uniformly in ( t, x ) ∈ [0 , T ] × R ; (vi) The mappings x xh ( t, x ) , x ∂ x h ( t, x ) are bounded, uniformly in ( t, x ) ∈ [0 , T ] × R . We note that some of the assumptions above are merely technical and can be improved,but we prefer not to dwell on such technicalities and focus on the main ideas instead.8 emark 2.5.

Note that if ( t, ϕ, y, z ) φ ( t, ϕ ·∧ t , y, z ) is a function deﬁned on [0 , T ] × C T × R × R satisfying Assumption 2.4-(i), (ii), then for any µ ∈ P ( C T ), we can deﬁne a functionon the space [0 , T ] × Ω × C T × P ( C T ) × U :¯ φ ( t, ω, ϕ ·∧ t , µ t , z ) △ = Z R φ ( t, ϕ ·∧ t , y, z ) µ t ( dy ) , (2.11)where µ t = µ ◦ P − t and P t ( ϕ ) △ = ϕ ( t ), ( t, ϕ ) ∈ [0 , T ] × C T . Then, ¯ φ must satisfy the followingLipschitz condition: | ¯ φ ( t, ϕ ·∧ t , µ t , z ) − ¯ φ ( t, ϕ ·∧ t , µ t , z ) | ≤ K n k ϕ − ϕ k C t + W ( µ , µ ) + | z − z | o , (2.12)where k · k C t is the sup-norm on C ([0 , t ]) and W ( · , · ) is the 2-Wasserstein metric. Remark 2.6.

The Fr´echet derivatives D ϕ σ and D ϕ f by deﬁnition belong to C ∗ T △ = M [0 , T ],the space of all ﬁnite signed Borel measures on [0 , T ], endowed with the total variation norm | · | T V (with a slight abuse of notation, we still denote it by | · | ). Thus the Assumption2.4-(iii) amounts to saying that, as measures, | D ϕ σ ( t, ϕ ·∧ t , y, z )( ds ) | + | D ϕ f ( t, ϕ ·∧ t , y, z )( ds ) | ≤ ℓ ( t, ds ) , ∀ ( t, ϕ, y, z ) . (2.13)This inequality will be crucial in our discussion in Section 7.To end this section we recall some basic facts in nonlinear ﬁltering theory, adapted toour situation. We begin by considering the inverse Girsanov kernel of ¯ L u deﬁned by (2.7): L ut △ = [ ¯ L ut ] − = exp n Z t h ( s, X us ) dY s − Z t | h ( s, X us ) | ds o , t ∈ [0 , T ] . (2.14)Then L u is a Q -martingale, d P u = L uT d Q , and L u satisﬁes the following SDE on (Ω , F , Q ): L t = 1 + Z t h ( s, X s ) L s dY s , t ∈ [0 , T ] . (2.15)Let us now denote L = L u for simplicity. An important ingredient that we are going to usefrequently is the SDEs known as the Kushner-Stratonovic or Fujisaki-Kallianpur-Kunita (FKK) equation for the “normalized conditional probability”. Let us denote S t △ = E Q [ L t X t |F Yt ] , S t △ = E Q [ L t |F Yt ] , t ≥ . (2.16)Since under Q the process ( B , Y ) is a Brownian motion, the σ -ﬁeld F Yt,T and F Yt ∨ F B t are independent, where F Yt,T △ = σ { Y r − Y t : t ≤ r ≤ T } . It is standard to show that (in lightof (2.15)) S and S satisfy the following SDEs: S t = 1 + Z t E Q [ h ( s, X s ) L s |F Ys ] dY s , t ≥ . (2.17)9nd S t = x + Z t E Q [ L s X s h ( s, X s ) |F Ys ] dY s , t ≥ . (2.18)Furthermore, let U t △ = E P u [ X t |F Yt ], t ≥

0. Then, by the Bayes formula (also known asthe Kallianpur-Striebel formula, see, e.g., [1]) we have U t = E Q [ L t X t |F Yt ] E Q [ L t |F Yt ] = S t S t , t ≥ , Q -a.s. (2.19)A simple application of Itˆo’s formula and some direct computation then lead to the followingFKK equation: dU t = n E P u [ X t h ( t, X t ) |F Yt ] − E P u [ X t |F Yt ] E P u [ h ( t, X t ) |F Yt ] o dY t (2.20)+ n E P u [ X t |F Yt ] (cid:8) E P u [ h ( t, X t ) |F Yt ] (cid:9) − E P u [ X t h ( t, X t ) |F Yt ] E P u [ h ( t, X t ) |F Yt ] o dt. In fact, one can easily show that S t = U t exp n Z t E P u [ h ( s, X s ) |F Ys ] dY s − Z t E P u [ h ( s, X s ) |F Ys ] ds o . (2.21) In this and next sections we investigate the well-posedness of the controlled state-observationsystem (2.4) and (2.5). More precisely, we shall argue that the admissible control set U ad ,deﬁned by Deﬁnition 2.1, is not empty. We ﬁrst note that, for a ﬁxed P ∈ P (Ω) and u ∈ L ∞− F Y ( P , [0 , T ]), if we deﬁne φ u ( t, ω, ϕ ·∧ t , µ t ) △ = Z R φ ( t, ϕ ·∧ t , y, u t ( ω )) µ t ( dy ) , (3.1)where φ = b, σ , then we can write the control-observation system (2.4) and (2.5) as a slightlymore generic form (denoting b u = b and σ u = σ for simplicity):  X t = x + Z t b ( s, · , X ·∧ s , µ X | Ys ) ds + Z t σ ( s, · , X ·∧ s , µ X | Ys ) dB s ; Y t = Z t h ( s, X s ) ds + B t , t ≥ , (3.2)where B = ( B , B ) is a P -Brownian motion, and µ X | Yt = P ◦ [ E P [ X t |F Yt ]] − . Our task is toprove the well-posedness of SDE (3.2) in a weak sense (i.e., including the existence of theprobability measure P (!)). In light of Remark 2.5, we shall assume that the coeﬃcients b and σ in (3.2) satisfy the following assumptions that are slightly weaker than Assumption2.4, but suﬃcient for our purpose in this section.10 ssumption 3.1. The coeﬃcients b, σ : [0 , T ] × C T × P ( C T ) R enjoy the followingproperties: (i) For ﬁxed ( ϕ, µ ) ∈ C T × P ( C T ) , the mapping ( t, ω ) ( b, σ )( t, ω, ϕ, µ ) is an F -progressively measurable process; (ii) For ﬁxed t ∈ [0 , T ] , and Q -a.e. ω ∈ Ω , there exists K > , independent of ( t, ω ) ,such that for all ( ϕ , µ ) , ( ϕ , µ ) ∈ C T × P ( C T ) , it holds that | φ ( t, ω, ϕ ·∧ t , µ t ) − φ ( t, ω, ϕ ·∧ t , µ t ) | ≤ K ( sup t ∈ [0 ,T ] | ϕ t − ϕ t | + W ( µ , µ )) , (3.3) for φ = b, σ , respectively. In the rest of the section we shall still assume b = 0, as it does not add extra diﬃculties.Now assume that ( X, Y ) satisﬁes (3.2) under P , and let us denote U X | Yt △ = E P [ X t |F Yt ], t ≥ U X | Y should be understood as the “optional projection” of X onto F Y !) Weﬁrst check that U X | Y is indeed a continuous process. Lemma 3.2.

Assume that Assumption 2.4 holds. Then U X | Y admits a continuous version.Proof. First note that P ∼ Q , and X has continuous paths, P -a.s. By Bayes formula(2.19) we can write U X | Yt = E Q [ L t X t |F Yt ] E Q [ L t |F Yt ] = S t S t , where S and S satisfy (2.17) and (2.18),respectively, and L satisﬁes (2.15). Clearly, the representations (2.17) and (2.18) indicatethat both S and S have continuous paths, thus U X | Y must have a continuous version.We now deﬁne µ X | Y ( · ) = P ◦ [ U X | Y ] − ( · ), and µ X | Yt ( · ) = P ◦ [ U X | Yt ] − ( · ), for any t ≥ µ X | Y ∈ P ( C T ), justifying the deﬁnition of SDE (3.2). Inwhat follows when the context is clear, we shall omit “ X | Y ” from the superscript.We note that the special circular nature of SDE (3.2) between its solution and its lawof the conditional expectation (whence the underlying probability) makes it necessary tospecify the meaning of a solution. We have the following deﬁnition. Deﬁnition 3.3 (Weak Solution) . An eight-tuple (Ω , F , P , F , X, Y, B , B ) is called a solu-tion to the ﬁltering equation (3.2) if (i) (Ω , F ) is the canonical space, P ∈ P (Ω) , and F is the canonical ﬁltration; (ii) ( B , B ) is a 2-dimensional F -Brownian motion under P ; (iii) ( X, Y ) is an F -adapted continuous process such that (3.2) holds for all t ∈ [0 , T ] , P -almost surely. To prove the well-posedness we shall use a generalized version of the Schauder FixedPoint Theorem (see Cauty [13], or a recent generalization in [14]). To this end we consider11he following subset of P ( C T ): E △ = n µ ∈ P ( C T ) (cid:12)(cid:12) sup t ∈ [0 ,T ] Z R | y | µ t ( dy ) < ∞ o . (3.4)In the above µ t = µ ◦ P t − ∈ P ( R ), and P t ( ϕ ) = ϕ ( t ), ϕ ∈ Ω, is the projection mapping.Clearly, E is a convex subset of P ( C T ).We now construct a mapping T : E E , whose ﬁxed point, if exists, would give asolution to the SDE (3.2). We shall begin with the reference probability space (Ω , F , Q ),thanks to Assumption 2.3, then ( B , Y ) is a Q -Brownian motion. We may assume withoutloss of generality that ( B , Y ) is the canonical process, and Q is the Wiener measure.For any µ ∈ E we consider the SDE on the space (Ω , F , Q ): X t = x + Z t σ ( s, · , X ·∧ s , µ s ) dB s , t ≥ . (3.5)Note that as the distribution µ is given, (3.5) is an “open-loop” SDE with “functionalLipschitz” coeﬃcient, thanks to Assumption 3.1. Thus, there exists a unique (strong)solution to (3.5), which we denote by X = X µ .Now, using X µ we deﬁne the process L µ = { L µt } t ≥ as in (2.14) on probability space(Ω , F , Q ), and then we deﬁne the probability d P µ △ = L µT d Q . By the Kallianpur-Striebelformula (2.19) we can deﬁne a process U µt △ = E P µ [ X µt |F Yt ] = E Q [ L µt X µt |F Yt ] E Q [ L µt |F Yt ] = S µt S µ, t , t ≥ , (3.6)where S µt △ = E Q [ L µt X µt |F YT ], S µ, t △ = E Q [ L µt |F YT ], t ≥

0, and then we denote T ( µ ) △ = ν µ = P µ ◦ [ U µ ] − ∈ P ( C T ) . (3.7)Our task is to show that the solution mapping T : µ ν µ satisﬁes the desired assump-tions for Schauder’s Fixed Point Theorem. Theorem 3.4.

The solution mapping T : E → P ( C T ) enjoys the following properties:(1) T ( E ) ⊆ E ;(2) T ( E ) is compact under 2-Wasserstein metric.(3) T : ( E , W ( · , · )) → ( P ( C T ) , W ( · , · )) is continuous, i.e., whenever µ, µ n ∈ E , n ≥ , is such that W ( µ n , µ ) → , we have that W ( T ( µ n ) , T ( µ )) → . We remark that an immediate consequence of (3) is that T : E → P ( C T ) is continuousunder both the 1- and the 2-Wasserstein metrics. Moreover, the compactness of T ( E ) underthe 2-Wasserstein metric stated in (2) implies that in the 1-Wasserstein metric.12 roof. (1) Given µ ∈ E we need only show thatsup t ∈ [0 ,T ] Z R | y | ν µt ( dy ) < ∞ . (3.8)To see this we note that for t ∈ [0 , T ], by Jensen’s inequality, Z R | y | ν µt ( dy ) = Z R | y | P µ ◦ [ U µ ] − ( dy ) = E P µ [ | E P µ [ X µt |F Yt ] | ] ≤ E P µ [ | X µt | ] . Since under Q , B is also a Brownian motion, it is standard to argue that, as X µ is thesolution to the SDE (3.5), it holds thatsup ≤ t ≤ T E Q [ | X µt | n ] ≤ C (1 + | x | n ) , for all n ∈ N . (3.9)Furthermore, noting that the process L µ is an L -martingale under Q , we havesup ≤ t ≤ T Z R d | y | ν µt ( dy ) ≤ sup ≤ t ≤ T E P µ h | X µt | i = sup ≤ t ≤ T E Q h L µT | X µt | i ≤ (cid:0) E Q [ | L µT | ] (cid:1) sup ≤ t ≤ T E Q h | X µt | i < ∞ , thanks to (3.9). In other words, ν µ = T ( µ ) ∈ E , proving (1).(2) We shall prove that for any sequence { µ nt } ⊆ E , there exists a subsequence, denotedby { µ nt } itself, such that lim n →∞ T ( µ n ) = ν in 2-Wasserstein metric, for some ν ∈ T ( E ).In light of the equivalence relation (2.2), we shall ﬁrst argue that the family { T ( µ n ) } n ≥ is tight. To this end, recall that U nt = E P n [ X nt |F Yt ] = S nt S n, t , (3.10)where S nt △ = E Q [ L nt X nt |F Yt ], S n, t △ = E Q [ L nt |F Yt ], t ≥

0, and d P n △ = L nT d Q . It then followsfrom the FKK equation (2.20) that dU nt = (cid:8) E P n [ X nt h ( t, X nt ) |F Yt ] − E P n [ X nt |F Yt ] E P n [ h ( t, X nt ) |F Yt ] (cid:9) dY t (3.11)+ (cid:8) E P n [ X nt |F Yt ]( E P n [ h ( t, X nt ) |F Yt ]) − E P n [ X nt h ( t, X nt ) |F Yt ] E P n [ h ( t, X nt ) |F Yt ] (cid:9) dt. Now denote B ,nt △ = Y t − R t h ( s, X n ·∧ s ) ds . Then ( B , B ,n ) is a 2-dimensional standard P n -Brownian motion. Furthermore, since h is bounded, so is E P n [ h ( t, X n ·∧ t ) |F Yt ]. We thus havethe following estimate: E P n [ | U nt − U ns | ] ≤ C E P n h(cid:16) R ts E P n [ | X ns | |F Ys ] ds (cid:17) i ≤ C E P n h sup ≤ s ≤ T (cid:12)(cid:12) E P n [ | X ns | |F Ys ] | i | t − s | C E P n h sup ≤ s ≤ T (cid:12)(cid:12) E P n [ sup ≤ r ≤ T | X nr | |F Ys ] | i | t − s | (3.12) ≤ C E P n h sup ≤ s ≤ T | X ns | i | t − s | ≤ C | t − s | . Thus, as U n = x , n ≥

1, the sequence of continuous processes { U n } is relatively compact(cf. e.g., Ethier-Kurtz [16]). Therefore, the sequence of their laws { T ( µ n ) △ = P n ◦ [ U n ] − , n ≥ } ⊆ P ( C T ) is tight. Consequently, we can ﬁnd a subsequence, we may assume itself, thatconverges weakly to a limit ν ∈ P ( C T ). Furthermore, for each n ≥

1, we apply the Jensen,Burkholder-Davis-Gundy, and H¨older inequalities to get, with ν n △ = T ( µ n ), Z C T k ϕ k C T ν n ( dϕ ) = E P n [ k U n k C T ] = E P n [ sup ≤ t ≤ T | E P n [ X nt |F nt ] | ] ≤ E P n h sup ≤ t ≤ T E P n (cid:2) sup ≤ r ≤ T | X nr ||F nt (cid:3) i (3.13) ≤ C h E P n (cid:2) sup ≤ r ≤ T | X nr | (cid:3)i / = C h E Q (cid:2) L nT sup ≤ r ≤ T | X nr | (cid:3)i / ≤ C h E Q [( L nT ) ] i / h E Q [ sup ≤ r ≤ T | X nr | ] i / < + ∞ . But noting that h is bounded, one deduces from (3.9) thatsup n ≥ Z C T k ϕ k C T ν n ( dϕ ) < ∞ , (3.14)and, thus, sup n ≥ Z C T k ϕ k C T I {| ϕ k C T ≥ N } ν n ( dϕ ) → , as N → + ∞ . This, together with the fact that ν n = T ( µ n ) w → ν , implies that W ( ν n , ν ) →

0, and ν ∈ E ,as n → ∞ , where W ( · , · ) is the 2-Wasserstein metric on P ( C T ). This proves (2).(3) We now check that the mapping T : ( E , W ( · , · )) → ( P ( C T ) , W ( · , · )) is continuous.To this end, for each µ ∈ E , we consider the following SDE on the probability space(Ω , F , Q ):  dX t = σ ( t, X ·∧ t , µ t ) dB t , X = x ; dB t = dY t − h ( t, X t ) dt, B = 0; dL t = h ( t, X t ) L t dY t , L = 1 . (3.15)Now let { µ n } ⊆ E be any sequence such that µ n → µ , as n → ∞ , in the 1-Wassersteinmetric, and denote by ( X n , B n, , L n ) the corresponding solutions to (3.15). Deﬁne σ n ( t, ω ·∧ t ) △ = σ ( t, ω ·∧ t , µ nt ) , ( t, ω ) ∈ [0 , T ] × Ω . σ n ’s are functional Lipschitz deterministic functions, withLipschitz constant independent of n . This and standard SDE arguments lead to that, as n → ∞ , E Q n sup ≤ t ≤ T | X nt − X t | p + sup ≤ t ≤ T | L nt − L t | p o → , in L p ( Q ), p ≥

1. (3.16)We deduce that U nt = E P n [ X nt |F Yt ] = S nt /S n, t converges in probability under Q to E Q [ L t X t |F Yt ] E Q [ L t |F Yt ] = E P [ X t |F Yt ], where d P △ = L T d Q .Now for any ψ ∈ C b ( R ), letting n → ∞ we have h ψ, T ( µ n ) t i = E P n (cid:2) ψ ( E P n [ X nt |F Yt ]) (cid:3) = E Q (cid:2) L nT ψ ( E P n [ X nt |F Yt ]) (cid:3) −→ E Q (cid:2) L T ψ ( E P [ X t |F Yt ]) (cid:3) = E P (cid:2) ψ ( E P [ X t |F Yt ]) (cid:3) (3.17)= h ψ, P ◦ [ E P [ X t |F Yt ]] − i , as n → ∞ . This implies that ν t = P ◦ [ E P [ X t |F Yt ]] − = T ( µ ) t , for all t ∈ [0 , T ]. With the sameargument one shows that, for any 0 ≤ t < t < · · · < t k < ∞ , T ( µ n ) t , ··· ,t k △ = P ◦ (cid:0) E P [ X nt |F Yt ] , · · · , E P [ X nt k |F Yt k ]) − d −→ ν t , ··· ,t k , as n → ∞ .That is, the ﬁnite dimensional distributions of T ( µ n ) converge to those of ν , and as { T ( µ n ) } n ≥ is tight by part (2), we conclude that T ( µ n ) w → ν in P ( C T ). This, togetherwith (3.13), further shows that W ( T ( µ n ) , T ( µ )) →

0, as n → ∞ , proving the continuityof T , whence (3). The proof is now complete.As a consequence of Theorem 3.4, we have the following existence result for SDE (3.2). Proposition 3.5.

Let Assumption 3.1 hold. Then SDE (3.2) has at least one solution inthe sense of Deﬁnition 3.3.Proof.

The proof follows from Theorem 3.4 and a generalization of the Schauder FixedPoint Theorem by Cauty (see [13], or a recent generalization [14]). To do this we mustcheck: (i) E is a convex subset of a Hausdorﬀ topological linear space, (ii) T is continuousand T ( E ) ⊆ E ; and (iii) T ( E ) ⊂ K , for some compact K in P ( C T ).To imbed E into a Hausdorﬀ topological linear space, we borrow the argument of Li-Min[22]. Let M ( C T ) be the space of all bounded signed Borel measures ν ( · ) on C T such that | R C T k ϕ k C T ν ( dϕ ) | < + ∞ , endowed with the norm: k ν k := sup n(cid:12)(cid:12)(cid:12) Z C T hdν (cid:12)(cid:12)(cid:12) : h ∈ Lip ( C T ) , | h (0) | ≤ o . Lip ( C T ) denotes the set of all real-valued Lipschitz functions over C T with Lipschitz constant 1. M ( C T ) , k · k ) is a normed (hence Hausdorﬀ topological) linear space. Since P ( C T ) ⊂ P ( C T ) ⊂ M ( C T ), and by the Kantorovich-Rubinstein formula, W ( ν , ν ) = sup n(cid:12)(cid:12)(cid:12) Z C T hd ( ν − ν ) (cid:12)(cid:12)(cid:12) : h ∈ Lip ( C T ) , | h (0) | ≤ o = k ν − ν k , for all ν , ν ∈ P ( C T ), the topology generated by the norm k · k on P ( C T ) coincideswith the one generated by the 1-Wasserstein metric on P ( C T ). Thus, E ⊂ P ( C T ) is aconvex subset of M ( C T ), proving (i). Further, note that T : E → P ( C T ) is continuousunder the 1-Wasserstein metric, hence also under the k · k -norm, verifying (ii). Finally,since T ( E ) ⊂ E , and E is compact under the 2-Wasserstein metric, hence also under the k · k -norm, proving (iii). We can now apply Cauty’s theorem to conclude the existence ofa ﬁxed point ν ∈ E ⊂ P ( C T ) such that T ( ν ) = ν. We note that the existence of the ﬁxed point µ amounts to saying that SDE (3.15)has a solution on the probability space (Ω , F , Q ), with µ = µ X | Y = P ◦ [ U ] − , and U t = E P [ X t |F Yt ], t ≥

0, where d P = L T d Q by construction. But this in turn deﬁnes a solution of(3.2) on the probability space (Ω , F , P ), thanks to the Girsanov transformation. However,since under P , ( B , B ) constructed in (3.15) is a Brownian motion, (Ω , F , P , X, Y, B , B )deﬁnes a (weak) solution of SDE (3.2). In this section we investigate the uniqueness of the solution to SDE (3.2). We note that thegeneral uniqueness for the weak solution for this problem is quite diﬃcult, we will contentourselves with a version that is relatively more amendable.To begin with, and let Q be the reference probability measure under which ( B , Y ) isa Brownian motion. For each u ∈ L ∞− F Y ( Q , [0 , T ]), consider the SDE on (Ω , F , Q ):  dX ut = σ ( t, X u ·∧ t , µ X u | Yt , u t ) dB t , X u = x ; dB t = dY t − h ( t, X ut ) dt, B = 0; dL ut = h ( t, X ut ) L ut dY t , L = 1 , (4.1)where µ X u | Yt := P u ◦ [ E P u [ X ut |F Yt ]] − , and d P u := L uT d Q . We shall argue that, underAssumption 2.4, the solution of the SDE (4.1) is pathwisely unique. Remark 4.1.

It should be clear that if u ∈ L ∞− F Y ( Q , [0 , T ]), and ( X u , B , L u ) is a solutionto (4.1) under Q , then u ∈ L ∞− F Y ( P u , [0 , T ]) (since d P u d Q ∈ L p (Ω) for all p >

1, thanks to16ssumption 2.4), and the process ( X u , Y, B , B ) is a solution to (2.4) and (2.5) on theprobability space (Ω , F , P u , F ) in the sense of Deﬁnition 3.3, where F := F B ,Y . Conversely,if (Ω , F , P u , F , B , B , X, Y ) is a weak solution of (2.4)-(2.5), then following the argument of § d Q = [ L uT ] − d P u deﬁnes a reference measure, where L u is deﬁned by (2.6)or (2.7), and ( X, B , [ L u ] − ) will be a solution of (4.1) with respect to the Q -Brownianmotion ( B , Y ). In what follows we shall call the solution to (4.1) the Q -dynamics of thesystem (2.4) and (2.5).Bearing Remark 4.1 in mind, let us ﬁrst try to establish a result in the spirit of theYamada-Watanabe Theorem: the pathwise uniqueness of (4 . implies the uniqueness inlaw for the original SDEs (2.4) and (2.5) . To do this, we begin by noting that, given the“regular” nature of the canonical space Ω, a process u ∈ L ∞− F Y ( P u , [0 , T ]) amounts to sayingthat (cf. e.g., [23, 25]) there exists a progressively measurable functional u : [0 , T ] × C T U such that u t ( ω ) = u ( t, Y ·∧ t ( ω )), dtd P u -a.s., such that u has all the ﬁnite moments under P u (hence also true under Q ∼ P u !). We have the following Proposition. Proposition 4.2.

Assume that Assumption 2.4 is in force, and that the pathwise unique-ness holds for SDE (4.1). Let u : [0 , T ] × Ω U be a given progressively measurablefunctional, and (Ω , F , P i , F , B ,i , B ,i , X i , Y i ) , i = 1 , , be two (weak) solutions of (2.4)-(2.5) corresponding to the controls u i = u ( · , Y i ) , i = 1 , , respectively. Then, it holds that P ◦ [( B , , B , , X , Y )] − = P ◦ [( B , , B , , X , Y )] − . Proof.

Following the argument of § d Q ,i = [ L iT ] − d P i , where L i = [ ¯ L i ] − and ¯ L i is the unique solution of the SDE (2.6) with respect to ( X i , B ,i , Y i ), i = 1 ,

2. Then,as the Q ,i -dynamics, ( X i , B ,i , L i ) satisﬁes (4.1), i = 1 , Q ,i -a.s. In particular, we recall(3.6) that U X i | Y i t = E P i [ X it |F Y i t ] = E Q ,i [ L it X it |F Y i t ] E Q ,i (cid:2) L it |F Y i t (cid:3) , Q ,i -a.s. , t ∈ [0 , T ] . Thus, there exist two progressively measurable functionals Φ i : [0 , T ] × Ω R suchthat U X i | Y i t = Φ i ( t, Y i ·∧ t ), dtd Q ,i -a.s., i = 1 ,

2. We now consider an intermediate SDE on(Ω , F , Q , ):  d b X t = σ ( t, b X ·∧ t , Φ ( t, Y ·∧ t ) , u ( t, Y ·∧ t )) dB , t , b X = x ; d b L t = h ( t, b X t ) b L t dY t , b L = 1 , t ∈ [0 , T ] . (4.2)17learly, comparing to (4.1) for Q , -dynamics ( X , B , , L ), this SDE has the same coef-ﬁcient b σ ( t, ω, ϕ ·∧ t ) := σ ( t, ϕ ·∧ t , Φ ( t, ω ·∧ t ) , u ( t, ω ·∧ t )), and h ( t, x ) ℓ , which is jointly measur-able, uniformly Lipschitz in ϕ with linear growth (in ℓ ), uniformly in ( t, ω, ϕ, ℓ ), thanks toAssumption 2.4, except that it is driven by the Q , -Brownian motion ( B , , Y ). Thus,by the classical SDE theory (cf. e.g., [18]) we know that there exists a (unique) measur-able functional Ψ : C T × C T → C T × C T such that ( X , L ) = Ψ( B , , Y ) , Q , -a.s., and( b X , b L ) = Ψ( B , , Y ) , Q , -a.s. Since Q , ◦ ( B , , Y ) − = Q , ◦ ( B , , Y ) − = Q , theWiener measure on (Ω , F ), we deduce that Q , ◦ ( B , , Y , X , L ) − = Q , ◦ ( B , , Y , b X , b L ) − . (4.3)We now claim that ( b X , B , , b L ) coincides with the Q , -dynamics of (2.4)-(2.5). In-deed, it suﬃces to argue that in SDE (4.2),Φ ( t, Y ·∧ t ) = E b P [ b X t | F Y t ] = U b X | Y t , Q , -a.s. , (4.4)where d b P := b L d Q , . To see this, we note that, for all t ∈ [0 , T ] and any bounded Borelmeasurable function f : C T → R , it follows from (4.3) and the deﬁnition of U X | Yt that E b P [ f ( Y ·∧ t )Φ ( t, Y ·∧ t )] = E Q , [ b L t f ( Y ·∧ t )Φ ( t, Y ·∧ t )] = E Q , [ L t f ( Y ·∧ t )Φ ( t, Y ·∧ t )]= E P [ f ( Y ·∧ t ) U X | Y t ] = E P [ f ( Y ·∧ t ) X t ] = E Q , [ L t f ( Y ·∧ t ) X t ] = E Q , [ b L t f ( Y ·∧ t ) b X t ]= E b P [ f ( Y ·∧ t ) b X t ] = E b P [ f ( Y ·∧ t ) U b X | Y t ] , proving (4.4), whence the claim.Now, by pathwise uniqueness of SDE (4.1), we conclude that ( X , L ) = ( b X , b L ), Q , -a.s. Thus (4.3) implies that Q , ◦ [( B , , Y , X , L )] − = Q , ◦ [( B , , Y , X , L )] − , andconsequently, Q , ◦ [( B , , B , , X , Y )] − = Q , ◦ [( B , , B , , X , Y )] − . This provesthe uniqueness in law for the system (2.4)-(2.5).We now turn our attention to the main result of this section: the pathwise uniquenessof (4.1). We shall establish some fundamental estimates which will be useful in our futurediscussions. Since all controlled dynamics are constructed via the reference probabilityspace (Ω , F , Q ), we shall consider only their Q -dynamics, namely the solution to (4.1).Recall the space L p ( Q ; L ([0 , T ])), p >

1, and the norm k · k p, , Q deﬁned by (2.3). Wehave the following important result. Proposition 4.3.

We split the proof into several steps. Throughout this proof we let

C >

T >

0, and it is allowed to vary from line to line.

Step 1 ( Estimate for X ). First let us denote, for any u ∈ U ad , σ u ( t, ϕ ·∧ t , µ ut ) △ = Z R σ ( t, ϕ ·∧ t , y, u t ) µ ut ( dy ) , ( t, ϕ ) ∈ [0 , T ] × C T , (4.7)and µ ut △ = µ X u | Y ◦ P − t = P u ◦ ( E P u [ X ut |F Yt ]) − , t ≥

0. Then, we have | σ u ( t, X u ·∧ t , µ ut ) − σ v ( t, X v ·∧ t , µ vt ) | = (cid:12)(cid:12)(cid:12) Z R σ ( t, X u ·∧ t , y, u t ) µ ut ( dy ) − Z R σ ( t, X v ·∧ t , y, v t ) µ vt ( dy ) (cid:12)(cid:12)(cid:12) (4.8) ≤ C n | u t − v t | + sup ≤ s ≤ t | X us − X vs | + (cid:12)(cid:12)(cid:12) Z R σ ( t, X v ·∧ t , y, v t )[ µ ut ( dy ) − µ vt ( dy )] (cid:12)(cid:12)(cid:12)o . Next, let us denote S ut = E Q [ L ut X ut |F Yt ] and S u, t = E Q [ L ut |F Yt ], and deﬁne S vt , S v, t in asimilar way. By (2.19) and the fact that d P u = L uT d Q , we see that (cid:12)(cid:12)(cid:12) Z R σ ( t, X v ·∧ t , y, v t )[ µ u ( dy ) − µ v ( dy )] (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) E u [ σ ( t, ϕ ·∧ t , E u [ X ut |F Yt ] , u )] − E v [ σ ( t, ϕ ·∧ t , E v [ X vt |F Yt ] , u )] (cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) (4.9)= (cid:12)(cid:12)(cid:12) E Q n L ut σ (cid:0) t, ϕ ·∧ t , E Q [ L ut X ut |F Yt ] E Q [ L ut |F Yt ] , u (cid:1) − L vt σ (cid:0) t, ϕ ·∧ t , E Q [ L vt X vt |F Yt ] E Q [ L vt |F Yt ] , u (cid:1)o(cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) ≤ I + I , where (noting the deﬁnition of S u , S u, and the fact that they are both F Y -adapted) I = (cid:12)(cid:12)(cid:12) E Q n L ut σ (cid:0) t, ϕ ·∧ t , S ut S u, t , u (cid:1) − L vt σ (cid:0) t, ϕ ·∧ t , S ut S v, t , u (cid:1)o(cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) E Q n S u, t σ (cid:0) t, ϕ ·∧ t , S ut S u, t , u (cid:1) − S v, t σ (cid:0) t, ϕ ·∧ t , S ut S v, t , u (cid:1)o(cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) ;and I = (cid:12)(cid:12)(cid:12) E Q n L vt (cid:2) σ (cid:0) t, ϕ ·∧ t , S ut S v, t , u (cid:1) − σ (cid:0) t, ϕ ·∧ t , S vt S v, t , u (cid:1)(cid:3)o(cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) E Q n S v, t (cid:2) σ (cid:0) t, ϕ ·∧ t , S ut S v, t , u (cid:1) − σ (cid:0) t, ϕ ·∧ t , S vt S v, t , u (cid:1)(cid:3)o(cid:12)(cid:12) ϕ = X v ,u = v t (cid:12)(cid:12)(cid:12) . I ≤ C E Q n S v, t | S ut − S vt | S v, t o ≤ C E Q [ | L ut X ut − L vt X vt | ] . (4.10)To estimate I , we write ˆ σ ( t, ω, ϕ ·∧ t , y, z ) = yσ (cid:0) t, ϕ ·∧ t , S ut ( ω ) y , z (cid:1) . Since ∂ y ˆ σ ( t, ω, ϕ ·∧ t , y, z ) = σ (cid:16) t, ϕ ·∧ t , S ut ( ω ) y , z (cid:17) − S ut ( ω ) y ∂ y σ (cid:16) t, ϕ ·∧ t , S ut ( ω ) y , z (cid:17) , (4.11)we see that y ∂ y ˆ σ ( t, ϕ ·∧ t , y, z ) is uniformly bounded thanks to Assumption 2.4-(iv). Thuswe have I ≤ C k ∂ y ˆ σ k ∞ E Q | S u, t − S v, t | ≤ C E Q | L ut − L vt | . (4.12)Now note that (4.1) implies that X ut − X vt = R t [ σ u ( s, X u ·∧ s , µ us ) − σ v ( s, X v ·∧ s , µ vs )] dB s . Com-bining (4.8)–(4.12), we see that E Q h sup ≤ s ≤ t | X us − X vs | p i ≤ C E Q nh Z t [ sup r ∈ [0 ,s ] | X ur − X vr | + | u s − v s | (4.13)+( E Q | L us − L vs | ) + ( E Q | L us X us − L vs X vs | ) ] ds i p/ o . Applying the Gronwall inequality we obtain that E Q h sup ≤ s ≤ t | X us − X vs | p i ≤ C E Q nh Z t (cid:2) | u s − v s | + E Q [ | L us − L vs | ]+ E Q [ | L us X us − L vs X vs | ] (cid:3) ds i p/ o . (4.14) Step 2 ( Estimate for L ). We ﬁrst note that, for t ∈ [0 , T ], | L ut h ( t, X ut ) − L vt h ( t, X vt ) | = (cid:12)(cid:12)(cid:12) L ut h (cid:16) t, L ut X ut L ut (cid:17) − L vt h (cid:16) t, L vt X vt L vt (cid:17)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) L ut h (cid:16) t, L ut X ut L ut (cid:17) − L ut h (cid:16) t, L vt X vt L ut (cid:17)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) L ut h (cid:16) t, L vt X vt L ut (cid:17) − L vt h (cid:16) t, L vt X vt L vt (cid:17)(cid:12)(cid:12)(cid:12) (4.15) ≤ C | L ut X ut − L vt X vt | + (cid:12)(cid:12)(cid:12) L ut h (cid:16) t, L vt X vt L ut (cid:17) − L vt h (cid:16) t, L vt X vt L vt (cid:17)(cid:12)(cid:12)(cid:12) . To estimate the second term above we deﬁne, as before, ˆ h ( t, ω, x ) △ = xh (cid:0) t, L vt ( ω ) X vt ( ω ) x (cid:1) .Then, similar to (4.11), one shows that x ∂ x ˆ h ( t, ω, x ) is uniformly bounded, thanks toAssumption 2.4-(v). Consequently, we have (cid:12)(cid:12)(cid:12) L ut h (cid:16) t, L vt X vt L ut (cid:17) − L vt h (cid:16) t, L vt X vt L vt (cid:17)(cid:12)(cid:12)(cid:12) ≤ k ∂ x ˆ h k ∞ | L ut − L vt | . (4.16)20ow, combining (4.15) and (4.16) we obtain | L ut h ( t, X ut ) − L vt h ( t, X vt ) | ≤ C ( | L ut − L vt | + | L ut X ut − L vt X vt | ) . (4.17)Therefore, noting that L ut = 1 + R t h ( s, X us ) L us dY s , we deduce from (4.17) and Gronwall’sinequality that E Q [ sup ≤ s ≤ t | L us − L vs | ] ≤ C E Q [ Z t | L us X us − L vs X vs | ds ] , Q -a.s. , ≤ t ≤ T. (4.18) Step 3 ( Estimate for L t X t ). It is clear from (4.14) and (4.18) that it suﬃces to ﬁnd theestimate of L ut X ut − L vt X vt in terms of u − v . To see this we note that L ut X ut = x + Z t L us X us h ( s, X us ) dY s + Z t L us E P u [ σ ( s, ϕ ·∧ s , E P u [ X us |F Ys ] , v )] (cid:12)(cid:12) ϕ = Xuv = us dB s . (4.19)Now deﬁne ˜ h ( t, x ) △ = xh ( t, x ). Then it is easily seen that as h satisﬁes Assumption 2.4-(vi),˜ h satisﬁes Assumption 2.4-(v). Thus, similar to (4.17) we have | L us X us h ( s, X us ) − L vs X vs h ( s, X vs ) | = | L us ˜ h ( s, X us ) − L vs ˜ h ( s, X vs ) |≤ C ( | L us − L vs | + | L us X us − L vs X us | ) . (4.20)On the other hand, for any u ∈ U ad , recalling (4.7) for the notations σ u and µ u , we have,∆ u,vt △ = (cid:12)(cid:12)(cid:12) L us E P u [ σ ( s, ϕ ·∧ s , E P u [ X us |F Ys ] , z )] (cid:12)(cid:12) ϕ = Xu ; z = us − L vs E P v [ σ ( s, ϕ ·∧ s , E P v [ X vs |F Ys ] , z )] (cid:12)(cid:12) ϕ = Xvz = vs (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) L ut σ u ( t, X u ·∧ t , µ ut ) − L vt σ v ( t, X v ·∧ t , µ vt ) (cid:12)(cid:12) . Then, following a similar argument as in Step 1 we have∆ u,vt ≤ CL vt ( E Q [ | L ut − L vt | ] + E Q [ | X ut L ut − X vt L vt | ])+ C ( | L ut − L vt | + | L ut X ut − L vt X vt | ) + CL vt | u t − v t | . Squaring both sides above and then taking the expectations we easily deduce that E Q [ | ∆ u,vt | ] ≤ C ( E Q [ | L us − L vs | ] + E Q [ | X ut L ut − X vt L vt | ]) + C E Q [( L vt ) | u t − v t | ] . (4.21)Now, combining (4.19)– (4.21), for p > C p > E Q h sup ≤ s ≤ t | L us X us − L vs X vs | i ≤ C E Q h Z t | L us X us h ( s, X us ) − L vs X vs h ( s, X vs ) | ds i + C E Q Z t | ∆ u,vs | ds (4.22) ≤ C p n E Q h(cid:16) Z t | u s − v s | ds (cid:17) p/ io /p + C E Q Z t | L us − L vs | ds + C E Q Z t | L us X us − L vs X vs | ds. E Q h sup ≤ s ≤ t | L us X us − L vs X vs | i ≤ C p k u − v k p, , Q + C E Q Z t | L us − L vs | ds. (4.23)Combining (4.23) with (4.18) and applying the Gronwall inequality again, we concludethat E Q n sup ≤ s ≤ t | L us − L vs | o ≤ C p k u − v k p, , Q . (4.24)This, together with (4.14) and (4.23), implies (4.5). (4.6) then follows easily from (4.5)and (4.13), proving the proposition.A direct consequence of Proposition 4.3 is the following uniqueness result. Corollary 4.4.

Assume that Assumption 2.4 holds. Then the solution to SDE (4.1) ispathwisely unique.Proof.

Setting u = v in Proposition 4.3 we obtain the result. We are now ready to study the stochastic control problem with partial observation. Weﬁrst note that in theory for each ( P u , u ) ∈ U ad our state-observation dynamics ( X u , Y u )lives on probability space (Ω , F , P u ), which varies with control u . We shall consider their Q -dynamics so that our analysis can be carried out on a common probability space, thanksto Assumption 2.3. Therefore, in what follows, for each ( P u , u ) ∈ U ad we consider only the Q -dynamics ( X u , Y, L u ), which satisﬁes the following SDE:  dX ut = σ u ( t, X u ·∧ t , µ ut ) dB t , X u = x ; dB ,ut = dY t − h ( t, X ut ) dt, B ,u = 0; dL ut = h ( t, X ut ) L ut dY t , L u = 1 , t ≥ , (5.1)where ( B , Y ) is a Q -Brownian motion, d P u = L uT d Q , and µ X u | Yt = P u ◦ [ E P u [ X t |F Yt ]] − .For simplicity, we denote E u [ · ] △ = E P u [ · ] and E [ · ] △ = E Q [ · ]. Remark 5.1.

A convenient and practical way to identify admissible control is to sim-ply consider the space L ∞− F Y ( Q ; [0 , T ]) (cf. Deﬁnition 2.1), which is independently well-deﬁned, thanks to Assumption 2.3. It is easy to check that, under Assumption 2.4, u ∈ L ∞− F Y ( Q ; [0 , T ]) if and only if u ∈ L ∞− F Y ( P u ; [0 , T ]). Therefore in what follows by u ∈ U ad we mean that u ∈ L ∞− F Y ( Q ; [0 , T ]). 22e recall that for u ∈ U ad and µ ∈ P ( C T ), the coeﬃcient σ u in (5.1) is deﬁned by(4.7). Thus we can write the cost functional as J ( u ) △ = E n Φ( X uT , µ uT ) + Z T f u ( s, X us , µ us ) ds o . (5.2)An admissible control u ∗ ∈ U ad is said to be optimal if J ( u ∗ ) = inf u ∈ U ad J ( u ) . (5.3)We remark that the cost functional J ( · ) involves the law of the conditional expectation ofthe solution in a nonlinear way. Therefore, such a control problem is intrinsically “ time-inconsistent ” and, thus, the dynamic programming approach in general does not apply. Forthis reason, we shall consider only the necessary condition of the optimal solution, that is,Pontryagin’s Maximum Principle.To this end, we let u ∗ ∈ U ad be an optimal control , and consider the convex variationsof u ∗ : u θ,vt := u ∗ t + θ ( v t − u ∗ t ) , t ∈ [0 , T ] , < θ < , v ∈ U ad . (5.4)Here, we assume that u ∗ , v ∈ L ∞− F Y ( Q ; [0 , T ]). Since U is convex, u θ,vt ∈ U , for all t ∈ [0 , T ], v ∈ U ad , and θ ∈ (0 , X θ,v , Y, L θ,v ) to be the corresponding Q -dynamicsthat satisﬁes (5.1), with control u θ,v . Applying Proposition 4.3 ((4.5) and (4.6)) and notingthat Y is a Brownian motion under Q , we get, for p > θ → E h sup ≤ t ≤ T | X θ,vt − X u ∗ t | i ≤ C p lim θ → k u θ,v − u ∗ k p, , Q = 0; (5.5)lim θ → E h sup ≤ t ≤ T | L θ,vt − L u ∗ t | i = 0 . (5.6)In the rest of the section we shall derive, heuristically, the “variational equations” whichplay a fundamental role in the study of Maximum Principle. The complete proof will begiven in the next section. For notational simplicity we shall denote u = u ∗ , the optimalcontrol, from now on, bearing in mind that all discussions will be carried out for the Q -dynamics, therefore on the same probability space.Now for u , u ∈ U ad , let ( X , L ) and ( X , L ) denote the corresponding solutions of(5.1). We deﬁne δX = δX , = δX u ,u △ = X u − X u and δL = δL , = δL u ,u △ = L u − L u ,and will often drop the superscript “ , ” if the context is clear. Then δX and δL satisfy theequations:  δX t = Z t [ σ u ( s, X ·∧ s , µ s ) − σ u ( s, X ·∧ s , µ s )] dB s ; δL t = Z t [ L s h ( s, X s ) − L s h ( s, X s )] dY s . (5.7)23s before, let U it △ = E u i [ X it |F Yt ] and µ it = P u i ◦ [ U it ] − , t ≥ i = 1 ,

2. We can easily checkthat σ u ( t, X ·∧ t , µ t ) − σ u ( t, X ·∧ t , µ t )= E n L t σ ( t, ϕ ·∧ t , U t , z ) − L t σ ( t, ϕ ·∧ t , U t , z ) o(cid:12)(cid:12)(cid:12) ϕ = X ,ϕ = X ; z = u t ,z = u t = E n δL , t σ ( t, ϕ ·∧ t , U t , z ) (5.8)+ L t h Z D ϕ σ ( t, ϕ ·∧ t + λ ( ϕ ·∧ t − ϕ ·∧ t ) , U t , z )( ϕ ·∧ t − ϕ ·∧ t ) dλ + Z ∂ y σ ( t, ϕ ·∧ t , U t + λ ( U t − U t ) , z ) dλ · ( U t − U t )+ Z ∂ z σ ( t, ϕ ·∧ t , U t , z + λ ( z − z )) dλ · ( z − z ) io(cid:12)(cid:12)(cid:12) ϕ = X ,ϕ = X ; z = u t ,z = u t . Now let u = u θ,v and u = u ∗ = u , and denote δ θ X △ = δ θ X u,v = X θ,v − X u θ , δ θ L △ = δ θ L u,v = L θ,v − L u θ , δ θ U △ = δ θ U u,v = U θ,v − U u θ . Combining (5.7) and (5.8) we have δ θ X t = Z t n E { δ θ L s · σ ( s, ϕ ·∧ s , U θ,vs , z ) } (cid:12)(cid:12)(cid:12) ϕ = X θ,v ,z = u θ,vs + [ Dσ ] θ,u,vs ( δ θ X ·∧ s )+ E { B θ,u,v ( s, ϕ ·∧ s , z ) δ θ U s } (cid:12)(cid:12)(cid:12) ϕ Xu ; z uθ,vs + C θ,u,vσ ( s )( v s − u s ) o dB s , (5.9)where[ Dσ ] θ,u,vt ( ψ ) = E n L ut Z D ϕ σ ( t, ϕ ·∧ t + λ ( ϕ ·∧ t − ϕ ·∧ t ) , U θ,vt , z )( ψ ) dλ o(cid:12)(cid:12)(cid:12) ϕ Xθ,v,ϕ Xu,z uθ,vt ,B θ,u,v ( t, ϕ ·∧ t , z ) = L ut Z ∂ y σ ( t, ϕ ·∧ t , U ut + λ ( U θ,vt − U ut ) , z ) dλ, (5.10) C θ,u,vσ ( t ) = E n L ut Z ∂ z σ ( t, ϕ ·∧ t , U ut , z + λ ( z − z )) dλ o(cid:12)(cid:12)(cid:12) ϕ = X u ; z = u θ,vt ,z = u t . Here the integral involving the Fr´echet derivative D ϕ σ is in the sense of Bochner. Notingthat U θ,vt = E [ L θ,vt X θ,vt |F Yt ] E [ L θ,vt |F Yt ] and U ut = E [ L ut X ut |F Yt ] E [ L ut |F Yt ] , we can easily check that δ θ U t = E [ L ut |F Yt ] E [ L θ,vt X θ,vt |F Yt ] − E [ L θ,vt |F Yt ] E [ L ut X ut |F Yt ] θ E [ L θ,vt |F Yt ] E [ L ut |F Yt ] (5.11)= E [ L ut |F Yt ] E [ δ θ L t X θ,vt + L ut δ θ X t |F Yt ] − E [ δ θ L t |F Yt ] E [ L ut X ut |F Yt ] E [ L θ,vt |F Yt ] E [ L ut |F Yt ]= E [ δ θ L t X θ,vt + L ut δ θ X t |F Yt ] E [ L θ,vt |F Yt ] − E [ δ θ L t |F Yt ] E [ L θ,vt |F Yt ] U ut . θ →

0, and assuming that K t = K u,vt △ = lim θ → δ θ X u,vt ; R t = R u,vt △ = lim θ → δ θ L u,vt (5.12)both exist in L ( Q ), then it follows from (5.7)-(5.11) we have, at least formally, K t = Z t n E [ R s σ ( s, ϕ ·∧ s , U us , z )] (cid:12)(cid:12)(cid:12) ϕ = X u ,z = u s + [ Dσ ] u,vs ( K ·∧ s )+ E h B u,v ( s, ϕ ·∧ s , z ) (cid:16) E [ R s X us + L us K s |F Ys ] E [ L us |F Ys ] − E [ R s |F Ys ] E [ L us |F Ys ] U us (cid:17)i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us (5.13)+ C u,vσ ( s )( v s − u s ) o dB s , where [ Dσ ] u,vt ( ψ ) △ = E { L ut D ϕ σ ( t, ϕ ·∧ t , U ut , z )( ψ ) } (cid:12)(cid:12)(cid:12) ϕ = X u ; z = u t ,B u,v ( t, ϕ ·∧ t , z ) △ = L ut ∂ y σ ( t, ϕ ·∧ t , U ut , z ) , (5.14) C u,vσ ( t ) △ = E n L ut ∂ z σ ( t, ϕ ·∧ t , U ut , z ) o(cid:12)(cid:12)(cid:12) ϕ = X u ; z = u t . Observing also that U ut is F Yt -measurable, we have E h B u,v ( s, ϕ ·∧ s , z ) (cid:16) E [ R s X us + L us K s |F Ys ] E [ L us |F Ys ] − E [ R s |F Ys ] E [ L us |F Ys ] U us (cid:17)i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us = E u h ∂ y σ ( s, ϕ ·∧ s , U us , z ) E u { ( L us ) − R s [ X us − U us ] + K s |F Ys } i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us (5.15)= E u h ( L us ) − ∂ y σ ( s, ϕ ·∧ s , U us , z ) { R s [ X us − U us ] + L us K s } i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us = E h ∂ y σ ( s, ϕ ·∧ s , U us , z )( R s X us + L us K s ) − U us ∂ y σ ( s, ϕ ·∧ s , U us , z ) R s i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us . Consequently, if we deﬁneΨ( t, ϕ ·∧ t , x, y, z ) △ = σ ( t, ϕ ·∧ t , y, z ) + ∂ y σ ( t, ϕ ·∧ t , y, z )( x − y ) , (5.16)then we can rewrite (5.13) as K t = Z t n E h Ψ( s, ϕ ·∧ s , X us , U us , z ) R s + ∂ y σ ( s, ϕ ·∧ s , U us , z ) L us K s i(cid:12)(cid:12)(cid:12) ϕ = X u ; z = u s (5.17)+[ Dσ ] u,vs ( K ·∧ s ) + C u,vσ ( s )( v s − u s ) o dB s . Similarly, we can formally write down the SDE for R : R t = Z t [ R s h ( s, X us ) + L us ∂ x h ( s, X us ) K s ] dY s , t ≥ . (5.18)The following theorem is regarding the well-posedness of the SDEs (5.17) and (5.18).25 heorem 5.2. Assume that Assumption 2.4 is in force, and let u, v ∈ L ∞− F Y ( Q ; [0 , T ]) begiven. Then, there is a unique solution ( K, R ) ∈ L ∞− F ( Q ; C T ) to SDEs (5.17) and (5.18).Proof. Let u, v ∈ L ∞− F Y ( Q ; [0 , T ]) be given. We deﬁne F t ( K, R ) and F t ( K, R ), t ∈ [0 , T ],to be the right hand side of (5.17) and (5.18), respectively.We ﬁrst observe that F t (0 ,

0) = R t C u,vσ ( s )( v s − u s ) dB s , and F t (0 , ≡ t ∈ [0 , T ].Then, for any p >

2, it holds that E u h sup ≤ s ≤ t | F s (0 , | p i ≤ C p E u h(cid:16) Z t | v s − u s | ds (cid:17) p/ i , t ∈ [0 , T ] . (5.19)Now let ( K i , R i ) ∈ L ∞− F ( Q ; C T ), i = 1 ,

2. We deﬁne e K i △ = F ( K i , R i ), e R i △ = F ( K i , R i ), i = 1 ,

2, and ¯ K △ = K − K , ¯ R △ = R − R , ˆ K △ = e K − e K , and ˆ R △ = e R − e R . Then, notingthat σ , ∂ y σ , y∂ y σ , and ∂ z σ are all bounded, thanks to Assumption 2.4, we see that | Ψ( t, ϕ ·∧ t , x, y, z ) | ≤ C (1 + | x | ) , ( t, x, y, z ) ∈ [0 , T ] × R , ϕ ∈ C T , where, and in what follows, C > (cid:12)(cid:12)(cid:12) E [Ψ( t, ϕ ·∧ t , X ut , U ut , z ) ¯ R s + ∂ y σ ( t, ϕ ·∧ t , U ut , z ) L ut ¯ K t ] (cid:12)(cid:12)(cid:12) ≤ C E [(1 + | X ut | ) | ¯ R t | + | L ut ¯ K t | ] ≤ C h E [ | ¯ K t | + | ¯ R t | ] i / . (5.20)Furthermore, since D ϕ σ is also bounded, we have | [ Dσ ] u,vt ( ψ ) | ≤ C sup ≤ s ≤ t | ψ ( s ) | , for ψ ∈ C T . Then from the deﬁnition of ˆ K and (5.20) we have, for any p ≥ t ∈ [0 , T ], E h sup ≤ s ≤ t | ˆ K s | p i ≤ C p Z t (cid:16) E [ | ¯ R s | + | ¯ K s | ] (cid:17) p ds + C p Z t E h sup ≤ r ≤ s | ¯ K r | p i ds. (5.21)On the other hand, the boundedness of h and ∂ x h implies that, recalling the deﬁnition ofˆ R , for p ≥ t ∈ [0 , T ], (cid:16) E h sup s ≤ t | ˆ R s | p (cid:17) ≤ C p Z t E [ | ¯ R s | p ] ds + C p Z t E [ | L us ¯ K s | p ] ds (5.22) ≤ C p Z t ( E [ | ¯ R s | p ]) ds + C p Z t E [ | ¯ K s | p ] ds. Combining (5.21) and (5.22) we have, for t ∈ [0 , T ], E h sup ≤ s ≤ t | ˆ K s | p i + (cid:16) E h sup ≤ s ≤ t | ˆ R s | p ] (cid:17) ≤ C p Z t (cid:16) E h sup ≤ r ≤ s | ¯ K r | p i + (cid:16) E h sup ≤ r ≤ s | ¯ R r | p i (cid:17) ds. K, R ) ∈ L ∞− F ( P ; C T ) of (5.17) and (5.18), such that for all p ≥ E (cid:2) k K k p C T (cid:3) + E (cid:2) k R k p C T (cid:3) ≤ C p k v s − u s k p, , Q . (5.23)We leave it to the interested reader, and this completes the proof. In this section we validate the heuristic arguments in the previous section and derive thevariational equation of the optimal trajectory rigorously. Recall the processes δ θ X = δ θ X u,v , δ θ L = δ θ L u,v , and ( K, R ) deﬁned in the previous section. Denote η θt △ = δ θ X t − K t , ˜ η θt △ = δ θ L t − R t , t ∈ [0 , T ] . (6.1)Our main purpose of this section is to prove the following result. Proposition 6.1.

Let ( P u , u ) = ( P u ∗ , u ∗ ) ∈ U ad be an optimal control, ( X u , L u ) be thecorresponding solution of (5.1), and let U ut = E u [ X ut |F Yt ] , t ≥ . For any v ∈ U ad , let ( K, R ) = ( K u,v , R u,v ) be the solution of the linear equations (5.17) and (5.18). Then, forall p > , it holds that lim θ → E [ k η θ k p C T ] = lim θ → E h sup s ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) X θ,vs − X us θ − K s (cid:12)(cid:12)(cid:12) p i = 0; (6.2)lim θ → E [ k ˜ η θ k p C T ] = lim θ → E h sup s ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) L θ,vs − L us θ − R s (cid:12)(cid:12)(cid:12) p i = 0 . (6.3)The proof of Proposition 6.1 is quite lengthy, we shall split it into two parts.[ Proof of (6.3) ]. This part is relatively easy. We note that with a direct calculationusing the equations (5.7) and (5.18) it is readily seen that ˜ η θ satisﬁes the following SDE:˜ η θt = Z t ˜ η θr h ( r, X θ,vr ) dY r + Z t L ur Z ∂ x h ( r, X ur + λθ ( η θr + K r )) η θr dλdY r + I ,θt + I ,θt , (6.4)where I ,θt = Z t R r ( h ( r, X θ,vr ) − h ( r, X ur )) dY r ; I ,θt = Z t L ur Z ∂ x h ( r, X ur + λθ ( η θr + K r )) K r dλdY r − Z t L ur ∂ x h ( r, X ur ) K r dY r .

27e claim that, for all p > θ → E u [ sup t ∈ [0 ,T ] | I ,θt | p ] = 0 , lim θ → E u [ sup t ∈ [0 ,T ] | I ,θt | p ] = 0 . (6.5)Indeed, note that dY t = dB t − h ( t, X ut ) dt , and B is a P u -Brownian motion. Proposition4.3, together with the bounded and continuity of h and ∂ x h , leads to that, for all p ≥ E u n sup t ∈ [0 ,T ] | I ,θt | p o = E n L uT sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) Z t R s [ h ( s, X θ,vs ) − h ( s, X us )] dY s (cid:12)(cid:12)(cid:12) p o ≤ E u n sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) Z t R s [ h ( s, X θ,vs ) − h ( s, X us )] dB s (cid:12)(cid:12)(cid:12) p o +2 E n L uT sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) Z t R s [ h ( s, X θ,vs ) − h ( s, X us )] h ( s, X us ) ds (cid:12)(cid:12)(cid:12) p o ≤ C p E n L uT Z T R ps ( | X θ,vs − X us | p ∧ ds o ≤ C p n E (cid:2) ( L uT ) (cid:3)o n E (cid:2) sup s ∈ [0 ,T ] | R s | p (cid:3)o n E (cid:2) sup s ∈ [0 ,T ] ( | X θ,vs − X us | ∧ (cid:3)o ≤ C p k u − u θ,v k p, , Q ≤ C | θ | , where we used the following estimate for any function f ∈ L ∞ ( R ) bounded by C ≥ | f ( x ) − f ( x ′ ) | p ≤ (2 C ( | f ( x ) − f ( x ′ ) | ∧ p ≤ (2 C ) p ( | f ( x ) − f ( x ′ ) | ∧ , ∀ p ≥ . (6.6)Similarly, we have E u n sup t ∈ [0 ,T ] | I ,θt | p o = E n L uT sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) Z t L ur K r h Z (cid:2) ∂ x h ( r, X ur + λθ ( η θr + K r )) − ∂ x h ( r, X ur )] dλ i dY r (cid:12)(cid:12)(cid:12) p o ≤ C p E n L uT Z T | L ur | p | K r | p h Z (cid:12)(cid:12) ∂ x h ( r, X ur + λθ ( η θr + K r )) − ∂ x h ( r, X ur ) (cid:12)(cid:12) dλ i p dr o ≤ C p E n Z T h Z (cid:2) | ∂ x h ( r, X ur + λθ ( η θr + K r )) − ∂ x h ( r, X ur ) | ∧ (cid:3) dλ i dr o / . Here in the above the second inequality follows from (6.6) applied to ∂ x h , the H¨older in-equality, and the fact that L u , K ∈ L ∞− F ( Q ; C T ) (see Theorem 5.2), and the last inequalityfollows from the L p -estimate (5.23). Now, from (4.5), (5.17), and (5.18) we see that E n sup t ∈ [0 ,T ] (cid:0) | η θt | + | K t | (cid:1)o ≤ C, θ ∈ (0 , . θ [ k η θ k C T + k K k C T ] →

0, in probability Q , as θ →

0, the continuity of ∂ x h andthe Bounded Convergence Theorem then imply (6.5), proving the claim. Recalling (6.4),we see that (6.3) follows from (6.5), provided (6.2) holds, which we now substantiate.[ Proof of (6.2) ]. This part is more involved. We ﬁrst rewrite (5.9) as follows δ θ X t = Z t n E { (˜ η θs + R s ) σ ( s, ϕ ·∧ s , U θ,vs , z ) } (cid:12)(cid:12)(cid:12) ϕ = Xθ,v,z = uθ,vs + [ Dσ ] θ,u,vs ( η θ ·∧ s + K ·∧ s )+ E { B θ,u,v ( s, ϕ ·∧ s , z ) δ θ U s } (cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs + C θ,u,vσ ( s )( v s − u s ) o dB s . (6.7)Here [ Dσ ] θ,u,v , B θ,u,v , and C θ,u,v are deﬁned by (5.10). Furthermore, in light of (5.11), wecan also write: δ θ U t = E [(˜ η θt + R t ) X θ,vt + L ut ( η θt + K t ) |F Yt ] E [ L θ,vt |F Yt ] − E [(˜ η θt + R t ) |F Yt ] E [ L θ,vt |F Yt ] U ut . Plugging this into (6.7) we have δ θ X t = Z t n E { ˜ η θs σ ( s, ϕ ·∧ s , U θ,vs , z ) } (cid:12)(cid:12)(cid:12) ϕ = Xθ,v,z = uθ,vs + [ Dσ ] θ,u,vs ( η θ ·∧ s )+ E n B θ,u,v ( s, ϕ ·∧ s , z ) h E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L θ,vs |F Ys ] − E [˜ η θs |F Ys ] E [ L θ,vs |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs o dB s + Z t n E { R s σ ( s, ϕ ·∧ s , U θs , z ) } (cid:12)(cid:12)(cid:12) ϕ = Xθ,v,z = uθ,vs + [ Dσ ] θ,u,vs ( K ·∧ s )+ E n B θ,u,v ( s, ϕ ·∧ s , z ) h E [ R s X θ,vs + L us K s |F Ys ] E [ L θ,vs |F Ys ] − E [ R s |F Yt ] E [ L θ,vs |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xθ,v ; z = uθ,vs + C θ,u,vσ ( s )( v s − u s ) o dB s . Now, recalling (5.17) (or more conveniently, (5.13)) we have η θt = δ θ X t − K t = Z t n E { ˜ η θs σ ( s, ϕ ·∧ s , U θ,vs , z ) } (cid:12)(cid:12)(cid:12) ϕ = Xθ,v,z = uθ,vs + [ Dσ ] θ,u,vs ( η θ ·∧ s )+ E n B θ,u,v ( s, ϕ ·∧ s , z ) h E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L θ,vs |F Ys ] − E [˜ η θs |F Ys ] E [ L θ,vs |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs o dB s + I ,θ, t + I ,θ, t + I ,θ, t + I ,θ, t , (6.8)where, for t ∈ [0 , T ], I ,θ, t △ = Z t E (cid:8) R s (cid:2) σ ( s, ϕ ·∧ s , U θ,vs , z ) − σ ( s, ϕ ·∧ s , U us , z ) (cid:3)(cid:9)(cid:12)(cid:12)(cid:12) ϕ Xθ,v,z uθ,vsϕ Xu,z us dB s ; I ,θ, t △ = Z t E (cid:8) [ Dσ ] θ,u,vs ( K ·∧ s ) − [ Dσ ] u,vs ( K ·∧ s ) (cid:9) dB s ; (6.9)29 ,θ, t △ = Z t n E n B θ,u,v ( s, ϕ ·∧ s , z ) (cid:16) E [ R s X θ,vs + L us K s |F Ys ] E [ L θ,vs |F Ys ] − E [ R s |F Yt ] E [ L θ,vs |F Ys ] U us (cid:17)o(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs − E n B u,v ( s, ϕ ·∧ s , z ) (cid:16) E [ R s X us + L us K s |F Ys ] E [ L us |F Ys ] − E [ R s |F Ys ] E [ L us |F Ys ] U us (cid:17)o(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = us o dB s I ,θ, t △ = Z t E [ C θ,u,vσ ( s )( v s − u s ) − C u,vσ ( s )( v s − u s )] dB s . We have the following lemma.

Lemma 6.2.

Suppose that Assumption 2.4 holds. Then, for all p > , lim θ → E n sup ≤ t ≤ T | I ,θ,it | p o = 0 , i = 1 , · · · , . (6.10) Proof.

We ﬁrst recall that U θ,vs △ = E θ,v [ X θ,vs |F Ys ] and U us △ = E u [ X us |F Ys ]. Using theKallianpur-Strieble formula we have E Z T | U θ,vs − U us | p ds ≤ C p n E Z T (cid:12)(cid:12)(cid:12) E [ L θ,vs X θ,vs |F Ys ] E [ L θ,vs |F Ys ] − E [ L us X us |F Ys ] E [ L θ,vs |F Ys ] (cid:12)(cid:12)(cid:12) p ds + E Z T (cid:12)(cid:12)(cid:12) E [ L us X us |F Ys ] E [ L θ,vs |F Ys ] − E [ L us X us |F Ys ] E [ L us |F Ys ] (cid:12)(cid:12)(cid:12) p ds o (6.11) △ = C p { J θ + J θ } . We now estimate J θ and J θ respectively. First note that, for any p >

1, we can ﬁnd aconstant C p > θ ∈ (0 ,

1) and u ∈ U ad , E [( L θ,vs ) p ] + E [( L θ,vs ) − p ] + E [( L us ) p ] ≤ C p . Thus, applying the H¨older and Jensen inequalities as well as Proposition 4.3, we have, forany p >

1, and θ ∈ (0 , E Z T (cid:12)(cid:12)(cid:12) E [ L θ,vs X θ,vs |F Ys ] − E [ L us X us |F Ys ] E [ L θ,vs |F Ys ] (cid:12)(cid:12)(cid:12) p ds ≤ Z T E n | L θ,vs X θ,vs − L us X us | p E [ L θ,vs |F Ys ] p o ds ≤ Z T n { E | L θ,vs X θ,vs − L us X us | } / · n E h | L θ,vs X θ,vs − L us X us | p − E [ L θ,vs |F Ys ] p io / o ds (6.12) ≤ Z t { E | L θ,vs X θ,vs − L us X us | } / · n E [ | L θ,vs X θ,vs − L us X us | p − ] E (cid:2) [ L θ,vs ] − p |F Ys (cid:3)o / ds ≤ C p θ k u − v k , , Q . Similarly, one can also argue that, for any p >

1, the following estimates hold: E Z T (cid:12)(cid:12)(cid:12) E [ L θ,vs |F Ys ] − E [ L us |F Ys ] (cid:12)(cid:12)(cid:12) p ds ≤ C p θ k u − v k , , Q , θ ∈ (0 , . (6.13)30learly, (6.12) and (6.13) imply that J θ + J θ ≤ C p θ k u − v k , , Q , for some constant C p > p , the Lipschitz constant of the coeﬃcients, and T . Therefore we have E Z T | U θ,vs − U us | p ds ≤ C p θ k u − v k , , Q → , as θ →

0. (6.14)We can now prove (6.10) for i = 1 , · · · ,

4. First, by Burkholder-Davis-Gundy inequalitywe have E [ sup ≤ t ≤ T | I ,θ, t | ] ≤ C Z T E (cid:12)(cid:12)(cid:12) E (cid:8) R s (cid:2) σ ( s, ϕ ·∧ s , U θ,vs , z ) − σ ( s, ϕ ·∧ s , U us , z ) (cid:3)(cid:9)(cid:12)(cid:12)(cid:12) ϕ Xθ,v,z uθ,vsϕ Xu,z us (cid:12)(cid:12)(cid:12) ds. Since σ is bounded and Lipschitz continuous in ( ϕ, y, z ), it follows from Proposition 4.3 and(6.14) that lim θ → E [sup ≤ t ≤ T | I ,θ, t | ] = 0. By the similar arguments using the continuityof D ϕ σ and that of ∂ z σ , respectively, it is not hard to show that, for all p > θ → E [ sup ≤ t ≤ T | I ,θ, t | p ] = 0; lim θ → E [ sup ≤ t ≤ T | I ,θ, t | p ] = 0 . It remains to prove the convergence of I ,θ, . To this end, we note that, for any p > E h sup s ∈ [0 ,T ] (cid:0) | R s | p + | K s | p (cid:1)i ≤ C p , (6.15)and by (6.14) we have, for p > θ → E Z T (cid:12)(cid:12)(cid:12) E (cid:8)(cid:12)(cid:12) B θ,u,v ( s, ϕ ·∧ s , z ) − B u,v ( s, ϕ ·∧ s , z ) (cid:12)(cid:12) (cid:9)(cid:12)(cid:12) ϕ = Xu,z = uθ,vsz us (cid:12)(cid:12)(cid:12) p ds = 0 . (6.16)This, together with (6.13), (6.14), an estimate similar to (6.12), and Proposition 4.3, yieldsthat lim θ → E [sup ≤ t ≤ T | I ,θ, t | ] = 0, proving the lemma.We now continue the proof of (6.2). First we rewrite (6.8) as η θt = Z t n E { ˜ η θs σ ( s, ϕ ·∧ s , U θ,vs , z ) } (cid:12)(cid:12)(cid:12) ϕ = Xθ,v,z = uθ,vs + [ Dσ ] θ,u,vs ( η θ ·∧ s )+ E n B θ,u,v ( s, ϕ ·∧ s , z ) h E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L us |F Ys ] − E [˜ η θs |F Ys ] E [ L us |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs o dB s + I ,θ, t + X i =1 I ,θ,it , (6.17)where I ,θ, t △ = Z t E n B θ,u,v ( s, ϕ ·∧ s , z ) h E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L θ,vs |F Ys ] − E [˜ η θs |F Ys ] E [ L θ,vs |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs − B θ,u,v ( s, ϕ ·∧ s , z ) h E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L us |F Ys ] − E [˜ η θs |F Ys ] E [ L us |F Ys ] U us io(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs o dB s

31e note that with the same argument as before one shows that lim θ → E [sup ≤ t ≤ T | I ,θ, t | ] =0. On the other hand, similar to (5.15) one can argue that E h B θ,u,v ( s, ϕ ·∧ s , z ) (cid:16) E [˜ η θs X θ,vs + L us η θs |F Ys ] E [ L us |F Ys ] − E [˜ η θs |F Ys ] E [ L us |F Ys ] U us (cid:17)i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs = E h Z ∂ y σ ( s, ϕ ·∧ s , U us + λ ( U θ,vs − U us ) , z ) dλ · (˜ η θs X θ,vs + L us η θs − U us ˜ η θs ) i(cid:12)(cid:12)(cid:12) ϕ = Xu ; z = uθ,vs . Consequently, we have η θt = Z t n E { α ,θs ( ϕ ·∧ s , ϕ ·∧ s , z )˜ η θs } (cid:12)(cid:12)(cid:12) ϕ Xθ,v,ϕ Xu,z = uθ,vs + E { α ,θs ( ϕ ·∧ s , z )˜ η θs } (cid:12)(cid:12)(cid:12) ϕ Xu,z = uθ,vs o dB s + Z t n E { β θs ( ϕ ·∧ s , z ) η θs } (cid:12)(cid:12)(cid:12) ϕ Xuz = uθ,vs + [ Dσ ] θ,u,vs ( η θ ·∧ s ) o dB s + I ,θt , where I ,θt = P i =0 I ,θ,it , and α ,θs ( ϕ ·∧ s , ϕ ·∧ s , z ) △ = Z D ϕ σ ( s, ϕ ·∧ s + λ ( ϕ ·∧ s − ϕ ·∧ s ) , U θ,vs , z )( ϕ ·∧ s − ϕ ·∧ s ) dλ,α ,θs ( ϕ ·∧ s , z ) △ = σ ( s, ϕ ·∧ s , U θ,vs , z ) + Z ∂ y σ ( s, ϕ ·∧ s , U us + λ ( U θ,vs − U us ) , z ) dλ ( U θ,vs − U us ); β θs ( ϕ ·∧ s , z ) △ = L us Z ∂ y σ ( s, ϕ ·∧ s , U us + λ ( U θ,vs − U us ) , z ) dλ. Notice that | α ,θs ( ϕ ·∧ s , ϕ ·∧ s , z ) | + | α ,θs ( ϕ ·∧ s , z ) | ≤ C (1+ | ϕ ·∧ s | + | ϕ ·∧ s | + | U θ,vs | + | U us | ) , | β θs ( ϕ ·∧ s , z ) | ≤ CL us . Now by the Burkholder and Cauchy-Schwartz inequalities we have, for all p ≥ t ∈ [0 , T ], E h sup s ∈ [0 ,t ] | η θs | p i ≤ C p n E [ k I ,θ k p C T ] + E nh Z t (cid:16) E [ | η θs | + | ˜ η θs | ] + sup r ∈ [0 ,s ] | η θs | (cid:17) ds i p oo , and from Gronwall’s inequality one has E h sup s ∈ [0 ,t ] | η θs | p i ≤ C p n E h k I ,θ k p C T + Z t (cid:16) E [ | ˜ η θs | p ] (cid:17) ds o , t ∈ [0 , T ] . (6.18)On the other hand, setting I θt △ = I ,θt + I ,θt , t ∈ [0 , T ], we have from (6.4) that, for p ≥ E h sup s ∈ [0 ,t ] | ˜ η θs | p i ≤ C p n E [ k I θ k p C T ] + Z t E [ | ˜ η θs | p ] ds + Z t (cid:0) E [ | η θs | p ] (cid:1) / ds o , t ∈ [0 , T ] . Then Gronwall’s inequality leads to that (cid:0) E h sup s ∈ [0 ,t ] | ˜ η θs | p i(cid:17) ≤ C p n(cid:0) E k I θ k p C T (cid:1) + Z t E [ | η θs | p ] ds o , t ∈ [0 , T ] . (6.19)32ombining (6.18), (6.19), applying (6.5) and Lemma 6.2 as well as the Gronwall inequality,we can easily deduce (6.2) by sending θ →

0. Consequently, (6.3) holds as well.From Proposition 6.1, (5.11) and the above development we also obtain the followingcorollary.

Corollary 6.3.

We assume that Assumption 2.4 holds. Then, for all p > , lim θ → E [ k δ θ U − V k p C T ] = lim θ → E [ sup ≤ s ≤ T | U θ,vs − U us θ − V s | p ] = 0 , where V t △ = E [ R t X ut + L ut K t |F Yt ] E [ L ut |F Yt ] − E [ R t |F Yt ] E [ L ut |F Yt ] U ut , t ∈ [0 , T ] . We are now ready to study the Stochastic Maximum Principle. The main task will be todetermine the appropriate adjoint equation , which we expect to be a backward stochasticdiﬀerential equation of Mean-ﬁeld type. We begin with a simple analysis. Suppose that u = u ∗ is an optimal control, and for any v ∈ U ad , we deﬁne u θ,v by (5.4). Then we have0 ≤ J ( u θ,v ) − J ( u ) θ = 1 θ E n E [ L θ,vT Φ( x, U θ,vT )] | x = X θT − E [ L uT Φ( x, U uT )] | x = X uT (7.1)+ Z T (cid:2) E [ L θ,vs f ( s, ϕ ·∧ s , U θ,vs , z )] | ϕ = Xθ,v,z = uθ,vs − E [ L us f ( s, ϕ ·∧ s , U us , z )] | ϕ = Xu,z = us (cid:3) ds o . Now, repeating the same analysis as that in Proposition 4.3, then sending θ →

0, it followsfrom Propositions 4.3, 6.1 and the continuity of the functions Φ and f that0 ≤ E [ K T ξ ] + E [ R T Θ] + E n Z T n E [ R s f ( s, ϕ ·∧ s , U us , z )] | ϕ = X u ,z = u s + E [ ∂ y f ( s, ϕ ·∧ s , U us , z )( X us − U us ) R s + L us K s ] | ϕ = X u ,z = u s (7.2)+ E [ L us D ϕ f ( s, ϕ ·∧ s , U us , z )( ψ ·∧ s )] | ϕ = X u ,z = u s ,ψ = K + E [ L us ∂ z f ( s, ϕ ·∧ s , U us , z )] | ϕ = X u ,z = u s ( v s − u s ) o ds o , where ξ △ = E [ L uT ∂ x Φ( x, U uT )] | x = X uT + L uT E [ ∂ y Φ( X uT , y )] | y = U uT , Θ △ = E [Φ( X uT , y )] (cid:12)(cid:12) y = U uT + ( X uT − U uT ) E [ ∂ y Φ( X uT , y )] | y = U uT . (7.3)33e now consider the adjoint equations that take the following form of backward SDEson the reference space (Ω , F , Q ): ( dp t = − α t dt + d Γ t + q t dB t + e q t dY t , p T = ξ,dQ t = − β t dt + d Σ t + M t dB t + f M t dY t , Q T = Θ . (7.4)Here the coeﬃcients α, β as well as the two bounded variation processes Γ and Σ are tobe determined. Applying Itˆo’s formula and recalling the variational equations (5.17) and(5.18), we can easily derive (denote U ut = E u [ X ut |F Yt ], t ∈ [0 , T ]) E [ ξK T ] + E [Θ R T ]= Z T n − E [ K s α s ] − E [ R s β s ] + E h q s E [ R s σ ( s, ϕ ·∧ s , U us , z )] (cid:12)(cid:12) ϕ = X u ,z = u s i + E h q s E h ∂ y σ ( s, ϕ z ·∧ s , U us , z )[( X us − U us ) R s + L us K s ] i(cid:12)(cid:12)(cid:12) ϕ = X u ,z = u s i (7.5)+ E (cid:2) q s [ Dσ ] u,vs ( K ·∧ s ) + q s C u,vσ ( s )( v s − u s ) + f M s R s h ( s, X us ) + f M s K s L us ∂ x h ( s, X us )] o ds + E n Z T [ K s d Γ s + R s d Σ s ] o , where [ Dσ ] u,v and C u,v are deﬁned by (5.14).By Fubini’s Theorem we see that  E (cid:2) q s E [ R s σ ( s, ϕ · ∧ s, U us , z )] (cid:12)(cid:12) ϕ = X u ,z = u s (cid:3) = E (cid:2) R s E [ q s σ ( s, X ·∧ s , y, u s )] (cid:12)(cid:12) y = U us (cid:3) ; E h q s E h ∂ y σ ( s, ϕ z ·∧ s , U us , z )[( X us − U us ) R s + L us K s ] i(cid:12)(cid:12)(cid:12) ϕ = X u ,z = u s i = E h E (cid:2) q s ∂ y σ ( s, X z ·∧ s , y, u s )] (cid:12)(cid:12) y = U us [( X us − U us ) R s + L us K s ] i . (7.6)Furthermore, in light of deﬁnition of [ Dσ ] u,v ((5.14)), if we denote, for ﬁxed ( t, ϕ, z ), µ σ ( t, ϕ ·∧ t , z )( · ) △ = E [ L ut D ϕ σ ( t, ϕ ·∧ t , U ut , z )]( · ) ∈ M [0 , T ] , (7.7)where M [0 , T ] denotes all the Borel measures on [0 , T ], then we can write[ Dσ ] u,vt ( K ·∧ t ) = E (cid:2) L ut D ϕ σ ( t, ϕ ·∧ t , U ut , z )( ψ )] (cid:12)(cid:12) ϕ = Xu,z = ut,ψ = K ·∧ t = Z t K r µ σ ( r, X u ·∧ r , u r )( dr ) . (7.8)Let us now argue that a similar Fubini Theorem argument holds for the random measure µ σ ( t, X u ·∧ t , u t )( · ). First, for a given process q ∈ L F ( Q ; [0 , T ]), consider the following ﬁnitevariation (FV) process (in fact, under Assumption 2.4, integrable variation (IV) process): A σt △ = Z T Z t ∧ s q s µ σ ( s, X u ·∧ s , u s )( dr ) ds, t ∈ [0 , T ] . (7.9)34t is easy to check, as a (randomized) signed measure on [0 , T ], it holds Q -almost surelythat dA σt = R Tt q s µ σ ( s, X u ·∧ s , u s )( dt ) ds . We note that being a “raw FV” process, the process A σ is not F -adapted. We now consider its dual predictable projection : p (cid:16) Z Tt q s µ σ ( s, X u ·∧ s , u s )( dt ) ds (cid:17) △ = d [ p A σt ] , t ∈ [0 , T ] . (7.10)We remark that d [ p A t ] is a predicable random measure that can be formally understood as d [ p A σt ] = E [ dA σt |F t − ] = E h Z Tt q s µ σ ( s, X u ·∧ s , u s )( dt ) ds (cid:12)(cid:12)(cid:12) F t − i , t ∈ [0 , T ] . Using the deﬁnition of dual predicable projection and (7.8), we see that, for the contin-uous process K ∈ L F ( Q ; C T ), Z T E [ q s [ Dσ ] u,vs ( K ·∧ s )] ds = Z T E h q s Z s K r µ σ ( r, X u ·∧ r , u r )( dr ) i ds = E h Z T K r dA σr i = E h Z T K r d [ p A σr ] i (7.11)= E h Z T K r p (cid:16) Z Tr q s µ σ ( s, X u ·∧ s , u s )( dr ) ds (cid:17)i . Similarly, we denote A ft △ = R T R t ∧ s µ f ( s, X u ·∧ s , u s )( dr ) ds , t ∈ [0 , T ]; and denote its dualpredicable projection by p (cid:16) R Tt µ f ( s, X u ·∧ s , u s )( dt ) ds (cid:17) = d [ p A ft ], t ∈ [0 , T ].We now plug (7.6) and (7.11) into (7.5) to get: E [ ξK T ] + E [Θ R T ]= E n Z T n K s h − α s + L us E (cid:2) q s ∂ y σ ( s, X u ·∧ s , y, u s ) (cid:3)(cid:12)(cid:12) y = U us + M s L us ∂ x h ( s, X us ) i + R s h − β s + E [ q s σ ( s, X ·∧ s , y, u s )] (cid:12)(cid:12) y = U us + f M s h ( s, X us ) i + q s C u,vs ( v s − u s )+ R s E (cid:2) q s ∂ y σ ( s, X u ·∧ s , y, u s ) (cid:3)(cid:12)(cid:12) y = U us ( X us − U us ) o ds + Z T K s d [ p A σs ] o (7.12)+ E n Z T [ K s d Γ s + R s d Σ s ] o , = E n Z T [ − K s ˆ α s − R s ˆ β s + q s C u,vσ ( s )( v s − u s )] ds + K s d [ p A σs ] + [ K s d Γ s + R s d Σ s ] o , where  ˆ α t △ = α t − L ut E (cid:2) q t ∂ y σ ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut − f M t L ut ∂ x h ( t, X ut );ˆ β t △ = β t − E [ q t σ ( t, X ·∧ t , y, u t )] (cid:12)(cid:12) y = U ut − f M t h ( t, X ut ) − E (cid:2) q t ∂ y σ ( t, X u ·∧ t , y, u t ) (cid:12)(cid:12) y = U ut ( X ut − U ut ) . (7.13)35ombining (7.2) and (7.12) and using the processes dA σ , dA f and their dual predicableprojections, we have0 ≤ E n Z T [ − K s ˆ α s − R s ˆ β s + q s C u,vσ ( s )( v s − u s )] ds + Z T K s d [ p A σs ] o (7.14)+ E n Z T h R s (cid:2) E [ f ( s, X ·∧ s , y, u s )] (cid:12)(cid:12) y = U us + E (cid:2) ∂ y f ( s, X u ·∧ s , y, u s ) (cid:3)(cid:12)(cid:12) y = U us ( X us − U us ) (cid:3) + L us K s E (cid:2) ∂ y f ( s, X u ·∧ s , y, u s ) (cid:3)(cid:12)(cid:12) y = U us + C u,vf ( s )( v s − u s ) i ds + Z T K s d [ p A fs ] o + E n Z T [ K s d Γ s + R s d Σ s ] o , where C u,vf ( s ) △ = E [ L us ∂ z f ( s, ϕ ·∧ s , U us , z )] | ϕ = X u ,z = u s . Now, if we set Σ t = 0, andˆ α t = L ut E (cid:2) ∂ y f ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut ˆ β t = E [ f ( t, X ·∧ t , y, u t )] (cid:12)(cid:12) y = U ut + E (cid:2) ∂ y f ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut ( X ut − U ut ) (7.15) d Γ t = − d [ p A σt ] − d [ p A ft ] , then (7.14) becomes0 ≤ E n Z T [ q t C u,vσ ( s ) + C u,vf ( s )]( v s − u s ) ds o , v ∈ U ad . (7.16)From this we should be able to derive the maximum principle, provided that the adjointequation (7.4) with coeﬃcients α , β , and Γ determined by (7.13) and (7.15) is well-deﬁned. Remark 7.1.

1) We remark that the process Γ in (7.15) should be considered as a mappingfrom the space L F ([0 , T ] × Ω) × L F (Ω; C T ) × L F ([0 , T ] × Ω; U ) to M F ([0 , T ]), the space ofall the random measures on [0 , T ], such that(i) ( t, ω ) µ ( t, ω, A ) is F -progressively measurable, for all A ∈ B ([0 , T ]);(ii) µ ( t, ω, · ) ∈ M ([0 , T ]) is a ﬁnite Borel measure on [0 , T ].2) Assumption 2.4-(iii) implies that the random measure D σ [ q, X u , u ]( t, dt ) satisﬁes thefollowing estimate: for any q ∈ L F ([0 , T ] × Ω) and u ∈ U ad , E h Z T | d p A σt | i = E n Z T (cid:12)(cid:12)(cid:12) p (cid:16) Z Tt q s µ σ ( s, X u ·∧ s , u s )( dt ) ds (cid:17)(cid:12)(cid:12)(cid:12)o ≤ E n Z T Z s | q s || µ σ ( s, X u ·∧ s , u s )( dt ) | ds o (7.17) ≤ E n Z T | q s | Z s ℓ ( s, dt ) ds o ≤ C E n Z T | q s | ds o ≤ C k q k , , Q . The same estimate holds for D f [ X u , u ]( t, dt ) as well.36) Clearly, the processes A σ and A f are originated from the Fr´echet derivatives of σ and f , respectively, with respect to the path ϕ ·∧ t . If σ and f are of Markovian type, then theywill be absolutely continuous with respect to the Lebesgue measure.We shall now validate all the arguments presented above. To begin with, we note thatthe choice of α , β , and Γ via by (7.13) and (7.15), together with the terminal condition ( ξ, Θ)by (7.3), amounts to saying that the processes ( p, q, ˜ q ) and ( Q, M, ˜ M ) solve the BSDE:  dp t = − L ut n E (cid:2) ∂ y f ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut + E (cid:2) q t ∂ y σ ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut + f M t ∂ x h ( t, X ut ) o dt − d p A σt − d p A ft + q t dB t + e q t dY t dQ t = − n E [ q t σ ( t, X u ·∧ t , y, u t )] (cid:12)(cid:12) y = U ut − f M t h ( t, X ut )+ E (cid:2) q t ∂ y σ ( t, X u ·∧ t , y, u t )] (cid:12)(cid:12) y = U ut ( X ut − U ut )+ E [ f ( t, X ·∧ t , y, u t )] (cid:12)(cid:12) y = U ut + E (cid:2) ∂ y f ( t, X u ·∧ t , y, u t ) (cid:3)(cid:12)(cid:12) y = U ut ( X ut − U ut ) o dt + M t dB t + f M t dY t ,p T = ξ, Q T = Θ . (7.18)Now if we denote η = ( p, Q ) T , W = ( B , Y ) T , Ξ = h q ˜ qM ˜ M i , then we can rewrite (7.18)in a more abstract (vector) form:  dη t = −{ A t + E [ G t Ξ t g ( t, y )] (cid:12)(cid:12) y = U ut + H t Ξ t h t } dt − Γ(Ξ)( t, dt ) − Γ ( t, dt ) + Ξ t dW t ,η T = Υ , (7.19)where Υ ∈ L F WT (Ω; Q ); A, G, H and h are bounded, vector or matrix-valued F W -adaptedprocesses with appropriate dimensions, g is an R -valued progressively measurable randomﬁeld, and U is an F Y -adapted process. Moreover, the R -valued ﬁnite variation processesΓ(Ξ)( t, dt ) and Γ ( t, dt ) take the form:Γ(Ξ)( t, dt ) = p (cid:16) Z Tt Ξ r µ r ( dt ) dr (cid:17) , Γ ( t, dt ) = p (cid:16) Z Tt µ r ( dt ) dr (cid:17) , (7.20)where r µ ir ( · ), i = 1 ,

2, are M [0 , T ]-valued measurable random processes satisfying, asmeasures with respect to the total variation norm, | µ r ( dt ) | + | µ r ( dt ) | ≤ ℓ ( r, dt ) , r ∈ [0 , T ] , Q a.s. (7.21)We note that Γ(Ξ)( dt ) and Γ ( dt ) are representing d [ p A σt ] and [ p A ft ] in (7.18), respectively,and can be substantiated by (7.9) and (7.10). Furthermore, by Assumption 2.4, they both37atisfy (7.21). To the best of our knowledge, BSDE (7.19) is beyond all the existing frame-works of BSDEs, and we shall give a brief proof for its well-posedness. Theorem 7.2.

Assume that the Assumption 2.4 is in force. Then, the BSDE (7.19) has aunique solution ( η, Ξ) .Proof. The proof is more or less standard, we shall only point out a key estimate. Forany given e Ξ i ∈ L F W ([0 , T ] × Ω; R ), obviously we have a unique solution ( η i , Ξ i ) of (7.19), i = 1 ,

2, respectively, i.e.,  dη it = −{ A t + E [ G t e Ξ it g ( t, y )] (cid:12)(cid:12) y = U ut + H t e Ξ it h t } dt − Γ( e Ξ i )( t, dt ) − Γ ( t, dt ) + Ξ it dW t ,η iT = Υ . We deﬁne b ξ = ξ − ξ , ξ i = η i , Ξ i , i = 1 ,

2, respectively. be Ξ = e Ξ − e Ξ . Noting the linearityof BSDE (7.19) we see that b η satisﬁes: b η t = Z Tt n E [ G s be Ξ s g ( s, y )] (cid:12)(cid:12) y = U us + H s be Ξ s h s o ds + Z Tt Γ( be Ξ)( s, ds ) − M Tt , (7.22)where M Tt △ = R Tt b Ξ s dW s . Therefore, | b η t + M Tt | ≤ n(cid:12)(cid:12)(cid:12) Z Tt n E [ G s be Ξ s g ( s, y )] (cid:12)(cid:12) y = U us + H s be Ξ s h s o ds (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Z Tt Γ( be Ξ)( s, ds ) (cid:12)(cid:12)(cid:12) o . Taking expectation on both sides above and noting that E [ b η t M Tt ] = 0 and E n(cid:12)(cid:12)(cid:12) Z Tt n E [ G s be Ξ s g ( s, y )] (cid:12)(cid:12) y = U us + H s be Ξ s h s o ds (cid:12)(cid:12)(cid:12) o ≤ C ( T − t ) E h Z Tt | be Ξ s | ds i , we have E [ | b η t | ] + E h Z Tt | b Ξ s | ds i ≤ C ( T − t ) E h Z Tt | be Ξ s | ds i + E n(cid:12)(cid:12)(cid:12) Z Tt Γ( be Ξ)( s, ds ) (cid:12)(cid:12)(cid:12) o . (7.23)To estimate the term involving Γ( be Ξ) we note that (recall (7.20)) if a square-integrableprocess V is increasing and continuous, then so is its dual predictable projection p V . Thus,by the deﬁnition of p V we have E h(cid:12)(cid:12)(cid:12) Z Tt d [ p V s ] (cid:12)(cid:12)(cid:12) i = 2 E h Z Tt ( p V s − p V t ) d [ p V s ] i = 2 E h Z Tt ( p V s − p V t ) dV s i ≤ E [( p V T − p V t )( V T − V t )] ≤ (cid:16) E h(cid:12)(cid:12)(cid:12) Z Tt d [ p V s ] (cid:12)(cid:12)(cid:12) i(cid:17) / (cid:16) E h(cid:12)(cid:12)(cid:12) Z Tt dV s (cid:12)(cid:12)(cid:12) i(cid:17) / . That is, E h(cid:12)(cid:12)(cid:12) Z Tt d [ p V s ] (cid:12)(cid:12)(cid:12) i ≤ E h(cid:12)(cid:12)(cid:12) Z Tt dV s (cid:12)(cid:12)(cid:12) i . (7.24)38pplying this to V t △ = R T R t ∧ r | be Ξ r || µ r ( ds ) | dr , t ∈ [0 , T ], we have E h(cid:12)(cid:12)(cid:12) Z Tt Γ( be Ξ)( s, ds ) (cid:12)(cid:12)(cid:12) i ≤ E h(cid:12)(cid:12)(cid:12) Z Tt p (cid:16) Z Ts | be Ξ r || µ r ( ds ) | dr (cid:17)(cid:12)(cid:12)(cid:12) i ≤ E h(cid:12)(cid:12)(cid:12) Z Tt Z Ts | be Ξ r || µ r ( ds ) | dr (cid:12)(cid:12)(cid:12) i ≤ E h(cid:12)(cid:12)(cid:12) Z Tt Z Ts | be Ξ r | ℓ ( r, ds ) dr (cid:12)(cid:12)(cid:12) o ≤ C E h(cid:12)(cid:12)(cid:12) Z Tt | be Ξ r | dr (cid:12)(cid:12)(cid:12) o ≤ C ( T − t ) E h Z T | be Ξ s | ds i , and therefore (7.23) becomes E [ | b η t | ] + E h Z Tt | b Ξ s | ds i ≤ C ( T − t ) E h Z Tt | be Ξ s | ds i . (7.25)With this estimate, and following the standard argument one shows that BSDE (7.18) iswell-posed on [ T − δ, T ] for some (uniform) δ >

0. Iterating the argument one can thenobtain the well-posedness on [0 , T ]. We leave the details to the interested reader.We are now ready to prove the main result of this paper. Let us deﬁne the

Hamiltonian :for ( ϕ, µ ) ∈ C T × P ( C T ), and k : [0 , T ] × Ω → R adapted process, ( t, ω, z ) ∈ [0 , T ] × Ω × R , H ( t, ω, ϕ ·∧ t , µ, z ; k ) △ = k t ( ω ) · σ ( t, ϕ ·∧ t , µ, z ) + f ( t, ϕ ·∧ t , µ, z ) . (7.26)We have the following theorem. Theorem 7.3 (Stochastic Maximum Principle) . Assume that the Assumptions 2.4 and 3.1hold. Assume further that the mapping z H ( t, ϕ ·∧ t , µ, z ) is convex. Let u = u ∗ ∈ U ad bean optimal control and X u the corresponding trajectory. Then, for dt × d Q -a.e. ( t, ω ) ∈ [0 , T ] × Ω it holds that H ( t, ω, X u ·∧ t , µ ut , u t ; q t ) = inf v ∈ U H ( t, ω, X u ·∧ t , µ ut , v ; q t ) , (7.27) where ( p, q, ˜ q ) and ( Q, M, ˜ M ) constitute the unique solution of the BSDE (7.18).Proof. We ﬁrst recall from (5.14) that C u,vf ( t ) = E [ L ut ∂ z f ( t, ϕ ·∧ t , U ut , z )] | ϕ = X u ,z = u t = ∂ z f ( t, X u ·∧ t , µ ut , u t ); C u,vσ ( t ) = E n L ut ∂ z σ ( t, ϕ ·∧ t , U ut , z ) io(cid:12)(cid:12)(cid:12) ϕ = X u ; z = u t = ∂ z σ ( t, X u ·∧ t , µ ut , u t ) . Then (7.16) implies that0 ≤ E h Z T [ q t C u,vσ ( t ) + C u,vf ( t )]( v t − u t ) dt i (7.28)= E h Z T ∂ z H ( t, ω, X u ·∧ t , µ ut , u t ; q t )( v t − u t ) dt i . dt × d Q -a.e. ( t, ω ) ∈ [0 , T ] × Ω, and any v ∈ U , it holds that ∂ z H ( t, ω, X u ·∧ t , µ ut , u t ; q t )( v − u t ) ≥ . (7.29)Now, for any v ∈ U , one has, dt × d Q -a.e. on [0 , T ] × Ω, H ( t, ω, X u ·∧ t , µ ut , v ; q t ) − H ( t, ω, X u ·∧ t , µ ut , u t ; q t )= Z ∂ z H ( t, ω, X u ·∧ t , µ ut , u t + λ ( v − u t ); q t )( v − u t ) dλ = Z h ∂ z H ( t, ω, X u ·∧ t , µ ut , u t + λ ( v − u t ); q t ) − ∂ z H ( t, ω, X u ·∧ t , µ ut , u t ; q t ) i ( v − u t ) dλ + ∂ z H ( t, ω, X u ·∧ t , µ ut , u t ; q t )( v − u t ) ≥ , . Here the ﬁrst integral on the right hand side above is nonnegative due to the convexity of H in variable z , and the last term is non-negative because of (7.29). The identity (7.27)now follows immediately. Remark 7.4.

In stochastic control literature the inequality (7.28) is sometimes referred toas

Stochastic Maximum Principle in integral form , which in many applications is useful, asit does not require the convexity assumption on the Hamiltonian H . Acknowledgment.

We would like to thank the anonymous referee for his/her very carefulreading of the manuscript and many incisive and constructive questions and suggestions,which helped us to make the paper a much better product.

References [1] Bensoussan, A. (1992),

Stochastic Control of Partially Observable Systems , CambridgeUniversity Press.[2] Buckdahn, R., Djehiche, B., and Li, J., (2011),

A general Stochastic Maximum Principlefor SDEs of Mean-ﬁeld Type . Appl. Math. Optim. , no. 2, 197–216.[3] Buckdahn, R., Djehiche, B., Li, J., and Peng, S. (2009), Mean-ﬁeld Backward StochasticDiﬀerential Equations: A Limit Approach.

Ann. Probab. , no. 4, 1524-1565.[4] Buckdahn, R., Li, J., and Peng, S. (2009), Mean-ﬁeld Backward Stochastic DiﬀerentialEquations and Related Partial Diﬀerential Equations. Stochastic Process . Appl. , no.10, 3133–3154. 405] Buckdahn, R., Ma, J., and Zhang, J. (2015),

Pathwise Taylor Expansions for RandomFields on Multiple Dimensional Paths , Stochastic Process. Appl. , no. 7, 2820–2855.[6] Buckdahn, R., Keller, C., Ma, J., and Zhang, J.

Pathwise Viscosity Solutions of SPDEsand Forward PPDEs , preprint.[7] Carnoma R. and Delarue, F. (2012),

Optimal control of McKean-Vlasov stochastic dy-namics , Technical Report.[8] Carnoma R., Delarue, F., and Lachapelle, A. (2013),

Control of MaKean-Vlasov versusMean Field Games , Math. Financ. Econ. , no. 2, 131-166.[9] Carnoma R. and Delarue, F. (2013), Probabilistic Analysis of Mean-Field Games , SIAMJ. Control Optim. , no. 4, 2705-2734.[10] Carmona, R. and Delarue, F. (2013), Mean Field Forward-Backward Stochastic Diﬀer-ential Equations , Electron. Commun. Probab. , no. 68, 15 pp.[11] Carmona, R. and Delarue, F., (2015), Forward-backward stochastic diﬀerential equa-tions and controlled McKean-Vlasov dynamics . Ann. Probab. , no. 5, 2647-2700.[12] Carmona, R. and Zhu, X. (2014), A probabilistic approach to mean ﬁeld games withmajor and minor players , arXiv: 1409.7141v [math.PR].[13] Cauty, R. (2001),

Solution du probl`eme de point ﬁxe de Schauder , Fundamenta Matem-aticae, , 231-246.[14] Cauty, R. (2012),

Une g´en´eralisation de la conjecture de point ﬁxe de Schauder , arXiv:1201.2586 [math.AT].[15] Dupire, B. (2009),

Functional Itˆo calculus , papers.ssrn.com.[16] Ethier, S., and Kurtz, T. (1986),

Markov Processes, Characterization and Convergence ,John Williams & Sons Inc.[17] Huang, M., Malham´e, R., and Caines, P. (2006),

Large population stochastic dynamicgames: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle ,Commun. Inf. Syst. (3), 221-252.[18] Ikeda, N. and Watanabe, S. (1981), Stochastic diﬀerential equations and diﬀusionprocesses , North-Holland, Amsterdam. 4119] Krylov, N. V. (1999),

An analytic approach to SPDEs . Stochastic partial diﬀerentialequations: six perspectives , 185-242, Math. Surveys Monogr., 64, Amer. Math. Soc.,Providence, RI.[20] Lasry, J. M. and Lions, P.L. (2007),

Mean Field Games , Japanese Journal of Mathe-matics, (1), Mar.[21] Li, J. (2012), Stochastic maximum principle in the mean-ﬁeld controls , Automatica J.IFAC. , no. 2, 366-373.[22] Li, J. and Min, H. (2016), Weak solutions of mean-ﬁeld stochastic diﬀerential equationsand application to zero-sum stochastic diﬀerential games , to appear in SIAM Controland Optimization.[23] Strook, D., and Varadhan, S.V.S. (1979),

Multidimensional Diﬀusion Processes ,Springer-Verlag, New York.[24] Villani, C. (2003),

Topics in optimal transportations . Graduate Studies in Mathematics, , AMS, Providence, RI.[25] Yong, J. and Zhou, X. (1999), Stochastic Controls: Hamiltonian Systems and HJBEquations , New York: Springer-Verlag.[26] Zeitouni, O. (1986),