A Concise Introduction to Control Theory for Stochastic Partial Differential Equations
aa r X i v : . [ m a t h . O C ] J a n A Concise Introduction to Control Theory forStochastic Partial Differential Equations ∗ Qi L¨u † and Xu Zhang ‡ Abstract
The aim of this notes is to give a concise introduction to control theory for systems gov-erned by stochastic partial differential equations. We shall mainly focus on controllability andoptimal control problems for these systems. For the first one, we present results for the exactcontrollability of stochastic transport equations, null and approximate controllability of stochas-tic parabolic equations and lack of exact controllability of stochastic hyperbolic equations. Forthe second one, we first introduce the stochastic linear quadratic optimal control problems andthen the Pontryagin type maximum principle for general optimal control problems. It deservesmentioning that, in order to solve some difficult problems in this field, one has to develop newtools, say, the stochastic transposition method introduced in our previous works. . Primary 93E20, 60H15, 93B05, 93B07.
Key Words . Stochastic partial differential equation, controllability, observability, optimal control,linear quadratic control problem, Pontryagin type maximum principle.
Contents ∗ This is a lecture notes of “a concise introduction to control theory for stochastic partial differential equations”.It was written for the EECI (European Embedded Control Institute) Summer School “International Graduate Schoolon Control” at Sichuan University, Chengdu, China from July 8 to July 27, 2019. We refer to the monograph [53] fora more detailed presentation of the whole theory, while the survey paper [56] is mainly for the engineering-orientedreaders. † School of Mathematics, Sichuan University, Chengdu 610064, Sichuan Province, China. The author is partiallysupported by NSF of China under grants 12025105, 11971334 and 11931011, and by the Chang Jiang ScholarsProgram from the Chinese Education Ministry.
E-mail: [email protected] . ‡ School of Mathematics, Sichuan University, Chengdu 610064, Sichuan Province, China. The author is supportedby the NSF of China under grants 11931011 and 11821001.
E-mail: zhang [email protected] .
11 A sufficient condition for an optimal control 98
12 Further comments and open problems 103
For the readers’ convenience, we collect in this section some basic knowledge of Functional Anal-ysis, Partial Differential Equations (PDEs for short), Stochastic Analysis and Stochastic PartialDifferential Equations (SPDEs for short), which will be used later. For more details, we refer to[7, 22, 53, 58, 61, 76, 79]. Unless otherwise stated, all definitions, theorems and examples in thissection are taken from these references. Throughout this paper, N is the set of positive integers,while R and C stand for respectively the fields of real numbers and complex numbers. Also, weshall denote by C a generic positive constant which may change from line to line (unless otherwisestated). In this notes, all linear spaces are over the field R or over the field C . Clearly, any linear space overthe field C is also a linear space over the field R . For any c ∈ C , denote by ¯ c its complex conjugate. Let X be a linear space. A map | · | X : X → R is called a norm on X if it satisfiesthe following: | x | X ≥ , ∀ x ∈ X ; and | x | X = 0 ⇐⇒ x = 0; | αx | X = | α || x | X , ∀ α ∈ C , x ∈ X ; | x + y | X ≤ | x | X + | y | X , ∀ x, y ∈ X . (1.1) A linear space X with the above norm | · | X is called a normed linear space and denoted by ( X , | · | X ) (or simply by X if the norm | · | X is clear from the context). Definition 1.2
Let X be a normed linear space.1) We call { x k } ∞ k =1 ⊂ X a Cauchy sequence (in ( X , | · | X ) ) if for any ε > , there is k ∈ N such that | x k − x j | X < ε for all k, j ≥ k .2) We call { x k } ∞ k =1 ⊂ X converges to some x ( ∈ X ) in X if lim k →∞ x k = x in X , i.e., lim k →∞ | x k − x | X = 0 . efinition 1.3 Let X be a normed linear space.1) A subset G ⊂ X is called bounded, if there is a constant C such that | x | X ≤ C for any x ∈ G .2) A subset G is said to be dense in X if for any x ∈ X , one can find a sequence { x k } ∞ k =1 ⊂ G such that lim k →∞ x k = x in X . Definition 1.4
A normed linear space ( X , | · | X ) is called a Banach space if it is complete, i.e., forany Cauchy sequence { x k } ∞ k =1 ⊂ X , there exists x ∈ X so that lim k →∞ x k = x in X . Definition 1.5
A Banach space X is called separable if there exists a countable dense subset of X . Definition 1.6
Let X be a linear space. A map h· , ·i X : X × X → C is called an inner product on X if it satisfies the following: h x, x i X ≥ , ∀ x ∈ X ; and h x, x i X = 0 ⇐⇒ x = 0; h x, y i X = h y, x i X , ∀ x, y ∈ X ; h αx + βy, z i X = α h x, y i X + β h y, z i X , ∀ α, β ∈ C , x, y, z ∈ X . (1.2) A linear space X with the inner product h· , ·i X is called an inner product space and denoted by ( X , h· , ·i X ) (or simply by X if the inner product h· , ·i X is clear from the context). The following result gives a relationship between norm and inner products.
Proposition 1.1
Let ( X , h· , ·i X ) be an inner product space. Then, the map x p h x, x i X , ∀ x ∈X , is a norm on X . By Proposition 1.1, any inner product space ( X , h· , ·i X ) can be regarded as a normed linearspace. We call | x | X = p h x, x i X the norm induced by h· , ·i X . Definition 1.7
An inner product space X is called a Hilbert space if it is complete under the norminduced by its inner product. In the rest of this section, unless otherwise stated, X and Y are normed linear spaces, and X and Y are Banach spaces. Definition 1.8
A map A : X → Y is called a bounded linear operator if it is linear, i.e., A ( αx + βy ) = αAx + βAy, ∀ x, y ∈ X , α, β ∈ C , (1.3) and A maps bounded subsets of X into bounded subsets of Y . Denote by L ( X ; Y ) the set of all bounded linear operators from X to Y . We simply write L ( X ) instead of L ( X ; X ). For any α, β ∈ R and A, B ∈ L ( X ; Y ), we define αA + βB as follows:( αA + βB )( x ) = αAx + βBx, ∀ x ∈ X . (1.4)Then, L ( X ; Y ) is also a linear space. Let | A | L ( X ; Y ) ∆ = sup x ∈X \{ } | Ax | Y | x | X . (1.5)One can show that, | · | L ( X ; Y ) defined by (1.5) is a norm on L ( X ; Y ), and L ( X ; Y ) is a Banachspace under such a norm.We shall use the following two theorems. 4 heorem 1.1 (Inverse Mapping) If A ∈ L ( X ; Y ) is bijective, then A − ∈ L ( Y ; X ) . Theorem 1.2 (Uniform Boundedness) If Λ is a given index set, A λ ∈ L ( X ; Y ) for each λ ∈ Λ and, sup λ ∈ Λ | A λ x | Y < ∞ , ∀ x ∈ X , then sup λ ∈ Λ | A λ | L ( X ; Y ) < ∞ . Let us consider the special case Y = C . Any f ∈ L ( X ; C ) is called a bounded linear functional on X . Hereafter, we write X ′ = L ( X ; C ) and call it the dual (space) of X . We also denote f ( x ) = h f, x i X ′ , X , ∀ x ∈ X . (1.6)The symbol h· , ·i X ′ , X is referred to as the duality pairing between X ′ and X . It follows from (1.5)that | f | X ′ = sup x ∈X , | x | X ≤ | f ( x ) | , ∀ f ∈ X ′ . (1.7)Clearly, both X ′ and X ′ are Banach spaces. Particularly , X is called reflexive if X ′′ = X .The following result is quite useful. Theorem 1.3 (Hahn-Banach)
Let X be a linear subspace of X and f ∈ X ′ . Then, there exists f ∈ X ′ with | f | X ′ = | f | X ′ , such that f ( x ) = f ( x ) , ∀ x ∈ X . An immediate consequence of Theorem 1.3 is as follows:
Proposition 1.2
For any x ∈ X , there is f ∈ X ′ with | f | X ′ = 1 such that f ( x ) = | x | X . We shall use the following theorem.
Theorem 1.4 (Riesz Representation) If ( X , h· , ·i X ) is a Hilbert space and F ∈ X ′ , then thereis y ∈ X such that F ( x ) = h x, y i X , ∀ x ∈ X . For any A ∈ L ( X ; Y ), we define a map A ∗ : Y ′ → X ′ by the following: h A ∗ y ′ , x i X ′ , X = h y ′ , Ax i Y ′ , Y , ∀ y ′ ∈ Y ′ , x ∈ X . (1.8)Clearly, A ∗ is linear and bounded. We call A ∗ the adjoint operator of A . It is easy to check thatfor any A, B ∈ L ( X ; Y ) and α, β ∈ R , ( αA + βB ) ∗ = αA ∗ + βB ∗ .Let V and V be separable Hilbert spaces, and { e j } ∞ j =1 be an orthonormal basis of V . F ∈L ( V ; V ) is called a Hilbert-Schmidt operator if P ∞ j =1 | F e j | V < ∞ . Denote by L ( V ; V ) the spaceof all Hilbert-Schmidt operators from V into V . One can show that, L ( V ; V ) equipped with theinner product h F, G i L ( V ; V ) = ∞ X j =1 h F e j , Ge j i V , ∀ F, G ∈ L ( V ; V ) , is a separable Hilbert space. 5 .1.3 Unbounded linear operator In this section, unless otherwise stated, H is a Hilbert space. Definition 1.9
For a linear map A : D ( A ) → H , where D ( A ) is a linear subspace in H , we call D ( A ) the domain of A . The graph of A is the subset of H × H consisting of all elements of theform ( x, Ax ) with x ∈ D ( A ) . The operator A is called closed (resp. densely defined) if its graph isa closed subspace of H × H (resp. D ( A ) is dense in H ). Unlike bounded linear operators, the linear operator A in Definition 1.9 might be unbounded,i.e., it may map some bounded sets in H to unbounded sets in H . A typical densely defined, closedunbounded linear operator (on H = L (0 , ddx , with the domain n y ∈ L (0 , (cid:12)(cid:12)(cid:12) y is absolutely continuous on [0 , dydx ∈ L (0 , , y (0) = 0 o . In the sequel, we shall assume that the operator A is densely defined. The domain D ( A ∗ ) ofthe adjoint operator A ∗ of A is defined as the set of all f ∈ H such that, for some g f ∈ H , h Ax, f i H = h x, g f i H , ∀ x ∈ D ( A ) . In this case, we define A ∗ f ∆ = g f .Denote by I the identity operator on H . The resolvent set ρ ( A ) of A is defined by ρ ( A ) ∆ = (cid:8) λ ∈ C (cid:12)(cid:12) λI − A : D ( A ) → H is bijectve and ( λI − A ) − ∈ L ( H ) (cid:9) . The resolvent set ρ ( A ) is open in C . In this subsection, we recall some basic definitions and results on measure and integration.Let Ω be a nonempty set. For any subset E ⊂ Ω, denote by χ E ( · ) the characteristic function of E , defined on Ω, that is, χ E ( ω ) = , if ω ∈ E, , if ω ∈ Ω \ E. Definition 1.10
Let F be a family of nonempty subsets of Ω . F is called a σ -field on Ω if 1) Ω ∈ F ; 2) Ω \ E ∈ F for any E ∈ F ; and 3) ∞ [ i =1 E i ∈ F whenever each E i ∈ F . If F is a σ -field on Ω, then (Ω , F ) is called a measurable space . Any element E ∈ F is called a measurable set on (Ω , F ), or simply a measurable set.In the sequel, we shall fix a measurable space (Ω , F ) . Definition 1.11
A set function µ : F → [0 , + ∞ ] is called a measure on (Ω , F ) if µ ( ∅ ) = 0 and µ is countably additive, i.e., µ ( ∞ [ i =1 E i ) = ∞ X i =1 µ ( E i ) whenever { E i } ∞ i =1 ⊂ F are mutually disjoint, i.e., E i ∩ E j = ∅ for all i, j ∈ N with i = j . The triple (Ω , F , µ ) is called a measure space. We shall fix below a measure space (Ω , F , µ ).6 efinition 1.12 The measure µ is called finite (resp. σ -finite) if µ (Ω) < ∞ (resp. there exists asequence { E i } ∞ i =1 ⊂ F so that Ω = S ∞ i =1 E i and µ ( E i ) < ∞ for each i ∈ N ). We call any E ⊂ F a µ -null (measurable) set if µ ( E ) = 0. Definition 1.13
The measure space (Ω , F , µ ) is said to be complete (and µ is said to be completeon F ) if N = { e E ⊂ Ω | e E ⊂ E for some µ -null set E } ⊂ F . If the measure space (Ω , F , µ ) is not complete, then the class F of all sets of the form ( E \ N ) S ( N \ E ), with E ∈ F and N ∈ N (See Definition 1.13 for N ), is a σ -field which contains F as a propersub-class, and the set function ¯ µ defined by ¯ µ (( E \ N ) ∪ ( N \ E )) = µ ( E ) is a complete measure on F . The measure ¯ µ is called the completion of µ .If a certain proposition concerning the points of (Ω , F , µ ) is true for every point ω ∈ Ω, withthe exception at most of a set of points which form a µ -null set, it is customary to say that theproposition is true µ -a.e. (or simply a.e., if the measure µ is clear from the context). A function g : (Ω , F , µ ) → R is called essentially bounded if it is bounded µ -a.e., i.e. if there exists a positive,finite constant c such that (cid:8) ω ∈ Ω (cid:12)(cid:12) | g ( ω ) | > c (cid:9) is a µ -null set. The infimum of the values of c forwhich this statement is true is called the essential supremum of | g | , abbreviated to ess sup ω ∈ Ω | g ( ω ) | .Let (Ω , F ) , · · · , (Ω n , F n ) ( n ∈ N ) be measurable spaces. Denote by F × · · · × F n the σ -field (on the Cartesian product Ω × · · · × Ω n ) generated by the subsets of the form E × · · · × E n ,where E i ∈ F i , 1 ≤ i ≤ n . Let µ i be a measure on (Ω i , F i ). We call µ a product measure on(Ω × · · · × Ω n , F × · · · × F n ) induced by µ , · · · , µ n if µ ( E × · · · × E n ) = n Y i =1 µ i ( E i ) , ∀ E i ∈ F i . Theorem 1.5
Let (Ω , F , µ ) , · · · , (Ω n , F n , µ n ) be σ -finite measure spaces. Then there is a uniqueproduct measure µ (denoted by µ × · · · × µ n ) on (Ω × · · · × Ω n , F × · · · × F n ) induced by { µ i } ni =1 . In the rest of this subsection, we fix a Banach space X . Definition 1.14
The smallest σ -field containing all open sets of X is called the Borel σ -field of X and denoted by B ( X ) . Any set E ∈ B ( X ) is called a Borel set (in X ). Let (Ω ′ , F ′ ) be another measurable space, and f : Ω → Ω ′ be a map . Definition 1.15
The above map f is said to be F / F ′ -measurable or simply F -measurable or evenmeasurable (in the case that no confusion would occur) if f − ( F ′ ) ⊂ F . Particularly if (Ω ′ , F ′ ) =( X , B ( X )) , then f is said to be an ( X -valued) F -measurable (or simply measurable) function. Let P be a property concerning the above map f at some elements in Ω. We shall simplydenote by { P } the subset { ω ∈ Ω | P holds for f ( ω ) } . For example, when Ω ′ = R , to simplify thenotation we denote { ω ∈ Ω | f ( ω ) ≥ } by { f ≥ } .For a measurable map f : (Ω , F ) → (Ω ′ , F ′ ), it is obvious that f − ( F ′ ) is a sub- σ -field of F .We call it the σ -field generated by f , and denote it by σ ( f ). Further, for a given index set Λ and afamily of measurable maps { f λ } λ ∈ Λ (defined on (Ω , F ), with possibly different ranges), we denoteby σ ( f λ ; λ ∈ Λ) the σ -field generated by S λ ∈ Λ σ ( f λ ). Note that here and henceforth the σ -field F × · · · × F n does not stand for the Cartesian product of F , · · · , F n . More generally, when X is a topology space, one can define its Borel σ -field B ( X ) in the same way. When Ω ′ = X , we call f an X -valued function. efinition 1.16 Let { f i } ∞ i =0 be a sequence of X -valued functions defined on Ω . The sequence { f k } ∞ k =1 is said to converge to f (denoted by lim k →∞ f k = f ) in X , µ -a.e., if { lim k →∞ f k = f } ∈ F ,and µ ( { lim k →∞ f k = f } ) = 0 . In the setting of infinite dimensions, the notion of F -measurability in Definition 1.15 may notprovide any means for approximation arguments. Thus, we need another notion of measurability.We restrict ourselves to Banach space valued functions, although some of the results below can begeneralized to functions with values in metric spaces. Definition 1.17
Let f : Ω → X be an X -valued function. We call f ( · ) an F -simple function (or simply a simple function when F is clear from thecontext) if f ( · ) = k X i =1 χ E i ( · ) h i , (1.9) for some k ∈ N , h i ∈ X , and mutually disjoint sets E , · · · , E k ∈ F satisfying k [ i =1 E i = Ω ; The function f ( · ) is said to be strongly F -measurable with respect to (w.r.t. for short) µ (orsimply strongly measurable) if there exists a sequence of F -simple functions { f k } ∞ k =1 converging to f in X , µ -a.e. In this case, sometimes we also say that f : (Ω , F , µ ) → X is strongly measurable. In order to characterize the strong measurability of X -valued functions, we introduce someterminology. Definition 1.18
A function f : Ω → X is called µ -separably valued (or simply separably valued) ifthere exists a separable closed subspace X ⊂ X such that f ( ω ) ∈ X for µ -a.e. ω ∈ Ω . Theorem 1.6
Let (Ω , F , µ ) be a σ -finite measure space. For a function f : Ω → X , the followingassertions are equivalent: f is strongly measurable; f is separably valued and measurable. By Theorem 1.6, if the Banach space X is separable and the measure space (Ω , F , µ ) is σ -finite,then, f is strongly measurable ⇔ f is measurable.Let us fix below a σ -finite measure space (Ω , F , µ ). Definition 1.19
Let f ( · ) be an ( X -valued) simple function in the form (1.9) . We call f ( · ) Bochnerintegrable if µ ( E i ) < ∞ for each i = 1 , · · · , k . In this case, for any E ∈ F , the Bochner integral of f ( · ) over E is defined by Z E f ( s ) dµ = k X i =1 µ ( E ∩ E i ) h i . In general, we have the following notion.
Definition 1.20
A strongly measurable function f ( · ) : Ω → X is said to be Bochner integrable(w.r.t. µ ) if there exists a sequence of Bochner integrable simple functions { f i ( · ) } ∞ i =1 converging to f ( · ) in X , µ -a.e., so that lim i,j →∞ Z Ω | f i ( s ) − f j ( s ) | X dµ = 0 . n this case, we also say that f ( · ) : (Ω , F , µ ) → X is Bochner integrable. For any E ∈ F , theBochner integral of f ( · ) over E is defined by Z E f ( s ) dµ = lim i →∞ Z Ω χ E ( s ) f i ( s ) dµ ( s ) in X . (1.10)It is easy to verify that the limit in the right hand side of (1.10) exists and its value is independentof the choice of the sequence { f i ( · ) } ∞ i =1 . Clearly, when X = R n (for some n ∈ N ), the above Bochnerintegral coincides the usual Lebesgue integral for R n -valued functions.The following result reveals the relationship between the Bochner integral (for vector-valuedfunctions) and the usual Lebesgue integral (for scalar functions). Theorem 1.7
Let f ( · ) : Ω → X be strongly measurable. Then, f ( · ) is Bochner integrable (w.r.t. µ ) if and only if the scalar function | f ( · ) | X : Ω → R is integrable (w.r.t. µ ). Further properties for Bochner integral are collected as follows.
Theorem 1.8
Let f ( · ) , g ( · ) : (Ω , F , µ ) → X be Bochner integrable. Then, For any a, b ∈ R , the function af ( · ) + bg ( · ) is Bochner integrable, and for any E ∈ F , Z E (cid:0) af ( s ) + bg ( s ) (cid:1) dµ = a Z E f ( s ) dµ + b Z E g ( s ) dµ. For any E ∈ F , (cid:12)(cid:12)(cid:12) Z E f ( s ) dµ (cid:12)(cid:12)(cid:12) X ≤ Z E | f ( s ) | X dµ. The Bochner integral is µ -absolutely continuous, that is, lim E ∈F , µ ( E ) → Z E f ( s ) dµ = 0 in X . If F ∈ L ( X ; Y ) , then F f ( · ) is a Y -valued Bochner integrable function, and for any E ∈ F , Z E F f ( s ) dµ = F Z E f ( s ) dµ. The following result, known as
Dominated Convergence Theorem , is very useful.
Theorem 1.9
Let f : (Ω , F , µ ) → X be strongly measurable, and let g : (Ω , F , µ ) → R be a realvalued nonnegative integrable function. Assume that, f i : (Ω , F , µ ) → X is Bochner integrable suchthat | f i | X ≤ g , µ -a.e. for each i ∈ N and lim i →∞ f i = f in X , µ -a.e. Then, f is Bochner integrable(w.r.t. µ ), and lim i →∞ Z E f i ( s ) dµ = Z E f ( s ) dµ in X , ∀ E ∈ F . Also, one has the following
Fubini Theorem (on Bochner integrals).
Theorem 1.10
Let (Ω , F , µ ) and (Ω , F , µ ) be σ -finite measure spaces. If f ( · , · ) : Ω × Ω →X is Bochner integrable (w.r.t. µ × µ ), then the functions y ( · ) ≡ R Ω f ( t, · ) dµ ( t ) and z ( · ) ≡ R Ω f ( · , s ) dµ ( s ) are a.e. defined on Ω and Ω and Bochner integrable w.r.t. µ and µ , respectively.Moreover, Z Ω × Ω f ( t, s ) dµ × µ ( t, s ) = Z Ω z ( t ) dµ ( t ) = Z Ω y ( s ) dµ ( s ) . p ∈ [1 , ∞ ), denote by L p F (Ω; X ) ∆ = L p (Ω , F , µ ; X ) the set of all (equivalence classes of)strongly measurable functions f : (Ω , F , µ ) → X such that R Ω | f | p X dµ < ∞ . It is a Banach spacewith the norm | f | L p F (Ω; X ) = (cid:18)Z Ω | f | p X dµ (cid:19) /p . When X is a Hilbert space, so is L F (Ω; X ). Denote by L ∞F (Ω; X ) ∆ = L ∞ (Ω , F , µ ; X ) the set of all(equivalence classes of) strongly measurable ( X -valued) functions f such that ess sup ω ∈ Ω | f ( ω ) | X < ∞ . It is also a Banach space with the norm | f | L ∞F (Ω; X ) = ess sup ω ∈ Ω | f ( ω ) | X . For 1 ≤ p ≤ ∞ and any non-empty open subset G of R n , we shall simply denote L p ( G, L , m ; X )by L p ( G ; X ), where L is the family of Lebesgue measurable sets in G , and m is the Lebesguemeasure on G . Also, we simply denote L p F (Ω; R ) and L p ( G ; R ) by L p F (Ω) and L p ( G ), respectively.Particularly, if G = (0 , T ) ⊂ R for some T >
0, we simply write L p ((0 , T ); X ) and L p ((0 , T ))respectively by L p (0 , T ; X ) and L p (0 , T ).For any p ∈ [1 , ∞ ), denote by p ′ the H¨older conjugate of p , i.e., p ′ = (cid:26) pp − , if p > , ∞ , if p = 1 . The following result characterizes the dual space of L p (Ω; X ). Proposition 1.3
Let X be a reflexive Banach space and p ∈ [1 , ∞ ) . Then, L p F (Ω; X ) ′ = L p ′ F (Ω; X ′ ) . (1.11)Let Φ : (Ω , F ) → (Ω ′ , F ′ ) be a measurable map. Then, for the measure µ on (Ω , F ), the map Φinduces a measure µ ′ on (Ω ′ , F ′ ) via µ ′ ( E ′ ) ∆ = µ (Φ − ( E ′ )) , ∀ E ′ ∈ F ′ . (1.12)The following is a change-of-variable formula: Theorem 1.11
A function f ( · ) : Ω ′ → X is Bochner integrable w.r.t. µ ′ if and only if f (Φ( · )) (defined on (Ω , F ) ) is Bochner integrable w.r.t. µ . Furthermore, Z Ω ′ f ( ω ′ ) dµ ′ ( ω ′ ) = Z Ω f (Φ( ω )) dµ ( ω ) . (1.13) In this subsection, we recall the notions of continuity and differentiability for vector-valued func-tions. Let X and Y be Banach spaces, X ⊂ X and let F : X → Y be a function (not necessarilylinear). Definition 1.21 We say that F is continuous at x ∈ X if lim x ∈X ,x → x | F ( x ) − F ( x ) | Y = 0 . f F is continuous at each point of X , we say that F is continuous on X . We say that F is Fr´echet differentiable at x ∈ X if there exists F ∈ L ( X ; Y ) such that lim x ∈X ,x → x | F ( x ) − F ( x ) − F ( x − x ) | Y | x − x | X = 0 . (1.14) We call F the Fr´echet derivative of F at x , and write F x ( x ) = F . If F is Fr´echet differentiableat each point of X , we say that F is Fr´echet differentiable on X . Moreover, when the map F x : X → L ( X ; Y ) is continuous, we say that F is continuous Fr´echet differentiable on X . We say that F is second order Fr´echet differentiable at x ∈ X if F : X → Y is continuousFr´echet differentiable and there exists F ∈ L ( X ; L ( X ; Y )) such that lim x ∈X ,x → x | F x ( x ) − F x ( x ) − F ( x − x ) | L ( X ; Y ) | x − x | X = 0 . (1.15) We call F the second order Fr´echet derivative of F at x , and write F xx ( x ) = F . If F is secondorder Fr´echet differentiable at each point of X , we say that F is second order Fr´echet differentiableon X . Moreover, when the map F xx : X → L ( X ; L ( X ; Y )) is continuous, we say that F is secondorder continuous Fr´echet differentiable on X . The set of all continuous ( resp. continuous Fr´echet differentiable, second order continuousFr´echet differentiable) functions from X to Y is denoted by C ( X ; Y ) ( resp. C ( X ; Y ), C ( X ; Y )).When Y = R , we simply denote it by C ( X ) ( resp. C ( X ), C ( X )).Next, we recall the definition of bounded bilinear operator. Let Z be another Banach space. Definition 1.22
A mapping M : X × Z → Y is called a bounded bilinear operator if M is linearin each argument and there is a constant C > such that | M ( x, z ) | Y ≤ C| x | X | z | Z , ∀ ( x, z ) ∈ X × Z . Denote by L ( X , Z ; Y ) the set of all bounded bilinear operators from X × Z to Y . The norm of M ∈ L ( X , Z ; Y ) is defined by | M | L ( X , Z ; Y ) = sup x ∈X \{ } ,z ∈Z\{ } | M ( x, z ) | Y | x | X | z | Z . L ( X , Z ; Y ) is a Banach space w.r.t. this norm.Any L ∈ L ( X ; L ( Z ; Y )) defines a bounded bilinear operator e L ∈ L ( X , Z ; Y ) as follows: e L ( x, z ) = (cid:0) L ( x ) (cid:1) ( z ) , ∀ ( x, z ) ∈ X × Z . Conversely, any e L ∈ L ( X , Z ; Y ) defines an L ∈ L ( X ; L ( Z ; Y )) in the following way: (cid:0) L ( x ) (cid:1) ( z ) = e L ( x, z ) , ∀ ( x, z ) ∈ X × Z . Hence, any F ∈ C ( X ; L ( X ; L ( Z ; Y ))) can be regarded as an element e F ∈ C ( X ; L ( X , Z ; Y ).In the rest of this paper, if there is no confusion, we identify F ∈ C ( X ; L ( X ; L ( Z ; Y ))) with e F ∈ C ( X ; L ( X , Z ; Y ). 11 .4 Generalized functions and Sobolev spaces We begin with introducing some notations. For any α = ( α , α , · · · , α n ) (with α j ∈ N ∪ { } , j = 1 , · · · , n ), put | α | = n X j =1 | α j | and ∂ α ≡ ∂ α ∂x α ∂ α ∂x α · · · ∂ αn ∂x αnn . Let G ⊂ R n be a bounded domainwith the boundary Γ, and write G for its closure. For any x ∈ R n and r >
0, denote by B (cid:0) x , r (cid:1) the open ball with radius r , centered at x . Definition 1.23
We say the boundary Γ (of G ) is C k (for some k ∈ N ) if for each point x ∈ Γ there exist r > and a C k function γ : R n − → R such that, upon relabeling and reorienting thecoordinates axes if necessary, G ∩ B (cid:0) x , r (cid:1) = (cid:8) x ∈ B (cid:0) x , r (cid:1) | x n > γ ( x , . . . , x n − ) (cid:9) . We say that Γ is C ∞ if Γ is C k for any k = 1 , , · · · . If Γ is C , then along Γ one can define the unit outward normal vector field (of G ): ν = (cid:0) ν , . . . , ν n (cid:1) . The unit outward normal vector at each point x ∈ Γ is hence ν ( x ) = ( ν ( x ) , · · · , ν n ( x )). For any u ∈ C ( G ), we call ∂u∂ν ∆ = ∇ u · ν the outward normal derivative of u .Although most of the results in this notes hold when Γ is C k for some suitable k ∈ N (usually, k = 1 or 2), for simplicity, in the sequel we shall always assume that Γ is C ∞ (unless other stated).For any m ∈ N ∪ { } , denote by C m ( G ) and C m ( G ) the sets of all m times continuouslydifferentiable functions on G and G , respectively, and by C m ( G ) the set of all functions f ∈ C m ( G ),such that supp f ∆ = { x ∈ G | f ( x ) = 0 } is compact in G . C m ( G ) is a Banach space with the norm | f | C m ( G ) = max x ∈ G X | α |≤ m | ∂ α f ( x ) | , ∀ f ∈ C m ( G ) . (1.16)In the sequel, we shall write C ( G ) for C ( G ).Denote by C ∞ ( G ) the set of infinitely differentiable functions on G . Put C ∞ ( G ) = (cid:8) f ∈ C ∞ ( G ) (cid:12)(cid:12) supp f is compact in G (cid:9) . Definition 1.24
Let { f k } ∞ k =1 be a sequence in C ∞ ( G ) . We say that f k → in C ∞ ( G ) as k → ∞ ,if 1) For some compact subset K in G , supp f k ⊂ K, ∀ k ∈ N ;
2) For all multi-index α = ( α , · · · , α n ) , sup x ∈ K | ∂ α f k ( x ) | → , as k → ∞ . Generally, for f ∈ C ∞ ( G ) , we say that f k → f in C ∞ ( G ) as k → ∞ , if f k − f → in C ∞ ( G ) as k → ∞ . D ( G ) the linear space C ∞ ( G ) equipped with the sequential convergence given inDefinition 1.24. A linear functional F : D ( G ) → C is said to be a D ′ ( G ) generalized function, insymbol F ∈ D ′ ( G ), if F is also continuous in D ( G ), i.e., for any sequence { f k } ∞ k =1 ⊂ C ∞ ( G ) with f k → C ∞ ( G ) as k → ∞ , one has lim k →∞ F ( f k ) = 0. Example 1.1
Any function f ∈ L ( G ) can be identified as a D ′ ( G ) generalized function in thefollowing way: F f ( ϕ ) = Z G f ( x ) ϕ ( x ) dx, ∀ ϕ ∈ D ( G ) . Example 1.2
Let a ∈ G . The Dirac δ a -function (in G ) is defined as δ a ( ϕ ) = ϕ ( a ) , ∀ ϕ ∈ D ( G ) . It is easy to show that δ a ∈ D ′ ( G ) . Also, one can show that δ a cannot be identified as a “usual”function in G . Let F ∈ D ′ ( G ). The generalized derivative of F w.r.t. x k (for some k ∈ { , , · · · , n } ), ∂F∂x k isdefined by (cid:16) ∂F∂x k (cid:17) ( ϕ ) = − F (cid:16) ∂ϕ∂x k (cid:17) , ∀ ϕ ∈ D ′ ( G ) . One can easily show that ∂F∂x k ∈ D ′ ( G ).Let p ∈ [1 , ∞ ). For f ∈ C m ( G ), define | f | W m,p ( G ) ∆ = (cid:16) Z G X | α |≤ m | ∂ α f | p dx (cid:17) /p , (1.17)which is a norm on C m ( G ). Write W m,p ( G ) for the completion of C m ( G ) w.r.t. the norm (1.17).Similarly, the completion of C m ( G ) under the norm (1.17) is denoted by W m,p ( G ). For p = 2, wealso write H m ( G ) for W m, ( G ) and H m ( G ) for W m,p ( G ). Both H m ( G ) and H m ( G ) are Hilbertspaces with the inner product h f, g i H m ( G ) = Z G X | α |≤ m ∂ α f ∂ α ¯ gdx, ∀ f, g ∈ H m ( G ) . (1.18)It can be proved that a function y ∈ W m,p ( G ), if and only if there exist functions f α ∈ L p ( G ), | α | ≤ m , such that Z G y∂ α ϕdx = ( − | α | Z G f α ϕdx, ∀ ϕ ∈ C ∞ ( G ) . (1.19)The above function f α is referred to as the α -th generalized derivative of y .Noting that C ∞ ( G ) ⊂ W m,p ( G ), people introduce the following space: Definition 1.25
Each element of ( W m,p ( G )) ′ , the dual space of W m,p ( G ) , determines a D ′ ( G ) generalized function. All of these generalized functions form a subspace of D ′ ( G ) , and we denote itby W − m,p ′ ( G ) , where p ′ is the H¨older conjugate of p . Especially, we denote H − m ( G ) = W − m, ( G ) . W − m,q ( G ) is a Banach space with the canonical norm, i.e., | F | W − m,q ( G ) = sup ϕ ∈ W m,p ( G ) \{ } F ( ϕ ) | ϕ | W m,p ( G ) , ∀ F ∈ W − m,q ( G ) . Especially, H − m ( G ) is a Hilbert space. 13 emark 1.1 Since H m ( G ) is a Hilbert space, by the classical Riesz representation theorem (i.e.,Theorem 1.4), there exists an isomorphism between H m ( G ) and H − m ( G ) ≡ ( H m ( G )) ′ . But, thisdoes not mean that these two spaces are the same. Indeed, the elements in H m ( G ) are functionswith certain regularities; while the elements in H − m ( G ) need not even to be a “usual” function. All the above spaces W m,p ( G ), W m,p ( G ) and W − m,p ′ ( G ) are called Sobolev spaces . Recall that H is a Hilbert space. Definition 1.26 A C -semigroup on H is a family of bounded linear operators { S ( t ) } t ≥ (on H )such that S ( t ) S ( s ) = S ( t + s ) for any s, t ≥ , S (0) = I and for any y ∈ H , lim t → S ( t ) y = y in H .When | S ( t ) | L ( H ) ≤ , the semigroup is called contractive. For any C -semigroup { S ( t ) } t ≥ on H , define a linear operator A on H as follows: D ( A ) ∆ = n x ∈ H (cid:12)(cid:12)(cid:12) lim t → S ( t ) x − xt exists in H o ,Ax = lim t → S ( t ) x − xt , ∀ x ∈ D ( A ) . (1.20)We say that the above operator A is the infinitesimal generator of { S ( t ) } t ≥ , or A generates the C -semigroup { S ( t ) } t ≥ . One can show that, D ( A ) is dense in H , A is a closed operator, and S ( t ) D ( A ) ⊂ D ( A ) for all t ≥ Proposition 1.4
Let { S ( t ) } t ≥ be a C -semigroup on H . Then There exist positive constants M and α such that | S ( t ) | L ( H ) ≤ M e αt , ∀ t ≥ ; dS ( t ) xdt = AS ( t ) x = S ( t ) Ax for every x ∈ D ( A ) and t ≥ ; For every z ∈ D ( A ∗ ) and x ∈ H , the map t
7→ h z, S ( t ) x i H is differentiable and ddt h z, S ( t ) x i H = h A ∗ z, S ( t ) x i H ; If a function y ∈ C ([0 , + ∞ ); D ( A )) satisfies dy ( t ) dt = Ay ( t ) for every t ∈ [0 , + ∞ ) , then y ( t ) = S ( t ) y (0) ; { S ( t ) ∗ } t ≥ is a C -semigroup on H with the infinitesimal generator A ∗ . Stimulating from the conclusions 2) and 4) in Proposition 1.4, it is natural to consider thefollowing evolution equation: ( y t ( t ) = Ay ( t ) , ∀ t > ,y (0) = y , (1.21)and for any y ∈ H , we call y ( · ) = S ( · ) y the mild solution to (1.21).The following result is quite useful to show that some specific unbounded operator under con-sideration generates a contraction semigroup. Theorem 1.12
Let A : D ( A ) ⊂ H → H be a linear densely defined closed operator. If h Ax, x i H ≤ , ∀ x ∈ D ( A )14 nd h A ∗ x ′ , x ′ i H ≤ , ∀ x ′ ∈ D ( A ∗ ) , then A generates a contraction semigroup on H . Example 1.3
Let G be given as in Subsection 1.4 and convex, and let O ∈ R n satisfy | O | R n = 1 .Take H = L ( G ) . Write Γ − ∆ = { x ∈ Γ | O · ν ( x ) ≤ } . Define an unbounded linear operator A on H as follows: D ( A ) = (cid:8) f ∈ H ( G ) (cid:12)(cid:12) f = 0 on Γ − (cid:9) ,Af = − O · ∇ f, ∀ f ∈ D ( A ) . Then, by Theorem 1.12, A generates a contraction semigroup { S ( t ) } t ≥ on L ( G ) . This semigrouparises in the study of (homogeneous) transport equation: y t + O · ∇ y = 0 in G × (0 , + ∞ ) ,y = 0 on Γ − × (0 , + ∞ ) ,y (0) = y in G. (1.22) Example 1.4
Take H = L ( G ) , and define an unbounded linear operator A on H as follows: D ( A ) = H ( G ) T H ( G ) ,Af = ∆ f = n X k =1 ∂ f∂x k , ∀ f ∈ D ( A ) . Then, by Theorem 1.12, A generates a contraction semigroup { S ( t ) } t ≥ on L ( G ) . This semigrouparises in the study of (homogeneous) heat equation: y t − ∆ y = 0 in G × (0 , + ∞ ) ,y = 0 on Γ × (0 , + ∞ ) ,y (0) = y in G. (1.23) Example 1.5
Let H = H ( G ) × L ( G ) and define an unbounded linear operator A on H as follows: D ( A ) = (cid:8) ( y, z ) ⊤ (cid:12)(cid:12) y ∈ H ( G ) ∩ H ( G ) , z ∈ H ( G ) (cid:9) ,A (cid:18) yz (cid:19) = (cid:18) z ∆ y (cid:19) ≡ (cid:18) I ∆ 0 (cid:19) (cid:18) yz (cid:19) , ∀ ( y, z ) ⊤ ∈ D ( A ) . By Theorem 1.12, A generates a contraction semigroup on H . This semigroup arises in the studyof the following (homogeneous) wave equation: y tt − ∆ y = 0 in G × (0 , + ∞ ) ,y = 0 on Γ × (0 , + ∞ ) ,y (0) = y , y t (0) = y in G. (1.24) In fact, if we set z = y t , then, (1.24) can be transformed into the following equation: w t = Aw in (0 , + ∞ ) ,w (0) = w , (1.25) where w = ( y, z ) and w = ( y , y ) .
15s applications of C -semigroup, people are concerned with the well-posedness of the followingsemilinear evolution equation: ( y t ( t ) = Ay ( t ) + f ( t, y ( t )) , t ∈ (0 , T ] ,y (0) = y , (1.26)where A : D ( A )( ⊂ H ) → H generates a C -semigroup { S ( t ) } t ≥ on H , y ∈ H and f : [0 , T ] × H → H satisfies the following:1) For each x ∈ H , f ( · , x ) : [0 , T ] → H is strongly measurable; and2) There exists a constant C >
0, such that, for a.e. t ∈ [0 , T ], ( | f ( t, x ) − f ( t, x ) | H ≤ C| x − x | H , ∀ x , x ∈ H, | f ( t, | H ≤ C . (1.27) Definition 1.27
1) We call y ∈ C ([0 , T ]; H ) a strong solution to (1.26) if y (0) = y , y is differ-entiable for a.e. t ∈ (0 , T ) , y ( t ) ∈ D ( A ) for a.e. t ∈ (0 , T ) and y t ( t ) = Ay ( t ) + f ( t, y ( t )) for a.e. t ∈ (0 , T ) .2) We call y ∈ C ([0 , T ]; H ) a weak solution to (1.26) if for any ϕ ∈ D ( A ∗ ) and t ∈ [0 , T ] , h y ( t ) , ϕ i H = h y , ϕ i H + Z t (cid:0) h y ( s ) , A ∗ ϕ i H + h f ( s, y ( s )) , ϕ i H (cid:1) ds. (1.28)
3) We call y ∈ C ([0 , T ]; H ) a mild solution to (1.26) if y ( t ) = S ( t ) y + Z t S ( t − s ) f ( s, y ( s )) ds, t ∈ [0 , T ] . (1.29)It is not difficult to show the following result. Proposition 1.5
A function y ∈ C ([0 , T ]; H ) is a mild solution to (1.26) if and only if it is a weaksolution to the same equation. Hereafter, we will not distinguish mild and weak solutions to (1.26). The following result isconcerned with the existence and uniqueness of solutions to (1.26).
Proposition 1.6
For any y ∈ H , (1.26) admits a unique mild solution y , and | y ( · ) | C ([0 ,T ]; H ) ≤ C (1 + | y | H ) . We note that the solution y to (1.26) is not necessarily differentiable because A is usuallyunbounded. This may bring some trouble in some situations. The following convergence resultsometimes can help us to remove such an obstacle. Proposition 1.7
Let y be the mild solution to (1.26) and let y λ be the strong solution to thefollowing equation: y λ ( t ) = S λ ( t ) y + Z t S λ ( t − s ) f ( s, y λ ( s )) ds, t ∈ [0 , T ] , (1.30) where { S λ ( t ) } t ≥ is the C -semigroup generated by A λ = λA ( λI − A ) − , where λ ∈ ρ ( A ) , theresolvent set of A . Then, lim λ →∞ sup t ∈ [0 ,T ] | y λ ( t ) − y ( t ) | H = 0 . (1.31)16 .6 Probability, random variables and expectation We first recall that a probability space is simply a measure space (Ω , F , P ) for which P (Ω) = 1. Inthis case, P is called a probability measure.In the sequel, we fix a probability space (Ω , F , P ) and a Banach space X . Any ω ∈ Ω is calleda sample point ; any A ∈ F is called an event , and P ( A ) represents the probability of the event A .If an event A ∈ F is such that P ( A ) = 1, then we may alternatively say that A holds, P -a.s., orsimply A holds a.s. (if the probability P is clear from the context).Any X -valued, strongly measurable function f : (Ω , F ) → ( X , B ( X )) is called an ( X -valued) random variable . If f is Bochner integrable w.r.t. the measure P , then we denote the integral by E f and call it the mean or mathematical expectation of f .For any p ∈ [1 , ∞ ], one can define the Banach spaces L p F (Ω; X ) ≡ L p (Ω , F , P ; X ) and L p F (Ω) asthat in Subsection 1.2. For any f ∈ L F (Ω; R m ) (for some m ∈ N ), we define the variance of f byVar f = E [( f − E f )( f − E f ) ⊤ ] . Let
A, B ∈ F . We say that A and B are independent if P ( A ∩ B ) = P ( A ) P ( B ). Let J and J be two subfamilies of F . We say that J and J are independent if P ( A ∩ B ) = P ( A ) P ( B ) , ∀ ( A, B ) ∈ J × J . Let f, g : (Ω , F ) → ( X , B ( X )) be two random variables. We say that f and g ( resp. f and J )are independent if σ ( f ) and σ ( g ) ( resp. σ ( f ) and J ) are independent.Let X = ( X , · · · , X m ) : (Ω , F ) → ( R m , B ( R m )) be a random variable. We call F ( x ) ≡ F ( x , · · · , x m ) ∆ = P { X ≤ x , · · · , X m ≤ x m } the distribution function of X . If for some nonnegative function f ( · ), one has F ( x ) ≡ F ( x , · · · , x m ) = Z x −∞ · · · Z x m −∞ f ( ξ , · · · , ξ m ) dξ · · · dξ m , then the function f ( · ) is called the density of X . If f ( x ) = (2 π ) − m | det Q | − / exp n − µ ( x − λ ) ⊤ Q − ( x − λ ) o , where λ ∈ R m and Q is a positive definite m -dimensional matrix, then X is called a normallydistributed random variable (or X has a normal distribution ) and denoted by X ∼ N ( λ, Q ). When λ = 0 and Q is the m -dimensional identity matrix I m , we call X a standard normally distributedrandom variable . Let I = [0 , T ] with T >
0. A family of X -valued random variables { X ( t ) } t ∈I on (Ω , F , P ) is called a stochastic process . In the sequel, we shall interchangeably use { X ( t ) } t ∈I , X ( · ) or even X to denotea (stochastic) process. For any ω ∈ Ω, the map t X ( t, ω ) is called a sample path (of X ). X ( · )is said to be continuous ( resp. c´adl`ag, i.e., right-continuous with left limits) if there is a P -null set N ∈ F , such that for any ω ∈ Ω \ N , the sample path X ( · , ω ) is continuous ( resp. c´adl`ag) in X . Definition 1.28
Two ( X -valued) processes X ( · ) and X ( · ) are said to be stochastically equivalentif P ( { X ( t ) = X ( t ) } ) = 1 , ∀ t ∈ I . In this case, one is said to be a modification of the other. efinition 1.29 We call a family of sub- σ -fields {F t } t ∈I in F a filtration if F t ⊂ F t for all t , t ∈ I with t ≤ t . For any t ∈ [0 , T ) , we put F t + ∆ = \ s ∈ ( t,T ] F s , F t − ∆ = [ s ∈ [0 ,t ) F s . If F t + = F t (resp. F t − = F t ), then {F t } t ∈I is said to be right (resp. left) continuous. In the sequel, we simply denote {F t } t ∈I by F unless we want to emphasize what F t or I exactlyis. We call (Ω , F , F , P ) a filtered probability space . Definition 1.30
We say that (Ω , F , F , P ) satisfies the usual condition if (Ω , F , P ) is complete, F contains all P -null sets in F , and F is right continuous. In what follows, unless otherwise stated, we always assume that (Ω , F , F , P ) satisfies the usualcondition. Definition 1.31
Let X ( · ) be an X -valued process. X ( · ) is said to be measurable if the map ( t, ω ) X ( t, ω ) is strongly ( B ( I ) × F ) / B ( X ) -measurable; X ( · ) is said to be F -adapted if it is measurable, and for each t ∈ I , the map ω X ( t, ω ) isstrongly F t / B ( X ) -measurable; X ( · ) is said to be F -progressively measurable if for each t ∈ I , the map ( s, ω ) X ( s, ω ) from [0 , t ] × Ω to X is strongly ( B ([0 , t ]) × F t ) / B ( X ) -measurable. Definition 1.32
A set A ∈ I × Ω is called progressively measurable w.r.t. F if the process χ A ( · ) is progressively measurable. The class of all progressively measurable sets is a σ -field, called the progressive σ -field w.r.t. F ,denoted by F . One can show that, a process ϕ : [0 , T ] × Ω → X is F -progressively measurable ifand only if it is strongly F -measurable.It is clear that if X ( · ) is F -progressively measurable, it must be F -adapted. Conversely, it canbe proved that, for any F -adapted process X ( · ), there is an F -progressively measurable process e X ( · ) which is stochastically equivalent to X ( · ). For this reason, in the sequel, by saying that aprocess X ( · ) is F -adapted, we mean that it is F -progressively measurable.For any p, q ∈ [1 , ∞ ), write L p F (Ω; L q (0 , T ; X )) ∆ = n ϕ : (0 , T ) × Ω → X (cid:12)(cid:12)(cid:12) ϕ ( · ) is F -adapted and E (cid:16) Z T | ϕ ( t ) | q X dt (cid:17) pq < ∞ o , and L q F (0 , T ; L p (Ω; X )) ∆ = n ϕ : (0 , T ) × Ω → X (cid:12)(cid:12)(cid:12) ϕ ( · ) is F -adapted and Z T (cid:16) E | ϕ ( t ) | p X (cid:17) qp dt < ∞ o . Similarly, we may also define (for 1 ≤ p, q < ∞ ) ( L ∞ F (Ω; L q (0 , T ; X )) , L p F (Ω; L ∞ (0 , T ; X )) , L ∞ F (Ω; L ∞ (0 , T ; X )) ,L ∞ F (0 , T ; L p (Ω; X )) , L q F (0 , T ; L ∞ (Ω; X )) , L ∞ F (0 , T ; L ∞ (Ω; X )) . L p F (Ω; L p (0 , T ; X )) ≡ L p F (0 , T ; L p (Ω; X )) by L p F (0 , T ; X ); and further simply write L p F (0 , T )for L p F (0 , T ; R ).For any p ∈ [1 , ∞ ), set L p F (Ω; C ([0 , T ]; X )) ∆ = n ϕ : [0 , T ] × Ω → X (cid:12)(cid:12)(cid:12) ϕ ( · ) is continuous, F -adapted and E (cid:0) | ϕ ( · ) | pC ([0 ,T ]; X ) (cid:1) < ∞ o and C F ([0 , T ]; L p (Ω; X )) ∆ = n ϕ : [0 , T ] × Ω → X (cid:12)(cid:12)(cid:12) ϕ ( · ) is F -adaptedand ϕ ( · ) : [0 , T ] → L p F T (Ω; X ) is continuous o . One can show that both L p F (Ω; C ([0 , T ]; X )) and C F ([0 , T ]; L p (Ω; X )) are Banach spaces with norms | ϕ ( · ) | L p F (Ω; C ([0 ,T ]; X )) = (cid:2) E | ϕ ( · ) | pC ([0 ,T ]; X ) (cid:3) /p and | ϕ ( · ) | C F ([0 ,T ]; L p (Ω; X )) = max t ∈ [0 ,T ] (cid:2) E | ϕ ( t ) | p X (cid:3) /p , respectively. Also, we denote by D F ([0 , T ]; L p (Ω; X )) the Banach space of all processes which areright continuous with left limits in L p F T (Ω; X ) w.r.t. t ∈ [0 , T ], with the norm (cid:2) E | ϕ ( · ) | p X (cid:3) /pL ∞ (0 ,T ) . Definition 1.33
A continuous, R m -valued, F -adapted process { W ( t ) } t ≥ is called an (standard) m -dimensional Brownian motion, if P ( { W (0) = 0 } ) = 1 ; and For any s, t ∈ [0 , ∞ ) with ≤ s < t < ∞ , the random variable W ( t ) − W ( s ) is independentof F s , and W ( t ) − W ( s ) ∼ N (0 , ( t − s ) I m ) . Similarly, one can define R m -valued Brownian motions over any time interval [ a, b ] or [ a, b ) with0 ≤ a < b ≤ ∞ .In the rest of this paper, unless otherwise stated, we fix a 1-dimensional standard Brownianmotion W ( · ) on (Ω , F , F , P ). Write F Wt ∆ = σ ( W ( s ); s ∈ [0 , t ]) ⊂ F t , ∀ t ∈ I . (1.32)Generally, the filtration {F Wt } t ∈I is left-continuous, but not necessarily right-continuous. Never-theless, the augmentation { ˆ F Wt } t ∈I of {F Wt } t ∈I by adding all P -null sets is continuous, and W ( · )is still a Brownian motion on the (augmented) filtered probability space (Ω , F , { ˆ F Wt } t ∈I , P ). In thesequel, by saying that F is the natural filtration generated by W ( · ), we mean that F is generatedas in (1.32) with the above augmentation, and hence it is continuous. Let H be a Hilbert space. We now define the Itˆo integral Z T X ( t ) dW ( t ) (1.33)19f an H -valued, F -adapted stochastic process X ( · ) (satisfying suitable conditions) w.r.t. W ( t ).Note that one cannot define (1.33) to be a Lebesgue-Stieltjes type integral by regarding ω as aparameter. Indeed, the map t ( ∈ [0 , T ]) W ( t, · ) is not of bounded variation, a.s.Write L for the set of f ∈ L F (0 , T ; H ) of the form f ( t, ω ) = n X j =0 f j ( ω ) χ [ t j ,t j +1 ) ( t ) , ( t, ω ) ∈ [0 , T ] × Ω , (1.34)where n ∈ N , 0 = t < t < · · · < t n +1 = T , f j is F t j -measurable withsup (cid:8) | f j ( ω ) | H (cid:12)(cid:12) j ∈ { , · · · , n } , ω ∈ Ω (cid:9) < ∞ . One can show that L is dense in L F (0 , T ; H ).Assume that f ∈ L takes the form of (1.34). Then we set I ( f )( t, ω ) = n X j =0 f j ( ω )[ W ( t ∧ t j +1 , ω ) − W ( t ∧ t j , ω )] . (1.35)It is easy to show that I ( f ) ∈ L F t (Ω; H ) and the following Itˆo isometry holds: | I ( f ) | L F t (Ω; H ) = | f | L F (0 ,t ; H ) . (1.36)Generally, for f ∈ L F (0 , T ; H ), one can find a sequence of { f k } ∞ k =1 ⊂ L such thatlim k →∞ | f k − f | L F (0 ,T ; H ) = 0 . Since | I ( f k ) − I ( f j ) | L F t (Ω; H ) = | f k − f j | L F (0 ,t ; H ) , { I ( f k ) } ∞ k =1 is a Cauchy sequence in L F t (Ω; H ) andhence, it converges to a unique element in L F t (Ω; H ), which is determined uniquely by f and isindependent of the particular choice of { f k } ∞ k =1 . We call this element the Itˆo integral of f (w.r.t.the Brownian motion W ( · )) on [0 , t ] and denote it by R t f dW .For 0 ≤ s < t ≤ T , we call R t f dW − R s f dW the Itˆo integral of f ∈ L F (0 , T ; H ) (w.r.t. theBrownian motion W ( · )) on [ s, t ]. We shall denote it by R ts f ( τ ) dW ( τ ) or simply R ts f dW .For any p ∈ (0 , ∞ ), write L p,loc F (0 , T ; H ) for the set of all H -valued, F -adapted stochasticprocesses f ( · ) satisfying only R T | f ( t ) | pH dt < ∞ , a.s. For any p ∈ [1 , ∞ ), one can also define theItˆo integral R t f dW for f ∈ L p,loc F (0 , T ; H ), especially for f ∈ L p F (Ω; L (0 , T ; H )) (See [53] for moredetails). The Itˆo integral has the following properties. Theorem 1.13
Let p ∈ [1 , ∞ ) . Let f, g ∈ L p F (Ω; L (0 , T ; H )) and a, b ∈ L F s (Ω) , ≤ s < t ≤ T .Then Z ts f dW ∈ L p F (Ω; C ([ s, t ]; H ));2) Z ts ( af + bg ) dW = a Z ts f dW + b Z ts gdW, a.s. ;3) E (cid:16) Z ts f dW (cid:17) = 0;4) When p ≥ , E (cid:16)D Z ts f dW, Z ts gdW E H (cid:17) = E (cid:16) Z ts h f ( r, · ) , g ( r, · ) i H dr (cid:17) . Burkholder-Davis-Gundy inequality , links Itˆo’s integral tothe Lebesgue/Bochner integral.
Theorem 1.14
For any p ∈ [1 , ∞ ) , there exists a constant C p > such that for any T > and f ∈ L p F (Ω; L (0 , T ; H )) , C p E (cid:16) Z T | f ( s ) | H ds (cid:17) p ≤ E (cid:16) sup t ∈ [0 ,T ] (cid:12)(cid:12)(cid:12) Z t f ( s ) dW ( s ) (cid:12)(cid:12)(cid:12) pH (cid:17) ≤ C p E (cid:16) Z T | f ( s ) | H ds (cid:17) p . We need the following notion for an important class of stochastic processes.
Definition 1.34 An H -valued, F -adapted, continuous process X ( · ) is called an Itˆo process if thereexist two H -valued stochastic processes φ ( · ) ∈ L ,loc F (0 , T ; H ) and Φ( · ) ∈ L ,loc F (0 , T ; H ) such that X ( t ) = X (0) + Z t φ ( s ) ds + Z t Φ( s ) dW ( s ) , a.s. , ∀ t ∈ [0 , T ] . (1.37)The following fundamental result is known as Itˆo’s formula . Theorem 1.15
Let X ( · ) be given by (1.37) . Let F : [0 , T ] × H → R be a function such that itspartial derivatives F t , F x and F xx are uniformly continuous on any bounded subset of [0 , T ] × H .Then, F ( t, X ( t )) − F (0 , X (0))= Z t F x ( s, X ( s ))Φ( s ) dW ( s ) + Z t h F t ( s, X ( s )) + F x ( s, X ( s )) φ ( s )+ 12 h F xx ( s, X ( s ))Φ( s ) , Φ( s ) i H i ds, a.s. , ∀ t ∈ [0 , T ] . (1.38) Remark 1.2
Usually, people write the formula (1.38) in the following differential form: dF ( t, X ( t )) = F x ( t, X ( t ))Φ( t ) dW ( t ) + F t ( t, X ( t )) dt + F x ( t, X ( t )) φ ( t ) dt + 12 h F xx ( t, X ( t ))Φ( t ) , Φ( t ) i H dt. Theorem 1.15 works well for Itˆo processes in the (strong) form (1.37). However, usually thisis too restrictive in the study of stochastic differential equations in infinite dimensions. Indeed, inthe infinite dimensional setting sometimes one has to handle Itˆo processes in a weaker form, to bepresented below.Let V be a Hilbert space such that the embedding V ⊂ H is continuous and dense. Denote by V ∗ the dual space of V w.r.t. the pivot space H . Hence, V ⊂ H = H ∗ ⊂ V ∗ , continuously anddensely and h z, v i V , V ∗ = h z, v i H , ∀ ( v, z ) ∈ H × V . We have the following Itˆo’s formula for a weak form of Itˆo process.
Theorem 1.16
Suppose that X ∈ L F (Ω; H ) , φ ( · ) ∈ L F (0 , T ; V ∗ ) , and Φ( · ) ∈ L p F (Ω; L (0 , T ; H )) for some p ≥ . Let X ( t ) = X + Z t φ ( s ) ds + Z t Φ( s ) dW ( s ) , t ∈ [0 , T ] . (1.39)21 f X ∈ L F (0 , T ; V ) , then X ( · ) ∈ C ([0 , T ]; H ) , a.s., and for any t ∈ [0 , T ] , | X ( t ) | H = | X | H + 2 Z t h φ ( s ) , X ( s ) i V ∗ , V ds +2 Z t h Φ( s ) , X ( s ) i H dW ( s ) + Z t | Φ( s ) | H ds, a.s. (1.40) Remark 1.3
For simplicity, we usually write formula (1.40) in the following differential form: d | X ( t ) | H = 2 h φ ( t ) , X ( t ) i V ∗ , V dt + | Φ( t ) | H dt + 2 h Φ( t ) , X ( t ) i H dW ( t ) and denote | Φ( t ) | H dt by | dX | H for simplicity. In the rest of this notes, unless other stated, we shall always assume that H is a separable Hilbertspace, and A is an unbounded linear operator (with domain D ( A )) on H , which is the infinitesimalgenerator of a C -semigroup { S ( t ) } t ≥ .Consider the following stochastic evolution equation (SEE for short): ( dX ( t ) = (cid:0) AX ( t ) + F ( t, X ( t )) (cid:1) dt + e F ( t, X ( t )) dW ( t ) in (0 , T ] ,X (0) = X , (1.41)where X ∈ L p F (Ω; H ) (for some p ≥ F ( · , · ) and e F ( · , · ) are two given functions from[0 , T ] × Ω × H to H .First, let us give the definition of strong solutions to (1.41). Definition 1.35 An H -valued, F -adapted, continuous stochastic process X ( · ) is called a strongsolution to the equation (1.41) if X ( t ) ∈ D ( A ) for a.e. ( t, ω ) ∈ [0 , T ] × Ω and AX ( · ) ∈ L (0 , T ; H ) a.s.; F ( · , X ( · )) ∈ L (0 , T ; H ) a.s., e F ( · , X ( · )) ∈ L ,loc F (0 , T ; H ) ; and For all t ∈ [0 , T ] , X ( t ) = X + Z t (cid:0) AX ( s ) + F ( s, X ( s )) (cid:1) ds + Z t e F ( s, X ( s )) dW ( s ) , a.s.Generally speaking, one needs very restrictive conditions to guarantee the existence of strongsolutions to (1.41). Thus, people introduce two types of “weak” solutions. Definition 1.36 An H -valued, F -adapted, continuous stochastic process X ( · ) is called a weak so-lution to (1.41) if F ( · , X ( · )) ∈ L (0 , T ; H ) a.s., e F ( · , X ( · )) ∈ L ,loc F (0 , T ; H ) , and for any t ∈ [0 , T ] and ξ ∈ D ( A ∗ ) , (cid:10) X ( t ) , ξ (cid:11) H = (cid:10) X , ξ (cid:11) H + Z t (cid:0)(cid:10) X ( s ) , A ∗ ξ (cid:11) H + (cid:10) F ( s, X ( s )) , ξ (cid:11) H (cid:1) ds + Z t (cid:10) e F ( s, X ( s )) , ξ (cid:11) H dW ( s ) , a.s.22 efinition 1.37 An H -valued, F -adapted, continuous stochastic process X ( · ) is called a mild so-lution to (1.41) if F ( · , X ( · )) ∈ L (0 , T ; H ) a.s., e F ( · , X ( · )) ∈ L ,loc F (0 , T ; H ) , and for any t ∈ [0 , T ] , X ( t ) = S ( t ) X + Z t S ( t − s ) F ( s, X ( s )) ds + Z t S ( t − s ) e F ( s, X ( s )) dW ( s ) , a.s. Clearly, a strong solution to (1.41) is a weak/mild solution to the same equation. The followingresult provides a sufficient condition for a mild solution to be a strong solution to (1.41).
Proposition 1.8
A mild solution X ( · ) to (1.41) is a strong solution (to the same equation) if thefollowing three conditions hold for all x ∈ H and ≤ s ≤ t ≤ T , a.s., X ∈ D ( A ) , S ( t − s ) F ( s, x ) ∈ D ( A ) and S ( t − s ) e F ( s, x ) ∈ D ( A ) ; | AS ( t − s ) F ( s, x ) | H ≤ α ( t − s ) | x | H for some real-valued stochastic process α ( · ) ∈ L ,loc F (0 , T ) a.s.; and | AS ( t − s ) e F ( s, x ) | H ≤ β ( t − s ) | x | H for some real-valued stochastic process β ( · ) ∈ L ,loc F (0 , T ) a.s. The next result gives the relationship between mild and weak solutions to (1.41).
Proposition 1.9
Any weak solution to (1.41) is also a mild solution to the same equation and viceversa.
By Proposition 1.9, in what follows, we will not distinguish the mild and the weak solutions to(1.41).In the rest of this subsection, we assume that both F ( · , x ) and e F ( · , x ) are F -adapted for each x ∈ H , and there exist two nonnegative (real-valued) functions L ( · ) ∈ L (0 , T ) and L ( · ) ∈ L (0 , T )such that for a.e. t ∈ [0 , T ] and all y, z ∈ H , | F ( t, y ) − F ( t, z ) | H ≤ L ( t ) | y − z | H , a.s. , | e F ( t, y ) − e F ( t, z ) | H ≤ L ( t ) | y − z | H , a.s. ,F ( · , ∈ L p F (Ω; L (0 , T ; H )) , e F ( · , ∈ L p F (Ω; L (0 , T ; H )) . (1.42)We have the following result: Theorem 1.17
Let p ≥ . Then, there is a unique mild solution X ( · ) ∈ C F ([0 , T ]; L p (Ω; H )) to (1.41) . Moreover, | X ( · ) | C F ([0 ,T ]; L p (Ω; H )) ≤ C (cid:0) | X | L p F (Ω; H ) + | F ( · , | L p F (Ω; L (0 ,T ; H )) + | e F ( · , | L p F (Ω; L (0 ,T ; H )) (cid:1) . (1.43)If the semigroup { S ( t ) } t ≥ is contractive, then one can obtain a better regularity for the mildsolution w.r.t. the time variable as follows: Theorem 1.18 If A generates a contraction semigroup and p ≥ , then (1.41) admits a uniquemild solution X ( · ) ∈ L p F (Ω; C ([0 , T ]; H )) . Moreover, | X ( · ) | L p F (Ω; C ([0 ,T ]; H )) ≤ C (cid:0) | X | L p F (Ω; H ) + | F ( · , | L p F (Ω; L (0 ,T ; H )) + | e F ( · , | L p F (Ω; L (0 ,T ; H )) (cid:1) . (1.44)23he following result indicates the regularity of mild solutions to a class of stochastic evolutionequations. Theorem 1.19
Let p ≥ . Assume that A is a self-adjoint, negative definite (unbounded linear) op-erator on H . Then, the equation (1.41) admits a unique mild solution X ( · ) ∈ L p F (Ω; C ([0 , T ]; H )) ∩ L p F (Ω; L (0 , T ; D (( − A ) ))) . Moreover, | X ( · ) | L p F (Ω; C ([0 ,T ]; H )) + | X ( · ) | L p F (Ω; L (0 ,T ; D (( − A ) ))) ≤ C (cid:0) | X | L p F (Ω; H ) + | F ( · , | L p F (Ω; L (0 ,T ; H )) + | e F ( · , | L p F (Ω; L (0 ,T ; H )) (cid:1) . (1.45)Usually, a mild solution does not have “enough” regularities for many situations. For example,when establishing the pointwise identity for Carleman estimate on stochastic partial differentiaequations of second order, we actually need the functions in consideration to be twice differentiablein the sense of generalized derivative w.r.t. the spatial variables. Nevertheless, these problems canbe solved by the following strategy:1. Introduce some auxiliary equations with strong solutions such that the limit of these strongsolutions is the mild or weak solution to the original equation.2. Obtain the desired properties for these strong solutions.3. Utilize the density argument to establish the desired properties for mild/ weak solutions tothe original equation.There are many methods to implement the above three steps in the setting of deterministicPDEs. Roughly speaking, any of these methods, which does not destroy the adaptedness of thesolution w.r.t F , can be applied to SEEs. Here we only present one approach.Introduce a family of auxiliary equations for (1.41) as follows: ( dX λ ( t ) = AX λ ( t ) dt + R ( λ ) F ( t, X λ ( t )) dt + R ( λ ) e F ( t, X λ ( t )) dW ( t ) in (0 , T ] ,X λ (0) = R ( λ ) X ∈ D ( A ) . (1.46)Here λ ∈ ρ ( A ) and R ( λ ) ∆ = λ ( λI − A ) − . Theorem 1.20
For each X ∈ L p F (Ω; H ) with p ≥ and λ ∈ ρ ( A ) , the equation (1.46) admits aunique strong solution X λ ( · ) ∈ C F ([0 , T ]; L p (Ω; D ( A ))) . Moreover, as λ → ∞ , the solution X λ ( · ) converges to X ( · ) in C F ([0 , T ]; L p (Ω; H )) , where X ( · ) is the mild solution to (1.41) . At last, we give a Burkholder-Davis-Gundy type inequality, which is very useful in the study ofmild solutions to SEEs.
Proposition 1.10
For any p ≥ , there exists a constant C p > such that for any g ∈ L p F (Ω; L (0 ,T ; H )) and t ∈ [0 , T ] , E (cid:12)(cid:12)(cid:12) Z t S ( t − s ) g ( s ) dW ( s ) (cid:12)(cid:12)(cid:12) pH ≤ C p E (cid:16) Z t | g ( s ) | H ds (cid:17) p . (1.47)24 .10 Backward stochastic evolution equations Backward stochastic differential equations and more generally, backward stochastic evolution equa-tions (BSEEs for short) are by-products in the study of stochastic control theory. These equationshave independent interest and are applied in other places.Consider the following BSEE: ( dY ( t ) = − (cid:0) A ∗ Y ( t ) + F ( t, Y ( t ) , Z ( t )) (cid:1) dt − Z ( t ) dW ( t ) in [0 , T ) ,Y ( T ) = Y T . (1.48)Here Y T ∈ L F T (Ω; H ) and F : [0 , T ] × Ω × H × H → H is a given function. Remark 1.4
The diffusion term “ Y ( t ) dW ( t ) ” in (1.48) may be replaced by a more general form“ (cid:0) e F ( t, y ( t )) + Y ( t ) (cid:1) dW ( t ) ” (for a function e F : [0 , T ] × Ω × H → H ). Clearly, the latter can bereduced to the former one by means of a simple transformation: ˜ y ( · ) = y ( · ) , e Y ( · ) = e F ( · , y ( · )) + Y ( · ) , under suitable assumptions on the function e F . Similarly to the case of SEEs, one introduces below notions of strong, weak and mild solutionsto the equation (1.48).
Definition 1.38 An H × H -valued process ( Y ( · ) , Z ( · )) is called a strong solution to (1.48) if Y ( · ) is F -adapted and continuous, Z ( · ) ∈ L ,loc F (0 , T ; H ) ; Y ( t ) ∈ D ( A ∗ ) for a.e. ( t, ω ) ∈ [0 , T ] × Ω , and A ∗ Y ( · ) ∈ L (0 , T ; H ) and F ( · , Y ( · ) , Z ( · )) ∈ L (0 , T ; H ) a.s.; and For any t ∈ [0 , T ] , Y ( t ) = Y T + Z Tt (cid:0) A ∗ Y ( s ) + F ( s, Y ( s ) , Z ( s )) (cid:1) ds + Z Tt Z ( s ) dW ( s ) , a.s. Definition 1.39 An H × H -valued process ( Y ( · ) , Z ( · )) is called a weak solution to (1.48) if Y ( · ) is F -adapted and continuous, Z ( · ) ∈ L ,loc F (0 , T ; H ) , and F ( · , Y ( · ) , Z ( · )) ∈ L (0 , T ; H ) a.s.; and For any t ∈ [0 , T ] and η ∈ D ( A ) , h Y ( t ) , η i H = h Y T , η i H + Z Tt h Y ( s ) , Aη i H ds + Z Tt h F ( s, Y ( s ) , Z ( s )) , η i H ds + Z Tt h Z ( s ) , η i H dW ( s ) , a.s. Definition 1.40 An H × H -valued process ( Y ( · ) , Z ( · )) is called a mild solution to (1.48) if Y ( · ) is F -adapted and continuous, Z ( · ) ∈ L ,loc F (0 , T ; H ) , and F ( · , Y ( · ) , Z ( · )) ∈ L (0 , T ; H ) a.s.; and For any t ∈ [0 , T ] , Y ( t ) = S ( T − t ) ∗ Y T + Z Tt S ( s − t ) ∗ F ( s, Y ( s ) , Z ( s )) ds + Z Tt S ( s − t ) ∗ Z ( s ) dW ( s ) , a.s.25imilarly to Proposition 1.9, we have the following result concerning the equivalence betweenweak and mild solutions to the equation (1.48). Proposition 1.11 An H × H -valued process ( Y ( · ) , Z ( · )) is a weak solution to (1.48) if and onlyif it is a mild solution to the same equation. According to Proposition 1.11, in the rest of this paper, we will not distinguish the mild andweak solutions to (1.48).Clearly, if ( Y ( · ) , Z ( · )) is a strong solution to (1.48), then it is also a weak/mild solution to thesame equation. On the other hand, starting from Definition 1.39, it is easy to show the followingresult. Proposition 1.12
A weak/mild solution ( Y ( · ) , Z ( · )) to (1.48) is a strong solution (to the sameequation), provided that Y ( t ) ∈ D ( A ∗ ) for a.e. ( t, ω ) ∈ [0 , T ] × Ω , and A ∗ Y ( · ) ∈ L (0 , T ; H ) a.s. In the rest of this subsection, we assume that F ( · , y, z ) is F -adapted for each y, z ∈ H , andthere exist two nonnegative functions L ( · ) ∈ L (0 , T ) and L ( · ) ∈ L (0 , T ) such that, for any fora.e. t ∈ [0 , T ], | F ( t, y , z ) − F ( t, y , z ) | H ≤ L ( t ) | y − y | H + L ( t ) | z − z | H , ∀ y , y , z , z ∈ H, a.s. F ( · , , ∈ L F (0 , T ; L (Ω; H )) . (1.49)Similarly to Theorem 1.17 (but here one needs that the filtration F is the natural one generatedby W ( · )), we have the following well-posedness result for the equation (1.48) in the sense of mildsolution. Theorem 1.21
Assume that F is the natural filtration generated by W ( · ) . Then, for every Y T ∈ L F T (Ω; H ) , the equation (1.48) admits a unique mild solution ( Y ( · ) , Z ( · )) ∈ L F (Ω; C ([0 , T ]; H )) × L F (0 , T ; H ) , and | ( Y, Z ) | L F (Ω; C ([0 ,T ]; H )) × L F (0 ,T ; H ) ≤ C (cid:0) | Y T | L F T (Ω; H ) + | F ( · , , | L F (0 ,T ; L (Ω; H )) (cid:1) . (1.50)Similarly to Theorem 1.19, the following result describes the smoothing effect of mild solutionsto a class of backward stochastic evolution equations. Theorem 1.22
Let F be the natural filtration generated by W ( · ) , and A be a self-adjoint, negativedefinite (unbounded linear) operator on H . Then, for any Y T ∈ L F T (Ω; H ) , the equation (1.48) ad-mits a unique mild solution ( Y ( · ) , Z ( · )) ∈ (cid:0) L F (Ω; C ([0 , T ]; H )) ∩ L F (0 , T ; D (( − A ) )) (cid:1) × L F (0 , T ; H ) .Moreover, | Y ( · ) | L F (Ω; C ([0 ,T ]; H )) + | Y ( · ) | L F (0 ,T ; D (( − A ) ))) + | Z ( · ) | L F (0 ,T ; H ) ≤ C (cid:0) | Y T | L F T (Ω; H ) + | F ( · , , | L F (0 ,T ; L (Ω; H )) (cid:1) . (1.51)Next, for each λ ∈ ρ ( A ∗ ), we introduce the following approximate equation of (1.48): dY λ ( t ) = − (cid:0) A ∗ Y λ ( t ) + R ∗ ( λ ) F ( t, Y ( t ) , Z ( t )) (cid:1) dt − R ∗ ( λ ) Z λ ( t ) dW ( t ) in [0 , T ) ,Y λ ( T ) = R ∗ ( λ ) Y T , (1.52)where R ∗ ( λ ) ∆ = λ ( λI − A ∗ ) − , Y T ∈ L F T (Ω; H ) and ( Y ( · ) , Z ( · )) is the mild solution to (1.48).Similarly to Theorem 1.20, we have the following result.26 heorem 1.23 Assume that F is the natural filtration generated by W ( · ) . Then, for each Y T ∈ L F T (Ω; H ) and λ ∈ ρ ( A ∗ ) , the equation (1.52) admits a unique strong solution ( Y λ ( · ) , Z λ ( · )) ∈ L F (Ω; C ([0 , T ]; D ( A ∗ ))) × L F (0 , T ; D ( A ∗ )) . Moreover, as λ → ∞ , ( Y λ ( · ) , Z λ ( · )) converges to ( Y ( · ) , Z ( · )) in L F (Ω; C ([0 , T ]; H )) × L F (0 , T ; H ) . Note that, in Theorems 1.21–1.23, we need the filtration F to be the one generated by W ( · ). Forthe general filtration, as we shall see below, we need to employ the stochastic transposition method(developed in our previous works [47, 48]) to show the well-posedness of the equation (1.48).In order to solve the BSEE (1.48) in the general filtration space, a fundamental idea in thestochastic transposition method ([47, 48]) is to introduce the following test SEE: ( dϕ = ( Aϕ + ψ ) ds + ψ dW ( s ) in ( t, T ] ,ϕ ( t ) = η, (1.53)where t ∈ [0 , T ), ψ ∈ L F ( t, T ; L (Ω; H )), ψ ∈ L F ( t, T ; H ) and η ∈ L F t (Ω; H ). By Theorem 1.17,the equation (1.53) admits a unique solution ϕ ∈ C F ([ t, T ]; L (Ω; H )) such that | ϕ | C F ([ t,T ]; L (Ω; H )) ≤ C (cid:12)(cid:12)(cid:0) ψ ( · ) , ψ ( · ) , η (cid:1)(cid:12)(cid:12) L F ( t,T ; L (Ω; H )) × L F ( t,T ; H ) × L F t (Ω; H ) . (1.54) Definition 1.41
We call ( Y ( · ) , Z ( · )) ∈ D F ([0 , T ]; L (Ω; H )) × L F (0 , T ; H ) a transposition solutionto (1.48) if for any t ∈ [0 , T ] , ψ ( · ) ∈ L F ( t, T ; L (Ω; H )) , ψ ( · ) ∈ L F ( t, T ; H ) , η ∈ L F t (Ω; H ) andthe corresponding solution ϕ ∈ C F ([ t, T ]; L (Ω; H )) to (1.53) , it holds that E (cid:10) ϕ ( T ) , Y T (cid:11) H − E Z Tt (cid:10) ϕ ( s ) , F ( s, Y ( s ) , Z ( s )) (cid:11) H ds = E (cid:10) η, Y ( t ) (cid:11) H + E Z Tt (cid:10) ψ ( s ) , Y ( s ) (cid:11) H ds + E Z Tt (cid:10) ψ ( s ) , Z ( s ) (cid:11) H ds. (1.55) Remark 1.5 If (1.48) admits a weak solution ( Y ( · ) , Z ( · )) ∈ C F ([0 , T ]; L (Ω; H )) × L F (0 , T ; H ) ,then by Itˆo’s formula, people obtain the variational equality (1.55) . This is the very reason for usto introduce Definition 1.41. We have the following result on the well-posedness of the equation (1.48) (See [48, Theorem3.1]).
Theorem 1.24
The equation (1.48) admits a unique transposition solution ( Y ( · ) , Z ( · )) ∈ D F ([0 , T ]; L (Ω; H )) × L F (0 , T ; H ) . Furthermore, | ( Y ( · ) , Z ( · )) | D F ([0 ,T ]; L (Ω; H )) × L F (0 ,T ; L (Ω; H )) ≤ C (cid:0) | Y T | L F T (Ω; H ) + | F ( · , , | L F (0 ,T ; L (Ω; H )) (cid:1) . (1.56)Unlike the case that F is the natural filtration generated by W ( · ), the proof of Theorem 1.24 isbased on a new stochastic Riesz-Type Representation Theorem (proved in [44, 45]). Remark 1.6
The stochastic transposition method works well not only for the vector-valued (i.e., H -valued) BSEE (1.48) , but also (and more importantly) for the difficult operator-valued (i.e., L ( H ) -valued) BSEEs (See the equations (8.5) and (10.26), also [48, 49, 51, 54] and particularly[53] for more details). Control systems governed by stochastic partial differential equa-tions
It is well-known that Control Theory was founded by N. Wiener in 1948. After that, owing to thegreat effort of numerous mathematicians and engineers, this theory was greatly extended to variousdifferent setting, and widely used in sciences, technologies, engineering and economics, particularlyin Artificial Intelligence in recent years. Usually, in terms of the state-space technique ([26]), peopledescribe the control system under consideration as a suitable state equation.Control theory for finite dimensional systems is now relatively mature. There exist a huge list ofpublications on control theory for deterministic distributed parameter systems (typically governedby partial differential equations) though it is still quite active; while the same can be said for controltheory for stochastic (ordinary) differential equations (i.e., stochastic differential equations in finitedimensions). By contrast, control theory for stochastic distributed parameter systems (describedby stochastic differential equations in infinite dimensions, typically by stochastic partial differentialequations), is still at its very beginning stage, though it was “born” in almost the same time asthat for deterministic distributed parameter control systems.Because of their inherent complexities, many control systems in reality exhibit very complicateddynamics, including substantial model uncertainty, actuator and state constraints, and high dimen-sionality (usually infinite). These systems are often best described by stochastic partial differentialequations or even more complicated stochastic equations in infinite dimensions (e.g., [59]).Generally speaking, any PDE can be regarded as a SPDE provided that at least one of itscoefficients, forcing terms, initial and boundary conditions is random. The terminology SPDE isa little misused, which may mean different types of equations in different places. The analysisof equations with random coefficients differs dramatically from that of equations with stochasticnoises. In this notes, we focus on the latter.The study of SPDEs is mainly motivated by two aspects: • One is due to the rapid development of stochastic analysis recently. The topic of SPDEs isan interdisciplinary area involved both stochastic processes (random fields) and PDEs, whichhas quickly become a discipline itself. In the last two decades, SPDEs have been one of themost active areas in stochastic analysis. • Another is the requirement from some branches of sciences. In many phenomena in physics,biology, economics and control theory (including filter theory in particular), stochastic effectsplay a central role. Thus, stochastic corrections to the deterministic models are indispensablyneeded. These backgrounds have been influencing the development of SPDEs remarkably.In this section, we present two typical models described by SPDEs, in which some controlactions may be introduced. The readers are referred to [7, 24] and so on for more SPDE systems.
Example 2.1 Stochastic parabolic equations
The following equation was introduced to describe the evolution of the density of a population(e.g., [8]): dy = κ∂ xx ydt + α √ ydW ( t ) in (0 , T ) × (0 , L ) ,y x = 0 on (0 , T ) × { , L } ,y (0) = y in (0 , L ) , (2.1) where κ > , α > and L > are given constants and y ∈ L (0 , L ) . The derivation of (2.1) is asfollows. uppose that a bacteria population is distributed in the interval [0 , L ] . Denote by y ( t, x ) thedensity of this population at time t ∈ [0 , T ] and position x ∈ [0 , L ] . If there is no large-scalemigration, the variation of the density is governed by dy ( t, x ) = κy xx ( t, x ) dt + dξ ( t, x, y ) . (2.2) In (2.2) , κy xx ( t, x ) describes the population’s diffusion from the high density place to the low one,while ξ ( t, x, y ) is a random perturbation caused by lots of small independent random disturbances.Suppose that the random perturbation ξ ( t, x, y ) at time t and position x can be approximated by aGaussian stochastic process whose variance Var ξ is monotone w.r.t. y . In the study of bacteria, L is very small and we may assume that ξ is independent of x . Further, when y is small, thevariance of ξ can be approximated by α y , where α is the derivative of Var ξ at y = 0 . Underthese assumptions, it follows that dξ ( t, x, y ) = α √ ydW ( t ) . (2.3) Combining (2.2) and (2.3) , we arrive at the first equation of (2.1) . If there is no bacteria enteringinto or leaving [0 , L ] from its boundary, we have y x ( t,
0) = y x ( t,
1) = 0 . As a result, we obtain theboundary condition of (2.1) .In order to change the population’s density, one can put in or draw out some species. Undersuch actions, the equation (2.1) becomes a controlled stochastic parabolic equation: dy = ( κ∂ xx y + u ) dt + ( α √ y + v ) dW ( t ) in (0 , T ) × (0 , L ) ,y x = 0 on (0 , T ) × { , L } ,y (0) = y in (0 , L ) , (2.4) where u and v are the ways of putting in or drawing out species. Example 2.2 Stochastic wave equations
To study the vibration of thin string/membrane perturbed by the random force, people introducedthe following stochastic wave equation (e.g. [19]): dy t = ∆ ydt + αydW ( t ) in (0 , T ) × G,y = 0 on (0 , T ) × Γ ,y (0) = y , y t (0) = y in G. (2.5) Here G is given as in Subsection 1.4, α ( · ) is a suitable function, while ( y , y ) is an initial datum.Let us recall below a derivation of (2.5) for G = (0 , L ) (for some L > ) by studying the motion ofa DNA molecule in a fluid (e.g., ([19, 57])).Compared with its length, the diameter of such a DNA molecule is very small, and hence, itcan be viewed as a thin and long elastic string. One can describe its position by using an R -valuedfunction y = ( y , y , y ) defined on [0 , L ] × [0 , + ∞ ) . Usually, a DNA molecule floats in a fluid.Thus, it is always struck by the fluid molecules, just as a particle of pollen floating in a fluid.For simplicity, we assume that the mass of this string per unit length is equal to . Then,the acceleration at position x ∈ [0 , L ] along the string at time t ∈ [0 , + ∞ ) is y tt ( t, x ) . There aremainly four kinds of forces acting on the string: the elastic force F ( t, x ) , the friction F ( t, x ) due toviscosity of the fluid, the impact force F ( t, x ) from the flowing of the fluid and the random impulse F ( t, x ) from the impacts of the fluid molecules. By Newton’s Second Law, it follows that y tt ( t, x ) = F ( t, x ) + F ( t, x ) + F ( t, x ) + F ( t, x ) . (2.6)29 imilar to the derivation of deterministic wave equations, the elastic force F ( t, x ) = y xx ( t, x ) . Thefiction depends on the property of the fluid and the velocity and shape of the string. When y , y x and y t are small, F ( t, x ) approximately depends on them linearly, i.e., F = a y x + a y t + a y for some suitable functions a , a and a . From the classical theory of Statistical Mechanics (e.g.,[67, Chapter VI]), the random impulse F ( t, x ) at time t and position x can be approximated bya Gaussian stochastic process with a covariance k ( · , y ) , which also depends on the property of thefluid, and therefore we may assume that F ( t, x ) = Z t k ( x, y ( s, x )) dW ( s ) . Thus, the equation (2.6) can be rewritten as dy t ( t, x ) = ( y xx + a y x + a y t + a y ) dt + F ( t, x ) dt + k ( x, y ( t, x )) dW ( t ) . (2.7) When y is small, we may assume that k ( · , y ) is linear w.r.t. y . In this case, k ( x, y ( t, x )) = a ( x ) y ( t, x ) for a suitable function a ( · ) . More generally, one may assume that a depends on t (and even on the sample point ω ∈ Ω ). Thus, (2.7) is reduced to the following: dy t ( t, x ) = ( y xx + a y x + a y t + a y ) dt + F ( t, x ) dt + a ( t, x ) y ( t, x ) dW ( t ) . (2.8) Many biological behaviors are related to the motions of DNA molecules. Hence, there is a strongmotivation for controlling these motions. In (2.8) , F ( · , · ) in the drift term and some part of thediffusion term can be designed as controls acting on the fluid. Furthermore, one can introduce someforces on the endpoints of the molecule. In this way, we arrive at the following controlled stochasticwave equation in one space dimension: dy t = ( y xx + a y x + a y t + a y + u ) dt + ( a y + v ) dW ( t ) in (0 , T ) × (0 , L ) ,y = f on (0 , T ) × { } ,y = f on (0 , T ) × { L } ,y (0) = y , y t (0) = y in (0 , L ) . (2.9) In (2.9) , ( f , f , u, v ) are controls which belonging to some suitable set, a is a suitable function.A natural question is as follows:Can we drive the DNA molecule (governed by (2.9) ) from any ( y , y ) to any target ( z , z ) at time T by choosing suitable controls ( f , f , u, v ) ?Clearly, this is an exact controllability problem for the stochastic wave equation (2.9) . Althoughthere are four controls in the system (2.9) , as we shall show in Section 6, the answer to the abovequestion is negative for controls belonging to some reasonable set.On the other hand, though the system (2.9) is not exactly controllable, it may well happen thatfor some ( y , y ) and ( z , z ) , there exist some controls ( f , f , u, v ) (in some Hilbert space H ) sothat the corresponding solutions to (2.9) verifying y ( T ) = z and y t ( T ) = z , a.s. Usually, suchsort of controls are not unique, and therefore, people may hope to find an optimal one, among thesecontrols, which minimize the following functional | ( f , f , u, v ) | H . This is typically an optimal control problem for the stochastic wave equation (2.9) . controllability and optimal control problems, actually two fundamental problems that we will beconcerned with in the rest of this notes. In this section, we shall present some results on the controllability and observability of SEEs. Asthe title suggests, these results are simple and their proofs are easy. However, some of them maynot be obvious for beginners. On the other hand, some results in this respect reveal new phenomenafor controllability problems in the stochastic setting. Unless otherwise stated, the content of thissection is taken from [53, Chapter 7].Throughout this section, U is a given Hilbert space. We first consider the following control system evolved on H : ( dX ( t ) = (cid:0) A + F ( t ) (cid:1) X ( t ) dt + Bu ( t ) dt + K ( t ) X ( t ) dW ( t ) in (0 , T ] ,X (0) = X . (3.1)In (3.1), X ∈ H , F, K ∈ L ∞ F (0 , T ; L ( H )), B ∈ L ( U ; H ) and u ∈ L F (0 , T ; U ).We begin with the notion of exact controllability of (3.1). Definition 3.1
The system (3.1) is called exactly controllable at time T if for any X ∈ H and X ∈ L F T (Ω; H ) , there is a control u ∈ L F (0 , T ; U ) such that the corresponding solution X ( · ) to (3.1) fulfills that X ( T ) = X , a.s. Define a function η ( · ) on [0 , T ] by η ( t ) = , for t ∈ h(cid:16) − j (cid:17) T, (cid:16) − j +1 (cid:17) T (cid:17) , j = 0 , , , · · · , − , otherwise . (3.2)Let us recall the following known result ([63, Lemma 2.1]). Lemma 3.1
Let ξ = Z T η ( t ) dW ( t ) . It is impossible to find ( ̺ , ̺ ) ∈ L F (0 , T ) × C F ([0 , T ]; L (Ω)) and x ∈ R such that ξ = x + Z T ̺ ( t ) dt + Z T ̺ ( t ) dW ( t ) . (3.3)We have the following negative controllability result for the system (3.1). Theorem 3.1
Let K ( · ) ∈ C F ([0 , T ]; L ∞ (Ω; L ( H ))) . Then the system (3.1) is not exactly control-lable at any time T > . roof . Without loss of generality, we assume that F ( · ) = 0. Let ϕ ∈ D ( A ∗ ) be such that | ϕ | H = 1.We use the contradiction argument. Suppose that (3.1) was exactly controllable at some time T >
0. Then, for X = 0 and X = (cid:16) Z T η ( t ) dW ( t ) (cid:17) ϕ with η ( · ) being given by (3.2), one could finda control u ∈ L F (0 , T ; U ) such that the corresponding solution X ( · ) to (3.1) fulfills that X ( T ) = X ,a.s. Hence, by the definition of weak solutions to (3.1), it follows that Z T η ( t ) dW ( t ) = Z T (cid:0) h X ( t ) , A ∗ ϕ i H + h Bu ( t ) , ϕ i H (cid:1) dt + Z T h K ( t ) X ( t ) , ϕ i H dW ( t ) . (3.4)Since X ( · ) ∈ C F ([0 , T ]; L (Ω; H )), we deduce that h X ( · ) , A ∗ ϕ i H + h Bu ( · ) , ϕ i H ∈ L F (0 , T ) and h K ( · ) X ( · ) , ϕ i H ∈ C F ([0 , T ]; L (Ω)). Clearly, (3.4) contradicts Lemma 3.1.Theorem 3.1 concludes that an SEE is not exactly controllable when the control, which is L w.r.t. the time variable, is only acted in the drift term. Further results for controllability problemsof such stochastic control systems can be found in [9, 63, 75]. To begin with, let us introduce the following concepts.
Definition 3.2
The system (3.1) is called null controllable at time T if for any X ∈ H , there isa control u ∈ L F (0 , T ; U ) such that the corresponding solution X ( · ) to (3.1) fulfills that X ( T ) = 0 ,a.s. Definition 3.3
The system (3.1) is called approximately controllable at time T if for any X ∈ H , X ∈ L F T (Ω; H ) and ε > , there is a control u ∈ L F (0 , T ; U ) such that the corresponding solution X ( · ) to (3.1) fulfills that | X ( T ) − X | L F T (Ω; H ) < ε . It is quite easy to study the null controllability of (3.1) when the operator F ( · ) is “deterministic”,i.e., F ( · ) ∈ L ∞ (0 , T ; L ( H )). In this case, let us introduce the following deterministic control system: d b X ( t ) dt = (cid:0) A + F ( t ) (cid:1) b X ( t ) + B ˆ u ( t ) in (0 , T ] , b X (0) = X , (3.5)where X ∈ H and ˆ u ∈ L (0 , T ; U ). Similarly to Definition 3.2, the system (3.5) is called null con-trollable at time T if for any X ∈ H , there is a control ˆ u ∈ L (0 , T ; U ) such that the correspondingsolution to (3.5) satisfies that b X ( T ) = 0. We have the following result: Theorem 3.2
Suppose that F ( · ) ∈ L ∞ (0 , T ; L ( H )) and K ( · ) ∈ L ∞ F (0 , T ) , Then, the system (3.1) is null controllable at time T if and only if so is the system (3.5) .Proof . The “only if” part. For arbitrary X ∈ H , let u ( · ) ∈ L F (0 , T ; U ) which steers the solution of(3.1) to 0 at time T . Then, clearly, E X is the solution to (3.5) with the control ˆ u = E u ∈ L (0 , T ; U ).Since X ( T ) = 0, a.s., we have that E X ( T ) = 0, which deduces that (3.5) is null controllable attime T . 32he “if” part. For a given X ∈ H , let ˆ u ( · ) ∈ L (0 , T ; U ) which steers the solution of (3.5) to0 at time T . Let X ( t ) = e R t K ( s ) dW ( s ) − R t K ( s ) ds b X ( t ) . Then, X ( · ) is the unique solution of (3.1) with the control u ( · ) = e R · K ( s ) dW ( s ) − R · K ( s ) ds ˆ u ( · ) . Since b X ( T ) = 0, we have that X ( T ) = e R T K ( s ) dW ( s ) − R T K ( s ) ds b X ( T ) = 0 . Further, E Z T | u ( t ) | U dt = Z T | ˆ u ( t ) | U E e R t K ( s ) dW ( s ) − R t K ( s ) ds dt ≤ C Z T | ˆ u ( t ) | U dt < ∞ . This completes the proof of Theorem 3.2.
Remark 3.1
The technique in the proof of Theorem 3.2 cannot be applied to get the exact con-trollability or approximate controllability of the system (3.1) . Indeed, let us assume that (3.5) isexactly controllable, i.e., for any X , X ∈ H , there exists a control ˆ u such that the solution to (3.5) fulfills that b X ( T ) = X . Put A T ∆ = (cid:8) X ( T ) (cid:12)(cid:12) X ( · ) solves (3.1) for some X ∈ H and u ( · ) = e R · K ( s ) dW ( s ) − R · K ( s ) ds ˆ u ( · ) for ˆ u ( · ) ∈ L (0 , T ; U ) (cid:9) . Then, from the exact controllability of (3.5) , we see that A T = (cid:8) e R T K ( s ) dW ( s ) − R T K ( s ) ds X (cid:12)(cid:12) X ∈ H (cid:9) . Clearly, A T = L F T (Ω; H ) and A T is not dense in L F T (Ω; H ) . Consequently, the system (3.1) isneither exactly controllable nor approximately controllable. Remark 3.2
When A generates a C -group on H , the null controllability of (3.5) implies theexact/approximate controllability of (3.5) . From Remark 3.1, it is easy to see that this is incorrectfor (3.1) , which is a new phenomenon in the stochastic setting. For deterministic control systems, if we assume that the control operator B is invertible, then itis easy to get controllability results. The following theorem shows that the same holds for stochasticcontrol systems. Theorem 3.3 If B ∈ L ( U ; H ) is invertible, then the system (3.1) is both null and approximatelycontrollable at any time T > .Proof . Consider the following BSEE: ( dY ( t ) = − (cid:0) A ∗ Y ( t ) + F ( t ) Y ( t ) + K ( t ) Z ( t ) (cid:1) dt + Z ( t ) dW ( t ) in [0 , T ) ,Y ( T ) = Y T , (3.6)33here Y T ∈ L F T (Ω; H ). By [53, Theorem 7.17], to prove that (3.1) is null controllable, it sufficesto show that | Y (0) | H ≤ C Z T E | B ∗ Y ( t ) | U dt. (3.7)By Theorem 1.21, there is a constant C > | Y (0) | H ≤ C E | Y ( t ) | H , ∀ t ∈ [0 , T ] . (3.8)Since B is invertible, there is a constant C > | y | H ≤ C| B ∗ y | U , ∀ y ∈ H. This, together with (3.8), implies the inequality (3.7).To prove that (3.1) is approximately controllable, it suffices to show that Y T = 0 , provided that B ∗ Y ( · ) = 0 in L F (0 , T ; U ) . (3.9)Since B is invertible, we have that Y ( · ) = 0 in L F (0 , T ; H ). Noting that Y ( · ) ∈ C F ([0 , T ]; L (Ω; H )),we get that Y T = 0. Consider the following SEE: ( dX ( t ) = (cid:0) A + F ( t ) (cid:1) X ( t ) dt + G ( t ) X ( t ) dW ( t ) in (0 , T ] ,X (0) = X , (3.10)where X ∈ H , F ∈ L ∞ (0 , T ; L ( H )) and G ∈ L ∞ F (0 , T ; L ( H )). Further, we introduce the followingdeterministic evolution equation: d b X ( t ) dt = (cid:0) A + F ( t ) (cid:1) b X ( t ) in (0 , T ] ,X (0) = X . (3.11)Let e U be another Hilbert space and C ∈ L ( H ; e U ). We say that the equation (3.10) ( resp. (3.11))is continuously initially observable on [0 , T ] for the observation operator C if | X | H ≤ C Z T E | C X ( t ) | e U , (3.12)( resp. | X | H ≤ C Z T | C b X ( t ) | e U . (cid:17) (3.13)The following result reveals the relationship between the observability of (3.10) and (3.11). Theorem 3.4
The equation (3.10) is continuously initially observable on [0 , T ] for the observationoperator C , provided that so is the equation (3.11) . roof . Let X ( · ) be a solution to (3.10). Then, E X ( · ) is a solution to (3.11). If (3.11) is continuouslyinitially observable on [0 , T ], then, by (3.13), we have | X | H ≤ C Z T | C E X ( t ) | e U . This, together with H¨older’s inequality, implies (3.12).
Remark 3.3
Clearly, if F is stochastic, then one cannot use the above argument to solve theobservability problems for SEEs. For establishing the observability for SPDEs, a powerful methodis the global Carleman estimate (e.g., [16, 21, 33, 35, 36, 37, 72, 77, 78, 80, 81]) Remark 3.4
In this section, for simplicity, we assume that B ∈ L ( U ; H ) and C ∈ L ( H ; e U ) . Someresults for unbounded B and C can be found in [40, 53]. Stochastic transport equation can be regarded as the simplest SPDE. In this section, we considerthe exact controllability for this equation. By a duality argument, the controllability problem isreduced to a suitable observability estimate for backward stochastic transport equations, and weshall employ a stochastic version of global Carleman estimate to derive such an estimate. Thecontent of this section is a simplified version of [39].In Sections 4, 5 and 6, we assume that F = {F t } t ∈ [0 ,T ] is the natural filtration generated by W ( · ). Let G be given as in Subsection 1.4 and convex. For any T >
0, put Q ≡ (0 , T ) × G, Σ ≡ (0 , T ) × Γ . (4.1)Fix an O ∈ R n satisfying | O | R n = 1. SetΓ − ∆ = (cid:8) x ∈ Γ (cid:12)(cid:12) O · ν ( x ) < (cid:9) , Γ + ∆ = Γ \ Γ − , Σ − ∆ =(0 , T ) × Γ − , Σ + ∆ =(0 , T ) × Γ + . Define the Hilbert space L O (Γ − ) as the completion of all h ∈ C ∞ (Γ − ) w.r.t. the norm | h | L O (Γ − ) ∆ = (cid:16) − Z Γ − O · ν | h | d Γ (cid:17) . Clearly, L (Γ − ) is dense in L O (Γ − ).Let us first recall the controllability problem for the following deterministic transport equation: y t + O · ∇ y = a y in Q,y = u on Σ − ,y (0) = y in G, (4.2)35here y ∈ L ( G ) and a ∈ L ∞ (0 , T ; L ∞ ( G )). In (4.2), y is the state and u is the control . The state and control spaces are chosen to be L ( G ) and L (0 , T ; L O (Γ − )), respectively. It is easy toshow that, for any y ∈ L ( G ) and u ∈ L (0 , T ; L O (Γ − )), (4.2) admits a unique (transposition)solution y ( · ) ∈ C ([0 , T ]; L ( G )) (e.g., [30]). Definition 4.1
The system (4.2) is said to be exactly controllable at time T if for any given y , y ∈ L ( G ) , one can find a control u ∈ L (0 , T ; L O (Γ − )) such that the corresponding solution y ( · ) to (4.2) satisfies y ( T ) = y . To study the exact controllability problem of (4.2), people introduce the following dual equation: z t + O ·∇ z = − a z in Q,z = 0 on Σ + ,z ( T ) = z T in G. (4.3)By means of the standard duality argument, it is easy to show the following result (e.g., [27]). Proposition 4.1
The equation (4.2) is exactly controllable at time T if and only if solutions tothe equation (4.3) satisfy the following observability estimate: | z T | L ( G ) ≤ C| z | L (0 ,T ; L O (Γ − )) , ∀ z T ∈ L ( G ) . (4.4)Controllability problems for deterministic transport equations are now well understood. Indeed,one can use the global Carleman estimate to prove the observability inequality (4.4) for T > x ∈ Γ | x | R n (c.f. [27]). In the rest of this section, we shall see that things are quite different inthe stochastic setting.Now, we consider the following controlled stochastic transport equation: dy + O · ∇ ydt = (cid:0) a y + a v (cid:1) dt + (cid:0) a y + v (cid:1) dW ( t ) in Q,y = u on Σ − ,y (0) = y in G. (4.5)Here y ∈ L ( G ), and a , a , a ∈ L ∞ F (0 , T ; L ∞ ( G )). In (4.5), y is the state , while u ∈ L F (0 , T ; L O (Γ − ))and v ∈ L F (0 , T ; L ( G )) are two controls . Remark 4.1
In practice, very likely, the control v in the diffusion term affects also the systemthrough its drift term (say, in the form a v , as in (4.5)). This means one cannot simply put acontrol in the diffusion term only. The cost is that it will effect the drift term in an undesired way,i.e., a v does not work as a control but as a disturbance. A solution to (4.5) is also understood in the transposition sense. For this, we introduce thefollowing “reference” equation which is a backward stochastic transport equation: dz + O · ∇ zdt = − ( a z + a Z ) dt + ZdW ( t ) in Q τ ,z = 0 on Σ + τ ,z ( τ ) = z τ in G, (4.6)where τ ∈ (0 , T ], Q τ ∆ =(0 , τ ) × G , Σ + τ ∆ =(0 , τ ) × Γ + and z τ ∈ L F τ (Ω; L ( G )).Recalling the operator A given in Example 1.3, as an easy consequence of Theorem 1.21, wehave the following well-posedness result for (4.6).36 roposition 4.2 For any τ ∈ (0 , T ] and z τ ∈ L F τ (Ω; L ( G )) , the equation (4.6) admits a uniquemild solution ( z, Z ) ∈ L F (Ω; C ([0 , τ ]; L ( G ))) × L F (0 , τ ; L ( G )) such that | z | L F (Ω; C ([0 ,τ ]; L ( G ))) + | Z | L F (0 ,τ ; L ( G )) ≤ C| z τ | L F τ (Ω; L ( G )) , (4.7) where C is a constant, independent of τ and z τ . We need the following further regularity result for solutions to (4.6).
Proposition 4.3
For any τ ∈ (0 , T ] , solutions to the equation (4.6) satisfy that | z | L F (0 ,τ ; L O (Γ − )) ≤ C E | z τ | L ( G ) , ∀ z τ ∈ L F τ (Ω; L ( G )) . Proof . For simplicity, we only consider the case of τ = T . Let A be the operator introduced inExample 1.3 and H = L ( G ). Then the equation (4.6) can be written in the following form: ( dz + A ∗ zdt = f dt + ZdW ( t ) dW ( t ) in Q τ ,z ( T ) = z T , (4.8)where f = − ( a z + a Z ). Let { ( z λ , Z λ ) } λ ∈ ρ ( A ∗ ) be the approximation of ( z, Z ) introduced in (1.52),i.e., ( z λ , Z λ ) solves ( dz λ + A ∗ z λ dt = R ∗ ( λ ) f dt + R ∗ ( λ ) Z λ dW ( t ) in [0 , T ) ,z λ ( T ) = R ∗ ( λ ) z T , (4.9)where R ∗ ( λ ) ∆ = λ ( λI − A ∗ ) − . By Theorem 1.21 and Proposition 1.12, one can show that the solutionto (4.9) satisfies that ( z λ , Z λ ) ∈ (cid:0) L F (Ω; C ([0 , T ]; L ( G ))) ∩ L F (0 , T ; D ( A ∗ )) (cid:1) × L F (0 , T ; L ( G )) . Itfollows from Itˆo’s formula that E | R ( λ ) z T | L ( G ) − | z λ (0) | L ( G ) = − E Z T Z G z λ O · ∇ z λ dxdt + E Z T Z G (cid:0) z λ R ∗ ( λ ) f + | R ∗ ( λ ) Z λ | (cid:1) dxdt. Therefore, − E Z T Z Γ − O · νz λ d Γ − dt ≤ E | R ( λ ) z T | L ( G ) − E Z T Z G (cid:0) z λ R ∗ ( λ ) f + | R ∗ ( λ ) Z λ | (cid:1) dxdt. (4.10)By letting λ tend to ∞ in (4.10), noting Theorem 1.23, we obtain that − E Z T Z Γ − O · νz d Γ − dt ≤ E | z T | L ( G ) − E Z T Z G (cid:0) zf + | Z | (cid:1) dxdt ≤ C E | z T | L ( G ) . This complete the proof of Proposition 4.3.
Remark 4.2
By Proposition 4.2, the first component z (of the solution to (4.6) ) belongs to L F (Ω; C ([0 , τ ]; L ( G ))) . This does not yield the regularity z | Γ − ∈ L F (0 , τ ; L O (Γ − )) , guaranteed by Propo-sition 4.3. Hence, the latter is sometimes called a hidden regularity property. Definition 4.2
A stochastic process y ∈ C F ([0 , T ]; L (Ω; L ( G ))) is called a transposition solutionto (4.5) if for any τ ∈ (0 , T ] and z τ ∈ L F τ (Ω; L ( G )) , it holds that E h y ( τ ) , z τ i L ( G ) − h y , z (0) i L ( G ) = E Z τ h v, a z + Z i L ( G ) dt − E Z τ Z Γ − O · νuzd Γ − dt. (4.11) Here ( z, Z ) solves (4.6) . Remark 4.3
In Subsection 1.10, we have introduced the notion of transposition solution for BSEEswith general filtration, to overcome the difficulty that there is no Martingale Representation The-orem. Here and in Section 6, we use the notion of transposition solution for controlled SPDEs toovercome the difficulty arisen from the boundary control. Note however that the idea to solve thesetwo different problems is very similar, that is, to study the well-posedness of a “less-understood”equation by means of another “well-understood” one. To keep clarity, it seems that we would betteruse two different notions for them but we prefer keeping the present presentation because the readersshould be able to detect the differences easily from the context.
We have the following well-posedness result for (4.5) (See Subsection 4.2 for its proof).
Proposition 4.4
For any y ∈ L ( G ) , u ∈ L F (0 , T ; L O (Γ − )) and v ∈ L F (0 , T ; L ( G )) , the equation (4.5) admits a unique (transposition) solution y ∈ C F ([0 , T ]; L (Ω; L ( G ))) such that | y | C F ([0 ,T ]; L (Ω; L ( G ))) ≤ C (cid:0) | y | L ( G ) + | u | L F (0 ,T ; L O (Γ − )) + | v | L F (0 ,T ; L ( G )) (cid:1) . (4.12)Now we can introduce the notion of exact controllability of (4.5). Definition 4.3
The system (4.5) is called exactly controllable at time T if for any y ∈ L ( G ) and y T ∈ L F T (Ω; L ( G )) , one can find a couple of controls ( u, v ) ∈ L F (0 , T ; L O (Γ − )) × L F (0 , T ; L ( G )) such that the corresponding (transposition) solution y to (4.5) satisfies that y ( T ) = y T in L ( G ) ,a.s. The main result in this section is the following exact controllability result for (4.5).
Theorem 4.1 If T > x ∈ Γ | x | R n , then the system (4.5) is exactly controllable at time T . Remark 4.4
We introduce two controls into the system (4.5) . Moreover, the control v acts on thewhole domain. Compared with the deterministic transport equations, it seems that our choice ofcontrols is too restrictive. One may consider the following three weaker cases:1) Only one control is acted on the system, that is, u = 0 or v = 0 in (4.5) .2) Neither u nor v is zero. But v = 0 in (0 , T ) × G , where G is a nonempty open subset of G .3) Two controls are imposed on the system. But both of them are in the drift term.For the above three cases, according to the controllability result for deterministic transport equa-tions, it seems that the corresponding control system should be exactly controllable. However, sim-ilarly to the proof of Theorem 3.1, one can show that it is not the truth (c.f. [39]). Using the standard duality argument, in order to prove Theorem 4.1, it suffices to establish asuitable observability estimate for (4.6) with τ = T . The latter will be done in Subsection 4.3, bymeans of a stochastic version of global Carleman estimate.38 .2 Well-posedness of the control system and a weighted identity This subsection is addressed to present some preliminary results.To begin with, let us prove the well-posedness result (i.e., Proposition 4.4) for the controlledstochastic transport equation (4.5).
Proof .[Proof of Proposition 4.4] We first prove the existence of transposition solutions to (4.5).Since u ∈ L F (0 , T ; L O (Γ − )), there exists a sequence { u m } ∞ m =1 ⊂ C F ([0 , T ]; H / (Γ − )) with u m (0) =0 for all m ∈ N such that lim m →∞ u m = u in L F (0 , T ; L O (Γ − )) . (4.13)For each m ∈ N , we can find a ˜ u m ∈ C F ([0 , T ]; H ( G )) such that ˜ u m | Γ − = u m and ˜ u m (0) = 0.Consider the following stochastic transport equation: d ˜ y m + O ·∇ ˜ y m dt = (cid:0) a ˜ y m + a v + ζ m (cid:1) dt + (cid:2) a (˜ y m + ˜ u m )+ v (cid:3) dW ( t ) in Q, ˜ y m = 0 on Σ , ˜ y m (0) = y in G, (4.14)where ζ m = − ˜ u m,t − O · ∇ ˜ u m dt + a ˜ u m . By Theorem 1.17, the system (4.14) admits a unique mildsolution ˜ y m ∈ C F ([0 , T ]; L (Ω; L ( G ))).Let y m = ˜ y m + ˜ u m . For any τ ∈ (0 , T ] and z τ ∈ L F τ (Ω; L ( G )), by Itˆo’s formula and integrationby parts, we have that E h y m ( τ ) , z τ i L ( G ) − h y , z (0) i L ( G ) = E Z τ h v, Z i L ( G ) dt − E Z τ Z Γ − O · νu m zd Γ − dt, (4.15)where ( z, Z ) is the mild solution to (4.6). Consequently, for any m , m ∈ N , it holds that E h y m ( τ ) − y m ( τ ) , z τ i L ( G ) = − E Z τ Z Γ − O · ν ( u m − u m ) zd Γ − dt. (4.16)By Proposition 1.2, we can choose z τ ∈ L F τ (Ω; L ( G )) so that | z τ | L F τ (Ω; L ( G )) = 1 and E h y m ( τ ) − y m ( τ ) , z τ i L ( G ) ≥ | y m ( τ ) − y m ( τ ) | L F τ (Ω; L ( G )) . (4.17)It follows from (4.16)–(4.17) and Proposition 4.3 that | y m ( τ ) − y m ( τ ) | L F τ (Ω; L ( G )) ≤ (cid:12)(cid:12)(cid:12) E Z τ Z Γ − O · ν ( u m − u m ) zd Γ − ds (cid:12)(cid:12)(cid:12) ≤ C| u m − u m | L F (0 ,T ; L O (Γ − )) | z τ | L F τ (Ω; L ( G )) ≤ C| u m − u m | L F (0 ,T ; L O (Γ − )) , where the constant C is independent of τ . Consequently, it holds that | y m − y m | C F ([0 ,T ]; L (Ω; L ( G ))) ≤ C| u m − u m | L F (0 ,T ; L O (Γ − )) . Hence, { y m } ∞ m =1 is a Cauchy sequence in C F ([0 , T ]; L (Ω; L ( G ))). Denote by y the limit of { y m } ∞ m =1 . Letting m → ∞ in (4.15), we see that y satisfies (4.11). Thus, y is a transpositionsolution to (4.5). 39y Proposition 1.2, we see that there exists z τ ∈ L F τ (Ω; L ( G )) such that | z τ | L F τ (Ω; L ( G )) = 1and E h y ( τ ) , z τ i L ( G ) ≥ | y ( τ ) | L F τ (Ω; L ( G )) . (4.18)Combining (4.11), (4.18) and using Proposition 4.3 again, we obtain that | y ( τ ) | L F τ (Ω; L ( G )) ≤ (cid:16)(cid:12)(cid:12) h y , z (0) i L ( G ) (cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E Z τ h v, Z i L ( G ) dt (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E Z τ Z Γ − O · νuzd Γ − dt (cid:12)(cid:12)(cid:12)(cid:17) ≤ C (cid:0) | y | L ( G ) + | u | L F (0 ,T ; L O (Γ − )) + | v | L F (0 ,T ; L ( G )) (cid:1) , where the constant C is independent of τ . Therefore, we obtain the desired estimate (4.12).Now we prove the uniqueness of solutions. Assume that y and y are two transposition solutionto (4.5). From (4.11), we deduce that, for every τ ∈ [0 , T ], E h y ( τ ) − y ( τ ) , z τ i L ( G ) = 0 , ∀ z τ ∈ L F τ (Ω; L ( G )) , which implies that y = y in C F ([0 , T ]; L (Ω; L ( G ))). This completes the proof of Proposition4.4.Next, we present a weighted identity (for stochastic transport operators), which will play a keyrole in establishing the global Carleman estimate for (4.6). Proposition 4.5
Let ℓ ∈ C ([0 , T ] × G ) and θ = e ℓ . Let u be an H ( R n ) -valued Itˆo process, and v = θ u . Then , − θ (cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:0) d u + O · ∇ u dt (cid:1) = − d (cid:2)(cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:3) − O · ∇ (cid:2)(cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:3) dt + 12 (cid:2) ℓ tt + O · ∇ ( O · ∇ ℓ ) + 2 O · ∇ ℓ t (cid:3) v dt + 12 (cid:0) ℓ t + O · ∇ ℓ (cid:1) ( d v ) + (cid:0) ℓ t + O · ∇ ℓ (cid:1) v dt. (4.19) Proof . Clearly, θ (cid:0) d u + O · ∇ u dt (cid:1) = θd (cid:0) θ − v (cid:1) + θO · ∇ (cid:0) θ − v (cid:1) dt = d v + O · ∇ v dt − (cid:0) ℓ t + O · ∇ ℓ (cid:1) v dt. Thus, − θ (cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:0) d u + O · ∇ u dt (cid:1) = − (cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:2) d v + O · ∇ v dt − (cid:0) ℓ t + O · ∇ ℓ (cid:1) v dt (cid:3) = − (cid:0) ℓ t + O · ∇ ℓ (cid:1) v (cid:0) d v + O · ∇ v dt (cid:1) + (cid:0) ℓ t + O · ∇ ℓ (cid:1) v dt. (4.20) See Remark 1.3 for the term ( d v ) − ℓ t v d v = − d ( ℓ t v ) + 12 ℓ tt v dt + 12 ℓ t ( d v ) , − O · ∇ ℓ v d v = − d ( O · ∇ ℓ v ) + 12 ( O · ∇ ℓ ) t v dt + 12 O · ∇ ℓ ( d v ) , − ℓ t v O · ∇ v dt = − O · ∇ ( ℓ t v ) dt + 12 O · ∇ ℓ t v dt, − O · ∇ ℓ v O · ∇ v dt = − O · ∇ ( O · ∇ ℓ v ) dt + 12 O · ∇ ( O · ∇ ℓ ) v dt. This, together with (4.20), implies (4.19). This completes the proof of Proposition 4.5.
In this subsection, we shall show the following observability estimate for the equation (4.6), whichimplies Theorem 4.1.
Theorem 4.2 If T > x ∈ Γ | x | R n , then solutions to the equation (4.6) with τ = T satisfy that | z T | L F T (Ω; L ( G )) ≤ C (cid:0) | z | L F (0 ,T ; L O (Γ − )) + | a z + Z | L F (0 ,T ; L ( G )) (cid:1) , ∀ z T ∈ L F T (Ω; L ( G )) . (4.21) Proof . Let λ >
0, and let c ∈ (0 ,
1) be such that cT > x ∈ Γ | x | R n . Put ℓ = λ h | x | R n − c (cid:16) t − T (cid:17) i . (4.22)Applying Proposition 4.5 to the equation (4.6) with u = z and ℓ given by (4.22), integrating (4.19)on Q , using integration by parts and taking expectation, we obtain that − E Z Q θ ( ℓ t + O · ∇ ℓ ) z ( dz + O · ∇ zdt ) dx = λ E Z G ( cT − O · x ) θ ( T ) z ( T ) dx + λ E Z G ( cT + 2 O · x ) θ (0) z (0) dx + λ E Z Σ − O · ν (cid:2) c ( T − t ) − O · x (cid:3) θ z d Σ − + 2(1 − c ) λ E Z Q θ z dxdt + E Z Q θ ( ℓ t + O · ∇ ℓ )( dz ) dx + 2 E Z Q θ ( ℓ t + O · ∇ ℓ ) z dxdt. (4.23)Noting that ( z, Z ) solves (4.6) with τ = T , we obtain that − E Z Q θ ( ℓ t + O · ∇ ℓ ) z ( dz + O · ∇ zdt ) dx = 2 E Z Q θ ( ℓ t + O · ∇ ℓ ) z (cid:0) a z + a Z (cid:1) dxdt ≤ E Z Q θ ( ℓ t + O · ∇ ℓ ) z dxdt + C E Z Q θ ( z + | Z | ) dxdt ≤ E Z Q θ ( ℓ t + O · ∇ ℓ ) z dxdt + C E Z Q θ ( z + | a z + Z | ) dxdt (4.24)41nd E Z Q θ ( ℓ t + O · ∇ ℓ )( dz ) dx = E Z Q θ ( ℓ t + O · ∇ ℓ ) | ( a z + Z ) − a z | dxdt ≤ E Z Q θ ( ℓ t + O · ∇ ℓ ) z dxdt + C E Z Q θ z dxdt +4 E Z Q θ | ℓ t + O · ∇ ℓ || a z + Z | dxdt. (4.25)From (4.23)–(4.25) and z ( T ) = z T in G a.s., it follows that λ E Z G ( cT − O · x ) θ ( T ) z T dx + λ Z G ( cT + 2 O · x ) θ (0) z (0) dx +2(1 − c ) λ E Z Q θ z dxdt + E Z Q θ ( ℓ t + O · ∇ ℓ ) z dxdt ≤ C E Z Q θ ( z + λ | a z + Z | ) dxdt − λ E Z Σ − O · ν (cid:2) c ( T − t ) − O · x (cid:3) θ z d Σ − , (4.26)where the constant C > λ .Finally, recalling that c ∈ (0 ,
1) satisfies cT > x ∈ Γ | x | R n , by choosing λ large enough in(4.26), we obtain the desired estimate (4.21). This completes the proof of Theorem 4.2. This subsection is addressed to a proof of the exact controllability of (4.5), i.e., Theorem 4.1.
Proof .[Proof of Theorem 4.1] Fix any y ∈ L ( G ) and y T ∈ L F T (Ω; L ( G )). Let us introduce alinear subspace of L F (0 , T ; L O (Γ − )) × L F (0 , T ; L ( G )) as follows: Y ∆ = n(cid:0) − z | Γ − , a z + Z (cid:1) (cid:12)(cid:12)(cid:12) ( z, Z ) solves the equation (4.6) with some z T ∈ L F T (Ω; L ( G )) o , and define a linear functional F on Y as follows: F ( − z | Γ − S , a z + Z ) = E Z G y T z T dx − Z G y z (0) dx. By Theorem 4.2, we see that F is a bounded linear functional on Y . By means of Theorem 1.3, F canbe extended to be a bounded linear functional on the space L F (0 , T ; L O (Γ − )) × L F (0 , T ; L ( G )).For simplicity, we still use F to denote this extension. By Theorem 1.4, there exists ( u, v ) ∈ L F (0 , T ; L O (Γ − )) × L F (0 , T ; L ( G )) such that E Z G y T z T dx − Z G y z (0) dx = F ( − z | Γ − S , a z + Z )= − E Z T Z Γ − O · νzud Γ − dt + E Z T Z G v ( a z + Z ) dxdt. (4.27)42e claim that u and v are the desired controls. Indeed, by the definition of the transpositionsolution to (4.5), we have E Z G y ( T, · ) z T dx − Z G y z (0) dx = − E Z T Z Γ − O · νzud Γ − dt + E Z T Z G v ( a z + Z ) dxdt. (4.28)From (4.27) and (4.28), we see that E Z G y T z T dx = E Z G y ( T ) z T dx for any z T ∈ L F T (Ω; L ( G )).Hence, y ( T ) = y T in L ( G ), a.s. This completes the proof of Theorem 4.1. Remark 4.5
In the proof of Theorem 4.1, we only give the existence of the controls which drivethe state to the desired destination. Some further characterization for these controls can be foundin [53, Section 7.4].
This section, based mainly on [70], is devoted to studying the null/approximate controllability andobservability of stochastic parabolic equations, one class of typical SPDEs.
Let G be given as in Subsection 1.4, and G be a given nonempty open subset of G . For any T > Q and Σ as in (4.1), and Q =(0 , T ) × G . We begin with the following controlled deterministic parabolic equation: y t − ∆ y = a y + χ G u in Q,y = 0 on Σ ,y (0) = y in G, (5.1)where a ∈ L ∞ (0 , T ; L ∞ ( G )). In (5.1), y and u are the state and control variables, while the state and control spaces are chosen to be L ( G ) and L ( Q ), respectively. Definition 5.1
The equation (5.1) is said to be null controllable at time T (resp. approximatelycontrollable at time T ) if for any given y ∈ L ( G ) (resp. for any given ε > and y , y ∈ L ( G ) ),one can find a control u ∈ L ( Q ) such that the solution y ( · ) ∈ C ([0 , T ]; L ( G )) ∩ L (0 , T ; H ( G )) to (5.1) satisfies y ( T ) = 0 (resp. | y ( T ) − y | L ( G ) ≤ ε ). Remark 5.1
Due to the smoothing effect of solutions to parabolic equations, the exact controlla-bility for (5.1) is impossible, i.e., the ε in Definition 5.1 cannot be zero. The dual equation of (5.1) is z t + ∆ z = − a z in Q,z = 0 on Σ ,z ( T ) = z T in G. (5.2)By means of the standard duality argument, it is easy to show the following result.43 roposition 5.1 i) The equation (5.1) is null controllable at time T if and only if solutions to theequation (5.2) satisfy the following observability estimate: | z | L ( G ) ≤ C| z | L ( Q ) , ∀ w ∈ L ( G ); (5.3)ii) The equation (5.1) is approximately controllable at time T if and only if any solution to theequation (5.2) satisfy the following unique continuation property: z = 0 in Q = ⇒ z T = 0 . Controllability of deterministic parabolic equations is now well-understood. One can use theglobal Carleman estimate to prove the observability inequality (5.3) and the unique continuationproperty of (5.2) (c.f. [17, 20]).We now consider the following controlled stochastic parabolic equation: dy − ∆ ydt = (cid:0) a y + χ G u + a v (cid:1) dt + ( a y + v ) dW ( t ) in Q,y = 0 on Σ ,y (0) = y in G, (5.4)where a , a , a ∈ L ∞ F (0 , T ; L ∞ ( G )). In the system (5.4), the initial state y ∈ L ( G ), y is the statevariable , and the control variable consists of ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )). Remark 5.2
Similarly to the control system (4.5) , the control v in the diffusion term affects thedrift term in the form a v . By Theorem 1.19, for any y ∈ L ( G ) and ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )), the sys-tem (5.4) admits a unique weak solution y ∈ L F (Ω; C ([0 , T ]; L ( G ))) ∩ L F (0 , T ; H ( G )). Moreover, | y | L F (Ω; C ([0 ,T ]; L ( G ))) ∩ L F (0 ,T ; H ( G )) ≤ C (cid:0) | y | L ( G ) + | ( u, v ) | L F (0 ,T ; L ( G )) × L F (0 ,T ; L ( G )) (cid:1) . (5.5) Definition 5.2
The system (5.4) is said to be null controllable (resp. approximately controllable)at time T if for any y ∈ L ( G ) (resp. for any given ε > , y ∈ L ( G ) and y ∈ L F T (Ω; L ( G )) ),there exists a pair of ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) such that the corresponding solutionto (5.4) fulfills that y ( T ) = 0 , a.s. (resp. | y ( T ) − y | L F T (Ω; L ( G )) ≤ ε ). The main result of this section can be stated as follows:
Theorem 5.1
System (5.4) is null and approximately controllable at any time
T > . The rest of this section is addressed to proving Theorem 5.1. For this, we need some prelimi-naries.
In this subsection, we derive a weighted identity for the stochastic parabolic operator “ dh + ∆ hdt ”.In the rest of this notes, for simplicity, we use the notation y x j = ∂y∂x j , j = 1 , · · · , n for thepartial derivative of a function y w.r.t. x j , where x j is the j -th coordinate of a generic point x = ( x , · · · , x n ) in R n . 44ssume that ℓ ∈ C , ( Q ) and Ψ ∈ C , ( Q ). Write A = |∇ ℓ | − ∆ ℓ − Ψ − ℓ t , B = 2 (cid:2) A Ψ + div (cid:0) A∇ ℓ (cid:1)(cid:3) − A t + ∆Ψ ,c jk = 2 ℓ x j x k − δ jk ∆ ℓ − Ψ δ jk , j, k = 1 , · · · , n, (5.6)where δ jk = (cid:26) , if j = k, , if j = k. We have the following fundamental weighted identity.
Theorem 5.2
Let h be an H ( G ) -valued Itˆo process. Set θ = e ℓ and w = θh . Then, for any t ∈ [0 , T ] and a.e. ( x, ω ) ∈ G × Ω , θ (cid:0) ∆ w + A w (cid:1)(cid:0) dh + ∆ hdt (cid:1) − ∇ wdw )+2div h (cid:0) ∇ ℓ · ∇ w (cid:1) ∇ w − |∇ w | ∇ ℓ − Ψ w ∇ w + (cid:16) A∇ ℓ + ∇ Ψ2 (cid:17) w i dt = 2 n X j,k =1 c jk w x j w x k dt + B w dt − d (cid:0) |∇ w | − A w (cid:1) + 2 (cid:0) ∆ w + A w (cid:1) dt + θ | d ∇ h + ∇ ℓdh | − θ A ( dh ) . (5.7) Proof . Recalling that θ = e ℓ and w = θh , one has dh = θ − ( dw − ℓ t wdt ) and h x j = θ − ( w x j − ℓ x j w ) for j = 1 , , · · · , n . Hence, θ ∆ h = θ n X j =1 [ θ − ( w x j − ℓ x j w )] x j = n X j =1 [( w x j − ℓ x j w )] x j − n X j =1 ( w x j − ℓ x j w ) ℓ x j = ∆ w − ∇ ℓ · ∇ w + ( |∇ ℓ | − ∆ ℓ ) w. (5.8)Put I ∆ = ∆ w + A w, I = (cid:0) ∆ w + A w (cid:1) dt,I = dw − ∇ ℓ · ∇ wdt, I = Ψ wdt. (5.9)From (5.8) and (5.9), it follows that2 θ (cid:0) ∆ w + A w (cid:1)(cid:0) dh + ∆ hdt (cid:1) = 2 I ( I + I + I ) . (5.10)Now, we compute the right hand side of (5.10). Let us first deal with 2 II . We have2 n X j,k =1 ℓ x j w x k w x j x k = n X j,k =1 ℓ x j ( w x k ) x j = n X j,k =1 (cid:2) ( ℓ x j w x k ) x j − ℓ x j x j w x k (cid:3) . (5.11) See Remark 1.3 for the term ( dh ) − (cid:0) ∆ w + A w (cid:1) ∇ ℓ · ∇ w = − ∇ ℓ · ∇ w ) ∇ w ] + 4 n X j,k =1 ℓ x j x k w x j w x k +4 n X j,k =1 ℓ x j w x k w x j x k − A∇ ℓ · ∇ ( w )= − (cid:2) (cid:0) ∇ ℓ · ∇ w (cid:1) ∇ w − |∇ w | ∇ ℓ + A∇ ℓw (cid:3) +2 n X j,k =1 (cid:0) ℓ x j x k − δ jk ∆ ℓ (cid:1) w x j w x k + 2div ( A∇ ℓ ) w . (5.12)Using Itˆo’s formula, we have2 (cid:0) ∆ w + A w (cid:1) dw = 2div ( ∇ wdw ) − ∇ w · d ∇ w + 2 A wdw = 2div ( ∇ wdw ) + d (cid:0) − |∇ w | + A w (cid:1) + | d ∇ w | − A t w dt − A ( dw ) . (5.13)Combining (5.9), (5.12) and (5.13), we arrive at2 II = − (cid:2) (cid:0) ∇ ℓ · ∇ w (cid:1) ∇ w − |∇ w | ∇ ℓ + A w ∇ ℓ (cid:3) dt + 2div ( ∇ wdw )+ d (cid:0) − |∇ w | + A w (cid:1) + 2 n X j,k =1 (cid:0) ℓ x j x k − δ jk ∆ ℓ (cid:1) w x j w x k dt − (cid:2) A t − A∇ ℓ ) (cid:3) w dt + | d ∇ w | − A ( dw ) . (5.14)Next, we compute 2 II . By (5.9), we get2 II = 2 (cid:0) ∆ w + A w (cid:1) Ψ wdt = (cid:2) (cid:0) Ψ w ∇ w (cid:1) − |∇ w | − ∇ Ψ · ∇ ( w ) + 2 A Ψ w (cid:3) dt = (cid:2) div (cid:0) w ∇ w − w ∇ Ψ (cid:1) − |∇ w | + (cid:0) ∆Ψ + 2 A Ψ (cid:1) w (cid:3) dt. (5.15)Finally, combining the equalities (5.10), (5.14) and (5.15), and noting that −| d ∇ w | + A ( dw ) = − θ |∇ h + dh ∇ ℓ | + θ A ( dh ) , we obtain the desired equality (5.7) immediately.Next, for any fixed nonnegative and nonzero function ψ ∈ C ( G ), and parameters λ > µ >
1, we choose θ = e ℓ , ℓ = λα, α ( t, x ) = e µψ ( x ) − e µ | ψ | C ( G ) t ( T − t ) , ϕ ( t, x ) = e µψ ( x ) t ( T − t ) , (5.16)and Ψ = − ℓ. (5.17)46n what follows, for a nonnegative integer m , we denote by O ( µ m ) a function of order µ m for large µ (which is independent of λ ); by O µ ( λ m ) a function of order λ m for fixed µ and for large λ . Likewise,we use the notation O ( e µ | ψ | C ( G ) ) and so on. For j, k = 1 , , · · · , n , it is easy to check that ℓ t = λα t , ℓ x j = λµϕψ x j , ℓ x j x k = λµ ϕψ x j ψ x k + λµϕψ x j x k (5.18)and that α t = ϕ O ( e µ | ψ | C ( G ) ) , α tt = ϕ O ( e µ | ψ | C ( G ) ) ,ϕ t = ϕ O (1) , ϕ tt = ϕ O (1) . (5.19)We shall need the following estimates on the A , B and c jk appearing in the weighted identity(5.7) (See (5.6) for their definitions, with ℓ and Ψ given in (5.16)–(5.17)): Proposition 5.2
When λ and µ are large enough, it holds that A = λ µ ϕ |∇ ψ | + λϕ O ( e µ | ψ | C ( G ) ) , B ≥ λ µ ϕ |∇ ψ | + λ ϕ O ( µ )+ λ ϕ O ( µ e µ | ψ | C ( G ) )+ λ ϕ O ( µ ) , n X j,k =1 c jk ξ j ξ k ≥ [ λµ ϕ |∇ ψ | + λϕO ( µ )] | ξ | , ∀ ξ = ( ξ , · · · , ξ n ) ∈ R n , (5.20) for any t ∈ [0 , T ] and x ∈ G with |∇ ψ ( x ) | > .Proof . Noting (5.17)–(5.18), from (5.6), we have that ℓ x j x k = λµ ϕψ x j ψ x k + λϕO ( µ )and that n X j,k =1 c jk ξ j ξ k = n X j,k =1 (cid:0) ℓ x j x k + δ jk ∆ ℓ (cid:1) ξ j ξ k = n X j,k =1 (cid:2)(cid:0) λµ ϕψ x j ψ x k + λµ ϕδ jk |∇ ψ | + λϕO ( µ ) (cid:1)(cid:3) ξ j ξ k = 2 λµ ϕ (cid:0) ∇ ψ · ξ (cid:1) + λµ ϕ |∇ ψ | | ξ | + λϕ |∇ w | O ( µ ) ≥ (cid:2) λµ ϕ |∇ ψ | + λϕO ( µ ) (cid:3) | ξ | , which gives the last inequality in (5.20).Similarly, by the definition of A in (5.6), and noting (5.19), we see that A = |∇ ℓ | + ∆ ℓ − ℓ t = λµ (cid:0) λµϕ |∇ ψ | + µϕ |∇ ψ | + ϕ ∆ ψ (cid:1) + λϕ O ( e µ | ψ | C ( G ) )= λ µ ϕ |∇ ψ | + λϕ O ( e µ | ψ | C ( G ) ) . Hence, we get the first estimate in (5.20). 47ow, let us estimate B (recall (5.6) for the definition of B ). By (5.18), and recalling the definitionof Ψ (in (5.17)), we see thatΨ = − λµ ( µϕ |∇ ψ | + ϕ ∆ ψ ) = − λµ ϕ |∇ ψ | + λϕO ( µ ) , Ψ x k = − ℓ x k = − λµ ϕ |∇ ψ | ψ x k + λϕO ( µ ) , Ψ x j x k = − ℓ x j x k = − λµ ϕ |∇ ψ | ψ x j ψ x k + λϕO ( µ ) , ∆Ψ = − λµ ϕ |∇ ψ | + λϕO ( µ ) . Hence, recalling the definition of A (in (5.6)), and using (5.18) and (5.19), we have that A Ψ = − λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) + λ µ ϕ O (cid:0) e µ | ψ | C ( G ) (cid:1) , A x k = n X j =1 (cid:0) ℓ x j ℓ x j x k + ℓ x j x j x k (cid:1) − ℓ tx k = 2 λ µ ϕ |∇ ψ | ψ x k + λ ϕ O ( µ ) + λϕO ( µ ) + λµϕ O (cid:0) e µ | ψ | C ( G ) (cid:1) , − n X j =1 A x j ℓ x j = − λ µ ϕ |∇ ψ | + λ ϕ O ( µ )+ λ ϕ O ( µ )+ λ µ ϕ O (cid:0) e µ | ψ | C ( G ) (cid:1) , − n X j =1 (cid:0) A ℓ x j (cid:1) x j = − n X j =1 A x j ℓ x j − A ∆ ℓ = − λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) + λ ϕ O ( µ )+ λ µ ϕ O (cid:0) e µ | ψ | C ( G ) (cid:1) , A t = (cid:0) |∇ ℓ | + ∆ ℓ + ℓ t (cid:1) t = λ µ ϕ O ( e µ | ψ | C ( G ) ) . From the definition of B (see (5.6)), we have that B = − λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) + λ µ ϕ O ( e µ | ψ | C ( G ) )+6 λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) + λ µ ϕ O ( e µ | ψ | C ( G ) )+ λ µ ϕ O ( e µ | ψ | C ( G ) ) + λ ϕ O ( µ ) ≥ λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) + λ ϕ O ( µ e µ | ψ | C ( G ) ) + λ ϕ O ( µ ) , which leads to the second estimate in (5.20). To study the null and approximate controllability problems for (5.4), we introduce the followingbackward stochastic parabolic equation: dz + ∆ zdt = − (cid:0) a z + a Z (cid:1) dt + ZdW ( t ) in Q,z = 0 on Σ ,z ( T ) = z T in G. (5.21)By Theorem 1.22, for any z T ∈ L F T (Ω; L ( G )), the system (5.21) admits a unique weak solution( z, Z ) ∈ (cid:0) L F (Ω; C ([0 , T ]; L ( G ))) ∩ L F (0 , T ; H ( G )) (cid:1) × L F (0 , T ; L ( G )). Moreover, for any t ∈ [0 , T ], | ( z ( · ) , Z ( · )) | ( L F (Ω; C ([0 ,t ]; L ( G ))) ∩ L F (0 ,t ; H ( G )) ) × L F (0 ,t ; L ( G )) ≤ C| z ( t ) | L F t (Ω; L ( G )) . (5.22)48e also need the following known result ([20, p. 4, Lemma 1.1]). Lemma 5.1
For any nonempty open subset G of G , there is a ψ ∈ C ( G ) such that ψ > in G , ψ = 0 on Γ , and |∇ ψ ( x ) | > for all x ∈ G \ G . Choose θ and ℓ as the ones in (5.16), and ψ given by Lemma 5.1 with G being any fixednonempty open subset of G such that G ⊂ G . We have the following global Carleman estimatefor (5.21): Theorem 5.3
There is a constant µ = µ ( G, G , T ) > such that for all µ ≥ µ , one can findtwo constants C = C ( µ ) > and λ = λ ( µ ) > such that for all λ ≥ λ and z T ∈ L F T (Ω; L ( G )) ,the solution ( z, Z ) to (5.21) satisfies that λ µ E Z Q θ ϕ z dxdt + λµ E Z Q θ ϕ |∇ z | dxdt ≤ C (cid:16) λ µ E Z Q θ ϕ z dxdt + λ µ E Z Q θ ϕ ( a z + Z ) dxdt (cid:17) . (5.23) Proof . We shall use Theorem 5.2 and Proposition 5.2. Integrating the equality (5.7) on Q ,taking expectation in both sides, and noting (5.20), we conclude that2 E Z Q θ (cid:0) ∆ w + A w (cid:1)(cid:0) dz + ∆ zdt (cid:1) dx − E Z Q div ( dw ∇ w ) dx +2 E Z Q div h (cid:0) ∇ ℓ · ∇ w (cid:1) ∇ w − |∇ w | ∇ ℓ − Ψ w ∇ w + (cid:16) A∇ ℓ + ∇ Ψ2 (cid:17) w i dxdt ≥ E Z Q (cid:2) ϕ (cid:0) λµ |∇ ψ | + λO ( µ ) (cid:1) |∇ w | + ϕ (cid:0) λ µ ϕ |∇ ψ | + λ ϕ O ( µ ) (5.24)+ λ ϕ O ( µ )+ λ µ ϕ O ( e µ | ψ | C ( G ) ) (cid:1) w (cid:3) dxdt +2 E Z Q | ∆ w + A w | dxdt + E Z Q θ | d ∇ z + dz ∇ ℓ | dx − E Z Q θ A ( dz ) dx. Recall that ν ( x ) = ( ν ( x ) , · · · , ν n ( x )) stands for the unit outward normal vector (of G ) at x ∈ Γ. Noting that z = 0 on Σ, we have that w = 0 and w x j = θv x j = θ ∂z∂ν ν j on Σ . Similarly, by Lemma 5.1, we get ℓ x j = λµϕψ x j = λµϕ ∂ψ∂ν ν j and ∂ψ∂ν < . Therefore, utilizing integration by parts, we obtain that − E Z Q div ( dw ∇ w ) dx = − E Z Σ n X j =1 w x j ν j dwdx = − E Z Σ θ ∂z∂ν dwdx = 0 (5.25)and that 2 E Z Q div h (cid:0) ∇ ℓ · ∇ w (cid:1) ∇ w − |∇ w | ∇ ℓ − Ψ w ∇ w + (cid:16) A∇ ℓ + ∇ Ψ2 (cid:17) w i dxdt
49 2 E Z Σ h (cid:16) n X j =1 λµϕ ∂ψ∂ν ν j λµϕ ∂z∂ν ν j (cid:17) θ ∂z∂ν ν j − (cid:16) n X j =1 θ (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) | ν j | (cid:17) λµϕ ∂ψ∂ν i dxdt = 2 λµ E Z Σ θ ϕ ∂ψ∂ν (cid:16) ∂z∂ν (cid:17) d Γ dt ≤ . (5.26)It follows from (5.21) that2 E Z Q θ (cid:0) ∆ w + A w (cid:1)(cid:0) dz + ∆ zdt (cid:1) dx = 2 E Z Q θ (cid:0) ∆ w + A w (cid:1)(cid:2) − (cid:0) a z + a Z (cid:1) dt + ZdW ( t ) (cid:3) dx = − E Z Q θ (cid:0) ∆ w + A w (cid:1) ( a z + a Z (cid:1) dxdt (5.27) ≤ E Z Q (cid:0) ∆ w + A w (cid:1) dtdx + E Z Q θ ( a z + a Z (cid:1) dxdt. By (5.24)–(5.27), we have that2 E Z Q (cid:2) ϕ (cid:0) λµ |∇ ψ | + λO ( µ ) (cid:1) |∇ w | + ϕ (cid:0) λ µ |∇ ψ | + λ O ( µ ) + λ µ O ( e µ | ψ | C ( G ) ) + λ ϕ − O ( µ ) (cid:1) w (cid:3) dxdt ≤ E Z Q θ (cid:2)(cid:0) a z + a Z (cid:1) + 2 A ( Z + a z ) + 2 a A z (cid:3) dxdt. (5.28)Choose a cut-off function ζ ∈ C ∞ ( G ; [0 , ζ ≡ G . By d ( θ ϕh ) = h ( θ ϕ ) t dt + 2 θ ϕhdh + θ ϕ ( dh ) , recalling lim t → + ϕ ( t, · ) = lim t → T − ϕ ( t, · ) ≡ E Z Q θ (cid:2) ζ z ( ϕ t + 2 λϕη t ) + 2 ζ ϕ |∇ z | + 2 µζ ϕ (1 + 2 λϕ ) z ∇ z · ∇ ψ +4 ζϕz ∇ z · ∇ ζ + 2 ζ ϕf z + ζ ϕZ (cid:3) dxdt. Therefore, for any ε >
0, one has2 E Z Q θ ζ ϕ |∇ z | dxdt + E Z Q θ ζ ϕZ dxdt ≤ ε E Z Q θ ζ ϕ |∇ z | dxdt + C ε E Z Q θ h λ µ (cid:0) a z + a Z (cid:1) + λ µ ϕ z i dxdt. This yields that E Z T Z G θ ϕ |∇ z | dxdt ≤ C E Z Q θ h λ µ (cid:0) a z + a Z (cid:1) + λ µ ϕ z i dxdt. (5.29)50rom (5.28) and (5.29), we conclude that there is a µ > µ ≥ µ , one can finda constant λ = λ ( µ ) so that for any λ ≥ λ , the desired estimate (5.23) holds. This completesthe proof of Theorem 5.3.As a consequence of Theorem 5.3, we have the following observability estimate and uniquecontinuation for (5.21): Corollary 5.1 Solutions ( z, Z ) to the system (5.21) satisfy | z (0) | L F (Ω; L ( G )) ≤ C (cid:0) | χ G z | L F (0 ,T ; L ( G )) + | a z + Z | L F (0 ,T ; L ( G )) (cid:1) , ∀ z T ∈ L F T (Ω; L ( G )) . (5.30)2) If for some z T ∈ L F T (Ω; L ( G )) , the corresponding solution ( z, Z ) satisfies that z = 0 in Q and a z + Z = 0 in Q , a.s., then z T = 0 in G , a.s.Proof . We only prove the conclusion 1). Applying Theorem 5.3 to the equation (5.21), andchoosing µ = µ and λ = C , from (5.23), we deduce that, E Z Q θ ϕ z dxdt ≤ C h E Z Q θ ϕ z dxdt + E Z Q θ ϕ ( a z + Z ) dxdt i . (5.31)Recalling (5.16), it follows from (5.31) that E Z T/ T/ Z G z dxdt ≤ C sup ( t,x ) ∈ Q (cid:16) θ ( t, x ) ϕ ( t, x )+ θ ( t, x ) ϕ ( t, x ) (cid:17) inf x ∈ G (cid:16) θ ( T / , x ) ϕ ( T / , x ) (cid:17) h E Z Q z dxdt + E Z Q ( a z + Z ) dxdt i ≤ C h E Z Q z dxdt + E Z Q ( a z + Z ) dxdt i . (5.32)From (5.22), it follows that E Z G z (0) dx ≤ C E Z G z ( t ) dx, ∀ t ∈ [0 , T ] . (5.33)Combining (5.32) and (5.33), we conclude that, the solution ( z, Z ) to the equation (5.21) satisfies(5.30). This subsection is addressed to a proof of Theorem 5.1.
Proof .[Proof of Theorem 5.1] Similarly to the proof of Theorem 4.1], we use the classical dualityargument.We first prove that (5.4) is null controllable at any time
T >
0. Let us introduce a linearsubspace of L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) as follows: Y ∆ = n(cid:0) χ G z, a z + Z (cid:1) (cid:12)(cid:12)(cid:12) ( z, Z ) solves the equation (5.21) with some z T ∈ L F T (Ω; L ( G )) o F on Y as follows: F ( χ G z, a z + Z ) = − Z G y z (0) dx. By the conclusion 1) in Corollary 5.1, we see that F is a bounded linear functional on Y . By meansof Theorem 1.3, F can be extended to be a bounded linear functional on the space L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )). For simplicity, we still use F to denote this extension. By Theorem 1.4, thereexists ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) such that − Z G y z (0) dx = F ( χ G z, a z + Z )= E Z T Z G zudxdt + E Z T Z G v ( a z + Z ) dxdt. (5.34)We claim that the above obtained u and v are the desired controls. Indeed, by Itˆo’s formula andintegration by parts, we have E Z G y ( T ) z T dx − Z G y z (0) dx = − E Z T Z G zudxdt + E Z T Z G v ( a z + Z ) dxdt, (5.35)where z is the solution to (5.21) with z T = η and y is the state of (5.4). From (5.34) and (5.35),we see that E Z G y ( T ) z T dx = 0 . (5.36)Since z T can be an arbitrary element in L F T (Ω; L ( G )), from the equality (5.36), we conclude that y ( T ) = 0 in L ( G ), a.s. This concludes the null controllability of (5.4).Next, we prove that (5.4) is approximately controllable at time T . It suffices to show that theset A T ∆ = (cid:8) y ( T ) (cid:12)(cid:12) y is the state of (5.4) with some controls( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) (cid:9) is dense in L F T (Ω; L ( G )). Let us prove this by the contradiction argument. Assume that therewas a nonzero η ∈ L F T (Ω; L ( G )) such that E Z G y ( T ) ηdx = 0 , ∀ y ( T ) ∈ A T . Then, by Itˆo’s formula and integration by parts, we would obtain that E Z G y ( T ) ηdx − Z G y z (0) dx = E Z T Z G zudxdt + E Z T Z G ( a z + Z ) vdxdt, (5.37)where z is the solution to (5.21) with z T = η and y is the state of (5.4). Hence, − Z G y z (0) dx = E Z T Z G zudxdt + E Z T Z G ( a z + Z ) vdxdt, ∀ ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) . (5.38)52oting that the left hand side of (5.38) is independent of ( u, v ). By choosing ( u, v ) = (0 , R G y z (0) dx = 0. Thus,0 = E Z T Z G zudxdt + E Z T Z G ( a z + Z ) vdxdt, ∀ ( u, v ) ∈ L F (0 , T ; L ( G )) × L F (0 , T ; L ( G )) , which yields that z = 0 in G × (0 , T ), a.s. and a z + Z in G × (0 , T ), a.s. By the conclusion 2) inCorollary 5.1, we then have η = 0, a contradiction. Similarly to parabolic equations, hyperbolic equations (including particularly the wave equations)are another class of typical PDEs. In the deterministic setting, the controllability problems forhyperbolic equations is extensively studied. As we shall see in this section, the usual stochastichyperbolic equation, i.e., the classic hyperbolic equation perturbed by a term of Itˆo’s integral, isnot exactly controllable even if the controls are effective everywhere in both drift and diffusionterms, which differs dramatically from its deterministic counterpart. This section is based on [55].
Let
T > G be given as in Subsection 1.4 and Γ be a subset of Γ. Put Q and Σ as in (4.1), andΣ =(0 , T ) × Γ .Similarly to the previous two sections, we begin with the following controlled (deterministic)hyperbolic equation: y tt − ∆ y = a y in Q,y = χ Σ h on Σ ,y (0) = y , y t (0) = y in G. (6.1)Here ( y , y ) ∈ L ( G ) × H − ( G ), a ∈ L ∞ ( Q ), ( y, y t ) is the state variable , and h ∈ L (Σ ) isthe control variable . It is well-known that the system (6.1) admits a unique transposition solution y ∈ C ([0 , T ]; L ( G )) ∩ C ([0 , T ]; H − ( G )) (e.g., [31]). Definition 6.1
We say the system (6.1) is exactly controllable at time T if for any given ( y , y ) , (˜ y , ˜ y ) ∈ L ( G ) × H − ( G ) , one can find a control h ∈ L (Σ ) such that the solution to (6.1)satisfying y ( T ) = ˜ y and y t ( T ) = ˜ y . To show the exact controllability of (6.1), people introduce its dual equation as follows z tt − ∆ z = a z in Q,z = 0 on Σ ,z ( T ) = z , z t ( T ) = z in G, (6.2)where ( z , z ) ∈ H ( G ) × L ( G ).By means of the standard duality argument, it is easy to show the following result (e.g., [29]).53 roposition 6.1 The equation (6.1) is exactly controllable at time T if and only if solutions tothe equation (6.2) satisfy the following observability estimate: | ( z , z ) | H ( G ) × L ( G ) ≤ C (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) L (Σ ) , ∀ ( z , z ) ∈ H ( G ) × L ( G ) . (6.3)It is known that (e.g., [81, 83]), under some assumptions on ( T, G, Γ )(for example, Γ = { x ∈ Γ | ( x − x ) · ν ( x ) > } for some x ∈ R n and T > x ∈ G | x − x | ), the inequality (6.3) holds. Asa result, the system (6.1) is exactly controllable. In the rest of this section, we shall see that thecontrollability property of stochastic hyperbolic equation is quite different from its deterministiccounterpart.Now, let us consider the following controlled stochastic hyperbolic equation: dy t − ∆ ydt = ( a y + g ) dt + ( a y + g ) dW ( t ) in Q,y = h on Σ ,y (0) = y , y t (0) = y in G, (6.4)where ( y , y ) ∈ L ( G ) × H − ( G ), a , a ∈ L ∞ F (0 , T ; L ∞ ( G )), ( y, y t ) are the state variable , g , g ∈ L ∞ F (0 , T ; H − ( G )) and h ∈ L F (0 , T ; L (Γ)) are the control variables . As we shall see in Subsec-tion 6.3, the equation (6.4) admits a unique transposition solution y ∈ C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))). Definition 6.2
The system (6.4) is called exactly controllable at time T if for any ( y , y ) ∈ L ( G ) × H − ( G ) and (˜ y , ˜ y ) ∈ L F T (Ω; L ( G )) × L F T (Ω; H − ( G )) , there is a triple of controls ( g , g , h ) ∈ L F (0 , T ; H − ( G )) × L F (0 , T ; H − ( G )) × L F (0 , T ; L (Γ)) such that the correspondingsolution y to the system (6.4) satisfies that ( y ( T ) , y t ( T )) = (˜ y , ˜ y ) , a.s. Since three controls are introduced in (6.4), one may guess that the desired exact controllabilityshould be trivially correct. Surprisingly, we have the following negative result.
Theorem 6.1
The system (6.4) is not exactly controllable at any time
T > . Clearly, the controls in (6.4) are the strongest ones people can introduce. Nevertheless, theresult in Theorem 6.1 differs significantly from the above mentioned controllability property of de-terministic hyperbolic equations. Since (6.4) is a generalization of the classical hyperbolic equationto the stochastic setting, from the viewpoint of control theory, we believe that some key featurehas been ignored in the derivation of the equation (6.4). A proof of Theorem 6.1 and some furtherdiscussions will be given in Subsection 6.4.
System (6.4) is a nonhomogeneous boundary value problem. Similarly to the control system (4.5),its solution is understood in the sense of transposition. To this end, we introduce the following“reference” equation: dz = ˆ zdt + ZdW ( t ) in Q τ ∆ =(0 , τ ) × G,d ˆ z − ∆ zdt = ( a z + a Z ) dt + b ZdW ( t ) in Q τ ,z = 0 on Σ τ ∆ =(0 , τ ) × Γ ,z ( τ ) = z τ , ˆ z ( τ ) = ˆ z τ in G, (6.5)54here τ ∈ (0 , T ] and ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )). By Theorem 1.21, the system(6.5) admits a unique solution ( z, ˆ z, Z, b Z ) ∈ L F (Ω; C ([0 , τ ]; H ( G ))) × L F (Ω; C ([0 , τ ]; L ( G ))) × L F (0 , τ ; H ( G )) × L F (0 , τ ; L ( G )) . Moreover, | z | L F (Ω; C ([0 ,τ ]; H ( G ))) + | ˆ z | L F (Ω; C ([0 ,τ ]; L ( G ))) + | Z | L F (0 ,τ ; H ( G )) + | b Z | L F (0 ,τ ; L ( G )) ≤ C (cid:0) | z τ | L F τ (Ω; H ( G )) + | ˆ z τ | L F τ (Ω; L ( G )) (cid:1) . (6.6)Similarly to Section 4, we need to establish a hidden regularity for solutions to (6.5). Proposition 6.2
For any ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) , the solution ( z, ˆ z, Z, b Z ) to (6.5) satisfies ∂z∂ν (cid:12)(cid:12) Γ ∈ L F (0 , τ ; L (Γ)) . Furthermore, (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) L F (0 ,τ ; L (Γ)) ≤ C (cid:0) | z τ | L F τ (Ω; H ( G )) + | ˆ z τ | L F τ (Ω; L ( G )) (cid:1) , (6.7) where the constant C is independent of τ .Proof . One can find a vector field Ξ = (Ξ , · · · , Ξ n ) ∈ C ( R n ; R n ) such that Ξ( x ) = ν ( x ) for x ∈ Γ (See [29, p. 29])). By Itˆo’s formula and the first equation of (6.5), we have d (ˆ z Ξ · ∇ z ) = d ˆ z Ξ · ∇ z + ˆ z Ξ · ∇ dz + d ˆ z Ξ · ∇ dz = d ˆ z Ξ · ∇ z + ˆ z Ξ · ∇ (cid:0) ˆ zdt + ZdW ( t ) (cid:1) + d ˆ z Ξ · ∇ dz = d ˆ z Ξ · ∇ z + 12 (cid:2) div (ˆ z Ξ) − (div Ξ)ˆ z (cid:3) dt + ˆ z Ξ · ∇ ZdW ( t ) + d ˆ z Ξ · ∇ dz. (6.8)It follows from a direct computation thatdiv (cid:2) · ∇ z ) ∇ z − Ξ |∇ z | (cid:3) = 2 (cid:16) ∆ z Ξ · ∇ z + n X j,k =1 Ξ kx j z x j z x k (cid:17) − (div Ξ) |∇ z | . (6.9)Combining (6.8) and (6.9), we obtain that − div (cid:2) · ∇ z ) ∇ z + Ξ (cid:0) ˆ z − |∇ z | (cid:1)(cid:3) dt = 2 h − d (ˆ z Ξ · ∇ z ) + (cid:0) d ˆ z − ∆ zdt (cid:1) Ξ · ∇ z − n X j,k =1 Ξ kx j z x j z x k dt i − (div Ξ)ˆ z dt + (div Ξ) |∇ z | dt + 2 d ˆ z Ξ · ∇ dz + 2ˆ z Ξ · ∇ ZdW ( t ) . (6.10)Integrating (6.10) in Q , taking expectation on Ω, using the second equation of (6.5) and notingthat z = 0 on (0 , τ ) × Γ, we get that − E Z Σ τ (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) d Σ τ = − E Z G ˆ z T Ξ · ∇ z T dx + 2 E Z G ˆ z (0)Ξ · ∇ z (0) dx + 2 Z Q τ h(cid:0) a z + a Z (cid:1) Ξ · ∇ z − n X j,k =1 Ξ kx j z x j z x k − (div Ξ)ˆ z + (div Ξ) |∇ z | + 2 b Z Ξ · ∇ Z i dxdt ∆ = I . By (6.6), we have |I| ≤ C (cid:0) | z τ | L F τ (Ω; H ( G )) + | ˆ z τ | L F τ (Ω; L ( G )) (cid:1) , which, together with the aboveequality, implies (6.7). 55 .3 Well-posedness of the control system We begin with the following notion.
Definition 6.3
A stochastic process y ∈ C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))) iscalled a transposition solution to (6.4) if for any τ ∈ (0 , T ] , ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) and the corresponding solution ( z, ˆ z, Z, b Z ) to (6.5) , it holds that E h y t ( τ ) , z τ i H − ( G ) ,H ( G ) − h y , z (0) i H − ( G ) ,H ( G ) − E h y ( τ ) , ˆ z τ i L ( G ) + h y , ˆ z (0) i L ( G ) (6.11)= E Z τ h g , z i H − ( G ) ,H ( G ) dt + E Z τ h g , Z i H − ( G ) ,H ( G ) dt − E Z Σ τ ∂z∂ν hd Σ τ . Remark 6.1
Note that, by Proposition 6.2, the solution ( z, ˆ z, Z, b Z ) to (6.5) satisfies ∂z∂ν (cid:12)(cid:12) Γ ∈ L F (0 , τ ; L (Γ)) . Hence, the term “ E R Σ τ ∂z∂ν hd Σ τ ” in (6.11) makes sense. We have the following well-posedness result for (6.4).
Proposition 6.3
For each ( y , y ) ∈ L ( G ) × H − ( G ) and ( g , g , h ) ∈ L F (0 , T ; H − ( G )) × L F (0 , T ; H − ( G )) × L F (0 , T ; L (Γ )) , the system (6.4) admits a unique transposition solution y ∈ C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))) , and | y | C F ([0 ,T ]; L (Ω; L ( G ))) ∩ C F ([0 ,T ]; L (Ω; H − ( G ))) ≤ C (cid:0) | y | L ( G ) + | y | H − ( G ) + | g | L F (0 ,T ; H − ( G )) + | g | L F (0 ,T ; H − ( G )) + | h | L F (0 ,T ; L (Γ)) (cid:1) . (6.12) Proof . Uniqueness . Assume that y and ˜ y are two transposition solutions of (6.4). It followsfrom Definition 6.3 that for any τ ∈ (0 , T ] and ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )), E h y t ( τ ) , z τ i H − ( G ) ,H ( G ) − E h y ( τ ) , ˆ z τ i L ( G ) = E h ˜ y t ( τ ) , z τ i H − ( G ) ,H ( G ) − E h ˜ y ( τ ) , ˆ z τ i L ( G ) , (6.13)which implies that (cid:0) y ( τ ) , y t ( τ ) (cid:1) = (cid:0) ˜ y ( τ ) , ˜ y t ( τ ) (cid:1) , a.s., ∀ τ ∈ (0 , T ]. Hence, y = ˜ y in C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))) . Existence . Since h ∈ L F (0 , T ; L (Γ)), there exists a sequence { h m } ∞ m =1 ⊂ C F ([0 , T ]; H / (Γ))with h m (0) = 0 for all m ∈ N such thatlim m →∞ h m = h in L F (0 , T ; L (Γ)) . (6.14)For each m ∈ N , we can find an ˜ h m ∈ C F ([0 , T ]; H ( G )) such that ˜ h m | Γ = h m and ˜ h m (0) = 0.Consider the following equation: d ˜ y m,t − ∆˜ y m dt = ( a ˜ y m + ζ m ) dt + [ a (˜ y m + ˜ h m ) + g ] dW ( t ) in Q, ˜ y m = 0 on Σ , ˜ y m (0) = y , ˜ y m,t (0) = y in G, (6.15)56here ζ m = g + ∆˜ h m + a ˜ h m . By Theorem 1.17, the system (6.15) admits a unique mild (alsoweak) solution ˜ y m ∈ C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))).Let y m = ˜ y m + ˜ h m . By Itˆo’s formula and integration by parts, we have that E h y m,t ( τ ) , z τ i H − ( G ) ,H ( G ) − h y , z (0) i H − ( G ) ,H ( G ) − E h y m ( τ ) , ˆ z τ i L ( G ) + h y , ˆ z (0) i L ( G ) (6.16)= − E Z τ h g , z i H − ( G ) ,H ( G ) dt + E Z τ h g , Z i H − ( G ) ,H ( G ) dt − E Z Σ τ ∂z∂ν h m d Σ τ . Consequently, for any m , m ∈ N , E h y m ,t ( τ ) − y m ,t ( τ ) , z τ i H − ( G ) ,H ( G ) − E h y m ( τ ) − y m ( τ ) , ˆ z τ i L ( G ) = − E Z Σ τ ∂z∂ν ( h m − h m ) d Σ τ . (6.17)By Proposition 1.2, there is ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) so that | z τ | L F τ (Ω; H ( G )) = 1 , | ˆ z τ | L F τ (Ω; L ( G )) = 1and that E h y m ,t ( τ ) − y m ,t ( τ ) , z τ i H − ( G ) ,H ( G ) − E h y m ( τ ) − y m ( τ ) , ˆ z τ i L ( G ) ≥ (cid:0) | y m ( τ ) − y m ( τ ) | L F τ (Ω; L ( G )) + | y m ,t ( τ ) − y m ,t ( τ ) | L F τ (Ω; H − ( G )) (cid:1) . (6.18)It follows from (6.17), (6.18) and Proposition 6.2 that | y m ( τ ) − y m ( τ ) | L F τ (Ω; L ( G )) + | y m ,t ( τ ) − y m ,t ( τ ) | L F T (Ω; H − ( G )) ≤ (cid:12)(cid:12)(cid:12) E Z Σ τ ∂z∂ν ( h m − h m ) d Σ τ (cid:12)(cid:12)(cid:12) ≤ C| h m − h m | L F (0 ,T ; L (Γ)) | ( z τ , ˆ z τ ) | L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) ≤ C| h m − h m | L F (0 ,T ; L (Γ)) , where the constant C is independent of τ . Consequently, it holds that | y m − y m | C F ([0 ,T ]; L (Ω; L ( G ))) + | y m ,t − y m | C F ([0 ,T ]; L (Ω; H − ( G ))) ≤ C| h m ,t − h m | L F (0 ,T ; L (Γ)) . This concludes that { y m } ∞ m =1 is a Cauchy sequence in C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))). Denote by y the limit of { y m } ∞ m =1 . Letting m → ∞ in (6.16), we see that y satisfies(6.11). Thus, y is a transposition solution to (6.4).By Proposition 1.2, there is ( z τ , ˆ z τ ) ∈ L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) such that | z τ | L F τ (Ω; H ( G )) = 1, | ˆ z τ | L F τ (Ω; L ( G )) = 1 and E h y t ( τ ) , z τ i H − ( G ) ,H ( G ) − E h y ( τ ) , ˆ z τ i L ( G ) ≥ (cid:0) | y ( τ ) | L F τ (Ω; L ( G )) + | ˆ y ( τ ) | L F τ (Ω; H − ( G )) (cid:1) . (6.19)57ombining (6.11), (6.19) and Proposition 6.2, we obtain that, for any τ ∈ (0 , T ], | y ( τ ) | L F τ (Ω; L ( G )) + | y t ( τ ) | L F τ (Ω; H − ( G )) ≤ (cid:16)(cid:12)(cid:12) h y , z (0) i H − ( G ) ,H ( G ) (cid:12)(cid:12) + (cid:12)(cid:12) h y , ˆ z (0) i L ( G ) (cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E Z τ h g , z i H − ( G ) ,H ( G ) dt (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E Z τ h g , Z i H − ( G ) ,H ( G ) dt (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E Z Σ τ ∂z∂ν hd Σ τ (cid:12)(cid:12)(cid:12)(cid:17) ≤ C (cid:0) | y | L ( G ) + | y | H − ( G ) + | g | L F (0 ,T ; H − ( G )) + | g | L F (0 ,T ; H − ( G )) + | h | L F (0 ,T ; L (Γ)) (cid:1) × | ( z τ , ˆ z τ ) | L F τ (Ω; H ( G )) × L F τ (Ω; L ( G )) ≤ C (cid:0) | y | L ( G ) + | y | H − ( G ) + | g | L F (0 ,T ; H − ( G )) + | g | L F (0 ,T ; H − ( G )) + | h | L F (0 ,T ; L (Γ)) (cid:1) , where the constant C is independent of τ . Therefore, we obtain the estimate (6.12). This completesthe proof of Proposition 6.3. First of all, let us prove the negative controllability result in Theorem 6.1:
Proof .[Proof of Theorem 6.1] We use the contradiction argument. Choose ψ ∈ H ( G ) satisfying | ψ | L ( G ) = 1 and let ˜ y = ξψ , where ξ is given in Lemma 3.1. Assume that (6.4) was exactlycontrollable for some time T >
0. Then, for any y ∈ L ( G ), we would find a triple of controls( g , g , h ) ∈ L F (0 , T ; H − ( G )) × L F (0 , T ; H − ( G )) × L F (0 , T ; L (Γ)) such that the correspondingsolution y ∈ C F ([0 , T ]; L (Ω; L ( G ))) ∩ C F ([0 , T ]; L (Ω; H − ( G ))) to the equation (6.4) satisfiesthat y ( T ) = ˜ y . Clearly, Z G ˜ y ψdx − Z G y ψdx = Z T h y t , ψ i H − ( G ) ,H ( G ) dt, which leads to ξ = Z G y ψdx + Z T h y t , ψ i H − ( G ) ,H ( G ) dt. This contradicts Lemma 3.1.Then, as we did in [55], motivated by Theorem 6.1, we propose below a refined model to describethe DNA molecule considered in Example 2.2. For this purpose, we partially employ a dynamicaltheory of Brownian motions, developed in [60], to describe the motion of a particle perturbed byrandom forces.According to [60, Chapter 11], we may suppose that y ( t, x ) = Z t ˜ y ( s, x ) ds + Z t F ( s, x, y ( s )) dW ( s ) . (6.20)Here ˜ y ( · , · ) is the expected velocity, F ( · , · , · ) is the random perturbation from the fluid molecule.When y is small, one can assume that F ( · , · , · ) is linear in the third argument, i.e., for a suitable b ( · , · ), F ( s, x, y ( t, x )) = b ( t, x ) y ( t, x ) . (6.21)58ormally, the acceleration at position x along the string at time t is ˜ y t ( t, x ). By Newton’s secondlaw, it follows that ˜ y t ( t, x ) = F ( t, x ) + F ( t, x ) + F ( t, x ) + F ( t, x ) . Similar to (2.8), we obtain then that d ˜ y ( t, x ) = y xx ( t, x ) dt + F ( t, x ) dt + F ( t, x ) dt + a ( t, x ) y ( t, x ) dW ( t ) . (6.22)Combining (6.20), (6.21) and (6.22), we arrive at the following modified version of (2.8): ( dy = ˜ ydt + b ydW ( t ) in (0 , T ) × (0 , L ) ,d ˜ y = ( y xx + a y x + a y t + a y ) dt + a ydW ( t ) in (0 , T ) × (0 , L ) . (6.23)Then, similarly to (2.9), we obtain the following control system: dy = ˜ ydt + ( b y + u ) dW ( t ) in (0 , T ) × (0 , L ) ,d ˜ y = ( y xx + a y x + a y t + a y ) dt + ( a y + v ) dW ( t ) in (0 , T ) × (0 , L ) ,y = f on (0 , T ) × { } ,y = f on (0 , T ) × { L } ,y (0) = y , ˜ y (0) = y in (0 , L ) , where ( f , f , u, v ) are controls which belong to some suitable spaces. Under some assumptions, onecan show the exact controllability of the above control system (See [53, Theorem 10.12 in Chapter10]). This, in turn, justifies our modification (6.23).Some further result related to the exact controllability of the above refined stochastic waveequations (even in several space dimensions) can be found in [55]. From this section to Section 9, we deal with linear quadratic optimal control problems (LQ problemsfor short) for SEEs. The content of these sections are taken from [41, 53].
Fix any
T >
0, let both H and A (generating a C -semigroup S ( · ) on H ) be given as in Subsection1.9, and let U be another Hilbert space.In what follows, for a Hilbert space b H , denote by S ( b H ) the set of all self-adjoint operators on b H . For M, N ∈ S ( b H ), we use the notation M ≥ N ( resp. M > N ) to indicate that M − N ispositive semi-definite (resp. positive definite). For any S ( b H )-valued stochastic process F on [0 , T ],we write F ≥ resp. F > F ≫
0) if F ( t, ω ) ≥ resp. F ( t, ω ) > F ( t, ω ) ≥ δI for some δ > I on b H ) for a.e. ( t, ω ) ∈ [0 , T ] × Ω.Consider a control system governed by the following linear SEE: dX ( t ) = (cid:0) AX ( t ) + A ( t ) X ( t ) + B ( t ) u ( t ) (cid:1) dt + (cid:0) C ( t ) X ( t ) + D ( t ) u ( t ) (cid:1) dW ( t ) in (0 , T ] ,X (0) = η. (7.1)59ere η ∈ H , A , C ∈ L ∞ F (0 , T ; L ( H )) and B, D ∈ L ∞ F (0 , T ; L ( U ; H )). In (7.1), X ( · ) is the statevariable and u ( · ) ∈ U [0 , T ] ∆ = L F (0 , T ; U )) is the control variable . For any u ∈ U [0 , T ], by Theorem1.17, the system (7.1) admits a unique solution X ( · ; η, u ) ∈ C F ([0 , T ]; L (Ω; H )) such that | X ( · ; η, u ) | C F ([0 ,T ]; L (Ω; H )) ≤ C (cid:0) | η | H + | u | L F (0 ,T ; U ) (cid:1) . (7.2)When there is no confusion, we simply denote the solution by X ( · ).Associated with the system (7.1), we consider the following quadratic cost functional J ( η ; u ( · )) = 12 E h Z T (cid:0) h M ( t ) X ( t ) , X ( t ) i H + h R ( t ) u ( t ) , u ( t ) i U (cid:1) dt + h GX ( T ) , X ( T ) i H i , (7.3)Here M ( · ) ∈ L ∞ F (0 , T ; S ( H )), R ( · ) ∈ L ∞ F (0 , T ; S ( U )) and G ( · ) ∈ L ∞F T (Ω; S ( H )). In what follows, tosimplify the notations, the time variable t will be suppressed in X ( · ), B ( · ), C ( · ), D ( · ), M ( · ) and R ( · ), and therefore we shall simply write them as B , C , D , M and R , respectively (if there is noconfusion).Let us now state our stochastic LQ problem as follows: Problem (SLQ).
For each η ∈ H , find a ¯ u ( · ) ∈ U [0 , T ] such that J ( η ; ¯ u ( · )) = inf u ( · ) ∈U [0 ,T ] J ( η ; u ( · )) . (7.4)Problem (SLQ) is now extensively studied for the case of finite dimensions (i.e., dim H < ∞ )and natural filtration. A nice treatise for this topic is [68] and the readers can find rich referencestherein. To handle the infinite dimensional case, we borrow some ideas, such as the optimal feedbackcontrol operator, from [68]. At first glance, one might think that the study of Problem (SLQ) fordim H = ∞ is simply a routine extension of that for dim H < ∞ . However, the infinite dimensionalsetting leads to significantly new difficulties.We begin with the following notions. Definition 7.1
1) Problem (SLQ) is said to be finite at η ∈ H if inf u ( · ) ∈U [0 ,T ] J ( η ; u ( · )) > −∞ .
2) Problem (SLQ) is said to be solvable at η ∈ H if there exists a control ¯ u ( · ) ∈ U [0 , T ] , suchthat J ( η ; ¯ u ( · )) = inf u ( · ) ∈U [0 ,T ] J ( η ; u ( · )) . In this case, ¯ u ( · ) is called an optimal control. The corresponding state X ( · ) and ( X ( · ) , ¯ u ( · )) arecalled an optimal state and an optimal pair, respectively.3) Problem (SLQ) is said to be finite (resp. solvable) if it is finite (resp. solvable) at any η ∈ H . In this subsection, we are concerned with the finiteness and solvability of Problem (SLQ).By the variation of constants formula, the solution to (7.1) can be explicitly expressed in termsof the initial state and the control in a linear form. Substituting that into the cost functional(which is quadratic in the state and control), one can obtain a quadratic functional w.r.t. the60ontrol. Thus, Problem (SLQ) can be transformed to a quadratic optimization problem on U [0 , T ].This leads to some necessary and sufficient conditions for the finiteness and solvability of Problem(SLQ). Let us present the details below.To begin with, we define four linear operators Ξ : U [0 , T ] → L F (0 , T ; H ), b Ξ : U [0 , T ] → L F T (Ω; H ), Γ : H → L F (0 , T ; H ) and b Γ : H → L F T (Ω; H ) as follows: (Ξ u ( · ))( · ) ∆ = X ( · ; 0 , u ) , b Ξ u ( · ) ∆ = X ( T ; 0 , u ) , ∀ u ( · ) ∈ U [0 , T ] , (Γ η )( · ) ∆ = X ( · ; η, , b Γ η ∆ = X ( T ; η, ∀ η ∈ H. (7.5)where X ( · ) ≡ X ( · ; η, u ( · )) solves the equation (7.1). By (7.2), the above operators are all bounded.By (7.5), for any η ∈ H and u ( · ) ∈ U [0 , T ], the corresponding state process X ( · ) and its terminalvalue X ( T ) satisfy X ( · ) = (Γ η )( · ) + (Ξ u ( · ))( · ) , X ( T ) = b Γ η + b Ξ u ( · ) . (7.6)We need to compute the adjoint operators of Ξ, b Ξ, Γ and b Γ. To this end, let us introduce thefollowing BSEE : ( dY ( t ) = − (cid:0) A ∗ Y ( t )+ A ∗ Y ( t )+ C ∗ Z ( t )+ ξ ( t ) (cid:1) dt + Z ( t ) dW ( t ) in [0 , T ) ,Y ( T ) = Y T , (7.7)where Y T ∈ L F T (Ω; H ) and ξ ( · ) ∈ L F (0 , T ; H ). Since we do not assume that the filtration F isthe natural one generated by W ( · ), the equation (7.7) may not have a weak/mild solution. Aswe have explained in Subsection 1.10, its solution is understood in the sense of transposition. Forany Y T ∈ L F T (Ω; H ) and ξ ( · ) ∈ L F (0 , T ; H ), by Theorem 1.24, there exists a unique transpositionsolution ( Y ( · ) , Z ( · )) ∈ D F ([0 , T ]; L (Ω; H )) × L F (0 , T ; H ) to (7.7). Moreover,sup ≤ t ≤ T E | Y ( t ) | H + E Z T | Z ( t ) | H dt ≤ C E (cid:16) | Y T | H + Z T | ξ ( t ) | H dt (cid:17) , (7.8)for some constant C > Proposition 7.1
For any ξ ( · ) ∈ L F (0 , T ; H ) , ( (Ξ ∗ ξ )( t ) = B ∗ Y ( t ) + D ∗ Z ( t ) , t ∈ [0 , T ] , Γ ∗ ξ = Y (0) , (7.9) where ( Y ( · ) , Z ( · )) is the transposition solution to (7.7) with Y T = 0 . Similarly, for any Y T ∈ L F T (Ω; H ) , ( ( b Ξ ∗ Y T )( t ) = B ∗ Y ( t ) + D ∗ Z ( t ) , t ∈ [0 , T ] , b Γ ∗ Y T = Y (0) , (7.10) where ( Y ( · ) , Z ( · )) is the transposition solution to (7.7) with ξ ( · ) = 0 . Here and henceforth, for any operator-valued process ( resp. random variable) R , we denote by R ∗ its pointwisedual operator-valued process ( resp. random variable). For example, if R ∈ L r F (0 , T ; L r (Ω; L ( H ))), then R ∗ ∈ L r F (0 , T ; L r (Ω; L ( H ))), and | R | L r F (0 ,T ; L r (Ω; L ( H ))) = | R ∗ | L r F (0 ,T ; L r (Ω; L ( H ))) . roof . For any η ∈ H and u ( · ) ∈ U [0 , T ], let X ( · ) be the solution of (7.1). From the definitionof the transposition solution to the equation (7.7), we find that E h X ( T ) , Y T i H − E h η, Y (0) i H = E Z T (cid:0) h u ( t ) , B ∗ Y ( t ) + D ∗ Z ( t ) i U − h X ( t ) , ξ ( t ) i H (cid:1) dt. This implies that E (cid:0) h b Γ η + b Ξ u ( · ) , Y T i H − h η, Y (0) i H (cid:1) = E Z T (cid:0) h u ( t ) , B ∗ Y ( t ) + D ∗ Z ( t ) i U − h (Γ η )( t ) + (Ξ u ( · ))( t ) , ξ ( t ) i H (cid:1) dt. (7.11)Choosing Y T = 0 and η = 0 in (7.11), we have E Z T h (Ξ u ( · ))( t ) , ξ ( t ) i H dt = E Z T h u ( t ) , B ∗ Y ( t ) + D ∗ Z ( t ) i U dt, which yields the first equality in (7.9). Choosing u ( · ) = 0 and Y T = 0 in (7.11), we obtain that E Z T h (Γ η )( t ) , ξ ( t ) i H dt = E h η, y (0) i H . This proves the second equality in (7.9).Letting η = 0 and ξ ( · ) = 0 in (7.11), we find that E h b Ξ u ( · ) , Y T i H = E Z T h u ( t ) , B ∗ Y ( t ) + D ∗ Z ( t ) i H dt. This proves the first equality in (7.10). Letting u ( · ) = 0 and ξ ( · ) = 0 in (7.11), we see that E h b Γ η, Y T i H = E h η, Y (0) i H , which gives the second equality in (7.10).From Proposition 7.1, we get immediately the following result, which is a representation for thecost functional (7.3), and will play an important role in the sequel. Proposition 7.2
The cost functional given by (7.3) can be presented as follows: J ( η ; u ( · )) = 12 h E Z T (cid:0) hN u, u i U + 2 hH ( η ) , u i U (cid:1) dt + M ( η ) i , (7.12) where N = R + Ξ ∗ M Ξ + b Ξ ∗ G b Ξ , H ( η ) = (cid:0) Ξ ∗ M Γ η (cid:1) ( · ) + (cid:0)b Ξ ∗ G b Γ η (cid:1) ( · ) , M ( η ) = (cid:10) M Γ η, Γ η (cid:11) L F (0 ,T ; H ) + (cid:10) G b Γ η, b Γ η (cid:11) L F T (Ω; H ) . (7.13)As an application of Proposition 7.2, we have the following result for the finiteness and solvabilityof Problem (SLQ). 62 heorem 7.1 If Problem (SLQ) is finite at some η ∈ H , then N ≥ . (7.14)2) Problem (SLQ) is solvable at η ∈ H if and only if N ≥ and there exists a control ¯ u ( · ) ∈U [0 , T ] such that N ¯ u ( · ) + H ( η ) = 0 . (7.15) In this case, ¯ u ( · ) is an optimal control. If N ≫ , then for any η ∈ H , J ( η ; · ) admits a unique minimizer ¯ u ( · ) given by ¯ u ( · ) = −N − H ( η ) . (7.16) In this case, it holds inf u ( · ) ∈U [0 ,T ] J ( η ; u ( · )) = J ( η ; u ( · )) = 12 (cid:16) M ( η ) − hN − H ( η ) , H ( η ) i H (cid:17) . Proof . We prove the assertion 1) by contradiction. Suppose that (7.14) was not true. Thenthere would exist u ∈ U [0 , T ] such that E Z T hN u , u i U ds < . Define a sequence { u k } ∞ k =1 ⊂ U [0 , T ] as u k ( · ) = ku ( · ) for k ∈ N . For u k ( · ), we have J ( η ; u k ( · )) = k E (cid:16) Z T hN u , u i U dt + 2 k Z T hH ( η, u ) i U dt + 1 k M ( η ) (cid:17) . Since E Z T hH ( η, u ) i U dt + E M ( η ) < + ∞ , there is a k > k ≥ k , J ( η ; u k ) ≤ k E Z T hN u , u i U dt. By letting k → ∞ , we find that J ( η ; u k ( · )) → −∞ , which contradicts that Problem (SLQ) is finiteat η .Now we prove the assertion 2). For the “if” part, let ¯ u ( · ) ∈ U [0 , T ] be an optimal control ofProblem (SLQ) for η ∈ H . From the optimality of ¯ u ( · ), for any u ( · ) ∈ U [0 , T ], we see that0 ≤ lim λ → λ (cid:16) J ( η ; ¯ u + λu ) − J ( η ; ¯ u ) (cid:17) = Z T hN ¯ u + H ( η ) , u i U dt. Consequently, N ¯ u + H ( η ) = 0 . For the “only if” part, let ( η, ¯ u ( · )) ∈ H × U [0 , T ] satisfy (7.15). For any u ∈ U [0 , T ], from (7.14),we have J ( η ; u ( · )) − J ( η ; ¯ u ( · )) = J (cid:0) η ; ¯ u ( · ) + u ( · ) − ¯ u ( · ) (cid:1) − J ( η ; ¯ u ( · ))= E Z T (cid:10) N ¯ u + H ( η ) , u − ¯ u (cid:11) U dt + 12 E Z T (cid:10) N (cid:0) u − ¯ u (cid:1) , u − ¯ u (cid:11) U dt = 12 E Z T (cid:10) N (cid:0) u − ¯ u (cid:1) , u − ¯ u (cid:11) U dt ≥ . u ( · ) is an optimal control.Finally, we prove the assertion 3). Since all the optimal controls should satisfy (7.15) and N isinvertible, we get the assertion 3) immediately.When R ≫ , M ≥ , G ≥ , (7.17)Problem (SLQ) is called a standard SLQ problem . In such a case, N ≫
0. Then, by Theorem 7.1,it is uniquely solvable and the optimal control is given by (7.16).
Remark 7.1
The main drawback of the formula (7.16) is that it is very difficult to compute theinverse of the operator N . In this subsection, we derive the following Pontryagin type maximum principle for Problem (SLQ).
Theorem 7.2
Let Problem (SLQ) be solvable at η ∈ H with ( X ( · ) , ¯ u ( · )) being an optimal pair.Then for the transposition solution ( Y ( · ) , Z ( · )) to ( dY = − (cid:0) A ∗ Y + A ∗ Y + C ∗ Z − M X (cid:1) dt + ZdW ( t ) in [0 , T ) ,Y ( T ) = − GX ( T ) , (7.18) we have that R ¯ u − B ∗ Y − D ∗ Z = 0 , a.e. ( t, ω ) ∈ [0 , T ] × Ω . (7.19)To prove Theorem 7.2, we need the following preliminary result. Lemma 7.1
Let e U be a convex subset of U . If F ( · ) , ¯ u ( · ) ∈ L F (0 , T ; e U ) satisfy E Z T (cid:10) F ( t, · ) , u ( t, · ) − ¯ u ( t, · ) (cid:11) U dt ≤ , ∀ u ( · ) ∈ L F (0 , T ; e U ) , (7.20) then, (cid:10) F ( t, ω ) , ρ − ¯ u ( t, ω ) (cid:11) U ≤ , a.e. ( t, ω ) ∈ [0 , T ] × Ω , ∀ ρ ∈ e U . (7.21)
Proof . We prove (7.21) by the contradiction argument. Suppose that (7.21) did not hold. Then,there would exist u ∈ e U and ε > α ε ∆ = Z Ω Z T χ Λ ε ( t, ω ) dtd P > , where Λ ε ∆ = (cid:8) ( t, ω ) ∈ [0 , T ] × Ω (cid:12)(cid:12) (cid:10) F ( t, ω ) , u − ¯ u ( t, ω ) (cid:11) U ≥ ε (cid:9) . For any m ∈ N , letΛ ε,m ∆ = Λ ε ∩ (cid:8) ( t, ω ) ∈ [0 , T ] × Ω (cid:12)(cid:12) | ¯ u ( t, ω ) | U ≤ m (cid:9) . It is clear that lim m →∞ Λ ε,m = Λ ε . Hence, there is m ε ∈ N such that Z Ω Z T χ Λ ε,m ( t, ω ) dtd P > α ε > , ∀ m ≥ m ε . (cid:10) F ( · ) , u − ¯ u ( · ) (cid:11) U is F -measurable, so is the process χ Λ ε,m ( · ). Setˆ u ε,m ( t, ω ) ∆ = u χ Λ ε,m ( t, ω ) + ¯ u ( t, ω ) χ Λ cε,m ( t, ω ) , ( t, ω ) ∈ [0 , T ] × Ω . Since | ¯ u ( · ) | U ≤ m on Λ ε,m , we see that ˆ u ε,m ( · ) ∈ L F (0 , T ; e U ) and ˆ u ε,m ( · ) − ¯ u ( · ) ∈ L F (0 , T ; U ).Hence, for any m ≥ m ε , we have that E Z T (cid:10) F ( t ) , ˆ u ε,m ( t ) − ¯ u ( t ) (cid:11) U dt = Z Ω Z T χ Λ ε,m ( t, ω ) (cid:10) F ( t, ω ) , u − ¯ u ( t, ω ) (cid:11) U dtd P ≥ ε Z Ω Z T χ Λ ε,m ( t, ω ) dtd P ≥ εα ε > , which contradicts (7.20). This completes the proof of Lemma 7.1.Now, let us prove Theorem 7.2. Proof .[Proof of Theorem 7.2] For the optimal pair ( X ( · ) , ¯ u ( · )) and a control u ( · ) ∈ U [0 , T ], wesee that u ε ( · ) ∆ = ¯ u ( · ) + ε [ u ( · ) − ¯ u ( · )] = (1 − ε )¯ u ( · ) + εu ( · ) ∈ U [0 , T ] , ∀ ε ∈ [0 , . Denote by X ε ( · ) the state process of (7.1) corresponding to the control u ε ( · ). Write X ε ( · ) ∆ = X ε ( · ) − X ( · ) ε , δu ( · ) ∆ = u ( · ) − ¯ u ( · ) . It is easy to see that X ε ( · ) solves the following SEE: ( dX ε = (cid:0) AX ε + A X ε + Bδu (cid:1) dt + (cid:0) CX ε + Dδu (cid:1) dW ( t ) in (0 , T ] ,X ε (0) = 0 . (7.22)Since ( X ( · ) , ¯ u ( · )) is an optimal pair of Problem (SLQ), we have0 ≤ lim ε → J ( η ; u ε ( · )) − J ( η ; ¯ u ( · )) ε = E Z T (cid:0)(cid:10) M X ( t ) , X ε ( t ) (cid:11) H + (cid:10) R ¯ u ( t ) , δu ( t ) (cid:11) U (cid:1) dt + E (cid:10) GX ( T ) , X ε ( T ) (cid:11) H . (7.23)By the definition of the transposition solution to (7.18), we obtain that − E (cid:10) GX ( T ) , X ε ( T ) (cid:11) H + E Z T (cid:10) M X ( t ) , X ε ( t ) (cid:11) H dt = E Z T (cid:0)(cid:10) Bδu ( t ) , Y ( t ) (cid:11) H + (cid:10) Dδu ( t ) , Z ( t ) (cid:11) H (cid:1) dt. (7.24)Combining (7.23) and (7.24), we arrive at E Z T (cid:10) R ¯ u ( t ) − B ∗ Y ( t ) − D ∗ Z ( t ) , u ( t ) − ¯ u ( t ) (cid:11) U dt ≥ , ∀ u ( · ) ∈ U [0 , T ] . (7.25)65ence, by Lemma 7.1 (where we choose e U = U ), we conclude that (cid:10) R ¯ u − B ∗ Y − D ∗ Z, ρ − ¯ u (cid:11) U ≥ , a.e. [0 , T ] × Ω , ∀ ρ ∈ U. (7.26)This implies (7.19).Next, we introduce the following decoupled forward-backward stochastic evolution equation(FBSEE for short): dX = (cid:0) AX + A X + Bu (cid:1) dt + (cid:0) CX + Du (cid:1) dW ( t ) in (0 , T ] ,dY = − (cid:0) A ∗ Y + A ∗ Y − M X + C ∗ Z (cid:1) dt + ZdW ( t ) in [0 , T ) ,X (0) = η, Y ( T ) = − GX ( T ) . (7.27)We call ( X ( · ) , Y ( · ) , Z ( · )) a transposition solution to the equation (7.27) if X ( · ) is the mild solutionto the forward SEE and ( Y ( · ) , Z ( · )) is the transposition solution to the backward one.Since the equation (7.27) is decoupled, its well-posedness is easy to be obtained. Given η ∈ H and u ( · ) ∈ U [0 , T ], one can first solve the forward SEE to get X ( · ), and then find the transpositionsolution ( Y ( · ) , Z ( · )) to the backward one. Hence, (7.27) admits a unique transposition solution( X ( · ) , Y ( · ) , Z ( · )) corresponding to η and u ( · ). The following result is a further consequence ofProposition 7.1. Proposition 7.3
For any ( η, u ( · )) ∈ H × U [0 , T ] , let ( X ( · ) , Y ( · ) , Z ( · )) be the transposition solutionto (7.27) . Then (cid:0) N u + H ( η ) (cid:1) ( t ) = Ru ( t ) − B ∗ Y ( t ) − D ∗ Z ( t ) , a.e. ( t, ω ) ∈ [0 , T ] × Ω . (7.28) Proof . Let ( X ( · ) , Y ( · ) , Z ( · )) be the transposition solution of (7.27). From (7.13), we obtainthat N u + H ( η ) = ( R + Ξ ∗ M Ξ + b Ξ ∗ G b Ξ) u + (Ξ ∗ M Γ η ) + b Ξ ∗ G b Γ η = Ru + Ξ ∗ M (cid:0) Γ η + Ξ u (cid:1) + b Ξ ∗ G ( b Γ η + b Ξ u )= Ru + Ξ ∗ M X + b Ξ ∗ GX ( T ) . (7.29)By (7.9) and (7.10), we find Ξ ∗ M X + b Ξ ∗ GX ( T ) = − B ∗ Y − D ∗ Z. This, together with (7.29), implies(7.28) immediately.As a variant of Theorem 7.1, we have the following result, in which the conditions are given interms of an FBSEE.
Theorem 7.3
Problem (SLQ) is solvable at η ∈ H with an optimal pair ( X ( · ) , ¯ u ( · )) if and only ifthere exists a unique ( X ( · ) , ¯ u ( · ) , Y ( · ) , Z ( · )) satisfying the FBSEE dX = (cid:0) AX + A X + B ¯ u (cid:1) dt + (cid:0) CX + D ¯ u (cid:1) dW ( t ) in (0 , T ] ,dY = − (cid:0) A ∗ Y + A ∗ Y + C ∗ Z − M X (cid:1) dt + ZdW ( t ) in [0 , T ) ,X (0) = η, y ( T ) = − GX ( T ) (7.30) with the condition R ¯ u − B ∗ Y − D ∗ Z = 0 , a.e. ( t, ω ) ∈ [0 , T ] × Ω , (7.31) and for any u ( · ) ∈ U [0 , T ] , the unique transposition solution ( X ( · ) , Y ( · ) , Z ( · )) to (7.27) with η = 0 satisfies E Z T (cid:10) Ru ( t ) − B ∗ ( t ) Y ( t ) − D ∗ Z ( t ) , u ( t ) (cid:11) U dt ≥ . (7.32)66 roof . The “only if” part . Clearly, (7.31) and (7.32) follow from Theorem 7.2 immediately.
The “if” part . From Proposition 7.3, the inequality (7.32) is equivalent to
N ≥
0. Now, let( X ( · ) , Y ( · ) , Z ( · )) be a transposition solution to (7.30) such that (7.31) holds. Then, by Proposition7.3, we see that (7.31) is the same as (7.15). Hence by Theorem 7.2, Problem (SLQ) is solvable.Theorem 7.3 is nothing but a restatement of Theorem 7.1. However, when R ( t ) is invertible fora.e. ( t, ω ) ∈ [0 , T ] × Ω and R ( · ) − ∈ L ∞ F (0 , T ; L ( H )) , (7.33)it gives a way to find the optimal control by solving the following coupled FBSEE: dX = (cid:2) ( A + A ) X + BR − B ∗ Y + BR − D ∗ Z (cid:3) dt + (cid:0) CX + DR − B ∗ Y + DR − D ∗ Z (cid:1) dW ( t ) in (0 , T ] ,dY = − (cid:2) M X − ( A + A ) ∗ Y − C ∗ Z (cid:3) dt + ZdW ( t ) in [0 , T ) ,X (0) = η, Y ( T ) = − GX ( T ) . (7.34)Clearly, (7.34) is a coupled linear FBSEE. As consequences of Theorem 7.3, we have the followingresults. Corollary 7.1
Let (7.33) hold and
N ≥ . Then Problem (SLQ) is uniquely solvable at η ∈ H if and only if the FBSEE (7.34) admits a unique transposition solution ( X ( · ) , Y ( · ) , Z ( · )) . In thiscase, ¯ u ( t ) = R − (cid:0) B ∗ Y ( t ) + D ∗ Z ( t ) (cid:1) , a.e. ( t, ω ) ∈ [0 , T ] × Ω (7.35) is an optimal control.
Corollary 7.2
Let (7.17) hold. Then FBSEE (7.34) admits a unique transposition solution ( X ( · ) ,Y ( · ) , Z ( · )) and Problem (SLQ) is uniquely solvable with the optimal control given by (7.35) . The optimal control given in the previous section (say (7.35) in Corollary 7.1) is not in a feed-back form. In Control Theory, one of the fundamental issues is to find optimal feedback controls,which is particularly important in practical applications. The main advantage of feedback controlsis that they keep the corresponding control strategy to be robust with respect to (small) pertur-bations/disturbances, which are usually unavoidable in realistic background. Unfortunately, it isactually very difficult to find optimal feedback controls for a control problem.In this section, we study the optimal feedback control for Problem (SLQ) introduced in Section7. To simplify notations, without loss of generality, hereafter, we assume that A = 0.To begin with, we introduce some notations. Let H and H be Hilbert spaces. For 1 ≤ p , p , q , q ≤ ∞ , let L pd (cid:0) L p F (0 , T ; L q (Ω; H )); L p (0 , T ; L q F (Ω; H )) (cid:1) ∆ = n L ∈ L (cid:0) L p F (0 , T ; L q (Ω; H )); L p (0 , T ; L q F (Ω; H )) (cid:1) (cid:12)(cid:12)(cid:12) for a.e. ( t, ω ) ∈ [0 , T ] × Ω , there is e L ( t, ω ) ∈ L ( H ; H ) satisfyingthat (cid:0) L u ( · ) (cid:1) ( t, ω ) = e L ( t, ω ) u ( t, ω ) , ∀ u ( · ) ∈ L p F (0 , T ; L q (Ω; H )) o . (8.1)67n what follows, if there is no confusion, we identify the above L ∈ L pd (cid:0) L p F (0 , T ; L q (Ω; H )); L p (0 , T ; L q F (Ω; H )) (cid:1) with e L ( · , · ).Fix p ≥ q ≥
1. WriteΥ p,q ( H ; H ) ∆ = (cid:8) J ( · , · ) ∈ L pd ( L p F (Ω; L ∞ (0 , T ; H )); L p F (Ω; L q (0 , T ; H ))) (cid:12)(cid:12) | J ( · , · ) | L ( H ; H ) ∈ L ∞ F (Ω; L q (0 , T )) (cid:9) . (8.2)In the sequel, we shall simply denote Υ p,p ( H ; H ) ( resp. Υ p,p ( H ; H )) by Υ p ( H ; H ) ( resp. Υ p ( H )).For Problem (SLQ), let us introduce the notion of optimal feedback operator as follows: Definition 8.1
An operator Θ( · ) ∈ Υ ( H ; U ) is called an optimal feedback operator for Problem(SLQ) if J ( η ; Θ( · ) X ( · )) ≤ J ( η ; u ( · )) , ∀ ( η, u ( · )) ∈ H × U [0 , T ] , (8.3) where X ( · ) = X ( · ; η, Θ( · ) X ( · )) solves the following equation: ( dX = (cid:0) AX + B Θ X (cid:1) dt + (cid:0) CX + D Θ X (cid:1) dW ( t ) in (0 , T ] ,X (0) = η. (8.4) Remark 8.1
In Definition 8.1, Θ( · ) is required to be independent of η ∈ H . For a fixed η ∈ H , theinequality (8.3) implies that the control ¯ u ( · ) ≡ Θ( · ) X ( · ) ∈ U [0 , T ] is optimal for Problem (SLQ).Therefore, for Problem (SLQ), the existence of an optimal feedback operator implies the existenceof optimal controls for any η ∈ H , but not vice versa. Stimulated by the pioneer work [3] (for stochastic LQ problems in finite dimensions), to studyΘ, we introduce the following operator-valued, backward stochastic Riccati equation for Problem(SLQ): dP = − (cid:0) P A + A ∗ P + Λ C + C ∗ Λ + C ∗ P C + M − L ∗ K − L (cid:1) dt +Λ dW ( t ) in [0 , T ) ,P ( T ) = G, (8.5)where K ≡ R + D ∗ P D > , L = B ∗ P + D ∗ ( P C + Λ) . (8.6)There exists a new essential difficulty in the study of (8.5) when dim H = ∞ . Indeed, in theinfinite dimensional setting, although L ( H ) is still a Banach space, it is neither reflexive (needlessto say to be a Hilbert space) nor separable even if the Hilbert space H itself is separable. As far aswe know, there exists no such a stochastic integration/evolution equation theory in general Banachspaces that can be employed to treat the well-posedness of (8.5), especially to handle the (stochasticintegral) term “Λ dW ( t )”. For example, the existing results on stochastic integration/evolutionequations in UMD Banach spaces (e.g. [71]) do not fit the present case because, L ( H ) is not aUMD Banach space.Because of the above mentioned difficulty, we have to employ the stochastic transpositionmethod ([48]) to introduce a new type of solutions to (8.5). To this end, let us first introducethe following two SEEs: ( dϕ = (cid:0) Aϕ + u (cid:1) dτ + (cid:0) Cϕ + v (cid:1) dW ( τ ) in ( t, T ] ,ϕ ( t ) = ξ (8.7)68nd ( dϕ = (cid:0) Aϕ + u (cid:1) dτ + (cid:0) Cϕ + v (cid:1) dW ( τ ) in ( t, T ] ,ϕ ( t ) = ξ . (8.8)Here t ∈ [0 , T ), ξ , ξ are suitable random variables and u , u , v , v are suitable stochastic pro-cesses.Also, we need to give the solution spaces for (8.5). Let V be a Hilbert space such that H ⊂ V and the embedding from H to V is a Hilbert-Schmidt operator. Denote by V ′ the dual space of V with the pivot space H . Put D F ,w ([0 , T ]; L ∞ (Ω; L ( H ))) ∆ = (cid:8) P ∈ D F ([0 , T ]; L ∞ (Ω; L ( H ; V ))) (cid:12)(cid:12) P ( t, ω ) ∈ S ( H ) , a.e. ( t, ω ) ∈ [0 , T ] × Ω , and χ [ t,T ] P ( · ) ζ ∈ D F ([ t, T ]; L (Ω; H )) , ∀ ζ ∈ L F t (Ω; H ) (cid:9) and L F ,w (0 , T ; L ( H )) ∆ = (cid:8) Λ ∈ L F (0 , T ; L ( H ; V )) (cid:12)(cid:12) D ∗ Λ ∈ L pd (cid:0) L ∞ F (0 , T ; L (Ω; H )); L F (0 , T ; U ) (cid:1)(cid:9) . Now, we introduce the notion of transposition solutions to (8.5):
Definition 8.2
We call (cid:0) P ( · ) , Λ( · ) (cid:1) ∈ D F ,w ([0 , T ]; L ∞ (Ω; L ( H ))) × L F ,w (0 , T ; L ( H )) a transpositionsolution to (8.5) if the following three conditions hold: K ( t, ω ) (cid:0) ≡ R ( t, ω ) + D ( t, ω ) ∗ P ( t, ω ) D ( t, ω ) (cid:1) > and its left inverse K ( t, ω ) − is a denselydefined closed operator for a.e. ( t, ω ) ∈ [0 , T ] × Ω ; For any t ∈ [0 , T ) , ξ , ξ ∈ L F t (Ω; H ) , u ( · ) , u ( · ) ∈ L F (Ω; L ( t, T ; H )) and v ( · ) , v ( · ) ∈ L F (Ω; L ( t, T ; V ′ )) , it holds that E h Gϕ ( T ) , ϕ ( T ) i H + E Z Tt (cid:10) M ( τ ) ϕ ( τ ) , ϕ ( τ ) (cid:11) H dτ − E Z Tt (cid:10) K ( τ ) − L ( τ ) ϕ ( τ ) , L ( τ ) ϕ ( τ ) (cid:11) H dτ = E (cid:10) P ( t ) ξ , ξ (cid:11) H + E Z Tt (cid:10) P ( τ ) u ( τ ) , ϕ ( τ ) (cid:11) H dτ + E Z Tt (cid:10) P ( τ ) ϕ ( τ ) , u ( τ ) (cid:11) H dτ + E Z Tt (cid:10) P ( τ ) C ( τ ) ϕ ( τ ) , v ( τ ) (cid:11) H dτ + E Z Tt (cid:10) P ( τ ) v ( τ ) , C ( τ ) ϕ ( τ ) + v ( τ ) (cid:11) H dτ + E Z Tt (cid:10) v ( τ ) , Λ( τ ) ϕ ( τ ) (cid:11) V ′ ,V dτ + E Z Tt (cid:10) Λ( τ ) ϕ ( τ ) , v ( τ ) (cid:11) V,V ′ dτ, (8.9) where ϕ ( · ) and ϕ ( · ) solve (8.7) and (8.8) , respectively . For any t ∈ [0 , T ) , ξ , ξ ∈ L F t (Ω; H ) , u ( · ) , u ( · ) ∈ L F ( t, T ; H ) and v ( · ) , v ( · ) ∈ L F ( t, T ; U ) ,it holds that E h Gϕ ( T ) , ϕ ( T ) i H + E Z Tt (cid:10) M ( τ ) ϕ ( τ ) , ϕ ( τ ) (cid:11) H dτ By Theorem 1.17, one has ϕ ( · ) , ϕ ( · ) ∈ C F ([0 , T ]; L (Ω; H )). E Z Tt (cid:10) K ( τ ) − L ( τ ) ϕ ( τ ) , L ( τ ) ϕ ( τ ) (cid:11) H dτ = E (cid:10) P ( t ) ξ , ξ (cid:11) H + E Z Tt (cid:10) P ( τ ) u ( τ ) , ϕ ( τ ) (cid:11) H dτ (8.10)+ E Z Tt (cid:10) P ( τ ) ϕ ( τ ) , u ( τ ) (cid:11) H dτ + E Z Tt (cid:10) P ( τ ) C ( τ ) ϕ ( τ ) , D ( τ ) v ( τ ) (cid:11) H dτ + E Z Tt (cid:10) P ( τ ) D ( τ ) v ( τ ) , C ( τ ) ϕ ( τ ) + D ( τ ) v ( τ ) (cid:11) H dτ + E Z Tt (cid:10) v ( τ ) , D ( τ ) ∗ Λ( τ ) ϕ ( τ ) (cid:11) U dτ + E Z Tt (cid:10) D ( τ ) ∗ Λ( τ ) ϕ ( τ ) , v ( τ ) (cid:11) U dτ. Here, ϕ ( · ) and ϕ ( · ) solve (8.7) and (8.8) with v and v replaced by Dv and Dv , respectively . Theorem 8.1
If the Riccati equation (8.5) admits a transposition solution (cid:0) P ( · ) , Λ( · ) (cid:1) ∈ D F ,w ([0 ,T ]; L ∞ (Ω; L ( H ))) × L F ,w (0 , T ; L ( H )) such that K ( · ) − (cid:2) B ( · ) ∗ P ( · ) + D ( · ) ∗ P ( · ) C ( · ) + D ( · ) ∗ Λ( · ) (cid:3) ∈ Υ ( H ; U ) , (8.11) then Problem (SLQ) has an optimal feedback operator Θ( · ) ∈ Υ ( H ; U ) . In this case, the optimalfeedback operator Θ( · ) is given by Θ( · ) = − K ( · ) − [ B ( · ) ∗ P ( · ) + D ( · ) ∗ P ( · ) C ( · ) + D ( · ) ∗ Λ( · )] . (8.12) Furthermore, inf u ∈U [0 ,T ]) J ( η ; u ) = 12 h P (0) η, η i H . (8.13) Proof . For any η ∈ H and u ( · ) ∈ U [0 , T ], let X ( · ) ≡ X ( · ; η, u ( · )) be the corresponding state for(7.1). Choose ξ = ξ = η , u = u = Bu and v = v = Du in (8.7)–(8.8). From (8.6), (8.10) andthe pointwise symmetry of K ( · ), we obtain that E h GX ( T ) , X ( T ) i H + E Z T (cid:10) M ( r ) X ( r ) , X ( r ) (cid:11) H dr − E Z T (cid:10) Θ( r ) ∗ K ( r )Θ( r ) X ( r ) , X ( r ) (cid:11) H dr (8.14)= E (cid:10) P (0) η, η (cid:11) H + E Z T (cid:10) P ( r ) B ( r ) u ( r ) , X ( r ) (cid:11) H dr + E Z T (cid:10) P ( r ) X ( r ) , B ( r ) u ( r ) (cid:11) H dr + E Z T (cid:10) P ( r ) C ( r ) X ( r ) , D ( r ) u ( r ) (cid:11) H dr + E Z T (cid:10) P ( r ) D ( r ) u ( r ) , C ( r ) X ( r ) + D ( r ) u ( r ) (cid:11) H dr + E Z T (cid:10) u ( r ) , D ( r ) ∗ Λ( r ) X ( r ) (cid:11) U dr + E Z T (cid:10) D ( r ) ∗ Λ( r ) X ( r ) , u ( r ) (cid:11) U dr. Then, by (8.14), and recalling the definition of L ( · ) and K ( · ), we arrive at2 J ( η ; u ( · )) By Theorem 1.17, one has ϕ ( · ) , ϕ ( · ) ∈ C F ([0 , T ]; L (Ω; H )). E h Z T (cid:0)(cid:10) M X ( r ) , X ( r ) (cid:11) H + (cid:10) Ru ( r ) , u ( r ) (cid:11) U (cid:1) dr + h GX ( T ) , X ( T ) i H i = E (cid:10) P (0) η, η (cid:11) H + E Z T (cid:10) P Bu ( r ) , X ( r ) (cid:11) H dr + E Z T (cid:10) P X ( r ) , Bu ( r ) (cid:11) H dr + E Z T (cid:10) P CX ( r ) , Du ( r ) (cid:11) H dr + E Z T (cid:10) P Du ( r ) , CX ( r ) + Du ( r ) (cid:11) H dr + E Z T (cid:10) u ( r ) , D ∗ Λ( r ) X ( r ) (cid:11) U dr + E Z T (cid:10) D ∗ Λ( r ) x ( r ) , u ( r ) (cid:11) U dr + E Z T (cid:10) Θ ∗ K Θ X ( r ) , X ( r ) (cid:11) H dr + E Z T (cid:10) Ru ( r ) , u ( r ) (cid:11) U dr = E h(cid:10) P (0) η, η (cid:11) H + Z T (cid:0)(cid:10) Θ ∗ K Θ X ( r ) , X ( r ) (cid:11) H + 2 (cid:10) Lx ( r ) , u ( r ) (cid:11) U + (cid:10) Ku ( r ) , u ( r ) (cid:11) U (cid:1) dr i . This, together with (8.12), implies that J ( η ; u ( · ))= 12 E h(cid:10) P (0) η, η (cid:11) H + Z T (cid:0)(cid:10) K Θ X, Θ X (cid:11) U − (cid:10) K Θ X, u (cid:11) U + h Ku, u i U (cid:1) dr i = 12 E h(cid:10) P (0) η, η (cid:11) H + Z T (cid:10) K ( u − Θ X ) , u − Θ X (cid:11) U dr i = J (cid:0) η ; Θ X (cid:1) + 12 E Z T (cid:10) K ( u − Θ X ) , u − Θ X (cid:11) U dr. Hence, J ( η ; Θ X ) ≤ J ( η ; u ) , ∀ u ( · ) ∈ U [0 , T ] . Consequently, Θ( · ) is an optimal feedback operator for Problem (SLQ), and (8.13) holds. Thiscompletes the proof of Theorem 8.1. Remark 8.2
In Definition 8.2, we only ask that K ( t, ω ) has left inverse for a.e. ( t, ω ) ∈ (0 , T ) × Ω ,and therefore K ( t, ω ) − may be unbounded. Nevertheless, this result cannot be improved. Anexample can be found in [54]. By Theorem 8.1, the existence of an optimal feedback operator is reduced to the existence ofsuitable transposition solutions to the Riccati equation (8.5). One may expect that (8.5) wouldadmit a transposition solution ( P, Λ) without further assumptions. Unfortunately, this is incorrecteven in finite dimensions, i.e., H = R n (e.g., [43, Example 6.2]). So far, there is only some partialanswer to this challenging problem (e.g., [54]). In this section, we study a special case of Problem (SLQ), i.e., all coefficients appeared in the stateequation (7.1) and the cost functional (7.3) are deterministic, and A ( · ) = 0 , B ( · ) , D ( · ) ∈ L ∞ (0 , T ; L ( U ; H )) , C ( · ) ∈ L ∞ (0 , T ; L ( H )) ,G ∈ S ( H ) , M ( · ) ∈ L (0 , T ; S ( H )) , R ( · ) ∈ L ∞ (0 , T ; S ( U )) . B ( · ) ∈ L (0 , T ; L ( U ; H )) and C ( · ) ∈ L (0 , T ; L ( H ))(e.g. [41]). Denote by C S ([0 , T ]; S ( H )) the set of all strongly continuous mappings F : [0 , T ] → S ( H ), thatis, F ( · ) ξ is continuous on [0 , T ] for each ξ ∈ H . A sequence { F n } ∞ n =1 ⊂ C S ([0 , T ]; S ( H )) is said toconverge strongly to F ∈ C S ([0 , T ]; S ( H )) iflim n →∞ F n ( · ) ξ = F ( · ) ξ, ∀ ξ ∈ H. In this case, we write lim n →∞ F n = F in C S ([0 , T ]; S ( H )). If F ∈ C S ([0 , T ]; S ( H )), then, by theUniform Boundedness Theorem (i.e., Theorem 1.2), the quantity | F | C S ([0 ,T ]; S ( H )) ∆ = sup t ∈ [0 ,T ] | F ( t ) | L ( H ) (9.1)is finite.Let X and Y be two Banach spaces. For 1 ≤ p ≤ ∞ , let L p, S (0 , T ; L ( X ; Y )) ∆ = { F : [0 , T ] → L ( X ; Y ) (cid:12)(cid:12) F η ∈ L p (0 , T ; Y ) , ∀ η ∈ X , | F | L ( X ; Y ) ∈ L p (0 , T ) } . Particularly, L p, S (0 , T ; S ( H )) ∆ = { F : [0 , T ] → S ( H ) (cid:12)(cid:12) F ∈ L p, S (0 , T ; L ( H )) } . As a special case of (8.5), we introduce the following operator-valued Riccati equation: dPdt + P A + A ∗ P + C ∗ P C + M − L ∗ K − L = 0 in [0 , T ) ,P ( T ) = G, (9.2)where L ( · ) = B ( · ) ∗ P ( · ) + D ( · ) ∗ P ( · ) C ( · ) , K ( · ) = R ( · ) + D ( · ) ∗ P ( · ) D ( · ) . (9.3)Solutions to (9.2) are understood in the following sense. Definition 9.1
We call P ∈ C S ([0 , T ]; S ( H )) a strongly regular mild solution to (9.2) if for any η ∈ H and s ∈ [0 , T ] , P ( s ) η = S ( T − s ) ∗ GS ( T − s ) η + Z Ts S ( τ − s ) ∗ (cid:0) C ∗ P C + M − L ∗ K − L (cid:1) S ( τ − s ) ηdτ (9.4) and K ( s ) ≥ c I, a.e. s ∈ [0 , T ] , (9.5) for some c > . emark 9.1 To define the mild solution to (9.2) , we only need the equality (9.4) make sense.To this end, (9.5) is unnecessary (e.g. [41]). Nevertheless, on order to find the desired optimalfeedback control for the corresponding Problem (SLQ), it is more convenient to consider more“regular” solutions to (9.2) . A natural one is the so called “regular mild solutions” (e.g. [41]). Inthis notes, to avoid too many technical details, we consider only strongly regular solutions introducedin Definition 9.1.
Similarly to the proof of Theorem 8.1, one can show the following result (and hence we omit itsproof).
Theorem 9.1
If the equation (9.2) admits a strongly regular mild solution P ∈ C S ([0 , T ]; S ( H )) ,then, Problem (SLQ) has an optimal feedback control ¯ u ( · ) = − K ( · ) − L ( · ) X ( · ) , (9.6) where X ( · ) solves the following closed-loop system (See (9.3) for K and L ): ( dX = (cid:0) A − BK − L (cid:1) Xdt + (cid:0) C − DK − L (cid:1) XdW ( t ) in (0 , T ] ,X (0) = η. (9.7)Further, we have the following two results. Theorem 9.2
The Riccati equation (9.2) admits at most one strongly regular mild solution.
Theorem 9.3
The Riccati equation (9.2) admits a strongly regular solution if and only if the map u ( · )
7→ J (0; u ( · )) is uniformly convex, i.e., for some constant c > , J (0; u ( · )) ≥ c E Z T | u ( s ) | U ds, ∀ u ( · ) ∈ U [0 , T ] . (9.8) Remark 9.2
Clearly, if (7.17) holds, then the map u ( · )
7→ J (0; u ( · )) is uniformly convex. On theother hand, there are some interesting cases for which such a map is uniformly convex but (7.17) does not hold (e.g., [41]). The proofs of Theorems 9.2–9.3 will be given in Subsection 9.3.
Remark 9.3 By (7.5) , for any ( η, u ( · )) ∈ H × U [0 , T ] , the cost functional can be written as J ( η ; u ( · )) = h G (cid:0)b Γ η + b Ξ u (cid:1) , b Γ η + b Ξ u i H + h M (Γ η + Ξ u ) , Γ η + Ξ u i H + h Ru, u i U . Hence, u ( · )
7→ J (0; u ( · )) is uniformly convex if and only if for some c > , b Ξ ∗ G b Ξ + Ξ ∗ M Ξ + R ≥ c I. (9.9) If R ≥ δI for some δ > , then (9.9) holds for c = δ . On the other hand, even if R ≥ δI doesnot hold, (9.9) may still be true when G is large enough. Such an example can be found in [41]. .2 Some preliminaries In order to prove Theorems 9.2–9.3, we need some preliminary results.Let us consider the following (operator-valued) Lyapunov equation: d e Pdt + e P ( A + e A ) + ( A + e A ) ∗ e P + e C ∗ e P e C + f M = 0 in [0 , T ) , e P ( T ) = e G, (9.10)where e A ( · ) ∈ L , S (0 , T ; L ( H )), e C ( · ) ∈ L , S (0 , T ; L ( H )), f M ( · ) ∈ L , S (0 , T ; S ( H )) and e G ∈ S ( H ).We call e P ∈ C S ([0 , T ]; S ( H )) a mild solution to (9.10) if for any s ∈ [0 , T ] and η ∈ H , e P ( s ) η = S ( T − s ) ∗ e GS ( T − s ) η + Z Ts S ( τ − s ) ∗ (cid:0) e P e A + e A ∗ e P + e C ∗ e P e C + f M (cid:1) S ( τ − s ) ηdτ. (9.11)We have the following well-posedness result for the equation (9.10). Lemma 9.1
The equation (9.10) admits a unique mild solution P ( · ) ∈ C S ([0 , T ]; S ( H )) . Moreover, | e P | C S ([0 ,T ]; S ( H )) ≤ C e R T (2 | e A | L ( H ) + | e C | L ( H ) ) ds (cid:16) | e G | L ( H ) + Z T | f M | L ( H ) ds (cid:17) . (9.12) Proof . Let M T ∆ = sup t ∈ [0 ,T ] | S ( t ) | L ( H ) . Let T ∈ [0 , T ) such that Z TT (cid:0) | e A | L ( H ) + | e C | L ( H ) (cid:1) ds < M T . Define a map G : C S ([ T , T ]; S ( H )) → C S ([ T , T ]; S ( H )) as follows: G ( e P )( r ) ζ ∆ = S ( T − r ) ∗ GS ( T − r ) η + Z Tr S ( s − r ) ∗ (cid:0) e P e A + e A ∗ e P + e C ∗ e P e C + f M (cid:1) S ( s − r ) ζds, ∀ ζ ∈ H. Let e P , e P ∈ C S ([0 , T ]; S ( H )). For each ζ ∈ H ,sup r ∈ [ T ,T ] (cid:12)(cid:12)(cid:0) G ( e P ) − G ( e P ) (cid:1) ( r ) ζ (cid:12)(cid:12) H = sup r ∈ [ T ,T ] (cid:12)(cid:12)(cid:12) Z Tr S ( s − r ) ∗ [( e P − e P ) e A + e A ∗ ( e P − e P )+ e C ∗ ( e P − e P ) e C ] S ( s − r ) ζds (cid:12)(cid:12)(cid:12) H ≤ Z TT (cid:12)(cid:12) S ( s − r ) ∗ (cid:12)(cid:12) L ( H ) (cid:2)(cid:0)(cid:12)(cid:12) e A (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) e A ∗ (cid:12)(cid:12) L ( H ) (cid:1)(cid:12)(cid:12) ( e P − e P ) (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) e C ∗ (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) e P − e P (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) e C (cid:12)(cid:12) L ( H ) (cid:3)(cid:12)(cid:12) S ( s − r ) (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) ζ (cid:12)(cid:12) H ds ≤ M T Z TT (cid:0) (cid:12)(cid:12) e A (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) e C (cid:12)(cid:12) L ( H ) (cid:1) ds sup r ∈ [ T ,T ] (cid:12)(cid:12) ( e P − e P )( r ) (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) ζ (cid:12)(cid:12) H ≤
12 sup r ∈ [ T ,T ] (cid:12)(cid:12) ( e P − e P )( r ) (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) ζ (cid:12)(cid:12) H . r ∈ [ T ,T ] (cid:12)(cid:12)(cid:0) G ( e P ) − G ( e P ) (cid:1) ( r ) (cid:12)(cid:12) L ( H ) ≤
12 sup r ∈ [ T ,T ] (cid:12)(cid:12) ( e P − e P )( r ) (cid:12)(cid:12) L ( H ) . (9.13)This implies that G is contractive. Consequently, there is a unique fixed point of G , which is themild solution to (9.10) on [ T , T ]. Repeating this process gives us the unique e P ∈ C S ([0 , T ]; S ( H ))which satisfies (9.11). The uniqueness of the solution is obvious.From (9.11), we see that for any r ∈ [0 , T ] and ζ ∈ H , (cid:12)(cid:12) e P ( r ) ζ (cid:12)(cid:12) H ≤ (cid:12)(cid:12) S ( T − r ) (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) e G (cid:12)(cid:12) L ( H ) (cid:12)(cid:12) ζ (cid:12)(cid:12) H + Z Tr (cid:12)(cid:12) S ( s − r ) (cid:12)(cid:12) L ( H ) (cid:2)(cid:0) (cid:12)(cid:12) e A (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) e C (cid:12)(cid:12) L ( H ) (cid:1)(cid:12)(cid:12) e P (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) f M (cid:12)(cid:12) L ( H ) (cid:3)(cid:12)(cid:12) ζ (cid:12)(cid:12) H ds. Consequently, (cid:12)(cid:12) e P ( r ) (cid:12)(cid:12) H ≤ C n(cid:12)(cid:12) e G (cid:12)(cid:12) L ( H ) + Z Tr (cid:2)(cid:0) (cid:12)(cid:12) e A (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) e C (cid:12)(cid:12) L ( H ) (cid:1)(cid:12)(cid:12) e P (cid:12)(cid:12) L ( H ) + (cid:12)(cid:12) f M (cid:12)(cid:12) L ( H ) (cid:3) ds o . This, together with Gronwall’s inequality, implies (9.12).The following result illustrates the differentiability of mild solutions to (9.10).
Proposition 9.1
Let e P be a mild solution to (9.10) . Then for any η, ζ ∈ D ( A ) , h e P ( · ) η, ζ i H isdifferentiable in [0 , T ] and ddt h e P η, ζ i H = −h e P η, ( A + e A ) ζ i H − h e P ( A + e A ) η, ζ i H −h e P e Cη, e Cζ i H − h f M η, ζ i H . (9.14) Proof . For any η, ζ ∈ H , we have that h e P ( r ) η, ζ i H = h e GS ( T − r ) η, S ( T − r ) ζ i H + Z Tr h (cid:0) e P e A + e A ∗ e P + e C ∗ e P e C + f M (cid:1) S ( s − r ) η, S ( s − r ) ζ i H ds. (9.15)If η, ζ ∈ D ( A ), it follows (9.15) that h e P ( r ) η, ζ i H is differentiable w.r.t. r . A simple computationgives (9.14).Now, let us show the following result: Lemma 9.2 If e G ≥ , f M ( t ) ≥ , a.e. t ∈ [0 , T ] , (9.16) then the mild solution e P ( · ) to (9.10) satisfies that e P ( t ) ≥ for all t ∈ [0 , T ] .Proof . Let t ∈ [0 , T ). Consider the following equation: ( d e X = ( A + e A ) e Xds + e C e XdW ( s ) in ( t, T ] , e X ( t ) = η ∈ H. Clearly this equation admits a unique solution e X ( · ).75or each λ ∈ ρ ( A ), the resolvent of A , write e X λ ( · ) ∆ = R ( λ ) e X ( · ), where R ( λ ) = λI ( λI − A ) − .Then e X λ ( · ) solves the following equation: ( d e X λ = (cid:0) A e X λ + R ( λ ) e A e X (cid:1) ds + R ( λ ) e C e XdW ( s ) in ( t, T ] , e X λ ( t ) = R ( λ ) η. By Itˆo’s formula and Proposition 9.1, we have E h e G e X λ ( T ) , e X λ ( T ) i H − E h e P ( t ) R ( λ ) η, R ( λ ) η i H = − E Z Tt (cid:2) h e P e X λ , ( A + e A ) e X λ i H + h e P ( A + e A ) e X λ , e X λ i H + h e P e C e X λ , e P e C e X λ i H + h f M ( s ) e X λ , e X λ i H (cid:3) ds + E Z Tt (cid:2) h e P e X λ , ( A e X λ + R ( λ ) e A e X ) i H + h e P ( A e X λ + R ( λ ) e A e X ) , e X λ i H + h e P R ( λ ) e C e X, R ( λ ) e C e X i H (cid:3) ds = − E Z Tt (cid:2) h e P e X λ , e A e X λ − R ( λ ) e A e X i H + h e P (cid:0) e A e X λ − R ( λ ) e A e X (cid:1) , e X λ i H + h e P e C e X λ , e P e C e X λ i H − h e P R ( λ ) e C e X, R ( λ ) e C e X i H + h f M e X λ , e X λ i H (cid:3) ds. (9.17)By Theorem 1.20, we have thatlim λ →∞ e X λ = e X in C F ([ t, T ]; L (Ω; H )) . (9.18)Noting that for any ζ ∈ H , lim λ →∞ R ( λ ) ζ = ζ in H, (9.19)we get from (9.18) that for a.e. s ∈ [ t, T ],lim λ →∞ (cid:0) h e P ( s ) e X λ ( s ) , e A ( s ) e X λ ( s ) − R ( λ ) e A ( s ) e X ( s ) i H (cid:1) = 0 , a.s. (9.20)It follows from the definition of R ( λ ) that (cid:12)(cid:12) h e P ( s ) e X λ ( s ) , e A ( s ) e X λ ( s ) − R ( λ ) e A ( s ) e X ( s ) i H (cid:12)(cid:12) ≤ C| e P ( s ) | L ( H ) | e A ( s ) | L ( H ) (cid:0) | e X ( s ) | H + 1 (cid:1) , a.s.This, together with (9.20) and Dominated Convergence Theorem (Theorem 1.9), implies thatlim λ →∞ E Z Tt h e P e X λ , e A e X λ − R ( λ ) e A e X i H ds = 0 . By a similar argument, letting λ → ∞ in both sides of (9.17), we get h e P ( t ) η, η i H = E h e G e X ( T ) , e X ( T ) i H + E Z Tt h f M e X, e X i H ds, ∀ t ∈ [0 , T ] . This, together with (9.16), implies that e P ( t ) ≥ t ∈ [0 , T ].76 emark 9.4 Since e X ( · ) may not be D ( A ) -valued, in the proof of Lemma 9.2, we introduce a familyof { e X λ ( · ) } λ ∈ ρ ( A ) as Theorem 1.20 to apply Itˆo’s formula and Proposition 9.1. In the rest of thissection, we omit such procedures to save space. The readers are encouraged to give the omitted partthemselves. Similarly to the proof of Proposition 9.1, we can obtain the following result.
Proposition 9.2
Let P be a strongly regular mild solution to (9.2) . Then for any η, ζ ∈ D ( A ) , h P ( · ) η, ζ i H is differentiable in [0 , T ] and ddt h P η, ζ i H = −h P η, Aζ i H − h P Aη, ζ i − h
P Cη, Cζ i H −h M η, ζ i H + h K − Lη, Lζ i H . (9.21)Now, for any Θ( · ) ∈ L , S (0 , T ; L ( H ; U )), let us consider the following (operator-valued) Lya-punov equation: dP Θ dt + P Θ ( A + B Θ) + ( A + B Θ) ∗ P Θ +( C + D Θ) ∗ P Θ ( C + D Θ) + Θ ∗ R Θ + M = 0 in [0 , T ) ,P Θ ( T ) = G. (9.22)As an immediate consequence of Lemma 9.1, we have the following result. Corollary 9.1
There is a unique mild solution to (9.22) . Moreover, | P Θ | C S ([0 ,T ]; S ( H )) ≤ C e R T (2 | B Θ | L ( H ) + | C + D Θ | L ( H ) ) dt h | G | L ( H ) + Z T (cid:0) | Θ | L ( H ; U ) | R | L ( U ) + | M | L ( H ) (cid:1) dt i . (9.23)For any r ∈ [0 , T ) and ξ ∈ L F r (Ω; H ), we consider the following control system: ( dX = (cid:0) AX + Bu (cid:1) dt + (cid:0) CX + Du (cid:1) dW ( t ) in ( r, T ] ,X ( r ) = ξ (9.24)with the cost functional J ( r, ξ ; u ( · )) ∆ = 12 E h Z Tr (cid:0) h M X, X i H + h Ru, u i U (cid:1) ds + h GX ( T ) , X ( T ) i H i , (9.25)where u ∈ U [ r, T ] ∆ = L F ( r, T ; U ). We need to introduce the following optimal control problem(parameterized by r ∈ [0 , T )): Problem (SLQ- r ). Find a control ¯ u ( · ) ∈ U [ r, T ] such that J ( r, ξ ; ¯ u ( · )) = inf u ( · ) ∈U [ r,T ] J ( r, ξ ; u ( · )) . (9.26)Clearly, if the map u ( · )
7→ J (0; u ( · )) is uniformly convex, then so is the map u ( · )
7→ J ( r, u ( · )),i.e., for some c > J ( r, u ( · )) ≥ c E Z Tr | u ( s ) | U ds, ∀ u ( · ) ∈ U [ r, T ] . (9.27)The result below gives a relation between the cost functional (9.25) and the Lyapunov equation(9.22). 77 emma 9.3 Let P Θ ( · ) solve (9.22) . Then, J ( r, ξ ; Θ( · ) X ( · ) + u ( · ))= E Z Tr (cid:2) h (cid:0) L Θ + K Θ Θ (cid:1) X, u i U + h K Θ u, u i U (cid:3) dt + E h P Θ ( r ) ξ, ξ i H , (9.28) where K Θ ( · ) ∆ = R ( · ) + D ( · ) P Θ ( · ) D ( · ) , L Θ ( · ) ∆ = B ( · ) ∗ P Θ ( · ) + D ( · ) ∗ P Θ ( · ) C ( · ) . (9.29) Proof . For any η ∈ H and u ( · ) ∈ U [ r, T ], let X ( · ) be the solution to dX = (cid:2) ( A + B Θ) X + Bu (cid:3) dt + (cid:2) ( C + D Θ) X + Du (cid:3) dW ( t ) in ( r, T ] ,X ( r ) = ξ. By Itˆo’s formula and Proposition 9.1, proceeding as (9.17) in the proof of Lemma 9.2, we can obtainthat E h Z Tr (cid:0) h M X, X i H + h R (Θ X + u ) , Θ X + u i U (cid:1) dt + h GX ( T ) , X ( T ) i H i = E Z Tr (cid:2) h (cid:0) L Θ + K Θ Θ (cid:1) X, u i U + h K Θ u, u i U (cid:3) dt + E h P Θ ( r ) ξ, ξ i H , which implies (9.28).Next, we give a result for the existence and uniqueness of optimal controls for Problem (SLQ- r ). Proposition 9.3
Suppose the map u ( · )
7→ J ( r, u ( · )) is uniformly convex. Then Problem (SLQ- r ) admits a unique optimal control, and for some constant α ∈ R , inf u ∈U [ r,T ] J ( r, ξ ; u ) ≥ α E | ξ | H , ∀ ξ ∈ L F r (Ω; H ) . (9.30) Proof . Clearly, for some c > J ( r, u ( · )) ≥ c E Z Tr | u ( t ) | U dt, ∀ u ( · ) ∈ U [ r, T ] . (9.31)Denote by X ( resp. X ) the solution to (9.24) with u = 0 ( resp. ξ = 0). Then, X = X + X , and J ( r, ξ ; u ( · ))= J ( r, ξ ; 0) + J ( r, u ( · )) + Z Tr h M X , X i U dt + h GX ( T ) , X ( T ) i H ≥ J ( r, ξ ; 0) + c E Z Tr | u ( t ) | U dt − c E Z Tr | u ( t ) | U dt − C E Z Tr | X | H dt ≥ J ( r, ξ ; 0) + c E Z T | u ( t ) | U dt − C E | ξ | H dt, ∀ ξ ∈ L F r (Ω; H ) . (9.32)This implies that u ( · )
7→ J ( r, ξ ; u ( · )) is coercive. Clearly, u ( · )
7→ J ( r, ξ ; u ( · )) is continuous andconvex. Consequently, by a standard argument involving a minimizing sequence and locally weakcompactness of Hilbert spaces, this functional has a unique minimizer. Moreover, (9.32) impliesthat inf u ∈U [0 ,T ] J ( r, ξ ; u ) ≥ J ( r, ξ ; 0) − C E | ξ | H dt. (9.33)Since the terms in the right-hand side of (9.33) are quadratic in ξ , we get (9.30).The next result shows that the solution to (9.22) is bounded from below.78 roposition 9.4 Let (9.8) hold. Then for any Θ( · ) ∈ L , S (0 , T ; L ( U ; H )) , the mild solution P Θ ( · ) to (9.22) and the process K Θ ( · ) defined by (9.29) satisfy K Θ ( t ) ≥ c I, a.e. t ∈ [0 , T ] (9.34) and P Θ ( t ) ≥ αI, ∀ t ∈ [0 , T ] , (9.35) where c > and α ∈ R are the constants appearing in (9.8) and (9.30) , respectively.Proof . For any u ( · ) ∈ U [0 , T ], let X ( · ) be the solution to dX = (cid:2) ( A + B Θ) X + Bu (cid:3) dt + (cid:2) ( C + D Θ) X + Du (cid:3) dW ( t ) in (0 , T ] ,X (0) = 0 . It follows from (9.8) and Lemma 9.3 that c E Z T | Θ X + u | U dt ≤ J (0; Θ( · ) X ( · ) + u ( · )) = E Z T (cid:2) h (cid:0) L Θ + K Θ Θ (cid:1) X , u i U + h K Θ u, u i U (cid:3) dt, which yields that, for any u ( · ) ∈ U [0 , T ], E Z T (cid:8) h (cid:2) L Θ + ( K Θ − c I )Θ (cid:3) X , u i U + h ( K Θ − c I ) u, u i U (cid:9) dt = c E Z T | Θ( t ) X ( t ) | U dt ≥ . (9.36)Fix any u ∈ U and t ∈ (0 , T ), and choose any h > t + h ≤ T . Take u ( · ) = u χ [ t ,t + h ] ( · ). Then | X | C F ([0 ,T ]; L (Ω; H )) ≤ C| u | L F (0 ,T ; U ) ≤ C√ h | u | U . (9.37)Dividing both sides of (9.36) by h and then letting h →
0, noting (9.37), we obtain h (cid:0) K Θ ( t ) − c I (cid:1) u , u i U ≥ , a.e. t ∈ (0 , T ) , ∀ u ∈ U. This gives (9.34).Now we prove (9.35). For any Θ( · ) ∈ L , S (0 , T ; L ( H ; U )) and ξ ∈ L F r (Ω; H ), let X ( · ) be thesolution of the following closed-loop system: ( dX = (cid:0) A + B Θ (cid:1) Xdt + (cid:0) C + D Θ (cid:1) XdW ( t ) in ( r, T ] ,X ( r ) = ξ. (9.38)It follows from Proposition 9.3 and Lemma 9.3 that α E | ξ | ≤ J ( r, ξ ; Θ( · ) X ( · )) = E h P Θ ( r ) ξ, ξ i H , ∀ ξ ∈ L F r (Ω; H ) . In particular, h P Θ ( r ) η, η i H ≥ α | η | for all η ∈ H , which yields (9.35).We shall also need the following result: 79 emma 9.4 For any u ( · ) ∈ U [0 , T ] , let X be the corresponding solution to (7.1) with η = 0 . Thenfor every Θ( · ) ∈ L , S (0 , T ; L ( H ; U )) , there exists a constant c Θ > such that E Z T (cid:12)(cid:12) u ( s ) − Θ( s ) X ( s ) (cid:12)(cid:12) U ds ≥ c Θ E Z T | u ( s ) | U ds, ∀ u ( · ) ∈ U [0 , T ] . (9.39) Proof . Define a bounded linear operator L : U [0 , T ] → U [0 , T ] by L u = u − Θ X . Then L isbijective and its inverse L − is given by L − u = u + Θ e X , where e X ( · ) is the solution to d e X = (cid:2)(cid:0) A + B Θ (cid:1) e X + Bu (cid:3) ds + (cid:2)(cid:0) C + D Θ (cid:1) e X + Du (cid:3) dW ( s ) in (0 , T ] , e X (0) = 0 . By the Inverse Mapping Theorem (i.e., Theorem 1.1), L − is bounded with the norm | L − | L ( U [0 ,T ]) >
0. Thus, E Z T | u ( s ) | U ds = E Z T | ( L − L u )( s ) | U ds ≤ | L − | L ( U [ t,T ]) E Z T | ( L u )( s ) | U ds = | L − | L ( U [ t,T ]) E Z T (cid:12)(cid:12) u ( s ) − Θ( s ) X ( s ) (cid:12)(cid:12) U ds, ∀ u ( · ) ∈ U [0 , T ] , which implies (9.39) with c Θ = | L − | − L ( U [0 ,T ]) . The purpose of this subsection is to prove Theorems 9.2–9.3, addressed to the uniqueness andexistence of solutions to the equation (9.2), respectively.
Proof .[Proof of Theorem 9.2] Let P , P ∈ C S ([0 , T ]; S ( H )) be two strongly regular mild solutionsto (9.2). Then, it follows from (9.4) that for j = 1 ,
2, for any η ∈ H and t ∈ [0 , T ], P j ( t ) η = S ( T − t ) ∗ Ge A ( T − t ) η + Z Tt S ( τ − t ) ∗ (cid:0) C ∗ P j C + M − L ∗ j K − j L j (cid:1) S ( τ − t ) ηdτ, (9.40)where L j ( · ) = B ( · ) ∗ P j ( · ) + D ( · ) ∗ P j ( · ) C ( · ) , K j ( · ) = R ( · ) + D ( · ) ∗ P j ( · ) D ( · ) . From (9.40) and (9.5), we have that (cid:12)(cid:12)(cid:0) P ( t ) − P ( t ) (cid:1) η (cid:12)(cid:12) H = (cid:12)(cid:12)(cid:12) Z Tt S ( τ − t ) ∗ (cid:2) C ∗ (cid:0) P − P (cid:1) C − L ∗ K − (cid:0) L − L (cid:1) (9.41) − (cid:0) L ∗ − L ∗ (cid:1) K − L − L ∗ (cid:0) K − − K − (cid:1) L (cid:3) S ( τ − t ) ηdτ (cid:12)(cid:12)(cid:12) H ≤ C ( | P | C S ([0 ,T ]; S ( H )) , | P | C S ([0 ,T ]; S ( H )) , c , R, B, C, D ) | η | H Z Tt | P − P | L ( H ) dτ. By (9.41) and the arbitrariness of η ∈ H , we get that (cid:12)(cid:12) P ( t ) − P ( t ) (cid:12)(cid:12) H ≤ C ( | P | C S ([0 ,T ]; S ( H )) , | P | C S ([0 ,T ]; S ( H )) , c , R, B, C, D ) Z Tt | P − P | L ( H ) dτ. P ( t ) = P ( t ), ∀ t ∈ [0 , T ]. Proof .[Proof of Theorem 9.3] “
The “if” part ”. We divide the proof into three steps.
Step 1 . In this step, we introduce a sequence of operator-valued functions { P j } Nj =1 .Let P be the solution to dP dt + P A + A ∗ P + C ∗ P C + M = 0 in [0 , T ) ,P ( T ) = G. (9.42)Applying Proposition 9.4 to (9.42) with Θ = 0, we obtain that R ( t ) + D ( t ) ∗ P ( t ) D ( t ) ≥ c I, P ( t ) ≥ αI, a.e. t ∈ [0 , T ] . (9.43)Inductively, for j = 0 , , , · · · , we set K j ∆ = R + D ∗ P j D, L j ∆ = B ∗ P j + D ∗ P j C, Θ j ∆ = − K − j L j , A j ∆ = B Θ j , C j ∆ = C + D Θ j , (9.44)and let P j +1 be the solution to dP j +1 dt + P j +1 ( A + A j ) + ( A + A j ) ∗ P j +1 + C ∗ j P j +1 C j + Θ ∗ j R Θ j + M = 0 in [0 , T ) ,P j +1 ( T ) = G. (9.45) Step 2 . In this step, we show the uniform boundedness of the sequence { P j } ∞ j =1 .From (9.43), we have that K ( t ) ≥ c I, P ( t ) ≥ αI, a.e. t ∈ [0 , T ] . (9.46)This implies that Θ = − K − L ∈ L , S (0 , T ; L ( H ; U )) . It follows from Proposition 9.4 (with P Θ and Θ in (9.22) replaced by P and Θ , respectively) that K ( t ) ≥ c I, P ( t ) ≥ αI, a.e. t ∈ [0 , T ] . Inductively, we have that K j +1 ( t ) ≥ c I, P j +1 ( t ) ≥ αI, a.e. t ∈ [0 , T ] , j = 0 , , , · · · (9.47)Let ∆ j ∆ = P j − P j +1 , Υ j ∆ = Θ j − − Θ j , j ≥ . Then for j ≥ ζ ∈ H , we have − ∆ j ( t ) ζ = P j +1 ( t ) ζ − P j ( t ) ζ = Z Tt S ( r − t ) ∗ (cid:2)(cid:0) P j − P j +1 (cid:1) A j + A ∗ j (cid:0) P j − P j +1 (cid:1) + C ∗ j (cid:0) P j − P j +1 ) C j + P j ( A j − − A j ) + ( A j − − A j ) ∗ P j + C ∗ j − P j C j − (9.48)81 C ∗ j P j C j + Θ ∗ j − R Θ j − − Θ ∗ j R Θ j (cid:3) S ( r − t ) ζdr = Z Tt S ( r − t ) ∗ (cid:2) ∆ j A j + A ∗ j ∆ j + C ∗ j ∆ j C j + P j ( A j − − A j ) + ( A j − − A j ) ∗ P j + C ∗ j − P j C j − − C ∗ j P j C j + Θ ∗ j − R Θ j − − Θ ∗ j R Θ j (cid:3) S ( r − t ) ζdr. From (9.44), we have that A j − − A j = B Θ j − − B Θ j = B (Θ j − − Θ j ) = B Υ j ,C j − − C j = C + D Θ j − − C − D Θ j = D (Θ j − − Θ j ) = D Υ j , and C ∗ j − P j C j − − C ∗ j P j C j = ( C + D Θ j − ) ∗ P j ( C + D Θ j − ) − ( C + D Θ j ) ∗ P j ( C + D Θ j )= Θ ∗ j − D ∗ P j D Θ j − − Θ ∗ j D ∗ P j D Θ j + Θ ∗ j − D ∗ P j C + C ∗ P j D Θ j − − Θ ∗ j D ∗ P j C − C ∗ P j D Θ j = (Θ j − − Θ j ) ∗ D ∗ P j D (Θ j − − Θ j ) + ( C + D Θ j ) ∗ P j D (Θ j − − Θ j )+(Θ j − − Θ j ) ∗ D ∗ P j ( C + D Θ j )= Υ ∗ j D ∗ P j D Υ j + C ∗ j P j D Υ j + Υ ∗ j D ∗ P j C j . Similarly, ( Θ ∗ j − R Θ j − − Θ ∗ j R Θ j = Υ ∗ j R Υ j + Υ ∗ j R Θ j + Θ ∗ j R Υ j ,B ∗ P j + D ∗ P j C j + R Θ j = B ∗ P j + D ∗ P j C + ( R + D ∗ P j D )Θ j = 0 . These, together with (9.48), yields that for any ζ ∈ H ,∆ j ( t ) ζ − Z Tt S ( r − t ) ∗ (cid:0) ∆ j A j + A ∗ j ∆ j + C ∗ j ∆ j C j (cid:1) S ( r − t ) ζdr = Z Tt S ( r − t ) ∗ (cid:0) P j B Υ j + Υ ∗ j B ∗ P j + Υ ∗ j D ∗ P j D Υ j + C ∗ j P j D Υ j (9.49)+Υ ∗ j D ∗ P j C j + Υ ∗ j R Υ j + Υ ∗ j R Θ j + Θ ∗ j R Υ j (cid:1) S ( r − t ) ζdr = Z Tt S ( r − t ) ∗ (cid:2) Υ ∗ j K j Υ j + ( P j B + C ∗ j P j D + Θ ∗ j R )Υ j +Υ ∗ j ( B ∗ P j + D ∗ P j C j + R Θ j ) (cid:3) S ( r − t ) ζdr = Z Tt S ( r − t ) ∗ Υ ∗ j K j Υ j S ( r − t ) ζdr. By (9.49), ∆ j ( · ) solves (9.10) with e G = 0, e A = A j , e C = C j and f M = Υ ∗ j K j Υ j ≥
0. Using Lemma9.2, we have ∆ j ( · ) ≥
0, namely, P j − ( · ) − P j ( · ) ≥ j ≥
1. By (9.47), for each j ∈ N and t ∈ [0 , T ], P ( t ) ≥ P j ( t ) ≥ P j +1 ( t ) ≥ αI. Hence, the sequence { P j } ∞ j =1 is uniformly bounded. Consequently,there exists a constant C > j ≥ t ∈ [0 , T ], | P j ( t ) | L ( H ) ≤ C , | K j ( t ) | L ( U ) ≤ C , | Θ j ( t ) | L ( H ; U ) ≤ C (cid:0) | B ( t ) | L ( U ; H ) + | C ( t ) | L ( H ) (cid:1) , |A j ( t ) | L ( H ) ≤ C| B ( t ) | L ( U ; H ) (cid:0) | B ( t ) | L ( U ; H ) + | C ( t ) | L ( H ) (cid:1) , | C j ( t ) | L ( H ) ≤ C (cid:0) | B ( t ) | L ( U ; H ) + | C ( t ) | L ( H ) (cid:1) . (9.50)82 tep 3 . In this step, we prove the convergence of the sequence { P j } ∞ j =1 .Noting Λ j = Θ j − − Θ j = K − j D ∗ ∆ j − DK − j − L j − K − j − (cid:0) B ∗ ∆ j − + D ∗ ∆ j − C (cid:1) and (9.50), onehas | Υ j ( t ) ∗ K j ( t )Υ j ( t ) | L ( H ) ≤ (cid:0) | Θ j ( t ) | L ( U ) + | Θ j − ( t ) | L ( U ) (cid:1) | K j ( t ) | L ( U ) | Θ j − ( t ) − Θ j ( t ) | L ( U ) ≤ C (cid:0) | B ( t ) | L ( U ; H ) + | C ( t ) | L ( H ) (cid:1) | ∆ j − ( t ) | L ( H ) . (9.51)By (9.49), it follows that for any ζ ∈ H ,∆ j ( t ) ζ = − Z Tt S ( r − t ) ∗ (cid:0) ∆ j A j + A ∗ j ∆ j + C ∗ j ∆ i C j + Υ ∗ j K j Υ j (cid:1) S ( r − t ) ζdr. Making use of (9.51) and noting (9.50), by the arbitrariness of ζ ∈ H , we get | ∆ j ( t ) | L ( H ) ≤ Z Tt ϕ ( r ) (cid:0) | ∆ j ( r ) | L ( H ) + | ∆ j − ( r ) | L ( H ) (cid:1) dr, ∀ t ∈ [0 , T ] , where ϕ ( · ) is a nonnegative integrable function independent of ∆ j ( · ) ( j ∈ N ). Hence, | ∆ j ( t ) | L ( H ) ≤ a Z Tt ϕ ( r ) | ∆ j − ( r ) | L ( H ) dr, where a = e R T ϕ ( r ) dr . Set b ∆ = max ≤ r ≤ T | ∆ ( r ) | L ( H ) . By induction, we deduce that | ∆ j ( t ) | L ( H ) ≤ b a j j ! (cid:16) Z Tt ϕ ( r ) dr (cid:17) j , ∀ t ∈ [0 , T ] , which implies the uniform convergence of { P j } ∞ j =1 . Denote by P the limit of { P j } ∞ j =1 , then (noting(9.47)) K ( t ) = lim j →∞ K j ( t ) ≥ c I for a.e. t ∈ [0 , T ], and lim j →∞ Θ j = − K − L ≡ Θ in L , S (0 , T ; L ( H ; U )) , lim j →∞ A j = B Θ in L , S (0 , T ; L ( H )) , lim j →∞ C j = C + D Θ in L , S (0 , T ; L ( H )) . Therefore, P ( · ) solves the following equation (in the sense of strongly regular mild solution): ˙ P + P ( A + B Θ) + ( A + B Θ) ∗ P +( C + D Θ) ∗ P ( C + D Θ) + Θ ∗ R Θ + M = 0 in [0 , T ) ,P ( T ) = G, which is equivalent to (9.2).“ The “only if” part ”. Let P ( · ) be the strongly regular solution to (9.2). Then (9.5) holds forsome c >
0. Put Θ ∆ = − K − L ( ∈ L , S (0 , T ; L ( H ; U ))) . For any u ( · ) ∈ U [0 , T ], let X ( · ) = X ( · ; 0 , u )83e the solution to (7.1) with η = 0. Applying Itˆo’s formula to t
7→ h P ( t ) X ( t ) , X ( t ) i H , we have J (0; u ( · )) = E h Z T (cid:0) h M X, X i H + h Ru, u i U (cid:1) dt + h GX ( T ) , X ( T ) i H i = E Z T n h− (cid:0) P A + A ∗ P + C ∗ P C + M − L ∗ K − L (cid:1) X, X i H + h P (cid:0) AX + Bu (cid:1) , X i H + h P X, AX + Bu i H + h P (cid:0) CX + Du (cid:1) , CX + Du i H + h M X, X i H + h Ru, u i U o dt = E Z T (cid:0) h Θ ∗ K Θ X, X i H − h K Θ X, u i U + h Ku, u i U (cid:1) dt = E Z T h K (cid:0) u − Θ X (cid:1) , u − Θ X i U dt. Hence, by (9.5) and Lemma 9.4, it holds that, for some c > u ( · ) ∈ U [0 , T ], J (0; u ( · )) = E Z T h K (cid:0) u − Θ x (cid:1) , u − Θ x i U ds ≥ c E Z T | u ( s ) | U ds, which completes the proof of Theorem 9.3.
10 Pontryagin-type maximum principle for controlled stochasticevolution equations
In this section, we shall present the first order necessary optimality condition, or more precisely, thePontryagin-type maximum principle, for optimal control problems of nonlinear stochastic evolutionequations in infinite dimensions, in which both the drift and diffusion terms may contain the controlvariables, and the control domain is allowed to be nonconvex. For the second order necessaryoptimality conditions, we refer the readers to [15, 46]. The results in this section are taken from[48, 49] (See also [52, 53]). Some related results can be found in [11, 18].
Throughout this section, both H and A (generating a C -semigroup S ( · ) on H ) are the sameas before, (Ω , F , F , P ) (with F ∆ = {F t } t ∈ [0 ,T ] for some T >
0) is a fixed filtered probability space(satisfying the usual condition), on which a 1-dimensional standard Brownian motion W ( · ) isdefined. Denote by F the progressive σ -field w.r.t. F . Let U be a separable metric space with ametric d ( · , · ). Put U [0 , T ] ∆ = (cid:8) u ( · ) : [0 , T ] × Ω → U (cid:12)(cid:12) u ( · ) is F -adapted (cid:9) (10.1)For any given functions a ( · , · , · , · ) : [0 , T ] × Ω × H × U → H and b ( · , · , · , · ) : [0 , T ] × Ω × H × U → H ,let us consider the following controlled SEE: ( dX = (cid:0) AX + a ( t, X, u ) (cid:1) dt + b ( t, X, u ) dW ( t ) in (0 , T ] ,X (0) = X , (10.2)where u ( · ) ∈ U [0 , T ] is the control variable , X ( · ) is the state variable , and the (given) initial state X ∈ H . Throughout this section, For ψ = a ( · , · , · , · ) , b ( · , · , · , · ), we assume the following condition:84 AS1) i) For any ( x, u ) ∈ H × U , the function ψ ( · , · , x, u ) : [0 , T ] × Ω → H is F -measurable;ii) For any ( t, x ) ∈ [0 , T ] × H , the function ψ ( t, x, · ) : U → H is continuous; and iii) For any ( x , x , u ) ∈ H × H × U and a.e. ( t, ω ) ∈ (0 , T ) × Ω , ( | ψ ( t, x , u ) − ψ ( t, x , u ) | H ≤ C| x − x | H , | ψ ( t, , u ) | H ≤ C . (10.3)By Theorem 1.43, it is clear that Proposition 10.1
Let the assumption (AS1) hold. Then, for any p ≥ , X ∈ H and u ( · ) ∈U [0 , T ] , the equation (10.2) admits a unique mild solution X ∈ C F ([0 , T ]; L p (Ω; H )) . Furthermore, | X ( · ) | C F ([0 ,T ]; L p (Ω; H )) ≤ C (cid:0) | X | H (cid:1) . Also, we need the following condition: (AS2)
Suppose that g ( · , · , · , · ) : [0 , T ] × Ω × H × U → R and h ( · , · ) : Ω × H → R are two functionssatisfying: i) For any ( x, u ) ∈ H × U , the function g ( · , · , x, u ) : [0 , T ] × Ω → R is F -measurable andthe function h ( · , x ) : Ω → R is F T -measurable; ii) For a.e. ( t, ω ) ∈ [0 , T ] × Ω and any x ∈ H , thefunction g ( t, ω, x, · ) : U → R is continuous; and iii) For all ( t, x , x , u ) ∈ [0 , T ] × H × H × U , ( | g ( t, x , u ) − g ( t, x , u ) | + | h ( x ) − h ( x ) | ≤ C| x − x | H , a.s. , | g ( t, , u ) | + | h (0) | ≤ C , a.s. (10.4)Define a cost functional J ( · ) (for the controlled system (10.2)) as follows: J ( u ( · )) ∆ = E (cid:16) Z T g ( t, X ( t ) , u ( t )) dt + h ( X ( T )) (cid:17) , ∀ u ( · ) ∈ U [0 , T ] , (10.5)where X ( · ) is the state of (10.2) corresponding to u .Consider the following optimal control problem: Problem (OP)
Find a ¯ u ( · ) ∈ U [0 , T ] such that J (¯ u ( · )) = inf u ( · ) ∈U [0 ,T ] J ( u ( · )) . (10.6) Any ¯ u ( · ) satisfying (10.6) is called an optimal control. The corresponding state X ( · ) is called anoptimal state, and ( X ( · ) , ¯ u ( · )) is called an optimal pair. Remark 10.1
We do not put state/end point constraints for our optimal control problem. Readerswho are interested in this are referred to [15] for some recent results.
Problem (OP) is now well-understood for the case of finite dimensions (i.e., dim
H < ∞ ) andnatural filtration. In this case, a Pontryagin-type maximum principle was obtained in [62] forgeneral stochastic control systems with control-dependent diffusion coefficients and possibly non-convex control regions, and it was found there that the corresponding result differs significantlyfrom its deterministic counterpart. The main purpose in this section is to see what happens whendim H = ∞ . 85 In this subsection, we shall present a necessary condition for optimal controls of Problem (OP)under the following conditions: (AS3)
The control region U is a convex subset of a separable Hilbert space H and the metric of U is induced by the norm of H , i.e., d ( u , u ) = | u − u | H . (AS4) For a.e. ( t, ω ) ∈ (0 , T ) × Ω , the functions a ( t, · , · ) : H × U → H and b ( t, · , · ) : H × U → H , g ( t, · , · ) : H × U → R and h ( · ) : H → R are C . Moreover, for any ( x, u ) ∈ H × U and a.e. ( t, ω ) ∈ (0 , T ) × Ω , ( | a x ( t, x, u ) | L ( H ) + | b x ( t, x, u ) | L ( H ) + | g x ( t, x, u ) | H + | h x ( x ) | H ≤ C , | a u ( t, x, u ) | L ( H ; H ) + | b u ( t, x, u ) | L ( H ; H ) + | g u ( t, x, u ) | H ≤ C . Further, we need to introduce the following H -valued BSEE: ( dY ( t ) = − A ∗ Y ( t ) dt + f ( t, Y ( t ) , Z ( t )) dt + Z ( t ) dW ( t ) in [0 , T ) ,y ( T ) = Y T . (10.7)Here Y T ∈ L F T (Ω; H )) and f ( · , · , · , · ) : [0 , T ] × Ω × H × H → H satisfies f ( · , · , , ∈ L F (0 , T ; L (Ω; H )) , | f ( t, ω, y , z ) − f ( t, ω, y , z ) | H ≤ C (cid:0) | y − y | H + | z − z | H (cid:1) , a.e. ( t, ω ) ∈ [0 , T ] × Ω , ∀ y , y , z , z ∈ H. (10.8)By Theorem 1.24, the equation (10.7) admits a unique transposition solution ( Y ( · ) , Z ( · )) ∈ D F ([0 , T ]; L (Ω; H )) × L F (0 , T ; H ). Furthermore, | ( Y ( · ) , Z ( · )) | D F ([0 ,T ]; L (Ω; H )) × L F (0 ,T ; H ) ≤ C (cid:0) | Y T | L F T (Ω; H ) + | f ( · , , | L F (0 ,T ; L (Ω; H )) (cid:1) . (10.9)We have the following necessary condition for optimal controls of Problem (OP) with a convexcontrol domain. Theorem 10.1
Let the assumptions (AS1)–(AS4) hold, and let ( X ( · ) , ¯ u ( · )) be an optimal pair ofProblem (OP). Let ( Y ( · ) , Z ( · )) be the transposition solution to the equation (10.7) with Y T and f ( · , · , · ) given by ( Y T = − h x (cid:0) X ( T ) (cid:1) ,f ( t, y , y ) = − a x ( t, X ( t ) , ¯ u ( t )) ∗ y − b x (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) ∗ y + g x (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) . (10.10) Then, (cid:10) a u ( t, X ( t ) , ¯ u ( t )) ∗ Y ( t ) + b u ( t, X ( t ) , ¯ u ( t )) ∗ Z ( t ) − g u ( t, ¯ u ( t ) , X ( t )) , ρ − ¯ u ( t ) (cid:11) H ≤ , ∀ ρ ∈ U, a.e. ( t, ω ) ∈ [0 , T ] × Ω . (10.11)86 roof . We use the convex perturbation technique and divide the proof into three steps. Step 1 . We fix any control u ( · ) ∈ U [0 , T ]. Since U is convex, for any ε ∈ [0 , u ε ( · ) ∆ = ¯ u ( · ) + ε (cid:0) u ( · ) − ¯ u ( · ) (cid:1) = (1 − ε )¯ u ( · ) + εu ( · ) ∈ U [0 , T ] . Denote by X ε ( · ) the state of (10.2) corresponding to the control u ε ( · ). It follows from Proposition10.1 that | X ε ( · ) | C F ([0 ,T ]; L (Ω; H )) ≤ C (cid:0) | X | H (cid:1) , ∀ ε ∈ [0 , . (10.12)Write X ε ( · ) = 1 ε (cid:0) X ε ( · ) − X ( · ) (cid:1) , δu ( · ) = u ( · ) − ¯ u ( · ) . It is easy to see that X ε ( · ) solves the following SEE: ( dX ε = (cid:0) AX ε + a ε X ε + a ε δu (cid:1) dt + (cid:0) b ε X ε + b ε δu (cid:1) dW ( t ) in (0 , T ] ,X ε (0) = 0 , (10.13)where for ψ = a, b , ψ ε ( t ) = Z ψ x ( t, X ( t ) + σεX ε ( t ) , u ε ( t )) dσ,ψ ε ( t ) = Z ψ u ( t, X ( t ) , ¯ u ( t ) + σεδu ( t )) dσ. (10.14)Consider the following SEE: ( dX = (cid:0) AX + a X + a δu (cid:1) dt + (cid:0) b X + b δu (cid:1) dW ( t ) in (0 , T ] ,X (0) = 0 , (10.15)where for ψ = a, b , ψ ( t ) = ψ x ( t, X ( t ) , ¯ u ( t )) , ψ ( t ) = ψ u ( t, X ( t ) , ¯ u ( t )) . (10.16) Step 2 . In this step, we shall show thatlim ε → (cid:12)(cid:12) X ε − X (cid:12)(cid:12) L ∞ F (0 ,T ; L (Ω; H )) = 0 . (10.17)First, using Burkholder-Davis-Gundy’s inequality (i.e., Proposition 1.10) and by the assumption(AS4), we find that E | X ε ( t ) | H = E (cid:12)(cid:12)(cid:12) Z t S ( t − s ) a ε ( s ) X ε ( s ) ds + Z t S ( t − s ) a ε ( s ) δu ( s ) ds + Z t S ( t − s ) b ε ( s ) X ε ( s ) dW ( s ) + Z t S ( t − s ) b ε ( s ) δu ( s ) dW ( s ) (cid:12)(cid:12)(cid:12) H ≤ C E (cid:16)(cid:12)(cid:12)(cid:12) Z t S ( t − s ) a ε ( s ) X ε ( s ) ds (cid:12)(cid:12)(cid:12) H + (cid:12)(cid:12)(cid:12) Z t S ( t − s ) b ε ( s ) X ε ( s ) dW ( s ) (cid:12)(cid:12)(cid:12) H + (cid:12)(cid:12)(cid:12) Z t S ( t − s ) a ε ( s ) δu ( s ) ds (cid:12)(cid:12)(cid:12) H + (cid:12)(cid:12)(cid:12) Z t S ( t − s ) b ε ( s ) δu ( s ) dW ( s ) (cid:12)(cid:12)(cid:12) H (cid:17) ≤ C (cid:16) Z t E | X ε ( s ) | H ds + Z T E | δu ( s ) | H dt (cid:17) . (10.18)87t follows from (10.18) and Gronwall’s inequality that E | X ε ( t ) | H ≤ C| ¯ u − u | L F (0 ,T ; H ) , ∀ t ∈ [0 , T ] . (10.19)Similarly, E | X ( t ) | H ≤ C| ¯ u − u | L F (0 ,T ; H ) , ∀ t ∈ [0 , T ] . (10.20)Put X ε = X ε − X . Then, X ε solves the following equation: dX ε = (cid:2) AX ε + a ε ( t ) X ε + (cid:0) a ε ( t ) − a ( t ) (cid:1) X + (cid:0) a ε ( t ) − a ( t ) (cid:1) δu (cid:3) dt + (cid:2) b ε ( t ) X ε + (cid:0) b ε ( t ) − b ( t ) (cid:1) X + (cid:0) b ε ( t ) − b ( t ) (cid:1) δu (cid:3) dW ( t ) in (0 , T ] ,X ε (0) = 0 . (10.21)It follows from (10.20)–(10.21) that E | X ε ( t ) | H = E (cid:12)(cid:12)(cid:12) Z t S ( t − s ) a ε ( s ) X ε ( s ) ds + Z t S ( t − s ) b ε ( s ) X ε ( s ) dW ( s )+ Z t S ( t − s ) (cid:2) a ε ( s ) − a ( s ) (cid:3) X ( s ) ds + Z t S ( t − s ) (cid:2) b ε ( s ) − b ( s ) (cid:3) X ( s ) dW ( s )+ Z t S ( t − s ) (cid:2) a ε − a (cid:3) δu ( s ) ds + Z t S ( t − s ) (cid:2) b ε − b (cid:3) δudW ( s ) (cid:12)(cid:12)(cid:12) H ≤ C h E Z t | X ε | H ds + | X | L ∞ F (0 ,T ; L (Ω; H )) Z T E (cid:0) | a ε − a | L ( H ) + | b ε − b | L ( H ) (cid:1) dt + | u − ¯ u | L F (0 ,T ; L (Ω; H )) Z T E (cid:0) | a ε − a | L ( H ; H ) + | b ε − b | L ( H ; H ) (cid:1) dt i ≤ C (1 + | u − ¯ u | L F (0 ,T ; L (Ω; H )) ) h E Z t | X ε ( s ) | H ds + Z T E (cid:0) | a ε − a | L ( H ) + | b ε − b | L ( H ) + | a ε − a | L ( H ; H ) + | b ε − b | L ( H ; H ) (cid:1) dt i . This, together with Gronwall’s inequality, implies that E | X ε ( t ) | H ≤ C e C| u − ¯ u | L F (0 ,T ; L H Z T E (cid:0) | a ε − a | L ( H ) + | b ε − b | L ( H ) + | a ε − a | L ( H ; H ) + | b ε − b | L ( H ; H ) (cid:1) ds, ∀ t ∈ [0 , T ] . (10.22)Note that (10.19) implies X ε ( · ) → X ( · ) (in H ) in probability, as ε →
0. Hence, by (10.14), (10.16)and the continuity of a x ( t, · , · ), b x ( t, · , · ), a u ( t, · , · ) and b u ( t, · , · ), we deduce thatlim ε → Z T E (cid:0) | a ε ( s ) − a ( s ) | L ( H ) + | b ε ( s ) − b ( s ) | L ( H ) + | a ε ( s ) − a ( s ) | L ( H ; H ) + | b ε ( s ) − b ( s ) | L ( H ; H ) (cid:1) ds = 0 . This, together with (10.22), gives (10.17). 88 tep 3 . Since ( X ( · ) , ¯ u ( · )) is an optimal pair of Problem (OP), from (10.17), we find that0 ≤ lim ε → J ( u ε ( · )) − J (¯ u ( · )) ε = E Z T (cid:16)(cid:10) g ( t ) , X ( t ) (cid:11) H + (cid:10) g ( t ) , δu ( t ) (cid:11) H (cid:17) dt + E (cid:10) h x ( X ( T )) , X ( T ) (cid:11) H , (10.23)where g ( t ) = g x ( t, X ( t ) , ¯ u ( t )) and g ( t ) = g u ( t, X ( t ) , ¯ u ( t )).By the definition of transposition solution to (10.7), we have that − E (cid:10) h x ( X ( T )) , X ( T ) (cid:11) H − E Z T (cid:10) g ( t ) , X ( t ) (cid:11) H dt = E Z T (cid:16)(cid:10) a ( t ) δu ( t ) , Y ( t ) (cid:11) H + (cid:10) b ( t ) δu ( t ) , Z ( t ) (cid:11) H (cid:17) dt. (10.24)Combining (10.23) and (10.24), we find that E Z T (cid:10) a ( t ) ∗ Y ( t ) + b ( t ) ∗ Z ( t ) − g ( t ) , u ( t ) − ¯ u ( t ) (cid:11) H dt ≤ u ( · ) ∈ U [0 , T ] satisfying u ( · ) − ¯ u ( · ) ∈ L F (0 , T ; H ). Hence, by Lemma 7.1 and (10.25),we conclude (10.11). This completes the proof of Theorem 10.1. In this subsection, we give a necessary condition for optimal controls of Problem (OP) for a generalcontrol domain.As we shall see later, the main difficulty in dealing with the non-convex control domain U isthat one needs to solve the following L ( H )-valued BSEE: dP = − ( A ∗ + J ∗ ) P dt − P ( A + J ) dt − K ∗ P Kdt − ( K ∗ Q + QK ) dt + F dt + QdW ( t ) in [0 , T ) ,P ( T ) = P T . (10.26)Here and henceforth, F ∈ L F (0 , T ; L (Ω; L ( H ))), P T ∈ L F T (Ω; L ( H )) and J, K ∈ L F (0 , T ; L ∞ (Ω; L ( H ))). As we have explained in Section 8, in the previous literature there exists no such astochastic integration theory in general Banach spaces that can be employed to treat the well-posedness of (10.26). In order to overcome this difficulty, similarly to Section 8, we employ thestochastic transposition method introduced in our previous works [47, 48]. To define solutions to (10.26), for t ∈ [0 , T ), ξ i ∈ L F t (Ω; H ) and u i , v i ∈ L F ( t, T ; L (Ω; H )) ( i = 1 , ( dϕ = ( A + J ) ϕ ds + u ds + Kϕ dW ( s ) + v dW ( s ) in ( t, T ] ,ϕ ( t ) = ξ (10.27)and ( dϕ = ( A + J ) ϕ ds + u ds + Kϕ dW ( s ) + v dW ( s ) in ( t, T ] ,ϕ ( t ) = ξ . (10.28)89lso, we shall introduce the solution space for (10.26). Write P [0 , T ] ∆ = n P ( · , · ) (cid:12)(cid:12)(cid:12) P ( · , · ) ∈ L pd (cid:0) L F (0 , T ; L (Ω; H )); L (0 , T ; L F (Ω; H )) (cid:1) and for every t ∈ [0 , T ] and ξ ∈ L F t (Ω; H ) ,P ( · , · ) ξ ∈ D F ([ t, T ]; L (Ω; H )) and | P ( · , · ) ξ | D F ([ t,T ]; L (Ω; H )) ≤ C| ξ | L F t (Ω; H ) o and Q [0 , T ] ∆ = n(cid:0) Q ( · ) , b Q ( · ) (cid:1) (cid:12)(cid:12)(cid:12) For any t ∈ [0 ,T ] , Q ( t ) and b Q ( t ) are bounded linearoperators from L F t (Ω; H ) × L F ( t, T ; L (Ω; H )) × L F ( t, T ; L (Ω; H )) to L F ( t, T ; L (Ω; H )) and Q ( t ) (0 , , · ) ∗ = b Q ( t ) (0 , , · ) o . The norms on P [0 , T ] and Q [0 , T ] are given respectively by | P ( · , · ) | P [0 ,T ] ∆ = | P ( · , · ) | L (cid:0) L F (0 ,T ; L (Ω; H )); L (0 ,T ; L F (Ω; H )) (cid:1) and | (cid:0) Q ( · ) , b Q ( · ) (cid:1) | Q [0 ,T ]∆ = sup t ∈ [0 ,T ] | (cid:0) Q ( t ) , b Q ( t ) (cid:1) | L ( L F t (Ω; H ) × L F ( t,T ; L (Ω; H )) × L F ( t,T ; L (Ω; H )); L F ( t,T ; L (Ω; H ))) The following notion will play a fundamental role in the sequel.
Definition 10.1
We call (cid:0) P ( · ) , Q ( · ) , b Q ( · ) (cid:1) ∈ P [0 , T ] × Q [0 , T ] a relaxed transposition solutionto the equation (10.26) if for any t ∈ [0 , T ] , ξ , ξ ∈ L F t (Ω; H ) and u ( · ) , u ( · ) , v ( · ) , v ( · ) ∈ L F ( t, T ; L (Ω; H )) , it holds that E (cid:10) P T ϕ ( T ) , ϕ ( T ) (cid:11) H − E Z Tt (cid:10) F ( s ) ϕ ( s ) , ϕ ( s ) (cid:11) H ds = E (cid:10) P ( t ) ξ , ξ (cid:11) H + E Z Tt (cid:10) P ( s ) u ( s ) , ϕ ( s ) (cid:11) H ds + E Z Tt (cid:10) P ( s ) ϕ ( s ) , u ( s ) (cid:11) H ds + E Z Tt (cid:10) P ( s ) K ( s ) ϕ ( s ) , v ( s ) (cid:11) H ds + E Z Tt (cid:10) P ( s ) v ( s ) , K ( s ) ϕ ( s ) + v ( s ) (cid:11) H ds + E Z Tt (cid:10) v ( s ) , b Q ( t ) ( ξ , u , v )( s ) (cid:11) H ds + E Z Tt (cid:10) Q ( t ) ( ξ , u , v )( s ) , v ( s ) (cid:11) H ds. Here, ϕ ( · ) and ϕ ( · ) solve (10.27) and (10.28) , respectively. We have the following well-posedness result for the equation (10.26) (See [48] for its proof).
Theorem 10.2
Suppose that L p F T (Ω) ( ≤ p < ∞ ) is separable. Then the equation (10.26) has aunique relaxed transposition solution (cid:0) P ( · ) , Q ( · ) , b Q ( · ) (cid:1) ∈ P [0 , T ] × Q [0 , T ] . Furthermore, | P | P [0 ,T ] + (cid:12)(cid:12)(cid:0) Q ( · ) , b Q ( · ) (cid:1)(cid:12)(cid:12) Q [0 ,T ] ≤ C (cid:0) | F | L F (0 ,T ; L (Ω; L ( H ))) + | P T | L F T (Ω; L ( H )) (cid:1) . emark 10.2 It is well known that L p F T (Ω) is separable if the probability space (Ω , F T , P ) is sepa-rable, i.e., there exists a countable family E ⊂ F T such that, for any ε > and B ∈ F T one can find B ∈ E with P (cid:0) ( B \ B ) ∪ ( B \ B ) (cid:1) < ε (e.g., [4, Section 13.4]). Except some artificial examples,almost all frequently used probability spaces are separable (e.g., [25]). Next, we give a regularity result for relaxed transposition solutions to (10.26). To this end, werecall two preliminary results (See [49] for their proofs). The first one is as follows:
Lemma 10.1
For each t ∈ [0 , T ] , if u = v = 0 in the equation (10.28) , then there exists an opera-tor U ( · , t ) ∈ L (cid:0) L F t (Ω; H ); C F ([ t, T ]; L (Ω; H )) (cid:1) such that the solution to (10.28) can be representedas ϕ ( · ) = U ( · , t ) ξ . Let { ∆ n } ∞ n =1 be a sequence of partitions of [0 , T ], that is,∆ n ∆ = n t ni (cid:12)(cid:12)(cid:12) i = 0 , , · · · , n, and 0 = t n < t n < · · · < t nn = T o , such that ∆ n ⊂ ∆ n +1 and max ≤ i ≤ n − ( t ni +1 − t ni ) → n → ∞ . We introduce the followingsubspaces of L F (0 , T ; L (Ω; H )): H n ∆ = n n − X i =0 χ [ t ni ,t ni +1 ) ( · ) U ( · , t ni ) h i (cid:12)(cid:12)(cid:12) h i ∈ L F tni (Ω; H ) , t ni ∈ ∆ n , i = 0 , · · · , n − o . (10.29)Here U ( · , · ) is the operator introduced in Lemma 10.1. We have the following result. Lemma 10.2
The set S ∞ n =1 H n is dense in L F (0 , T ; L (Ω; H )) . The regularity result for solutions to (10.26) is stated as follows (See [49, 53] for its proof).
Lemma 10.3
Suppose that the assumptions in Theorem 10.2 hold and let ( P ( · ) , Q ( · ) , b Q ( · ) ) bethe relaxed transposition solution to the equation (10.26) . Then, for each n ∈ N , there exist twopointwise defined, linear operators Q n and b Q n , both of which are from H n to L F (0 , T ; L (Ω; H )) ,such that for any ξ , ξ ∈ L F (Ω; H ) , u ( · ) , u ( · ) ∈ L F (Ω; L (0 , T ; H )) and v ( · ) , v ( · ) ∈ H n , it holdsthat E Z T (cid:10) v ( s ) , b Q (0) ( ξ , u , v )( s ) (cid:11) H ds + E Z T (cid:10) Q (0) ( ξ , u , v )( s ) , v ( s ) (cid:11) H ds = E Z T h(cid:10) Q n ( s ) v ( s ) , ϕ ( s ) (cid:11) H + (cid:10) ϕ ( s ) , b Q n ( s ) v ( s ) (cid:11) H i ds, (10.30) where ϕ ( · ) and ϕ ( · ) solve (10.27) and (10.28) with t = 0 , respectively. We first assume the following further conditions for Problem (OP). (AS5)
The functions a ( t, · , u ) , b ( t, · , u ) , g ( t, · , u ) and h ( · ) are C w.r.t. x ∈ H , such that for ϕ ( t, x, u ) = a ( t, x, u ) , b ( t, x, u ) , ψ ( t, x, u ) = g ( t, x, u ) , h ( x ) , it holds that ϕ x ( t, x, · ) , ψ x ( t, x, · ) , One can define pointwise defined, linear operators similarly to that in (8.1). xx ( t, x, · ) and ψ xx ( t, x, · ) are continuous. Moreover, for all ( x, u ) ∈ H × U and a.e. ( t, ω ) ∈ [0 , T ] × Ω , ( | ϕ x ( t, x, u ) | L ( H ) + | ψ x ( t, x, u ) | H ≤ C , | ϕ xx ( t, x, u ) | L ( H,H ; H ) + | ψ xx ( t, x, u ) | L ( H ) ≤ C . (10.31)Let H ( t, x, u, k , k ) ∆ = (cid:10) k , a ( t, x, u ) (cid:11) H + (cid:10) k , b ( t, x, u ) (cid:11) H − g ( t, x, u ) , ( t, x, u, k , k ) ∈ [0 , T ] × H × U × H × H. (10.32)We have the following result. Theorem 10.3
Suppose that L p F T (Ω) ( ≤ p < ∞ ) is a separable Banach space. Let the assump-tions (AS1), (AS2) and (AS5) hold, and let ( X ( · ) , ¯ u ( · )) be an optimal pair of Problem (OP). Let (cid:0) Y ( · ) , Z ( · ) (cid:1) be the transposition solution to (10.7) with Y T and f ( · , · , · ) given by (10.10) . Assumethat ( P ( · ) , Q ( · ) , b Q ( · ) ) is the relaxed transposition solution to the equation (10.26) in which P T , J ( · ) , K ( · ) and F ( · ) are given by ( P T = − h xx (cid:0) X ( T ) (cid:1) , J ( t ) = a x ( t, X ( t ) , ¯ u ( t )) ,K ( t ) = b x ( t, X ( t ) , ¯ u ( t )) , F ( t ) = − H xx (cid:0) t, X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) . (10.33) Then, for a.e. ( t, ω ) ∈ [0 , T ] × Ω and for all u ∈ U , H (cid:0) t, X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) − H (cid:0) t, X ( t ) , u, Y ( t ) , Z ( t ) (cid:1) − (cid:10) P ( t ) (cid:2) b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , u (cid:1)(cid:3) , b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , u (cid:1)(cid:11) H ≥ . (10.34) Proof . We divide the proof into several steps.
Step 1 . For each ε ∈ (0 , T ), let E ε ⊂ [0 , T ] be a Lebesgue measurable set with the measure ε .Put u ε ( · ) = ¯ u ( · ) χ [0 ,T ] \ E ε ( · ) + u ( · ) χ E ε ( · ) , where u ( · ) is an arbitrarily given element in U [0 , T ].Let us introduce some notations which will be used later. For ψ = a, b, g , let ψ ( t ) = ψ x ( t, X ( t ) , ¯ u ( t )) , ψ ( t ) = ψ xx ( t, X ( t ) , ¯ u ( t )) , ˜ ψ ε ( t ) = Z ψ x (cid:0) t, X ( t ) + σ ( X ε ( t ) − X ( t )) , u ε ( t ) (cid:1) dσ, ˜ ψ ε ( t ) = 2 Z (1 − σ ) ψ xx (cid:0) t, X ( t ) + σ ( X ε ( t ) − X ( t )) , u ε ( t ) (cid:1) dσ, (10.35)and δψ ( t ) = ψ ( t, X ( t ) , u ( t )) − ψ ( t, X ( t ) , ¯ u ( t )) ,δψ ( t ) = ψ x ( t, X ( t ) , u ( t )) − ψ x ( t, X ( t ) , ¯ u ( t )) ,δψ ( t ) = ψ xx ( t, X ( t ) , u ( t )) − ψ xx ( t, X ( t ) , ¯ u ( t )) . (10.36)Consider the following equation: ( dX ε = (cid:0) AX ε + a ( t, X ε , u ε ) (cid:1) dt + b ( t, X ε , u ε ) dW ( t ) in (0 , T ] ,X ε (0) = X . (10.37)92et X ε ( · ) ∆ = X ε ( · ) − X ( · ). Then, we see that X ε ( · ) solves the following SEE: dX ε = (cid:0) AX ε + ˜ a ε ( t ) X ε + χ E ε ( t ) δa ( t ) (cid:1) dt + (cid:0) ˜ b ε ( t ) X ε + χ E ε ( t ) δb ( t ) (cid:1) dW ( t ) in (0 , T ] ,X ε (0) = 0 . (10.38)The equation (10.38) reminds us to introduce the following first order “variational” equation: ( dX ε = (cid:0) AX ε + a ( t ) X ε (cid:1) dt + (cid:0) b ( t ) X ε + χ E ε ( t ) δb ( t ) (cid:1) dW ( t ) in (0 , T ] ,X ε (0) = 0 . (10.39)We also need to introduce the following second order “variational” equation dX ε = h AX ε + a ( t ) X ε + χ E ε ( t ) δa ( t ) + 12 a ( t ) (cid:0) X ε , X ε (cid:1)i dt + h b ( t ) X ε + χ E ε ( t ) δb ( t ) X ε + 12 b ( t ) (cid:0) X ε , X ε (cid:1)i dW ( t ) in (0 , T ] ,X ε (0) = 0 . (10.40)Denote by C ( X ) a generic constant (depending on X , T and A ), which may change from lineto line. One can show that | X ε ( · ) | C F ([0 ,T ]; L (Ω; H )) ≤ C ( X ) √ ε, (10.41) | X ε ( · ) | C F ([0 ,T ]; L (Ω; H )) ≤ C ( X ) ε. (10.42) Step 2 . We now compute J ( u ε ( · )) − J (¯ u ( · )).Using Taylor’s formula and the continuity of both h xx ( x ) and g xx ( t, x, u ) with respect to x ,noting (10.41)–(10.42), after a careful computation, we can show that J ( u ε ( · )) − J (¯ u ( · ))= E Z T (cid:16)(cid:10) g ( t ) , X ε ( t ) + X ε ( t ) (cid:11) H + 12 (cid:10) g ( t ) X ε ( t ) , X ε ( t ) (cid:11) H + χ E ε ( t ) δg ( t ) (cid:17) dt + E (cid:10) h x (cid:0) ¯ x ( T ) (cid:1) , X ε ( T ) + X ε ( T ) (cid:11) H + 12 E (cid:10) h xx (cid:0) ¯ x ( T ) (cid:1) X ε ( t ) , X ε ( t ) (cid:11) H + o ( ε ) , as ε → . (10.43)Next, we shall get rid of X ε ( · ) and X ε ( · ) in (10.43) by solutions to the equations (10.7) and(10.26). It follows from the definition of the transposition solution to the equation (10.7) (with Y T and f ( · , · , · ) given by (10.10)) that − E (cid:10) h x (¯ x ( T ))) , X ε ( T ) (cid:11) H − E Z T (cid:10) g ( t ) , X ε ( t ) (cid:11) H dt = E Z T (cid:10) Z ( t ) , δb ( t ) (cid:11) H χ E ε ( t ) dt (10.44) Recall that, by Subsection 1.3, for any C -function f ( · ) defined on a Banach space X and X ∈ X , f xx ( X ) ∈L ( X, X ; X ). This means that, for any X , X ∈ X , f xx ( X )( X , X ) ∈ X . Hence, by (10.35), a ( t ) (cid:0) X ε , X ε (cid:1) (in(10.40)) stands for a xx ( t, X ( t ) , ¯ u ( t )) (cid:0) X ε ( t ) , X ε ( t ) (cid:1) . One has a similar meaning for b ( t ) (cid:0) X ε , X ε (cid:1) and so on. − E (cid:10) h x (¯ x ( T ))) , X ε ( T ) (cid:11) H − E Z T (cid:10) g ( t ) , X ε ( t ) (cid:11) H dt = E Z T h (cid:0)(cid:10) Y ( t ) , a ( t ) (cid:0) X ε ( t ) , X ε ( t ) (cid:1)(cid:11) H + (cid:10) Z ( t ) , b ( t ) (cid:0) X ε ( t ) , X ε ( t ) (cid:1)(cid:11) H (cid:1) + χ E ε ( t ) (cid:0)(cid:10) Y ( t ) , δa ( t ) (cid:11) H + (cid:10) Y, δb ( t ) X ε ( t ) (cid:11) H (cid:1)i dt. (10.45)According to (10.43)–(10.45), we conclude that J ( u ε ( · )) − J (¯ u ( · ))= 12 E Z T (cid:0)(cid:10) g ( t ) X ε ( t ) , X ε ( t ) (cid:11) H − (cid:10) Y ( t ) , a ( t ) (cid:0) X ε ( t ) , X ε ( t ) (cid:1)(cid:11) H − (cid:10) Y, b ( t ) (cid:0) X ε ( t ) , X ε ( t ) (cid:1)(cid:11) H (cid:1) dt + E Z T χ E ε ( t ) (cid:0) δg ( t ) − (cid:10) Y ( t ) , δa ( t ) (cid:11) H − (cid:10) Z ( t ) , δb ( t ) (cid:11) H (cid:1) dt + 12 E (cid:10) h xx (cid:0) ¯ x ( T ) (cid:1) X ε ( T ) , X ε ( T ) (cid:11) H + o ( ε ) , as ε → . (10.46)By the definition of the relaxed transposition solution to the equation (10.26) (with P T , J ( · ), K ( · )and F ( · ) given by (10.33)), we obtain that − E (cid:10) h xx (cid:0) ¯ x ( T ) (cid:1) X ε ( T ) , X ε ( T ) (cid:11) H + E Z T (cid:10) H xx (cid:0) t, ¯ X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) X ε ( t ) , X ε ( t ) (cid:11) H dt = E Z T χ E ε ( t ) (cid:10) b ( t ) X ε ( t ) , P ( t ) ∗ δb ( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) P ( t ) δb ( t ) , b ( t ) X ε ( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) P ( t ) δb ( t ) , δb ( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) δb ( t ) , b Q (0) (0 , , χ E ε δb )( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) Q (0) (0 , , δb )( t ) , δb ( t ) (cid:11) H dt. (10.47)Now, we estimate the terms in the right hand side of (10.47). By (10.41), we have (cid:12)(cid:12)(cid:12) E Z T χ E ε ( t ) (cid:10) b ( t ) X ε ( t ) , P ( t ) ∗ δb ( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) P ( t ) δb ( t ) , b ( t ) X ε ( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) = o ( ε ) . (10.48)In what follows, for any τ ∈ [0 , T ), we choose E ε = [ τ, τ + ε ] ⊂ [0 , T ].By Lemma 10.2, we can find a sequence { β n } ∞ n =1 such that β n ∈ H n (Recall (10.29) for thedefinition of H n ) and lim n →∞ β n = δb in L F (0 , T ; L (Ω; H )) . Hence, | β n | L F (0 ,T ; L (Ω; H )) ≤ C ( X ) < ∞ , ∀ n ∈ N , (10.49)and there is a subsequence { n k } ∞ k =1 ⊂ N such thatlim k →∞ | β n k ( t ) − δb ( t ) | L F t (Ω; H ) = 0 for a.e. t ∈ [0 , T ] . (10.50)94enote by Q n k and b Q n k the corresponding pointwise defined, linear operators from H n k to L F (0 , T ; L (Ω; H )), given in Lemma 10.3.Consider the following SEE: ( dX ε ,n k = (cid:0) AX ε ,n k + a ( t ) X ε ,n k (cid:1) dt + (cid:0) b ( t ) X ε ,n k + χ E ε ( t ) β n k ( t ) (cid:1) dW ( t ) in (0 , T ] ,X ε ,n k (0) = 0 . (10.51)Applying Theorem 1.17 to (10.51) and by (10.49), we obtain that | x ε ,n k ( · ) | C F ([0 ,T ]; L (Ω; H )) ≤ C E (cid:16) Z T χ E ε ( s ) (cid:12)(cid:12) β k ( s ) (cid:12)(cid:12) H ds (cid:17) ≤ C ε Z E ε E | β k ( s ) | H ds ≤ C ( x , k ) ε . (10.52)Here and henceforth, C ( X , k ) is a generic constant (depending on X , k , T and A ), which may bedifferent from line to line.For any fixed k ∈ N , since Q n k β n k ∈ L F (0 , T ; L (Ω; H )), by (10.52), we find that (cid:12)(cid:12)(cid:12) E Z T χ E ε ( t ) (cid:10)(cid:0) Q n k β n k (cid:1) ( t ) , X ε ,n k ( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) ≤ | X ε ,n k ( · ) | L ∞ F (0 ,T ; L (Ω; H )) Z E ε (cid:12)(cid:12)(cid:0) Q n k β n k (cid:1) ( t ) (cid:12)(cid:12) L F t (Ω; H ) dt ≤ C ( X , k ) √ ε Z E ε (cid:12)(cid:12)(cid:0) Q n k β n k (cid:1) ( t ) (cid:12)(cid:12) L F t (Ω; H ) dt = o ( ε ) , as ε → . (10.53)Similarly, (cid:12)(cid:12)(cid:12) E Z T χ E ε ( t ) (cid:10) X ε ,n k ( t ) , (cid:0) b Q n k β n k (cid:1) ( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) = o ( ε ) , as ε → . (10.54)From (10.30) in Lemma 10.3, and noting that both Q n k and b Q n k are pointwise defined, we get E Z T (cid:10) χ E ε ( t ) β n k ( t ) , b Q (0) (0 , , χ E ε β n k )( t ) (cid:11) H dt + E Z T (cid:10) Q (0) (0 , , χ E ε β n k )( t ) , χ E ε ( t ) β n k ( t ) (cid:11) H dt = E Z T χ E ε ( t ) h(cid:10)(cid:0) Q n k β n k (cid:1) ( t ) , X ε ,n k ( t ) (cid:11) H + (cid:10) X ε ,n k ( t ) , (cid:0) b Q n k β n k (cid:1) ( t ) (cid:11) H i dt. (10.55)From (10.50) and the density of the Lebesgue points, we find that for a.e. τ ∈ [0 , T ), it holdsthat lim k →∞ lim ε → ε (cid:12)(cid:12)(cid:12) E Z T (cid:10) χ E ε ( t ) δb ( t ) , b Q (0) (0 , , χ E ε δb )( t ) (cid:11) H dt − E Z T (cid:10) χ E ε ( t ) δb ( t ) , b Q (0) (0 , , χ E ε β n k )( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) ≤ lim k →∞ lim ε → ε hZ T χ E ε ( t ) (cid:16) E | δb ( t ) | H (cid:17) dt i | b Q (0) (0 , , χ E ε ( δb − β n k )) | L F (0 ,T ; L (Ω; H )) ≤ C lim k →∞ lim ε → ε h Z T χ E ε ( t ) (cid:16) E | δb ( t ) | H (cid:17) dt i (cid:12)(cid:12) χ E ε ( δb − β n k ) (cid:12)(cid:12) L F (0 ,T ; L (Ω; H )) C lim k →∞ lim ε → | δb ( τ ) | L F τ (Ω; H ) √ ε h Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt i (10.56)= C lim k →∞ lim ε → | δb ( τ ) | L F τ (Ω; H ) h ε Z τ + ετ | δb ( t ) − β n k ( t ) | L F t (Ω; H ) dt i = C lim k →∞ | δb ( τ ) | L F τ (Ω; H ) | δb ( τ ) − β n k ( τ ) | L F τ (Ω; H ) = 0 . Similarly, lim k →∞ lim ε → ε (cid:12)(cid:12)(cid:12) E Z T (cid:10) χ E ε ( t ) δb ( t ) , b Q (0) (0 , , χ E ε β n k )( t ) (cid:11) H dt − E Z T (cid:10) χ E ε ( t ) β n k ( t ) , b Q (0) (0 , , χ E ε β n k )( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) ≤ lim k →∞ lim ε → ε (cid:12)(cid:12) b Q (0) (0 , , χ E ε β n k ) (cid:12)(cid:12) L F (0 ,T ; L (Ω; H )) × h Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt i ≤ C lim k →∞ lim ε → ε (cid:12)(cid:12) χ E ε β n k (cid:12)(cid:12) L F (0 ,T ; L (Ω; H )) h Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt i ≤ C lim k →∞ lim ε → ε n(cid:12)(cid:12) χ E ε δb (cid:12)(cid:12) L F (0 ,T ; L (Ω; H )) h Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt i + Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt o (10.57) ≤ C lim k →∞ lim ε → (cid:26) | δb ( τ ) | L F τ (Ω; H ) √ ε h Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt i + 1 ε Z T χ E ε ( t ) (cid:16) E | δb ( t ) − β n k ( t ) | H (cid:17) dt (cid:27) = C lim k →∞ lim ε → h | δb ( τ ) | L F τ (Ω; H ) (cid:16) ε Z τ + ετ | δb ( t ) − β n k ( t ) | L F t (Ω; H ) dt (cid:17) + 1 ε Z τ + ετ | δb ( t ) − β n k ( t ) | L F t (Ω; H ) dt i = C lim k →∞ (cid:2) | δb ( τ ) | L F τ (Ω; H ) | δb ( τ ) − β n k ( τ ) | L F τ (Ω; H ) + | δb ( τ ) − β n k ( τ ) | L F τ (Ω; H ) (cid:3) = 0 . From (10.56)–(10.57), we find thatlim k →∞ lim ε → ε (cid:12)(cid:12)(cid:12) E Z T (cid:10) χ E ε ( t ) δb ( t ) , b Q (0) (0 , , χ E ε δb )( t ) (cid:11) H dt (10.58) − E Z T (cid:10) χ E ε ( t ) β n k ( t ) , b Q (0) (0 , , χ E ε β n k )( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) = 0 . By a similar argument, we get thatlim k →∞ lim ε → ε (cid:12)(cid:12)(cid:12) E Z T (cid:10) Q (0) (0 , , χ E ε δb )( t ) , χ E ε ( t ) δb ( t ) (cid:11) H dt − E Z T (cid:10) Q (0) (0 , , χ E ε β n k )( t ) , χ E ε ( t ) β n k ( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) = 0 . (10.59)96rom (10.53)–(10.54) and (10.58)–(10.59), we obtain that (cid:12)(cid:12)(cid:12) E Z T χ E ε ( t ) (cid:10) δb ( t ) , b Q (0) (0 , , χ E ε δb )( t ) (cid:11) H dt + E Z T χ E ε ( t ) (cid:10) Q (0) (0 , , δb )( t ) , δb ( t ) (cid:11) H dt (cid:12)(cid:12)(cid:12) = o ( ε ) , as ε → . (10.60)Combining (10.46)–(10.48) and (10.60), we end up with J ( u ε ( · )) − J (¯ u ( · ))= E Z T (cid:16) δg ( t ) − (cid:10) Y ( t ) , δa ( t ) (cid:11) H − (cid:10) Z ( t ) , δb ( t ) (cid:11) H − (cid:10) P ( t ) δb ( t ) , δb ( t ) (cid:11) H (cid:17) χ E ε ( t ) dt + o ( ε ) , as ε → . Since ¯ u ( · ) is the optimal control, J ( u ε ( · )) − J (¯ u ( · )) ≥
0. Thus, E Z T χ E ε ( t ) (cid:16)(cid:10) Y ( t ) , δa ( t ) (cid:11) H + (cid:10) Z ( t ) , δb ( t ) (cid:11) H − δg ( t )+ 12 (cid:10) P ( t ) δb ( t ) , δb ( t ) (cid:11) H (cid:17) dt ≤ o ( ε ) , (10.61)as ε → Step 3 . We are now in a position to complete the proof.Since L F T (Ω) is separable, for any t ∈ [0 , T ], F t is countably generated by a sequence { M k } ∞ k =1 ⊂F t , that is, for any M ⊂ F t , there exists a subsequence { M k j } ∞ j =1 ⊂ { M k } ∞ k =1 such thatlim j →∞ P (cid:0) ( M \ M k j ) ∪ ( M k j \ M ) (cid:1) = 0 . Denote by { t i } ∞ i =1 the sequence of rational numbers in [0 , T ), and by { v i } ∞ i =1 a dense subset of U . For each i ∈ N , we choose { M ij } ∞ j =1 ⊂ F t i to be a sequence which generates F t i .Fix i, j, k ∈ N arbitrarily. For any τ ∈ [ t i , T ) and θ ∈ (0 , T − τ ), write E iθ = [ τ, τ + θ ). Put u k,θij = v k , if ( t, ω ) ∈ E iθ × M ij , ¯ u ( t, ω ) , if ( t, ω ) ∈ ([0 , T ] × Ω) \ ( E iθ × M ij ) . Clearly, u k,θij ∈ U [0 , T ] and u k,θij ( t, ω ) − ¯ u ( t, ω ) = ( v k − ¯ u ( t, ω )) χ M ij ( ω ) χ E iθ ( t ) , ( t, ω ) ∈ [0 , T ] × Ω . From (10.61), we know that E Z θ + τθ (cid:16) H (cid:0) t, X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) − H (cid:0) t, X ( t ) , u k,θij ( t ) , Y ( t ) , Z ( t ) (cid:1) − (cid:10) P ( t ) (cid:0) b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , u k,θij ( t ) (cid:1)(cid:1) , (10.62) b (cid:0) t, ¯ X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, ¯ X ( t ) , u k,θij ( t ) (cid:1)(cid:11) H (cid:17) dt o ( ε ) . Dividing both sides of (10.62) by τ and letting τ → + , by the property of Lebesgue point, weconclude that for any i, j, k ∈ N , there exists a zero measure set E ki,j ⊂ [ t i , T ) such that for all t ∈ [ t i , T ) \ E ki,j , E h(cid:16) H (cid:0) t, X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) − H (cid:0) t, X ( t ) , v k ( t ) , Y ( t ) , Z ( t ) (cid:1) − (cid:10) P ( t ) (cid:0) b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , v k ( t ) (cid:1)(cid:1) ,b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , v k ( t ) (cid:1)(cid:11) H (cid:17) χ M ij i ≤ . Let E = [ i,j,k ∈ N E ki,j . Then its Lebesgue measure is zero, and for any i, j, k ∈ N , E h(cid:16) H (cid:0) t, X ( t ) , ¯ u ( t ) , Y ( t ) , Z ( t ) (cid:1) − H (cid:0) t, X ( t ) , v k ( t ) , Y ( t ) , Z ( t ) (cid:1) − (cid:10) P ( t ) (cid:0) b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , v k ( t ) (cid:1)(cid:1) ,b (cid:0) t, X ( t ) , ¯ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , v k ( t ) (cid:1)(cid:11) H (cid:17) χ M ij i ≤ , ∀ t ∈ [ t i , T ) \ E . This, together with the construction of { M ij } ∞ j =1 , the right continuity of the filtration F and thedensity of { v k } ∞ k =1 , implies that (10.34) holds. This completes the proof of Theorem 10.3.
11 A sufficient condition for an optimal control
Both Theorems 10.1 and 10.3 are about first odder necessary conditions for optimal controls ofProblem (OP). In this section, we shall consider very quickly the related first order sufficient(optimality) condition. For simplicity, we assume that the control region U is a convex subset ofthe separable Hilbert space H appeared in (AS3) (in Subsection 10.2), and L p F T (Ω) ( p ≥
1) isseparable. The finite version is obtained in [82]. We borrow some idea from that paper.
In this subsection, as preliminaries, we recall the definition and some basic properties of Clarke’sgeneralized gradient. A detailed introduction to this topic can be found in [6].In the rest of this subsection, O is a domain in H and ϕ : O → R is a locally Lipschitz continuousfunction. Definition 11.1
For any x ∈ O , the (Clarke) generalized gradient of ϕ at x is defined by ∂ϕ ( x ) ∆ = n ξ ∈ H (cid:12)(cid:12)(cid:12) h ξ, y i H ≤ lim z → x,z ∈ Or → ϕ ( z + ry ) − ϕ ( z ) r o . (11.1)The following lemma provides some properties of the Clarke generalized gradient.98 emma 11.1 The following results hold: ∂ϕ ( x ) is a nonempty, bounded, convex subset of H ; ∂ ( − ϕ )( x ) = − ∂ϕ ( x ) ;
3) 0 ∈ ∂ϕ ( x ) if ϕ attains a local minimum or maximum at x .Proof . 1) The boundedness of ∂ϕ ( x ) follow immediately from (11.1). We only need to show theconvexity and non-emptiness. Since ϕ : O → R is locally Lipschitz continuous, for the given x ∈ O and a small δ > B δ ( x ) ∆ = (cid:8) z ∈ O (cid:12)(cid:12) | z − x | ≤ δ (cid:9) ⊆ O, there exists a constant C > | ϕ ( z ) − ϕ ( b z ) | ≤ C| z − b z | H , ∀ z, b z ∈ B δ ( x ) . Thus, for any fixed y ∈ H and r > | ϕ ( z + ry ) − ϕ ( z ) | r ≤ C| y | H , ∀ z ∈ B δ/ ( x ) . Hence, the following functional is well-defined: ϕ ( x ; y ) ∆ = lim z → x,z ∈ Or → ϕ ( z + ry ) − ϕ ( z ) r . (11.2)It is easy to check that ϕ ( x ; λy ) = λϕ ( x ; y ) , ∀ y ∈ H, λ ≥ ϕ ( x ; y + z ) ≤ ϕ ( x ; y ) + ϕ ( x ; z ) , ∀ y, z ∈ H. (11.4)Thus, y ϕ ( x ; y ) is a convex function. Also, (11.4) implies that − ϕ ( x ; y ) ≤ ϕ ( x ; − y ) , ∀ y ∈ H. (11.5)Next, we fix a z ∈ H and define F : { λz (cid:12)(cid:12) λ ∈ R } → R by F ( λz ) = λϕ ( x ; z ) , ∀ λ ∈ R . Then for λ ≥ ( F ( λz ) ≡ λϕ ( x ; z ) = ϕ ( x ; λz ) ,F ( − λz ) ≡ − λϕ ( x ; z ) = − ϕ ( x ; λz ) ≤ ϕ ( x, − λz ) , which implies F ( λz ) ≤ ϕ ( x ; λz ) , ∀ λ ∈ R . (11.6)Therefore, F is a linear functional defined on the linear space spanned by z , and it is dominated bythe convex function ϕ ( x ; · ). By the Hahn-Banach theorem (i.e., Theorem 1.3), F can be extendedto be a bounded linear functional on H . Then, by the classical Riesz representation theorem (i.e.,Theorem 1.4), there exists ξ ∈ H , such that ( h ξ, λz i = F ( λz ) ≡ λϕ ( x ; z ) , ∀ λ ∈ R , h ξ, y i ≤ ϕ ( x ; y ) , ∀ y ∈ H. (11.7)99his implies ξ ∈ ∂ϕ ( x ). Consequently, ∂ϕ ( x ) is nonempty.2) It follows from (11.2) that( − ϕ ) ( x ; y ) = lim z → xr → − ϕ ( z + ry ) + ϕ ( z ) r = lim z ′→ xr → − ϕ ( z ′ ) + ϕ ( z ′ − ry ) r = ϕ ( x ; − y ) . Thus, ξ ∈ ∂ ( − ϕ )( x ) if and only if h− ξ, y i H = h ξ, − y i H ≤ ( − ϕ ) ( x ; − y ) = ϕ ( x ; y ) , ∀ y ∈ H, which is equivalent to − ξ ∈ ∂ϕ ( x ).3) Suppose ϕ attains a local minimum at x . Then ϕ ( x ; y ) = lim z → xr → ϕ ( z + ry ) − ϕ ( z ) r ≥ lim r → + ϕ ( x + ry ) − ϕ ( x ) r ≥ h , y i H , ∀ y ∈ H, This implies 0 ∈ ∂ϕ ( x ).If ϕ attains a local maximum at x , then the conclusion follows from the fact that − ϕ attains alocal minimum at x .By fixing some arguments of a function, one may naturally define its partial generalized gradient .For example, if ψ : H × H → R is locally Lipschitz, by ∂ x ψ ( x, u ) ( resp. ∂ u ψ ( x, u )), we mean thepartial generalized gradient of ψ in x ( resp. in u ) at ( x, u ) ∈ H × U .Next, we present a technical lemma. Lemma 11.2
Let ψ be a convex or concave function on H × H . Assume that ψ ( · , u ) is differen-tiable and ψ x ( · , · ) is continuous. Then (cid:8) ( ψ x (ˆ x, ˆ u ) , r ) (cid:12)(cid:12) r ∈ ∂ u ψ (ˆ x, ˆ u ) (cid:9) ⊆ ∂ x,u ψ (ˆ x, ˆ u ) , ∀ (ˆ x, ˆ u ) ∈ H × U. (11.8) Proof . We first handle the case that ψ is convex. For any ξ ∈ H and u ∈ H , we choose asequence { ( x j , δ j ) } ∞ j =1 ⊂ H × R as follows:( x j , ˆ u ) ∈ H × U, ( x j + δ j ξ, ˆ u + δ j u ) ∈ H × U,δ j → + , as j → ∞ , and | x j − ˆ x | H ≤ δ j . By the convexity of ψ , we havelim j →∞ ψ ( x j + δ j ξ, ˆ u + δ j u ) − ψ (ˆ x, ˆ u + δ j u ) δ j ≥ lim j →∞ h ψ x (ˆ x, ˆ u + δ j u ) , x i − ˆ x + δ j ξ i H δ j = h ψ x (ˆ x, ˆ u ) , ξ i H . (11.9)Similarly, lim j →∞ ψ (ˆ x, ˆ u + δ j u ) − ψ (ˆ x, ˆ u ) δ j ≥ h r, u i H . (11.10)Also, lim j →∞ ψ (ˆ x, ˆ u ) − ψ ( x j , ˆ u ) δ j ≥ lim j →∞ h ψ x ( x j , ˆ u ) , ˆ x − x j i H δ j = 0 . (11.11)100t follows from (11.9)–(11.11) thatlim j →∞ ψ ( x j + δ j ξ, ˆ u + δ j u ) − ψ ( x j , ˆ u ) δ j ≥ h ψ x (ˆ x, ˆ u ) , ξ i H + h r, u i H . This, together with the definition of the generalized gradient, implies that( ψ x (ˆ x, ˆ u ) , r ) ∈ ∂ x,u ψ (ˆ x, ˆ u ).If ψ is concave, the desired result follows immediately by noting that − ψ is convex and theconclusion 2) in Lemma 11.1. Let u ( · ) be an admissible control and X ( · ) be the corresponding state. Let (cid:0) Y ( · ) , Z ( · ) (cid:1) be thetransposition solution to (10.7) with Y T and f ( · , · , · ) given by Y T = − h x (cid:0) X ( T ) (cid:1) ,f ( t, X ( t ) , u ( t )) = − a x ( t, X ( t ) , u ( t )) ∗ X ( t ) − b x (cid:0) t, X ( t ) , u ( t ) (cid:1) ∗ u ( t )+ g x (cid:0) t, X ( t ) , u ( t ) (cid:1) (11.12)and ( P ( · ) , Q ( · ) , b Q ( · ) ) be the relaxed transposition solution to the equation (10.26) in which F ( · ), J ( · ), K ( · ) and P T are given by ( F ( t ) = − H xx (cid:0) t, X ( t ) , u ( t ) , Y ( t ) , Z ( t ) (cid:1) , P T = − h xx (cid:0) x ( T ) (cid:1) ,K ( t ) = b x ( t, X ( t ) , u ( t )) , J ( t ) = a x ( t, X ( t ) , u ( t )) . (11.13)Recall (10.32) for the definition of H and let H ( t, ξ, η ) ∆ = H (cid:0) t, ξ, η, Y ( t ) , Z ( t ) (cid:1) + 12 (cid:10) P ( t ) b (cid:0) t, ξ, η (cid:1) , b (cid:0) t, ξ, η (cid:1)(cid:11) H − (cid:10) P ( t ) b (cid:0) t, X ( t ) , u ( t ) (cid:1) , b (cid:0) t, ξ, η (cid:1)(cid:11) H , ∀ ( t, ξ, η ) ∈ [0 , T ] × H × U. Lemma 11.3
Suppose that (AS1)–(AS3) and (AS5) hold. Then for a.e. ( t, ω ) ∈ [0 , T ] × Ω , ∂ u H ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) = ∂ u H ( t, X ( t ) , u ( t )) . (11.14) Proof . Fix a t ∈ [0 , T ]. Let H ( η ) ∆ = H (cid:0) t, X ( t ) , η, Y ( t ) , Z ( t ) (cid:1) , H ( η ) ∆ = H (cid:0) t, X ( t ) , η (cid:1) , b ( η ) ∆ = b (cid:0) t, X ( t ) , η (cid:1) ,ψ ( η ) ∆ = 12 (cid:10) P ( t ) b ( η ) , b ( η ) (cid:11) H − (cid:10) P ( t ) b ( u ( t )) , b ( η ) (cid:11) H . Then, H ( η ) = H ( η ) + ψ ( η ) . Note that for any r → + , η, κ ∈ U , with η → ¯ u ( t ), ψ ( η + rκ ) − ψ ( η ) = 12 (cid:10) P ( t ) (cid:2) b ( η + rκ ) + b ( η ) − b ( u ( t )) (cid:3) , b ( η + rκ ) − b ( η ) (cid:11) H = o ( r ) . η → ¯ u ( t ) ,η ∈ Ur → H ( η + rκ ) − H ( η ) r = lim η → ¯ u ( t ) ,η ∈ Ur → H ( η + rκ ) − H ( η ) r . Consequently, by (11.1), the desired result (11.14) follows.Now, we present the following sufficient condition of optimality for Problem (OP).
Theorem 11.1
Let (AS1)–(AS3) and (AS5) hold. Suppose that h ( · ) is convex, H ( t, · , · , Y ( t ) , Z ( t )) is concave for all t ∈ [0 , T ] , a.s., and H ( t, X ( t ) , u ( t )) = max u ∈ U H ( t, X ( t ) , u ) , a.e. ( t, ω ) ∈ [0 , T ] × Ω . (11.15) Then, ( X ( · ) , u ( · )) is an optimal pair of Problem (OP).Proof . By the maximum condition (11.15), Lemma 11.3 and the assertion 3) in Lemma 11.1,we have 0 ∈ ∂ u H ( t, X ( t ) , u ( t )) = ∂ u H ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) . (11.16)Hence, by Lemma 11.2, we conclude that (cid:16) H x ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) , (cid:17) ∈ ∂ x,u H ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) . (11.17)This, together with the concavity of H ( t, · , · , Y ( t ) , Z ( t )), implies that Z T (cid:0) H ( t, e X ( t ) , ˜ u ( t ) , Y ( t ) , Z ( t )) − H ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) (cid:1) dt ≤ Z T (cid:10) H x ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) , e X ( t ) − X ( t ) (cid:11) H dt, (11.18)for any admissible pair ( e X ( · ) , ˜ u ( · )). Let ξ ( t ) ∆ = e X ( t ) − X ( t ). Then ξ ( · ) solves dξ ( t ) = (cid:2) Aξ ( t ) + a x (cid:0) t, X ( t ) , u ( t ) (cid:1) ξ ( t ) + α ( t ) (cid:3) dt + (cid:2) b x (cid:0) t, X ( t ) , u ( t ) (cid:1) ξ ( t ) + β ( t ) (cid:3) dW ( t ) in (0 , T ] ,ξ (0) = 0 , (11.19)where α ( t ) ∆ = − a x (cid:0) t, X ( t ) , u ( t ) (cid:1) ξ ( t ) + a (cid:0) t, e X ( t ) , ˜ u ( t ) (cid:1) − a (cid:0) t, X ( t ) , u ( t ) (cid:1) ,β ( t ) ∆ = − b x (cid:0) t, X ( t ) , u ( t ) (cid:1) ξ ( t ) + b (cid:0) t, e X ( t ) , ˜ u ( t ) (cid:1) − b (cid:0) t, X ( t ) , u ( t ) (cid:1) . It follows from (11.18), (11.19) and the definition of the transposition solution to (10.7) that E (cid:10) h x ( X ( T )) , ξ ( T ) (cid:11) H = − E (cid:10) Y ( T ) , ξ ( T ) (cid:11) H + E (cid:10) Y (0) , ξ (0) (cid:11) H = − E Z T h(cid:10) g x ( t, X ( t ) , u ( t )) , ξ ( t ) (cid:11) H + (cid:10) Y ( t ) , α ( t ) (cid:11) H + (cid:10) Z ( t ) , β ( t ) (cid:11) H i dt = E Z T (cid:10) H x ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) , ξ ( t ) (cid:11) H dt − E Z T (cid:16)(cid:10) Y ( t ) , a ( t, e X ( t ) , ˜ u ( t )) − a ( t, X ( t ) , u ( t )) (cid:11) H (cid:10) Z ( t ) , b ( t, e X ( t ) , ˜ u ( t )) − b ( t, X ( t ) , u ( t )) (cid:11) H (cid:17) dt (11.20) ≥ E Z T (cid:16) H ( t, e X ( t ) , ˜ u ( t ) , Y ( t ) , Z ( t )) − H ( t, X ( t ) , u ( t ) , Y ( t ) , Z ( t )) (cid:17) dt − E Z T (cid:16)(cid:10) Y ( t ) , a ( t, e X ( t ) , ˜ u ( t )) − a ( t, X ( t ) , u ( t )) (cid:11) H + (cid:10) Z ( t ) , b ( t, e X ( t ) , ˜ u ( t )) − b ( t, X ( t ) , u ( t )) (cid:11) H (cid:17) dt = − E Z T (cid:16) g ( t, e X ( t ) , ˜ u ( t )) − g ( t, X ( t ) , u ( t )) (cid:17) dt. On the other hand, the convexity of h implies that E (cid:10) h x ( X ( T )) , ξ ( T ) (cid:11) H ≤ E h ( e X ( T )) − E h ( X ( T )) . This, together with (11.20), yields J ( u ( · )) ≤ J (˜ u ( · )) . Since u ( · ) ∈ U [0 , T ] is arbitrary, the desiredresult follows.
12 Further comments and open problems • For an earlier version of this notes, we refer to [52]. The latter, considering the controllabilityand optimal control problems for both stochastic differential equations in finite dimensionsand stochastic evolution equations in infinite dimensions, is a lecture notes for the LIASFMAHangzhou Autumn School on “Control and Inverse Problems for Partial Differential Equa-tions” which was held during October 17–22, 2016 at Zhejiang University, Hangzhou, China. • Since the seminal paper [26], controllability and observability became the basis of ControlTheory. In this notes, we reduce the controllability problem for SEEs to observability prob-lems for BSEEs (by means of the so called “duality method”) without proofs due to thelimitation of space. Readers are referred to [53] for a detailed introduction to this topic. • Besides the “duality method”, there are several other methods to solve controllability prob-lems for deterministic PDEs. Nevertheless, most of them cannot be simply applied to thecontrollability problems of SPDEs. Let us explain this below briefly for transport equationsand parabolic equations.To establish the exact controllability of deterministic transport equations, there are anothertwo approaches. The first one is to utilize the explicit formula of their solutions. By thismethod, for some simple transport equations, one can explicitly construct a control steeringthe system from a given initial state to another given final state, provided that the time islarge enough. It seems that this method cannot be used to solve our stochastic problem sincegenerally we do not have the explicit formula for solutions to the system (4.5). The second oneis the extension method. However, it seems that it is only valid for time reversible systems,and therefore, it is not suitable for the stochastic problems in the framework of Itˆo’s integral.In [14], the null controllability of the heat equation in one space dimension is obtained bysolving a moment problem, based on the classical result on the linear independence of asuitably chosen families of real exponentials in L (0 , T ). However, it seems that it is veryhard to employ the same idea to prove the null controllability of stochastic parabolic equations.103ndeed, so far it is unclear that how to reduce the stochastic null controllability problem to asuitable moment problem. For example, let us consider the following equation dy ( t, x ) − y xx ( t, x ) dt = f ( x ) u ( t ) dt + a ( x ) y ( t, x ) dW ( t ) in (0 , T ] × (0 , ,y ( t,
0) = y ( t,
1) = 0 on (0 , T ) ,y (0 , x ) = y ( x ) in (0 , . (12.1)Here y ∈ L (0 , a ∈ L ∞ (0 , f ∈ L (0 , y is the state and u ∈ L F (0 , T ) is the control.One can see that it is not easy to reduce the null controllability problem for the system (12.1)to the usual moment problem. Under some conditions, it seems possible to reduce it to astochastic moment problem but this remains to be done.In [64], it was shown that if the wave equation is exactly controllable for some T > G of G , then the heat equation is null controllablefor any T > G . It seems that one can follow this idea toestablish a connection between the null controllability of stochastic heat equations and that ofstochastic wave equations. Nevertheless, at this moment the null controllability of stochasticwave equations is still open, which seems even more difficult than that of stochastic heatequations. • In this notes, we only give a very brief introduction to controllability problems for threekinds of SPDEs. Recently, there are also some works for controllability problems for othertypes of SPDEs, such as [16] for stochastic complex parabolic equations, [21] for stochasticKuramoto-Sivashinsky equations, [33, 72] for degenerate stochastic parabolic equations, [32]for coupled stochastic parabolic equations, [38] for stochastic Schr¨odinger equations and [55]for a refined stochastic wave equation. • By means of the tools that we developed for solving the stochastic controllability problems,in [35, 36, 50] we initiated the study of inverse problems for SPDEs (See also [72, 77, 78] forfurther interesting progress), in which the point is that, unlike most of the previous worksin this topic, the problem is genuinely stochastic and therefore it cannot be reduced to anydeterministic one. Especially, it was found in [50] that both the formulation of stochasticinverse problems and the tools to solve them differ considerably from their deterministiccounterpart. Indeed, as the counterexample in [50, Remark 2.7] shows, the inverse problemconsidered therein makes sense only in the stochastic setting! • The classical transposition method was introduced by J.-L. Lions and E. Magenes ([30, 31]) tosolve the non-homogeneous boundary value problems (including in particular the boundarycontrol problems) for deterministic PDEs. The main idea is to interpret solutions to a lessunderstood equation by means of that to another well understood one (See also Remark 4.3).Nevertheless, in the stochastic setting the philosophy is quite different. Indeed, the mainpurpose to introduce the stochastic transposition method is not to solve the non-homogeneousboundary value problems for SPDEs (though it does work for this sort of problems) but tosolve the backward stochastic equations (especially the difficult operator-valued BSEEs (8.5)and (10.26)), in which there exist no boundary conditions explicitly! • In this notes, we do not present the dynamic programming method for solving optimal controlproblems of SPDEs. We refer the readers to [13] for a comprehensive introduction to thistopic. 104
In our opinion, control theory for SPDEs is still at its early stage. There are lots of challenging openproblems in this topic. We shall list below some of them which, in our opinion, are particularlyinteresting: • Controllability of the stochastic parabolic equation with one control
Note that we introduce two controls u and v in (5.4). In view of the controllability resultfor deterministic parabolic equations, it is more natural to use only one control and considerthe following controlled stochastic parabolic equation (which is a special case of (5.4) with v ≡ dy − ∆ ydt = (cid:0) a y + χ G u (cid:1) dt + a ydW ( t ) in Q,y = 0 on Σ ,y (0) = y in G. (12.2)In order to obtain the null controllability of (12.2), we shall prove that solutions to the system(5.21) satisfy the following observability estimate: | z (0) | L ( G ) ≤ C| z | L F (0 ,T ; L ( G )) , ∀ z T ∈ L F T (Ω; L ( G )) . (12.3)Unfortunately, at this moment, we do not know how to obtain (12.3). Indeed, positiveresults are available only for some special cases of (12.2) when the spectral method can beapplied (c.f. [32, 34, 73]). Even for these special cases, some new phenomenons appear in thestochastic setting. Inspired by [32, 34, 73], we believe that one control is enough for the nullcontrollability of a stochastic parabolic equation. The main difficulty in establishing (12.3) isthat though the correction term “ Z ” plays a “coercive” role for the well-posedness of (5.21),it seems to be a “bad” (non-homogeneous) term when one tries to prove (12.3) using theglobal Carleman estimate. In such case, that term behaves like a nonhomogeneous term andappears in the right hand side of the Carleman estimate (See (5.23) in Theorem 5.3). • Null/approximate controllability for stochastic hyperbolic equations
In Section 6, we show that stochastic hyperbolic equations are not exactly controllable. Itis interesting to study whether they are null/approximately controllable. As far as we know,there are no nontrivial results published in this respect.A possible way to solve these problems is to establish suitable observability estimate/uniquecontinuation property for some backward stochastic hyperbolic equation. For example, to getthe null controllability of (6.4), one should establish the following observability estimate forthe equation (6.5) with τ = T : | ( z (0) , ˆ z (0)) | H ( G ) × L ( G ) ≤ C (cid:16) Z Σ (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) d Σ + | Z | L F (0 ,T ; H ( G )) (cid:17) . However, until now, we can only prove that (see [55]) | ( z (0) , ˆ z (0)) | H ( G ) × L ( G ) ≤ C (cid:16) Z Σ (cid:12)(cid:12)(cid:12) ∂z∂ν (cid:12)(cid:12)(cid:12) d Σ + | Z | L F (0 ,T ; H ( G )) + | b Z | L F (0 ,T ; L ( G )) (cid:17) . How to get rid of the above extra term | b Z | L F (0 ,T ; L ( G )) is an unsolved problem.105 Controllability problem for one dimensional stochastic hyperbolic systems
When the system is linear, it seems possible to generalize the powerful method in [28] (whichis for deterministic problems) to get the exact/null controllability results for stochastic hy-perbolic systems in one space dimension. However, when the system is quasilinear, to ourbest knowledge, it is unknown about how to define the characteristics. Thus, it is unclearhow to study the controllability problem for such sort of nonlinear systems. • The existence of an optimal feedback control operator for stochastic linearquadratic optimal problems with less restriction
To guarantee the existence of an optimal feedback control operator for stochastic linearquadratic optimal problems, we need very restrictive conditions (e.g., [54]). It is interestingto relax these conditions. We believe, very likely, the theory for forward-backward stochasticevolution equations and Malliavin Calculus are helpful to handle this problem. • Linear quadratic optimal problems for SEEs with unbounded control operators
In Sections 7–9, we assume that B , D and Q are bounded, linear operator-valued functions.In many concrete control systems, it is very common that both the control and the observationoperators are unbounded w.r.t. the state spaces. Typical examples are systems governed bySPDEs in which the actuators and the sensors act on lower-dimensional hypersurfaces oron the boundary of a spatial domain. The unboundedness of the control operators and theobservation operators leads to substantial technical difficulties even for the formulation ofthe state spaces. To study such systems, one needs to make some further assumptions, suchas the semigroup { S ( t ) } t ≥ has some smoothing effect. This is studied when D = 0 (e.g.,[23]). In the deterministic setting, to overcome this sort of difficulties, people introducedthe notions of the admissible control operator and the admissible observation operator (e.g.,[66]). On the other hand, people introduced the notion of well-posed linear system (withpossibly unbounded control and observation operators), which satisfies, roughly speaking,that the map from the input space to the output one is bounded (e.g., [66]). The well-posedlinear systems form a very general class whose basic properties are rich enough to study LQproblems. The concept of well-posed linear system was generalized to the stochastic settingin [40] but it seems that there are many things to be done to develop a satisfactory theoryfor stochastic well-posed linear control systems. • Optimal control problems for SEEs with state/endpoint constraints
We do not consider state/endpoint constraints for the optimal control problem in this notes.Some results can be found in [15]. However, the constraints considered in [15] are not sostrong, and hence many interesting cases cannot be covered, say, the endpoint constraintsuch that X ( T ) ∈ S , a.s. for some set S . • High order necessary conditions for optimal controls
Pontryagin type maximum principle is a first order necessary condition for optimal controls. Inaddition to the first-order necessary conditions, some higher order necessary conditions shouldbe established to distinguish optimal controls from the candidates which satisfy the first ordernecessary conditions, especially when the optimal controls are singular, i.e., optimal controlsatisfy the first order necessary conditions trivially. For instance, when the Hamiltoniancorresponding to optimal controls is equal to a constant in a subset of the control region or106he gradient and the Hessian (w.r.t. the control variable u ) of the corresponding Hamiltonianvanish/degenerate. In those cases, the first order necessary conditions are not enough todetermine the optimal controls. When the control domain U is convex, some pointwisesecond order necessary conditions are obtained in [46]. When U is not convex, some integraltype second order necessary conditions are obtained in [15]. However, compared with thedeterministic counterpart, the results in [15, 46] are not satisfactory. • Time-inconsistent optimal control problems for SEEs
In recent years, there are many works on the time-inconsistent optimal control problemsfor stochastic differential equations (See [74] and the references cited therein). It would bequite interesting to study the same problems but for stochastic evolution equations in infinitedimensions. As far as we know, there are only very few publications on this topic (e.g., [10]). • Optimal control problems for mean-field SEEs
Mean field approximation is extremely useful when describing macroscopic phenomena frommicroscopic overviews. Over the last years, due to various applications in economics, financeand physics, there has been a growing interest in control problems for mean-field stochasticdifferential equations (e.g., [2, 5] and the rich references therein). However, the publications oncontrol problems for mean-field SEEs are quite limited. Indeed, to the best of our knowledge,[1, 12, 42, 69] are the only papers in dealing with optimal control problems for this sort ofsystems. • Control problems for other types of stochastic control systems
The control problems considered in this paper and mentioned above make sense also forforward-backward stochastic evolution equations, stochastic evolution equations driven byfractional Brownian motions or G -Brownian motions, forward-backward doubly stochasticevolution equations, stochastic Volterra integral equation in infinite dimensions or stochasticevolution equations driven by general martingales (even with incomplete information). Asfar as we know, all of them remain to be done, and very likely people might need to developnew tools to obtain interesting new results. References [1] N. Agram and B. Øksendal,
Stochastic control of memory mean-field processes.
Appl. Math.Optim., (2019), 181–204.[2] A. Bensoussan, J. Frehse and P. Yam, Mean field games and mean field type control theory.
Springer, New York, 2013.[3] J.-M. Bismut,
Linear quadratic optimal stochastic control with random coefficients.
SIAM J.Control Optim., (1976), 419–444.[4] A. M. Bruckner, J. B. Bruckner and B. S. Thomson, Real analysis . Prentice Hall (Pearson),Upper Saddle River, 1997.[5] R. A. Carmona and F. Delarue,
Probabilistic theory of mean field games with applications. I.Mean field FBSDEs, control, and games . Springer, Cham, 2018.1076] F. Clarke,
Functional analysis, calculus of variations and optimal control . Graduate Texts inMathematics, 264. Springer, London, 2013.[7] G. Da Prato and J. Zabczyk,
Stochastic equations in infinite dimensions . Cambridge UniversityPress, Cambridge, 1992.[8] D. A. Dawson,
Stochastic evolution equations.
Math. Biosci. , (1972), 287–316.[9] F. Dou and Q. L¨u, Partial approximate controllability for linear stochastic control systems.
SIAM J. Control Optim. , (2019), 1209–1229.[10] F. Dou and Q. L¨u, Time-inconsistent linear quadratic optimal control problems for stochasticevolution equations.
SIAM J. Control Optim. , (2020), 485–509.[11] K. Du and Q. Meng, A maximum principle for optimal control of stochastic evolution equations.
SIAM J. Control Optim. , (2013), 4343–4362.[12] R. Dumitrescu, B. Øksendal and A. Sulem, Stochastic control for mean-field stochastic partialdifferential equations with jumps.
J. Optim. Theory Appl. , (2018), 559–584.[13] G. Fabbri, F. Gozzi and A. ´Swi¸ech, Stochastic optimal control in infinite dimension. Dynamicprogramming and HJB equations . Springer, Cham, 2017.[14] H. O. Fattorini and D. L. Russell,
Exact controllability theorems for linear parabolic equationsin one space dimension.
Arch. Rational Mech. Anal. , (1971), 272–292.[15] H. Frankowska and Q. L¨u, First and second order necessary optimality conditions for controlledstochastic evolution equations with control and state constraints.
J. Differential Equations , (2020), 2949–3015..[16] X. Fu and X. Liu, Controllability and observability of some stochastic complex Ginzburg-Landauequations.
SIAM J. Control Optim. , (2017), 1102–1127.[17] X. Fu, Q. L¨u and X. Zhang, Carleman estimates for second order partial differential operatorsand applications, a unified approach . Springer, Cham, 2019.[18] M. Fuhrman, Y. Hu and G. Tessitore,
Stochastic maximum principle for optimal control ofSPDEs.
Appl. Math. Optim. , (2018), 255–285.[19] T. Funaki, Random motion of strings and related stochastic evolution equations.
Nagoya Math.J. , (1983), 129–193.[20] A. V. Fursikov and O. Yu. Imanuvilov, Controllability of evolution equations . Lecture NotesSeries , Research Institute of Mathematics, Seoul National University, Seoul, Korea, 1994.[21] P. Gao, M. Chen and Y. Li, Observability estimates and null controllability for forward andbackward linear stochastic Kuramoto-Sivashinsky equations.
SIAM J. Control Optim. , (2015), 475–500.[22] P. R. Halmos, Measure theory . D. Van Nostrand Company, Inc., New York, 1950.[23] C. Hafizoglu, I. Lasiecka, T. Levajkovi´c, H. Mena, A. Tuffaha,
The stochastic linear quadraticcontrol problem with singular estimates.
SIAM J. Control Optim. , (2017), 595–626.10824] H. Holden, B. Øksendal, J. Ubøe and T. Zhang, Stochastic partial differential equations. Amodeling, white noise functional approach . Universitext. Springer, New York, 2010.[25] K. Itˆo,
Introduction to probability theory . Cambridge University Press, Cambridge, 1984.[26] R. E. Kalman,
On the general theory of control systems.
Proceedings of the first IFAC congress ,Moscow, 1960;
Butterworth, London, 1961, vol.1, 481–492.[27] M. V. Klibanov and M. Yamamoto,
Exact controllability for the time dependent transportequation.
SIAM J. Control Optim. , (2007), 2071–2195.[28] T. Li, Controllability and observability for quasilinear hyperbolic systems , American Instituteof Mathematical Sciences (AIMS), Springfield, MO; Higher Education Press, Beijing, 2010.[29] J.-L. Lions,
Contrˆolabilit´e exacte, perturbations et stabilisation de syst`emes distribu´es, Tome ,contrˆolabilit´e exacte . Rech. Math. Appl. 8, Masson, Paris, 1988.[30] J.-L. Lions and E. Magenes, Non-homogeneous boundary value problems and applications . Vol.I. Springer-Verlag, New York-Heidelberg, 1972.[31] J.-L. Lions and E. Magenes,
Non-homogeneous boundary value problems and applications . Vol.II. Springer-Verlag, New York-Heidelberg, 1972.[32] X. Liu,
Controllability of some coupled stochastic parabolic systems with fractional order spatialdifferential operators by one control in the drift.
SIAM J. Control Optim. , (2014), 836–860.[33] X. Liu and Y. Yu, Carleman estimates of some stochastic degenerate parabolic equations andapplication.
SIAM J. Control Optim. , (2019), 3527–3552.[34] Q. L¨u, Some results on the controllability of forward stochastic parabolic equations with controlon the drift.
J. Funct. Anal. , (2011), 832–851.[35] Q. L¨u, Carleman estimate for stochastic parabolic equations and inverse stochastic parabolicproblems.
Inverse Problems , (2012), 045008.[36] Q. L¨u, Observability estimate for stochastic Schr¨odinger equations and its applications.
SIAMJ. Control Optim. , (2013), 121–144.[37] Q. L¨u, Observability estimate and state observation problems for stochastic hyperbolic equa-tions.
Inverse Problems , (2013), 095011.[38] Q. L¨u, Exact controllability for stochastic Schr¨odinger equations.
J. Differential Equations , (2013), 2484–2504.[39] Q. L¨u, Exact controllability for stochastic transport equations.
SIAM J. Control Optim. , (2014), 397–419.[40] Q. L¨u, Stochastic well-posed systems and well-posedness of some stochastic partial differentialequations with boundary control and observation.
SIAM J. Control Optim. , (2015), 3457–3482.[41] Q. L¨u, Well-posedness of stochastic Riccati equations and closed-loop solvability for stochasticlinear quadratic optimal control problems.
J. Differential Equations , (2019), 180–227.10942] Q. L¨u, Stochastic linear quadratic optimal control problems for mean-field stochastic evolutionequations.
ESAIM Control Optim. Calc. Var. , (2020), Paper No. 127.[43] Q. L¨u, T. Wang and X. Zhang, Characterization of optimal feedback for stochastic linearquadratic control problems.
Probab. Uncertain. Quant. Risk. , (2017), Paper no. 11, DOI10.1186/s41546-017-0022-7.[44] Q. L¨u, J. Yong and X. Zhang, Representation of Itˆo integrals by Lebesgue/Bochner integrals.
J. Eur. Math. Soc. , (2012), 1795–1823.[45] Q. L¨u, J. Yong and X. Zhang, Erratum to “Representation of Itˆo integrals by Lebesgue/ Bochnerintegrals” (J. Eur. Math. Soc. 14, 1795–1823 (2012)) [MR2984588].
J. Eur. Math. Soc. , (2018), 259–260.[46] Q. L¨u, H. Zhang and X. Zhang, Second order optimality conditions for optimal control problemsof stochastic evolution equations. preprint, arXiv: 1811.07337.[47] Q. L¨u and X. Zhang,
Well-posedness of backward stochastic differential equations with generalfiltration.
J. Differential Equations , (2013), 3200–3227.[48] Q. L¨u and X. Zhang, General Pontryagin-type stochastic maximum principle and backwardstochastic evolution equations in infinite dimensions . Springer, Cham, 2014.[49] Q. L¨u and X. Zhang,
Transposition method for backward stochastic evolution equations revis-ited, and its application.
Math. Control Relat. Fields , (2015), 529–555.[50] Q. L¨u and X. Zhang, Global uniqueness for an inverse stochastic hyperbolic problem with threeunknowns.
Comm. Pure Appl. Math. , (2015), 948–963.[51] Q. L¨u and X. Zhang, Operator-valued backward stochastic Lyapunov equations in infinite di-mensions, and its application.
Math. Control Relat. Fields , (2018), 337–381.[52] Q. L¨u and X. Zhang, A mini-course on stochastic control, in
Control and inverse problems forpartial differential equations . Edited by G. Bao, J.-M. Coron and T.-T. Li, Higher EducationPress, Beijing, 2019, 171–254.[53] Q. L¨u and X. Zhang,
Mathematical theory for stochastic distributed parameter control systems.
Springer-Verlag, in press.[54] Q. L¨u and X. Zhang,
Optimal feedback for stochastic linear quadratic control and backwardstochastic Riccati equations in infinite dimensions. preprint, arXiv: 1901.00978.[55] Q. L¨u and X. Zhang,
Exact controllability for a refined stochastic wave equation. preprint,arXiv: 1901.06074.[56] Q. L¨u and X. Zhang,
Control theory for stochastic distributed parameter systems, an engineer-ing perspective. submitted to
Annual Reviews in Control .[57] R. S. Manning, J. H. Maddocks and J. D. Kahn,
A continuum rod model of sequence-dependentDNA structure.
J. Chem. Phys. , (1996), 5626–5646.[58] A. P. Meyer, Probability and potentials . Blaisdell Publishing Co. Ginn and Co., Waltham,Mass.-Toronto, Ont.-London, 1966. 11059] R. M. Murray et al , Control in an information rich world. Report of the panel on futuredirections in control, dynamics, and systems . Papers from the meeting held in College Park,MD, July 16-17, 2000. Society for Industrial and Applied Mathematics, Philadelphia, PA,2003.[60] E. Nelson,
Dynamical theories of Brownian motion . Princeton University Press, Princeton,N.J. 1967.[61] A. Pazy,
Semigroups of linear operators and applications to partial differential equations , Ap-plied Mathematical Sciences, 44. Springer-Verlag, New York, 1983.[62] S. Peng,
A general stochastic maximum principle for optimal control problems.
SIAM J. ControlOptim. , (1990), 966–979.[63] S. Peng, Backward stochastic differential equation and exact controllability of stochastic controlsystems.
Progr. Natur. Sci. (English Ed.). , (1994), 274–284.[64] D. L. Russell, A unified boundary controllability theory for hyperbolic and parabolic partialdifferential equations.
Studies in Appl. Math. , (1973), 189–211.[65] D. L. Russell, Controllability and stabilizability theory for linear partial differential equations:recent progress and open problems.
SIAM Rev. , (1978), 639–739.[66] D. Salamon, Infinite-dimensional linear systems with unbounded control and observation: Afunctional analytic approach.
Trans. Amer. Math. Soc. , (1987), 383–431.[67] K. Stowe, An introduction to thermodynamics and statistical mechanics . Cambridge UniversityPress, Cambridge, 2007.[68] J. Sun and J. Yong,
Stochastic linear-quadratic optimal control theory: open-loop and closed-loop solutions . Springer, Cham, 2019.[69] M. Tang, Q. Meng and M. Wang,
Forward and backward mean-field stochastic partial differ-ential equation and optimal control.
Chin. Ann. Math. Ser. B , (2019), 515–540.[70] S. Tang and X. Zhang, Null controllability for forward and backward stochastic parabolic equa-tions.
SIAM J. Control Optim. , (2009), 2191–2216.[71] J. M. A. M. van Neerven, γ -radonifying operators—a survey, in The AMSI-ANU workshopon spectral theory and harmonic analysis . Proc. Centre Math. Appl. Austral. Nat. Univ., 44,Austral. Nat. Univ., Canberra, (2010), 1–61.[72] B. Wu, Q. Chen and Z. Wang,
Carleman estimates for a stochastic degenerate parabolic equa-tion and applications to null controllability and an inverse random source problem.
InverseProblems , (2020), 075014.[73] D. Yang and J. Zhong, Observability inequality of backward stochastic heat equations for mea-surable sets and its applications.
SIAM J. Control Optim. , (2016), 1157–1175.[74] J. Yong, Time-inconsistent optimal control problems.
Proceedings of the international congressof mathematicians 2014, Vol. IV. Seoul, Korea, 2014, 947–969.[75] J. Yong and X. Y. Zhou,
Stochastic controls: Hamiltonian systems and HJB equations .Springer-Verlag, New York, 1999. 11176] K. Yosida,
Functional analysis . Classics in Mathematics. Springer-Verlag, Berlin, 1995.[77] G. Yuan,
Determination of two kinds of sources simultaneously for a stochastic wave equation.
Inverse Problems , (2015), 085003.[78] G. Yuan, Conditional stability in determination of initial data for stochastic parabolic equa-tions.
Inverse Problems , (2017), 035014.[79] E. Zeidler, Nonlinear functional analysis and its applications. I. fixed-point theorems . Springer-Verlag, New York, 1986.[80] X. Zhang,
Carleman and observability estimates for stochastic wave equations.
SIAM J. Math.Anal. , (2008), 851–868.[81] X. Zhang, A unified controllability/observability theory for some stochastic and deterministicpartial differential equations. in Proceedings of the international congress of mathematicians,Volume IV , 3008–3034, Hindustan Book Agency, New Delhi, 2010.[82] X. Zhou,
Sufficient conditions of optimality for stochastic systems with controllable diffusions.