Robust Mean Field Linear-Quadratic-Gaussian Games with Unknown L^2-Disturbance
aa r X i v : . [ m a t h . O C ] J a n Robust Mean Field Linear-Quadratic-Gaussian Gameswith Unknown L -Disturbance ∗ Jianhui Huang † and Minyi Huang ‡ Abstract
This paper considers a class of mean field linear-quadratic-Gaussian (LQG) games with modeluncertainty. The drift term in the dynamics of the agents contains a common unknown function. Wetake a robust optimization approach where a representative agent in the limiting model views thedrift uncertainty as an adversarial player. By including the mean field dynamics in an augmentedstate space, we solve two optimal control problems sequentially, which combined with consistentmean field approximations provides a solution to the robust game. A set of decentralized controlstrategies is derived by use of forward-backward stochastic differential equations (FBSDE) andshown to be a robust ε -Nash equilibrium. Mean field game theory provides an effective methodology for the analysis and strategy design in alarge population of players which are individually insignificant but collectively have strong impact (seee.g. [24, 27, 28, 34]). A typical modeling analyzes a system of N players with mean field couplingin their dynamics or costs, or both. The linear-quadratic-Gaussian (LQG) framework is of particularinterest since it allows an explicit solution procedure. Consider a large population of N agents. Thedynamics of agent i are given by the stochastic differential equation (SDE) dx i ( t ) = ( Ax i ( t ) + Bu i ( t ) + Gx ( N ) ( t )) dt + DdW i ( t ) , t ≥ , (1)where x ( N ) = (1 /N ) P Ni =1 x i denotes the mean field coupling term. The cost of agent i is given by J i ( u i , . . . , u N ) = E h Z T (cid:16) | x i − Γ x ( N ) − η | Q + u Ti Ru i (cid:17) dt + x Ti ( T ) Hx i ( T ) i , (2)where we denote | z | Q = ( z T Qz ) and the symmetric matrices Q ≥ , H ≥ R >
0. The LQGmodeling framework was first developed in [24, 27] to obtain a set of strategies (ˆ u , . . . , ˆ u N ) such thateach ˆ u i only uses the local sample path information of x i and some deterministic functions reflectingthe collective behavior of the agents and such that (ˆ u , . . . , ˆ u N ) is an ε -Nash equilibrium. There hasexisted a substantial body of literature adopting the LQG framework [4, 7, 31, 35, 43, 47]. ∗ A compressed version of this paper without detailed proofs has been presented at the 2015 IEEE CDC. † J. Huang is with the Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong([email protected]). This author was supported by RGC Early Career Scheme (ECS) grant 502412P, 500613P. ‡ M. Huang is with the School of Mathematics and Statistics, Carleton University, Ottawa, ON K1S 5B6, Canada([email protected]). This author was supported in part by Natural Sciences and Engineering Research Council(NSERC) of Canada under a Discovery Grant and a Discovery Accelerator Supplements Program. Part of this author’swork was conducted at the Department of Applied Mathematics, The Hong Kong Polytechnic University during March-May, 2014. Please address all correspondence to this author. N player static game with finite action spaces and an uncertain payoff matrix, arobust-optimization equilibrium is introduced in [2] where each player optimizes its worst case payoffwith respect to the uncertain set. A similar method is applied to hierarchical static games [22].Robustness has been addressed in dynamic games as well. A linear-quadratic (LQ) game with systemparameter uncertainties is presented in [29], and the deviation from the Nash equilibrium is estimatedfor a set of nominal strategies. Robust Nash equilibria are analyzed in [49] for an LQ game with anunknown time-varying disturbance signal as an adversarial player. In the first case, a soft-constrainedgame is solved where the cost includes a quadratic penalty term for the disturbance. The second caseintroduces a hard constraint by specifying an L bound on the disturbance function. The work [30]deals with stochastic games where the payoff and state transition probabilities contain uncertainty.The solution is developed by letting each player solve a robust Markov decision problem to optimizeits worst case cost while other players’ strategies are fixed.This paper aims to address model uncertainty in the mean field LQG game context. Specifically,we focus on drift uncertainty by adding to (1) a common unknown L -disturbance f . A practicalmotivation is that in many decision problems, a large number of agents can share a common uncertaintysource fluctuating with time, and examples include taxation, subsidy, interest rates, and so on. Adirect consequence of our modeling is that this disturbance has global influence on the population.To address robustness, each agent locally views the disturbance as an adversarial player, and for thispurpose we incorporate into (2) an effort penalty term for the disturbance which in turn maximizesthe resulting cost first. The agent minimizes subsequently. The framework of letting the disturbancemaximize while its effort is penalized is called the soft-constraint approach [5, 19, 49]. It has theadvantage of analytical tractability. When a hard constraint is considered, the robust mean field gameis more difficult to tackle; see some preliminary analysis in [25]. Regarding robustness in mean fieldgames, a related work is [46] where each agent is paired with its local disturbance as an adversarialplayer. The resulting solution is to replace the usual HJB equation by a Hamilton-Jacobi-Isaacs (HJI)equation in the solution.To design the individual strategies it is necessary to build the dynamics of the mean field (i.e.state average of the agents) evolving under the disturbance. This technique shares its spirit with thestate augmentation method in major player models [26, 41, 42]. The subsequent robust optimizationproblem, as a minimax control problem, leads to two optimal control problems with indefinite stateweights [51]. They are different from the well known stochastic control problems with indefinite controlweights [17, 37]. We will follow a convex optimization approach to solve the two control problems viavariational analysis and forward-backward stochastic differential equations (FBSDE) [23, 40, 44]. Boththe information structure and the solution procedure for our model are different from [46] where eachplayer and its local disturbance have access to its state and so dynamic programming is applicable.Our main contributions are summarized as follows: • We formulate a class of mean field LQG games where the players face a common uncertaintysource, and introduce the robust optimization approach to solve two convex optimal controlproblems. 2
Decentralized strategies are obtained for the robust mean field game via a set of FBSDE. • The performance of the decentralized strategies for the N players is characterized as a robust ε -Nash equilibrium.The rest of this paper is organized as follows. Section 2 introduces the mean field LQG game witha common disturbance and defines the worst case cost for a player. Section 3 studies the limitingrobust optimization problem which leads to two optimal control problems solved sequentially by thedisturbance and the representative player. The solution equation system of the mean field game isobtained in Section 4 based on consistent mean field approximations. A key error estimate of themean field approximation is developed in Section 5. Section 6 characterizes the set of decentralizedstrategies as a robust ε -Nash equilibrium. An extension of the analysis to players with random initialstates is presented in Section 7, and Section 8 concludes the paper. Consider a finite time horizon [0 , T ] for
T >
0. Suppose that (Ω , F , {F t } ≤ t ≤ T , P ) is a complete filteredprobability space. Throughout this paper, we denote by R k the k -dimensional Euclidean space, R n × k the set of all n × k matrices. We use | · | to denote the norm of a Euclidean space, or the Frobeniusnorm for matrices. For a vector or matrix M , M T denotes its transpose. Let L F (0 , T ; R k ) denotethe space of all R k -valued F t -progressively measurable processes x ( · ) satisfying E R T | x ( t ) | dt < ∞ ; C ([0 , T ]; R k ) (resp., C ([0 , T ]; R k )) is the space of all R k -valued functions h ( · ) defined on [0 , T ] whichare continuous (resp., continuously differentiable); L (0 , T ; R k ) is the space of all R k -valued measurablefunctions h ( · ) on [0 , T ] satisfying R T | h ( t ) | dt < ∞ , and we denote the norm k h k L = ( R T | h ( t ) | dt ) / .Throughout the paper, we use C (or C , C , . . . ) to denote a generic constant which does not dependon the population size N and may vary from place to place. Consider N agents (or players) denoted by A i , 1 ≤ i ≤ N , respectively. The state x i of A i is R n -valuedand satisfies the linear SDE dx i ( t ) = ( Ax i ( t ) + Bu i ( t ) + Gx ( N ) ( t ) + f ( t )) dt + DdW i ( t ) , ≤ i ≤ N, (3)where x ( N ) = (1 /N ) P Nj =1 x j . The control u i takes its value in R n . The R n -valued standard Brownianmotions { W i ( t ) , ≤ i ≤ N } are independent. The initial states { x i (0) , ≤ i ≤ N } are deterministicand their empirical mean has the limit lim N →∞ (1 /N ) P Ni =1 x i (0) = m . We take {F t } ≤ t ≤ T as thenatural filtration generated by the N n -dimensional Brownian motion ( W ( t ) , . . . , W N ( t )), and F = F T . The admissible control set U of A i is U := (cid:8) u i ( · ) : u i ∈ L F (0 , T ; R n ) (cid:9) . Denote u = ( u , . . . , u N ) and u − i = ( u , . . . , u i − , u i +1 , . . . , u N ).The function f ∈ L (0 , T ; R n ) is an unknown disturbance to characterize the model uncertainty,and represents an influence from the common environment for decision-making. A natural motivationfor considering deterministic disturbance is the following. Although each player A i regards the dis-turbance as adversarial, it should not be excessively pessimistic by assuming that the latter will usethe sample path information of W i to play against it, and instead only considers a deterministic f .The cost functional of A i is J i ( u i , u − i , f ) = E (cid:20)Z T (cid:18) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:19) dt + x Ti ( T ) Hx i ( T ) (cid:21) , (4)3here the symmetric matrices Q ≥ R > H ≥ γ >
0. We assume uniformagents in the sense that they share the same parameter datum (
A, B, G, D ; Γ , η, Q, R, γ, H ) . Also, tosimplify the analysis, we consider constant parameters.Due to the unknown function f , A i cannot evaluate its cost even if all control policies ( u , . . . , u N )are known. To address this indeterminacy, we approach the game from a robust optimization pointof view where each agent takes f as an adversarial player. Here a soft-constraint [5, 49, 19] for thedisturbance is adopted in that the term − γ | f ( t ) | is included in (4) while f attempts to maximize.For given ( u i , u − i ), define the worst case cost of A i as J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ) . A set of strategies (ˆ u , . . . , ˆ u N ) is a robust ε -Nash equilibrium for the N players if for ε ≥ J wo i (ˆ u i , ˆ u − i ) − ε ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) . (5)Our central objective is to design decentralized strategies based on the above solution notion. We start by making an appropriate approximation of the coupling term x ( N ) . Adding up the N equations in (3) and normalizing by 1 /N , we obtain dx ( N ) = [( A + G ) x ( N ) + Bu ( N ) + f ] dt + D (1 /N ) N X j =1 dW j , where u ( N ) = (1 /N ) P Nj =1 u j . Intuitively, from the point of view of A i , u ( N ) may be approximated bya deterministic function ¯ u . Moreover, when N → ∞ , (1 /N ) P Nj =1 dW j vanishes due to the law of largenumbers. In turn, a deterministic function m can be used to approximate x ( N ) . The above reasoningsuggests to introduce the limiting ordinary differential equation (ODE)˙ m = ( A + G ) m + B ¯ u + f, m (0) = m . (6) Consider the optimization problem of a representative agent A i : ( dx i = ( Ax i + Bu i + Gm i + f ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + f, (7)where the second equation is motivated from (6) and m i (0) = m . For the limiting model (7),( W i , x i (0)) is the same as in (3). We reuse ( x i , A i ) to denote the state and the corresponding agent.This shall cause no risk of confusion. Since f will be determined as its worst case form depending on x i (0), m i is associated with the agent index i so that it is ready as an appropriate notation for thesubsequent closed-loop dynamics. The cost functional is given by¯ J i ( u i , f ) = E Z T (cid:26) | x i − (Γ m i + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:27) dt + E x Ti ( T ) Hx i ( T ) .
4e aim to find a solution pair ( ˆ f , ˆ u i ) such that¯ J i (ˆ u i , ˆ f ) = min u i ∈U max f ∈ L (0 ,T ; R n ) ¯ J i ( u i , f ) . (8)Finally, we need a consistency condition, i.e., N P Ni =1 ˆ u i converges to ¯ u in some sense (this will bemade precise in Section 4) and we look for ¯ u ∈ C ([0 , T ]; R n ); the feasibility of doing so will be clearfrom our solution procedure. The next part of our plan is to show that such strategies have theproperty in (5) when applied in the game of N agents. In the following, we solve the optimizationproblem (8) in two steps. Let u i ∈ U and ¯ u ∈ C ([0 , T ]; R n ) be fixed. The optimal control problem is( P1 ) maximize f ∈ L (0 ,T ; R n ) ¯ J i ( u i , f ) . (9)Clearly ( P1) is equivalent to the following problem( P1a ) minimize f ∈ L (0 ,T ; R n ) ¯ J ′ i ( u i , f ) = E Z T (cid:26) −| x i − (Γ m i + η ) | Q + 1 γ | f ( t ) | (cid:27) dt − E x Ti ( T ) Hx i ( T ) . (P1a) is an optimal control problem with negative semi-definite state weights. We are interestedin the situation where (P1a) is a strictly convex problem with a coercivity property. This ensures thatthe worse case disturbance is uniquely determined by A i . The procedure below to identify conditionsfor ensuring convexity is similar to [37].To study the convexity of ¯ J ′ i in f , we construct a simpler auxiliary optimal control problem. Denote b Q = ( I − Γ) T Q ( I − Γ) . Consider the dynamics ˙ z = ( A + G ) z + g, z (0) = 0 , (10)where g ∈ L (0 , T ; R n ). The optimal control problem is( P1b ) minimize ¯ J ′′ i ( g ) = Z T (cid:26) − z T b Qz + 1 γ | g ( t ) | (cid:27) dt − z T ( T ) Hz ( T ) . For any s ∈ R , we have ¯ J ′′ i ( sg ) = s ¯ J ′′ i ( g ), and so view ¯ J ′′ i as a quadratic functional of g . Definition 1
Let F ( g ) be a real-valued functional of g ∈ L (0 , T ; R n ) . If F ( g ) ≥ for all g , F is saidto be positive semi-definite. If furthermore, F ( g ) > for all g = 0 , F is said to be positive definite. Lemma 2 ¯ J ′ i ( u i , f ) is convex (resp., strictly convex) in f if and only if ¯ J ′′ i ( g ) is positive semi-definite(resp., positive definite).Proof. Let ( x i , m i ) and ( x ′ i , m ′ i ) be the state processes of (7) corresponding to ( u i , f ) and ( u i , f ′ ),respectively. Take any λ ∈ [0 ,
1] and denote λ = 1 − λ . Then λ ¯ J ′ i ( u i , f ) + λ ¯ J ′ i ( u i , f ′ ) − ¯ J ′ i ( u i , λ f + λ f ′ )= λ λ E Z T (cid:26) | x i − x ′ i − Γ( m i − m ′ i ) | Q + 1 γ | f ( t ) − f ′ ( t ) | (cid:27) dt − λ λ E | x i ( T ) − x ′ i ( T ) | H . g = f − f ′ , and z = x i − x ′ i . Therefore, z is deterministic and satisfies (10). In addition, m i − m ′ i = z for t ∈ [0 , T ]. Hence λ ¯ J ′ i ( u i , f ) + λ ¯ J ′ i ( u i , f ′ ) − ¯ J ′ i ( u i , λ f + λ f ′ ) = λ λ ¯ J ′′ i ( g )and the lemma follows. (cid:3) For our further existence analysis, we need to ensure ¯ J ′ i ( u i , f ) to be both strictly convex andcoercive in f . For this purpose, we introduce the following assumption.( H1 ) There exists a small ǫ > J ′′ i ( g ) − ǫ k g k L is positive semi-definite.Note that (H1) is completely determined by the parameters ( b Q, γ, ǫ , H, T ), and does not dependon u i . Concerning (H1), we have the following result. Proposition 3
The following statements are equivalent: (i) (H1) holds true on [0 , T ] . (ii) The Riccati equation ˙ P + ( A + G ) T P + P ( A + G ) − γP − b Q = 0 , P ( T ) = − H (11) has a unique solution on [0 , T ] . (iii) For any t ∈ [0 , T ] , det { [(0 , I ) e A t (0 , I ) T ] } > , where A = (cid:18) A + G + γH − γI ˘ Q − ( A + G + γH ) T (cid:19) and ˘ Q = γH + b Q + ( A + G ) T H + H ( A + G ) . Proof.
In fact, (H1) is the uniform convexity condition proposed in [45], and the equivalencebetween (i) and (ii) is a corollary of Theorem 4.6 of [45]. Moreover, (iii) = ⇒ (ii) is given in Theorem4.3 of [40]. On the other hand, (ii) = ⇒ (iii) is implied by Theorems 2.7 and 2.9 of [54]. (cid:3) For illustration of condition (ii), we give the following example.
Example 4
Consider system (3) - (4) with parameters A = 0 . , B = 1 , G = 0 . , Q = 1 , Γ = 0 . , R = 1 . , H = 0 , γ = 1 . Denote b A = A + G . We solve (11) to obtain P ( t ) = − b Q ( e α ( t − T ) − e − α ( t − T ) ) λ e α ( t − T ) − λ e − α ( t − T ) , (12) where λ = − b A + q b A − γ b Q = − . , λ = − b A − q b A − γ b Q = − . ,α = q b A − γ b Q = 0 . . If < T < T max = α log( λ /λ ) = 2 . , P ( t ) given by (12) is well defined on [0 , T ] . By the localLipschitz continuity property of the vector field in (11) , P ( t ) is the unique solution. Note that (11) is not a standard Riccati equation since the state weight matrix − b Q is not positive semi-definite. In general, the solvability of (11) cannot be ensured on an arbitrary time horizon. Condition(iii) enables us to determine the solvability of (11) on a given time horizon. Note that condition (iii) isequivalent to det { [(0 , I ) e A t (0 , I ) T ] } 6 = 0 , ∀ t ∈ [0 , T ] by noting det { [(0 , I ) e A t (0 , I ) T ] } t =0 = 1 . Condition(iii) is more checkable as illustrated by the following example.6 xample 5
Consider system (3) - (4) with parameters A = − . , G = 0 . , Q = 1 , Γ = 0 . , H = 0 , γ = 1 . We obtain A = (cid:18) − . − .
04 0 . (cid:19) , e A t = − e t + e − t − e t + e − t e t − e − t e t + e − t ! , and det { [(0 , e A t (0 , T ] } = 43 e t + 13 e − t > , ∀ t ≥ . (13) Thus for any
T > , (11) admits a unique solution on [0 , T ] . Therefore, (H1) holds true on [0 , T ] . Lemma 6
Assume (H1) . Then ¯ J ′ i ( u i , f ) is strictly convex in f . Moreover, ¯ J ′ i ( u i , f ) is coercive in f and, in particular, there exists a constant C u i ,x i (0) depending on ( u i , x i (0)) such that ¯ J ′ i ( u i , f ) ≥ ǫ k f k L − C u i ,x i (0) . Proof.
Since ¯ J ′′ i ( g ) − ǫ k g k L is positive semi-definite by (H1), ¯ J ′′ i ( g ) is positive definite. By Lemma2, ¯ J ′ i ( u i , f ) is strictly convex in f . Following the method in proving Lemma 2, we can further showthat χ ( f ) := ¯ J ′ i ( u i , f ) − ǫ k f k L is convex in f . By (7) and direct estimates, we can showsup k f k L ≤ | χ ( f ) | ≤ C ,u i ,x i (0) , where the constant C ,u i ,x i (0) depends on ( u i , x i (0)). Now consider f with k f k L ≥
1. Define f = f k f k L . The convexity of χ ( f ) implies χ ( f ) ≤ k f k L χ ( f ) + k f k L − k f k L χ (0) ≤ k f k L χ ( f ) + C ,u i ,x i (0) . (14)Consequently, for k f k L ≥
1, (14) gives χ ( f ) ≥ − C ,u i x i (0) k f k L . Hence for any f , χ ( f ) ≥ − C ,u i ,x i (0) (2 k f k L + 1) . It follows that¯ J ′ i ( u i , f ) = χ ( f ) + ǫ k f k L ≥ ǫ k f k L − C ,u i ,x i (0) (2 k f k L + 1) ≥ ǫ k f k L − C u i ,x i (0) for some constant C u i ,x i (0) . (cid:3) Theorem 7
Suppose that (H1) holds and let u i ∈ U and ¯ u be fixed. Then (i) ¯ J ′ i ( u i , f ) has a unique minimizer ˆ f , or equivalently, ¯ J i ( u i , f ) has a unique maximizer ˆ f ; (ii) there exists a unique solution ( x i , m i , p i ) ∈ L F (0 , T ; R n ) × L (0 , T ; R n ) to the equation system dx i = ( Ax i + Bu i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (15) where m i (0) = m and p i ( T ) = H E x i ( T ) , and furthermore ˆ f = γp i . roof. (i) By Lemma 2, ¯ J ′ i is strictly convex and coercive. In addition, ¯ J ′ i is continuous in f . Hencethere exists a unique ˆ f such that ¯ J ′ i ( u i , ˆ f ) = inf f ¯ J ′ i ( u i , f ) [33, Chap. 7], [39].(ii) We start by establishing existence. Let the optimal state-control pair be denoted by ( x i , m i , ˆ f ),which is uniquely determined. We have the relation dx i = ( Ax i + Bu i + Gm i + γ ˆ f ) dt + DdW i , (16)˙ m i = ( A + G ) m i + B ¯ u + γ ˆ f , (17)where m i (0) = m . By using ( x i , m i ), we obtain a unique solution p i from˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (18)where p i ( T ) = H E x i ( T ).Now we consider another control f = ˆ f + ˜ f ∈ L (0 , T ; R n ) in place of ˆ f . Let ˜ x i and ˜ m i be the firstvariations of x i and m i , respectively, which result from the variation ˜ f for ˆ f . Then we have ˜ x i = ˜ m i for all t ∈ [0 , T ] and d ˜ x i dt = ( A + G )˜ x i + ˜ f , ˜ x i (0) = 0 . Since ¯ J ′ i has a minimum at ( x i , m i , ˆ f ), the first variation of the cost satisfies0 = δ ¯ J ′ i E Z T (cid:26) − [ x i − (Γ m i + η )] T Q ( I − Γ)˜ x i + 1 γ ˆ f T ˜ f (cid:27) dt − E x Ti ( T ) H ˜ x i ( T ) . (19)On the other hand, ddt ( p Ti ˜ x i ) = ˜ x Ti ˙ p i + p Ti d ˜ x i dt = − [ E x i − (Γ m i + η )] T Q ( I − Γ)˜ x i + p Ti ˜ f . (20)Integrating both sides of (20) and invoking (19), we obtain p Ti ( T )˜ x i ( T ) = Z T (cid:18) p Ti ˜ f − γ ˆ f T ˜ f (cid:19) dt + E x Ti ( T ) H ˜ x i ( T ) . (21)Recalling p i ( T ) = H E x i ( T ), since ˜ f is arbitrary, it follows from (21) thatˆ f = γp i for a.e. t ∈ [0 , T ]. Therefore, ( x i , m i , p i ) determined by (16)-(18) is a solution to (15).We proceed to show uniqueness. Suppose that ( x ′ i , m ′ i , p ′ i ) is another solution of (15). Set thecontrol f ′ = γp ′ i . It is straightforward to show that the first variation of ¯ J ′ i at the state control pair( x ′ i , m ′ i , f ′ ) is zero. Since ¯ J ′ i is strictly convex, this implies that ( x ′ i , m ′ i , f ′ ) is the unique optimal state-control pair and so coincides with ( x i , m i , ˆ f ) where ( x i , m i ) is the optimal state process determinedfrom (16)-(18). This further implies p ′ i = p i . So uniqueness follows. The last part of (ii) is nowobvious. (cid:3) .3 The control problem of player A i Assume that (H1) holds. This will ensure that all the equation systems in this section have a welldefined solution. The dynamics are given by dx i = ( Ax i + Bu i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (22)where m i (0) = m and p i ( T ) = H E x i ( T ). The optimal control problem is( P2 ) minimize u i ∈ L F (0 ,T ; R n ) ¯ J i ( u i , ˆ f u i ) = E Z T (cid:8) | x i − (Γ m i + η ) | Q + u Ti Ru i − γ | p i ( t ) | (cid:9) dt + E x Ti ( T ) Hx i ( T ) . Here we have taken ˆ f u i = γp i which depends on u i . We may simply write ¯ J i ( u i ). This is again a linearquadratic optimal control problem with indefinite weight for the state vector ( x i , m i , p i ). Note that aperturbation in u i will cause a change of the mean term E x i . So this is essentially a mean field typeoptimal control problem; see related work [3, 53].We continue to identify conditions under which (P2) is strictly convex and coercive. These condi-tions will be characterized by using an auxiliary control problem with dynamics ˙ z i = Az i + Bν i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q ( z i − Γ z ) , (23)where z i (0) = z (0) = 0 and q ( T ) = Hz i ( T ). The control ν i ∈ L (0 , T ; R n ). The optimal controlproblem is ( P2a ) minimize ¯ J ai ( ν i ) = Z T (cid:8) | z i − Γ z | Q + ν Ti Rν i − γ | q ( t ) | (cid:9) dt + | z i ( T ) | H . (24)We may view this as a deterministic optimal control problem with two point boundary value conditionsfor the state trajectory. We say ¯ J ai is positive semi-definite if ¯ J ai ( ν i ) ≥ ν i ; if furthermore,¯ J ai ( ν i ) > ν i = 0, we say ¯ J ai is positive definite. In order to have a well defined optimalcontrol problem, we need to show that (23) has a unique solution. Lemma 8
Assume (H1) . For each ν i , there exists a unique solution ( z i , z, q ) ∈ C ([0 , T ]; R n ) to (23) .Proof. Indeed, by taking u i = 0 and u i = ν i ∈ L (0 , T ; R n ) in (22), we obtain two solutions( x i , m i , p i ) and ( x ν i i , m ν i i , p ν i i ), respectively. It is easy to show that ( z i , z, q ) := ( x ν i i − x i , m ν i i − m i , p ν i i − p i ) is a solution of (23) by observing that x ν i i − x i is deterministic.If there exist two different solutions to (23) for some ν i , then we can construct two differentsolutions to (22) for a given u i , which is a contradiction to Theorem 7. (cid:3) Lemma 9 ¯ J i ( u i ) is convex (resp., strictly convex) in u i ∈ U if and only if ¯ J ai ( ν i ) is positive semi-definite (resp., positive definite).Proof . See appendix A. (cid:3) We introduce the following assumption. (H2)
There exists a small constant δ > J ai ( ν i ) − δ k ν i k ≥ ν i ∈ L (0 , T ; R n ).9 .4 Representation of the quadratic functional We intend to find an expression of ¯ J ai ( ν i ) so that (H2) can be characterized in a more explicit form.A change of coordinates will make the computation more convenient. Define ˇ z = z i − z . Then (23)becomes ˙ˇ z = A ˇ z + Bν i , ˙ z = ( A + G ) z + γq, ˙ q = − b Qz − ( A + G ) T q − ( I − Γ) T Q ˇ z, (25)where ˇ z (0) = z (0) = 0 and q ( T ) = H (ˇ z ( T ) + z ( T )).Define the Hamiltonian matrix H = (cid:20) A + G γI − b Q − ( A + G ) T (cid:21) and the matrix ODE ˙Φ( t ) = H Φ( t ) where Φ(0) = I . Denote the partitionΦ( t ) = (cid:20) Φ ( t ) Φ ( t )Φ ( t ) Φ ( t ) (cid:21) , where each submatrix Φ ij is an n × n matrix function.We have ˇ z ( t ) = Z t e A ( t − τ ) Bν i ( τ ) dτ. (26)By solving ( z, q ) in (25), we obtain z ( t ) = Φ ( t ) q (0) − Z t Φ ( t − s )( I − Γ) T Q ˇ z ( s ) ds,q ( t ) = Φ ( t ) q (0) − Z t Φ ( t − s )( I − Γ) T Q ˇ z ( s ) ds, where q (0) is to be determined. At the terminal time, z ( T ) = Φ ( T ) q (0) − Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds and q ( T ) = Φ ( T ) q (0) − Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds = H ˇ z ( T ) + H Φ ( T ) q (0) − H Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds, where the second equality is due to the terminal condition of q . It follows that[Φ ( T ) − H Φ ( T )] q (0) = H ˇ z ( T ) + Z T [Φ ( T − s ) − H Φ ( T − s )]( I − Γ) T Q ˇ z ( s ) ds. (27) Proposition 10 If (H1) holds, Φ ( T ) − H Φ ( T ) is nonsingular. roof. Under (H1), (25) has a unique solution by Lemma 8, and accordingly, q (0) is uniquelydetermined. If Φ ( T ) − H Φ ( T ) is singular, we may find two different solutions of q (0) from (27)which further give two different solutions to (25), leading to a contradiction. Hence, Φ − H Φ ( T )is nonsingular. (cid:3) By solving q (0) in (27) and further eliminating ˇ z , we write z and q as integrals depending on ν i .Define the linear operator [ L ( ν i )]( t ) = ˇ z ( t ) z ( t ) q ( t ) . By standard estimates we can show that L is a linear and bounded operator from L (0 , T ; R n ) to L (0 , T ; R n ). Let L ∗ be its adjoint operator from L (0 , T ; R n ) to L (0 , T ; R n ). Define the operator L T ν i = ˇ z ( T ) + z ( T ) . It can be shown that L T is a linear and bounded operator from L (0 , T ; R n ) to R n . Let L ∗ T be itsadjoint operator. Now ¯ J ai may be represented in terms of the inner product on L (0 , T ; R n ):¯ J ai ( ν i ) = h Θ ν i , ν i i + h Rν i , ν i i + h Θ T ν i , ν i i , (28)where Θ ν i = L ∗ Q Q ( I − Γ) 0( I − Γ) T Q b Q
00 0 − γI L ν i , Θ T ν i = L ∗ T H L T ν i . Proposition 11 (i) ¯ J i ( u i ) is convex in u i ∈ U if and only if h (Θ + Θ T + R ) ν i , ν i i ≥ for all ν i ∈ L (0 , T ; R n ) . (ii) (H2) holds if and only if there exists δ > such that h (Θ + Θ T + R ) ν i , ν i i ≥ δ k ν i k L for all ν i ∈ L (0 , T ; R n ) .Proof. (i) follows from Lemma 9 and the representation (28). (ii) follows from (28). (cid:3) The criterion in part (ii) of Proposition 11 still involves the operators Θ and Θ T on an infinitedimensional space. Here we give a sufficient condition to endure (H2) based on some more computableparameters. It is clear that h (Θ + Θ T + R ) ν i , ν i i ≥ R T ( | ν i ( t ) | R − γ | q ( t ) | ) dt. For simplicity, we onlyconsider the case H = 0, and simple computations lead to q ( t ) = Φ ( t )Φ − ( T ) Z T Φ ( T − s )( I − Γ) T Q Z s e A ( s − τ ) Bν i ( τ ) dτ ds − Z t Φ ( t − s )( I − Γ) T Q Z s e A ( s − τ ) Bν i ( τ ) dτ ds =: q ( t ) − q ( t ) . Denote b = sup ≤ t ≤ T | Φ ( t ) | , b = sup ≤ t ≤ T | Φ ( t )Φ − ( T ) | , b = | Q ( I − Γ) | , b = R T | e As B | ds and b = sup ≤ t ≤ T | e At B | . By exchanging the order of integration in q and q , it is easy to show | q ( t ) | ≤ ( b b b b ) T Z T ν i ( s ) ds, | q ( t ) | ≤ b b b Z t ( t − τ ) | ν i ( τ ) | dτ, which further gives Z T | q ( t ) | dt ≤ C q Z T | ν i ( t ) | dt, (29)where C q = 2( b b b b ) T + ( b b b ) T . For the case H = 0, (H2) holds whenever R > γC q I .11 .5 The solution of (P2) Let ¯ u ∈ C ([0 , T ]; R n ) be fixed. Lemma 12
Assume (H1) - (H2) . Then (P2) has a unique optimal state-control pair of the form ( x i , m i , p i , ˆ u i ) satisfying dx i = ( Ax i + B ˆ u i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (30) where p i ( T ) = H E x i ( T ) . Furthermore, the backward stochastic differential equation (BSDE) ( dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i ,y i ( T ) = − Hx i ( T ) (31) has a unique solution ( y i , ζ i ) ∈ L F (0 , T ; R n ) and ˆ u i = R − B T y i . (32) Proof.
Under (H2), by adapting Lemma 9 to the auxiliary control problem with cost functional¯ J i ( u i ) − δ E R T | u i | dt , we can show that ¯ J i ( u i ) − δ E R T | u i | dt is convex in u i . By the method inproving Lemma 6, we can further show that ¯ J i is strictly convex and coercive in u i . Hence (P2) hasa unique optimal state-control pair ( x i , m i , p i , ˆ u i ) which minimizes ¯ J i ( u i ).Given ( x i , m i , p i , ˆ u i ), (31) is a standard linear BSDE and so has a unique solution ( y i , ζ i ). Furtherdefine the BSDE dy = (cid:8) − G T y i − ( A + G ) T y − Γ T Q [ x i − (Γ m i + η )] (cid:9) dt + ζdW i , where y ( T ) = 0. It also has a unique solution ( y, ζ ) ∈ L F (0 , T ; R n ). It can be checked that ddt [ E ( y + y i ) + p i ] = − ( A + G ) T [ E ( y + y i ) + p i ]and E ( y ( T ) + y i ( T )) + p i ( T ) = 0. So E ( y i + y ) + p i = 0 (33)for all t ∈ [0 , T ].Let ˆ u i be replaced by ˆ u i + ˜ u i ∈ L F (0 , T ; R n ) in (30), and the resulting solution be denoted by( x i + ˜ x i , m i + ˜ m i , p i + ˜ p i ), which exists and is unique by Theorem 7. It follows that ˙˜ x i = A ˜ x i + B ˜ u i + G ˜ m i + γ ˜ p i , ˙˜ m i = ( A + G ) ˜ m i + γ ˜ p i , ˙˜ p i = − ( A + G ) T ˜ p i − ( I − Γ) T Q ( E ˜ x i − Γ ˜ m i ) , where ˜ x i (0) = ˜ m i (0) = 0 and ˜ p i ( T ) = H E ˜ x i ( T ). The first variation of ¯ J i about ˆ u i satisfies0 = δ ¯ J i E Z T (cid:8) (˜ x i − Γ ˜ m i ) T Q [ x i − (Γ m i + η )] + ˜ u Ti R ˆ u i − γ ˜ p Ti p i (cid:9) dt + E ˜ x Ti ( T ) Hx i ( T ) . (34)By applying Ito’s formula to ˜ x Ti y i , we obtain E ˜ x Ti ( T ) y i ( T ) − E ˜ x Ti (0) y i (0) = E Z T (cid:8) ˜ x Ti Q [ x i − (Γ m i + η )] + y Ti ( B ˜ u i + G ˜ m i + γ ˜ p i ) (cid:9) dt. E ˜ m Ti ( T ) y ( T ) − E ˜ m Ti (0) y (0) = E Z T (cid:8) γy T ˜ p i − ˜ m Ti ( G T y i + Γ T Q [ x i − (Γ m i + η )]) (cid:9) dt. Therefore, adding up the two equations yields − E ˜ x Ti ( T ) Hx i ( T ) = E Z T (cid:8) (˜ x i − Γ ˜ m i ) T Q [ x i − (Γ m i + η )] + y Ti B ˜ u i + γ ( y + y i ) T ˜ p i (cid:9) dt. (35)By (34) and (35), E Z T [˜ u Ti R ˆ u i − γ ˜ p Ti p i − ˜ u Ti B T y i − γ ˜ p Ti ( y + y i )] dt = 0 . Note that by (33), E Z T ˜ p Ti ( p i + y + y i ) dt = Z T ˜ p Ti [ p i + E ( y + y i )] dt = 0 . Hence, E Z T ˜ u Ti ( R ˆ u i − B T y i ) dt = 0 . Since ˜ u i ∈ L F (0 , T ; R n ) is arbitrary, (32) follows. (cid:3) After substituting ˆ u i = R − B T y i into (30), we form the equation system dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (36)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ). This equation systemconsists of 2 forward equations and 2 backward equations. It is clear that the solution of the optimalcontrol problem (P2) satisfies the above FBSDE. A natural question is whether this FBSDE’s solutioncompletely determines the optimal control. This is answered by the next theorem. Denote S [0 , T ] = L F (0 , T ; R n ) × C ([0 , T ]; R n ) × L F (0 , T ; R n ) . Theorem 13
Assume (H1) - (H2) . Then the FBSDE (36) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] and the optimal control for (P2) is given by ˆ u i = R − B T y i .Proof. We solve (P1) first and (P2) next to determine ˆ u i . By Lemma 12, we obtain ( x i , m i , p i , y i , ζ i )to satisfy (30)-(31) and ˆ u i = R − B T y i . Obviously, ( x i , m i , p i , y i , ζ i ) satisfies (36).We continue to show uniqueness. Suppose that ( x i , m i , p i , y i , ζ i ) and ( x ′ i , m ′ i , p ′ i , y ′ i , ζ ′ i ) are twosolutions of (36). Define ˇ u i = R − B T y i and u ′ i = R − B T y ′ i which are both well-determined elementsin L F (0 , T ; R n ). In particular, we have dx i = ( Ax i + B ˇ u i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (37)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ).As in the proof of Lemma 12, we evaluate the first variation of ¯ J i ( u i ) at ( x i , m i , p i , ˇ u i ) and canshow δ ¯ J i = 0. Since ¯ J i is convex, this zero first variation condition implies that ˇ u i is an optimal controlof (P2). By the same reasoning, u ′ i is also an optimal control. By strict convexity, we have ˇ u i = u ′ i .Subsequently, we have ( x i , m i , p i ) = ( x ′ i , m ′ i , p ′ i ) by Theorem 7. This further implies ( y i , ζ i ) = ( y ′ i , ζ ′ i ). (cid:3) The Solution of the Robust Game
Note that Theorem 13 determines the strategy of a representative agent when ¯ u is fixed. Denote x ( N ) = 1 N N X i =1 x i , y ( N ) = 1 N N X i =1 y i , m ( N ) = 1 N N X i =1 m i , p ( N ) = 1 N N X i =1 p i . (38)By (36), we obtain dx ( N ) = (cid:0) Ax ( N ) + BR − B T y ( N ) + Gm ( N ) + γp ( N ) (cid:1) dt + DN P Ni =1 dW i , dm ( N ) dt = ( A + G ) m ( N ) + B ¯ u + γp ( N ) , dp ( N ) dt = − ( A + G ) T p ( N ) − ( I − Γ) T Q (cid:2) E x ( N ) − (Γ m ( N ) + η ) (cid:3) ,dy ( N ) = (cid:8) − A T y ( N ) + Q [ x ( N ) − (Γ m ( N ) + η )] (cid:9) dt + N P Ni =1 ζ i dW i , (39)where x ( N ) (0) = (1 /N ) P Ni =1 x i (0), m ( N ) (0) = m , p ( N ) ( T ) = H E x ( N ) ( T ), and y ( N ) ( T ) = − Hx ( N ) ( T ).As an approximation to (39), we construct the following limiting system ˙ x = A x + BR − B T y + G m + γ p , ˙ m = ( A + G ) m + B ¯ u + γ p , ˙ p = − ( A + G ) T p − ( I − Γ) T Q [ x − (Γ m + η )] , ˙ y = − A T y + Q [ x − (Γ m + η )] , (40)where x (0) = m (0) = m , p ( T ) = H x ( T ), and y ( T ) = − H x ( T ). This is a two point boundary valueproblem.Note that y is intended as an approximation of y ( N ) when N → ∞ . The consistency requirementimposes ¯ u = R − B T y . (41)Under the condition (41), the first two equations in (40) coincide to give x = m for all t ∈ [0 , T ].Consequently, we eliminate the equation of x and introduce the new system ˙ m = ( A + G ) m + BR − B T y + γ p , ˙ p = − ( A + G ) T p − ( I − Γ) T Q [ m − (Γ m + η )] , ˙ y = − A T y + Q [ m − (Γ m + η )] , (42)where m (0) = m , p ( T ) = H m ( T ), and y ( T ) = − H m ( T ). This is still a two point boundary valueproblem. The next corollary follows from Theorem 13. Corollary 14
Assume (H1) - (H2) . Suppose that (42) has a unique solution ( m , p , y ) ∈ C ([0 , T ]; R n ) and take ¯ u = R − B T y in (36) . Then (36) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] . (cid:3) Consider the special case where all agents have the same initial condition x i (0) = m for all i ≥ u ) = R − B T E y i , where we take ¯ u ∈ C ([0 , T ]; R n ). Clearly R − B T E y i is a continuous R n -valued function of t ∈ [0 , T ].14y the consistency requirement ¯ u = Λ(¯ u ), we set ¯ u = R − B T E y i in the second equation of (36)to obtain the equation system of the mean field game: dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + BR − B T E y i + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (43)where x i (0) = m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ).An interesting fact is that the existence and uniqueness of a solution to (43) is completely deter-mined by the ODE system (42) without further using (H1)-(H2). Theorem 15 (43) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] if and only if (42) has a uniquesolution.Proof. By Lemma B.1, (43) has a unique solution if and only if the FBSDE (B.1) has a uniquesolution. By Lemma B.2 and Lemma B.3-(iii), the FBSDE (B.1) has a unique solution if and only if(42) has a unique solution. The theorem follows. (cid:3) (42)
To study the existence and uniqueness of a solution to (42), we use a fixed point approach and introducethe equation system ˙ m = ( A + G ) m + h + γ p , ˙ p = − ( A + G ) T p − b Q m + ( I − Γ) T Qη, ˙ y = − A T y + Q [ m − (Γ m + η )] , (44)where h ∈ C ([0 , T ]; R n ), m (0) = m , p ( T ) = H m ( T ), and y ( T ) = − H m ( T ). The next lemmaidentifies a sufficient condition for (44) to have a unique solution for any h ∈ C ([0 , T ]; R n ). Lemma 16
Suppose that the Riccati equation ˙ K + K ( A + G ) + ( A + G ) T K − γK − b Q = 0 , K ( T ) = − H (45) has a unique solution on [0 , T ] . Then (44) defines a mapping from C ([0 , T ]; R n ) to itself: Λ : h BR − B T y . Proof . We write p = − K m + φ for (44) and obtain the ODE˙ φ = − ( A + G − γK ) φ + Kh + ( I − Γ) T Qη, φ ( T ) = 0 . It follows that ˙ m = ( A + G − γK ) m + h + γφ. Let the fundamental solution matrices of the two ODEs˙ ϕ = ( A + G − γK ) ϕ, ˙ ψ = − ( A + G − γK ) T ψ be Φ( t, s ) and Ψ( t, s ), respectively, with Φ( s, s ) = Ψ( s, s ) = I . Then Ψ( t, s ) = Φ T ( s, t ). We obtain φ ( t ) = − Z Tt Ψ( t, s )[ K ( s ) h ( s ) + ( I − Γ) T Qη ] ds . m ( t ) = Φ( t, m + Z t Φ( t, s ) h ( s ) ds − γ Z t Φ( t, s ) Z Ts Ψ( s , s )[ K ( s ) h ( s ) + ( I − Γ) T Qη ] ds ds . We further solve y ( t ) = − Z Tt e − A T ( t − s ) Q [( I − Γ) m ( s ) − η ] ds − e − A T ( t − T ) H m ( T ) , which implies y ∈ C ([0 , T ]; R n ). The lemma follows. (cid:3) To simplify the existence analysis for (42) in this section, we consider the case H = 0. BelowΥ k denotes a continuous function of t which does not depend on h and can be easily determined.Consequently, y ( t ) = − Z Tt e − A T ( t − s ) Q [( I − Γ) m ( s ) − η ] ds = − Z Tt e − A T ( t − s ) Q ( I − Γ) m ( s ) ds + Υ ( t )= − Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) h ( s ) ds ds + γ Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) Z Ts Ψ( s , s ) K ( s ) h ( s ) ds ds ds + Υ ( t ) . Now we haveΛ ( h )( t ) = BR − B T y ( t )= − BR − B T Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) h ( s ) ds ds + γBR − B T Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) Z Ts Ψ( s , s ) K ( s ) h ( s ) ds ds ds + BR − B T Υ ( t )=: Λ ( h )( t ) + BR − B T Υ ( t ) . It is clear that Λ is from C ([0 , T ]; R n ) to itself. (cid:3) Define the constants c = max t ∈ [0 ,T ] | K ( t ) | , c = max ≤ t,s ≤ T | Φ( t, s ) | ,c = max t ∈ [0 ,T ] Z Tt | e A ( s − t ) | sds, c = max t ∈ [0 ,T ] Z Tt | e A ( s − t ) | ( T s − s ds. Note that
T s − s ≥ s ∈ [0 , T ]. Denote | h | = max t ∈ [0 ,T ] | h ( t ) | . Theorem 17
Assume H = 0 . If c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) < , (46) then (42) has a unique solution. roof. For each t , | Λ ( h )( t ) | ≤ c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A T ( s − t ) | s ds + γc c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A T ( s − t ) | Z s Z Ts ds ds ds = c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A ( s − t ) | s ds + γc c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A ( s − t ) | ( T s − s ds = c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) . Hence, Λ is a contraction and has a unique fixed point. So (42) has a unique solution. (cid:3) The constants c , . . . , c in (46) do not depend on BR − B T . If BR − B T is suitably small, (46)can be ensured. Example 18
Consider the system with parameters given by Example 4. Take T = 1 . . In analogueto (12) , we can solve K ( t ) on [0 , T ] for (45) . It can be shown that K ( t ) ≤ for t ∈ [0 , T ] and | K ( t ) | attains its maximum on [0 , T ] at t = 0 . We have K (0) = − . which gives c = 0 . . So c ≤ e ( A + G + | K (0) | ) T = 3 . . Furthermore, c ≤ Z T e As sds = 1 . , c ≤ Z T e As ( T s − s ds = 1 . . Subsequently, c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) ≤ . . So (46) holds. Remark 1
For the two-point boundary value problem, the contraction estimate in the fixed pointmethod may be conservative and typically works on small time intervals for the solvability of (42)(see, e.g., Ch.1, Sec. 5, [40]).We continue to derive another condition under which (42) is solvable without restriction to a smalltime horizon. To this end, we first rewrite (42) in the following form: ˙ m ˙ p ˙ y = e A mpy + e η,m (0) = m , p ( T ) = Hm ( T ) , y ( T ) = − Hm ( T ) , (47)where e A = A + G γ BR − B T − ( I − Γ) T Q ( I − Γ) − ( A + G ) T Q (1 − Γ) 0 − A T , e η = I − Γ) T Qη − Qη . Then, by the variation of constant formula, we have m ( t ) p ( t ) y ( t ) = Θ( t ) m µν + Θ( t ) Z t Θ − ( s ) e ηds, (48)where Θ( t ) = e e At and p, y have the initial conditions p (0) = µ , y (0) = ν . Noting the terminal conditionin (47), now we present the following result. 17 roposition 19 [40, Ch.2, Sec. 3] If for given
T > , det( e Θ( T )) = 0 , where e Θ( T ) = (cid:18) − H I H I (cid:19) Θ( T ) I I , then (42) has a unique solution on [0 , T ] for any initial value m . For illustration, we give the following example.
Example 20
Consider system (3) - (4) with all parameters being scalar-valued and Γ = 1 , H = 0 . Wecalculate A = (cid:18) A + G − γ − ( A + G ) (cid:19) , e A = A + G γ R − B − ( A + G ) 00 0 − A , where A is defined in Proposition 3. By direct computations, we obtain det { [(0 , I ) e A t (0 , I ) T ] } = e − ( A + G ) t > for all t ∈ [0 , T ] , which ensures (H1) by Proposition 3. Moreover, b = 0 gives C q = 0 in (29) so that (H2) always holds true for R > . Finally, det( e Θ( t )) = e − (2 A + G ) t > , and subsequently, (42) has aunique solution on any interval [0 , T ] . To summarize, (H1) , (H2) and the solvability of (42) are allsatisfied by the system. Note that the solvability of (42) in Example 20 does not depend on the value of R − B , which isdifferent from the condition in Theorem 17. We suppose that (42) has a unique solution ( m , p , y ) and accordingly take ¯ u in (36) as¯ u ∗ = R − B T y . (49)The FBSDE system (36) now becomes dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u ∗ + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (50)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ). By Corollary 14, thisFBSDE has a unique solution. In the game of N players, let y i be solved from (50) and denote thecontrol for A i by ˆ u i = R − By i , ≤ i ≤ N, (51)which is a well defined process in L F (0 , T ; R n ).For ˆ u ( N ) = (1 /N ) P Ni =1 ˆ u i , we aim to estimate E | ˆ u ( N ) ( t ) − ¯ u ∗ ( t ) | . Note that ˆ u , . . . , ˆ u N are independent, but they are not necessarily with the same distribution due topossibly different initial states of the agents. This fact will somehow complicate our error estimate.The key result of this section is the following theorem.18 heorem 21 Assume that (H1) - (H2) hold and that (42) has a unique solution. We have sup ≤ t ≤ T E | ˆ u ( N ) − ¯ u ∗ | = O (1 /N ) + O ( | x ( N ) (0) − m | ) , where x ( N ) (0) = (1 /N ) P Ni =1 x i (0) . (cid:3) The proof of Theorem 21 is provided in the remaining part of this section. To do this, we need toprove some lemmas under the assumption of the theorem. Recalling (38), we take ¯ u = ¯ u ∗ in (39) towrite dx ( N ) = (cid:0) Ax ( N ) + BR − B T y ( N ) + Gm ( N ) + γp ( N ) (cid:1) dt + DN P Ni =1 dW i , dm ( N ) dt = ( A + G ) m ( N ) + B ¯ u ∗ + γp ( N ) , dp ( N ) dt = − ( A + G ) T p ( N ) − ( I − Γ) T Q (cid:2) E x ( N ) − (Γ m ( N ) + η ) (cid:3) ,dy ( N ) = (cid:8) − A T y ( N ) + Q [ x ( N ) − (Γ m ( N ) + η )] (cid:9) dt + N P Ni =1 ζ i dW i , (52)where x ( N ) (0) = (1 /N ) P Ni =1 x i (0), m ( N ) (0) = m , p ( N ) ( T ) = H E x ( N ) ( T ), and y ( N ) ( T ) = − Hx ( N ) ( T ).Denote the ODE system ˙ x N = A x N + BR − B T y N + G m N + γ p N , ˙ m N = ( A + G ) m N + B ¯ u ∗ + γ p N , ˙ p N = − ( A + G ) T p N − ( I − Γ) T Q [ x N − (Γ m N + η )] , ˙ y N = − A T y N + Q [ x N − (Γ m N + η )] , (53)where x N (0) = (1 /N ) P Ni =1 x i (0), m N (0) = m , p N ( T ) = H x N ( T ), and y N ( T ) = − H x N ( T ). Theinitial condition x N (0) is different from that of (40). Lemma 22 (53) has a unique solution which can be denoted as ( x N , m N , p N , y N ) = ( E x ( N ) , m ( N ) , p ( N ) , E y ( N ) ) . Proof.
Existence follows by taking expectation in (52). To show uniqueness, suppose that (53) hastwo different solutions ( x N , m N , p N , y N ) and ( x ′ N , m ′ N , p ′ N , y ′ N ). Then for any λ ∈ R ,( x i , m i , p i , y i , ζ i ) + λ ( x N − x ′ N , m N − m ′ N , p N − p ′ N , y N − y ′ N , (cid:3) Lemma 23
We have sup ≤ t ≤ T (cid:16) E | x ( N ) − E x ( N ) | + E | y ( N ) − E y ( N ) | (cid:17) = O (1 /N ) . Proof.
Define ( θ , θ ) = ( x ( N ) − E x ( N ) , y ( N ) − E y ( N ) ) . By (52), (53) and Lemma 22, ( dθ = ( Aθ + BR − B T θ ) dt + DN P Ni =1 dW i ,dθ = ( − A T θ + Qθ ) dt + N P Ni =1 ζ i dW i , where θ (0) = 0 and θ ( T ) = − Hθ ( T ). 19et P be the solution of the Riccati equation˙ P + A T P + P A − P BR − B T P + Q = 0 , P ( T ) = H. Denote θ = − P θ + ψ, where ψ ( T ) = 0. This gives the equation dψ = − ( A − BR − B T P ) T ψdt + 1 N N X i =1 ( P D + ζ i ) dW i , where ψ ( T ) = 0. There is a unique solution ψ = 0 for t ∈ [0 , T ]. This implies dθ = ( A − BR − B T P ) θ dt + DN N X i =1 dW i . Hence, sup ≤ t ≤ T E | θ ( t ) | = O (1 /N ) . The lemma follows since θ = − P θ . (cid:3) When ( m , p , y ) is a unique solution of (42), it can be shown that ( x , m , y , p ) := ( m , m , y , p ) is aunique solution of (40) under the condition (41). Lemma 24
We have sup ≤ t ≤ T [ | x N − x | + | m N − m | + | p N − p | + | y N − y | ] = O ( | x ( N ) (0) − m | ) . Proof.
Consider ˙ h = Ah + BR − B T h + Gh + γh , ˙ h = ( A + G ) h + γh , ˙ h = − ( A + G ) T h − ( I − Γ) T Q ( h − Γ h ) , ˙ h = − A T h + Q ( h − Γ h ) , (54)where h (0) is given, h (0) = 0, h ( T ) = Hh ( T ), and h ( T ) = − Hh ( T ). It is constructed as ahomogeneous version of (53). We claim that (54) has a unique solution for any given value of h (0).If this were not true, there would exist h (0) such that (54) has multiple solutions which, in turn, canbe used to construct multiple solutions to (53). This would give a contradiction to Lemma 22.It is clear that ( x N − x , m N − m , p N − p , y N − y ) =: ( h , h , h , h ) , is a solution of (54) with h (0) = x N (0) − m .Let e , . . . e n be a canonical basis of R n . For h (0) = e k , we obtain a solution of (54), denotedby h k = ( h k , h k , h k , h k ). Let ( z ) k be the k th component of a vector z . We may uniquely denote( x N − x , m N − m , p N − p , y N − y ) as a linear combination of h , . . . , h n :( x N − x , m N − m , p N − p , y N − y ) = n X k =1 ( x N (0) − m ) k ( h k , h k , h k , h k ) . The lemma follows readily. (cid:3)
Proof of Theorem 21.
For ¯ u = ¯ u ∗ , we write ˆ u ( N ) = R − B T (1 /N ) P Ni =1 y i = R − B T y ( N ) . Wehave | ˆ u ( N ) − ¯ u ∗ | = E | R − B T ( y ( N ) − y ) | ≤ C E | y ( N ) − y | = CE | y ( N ) − E y ( N ) + E y ( N ) − y | ≤ C (1 /N ) + C | y N − y | = O (1 /N ) + O ( | x ( N ) (0) − m | ) . The second inequality follows from Lemmas 22 and 23, and the last step follows from Lemma 24. (cid:3) Robust Nash Equilibrium
Throughout this section, we assume that (42) has a unique solution and take ¯ u = ¯ u ∗ determined by(49). For f ∈ L (0 , T ; R n ) and u i ∈ L F (0 , T ; R n ), 1 ≤ i ≤ N , recall the worst case cost J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ) . It is clear that for each i and any ( u i , u − i ), sup f J i ( u i , u − i , f ) ≥ u i , ˆ u − i ) given by (51) for a population of N players with dynamics (3).It should be emphasized that we only use (50)-(51) to make a well defined process ˆ u i in L F (0 , T ; R n )which should not be understood as a feedback strategy. The main result of this section is the nexttheorem which characterizes the performance of this set of strategies. Theorem 25
Assume (i) (H1) - (H2) hold; (ii) sup i ≥ | x i (0) | ≤ M where M does not depend on N ; (iii) (42) has a unique solution. Then the set of strategies (ˆ u , . . . , ˆ u N ) given by (51) is a robust ε N -Nash equilibrium for the N players, i.e., J wo i (ˆ u i , ˆ u − i ) − ε N ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) , (55) where ≤ ε N = O (1 / √ N + | x ( N ) (0) − m | ) and x ( N ) (0) = (1 /N ) P Nj =1 x j (0) . (cid:3) The rest part of this section is devoted to the proof of Theorem 25. For any given f ∈ L (0 , T ; R n ),denote the state processes of (3) corresponding to (ˆ u i , ˆ u − i , f ) by ˆ x j , 1 ≤ j ≤ N , and ˆ x ( N ) =(1 /N ) P Nj =1 ˆ x j . Denote ˙¯ m = ( A + G ) ¯ m + B ¯ u ∗ + f, ¯ m (0) = m (56)All subsequent lemmas are proved under the assumptions of Theorem 25. Lemma 26
We have sup ≤ t ≤ T,f E | ˆ x ( N ) − ¯ m | ≤ C (1 /N + | x ( N ) (0) − m | ) . Proof.
Note that d ˆ x ( N ) = [( A + G )ˆ x ( N ) + B ˆ u ( N ) + f ] dt + ( D/N ) N X i =1 dW i . Therefore, d (ˆ x ( N ) − ¯ m ) = [( A + G )(ˆ x ( N ) − ¯ m ) + B (ˆ u ( N ) − ¯ u ∗ )] dt + ( D/N ) N X i =1 dW i . By linear SDE estimates, E | ˆ x ( N ) ( t ) − ¯ m ( t ) | ≤ C | x ( N ) (0) − m | + C/N + C E Z t | ˆ u ( N ) ( τ ) − ¯ u ∗ ( τ ) | dτ. By Theorem 21, the lemma follows. (cid:3) emma 27 There exists a constant ˆ C independent of N such that max ≤ i ≤ N sup f J i (ˆ u i , ˆ u − i , f ) ≤ ˆ C . Proof.
Denote dx ′ i = ( Ax ′ i + B ˆ u i + G ¯ m + f ) dt + DdW i , (57)where x ′ i (0) = x i (0). By Lemma 26, it is easy to showsup ≤ t ≤ T,f E | ˆ x i ( t ) − x ′ i ( t ) | ≤ C (1 /N + | x ( N ) (0) − m | ) . We have J i (ˆ u i , ˆ u − i , f ) ≤ ¯ J i (ˆ u i , f ) + E Z T | (ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) ) | Q dt + E | ˆ x i ( T ) − x ′ i ( T ) | H + 2 E Z T [ x ′ i − (Γ ¯ m + η )] T Q [(ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) )] dt + 2 E [ x ′ Ti ( T ) H (ˆ x i ( T ) − x ′ i ( T ))] . (58)Combining Lemma 6 with condition (ii) in Theorem 25, we obtain¯ J i (ˆ u i , f ) ≤ C − ( ǫ / k f k L (59)for ǫ >
0, where C does not depend on ( i, N ). Since neither ˆ x i − x ′ i nor ¯ m − ˆ x ( N ) depend on f , thereexists a constant C such that (cid:12)(cid:12)(cid:12)(cid:12) E Z T [ x ′ i − (Γ ¯ m + η )] T Q [(ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) )] dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ C (cid:18) E Z T | x ′ i − (Γ ¯ m + η ) | Q dt (cid:19) / ≤ C (1 + k f k L ) / ≤ C + ( ǫ / k f k L , (60)where the second inequality follows from elementary estimates based on the solutions of (56) and (57).Similarly, E [ x ′ Ti ( T ) H (ˆ x i ( T ) − x ′ i ( T ))] ≤ C + ( ǫ / k f k L . (61)Finally combining (58)-(61) with Lemma 26 leads to J i (ˆ u i , ˆ u − i , f ) ≤ C − ( ǫ / k f k L . The lemma follows. (cid:3)
Consider the set of strategies ( u i , ˆ u − i ) and the corresponding state processes dx i = ( Ax i + Bu i + Gx ( N ) + f ) dt + DdW i , (62) dx j = ( Ax j + B ˆ u j + Gx ( N ) + f ) dt + DdW j , ≤ j ≤ N, j = i. (63)22 emma 28 If u i in (62) satisfies sup f J i ( u i , ˆ u − i , f ) ≤ ˆ C , there exists ˆ C independent of N such that E Z T | u i ( t ) | dt ≤ ˆ C . (64) Proof.
Suppose sup f J i ( u i , ˆ u − i , f ) ≤ ˆ C . Then for any f , E Z T (cid:18) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:19) dt + E [ x Ti ( T ) Hx i ( T )] ≤ ˆ C , where ( x , · · · , x N ) is generated by ( u i , ˆ u − i ) and f . Taking f = 0, we obtain E Z T (cid:16) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i (cid:17) dt ≤ ˆ C . Therefore, (64) holds. (cid:3)
Let U ˆ C denote the set of processes u i ∈ L F (0 , T ; R n ) which satisfy (64). For (62)-(63), denote x ( N ) = (1 /N ) P Nj =1 x j . Lemma 29
Suppose u i ∈ U ˆ C in (62) . Then sup ≤ t ≤ T,f,u i ∈U ˆ C E | x ( N ) ( t ) − ¯ m ( t ) | = O (1 /N + | x ( N ) (0) − m | ) . Proof.
Rewrite (62) in the form dx i = [ Ax i + B ˆ u i + Gx ( N ) + f ] dt + B ( u i − ˆ u i ) dt + DdW i . (65)By (63) and (65), dx ( N ) = [( A + G ) x ( N ) + B ˆ u ( N ) + f ] dt + BN ( u i − ˆ u i ) dt + DN N X j =1 dW j , which combined with (56) gives d ( x ( N ) − ¯ m ) = [( A + G )( x ( N ) − ¯ m ) + B (ˆ u ( N ) − ¯ u ∗ )] dt + BN ( u i − ˆ u i ) dt + DN N X j =1 dW j . By Theorem 21 and the fact E R T | u i − ˆ u i | ≤ C for all u i ∈ U ˆ C , where the constants C do not dependon ( f, u i ), elementary SDE estimates lead tosup ≤ t ≤ T,f E | x ( N ) ( t ) − ¯ m ( t ) | ≤ C (1 /N + | x ( N ) (0) − m | ) , where C does not depend on u i . The lemma follows. (cid:3) Lemma 30
For each u i ∈ U ˆ C , sup f J i ( u i , ˆ u − i , f ) is finite and attained by some f depending on u i and so denoted as f u i . Moreover, sup u i ∈U ˆ C | sup f J i ( u i , ˆ u − i , f ) − ¯ J i ( u i , ˆ f u i ) | = O (1 / √ N + | x ( N ) (0) − m (0) | ) , where ˆ f u i is determined by Theorem 7 for the given u i . roof. Note that we have dx i = [ Ax i + Bu i + G ¯ m + G ( x ( N ) − ¯ m ) + f ] dt + DdW i , (66)˙¯ m = ( A + G ) ¯ m + B ¯ u ∗ + f, where ¯ m (0) = m . Define the auxiliary process dx † i = ( Ax † i + Bu i + G ¯ m + f ) dt + DdW i , where x † i (0) = x i (0) and ( u i , f, W i ) is the same as in (66). By Lemma 29, it is easy to showsup ≤ t ≤ T,f E | x i ( t ) − x † i ( t ) | = O (1 /N + | x ( N ) (0) − m (0) | ) . (67)We have the relation | x i − (Γ x ( N ) + η ) | Q = | x † i − (Γ ¯ m + η ) | Q + | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q + 2[ x † i − (Γ ¯ m + η )] T Q [( x i − x † i ) + Γ( ¯ m − x ( N ) )] . The cost can be rewritten as J i ( u i , ˆ u − i , f ) = ¯ J i ( u i , f ) + E Z T | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q dt + E h | x i ( T ) − x † i ( T ) | H i + 2 E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt + 2 E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i (68) ≤ ¯ J i ( u i , f ) + C (cid:16) /N + | x ( N ) (0) − m | (cid:17) + 2 E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt + 2 E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i , (69)where the inequality follows from Lemma 29 and (67). Note that neither x i − x † i nor ¯ m − x ( N ) in (68)depend on f . The terms x † i and x † i − (Γ ¯ m + η ) are affine in f , and − ¯ J i ( u i , f ) is convex in f by Lemma6. Consequently, it follows from (68) that − J i ( u i , ˆ u − i , f ) is convex in f . For u i ∈ U ˆ C , in analogue to(59), we obtain ¯ J i ( u i , f ) ≤ C − ( ǫ / k f k L , (70)where C doest not depend on u i . We have (cid:12)(cid:12)(cid:12)(cid:12) E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:26) E Z T | x † i − (Γ ¯ m + η ) | Q dt (cid:27) / · (cid:26) E Z T | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q dt (cid:27) / ≤ C (cid:16) / √ N + | x ( N ) (0) − m | (cid:17) (1 + k f k L ) / ≤ C + ( ǫ / k f k L . (cid:12)(cid:12)(cid:12) E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i(cid:12)(cid:12)(cid:12) ≤ C + ( ǫ / k f k L . Hence, (69) gives J i ( u i , ˆ u − i , f ) ≤ C − ( ǫ / k f k L , (71)where C does not depend on ( N, u i ). So for given u i ∈ U ˆ C , J i ( u i , ˆ u − i , f ) attains a finite supreme atsome f u i since it is a continuous functional of f , and by (71) we may further find a constant ˆ C suchthat sup u i ∈U ˆ C k f u i k L ≤ ˆ C . (72)By (69), J i ( u i , ˆ u − i , f ) ≤ ¯ J i ( u i , f ) + C (1 /N + | x ( N ) (0) − m | )+ C (cid:16) /N + | x ( N ) (0) − m | (cid:17) / (cid:18) E Z T | x † i − (Γ ¯ m + η ) | Q dt (cid:19) / + C (cid:16) /N + | x ( N ) (0) − m | (cid:17) / (cid:16) E | x † i ( T ) | (cid:17) / . (73)Now for u i ∈ U ˆ C and the resulting f u i satisfying (72), we further obtain E | x † i ( T ) | + E Z T | x † i − (Γ ¯ m + η ) | Q dt ≤ C. For u i ∈ U ˆ C , (73) givessup f J i ( u i , ˆ u − i , f ) ≤ ¯ J i ( u i , f u i ) + C (1 / √ N + | x ( N ) (0) − m | ) ≤ ¯ J i ( u i , ˆ f u i ) + C (1 / √ N + | x ( N ) (0) − m | ) , where ˆ f u i is determined by Theorem 7. Due to (70),sup u i ∈U ˆ C k ˆ f u i k L ≤ C (74)for some constant C . By (74) and the method in (68), we similarly derive J i ( u i , ˆ u − i , ˆ f u i ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . Hence, for all u i ∈ U ˆ C ,sup f J i ( u i , ˆ u − i , f ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . The constant C in various places does not depend on u i . The lemma follows. (cid:3) Proof of Theorem 25.
It suffices to show the first inequality by checking u i ∈ U ˆ C . By Lemma30, we have sup f J i ( u i , ˆ u − i , f ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) ≥ ¯ J i (ˆ u i , ˆ f ˆ u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . (75)25n the other hand, by taking the particular control ˆ u i in Lemma 30,sup f J i (ˆ u i , ˆ u − i , f ) ≤ ¯ J i (ˆ u i , ˆ f ˆ u i ) + C (1 / √ N + | x ( N ) (0) − m | ) . (76)Subsequently, (75) and (76) implysup f J i ( u i , ˆ u − i , f ) ≥ sup f J i (ˆ u i , ˆ u − i , f ) − ( C + C )(1 / √ N + | x ( N ) (0) − m | ) . This completes the proof. (cid:3)
This section extends the results to a more general model with random initial states. For agent A i , itsdynamics are given by dx oi ( t ) = ( Ax oi ( t ) + Bu i ( t ) + Gx o ( N ) ( t ) + f ( t )) dt + DdW i ( t ) , ≤ i ≤ N, where x o ( N ) = (1 /N ) P Nj =1 x oj . The initial states of the agents are given by x oi (0) = ξ i . As in (4), wedefine J i ( u i , u − i , f ) by using x oj in place of x j , 1 ≤ j ≤ N . Let {F ot } ≤ t ≤ T be the filtration generatedby { ξ i , W i ( t ) , ≤ i ≤ N } , and L F o (0 , T ; R k ) is defined accordingly.( H0 ) The sequence { ξ i , i ≥ } consists of independent random variables which are also independentof the Browian motions { W i , i ≥ } . In addition, lim N →∞ (1 /N ) P Ni =1 E ξ i = m , sup i E | ξ i | ≤ c forsome constant c independent of N .For fixed ¯ u , we consider the FBSDE dx oi = ( Ax oi + BR − B T y oi + Gm oi + γp oi ) dt + DdW i , ˙ m oi = ( A + G ) m oi + B ¯ u + γp oi , ˙ p oi = − ( A + G ) T p oi − ( I − Γ) T Q [ E x oi − (Γ m oi + η )] ,dy oi = (cid:8) − A T y oi + Q [ x oi − (Γ m oi + η )] (cid:9) dt + ζ oi dW i , (77)where x oi (0) = ξ i , m oi (0) = m , p oi ( T ) = H E x oi ( T ), and y oi ( T ) = − Hx oi ( T ). Except the random initialstate, this FBSDE has the same form as (36).For the current situation where the filtration is not generated only by the Brownian motions, theproof of Lemma 12 is not applicable. The solution procedure of (P2) as presented in Section 3.5 is onlyheuristically applied to derive (77). Nevertheless, we can study (77) directly and use it to constructdecentralized strategies. We still define J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ). The next theoremsubsumes Corollary 14 and Theorem 25. Theorem 31
Assume that (H0) - (H2) hold and (42) has a unique solution ( m , p , y ) . We further take ¯ u = R − B T y in (77) . Then the two assertions hold. (i) (77) has a unique solution in L F o (0 , T ; R n ) × C ([0 , T ]; R n ) × L F o (0 , T ; R n ) . (ii) For ˆ u i = R − B T y oi , ≤ i ≤ N, we have J wo i (ˆ u i , ˆ u − i ) − ε N ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) , (78) where ≤ ε N = O (1 / √ N + | (1 /N ) P Nj =1 E ξ j − m | ) . roof. (i) Consider (36) by setting¯ u = R − B T y , x i (0) = E ξ i . (79)Further construct the ODE by taking expectation in (36): ˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , ˙¯ m i = ( A + G ) ¯ m i + B ¯ u + γ ¯ p i , ˙¯ p i = − ( A + G ) T ¯ p i − ( I − Γ) T Q [¯ x i − (Γ ¯ m i + η )] , ˙¯ y i = − A T ¯ y i + Q [¯ x i − (Γ ¯ m i + η )] , (80)where ¯ x i (0) = E ξ i , ¯ m i (0) = m , ¯ p i ( T ) = H ¯ x i ( T ), and ¯ y i ( T ) = − H ¯ x i ( T ). Since (36) subject to (79)has a unique solution, (80) has a solution in C ([0 , T ]; R n ). If (80) has two different solutions, we willbe able to construct two different solutions to (36) satisfying (79), a contradiction to Theorem 13. So(80) has a unique solution (¯ x i , ¯ m i , ¯ p i , ¯ y i ).Setting ( m oi , p oi ) = ( ¯ m i , ¯ p i ) in the first and last equations of (77), we construct the new equations ( dx oi = ( Ax oi + BR − B T y oi + G ¯ m i + γ ¯ p i ) dt + DdW i ,dy oi = (cid:8) − A T y oi + Q [ x oi − (Γ ¯ m i + η )] (cid:9) dt + ζ oi dW i , (81)where x oi (0) = ξ i and y oi ( T ) = − Hx oi ( T ). Let P be the solution of the Riccati equation (B.4) and takethe transformation y oi = − P x oi + φ . We obtain dφ = (cid:2) − ( A − BR − B T P ) T φ + P ( G ¯ m i + γ ¯ p i ) − Q (Γ ¯ m i + η ) (cid:3) dt + ( ζ oi + P D ) dW i , where φ ( T ) = 0. We solve ( φ, ζ oi ) ∈ L F (0 , T ; R n ), and further obtain ( x oi , y oi ) ∈ L F o (0 , T ; R n ).Subsequently, we can show E x oi = ¯ x i . Hence ( x oi , m oi , p oi , y oi , ζ oi ) satisfies (77). By taking the variationof the first three equations of (77) and applying an optimal control interpretation as in provingTheorem 13, we can show that ( x oi , m oi , p oi , y oi , ζ oi ) is the unique solution.(ii) By slightly modifying the proofs of Theorem 18 and the associated lemmas, we can showsup ≤ t ≤ T E | ˆ u ( N ) − ¯ u ∗ | = O (1 /N ) + O ( | E x ( N ) (0) − m | ) . Next, we adapt the proofs of Lemmas 26-30 taking into account the random initial states satisfying(H0). This gives the desired estimate for ε N . (cid:3) This paper introduces a class of mean field LQG games with drift uncertainty. By using the ideaof robust optimization, the local strategy is designed by minimizing the worst case cost. When thedecentralized strategies are implemented in a finite population, their performance is characterized asa robust ε -Nash equilibrium.In this paper we only deal with drift uncertainty. If the Brownian motions are also subject toan uncertain coefficient process to model volatility uncertainty [38], the resulting optimal controlproblems will give a set of more complicated FBSDE. It is also of potential interest to address modeluncertainty of the mean field game in a different setup by considering measure uncertainty [16, 36, 48]in the robust optimization problem. This will necessitate the use of different techniques for analysis.27 ppendix A For proving Lemma 9, we give another lemma first. Consider an auxiliary optimal control problemwith dynamics ˙ z i = Az i + Bv i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q ( E z i − Γ z ) , (A.1)where z i (0) = z (0) = 0, q ( T ) = H E z i ( T ) and v i ∈ L F (0 , T ; R n ). Following the argument in the proofof Lemma 8, under (H1) we can show the existence and uniqueness of a solution to (A.1). The optimalcontrol problem is( P2b ) minimize ¯ J bi ( v i ) = E Z T (cid:8) | z i − Γ z | Q + v Ti Rv i − γ | q ( t ) | (cid:9) dt + E | z i ( T ) | H . Similarly, we may define positive definiteness of ¯ J bi as in Section 3. Lemma A.1 ¯ J ai is positive semi-definite (resp., positive definite) if and only if ¯ J bi is positive semi-definite (resp., positive definite).Proof. If suffices to show the “only if” part.Suppose that ¯ J ai is positive semi-definite. Consider any control v i ∈ L F (0 , T ; R n ) for ¯ J bi , and thisgives a unique solution ( z i , z, q ). We take expectation in (A.1) to obtain ˙¯ z i = A ¯ z i + B ¯ v i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q (¯ z i − Γ z ) , where ¯ z i = E z i and ¯ v i = E v i .It follows that¯ J bi ( v i ) = ¯ J ai (¯ v i ) + E Z T (cid:2) | z i − E z i | Q + | v i − E v i | R (cid:3) dt + E | z i ( T ) − E z i ( T ) | H ≥ ¯ J ai (¯ v i ) ≥ . On the other hand, ¯ J ai (0) = 0. This shows that ¯ J bi is positive semi-definite. The above reasoning isalso valid for the positive definite case. This proves the “only if” part. (cid:3) Proof of Lemma 9 . Let ( x i , m i , p i ) and ( x ′ i , m ′ i , p ′ i ) be two state processes in (P2) corresponding tothe controls u i and u ′ i , respectively. Assume λ ∈ [0 ,
1] and λ + λ = 1. We have λ ¯ J i ( u i ) + λ ¯ J i ( u ′ i ) − ¯ J i ( λ u i + λ u ′ i )= λ λ E Z T (cid:8) | x i − x ′ i − Γ( m i − m ′ i ) | Q + | u i − u ′ i | R − γ | p i ( t ) − p ′ i ( t ) | (cid:9) dt + λ λ E | x i ( T ) − x ′ i ( T ) | H . Denote z i = x i − x ′ i , z = m i − m ′ i , q = p i − p ′ i and v i = u i − u ′ i . It is obvious λ ¯ J i ( u i ) + λ ¯ J i ( u ′ i ) − ¯ J i ( λ u i + λ u ′ i ) = λ λ ¯ J bi ( v i ) . Recalling Lemma A.1, this completes the proof. (cid:3) ppendix B We introduce the FBSDE dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + BR − B T E y i + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ m i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (B.1)where x i (0) = m i (0) = m , p i ( T ) = Hm i ( T ), and y i ( T ) = − Hx i ( T ). This FBSDE is slightly differentfrom (42) by the third equation and the condition on p i ( T ) and will be more convenient for analysis.The next lemma shows that the two equation systems (43) and (B.1) are equivalent. The proof isstraightforward since E x i and m i satisfy the same ODE with the same initial condition. Lemma B.1 If ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] satisfies one of (43) and (B.1) , it also satisfies the other. (cid:3) Consider the ODE system ˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , ˙¯ m i = ( A + G ) ¯ m i + BR − B T ¯ y i + γ ¯ p i , ˙¯ p i = − ( A + G ) T ¯ p i − ( I − Γ) T Q [ ¯ m i − (Γ ¯ m i + η )] , ˙¯ y i = − A T ¯ y i + Q [¯ x i − (Γ ¯ m i + η )] , (B.2)where ¯ x i (0) = ¯ m i (0) = m , ¯ p i ( T ) = H ¯ m i ( T ) and ¯ y i ( T ) = − H ¯ x i ( T ). Lemma B.2
The two statements are equivalent: (i)
The
FBSDE (B.1) has a unique solution in S [0 , T ] . (ii) The
ODE (B.2) has a unique solution in C ([0 , T ]; R n ) .Proof. Step 1. Suppose that (ii) holds and let the unique solution be denoted by (¯ x i , ¯ m i , ¯ p i , ¯ y i ).Take ( m i , p i ) = ( ¯ m i , ¯ p i ) on the right hand side of the first and last equations of (B.1) to write ( dx i = ( Ax i + BR − B T y i + G ¯ m i + γ ¯ p i ) dt + DdW i ,dy i = (cid:8) − A T y i + Q [ x i − (Γ ¯ m i + η )] (cid:9) dt + ζ i dW i , (B.3)where y i ( T ) = − Hx i ( T ). Denote the Riccati equation˙ P + A T P + P A − P BR − B T P + Q = 0 , P ( T ) = H, (B.4)which has a unique solution on [0 , T ]. Setting y i = − P x i + φ in (B.3), we obtain two decoupled equa-tions for ( x i , φ ) which is uniquely solved. This further gives a unique solution ( x i , y i , ζ i ) ∈ L F (0 , T ; R n )for (B.3). Taking expectation on both sides of (B.3) yields ( ddt E x i = A E x i + BR − B T E y i + G ¯ m i + γ ¯ p i , ddt E y i = − A T E y i + Q [ E x i − (Γ ¯ m i + η )] , (B.5)where E y i ( T ) = − H E x i ( T ). By combining (B.5) with the first and fourth equations of (B.2), it iseasy to show E x i = ¯ x i and E y i = ¯ y i for all t ∈ [0 , T ]. This implies˙¯ m i = ( A + G ) ¯ m i + BR − B T ¯ y i + γ ¯ p i = ( A + G ) ¯ m i + BR − B T E y i + γ ¯ p i . m i , ¯ p i ). Therefore, ( x i , m i , p i , y i , ζ i ) := ( x i , ¯ m i , ¯ p i , y i , ζ i )satisfies (B.1).We continue to show that ( x i , m i , p i , y i , ζ i ) above is the unique solution of (B.1). Suppose that( x ′ i , m ′ i , p ′ i , y ′ i , ζ ′ i ) is another solution of (B.1). It is clear that ( E x ′ i , m ′ i , p ′ i , E y ′ i ) is a solution of (B.2).Since (B.2) has a unique solution (¯ x i , ¯ m i , ¯ p i , ¯ y i ), we have ( m ′ i , p ′ i ) = ( ¯ m i , ¯ p i ). By using the first andfourth equations of (B.1), we derive the equations satisfied by ( x ′ i − x i , y ′ i − y i ) and further infer( x ′ i , y ′ i ) = ( x i , y i ). We conclude that (i) holds.Step 2. Suppose that (i) holds with the unique solution denoted by ( x i , m i , p i , y i , ζ i ). It is ob-vious that (¯ x i , ¯ m i , ¯ p i , ¯ y i ) := ( E x i , m i , p i , E y i ) is a solution of (B.2). Suppose that (¯ x ′ i , ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) =(¯ x i , ¯ m i , ¯ p i , ¯ y i ) is another solution of (B.2). Then ( x i , m i , p i , y i , ζ i ) + (¯ x ′ i − ¯ x i , ¯ m ′ i − ¯ m i , ¯ p ′ i − ¯ p i , ¯ y ′ i − ¯ y i , (cid:3) Lemma B.3 (i) If (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution of (B.2) , ( m , p , y ) := ( ¯ m i , ¯ p i , ¯ y i ) satisfies (42) . (ii) If ( m , p , y ) is a solution of (42) , there exists ¯ x i such that (¯ x i , ¯ m i , ¯ p i , ¯ y i ) := (¯ x i , m , p , y ) satisfies (B.2) . (iii) The
ODE (B.2) has a unique solution if and only if (42) has a unique solution.Proof. (i) If (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution of (B.2), ¯ x i = ¯ m i and therefore ¯ y i ( T ) = − H ¯ x i ( T ) = − H ¯ m i ( T ). So ( m , p , y ) defined above satisfies (42).(ii) If ( m , p , y ) is a solution of (42), we set ( ¯ m i , ¯ p i , ¯ y i ) = ( m , p , y ) and define ¯ x i by the ODE˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , where ¯ x i (0) = m . It can be checked that ¯ m i = ¯ x i , which gives ¯ y i ( T ) = − H ¯ m i ( T ) = − H ¯ x i ( T ).Hence, (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution to (B.2).(iii) Assume that (42) has a unique solution. Let (¯ x i , ¯ m i , ¯ p i , ¯ y i ) and (¯ x ′ i , ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) be two solutionsof (B.2). By (i), ( ¯ m i , ¯ p i , ¯ y i ) and ( ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) are two solutions of (42) and so must be equal, whichfurther implies ¯ x i = ¯ x ′ i by the first equation of (B.2). This shows that (B.2) has a unique solution.Next assume that (B.2) has a unique solution. Let ( m , p , y ) and ( m ′ , p ′ , y ′ ) be two solutions of(42). By (ii), we must have ( m , p , y ) = ( m ′ , p ′ , y ′ ) . Therefore, (42) has a unique solution. (cid:3)
References [1] S. Adlakha, R. Johari, and G. Y. Weintraub. Equilibria of dynamic games with many players:Existence, approximation, and market structure.
J. Econ. Theory , vol. 156, pp. 269-316, 2015.[2] M. Aghassi and D. Bertsimas. Robust game theory,
Math. Progr. , vol. 107, pp. 231-273, 2006.[3] D. Andersson and B. Djehiche. A maximum principle for stochastic control of SDE’s of mean-fieldtype.
Applied Math. Optim. , vol. 63, pp. 341-356, 2010.[4] M. Bardi. Explicit solutions of some linear-quadratic mean field games.
Netw. HeterogeneousMedia , vol. 7, no. 2, pp. 243-261, 2012.[5] T. Basar and P. Bernhard. H ∞ -optimal Control and Related Minimax Design Problems: A Dy-namic Game Approach. Mean Field Games and Mean Field Type Control Theory .Springer, New York, 2013.[7] A. Bensoussan, K. C. J. Sung, S. C. P. Yam, and S. P. Yung. Linear-quadratic mean-field games.
J. Optim. Theory Appl. , vol. 169, no. 2, pp. 496-529, 2016.308] A. Bensoussan, K. C. J. Sung, S. C. P. Yam. Linear-quadratic time-inconsistent mean field games.
Dynamic Games Appl. , vol. 3, no.4, pp. 537-552, 2013.[9] R. Buckdahn, J. Li, and S. Peng. Nonlinear stochastic differential games involving a major playerand a large number of collectively acting minor agents.
SIAM J. Control Optim. , vol. 52, no. 1,pp. 451-492, 2014.[10] P. E. Caines. Mean field games, in
Encyclopedia of Systems and Control , Ed. T. Samad and J.Baillieul, Berlin: Springer-Verlag, 2014.[11] P. Cardaliaguet.
Notes on mean field games , 2012.[12] R. Carmona and F. Delarue. Probabilistic analysis of mean-field games.
SIAM J. Control Optim. ,vol. 51, no. 4, pp. 2705-2734, 2013.[13] R. Carmona, F. Delarue, and A. Lachapelle. Control of McKean-Vlasov dynamics versus meanfield games.
Math. Financ. Econ. , vol. 7, no. 2, pp. 131-166, 2013.[14] R. Carmona and D. Lacker. A probabilistic weak formulation of mean field games and applications.
Ann. Appl. Probab. , vol. 25, no. 3, pp. 1189-1231, 2015.[15] P. Chan and R. Sircar. Bertrand and Cournot mean field games,
Appled Math. Optim. , vol. 71,no. 3, pp. 533-569, 2015.[16] C. D. Charalambous and F. Rezaei. Stochastic uncertain systems subject to relative entropyconstraints: induced norms and monotonicity properties of minimax games.
IEEE Transactionson Automatic Control , vol. 52, no. 4, pp. 647-663, April 2007.[17] S. Chen, X. Li, and X. Y. Zhou. Stochastic linear quadratic regulators with indefinite controlweight costs.
SIAM J. Control Optim. , vol. 36, no. 5, pp. 1685-1702, Sept. 1998.[18] B. Djehiche, and M. Huang. A characterization of sub-game perfect equilibria for SDEs of meanfield type.
Dynamic Games Appl. , vol. 6, no. 1, pp. 55-81, 2016.[19] J. Engwerda. A numerical algorithm to find soft-constrained Nash equilibria in scalar LQ-games.
Int. J. Control , vol. 79, no. 6, pp. 592-603, 2006.[20] M. Fischer. On the connection between symmetric N -player games and mean field games, Arxivpreprint, 2014.[21] D. A. Gomes and J. Saude. Mean field games models–a brief survey. Dynamic Games Appl. , vol.4, no. 2, pp. 110-154, 2014.[22] M. Hu and M. Fikushima. Existence, uniqueness, and computation of robust Nash equilibria ina class of multi-leader-follower games.
SIAM J. Optim. , vol. 23, no. 2, pp. 894-916, 2013.[23] Y. Hu and S. Peng. Solution of forward-backward stochastic differential equations.
Probab. TheoryRelated Fields , vol. 103, pp. 273-283, 1995.[24] M. Huang, P. E. Caines, and R. P. Malham´e. Individual and mass behaviour in large populationstochastic wireless power control problems: Centralized and Nash equilibrium solutions.
Proc.42nd IEEE CDC , Maui, HI, pp. 98-103, Dec. 2003.[25] J. Huang and M. Huang. Mean field LQG games with model uncertainty.
Proc. 52nd IEEE CDC ,Florence, Italy, pp. 3103-3108, December 2013.3126] M. Huang. Large-population LQG games involving a major player: the Nash certainty equivalenceprinciple.
SIAM J. Control Optim. , vol. 48, pp. 3318-3353, 2010.[27] M. Huang, P. E. Caines and R. P. Malham´e. Large-population cost-coupled LQG problems withnon-uniform agents: individual-mass behavior and decentralized ε -Nash equilibria. IEEE Trans-actions on Automatic Control , vol. 52, pp. 1560-1571, 2007.[28] M. Huang, R. P. Malham´e, and P. E. Caines. Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle.
Communication inInformation and Systems , vol. 6, pp. 221-251, 2006.[29] M. Jimenez and A. Poznyak. ε -equilibrium in LQ differential games with bounded uncertain dis-turbances: robustness of standard strategies and new strategies with adaptation. Int. J. Control ,vol. 79, no. 7, pp. 786-797, 2006.[30] E. Kardes, F. Ordonez, and R.W. Hall. Discounted robust stochastic games and an applicationto queueing control.
Operations Research , vol. 59, no. 2, pp. 365-382, 2011.[31] A. C. Kizilkale and P. E. Caines. Mean field stochastic adaptive control.
IEEE Transactions onAutomatic Control , vol. 58, no. 4, pp. 905-920, April 2013.[32] V. N. Kolokoltsov, J. Li, and W. Yang. Mean field games and nonlinear Markov processes,arXiv:1112.3744, preprint, 2011.[33] A. J. Kurdila and M. Zabarankin.
Convex Functional Analysis , Berlin: Birkh¨auser, 2005.[34] J. M. Lasry and P. L. Lions. Mean field games.
Japan J. Math. , vol. 2, pp. 229-260, 2007.[35] T. Li and J. F. Zhang. Asymptotically optimal decentralized control for large population stochas-tic multiagent systems.
IEEE Transactions on Automatic Control , vol. 53, no. 7, pp. 1643-1660,August 2008.[36] A. Lim and J. Shanthikumar. Relative entropy, exponential utility, and robust dynamic pricing.
Operations Research , vol. 55, pp. 198-214, 2007.[37] A. Lim and X. Y. Zhou. Stochastic optimal LQR control with integral quadratic constraints andindefinite control weights.
IEEE Transactions on Automatic Control , vol. 44, no. 7, pp. 1359-1369,July, 1999.[38] D. P. Looze, H. V. Poor, K. S. Vastola, and J. C. Darragh. Minimax control of linear stochasticsystems with noise uncertainty.
IEEE Transactions on Automatic Control , AC-28, no. 9, pp.882-888, Sept. 1983.[39] D. G. Luenberger.
Optimization by Vector Space Methods . New York: Wiley, 1969.[40] J. Ma and J. Yong.
Forward-backward Stochastic Differential Equations and their Applications ,Lecture Notes in Math. 1702. Springer-Verlag, New York, 1999.[41] S. L. Nguyen and M. Huang. Linear-quadratic-Gaussian mixed games with continuum-parametrized minor players.
SIAM J. Control Optim. , vol. 50, no. 5, pp. 2907-2937, 2012.[42] M. Nourian and P. E. Caines. ǫ -Nash mean field game theory for nonlinear stochastic dynamicalsystems with major and minor agents. SIAM J. Control Optim. , vol. 51, no. 4, pp. 3302-3331,2013. 3243] M. Nourian, P. E. Caines, R. P. Malham´e, and M. Huang. Mean field control in leader-followerstochastic multi-agent systems: likelihood ratio based adaptation.
IEEE Transactions on Auto-matic Control , vol. 57, no. 11, pp. 2801-2816, Nov. 2012.[44] S. Peng and Z. Wu. Fully coupled forward-backward stochastic differential equations and appli-cations to optimal control,
SIAM J. Control Optim. , vol. 37, no. 3, pp. 825-843, 1999.[45] J. Sun, X. Li, and J. Yong. Open-loop and closed-loop solvabilities for stochastic linear quadraticoptimal control problems.
SIAM J Control Optim. , vol. 54, no. 5, pp. 2274-2308, 2016.[46] H. Tembine, D. Bauso, and T. Basar. Robust linear quadratic mean-field games in crowd-seekingsocial networks.
Proc. 52nd IEEE CDC , Florence, Italy, pp. 3134-3139, 2013.[47] H. Tembine, Q. Zhu, and T. Basar. Risk-sensitive mean-field games.
IEEE Transactions on Au-tomatic Control , vol. 59, no. 4, pp. 835-850, April 2014.[48] V. A. Ugrinovskii and I. R. Petersen. Minimax LQG control of stochastic partially observeduncertain systems.
SIAM J. Control Optim. , vol. 40, no. 4, pp. 1189-1226, 2001.[49] W. A. van den Broek, J. C. Engwerda, and J. M. Schumacher. Robust equilibria in indefinitelinear-quadratic differential games.
J. Optim. Appl. , vol. 119, no. 3, pp. 565-595, 2003.[50] B. C. Wang and J.-F. Zhang. Mean field games for large-population multiagent systems withMarkov jump parameters.
SIAM J. Control Optim. , vol. 50, no. 4, pp. 2308-2334, 2013.[51] J. C. Willems. Least squares stationary optimal control and the algebraic Riccati equation.
IEEETrans. Autom. Control , vol. 16, no. 6, pp. 621-634, Dec. 1971.[52] H. Yin, P. G. Mehta, S. P. Meyn, and U. V. Shanbhag. Synchronization of coupled oscillators isa game.
IEEE Trans. Autom. Control , vol. 57, no. 4, pp. 920-935, April 2012.[53] J. Yong. Linear-quadratic optimal control problems for mean-field stochastic differential equa-tions.
SIAM J. Control Optim. , vol. 51, no. 4, pp. 2809-2838, 2013.[54] J. Yong and X. Y. Zhou.