[PDF] Robust Mean Field Linear-Quadratic-Gaussian Games with Unknown L^2-Disturbance

Abstract

This paper considers a class of mean field linear-quadratic-Gaussian (LQG) games with model uncertainty. The drift term in the dynamics of the agents contains a common unknown function. We take a robust optimization approach where a representative agent in the limiting model views the drift uncertainty as an adversarial player. By including the mean field dynamics in an augmented state space, we solve two optimal control problems sequentially, which combined with consistent mean field approximations provides a solution to the robust game. A set of decentralized control strategies is derived by use of forward-backward stochastic differential equations (FBSDE) and shown to be a robust epsilon-Nash equilibrium.

Full PDF

aa r X i v : . [ m a t h . O C ] J a n Robust Mean Field Linear-Quadratic-Gaussian Gameswith Unknown L -Disturbance ∗ Jianhui Huang † and Minyi Huang ‡ Abstract

This paper considers a class of mean ﬁeld linear-quadratic-Gaussian (LQG) games with modeluncertainty. The drift term in the dynamics of the agents contains a common unknown function. Wetake a robust optimization approach where a representative agent in the limiting model views thedrift uncertainty as an adversarial player. By including the mean ﬁeld dynamics in an augmentedstate space, we solve two optimal control problems sequentially, which combined with consistentmean ﬁeld approximations provides a solution to the robust game. A set of decentralized controlstrategies is derived by use of forward-backward stochastic diﬀerential equations (FBSDE) andshown to be a robust ε -Nash equilibrium. Mean ﬁeld game theory provides an eﬀective methodology for the analysis and strategy design in alarge population of players which are individually insigniﬁcant but collectively have strong impact (seee.g. [24, 27, 28, 34]). A typical modeling analyzes a system of N players with mean ﬁeld couplingin their dynamics or costs, or both. The linear-quadratic-Gaussian (LQG) framework is of particularinterest since it allows an explicit solution procedure. Consider a large population of N agents. Thedynamics of agent i are given by the stochastic diﬀerential equation (SDE) dx i ( t ) = ( Ax i ( t ) + Bu i ( t ) + Gx ( N ) ( t )) dt + DdW i ( t ) , t ≥ , (1)where x ( N ) = (1 /N ) P Ni =1 x i denotes the mean ﬁeld coupling term. The cost of agent i is given by J i ( u i , . . . , u N ) = E h Z T (cid:16) | x i − Γ x ( N ) − η | Q + u Ti Ru i (cid:17) dt + x Ti ( T ) Hx i ( T ) i , (2)where we denote | z | Q = ( z T Qz ) and the symmetric matrices Q ≥ , H ≥ R >

0. The LQGmodeling framework was ﬁrst developed in [24, 27] to obtain a set of strategies (ˆ u , . . . , ˆ u N ) such thateach ˆ u i only uses the local sample path information of x i and some deterministic functions reﬂectingthe collective behavior of the agents and such that (ˆ u , . . . , ˆ u N ) is an ε -Nash equilibrium. There hasexisted a substantial body of literature adopting the LQG framework [4, 7, 31, 35, 43, 47]. ∗ A compressed version of this paper without detailed proofs has been presented at the 2015 IEEE CDC. † J. Huang is with the Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong([email protected]). This author was supported by RGC Early Career Scheme (ECS) grant 502412P, 500613P. ‡ M. Huang is with the School of Mathematics and Statistics, Carleton University, Ottawa, ON K1S 5B6, Canada([email protected]). This author was supported in part by Natural Sciences and Engineering Research Council(NSERC) of Canada under a Discovery Grant and a Discovery Accelerator Supplements Program. Part of this author’swork was conducted at the Department of Applied Mathematics, The Hong Kong Polytechnic University during March-May, 2014. Please address all correspondence to this author. N player static game with ﬁnite action spaces and an uncertain payoﬀ matrix, arobust-optimization equilibrium is introduced in [2] where each player optimizes its worst case payoﬀwith respect to the uncertain set. A similar method is applied to hierarchical static games [22].Robustness has been addressed in dynamic games as well. A linear-quadratic (LQ) game with systemparameter uncertainties is presented in [29], and the deviation from the Nash equilibrium is estimatedfor a set of nominal strategies. Robust Nash equilibria are analyzed in [49] for an LQ game with anunknown time-varying disturbance signal as an adversarial player. In the ﬁrst case, a soft-constrainedgame is solved where the cost includes a quadratic penalty term for the disturbance. The second caseintroduces a hard constraint by specifying an L bound on the disturbance function. The work [30]deals with stochastic games where the payoﬀ and state transition probabilities contain uncertainty.The solution is developed by letting each player solve a robust Markov decision problem to optimizeits worst case cost while other players’ strategies are ﬁxed.This paper aims to address model uncertainty in the mean ﬁeld LQG game context. Speciﬁcally,we focus on drift uncertainty by adding to (1) a common unknown L -disturbance f . A practicalmotivation is that in many decision problems, a large number of agents can share a common uncertaintysource ﬂuctuating with time, and examples include taxation, subsidy, interest rates, and so on. Adirect consequence of our modeling is that this disturbance has global inﬂuence on the population.To address robustness, each agent locally views the disturbance as an adversarial player, and for thispurpose we incorporate into (2) an eﬀort penalty term for the disturbance which in turn maximizesthe resulting cost ﬁrst. The agent minimizes subsequently. The framework of letting the disturbancemaximize while its eﬀort is penalized is called the soft-constraint approach [5, 19, 49]. It has theadvantage of analytical tractability. When a hard constraint is considered, the robust mean ﬁeld gameis more diﬃcult to tackle; see some preliminary analysis in [25]. Regarding robustness in mean ﬁeldgames, a related work is [46] where each agent is paired with its local disturbance as an adversarialplayer. The resulting solution is to replace the usual HJB equation by a Hamilton-Jacobi-Isaacs (HJI)equation in the solution.To design the individual strategies it is necessary to build the dynamics of the mean ﬁeld (i.e.state average of the agents) evolving under the disturbance. This technique shares its spirit with thestate augmentation method in major player models [26, 41, 42]. The subsequent robust optimizationproblem, as a minimax control problem, leads to two optimal control problems with indeﬁnite stateweights [51]. They are diﬀerent from the well known stochastic control problems with indeﬁnite controlweights [17, 37]. We will follow a convex optimization approach to solve the two control problems viavariational analysis and forward-backward stochastic diﬀerential equations (FBSDE) [23, 40, 44]. Boththe information structure and the solution procedure for our model are diﬀerent from [46] where eachplayer and its local disturbance have access to its state and so dynamic programming is applicable.Our main contributions are summarized as follows: • We formulate a class of mean ﬁeld LQG games where the players face a common uncertaintysource, and introduce the robust optimization approach to solve two convex optimal controlproblems. 2

Decentralized strategies are obtained for the robust mean ﬁeld game via a set of FBSDE. • The performance of the decentralized strategies for the N players is characterized as a robust ε -Nash equilibrium.The rest of this paper is organized as follows. Section 2 introduces the mean ﬁeld LQG game witha common disturbance and deﬁnes the worst case cost for a player. Section 3 studies the limitingrobust optimization problem which leads to two optimal control problems solved sequentially by thedisturbance and the representative player. The solution equation system of the mean ﬁeld game isobtained in Section 4 based on consistent mean ﬁeld approximations. A key error estimate of themean ﬁeld approximation is developed in Section 5. Section 6 characterizes the set of decentralizedstrategies as a robust ε -Nash equilibrium. An extension of the analysis to players with random initialstates is presented in Section 7, and Section 8 concludes the paper. Consider a ﬁnite time horizon [0 , T ] for

T >

0. Suppose that (Ω , F , {F t } ≤ t ≤ T , P ) is a complete ﬁlteredprobability space. Throughout this paper, we denote by R k the k -dimensional Euclidean space, R n × k the set of all n × k matrices. We use | · | to denote the norm of a Euclidean space, or the Frobeniusnorm for matrices. For a vector or matrix M , M T denotes its transpose. Let L F (0 , T ; R k ) denotethe space of all R k -valued F t -progressively measurable processes x ( · ) satisfying E R T | x ( t ) | dt < ∞ ; C ([0 , T ]; R k ) (resp., C ([0 , T ]; R k )) is the space of all R k -valued functions h ( · ) deﬁned on [0 , T ] whichare continuous (resp., continuously diﬀerentiable); L (0 , T ; R k ) is the space of all R k -valued measurablefunctions h ( · ) on [0 , T ] satisfying R T | h ( t ) | dt < ∞ , and we denote the norm k h k L = ( R T | h ( t ) | dt ) / .Throughout the paper, we use C (or C , C , . . . ) to denote a generic constant which does not dependon the population size N and may vary from place to place. Consider N agents (or players) denoted by A i , 1 ≤ i ≤ N , respectively. The state x i of A i is R n -valuedand satisﬁes the linear SDE dx i ( t ) = ( Ax i ( t ) + Bu i ( t ) + Gx ( N ) ( t ) + f ( t )) dt + DdW i ( t ) , ≤ i ≤ N, (3)where x ( N ) = (1 /N ) P Nj =1 x j . The control u i takes its value in R n . The R n -valued standard Brownianmotions { W i ( t ) , ≤ i ≤ N } are independent. The initial states { x i (0) , ≤ i ≤ N } are deterministicand their empirical mean has the limit lim N →∞ (1 /N ) P Ni =1 x i (0) = m . We take {F t } ≤ t ≤ T as thenatural ﬁltration generated by the N n -dimensional Brownian motion ( W ( t ) , . . . , W N ( t )), and F = F T . The admissible control set U of A i is U := (cid:8) u i ( · ) : u i ∈ L F (0 , T ; R n ) (cid:9) . Denote u = ( u , . . . , u N ) and u − i = ( u , . . . , u i − , u i +1 , . . . , u N ).The function f ∈ L (0 , T ; R n ) is an unknown disturbance to characterize the model uncertainty,and represents an inﬂuence from the common environment for decision-making. A natural motivationfor considering deterministic disturbance is the following. Although each player A i regards the dis-turbance as adversarial, it should not be excessively pessimistic by assuming that the latter will usethe sample path information of W i to play against it, and instead only considers a deterministic f .The cost functional of A i is J i ( u i , u − i , f ) = E (cid:20)Z T (cid:18) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:19) dt + x Ti ( T ) Hx i ( T ) (cid:21) , (4)3here the symmetric matrices Q ≥ R > H ≥ γ >

0. We assume uniformagents in the sense that they share the same parameter datum (

A, B, G, D ; Γ , η, Q, R, γ, H ) . Also, tosimplify the analysis, we consider constant parameters.Due to the unknown function f , A i cannot evaluate its cost even if all control policies ( u , . . . , u N )are known. To address this indeterminacy, we approach the game from a robust optimization pointof view where each agent takes f as an adversarial player. Here a soft-constraint [5, 49, 19] for thedisturbance is adopted in that the term − γ | f ( t ) | is included in (4) while f attempts to maximize.For given ( u i , u − i ), deﬁne the worst case cost of A i as J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ) . A set of strategies (ˆ u , . . . , ˆ u N ) is a robust ε -Nash equilibrium for the N players if for ε ≥ J wo i (ˆ u i , ˆ u − i ) − ε ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) . (5)Our central objective is to design decentralized strategies based on the above solution notion. We start by making an appropriate approximation of the coupling term x ( N ) . Adding up the N equations in (3) and normalizing by 1 /N , we obtain dx ( N ) = [( A + G ) x ( N ) + Bu ( N ) + f ] dt + D (1 /N ) N X j =1 dW j , where u ( N ) = (1 /N ) P Nj =1 u j . Intuitively, from the point of view of A i , u ( N ) may be approximated bya deterministic function ¯ u . Moreover, when N → ∞ , (1 /N ) P Nj =1 dW j vanishes due to the law of largenumbers. In turn, a deterministic function m can be used to approximate x ( N ) . The above reasoningsuggests to introduce the limiting ordinary diﬀerential equation (ODE)˙ m = ( A + G ) m + B ¯ u + f, m (0) = m . (6) Consider the optimization problem of a representative agent A i : ( dx i = ( Ax i + Bu i + Gm i + f ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + f, (7)where the second equation is motivated from (6) and m i (0) = m . For the limiting model (7),( W i , x i (0)) is the same as in (3). We reuse ( x i , A i ) to denote the state and the corresponding agent.This shall cause no risk of confusion. Since f will be determined as its worst case form depending on x i (0), m i is associated with the agent index i so that it is ready as an appropriate notation for thesubsequent closed-loop dynamics. The cost functional is given by¯ J i ( u i , f ) = E Z T (cid:26) | x i − (Γ m i + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:27) dt + E x Ti ( T ) Hx i ( T ) .

4e aim to ﬁnd a solution pair ( ˆ f , ˆ u i ) such that¯ J i (ˆ u i , ˆ f ) = min u i ∈U max f ∈ L (0 ,T ; R n ) ¯ J i ( u i , f ) . (8)Finally, we need a consistency condition, i.e., N P Ni =1 ˆ u i converges to ¯ u in some sense (this will bemade precise in Section 4) and we look for ¯ u ∈ C ([0 , T ]; R n ); the feasibility of doing so will be clearfrom our solution procedure. The next part of our plan is to show that such strategies have theproperty in (5) when applied in the game of N agents. In the following, we solve the optimizationproblem (8) in two steps. Let u i ∈ U and ¯ u ∈ C ([0 , T ]; R n ) be ﬁxed. The optimal control problem is( P1 ) maximize f ∈ L (0 ,T ; R n ) ¯ J i ( u i , f ) . (9)Clearly ( P1) is equivalent to the following problem( P1a ) minimize f ∈ L (0 ,T ; R n ) ¯ J ′ i ( u i , f ) = E Z T (cid:26) −| x i − (Γ m i + η ) | Q + 1 γ | f ( t ) | (cid:27) dt − E x Ti ( T ) Hx i ( T ) . (P1a) is an optimal control problem with negative semi-deﬁnite state weights. We are interestedin the situation where (P1a) is a strictly convex problem with a coercivity property. This ensures thatthe worse case disturbance is uniquely determined by A i . The procedure below to identify conditionsfor ensuring convexity is similar to [37].To study the convexity of ¯ J ′ i in f , we construct a simpler auxiliary optimal control problem. Denote b Q = ( I − Γ) T Q ( I − Γ) . Consider the dynamics ˙ z = ( A + G ) z + g, z (0) = 0 , (10)where g ∈ L (0 , T ; R n ). The optimal control problem is( P1b ) minimize ¯ J ′′ i ( g ) = Z T (cid:26) − z T b Qz + 1 γ | g ( t ) | (cid:27) dt − z T ( T ) Hz ( T ) . For any s ∈ R , we have ¯ J ′′ i ( sg ) = s ¯ J ′′ i ( g ), and so view ¯ J ′′ i as a quadratic functional of g . Deﬁnition 1

Let F ( g ) be a real-valued functional of g ∈ L (0 , T ; R n ) . If F ( g ) ≥ for all g , F is saidto be positive semi-deﬁnite. If furthermore, F ( g ) > for all g = 0 , F is said to be positive deﬁnite. Lemma 2 ¯ J ′ i ( u i , f ) is convex (resp., strictly convex) in f if and only if ¯ J ′′ i ( g ) is positive semi-deﬁnite(resp., positive deﬁnite).Proof. Let ( x i , m i ) and ( x ′ i , m ′ i ) be the state processes of (7) corresponding to ( u i , f ) and ( u i , f ′ ),respectively. Take any λ ∈ [0 ,

1] and denote λ = 1 − λ . Then λ ¯ J ′ i ( u i , f ) + λ ¯ J ′ i ( u i , f ′ ) − ¯ J ′ i ( u i , λ f + λ f ′ )= λ λ E Z T (cid:26) | x i − x ′ i − Γ( m i − m ′ i ) | Q + 1 γ | f ( t ) − f ′ ( t ) | (cid:27) dt − λ λ E | x i ( T ) − x ′ i ( T ) | H . g = f − f ′ , and z = x i − x ′ i . Therefore, z is deterministic and satisﬁes (10). In addition, m i − m ′ i = z for t ∈ [0 , T ]. Hence λ ¯ J ′ i ( u i , f ) + λ ¯ J ′ i ( u i , f ′ ) − ¯ J ′ i ( u i , λ f + λ f ′ ) = λ λ ¯ J ′′ i ( g )and the lemma follows. (cid:3) For our further existence analysis, we need to ensure ¯ J ′ i ( u i , f ) to be both strictly convex andcoercive in f . For this purpose, we introduce the following assumption.( H1 ) There exists a small ǫ > J ′′ i ( g ) − ǫ k g k L is positive semi-deﬁnite.Note that (H1) is completely determined by the parameters ( b Q, γ, ǫ , H, T ), and does not dependon u i . Concerning (H1), we have the following result. Proposition 3

The following statements are equivalent: (i) (H1) holds true on [0 , T ] . (ii) The Riccati equation ˙ P + ( A + G ) T P + P ( A + G ) − γP − b Q = 0 , P ( T ) = − H (11) has a unique solution on [0 , T ] . (iii) For any t ∈ [0 , T ] , det { [(0 , I ) e A t (0 , I ) T ] } > , where A = (cid:18) A + G + γH − γI ˘ Q − ( A + G + γH ) T (cid:19) and ˘ Q = γH + b Q + ( A + G ) T H + H ( A + G ) . Proof.

In fact, (H1) is the uniform convexity condition proposed in [45], and the equivalencebetween (i) and (ii) is a corollary of Theorem 4.6 of [45]. Moreover, (iii) = ⇒ (ii) is given in Theorem4.3 of [40]. On the other hand, (ii) = ⇒ (iii) is implied by Theorems 2.7 and 2.9 of [54]. (cid:3) For illustration of condition (ii), we give the following example.

Example 4

Consider system (3) - (4) with parameters A = 0 . , B = 1 , G = 0 . , Q = 1 , Γ = 0 . , R = 1 . , H = 0 , γ = 1 . Denote b A = A + G . We solve (11) to obtain P ( t ) = − b Q ( e α ( t − T ) − e − α ( t − T ) ) λ e α ( t − T ) − λ e − α ( t − T ) , (12) where λ = − b A + q b A − γ b Q = − . , λ = − b A − q b A − γ b Q = − . ,α = q b A − γ b Q = 0 . . If < T < T max = α log( λ /λ ) = 2 . , P ( t ) given by (12) is well deﬁned on [0 , T ] . By the localLipschitz continuity property of the vector ﬁeld in (11) , P ( t ) is the unique solution. Note that (11) is not a standard Riccati equation since the state weight matrix − b Q is not positive semi-deﬁnite. In general, the solvability of (11) cannot be ensured on an arbitrary time horizon. Condition(iii) enables us to determine the solvability of (11) on a given time horizon. Note that condition (iii) isequivalent to det { [(0 , I ) e A t (0 , I ) T ] } 6 = 0 , ∀ t ∈ [0 , T ] by noting det { [(0 , I ) e A t (0 , I ) T ] } t =0 = 1 . Condition(iii) is more checkable as illustrated by the following example.6 xample 5

Consider system (3) - (4) with parameters A = − . , G = 0 . , Q = 1 , Γ = 0 . , H = 0 , γ = 1 . We obtain A = (cid:18) − . − .

04 0 . (cid:19) , e A t = − e t + e − t − e t + e − t e t − e − t e t + e − t ! , and det { [(0 , e A t (0 , T ] } = 43 e t + 13 e − t > , ∀ t ≥ . (13) Thus for any

T > , (11) admits a unique solution on [0 , T ] . Therefore, (H1) holds true on [0 , T ] . Lemma 6

Assume (H1) . Then ¯ J ′ i ( u i , f ) is strictly convex in f . Moreover, ¯ J ′ i ( u i , f ) is coercive in f and, in particular, there exists a constant C u i ,x i (0) depending on ( u i , x i (0)) such that ¯ J ′ i ( u i , f ) ≥ ǫ k f k L − C u i ,x i (0) . Proof.

Since ¯ J ′′ i ( g ) − ǫ k g k L is positive semi-deﬁnite by (H1), ¯ J ′′ i ( g ) is positive deﬁnite. By Lemma2, ¯ J ′ i ( u i , f ) is strictly convex in f . Following the method in proving Lemma 2, we can further showthat χ ( f ) := ¯ J ′ i ( u i , f ) − ǫ k f k L is convex in f . By (7) and direct estimates, we can showsup k f k L ≤ | χ ( f ) | ≤ C ,u i ,x i (0) , where the constant C ,u i ,x i (0) depends on ( u i , x i (0)). Now consider f with k f k L ≥

1. Deﬁne f = f k f k L . The convexity of χ ( f ) implies χ ( f ) ≤ k f k L χ ( f ) + k f k L − k f k L χ (0) ≤ k f k L χ ( f ) + C ,u i ,x i (0) . (14)Consequently, for k f k L ≥

1, (14) gives χ ( f ) ≥ − C ,u i x i (0) k f k L . Hence for any f , χ ( f ) ≥ − C ,u i ,x i (0) (2 k f k L + 1) . It follows that¯ J ′ i ( u i , f ) = χ ( f ) + ǫ k f k L ≥ ǫ k f k L − C ,u i ,x i (0) (2 k f k L + 1) ≥ ǫ k f k L − C u i ,x i (0) for some constant C u i ,x i (0) . (cid:3) Theorem 7

Suppose that (H1) holds and let u i ∈ U and ¯ u be ﬁxed. Then (i) ¯ J ′ i ( u i , f ) has a unique minimizer ˆ f , or equivalently, ¯ J i ( u i , f ) has a unique maximizer ˆ f ; (ii) there exists a unique solution ( x i , m i , p i ) ∈ L F (0 , T ; R n ) × L (0 , T ; R n ) to the equation system  dx i = ( Ax i + Bu i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (15) where m i (0) = m and p i ( T ) = H E x i ( T ) , and furthermore ˆ f = γp i . roof. (i) By Lemma 2, ¯ J ′ i is strictly convex and coercive. In addition, ¯ J ′ i is continuous in f . Hencethere exists a unique ˆ f such that ¯ J ′ i ( u i , ˆ f ) = inf f ¯ J ′ i ( u i , f ) [33, Chap. 7], [39].(ii) We start by establishing existence. Let the optimal state-control pair be denoted by ( x i , m i , ˆ f ),which is uniquely determined. We have the relation dx i = ( Ax i + Bu i + Gm i + γ ˆ f ) dt + DdW i , (16)˙ m i = ( A + G ) m i + B ¯ u + γ ˆ f , (17)where m i (0) = m . By using ( x i , m i ), we obtain a unique solution p i from˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (18)where p i ( T ) = H E x i ( T ).Now we consider another control f = ˆ f + ˜ f ∈ L (0 , T ; R n ) in place of ˆ f . Let ˜ x i and ˜ m i be the ﬁrstvariations of x i and m i , respectively, which result from the variation ˜ f for ˆ f . Then we have ˜ x i = ˜ m i for all t ∈ [0 , T ] and d ˜ x i dt = ( A + G )˜ x i + ˜ f , ˜ x i (0) = 0 . Since ¯ J ′ i has a minimum at ( x i , m i , ˆ f ), the ﬁrst variation of the cost satisﬁes0 = δ ¯ J ′ i E Z T (cid:26) − [ x i − (Γ m i + η )] T Q ( I − Γ)˜ x i + 1 γ ˆ f T ˜ f (cid:27) dt − E x Ti ( T ) H ˜ x i ( T ) . (19)On the other hand, ddt ( p Ti ˜ x i ) = ˜ x Ti ˙ p i + p Ti d ˜ x i dt = − [ E x i − (Γ m i + η )] T Q ( I − Γ)˜ x i + p Ti ˜ f . (20)Integrating both sides of (20) and invoking (19), we obtain p Ti ( T )˜ x i ( T ) = Z T (cid:18) p Ti ˜ f − γ ˆ f T ˜ f (cid:19) dt + E x Ti ( T ) H ˜ x i ( T ) . (21)Recalling p i ( T ) = H E x i ( T ), since ˜ f is arbitrary, it follows from (21) thatˆ f = γp i for a.e. t ∈ [0 , T ]. Therefore, ( x i , m i , p i ) determined by (16)-(18) is a solution to (15).We proceed to show uniqueness. Suppose that ( x ′ i , m ′ i , p ′ i ) is another solution of (15). Set thecontrol f ′ = γp ′ i . It is straightforward to show that the ﬁrst variation of ¯ J ′ i at the state control pair( x ′ i , m ′ i , f ′ ) is zero. Since ¯ J ′ i is strictly convex, this implies that ( x ′ i , m ′ i , f ′ ) is the unique optimal state-control pair and so coincides with ( x i , m i , ˆ f ) where ( x i , m i ) is the optimal state process determinedfrom (16)-(18). This further implies p ′ i = p i . So uniqueness follows. The last part of (ii) is nowobvious. (cid:3) .3 The control problem of player A i Assume that (H1) holds. This will ensure that all the equation systems in this section have a welldeﬁned solution. The dynamics are given by  dx i = ( Ax i + Bu i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (22)where m i (0) = m and p i ( T ) = H E x i ( T ). The optimal control problem is( P2 ) minimize u i ∈ L F (0 ,T ; R n ) ¯ J i ( u i , ˆ f u i ) = E Z T (cid:8) | x i − (Γ m i + η ) | Q + u Ti Ru i − γ | p i ( t ) | (cid:9) dt + E x Ti ( T ) Hx i ( T ) . Here we have taken ˆ f u i = γp i which depends on u i . We may simply write ¯ J i ( u i ). This is again a linearquadratic optimal control problem with indeﬁnite weight for the state vector ( x i , m i , p i ). Note that aperturbation in u i will cause a change of the mean term E x i . So this is essentially a mean ﬁeld typeoptimal control problem; see related work [3, 53].We continue to identify conditions under which (P2) is strictly convex and coercive. These condi-tions will be characterized by using an auxiliary control problem with dynamics  ˙ z i = Az i + Bν i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q ( z i − Γ z ) , (23)where z i (0) = z (0) = 0 and q ( T ) = Hz i ( T ). The control ν i ∈ L (0 , T ; R n ). The optimal controlproblem is ( P2a ) minimize ¯ J ai ( ν i ) = Z T (cid:8) | z i − Γ z | Q + ν Ti Rν i − γ | q ( t ) | (cid:9) dt + | z i ( T ) | H . (24)We may view this as a deterministic optimal control problem with two point boundary value conditionsfor the state trajectory. We say ¯ J ai is positive semi-deﬁnite if ¯ J ai ( ν i ) ≥ ν i ; if furthermore,¯ J ai ( ν i ) > ν i = 0, we say ¯ J ai is positive deﬁnite. In order to have a well deﬁned optimalcontrol problem, we need to show that (23) has a unique solution. Lemma 8

Assume (H1) . For each ν i , there exists a unique solution ( z i , z, q ) ∈ C ([0 , T ]; R n ) to (23) .Proof. Indeed, by taking u i = 0 and u i = ν i ∈ L (0 , T ; R n ) in (22), we obtain two solutions( x i , m i , p i ) and ( x ν i i , m ν i i , p ν i i ), respectively. It is easy to show that ( z i , z, q ) := ( x ν i i − x i , m ν i i − m i , p ν i i − p i ) is a solution of (23) by observing that x ν i i − x i is deterministic.If there exist two diﬀerent solutions to (23) for some ν i , then we can construct two diﬀerentsolutions to (22) for a given u i , which is a contradiction to Theorem 7. (cid:3) Lemma 9 ¯ J i ( u i ) is convex (resp., strictly convex) in u i ∈ U if and only if ¯ J ai ( ν i ) is positive semi-deﬁnite (resp., positive deﬁnite).Proof . See appendix A. (cid:3) We introduce the following assumption. (H2)

There exists a small constant δ > J ai ( ν i ) − δ k ν i k ≥ ν i ∈ L (0 , T ; R n ).9 .4 Representation of the quadratic functional We intend to ﬁnd an expression of ¯ J ai ( ν i ) so that (H2) can be characterized in a more explicit form.A change of coordinates will make the computation more convenient. Deﬁne ˇ z = z i − z . Then (23)becomes  ˙ˇ z = A ˇ z + Bν i , ˙ z = ( A + G ) z + γq, ˙ q = − b Qz − ( A + G ) T q − ( I − Γ) T Q ˇ z, (25)where ˇ z (0) = z (0) = 0 and q ( T ) = H (ˇ z ( T ) + z ( T )).Deﬁne the Hamiltonian matrix H = (cid:20) A + G γI − b Q − ( A + G ) T (cid:21) and the matrix ODE ˙Φ( t ) = H Φ( t ) where Φ(0) = I . Denote the partitionΦ( t ) = (cid:20) Φ ( t ) Φ ( t )Φ ( t ) Φ ( t ) (cid:21) , where each submatrix Φ ij is an n × n matrix function.We have ˇ z ( t ) = Z t e A ( t − τ ) Bν i ( τ ) dτ. (26)By solving ( z, q ) in (25), we obtain z ( t ) = Φ ( t ) q (0) − Z t Φ ( t − s )( I − Γ) T Q ˇ z ( s ) ds,q ( t ) = Φ ( t ) q (0) − Z t Φ ( t − s )( I − Γ) T Q ˇ z ( s ) ds, where q (0) is to be determined. At the terminal time, z ( T ) = Φ ( T ) q (0) − Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds and q ( T ) = Φ ( T ) q (0) − Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds = H ˇ z ( T ) + H Φ ( T ) q (0) − H Z T Φ ( T − s )( I − Γ) T Q ˇ z ( s ) ds, where the second equality is due to the terminal condition of q . It follows that[Φ ( T ) − H Φ ( T )] q (0) = H ˇ z ( T ) + Z T [Φ ( T − s ) − H Φ ( T − s )]( I − Γ) T Q ˇ z ( s ) ds. (27) Proposition 10 If (H1) holds, Φ ( T ) − H Φ ( T ) is nonsingular. roof. Under (H1), (25) has a unique solution by Lemma 8, and accordingly, q (0) is uniquelydetermined. If Φ ( T ) − H Φ ( T ) is singular, we may ﬁnd two diﬀerent solutions of q (0) from (27)which further give two diﬀerent solutions to (25), leading to a contradiction. Hence, Φ − H Φ ( T )is nonsingular. (cid:3) By solving q (0) in (27) and further eliminating ˇ z , we write z and q as integrals depending on ν i .Deﬁne the linear operator [ L ( ν i )]( t ) =  ˇ z ( t ) z ( t ) q ( t )  . By standard estimates we can show that L is a linear and bounded operator from L (0 , T ; R n ) to L (0 , T ; R n ). Let L ∗ be its adjoint operator from L (0 , T ; R n ) to L (0 , T ; R n ). Deﬁne the operator L T ν i = ˇ z ( T ) + z ( T ) . It can be shown that L T is a linear and bounded operator from L (0 , T ; R n ) to R n . Let L ∗ T be itsadjoint operator. Now ¯ J ai may be represented in terms of the inner product on L (0 , T ; R n ):¯ J ai ( ν i ) = h Θ ν i , ν i i + h Rν i , ν i i + h Θ T ν i , ν i i , (28)where Θ ν i = L ∗  Q Q ( I − Γ) 0( I − Γ) T Q b Q

00 0 − γI  L ν i , Θ T ν i = L ∗ T H L T ν i . Proposition 11 (i) ¯ J i ( u i ) is convex in u i ∈ U if and only if h (Θ + Θ T + R ) ν i , ν i i ≥ for all ν i ∈ L (0 , T ; R n ) . (ii) (H2) holds if and only if there exists δ > such that h (Θ + Θ T + R ) ν i , ν i i ≥ δ k ν i k L for all ν i ∈ L (0 , T ; R n ) .Proof. (i) follows from Lemma 9 and the representation (28). (ii) follows from (28). (cid:3) The criterion in part (ii) of Proposition 11 still involves the operators Θ and Θ T on an inﬁnitedimensional space. Here we give a suﬃcient condition to endure (H2) based on some more computableparameters. It is clear that h (Θ + Θ T + R ) ν i , ν i i ≥ R T ( | ν i ( t ) | R − γ | q ( t ) | ) dt. For simplicity, we onlyconsider the case H = 0, and simple computations lead to q ( t ) = Φ ( t )Φ − ( T ) Z T Φ ( T − s )( I − Γ) T Q Z s e A ( s − τ ) Bν i ( τ ) dτ ds − Z t Φ ( t − s )( I − Γ) T Q Z s e A ( s − τ ) Bν i ( τ ) dτ ds =: q ( t ) − q ( t ) . Denote b = sup ≤ t ≤ T | Φ ( t ) | , b = sup ≤ t ≤ T | Φ ( t )Φ − ( T ) | , b = | Q ( I − Γ) | , b = R T | e As B | ds and b = sup ≤ t ≤ T | e At B | . By exchanging the order of integration in q and q , it is easy to show | q ( t ) | ≤ ( b b b b ) T Z T ν i ( s ) ds, | q ( t ) | ≤ b b b Z t ( t − τ ) | ν i ( τ ) | dτ, which further gives Z T | q ( t ) | dt ≤ C q Z T | ν i ( t ) | dt, (29)where C q = 2( b b b b ) T + ( b b b ) T . For the case H = 0, (H2) holds whenever R > γC q I .11 .5 The solution of (P2) Let ¯ u ∈ C ([0 , T ]; R n ) be ﬁxed. Lemma 12

Assume (H1) - (H2) . Then (P2) has a unique optimal state-control pair of the form ( x i , m i , p i , ˆ u i ) satisfying  dx i = ( Ax i + B ˆ u i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] , (30) where p i ( T ) = H E x i ( T ) . Furthermore, the backward stochastic diﬀerential equation (BSDE) ( dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i ,y i ( T ) = − Hx i ( T ) (31) has a unique solution ( y i , ζ i ) ∈ L F (0 , T ; R n ) and ˆ u i = R − B T y i . (32) Proof.

Under (H2), by adapting Lemma 9 to the auxiliary control problem with cost functional¯ J i ( u i ) − δ E R T | u i | dt , we can show that ¯ J i ( u i ) − δ E R T | u i | dt is convex in u i . By the method inproving Lemma 6, we can further show that ¯ J i is strictly convex and coercive in u i . Hence (P2) hasa unique optimal state-control pair ( x i , m i , p i , ˆ u i ) which minimizes ¯ J i ( u i ).Given ( x i , m i , p i , ˆ u i ), (31) is a standard linear BSDE and so has a unique solution ( y i , ζ i ). Furtherdeﬁne the BSDE dy = (cid:8) − G T y i − ( A + G ) T y − Γ T Q [ x i − (Γ m i + η )] (cid:9) dt + ζdW i , where y ( T ) = 0. It also has a unique solution ( y, ζ ) ∈ L F (0 , T ; R n ). It can be checked that ddt [ E ( y + y i ) + p i ] = − ( A + G ) T [ E ( y + y i ) + p i ]and E ( y ( T ) + y i ( T )) + p i ( T ) = 0. So E ( y i + y ) + p i = 0 (33)for all t ∈ [0 , T ].Let ˆ u i be replaced by ˆ u i + ˜ u i ∈ L F (0 , T ; R n ) in (30), and the resulting solution be denoted by( x i + ˜ x i , m i + ˜ m i , p i + ˜ p i ), which exists and is unique by Theorem 7. It follows that  ˙˜ x i = A ˜ x i + B ˜ u i + G ˜ m i + γ ˜ p i , ˙˜ m i = ( A + G ) ˜ m i + γ ˜ p i , ˙˜ p i = − ( A + G ) T ˜ p i − ( I − Γ) T Q ( E ˜ x i − Γ ˜ m i ) , where ˜ x i (0) = ˜ m i (0) = 0 and ˜ p i ( T ) = H E ˜ x i ( T ). The ﬁrst variation of ¯ J i about ˆ u i satisﬁes0 = δ ¯ J i E Z T (cid:8) (˜ x i − Γ ˜ m i ) T Q [ x i − (Γ m i + η )] + ˜ u Ti R ˆ u i − γ ˜ p Ti p i (cid:9) dt + E ˜ x Ti ( T ) Hx i ( T ) . (34)By applying Ito’s formula to ˜ x Ti y i , we obtain E ˜ x Ti ( T ) y i ( T ) − E ˜ x Ti (0) y i (0) = E Z T (cid:8) ˜ x Ti Q [ x i − (Γ m i + η )] + y Ti ( B ˜ u i + G ˜ m i + γ ˜ p i ) (cid:9) dt. E ˜ m Ti ( T ) y ( T ) − E ˜ m Ti (0) y (0) = E Z T (cid:8) γy T ˜ p i − ˜ m Ti ( G T y i + Γ T Q [ x i − (Γ m i + η )]) (cid:9) dt. Therefore, adding up the two equations yields − E ˜ x Ti ( T ) Hx i ( T ) = E Z T (cid:8) (˜ x i − Γ ˜ m i ) T Q [ x i − (Γ m i + η )] + y Ti B ˜ u i + γ ( y + y i ) T ˜ p i (cid:9) dt. (35)By (34) and (35), E Z T [˜ u Ti R ˆ u i − γ ˜ p Ti p i − ˜ u Ti B T y i − γ ˜ p Ti ( y + y i )] dt = 0 . Note that by (33), E Z T ˜ p Ti ( p i + y + y i ) dt = Z T ˜ p Ti [ p i + E ( y + y i )] dt = 0 . Hence, E Z T ˜ u Ti ( R ˆ u i − B T y i ) dt = 0 . Since ˜ u i ∈ L F (0 , T ; R n ) is arbitrary, (32) follows. (cid:3) After substituting ˆ u i = R − B T y i into (30), we form the equation system  dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (36)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ). This equation systemconsists of 2 forward equations and 2 backward equations. It is clear that the solution of the optimalcontrol problem (P2) satisﬁes the above FBSDE. A natural question is whether this FBSDE’s solutioncompletely determines the optimal control. This is answered by the next theorem. Denote S [0 , T ] = L F (0 , T ; R n ) × C ([0 , T ]; R n ) × L F (0 , T ; R n ) . Theorem 13

Assume (H1) - (H2) . Then the FBSDE (36) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] and the optimal control for (P2) is given by ˆ u i = R − B T y i .Proof. We solve (P1) ﬁrst and (P2) next to determine ˆ u i . By Lemma 12, we obtain ( x i , m i , p i , y i , ζ i )to satisfy (30)-(31) and ˆ u i = R − B T y i . Obviously, ( x i , m i , p i , y i , ζ i ) satisﬁes (36).We continue to show uniqueness. Suppose that ( x i , m i , p i , y i , ζ i ) and ( x ′ i , m ′ i , p ′ i , y ′ i , ζ ′ i ) are twosolutions of (36). Deﬁne ˇ u i = R − B T y i and u ′ i = R − B T y ′ i which are both well-determined elementsin L F (0 , T ; R n ). In particular, we have  dx i = ( Ax i + B ˇ u i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (37)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ).As in the proof of Lemma 12, we evaluate the ﬁrst variation of ¯ J i ( u i ) at ( x i , m i , p i , ˇ u i ) and canshow δ ¯ J i = 0. Since ¯ J i is convex, this zero ﬁrst variation condition implies that ˇ u i is an optimal controlof (P2). By the same reasoning, u ′ i is also an optimal control. By strict convexity, we have ˇ u i = u ′ i .Subsequently, we have ( x i , m i , p i ) = ( x ′ i , m ′ i , p ′ i ) by Theorem 7. This further implies ( y i , ζ i ) = ( y ′ i , ζ ′ i ). (cid:3) The Solution of the Robust Game

Note that Theorem 13 determines the strategy of a representative agent when ¯ u is ﬁxed. Denote x ( N ) = 1 N N X i =1 x i , y ( N ) = 1 N N X i =1 y i , m ( N ) = 1 N N X i =1 m i , p ( N ) = 1 N N X i =1 p i . (38)By (36), we obtain  dx ( N ) = (cid:0) Ax ( N ) + BR − B T y ( N ) + Gm ( N ) + γp ( N ) (cid:1) dt + DN P Ni =1 dW i , dm ( N ) dt = ( A + G ) m ( N ) + B ¯ u + γp ( N ) , dp ( N ) dt = − ( A + G ) T p ( N ) − ( I − Γ) T Q (cid:2) E x ( N ) − (Γ m ( N ) + η ) (cid:3) ,dy ( N ) = (cid:8) − A T y ( N ) + Q [ x ( N ) − (Γ m ( N ) + η )] (cid:9) dt + N P Ni =1 ζ i dW i , (39)where x ( N ) (0) = (1 /N ) P Ni =1 x i (0), m ( N ) (0) = m , p ( N ) ( T ) = H E x ( N ) ( T ), and y ( N ) ( T ) = − Hx ( N ) ( T ).As an approximation to (39), we construct the following limiting system  ˙ x = A x + BR − B T y + G m + γ p , ˙ m = ( A + G ) m + B ¯ u + γ p , ˙ p = − ( A + G ) T p − ( I − Γ) T Q [ x − (Γ m + η )] , ˙ y = − A T y + Q [ x − (Γ m + η )] , (40)where x (0) = m (0) = m , p ( T ) = H x ( T ), and y ( T ) = − H x ( T ). This is a two point boundary valueproblem.Note that y is intended as an approximation of y ( N ) when N → ∞ . The consistency requirementimposes ¯ u = R − B T y . (41)Under the condition (41), the ﬁrst two equations in (40) coincide to give x = m for all t ∈ [0 , T ].Consequently, we eliminate the equation of x and introduce the new system  ˙ m = ( A + G ) m + BR − B T y + γ p , ˙ p = − ( A + G ) T p − ( I − Γ) T Q [ m − (Γ m + η )] , ˙ y = − A T y + Q [ m − (Γ m + η )] , (42)where m (0) = m , p ( T ) = H m ( T ), and y ( T ) = − H m ( T ). This is still a two point boundary valueproblem. The next corollary follows from Theorem 13. Corollary 14

Assume (H1) - (H2) . Suppose that (42) has a unique solution ( m , p , y ) ∈ C ([0 , T ]; R n ) and take ¯ u = R − B T y in (36) . Then (36) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] . (cid:3) Consider the special case where all agents have the same initial condition x i (0) = m for all i ≥ u ) = R − B T E y i , where we take ¯ u ∈ C ([0 , T ]; R n ). Clearly R − B T E y i is a continuous R n -valued function of t ∈ [0 , T ].14y the consistency requirement ¯ u = Λ(¯ u ), we set ¯ u = R − B T E y i in the second equation of (36)to obtain the equation system of the mean ﬁeld game:  dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + BR − B T E y i + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (43)where x i (0) = m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ).An interesting fact is that the existence and uniqueness of a solution to (43) is completely deter-mined by the ODE system (42) without further using (H1)-(H2). Theorem 15 (43) has a unique solution ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] if and only if (42) has a uniquesolution.Proof. By Lemma B.1, (43) has a unique solution if and only if the FBSDE (B.1) has a uniquesolution. By Lemma B.2 and Lemma B.3-(iii), the FBSDE (B.1) has a unique solution if and only if(42) has a unique solution. The theorem follows. (cid:3) (42)

To study the existence and uniqueness of a solution to (42), we use a ﬁxed point approach and introducethe equation system  ˙ m = ( A + G ) m + h + γ p , ˙ p = − ( A + G ) T p − b Q m + ( I − Γ) T Qη, ˙ y = − A T y + Q [ m − (Γ m + η )] , (44)where h ∈ C ([0 , T ]; R n ), m (0) = m , p ( T ) = H m ( T ), and y ( T ) = − H m ( T ). The next lemmaidentiﬁes a suﬃcient condition for (44) to have a unique solution for any h ∈ C ([0 , T ]; R n ). Lemma 16

Suppose that the Riccati equation ˙ K + K ( A + G ) + ( A + G ) T K − γK − b Q = 0 , K ( T ) = − H (45) has a unique solution on [0 , T ] . Then (44) deﬁnes a mapping from C ([0 , T ]; R n ) to itself: Λ : h BR − B T y . Proof . We write p = − K m + φ for (44) and obtain the ODE˙ φ = − ( A + G − γK ) φ + Kh + ( I − Γ) T Qη, φ ( T ) = 0 . It follows that ˙ m = ( A + G − γK ) m + h + γφ. Let the fundamental solution matrices of the two ODEs˙ ϕ = ( A + G − γK ) ϕ, ˙ ψ = − ( A + G − γK ) T ψ be Φ( t, s ) and Ψ( t, s ), respectively, with Φ( s, s ) = Ψ( s, s ) = I . Then Ψ( t, s ) = Φ T ( s, t ). We obtain φ ( t ) = − Z Tt Ψ( t, s )[ K ( s ) h ( s ) + ( I − Γ) T Qη ] ds . m ( t ) = Φ( t, m + Z t Φ( t, s ) h ( s ) ds − γ Z t Φ( t, s ) Z Ts Ψ( s , s )[ K ( s ) h ( s ) + ( I − Γ) T Qη ] ds ds . We further solve y ( t ) = − Z Tt e − A T ( t − s ) Q [( I − Γ) m ( s ) − η ] ds − e − A T ( t − T ) H m ( T ) , which implies y ∈ C ([0 , T ]; R n ). The lemma follows. (cid:3) To simplify the existence analysis for (42) in this section, we consider the case H = 0. BelowΥ k denotes a continuous function of t which does not depend on h and can be easily determined.Consequently, y ( t ) = − Z Tt e − A T ( t − s ) Q [( I − Γ) m ( s ) − η ] ds = − Z Tt e − A T ( t − s ) Q ( I − Γ) m ( s ) ds + Υ ( t )= − Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) h ( s ) ds ds + γ Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) Z Ts Ψ( s , s ) K ( s ) h ( s ) ds ds ds + Υ ( t ) . Now we haveΛ ( h )( t ) = BR − B T y ( t )= − BR − B T Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) h ( s ) ds ds + γBR − B T Z Tt e − A T ( t − s ) Q ( I − Γ) Z s Φ( s , s ) Z Ts Ψ( s , s ) K ( s ) h ( s ) ds ds ds + BR − B T Υ ( t )=: Λ ( h )( t ) + BR − B T Υ ( t ) . It is clear that Λ is from C ([0 , T ]; R n ) to itself. (cid:3) Deﬁne the constants c = max t ∈ [0 ,T ] | K ( t ) | , c = max ≤ t,s ≤ T | Φ( t, s ) | ,c = max t ∈ [0 ,T ] Z Tt | e A ( s − t ) | sds, c = max t ∈ [0 ,T ] Z Tt | e A ( s − t ) | ( T s − s ds. Note that

T s − s ≥ s ∈ [0 , T ]. Denote | h | = max t ∈ [0 ,T ] | h ( t ) | . Theorem 17

Assume H = 0 . If c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) < , (46) then (42) has a unique solution. roof. For each t , | Λ ( h )( t ) | ≤ c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A T ( s − t ) | s ds + γc c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A T ( s − t ) | Z s Z Ts ds ds ds = c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A ( s − t ) | s ds + γc c | h | · | BR − B T | · | Q ( I − Γ) | Z Tt | e A ( s − t ) | ( T s − s ds = c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) . Hence, Λ is a contraction and has a unique ﬁxed point. So (42) has a unique solution. (cid:3) The constants c , . . . , c in (46) do not depend on BR − B T . If BR − B T is suitably small, (46)can be ensured. Example 18

Consider the system with parameters given by Example 4. Take T = 1 . . In analogueto (12) , we can solve K ( t ) on [0 , T ] for (45) . It can be shown that K ( t ) ≤ for t ∈ [0 , T ] and | K ( t ) | attains its maximum on [0 , T ] at t = 0 . We have K (0) = − . which gives c = 0 . . So c ≤ e ( A + G + | K (0) | ) T = 3 . . Furthermore, c ≤ Z T e As sds = 1 . , c ≤ Z T e As ( T s − s ds = 1 . . Subsequently, c | BR − B T | · | Q ( I − Γ) | ( c + γc c c ) ≤ . . So (46) holds. Remark 1

For the two-point boundary value problem, the contraction estimate in the ﬁxed pointmethod may be conservative and typically works on small time intervals for the solvability of (42)(see, e.g., Ch.1, Sec. 5, [40]).We continue to derive another condition under which (42) is solvable without restriction to a smalltime horizon. To this end, we ﬁrst rewrite (42) in the following form:   ˙ m ˙ p ˙ y  = e A  mpy  + e η,m (0) = m , p ( T ) = Hm ( T ) , y ( T ) = − Hm ( T ) , (47)where e A =  A + G γ BR − B T − ( I − Γ) T Q ( I − Γ) − ( A + G ) T Q (1 − Γ) 0 − A T  , e η =  I − Γ) T Qη − Qη  . Then, by the variation of constant formula, we have  m ( t ) p ( t ) y ( t )  = Θ( t )  m µν  + Θ( t ) Z t Θ − ( s ) e ηds, (48)where Θ( t ) = e e At and p, y have the initial conditions p (0) = µ , y (0) = ν . Noting the terminal conditionin (47), now we present the following result. 17 roposition 19 [40, Ch.2, Sec. 3] If for given

T > , det( e Θ( T )) = 0 , where e Θ( T ) = (cid:18) − H I H I (cid:19) Θ( T )  I I  , then (42) has a unique solution on [0 , T ] for any initial value m . For illustration, we give the following example.

Example 20

Consider system (3) - (4) with all parameters being scalar-valued and Γ = 1 , H = 0 . Wecalculate A = (cid:18) A + G − γ − ( A + G ) (cid:19) , e A =  A + G γ R − B − ( A + G ) 00 0 − A  , where A is deﬁned in Proposition 3. By direct computations, we obtain det { [(0 , I ) e A t (0 , I ) T ] } = e − ( A + G ) t > for all t ∈ [0 , T ] , which ensures (H1) by Proposition 3. Moreover, b = 0 gives C q = 0 in (29) so that (H2) always holds true for R > . Finally, det( e Θ( t )) = e − (2 A + G ) t > , and subsequently, (42) has aunique solution on any interval [0 , T ] . To summarize, (H1) , (H2) and the solvability of (42) are allsatisﬁed by the system. Note that the solvability of (42) in Example 20 does not depend on the value of R − B , which isdiﬀerent from the condition in Theorem 17. We suppose that (42) has a unique solution ( m , p , y ) and accordingly take ¯ u in (36) as¯ u ∗ = R − B T y . (49)The FBSDE system (36) now becomes  dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + B ¯ u ∗ + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ E x i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (50)where x i (0) is given, m i (0) = m , p i ( T ) = H E x i ( T ), and y i ( T ) = − Hx i ( T ). By Corollary 14, thisFBSDE has a unique solution. In the game of N players, let y i be solved from (50) and denote thecontrol for A i by ˆ u i = R − By i , ≤ i ≤ N, (51)which is a well deﬁned process in L F (0 , T ; R n ).For ˆ u ( N ) = (1 /N ) P Ni =1 ˆ u i , we aim to estimate E | ˆ u ( N ) ( t ) − ¯ u ∗ ( t ) | . Note that ˆ u , . . . , ˆ u N are independent, but they are not necessarily with the same distribution due topossibly diﬀerent initial states of the agents. This fact will somehow complicate our error estimate.The key result of this section is the following theorem.18 heorem 21 Assume that (H1) - (H2) hold and that (42) has a unique solution. We have sup ≤ t ≤ T E | ˆ u ( N ) − ¯ u ∗ | = O (1 /N ) + O ( | x ( N ) (0) − m | ) , where x ( N ) (0) = (1 /N ) P Ni =1 x i (0) . (cid:3) The proof of Theorem 21 is provided in the remaining part of this section. To do this, we need toprove some lemmas under the assumption of the theorem. Recalling (38), we take ¯ u = ¯ u ∗ in (39) towrite  dx ( N ) = (cid:0) Ax ( N ) + BR − B T y ( N ) + Gm ( N ) + γp ( N ) (cid:1) dt + DN P Ni =1 dW i , dm ( N ) dt = ( A + G ) m ( N ) + B ¯ u ∗ + γp ( N ) , dp ( N ) dt = − ( A + G ) T p ( N ) − ( I − Γ) T Q (cid:2) E x ( N ) − (Γ m ( N ) + η ) (cid:3) ,dy ( N ) = (cid:8) − A T y ( N ) + Q [ x ( N ) − (Γ m ( N ) + η )] (cid:9) dt + N P Ni =1 ζ i dW i , (52)where x ( N ) (0) = (1 /N ) P Ni =1 x i (0), m ( N ) (0) = m , p ( N ) ( T ) = H E x ( N ) ( T ), and y ( N ) ( T ) = − Hx ( N ) ( T ).Denote the ODE system  ˙ x N = A x N + BR − B T y N + G m N + γ p N , ˙ m N = ( A + G ) m N + B ¯ u ∗ + γ p N , ˙ p N = − ( A + G ) T p N − ( I − Γ) T Q [ x N − (Γ m N + η )] , ˙ y N = − A T y N + Q [ x N − (Γ m N + η )] , (53)where x N (0) = (1 /N ) P Ni =1 x i (0), m N (0) = m , p N ( T ) = H x N ( T ), and y N ( T ) = − H x N ( T ). Theinitial condition x N (0) is diﬀerent from that of (40). Lemma 22 (53) has a unique solution which can be denoted as ( x N , m N , p N , y N ) = ( E x ( N ) , m ( N ) , p ( N ) , E y ( N ) ) . Proof.

Existence follows by taking expectation in (52). To show uniqueness, suppose that (53) hastwo diﬀerent solutions ( x N , m N , p N , y N ) and ( x ′ N , m ′ N , p ′ N , y ′ N ). Then for any λ ∈ R ,( x i , m i , p i , y i , ζ i ) + λ ( x N − x ′ N , m N − m ′ N , p N − p ′ N , y N − y ′ N , (cid:3) Lemma 23

We have sup ≤ t ≤ T (cid:16) E | x ( N ) − E x ( N ) | + E | y ( N ) − E y ( N ) | (cid:17) = O (1 /N ) . Proof.

Deﬁne ( θ , θ ) = ( x ( N ) − E x ( N ) , y ( N ) − E y ( N ) ) . By (52), (53) and Lemma 22, ( dθ = ( Aθ + BR − B T θ ) dt + DN P Ni =1 dW i ,dθ = ( − A T θ + Qθ ) dt + N P Ni =1 ζ i dW i , where θ (0) = 0 and θ ( T ) = − Hθ ( T ). 19et P be the solution of the Riccati equation˙ P + A T P + P A − P BR − B T P + Q = 0 , P ( T ) = H. Denote θ = − P θ + ψ, where ψ ( T ) = 0. This gives the equation dψ = − ( A − BR − B T P ) T ψdt + 1 N N X i =1 ( P D + ζ i ) dW i , where ψ ( T ) = 0. There is a unique solution ψ = 0 for t ∈ [0 , T ]. This implies dθ = ( A − BR − B T P ) θ dt + DN N X i =1 dW i . Hence, sup ≤ t ≤ T E | θ ( t ) | = O (1 /N ) . The lemma follows since θ = − P θ . (cid:3) When ( m , p , y ) is a unique solution of (42), it can be shown that ( x , m , y , p ) := ( m , m , y , p ) is aunique solution of (40) under the condition (41). Lemma 24

We have sup ≤ t ≤ T [ | x N − x | + | m N − m | + | p N − p | + | y N − y | ] = O ( | x ( N ) (0) − m | ) . Proof.

Consider  ˙ h = Ah + BR − B T h + Gh + γh , ˙ h = ( A + G ) h + γh , ˙ h = − ( A + G ) T h − ( I − Γ) T Q ( h − Γ h ) , ˙ h = − A T h + Q ( h − Γ h ) , (54)where h (0) is given, h (0) = 0, h ( T ) = Hh ( T ), and h ( T ) = − Hh ( T ). It is constructed as ahomogeneous version of (53). We claim that (54) has a unique solution for any given value of h (0).If this were not true, there would exist h (0) such that (54) has multiple solutions which, in turn, canbe used to construct multiple solutions to (53). This would give a contradiction to Lemma 22.It is clear that ( x N − x , m N − m , p N − p , y N − y ) =: ( h , h , h , h ) , is a solution of (54) with h (0) = x N (0) − m .Let e , . . . e n be a canonical basis of R n . For h (0) = e k , we obtain a solution of (54), denotedby h k = ( h k , h k , h k , h k ). Let ( z ) k be the k th component of a vector z . We may uniquely denote( x N − x , m N − m , p N − p , y N − y ) as a linear combination of h , . . . , h n :( x N − x , m N − m , p N − p , y N − y ) = n X k =1 ( x N (0) − m ) k ( h k , h k , h k , h k ) . The lemma follows readily. (cid:3)

Proof of Theorem 21.

For ¯ u = ¯ u ∗ , we write ˆ u ( N ) = R − B T (1 /N ) P Ni =1 y i = R − B T y ( N ) . Wehave | ˆ u ( N ) − ¯ u ∗ | = E | R − B T ( y ( N ) − y ) | ≤ C E | y ( N ) − y | = CE | y ( N ) − E y ( N ) + E y ( N ) − y | ≤ C (1 /N ) + C | y N − y | = O (1 /N ) + O ( | x ( N ) (0) − m | ) . The second inequality follows from Lemmas 22 and 23, and the last step follows from Lemma 24. (cid:3) Robust Nash Equilibrium

Throughout this section, we assume that (42) has a unique solution and take ¯ u = ¯ u ∗ determined by(49). For f ∈ L (0 , T ; R n ) and u i ∈ L F (0 , T ; R n ), 1 ≤ i ≤ N , recall the worst case cost J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ) . It is clear that for each i and any ( u i , u − i ), sup f J i ( u i , u − i , f ) ≥ u i , ˆ u − i ) given by (51) for a population of N players with dynamics (3).It should be emphasized that we only use (50)-(51) to make a well deﬁned process ˆ u i in L F (0 , T ; R n )which should not be understood as a feedback strategy. The main result of this section is the nexttheorem which characterizes the performance of this set of strategies. Theorem 25

Assume (i) (H1) - (H2) hold; (ii) sup i ≥ | x i (0) | ≤ M where M does not depend on N ; (iii) (42) has a unique solution. Then the set of strategies (ˆ u , . . . , ˆ u N ) given by (51) is a robust ε N -Nash equilibrium for the N players, i.e., J wo i (ˆ u i , ˆ u − i ) − ε N ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) , (55) where ≤ ε N = O (1 / √ N + | x ( N ) (0) − m | ) and x ( N ) (0) = (1 /N ) P Nj =1 x j (0) . (cid:3) The rest part of this section is devoted to the proof of Theorem 25. For any given f ∈ L (0 , T ; R n ),denote the state processes of (3) corresponding to (ˆ u i , ˆ u − i , f ) by ˆ x j , 1 ≤ j ≤ N , and ˆ x ( N ) =(1 /N ) P Nj =1 ˆ x j . Denote ˙¯ m = ( A + G ) ¯ m + B ¯ u ∗ + f, ¯ m (0) = m (56)All subsequent lemmas are proved under the assumptions of Theorem 25. Lemma 26

We have sup ≤ t ≤ T,f E | ˆ x ( N ) − ¯ m | ≤ C (1 /N + | x ( N ) (0) − m | ) . Proof.

Note that d ˆ x ( N ) = [( A + G )ˆ x ( N ) + B ˆ u ( N ) + f ] dt + ( D/N ) N X i =1 dW i . Therefore, d (ˆ x ( N ) − ¯ m ) = [( A + G )(ˆ x ( N ) − ¯ m ) + B (ˆ u ( N ) − ¯ u ∗ )] dt + ( D/N ) N X i =1 dW i . By linear SDE estimates, E | ˆ x ( N ) ( t ) − ¯ m ( t ) | ≤ C | x ( N ) (0) − m | + C/N + C E Z t | ˆ u ( N ) ( τ ) − ¯ u ∗ ( τ ) | dτ. By Theorem 21, the lemma follows. (cid:3) emma 27 There exists a constant ˆ C independent of N such that max ≤ i ≤ N sup f J i (ˆ u i , ˆ u − i , f ) ≤ ˆ C . Proof.

Denote dx ′ i = ( Ax ′ i + B ˆ u i + G ¯ m + f ) dt + DdW i , (57)where x ′ i (0) = x i (0). By Lemma 26, it is easy to showsup ≤ t ≤ T,f E | ˆ x i ( t ) − x ′ i ( t ) | ≤ C (1 /N + | x ( N ) (0) − m | ) . We have J i (ˆ u i , ˆ u − i , f ) ≤ ¯ J i (ˆ u i , f ) + E Z T | (ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) ) | Q dt + E | ˆ x i ( T ) − x ′ i ( T ) | H + 2 E Z T [ x ′ i − (Γ ¯ m + η )] T Q [(ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) )] dt + 2 E [ x ′ Ti ( T ) H (ˆ x i ( T ) − x ′ i ( T ))] . (58)Combining Lemma 6 with condition (ii) in Theorem 25, we obtain¯ J i (ˆ u i , f ) ≤ C − ( ǫ / k f k L (59)for ǫ >

0, where C does not depend on ( i, N ). Since neither ˆ x i − x ′ i nor ¯ m − ˆ x ( N ) depend on f , thereexists a constant C such that (cid:12)(cid:12)(cid:12)(cid:12) E Z T [ x ′ i − (Γ ¯ m + η )] T Q [(ˆ x i − x ′ i ) + Γ( ¯ m − ˆ x ( N ) )] dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ C (cid:18) E Z T | x ′ i − (Γ ¯ m + η ) | Q dt (cid:19) / ≤ C (1 + k f k L ) / ≤ C + ( ǫ / k f k L , (60)where the second inequality follows from elementary estimates based on the solutions of (56) and (57).Similarly, E [ x ′ Ti ( T ) H (ˆ x i ( T ) − x ′ i ( T ))] ≤ C + ( ǫ / k f k L . (61)Finally combining (58)-(61) with Lemma 26 leads to J i (ˆ u i , ˆ u − i , f ) ≤ C − ( ǫ / k f k L . The lemma follows. (cid:3)

Consider the set of strategies ( u i , ˆ u − i ) and the corresponding state processes dx i = ( Ax i + Bu i + Gx ( N ) + f ) dt + DdW i , (62) dx j = ( Ax j + B ˆ u j + Gx ( N ) + f ) dt + DdW j , ≤ j ≤ N, j = i. (63)22 emma 28 If u i in (62) satisﬁes sup f J i ( u i , ˆ u − i , f ) ≤ ˆ C , there exists ˆ C independent of N such that E Z T | u i ( t ) | dt ≤ ˆ C . (64) Proof.

Suppose sup f J i ( u i , ˆ u − i , f ) ≤ ˆ C . Then for any f , E Z T (cid:18) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i − γ | f ( t ) | (cid:19) dt + E [ x Ti ( T ) Hx i ( T )] ≤ ˆ C , where ( x , · · · , x N ) is generated by ( u i , ˆ u − i ) and f . Taking f = 0, we obtain E Z T (cid:16) | x i − (Γ x ( N ) + η ) | Q + u Ti Ru i (cid:17) dt ≤ ˆ C . Therefore, (64) holds. (cid:3)

Let U ˆ C denote the set of processes u i ∈ L F (0 , T ; R n ) which satisfy (64). For (62)-(63), denote x ( N ) = (1 /N ) P Nj =1 x j . Lemma 29

Suppose u i ∈ U ˆ C in (62) . Then sup ≤ t ≤ T,f,u i ∈U ˆ C E | x ( N ) ( t ) − ¯ m ( t ) | = O (1 /N + | x ( N ) (0) − m | ) . Proof.

Rewrite (62) in the form dx i = [ Ax i + B ˆ u i + Gx ( N ) + f ] dt + B ( u i − ˆ u i ) dt + DdW i . (65)By (63) and (65), dx ( N ) = [( A + G ) x ( N ) + B ˆ u ( N ) + f ] dt + BN ( u i − ˆ u i ) dt + DN N X j =1 dW j , which combined with (56) gives d ( x ( N ) − ¯ m ) = [( A + G )( x ( N ) − ¯ m ) + B (ˆ u ( N ) − ¯ u ∗ )] dt + BN ( u i − ˆ u i ) dt + DN N X j =1 dW j . By Theorem 21 and the fact E R T | u i − ˆ u i | ≤ C for all u i ∈ U ˆ C , where the constants C do not dependon ( f, u i ), elementary SDE estimates lead tosup ≤ t ≤ T,f E | x ( N ) ( t ) − ¯ m ( t ) | ≤ C (1 /N + | x ( N ) (0) − m | ) , where C does not depend on u i . The lemma follows. (cid:3) Lemma 30

For each u i ∈ U ˆ C , sup f J i ( u i , ˆ u − i , f ) is ﬁnite and attained by some f depending on u i and so denoted as f u i . Moreover, sup u i ∈U ˆ C | sup f J i ( u i , ˆ u − i , f ) − ¯ J i ( u i , ˆ f u i ) | = O (1 / √ N + | x ( N ) (0) − m (0) | ) , where ˆ f u i is determined by Theorem 7 for the given u i . roof. Note that we have dx i = [ Ax i + Bu i + G ¯ m + G ( x ( N ) − ¯ m ) + f ] dt + DdW i , (66)˙¯ m = ( A + G ) ¯ m + B ¯ u ∗ + f, where ¯ m (0) = m . Deﬁne the auxiliary process dx † i = ( Ax † i + Bu i + G ¯ m + f ) dt + DdW i , where x † i (0) = x i (0) and ( u i , f, W i ) is the same as in (66). By Lemma 29, it is easy to showsup ≤ t ≤ T,f E | x i ( t ) − x † i ( t ) | = O (1 /N + | x ( N ) (0) − m (0) | ) . (67)We have the relation | x i − (Γ x ( N ) + η ) | Q = | x † i − (Γ ¯ m + η ) | Q + | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q + 2[ x † i − (Γ ¯ m + η )] T Q [( x i − x † i ) + Γ( ¯ m − x ( N ) )] . The cost can be rewritten as J i ( u i , ˆ u − i , f ) = ¯ J i ( u i , f ) + E Z T | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q dt + E h | x i ( T ) − x † i ( T ) | H i + 2 E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt + 2 E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i (68) ≤ ¯ J i ( u i , f ) + C (cid:16) /N + | x ( N ) (0) − m | (cid:17) + 2 E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt + 2 E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i , (69)where the inequality follows from Lemma 29 and (67). Note that neither x i − x † i nor ¯ m − x ( N ) in (68)depend on f . The terms x † i and x † i − (Γ ¯ m + η ) are aﬃne in f , and − ¯ J i ( u i , f ) is convex in f by Lemma6. Consequently, it follows from (68) that − J i ( u i , ˆ u − i , f ) is convex in f . For u i ∈ U ˆ C , in analogue to(59), we obtain ¯ J i ( u i , f ) ≤ C − ( ǫ / k f k L , (70)where C doest not depend on u i . We have (cid:12)(cid:12)(cid:12)(cid:12) E Z T h x † i − (Γ ¯ m + η ) i T Q h ( x i − x † i ) + Γ( ¯ m − x ( N ) ) i dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:26) E Z T | x † i − (Γ ¯ m + η ) | Q dt (cid:27) / · (cid:26) E Z T | ( x i − x † i ) + Γ( ¯ m − x ( N ) ) | Q dt (cid:27) / ≤ C (cid:16) / √ N + | x ( N ) (0) − m | (cid:17) (1 + k f k L ) / ≤ C + ( ǫ / k f k L . (cid:12)(cid:12)(cid:12) E h ( x † i ( T )) T H ( x i ( T ) − x † i ( T )) i(cid:12)(cid:12)(cid:12) ≤ C + ( ǫ / k f k L . Hence, (69) gives J i ( u i , ˆ u − i , f ) ≤ C − ( ǫ / k f k L , (71)where C does not depend on ( N, u i ). So for given u i ∈ U ˆ C , J i ( u i , ˆ u − i , f ) attains a ﬁnite supreme atsome f u i since it is a continuous functional of f , and by (71) we may further ﬁnd a constant ˆ C suchthat sup u i ∈U ˆ C k f u i k L ≤ ˆ C . (72)By (69), J i ( u i , ˆ u − i , f ) ≤ ¯ J i ( u i , f ) + C (1 /N + | x ( N ) (0) − m | )+ C (cid:16) /N + | x ( N ) (0) − m | (cid:17) / (cid:18) E Z T | x † i − (Γ ¯ m + η ) | Q dt (cid:19) / + C (cid:16) /N + | x ( N ) (0) − m | (cid:17) / (cid:16) E | x † i ( T ) | (cid:17) / . (73)Now for u i ∈ U ˆ C and the resulting f u i satisfying (72), we further obtain E | x † i ( T ) | + E Z T | x † i − (Γ ¯ m + η ) | Q dt ≤ C. For u i ∈ U ˆ C , (73) givessup f J i ( u i , ˆ u − i , f ) ≤ ¯ J i ( u i , f u i ) + C (1 / √ N + | x ( N ) (0) − m | ) ≤ ¯ J i ( u i , ˆ f u i ) + C (1 / √ N + | x ( N ) (0) − m | ) , where ˆ f u i is determined by Theorem 7. Due to (70),sup u i ∈U ˆ C k ˆ f u i k L ≤ C (74)for some constant C . By (74) and the method in (68), we similarly derive J i ( u i , ˆ u − i , ˆ f u i ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . Hence, for all u i ∈ U ˆ C ,sup f J i ( u i , ˆ u − i , f ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . The constant C in various places does not depend on u i . The lemma follows. (cid:3) Proof of Theorem 25.

It suﬃces to show the ﬁrst inequality by checking u i ∈ U ˆ C . By Lemma30, we have sup f J i ( u i , ˆ u − i , f ) ≥ ¯ J i ( u i , ˆ f u i ) − C (1 / √ N + | x ( N ) (0) − m | ) ≥ ¯ J i (ˆ u i , ˆ f ˆ u i ) − C (1 / √ N + | x ( N ) (0) − m | ) . (75)25n the other hand, by taking the particular control ˆ u i in Lemma 30,sup f J i (ˆ u i , ˆ u − i , f ) ≤ ¯ J i (ˆ u i , ˆ f ˆ u i ) + C (1 / √ N + | x ( N ) (0) − m | ) . (76)Subsequently, (75) and (76) implysup f J i ( u i , ˆ u − i , f ) ≥ sup f J i (ˆ u i , ˆ u − i , f ) − ( C + C )(1 / √ N + | x ( N ) (0) − m | ) . This completes the proof. (cid:3)

This section extends the results to a more general model with random initial states. For agent A i , itsdynamics are given by dx oi ( t ) = ( Ax oi ( t ) + Bu i ( t ) + Gx o ( N ) ( t ) + f ( t )) dt + DdW i ( t ) , ≤ i ≤ N, where x o ( N ) = (1 /N ) P Nj =1 x oj . The initial states of the agents are given by x oi (0) = ξ i . As in (4), wedeﬁne J i ( u i , u − i , f ) by using x oj in place of x j , 1 ≤ j ≤ N . Let {F ot } ≤ t ≤ T be the ﬁltration generatedby { ξ i , W i ( t ) , ≤ i ≤ N } , and L F o (0 , T ; R k ) is deﬁned accordingly.( H0 ) The sequence { ξ i , i ≥ } consists of independent random variables which are also independentof the Browian motions { W i , i ≥ } . In addition, lim N →∞ (1 /N ) P Ni =1 E ξ i = m , sup i E | ξ i | ≤ c forsome constant c independent of N .For ﬁxed ¯ u , we consider the FBSDE  dx oi = ( Ax oi + BR − B T y oi + Gm oi + γp oi ) dt + DdW i , ˙ m oi = ( A + G ) m oi + B ¯ u + γp oi , ˙ p oi = − ( A + G ) T p oi − ( I − Γ) T Q [ E x oi − (Γ m oi + η )] ,dy oi = (cid:8) − A T y oi + Q [ x oi − (Γ m oi + η )] (cid:9) dt + ζ oi dW i , (77)where x oi (0) = ξ i , m oi (0) = m , p oi ( T ) = H E x oi ( T ), and y oi ( T ) = − Hx oi ( T ). Except the random initialstate, this FBSDE has the same form as (36).For the current situation where the ﬁltration is not generated only by the Brownian motions, theproof of Lemma 12 is not applicable. The solution procedure of (P2) as presented in Section 3.5 is onlyheuristically applied to derive (77). Nevertheless, we can study (77) directly and use it to constructdecentralized strategies. We still deﬁne J wo i ( u i , u − i ) = sup f ∈ L (0 ,T ; R n ) J i ( u i , u − i , f ). The next theoremsubsumes Corollary 14 and Theorem 25. Theorem 31

Assume that (H0) - (H2) hold and (42) has a unique solution ( m , p , y ) . We further take ¯ u = R − B T y in (77) . Then the two assertions hold. (i) (77) has a unique solution in L F o (0 , T ; R n ) × C ([0 , T ]; R n ) × L F o (0 , T ; R n ) . (ii) For ˆ u i = R − B T y oi , ≤ i ≤ N, we have J wo i (ˆ u i , ˆ u − i ) − ε N ≤ inf u i ∈U J wo i ( u i , ˆ u − i ) ≤ J wo i (ˆ u i , ˆ u − i ) , (78) where ≤ ε N = O (1 / √ N + | (1 /N ) P Nj =1 E ξ j − m | ) . roof. (i) Consider (36) by setting¯ u = R − B T y , x i (0) = E ξ i . (79)Further construct the ODE by taking expectation in (36):  ˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , ˙¯ m i = ( A + G ) ¯ m i + B ¯ u + γ ¯ p i , ˙¯ p i = − ( A + G ) T ¯ p i − ( I − Γ) T Q [¯ x i − (Γ ¯ m i + η )] , ˙¯ y i = − A T ¯ y i + Q [¯ x i − (Γ ¯ m i + η )] , (80)where ¯ x i (0) = E ξ i , ¯ m i (0) = m , ¯ p i ( T ) = H ¯ x i ( T ), and ¯ y i ( T ) = − H ¯ x i ( T ). Since (36) subject to (79)has a unique solution, (80) has a solution in C ([0 , T ]; R n ). If (80) has two diﬀerent solutions, we willbe able to construct two diﬀerent solutions to (36) satisfying (79), a contradiction to Theorem 13. So(80) has a unique solution (¯ x i , ¯ m i , ¯ p i , ¯ y i ).Setting ( m oi , p oi ) = ( ¯ m i , ¯ p i ) in the ﬁrst and last equations of (77), we construct the new equations ( dx oi = ( Ax oi + BR − B T y oi + G ¯ m i + γ ¯ p i ) dt + DdW i ,dy oi = (cid:8) − A T y oi + Q [ x oi − (Γ ¯ m i + η )] (cid:9) dt + ζ oi dW i , (81)where x oi (0) = ξ i and y oi ( T ) = − Hx oi ( T ). Let P be the solution of the Riccati equation (B.4) and takethe transformation y oi = − P x oi + φ . We obtain dφ = (cid:2) − ( A − BR − B T P ) T φ + P ( G ¯ m i + γ ¯ p i ) − Q (Γ ¯ m i + η ) (cid:3) dt + ( ζ oi + P D ) dW i , where φ ( T ) = 0. We solve ( φ, ζ oi ) ∈ L F (0 , T ; R n ), and further obtain ( x oi , y oi ) ∈ L F o (0 , T ; R n ).Subsequently, we can show E x oi = ¯ x i . Hence ( x oi , m oi , p oi , y oi , ζ oi ) satisﬁes (77). By taking the variationof the ﬁrst three equations of (77) and applying an optimal control interpretation as in provingTheorem 13, we can show that ( x oi , m oi , p oi , y oi , ζ oi ) is the unique solution.(ii) By slightly modifying the proofs of Theorem 18 and the associated lemmas, we can showsup ≤ t ≤ T E | ˆ u ( N ) − ¯ u ∗ | = O (1 /N ) + O ( | E x ( N ) (0) − m | ) . Next, we adapt the proofs of Lemmas 26-30 taking into account the random initial states satisfying(H0). This gives the desired estimate for ε N . (cid:3) This paper introduces a class of mean ﬁeld LQG games with drift uncertainty. By using the ideaof robust optimization, the local strategy is designed by minimizing the worst case cost. When thedecentralized strategies are implemented in a ﬁnite population, their performance is characterized asa robust ε -Nash equilibrium.In this paper we only deal with drift uncertainty. If the Brownian motions are also subject toan uncertain coeﬃcient process to model volatility uncertainty [38], the resulting optimal controlproblems will give a set of more complicated FBSDE. It is also of potential interest to address modeluncertainty of the mean ﬁeld game in a diﬀerent setup by considering measure uncertainty [16, 36, 48]in the robust optimization problem. This will necessitate the use of diﬀerent techniques for analysis.27 ppendix A For proving Lemma 9, we give another lemma ﬁrst. Consider an auxiliary optimal control problemwith dynamics  ˙ z i = Az i + Bv i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q ( E z i − Γ z ) , (A.1)where z i (0) = z (0) = 0, q ( T ) = H E z i ( T ) and v i ∈ L F (0 , T ; R n ). Following the argument in the proofof Lemma 8, under (H1) we can show the existence and uniqueness of a solution to (A.1). The optimalcontrol problem is( P2b ) minimize ¯ J bi ( v i ) = E Z T (cid:8) | z i − Γ z | Q + v Ti Rv i − γ | q ( t ) | (cid:9) dt + E | z i ( T ) | H . Similarly, we may deﬁne positive deﬁniteness of ¯ J bi as in Section 3. Lemma A.1 ¯ J ai is positive semi-deﬁnite (resp., positive deﬁnite) if and only if ¯ J bi is positive semi-deﬁnite (resp., positive deﬁnite).Proof. If suﬃces to show the “only if” part.Suppose that ¯ J ai is positive semi-deﬁnite. Consider any control v i ∈ L F (0 , T ; R n ) for ¯ J bi , and thisgives a unique solution ( z i , z, q ). We take expectation in (A.1) to obtain  ˙¯ z i = A ¯ z i + B ¯ v i + Gz + γq, ˙ z = ( A + G ) z + γq, ˙ q = − ( A + G ) T q − ( I − Γ) T Q (¯ z i − Γ z ) , where ¯ z i = E z i and ¯ v i = E v i .It follows that¯ J bi ( v i ) = ¯ J ai (¯ v i ) + E Z T (cid:2) | z i − E z i | Q + | v i − E v i | R (cid:3) dt + E | z i ( T ) − E z i ( T ) | H ≥ ¯ J ai (¯ v i ) ≥ . On the other hand, ¯ J ai (0) = 0. This shows that ¯ J bi is positive semi-deﬁnite. The above reasoning isalso valid for the positive deﬁnite case. This proves the “only if” part. (cid:3) Proof of Lemma 9 . Let ( x i , m i , p i ) and ( x ′ i , m ′ i , p ′ i ) be two state processes in (P2) corresponding tothe controls u i and u ′ i , respectively. Assume λ ∈ [0 ,

1] and λ + λ = 1. We have λ ¯ J i ( u i ) + λ ¯ J i ( u ′ i ) − ¯ J i ( λ u i + λ u ′ i )= λ λ E Z T (cid:8) | x i − x ′ i − Γ( m i − m ′ i ) | Q + | u i − u ′ i | R − γ | p i ( t ) − p ′ i ( t ) | (cid:9) dt + λ λ E | x i ( T ) − x ′ i ( T ) | H . Denote z i = x i − x ′ i , z = m i − m ′ i , q = p i − p ′ i and v i = u i − u ′ i . It is obvious λ ¯ J i ( u i ) + λ ¯ J i ( u ′ i ) − ¯ J i ( λ u i + λ u ′ i ) = λ λ ¯ J bi ( v i ) . Recalling Lemma A.1, this completes the proof. (cid:3) ppendix B We introduce the FBSDE  dx i = ( Ax i + BR − B T y i + Gm i + γp i ) dt + DdW i , ˙ m i = ( A + G ) m i + BR − B T E y i + γp i , ˙ p i = − ( A + G ) T p i − ( I − Γ) T Q [ m i − (Γ m i + η )] ,dy i = (cid:8) − A T y i + Q [ x i − (Γ m i + η )] (cid:9) dt + ζ i dW i , (B.1)where x i (0) = m i (0) = m , p i ( T ) = Hm i ( T ), and y i ( T ) = − Hx i ( T ). This FBSDE is slightly diﬀerentfrom (42) by the third equation and the condition on p i ( T ) and will be more convenient for analysis.The next lemma shows that the two equation systems (43) and (B.1) are equivalent. The proof isstraightforward since E x i and m i satisfy the same ODE with the same initial condition. Lemma B.1 If ( x i , m i , p i , y i , ζ i ) ∈ S [0 , T ] satisﬁes one of (43) and (B.1) , it also satisﬁes the other. (cid:3) Consider the ODE system  ˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , ˙¯ m i = ( A + G ) ¯ m i + BR − B T ¯ y i + γ ¯ p i , ˙¯ p i = − ( A + G ) T ¯ p i − ( I − Γ) T Q [ ¯ m i − (Γ ¯ m i + η )] , ˙¯ y i = − A T ¯ y i + Q [¯ x i − (Γ ¯ m i + η )] , (B.2)where ¯ x i (0) = ¯ m i (0) = m , ¯ p i ( T ) = H ¯ m i ( T ) and ¯ y i ( T ) = − H ¯ x i ( T ). Lemma B.2

The two statements are equivalent: (i)

The

FBSDE (B.1) has a unique solution in S [0 , T ] . (ii) The

ODE (B.2) has a unique solution in C ([0 , T ]; R n ) .Proof. Step 1. Suppose that (ii) holds and let the unique solution be denoted by (¯ x i , ¯ m i , ¯ p i , ¯ y i ).Take ( m i , p i ) = ( ¯ m i , ¯ p i ) on the right hand side of the ﬁrst and last equations of (B.1) to write ( dx i = ( Ax i + BR − B T y i + G ¯ m i + γ ¯ p i ) dt + DdW i ,dy i = (cid:8) − A T y i + Q [ x i − (Γ ¯ m i + η )] (cid:9) dt + ζ i dW i , (B.3)where y i ( T ) = − Hx i ( T ). Denote the Riccati equation˙ P + A T P + P A − P BR − B T P + Q = 0 , P ( T ) = H, (B.4)which has a unique solution on [0 , T ]. Setting y i = − P x i + φ in (B.3), we obtain two decoupled equa-tions for ( x i , φ ) which is uniquely solved. This further gives a unique solution ( x i , y i , ζ i ) ∈ L F (0 , T ; R n )for (B.3). Taking expectation on both sides of (B.3) yields ( ddt E x i = A E x i + BR − B T E y i + G ¯ m i + γ ¯ p i , ddt E y i = − A T E y i + Q [ E x i − (Γ ¯ m i + η )] , (B.5)where E y i ( T ) = − H E x i ( T ). By combining (B.5) with the ﬁrst and fourth equations of (B.2), it iseasy to show E x i = ¯ x i and E y i = ¯ y i for all t ∈ [0 , T ]. This implies˙¯ m i = ( A + G ) ¯ m i + BR − B T ¯ y i + γ ¯ p i = ( A + G ) ¯ m i + BR − B T E y i + γ ¯ p i . m i , ¯ p i ). Therefore, ( x i , m i , p i , y i , ζ i ) := ( x i , ¯ m i , ¯ p i , y i , ζ i )satisﬁes (B.1).We continue to show that ( x i , m i , p i , y i , ζ i ) above is the unique solution of (B.1). Suppose that( x ′ i , m ′ i , p ′ i , y ′ i , ζ ′ i ) is another solution of (B.1). It is clear that ( E x ′ i , m ′ i , p ′ i , E y ′ i ) is a solution of (B.2).Since (B.2) has a unique solution (¯ x i , ¯ m i , ¯ p i , ¯ y i ), we have ( m ′ i , p ′ i ) = ( ¯ m i , ¯ p i ). By using the ﬁrst andfourth equations of (B.1), we derive the equations satisﬁed by ( x ′ i − x i , y ′ i − y i ) and further infer( x ′ i , y ′ i ) = ( x i , y i ). We conclude that (i) holds.Step 2. Suppose that (i) holds with the unique solution denoted by ( x i , m i , p i , y i , ζ i ). It is ob-vious that (¯ x i , ¯ m i , ¯ p i , ¯ y i ) := ( E x i , m i , p i , E y i ) is a solution of (B.2). Suppose that (¯ x ′ i , ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) =(¯ x i , ¯ m i , ¯ p i , ¯ y i ) is another solution of (B.2). Then ( x i , m i , p i , y i , ζ i ) + (¯ x ′ i − ¯ x i , ¯ m ′ i − ¯ m i , ¯ p ′ i − ¯ p i , ¯ y ′ i − ¯ y i , (cid:3) Lemma B.3 (i) If (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution of (B.2) , ( m , p , y ) := ( ¯ m i , ¯ p i , ¯ y i ) satisﬁes (42) . (ii) If ( m , p , y ) is a solution of (42) , there exists ¯ x i such that (¯ x i , ¯ m i , ¯ p i , ¯ y i ) := (¯ x i , m , p , y ) satisﬁes (B.2) . (iii) The

ODE (B.2) has a unique solution if and only if (42) has a unique solution.Proof. (i) If (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution of (B.2), ¯ x i = ¯ m i and therefore ¯ y i ( T ) = − H ¯ x i ( T ) = − H ¯ m i ( T ). So ( m , p , y ) deﬁned above satisﬁes (42).(ii) If ( m , p , y ) is a solution of (42), we set ( ¯ m i , ¯ p i , ¯ y i ) = ( m , p , y ) and deﬁne ¯ x i by the ODE˙¯ x i = A ¯ x i + BR − B T ¯ y i + G ¯ m i + γ ¯ p i , where ¯ x i (0) = m . It can be checked that ¯ m i = ¯ x i , which gives ¯ y i ( T ) = − H ¯ m i ( T ) = − H ¯ x i ( T ).Hence, (¯ x i , ¯ m i , ¯ p i , ¯ y i ) is a solution to (B.2).(iii) Assume that (42) has a unique solution. Let (¯ x i , ¯ m i , ¯ p i , ¯ y i ) and (¯ x ′ i , ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) be two solutionsof (B.2). By (i), ( ¯ m i , ¯ p i , ¯ y i ) and ( ¯ m ′ i , ¯ p ′ i , ¯ y ′ i ) are two solutions of (42) and so must be equal, whichfurther implies ¯ x i = ¯ x ′ i by the ﬁrst equation of (B.2). This shows that (B.2) has a unique solution.Next assume that (B.2) has a unique solution. Let ( m , p , y ) and ( m ′ , p ′ , y ′ ) be two solutions of(42). By (ii), we must have ( m , p , y ) = ( m ′ , p ′ , y ′ ) . Therefore, (42) has a unique solution. (cid:3)

References [1] S. Adlakha, R. Johari, and G. Y. Weintraub. Equilibria of dynamic games with many players:Existence, approximation, and market structure.

J. Econ. Theory , vol. 156, pp. 269-316, 2015.[2] M. Aghassi and D. Bertsimas. Robust game theory,

Math. Progr. , vol. 107, pp. 231-273, 2006.[3] D. Andersson and B. Djehiche. A maximum principle for stochastic control of SDE’s of mean-ﬁeldtype.

Applied Math. Optim. , vol. 63, pp. 341-356, 2010.[4] M. Bardi. Explicit solutions of some linear-quadratic mean ﬁeld games.

Netw. HeterogeneousMedia , vol. 7, no. 2, pp. 243-261, 2012.[5] T. Basar and P. Bernhard. H ∞ -optimal Control and Related Minimax Design Problems: A Dy-namic Game Approach. Mean Field Games and Mean Field Type Control Theory .Springer, New York, 2013.[7] A. Bensoussan, K. C. J. Sung, S. C. P. Yam, and S. P. Yung. Linear-quadratic mean-ﬁeld games.

J. Optim. Theory Appl. , vol. 169, no. 2, pp. 496-529, 2016.308] A. Bensoussan, K. C. J. Sung, S. C. P. Yam. Linear-quadratic time-inconsistent mean ﬁeld games.

Dynamic Games Appl. , vol. 3, no.4, pp. 537-552, 2013.[9] R. Buckdahn, J. Li, and S. Peng. Nonlinear stochastic diﬀerential games involving a major playerand a large number of collectively acting minor agents.

SIAM J. Control Optim. , vol. 52, no. 1,pp. 451-492, 2014.[10] P. E. Caines. Mean ﬁeld games, in

Encyclopedia of Systems and Control , Ed. T. Samad and J.Baillieul, Berlin: Springer-Verlag, 2014.[11] P. Cardaliaguet.

Notes on mean ﬁeld games , 2012.[12] R. Carmona and F. Delarue. Probabilistic analysis of mean-ﬁeld games.

SIAM J. Control Optim. ,vol. 51, no. 4, pp. 2705-2734, 2013.[13] R. Carmona, F. Delarue, and A. Lachapelle. Control of McKean-Vlasov dynamics versus meanﬁeld games.

Math. Financ. Econ. , vol. 7, no. 2, pp. 131-166, 2013.[14] R. Carmona and D. Lacker. A probabilistic weak formulation of mean ﬁeld games and applications.

Ann. Appl. Probab. , vol. 25, no. 3, pp. 1189-1231, 2015.[15] P. Chan and R. Sircar. Bertrand and Cournot mean ﬁeld games,

Appled Math. Optim. , vol. 71,no. 3, pp. 533-569, 2015.[16] C. D. Charalambous and F. Rezaei. Stochastic uncertain systems subject to relative entropyconstraints: induced norms and monotonicity properties of minimax games.

IEEE Transactionson Automatic Control , vol. 52, no. 4, pp. 647-663, April 2007.[17] S. Chen, X. Li, and X. Y. Zhou. Stochastic linear quadratic regulators with indeﬁnite controlweight costs.

SIAM J. Control Optim. , vol. 36, no. 5, pp. 1685-1702, Sept. 1998.[18] B. Djehiche, and M. Huang. A characterization of sub-game perfect equilibria for SDEs of meanﬁeld type.

Dynamic Games Appl. , vol. 6, no. 1, pp. 55-81, 2016.[19] J. Engwerda. A numerical algorithm to ﬁnd soft-constrained Nash equilibria in scalar LQ-games.

Int. J. Control , vol. 79, no. 6, pp. 592-603, 2006.[20] M. Fischer. On the connection between symmetric N -player games and mean ﬁeld games, Arxivpreprint, 2014.[21] D. A. Gomes and J. Saude. Mean ﬁeld games models–a brief survey. Dynamic Games Appl. , vol.4, no. 2, pp. 110-154, 2014.[22] M. Hu and M. Fikushima. Existence, uniqueness, and computation of robust Nash equilibria ina class of multi-leader-follower games.

SIAM J. Optim. , vol. 23, no. 2, pp. 894-916, 2013.[23] Y. Hu and S. Peng. Solution of forward-backward stochastic diﬀerential equations.

Probab. TheoryRelated Fields , vol. 103, pp. 273-283, 1995.[24] M. Huang, P. E. Caines, and R. P. Malham´e. Individual and mass behaviour in large populationstochastic wireless power control problems: Centralized and Nash equilibrium solutions.

Proc.42nd IEEE CDC , Maui, HI, pp. 98-103, Dec. 2003.[25] J. Huang and M. Huang. Mean ﬁeld LQG games with model uncertainty.

Proc. 52nd IEEE CDC ,Florence, Italy, pp. 3103-3108, December 2013.3126] M. Huang. Large-population LQG games involving a major player: the Nash certainty equivalenceprinciple.

SIAM J. Control Optim. , vol. 48, pp. 3318-3353, 2010.[27] M. Huang, P. E. Caines and R. P. Malham´e. Large-population cost-coupled LQG problems withnon-uniform agents: individual-mass behavior and decentralized ε -Nash equilibria. IEEE Trans-actions on Automatic Control , vol. 52, pp. 1560-1571, 2007.[28] M. Huang, R. P. Malham´e, and P. E. Caines. Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle.

Communication inInformation and Systems , vol. 6, pp. 221-251, 2006.[29] M. Jimenez and A. Poznyak. ε -equilibrium in LQ diﬀerential games with bounded uncertain dis-turbances: robustness of standard strategies and new strategies with adaptation. Int. J. Control ,vol. 79, no. 7, pp. 786-797, 2006.[30] E. Kardes, F. Ordonez, and R.W. Hall. Discounted robust stochastic games and an applicationto queueing control.

Operations Research , vol. 59, no. 2, pp. 365-382, 2011.[31] A. C. Kizilkale and P. E. Caines. Mean ﬁeld stochastic adaptive control.

IEEE Transactions onAutomatic Control , vol. 58, no. 4, pp. 905-920, April 2013.[32] V. N. Kolokoltsov, J. Li, and W. Yang. Mean ﬁeld games and nonlinear Markov processes,arXiv:1112.3744, preprint, 2011.[33] A. J. Kurdila and M. Zabarankin.

Convex Functional Analysis , Berlin: Birkh¨auser, 2005.[34] J. M. Lasry and P. L. Lions. Mean ﬁeld games.

Japan J. Math. , vol. 2, pp. 229-260, 2007.[35] T. Li and J. F. Zhang. Asymptotically optimal decentralized control for large population stochas-tic multiagent systems.

IEEE Transactions on Automatic Control , vol. 53, no. 7, pp. 1643-1660,August 2008.[36] A. Lim and J. Shanthikumar. Relative entropy, exponential utility, and robust dynamic pricing.

Operations Research , vol. 55, pp. 198-214, 2007.[37] A. Lim and X. Y. Zhou. Stochastic optimal LQR control with integral quadratic constraints andindeﬁnite control weights.

IEEE Transactions on Automatic Control , vol. 44, no. 7, pp. 1359-1369,July, 1999.[38] D. P. Looze, H. V. Poor, K. S. Vastola, and J. C. Darragh. Minimax control of linear stochasticsystems with noise uncertainty.

IEEE Transactions on Automatic Control , AC-28, no. 9, pp.882-888, Sept. 1983.[39] D. G. Luenberger.

Optimization by Vector Space Methods . New York: Wiley, 1969.[40] J. Ma and J. Yong.

Forward-backward Stochastic Diﬀerential Equations and their Applications ,Lecture Notes in Math. 1702. Springer-Verlag, New York, 1999.[41] S. L. Nguyen and M. Huang. Linear-quadratic-Gaussian mixed games with continuum-parametrized minor players.

SIAM J. Control Optim. , vol. 50, no. 5, pp. 2907-2937, 2012.[42] M. Nourian and P. E. Caines. ǫ -Nash mean ﬁeld game theory for nonlinear stochastic dynamicalsystems with major and minor agents. SIAM J. Control Optim. , vol. 51, no. 4, pp. 3302-3331,2013. 3243] M. Nourian, P. E. Caines, R. P. Malham´e, and M. Huang. Mean ﬁeld control in leader-followerstochastic multi-agent systems: likelihood ratio based adaptation.

IEEE Transactions on Auto-matic Control , vol. 57, no. 11, pp. 2801-2816, Nov. 2012.[44] S. Peng and Z. Wu. Fully coupled forward-backward stochastic diﬀerential equations and appli-cations to optimal control,

SIAM J. Control Optim. , vol. 37, no. 3, pp. 825-843, 1999.[45] J. Sun, X. Li, and J. Yong. Open-loop and closed-loop solvabilities for stochastic linear quadraticoptimal control problems.

SIAM J Control Optim. , vol. 54, no. 5, pp. 2274-2308, 2016.[46] H. Tembine, D. Bauso, and T. Basar. Robust linear quadratic mean-ﬁeld games in crowd-seekingsocial networks.

Proc. 52nd IEEE CDC , Florence, Italy, pp. 3134-3139, 2013.[47] H. Tembine, Q. Zhu, and T. Basar. Risk-sensitive mean-ﬁeld games.

IEEE Transactions on Au-tomatic Control , vol. 59, no. 4, pp. 835-850, April 2014.[48] V. A. Ugrinovskii and I. R. Petersen. Minimax LQG control of stochastic partially observeduncertain systems.

SIAM J. Control Optim. , vol. 40, no. 4, pp. 1189-1226, 2001.[49] W. A. van den Broek, J. C. Engwerda, and J. M. Schumacher. Robust equilibria in indeﬁnitelinear-quadratic diﬀerential games.

J. Optim. Appl. , vol. 119, no. 3, pp. 565-595, 2003.[50] B. C. Wang and J.-F. Zhang. Mean ﬁeld games for large-population multiagent systems withMarkov jump parameters.

SIAM J. Control Optim. , vol. 50, no. 4, pp. 2308-2334, 2013.[51] J. C. Willems. Least squares stationary optimal control and the algebraic Riccati equation.

IEEETrans. Autom. Control , vol. 16, no. 6, pp. 621-634, Dec. 1971.[52] H. Yin, P. G. Mehta, S. P. Meyn, and U. V. Shanbhag. Synchronization of coupled oscillators isa game.

IEEE Trans. Autom. Control , vol. 57, no. 4, pp. 920-935, April 2012.[53] J. Yong. Linear-quadratic optimal control problems for mean-ﬁeld stochastic diﬀerential equa-tions.

SIAM J. Control Optim. , vol. 51, no. 4, pp. 2809-2838, 2013.[54] J. Yong and X. Y. Zhou.