Applications of Mean Field Games in Financial Engineering and Economic Theory
AApplications of Mean Field Gamesin Financial Engineering and Economic Theory
René Carmona A BSTRACT . This is an expanded version of the lecture given at the AMS Short Courseon Mean Field Games, on January 13, 2020 in Denver CO. The assignment was to discussapplications of Mean Field Games in finance and economics. I need to admit upfront thatseveral of the examples reviewed in this chapter were already discussed in book form. Still,they are here accompanied with discussions of, and references to, works which appearedover the last three years. Moreover, several completely new sections are added to showhow recent developments in financial engineering and economics can benefit from beingviewed through the lens of the Mean Field Game paradigm. The new financial engineeringapplications deal with bitcoin mining and the energy markets, while the new economicapplications concern models offering a smooth transition between macro-economics andfinance, and contract theory. C ONTENTS Introduction
Financial Engineering Applications
Games Models for Energy and the Environment
Macro-Economic Growth Models
From Macro to Finance
Moral Hazard & Contract Theory
1. Introduction
The goal of this chapter is to review applications in financial engineering and econom-ics which can be cast and tackled within the framework of the Mean Field Game (oftenabbreviated as MFG in the sequel) paradigm.Sections 2, 3 and 4 revisit some of the applications discussed in [ , Chapter 1]. Here,we try to give some context, and examine developments which occurred since the pub-lication of the book. While we do not delve into proofs, we provide detailed referenceswhere the interested reader will be able to find complements on the subject matters. Thisis clearly the case for Section 2 reviewing applications to financial engineering. Mathematics Subject Classification.
Primary .The author was partially supported by NSF DMS-1716673 and ARO W911NF-17-1-0578. a r X i v : . [ q -f i n . GN ] D ec RENÉ CARMONA
Economic models often involve optimization over sets of actions, behaviors and strate-gies. Engineers and biologists facing similar optimization problems would be likely to re-strict themselves to finite action sets, but many economists prefer continuous action sets inorder to be able to write first order conditions of optimality in differential forms, in hopeof resolving them with explicit solutions. Furthermore, they have no qualms about usingmodels with a continuum of agents. They explain that this guarantees an environment withperfect competition and tractable equilibrium behavior. However, these formulations oftenraise eyebrows from colleagues, and especially mathematicians. Our discussion of macro-economic models is motivated in part by the desire to reconcile the approach to competitiveequilibria with a continuum of players with the paradigm of mean field games. Addition-ally, while not all economists like to work with continuous time, it has been argued overthe last two decades that models in macro-economic, contract theory, finance, . . . , greatlybenefit from the switch to continuous time from the more traditional discrete time models.Yuli Sannikov is certainly one of the most visible crusaders in this respect, and we shallfollow his lead and consider only continuous time models in this review chapter.Like it is often the case in the published literature in economics, economies are mod-eled as populations with a continuum of players, and to capture the fact that the influenceof any individual member on the aggregate should be infinitesimally small, these individ-uals are modeled as elements of a measure space with a continuous measure, say the unitinterval [0 , with its Lebesgue’s measure. This starting point is different from the typicalset-up of mean field games which usually starts from N player games with N finite, andconsiders the limit N → ∞ . We shall use both modeling paradigms, and refere the readerto [ , Section 3.7] for a discussion of the links between the two.None of the macro-economic papers which we review in this chapter mention the name Nash , or use the terminology
Nash equilibrium . The authors of these papers are concernedwith general equilibria , not
Nash equilibria . They first assume that all the aggregate quan-tities in the economy are given, and they let the participants in the economy optimize theirutilities independently of each other. Next, they identify the constraints to be satisfiedaccording to economic theory, compile them in a set which they call set of clearing con-ditions, and check that the results of the participants’ optimizations are compatible withthe clearing conditions. As long as the clearing conditions are satisfied while the partici-pants in the economy optimize their expected utilities, a general equilibrium is said to takeplace. This procedure is very similar to the search for a Nash equilibrium. Identifying thebest response function amounts to having the participants optimize their expected utilitiesgiven the strategies of the other players, and given the constraints of economic theory, thesearch for a fixed point of the best response function appears as an analog of the checkthat the clearing conditions are satisfied. The parallel is even more striking when the clear-ing conditions can be written in terms of aggregate quantities. Indeed these aggregatesquantify the interactions between the economy participants, and since these aggregates aremost often nothing more than empirical means of state variables, they spell out mean field interactions in the model. It is thus reasonable to cast these general equilibria as Nashequilibria for Mean Field Game models, and take full advantage of the technology alreadydeveloped for the analysis of these game models to gain a better understanding of thesegeneral equilibria.
Acknowledgements:
I would like to thank Markus Brunnermeier for relentless attemptsat educating me on some of the subtleties of the economic models discussed in Section 5.Also, I would like to acknowledge the useful comments and suggestions from an anony-mous referee.
INANCIAL ENGINEERING & ECONOMICS 3
2. Financial Engineering Applications2.1. Systemic Risk.
The study of systemic risk is concerned with the identificationand analysis of events or a sequences of events which could trigger severe instability, oreven collapse of the financial system and the entire economy as a result. In finance, sys-temic risk is approached differently than the risk associated with any one individual insti-tution or portfolio. In the US, it was first brought to the forefront by the Federal ReserveBank of New York after the September 11, 2001 attack. See [ ]. The Central Bankspearheaded several collaborative research initiatives involving economists and academicresearchers including engineers working on the safety of the electric grid, mathematiciansexperts in graph theory and network analysis, population biologists, .... . But unfortunately,what brought this line of research in the limelight is the financial crisis of 2008 for whichsystemic risk was a major contributor. While the mean field game models we discuss belowgive identical importance to the various institutions involved in the model, it is clear that arealistic model of systemic risk in finance should differentiate the roles of those companiesconsidered to be too big to fail , and which have been identified officially under the nameof SIFI for Systemically Important Financial Institutions. They are the banks, insurancecompanies, or other financial institutions that U.S. federal regulators think would pose aserious risk to the economy if they were to collapse. See [ ] for an account of the stateof the art after the financial crisis. We shall briefly discuss extensions of the models re-viewed below which could possibly account for the presence of such important players inthe models.2.1.1. A Toy Model of Systemic Risk . Our first example is borrowed from [ ]. Weview its main merit as being pedagogical. It gives us the opportunity to review some ofthe main features of Linear Quadratic (LQ) finite player game models, and in so doing,explain how one can solve them. Moreover, despite its very simple structure, it exhibitsseveral very useful characteristics: for each integer N ≥ , the N -player version of thegame has a unique closed loop equilibrium, a unique open loop equilibrium, this open loopequilibrium is in closed loop form, but it is different from the closed loop equilibrium.However, when N → ∞ , both equilibria converge toward the unique equilibrium of themean field game.To describe the model, let us assume that the log-monetary reserves of N banks, say X it , i = 1 , . . . , N , satisfy dX it = (cid:2) a ( X t − X it ) + α it (cid:3) dt + σ (cid:18)(cid:112) − ρ dW it + ρdW t (cid:19) , i = 1 , . . . , N where W it , i = 0 , , . . . , N are independent Brownian motions and σ > is a constant.The notation ¯ X t stands for the empirical mean of the X it for i = 1 , . . . , N , a is a constantregulating the mean reversion of X it toward the mean, and ρ correlates the idiosyncraticshocks dW it and the common shock dW t . So in this model, borrowing and lending is donethrough the drifts. In fact, • If X it is small (relative to the empirical mean X t ) then bank i wants to borrow( α it > ) • If X it is large then bank i will want to lend ( α it < ) RENÉ CARMONA
The adapted stochastic process α i = ( α it ) t ≥ is the control strategy of bank i which triesto minimize the quantity: J i ( α , . . . , α N )= E (cid:40)(cid:90) T (cid:20)
12 ( α it ) − qα it ( X t − X it ) + (cid:15) X t − X it ) (cid:21) dt + (cid:15) X T − X iT ) (cid:41) (2.1)We could imagine that the quantity q > is chosen by the regulator to control the cost ofborrowing and lending.This model is a simple example of an N - player stochastic differential game with Mean Field Interactions since the interactions are through the empirical means of the N states, and a common noise . Explicit Solutions for Finite N . While it is usually very difficult to identify and com-pute Nash equilibria for finite player games, especially when the games are stochastic anddynamic, the very special Linear-Quadratic (LQ) nature of the model allows for explicitsolutions.Searching for an open loop
Nash equilibrium α = ( α t ) ≤ t ≤ T for the game with α t = ( α t , . . . , α Nt ) being adapted to the filtration generated by the Brownian motions canbe done using the Pontryagin stochastic maximum principle. In this particular model, onefinds that the strategy profile α defined by(2.2) α it = (cid:2) q + (1 − N ) η t (cid:3) ( X t − X it ) where the deterministic function t (cid:55)→ η t solves the Riccati equation(2.3) ˙ η t = 2 (cid:2) a + q − N q ] η t + (cid:0) − N (cid:1) η t + q − (cid:15) with terminal condition η T = c . This equation is uniquely solvable if we assume (cid:15) ≥ q ,is the unique open loop Nash equilibrium. Notice that α t is in closed loop / feedback form since it is of the form α t = φ t ( X t ) for the deterministic function(2.4) ( t, x ) (cid:55)→ φ it ( x ) = (cid:2) q + (1 − N ) η t (cid:3) ( x − x i ) , with ¯ x = 1 N N (cid:88) i =1 x i , for i = 1 , . . . , N . Note also that in equilibrium, the states of the banks satisfy:(2.5) dX it = (cid:2) a + q + (1 − N ) η t (cid:3)(cid:0) X t − X it (cid:1) dt + σρdW t + σ (cid:112) − ρ dW it , for i = 1 , . . . , N . So the states evolve according to an Ornstein-Ulhenbeck like processwhich is Gaussian if the initial conditions are Gaussian (or deterministic).Searching for a closed loop Nash equilibrium β = ( β t ) ≤ t ≤ T with β t = ψ t ( X t ) isusually done by deriving the Hamilton-Jacobi-Bellman equation for the system of valuefunctions of the players, but in this particular instance, it can also be done using the Pon-tryagin stochastic maximum principle. In any case, one finds that there exists a uniqueNash equilibrium, and like in the open loop case, the equilibrium strategy profile is givenby feedback functions ψ = ( ψ , . . . , ψ N ) given by the same formula (2.4) except for the INANCIAL ENGINEERING & ECONOMICS 5 fact that the deterministic function t (cid:55)→ η t now solves a slightly different Riccati equation,namely:(2.6) ˙ η t = 2( a + q ) η t + (cid:0) − N (cid:1) η t + q − (cid:15), with the same terminal condition η T = c . The same remarks apply to the Ornstein-Ulhenbeck nature of the state evolutions in equilibrium. However, what is remarkableis the fact that the open loop Nash equilibrium happens to be in closed loop form, but still,it is not a closed loop Nash equilibrium. This is very different from the situation for plainoptimization. In fact, it reinforces the message that looking for a Nash equilibrium is a farcry from solving an optimization problem.The second point we want to emphasize is that both Riccati equations, and hence bothsolutions, coincide in the limit N → ∞ . More on that below.The interested reader can find complements and detailed proofs in [ , Section 2.5]. Relevance to Mean Field Games (MFGs) . Mean Field Games have been touted asthe appropriate limits of N player games when N → ∞ . However, beyond appealing in-tuitive arguments and rigorous proofs that MFG solutions can be used to construct strategyprofiles forming approximate Nash equilibria for N player games, the larger N , the bet-ter the approximation, proving actual convergence is a difficult problem. See for exampleD.Lacker chapter in this volume.In the present situation, because of the explicit nature of the solutions of the finiteplayer games, we can take the limit N → ∞ in the dynamics of the states, in the Nashequilibrium controls (open and closed loop), and in the expected costs optimized by theplayers, all that despite the presence of the common noise. In fact, we can read off theimpact of the common noise in the limit N → ∞ where the open and closed loop modelscoincide. This limit can formally be identified to the so-called Mean Field Game model,and we can even identify the Master Equation from the large N behavior of the system ofHJB equations. MFGs as Models for Systemic Risk . Being set up in continuous time, the abovemodel is multi-period in nature. This is in sharp contrast with most of the existing math-ematical models of systemic risk which are most often cast as static one-period models.Still, one of the most valuable feature of the model presented above is being explicitlysolvable .On the shortcoming side, I need to admit that this is a very naive model of bank lendingand borrowing. Among its undesirable features, the model does not have any provision forborrowers to repay their debts ! Moreover, despite explicit solutions, it gives only a smalljab at the stability properties of the system.Still, the model raises interesting challenges and it seems reasonable that it can bemade more realistic and more useful with some mathematically tractable add-ons. We citethree of them for the sake of illustration. • The introduction of major and minor players in this model will allow to capturethe crucial role played by the SIFIs discussed in the introduction to this section.Game models with major and minor players experienced a renewal of interestrecently, and systemic risk seems to be a perfect testbed for their implementation. • In order to palliate some of the unrealistic features of the original model, Car-mona, Fouque, Moussavi and Sun suggested the introduction of time delays in
RENÉ CARMONA the controls in [ ]. While increasing the technicalities of the proofs and pre-cluding explicit solutions, this extension of the model includes provisions forborrowers to repay their debts in a fixed amount of time. • As another example of the fact that systemic risk is a fertile ground for the in-troduction and analysis of new MFG models, we mention the recent paper byBenazzoli, Ciampi and Di Persio [ ] in which the authors study a simple illiq-uid inter-bank market model, where the banks can change their reserves only atthe jump times of some exogenous Poisson processes with a common constantintensity. • Also noteworthy is a recent paper [ ] of Nadtochyi and Shkolnikov who studythe mean field limit of systems of particles interacting through hitting times.Their model was motivated by the contagion of the times of default of financialinstitutions in periods of economic stress. It would be interesting to add the opti-mization component to their model and study the endogenously made decisionsof the participants. • Finally, the introduction of graphical constraints should be a good way to quan-tify the various levels of exchanges between the financial institutions. Introduc-ing a weighted graph of interactions between the players of the game changesdramatically the mean field nature of the model, and new solution approacheswill have to be worked out for any significant progress to be made in this direc-tion. Still, despite the obvious challenges of this research program, it is underactive investigation by many financial engineers.
High frequency markets offer anotherfertile ground for applications of financial engineering. Among other things, they high-lighted the importance of price impact and optimal execution . The search for the bestpossible way to execute a given trade is an old problem and the presence of price impactdid not have to wait for the popularity of the high frequency markets.
A Model for Price Impact.
We briefly review an MFG model of price impact intro-duced by Carmona and Lacker in [ ]. There, it was solved in the weak formulation, butfor the purpose of the present discussion the specific approach used to get to a solution willnot really matter.We start with a model for N traders. We denote by X it the inventory (i.e. the numberof shares owned) at time t by trader i , and we assume that this inventory evolves as an Itôprocess according to(2.7) dX it = α it dt + σ i dW it where α it represents the rate of trading of trader i . This will be their control. W i =( W it ) t ≥ are independent Wiener processes for i = 1 , . . . , N , and σ i represents an id-iosyncratic volatility. We assume that it is independent of i for simplicity. Note that inessentially all the papers on the subject this volatility is assumed to be . In other words,inventories are assumed to be differentiable in time. Our decision to work with σ i = σ > is backed by empirical evidence, at least in the high frequency markets, as demonstratedby Carmona and Webster in [ ]. Next, we denote by K it the amount of cash held by trader i at time t . We have: dK it = − [ α it S t + c ( α it )] dt, where S t is the transaction price of one share at time t , and the function α → c ( α ) ≥ models the cost for trading at rate α . As explained in [ ], this function c should be thought INANCIAL ENGINEERING & ECONOMICS 7 of as the Legendre transform of the shape of the order book. So for a flat order book weshould have: c ( α ) = cα . We model the time evolution of the price S t using the natural extension to the case of N traders of the price impact formula of Almgren and Chriss [ ]: dS t = 1 N N (cid:88) i =1 h ( α it ) dt + σ dW t for some non-negative increasing function α (cid:55)→ h ( α ) and a Wiener process W =( W t ) t ≥ independent of the other ones. In this model, the wealth V it of trader i at time t is given by the sum of his holdings in cash and the value of his holdings in the stock, asmarked at the current value of the stock: V it = K it + X it S t . Using the standard self-financing condition and Itô’s formula we see that: dV it = dK it + X it dS t + S t dX it = (cid:20) − c ( α it ) + X it N N (cid:88) j =1 h ( α jt ) (cid:21) dt + σS t dW it + σ X it dW t , (2.8)so if player i minimizes their expected trading costs J i ( α , ..., α N ) = E (cid:20) (cid:90) T c X ( X it ) dt + g ( X iT ) − V iT (cid:21) where x (cid:55)→ c X ( x ) represents the cost of holding an inventory of sixe x and g ( x ) a form ofterminal inventory cost. Using (2.8), we can rewrite these expected costs as; J i ( α , ..., α N ) = E (cid:20) (cid:90) T f ( t, X it , ¯ ν Nt , α it ) dt + g ( X iT ) (cid:21) if ¯ ν Nt denotes the empirical distribution of α t , . . . , α Nt , and the function f is defined by f ( t, x, ν, α ) = c ( α ) + c X ( x ) − x (cid:82) h dν .R EMARK N -player stochastic differential game.(2) The state equations are given by (3.1). They are driven by the N independentidiosyncratic noise term dW it . The common noise W disappeared from theoptimization problem only because of the risk neutrality of the traders, namelythe fact that they minimize the expectations of their costs. Should they choose tominimize a nonlinear utility function, the common noise would not disappear !(3) Another remarkable property of this model, and one of the reasons its analysiswas of great interest, is that it is one of the earliest MFG models for which themean field interaction occurs naturally through the controls. An early analysis ofthese MFGs within the probabilistic approach was given in [ , Section 4.6].While the model formulated above is for N traders, the mean field formulation N →∞ is clear. A complete solution of this limit MFG in the weak formulation can be foundin [ ] and in [ , Section 4.7] for the strong formulation. Later on, it was revisitedby Cardaliaguet and Lehalle in [ ] where the authors consider agents’ possible hetero-geneities, and the introduction of fictitious plays which gives a learning twist to the model.It was also extended by Cartea and Jaimungal in [ ], to formulate an optimal execution RENÉ CARMONA problem for which the authors could still provide solution formulas in quasi explicit forms.Game models for optimal execution in the presence of price impact did not wait for thetheory of MFGs to catch the interest of financial economists and financial engineers. Theinterested reader may want to check the papers [ ], [ ], and [ ] for games modelsshedding light on predatory trading. I came across thepotential application of Mean Field Games to the important problem of bank runs by at-tending Jean Charles Rochet’s lectures at the Vancouver Systemic Risk Summer School inJuly 2014, and the talk given by Olivier Gossner during the conference following the sum-mer school. Both works [
71, 46 ] are reported in detail in [
21, 22 ]. Here, we only reviewthe second one because it fits better in the class of continuous time models on which weconcentrate in this chapter.In the spirit of the discussion to follow, it is worth mentioning the works of Morris andShin [ ] and He and Xiong [ ]. Like most economists, they use a model of the economywith a continuum of players based on an atomless measure space. Our goal is to recovertheir models starting with finitely many players and then, analyze the mean field limit. A Continuous Time Model for Bank Runs . Following Gossner’s talk mentioned ear-lier, we consider a group of N depositors with individual initial deposits in the amount D i = 1 /N for i = 1 , . . . , N . They are promised a rate of return r > r where r is the cur-rent prevailing interest rate. We assume that the value Y t of the assets of the bank at time t follows an Itô process and that Y ≥ . We also assume the existence of a deterministicfunction y (cid:55)→ L ( y ) giving the liquidation value of the bank assets when Y t = y . One canimagine that the bank has a rate r credit line of size L ( Y t ) at time t , and that the bank usesthis credit line each time a depositor runs (withdraws their deposit).Next, we assume that the assets mature at time T , and that no transaction takes placeafter that. If at that time Y T ≥ , every one is paid in full, but if Y T < we treat this caseas an exogenous default . We talk about an endogenous default at time t if depositors try towithdraw more than L ( Y t ) at that time.As time passes, each depositor has access to a private signal X it satisfying: dX it = dY t + σdW it , i = 1 , . . . , N, and at a time τ i of their choosing, they can attempt to withdraw their deposit, de factocollecting the return r until time τ i , and trying to maximize: J i ( τ , . . . , τ N ) = E (cid:20) g ( τ i , Y τ i , τ − i ) (cid:21) where we use the standard notation τ − i = ( τ , . . . , τ i − , τ i +1 , . . . , τ N ) , and for example g ( t, Y t , τ , . . . , τ N ) = e ( r − r ) t ∧ τ + e − rt ∧ τ ( L ( Y t ) − N t /N ) + ∧ N ,N t = { i ; τ i ≤ t } is number of withdrawals before t , and τ = inf { t ; L ( Y t ) < N t /N } is the first time the bank cannot withstand the withdrawal requests.First, let us try to derive some conclusions if the depositors had full information , inwhich case Y t would be public knowledge, and σ would be , i.e. σ = 0 . If we also assume INANCIAL ENGINEERING & ECONOMICS 9 that the function y (cid:44) → L ( y ) is known to the depositors, then it is easy to check that in anyequilibrium: τ i = inf { t ; L ( Y t ) ≤ } . So all the depositors withdraw at the same time (they all run on the bank simultaneously)and each depositor gets their deposit back: no one gets hurt! . Clearly this scenario is unfor-tunately, highly unrealistic. We should expect that depositors wait longer before runningon the bank, presumably because they only have imperfect information (i.e. noisy privatesignals) on the health of the bank.
Games of Timing . Let us consider a population of N players with individual states X N,it at time t satisfying equations of the form dX N,it = b ( t, X N,it , ν Nt ) dt + σ ( t, X N,it ) dW it , i = 1 , . . . , N coupled through their empirical distribution ν Nt = 1 N N (cid:88) i =1 δ X N,it . Each player chooses a F X i - stopping time τ i and tries to maximize J i ( τ , . . . , τ N ) = E (cid:20) g ( τ i , X τ i , µ N ([0 , τ i ]) (cid:21) where µ = N (cid:80) Ni =1 δ τ i is the empirical distribution of the τ i ’s, g ( t, x, p ) is the reward toa player for exercising their timing decision at time t when their private signal is X it = x ,and the proportion of players who already exercised their right is p . Taking formally thelimit N → ∞ in this set-up, we obtain the following MFG formulation of a mean fieldgame of timing.Assuming that the drift is independent of the empirical distribution of the states for thesake of simplicity, i.e. b ( t, x, ν ) = b ( t, x ) the dynamics of the state of a generic player aregiven by an Itô equation of the form: dX t = b ( t, X t ) dt + σ ( t, X t ) dW t . We denote by F X = ( F Xt ) ≤ t ≤ T the information available to the agent at time t , and by S X the set of F X -stopping times. The MFG of timing paradigm can then be formulated asfollows:(1) Best Response Optimization : for each fixed environment µ ∈ P ([0 , T ]) solve ˆ θ ∈ arg sup θ ∈S X ,θ ≤ T E [ g ( θ, X θ , µ ([0 , θ ]))] (2) Fixed Point Step : find µ so that µ ([0 , t ]) = P [ˆ θ ≤ t ] Here and throughout, we denote by P ( A ) the space of probability measures on A . Existence with randomized stopping times . In an unpublished PhD thesis, GeoffreyZhu proposed an existence proof for randomized stopping times, providing an analog ofNash’s original existence theorem for the existence of equilibria in mixed strategies.Before we go any further, recall the sobering shortcoming of convergence in distribu-tion which says that even if (cid:40) lim n →∞ ( X, Y n ) = ( X, Y ) in law Y n is a function of X then Y is not necessarily a function of X , in other words, Y ∈ σ { X } may not hold.For the purpose of this existence proof, let us assume that the reward function g :[0 , T ] × R × P ([0 , T ]) (cid:51) ( t, x, µ ) (cid:55)→ g ( t, x, µ ) ∈ R is bounded, continuous in ( t, x ) for µ fixed, and Lipschitz continuous in µ for ( t, x ) fixed. Note that, unfortunately, this lastassumption is not satisfied for functions t (cid:44) → µ ([0 , t ]) , unless t ∈ T ⊂ [0 , T ] with T finite!This will prevent us from using this existence result in the model of bank run discussedearlier. In any case, under the present assumptions Π : P ([0 , T ]) × P ( C ([0 , T ] × [0 , T ])) (cid:55)→ R ( µ, ξ ) (cid:44) → Π( ν, ξ ) = (cid:90) C ([0 ,T ]) × [0 ,T ] g ( t, x t , µ ) ξ ( dx, dt ) is continuous, and since the space ˜ S of randomized stopping times is compact because ofan old result of Baxter and Chacon, Berge’s maximum theorem implies that the multivaluedfunction P ([0 , T ]) (cid:51) ν (cid:44) → arg sup ξ ∈ ˜ S Π( ν, ξ ) is upper hemi-continuous and compact-valued. Followed by the projection on the firstmarginal, it is still upper hemi-continuous and compact-valued, and Kakutani’s fixed pointtheorem implies the desired existence result. Existence with usual stopping times . Existence for standard stopping times can beshown to hold under a different set of assumptions, using for example the order structureof the space of stopping times instead of topological properties of this space. For example,if we assume that the time increments of g are monotone in ν , then we can use the fact thatthe space of stopping times is a complete lattice, check that τ (cid:44) → arg sup τ (cid:48) ∈S E [ g ( τ (cid:48) , X τ (cid:48) , F τ ( τ (cid:48) ))] is monotone, and use Tarski’s fixed point theorem. Here F τ ( t ) = P [ τ ≤ t ] is the cumula-tive distribution function of τ .Unfortunately, once more, this existence result does not apply to the model of bankrun discussed earlier. Solution in the general case (including common noise) . Beyond a simple examplepresented by M. Nutz in [ ], the solution in the general bank run set-up introduced earlieris much more difficult and technical than originally thought. A complete solution was givenby Carmona, Delarue and Lacker in [ ]. See also [ ] by C. Bertucci for an approachrelying purely on partial differential equations and quasi-variational inequalities, and [ ]by Bouveret, Dumitrescu and Tankov for more on the use of relaxed stopping times. INANCIAL ENGINEERING & ECONOMICS 11
Given the fact that the Bitcoin maniahits the financial markets on a recurrent basis, and because of the competitive nature of themining process involving a large number or miners , it is no surprise that mean field gamemodels have been proposed to analyze some of the features of the cryptocurrency space.Here we briefly review two very recent papers by Li, Reppen and Sircar [ ] and Bertucci,Bertucci, Lasry and Lions [ ] which use continuous time mean field games, though invery different ways, to analyze some of the features of cryptocurrency generation. Bothpapers envision a continuum of miners interacting through the aggregate computationalpower they allocate to mining the blockchain.Even though it is not the only cryptocurrency, we shall only talk about Bitcoin becauseit is the one getting the most press. There are many reasons for that, wild price moves beingdefinitely one of them. After briefly exceeding $12,000, it crashed down to the $10,000range, before quickly moving up again to high levels.The generation of bitcoins is based on blockchain technology. The latter was intro-duced for the purpose of record keeping in a decentralized ledger. Still, it is at the coreof bitcoin generation. In bitcoin production, independent miners compete for the right torecord the next transaction block on the blockchain. They follow a Proof-of-Work (PoW)protocol. Their goal is to solve mathematical puzzles designed to be solved by brute forceonly. Computations to solve puzzles (create a block and earn a reward) are otherwise to-tally useless as they are not applicable anywhere else. Once a miner obtains a solution,the corresponding block is added on top of the blockchain and the miner obtains their re-ward. This reward is paid out in cryptocurrency (a fixed number of bitcoins, currently . bitcoins for adding a block) while electricity and mining hardware need to be paidwith traditional fiat currency (like the US Dollar).The supply of bitcoins is constantly growing. However, it is limited to million,of which more than million are already in circulation. The security of the network isa serious issue. A major fear is a majority attack also called attack when a groupof users controls the majority of mining power. These instances are rare, mostly becausethey are very difficult to realize due to their enormous costs. They are not considered in[
65, 12 ].The computing power devoted to mining is called the hash rate. It captures the num-ber of trials per second trying to solve the mining puzzle. In order to maintain stability inthe blockchain, the mining puzzle difficulty is dynamically adjusted so that, on average,the time between the creation of two consecutive blocks is constant, currently approxi-mately minutes. Therefore as the hash rate increases, the difficulty increases so that itis required to compute more hashes for a given block.Miners control their hash rate to increases their share of the blockchain reward, allother things being equal, while reducing the share of the other miners. On the other hand,intensive computations consume a lot of energy and each miner faces significant electricitycosts. In a nutshell, this dilemma is the core of the individual miner optimization problem.The aggregate hash rate in the system represents the total computational power devoted toblock creation. In both papers this aggregate hash rate will be the source of the mean fieldinteraction between the miners.In [ ], Li, Reppen, and Sircar focus on the risk borne by risk-averse miners and studymining concentration. They use a jump process to represent the acquisition of the reward,the jump intensity being the control of the typical miner. In their model, the jump intensityreflects the computer power, or hash rate, devoted to the effort, and the mean of the controlsof the individual miners is what creates the mean field interaction in the model. Given our simplistic description of how bitcoins are generated, it is natural to assume that the miner’sprobability of receiving the next mining reward is proportional to the ratio of their hashrate to that of the population. The number of rewards each miner receives is modeled by acounting process N t with jump intensity λ t > . If the number of miners is M + 1 , thisintensity is given by λ t = α t D ( α t + M ¯ α t ) where M ¯ α t approximates the total hash rate of the other miners. The wealth of the mineris used as state variable. It evolves as an Itô process of the form dX t = − cα t dt + P dN t where P is the bitcoin price, so the value of each reward is the product of its quantityby P . As we mentioned earlier, . bitcoins are granted to a miner for adding a blocksuccessfully. The total number of rewards in the system as a whole is a Poisson processwith a constant intensity D > . This will play the role of a common noise in the model.Given an adapted process ¯ α = (¯ α t ) t ≥ representing the conditional mean of the con-trols given the common noise, the miner optimization problem is to maximize the expectedutility of wealth at a fixed terminal time T : v ¯ α ( t, x ) = sup α ∈ [0 ,A ( x )] E (cid:2) U ( X T ) | X t = x (cid:3) where the controls α are restricted to the interval [0 , A ( x )] when the state is x . The authorstackle the optimization problem by solving the HJB equation: ∂ t v ¯ α + sup α ∈ [0 ,A ( x )] (cid:16) − cα∂ x v ¯ α + αD ( α + M ¯ α t ) ∆ v ¯ α (cid:17) = 0 and the solution is completed by solving the fixed point equation for the average effortrate ¯ α . The authors provide an explicit solution for exponential utility and no liquidityconstraint. They go on to the analysis of the effect of liquidity constraints and more generalutility functions. They provide robust numerical procedures to compute the equilibrium. Inthe case of the CRRA power utility function, they study the concentration of wealth amongthe miners. Their conclusion is that the richer the miner, the wealthier they will get.The authors also introduce a model in which one special miner is singled out forhaving a significant cost advantage (e.g. benefitting from cheaper electricity prices) overthe remaining field of miners. Naturally this special miner is shown to contribute more tothe hash rate.Bertucci, Bertucci, Lasry and Lions use a different modeling approach. Very much inthe spirit of the work [ ] proposing a mean field approach to the dynamics of an orderbook in the high frequency markets, they directly introduce the master equation, arguingthat this is the best way to study mean field games. Notice that this is in sharp contrast withthe usual approach starting from the introduction of the agents maximization problems.Using their notation system, P is now the nominal hashrate, (number of hashes persecond). They define the real hashrate K t = e − δt P t arguing that the rate δ quantifiestechnological progress. The evolution of K t is modeled in continuous time as minerscontinuously acquire computing power to compute hashes. As we explained earlier, theblockchain outputs a fixed number of coins per unit of time, so the miners compete againsteach other to earn a share of this fixed output. From the above description, it is reason-able to assume that the share they get is proportional to their relative share of the total INANCIAL ENGINEERING & ECONOMICS 13 computational power. The authors posit dynamics of the real hash rate in the form: dK t = − δK t dt + λU t ( K t ) dt, K = K where U t represents the flow of entry of computing devices, or in other words, the value ofa unit of real hash rate. They introduce the relationship: − ( r + δ ) U + ( − δK + λ ) ∂ K U + ( K + (cid:15) ) − − c. This Partial Differential Equation (PDE) should be viewed as the master equation of thecompetitive mean field game with finitely many states. The players are the miners, and weshould think of K as a measure of an aggregate of the population of miners responsible forthe mean field interaction.If we now assume that the reward is of the form g ( P t ) / ( K t + (cid:15) ) where g is a smoothpositive function of P t which evolves according to (cid:40) dP t = α ( P t ) dt + √ νdW t , P = PdK t = − δK t dt + λU t ( K t , P t ) dt, K = K where the function α is Lipschitz, ν > and W = ( W t ) t ≥ is a standard Wiener process,then U ( K, P ) = (cid:90) ∞ e − ( r + δ ) t (cid:16) g ( P t ) (cid:15) + K t − c (cid:17) dt is the value function of one unit of real hashrate. In this model, P t captures the exchangerate between the value of the cryptocurrency and fiat money and W = ( W t ) t ≥ is a formof common noise. In this case the master equation on [0 , ∞ ) × R reads: − ( r + δ ) U + ( − δK + λU ) ∂ K U + α∂ P U + ν∂ P P U + g ( P ) K + (cid:15) − c This is the master equation of a MFG with finite state space and common noise. Again,the monotone structure of the MFG plays a key role in the well-posedness of these models.Indeed, existence and uniqueness follow from monotonicity, and the existence of a station-ary state is also proven and analyzed. This is proven in [ ] when g is bounded from aboveand below.[ ] also discusses model security against attacks, proposes extensions to several com-peting populations of miners facing different mining costs, and a market where miningequipment can be bought and sold.
3. Games Models for Energy and the Environment
In this section we review some of the MFG models which have been touted and usedto revisit and extend earlier economic analyses of energy and environment markets. Wefirst summarize the discussion given in [ , Section 1.4.4] of a first model proposed byGuéant, Lasry and Lions in [ ]. If we denote by x , . . . , x N the initial re-serves of N oil producers who control their own rates of production, and if we denote by X it the oil reserves of producer i at time t , the changes in reserves should be given byequations of the form(3.1) dX it = − α it dt + σX it dW it where σ > is a volatility level common to all the producers, the non-negative adaptedand square integrable processes α i = ( α it ) t ≥ being the controls exerted by the producers, and the W i = ( W it ) t ≥ independent standard Wiener processes. If we denote by P t theprice of one barrel of oil at time t , and if we denote by C ( α ) = b α + aα the cost ofproducing α barrels of oil, then producer i tries to maximize:(3.2) J i ( α , . . . , α N ) = sup ( α t ) t ≥ ,α t ≥ E (cid:20) (cid:90) ∞ e − rt [ α it P t − C ( α it )] dt (cid:21) , where r > is a discount factor. As we are about to explain, the price P t is the source ofcoupling between the producers’ strategies. The notion of general equilibrium is intendedto characterize situations in which all the producers manage to maximize their profits si-multaneously, and the market clears in the sense that demand matches supply. Let us de-note by D ( t, p ) the demand at time t when the price is p . The function D ( t, p ) = we ρt p − γ was used in [ ]. We use the obvious notation D − for the inverse demand function, i.e. q = D ( t, p ) ⇐⇒ p = D − ( t, q ) . Mean Field Formulation.
In the present context, the MFG paradigm can be articu-lated easily. For each fixed deterministic flow ( µ t ) t ≥ of probability measures, we computethe price P t from the formula: P t = D − (cid:18) t, − ddt (cid:90) xµ t ( dx ) (cid:19) , and the best response to this flow of distributions is given by the solution of the discountedinfinite horizon optimal control problem for the instantaneous cost function f ( t, x, µ, α ) = [ αp − C ( α )] e − rt under the dynamic constraint (3.1). The MFG will be solved if one can find a measureflow ( µ t ) t ≥ such that the marginal distribution L ( X t ) of the solution of the control prob-lem matches the flow we started from, namely µ t = L ( X t ) for all t ≥ . The analyticapproach to MFGs based on the solution of a system of coupled HJB and Fokker-Planck-Kolmogorov equations is used in [ ] to give a solution to this problem. Numerical illus-trations provide comparative statics of the solutions are also given. Variations and Extensions . In [ ], Chan and Sircar propose to look at dynamics(3.3) dX s = − α s ds + σdW s , X t = x > . with a Dirichlet boundary condition at X t = 0 to guarantee that the reserves of a genericoil producer do not become negative. As before, α t represents the rate of production of ageneric producer and X t represents the remaining reserves. As in most models for Cournotgames, the price experienced by the producer, call it P t for the sake of definiteness, is givenby a linear inverse demand function of the rates of production, and it is chosen to be of theform P t = 1 − α t − (cid:15) ¯ α t where ¯ α t is the mean production rate for all the exhaustible resources. so that the costfunction becomes f ( t, x, µ, α ) = α [1 − αp − (cid:15)α ] where µ denotes the distribution of the control α giving the rate of production, ¯ α the meanof this distribution, and p the price given by the above inverse demand function. This is atypical extended MFG (because the mean field interaction is through the controls) with aboundary condition to guarantee that the remaining reserves do not become negative. Inparallel, the authors propose a slightly modified model for producers of renewable energy INANCIAL ENGINEERING & ECONOMICS 15 and analyze the oil market in the presence of both populations of producers. In this paper,they also propose several variations on the above model. Their goal was to include severalrealistic features like strategic blockading the entrance of renewable producers, and explo-ration and discovery of new reserves. While not always worrying about all the subtletiesof mathematical existence theorems, they provide enlightening numerical illustrations oftheir conclusions. This prompted more mathematically inclined authors like Bensoussanand Graber to pursue in [ ] a complete existence analysis based on partial differentialequations techniques of the models proposed by Sircar and Chan.For the sake of completeness, we mention that plain models for a macro perspectiveon the behavior of a large population of oil producers were proposed by Giraud, Guéant,Lasry, and Lions. See for example [
49, 53, 54 ]. More recently, Achdou, Giraud, Lasry andLions revisited some of these models including the presence of major and minor players.See [ ]. Also, note that game theoretical approaches, though not involving mean fieldgames per se, were used by Ludkovski and Sircar in [ ] to analyze oil production. General equilibriummodels have a long history in the engineering literature on electricity pricing. See forexample [ ] and the references therein. More recently, Djehiche, Barreiro-Gomez andTembine proposed a mean field game model for pricing electricity in a smart grid. See[ ]. Still, to model individual decision in a smart grid, Alasseur, Ben Tahar and Matoussiuse in [ ] a game with mean field interactions through the controls as a framework to man-age storage. In [ ], Aïd, Dumitrescu, and Tankov use one of the mean field game of timingmodels reviewed earlier to capture the time at which renewable producers choose to entera new market, and when conventional producers using fossil fuels should exit the market.In a different context, Aïd, Basei and Pham investigate in [ ] a Stackelberg game modelwhere the leader (an electricity producing firm) and the follower (consumer) choose strate-gies possibly depending upon their distributions. So they solve optimal control problemsof the McKean-Vlasov type. The main emphasis of the paper is to show that the Stackel-berg equilibrium is not Pareto optimal, and to explain the economic consequences of thisdisparity.Investigating the valuation of demand response contracts in a model with a continuumof consumers with mean field interactions and the presence of a common noise impactingtheir consumptions, Elie, Hubert, Mastrolia, and Possamaï formulate in [ ] the problemas a contract theory problem with moral hazard as those we shall discuss in more detail inSection 6 below. In their model, the Principal is an electricity producer who observes con-tinuously the consumption of a continuum of risk averse consumers, and designs contractsin order to reduce the production costs. To be more specific, the producer incentivizesthe consumers to reduce the average consumptions as well as their volatilities in differentregimes, without observing the efforts they potentially make. This is exactly the type ofmodels we shall investigate in Section 6.The recent paper by Shrivats, Firoozi, and Jaimungal [ ], still in the context of theelectricity markets, offers a smooth transition with the next discussion of the environmentmarkets. Indeed, it proposes MFG models to derive the optimal behavior of electricityproducers and an equilibrium price for Solar Renewable Energy Certificate (SRECs) inmarket-based systems designed to incentivize solar energy generation. Early general equilibrium models aimed at under-standing the effects and the control of externalities and taxes (in the spirit of Tobin taxes)were proposed by Golosov, Hassler, Krusell, and Tsyvinski in [ ]. General equilibriummodels were also used in early works on the emissions markets by Bueler [ ] and Haurie[ ] and more significantly in the analysis of the European Union Emission Trading Sys-tem (EU ETS) by Carmona, Fehr, Hinz and Porchet in [ ]. See also the references therein.We argue later on in Section 4 that many general equilibrium models can be recast as MeanField Game models. More recently, ideas which first appeared in the treatment of MFGswere used by Bahn, Haurie and Malhamé to model negotiations related to environmentpolicies. See [ ].Finally, we mention the recent work of Carmona, Dayanikli and Laurière [ ] whouse MFG models with major and minor players, very much in the spirit of the contracttheory models we review in Section 6, to derive equilibrium analyses of externalities andregulation on one end, and investments in renewables on the other, when dealing withelectricity production.
4. Macro-Economic Growth Models
In this section, we review several general equilibrium economic growth models. Weborrowed the first one from a paper by Guéant, Lasry and Lions on mean field games [ ].We chose to present it here because, by cleverly adapting ideas from a model of Aghionand Howitt, the authors present a model with common noise which can be solved explicitly,all the way to the master equation.Our choice of the second model was driven by a remarkable property: even though theoriginal contribution [ ] of Krusell and Smith appeared long before the mean field gameparadigm was articulated, the numerical algorithm proposed by the authors to approximatenumerically the equilibrium characteristics, reads as if it had been designed for the compu-tation of an MFG equilibrium. Indeed, it is eerie to see how closely the description of theirnumerical algorithm mimics, step by step, the mean field game strategy based on the alter-nate iteration of steps to approach the solution of the HJB equation and the Fokker-PlanckKolmogorov equation.We learned of the third example presented in this section, from a private conversationwith Benjamin Moll. We include it in this review because it can be solved completely, bothanalytically and numerically. Unfortunately, those examples are few and far between.The interested reader may also want to consult [ ] by Achdou, Han, Lasry, Lions andMoll for another discussion of continuous time macro-economic models recast as MFGs. We introducedirectly the mean field formulation of the game without starting from the definition of thefinite player game because the interaction between the players is local in the sense that it isa function of the density of the statistical distribution of the states of the players. In the caseof finitely many players, this distribution is the empirical distribution of the finitely manystates and as such, it does not have a density per se. So in order to avoid the introductionof smoothing of the empirical measures to define the costs to the players, we jump directlyto the mean field game formulation which can be done directly with densities without anyneed for mollification arguments.In this model, there are no idiosyncratic shocks, just aggregate shocks common toall the players. They are given by the increments of a one dimensional Wiener process W = ( W t ) t ≥ . We denote by F = ( F t ) t ≥ its filtration. We also assume that the INANCIAL ENGINEERING & ECONOMICS 17 volatility of the state of a generic player is linear, that is σ ( x ) = σx for some positiveconstant σ , and that each player controls the drift of their state so that the dynamics of theirstate read:(4.1) dX t = α t dt + σX t dW t . We shall restrict ourselves to Markovian controls of the form α t = α ( t, X t ) for a determin-istic function ( t, x ) (cid:55)→ α ( t, x ) , which will be assumed to be non-negative and Lipschitzin the variable x . Under these conditions, X t ≥ at all times t > if X ≥ . Notethat if X t and ˜ X t are solutions of (4.1) for the same linear control α ( t, x ) = γ t x for somecontinuous path [0 , T ] (cid:51) t (cid:55)→ γ t ∈ [0 , + ∞ ) , with initial conditions X ≤ ˜ X , then(4.2) ˜ X t = X t + ( ˜ X − X ) e (cid:82) t γ s ds − ( σ / t + σW t . We assume that k > is a fixed parameter, and we introduce a special notation for the fam-ily of one-sided scaled Pareto distributions with decay parameter k . For any real number q > , we denote by µ ( q ) the one-sided Pareto distribution on the interval [ q, ∞ ) :(4.3) µ ( q ) ( dx ) = k q k x k +1 [ q, ∞ ) ( x ) dx. Notice that for any random variable X , X ∼ µ (1) is equivalent to qX ∼ µ ( q ) .For each t ≥ we define µ t ( dx ) = P [ X t ∈ dx |F t ] . The flow ( µ t ) t ≥ of probabilitymeasures is adapted to the filtration F of the common noise. Recall that the MFG par-adigm in the presence of a common noise is to solve, for each fixed F -adapted flow ofprobability measures ( µ t ) t ≥ , the optimization problem of a generic player, and then solvethe fixed point problem to guarantee that the flow ( µ t ) t ≥ we started from is in fact theflow of conditional marginal laws of the solution of the optimization problem.For this particular family of distributions, if µ = µ (1) , then µ t = µ ( q t ) where q t = e (cid:82) t γ s ds − ( σ / t + σW t . In other words, conditioned on the history of the common noise,the distribution of the states of the players remains Pareto with parameter k if it startsthat way, and the left-hand point of the distribution q t can be understood as a sufficientstatistic characterizing the distribution µ t . So if X ∼ µ (1) , then µ t ∼ µ ( q t ) . This simpleremark provides an explicit formula for the time evolution of the (conditional) marginaldistributions of the states given the common noise. In general MFGs with a commonnoise, this time evolution is difficult to determine as it requires the solution of a forwardStochastic Partial Differential Equation (SPDE for short).Using the same notation as in [ ], we define the running cost function f by f ( x, µ, α ) = c x a [( dµ/dx )( x )] b − Ep α p [ µ ([ x, ∞ ))] b , for positive constants a , b , c , E and p > . The economic rationale for the form of thiscost function and the meanings of the parameters are discussed in [ ]. By convention,the density appearing in this formula is the density of the absolutely continuous part of theLebesgue’s decomposition of the measure µ , and it is set to when the measure is singular.The argument of the optimization of the Hamiltonian is given by ˆ α ( x, µ, y ) = (cid:16) yE (cid:2) µ ([ x, ∞ )) (cid:3) b (cid:17) / ( p − . This formula can be used to write the master equation which, when restricted to one-sidedPareto distributions, can be reduced to a finite dimensional PDE because of the above re-mark. Accordingly, Nash equilibria can be identified in this family of Pareto distributions.The details, far too technical for this review, can be found in Section 4.5.2 of [ ]. One major difference with the growthmodel discussed in the previous subsection is the fact that, on the top of the common noiseaffecting all the states, we also have idiosyncratic random shocks specific to each individualagent in the economy. In [ ] the shocks take only finitely many values. We suspect thatthis restrictive assumption was made for the purpose of numerical implementation. Inthe next subsection, we change the nature of the random shocks by introducing Wienerprocesses to recast the model in the framework of stochastic differential games. Description of the Economy . While economists usually work with models compris-ing a continuum of players (this is indeed the case in [ ]), in order to avoid the discussionof measurability issues related to continuum families of independent random variables,we first discuss the model of an economy comprising N consumers. The random shocksare given by a set of N continuous time Markov chains ( z t , η it ) t ≥ for i = 1 , . . . , N .The common component z captures the health of the overall economy, like an aggregateproductivity measure, so for some constant ∆ z ≥ , z t = 1 + ∆ z in good times, and z t = 1 − ∆ z in bad times. The idiosyncratic component η is specific to the consumer, η it = 1 when consumer i is employed, and η it = 0 whenever they are unemployed. ∆ z = 0 corresponds to the absence of common noise.The production technology is modeled by a Cobb - Douglas production function in thesense that the per-capita output is given by(4.4) Y t = z t K αt ( (cid:96)L t ) − α where K t and L t stand for per-capita capital and employment rates respectively. Theconstant (cid:96) can be interpreted as the number of units of labor produced by an employedindividual. The power α ∈ (0 , is a constant of the model. In such a model, two quantitiesplay an important role: the capital rent r t and the wage rate w t . Economic theory says thatin equilibrium, these marginal rates are defined as the partial derivatives of the per-capitaoutput Y t with respect to capital and employment rate respectively. So(4.5) r t = r ( K t , L t , z t ) = αz t (cid:0) K t L t (cid:1) α − and(4.6) w t = w ( K t , L t , z t ) = (1 − α ) z t (cid:0) K t L t (cid:1) α . Consumer’s Optimization Problem . Consumers control their capital consumption rate c it at time t , and maximize their expected utilities of consumption E (cid:20) (cid:90) ∞ e − ρt u ( c it ) dt (cid:21) , for some discount factor ρ > . We use the power utility function(4.7) u ( c ) = c − γ − − γ INANCIAL ENGINEERING & ECONOMICS 19 for some γ ∈ (0 , , also known as CRRA (short for Constant Relative Risk Aversion)utility function.Consumers must choose their consumptions while making sure that their individualcapitals k it remain non-negative at all times. The individual capitals evolve according tothe equation dk it = (cid:2) ( r t − δ ) k it + [(1 − τ t ) (cid:96)η it + µ (1 − η it )] w t (cid:3) dt − c it dt. Here, the constant δ > represents a depreciation rate. The second term in the aboveright hand side represents the wages earned by the consumer. It is equal to µw t whenthe consumer is unemployed, quantity which should be understood as an unemploymentbenefit rate. On the other hand, it is equal to (1 − τ t ) (cid:96)w t after adjustment for taxes, whenthey are employed. Here τ t = µu t (cid:96)L t where u t = 1 − L t is the unemployment rate. MFG Formulation . By de Finetti’s law of large numbers, we expect that the empiricalmeasures µ k,Nt of capital and µ η,Nt of labor: µ k,Nt = 1 N N (cid:88) i =1 δ k it , and µ η,Nt = 1 N N (cid:88) i =1 δ η it , converge as N → ∞ toward a limit which we denote by µ k,zt and µ η,zt . These limitsgive the conditional distributions of capital and labor k t and η t given the state z t = z of the economy. Since z t can only take two values − ∆ z and z , we only needthe knowledge of deterministic flows of measures, ( µ k,dt ) t ≥ , ( µ k,ut ) t ≥ , ( µ η,dt ) t ≥ , and ( µ η,ut ) t ≥ ) corresponding to the two values of z , say down and up, namely d = 1 − ∆ z and u = 1 + ∆ z .Once the flows of conditional measures are known, the computation of the best re-sponse of a representative agent reduces to the solution of the optimal control problem max c E (cid:20) (cid:90) ∞ e − ρt u ( c t ) dt (cid:21) under the constraints k t ≥ and dk t = (cid:2) ( r ( K t , L t , z t ) − δ ) k t + [(1 − τ t ) (cid:96)η t + µ (1 − η t )] w ( K t , L t , z t ) (cid:3) dt − c t dt. Here, ( z t , η t ) t ≥ is a continuous time Markov chain with the same law as any of the ( z t , η it ) t ≥ introduced earlier, the rental rate function r and the wage level function w areas in (5.42) and (4.6), and K t = k z t t is the mean of the conditional measure µ z t t , namely K t = (cid:90) [0 , ∞ ) kµ k,ut ( dk ) if z t = 1 + ∆ z , and K t = (cid:90) [0 , ∞ ) kµ k,dt ( dk ) if z t = 1 − ∆ z , and where the aggregate labor L t is defined similarly as the conditional mean of µ η,zt . Theaggregates K t and L t are the conditional means of the capital and labor given the commonnoise: they carry the mean field interactions in the model.As we recalled during our discussion of the previous example, the MFG paradigmin the presence of a common noise is to solve, for each fixed flow of probability measuresadapted to the filtration of the common noise, the optimization problem of a generic player,and then solve the fixed point problem to guarantee that the flow we started from is in fact the flow of conditional marginal laws of the solution of the optimization problem. Note thatin the Krusell-Smith’s model, the common noise and the idiosyncratic noise are correlatedand that the labor state variable η t (whose aggregate is L t ) is nothing but the idiosyncraticnoise. So clearly, the MFG paradigm reduces to the solution of the individual optimizationproblem given the aggregate K t and then solving for the fixed point. This is exactly whatthe numerical algorithm proposed in [ ] does. Time discretization is not needed in [ ]since the model is introduced in discrete time there. Then a form of dynamic programmingis used to solve the optimization problem given the aggregate K t , and then an update of K t is done by Monte Carlo simulation, before going back to the solution of the optimizationproblem given the update of K t , and so on and so forth. While the authors realize that theentire distribution µ k,zt should be updated, they argue that updating the mean is sufficientto get reasonable numerical results for a problem whose complexity should have beenprohibitive. As explained in the introduction, even though they never used the term Nashequilibrium, their numerical search for a recursive competitive equilibrium is exactly thealgorithm based on the iteration of the numerical approximation of the solution of the HJBequation followed by the Fokker-Planck-Kolmogorov equation, algorithm (re)introducedand used over 15 years later for the numerical solution of Mean Field Games. As explained in the introduc-tion to this section, we learned about the model presented in this subsection from a privateconversation with Benjamin Moll. It is one of the models discussed in the review [ ] byAchdou, Buera, Lasry, Lions and Moll devoted to partial differential equation models inmacroeconomics. As far as we know, the first, and most likely the only, complete mathe-matical solution as a mean field game of this model can be found in Chapter 3 of [ ].We first describe the finite player form of the model. We shall solve it as a meanfield game model later on. The N agents i ∈ { , . . . , N } are the workers comprising theeconomy. The private state at time t of agent i is a two-dimensional vector X it = ( Z it , A it ) .For the purpose of this model, A it is the wealth at time t of worker i , and Z it their laborproductivity. The time evolutions of the states are given by stochastic differential equations(4.8) (cid:40) dZ it = µ Z ( Z it ) dt + σ Z ( Z it ) dW it dA it = [ w it Z it + r t A it − c it ] dt. The functions µ Z , σ Z : R → R are known. We shall specify them later on in the exampleswe treat theoretically and numerically. The random shocks are given by N independentWiener processes W i = ( W it ) t ≥ , for i = 1 , . . . , N . r t is the interest rate at time t , w it represents the wages of worker i at time t and the consumption process c i = ( c it ) t ≥ is thecontrol of player i .R EMARK A it ≥ a for somenonpositive constant a ≤ . Moreover, the labor productivity processes Z i = ( Z it ) t ≥ arealso restricted by requiring that they are ergodic, or even restricted to an interval [ z, z ] forsome finite constants ≤ z < z < ∞ . We do not know of an economic rationale forthese constraints and we suspect that these assumptions are made for the sole benefits ofthe technical proofs.In this model, given adapted processes r = ( r t ) t ≥ and w i = ( w it ) t ≥ for i =1 , . . . , N , the workers choose their consumptions c , . . . , c N in order to maximize their INANCIAL ENGINEERING & ECONOMICS 21 expected discounted utilities:(4.9) J i ( c , . . . , c N ) = E (cid:90) ∞ e − ρt u ( c it ) dt. As usual in economic applications, the model is set up in infinite horizon, and u is anincreasing concave utility function, the same for all the workers. So far, it seems likethe workers do not interact. Also, we need to explain how the interest rate and the wageprocesses appear in equilibrium. As in the Krusell-Smith model discussed earlier, weassume that the aggregate production in the economy is given by a production function Y = F ( K, L ) , the total capital supplied in the economy at time t , say K t being given bythe aggregate wealth(4.10) K t = (cid:90) a dµ NX t ( dz, da ) = 1 N N (cid:88) i =1 A it while the total amount of labor L t supplied in the economy at time t is normalized to .Here, we denote by µ NX t the empirical measure of the sample X t , . . . , X Nt . Note that onlythe A -marginal enters the definition of K t .R EMARK (cid:40) r t = [ ∂ K F ]( K t , L t ) | L t =1 − δw t = [ ∂ L F ]( K t , L t ) | L t =1 where δ ≥ is the rate of capital depreciation. So in equilibrium, the interaction be-tween the agents in the economy is through the mean K t = (cid:82) aµ NX ( a, z ) of the empiricaldistribution of the workers’ wealths A it . Practical Solution . We now specify the model further to solve it as a mean field game.We use the CRRA isoelastic utility function with constant relative risk aversion introducedabove in (4.7). Note that(4.11) u (cid:48) ( c ) = c − γ and ( u (cid:48) ) − ( y ) = y − /γ . Next, we use the Cobb - Douglas production function(4.12) F ( K, L ) =
A K α L − α for some constants A > and α ∈ (0 , . With this choice, in equilibrium, r t = αAK α − t L − αt − δ and w t = (1 − α ) AK αt L − αt and since we normalized the aggregate supply of labor to ,(4.13) r t = αAK − αt − δ and w t = (1 − α ) AK αt , where K t is given by (4.10) and provides the mean field interaction. Finally, we usean Ornstein-Uhlenbeck process for the mean reverting labor productivity process Z =( Z t ) t ≥ by choosing µ Z ( z ) = 1 − z and σ Z ≡ for the sake of definiteness. Moving to the mean field game formulation of the model, the state X t = ( A t , Z t ) evolves accordingto (cid:40) dZ t = − ( Z t − dt + dW t ,dA t = (cid:2) (1 − α )¯ µ αt Z t + (cid:0) α ¯ µ α − t − δ (cid:1) A t − c t (cid:3) dt, t ∈ [0 , T ] , where (¯ µ t ) ≤ t ≤ T denotes the flow of average wealths in the population in equilibrium. It isassumed to take (strictly) positive values. The set A of admissible controls is the set H , of real valued square-integrable F -adapted processes c = ( c t ) ≤ t ≤ T with non-negativevalues, and the cost functional is defined by: J ( c ) = E (cid:20)(cid:90) T ( − u )( c t ) dt − ˜ u ( A T ) (cid:21) , for the CRRA utility function u given by (4.7) and ˜ u ( a ) = a . Notice the additional minussigns due to the fact that we want to treat the optimization problem as a minimizationproblem. Here we chose to take for the discount rate since we are working on a finitehorizon. Throughout the analysis, we shall assume that A > and Z = 1 , so that E [ Z t ] = 1 for any t ≥ .In order to solve this MFG using the Pontryagin Maximum Principle, we introducethe Hamiltonian: H ( t, z, a, µ, y z , y a , c ) = (1 − z ) y z + (cid:0) − c + (1 − α ) µ αt z + ( αµ α − − δ ) a (cid:1) y a − u ( c ) , where µ = (cid:82) a dµ ( z, a ) denotes the mean of the second marginal of the measure µ . Thefirst adjoint equation reads dY z,t = − ∂ z H ( t, Z t , A t , Y z,t , Y a,t , c t ) dt + ˜ Z z,t dW t = Y z,t dt + ˜ Z z,t dW t . Its solution is Y z,t = 0 because its terminal condition is Y z,T = 0 . Since the variables z and y z do not play any role in the minimization of the Hamiltonian with respect to thecontrol variable c , we use the reduced Hamiltonian: H ( t, a, µ, y, c ) = (cid:0) − c + ( αµ α − − δ ) a (cid:1) y − u ( c ) , which is convex in ( a, c ) and strictly convex in c . The form (4.11) of the derivative of theutility function implies that the value of the control minimizing the Hamiltonian is ˆ c =( − u (cid:48) ) − ( y ) = ( − y ) − /γ . Therefore, the FBSDE derived from the Pontryagin stochasticmaximum principle reads(4.14) (cid:40) dA t = (cid:2) (1 − α ) µ αt Z t + [ αµ α − t − δ ] A t − ( − Y t ) − /γ (cid:3) dtdY t = − Y t [ αµ α − t − δ ] dt + Z (cid:48) t dW t , t ∈ [0 , T ] ; Y T = − , where we used the notation ( Z (cid:48) t ) ≤ t ≤ T to denote the integrand in the quadratic variationpart of the backward equation in order to distinguish it from the process ( Z t ) ≤ t ≤ T usedin the model as the first component of the state. Despite the fact that the utility functionhas a singularity at , it is not difficult to check that the proof of the sufficient part of thePontryagin principle goes through provided that the adjoint process ( Y t ) ≤ t ≤ T lives, withprobability , in a compact subset of ( −∞ , .We shall refrain from going through the gory details of the rest of the proof. We referthe interested reader to [ , Section 3.6.3]. The major insight is to notice that the backwardequation may be decoupled from the forward equation and that its solution is deterministicand is obtained by solving the backward ordinary differential equation: dY t = − Y t [ αµ α − t − δ ] dt, t ∈ [0 , T ] ; Y T = − . The remaining of the proof follows easily.
INANCIAL ENGINEERING & ECONOMICS 23
5. From Macro to Finance
In this section, we review two recent works of M. Brunnermeier and Y. Sannikov[
14, 15 ] in which the authors compare the historical evolutions of macro-economic andfinance models, arguing that properly framed, the analysis of continuous time stochasticmodels should provide a unifying thread for these sub-fields of economics which so far,developed in parallel. To make their point, the authors introduce models of the economycomprising households maximizing consumption like in classical macro-economic growthmodels, as well as experts trading in financial markets.As explained in the introduction, those models lead to MFGs with a common noise.The importance of common shocks in macro-economics points to the need of a bettermathematical understanding of MFGs with a common noise. The first model reviewed inthis section fits in the class of MFGs with one population of individuals facing idiosyncraticnoise terms as well as random shocks common to all. It was first introduced in [ ] indiscrete time. While the second model does not have idiosyncratic noise terms, it involvestwo populations of agents. This gives us an opportunity to quickly review some of thefeatures of MFGs with several populations, which are not discussed often enough in themathematical literature on mean field games. We consider a one-sector economy witha continuum of households with identical preferences (we shall use the logarithmic util-ity function u ( x ) = log x ) and different levels of wealth. We denote by I the set ofhouseholds. We choose I = [0 , for the sake of definiteness. In this model, because eachagent’s influence on the economy is infinitesimal, we use a continuous probability measure λ on I to sample households. For practical purposes, we can think of λ as the Lebesguemeasure on [0 , .R EMARK [0 , equipped with its Lebesgue measure is a flagrant expediency. Indeed, mathematicallyspeaking, it does not pass the smell test , as in order to manipulate a continuum of idiosyn-cratic shocks without having to face severe measurability issues, we would have to jumpthrough several hoops, for example using rich Fubini extensions instead of the Lebesgueunit interval. See for example [ , Section 3.7] for a discussion of such a rigorous ap-proach.In this model, each household operates a firm and holds money. The capital stock of ageneric household h evolves according to the equation:(5.1) dk ht k ht = ( φ ( ι ht ) − δ ) dt + σ dW t + σdW ht where δ > is a depreciation rate, and the function φ reflects adjustment costs in capitalstock. This function is assumed to satisfy φ (0) = 0 , φ (cid:48) (0) = 1 , φ (cid:48) ( · ) > and φ (cid:48)(cid:48) ( · ) < . Its concavity captures technological illiquidity. ι ht represents the investment rate ofhousehold h in physical capital at time t . Essentially, it gives how many units of physicalcapital are used in order to produce new physical capital. W = ( W t ) t ≥ and W h =( W ht ) t ≥ are independent Wiener processes modeling random shocks. dW ht represents anidiosyncratic shock specific to the household h , while dW t represents a shock common toall the households. We shall often call it the common noise . The volatilities σ and σ arepositive constants. Households hold money. We denote by m ht the amount of money held at time t byhousehold h . They consume in the amount c ht . We denote by θ ht the fraction of the house-hold wealth in money at time t . So the control of a household is the triple ( ι ht , θ ht , c ht ) . Thegoal of a household is to maximize its long-run discounted expected utility of consumption:(5.2) J h ( ι , θ , c ) = E (cid:104)(cid:90) ∞ e − ρt u ( c t ) dt (cid:105) over the control strategies ( ι , θ , c ) = ( ι t , θ t , c t ) t ≥ . The constant ρ > provides actual-ization. We now derive the dynamic constraint under which this optimization is performedby each household. It is expressed in terms of the wealth n ht of the household at time t .We use capital letters to denote the aggregates (i.e. the empirical means) of each ofthe variables k ht , m ht and n ht . In other words:(5.3) K t = (cid:90) I h k ht λ ( dh ) , M t = (cid:90) I h m ht λ ( dh ) , N t = (cid:90) I h n ht λ ( dh ) . Anticipating on the fact that we shall discover that in equilibrium, ( ι ht ) t ≥ is independentof the household and adapted to the filtration of the common noise (which is the case if ι ht depends only upon aggregate quantities at time t ), which implies that all the householdsuse the same investment in physical capital strategy, we can integrate (5.1) over h and findthat(5.4) dK t K t = ( φ (¯ ι t ) − δ ) dt + σ dW t which is a stochastic differential equation only driven by the common noise. The idiosyn-cratic shocks disappear because of a continuous form of the exact law of large numbers.See for example [ , Section 3.7]. We used the notation ¯ ι t to distinguish this aggregatereturn on capital from the individual households’ ι ht .We introduce two more constants: q for the price of one unit of physical capital (sothe real value of aggregate physical capital is qK t ), and p for the real value of moneynormalized by the size of the economy as measured by K t (so pK t is the real value oftotal money supply). These could be Itô processes, say ( q t ) t ≥ and ( p t ) t ≥ driven by thecommon noise W , but for the sake of simplicity, we shall assume them to be deterministicconstants for the purpose of this presentation. Given the definition of the constants q and p , the total wealth in the economy is(5.5) N t = ( p + q ) K t ,qK t representing the value of the physical capital and pK t the value of the nominal capital.We denote by ϑ the fraction of nominal wealth:(5.6) ϑ = pp + q . The quantity of money in the economy is controlled exogenously by a central bank. Weassume that money supply follows the following stochastic differential equation(5.7) dM t M t = µ M dt + σ M dW t driven by the common noise. INANCIAL ENGINEERING & ECONOMICS 25
Individual household optimization problem . We first derive the stochastic differen-tial equation driving the dynamics of the wealth of a generic household, and then tackle theoptimization of the expected utility of consumption by the Pontryagin stochastic maximumprinciple.Changes in the wealth n ht of household h at time t are the sums of three contributions.We have:(5.8) dn ht = θ ht n ht dr Mt + (1 − θ ht ) n ht dr h,Kt ( ι ht ) − c ht dt where r Mt denotes the rate of return on money, and r h,Kt ( ι ht ) the rate of return on capital.If θ ht n ht is the amount the household holds in money, and if we denote by p mt the value ofone unit of money, namely(5.9) p mt = pK t M t , then the return on this investment is dr Mt = dp mt p mt = d ( K t /M t ) K t /M t since we assume that p is a constant. Using Itô’s formula with (5.4) and (5.7) we get:(5.10) dr Mt = (cid:104) φ (¯ ι t ) − δ − [ µ M − σ M ( σ M − σ )] (cid:105) dt + ( σ − σ M ) dW t . We now identify the time evolution of the rate of return on capital r h,Kt ( ι ht ) . It has threecomponents: the return of investment in physical capital, the return on the household cap-ital qk ht , and the seigniorage. Seigniorage is the amount of money which is transferred tomoney holders proportionally to their capital. Given the definition (5.9) of the value ofone unit of money, we can easily understand the change in the seigniorage over a period [ t, t + ∆ t ] . It is given by: T ht +∆ t − T ht = p mt +∆ t ( M t +∆ t − M t ) = p mt ( M t +∆ t − M t ) + ( p mt +∆ t − p mt ) . ( M t +∆ t − M t ) . So in continuous time, i.e. after taking the limit ∆ t (cid:38) : dT t = p mt dM t + d [ p m , M ] t = pK t (cid:104) [ µ M + ( σ − σ M ) σ M ] dt + σ M dW t (cid:105) . (5.11)Consequently: dr h,Kt ( ι ht ) = a − ι ht q dt + d (cid:0) qk ht (cid:1) qk ht + dT t qK t = (cid:104) a − ι ht q + φ ( ι ht ) − δ + pq [ µ M + ( σ − σ M ) σ M ] (cid:105) dt + ( σ + pq σ M ) dW t + σdW ht . (5.12) Putting together (5.8), (5.10) and (5.12) we get: dn ht = θ ht n ht dr Mt + (1 − θ ht ) n ht dr h,Kt ( ι ht ) − c ht dt = θ ht n ht (cid:104) φ (¯ ι t ) − δ − [ µ M + ( σ − σ M ) σ M ] (cid:105) dt + θ ht n ht ( σ − σ M ) dW t + (1 − θ ht ) n ht (cid:104) a − ι ht q + φ ( ι ht ) − δ + pq [ µ M + ( σ − σ M ) σ M ] (cid:105) dt + (1 − θ ht ) n ht ( σ + pq σ M ) dW t + (1 − θ ht ) n ht σdW ht − c ht dt = (cid:16) n ht (cid:104) θ ht (cid:0) φ (¯ ι t ) − δ (cid:1) + (1 − θ ht ) (cid:2) a − ι ht q + φ ( ι ht ) − δ (cid:3) + [ µ M + ( σ − σ M ) σ M ] (cid:0) pq − θ ht p + qq (cid:1)(cid:105) − c ht (cid:17) dt + n ht (cid:2) σ + (cid:0) pq − θ ht p + qq (cid:1) σ M (cid:3) dW t + (1 − θ ht ) n ht σdW ht (5.13)The Hamiltonian of the optimization problem of a generic household reads: H ( t, n, y, z , z, ι, θ, c )= (cid:16) n (cid:104) θ (cid:0) φ (¯ ι t ) − δ (cid:1) + (1 − θ ) (cid:2) a − ιq + φ ( ι ) − δ (cid:3) + [ µ M + ( σ − σ M ) σ M ] (cid:0) pq − θ p + qq (cid:1)(cid:105) − c (cid:17) y + n (cid:2) σ + (cid:0) pq − θ p + qq (cid:1) σ M (cid:3) z + (1 − θ ) nσz − e − ρt u ( c ) (5.14)if we use the notations y , z and z for the adjoint variables (sometimes called co-states).The necessary part of the Pontryagin maximum principle suggests to minimize the Hamil-tonian with respect to the control variables ι , θ and c .Moreover, since (1 − θ ) ≥ , we can isolate ι and minimize ( a − ι ) /q + φ ( ι ) whichgives Tobin’s q equation: − q + φ (cid:48) ( ι ) = 0 ⇔ ι = ( φ (cid:48) ) − (cid:0) q (cid:1) , and in the case of the function φ ( ι ) = (1 /κ ) log(1 + κι ) used in [ ], we get:(5.15) κ ˆ ι t = q − . Notice that the optimal ˆ ι t is a constant independent of t . In general, if q is an Itô processadapted to the filtration of the common noise, so is ˆ ι . But the fact to remember at thisstage is that the optimal ˆ ι t is the same for all the households. So from now on ¯ ι t = ˆ ι t = κ − ( q − .Expecting that θ ∈ [0 , , minimizing the Hamiltonian over θ could lead to ∂ θ H = 0 ,i.e. ny (cid:104) a − ιq + p + qq [ µ M + ( σ − σ M ) σ M ] (cid:105) + nz p + qq σ M + nσz = 0 . For obvious reasons we write the adjoint variables z and z in the form z = − yζ and z = − yζ so we can rewrite the first order condition ∂ θ H = 0 in equilibrium as:(5.16) a − ιq + p + qq [ µ M + ( σ − σ M ) σ M ] = p + qq σ M ζ + σζ. INANCIAL ENGINEERING & ECONOMICS 27
This equation does not determine directly the optimal value of θ t . It is sometimes called thepricing equation because it can also be derived from the HJB equation of the optimizationproblem offering a pricing interpretation.Finally, since we restrict ourselves to y < , we can write the third First Order Condition(FOC) as: ∂ c H = 0 ⇔ − y − e − ρt u (cid:48) ( c ) = 0 ⇔ c = ( u (cid:48) ) − (cid:0) − e ρt y (cid:1) so that in the case of logarithmic utility, the optimal consumption rate should be the process (ˆ c t ) t ≥ given by ˆ c ht = − e − ρt y ht where the adjoint process ( y t ) t ≥ is the first component of the solution of the adjointequation, namely the Backward Stochastic Differential Equation (BSDE) equation:(5.17) dy ht = − ∂ n H ( t, n ht , y ht , z ,ht , z ht , ˆ ι t , ˆ θ t , ˆ c t ) dt + z ,ht dW t + z ht dW ht . Computing ∂ n H from (5.14) and using (5.16) we get: ∂ n H = y (cid:2) φ ( ι ) − δ − [ µ M + σ M ( σ − σ M )] − ( σ − σ M ) ζ (cid:3) = yr ht (5.18)if we define the individual household effective interest by:(5.19) r ht = φ ( ι ) − δ − [ µ M + σ M ( σ − σ M )] − ( σ − σ M ) ζ ,ht . So the adjoint Backward Stochastic Differential Equation (BSDE) reads(5.20) dy ht y ht = − r ht dt − ζ ,h dW t − ζ ht dW ht , hence the interpretation of r ht as an individual household short interest rate and y ht as anindividual stochastic discount factor. Using Itô’s formula with (5.8) and (5.20), and thedefinition (5.14) of the Hamiltonian we get: d ( y ht n ht ) y ht n ht = − c ht n ht dt + (cid:16)(cid:2) σ + (cid:0) pq − θ ht p + qq (cid:1) σ M (cid:3) − ζ ,ht (cid:17) + (cid:16) (1 − θ ht ) σ − ζ ht (cid:17) , so if we choose:(5.21) ζ ,ht = σ + (cid:0) pq − θ ht p + qq (cid:1) σ M and ζ ht = σ (1 − θ t ) , we have that y ht n ht = − e − ρt /ρ and consequently:(5.22) ˆ c ht = ρn ht . NB:
The fact that the optimal rate of consumption is proportional to the wealth is to beexpected when using logarithmic utility.Plugging the expressions (5.21) for ζ ,ht and ζ ht into the pricing equation (5.16), wefind: a − ιq + p + qq [ µ M + ( σ − σ M ) σ M ] = p + qq σ M (cid:16) σ + (cid:0) pq − θ ht p + qq (cid:1) σ M (cid:17) + σ (1 − θ t ) from which we derive(5.23) − θ ht = a − ιq + p + qq µ M (cid:0) p + qq (cid:1) ( σ M ) + σ . Using the fraction of nominal wealth defined in (5.6) this gives:(5.24) − θ ht = a − ιq (1 − ϑ ) + µ M ( σ M ) + σ (1 − ϑ ) (1 − ϑ ) . So not only is the optimal portfolio the same for all the households, but we also learn thatit is a constant. Moreover: ζ ,ht = σ − σ M + σ M − θ ht − ϑ = σ − σ M + σ M a − ιq (1 − ϑ ) + µ M ( σ M ) + σ (1 − ϑ ) , and inserting this value of ζ ,ht into the formula (5.19) we get:(5.25) r ht = φ ( ι ) − δ − [ µ M + σ M ( σ − σ M )] − ( σ − σ M ) (cid:16) σ − σ M + σ M a − ιq (1 − ϑ ) + µ M ( σ M ) + σ (1 − ϑ ) (cid:17) , which shows that the individual interest rate is in fact the same constant for all the house-holds. Clearing Conditions . The goods market clears if total output aK t equals the sum ofinvestment ι t K t and consumption C t . So the overall consumption C t = (cid:82) c ht λ ( dh ) shouldbe equal to ( a − ι t ) K t since aK t represents the overall production and ι t K t represents theoverall reinvestment in capital. If we recall that we are using logarithmic utility, we sawthat the optimal consumption was proportional to the wealth so: C t = (cid:90) c ht λ ( dh ) = ρ (cid:90) n ht λ ( dh ) = ρN t = ρ ( p + q ) K t so that the clearing condition amounts to ρ ( p + q ) = a − ι t which gives(5.26) a − ˆ ιq = ρ − ϑ . The capital market clears if aggregate capital demand equals capital supply K t , inother words if: − θ t N t q = K t and using the fact that N t = ( p + q ) K t we get:(5.27) − ˆ θ t = qp + q = 1 − ϑ. The money market clears by Walras law.Using the clearing condition (5.26) and the optimal value of ˆ ι t (5.15) we get: q = (1 − ϑ ) 1 + κa − ϑ + κρ , from which we derive: ˆ ι = (1 − ϑ ) a − ρ − ϑ + κρ , and p = ϑ − κa − ϑ + κρ . Finally, injecting (5.27) and (5.26) into the pricing equation (5.16) we get: − ϑ = (cid:114) ρ + µ M − ( σ M ) σ . INANCIAL ENGINEERING & ECONOMICS 29 which shows that a stationary (meaning the processes ( p t ) t ≥ and ( q t ) t ≥ are deterministicand constant given by the real numbers p and q ) general equilibrium is possible if(5.28) ρ + µ M − ( σ M ) > and σ > (cid:113) ρ + µ M − ( σ M ) . Does all this have anything to do with Mean Field Games?
Since the state variableof an individual household is its wealth n ht , the typical interaction one should expect if thisgeneral equilibrium can be recast as a mean field game should be the aggregate wealth N t .So in the presence of the common noise W t one should fix the flow of conditional distri-butions of the wealth n ht given the filtration of the common noise, and search for the bestresponse of this household. In other words, given the knowledge of ( N t ) t ≥ which is astochastic process adapted to the filtration F of W , the individual household should findoptimal investment rate in physical capital (ˆ ι t ) t ≥ , optimal investment portfolio (ˆ θ t ) t ≥ ,and optimal consumption rate (ˆ c t ) t ≥ , to maximize its long-run discounted expected utilityof consumption (5.2). This is exactly what was done in the section dealing with the individ-ual household optimization problem. The next step of the MFG paradigm is the fixed pointstep according to which one tries to identify a flow of conditional distributions which endsup being the flow of conditional distributions of the solution of the optimization problemunderpinning the search for the best response.In typical macro-economic general equilibrium problems, individual optimizations areperformed assuming that the aggregates are known. If aggregates can be interpreted asmeans of some state variables, fixing the aggregates amounts to fixing the distributions ofthese state variables. In this example, assuming the knowledge of ( N t ) t ≥ is the same thingas assuming the knowledge of ( K t ) t ≥ since as we saw, N t = ρ ( p + q ) K t , which in turn, isassuming the knowledge of the process (¯ ι t ) t ≥ representing the aggregate investment ratein physical capital. This is the mean field interaction appearing explicitly in the dynamics(5.13) of the state of the individual household. Since the individual household optimalinvestment rate in capital is constant as given by Tobin’s q equation (5.15), a necessarycondition for the fixed point step is that ¯ ι t = ˆ ι t . Added to the necessary conditions ofoptimality (which we derived from the Pontryagin stochastic maximum principle) and thecapital market clearing condition, this fixed point step leads to the equilibrium solutionunder the conditions (5.28).Because we chose to restrict ourselves to the search for a stationary general equilib-rium in which the processes ( p t ) t ≥ and ( q t ) t ≥ are deterministic and constant given bythe real numbers p and q , the deterministic nature of most of the characteristics of the equi-libria are rather anti-climatic, and the reformulation of the solution as the search for Nashequilibria in a mean field game is rather contrived. We chose to present this model becauseof the presence of both idiosyncratic and common shocks. We refer the interested readertto [
14, 15 ] for extensions with deeper financial meaning. The next example will be moreillustrative of the deep connection with the paradigm of mean field games. While it doesnot involve idiosymcratic shocks, it involves two populations and this will give us a chanceto highlight the possible benefits of a mean field game reformulation of the model.
We present the analysis of the model dis-cussed in [ ] mutatis mutandis. We consider an economy with a continuum of householdsand experts. We denote by I h (resp. I e ) the space of households (resp. experts). Typi-cally, we choose I h = I e = [0 , which we assume to be equipped with its Borel σ -field.We shall alo use continuous probability measures λ h and λ e on I h and I e respectively. Again, for practical purposes, and modulo the contents of Remark 5.1 at the beginning ofthe discussion of the previous model, we can think of them both as equal to the Lebesguemeasure on [0 , .In this economy, households consume and lend money to experts. On the other end,experts borrow money from households, invest in the production of a single good, and con-sume. The goal of each agent is to maximize their long run expected utility. In this model,all agents use the logarithmic utility function u ( c ) = log c . So if we denote by c et and c ht the consumptions at time t of expert e and household h respectively, the optimizationproblem is: sup ( c it ) t ≥ E (cid:104)(cid:90) ∞ e − ρt log c it dt (cid:105) , i = e, h, where ρ > is a discount factor common to the two classes of agents. To be consistentwith the computation done throughout this chapter, we shall in fact minimize the negativeof the above expected utility of consumption. Individual household optimization problem . If we denote by n ht the wealth of house-hold h at time t , we have:(5.29) dn ht = r t n ht dt − c ht dt Here, the process ( r t ) t ≥ represents the interest rate common to all agents. It is one of thestochastic processes to be determined endogenously.The Hamiltonian of the optimization problem of a generic household reads: H ( t, n, ξ, c ) = ( r t n − c ) ξ − e − ρt u ( c ) if we use the notation ξ for the adjoint variable (sometimes called the co-state) which weshall restrict to be negative. The necessary part of the Pontryagin maximum principlesuggests to minimize the Hamiltonian with respect to the control variable c . This gives theFirst Order Condition (FOC): ∂ c H = 0 ⇔ − ξ − e − ρt u (cid:48) ( c ) = 0 ⇔ c = ( u (cid:48) ) − (cid:0) − e ρt ξ (cid:1) so that in the case of logarithmic utility, the optimal consumption rate is given by ˆ c t = − e − ρt ξ t where the adjoint function t (cid:55)→ ξ t solves the adjoint equation: dξ t = − ∂ n Hdt = − r t ξ t dt. The differentiation product rule gives: d ( ξ t n ht ) = c ht ξ t dt = − e − ρt dt implying that ξ t n ht = − e − ρt /ρ and consequently:(5.30) ˆ c ht = ρn ht . As noted in the previous example, the fact that the optimal rate of consumption is propor-tional to the wealth (and is independent of the interest rate) is a well known property of thelogarithmic utility function.
INANCIAL ENGINEERING & ECONOMICS 31
Individual expert optimization problem.
If at time t we denote by n et the wealth ofexpert e , by θ et the proportion of self worth invested in bonds (i.e. borrowed from thehouseholds, so θ et ≤ ) and by ι et the investment in physical capital, we have:(5.31) dn et = θ et n et r t dt + (1 − θ et ) n et dr kt ( ι et ) − c et dt where r kt ( ι et ) denotes the return from the investment ι et in physical capital. The capitalstock of a generic expert e evolves according to the equation:(5.32) dk et k et = ( φ ( ι et ) − δ ) dt + σdW t where δ > is a depreciation rate, and the function φ reflects adjustment costs in capitalstock. It is assumed to satisfy φ (0) = 0 , φ (cid:48) (0) = 1 , φ (cid:48) ( · ) > and φ (cid:48)(cid:48) ( · ) < . Itsconcavity captures technological illiquidity. The volatility σ > is a positive constant and W = ( W t ) t ≥ is a Wiener process modeling random shocks. Note that this is the sameprocess for all the experts. This is an instance of what we call a common noise . There isno source of idiosyncratic noise in this model.Let the price q t at time t of one unit of capital be an Itô process satisfying(5.33) dq t q t = µ qt dt + σ qt dW t for two processes ( µ qt ) t ≥ and ( σ qt ) t ≥ adapted to the filtration F of the common noise,which will be specified later on.The return on capital r kt ( ι et ) is defined as:(5.34) dr kt ( ι et ) = a − ι et q t dt + d ( q t k et ) q t k et . The first term in the right hand side represents the dividend yield while the second onegives the capital gain. Using the definitions (5.33) and (5.32) and Itô’s formula for thedifferential of a product we get:(5.35) dr kt ( ι et ) = (cid:104) a − ι et q t + φ ( ι et ) − δ + µ qt + σσ qt (cid:105) dt + ( σ + σ qt ) dW t . Plugging this formula in the dynamics (5.31) of the wealth of a generic expert we get:(5.36) dn et = (cid:104) θ et n et r t +(1 − θ et ) n et (cid:16) a − ι et q t + φ ( ι et ) − δ + µ qt + σσ qt (cid:17) − c et (cid:105) dt +(1 − θ t ) n et ( σ + σ qt ) dW t . This equation should be viewed as giving the dynamics of the state variable n et as controlledby ( c et , ι et , θ et ) . As before, we use Pontryagin stochastic maximum principle to solve theoptimization of the expected utility of consumption. For the sake of simplicity of notation,we skip the superscript e throughout the remaining of this subsection. No confusion ispossible since we are only dealing with the expert optimization problem. The Hamiltonianof this optimization problem reads: H ( t, n, ξ, ζ, c, ι, θ ) = (cid:104) θn r t + (1 − θ ) n (cid:16) a − ιq t + φ ( ι ) − δ + µ qt + σσ qt (cid:17) − c (cid:105) ξ − (1 − θ ) n ( σ + σ qt ) ζξ − e − ρt u ( c ) (5.37)where for reasons which will become clear soon, we used the notations ξ (which is as-sumed to be negative) and − ξζ for the adjoint variables. We now have three First OrderConditions. Since ξ ≤ and (1 − θ ) ≥ we can isolate the contribution of the control ι . Thisleads to the maximization of the quantity ( a − ι ) /q t + φ ( ι ) which leads to" − q t + φ (cid:48) ( ι ) = 0 ⇔ ι = ( φ (cid:48) ) − (cid:0) − q t (cid:1) . In the case of the function φ ( ι ) = (1 /κ ) log(1 + κι ) used in [ ], we get: ˆ ι t = 1 κ ( q t − . In any case, this value is the same for all the experts e , and as a control process, it is adaptedto the filtration of the common noise.As before: ∂ c H = 0 ⇔ − ξ − e − ρt u (cid:48) ( c ) = 0 ⇔ c = ( u (cid:48) ) − (cid:0) − e ρt ξ (cid:1) so that in the case of logarithmic utility, the optimal consumption rate is given by ˆ c t = − e − ρt /ξ t where the adjoint process ( ξ t ) t ≥ solves the adjoint equation: dξ t = − ∂ n Hdt − ξ t ζ t dW t . Notice that the FOC ∂ θ H = 0 gives:(5.38) r t = a − ιq t + φ ( ι ) − δ + µ qt + σσ qt − ζ t ( σ + σ qt ) and we shall see below that this formula will help us identify the individual expert optimalinvestment ˆ θ et in terms of the processes ( r t ) t ≥ , ( q t ) t ≥ , ( µ qt ) t ≥ , ( σ qt ) t ≥ . Computing ∂ n H from (5.37) we get ∂ n H ( t, n, ξ, ζ, c, ι, θ ) = (cid:104) θ r t +(1 − θ ) (cid:16) a − ιq t + φ ( ι ) − δ + µ qt + σσ qt (cid:17)(cid:105) ξ − (1 − θ )( σ + σ qt ) ζξ and using (5.38) we get ∂ n H ( t, n, ξ, ζ, c, ι, θ ) = ξr t and the adjoint equation rewrites:(5.39) dξ t ξ t = − r t dt − ζ t dW t which justifies our choice of the form of the second adjoint variable. Applying Itô formulato (5.36) and (5.39) we get: d ( ξ t n t ) ξ t n t = − c t n t dt + (cid:2) − ζ t + (1 − θ t )( σ + σ qt ) (cid:3) dW t so that, choosing(5.40) ζ t = (1 − θ t )( σ + σ qt ) , we find that as in the case of the computation of the optimal consumption rate of thehouseholds, ξ t n t = − e − ρt /ρ and consequently:(5.41) ˆ c et = ρn et . So the fact that the optimal rate of consumption is proportional to the wealth was notaffected by the presence of the random shocks. It is typical in the case of logarithmicutility. Plugging our choice (5.40) for ζ t in (5.38) we find:(5.42) r t = a − ˆ ι t q t + φ (ˆ ι t ) − δ + µ qt + σσ qt − (1 − θ et )( σ + σ qt ) INANCIAL ENGINEERING & ECONOMICS 33 from which we can easily extract ˆ θ et as desired. Clearing constraints and equilibrium.
The next step in the search for a general equi-librium for this macro-economic model is to articulate the constraints imposed by the needto have all the markets clear, and to show that one can identify processes ( r t ) t ≥ , ( q t ) t ≥ , ( µ qt ) t ≥ , ( σ qt ) t ≥ satisfying these constraints and allowing the simultaneous optimizationsof all the agents. Clearing is best expressed in terms of aggregate quantities. For each i ∈ { h, e } , we denote by C it the aggregate consumption for the agents of type i . Formallywe write: C ht = (cid:90) I h ˆ c ht λ h ( dh ) , and C et = (cid:90) I e ˆ c et λ e ( de ) . Given (5.30) and (5.41) we see that C ht = ρN ht and C et = ρN et where N ht and N et are theaggregate worths of the populations of households and experts respectively, i.e. N ht = (cid:90) I h ˆ n ht λ h ( dh ) , and N et = (cid:90) I e ˆ n et λ e ( de ) . If we denote by K t the aggregate physical capital in the economy at time t , i.e. K t = (cid:90) I e k et λ e ( de ) , the aggregate wealth in the economy is equal to q t K t Clearing of the loan market requires that at each time t , the aggregate debt of theexperts, say D et , be equal to the aggregate loans of the households, so that: D et = (cid:90) I h n ht λ h ( dh ) = N ht . So N et = q t K t − D et = q t K t − N ht which implies that(5.43) q t K t = N ht + N et . It will be convenient to use the quantity:(5.44) η t = N et N ht + N et = N et q t K t representing the wealth share of the experts. Notice that all these aggregate quantities arerandom since they depend upon the common noise W which does not average out in thecomputation of the aggregates because it is common to all the agents.Clearing of consumption on the market for goods requires C t = ( a − ι et ) K t in other words ρq t K t = ( a − ι et ( q t )) K t which implies ρq t = a − ι et ( q t ) which in turnimplies that the process ( q t ) t ≥ is in fact a positive constant, say q . As a consequence, µ qt ≡ and σ qt ≡ , and if we use the function φ ( ι ) = (1 /κ ) log(1 + κι ) proposed in [ ]we get(5.45) q = 1 + κa κρ and ι e = a − ρ κρ . Capital market clearing yields:(5.46) − θ et = q t K t N et = 1 η t Knowing that q t has to be a deterministic constant, we can use the facts that dN et N et = dn et n et = (cid:104) r t + (1 − θ et ) σ − c et n et (cid:105) dt + (1 − θ et ) σdW t (5.47)and dK et K et = dk et k et = (cid:104) r t + (1 − θ et ) σ − ρ (cid:105) dt + σdW t (5.48)to derive from Itô’s formula that: dη t η t = d (cid:0) N et /K t (cid:1) N et /K t = (cid:104) − c et n et + ρ + ( θ et ) σ (cid:105) dt − θ et σdW t = ( θ et ) σ dt − θ et σdW t (5.49)which we can rewrite as(5.50) dη t = σ (1 − η t ) η t dt + σ (1 − η t ) dW t if we use the capital market clearing condition (5.46). Interpretation.
This is a stochastic differential equation on the open interval (0 , .According to Feller’s theory of one dimensional diffusions, the scale function p ( x ) and thespeed measure m ( dx ) are given by: p ( x ) = 12 (cid:0) − x (cid:1) and m ( dx ) = 8 σ x (1 − x ) dx. Feller’s explosion test can be computed and it says that if started inside the interval (0 , ,the diffusion remains inside the interval for ever and in fact lim t →∞ η t = 1 almost surely.Note also that the drift is always positive, and very large when η t is small, so up to thefluctuations due to the random shocks (whose sizes σ (1 − η t ) decrease as η t get closer to ), one should expect that η t would grow quickly toward and become mostly flat when itgets close to . This is illustrated in Figure 1. INANCIAL ENGINEERING & ECONOMICS 35 F IGURE
1. Typical sample path of η t .From an economic point of view, this means that the proportion of the wealth held bythe experts grows quickly toward a high value close to , and eventually converges tothis level, leaving the households helpless.Finally, revisiting the constraint (5.42), we see that in equilibrium we must have: r t = a − ι et q t + φ ( ι et ) − δ + µ qt + σσ qt − (1 − θ et ) σ = a − ι et q t + φ ( ι et ) − δ − (1 − θ et ) σ = a − ι et q t + φ ( ι et ) − δ − σ η t = ρ + 1 κ log (cid:16) κa κρ (cid:17) − δ − σ η t (5.51)if we use the function φ ( ι ) = (1 /κ ) log(1 + κι ) proposed in [ ]. NB :
This interest rate is negative for small values of η t . Conclusion . Given the common random shock process W , we solve the stochasticdifferential equation (5.50) to find a process ( η t ) t ≥ which stays in (0 , . Next we definethe short interest rate process ( r t ) t ≥ by (5.51) and with the constant price of capital q given by (5.45) all the agents can maximize their expected long run discounted utilityof consumption simultaneously and all the markets clear. These are the elements of thedesired equilibrium. Two-Population, Infinite-Horizon Mean Field Game Formulation . The model de-scribed in this section is the epitome of an infinite-horizon, two-population Mean FieldGame (MFG) with a common noise and no idiosyncratic noise . We make it explicit di-rectly in the limiting mean field limit without motivating it with the description of the finiteplayer analogue. Because we do not know of examples of this type treated in the existingliterature, we formulate a rigorous definition in the spirit and with the notations of [ ] and[ ], and we accommodate the possibility of idiosyncratic random shocks. The sources of random shocks are three independent R d -valued Wiener processes W = ( W t ) t ≥ , standing for the idiosyncratic noise for the players of the first popu-lation, W = ( W t ) t ≥ standing for the idiosyncratic noise for the players of the secondpopulation, and W = ( W t ) t ≥ standing for the noise common to all the players. For i = 0 , , , we denote by F i = ( F it ) t ≥ the filtration generated by W i . The MFG problemcan be formulated as the conjunction of the following two bullet points:(1) For any two probability measure µ and µ on R d and two stochastic flows of(random) probability measures µ = ( µ t ) t> and µ = ( µ t ) t> on R d , bothadapted to the filtration F of the common noise, solve the two optimal controlproblems: sup α J , µ , µ ( α ) and sup α J , µ , µ ( α ) over F -progressively measurable R k -valued processes α = ( α t ) t ≥ and F -progressively measurable R k -valued processes α = ( α t ) t ≥ , where J , µ , µ ( α ) = E (cid:20)(cid:90) ∞ e − ρt f (cid:0) t, X t , µ t , µ t , α t (cid:1) dt (cid:21) ,J , µ , µ ( α ) = E (cid:20)(cid:90) ∞ e − ρt f (cid:0) t, X t , µ t , µ t , α t (cid:1) dt (cid:21) , with dX t = b (cid:0) t, X t , µ t , µ t , α t (cid:1) dt + σ (cid:0) t, X t , µ t , µ t , α t (cid:1) dW t + σ , (cid:0) t, X t , µ t , µ t , α t (cid:1) dW t ,dX t = b (cid:0) t, X t , µ t , µ t , α t (cid:1) dt + σ (cid:0) t, X t , µ t , µ t , α t (cid:1) dW t + σ , (cid:0) t, X t , µ t , µ t , α t (cid:1) dW t , for t > , and L ( X ) = µ and L ( X ) = µ .(2) Find F -adapted random flows µ = ( µ t ) t> and µ = ( µ t ) t> such that con-ditioned on the past of the common noise, almost surely, the marginal distribu-tions of the solutions of the above stochastic control problems coincide with theelements of the probability flows we started from. In other words: ∀ t ∈ [0 , T ] , µ t = L (cid:0) ˆ X , µ , µ t |F t (cid:1) , µ t = L (cid:0) ˆ X , µ , µ t |F t (cid:1) , if we denote by ˆ X , µ , µ and ˆ X , µ , µ the solutions of the above optimal controlproblems.In practice, specific assumptions are required on the coefficients b , σ , σ , , b , σ , σ , and the running reward functions f and f for the stochastic differential equations deter-mining the generic states X t and X t of the two populations to have solutions, and theexpected costs to make sense. Moreover, much more restrictive assumptions are requiredfor the existence of a couple of measure flows satisfying the fixed point conditions (2).Our goal is to explain how the macro-economic model presented in the previous sub-section is an instance of such a Mean Field Game. • The individuals in the first population are the households and the individuals ofthe second population are the experts. • In the present situation, the idiosyncratic noises W and W are not present sowe can take σ = σ = 0 . • The generic states X t and X t are the wealths n ht and n et . INANCIAL ENGINEERING & ECONOMICS 37 • As a result, the interpretation of the fixed point condition in item (2) of the abovedefinition of a solution to the MFG is that in equilibrium, the (random) measures µ t and µ t should be the conditional distributions of the states n ht and n et giventhe past { W s ; 0 ≤ s ≤ t } of the common noise. • In fact as we are about to see, the forms of the coefficients of the state equationsas well as of the reward functions are such that the means (first moments) of theprobability measures µ t and µ t are sufficient statistics. So instead of workingwith the full random measures µ t and µ t , we can restrict ourselves to their means ¯ µ t = (cid:82) xµ t ( dx ) and ¯ µ t = (cid:82) xµ t ( dx ) which are still functions of the past of thecommon noise. • According to item (1) of the above definition of a solution to the MFG, for eachcouple of stochastic flows of (random) probability measures µ = ( µ t ) t> and µ = ( µ t ) t> adapted to the filtration F of the common noise W , we needto be able to solve the two optimal control problems before we tackle the fixedpoint problem stated in item (2) of this definition. Let me argue that this is exactlywhat we did in the optimization steps of the construction of a general equilibriumfor the macro-economic model studied earlier in this section. For the followingdiscussion to be more transparent, we should think of the means ¯ µ t and ¯ µ t asthe stochastic processes ( N ht ) t ≥ and ( N et ) t ≥ of the aggregate wealths in thehouseholds and experts populations. – At each time t ≥ , knowing the averages ¯ µ t and ¯ µ t we can compute η t = ¯ µ t / (¯ µ t + ¯ µ t ) and then the quatity r t from formula (5.51). – Choosing the control α t = c ht , the drift function b ( t, n h , µ t , µ t , α t ) = r t n ht − c ht ,the volatility σ , ( t, n h , µ t , µ t , α t ) = 0 , and the reward function f ( t, n h , µ t , µ t , α t ) = log c ht ,for the utility of the household, we see that the first optimal control problemin item (1) is exactly what we called the optimization problem of the house-holds which we solved using the Pontryagin maximum principle. Noticethat in the dynamics of the state, namely in the function b , the dependenceupon the random probability measures µ t and µ t appears implicitly throughthe process r t . – Choosing the control α t = ( ι t , θ t , c et ) , the drift function b ( t, n e , µ t , µ t , α t ) = θ t r t n et − c et + (1 − θ t ) n et (cid:0) a − ιq + φ ( ι t ) − δ (cid:1) ,the volatility σ , ( t, n e , µ t , µ t , α t ) = (1 − θ t ) n et σ ,and the reward function f ( t, n e , µ t , µ t , α t ) = log c et we see that the second optimal control problem in item (1) is exactly whatwe called the optimization problem of the experts which we solved using thePontryagin maximum principle. As before, the dependence upon the ran-dom probability measures µ t and µ t appears implicitly through the process r t . • The fixed point condition in item (2) of the definition of the MFG guarantees that ¯ µ t = N ht and ¯ µ t = N et so that the process ( η t ) t ≥ is indeed the wealth share ofthe experts, and the process ( r t ) t ≥ is indeed the short interest rate process andall the clearing conditions are satisfied. So the Nash equilibrium of the two-population Mean Field Gamecoincides with the general equilibrium constructed in the previous subsections. But whathave we gained? Aren’t we making matters worse?
Johann Wolfgang von Goethe oncesaid "Mathematicians are like Frenchmen: whatever you say to them theytranslate into their own language and forthwith it is something en-tirely different." and I would add that it has to be even worse when the mathematician is French ! Still, oncethe macro-economic general equilibrium model is reformulated as a Mean Field Game, thetechnology developed during the last years to analyze these models can be brought tobear to gain insight in these macro-economic models. Among them, 1) numerical methodsto compute solutions and statics, 2) convergence of finite populations models and quan-tification of the finite size effects, 3) analysis of the uniqueness of the equilibria, or lackthereof, 4) analysis of the centralized optimization counterpart and comparisons of theglobal welfares (e.g. computation of the price of anarchy), . . . .
6. Moral Hazard & Contract Theory
After reviewing some of the major historical developments in contract theory, and ex-plaining the basics of the models, we concentrate on two recent papers using mean fieldgames for the purpose of extending the reach of possible applications of the theory to largepopulations of agents. Because this topic was not discussed as a possible application ofMean Field Games in the books [
21, 22 ], we spend a significant amount of time review-ing the underpinnings of the economic theory as well as some of the recent applicationsinvolving mean field models.Put in layman’s terms, the purpose of economic contract theory is to address questionsof the type: a) How should a government control a flu outbreak by encouraging citizens tovaccinate? b) How should taxes be levied to influence people’s consumption, saving andinvestment decisions? c) How should an employer incentivize and compensate their em-ployees in order to boost productivity? While these are mundane objectives, they shouldshed some initial light on the type of problems contract theory can encompass. The eco-nomic lingo we shall use and try to elucidate includes terms like
Agency Problem, Contract Theory, Moral Hazard , and
Information Asymmetry . • The agency problem refers to a conflict of interest between two parties when oneof them is expected to behave in the other party’s best interests. • The purpose of contract theory is to study how an economic agent can designand structure a contractual agreement to incentivize another agent to behave inhis or her best interest. This problem is most often set up when the informationsavailable to the two agents are not the same. • In economics and in finance, moral hazard refers to a situation when an agent hasan incentive to take excess risk because they do not bear the full consequencesof that risk. • In contract theory and economics, information asymmetry deals with the studyof decisions in transactions where one party has more or better information thanthe other. • The principal - agent problem occurs when one agent makes decisions on behalfof another person or entity called the principal.
INANCIAL ENGINEERING & ECONOMICS 39
Earliest contributions to this important field of economics concentrated on static oneperiod models. They are attached to the names of Mirrlees [ ] and Holmström [
58, 59 ].The first dynamic models were introduced by Holmström and Milgrom [
60, 61 ] and it hadto wait almost two decades for Sannikov’s breakthrough [
72, 73 ]. By focusing on Markovdiffusion models in continuous time, and using the weak formulation of stochastic controlproblems to capture moral hazard, Sannilov created a new wave of interest, especiallyamong financial mathematicians. The book of Cvitanic and Zhang focusing on the use ofBackward Stochastic Differential Equations (BSDEs) is a case in point, and the more recentwork of Cvitanic, Possamaï and Touzi [ ] highlighting the real nature of Sannikov’s trick,will be instrumental in the developments we present in this section. NB:
Bengt Holmström and Oliver Hart were awarded the Nobel Memorial Prize in Eco-nomic Sciences in 2016 for their work on Contract Theory.
In a classical contract theorymodel, two parties are present:(1) the principal who devises a contract , according to which incentives are given to,and/or penalties are imposed on,(2) the agent who may accept the contract and work for the principal.The following are the major assumptions which are usually made implicitly or explicitlyin a contract theory model. • It is assumed that all the agents are rational in the sense that they behave op-timally to maximize their own utilities, controlling the tradeoff between the re-wards/penalties they received and the efforts they put in. • The principal designs the contract • After reviewing the terms of the contract, the agent may walk away. We as-sume that the agent has a threshold level (e.g. minimum reward, ..... ) belowwhich they think the contract is not worth the effort. We call this threshold the reservation utility of the agent. • The principal observes the agent actions only partially. This is the source ofinformation asymmetry and moral hazard in the problem.From a mathematical point of view, the fact that the principal de facto does not see allof the agent behavior forces a special formulation of the optimization problem faced by theprincipal. As far as we know, Sannikov was the first one to realize that this situation was notaccommodated properly by the usual mathematical strong formulation of stochastic gamesand stochastic control. He proposed to use the weak formulation of stochastic control(also called the martingale approach to stochastic control) to set the principal optimizationproblem. We now explain what we mean by weak formulation to accommodate moralhazard.
We denote by X t thestate of the system at time t . We shall give detailed explanations of what the state is ineach of the applications considered below. We assume that the system is controlled byone single controller who takes actions to implement their control. The actions of thecontroller affect the state of the system and the level of the cost / reward . This describesthe set-up of a classical control problem. Games and stochastic games differ in the sensethat several controllers can act on the same system, i.e. several controllers (players, agents)take actions. In this case, each player has a cost / reward to worry about, and the wholesystem is affected by the individual actions and their interactions . The weak formulation of these optimization problems, also called the martingale approach , is perfectly suited toinformation asymmetry and moral hazard. In this set-up the trajectories t (cid:44) → X t ( ω ) are notaffected by the actions taken by the controllers. Only the likelihood of the scenarios givenby the trajectories can be changed by the controls. In other words, only the distribution ofthe process X = ( X t ) ≤ t ≤ T is affected by controls. In words, one can surmise that thechoice of a control is equivalent to the choice of a law for the state process.Still, it may not be completely clear why the presence of moral hazard and the lackof symmetry of information suggest the use of the weak formulation. Given the terms ofthe contract, the Agent chooses controls which influence their own reward, as well as therewards of the Principal. Let us denote by α t the control (effort level) of the agent at time t . The agent sees the state X t as it is impacted by the control, and the agent optimizes their cost / reward according to the terms of the contract.On the other end, through the terms of the contract, the Principal chooses the remuner-ation conditions of the agent. But he or she does so without observing directly the effortlevel α t of the agent, observing only partially the state X t , and seeing the impact of theagent’s effort only through the expected returns he or she is getting, de facto through thevalues of expected quantitiesRoughly speaking, the principal guesses the expected value of his or her returns throughthe distribution of the output of the agent. This is exactly what the weak formulation ap-proach is trying to capture. See below for the mathematical details of this approach.After this quick and informal review of moral hazard and classical contract theory, wenow introduce the application we have in mind in this chapter. The main features ofthe new framework can be summarized as follows:(1)
ONE principal who devises one single contract , according to which incentives are given to, and/or penalties are imposed on,(2)
MULTIPLE agents who see the same contract and may accept it and work forthe principal.As before, the major assumptions can be captured in a small number of bulllet points: • The principal designs a contract hoping to maximize their own utility; • The agents are rational so they will also try to maximize their own utilities; • The agents have their reservation utilities and so they may decide to walk away; • The agents are statistically identical in that their contracts are the same; • They behave selfishly and maximize their utilities; • We assume that they reach a Nash equilibrium; • The Principal designs the contract anticipating that the agents will reach a Nashequilibrium.In some sense, we can say that we are considering the problem of one principal contractinga field of agents. As before, the principal does not see (and cannot control) the individualactions taken by the agents. The principal only feels the overall expected value of the re-ward he or she gets from the actions of the agents. This information asymmetry creates themoral hazard which the model captures through the weak formulation of the optimizationproblem.Below, we give mathematical details when the state space is the Euclidean space R d (originally treated by Elie, Mastrolia and Possamaï in [ ]) and the case of finite statestreated by Carmona and Wang in [
29, 28 ]. INANCIAL ENGINEERING & ECONOMICS 41
The weak formulation is best set-up using the canonicalrepresentation of the state process by assuming that Ω is the space of continuous functionsfrom [0 , T ] to E (typically E = R or E = R d ), W t ( ω ) = ω ( t ) for t ≥ gives thecoordinate process, F := ( F t ) t ∈ [0 ,T ] is the natural filtration generated by the process W =( W t ) t ≥ , µ is a fixed probability measure on E which serves as the initial distribution ofthe state, i.e. X ∼ µ , F := F T if we work on a finite time horizon [0 , T ] . P is the Wienermeasure on (Ω , F , F ) so that W is a Wiener process, and X t = ξ + (cid:82) t σ ( s, X · ) dW s forsome Lipschitz function ( s, x ) (cid:55)→ σ ( s, x ) which is assumed to be bounded from aboveand below away from uniformly in s and x . In order to be consistent with the existingliterature on the subject, we allow the coefficients to depend upon the past history of thestates. We use the notation X · and x · to denote the whole trajectories of the state. Notealso that:(6.1) dX t = σ ( t, X · ) dW t , under P irrespective of which control is chosen by the agent.Next we introduce A , the space of admissible control strategies (representing theagents effort levels). The elements of A are adapted processes α = ( α t ) ≤ t ≤ T whichmay satisfy further properties to be specified later on. Next we introduce the drift function b (the only part of the dynamics of the state controlled by the agent). We assume that thedrift ( t, x, α ) (cid:55)→ b ( t, x · , α ) ∈ R d is bounded and progressively measurable. For each ad-missible control strategy α , we denote by P α the state distribution when the effort level ofthe agent is α . It is defined by its density with respect to the measure P given by: d P α d P = E (cid:104)(cid:90) T σ ( t, X · ) − b ( t, X · , α t ) dW t (cid:105) where E ( M ) = exp[ M t − < M, M > t ] denotes the Doleans exponential of the contin-uous square integrable martingale M = ( M t ) ≤ t ≤ T . Girsanov theorem implies that: dX t = b ( t, X · , α t ) dt + σ ( t, X · ) dW α t , under P α where W α t = W t − (cid:90) t σ ( s, X · ) b ( s, X · , α s ) ds is a Brownian motion under the measure P α . So the same state process X , constructedin (6.1) independently of the controls α , now appears as the state of the process controlledby α if one looks at its evolution under the probability measure P α . More on that remarkbelow.We now finalize the weak formulation of the problem by describing the behaviors ofthe agents in this set-up.The principal offers a contract ( r , ξ ) where • r = ( r t ) ≤ t ≤ T is an adapted process representing the payment stream; • ξ is a random variable representing a terminal payment.The agent decides whether or not to accept the contract and work for the principal, and ifhe or she does accept, chooses an effort level α = ( α t ) ≤ t ≤ T to maximize their expectedoverall reward: J r ,ξ ( α , µ ) = E P α (cid:104) U A ( ξ ) + (cid:90) T [ u A ( r t ) − c ( t, X · , α t )] dt (cid:105) where u A is the agent running utility, U A is the agent terminal utility, and c ( t, x · , α ) is thecost for applying the effort level α at time t when the history of the state is x [0 ,t ] . Given this rational expected behavior of the agent, the optimization problem of theprincipal can be formulated in the following way:For each contract ( r , ξ ) , assuming knowledge of the utility and cost functions of theagent and assuming that the agent is rational, the Principal computes an optimal effort level α ∗ = ( α ∗ t ) ≤ t ≤ T α ∗ ∈ arg inf α ∈ A J r ,ξ ( α , µ ) which the agent should choose, and then, search for an optimal contract ( r ∗ , ξ ∗ )( r ∗ , ξ ∗ ) ∈ arg inf ( r ,ξ ) E P α ∗ (cid:104) U P (cid:16) X T − ξ − (cid:90) T r t dt (cid:17)(cid:105) where U P is the (terminal) utility of the principal.This is a typical instance of a Stackelberg game between the principal going first andthe agent.
What Changes with a Large Number of Agents . We assume that the agents, while competing with each other, behave similarly (this is the form of symmetry assumptionin force in mean field game models), and because of their large number, their individualinfluences on the aggregate quantities are negligeable .In these conditions, the optimization problem of the Principal can be formulated asbefore. Knowing the utility and cost functions of the agents, the Principal assumes that foreach contract ( r , ξ ) , the agents settle in a Mean Field Nash Equilibrium, so for each ( r , ξ ) ,the Principal • solves the MFG of the agents • computes the effort level α ∗ = ( α ∗ t ) ≤ t ≤ T of the Nash Equilibria he or she cancompute • then search for an optimal contract ( r ∗ , ξ ∗ )( r ∗ , ξ ∗ ) ∈ arg inf ( r ,ξ ) E P α ∗ (cid:104) U P (cid:16) X T − ξ − (cid:90) T r t dt (cid:17)(cid:105) where U P is the (terminal) utility of the principal.As before, this is a form of Stackelberg game between the principal going first , and thefield of agents going next . But now, the dynamics of the state and the cost/reward functionsdepend upon the distribution of the state in the sense that: b ( t, X t , α t , µ t ) and c ( t, X t , α t , µ t ) where µ t is the distribution at time t of the state X t under P α .Details about the formulation of the problem and an example of solvable model (es-sentially from the linear-quadratic family) can be found in [ ].R EMARK ] of Elie, Hubert, Mastrolia, and Possamaï for an attempt in this direction. INANCIAL ENGINEERING & ECONOMICS 43
Next we consider the same contract theory model when the state space is finite. Wego over a numerical application in detail to illustrate with a few statics, the informationalcontent of the equilibrium when we can actually compute it.
The framework of the above discussion is based onthe theory of diffusion processes in continuous time and its application to problems ofstochastic control. The states are living in Euclidean spaces, their dynamics are mod-eled by stochastic differential equations, and sophisticated tools from stochastic analysis,starting with Girsanov’s theory of changes of measure, are brought to bear in order toformalize the asymmetry of information and the weak formulation appropriate for the op-timization of the Principal. Stochastic dynamical systems taking values with finite statespaces are often used in applications for which numerical implementations are of crucialimportance. Strangely enough, what seems like a simplification at first, after all finitestate spaces should be easier to handle than continuous spaces, may not always make thetheoretical analysis easier. Here, we review recent works attempting to port the strategyoutlined in the previous section to this case. In particular, we explain how to set up theweak formulation for Mean Field Games with finitely many states, and we implement thesteps previously outlined in the diffusion case in the framework of finitely valued stateprocesses.R
EMARK , Section 7.2].See also [ , Section 7.1.9] for a discussion of models with major and minor players, theanalysis of which bears much resemblence to some of the steps taken to state and solvecontract theory problems with a large number of agents. For the sake of completeness, wemention some of the works on finite state space mean field games which appeared sincethen, and which we know of. The probabilistic approach to finite state mean field gamesis advocated by Cecchin and Fischer in [ ], Bayraktar and Cohen derived the equivalentof the master equation in [ ], and the convergence problem is studied in [ ] by Doncel,Gast and Gaujal, and in [ ] by Cecchin and Pelino. Finally, we note that these models canexhibit all sorts of behavior as shown for example in [ ] where Cecchin, Dai Pra, Fischerand Pelino identify a two-state model without uniqueness.The following is a review of results of Carmona and Wang borrowed from [
29, 28 ]. The Canonical Process for the Discrete Case . In this section, we assume that the statespace is the finite set E = { e , . . . , e m } , where for the sake of mathematical conveniencewe shall assume that the e i ’s are the unit vectors of the canonical basis of R m . The stateprocess X = ( X t ) ≤ t ≤ T will be a continuous-time Markov chain with m states whosesample paths t → X t are càdlàg , i.e. right continuous with left limits, and continuous at T (i.e. X T − = X T ). In analogy with the Euclidean state space case, we introduce thefollowing canonical representation: • Ω is the space of càdlàg functions from [0 , T ] to E , continuous at T ; • X t ( ω ) := ω t is the coordinate process; • F := ( F t ) t ∈ [0 ,T ] is the natural filtration generated by X ; • p ◦ is a fixed probability on E ; • F := F T ; • P is the unique probability on (Ω , F , F ) for which X is a continuous-timeMarkov chain with initial distribution p ◦ and transition rates between any two different states equal to . So if i (cid:54) = j and ∆ t > , P [ X t +∆ t = e j |F t ] = P [ X t +∆ t = e j | X t ] and P [ X t +∆ t = e j | X t = e i ] = ∆ t + o (∆ t ) Using the result of [
38, 37 ] we see that the process X has the representation: X t = X + (cid:90) (0 ,t ] Q · X t − dt + M t , where Q is the square matrix whose entries are given by: • Q i,i = − ( m − , i = 1 , . . . , m • Q i,j = 1 if i (cid:54) = j and M = ( M t ) t ≥ is a R m -valued P -martingale. We sometime use the symbol · toemphasize matrix multiplication. The predictable quadratic variation of the martingale M under P is given by the formula:(6.2) (cid:104) M , M (cid:105) t = (cid:90) t ψ t dt, where ψ t is given by:(6.3) ψ t := diag ( Q · X t − ) − Q · diag ( X t − ) − diag ( X t − ) · Q . Players’ Controls.
We assume that all the agents can take actions which are elements α of a closed convex subset A of a Euclidean space R k . For any agent, the set A of ad-missible (control) strategies is the set of A -valued, F -predictable process α = ( α t ) ≤ t ≤ T .The space of probability measures on the state space being the simplex P ( E ) = S := { p ∈ R m ; m (cid:88) i =1 p i = 1 , p i ≥ } , the controlled state processes will have dynamics determined by Q -matrices Q ( t, α, p, ν ) = [ q ( t, i, j, α, p, ν )] ≤ i,j ≤ m where q is a function [0 , T ] × { , . . . , m } × A × S × P ( A ) → q ( t, i, j, α, p, ν ) ∈ R . We shall assume that:(i) Q ( t, α, p, ν ) is a Q-matrix.(ii) < C < q ( t, i, j, α, p, ν ) < C .(iii) For all ( t, i, j ) ∈ [0 , T ] × E , α, α (cid:48) ∈ A , p, p (cid:48) ∈ S and ν, ν (cid:48) ∈ P ( A ) , we have: | q ( t, i, j, α, p, ν ) − q ( t, i, j, α (cid:48) , p (cid:48) , ν (cid:48) ) | ≤ C ( (cid:107) α − α (cid:48) (cid:107) + (cid:107) p − p (cid:48) (cid:107) + W ( ν, ν (cid:48) )) where W is the -Wasserstein distance on P ( A ) .Assumption (i) is natural given that we start from a canonical process X which isalready a continuous time Markov chain. The strictly positive lower bound of assumption(ii) may appear to be restrictive at first, but if we understand that in fact, it is sufficient thatit is satisfied for a given power of the matrix, this assumption guarantees that all states areattainable through appropriate actions, and this is a desirable feature for control problemsto be solvable. Finally, assumption (iii) is to be expected if one thinks of the mathematicalanalysis needed to study these models.Now, given INANCIAL ENGINEERING & ECONOMICS 45 • α = ( α t ) ≤ t ≤ T ∈ A • p = ( p t ) ≤ t ≤ T a flow of probability measures on E • ν = ( ν t ) ≤ t ≤ T a flow of probability measures on A we define the martingale L ( α , p , ν ) = ( L ( α , p , ν ) t ) ≤ t ≤ T by L ( α , p , ν ) t := (cid:90) t X ∗ s − · ( Q ( s, α s , p s , ν s ) − Q ) · ψ + s · d M s . Simple calculations show that ∆ L ( α , p , ν ) t = X ∗ t − · ( Q ( t, α t , p t , ν t ) − Q ) · ψ + t · ∆ X t , which is either when there is no jump at time t , or q ( t, i, j, α t , p t , ν t ) − if the statejumps from state i to state j at time t . In any case, ∆ L ( α , p , ν ) t ≥ − . Also, the Doleans ex-ponential E ( L ( α , p , ν ) ) is uniformly integrable so we can apply the extension of Girsanov’stheorem to processes with jumps, and define the probability measure Q ( α , p , ν ) by its den-sity with respect to P : d Q ( α , p , ν ) d P := E ( L ( α , p , ν ) ) T , which guarantees that the process M ( α , p , ν ) = ( M ( α , p , ν ) t ) ≤ t ≤ T defined as:(6.4) M ( α , p , ν ) t := M t − (cid:90) t ( Q ∗ ( s, α s , p s , ν s ) − Q ) · X s − ds, is a Q ( α , p , ν ) -martingale, and the canonical decomposition of X under Q ( α , p , ν ) reads:(6.5) X t = X + (cid:90) t Q ∗ ( s, α s , p s , ν s ) · X s − dt + M ( α , p , ν ) t , showing that under Q ( α , p , ν ) , the stochastic intensity rate of X is Q ( t, α t , p t , ν t ) . Noticethat X has still distribution p ◦ and if α t = φ ( t, X t − ) for some measurable function φ , X is a continuous-time Markov chain with jump rate intensity q ( t, i, j, φ ( t, i ) , p t , ν t ) underthe measure Q ( α , p , ν ) .So as explained earlier in our first mention of the weak formulation, the choice ofthe control of the agents does not affect the trajectories of the state process, but it doesinfluence the probability distribution, Q ( α , p , ν ) in the present case, which determines theexpected costs and rewards of the principal. Principal’s Optimization Problem . The reward of the Principal depends on the distri-bution of the agents’ states and the payments made to the agents. We use the notation • c : [0 , T ] × S → R for the running cost function • C : S → R for the terminal cost functiondefining the costs of the Principal. Now, assuming that all the agents choose α = ( α t ) ≤ t ≤ T as their control strategy, that the resulting flow of marginal distribution of the agents’ statesis p = ( p ( t )) t ∈ [0 ,T ] , and the contract offered by the principal is ( r , ξ ) , the principal’s ex-pected total cost is given by: J α , p ( r , ξ ) := E Q ( α , p ) (cid:34)(cid:90) T [ c ( t, p ( t )) + r t ] dt + C ( p ( T )) + ξ (cid:35) . Agents’ Mean Field Equilibria . We assume that, for a given contract ( r , ξ ) proposedby the principal, the agents reach a Nash equilibrium as defined rigorously in the followingstatement.D EFINITION ( ˆ α , ˆ p ) is a Nash equilibrium for the contract ( r , ξ ) , ( ˆ α , ˆ p ) ∈ N ( r , ξ ) in notation, if:(i) ˆ α is the best response to the behavior of the other agents, i.e. it minimizes the costwhen the agent is committed to the contract ( r , ξ ) and the flow of marginal distributionsof all the agents is given by the flow ˆ p : ˆ α = arg inf α ∈ A E Q ( α , ˆ p ) (cid:34)(cid:90) T [ c ( t, X t , α t , ˆ p ( t )) − u ( r t )] dt − U ( ξ ) (cid:35) . (ii) ( ˆ α , ˆ p ) satisfies the fixed point condition:(6.6) ∀ t ∈ [0 , T ] ˆ p ( t ) = E Q (ˆ α , ˆ p ) [ X t ] . Notice that this equation is equivalent to ˆ p i ( t ) = Q ( ˆ α , ˆ p ) [ X t = e i ] for all t ∈ [0 , T ] and i ∈ { , . . . , m } . Principal’s Optimal Contracting Problem . As we already explained, the Principalminimizes his or her total expected cost assuming the agents reach a Nash equilibrium. Sowe only consider contracts ( r , ξ ) that result in at least one Nash equilibrium. We denoteby C the set of all admissible contracts. To implement the participation constraint, wedisregard the equilibria in which the agent’s expected total cost is above a given threshold κ , i.e. take-it-or-leave-it behavior of the agents in contract theory: if the agents’ expectedtotal costs exceed a certain threshold, they should be able to turn down the contract. Insummary, the optimization problem for the principal reads: V ( κ ) := inf ( r ,ξ ) ∈C inf ( α , p ) ∈N ( r ,ξ ) J r ,ξ ( α , p ) ≤ κ E Q ( α , p ) (cid:20)(cid:90) T [ c ( t, p ( t )) + r t ] dt + C ( p ( T )) + ξ (cid:21) , Solving the Individual Agent Optimization Problem . For the agent’s optimizationproblem, we introduce the Hamiltonian H : [0 , T ] × E × R m × A × S × R → R definedby: H ( t, x, z, α, p, r ) := c ( t, x, α, p ) − r + x ∗ ( Q ( t, α, p ) − Q ) z. and H i ( t, z, α, p, r ) = H ( t, e i , z, α, p, r ) . We assume that there exists a unique minimizer ˆ α i ( t, z, p ) of α → H i ( t, z, α, p, r ) and that it is uniformly Lipschitz in z , and we use thenotations: ˆ H i ( t, z, p, r ) = H i ( t, z, ˆ α i ( t, z, p ) , p, r ) and ˆ H ( t, x, z, p, r ) = m (cid:88) i =1 ˆ H i ( t, z, p, r ) x = e i for the maximized Hamiltonians. BSDEs driven by Markov Chains . Following the strategy at the root of the weakformulation of stochastic control problems, we introduce the BSDEs :(6.7) Y t = − U ( ξ ) + (cid:90) Tt H ( s, X s − , Z s , α s , p ( s ) , u ( r t )) ds − (cid:90) Tt Z ∗ s d M s . INANCIAL ENGINEERING & ECONOMICS 47 (6.8) Y t = − U ( ξ ) + (cid:90) Tt ˆ H ( s, X s − , Z s , p ( s ) , u ( r t )) ds − (cid:90) Tt Z ∗ s d M s . and we prove the following representation theorems by inspection. Notice that in thepresent situation, the BSDEs are driven by continuous time Markov chains.L EMMA
For each fixed contract ( r , ξ ) , α ∈ A and measurable mapping p :[0 , T ] → S ,(i) the BSDE (6.7) admits a unique solution ( Y , Z ) and we have J r ,ξ ( α , p ) = E P [ Y ] . (ii) The BSDE (6.8) admits a unique solution ( Y , Z ) and we have inf α ∈ A J r ,ξ ( α , p ) = E P [ Y ] . In addition, the optimal control of the agent is ˆ α ( t, X t − , Z t , p ( t )) . Nash Equilibria as Solutions of BSDEs . Let ( Y , Z , α , p , Q ) be a solution to the McKean-Vlasov BSDE system: Y t = − U ( ξ ) + (cid:90) Tt ˆ H ( s, X s − , Z s , p ( s ) , u ( r s )) ds − (cid:90) Tt Z ∗ s d M s , (6.9) E t = 1 + (cid:90) t E s − X ∗ s − ( Q ( s, α s , p ( s )) − Q ) ψ + s d M s , (6.10) α t = ˆ α ( t, X t − , Z t , p ( t )) , (6.11) p ( t ) = E Q [ X t ] , d Q d P = E T . (6.12) • Y is an adapted càdlàg process such that E P [ (cid:82) T Y t ] < + ∞ for all t ∈ [0 , T ] , • Z is an adapted square integrable left-continuous process, • α ∈ A , p : [0 , T ] → S is measurable, Q is a probability on Ω The following result links the solution of the McKean-Vlasov BSDE (6.9)-(6.12) tothe Nash equilibria of the agents.T
HEOREM
If the BSDE (6.9)-(6.12) admits a solution ( Y , Z , α , p , Q ) , then ( α , p ) is a Nash equilibrium. Conversely if ( ˆ α , ˆ p ) is a Nash equilibrium, then the BSDE (6.9)-(6.12) admits a solution ( Y , Z , α , p , Q ) such that α = ˆ α , d P ⊗ dt -a.e. and p ( t ) = ˆ p ( t ) dt -a.e. Principal’s Optimal Contracting Problem . Recall the optimization problem for theprincipal: V ( κ ) := inf ( r ,ξ ) ∈C inf ( α , p ) ∈N ( r ,ξ ) J r ,ξ ( α , p ) ≤ κ E Q ( α , p ) (cid:34)(cid:90) T [ c ( t, p ( t )) + r t ] dt + C ( p ( T )) + ξ (cid:35) , Unfortunatly, this problem is totally intractable !!!! So we transform it into a morefamiliar control problem. This is often called the Sannikov trick. Its nature was clearly elucidated by Cvitanic, Possamaï and Touzi in [ ]. We consider the following system of(forward) McKean-Vlasov SDEs : Y t = Y − (cid:90) t ˆ H ( s, X s − , Z s , p ( s ) , u ( r s )) ds + (cid:90) t Z ∗ s d M s , (6.13) E t = 1 + (cid:90) t E s − X ∗ s − ( Q ( s, α s , p ( s )) − Q ) ψ + s d M s , (6.14) α t = ˆ α ( t, X t − , Z t , p ( t )) , (6.15) p ( t ) = E Q [ X t ] , d Q d P = E T . (6.16)This is the same type of equations as before, except that we write the dynamic of Y inthe forward direction of time . That makes the whole difference. Indeed, if we denoteits solution by ( Y Z ,br,Y ) , Z ( Z , r ,Y ) , α ( Z , r ,Y ) , p ( Z , r ,Y ) , P ( Z , r ,Y ) ) , the expectation under P ( Z , r ,Y ) by E ( Z , r ,Y ) , and if we consider the optimal control problem: ˜ V ( κ ) := inf E P [ Y ] ≤ κ inf Z ∈H X r ∈R E ( Z , r ,Y ) (cid:20) (cid:90) T [ c ( t, p ( Z , r ,Y ) ( t )) + r t ] dt + C ( p ( Z , r ,Y ) ( T )) + U − ( − Y ( Z , r ,Y ) T ) (cid:21) , then, as a direct consequence of the previous theorem, we have ˜ V ( κ ) = V ( κ ) . A Class of Solvable Models . While informative at the theoretical level, still, the aboveresults remain of little practical value if they cannot be implemented in the solution ofpractical problems. In this respect, it is rewarding to discover that under a reasonable set ofassumptions, computable solutions can be identified. Here is an example. We fix p ◦ ∈ S ,we assume that the space of actions is a bounded interval, say A := [ α, α ] ⊂ R + , and thatthe transition rates are linear in the control in the sense that: q ( t, i, j, α, p ) := ¯ q i,j ( t, p ) + λ i,j ( α − α ) , for i (cid:54) = j,q ( t, i, i, α, p ) := − (cid:88) j (cid:54) = i q ( t, i, j, α, p ) , where • λ i,j ∈ R + for all i (cid:54) = j , and (cid:80) j (cid:54) = i λ i,j > for all i , • ¯ q i,j : [0 , T ] × S → R + are continuous mappings for all i (cid:54) = j .Furthermore, we assume that the agent running costs are of the form: c ( t, e i , α, p ) := c ( t, e i , p ) + γ i α , where γ i > , and the mapping ( t, p ) → c ( t, e i , p ) is continuous for all i ∈ { , . . . , m } .Finally, we assume that the utility function of continuous reward u is continuous, concaveand increasing, and that the utility of terminal reward is linear, say U ( ξ ) = ξ . Under theseconditions it is possible to show that the minimizer of the Hamiltonian is given by: ˆ α ( t, e i , z, p ) = ˆ α ( e i , z ) = b − γ i (cid:88) j (cid:54) = i λ i,j ( z j − z i ) , for i ∈ { , . . . , m } , where b ( z ) := min { max { z, α } , ¯ α } . Under these assumptions, onecan INANCIAL ENGINEERING & ECONOMICS 49 • reduce the problem to the optimal control of a flow of probability measures, • and construct an optimal contract!See [ ] for details. We illustrate this result on a concrete example. A Simple Model of Epidemic Containment . I feel compelled to offer a disclaimerbefore presenting the gory details of the model I propose to use as illustration. Peiqi Wangand I concocted this model over three years ago for the purpose of illustrating the innerworkings of the theory and the analytic computations presented in [ ]. In the Spring of2020, when the paper was accepted for publication in Management Science, the Editor inChief asked if we could add a discussion to highlight the relevance of this kind of model tothe understanding of the COVID-19 pandemic. We obliged, and while doing so, I realizedthe potential of these new tools to inform policy makers in the control of the spread ofepidemics, and the localized re-opening of an economy after shut-down. Given the direconditions in which we are finding ourselves at this very moment, Aurrell, Dayanikli,Laurière and I embarked in a systematic investigation of what extensions of the modelcould bring to the understanding of the health and economic consequences of regulations.This effort resulted in [ ]. The reader interested in applications of similar equilibrium viewto epidemic control can also consult the recent work of Elie, Hubert and Turinici [ ].Below, we present the model originally introduced in [ ], where plenty numericalillustrations are given illustrating the influence of the various parameters of the model, andin particular, how the contract proposed by the regulator can influence the propensity ofthe agents to move from one city to another.A regulator tries to control the spread of a virus over a time period [0 , T ] . The ju-risdiction of the regulator consists of two cities, say A and B . Each individual is ei-ther infected ( I ) or healthy ( H ), lives in city A or B . So the state space of the modelis E = { AI, AH, BI, BH } and we denote by π AI , π AH , π BI , π BH the proportions ofindividuals in each of these states.To describe the time evolution of the state of each individual we introduce the follow-ing assumptions:(1) the rate of contracting the virus depends on the proportion of infected individualsin the city so the – transition rate from state AH to state AI is θ − A ( π AI π AI + π AH ) – transition rate from state BH to state BI is θ − B ( π BI π BI + π BH ) .(2) the rate of recovery is a function of the proportion of healthy individuals in thecity, so the – transition rate from state AI to state AH is θ + A ( π AH π AI + π AH ) – transition rate from state BH to state BI is θ + B ( π BH π BI + π BH ) .(3) Each individual can try to move to the other city: we denote by ν I α the transitionrates between the states AI and BI , and by ν H α the transition rates between thestates AH and BH .(4) Status of infection does not change when individual moves between cities.The non-negative functions θ − A , θ − B , θ + A and θ + B are increasing, differentiable on [0 , .They characterize the quality of health care in the cities A and B . So we can change theirparameters to make it more or less attractive to individuals to move from one city to theother, or to stay put. In any case, with these simple prescription, the Q-matrix of the systemreads: AI AH BI BH Q ( t, α, π ) := . . . θ + A ( π AH π AI + π AH ) ν I α θ − A ( π AI π AI + π AH ) . . . ν H αν I α . . . θ + B ( π BH π BI + π BH )0 ν H α θ − B ( π BI π BI + π BH ) . . . AIAHBIBH
We now introduce the costs, first for the agents: c ( t, AI, π ) = c ( t, AH, π ) := φ A (cid:18) π AI π AI + π AH (cid:19) , (6.17) c ( t, BI, π ) = c ( t, BH, π ) := φ B (cid:18) π BI π BI + π BH (cid:19) , (6.18) γ AI = γ BI := γ I , γ AH = γ BH := γ H , (6.19)where φ A and φ B are two increasing functions, and next for the regulator (namely thePrincipal) for whom the running and terminal costs are given in the form: c ( t, π ) = exp( σ A π AI + σ B π BI ) , (6.20) C ( π ) = σ P · ( π AI + π AH − π A ) , (6.21)where π A is the population of city A at time . Choosing the values of the parameters σ A , σ B and σ P offer • a trade-off between the control of the epidemic and population planning; • to try to minimize the infection rate of both cities. • In fact, σ A , σ B and σ P weigh the relative importance the regulator attributes toeach of these objectives.The analysis of this model reduces to the solution of an explicit forward-backward systemof Ordinary Differential Equations (ODEs) which can easily be solved numerically, allow-ing for the computation of statics of the model. Numerical illustrations are provided in[ ]. It is natural and enlightening to com-pare the equilibrium computed from the solution of the principal-agent problem to the Nashequilibrium of the mean field game reached by the individuals in the absence of the regu-lator. In its absence, the states of the individuals are still governed by the same transitionrates, but the individuals’ rewards or penalties from the authority are not present in theobjective functions they optimize. In other words, they minimize selfishly their expectedcosts(6.22) E Q ( α ,π ) (cid:34)(cid:90) T c ( t, X t , α t , π ( t )) dt (cid:35) , as part of a regular mean field game, and it is plain to compute some of the numericalcharacteristics of its Nash equilibrium. Note that this formula for the expected costs of theagents does not contain the payment stream r = ( r t ) ≤ t ≤ T and the terminal payment ξ which enter the costs to the agents as part of the covenants between them and the principal.This comparison it very much in the spirit of the computation of the so-called Price ofAnarchy in classical game theory.Following the analytical approach to finite-state mean field games introduced in [ ],it is straightforward to derive the system of forward-backward ODEs characterizing theNash equilibrium. See the system of ODEs (12)-(13) in [ ] or [ , Section 7.2]. In the INANCIAL ENGINEERING & ECONOMICS 51 particular case of the model discussed in this section, the exact form of this system ofODEs is given in the appendix of [ ]. References
1. Y. Achdou, F. Buera, J.M. Lasry, P.L. Lions, and B. Moll,
Partial differential equation models in macroeco-nomics , Philosophical Transactions of the Royal Society (2014).2. Y. Achdou, P.N. Giraud, J.M. Lasry, and P.L. Lions,
A long-term mathematical model for mining industries ,Applied Mathematics and Optimization (2016), 579–618.3. Y. Achdou, J. Han, J.M. Lasry, and B. Moll P.L. Lions, Income and wealth distribution in macroeconomics:A continuous-time approach
A mckean-vlasov approach to distributed electricity generation devel-opment. , Mathematical Methods of Operations Research (2019), 269–310.5. , The entry and exit game in the electricity markets: a mean field game approach. , Tech. report,arXiv.org/2004.14057, 2020.6. C. Alasseur, I. Ben Tahar, and A. Matoussi,
An extended mean field game for storage in smart grids ,arXiv:1710.08991, 2017.7. R. Almgren and N. Chriss,
Optimal execution of portfolio transactions , Journal of Risk (2001), 5Ð39.8. A. Aurrell, R. Carmona, G. Dayanikli, and M. Laurière, Optimal incentives to mitigate epidemics: a stackel-berg mean field game approach , Tech. report, https://arxiv.org/abs/2011.03105, in preparation.9. O. Bahn, A. Haurie, and R. Malhamé,
Limit game models for climate change negotiations , Advances inDynamic and Mean Field Games. (J. Apaloo and B. Viscolani, eds.), Annals of the International Society ofDynamic Games, vol 15, Birkhäuser, 2017, pp. 27–47.10. E. Bayraktar and A. Cohen,
Analysis of a finite state many player game using its master equation , Tech.report, arXiv arXiv:1707.02648 , 2017.11. C. Bertucci,
Optimal stopping in mean field games, and obstacle problem approach , Tech. report,arXiv:1704.06553v2, 2017.12. C. Bertucci, L. Bertucci, J.M. Lasry, and P.L. Lions,
Mean field game approach to bitcoin mining , Tech.report, arXiv:12004.08167v1, 2020.13. M. Brunnermeier and L. Pedersen,
Predatory trading , Journal of Finance (2005), 1825–1863.14. M. Brunnermeier and Y. Sannikov, On the optimal inflation rate , vol. 106, 2016, pp. 484–489.15. ,
Macro, money and finance: A continuous time approach , (2017), 1497–1546.16. B. Bueler,
Solving an equilibrium model for trade of co emission permits , European Journal of OperationalResearch (1997), no. 2, 393–403.17. L. Ciampi C. Benazzoli and L. DiPersio, Mean field games with controlled jump–diffusion dynamics: Exis-tence results and an illiquid interbank market model , Stochastic Processes and their Applications (2020).18. P. Cardaliaguet and C.A. Lehalle, Mean field game of controls and an application to trade crowding , Tech.report, arXiv:1610.09904.19. B. Carlin, M. Lobo, and S.Viswanathan,
Episodic liquidity crises: Cooperative and predatory trading , Jour-nal of Finance (2007), 2235– 2274.20. R. Carmona, G. Dayanikli, and M. Laurière, Mean field game models for renewable investment in the elec-tricity markets , Tech. report, Princeton University, 2020.21. R. Carmona and F. Delarue,
Probabilistic theory of mean field games: vol. i, mean field fbsdes, control, andgames , Stochastic Analysis and Applications, Springer Verlag, 2017.22. ,
Probabilistic theory of mean field games: vol. ii, mean field games with common noise and masterequations , Stochastic Analysis and Applications, Springer Verlag, 2017.23. R. Carmona, F. Delarue, and D. Lacker,
Mean field games of timing and models for bank runs , AppliedMathematics and Optimization (2017), 217–260.24. R. Carmona, M. Fehr, J. Hinz, and A. Porchet, Market design for emissions markets trading schemes , SIAMReview (2010), 403–452.25. R. Carmona, J.P. Fouque, M. Moussavi, and L.H. Sun, Systemic risk and stochastic games with delay , Tech.report, 2018.26. R. Carmona, J.P. Fouque, and L.H. Sun,
Mean field games and systemic risk: a toy model , Communicationsin Mathematical Sciences (2015), 911–933.27. R. Carmona and D. Lacker, A probabilistic weak formulation of mean field games and applications , Annalsof Applied Probability (2015), 1189–1231.28. R. Carmona and P. Wang, Finite-state contract theory with a principal and a field of agents , ManagementScience (2020).
29. ,
A probabilistic approach to extended finite state mean field games , Mathematics of OperationsResearch (2020).30. R. Carmona and K. Webster,
The self financing condition in high frequency markets , Finance & Stochastics (2019), 729 – 759.31. R. Carmona and J. Yang, Predatory trading: a game on volatility and liquidity , Quantitative Finance (underrevision) (2011).32. A. Cartea, S. Jaimungal, and J. Penalva,
Algorithmic and high- frequency trading , Mathematics, Finance andRisk, Cambridge University Press, 2015.33. A. Cecchin and M. Fischer,
Probabilistic approach to finite state mean field games , Applied Mathematics &Optimization (2018).34. A. Cecchin and G. Pelino,
Convergence, fluctuations and large deviations for finite state mean field gamesvia the master equation , Tech. report, arXiv:1707.01819 , 2018.35. A. Cecchin, P. Dai Pra, M. Fischer, and G. Pelino,
On the convergence problem in mean field games: A twostate model without uniqueness , SIAM Journal on Control and Optimization (2020), 2443–2466.36. P. Chan and R. Sircar, Fracking, renewables & mean field games , SIAM Review (2017), 588–615.37. S. Cohen and R. Elliott, Comparisons for backward stochastic differential equations on markov chains andrelated no-arbitrage conditions , The Annals of Applied Probability, no. 1, 267–311.38. ,
Solutions of backward stochastic differential equations on Markov chains , Communications in Sto-chastic Analysis (2008), no. 2, 251–262.39. J. Cvitanic, D. Possamai, and N. Touzi,
Dynamic programming approach to principal-agent problems , Tech.report, September 2017.40. J. Cvitanic, D. Possamaï, and N. Touzi,
Dynamic programming approach to principal-agent problems , Fi-nance and Stochastics (2018), 1–37.41. B. Djehiche, J. Barreiro-Gomez, and H. Tembine, Electricity price dynamics in the smart grid: A mean-field-type game perspective , 23rd International Symposium on Mathematical Theory of Networks and SystemsHong Kong University of Science and Technology, 2018.42. J. Doncel, N. Gast, and B. Gaujal,
Discrete mean field games: Existence of equilibria and convergence , Tech.report, arXiv 1909.01209 , 2019.43. R. Elie, E. Hubert, T. Mastrolia, and D. Possamaï,
Mean-field moral hazard for optimal energy demandresponse management. , Tech. report, arXiv 1902.10405, 2019.44. R. Elie, E. Hubert, and G. Turinici,
Contact rate epidemic control of covid-19: an equilibrium approach ,Mathematical Modelling of Natural Phenomena (2020).45. R. Elie, T. Mastrolia, and D. Possamaï, A tale of a principal and many many agents , Mathematics of Opera-tions Research (2019), 440–467.46. K. Fong, O. Gossner, J. Hörner, and Y. Sannikov, Coordination under private monitoring: from bank runs tothe prisoner’s dilemma , Tech. report, Princeton University, 2014.47. J.P. Fouque and J. Langsam (eds.),
Handbook on systemic risk , Cambridge University Press, 2013.48. R. Dumitrescu G. Bouveret and P. Tankov,
Mean-field games of optimal stopping: a relaxed solution ap-proach , (2020).49. P.N. Giraud, O. Guéant, J.M. Lasry, and P.L. Lions,
A mean field game model of oil production in presenceof alternative energy producers , Tech. report, to appear.50. M. Golosov, J. Hassler, P. Krusell, and A. Tsyvinski,
Optimal taxes on fossil fuel in general equilibrium ,Econometrica (2014), 41–88.51. D.A. Gomes, J. Mohr, and R.R. Souza, Continuous time finite state mean field games , Applied Mathematics& Optimization (2013), 99Ð143.52. P.J. Graber and A. Bensoussan, Existence and uniqueness of solutions for Bertrand and Cournot mean fieldgames. , Tech. report, arXiv 1508.05408v1 , 2015.53. O. Guéant, J.M. Lasry, and P.L. Lions,
Mean field games and oil production , Finance and Sustainable Devel-opment : Seminar’s lectures. (2009).54. ,
Mean field games and applications , Paris Princeton Lectures in Mathematical Finance IV (R. Car-mona et al., ed.), Lecture Notes in Mathematics, vol. 2003, Springer Verlag, 2010.55. A. Haurie and L. Viguier,
A stochastic dynamic game of carbon emissions trading , Environmental Modelingand Assessment (2003), no. 3, 239–248.56. Z. He and W. Xiong, Dynamic debt runs , Review of Financial Studies (2012), 1799 – 1843.57. J. Hinz, An equilibrium model for electricity auctions , Appl. Math. (Warsaw) (2003), 243–249.58. B. Holmstrom, Moral hazard and observability , The Bell Journal of Economics (1979), no. 1, 74–91.59. , Moral hazard in teams , The Bell Journal of Economics (1982), no. 2, 324–340. INANCIAL ENGINEERING & ECONOMICS 53
60. B. Holmstrom and P. Milgrom,
Aggregation and linearity in the provision of inter-temporal incentives ,Econometrica (1987), 303–328.61. Bengt Holmstrom and Paul Milgrom, Multitask principal-agent analyses: Incentive contracts, asset owner-ship, and job design , Journal of Law, Economics, & Organization (1991), 24–52.62. J. Kambhu, S.Weidman, and N. Krishnan (eds.), New directions for understanding systemic risk: A reporton a conference cosponsored by the federal reserve bank of new york and the national academy of sciences ,National Research Council, 2007.63. P. Krusell and Jr. A. Smith,
Income and wealth heterogeneity in the macroeconomy , Journal of PoliticalEconomy (1998), 867–896.64. A. Lachapelle, J.M. Lasry, C.A. Lehalle, and P.L. Lions,
Efficiency of the price formation process in presenceof high frequency participants: a mean field games analysis , Mathematics and Financial Economics (2016), 223 – 262.65. Z. Li, M. Reppen, and R. Sircar, A mean field games model for cryptocurrency mining , Tech. report,arXiv:1912.01952v1, 2019.66. M. Ludkovski and R. Sircar,
Game theoretic models for energy production , Commodities, Energy and Envi-ronmental Finance, Springer, 2015.67. J. Mirrlees,
The optimal structure of incentives and authority within an organization , The Bell Journal ofEconomics (1976), no. 1, 105–131.68. S. Morris and H.S. Shin, Unique equilibrium in a model of self-fulfilling currency attacks , American Eco-nomic Review (1998), 587–597.69. S. Nadtochiy and M. Shkolnikov,
Mean field systems on networks, with singular interaction through hittingtimes , Annals of Probability (2020), 520–1556.70. M. Nutz, A mean field game of optimal stopping , Tech. report, 2018.71. J.C. Rochet and X. Vives,
Coordination failures and the lender of last resort , Journal of the European Eco-nomic Associateion (2004), 1116 – 1148.72. Yuliy Sannikov, A continuous-time version of the principal-agent problem , The Review of Economic Studies (2008), no. 3, 957–984.73. , Contracts: The theory of dynamic principal-agent relationships and the continuous-time approach ,10th World Congress of the Econometric Society, 2012.74. A. Shrivats, D. Firoozi, and S. Jaimungal,
A mean field game approach to equilibrium pricing optimal gen-eration, and trading in solar renewable energy certificate (srec) markets. , Tech. report, arXiv 2003.04938,2020.D
EPARTMENT OF O PERATIONS R ESEARCH & F
INANCIAL E NGINEERING , B
ENDHEIM C ENTER FOR F INANCE , P
ROGRAM IN A PPLIED & C
OMPUTATIONAL M ATHEMATICS , P
RINCETON U NIVERSITY
Email address ::