[PDF] A mean-field game model for homogeneous flocking

Abstract

Empirically derived continuum models of collective behavior among large populations of dynamic agents are a subject of intense study in several fields, including biology, engineering and finance. We formulate and study a mean-field game model whose behavior mimics an empirically derived non-local homogeneous flocking model for agents with gradient self-propulsion dynamics. The mean-field game framework provides a non-cooperative optimal control description of the behavior of a population of agents in a distributed setting. In this description, each agent's state is driven by optimally controlled dynamics that result in a Nash equilibrium between itself and the population. The optimal control is computed by minimizing a cost that depends only on its own state, and a mean-field term. The agent distribution in phase space evolves under the optimal feedback control policy. We exploit the low-rank perturbative nature of the non-local term in the forward-backward system of equations governing the state and control distributions, and provide a linear stability analysis demonstrating that our model exhibits bifurcations similar to those found in the empirical model. The present work is a step towards developing a set of tools for systematic analysis, and eventually design, of collective behavior of non-cooperative dynamic agents via an inverse modeling approach.

Full PDF

AA mean-ﬁeld game model for homogeneous ﬂocking

Piyush Grover, Kaivalya Bakshi,

1, 2 and Evangelos A. Theodorou Mitsubishi Electric Research Labs, Cambridge, MA 02139,USA a)2)

Aerospace Engineering, Georgia Tech, Atlanta, GA 30332,USA (Dated: 30 May 2018)

Empirically derived continuum models of collective behavior among large popula-tions of dynamic agents are a subject of intense study in several ﬁelds, includingbiology, engineering and ﬁnance. We formulate and study a mean-ﬁeld game modelwhose behavior mimics an empirically derived nonlocal homogeneous ﬂocking modelfor agents with gradient self-propulsion dynamics. The mean-ﬁeld game frameworkprovides a non-cooperative optimal control description of the behavior of a popu-lation of agents in a distributed setting. In this description, each agent’s state isdriven by optimally controlled dynamics that result in a Nash equilibrium betweenitself and the population. The optimal control is computed by minimizing a costthat depends only on its own state, and a mean-ﬁeld term. The agent distributionin phase space evolves under the optimal feedback control policy. We exploit thelow-rank perturbative nature of the nonlocal term in the forward-backward systemof equations governing the state and control distributions, and provide a closed-looplinear stability analysis demonstrating that our model exhibits bifurcations similarto those found in the empirical model. The present work is a step towards developinga set of tools for systematic analysis, and eventually design , of collective behavior ofnon-cooperative dynamic agents via an inverse modeling approach. a) Electronic mail: [email protected] a r X i v : . [ n li n . AO ] M a y hile the analysis of emergent behavior in a large population of dynamic agentsis a classical topic, the design of desired macroscopic behavior in such systems isa grand engineering challenge. Such systems are often studied using continuummodels, involving empirically derived systems of nonlinear partial diﬀerentialequations that govern the distribution of agents in the phase space. The vari-ous terms in these equations represent intrinsic dynamics of the agents, mutualattraction and/or repulsion, and noise. An important class of such models con-cern ﬂocking, both in nature, and engineering applications such as bio-inspiredcontrol of multi-agent robotics, traﬃc modeling, power-grid synchronization etc.We take a mean-ﬁeld game approach to derive a control system that mimics thebehavior of one such class of models in the setting of non-cooperative agents. Amean-ﬁeld game is a coupled system of partial diﬀerential equations that governthe state and optimal control distributions of a representative agent in a Nashequilibrium with the population. Using a linear stability analysis, we recoverphase transitions that have been observed in the corresponding empirical model,as well as ﬁnd some new ones, as the control penalty is changed.I. INTRODUCTION Continuum models of large populations of interacting dynamic agents are popular inmathematical biology , and also have been employed in numerous applications such as multi-agent robotics , ﬁnance and traﬃc modeling . The aim of such models is to accuratelyrepresent the macroscopic dynamics of the population, and its dependence on parameters.Typically, such models are derived by starting with an empirical dynamical system for arepresentative agent. This system typically involves the intrinsic dynamics of the agent, acoupling function describing its interaction with the population, and noise. From this singleagent dynamical system, a continuum description is obtained by deriving a macroscopicequation for the distribution of agents in the phase space. We call this class of models uncontrolled .An alternative way of deriving continuum models of collective behavior is via a corre-sponding variational principle. In this approach, the dynamical system for a representative2 n FIG. 1: The MFG frameworkagent includes its intrinsic dynamics, a control term and noise. The unknown control termis obtained as a solution to an optimization problem. Within this variational (or optimiza-tion) framework for large populations, there are multiple classes of modeling strategies . Ifone takes a centralized global optimization viewpoint, the corresponding problem is that of mean-ﬁeld control , i.e. it is assumed that each agent is being controlled by a central entitywhose goal is to optimize a macroscopic cost function that includes interaction among thepopulation. In a distributed setting, there is no central entity, and the agents can either be cooperative or non-cooperative . In the former case, each agent choses its control to optimizea global sum of cost functions of the population.On the other hand, in the non-cooperative mean ﬁeld setting that we are interested in,each agent optimizes only its individual cost function. This cost function involves couplingwith the population solely via a mean-ﬁeld term. This is the setting of mean-ﬁeld games(MFG) . In this setting, a Hamilton-Jacobi-Bellman (HJB) equation (posed backwardin time) characterizes the optimal feedback control for a representative agent under the as-sumption that the (cost) coupling function depends only on its own state, and possibly time.A Fokker-Planck (FP) equation governs the evolution of agent density in phase space. Aconsistency principle requires that the coupling function used in the agent HJB equation isreproduced as its own average over the continuum of agents. Under fairly general conditions,solutions to MFG model can be shown to possess (cid:15) -Nash property, i.e., unilateral beneﬁt ofany deviation from the computed control policy by a single agent vanishes rapidly as thepopulation becomes large.The classical (uncontrolled) Cucker-Smale (CS) ﬂocking model describes a system of3nite population of coupled agents with trivial intrinsic dynamics, moving solely underthe inﬂuence of an alignment force, and noise. This was followed by several continuumdescriptions , and was recently generalized to a continuum model with self-propulsioneﬀects in the homogeneous case (i.e., assuming spatial homogeneity). This latter general-ization results in existence of non-zero mean velocity distribution resulting from symmetrybreaking, a wide range of ‘disordered’ states consisting of multiple ﬂocks, and other phasetransitions.A MFG model for a continuum of coupled Kuramoto oscillators was described in aseminal work that inﬂuences the development in the current paper. Building upon thiswork, a MFG model for the classical inhomogeneous CS was then proposed ; the stabilityanalysis was partially addressed. This was followed by a homogeneous ﬂocking MFG modelfor coupled agents with trivial intrinsic dynamics, along with linear and nonlinear stabilityanalysis . Also of interest is an approach where agents apply a gradient descent ratherthan solve an HJB equation, since the Nash equilibria of the MFG are recovered undercertain conditions using this approach.The contributions of this paper are as follows. We formulate a MFG model for homo-geneous ﬂocking of agents driven by self-propulsion and noise. In contrast to the earlierwork on homogeneous MFG model with trivial intrinsic dynamics , this model exhibitsphase transitions (bifurcations) that mimic those present in the corresponding uncontrolledmodel . We generalize the stability analysis developed in previous MFG models to agents with gradient nonlinear dynamics, and employ a method used to study reaction-diﬀusion equations to derive a semi-analytical stability criterion. Besides qualitativelyexplaining the phase transition phenomena, quantitative results useful in control designare obtained from the numerical analysis. Decreasing the control control penalty below athreshold causes the zero mean velocity steady state of the MFG model to lose stability viapitchfork bifurcation . This results in a pair of stable steady states with non-zero meanvelocity. If the control is made even cheaper, a new stable regime (nonexistent in the un-controlled model) emerges for zero mean velocity steady states in the small noise case via asubcritical pitchfork bifurcation. 4 I. UNCONTROLLED FORMULATION

We brieﬂy review here the uncontrolled formulation from Ref. 14 which provides a ho-mogeneous model for CS ﬂocking with self-propulsion. Consider a population of N agentsmoving in phase space (( q, p ) ∈ R ), where each agent is acted upon by a gradient self-propulsion term, a CS coupling force with localization kernel K in position space that alignsthe agents’ velocity with the neighbors, and noise. The dynamics for i th agent are dq i = p i dt,dp i = a ( p i ) dt + F ( q i , p i , q − i , p − i ) dt + σdω i , where a ( p i ) = − ∂ p U ( p i ), U ( p i ) = α ( p i − p i ), F ( q i , p i , q − i , p − i ) = 1 N (cid:80) Nj =1 K ( q i , q j )( p j − p i ) (cid:80) Nj =1 K ( q i , q j ) ,σ > α > K ( q, q (cid:48) ) = K ( q (cid:48) , q ) ≥ , and K ( q, q ) = 1, q − i = { q , . . . , q i − , q i +1 . . . } , p − i = { p , . . . , p i − , p i +1 . . . } .In the continuum limit ( N → ∞ ), the agent density f ( q, p, t ) in phase space is governed by ∂ t f + ∂ q ( pf ) + ∂ p ( a ( p ) f + F [ f ] f ) = σ ∂ pp f, where F [ f ]( q, p, t ) = (¯ p − p ) and,¯ p ( q, t ) = (cid:82) (cid:82) K ( q, q (cid:48) ) p f ( q (cid:48) , p, t ) dq (cid:48) dp (cid:82) (cid:82) K ( q, q (cid:48) ) f ( q (cid:48) , p, t ) dq (cid:48) dp . We denote the action of the operator F on a function f by F [ f ]( . ). From here onwards,we consider the homogeneous case by dropping dependence on q , and use x to denote thevelocity p . The uncontrolled dynamics for the velocity of agent i are dx i = a ( x i ) dt + 1 N N (cid:88) j =1 ( x j − x i ) dt + σdω i , (1)with corresponding density evolution ∂ t f = ∂ x ( α ( x − xf + ( x − ¯ x ) f ) + σ ∂ xx f, (2)where ¯ x ( t ) = (cid:82) xf ( x, t ) dx (cid:82) f ( x, t ) dx . A. Fixed Points and Stability Analysis

It is known that ﬁxed points of Eq. (2) are given by f ∞ ( x ; µ ) = 1 Z exp (cid:18) − σ (cid:20) α x − α ) x − µx (cid:21)(cid:19) , (3)5here µ ∈ R is the mean of the distribution, and Z is the normalization factor. For allpositive values of parameters ( σ, α ), the zero mean velocity solution f ∞ ( · ,

0) always exists.For a range of parameters, two additional stable non-zero mean velocity solutions are createdvia a supercritical bifurcation, resulting in loss of stability of the zero mean solution. In Ref.14, these stability properties were inferred numerically by a Monte-Carlo approach.We take a diﬀerent approach, and consider the spectral stability of steady state solutionsof Eq. (2). In addition to gaining additional insight into the properties of the uncontrolledsystem, this also sets the stage for stability analysis of the MFG system in the next section.We consider perturbations of the form f ( x, t ) = f ∞ ( x )(1 + (cid:15) ˜ f ( x, t )). Then, the linearizationof Eq. (2) is ∂ t ˜ f ( x, t ) = L [ ˜ f ]( x, t ) = L loc [ ˜ f ]( x, t ) + L nonloc [ ˜ f ]( x, t ) , where, ˆ U ( x ) = U ( x ) + x / − µx , L loc [ ˜ f ]( x, t ) = − ∂ x ˆ U ( x ) ∂ x ˜ f ( x, t ) + ( σ / ∂ xx ˜ f ( x, t )is a local linear operator, and L nonloc [ ˜ f ]( x, t ) = 2 σ ∂ x ˆ U ( x ) (cid:90) y ˜ f ( y, t ) f ∞ ( y ) dy is a nonlocal linear operator. An operator O is called nonlocal if O [ f ]( x ) depends on f ( x )(or the derivatives ∂ x f ( x ) , ∂ xx f ( x )) for some x (cid:54) = x , and local otherwise. Let q ( x ) ≡ σ ∂ x ˆ U ( x ). Then, ∂ x f ∞ ( x ) = − q ( x ) f ∞ ( x ). We deﬁne a Hilbert space H = L ( R , f ∞ dx ), i.e.,the f ∞ -weighted inner-product space of square-integrable functions on the real line. Thenwe can write a general form of the full linearized operator as L [ ˜ f ]( x, t ) = L loc [ ˜ f ]( x, t ) + s ( x ) (cid:104) g ( · ) , ˜ f ( · , t ) (cid:105) , (4)where s ( x ) = q ( x ) , g ( x ) = x for our case, and the inner product is understood to be (cid:104)· , ·(cid:105) H .We note that L loc is a self-adjoint operator on H which has a non-positive discrete realspectrum of the form 0 = λ > λ > λ . . . . It has a complete set of orthogonal eigen-functions { ξ i ( x ) } i ∈ N . The ﬁrst eigenfunction ξ , spanning the kernel of L loc , is a constantfunction. Following the approach presented in Refs. 21 and 25 for nonlocal eigenvalue prob-lems in reaction-diﬀusion equations (also see Ref. 26), we consider the following eigenvalue6roblem λw = L loc w + s ( x ) (cid:104) g , w (cid:105) = ⇒ L loc − λI ) w + s ( x ) (cid:104) g , w (cid:105) . (5)Note that an eigenfunction w of L satisfying (cid:104) w, g (cid:105) = 0 is also an eigenfunction of L loc ,i.e. w = v i for some i with eigenvalue λ = λ i . We search for eigenfunctions such that (cid:104) w, g (cid:105) is nonzero. The corresponding eigenvalues are called ‘moving’ eigenvalues in Ref. 25.Multiplying both sides of Eq. (5) with the resolvent R λ = ( L loc − λI ) − ,0 = w + R λ s ( x ) (cid:104) g , w (cid:105) . Taking the inner product of the above equation with g ,0 = (cid:104) g , w (cid:105) + (cid:104) R λ s ( x ) , g (cid:105)(cid:104) g , w (cid:105) . (6)For an arbitrary function z ( x ) , R λ z = (cid:80) ∞ i =1 (cid:104) ξ i , z (cid:105) λ i − λ ξ i . Evaluating the inner product inEqs. 6, (cid:104) R λ s ( x ) , g (cid:105) = (cid:104) R λ q ( x ) , x (cid:105) = ∞ (cid:88) i =2 (cid:104) ξ i , q ( x ) (cid:105) λ i − λ (cid:104) ξ i , x (cid:105) . (7)Using this result in Eq. (6), (cid:104) w, x (cid:105) (1 + ∞ (cid:88) i =2 (cid:104) ξ i , q ( x ) (cid:105) λ i − λ (cid:104) ξ i , x (cid:105) ) = 0 . Hence, either (cid:104) w, x (cid:105) = 0, or 1 + (cid:80) ∞ i =2 (cid:104) ξ i , q ( x ) (cid:105) λ i − λ (cid:104) ξ i , x (cid:105) = 0. But we are looking for movingeigenvalues, i.e. w s.t. (cid:104) w, x (cid:105) (cid:54) = 0, hence the eigenvalue equation reduces to: h ( λ ) ≡ ∞ (cid:88) i =2 (cid:104) ξ i , q ( x ) (cid:105) λ i − λ (cid:104) ξ i , x (cid:105) = 0 . (8)A suﬃcient condition for Eq. (8) to have only real roots is that the function h ( λ ) isHerglotz, or equivalently, the product (cid:104) ξ i , q ( x ) (cid:105)(cid:104) ξ i , x (cid:105) has the same sign for all i . Using inte-gration by parts on eigenvalue equation for L loc , one can show that (cid:104) ξ i , x (cid:105) = − σ λ i (cid:104) ξ i , q ( x ) (cid:105) .Thus the Herglotz condition is satisﬁed since λ i < i > a) (0,0) xxxx (b) (c) (d) FIG. 2:

Uncontrolled system. a) Stable (solid) and unstable (dashed) zero mean steady state solutionsfor α = 0 . α = 1 . L loc (o) and L ( × ) for atypical zero mean case. The ﬁrst eigenvalue (= 0) is omitted. Notice that alternating eigenvalues are samefor both operators. The arrows indicate the direction of motion of the other (‘moving’) eigenvalues of L as σ is reduced. The rightmost eigenvalue of L reaches 0 at σ = σ c ( α ) with non-zero speed. c) Non-zero meansolutions d) The µ > µ = 0 solution (dashed) via a supercritical pitchforkbifurcation as σ occurs is reduced below σ c ( α ). Numerical Results:

We use Chebfun to perform all computations. The non-zeromean steady state solutions to Eq. (2) are computed using a simple ﬁxed point iteration for µ . The solutions are shown in Figure 2. The supercritical pitchfork bifurcation that occursas σ is reduced below critical value σ c ( α ), is shown for a range of α values. To evaluate h ( λ )in Eq. (8), we compute the spectrum of L loc for µ = 0. The odd-numbered eigenfunctionsare even functions of x , and hence (cid:104) ξ k +1 , g (cid:105) = 0. Therefore, eigenvalues λ k +1 of L loc are also eigenvalues of L , and the eigenvalues λ k are moving eigenvalues. We ﬁnd that at σ = σ c ( α ), h (0) = 0. Hence, as σ is decreased below σ c ( α ), the least stable eigenvalue λ of L loc moves to the positive real axis due to the eﬀect of the nonlocal term, resulting ininstability of the zero mean solution. 8 II. MFG FORMULATION

In this section we describe a MFG formulation for homogeneous equation Eq. (2). Thevelocity of i th agent evolves via the following equation (compare with Eq (1)) dx i ( t ) = a ( x i ) dt + u i ( t ) dt + σdω i ( t ) , (9)where u i is the optimal control. Let F [ x i , x − i ]( t ) ≡ ( x i − N − (cid:80) j (cid:54) = i x j ) , and β = 1 rσ ,where r > x − i ≡ { x , x , . . . , x i − , x i +1 , . . . } . Then the i th agent is minimizing the following long time average cost J = lim sup T →∞ T (cid:20)(cid:90) T βF [ x i , x − i ]( t ) + 12 σ u i ( t ) (cid:21) dt, that depends on states of all other agents.To derive the MFG equations (recall Fig. 1), we rewrite the single-agent cost in terms ofˆ F ( x i , t ), the unknown coupling function with dependence on x i only J = lim sup T →∞ T (cid:20)(cid:90) T β ˆ F ( x i , t ) + 12 σ u i ( t ) (cid:21) dt. The resulting single agent HJB equation is ∂ t v i ( x, t ) = c − β ˆ F ( x, t ) − a ( x ) ∂ x v i ( x, t )+ σ ∂ x v i ( x, t )) − σ ∂ xx v i ( x, t ) , (10)where v i ( x, t ) is the single-agent relative value function, c is the minimum average cost, and u i ( x, t ) = − σ ∂ x v i ( x i , t ) given in feedback form. Note that the HJB equation is well-posedbackward in time. The self-consistency principle yields the expression for ˆ F in terms ofagent density f ( x, t ) (in the limit N → ∞ ):ˆ F [ f ]( x, t ) = (cid:20)(cid:90) ( x − y ) f ( y, t ) dy (cid:21) . (11)Hence, the following set of FP-HJB MFG equations govern the density and value functionevolution: ∂ t f ( x, t )+ ∂ x (cid:2)(cid:0) a ( x ) − σ ∂ x v ( x, t ) (cid:1) f ( x, t ) (cid:3) = σ ∂ xx f ( x, t ) , (12) ∂ t v ( x, t ) = c − β ˆ F [ f ]( x, t ) − a ( x ) ∂ x v ( x, t )+ σ ∂ x v ( x, t )) − σ ∂ xx v ( x, t ) . (13)9he unique invariant density satisfying Eq. (12) is f ∞ ( x ) = 1 Z exp( − σ ( U ( x ) + σ v ∞ ( x ))) . (14)Inserting this expression into Eq. (13), and using the Cole-Hopf transformation φ ( x ) =exp( − v ∞ ( x )), results in the following nonlinear nonlocal eigenvalue problem for φ ( x ) cφ ( x ) = β { x − (cid:90) y exp( − σ U ( y )) φ ( y ) dy } φ ( x ) − a ( x ) φ ( x ) − σ ∂ xx φ ( x ) , (15)with the constraint (cid:82) exp( − σ U ( y )) φ ( y ) dy = 1 to ensure normalization of f ∞ . The groundstate of this problem yields the desired steady state solutions, with corresponding eigenvaluebeing the minimum cost c . A. Stability Analysis

In this section we extend the resolvent based analysis from section II A to the MFG sys-tem, and ﬁnd conditions for closed-loop stability of an arbitrary steady state ( f ∞ ( x ) , v ∞ ( x ))to an initial perturbation in density. We consider mass preserving perturbations in density ofthe form f ( x, t ) = f ∞ ( x )(1+ (cid:15) ˜ f ( x, t )), i.e., the initial conditions satisfy (cid:82) f ∞ ( x ) ˜ f ( x, dx = 0.The perturbed value function is taken to be of the form v ( x, t ) = v ∞ ( x ) + (cid:15) ˜ v ( x, t ). A givensteady state is called linearly stable if any perturbation to the density decays to zero underthe action of the control, where both the density and control evolution are computed usinglinearized MFG equations.Linearization of MFG equations (12,13) yields the nonlocal system  ∂ t ˜ f ( x, t ) ∂ t ˜ v ( x, t )  = L F Bloc  ˜ f ( x, t )˜ v ( x, t )  , (16)where L F B = L F Bloc + L F Bnonloc , L F Bloc =  L loc L loc − L loc  , L F Bnonloc =  βs (cid:104) g , . (cid:105)  ,s ( x ) = x − µ, g ( x ) = x , L loc = − ∂ x ( ˆ U ) ∂ x + σ ∂ xx , with eigenvalue/eigenfunction pairsdenoted by { λ i , ξ i } , and ˆ U ( x ) = U ( x ) + σ v ∞ ( x ) in analogy with the deﬁnition of L loc inSection II. In addition to the Hilbert space H = L ( R , f ∞ dx ) and R λ as deﬁned earlier, wealso consider a subspace ¯ H = { f ∈ H |(cid:104) f, (cid:105) = 0 } .10 . Eigenspectrum of the linearized forward-backward operator We start oﬀ by noting that the characteristic equation of L F Bloc is ( L loc − λI )( L loc + λI ) = 0.Hence, its eigenvalues are ∪ i ∈ N {± λ i } . Now consider the eigenvalue problem for L F B witheigenvalue λ and eigenfunction [ w f ( x ) w v ( x )] T : λ  w f w v  =  L loc w f + 2 L loc w v βs (cid:104) g , w f (cid:105) − L loc w v  . (17)Assuming λ (cid:54)∈ ∪ i ∈ N {± λ i } , R λ and R − λ are well deﬁned. The second equation of Eq. (17)gives w v = 2 βR − λ s (cid:104) g , w f (cid:105) . Substituting this expression in the ﬁrst equation of Eq. (17), and re-arranging, w f = − βR λ L loc R − λ s (cid:104) g , w f (cid:105) . (18)Taking the inner product of the above equation with g , (cid:104) g , w f (cid:105) (1 + 4 β (cid:104) g , R λ L loc R − λ s (cid:105) ) = 0 . The eigenvalue equation for the (cid:104) g , w f (cid:105) (cid:54) = 0 case for moving eigenvalues (as in SectionII A) is h ( λ ) ≡ β (cid:104) g , R λ L loc R − λ s (cid:105) = 0 . (19)Using the deﬁnition of resolvent in Eq. (19), h ( λ ) = 1 + 4 β ∞ (cid:88) i =2 λ i (cid:104) g , ξ i (cid:105)(cid:104) s , ξ i (cid:105) λ i − λ , (20)and hence, h ( λ ) = 1 + 4 β ∞ (cid:88) i =2 λ i (cid:104) x, ξ i (cid:105) λ i − λ . (21)Since Eq. (21) is Herglotz in λ , this implies that the eigenvalues come in pairs, either realor purely imaginary. Let ω ≡ h (0) = 1 + 4 β (cid:80) ∞ i =2 (cid:104) x, ξ i (cid:105) λ i . Lemma III.1.

Consider the eigenvalue equation h ( λ ) = 0 for moving eigenvalues. i) If (cid:104) x, ξ i (cid:105) (cid:54) = 0 for all i ≥ , then there exists a pair of real roots ± δ i for each i ≥ ,such that λ i +1 < δ i < λ i .(ii) Recall that λ = 0 . If (cid:104) x, ξ (cid:105) (cid:54) = 0 and ω > , there exists a pair of real roots ± δ , suchthat λ < δ < .(iii) If (cid:104) x, ξ (cid:105) (cid:54) = 0 and ω < , there exists a pair of purely imaginary roots ± iγ .Proof. (i) Consider the interval I i = ( λ i +1 , λ i ). As λ → λ − i , h ( λ ) → ∞ , and as λ → λ + i +1 , h ( λ ) → −∞ . It is easy to check that h ( λ ) is monotonic in I i . By intermediate valuetheorem, a root δ i exists in I i , and by the monotonicity property, it is unique. Theresult for − δ i follows by symmetry.(ii) Consider the interval I = ( λ , λ → λ +2 , h ( λ ) → −∞ , and as λ → − , h ( λ ) → ω . Hence, if w >

0, arguments similar to those in part (i) yield the existenceof a real root δ between λ and 0.(iii) Consider the function h ( iγ ) for real γ >

0. Clearly, h is monotonic in this interval.Furthermore, as γ → ∞ , h ( iγ ) →

0, and as γ → + , h ( iγ ) → ω . By arguments similarto those in part (i), ω < iγ of h .

2. Contraction analysis of the linearized forward-backward operator

Since the MFG system has a forward-backward nature, spectral information alone isinsuﬃcient to derive conclusions about the stability of steady state solutions. A contractionanalysis is therefore adopted following Refs. 10 and 16. Consider the linear dynamicalsystem given by Eq. (16), with initial perturbation in density f ( x,

0) = f ∞ (1 + (cid:15) ˜ f ( x, v ( x, T ) → T → ∞ , the conditions for existence of a unique solutionsatisfying this assumption are derived. These conditions also provide a stability criterion.Integrating the ˜ v equation in Eq. (16) from t to T ,˜ v ( x, T ) = e − L loc ( T − t ) ˜ v ( x, t )+ 2 β (cid:90) Tt e − L loc ( T − s ) s ( x ) (cid:104) g , ˜ f ( ., s ) (cid:105) ds. Taking the limit T → ∞ , 12 v ( x, t ) = − βe − L loc t (cid:90) ∞ t e L loc s s ( x ) (cid:104) g ( . ) , ˜ f ( ., s ) (cid:105) ds. (22)Substituting above equation in the ˜ f equation, ∂ t ˜ f ( x, t ) = L loc ˜ f ( x, t ) − βL loc e − L loc t (cid:90) ∞ t e L loc s s ( x ) (cid:104) g ( . ) , ˜ f ( ., s ) (cid:105) ds. (23)Integrating from 0 to t yields the ﬁxed point equation,˜ f ( x, t ) = e L loc t ˜ f ( x,

0) + M ˜ f ( x, t ) , (24)where the operator M acting on ˜ f ( x, t ) is deﬁned as M ˜ f ( x, t ) = − βe L loc t (cid:90) r = tr =0 e − L loc r L loc e − L loc r (cid:90) s = ∞ s = r e L loc s s ( x ) (cid:104) g ( . ) , ˜ f ( ., s ) (cid:105) dsdr. (25)Applying the Laplace transform in time to Eq. (25),ˆ M ( λ ) = − βR λ L loc R − λ s (cid:104) g , . (cid:105) . (26)The operator norm (cid:107) M (cid:107) is given by (cid:107) M (cid:107) = sup λ ∈ I sup (cid:107) ˜ f (cid:107) =1 (cid:107) ˆ M ( λ ) ˜ f (cid:107) , (27)=4 β sup λ ∈ I sup (cid:107) ˜ f (cid:107) =1 (cid:107) ∞ (cid:88) i =2 λ i (cid:104) s , ξ i (cid:105)(cid:104) g , ˜ f (cid:105) λ i − λ ξ i (cid:107) , =4 β sup λ ∈ I sup (cid:107) ˜ f (cid:107) =1 (cid:118)(cid:117)(cid:117)(cid:116) ∞ (cid:88) i =2 (cid:34) λ i (cid:104) s , ξ i (cid:105)(cid:104) g , ˜ f (cid:105) λ i − λ (cid:35) , =4 β (cid:107) g (cid:107) (cid:118)(cid:117)(cid:117)(cid:116) ∞ (cid:88) i =2 (cid:104) s , ξ i (cid:105) λ i = 4 β (cid:107) x (cid:107) (cid:118)(cid:117)(cid:117)(cid:116) ∞ (cid:88) i =2 (cid:104) x, ξ i (cid:105) λ i . (28)Lemma III.2 proved next implies that (cid:107) M (cid:107) < f ∞ ( x ) , v ∞ ( x )) of the nonlinear MFG system Eqs. (12,13) to be linearly stable to densityperturbations. 13 a) xxx x x x x x (0,0) (b) (c) FIG. 3:

The MFG system with σ = 0 .

5. a) Zero mean MFG steady state densities for α = 0 . α = 1 . L F Bloc (o) and L F B ( × ) for atypical zero mean case. The twin zero eigenvalues of L F B are omitted. The arrows indicate the directionof motion of the ‘moving’ eigenvalues of L F B as r is reduced starting from r > r sup ( α, σ ). Note that theassumption in Lemma III.1(i) is violated in this particular case due to the symmetric nature of theself-propulsion term, and hence, only alternating eigenvalues are actually ‘moving’. The pair of eigenvaluesof L F B closest to imaginary axis, ± δ , reaches 0 at r = r sup , and moves up/down the imaginary axis for r < r sup . (Bottom) The µ > µ = 0 solution (dashed) via a supercriticalpitchfork bifurcation as r is reduced below r sup . c) Non-zero mean MFG steady state densities on thesupercritical branch. Lemma III.2.

Consider the initial value problem for the linearized system in Eqs. 16, withmass-preserving initial condition ˜ f ( x, i.e., (cid:82) f ∞ ( x ) ˜ f ( x, dx = 0 . If the operator M is acontraction (i.e., (cid:107) M (cid:107) < ), then the perturbation in density, ˜ f ( ., t ) , decays to as t → ∞ .Moreover, ˜ v ( ., t ) also decays to as t → ∞ .Proof. If M is a contraction, then we can (formally) invert the Eq. (24), and write theunique solution ˜ f ( x, t ) = ( I − M ) − e L loc t ˜ f ( x, I + M + M + . . . ) e L loc t ˜ f ( x, . (29)We note that mass conservation property is equivalent to (cid:104) ˜ f ( x, , (cid:105) = 0, i.e. ˜ f ( x, ∈ ¯ H .Recall that L loc restricted to ¯ H is a self-adjoint operator with negative eigenvalues λ i , i =14 a) (b) (c) FIG. 4:

The MFG system with σ = 0 .

5. a) The norm of operator M for zero mean steady state ascontrol penalty is varied, for various α . b). The bifurcation diagram for α = 1 .

5, showing supercritical andsubcritical (inset) bifurcations . Only the µ > α = 1 . , , . . . . Then, lim t →∞ (cid:107) e L loc t (cid:107) ¯ H = lim t →∞ e λ t = 0. This proves the decay of ˜ f ( ., t ). Thecorresponding result for ˜ v ( ., t ) is obtained by inserting the expression for ˜ f ( ., t ) into Eq.(22).Now consider a case where eigenvalue equation in Eq. (18) has a pair of purely imaginaryroots ± iγ ( (cid:54) = 0). Then there is a eigenfunction z f s.t. z f = − βR iγ L loc R − iγ s (cid:104) g , z f (cid:105) = ˆ M ( iγ ) z f , by noting Eq. (26). But this implies that norm of ˆ M is at least 1, hence it is not a contraction.This implies that a necessary condition for M to be a contraction is the absence of non-zerospectra of L F B on the imaginary axis.

B. Numerical Results

Recall that in the MFG problem described by Eqs. (12,13), the representative agent isminimizing a weighted sum of two costs: one penalizes deviation of its velocity from themean velocity of the agent population, and the other penalizes the control action. In thissection, we compute ﬁxed points, and identify phase transitions of this system of equationsas the problem parameters are varied. Rather than solving the resulting constrained non-linear eigenvalue problem 15 directly, we use an iterative algorithm to compute steady state15olutions of the MFG system.We note that the coupling term ˆ F [ f ]( x, t ) evaluated at any steady state density f ∞ is ˆ F [ f ∞ ]( x ) = ( x − µ ) , where µ = (cid:82) yf ∞ ( y ) dy . Again using Cole-Hopf transformation φ ( x ) = exp( − v ( x )) on the HJB equation leads to a linear eigenvalue problem in φ : cφ ( x ) = L [ φ ]( x ) , (30)where L [ φ ] = β ( x − µ ) φ ( x ) − a ( x ) ∂ x φ ( x ) − σ ∂ xx φ ( x ). We solve Eq. (30) iteratively alongwith Eq. (14) to ﬁnd zero mean MFG steady states ( f ∞ , v ∞ ) for a range of r , keeping σ and α ﬁxed (See Fig. 3).These solutions are stable (i.e., (cid:107) M (cid:107) <

1) for large r , implyingthat when control is expensive, the agents use minimal control action. The resulting steadystate distribution is bi-modal due to dominance of the self-propulsion force, and dispersionvia noise.These zero mean solutions lose stability (i.e., (cid:107) M (cid:107) >

1) via a supercritical bifurcationas r is reduced below a critical value r sup . The Eq. (21) for moving eigenvalues of L F B has a double zero root at r = r sup , and a pair of purely imaginary roots emerges as r isreduced below r sup . This implies that the pair of symmetric eigenvalues of L F B closest to theimaginary axis reaches 0 at the critical parameter, and then moves up/down the imaginaryaxis. The stable non-zero mean MFG steady state solutions on the supercritical branch arecomputed by combining ﬁxed point iteration in µ with a continuation step. This bifurcationprovides a MFG interpretation to the pitchfork bifurcation observed in the uncontrolledsystem, i.e., cheaper control makes it economical to compensate for noise. Hence, the agentsapply larger control action to ﬂock together (and reduce the cost of deviation from thepopulation mean), resulting in symmetry breaking non-zero mean solutions.When noise strength σ is ﬁxed below a critical value, the zero mean solution branchundergoes a subcritical bifurcation as control penalty r is further reduced, i.e, at r = r sub r sup . Hence, we conclude that for r < r sub , the control is cheapenough to counteract the intrinsic dynamics, and make zero mean uni-modal solution stable.16 V. CONCLUSIONS

We have presented a MFG formulation for homogeneous ﬂocking of agents with gradientnonlinearity in their intrinsic dynamics. We have employed tools from theory of reaction-diﬀusion equations, and exploited the low rank nature of the nonlocal coupling term to studythe linear stability of the MFG equations. The explicit formulae for verifying the stability ofsteady state solutions of the nonlocal forward-backward MFG system require relatively sim-ple numerical computation of spectra of the local self-adjoint Fokker-Planck operators. TheMFG system shows rich nonlinear behavior, such as supercritical and subcritical pitchforkbifurcations that result in wide range of collective behaviors, some of which are not presentin the uncontrolled model.Much of the analysis in the current work can be generalized to higher dimensional statespace for homogeneous ﬂocking with self-propulsion, similar in spirit to the generalization of one-dimensional uncontrolled ﬂocking model. Furthermore, the abstract results presentedin this work apply to models other than homogeneous ﬂocking, e.g. nonlocally coupled agentswith arbitrary ﬁrst order gradient dynamics. Extension to non-homogeneous ﬂocking wouldbe a natural next step; the resulting second-order dynamics could require more sophisticatedtools for stability analysis. Implementation of the MFG control laws in an engineered largepopulation system requires the control to be provided in a causal form. Algorithms thatcan learn the MFG laws can be used to convert the control laws obtained by solving theFP-HJB equations into an implementable form .The use of bifurcation and singularity theory to develop bio-inspired control and decisionmaking algorithms for multi-agent systems has been explored recently . Our work addsto the toolbox for systematic analysis of collective behavior of non-cooperative dynamicagents via an inverse modeling approach. The qualitative and quantitative insight providedby the stability analysis can be exploited in mechanism design , i.e., design of penaltiesor incentives to drive the population to asymptotic states with desirable characteristics.We believe that a systematic study of bifurcations in MFG models can lead to progressin tackling the grand challenge of designing or manipulating collective behavior of a largepopulation of non-cooperative dynamic agents.17 CKNOWLEDGMENTS

We wish to thank the anonymous reviewers for their constructive comments, especiallyregarding Lemma III.1 and its proof.

REFERENCES H. Resat, L. Petzold, and M. F. Pettigrew, “Kinetic modeling of biological systems,”(Springer, 2009) pp. 311–335. G. Beni, “From swarm intelligence to swarm robotics,” in

International Workshop onSwarm Robotics (Springer, 2004) pp. 1–9. E. Bonabeau, “Agent-based modeling: Methods and techniques for simulating humansystems,” Proceedings of the National Academy of Sciences , 7280–7287 (2002). G. Whitham, “On kinematic waves ii. a theory of traﬃc ﬂow on long crowded roads,” in

Proc. R. Soc. Lond. A , Vol. 229 (The Royal Society, 1955) pp. 317–345. T. Stankovski, T. Pereira, P. V. McClintock, and A. Stefanovska, “Coupling functions:universal insights into dynamical interaction mechanisms,” Reviews of Modern Physics ,045001 (2017). M. Nourian, P. E. Caines, R. P. Malhame, and M. Huang, “Nash, social and centralizedsolutions to consensus problems via mean ﬁeld control theory,” IEEE Transactions onAutomatic Control (2013). K. Elamvazhuthi and P. Grover, “Optimal transport over nonlinear systems via inﬁnitesi-mal generators on graphs,” arXiv preprint arXiv:1612.01193 (2016). P. E. Caines, “Mean ﬁeld games,” Encyclopedia of Systems and Control , 706–712 (2015). J.-M. Lasry and P.-L. Lions, “Mean ﬁeld games,” Japanese journal of mathematics ,229–260 (2007). M. Huang, P. E. Caines, and R. P. Malham´e, “Large-population cost-coupled LQG prob-lems with nonuniform agents: individual-mass behavior and decentralized (cid:15) -Nash equilib-ria,” IEEE transactions on automatic control , 1560–1571 (2007). F. Cucker and S. Smale, “Emergent behavior in ﬂocks,” IEEE Transactions on automaticcontrol , 852–862 (2007). S.-Y. Ha, J.-G. Liu, et al. , “A simple proof of the cucker-smale ﬂocking dynamics and18ean-ﬁeld limit,” Communications in Mathematical Sciences , 297–325 (2009). Y.-P. Choi, S.-Y. Ha, and Z. Li, “Emergent dynamics of the Cucker–Smale ﬂocking modeland its variants,” in

Active Particles, Volume 1 (Springer, 2017) pp. 299–331. A. B. Barbaro, J. A. Ca˜nizo, J. A. Carrillo, and P. Degond, “Phase transitions in a kineticﬂocking model of Cucker–Smale type,” Multiscale Modeling & Simulation , 1063–1088(2016). Y. Kuramoto, “Self-entrainment of a population of coupled non-linear oscillators,” in

In-ternational symposium on mathematical problems in theoretical physics (Springer, 1975)pp. 420–422. H. Yin, P. G. Mehta, S. P. Meyn, and U. V. Shanbhag, “Synchronization of coupledoscillators is a game,” IEEE Transactions on Automatic Control , 920–935 (2012). M. Nourian, P. E. Caines, and R. P. Malham´e, “Mean ﬁeld analysis of controlled cucker-smale type ﬂocking: Linear analysis and perturbation equations,” IFAC Proceedings Vol-umes , 4471–4476 (2011). M. Nourian, P. E. Caines, and R. P. Malhame, “A mean ﬁeld game synthesis of initialmean consensus problems: A continuum approach for non-gaussian behavior ,” IEEETransactions on Automatic Control , 449–455 (2014). P. Degond, J.-G. Liu, and C. Ringhofer, “Large-scale dynamics of mean-ﬁeld games drivenby local nash equilibria,” Journal of Nonlinear Science , 93–115 (2014). O. Gu´eant, “A reference case for mean ﬁeld games models,” Journal de math´ematiquespures et appliqu´ees , 276–294 (2009). T. J. Anastasio, A. K. Barreiro, and J. C. Bronski, “A geometric method for eigenvalueproblems with low-rank perturbations,” Royal Society open science , 170390 (2017). S. Wiggins,

Introduction to Applied Nonlinear Dynamical Systems and Chaos , Texts inApplied Mathematics Science, Vol. 2 (Springer-Verlag, Berlin, 1990). J. Tugaut, “Phase transitions of McKean–Vlasov processes in double-wells landscape,”Stochastics An International Journal of Probability and Stochastic Processes , 257–284(2014). G. A. Pavliotis,

Stochastic processes and applications: Diﬀusion processes, the Fokker-Planck and Langevin equations , Vol. 60 (Springer, 2014). P. Freitas, “Stability of stationary solutions for a scalar non-local reaction-diﬀusion equa-tion,” The Quarterly Journal of Mechanics and Applied Mathematics , 557–582 (1995).19 T. Kapitula and K. Promislow, “Stability indices for constrained self-adjoint operators,”Proceedings of the American Mathematical Society , 865–880 (2012). T. A. Driscoll, N. Hale, and L. N. Trefethen, “Chebfun guide,” (2014). V. S. Borkar, “Ergodic control of diﬀusion processes,” in

Proceedings ICM (2006). E. Todorov, “Eigenfunction approximation methods for linearly-solvable optimal controlproblems,” in

Adaptive Dynamic Programming and Reinforcement Learning, 2009. AD-PRL’09. IEEE Symposium on (IEEE, 2009) pp. 161–168. J. Alexander, R. Gardner, and C. Jones, “A topological invariant arising in the stabilityanalysis of travelling waves,” J. reine angew. Math , 143 (1990). P. Cardaliaguet and S. Hadikhanloo, “Learning in mean ﬁeld games: The ﬁctitious play,”ESAIM: Control, Optimisation and Calculus of Variations , 569–591 (2017). N. E. Leonard, “Multi-agent system dynamics: Bifurcation and behavior of animalgroups,” Annual Reviews in Control , 171–183 (2014). V. Srivastava and N. E. Leonard, “Bio-inspired decision-making and control: From hon-eybees and neurons to network design,” in

American Control Conference (ACC), 2017 (IEEE, 2017) pp. 2026–2039.34