A Passivity-Based Approach to Nash Equilibrium Seeking over Networks
AA Passivity-Based Approach to Nash EquilibriumSeeking over Networks
Dian Gadjov and Lacra Pavel
Abstract —In this paper we consider the problem of distributedNash equilibrium (NE) seeking over networks, a setting inwhich players have limited local information. We start from acontinuous-time gradient-play dynamics that converges to an NEunder strict monotonicity of the pseudo-gradient and assumesperfect information, i.e., instantaneous all-to-all player communi-cation. We consider how to modify this gradient-play dynamics inthe case of partial, or networked information between players. Wepropose an augmented gradient-play dynamics with correctionin which players communicate locally only with their neighboursto compute an estimate of the other players’ actions. We derivethe new dynamics based on the reformulation as a multi-agentcoordination problem over an undirected graph. We exploitincremental passivity properties and show that a synchronizing,distributed Laplacian feedback can be designed using relativeestimates of the neighbours. Under a strict monotonicity propertyof the pseudo-gradient, we show that the augmented gradient-play dynamics converges to consensus on the NE of the game.We further discuss two cases that highlight the tradeoff betweenproperties of the game and the communication graph.
I. I
NTRODUCTION
We consider distributed Nash equilibrium (NE) seeking overnetworks, where players have limited local information, over acommunication network. This is a research topic of recent in-terest, [1], [2], [3], due to many networked scenarios in whichsuch problems arise such as in wireless communication [4],[5], [6], optical networks [7], [8], [9], distributed constrainedconvex optimization [10], [11], noncooperative flow controlproblems [12], [13], etc.We propose a new continuous-time dynamics for a generalclass of N-player games and prove its convergence to NEover a connected graph. Our scheme is derived based onreformulating the problem as a multi-agent coordination prob-lem between the players and leveraging passivity properties.Specifically, we endow each player (agent) with an auxiliarystate variable that provides an estimate of all other players’actions. For each agent we combine its own gradient-typedynamics with an integrator-type auxiliary dynamics, drivenby some control signal. We design the control signal for eachindividual player, based on the relative output feedback fromits neighbours, such that these auxiliary state variables agreeone with another. The resulting player’s dynamics has twocomponents: the action component composed of a gradientterm (enforcing the move towards minimizing its own cost)and a consensus term, and the estimate component with a
This work was supported by an NSERC Discovery Grant.D. Gadjov and L. Pavel are with the Department of Electricaland Computer Engineering, University of Toronto, Toronto, ON,M5S 3G4, Canada [email protected] , [email protected] consensus term. We call this new dynamics an augmentedgradient-play dynamics with correction and estimation. Weprove it converges to consensus on the NE of the game, undera monotonicity property of the extended pseudo-gradient. Literature review.
Our work is related to the literature ofNE seeking in games over networks . Existing results arealmost exclusively developed in discrete-time. The problemof NE seeking under networked communication is consideredin [14], [15], specifically for the special class of aggregative games, where each agent’s cost function is coupled to otherplayers’ actions through a single, aggregative variable. In[16], this approach is generalized to a larger class of coupledgames: a gossip-based discrete-time algorithm is proposedand convergence was shown for diminishing step-sizes . Veryrecently, discrete-time ADMM-type algorithms with constantstep-sizes have been proposed and convergence proved under co-coercivity of the extended pseudo-gradient, [17], [18].In continuous-time, gradient-based dynamics for NE com-putation have been used since the work Arrow, [19], [20], [21],[22]. Over networks, gradient-based algorithms are designed in[11] based on information from only a set of local neighbour-ing agents for games with local utility functions (proved to bestate-based potential games). Continuous-time distributed NEseeking dynamics are proposed for a two-network zero-sumgame in [23]. Key assumptions are the additive decompositionof the common objective function as well as other structuralassumptions. Based on the max-min formulation, the dynamicstakes the form of a saddle-point dynamics, [19], distributedover the agents of each of the two networks, inspired by theoptimization framework of [10].In this paper we consider a general class of N-playergames, where players have limited local information about theothers’ actions over a communication network. Our work isalso related to the distributed optimization framework in [10].However there are several differences between [10] or [23]and our work. Beside the summable structure of the commoncost function, in [10] a critical structural assumption is the factthat each agent optimizes its cost function over the full argu-ment. Then, when an augmented (lifted) space of actions andestimates is considered in the networked communication case,a lifted cost function is obtained which can be decomposedas a sum of separable cost functions, individually convex intheir full argument . This leads to distributed algorithms, understrict convexity of the individual cost functions with respect tothe full argument. In a strategic game context, the individualconvexity properties with respect to the full argument are toorestrictive unless the game is separable to start with. While thegame setting has an inherent distributed structure (since each a r X i v : . [ m a t h . O C ] M a y layer optimizes its own cost function), individual (player-by-player) optimization) is over its own action. In contrast todistributed optimization, each player’s individual action is only part of the full action profile and its cost function is coupled toits opponents’ actions, which are under their decision. This keydifferentiating structural aspect between games and distributedoptimization presents technical challenges.In this work, we also consider an augmented space todeal with the networked communication case, of actionsand estimates of others’ actions. However, the difficulty isthat we do not have an additive decomposition to exploit,and each player only controls/optimizes a part of the fullargument on which its own cost function depends on. Our mainapproach is to highlight and exploit passivity properties of thegame (pseudo-gradient), gradient-based algorithm/dynamicsand network (Laplacian). A typical assumption in games isnot individual gradient monotonicity with respect to the fullargument, but rather monotonicity of the pseudo-gradient. Thecorresponding game assumption we use in the networkedcommunication case, is monotonicity of the extended pseudo-gradient, which is equivalent to incremental passivity. Contributions.
We consider a general class of N-playergames and develop of a new continuous-time dynamics forNE seeking dynamics under networked information. Our ap-proach is based on reformulating the problem as a multi-agent coordination problem and exploiting basic incrementalpassivity properties of the pseudo-gradient map. To the bestof our knowledge such an approach has not been proposedbefore. Our contributions are three-fold.
First , we show thatunder strict monotonicity of the extended pseudo-gradient,the proposed new dynamics converges over any connected graph, unlike [17], [18]. Our scheme is different from [16],due to an extra correction term on the actions’ dynamicsthat arises naturally from the passivity-based design. Thisterm is in fact critical to prove convergence on a singletimescale. Essentially, players perform simultaneous consensusof estimates and player-by-player optimization.
Secondly , our passivity-based approach highlights the trade-off between properties of the game and those of the communi-cation graph. Under a weaker Lipschitz continuity assumptionof the extended pseudo-gradient, we show that the new dy-namics converges over any sufficiently connected graph. Keyis the fact that the Laplacian contribution (or excess passivity)can be used to balance the other terms that are dependent onthe game properties.
Thirdly , we relax the connectivity boundon the graph, based on a time-scale separation argument. Thisis achieved by modifying the dynamics of the estimates suchthat the system approaches quickly the consensus subspace.The paper is organized as follows. Section II gives thepreliminary background. Section III formulates the noncoop-erative game and basic assumptions. Section IV presents thedistributed NE seeking dynamics and analyzes its equilibriumpoints. Section V analyzes the convergence of the proposeddynamics over a connected graph under various assumptions.Section VI considers the case of compact action spaces, whereprojection dynamics are proposed and analyzed. Numericalexamples are given in Section VII and conclusions in SectionVIII. II. P
RELIMINARIES
Notations . Given a vector x ∈ R n , x T denotes its transpose.Let x T y denote the Euclidean inner product of x, y ∈ R n and (cid:107) x (cid:107) the Euclidean norm. Let A ⊗ B denote the Kro-necker product of matrices A and B . The all ones vectoris n = [1 , . . . , T ∈ R n , and the all zeros vector is n = [0 , . . . , T ∈ R n . diag ( A , . . . , A n ) denotes the block-diagonal matrix with A i on its diagonal. Given a matrix M ∈ R p × q , N ull ( M ) = { x ∈ R q | M x = 0 } and Range ( M ) = { y ∈ R p | ( ∃ x ∈ R q ) y = M x } . A function Φ : R n → R n is monotone if ( x − y ) T (Φ( x ) − Φ( y )) ≥ ,for all x, y ∈ R n , and strictly monotone if the inequality isstrict when x (cid:54) = y . Φ is strongly monotone if there exists µ > such that ( x − y ) T (Φ( x ) − Φ( y )) ≥ µ (cid:107) x − y (cid:107) , forall x, y ∈ R n . For a differentiable function V : R n → R , ∇ V ( x ) = ∂V∂x ( x ) ∈ R n denotes its gradient. V is convex,strictly, strongly convex if and only if its gradient ∇ V is mono-tone, strictly, strongly monotone. Monotonicity properties playin variational inequalities the same role as convexity plays inoptimization. A. Projections
Given a closed, convex set Ω ⊂ R n , let the interior,boundary and closure of Ω be denoted by int Ω , ∂ Ω and Ω ,respectively. The normal cone of Ω at a point x ∈ Ω is definedas N Ω ( x ) = { y ∈ R n | y T ( x (cid:48) − x ) ≤ , ∀ x (cid:48) ∈ Ω } . The tangentcone of Ω at x ∈ Ω is given as T Ω ( x ) = (cid:83) δ> δ (Ω − x ) . Theprojection operator of a point x ∈ R n to the set Ω is given bythe point P Ω ( x ) ∈ Ω such that (cid:107) x − P Ω ( x ) (cid:107) ≤ (cid:107) x − x (cid:48) (cid:107) , forall x (cid:48) ∈ Ω , or P Ω ( x ) = argmin x (cid:48) ∈ Ω (cid:107) x − x (cid:48) (cid:107) . The projectionoperator of a vector v ∈ R n at a point x ∈ Ω with respect to Ω is Π Ω ( x, v ) = lim δ → + P Ω ( x + δv ) − xδ . Note that (cid:107) Π Ω ( x, v ) (cid:107) ≤ (cid:107) v (cid:107) .Given x ∈ ∂ Ω let n ( x ) denote the set of outward unit normalsto Ω at x , n ( x ) = { y | y ∈ N Ω ( x ) , (cid:107) y (cid:107) = 1 } . By Lemma 2.1in [24], if x ∈ int Ω , then Π Ω ( x, v ) = v , while if x ∈ ∂ Ω , then Π Ω ( x, v ) = v − β ( x ) n ∗ ( x ) (1)where n ∗ ( x ) = argmax n ∈ n ( x ) v T n and β ( x ) = max { , v T n ∗ ( x ) } .Note that if v ∈ T Ω ( x ) for some x ∈ ∂ Ω , then sup n ∈ n ( x ) v T n ≤ hence β ( x ) = 0 and no projection needs to be performed. Theoperator Π Ω ( x, v ) is equivalent to the projection of the vector v onto the tangent cone T Ω ( x ) at x , Π Ω ( x, v ) = P T Ω ( x ) ( v ) .A set C ⊆ R n is a cone if for any c ∈ C , γc ∈ C forevery γ > . The polar cone of a convex cone C is given by C ◦ = { y ∈ R n | y T c ≤ , ∀ c ∈ C } . Lemma 1 (Moreau’s Decomposition Theorem III.3.2.5, [25]) . Let C ⊆ R n and C ◦ ⊆ R n be a closed convex cone and itspolar cone, and let v ∈ R n . Then the following are equivalent:(i) v C = P C ( v ) and v C ◦ = P C ◦ ( v ) .(ii) v C ∈ C , v C ◦ ∈ C ◦ , v = v C + v C ◦ , and v TC v C ◦ = 0 . Notice that N Ω ( x ) is a convex cone and the tangent cone isits polar cone, i.e., N Ω ( x ) = ( T Ω ( x )) ◦ , ( N Ω ( x )) ◦ = T Ω ( x ) .y Lemma 1, for any x ∈ Ω , any vector v ∈ R n can be de-composed into tangent v T Ω ∈ T Ω ( x ) and normal components, v N Ω ∈ N Ω ( x ) , v = v T Ω + v N Ω (2)with v T Ω = P T Ω ( x ) ( v ) = Π Ω ( x, v ) , v N Ω = P N Ω ( x ) ( v ) . B. Graph theory
The following are from [26]. An undirected graph G c isa pair G c = ( I , E ) with I = { , . . . , N } the vertex set and E ⊆ I×I the edge set such that for i, j ∈ I , if ( i, j ) ∈ E , then ( j, i ) ∈ E . The degree of vertex i , deg ( i ) , is the number ofedges connected to i . A path in a graph is a sequence of edgeswhich connects a sequence of vertices. A graph is connectedif there is a path between every pair of vertices. In this paperwe associate a vertex with a player/agent. An edge betweenagents i, j ∈ I exists if agents i and j exchange information.Let N i ⊂ I denote the set of neighbours of player i . TheLaplacian matrix L ∈ R N × N describes the connectivity of thegraph G c , with [ L ] ij = |N i | , if i = j , [ L ] ij = − , if j ∈ N i and otherwise. When G c is an undirected and connectedgraph, is a simple eigenvalue of L , L N = N , and all othereigenvalues positive. Let the eigenvalues of L in ascendingorder be < λ ≤ . . . ≤ λ N , then min x (cid:54) =0 , TN x =0 x T Lx = λ (cid:107) x (cid:107) , max x (cid:54) =0 x T Lx = λ N (cid:107) x (cid:107) . C. Equilibrium Independent and Incremental Passivity
The following are from [27], [28], [29], [30]. Consider
ΣΣ : (cid:40) ˙ x = f ( x, u ) ,y = h ( x, u ) , (3)with x ∈ R n , u ∈ R q and y ∈ R q , f locally Lipschitz and h continuous. Consider a differentiable function V : R n → R .The time derivative of V along solutions of (3) is denotedas ˙ V ( x ) = ∇ T V ( x ) f ( x, u ) or just ˙ V . Let u , x , y be anequilibrium condition, such that f ( x, u ) , y = h ( x, u ) .Assume ∃ U ⊂ R and a continuous function k x ( u ) such thatfor any constant u ∈ U , f ( k x ( u ) , u ) = 0 . The continuousfunction k y ( u ) = h ( k x ( u ) , u ) is the equilibrium input-outputmap. Equilibrium independent passivity (EIP) requires Σ tobe passive independent of the equilibrium point. Definition 1.
System Σ (3) is Equilibrium Independent Passive(EIP) if it is passive with respect to u and y ; that is forevery u ∈ U there exists a differentiable, positive semi-definitestorage function V : R n → R such that V ( x ) = 0 and, for all u ∈ R q , x ∈ R n , ˙ V ( x ) ≤ ( y − y ) T ( u − u ) A slight refinement to the EIP definition can be made tohandle the case where k y ( u ) is not a function but is a map. AnEIP system with a map k y ( u ) is called maximal EIP (MEIP)when k y ( u ) is maximally monotone, e.g. an integrator, [28].The parallel interconnection and the feedback interconnectionof EIP systems results in a EIP system. When passivity holds in comparing any two trajectories of Σ , the property is calledincremental passivity, [30]. Definition 2.
System Σ (3) is incrementally passive if thereexists a C , regular, positive semi-definite storage function V : R n × R n → R such that for any two inputs u , u (cid:48) andany two solutions x , x (cid:48) corresponding to these inputs, therespective outputs y , y (cid:48) satisfy ˙ V ( x, x (cid:48) ) ≤ ( y − y (cid:48) ) T ( u − u (cid:48) ) where ˙ V = ∇ Tx V ( x, x (cid:48) ) f ( x, u ) + ∇ Tx (cid:48) V ( x, x (cid:48) ) f ( x (cid:48) , u (cid:48) ) . When u (cid:48) , x (cid:48) , y (cid:48) are constant (equilibrium conditions), thisrecovers EIP definition. When system Σ is just a staticmap, incrementally passivity reduces to monotonicity. A staticfunction y = Φ( u ) is EIP if and only if it is incrementallypassive, or equivalently, it is monotone. Monotonicity playsan important role in optimization and variational inequalitieswhile passivity plays as critical a role in dynamical systems.III. P ROBLEM S TATEMENT
A. Game Formulation
Consider a set I = { , . . . , N } of N players (agents)involved in a game. The information sharing between themis described by an undirected graph G c = ( I , E ) or G c . Assumption 1.
The communication graph G c is connected. Each player i ∈ I controls its action x i ∈ Ω i , where Ω i ⊆ R n i . The action set of all players is Ω = (cid:81) i ∈I Ω i ⊆ R n , n = (cid:80) i ∈I n i . Let x = ( x i , x − i ) ∈ Ω denote all agents’action profile or N -tuple, where x − i ∈ Ω − i = (cid:81) j ∈I\{ i } Ω j is the ( N − -tuple of all agents’ actions except agent i ’s. Alternatively, x is represented as a stacked vector x =[ x T . . . x TN ] T ∈ Ω ⊆ R n . Each player (agent) i aims tominimize its own cost function J i ( x i , x − i ) , J i : Ω → R ,which depends on possibly all other players’ actions. Let thegame thus defined be denoted by G ( I , J i , Ω i ) . Definition 3.
Given a game G ( I , J i , Ω i ) , an action profile x ∗ = ( x ∗ i , x ∗− i ) ∈ Ω is a Nash Equilibrium (NE) of G if ( ∀ i ∈ I )( ∀ y i ∈ Ω i ) J i ( x ∗ i , x ∗− i ) ≤ J i ( y i , x ∗− i ) At a Nash Equilibrium no agent has any incentive tounilaterally deviate from its action. In the following we useone of the following two basic convexity and smoothnessassumptions, which ensure the existence of a pure NE.
Assumption 2. (i)
For every i ∈ I , Ω i = R n i , the costfunction J i : Ω → R is C in its arguments, strictlyconvex and radially unbounded in x i , for every x − i ∈ Ω − i . (ii) For every i ∈ I , Ω i is a non empty, convex and compactsubset of R n i and the cost function J i : Ω → R is C in its arguments and (strictly) convex in x i , for every x − i ∈ Ω − i . Under Assumption 2(i) from Corollary 4.2 in [31] it followsthat a pure NE x ∗ exists. Moreover, an NE satisfies ∇ i J i ( x ∗ i , x ∗− i ) = 0 , ( ∀ i ∈ I ) or F ( x ∗ ) = 0 (4)here ∇ i J i ( x i , x − i ) = ∂J i ∂x i ( x i , x − i ) ∈ R n i , is the gradientof agent i ’s cost function J i ( x i , x − i ) with respect to its ownaction x i and F : Ω → R n is the pseudo-gradient defined bystacking all agents’ partial gradients, F ( x ) = [ ∇ J T ( x ) , . . . , ∇ N J TN ( x )] T (5)Under Assumption 2(ii) it follows from Theorem 4.3 in [31]that a pure NE exists, based on Brower’s fixed point theorem.Under just convexity of J i with respect to x i , existence ofan NE follows based on a Kakutani’s fixed point theorem.Moreover a Nash equilibrium (NE) x ∗ ∈ Ω satisfies thevariational inequality (VI) (cf. Proposition 1.4.2, [32]), ( x − x ∗ ) T F ( x ∗ ) ≥ ∀ x ∈ Ω (6)and projected gradient methods need be used, [32]. Addition-ally (6) can be written as − F ( x ∗ ) T ( x − x ∗ ) ≤ and from thedefinition of the normal cone, − F ( x ∗ ) ∈ N Ω ( x ∗ ) (7)Next we state typical assumptions on the pseudo-gradient. Assumption 3. (i)
The pseudo-gradient F : Ω → R n isstrictly monotone, ( x − x (cid:48) ) T ( F ( x ) − F ( x (cid:48) )) > , ∀ x (cid:54) = x (cid:48) . (ii) The pseudo-gradient F : Ω → R n is strongly monotone, ( x − x (cid:48) ) T ( F ( x ) − F ( x (cid:48) )) ≥ µ (cid:107) x − x (cid:48) (cid:107) , ∀ x, x (cid:48) ∈ Ω ,for µ > , and Lipschitz continuous, (cid:107) F ( x ) − F ( x (cid:48) ) (cid:107) ≤ θ (cid:107) x − x (cid:48) (cid:107) , ∀ x, x (cid:48) ∈ Ω , where θ > . Under Assumption 3(i) or 3(ii), the game has a unique NE,(cf. Theorem 3 in [33]).The above setting refers to players’ strategic interactions,but it does not specify what knowledge or information eachplayer has. Since J i depends on all players’ actions, an intro-spective calculation of an NE requires complete information where each player knows the cost functions and strategiesof all other players, see Definition 3 and (4). A game with incomplete information refers to players not fully knowing thecost functions or strategies of the others, [34]. Throughoutthe paper, we assume J i is known by player i only. In agame with incomplete but perfect information , each agent hasknowledge of the actions of all other players, x − i . We referto the case when players are not able to observe the actions of all the other players, as a game with incomplete and imperfector partial information . This is the setting we consider inthis paper: we assume players can communicate only locally,with their neighbours. Our goal is to derive a dynamics forseeking an NE in the incomplete, partial information case,over a communication graph G c . We review first the case ofperfect information, and treat the case of partial or networkedinformation in the following sections. In the first part of thepaper, for simplicity of the arguments, we consider Ω i = R n i and Assumption 2(i). We consider compact action sets andtreat the case of projected dynamics, under Assumption 2(ii),in Section VI. B. Gradient Dynamics with Perfect information
In a game of perfect information, under Assumption 2(i), atypical gradient-based dynamics, [20], [21], [22], [32], can beused for each player i P i : ˙ x i = −∇ i J i ( x i , x − i ) , ∀ i ∈ I (8)or, P : ˙ x = − F ( x ) , the overall system of all the agents’dynamics stacked together. Assumption 2(i) ensures existenceand uniqueness of solutions of (8). Note that (8) requiresall-to-all instantaneous information exchange between playersor over a complete communication graph. Convergence of(8) is typically shown under strict (strong) monotonicity onthe pseudo-gradient F , [32], [21], or under strict diagonaldominance of its Jacobian evaluated at x ∗ , [1]. We providea passivity interpretation. The gradient dynamics P (8) is thefeedback interconnection between a bank of N integrators Σ and the static pseudo-gradient map F ( · ) , Figure 1. Σ is Σ : (cid:40) ˙ x = uy = xF ( · ) u y − Fig. 1. Gradient Dynamics (8) as Feedback Interconnection of two EIP
MEIP with storage function V ( x ) = (cid:107) x − x (cid:107) , while F ( · ) is static and under Assumption 3(i) is incrementally passive(EIP). Hence their interconnection is also EIP and asymptoticstability can be shown using the same storage function. Lemma 2.
Consider a game G ( I , J i , Ω i ) in the perfect infor-mation case, under Assumptions 2(i). Then, the equilibriumpoint ¯ x of the gradient dynamics, (8) is the NE of the game x ∗ and, under Assumption 3(i), is globally asymptotically.Alternatively, under 3(ii) x ∗ is exponentially stable, hence thesolutions of (8) converge exponentially to the NE of the game, x ∗ .Proof. At an equilibrium x , of (8), F ( x ) = 0 , hence by(4), ¯ x = x ∗ , the NE of G ( I , J i , Ω i ) . Consider the quadraticLyapunov function V : Ω → R , V ( x ) = (cid:107) x − ¯ x (cid:107) . Along(8) using F ( x ) = 0 , ˙ V ( x ) = − ( x − x ) T ( F ( x ) − F ( x )) < ,for all x (cid:54) = x , by Assumption 3(i). Hence ˙ V ( x ) ≤ and ˙ V ( x ) = 0 only if x = x = x ∗ . Since V is radiallyunbounded, the conclusion follows by LaSalle’s theorem [35].Under Assumption 3(ii), ˙ V ( x ) ≤ − µ (cid:107) x − x (cid:107) , ∀ x and globalexponential stability follows immediately.IV. NE S EEKING D YNAMICS OVER A G RAPH
In this section we consider the following question: how canwe modify the gradient dynamics (8) such that it converges toNE in a networked information setting, over some connectedcommunication graph G c ?We propose a new augmented gradient dynamics, derivedbased on the reformulation as a multi-agent agreement prob-lem between the players. We endow each player (agent) withn auxiliary state that provides an estimate of all other players’actions. We design a new signal for each player, based on therelative feedback from its neighbours, such that these estimates agree one with another.Thus assume that player (agent) i maintains an estimatevector x i = [( x i ) T , . . . , ( x iN ) T ] T ∈ Ω where x ij is player i ’sestimate of player j ’s action and x ii = x i is player i ’s actualaction. x i − i represents player i ’s estimate vector without itsown action, x ii . All agents’ vectors are stacked into a singlevector x = [( x ) T , . . . , ( x N ) T ] T ∈ (cid:81) i ∈I Ω = Ω N = R Nn .Note that the state space is now Ω N = (cid:81) i ∈I Ω = R Nn . Inthe enlarged space the estimate components will be differentinitially, but in the limit all players estimate vectors shouldbe in consensus. We modify the gradient dynamics such thatplayer i updates x ii to reduce its own cost function and updates x i − i to reach a consensus with the other players. Let eachplayer combine its gradient-type dynamics with an integrator-type auxiliary dynamics, driven by some control signal, (cid:101) Σ i : (cid:34) ˙ x i ˙ x i − i (cid:35) = (cid:34) −∇ i J i ( x i , x i − i )0 (cid:35) + B i u i y i = ( B i ) T x i (9)where B i is a full rank n × n matrix. For each player, u i ∈ R n is to be designed based on the relative output feedback fromits neighbours, such that x i = x j , for all i, j , and converge tothe NE x ∗ .Thus we have reformulated the design of NE dynamics over G c as a multi-agent agreement problem. We note that agentdynamics (9) are heterogenous, separable, but do not satisfyan individual passivity property as typically assumed in multi-agent literature, e.g. [29], [36]. We show next that a Laplacian-type feedback can be designed under strict incremental pas-sivity of the pseudo-gradient.To proceed, we first analyze properties of (cid:101) Σ i and the overallagents’ dynamics (cid:101) Σ . Write (cid:101) Σ i (9) in a compact form (cid:101) Σ i : (cid:40) ˙ x i = −R Ti ∇ i J i ( x i ) + B i u i y i = ( B i ) T x i (10)where R i = [ n i × n i ] (11)and n i = (cid:80) j>i i,j ∈I n j .Thus R Ti ∈ R n × n i aligns the gradient to the actioncomponent in ˙ x i . From (10), with x = [( x ) T , . . . , ( x N ) T ] T , u = [ u T , . . . , u TN ] T ∈ R Nn , the overall agents’ dynamicsdenoted by (cid:101) Σ can be written in stacked form as (cid:101) Σ : (cid:40) ˙ x = −R T F ( x ) + B uy = B T x (12)where R = diag ( R , . . . , R N ) , B = diag ( B , . . . , B N ) and F ( x ) = [ ∇ J T ( x ) , . . . , ∇ N J TN ( x N )] T (13)is the continuous extension of the pseudo-gradient F , (5) to theaugmented space, F ( x ) : Ω N → R n . Note that F ( N ⊗ x ) = F ( x ) . In the rest of the paper we consider one of the followingtwo assumptions on the extended F . Assumption 4. (i)
The extended pseudo-gradient F ismonotone, ( x − x (cid:48) ) T ( F ( x ) − F ( x (cid:48) )) ≥ , ∀ x , x (cid:48) ∈ Ω N . (ii) The extended pseudo-gradient F is Lipschitz continuous, (cid:107) F ( x ) − F ( x (cid:48) ) (cid:107) ≤ θ (cid:107) x − x (cid:48) (cid:107) , ∀ x , x (cid:48) ∈ Ω N where θ > . Remark 1.
We compare this assumption to similar onesused in distributed optimization and in multi-agent coordi-nation control, respectively. First, note that Assumption 4(i)on the extended pseudo-gradient F holds under individualjoint convexity of each J i with respect to the full argument.In distributed optimization problems, each objective functionis assumed to be strictly (strongly) jointly convex in the fullvector x and its gradient to be Lipschitz continuous, e.g. [10].Similarly, in multi-agent coordination control, it is standardto assume that individual agent dynamics are separable andstrictly (strongly) incrementally passive, e.g. [29]. However,in a game context the individual joint convexity of J i withrespect to the full argument is too restrictive, unless we havea trivial game with separable cost functions. In general, J i is coupled to other players’ actions while each player hasunder its control only its own action. This is a key differenceversus distributed optimization or multi-agent coordination,one which introduces technical challenges. However, we showthat under the monotonicity Assumption 4(i) on F , the overall (cid:101) Σ (12) is incrementally passive, hence EIP. Based on this, wedesign a new dynamics which converges over any connected G c (Theorem 1). Under the weaker Lipschitz Assumption4(ii) on F and Assumption 3(ii) on F , we show that thenew dynamics converges over any sufficiently connected G c (Theorem 2). We also note that Assumption 4(i) is similar tothose used in [17], [18], while Assumption 4(ii) is weaker.Assumption 4(i) is the extension of Assumption 3(i) to theaugmented space, for local communication over the connectedgraph G c . The weaker Assumption 4(ii) on F , is the extensionof Lipschitz continuity of F in Assumption 3(ii). We also notethat these assumptions could be relaxed to hold only locallyaround x ∗ in which case all results become local. Lemma 3.
Under Assumption 4(i), the overall system (cid:101) Σ , (12),is incrementally passive, hence EIP.Proof. Consider two inputs u , u (cid:48) and let x , x (cid:48) , y , y (cid:48) be thetrajectories and outputs of (cid:101) Σ (12). Let the storage function be V ( x , x (cid:48) ) = (cid:107) x − x (cid:48) (cid:107) . Then, along solutions of (12), ˙ V = − ( x − x (cid:48) ) T (cid:2) R T ( F ( x ) − F ( x (cid:48) )) + B ( u − u (cid:48) ) (cid:3) = − ( x − x (cid:48) ) T ( F ( x ) − F ( x (cid:48) )) + ( y − y (cid:48) ) T ( u − u (cid:48) ) (14)by using R = diag ( R , . . . , R N ) and (11). Using Assumption4(i) it follows that ˙ V ≤ ( y − y (cid:48) ) T ( u − u (cid:48) ) Thus by Definition 2, (cid:101) Σ , is incrementally passive, hence EIP. A. Distributed feedback design
Given agent dynamics (cid:101) Σ i , (10), for each individual playerwe design u i ∈ R n based on the relative output feedbackrom its neighbours, such that the auxiliary state variables(estimates) agree one with another and converge to the NE x ∗ . For simplicity, take B = I Nn so that y = x .Let N i denote the set of neighbours of player i in graph G c and L denote the symmetric Laplacian matrix. Let L = L ⊗ I n denote the augmented Laplacian matrix which satisfies N ull ( L ) = Range ( N ⊗ I n ) (15)and Range ( L ) = N ull ( TN ⊗ I n ) , based on L N = N . Forany W ∈ R q × n , and any x ∈ R Nn , using L N = N , ( TN ⊗ W ) Lx = (( TN L ) ⊗ ( W I n )) x = q (16)With respect to the overall dynamics (cid:101) Σ , (12), the objectiveis to design u such that x reaches consensus, i.e., N ⊗ x ,for some x ∈ Ω and x converges towards the NE x ∗ . Theconsensus condition is written as Lx = 0 . Since (cid:101) Σ , (12) isincrementally passive by Lemma 3, and L is positive semi-definite, a passivity-based control design, e.g. [36], suggeststaking u = − Lx . The resulting closed-loop system whichrepresents the new overall system dynamics (cid:101) P is given instacked notation as (cid:101) P : ˙ x = −R T F ( x ) − Lx (17)shown in Figure 2 as the feedback interconnection between (cid:101) Σ and L . Local solutions of (17) exist by Assumption 2(i). The (cid:101) Σ : (cid:40) ˙ x = −R T F ( x ) + uy = x L ⊗ I n u y − Fig. 2. Augmented gradient dynamics (17) over G c new individual player dynamics (cid:101) P i are (cid:101) P i : ˙ x i = −R Ti ∇ i J i ( x i ) − (cid:88) j ∈N i ( x i − x j ) (18)or, separating the action x ii = x i and estimate x i − i dynamics, (cid:101) P i : (cid:40)(cid:34) ˙ x i ˙ x i − i (cid:35) = (cid:34) −∇ i J i ( x i , x i − i ) − R i (cid:80) j ∈N i ( x i − x j ) −S i ( (cid:80) j ∈N i x i − x j ) (cid:35) (19)where S i = (cid:20) I ni n>i × ni × n i I n>i (cid:21) (20)and S i removes x ii = x i , its own action component from agent i ’s estimate vector, x i .For player i , (cid:101) P i , (18) or (19) is clearly distributed over G c . Its input is the relative difference between its estimateand its neighbours’. In standard consensus terms, agent i canuse this information to move in the direction of the averagevalue of its neighbours, while the gradient term enforces themove towards minimizing its own cost. Compared to thegossip-based algorithm in [16], the action part of (19) has an extra correction term. This term is instrumental in provingconvergence on a single timescale as shown in Section V.The next result shows that the equilibrium of (17) or (19)occurs when the agents are at a consensus and at NE. Lemma 4.
Consider a game G ( I , J i , Ω i ) over a communica-tion graph G c under Assumptions 1, 2(i). Let the dynamicsfor each agent (cid:101) P i be as in (18), (19), or overall (cid:101) P , (17).At an equilibrium point x ∈ Ω N the estimate vectors of allplayers are equal ¯ x i = ¯ x j , ∀ i, j ∈ I and equal to the Nashequilibrium profile x ∗ , hence the action components of allplayers coincide with the optimal actions, ¯ x ii = x ∗ i , ∀ i ∈ I .Proof. Let x denote an equilibrium of (17), Nn = −R T F (¯ x ) − L ¯ x (21)Pre-multiplying both sides by ( TN ⊗ I n ) , yields n = − ( TN ⊗ I n ) R T F (¯ x ) , where ( TN ⊗ I n ) L ¯ x = 0 was used by (15). Using(11) and simplifying ( TN ⊗ I n ) R T gives n = F (¯ x ) , or ∇ i J i (¯ x i ) = 0 , ∀ i ∈ I (22)by (13). Substituting (22) into (21) results in Nn = − L ¯ x .From this it follows that ¯ x i = ¯ x j , ∀ i, j ∈ I by Assumption 1and (15). Therefore ¯ x = N ⊗ ¯ x , for some ¯ x ∈ Ω . Substitutingthis back into (22) yields n = F ( N ⊗ ¯ x ) or ∇ i J i (¯ x ) = 0 ,for all i ∈ I . Using (13), ∇ i J i (¯ x i , ¯ x − i ) = 0 , for all i ∈ I , or n = F (¯ x ) . Therefore by (4) ¯ x = x ∗ , hence ¯ x = N ⊗ x ∗ and for all i, j ∈ I , ¯ x i = ¯ x j = x ∗ the NE of the game.V. C ONVERGENCE A NALYSIS
In this section we analyze the convergence of player’s newdynamics (cid:101) P i (18), (19) or overall (cid:101) P (17) to the NE of thegame, over a connected graph G c . We consider two cases.In Section V-A we analyze convergence of (19) on a singletimescale: in Theorem 1 under Assumptions 1, 2(i), 3(i) and4(i), and in Theorem 2 under Assumptions 1, 2(i), 3(ii) and4(ii). We exploit the incremental passivity (EIP) property of (cid:101) Σ (12) (Lemma 3) and diffusive properties of the Laplacian.In Section V-B, we modify the estimate component of thedynamics (19) to be much faster, and in Theorem 3 proveconvergence under Assumptions 1, 2(i), 3(ii) and 4(ii), usinga two-timescale singular perturbation approach. A. Single-Timescale Consensus and Player Optimization
Theorem 1 shows that, under Assumption 4(i), (19) con-verges to the NE of the game, over any connected G c . Theorem 1.
Consider a game G ( I , J i , Ω i ) over a communi-cation graph G c under Assumptions 1, 2(i), 3(i) and 4(i). Leteach player’s dynamics (cid:101) P i , be as in (18), (19), or overall (cid:101) P ,(17), as in Figure 2. Then, any solution of (17) is boundedand asymptotically converges to N ⊗ x ∗ , and the actionscomponents converge to the NE of the game, x ∗ .Proof. By Lemma 4, the equilibrium of (cid:101) P (17) is ¯ x = N ⊗ x ∗ .We consider the quadratic storage function V ( x ) = (cid:107) x − ¯ x (cid:107) , ¯ x = N ⊗ x ∗ as a Lyapunov function. As in (14) in the prooff Lemma 3, using u = − Lx , (21) we obtain that along thesolutions of (17), ˙ V = − ( x − ¯ x ) T R T ( F ( x ) − F (¯ x )) − ( x − ¯ x ) T L ( x − ¯ x ) (23)where ¯ x = N ⊗ x ∗ , and x T R T = x . By Assumption 4(i)and since the augmented Laplacian L is positive semi-definiteit follows that ˙ V ≤ , for all x ∈ Ω N , hence all trajectoriesof (17) are bounded and x is stable. To show convergence weresort to LaSalle’s invariance principle, [35].From (23), ˙ V = 0 when both terms in (23) are zero, i.e., ( x − ¯ x ) T R T ( F ( x ) − F (¯ x )) = 0 and ( x − ¯ x ) T L ( x − ¯ x ) = 0 . ByAssumption 1 and (15), ( x − ¯ x ) T L ( x − ¯ x ) = 0 is equivalentto x − ¯ x = N ⊗ x , for some x ∈ R n . Since at equilibrium ¯ x = N ⊗ x ∗ , this implies that x = N ⊗ x , for some x ∈ R n .By (13) F ( N ⊗ x ) = F ( x ) . Using R in (11) yields for thefirst term in (23) to be zero, − ( N ⊗ x − N ⊗ x ∗ ) T R T [ F ( x ) − F ( x ∗ )]= − ( x − x ∗ ) T [ F ( x ) − F ( x ∗ )] < ∀ x (cid:54) = x ∗ (24)where strict inequality follows by Assumption 3(i). Therefore ˙ V = 0 in (23) only if x = x ∗ and hence x = N ⊗ x ∗ . Since V is radially unbounded, the conclusion follows by LaSalle’sinvariance principle.If F ( · ) is strongly monotone, exponential convergence canbe shown over any connected G c . Next we show that, under aweaker Lipschitz property of F (Assumption 4(ii)) and strongmonotonicity of F (Assumption 3(ii)), (19) converges over any sufficiently connected G c . Theorem 2.
Consider a game G ( I , J i , Ω i ) over a communi-cation graph G c under Assumptions 1, 2(i), 3(ii) and 4(ii). Leteach player’s dynamics (cid:101) P i be as in (18), (19) or overall (cid:101) P (17). Then, if λ ( L ) > θ µ + θ , any solution of (17) convergesasymptotically to N ⊗ x ∗ , and the actions components con-verge to the NE of the game, x ∗ . If λ ( L ) > Nθ µ + θ , thenconvergence is exponential.Proof. We decompose R Nn as R Nn = C nN ⊕ E nN , intothe consensus subspace C nN = { N ⊗ x | x ∈ R n } and itsorthogonal complement E nN . Let two projection matrices bedefined as P C = 1 N N ⊗ TN ⊗ I n , P E = I Nn − N N ⊗ TN ⊗ I n Then any x ∈ R Nn can be decomposed as x = x || + x ⊥ ,where x || = P C x ∈ C nN and x ⊥ = P E x ∈ E nN , with ( x || ) T x ⊥ = 0 . Thus x || = N ⊗ x , for some x ∈ R n , sothat Lx || = 0 , and min x ⊥ ∈ E nN ( x ⊥ ) T Lx ⊥ = λ ( L ) (cid:107) x ⊥ (cid:107) , λ ( L ) > .Consider V ( x ) = (cid:107) x − x (cid:107) , x = N ⊗ x ∗ , which using x = x || + x ⊥ can be written as V ( x ) = (cid:107) x ⊥ (cid:107) + (cid:107) x || − x (cid:107) .Then following the same steps as in Lemma 3, and replacing x with its decomposed components x ⊥ , x || a relation similarto (23) follows along (17), i.e., ˙ V ≤ − ( x − x ) T R T [ F ( x ) − F ( x )] − ( x − x ) T L ( x − x ) Using x = x ⊥ + x || , x = N ⊗ x ∗ , Lx || = 0 , ˙ V can be writtenas ˙ V ≤ − ( x ⊥ ) T R T (cid:104) F ( x ) − F ( x || ) (cid:105) − ( x ⊥ ) T R T (cid:104) F ( x || ) − F ( x ) (cid:105) − ( x || − N ⊗ x ∗ ) T R T (cid:104) F ( x ) − F ( x || ) (cid:105) − ( x || − N ⊗ x ∗ ) T R T (cid:104) F ( x || ) − F ( x ) (cid:105) − ( x ⊥ ) T Lx ⊥ Using ( x ⊥ ) T Lx ⊥ ≥ λ ( L ) (cid:107) x ⊥ (cid:107) and F ( x || ) = F ( x ) , F ( x ) = F ( x ∗ ) yields ˙ V ≤ (cid:107) x ⊥ (cid:107)(cid:107) F ( x ) − F ( x || ) (cid:107)− ( x ⊥ ) T R T [ F ( x ) − F ( x ∗ )] − ( x || − N ⊗ x ∗ ) T R T (cid:104) F ( x ) − F ( x || ) (cid:105) − ( x || − N ⊗ x ∗ ) T R T [ F ( x ) − F ( x ∗ )] − λ ( L ) (cid:107) x ⊥ (cid:107) Under Assumption 4(ii), (cid:107) F ( x ) − F ( x || ) (cid:107) ≤ θ (cid:107) x ⊥ (cid:107) , so that,after simplifying R x || = x , R ( N ⊗ x ∗ ) = x ∗ , ˙ V ≤ − ( λ ( L ) − θ ) (cid:107) x ⊥ (cid:107) − x ⊥ R T [ F ( x ) − F ( x ∗ )] − ( x − x ∗ ) T [ F ( x ) − F ( x || )] − ( x − x ∗ ) T [ F ( x ) − F ( x ∗ )] Using again Assumption 4(ii) in the rd and nd terms andAssumption 3(ii) in the th one, it can be shown that ˙ V ≤ − ( λ ( L ) − θ ) (cid:107) x ⊥ (cid:107) + 2 θ (cid:107) x ⊥ (cid:107)(cid:107) x − x ∗ (cid:107) (25) − µ (cid:107) x − x ∗ (cid:107) or ˙ V ≤ − [ (cid:107) x − x ∗ (cid:107) (cid:107) x ⊥ (cid:107) ] Θ [ (cid:107) x − x ∗ (cid:107) (cid:107) x ⊥ (cid:107) ] T , where Θ = (cid:20) µ − θ − θ λ ( L ) − θ (cid:21) . Under the conditions in the statement, Θ is positive definite. Hence ˙ V ≤ and ˙ V = 0 only if x ⊥ = 0 and x = x ∗ , hence x = N ⊗ x ∗ . The conclusion follows byLaSalle’s invariance principle.Exponential convergence follows using (cid:107) x − x ∗ (cid:107) = √ N (cid:107) x || − x (cid:107) , under the stricter condition on λ ( L ) . Indeed, ˙ V ≤ − (cid:2) (cid:107) x || − x (cid:107) (cid:107) x ⊥ (cid:107) (cid:3) Θ N (cid:20) (cid:107) x || − x (cid:107)(cid:107) x ⊥ (cid:107) (cid:21) where Θ N = (cid:20) N µ − θ − θ λ ( L ) − θ (cid:21) is positive definite if λ ( L ) > Nθ µ + θ . This implies that ˙ V ( x ( t )) ≤ − ηV ( x ( t )) ,for some η > , so that x ( t ) converges exponentially to x . Remark 2.
Since θ, µ are related to the coupling in theplayers’ cost functions and λ ( L ) to the connectivity betweenplayers, Theorem 2 highlights the tradeoff between propertiesof the game and those of the communication graph G c . Key isthe fact that the Laplacian contribution can be used to balancethe other terms in ˙ V . Alternatively L on feedback path inFigure 2 has excess passivity which compensates the lack ofpassivity in the F terms, or (cid:101) Σ on the forward path . emark 3. We note that we can relax the monotonicityassumption to hold just at the NE x ∗ , recovering a strict-diagonal assumption used in [1]. However, since x ∗ is un-known, such an assumption cannot be checked a-priori exceptfor special cases such as quadratic games, (see Section VII).Local results follow if assumptions for F ( · ) hold only locallyaround x ∗ , and for F ( · ) only locally around x ∗ = N ⊗ x ∗ .We note that the class of quadratic games satisfies Assumption3(ii) globally. An alternative representation of the dynamics (cid:101) P , (17),reveals interesting connections to distributed optimization andpassivity based control, [10], [36]. To do that we use twomatrices to write a compact representation for (cid:101) P , (17). Let S = diag ( S , ..., S N ) ∈ R ( Nn − n ) × Nn , where S i in (20). Then S and R (11) satisfy SR T = and R T R + S T S = I, RR T = I, RS T = , SS T = I (26)Using R and S , the stacked actions are [( x ) T , . . . , ( x NN ) T ] T = R x , while the stacked estimates [( x − ) T , . . . , ( x N − N ) T ] T = S x . Let x = R x and z = S x ,and using properties of R , S , (26), yields x = R T x + S T z .Thus the equivalent representation of (cid:101) P (17) is: (cid:101) P : (cid:40) ˙ x = − F ( R T x + S T z ) − R L [ R T x + S T z ]˙ z = −S L [ R T x + S T z ] (27)which separates the actions and the estimates stacked compo-nents of the dynamics. Using L = L ⊗ I n and L = QQ T ,with Q the incidence matrix, yields the interconnected block-diagram in Figure 3.We note that, unlike distributed optimization (e.g. [10])and passivity based control (e.g. [36]) where the dynamicsare decoupled, in Figure 2 the dynamics on the top path arecoupled, due to the inherent coupling in players’ cost functionsin game G ( I , J i , Ω i ) . As another observation, recall that givena matrix Q , pre-multiplication by Q and post-multiplicationby Q T preserves passivity of a system. Figure 2 shows thatpossible generalized dynamics can be designed by substitutingthe identity block on the feedback path by some other passivedynamics. Based on Figure 2, one such dynamic generalizationcan be obtained by substituting the static feedback through L , with an integrator or proportional-integrator term through L , which preserves passivity, as in [10]. Thus the passivityinterpretation of the game NE seeking dynamics design allowsa systematic derivation of new dynamics/algorithms.Note that if the dynamics of the estimates were modified such that the system approached quickly the consensus sub-space then convergence to the NE could be shown via a time-scale decomposition approach. This is explored in the nextsection. B. Two-Timescale Singular Perturbation Analysis
In this section we relax the connectivity bound on L in Theorem 2, based on a time-scale separation argument.The idea is to modify the dynamics of the estimates in (cid:101) P (17) or (27) such that the system approaches quickly theconsensus subspace. Under Assumption 3(ii) and 4(ii), we −R ˙ x = − F ( x, z ) + u R T s . . . s −S S T + IdQ ⊗ I n Q T ⊗ I n xz ˙ zu x z Fig. 3. Block Diagram of Actions and Estimates dynamics in (17) show convergence to the NE over a sufficiently connected G c based on a time-scale decomposition approach.Recall the equivalent representation of (cid:101) P in (27) and modifythe estimate component of the dynamics such that is muchfaster than the action component, (cid:101) P (cid:15) : (cid:40) ˙ x = − F ( R T x + S T z ) − R L [ R T x + S T z ] (cid:15) ˙ z = −S L [ R T x + S T z ] (28)where (cid:15) > . Thus player i ’s dynamics is as follows: (cid:101) P i,(cid:15) : (cid:40)(cid:34) ˙ x i ˙ x i − i (cid:35) = (cid:34) −∇ i J i ( x i , x i − i ) − R i (cid:80) j ∈N i ( x i − x j ) − (cid:15) S i ( (cid:80) j ∈N i x i − x j ) (cid:35) (29)with the (cid:15) high gain on the estimate component. (cid:101) P (cid:15) (28) is inthe standard form of a singularly perturbed system, where theestimate dynamics and the action dynamics are the fast andthe slow components, respectively. Theorem 3.
Consider a game G ( I , J i , Ω i ) over a communi-cation graph G c under Assumptions 1, 2(i), 3(ii) and 4(ii).Let each player’s dynamics (cid:101) P i,(cid:15) be as in (29), or overall (cid:101) P (cid:15) (28), (cid:15) > . Then, there exists (cid:15) ∗ > , such that forall < (cid:15) < (cid:15) ∗ , ( x ∗ , S ( N ⊗ x ∗ )) is exponentially stable.Alternatively, ( x ∗ , S ( N ⊗ x ∗ )) is asymptotically stable, forall < (cid:15) < such that λ ( L ) > (cid:15) √ N ( θµ + 1)( θ + 2 d ∗ )) . Proof.
We analyze (28) by examining the reduced and theboundary-layer systems. First we find the roots of S L [ R T x + S T z ] = 0 , or S Lx = 0 . Note that, by (26), x ∈ N ull ( S L ) if and only if Lx ∈ N ull ( S ) , which is equivalent to Lx ∈ Range ( R T ) . Thus x ∈ N ull ( S L ) if and only if there exists q ∈ R n such that Lx = R T q . Then for such q ∈ R n andfor all w ∈ R n , ( TN ⊗ w T ) Lx = ( TN ⊗ w T ) R T q . Using ( TN ⊗ w T ) Lx = 0 by (16) and R in (11), this means w T q ,for all w ∈ R n . Therefore, q = 0 and Lx = 0 . By (15), x = N ⊗ x , x ∈ Ω . Hence roots of S L [ R T x + S T z ] = 0 arewhen x = ( N ⊗ x ) , i.e., z = S ( N ⊗ x ) .We use a change of coordinates v = z − S ( N ⊗ x ) , to shiftthe equilibrium of the boundary-layer system to the origin.irst, we use z = v + S ( N ⊗ x ) and x = R ( N ⊗ x ) torewrite the term R T x + S T z that appears in (28) as follows, R T x + S T z = R T R ( N ⊗ x ) + S T ( v + S ( N ⊗ x ))= ( R T R + S T S )( N ⊗ x ) + S T v = N ⊗ x + S T v where (26) was used. Using this and the change of variables v = z − S ( N ⊗ x ) into (28) with L ( N ⊗ x ) = 0 yields, (cid:101) P (cid:15) : ˙ x = − F ( N ⊗ x + S T v ) − R L S T v(cid:15) ˙ v = −S L S T v + (cid:15) S ( N ⊗ R L S T v )+ (cid:15) S ( N ⊗ F ( N ⊗ x + S T v )) (30)Note that v = 0 is the quasi-steady state of (cid:15) ˙ v and substitutingthis in ˙ x gives the reduced system as ˙ x = − F ( N ⊗ x ) = − F ( x ) (31)which is exactly the gradient dynamics and has equilibrium x ∗ at the NE. By Lemma 2, under Assumption 3(ii) the gradientdynamics, (31), is exponentially stable.The boundary-layer system on the τ = t/(cid:15) timescale is dvdτ = −S L S T v (32)It can be shown that matrix S L S T is positive definite, so that(32) is exponentially stable. To see this note that S L S T v =0 only if S T v ∈ N ull ( S L ) . Recall that N ull ( S L ) = N ull ( L ) = ( N ⊗ w ) , w ∈ Ω . Note that to be in N ull ( L ) , y = [( y ) T . . . ( y N ) T ] T = S T v has to have y i = y j , ∀ i, j ∈ I ,but y has a in component y ii ∀ i ∈ I , due to the definition of S . Therefore y i (cid:54) = y j unless y ji = y ii = 0 , for all j . Therefore y = S T v has to be equal to to be in the N ull ( L ) . Since N ull ( S T ) = { } this implies v = 0 , hence N ull ( S L S T ) = 0 and S L S T is positive definite. By Theorem 11.4 in [35] itfollows that there exists (cid:15) ∗ > , such that for all (cid:15) < (cid:15) ∗ , ( x ∗ , is exponentially stable for (30), or ( x ∗ , S ( N ⊗ x ∗ )) is exponentially stable for (28).Alternatively, Theorem 11.3 in [35] can be applied to (30)to show asymptotic stability. The two Lyapunov functionsare V ( x ) = (cid:107) x − x ∗ (cid:107) and W ( v ) = (cid:107) v (cid:107) , and alongthe reduced and the boundary layer-systems, (31),(32), thefollowing hold − ( x − x ∗ ) T F ( N ⊗ x ) ≤ − µ (cid:107) x − x ∗ (cid:107) − v T S L S T v ≤ − λ ( L ) /N (cid:107) v (cid:107) so that (11.39) and (11.40) in [35] hold for α = µ , ψ ( x ) = (cid:107) x − x ∗ (cid:107) , and α = λ ( L ) /N , ψ ( v ) = (cid:107) v (cid:107) .Note also that the following holds ( x − x ∗ ) T (cid:2) − F ( N ⊗ x + S T v ) − R L S T v + F ( N ⊗ x ) (cid:3) ≤ ( θ + λ N ( L )) (cid:107) x − x ∗ (cid:107)(cid:107) v (cid:107) so that (11.43) in [35] holds for β = θ + λ N ( L ) .Similarly, v T S ( N ⊗ R L S T v ) + v T S ( N ⊗ F ( N ⊗ x + S T v )) ≤ √ N θ (cid:107) x − x ∗ (cid:107)(cid:107) v (cid:107) + √ N ( θ + λ N ( L )) (cid:107) v (cid:107) and (11.44), [35] holds for β = √ N θ , γ = √ N ( θ + λ N ( L )) . Then using Theorem 11.3, (cid:15) ∗ = α α α γ + β β , is given as (cid:15) ∗ = λ ( L ) µN √ N ( θ + µ )( θ + λ N ( L )) and using λ N ( L ) ≤ d ∗ = 2 max i ∈I |N i | , (cid:15) ∗ ≥ λ ( L ) µN √ N ( θ + µ )( θ + 2 d ∗ )) Then, by Theorem 11.3 in [35], for any < (cid:15) < such that λ ( L ) > (cid:15)N √ N ( θµ + 1)( θ + 2 d ∗ ))( x ∗ , is asymptotic stable for (30), hence ( x ∗ , S ( N ⊗ x ∗ )) is asymptotic stable for (28). Remark 4. In (cid:101) P i,(cid:15) (29) the estimate dynamics is made fasterwith the gain /(cid:15) . It can be shown that for sufficiently high /(cid:15) , (cid:15) > N √ N (1 + 2 d ∗ θ ) , the bound on λ ( L ) in Theorem 3is lower than the bound on λ ( L ) in Theorem 2. Alternatively,we can consider a gain parameter (cid:15) > on the estimates in(17) to improve the lower bound to λ ( L ) > (cid:15) ( θ µ + θ ) , asshown in the next section. Thus a higher /(cid:15) can relax theconnectivity bound on L , but (cid:15) is a global parameter. Thishighlights another aspect of the tradeoff between game prop-erties (coupling), communication graph (consensus) propertiesand information. VI. P
ROJECTED
NE D
YNAMICS FOR C OMPACT A CTION S ETS
In this section we treat the case of compact Ω i actionsets, under Assumption 2(ii), using projected dynamics. Wehighlight the major steps of the approach and their differencescompared to the unconstrained action set case.In a game of perfect information, for compact Ω i action set,under Assumption 2(ii) each player i ∈ I runs the projectedgradient-based dynamics, given as [21], [22], P i : ˙ x i = Π Ω i ( x i , −∇ i J i ( x i , x − i )) , x i (0) ∈ Ω i (33)The overall system of all agents’ projected dynamics in stackednotation is given by P : ˙ x ( t ) = Π Ω ( x ( t ) , − F ( x ( t ))) , x (0) ∈ Ω (34)or, equivalently, P : ˙ x ( t ) = P T Ω ( x ( t )) [ − F ( x ( t )] , x (0) ∈ Ω where equivalence follows by using Lemma 1 (Moreau’sdecomposition theorem, [25]), or directly by Proposition 1and Corollary 1 in [37]. Furthermore, this is equivalent to thedifferential inclusion [38] − F ( x ( t )) − Π Ω ( x ( t ) , − F ( x ( t ))) ∈ N Ω ( x ( t )) . (35)In all the above the projection operator is discontinuous on theboundary of Ω . We use the standard definition of a solutionof a projected dynamical system (PDS), (Definition 2.5 in[24]). Thus we call x : [0 , + ∞ ) → Ω a solution of (34)if x ( · ) is an absolutely continuous function t (cid:55)→ x ( t ) and x ( t ) = Π Ω ( x ( t ) , − F ( x ( t ))) holds almost everywhere (a.e.)with respect to t , i.e., except on a set of measure zero.The existence of a unique solution of (34) is guaranteedfor any x (0) ∈ Ω , under Lipschitz continuity of F on Ω , cf.Theorem 2.5 in [24]. Note that any solution must necessarilylie in Ω for almost every t . Alternatively, existence holds undercontinuity and (hypo) monotonicity of F , i.e., for some µ ≤ , ( x − x (cid:48) ) T ( F ( x ) − F ( x (cid:48) )) ≥ µ (cid:107) x − x (cid:48) (cid:107) , ∀ x, x (cid:48) ∈ Ω (see Assumption 2.1 in [24], and also Theorem 1 in [37] forextension to a non-autonomous systems). This is similar tothe QUAD relaxation in [39]. It means that F ( x ) − µx ismonotone, where µ ≤ . Note that F ( x ) + ηx is stronglymonotone for any η > − µ , with η + µ > monotonicityconstant. When µ > in fact we can take η = 0 and recover µ -strong monotonicity of F . Thus under Assumption 2(ii),3(i), for any x (0) ∈ Ω , there exists a unique solution of (34)and moreover, a.e., ˙ x ( t ) ∈ T Ω ( x ( t )) and x ( t ) ∈ Ω .Equilibrium points of (34) coincide with Nash equilibria,which are solutions of the VI( F , Ω ), by Theorem 2.4 in [24].To see this, let x be an equilibrium point of (34) such that { x ∈ Ω | n = Π Ω ( x, − F ( x )) } . By Lemma 2.1 in [24] and(1), if x ∈ int Ω , then n = Π Ω ( x, − F ( x )) = − F ( x ) , whileif x ∈ ∂ Ω , then n = Π Ω ( x, − F ( x )) = − F ( x ) − β n (36)for some β > and n ∈ n ( x ) ⊂ N Ω ( x ) . Equivalently, by(35), − F ( x ) ∈ N Ω ( x ) and using the definition of N Ω ( x ) , itfollows that − F ( x ) T ( x − x ) ≤ , ∀ x ∈ Ω Comparing to (6), or (7), it follows that x = x ∗ . Thus theequilibrium points of (34) { x ∗ ∈ Ω | n = Π Ω ( x ∗ , − F ( x ∗ )) } coincide with Nash equilibria x ∗ . Lemma 5.
Consider a game G ( I , J i , Ω i ) in the perfectinformation case, under Assumptions 2(ii) and 3(i). Then, forany x i (0) ∈ Ω i , the solution of (33), i ∈ I , or (34) convergesasymptotically to the NE of the game x ∗ . Under Assumption3(ii) convergence is exponential.Proof. The proof follows from Theorem 3.6 and 3.7 in [24].Consider any x (0) ∈ Ω and V ( t, x ) = (cid:107) x ( t ) − x ∗ (cid:107) ,where x ∗ is the Nash equilibrium of the game, and x ( t ) isthe solution of (34). Then the time derivative of V alongsolutions of (34) is ˙ V = ( x ( t ) − x ∗ ) T Π Ω ( x ( t ) , − F ( x ( t ))) .Since T Ω ( x ( t )) = [ N Ω ( x ( t ))] o , by Moreau’s decompositiontheorem (Lemma 1), at any point x ( t ) ∈ Ω the pseudo-gradient − F ( x ( t )) can be decomposed as in (2) into normaland tangent components, in N Ω ( x ( t )) and T Ω ( x ( t )) . Since Π Ω ( x ( t ) , − F ( x ( t ))) = P T Ω ( x ( t )) ( − F ( x ( t ))) is in the tangentcone T Ω ( x ( t )) , it follows as in (35) that − F ( x ( t )) − Π Ω ( x ( t ) , − F ( x ( t ))) ∈ N Ω ( x ( t )) (37)From the definition of the normal cone N Ω ( x ( t )) this means, ( x (cid:48) − x ( t )) T ( − F ( x ( t )) − Π Ω ( x ( t ) , − F ( x ( t )))) ≤ for all x (cid:48) ∈ Ω . Thus it follows that for x (cid:48) = x ∗ and ∀ x ( t ) ∈ Ω , ( x ( t ) − x ∗ ) T Π Ω ( x ( t ) , − F ( x ( t )))) ≤ − ( x ( t ) − x ∗ ) T F ( x ( t )) From (6), at the Nash equilibrium F ( x ∗ ) T ( x ( t ) − x ∗ ) ≥ , ∀ x ( t ) ∈ Ω . Therefore adding this to the right-hand side ofthe above and using ˙ V ( t ) yields that along solutions of (34),for all t ≥ , ˙ V = ( x ( t ) − x ∗ ) T Π Ω ( x ( t ) , − F ( x ( t ))) ≤− ( x ( t ) − x ∗ ) T ( F ( x ( t )) − F ( x ∗ )) < , when x ( t ) (cid:54) = x ∗ , wherethe strict inequality follows from Assumption 3(i). Hence V ( t ) is monotonically decreasing and non-negative, and thus thereexists lim t →∞ V ( t ) = V . As in Theorem 3.6 in [24], acontradiction argument can be used to show that V = 0 , hencefor any x (0) ∈ Ω , (cid:107) x ( t ) − x ∗ (cid:107) → as t → ∞ .Under Assumption 3(ii), for any x (0) ∈ Ω , along solutionsof (34), for all t ≥ , ˙ V ≤ − µ (cid:107) x ( t ) − x ∗ (cid:107) = − µV ( t ) , µ > , ∀ x and exponential convergence follows immediately.In the partial or networked information case, over graph G c ,we modify each player’s (cid:101) Σ i , in (9) using projected dynamicsfor the action components to Ω i , as in (cid:101) Σ i : (cid:34) ˙ x i ˙ x i − i (cid:35) = (cid:34) Π Ω i (cid:0) x i , −∇ i J i ( x i , x i − i ) + B ii u i ( t ) (cid:1) B i − i u i (cid:35) y i = ( B i ) T x i (38)where u i ( t ) ∈ R n is a piecewise continuous function, tobe designed based on the relative output feedback from itsneighbours, such that x i = x j , for all i, j , and convergetowards the NE x ∗ . Write (cid:101) Σ i (38) in a more compact form (cid:101) Σ i : (cid:40) ˙ x i = R Ti Π Ω i (cid:0) x i , −∇ i J i ( x i ) + R i B i u i (cid:1) + S Ti S i B i u i y i = ( B i ) T x i (39)where R i , S i , are defined as in (11), (20). The overalldynamics for all players becomes in stacked form, (cid:101) Σ , (cid:101) Σ : (cid:40) ˙ x = R T Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) + S T S B u ( t ) y = B T x ( t ) (40)where x = R x , R = diag ( R , . . . , R N ) , S = diag ( S , . . . , S N ) , satisfying the properties in (26), and u ( t ) ∈ R Nn is piecewise continuous. This is similar to (12), exceptthat the dynamics for the action components is projected to Ω . For any R x (0) ∈ Ω , existence of a unique solution of (40)is guaranteed under Assumption 4(i) or (ii) by Theorem 1 in[37]. Extending the incrementally passivity and EIP conceptto projected dynamical systems leads to the following result. Lemma 6.
Under Assumption 4(i), the overall system (cid:101) Σ , (40),is incrementally passive, hence EIP.Proof. Consider two inputs u ( t ) , u (cid:48) ( t ) and let x ( t ) , x (cid:48) ( t ) , y ( t ) , y (cid:48) ( t ) be the state trajectories and outputs of (cid:101) Σ (40). Letthe storage function be V ( t, x , x (cid:48) ) = (cid:107) x ( t ) − x (cid:48) ( t ) (cid:107) . Then,along solutions of (40), ˙ V = ( x ( t ) − x (cid:48) ( t )) T R T [Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) − Π Ω ( R x (cid:48) ( t ) , − F ( x (cid:48) ( t )) + R B u (cid:48) ( t ))] (41) + ( x ( t ) − x (cid:48) ( t )) T S T S B ( u ( t ) − u (cid:48) ( t )) otice that using (26) the following holds for (40), R ˙ x ( t ) = Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) ∈ T Ω ( R x ( t )) Hence as in (37), for any R x ( t ) ∈ Ω , − F ( x )( t ) + R B u ( t ) − Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) ∈ N Ω ( R x ( t )) and using the definition of normal cone N Ω ( R x ( t )) , ( R x ( t ) − R x (cid:48) ( t )) T Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) ≤ ( R x ( t ) − R x (cid:48) ( t )) T ( − F ( x ( t )) + R B u ( t )) (42)for all R x (cid:48) ( t ) ∈ Ω . Since R x ( t ) ∈ Ω and R x (cid:48) ( t ) ∈ Ω areboth arbitrary elements in Ω , swapping them leads to ( R x (cid:48) ( t ) − R x ( t )) T Π Ω ( R x (cid:48) ( t ) , − F ( x (cid:48) ( t )) + R B u (cid:48) ( t )) ≤ ( R x (cid:48) ( t ) − R x ( t )) T ( − F ( x (cid:48) ( t )) + R B u (cid:48) ( t )) (43)Adding (42) and (43) results in ( x ( t ) − x (cid:48) ( t )) T R T [Π Ω ( R x ( t ) , − F ( x ( t )) + R B u ( t )) − Π Ω ( R x (cid:48) ( t ) , − F ( x (cid:48) ( t )) + R B u (cid:48) ( t ))] ≤− ( x ( t ) − x (cid:48) ( t )) T R T ( F ( x ( t )) − F ( x (cid:48) ( t )))+ ( x ( t ) − x (cid:48) ( t )) T R T R B ( u ( t ) − u (cid:48) ( t )) Therefore using this in (41), yields for ˙ V ˙ V ≤ − ( x ( t ) − x (cid:48) ( t )) T R T ( F ( x ( t )) − F ( x (cid:48) ( t )))+ ( x ( t ) − x (cid:48) ( t )) T R T R B ( u ( t ) − u (cid:48) ( t ))+ ( x ( t ) − x (cid:48) ( t )) T S T S B ( u ( t ) − u (cid:48) ( t )) or, using R T R + S T S = I , ˙ V ≤ − ( x ( t ) − x (cid:48) ( t )) T R T ( F ( x ( t )) − F ( x (cid:48) ( t )))+ ( x ( t ) − x (cid:48) ( t )) T B ( u ( t ) − u (cid:48) ( t )) (44)Finally, using Assumption 4(i), it follows that ˙ V ≤ ( y ( t ) − y (cid:48) ( t )) T ( u ( t ) − u (cid:48) ( t )) and (cid:101) Σ is incrementally passive, hence EIP.Since (cid:101) Σ , (40) is incrementally passive by Lemma 6, and L is positive semi-definite, as in Section IV-A we consider apassivity-based control u ( t ) = − Lx ( t ) . The resulting closed-loop system which represents the new overall system dynamics (cid:101) P is given in stacked notation as (cid:101) P : ˙ x = R T Π Ω ( R x ( t ) , − F ( x ( t )) − R Lx ( t )) − S T S Lx ( t ) (45)Alternatively, using x = R x , z = S x , RS T = 0 , equivalentlywith actions and estimates separated as in (27), (cid:101) P : (cid:40) ˙ x = Π Ω (cid:0) x, − F ( R T x + S T z ) − R L [ R T x + S T z ] (cid:1) ˙ z = −S L [ R T x + S T z ] (46)Existence of a unique solution of (45) or (46) is guaranteedunder Assumption 4(i) or (ii) by Theorem 1 in [37]. From (45) or (46) after separating the action x ii = x i and estimate x i − i dynamics, the new projected player dynamics (cid:101) P i are, (cid:101) P i : (cid:40) ˙ x i = Π Ω i (cid:16) x i , −∇ i J i ( x i ) − R i (cid:80) j ∈N i ( x i − x j ) (cid:17) ˙ x i − i = −S i ( (cid:80) j ∈N i x i − x j ) (47)Compared to (19), (cid:101) P i , in (47) has projected action compo-nents. The next result shows that the equilibrium of (45) or(47) occurs when the agents are at a consensus and at NE. Lemma 7.
Consider a game G ( I , J i , Ω i ) over a communi-cation graph G c under Assumptions 1, 2(ii) and 4(i) or (ii).Let the dynamics for each agent (cid:101) P i be as in (47), or overall (cid:101) P , (45). At an equilibrium point x the estimate vectors ofall players are equal ¯ x i = ¯ x j , ∀ i, j ∈ I and equal to theNash equilibrium profile x ∗ , hence the action components ofall players coincide with the optimal actions, ¯ x ii = x ∗ i , ∀ i ∈ I .Proof. Let x denote an equilibrium of (45), Nn = R T Π Ω ( R x , − F (¯ x ) − R L ¯ x ) − S T S L ¯ x (48)Pre-multiplying both sides by R and using (26) simplifies to, n = Π Ω ( R x , − F (¯ x ) − R L ¯ x ) (49)Substituting (49) into (48) results in Nn = −S T S L ¯ x whichimplies ¯ x ∈ null ( L ) . From this it follows that ¯ x i = ¯ x j , ∀ i, j ∈ I by Assumption 1 and (15). Therefore ¯ x = N ⊗ ¯ x ,for some ¯ x ∈ Ω . Substituting this back into (49) yields n = Π Ω ( R ( N ⊗ ¯ x ) , − F ( N ⊗ ¯ x )) or n = Π Ω (¯ x, − F (¯ x )) by using (13). Therefore as in (36) it follows that − F (¯ x ) ∈ N Ω (¯ x ) hence, by (7), ¯ x = x ∗ the NE. Thus ¯ x = N ⊗ x ∗ andfor all i, j ∈ I , ¯ x i = ¯ x j = x ∗ the NE of the game.The following results show single-timescale convergence tothe NE of the game over a connected G c , under Assumption4(i) or Assumption 4(ii). Theorem 4.
Consider a game G ( I , J i , Ω i ) over a commu-nication graph G c under Assumptions 1, 2(ii), 3(i) and 4(i).Let each player’s dynamics (cid:101) P i , be as in (47), or overall (cid:101) P ,(45). Then, for any x (0) ∈ Ω and any z (0) , the solution of(45) asymptotically converges to N ⊗ x ∗ , and the actionscomponents converge to the NE of the game, x ∗ .Proof. The proof is similar to the proof of Theorem 1 exceptthat, instead of LaSalle’s invariance principle, the argument isbased on Barbalat’s Lemma, [35], since the system is time-varying. Let V ( t, x ) = (cid:107) x ( t ) − x (cid:107) , where by Lemma 7, x = N ⊗ x ∗ . Using (44) in Lemma 6, for x (cid:48) ( t ) = x , u ( t ) = − Lx ( t ) , u (cid:48) ( t ) = − Lx , it follows that for any x (0) ∈ Ω andany z (0) , along (45), ˙ V ≤ − ( x ( t ) − ¯ x ) T R T ( F ( x ( t )) − F (¯ x )) (50) − ( x ( t ) − ¯ x ) T L ( x ( t ) − ¯ x ) Under Assumption 4(i), ˙ V ≤ , for all R x ( t ) ∈ Ω , z (0) . Thus V ( t, x ( t )) is non-increasing and bounded from below by ,hence it converges as t → ∞ to some V ≥ . Then, under As-sumption 4(i), it follows that lim t →∞ (cid:82) t ( x ( τ ) − ¯ x ) T L ( x ( τ ) − ¯ x ) dτ exists and is finite. Since x ( t ) is absolutely continuous,ence uniformly continuous, from Barbalat’s Lemma in [35] itfollows that L ( x ( t ) − ¯ x ) → as t → ∞ . Since x = N ⊗ x ∗ ,this means that x ( t ) → N ⊗ x , as t → ∞ for some x ∈ Ω .Then V ( t, x ( t )) = (cid:107) x ( t ) − x (cid:107) → (cid:107) N ⊗ ( x − x ∗ ) (cid:107) = V as t → ∞ . If V = 0 the proof if completed. Using thestrict monotonicity assumption 3(i), it can be shown by acontradiction argument that x = x ∗ and V = 0 . Assume that x (cid:54) = x ∗ and V > . Then from (50) there exists a sequence { t k } , t k → ∞ , as k → ∞ , such that ˙ V ( t k ) → as k → ∞ .Suppose this claim is false. Then there exists a d > and a T > such that ˙ V ( t ) ≤ − d , for all t > T , which contradicts V > , hence the claim is true. Substituting { t k } into (50)yields ˙ V ( t k ) ≤ − ( x ( t k ) − ¯ x ) T R T ( F ( x ( t k )) − F (¯ x )) − ( x ( t k ) − ¯ x ) T L ( x ( t k ) − ¯ x ) where the left-hand side converges to as k → ∞ . Hence, ≤ − lim k →∞ ( x ( t k ) − ¯ x ) T R T ( F ( x ( t k )) − F (¯ x )) − lim k →∞ ( x ( t k ) − ¯ x ) T L ( x ( t k ) − ¯ x ) Using lim k →∞ x ( t k ) = N ⊗ x ∈ N ull ( L ) , this leads to ≤ − [ N ⊗ ( x − x ∗ )] T R T ( F ( N ⊗ x ) − F ( N ⊗ x ∗ )) or, ≤ − ( x − x ∗ )] T ( F ( x ) − F ( x ∗ )) < , by the strictmonotonicity Assumption 3(i), since we assumed x (cid:54) = x ∗ .This is a contradiction, hence x = x ∗ and V = 0 . Theorem 5.
Consider a game G ( I , J i , Ω i ) over a communi-cation graph G c under Assumptions 1, 2(ii), 3(ii) and 4(ii).Let each player’s dynamics (cid:101) P i be as in (47) or overall (cid:101) P (45). Then, if λ ( L ) > θ µ + θ , for any x (0) ∈ Ω andany z (0) , the solution of (45) asymptotically converges to N ⊗ x ∗ , and the actions converge to the NE of the game, x ∗ . If λ ( L ) > Nθ µ + θ , then convergence is exponential.Proof. The proof is similar to the proof of Theorem 2. Basedon Lemma 7, using V ( t, x , x ) = (cid:107) x ( t ) − x (cid:107) and (44) inLemma 6, for u ( t ) = − Lx ( t ) , u (cid:48) ( t ) = − Lx ( t ) one can obtain(50) along (45), for any x (0) ∈ Ω and any z (0) . Then furtherdecomposing x ( t ) into x ⊥ ( t ) and x || ( t ) components as in theproof of Theorem 2 leads to an inequality as (25), where Θ is positive under the conditions in the theorem. Then invokingBarbalat’s Lemma in [35] as in the proof of Theorem 4 leadsto x ( t ) → x ∗ and x ⊥ ( t ) → as t → ∞ .Note also that we can consider a gain parameter (cid:15) > onthe estimates in (45) to improve the lower bound to λ ( L ) .Consider (cid:101) P (cid:15) : ˙ x = R T Π Ω ( R x , − F ( x ) − (cid:15) R Lx ) − (cid:15) S T S Lx (51)Thus player i ’s dynamics is as follows: (cid:101) P i,(cid:15) : ˙ x i ˙ x i − i = Π Ω i ( x i , −∇ i J i ( x i , x i − i ) − (cid:15) R i (cid:80) j ∈N i ( x i − x j )) − (cid:15) S i ( (cid:80) j ∈N i x i − x j ) (52) Following the proof of Theorem 2 the matrix Θ becomes Θ = (cid:20) µ − θ − θ (cid:15) λ ( L ) − θ (cid:21) where the condition for the matrix to be positive definite is λ ( L ) > (cid:15) ( θ µ + θ ) . As in the two-time scale analysis this (cid:15) isa global parameter.VII. N UMERICAL E XAMPLES
A. Unconstrained Ω and DynamicsExample 1: Consider a N-player quadratic game fromeconomics, where 20 firms are involved in the production ofa homogeneous commodity. The quantity produced by firm i is denoted by x i . The overall cost function of firm i is J i ( x i , x − i ) = c i ( x i ) − x i f ( x ) , where c i ( x i ) = (20 + 10( i − x i is the production cost, f ( x ) = 2200 − (cid:80) i ∈I x i isthe demand price, as in [18]. We investigate the proposeddynamics (19) over a communication graph G c via simulation.The initial conditions are selected randomly from [0 , .Assumption 3(i) and 4(i) hold, so by Theorem 1 the dynamics(19) will converge even over a minimally connected graph.Figures 6 and 7 show the convergence of (19) over a randomlygenerated communication graph G c (Fig. 4) and over a cycle G c graph (Fig. 5), respectively. Fig. 4. Random G c , λ = 4 . Fig. 5. Cycle G c Graph ac t i on Fig. 6. (19) over random G c ac t i on Fig. 7. (19) over cycle G c Example 2:
Consider a second example of an 8 player gamewith J i ( x i , x − i ) = c i ( x i ) − x i f ( x ) , c i ( x i ) = (10+4( i − x i , f ( x ) = 600 − (cid:80) i ∈I x i , as in [16]. Here Assumption 4(i) on F does not hold globally, so cannot apply Theorem 1, butAssumption 4(ii) holds locally. By Theorem 2, (19) will con-verge depending on λ ( L ) . Figure 10 shows the convergenceof (19) over a sufficiently connected, randomly generatedcommunication graph G c as depicted in Fig. 8. Over a cycle G c graph, (19) does not converge. Alternatively, by Theorem, a higher /(cid:15) (time-scale decomposition) can balance theconnectivity loss. Fig. 11 shows convergence for (29) with /(cid:15) = 200 , over a cycle G c graph as shown in Fig. 9. Theinitial conditions are selected randomly from [0 , . Fig. 8. Random G c , λ = 1 . Fig. 9. Cycle G c Graph ac t i on Fig. 10. (19) over random G c ac t i on Fig. 11. (29) cycle G c , /(cid:15) = 200 B. Compact Ω and Projected DynamicsExample 1: N = 20 and this time Ω i = [0 , .We investigate the projected augmented gradient dynamics(47) over a graph G c . The actions’ initial conditions areselected randomly from [0 , , while the estimates from [ − , . Assumption 3(i) and 4(i) hold, so by Theorem4 the dynamics (47) will converge even over a minimallyconnected graph. Figures 14 and 15 show the convergenceof (47) over a randomly generated communication graph G c (Fig. 12) and over a cycle G c graph (Fig. 13), respectively. ID: 15
Fig. 12. Random G c , λ = 3 . Fig. 13. Cycle G c Graph
Example 2: N = 8 and Ω i = [0 , , as in [16]. HereAssumption 4(i) on F does not hold globally, so cannot applyTheorem 4. Under Assumption 3(ii) and 4(ii) by Theorem 2,(47) will converge depending on λ ( L ) . Figure 18 shows theconvergence of (47) over a sufficiently connected, randomlygenerated communication graph G c as in Fig. 16. A higher /(cid:15) on the estimates can balance the lack of connectivity. Fig. ac t i on Fig. 14. (47) over random G c ac t i on Fig. 15. (47) over cycle G c
19 shows results for (52) with /(cid:15) = 200 , over a cycle G c graph as in Fig. 17. The actions’ initial conditions are selectedrandomly from [0 , and the estimates from [ − , .
12 34 5678
Fig. 16. Random G c , λ = 0 . Fig. 17. Cycle G c Graph ac t i on Fig. 18. (47) over random G c ac t i on Fig. 19. (52) cycle G c , /(cid:15) = 200 Example 3:
Consider J i ( x i , x − i ) = c i ( x i ) − x i f ( x ) , where c i ( x i ) = [20 + 40( i − x i and f ( x ) = 1200 − (cid:80) i ( x i ) , Ω i = [0 , , for N = 20 . The NE is on the boundary (cid:2)
200 200 183 . . . . . · · · (cid:3) .Figure 22 shows the convergence of (47) over a randomlygenerated communication graph G c as in Fig. 20, while Fig.23 gives similar results, this time over a cycle G c graph asdepicted in Fig. 21. VIII. C ONCLUSION
In this paper, we studied distributed Nash equilibrium (NE)seeking over networks in continuous-time. We proposed anaugmented gradient-play dynamics with estimation in whichplayers communicate locally only with their neighbours tocompute an estimate of the other players’ actions. We derived
Fig. 20. Random G c , λ = 1 . Fig. 21. Cycle G c Graph ac t i on Fig. 22. (47) over random G c ac t i on Fig. 23. (47) over cycle G c the new dynamics based on the reformulation as a multi-agent coordination problem over an undirected graph. Weexploited incremental passivity properties and showed that asynchronizing, distributed Laplacian feedback can be designedusing relative estimates of the neighbours. Under strict mono-tonicity property of the pseudo-gradient, we proved that thenew dynamics converges to the NE of the game. We discussedcases that highlight the tradeoff between properties of thegame and the communication graph.R EFERENCES[1] P. Frihauf, M. Krstic, and T. Bas¸ar, “Nash Equilibrium Seeking inNoncooperative Games,”
IEEE Trans. on Automatic Control , vol. 57,no. 5, pp. 1192–1207, 2012.[2] Y. Lou, Y. Hong, L. Xie, G. Shi, and K. Johansson, “Nash equilibriumcomputation in subnetwork zero-sum games with switching communi-cations,”
IEEE Trans. on Automatic Control , vol. 61, no. 10, pp. 2920–2935, 2016.[3] M. S. Stankovic, K. H. Johansson, and D. M. Stipanovic, “Distributedseeking of Nash equilibria with applications to mobile sensor networks,”
IEEE Trans. on Automatic Control , vol. 57, no. 4, pp. 904–919, 2012.[4] H. Li and Z. Han, “Competitive Spectrum Access in Cognitive RadioNetworks: Graphical Game and Learning,” in , April 2010, pp. 1–6.[5] X. Chen and J. Huang, “Spatial Spectrum Access Game: Nash Equilibriaand Distributed Learning,” in
Proc. of the 13th ACM MobiHoc . NewYork, NY, USA: ACM, 2012, pp. 205–214.[6] T. Alpcan and T. Bas¸ar, “A hybrid systems model for power controlin multicell wireless data networks,”
Performance Evaluation , vol. 57,no. 4, pp. 477 – 495, 2004.[7] L. Pavel,
Game theory for control of optical networks . Birkh¨auser-Springer Science, 2012.[8] ——, “A noncooperative game approach to OSNR optimization inoptical networks,”
IEEE Trans. on Automatic Control , vol. 51, no. 5,pp. 848 – 852, 2006.[9] Y. Pan and L. Pavel, “Games with coupled propagated constraints inoptical networks with multi-link topologies,”
Automatica , vol. 45, no. 4,pp. 871 – 880, 2009.[10] J. Wang and N. Elia, “A control perspective for centralized and dis-tributed convex optimization,” in
Proc. of the 50th IEEE CDC-ECC ,Dec 2011, pp. 3800–3805. [11] N. Li and J. R. Marden, “Designing games for distributed optimization,”
IEEE Journal of Selected Topics in Signal Processing , vol. 7, no. 2, pp.230–242, 2013.[12] H. Yin, U. Shanbhag, and P. G. Mehta, “Nash Equilibrium ProblemsWith Scaled Congestion Costs and Shared Constraints,”
IEEE Trans. onAutomatic Control , vol. 56, no. 7, pp. 1702–1708, 2011.[13] T. Alpcan and T. Bas¸ar,
Distributed Algorithms for Nash Equilibria ofFlow Control Games . Birkh¨auser Boston, 2005, pp. 473–498.[14] J. Koshal, A. Nedic, and U. Shanbhag, “A gossip algorithm for aggrega-tive games on graphs,” in
Proc. of the 51st IEEE CDC , Dec 2012, pp.4840–4845.[15] S. Grammatico, F. Parise, M. Colombino, and J. Lygeros, “DecentralizedConvergence to Nash Equilibria in Constrained Deterministic MeanField Control,”
IEEE Trans. on Automatic Control , vol. 61, no. 11, pp.3315–3329, 2016.[16] F. Salehisadaghiani and L. Pavel, “Distributed Nash equilibrium seeking:A gossip-based algorithm,”
Automatica , vol. 72, pp. 209 – 216, 2016.[17] ——, “Distributed Nash Equilibrium seeking via the Alternating Di-rection Method of Multipliers,” in the 20th IFAC World Congress, toappear , 2017.[18] W. Shi and L. Pavel, “LANA: an ADMM-like Nash Equilibriumseeking algorithm in decentralized environment,” in
American ControlConference, to appear , 2017.[19] K. Arrow, L. Hurwitz, and H. Uzawa,
A gradient method for approx-imating saddle points and constrained maxima . Rand Corporation,United Staes Army Air Forces, 1951.[20] ——,
Studies in linear and non-linear programming . Stanford Univer-sity Press, Stanford, California, 1958.[21] S. Fl˚am, “Equilibrium, evolutionary stability and gradient dynamics,”
International Game Theory Review , vol. 4, no. 04, pp. 357–370, 2002.[22] J. S. Shamma and G. Arslan, “Dynamic fictitious play, dynamic gradientplay, and distributed convergence to Nash equilibria,”
IEEE Trans. onAutomatic Control , vol. 50, no. 3, pp. 312–327, 2005.[23] B. Gharesifard and J. Cortes, “Distributed convergence to Nash equi-libria in two-network zero-sum games,”
Automatica , vol. 49, no. 6, pp.1683 – 1692, 2013.[24] A.Nagurney and D. Zhang,
Projected Dynamical Systems and Vari-ational Inequalities with Applications , ser. Innovations in FinancialMarkets and Institutions. Springer US, 1996.[25] J.-B. Hiriart-Urruty and C. Lemar´echal,
Fundamentals of Convex Anal-ysis . Springer, 2001.[26] C. Godsil and G. Royle,
Algebraic Graph Theory , ser. Graduate Textsin Mathematics. Springer New York, 2001.[27] G. Hines, M. Arcak, and A. Packard, “Equilibrium-independent passiv-ity: A new definition and numerical certification,”
Automatica , vol. 47,no. 9, pp. 1949 – 1956, 2011.[28] M. B¨urger, D. Zelazo, and F. Allg¨ower, “Duality and network theoryin passivity-based cooperative control,”
Automatica , vol. 50, no. 8, pp.2051 – 2061, 2014.[29] M. B¨urger and C. D. Persis, “Dynamic coupling design for nonlinearoutput agreement and time-varying flow control,”
Automatica , vol. 51,pp. 210 – 222, 2015.[30] A. Pavlov and L. Marconi, “Incremental passivity and output regulation,”
System & Control Letters , vol. 57, pp. 400 – 409, 2008.[31] T. Bas¸ar and G. Olsder,
Dynamic Noncooperative Game Theory: SecondEdition , ser. Classics in Applied Mathematics. SIAM, 1999.[32] F. Facchinei and J. Pang,
Finite-Dimensional Variational Inequalitiesand Complementarity Problems , ser. Springer Series in OperationsResearch and Financial Engineering. Springer New York, 2007.[33] G. Scutari, F. Facchinei, J. S. Pang, and D. P. Palomar, “Real and Com-plex Monotone Communication Games,”
IEEE Trans. on InformationTheory , vol. 60, no. 7, pp. 4197–4231, 2014.[34] S. Li and T. Bas¸ar, “Distributed Algorithms for the Computation ofNoncooperative Equilibria,”
Automatica , vol. 23, no. 4, pp. 523–533,1987.[35] H. Khalil,
Nonlinear Systems . Prentice Hall, 2002.[36] M. Arcak, “Passivity as a Design Tool for Group Coordination,”
IEEETrans. on Automatic Control , vol. 52, no. 8, pp. 1380–1390, 2007.[37] B. Brogliato, A. Daniilidis, C. Lemar´echal, and V. Acary, “On theequivalence between complementarity systems, projected systems anddifferential inclusions,”
Systems & Control Letters , vol. 55, pp. 45–51,2006.[38] J.-P. Aubin and A. Cellina,
Differential Inclusions . Springer, Heidel-berg, 1984.[39] M. d. B. P. DeLellis and G. Russo, “On QUAD, Lipschitz, andcontracting vector fields for consensus and synchronization of networks,”