Distributed Nash Equilibrium Seeking for Games in Systems with Unknown Control Directions
aa r X i v : . [ m a t h . O C ] S e p Distributed Nash Equilibrium Seeking for Games in Systemswith Unknown Control Directions
Maojiao Ye, Shengyuan Xu, Jizhao Yin
Abstract
Distributed Nash equilibrium seeking for games in uncertain networked systems without a prior knowledge about controldirections is explored in this paper. More specifically, the dynamics of the players are supposed to be first-order or second-ordersystems in which the control directions are unknown and there are parametric uncertainties. To achieve Nash equilibriumseeking in a distributed way, Nussbaum function based strategies are proposed through separately designing an optimizationmodule and a state regulation module. The optimization module generates a reference trajectory, that can search for theNash equilibrium, for the state regulation module. The state regulator is designed to steer the players’ actions to the referencetrajectory. An adaptive law is included in the state regulation module to compensate for the uncertain parameter in the players’dynamics and the Nussbaum function is included to address the unavailability of the control directions. Fully distributedimplementations of the proposed algorithms are discussed and investigated. Through our analytical explorations, we show thatthe proposed seeking strategies can drive the players’ actions to the Nash equilibrium asymptotically without requiring thehomogeneity of the players’ unknown control directions based on Barbalat’s lemma. A numerical example is given to supportthe theoretical analysis of the proposed algorithms.
Key words:
Unknown control directions; Nussbaum function; distributed Nash equilibrium seeking; parametric uncertainties
Games under distributed communication networks are re-ceiving increasing attention due to their wide applicationsin numerous fields. For example, the connectivity controlof mobile sensor networks was modeled as a game in whicheach sensor’s objective function contains a local cost thatmodels the sensor’s local goal (e.g., source seeking) anda global cost that describes the sensor’s willingness tokeep connectivity with other sensors [1]. Inspired by theobservation that practical engineering systems are usuallyafflicted with model uncertainties and disturbance, anextended-state-observer based robust Nash equilibriumseeking strategy was proposed in [1]. Energy consumptioncontrol might be formulated as an aggregative game, inwhich each user’s cost function depends on the user’s ownenergy consumption and the total energy consumptionof all users [2]. Dynamic average consensus algorithmscan be adapted as an aggregation estimator, based onwhich distributed Nash equilibrium seeking algorithmswere constructed [2]. Congestion control problems in wire-less sensor networks can be viewed as a semi-aggregativegame, in which each data transmitter makes decisionson its data transmission to maximize its own profit [3].Interference graphs can be introduced for the interactiondescriptions among the data transmitters, based on whichNash equilibrium seeking algorithms were designed in [3].Moreover, noncooperative games can be utilized to illus- ⋆ This work was supported by the National Natural ScienceFoundation of China (NSFC), No. 61803202, the Natural Sci-ence Foundation of Jiangsu Province, No. BK20180455, andthe Fundamental Research Funds for the Central Universities,No. 30920032203. M. Ye, S. Xu and J. Yin are with the Schoolof Automation, Nanjing University of Science andTechnology, Nanjing 210094, P.R. China (Email:[email protected],[email protected],[email protected]). trate the interactions among groups of discrete-time andcontinuous-time agents under distributed communicationnetworks [4]. Motivated by the broad applications of net-worked games, distributed Nash equilibrium seeking hasattracted a lot of interests in the past few years and quitea few distributed schemes have been proposed to achievedistributed Nash equilibrium seeking. The existing worksprovide some interesting viewpoints to cope with Nashequilibrium seeking for games in which the players’ actionscan be freely designed (e.g., [20][21][22][23]) or governedby simple dynamics (see e.g., [24][25]) or possibly subjectto disturbance and un-modeled dynamics (see e.g, [1]). Acommon premise of the existing works is that the controldirections are known to the players.It should be noted that control directions determine themotion directions of a control system and are greatly im-portant as a control force with incorrect direction may de-teriorate the system and cause undesired system controlperformance [13]. With the information on control direc-tions, the controller design becomes much simpler. Nev-ertheless, in some practical circumstances, the control di-rections are unknown. For instance, due to the inaccuratecamera parameters and image depth, the manipulator tra-jectory tracking control of visual servo system may needto address the unknown control directions [6]. Affected byspeed variations and loading conditions of the complex,varying environment, the model of ships contains large un-certainties and hence, the autopilot design of time-varyingships requires the accommodation of unknown control di-rections [7]. It was recognized that the longitudinal dy-namics of the air-breathing hypersonic vehicle suffer fromunknown control directions as well [9]. Furthermore, theauthors in [10] argued that in some situations, it is difficultto detect the control directions of quadrotor unmannedaerial vehicles. Without the information on control direc-tions, the controller design becomes much more challeng-ing especially for multi-agent systems.
Preprint submitted to Automatica 29 September 2020 any researchers have been dedicated to investigate sys-tems with unknown control directions. Adaptive designswith Nussbaum-type functions, which can be traced backto [8], are shown to be effective to deal with uncertain-ties in control directions. In [12], the Nussbaum-type func-tions were adopted to achieve adaptive control of nonlin-ear systems with arbitrary dynamic order and paramet-ric uncertainties. Extremum seekers with unknown controldirections were proposed in [15]. Output feedback controlfor discrete-time systems without a prior control directionknowledge was studied in [17] in which a discrete Nuss-baum gain was utilized to achieve asymptotic output track-ing. Nussbaum functions were discussed in [13] for systemswith time-varying unknown control directions. With thedevelopment of multi-agent systems, cooperative control ofmulti-agent systems with unknown control directions hasreceived increasing attention. For example, the authors in[11] considered consensus among a network of first-orderintegrator-type agents with unknown control directions. In[16], the authors supposed that some control directions areknown based on which consensus of multi-agent systemswith partially unknown and non-identical control direc-tions was addressed. Cooperative output consensus in het-erogeneous multi-agent systems with non-identical controldirections was considered in [18], where Nussbaum-typefunctions were adopted to achieve global cooperative out-put regulation. Distributed optimization among a networkof high-order integrator-type agents was addressed in [19]without utilizing prior knowledge about control directions.Fully distributed consensus among high-order nonlinearsystems in which the agents have heterogenous unknowncontrol directions was investigated in [29]. A new Nuss-baum function was employed to deal with the unknowncontrol directions and it was shown that the agents’ out-put can achieve asymptotic consensus [29].
Nevertheless,to the best of the authors’ knowledge, distributedNash equilibrium seeking for networked games inwhich the players are subject to unknown controldirections and uncertain parameters still remainsto be addressed.
Motivated by the above observations,this paper tries to shed some light on distributed Nashequilibrium seeking strategy design without utilizing con-trol direction information.In comparison with the existing works, the main contribu-tions of this paper are summarized as follows.(1) Different from the existing works that consider gameswith known control directions, the seeking strategiesproposed in this paper do not require prior direc-tion information. To the best of the authors’ knowl-edge, this is the first work that addresses distributedNash equilibrium seeking for games with unknowncontrol directions. Besides, this paper also accommo-dates parametric uncertainties in the players’ dynam-ics. Through a modular design, this paper proposesNussbaum function based adaptive seeking strategiesto achieve distributed Nash equilibrium seeking forgames in both first-order and second-order systemswith unknown control directions and parametric un-certainties.(2) Based on Barbalat’s lemma, it is theoretically shownthat the players’ actions can be steered to the Nashequilibrium while the other auxiliary variables staybounded by utilizing the proposed algorithms.(3) Discussions on fully distributed implementation of theproposed algorithms are provided. The explorations show that through adaptive parameter designs, theproposed fully distributed algorithms are effective.We organize the remaining sections as follows. Some pre-liminaries are given in Section 2 and the considered prob-lem is formulated in Section 3. Section 4 presents the mainresults of the paper, in which first-order and second-ordersystems with unknown control directions and parametricuncertainties are visited, successively. Discussions on fullydistributed implementations of the proposed methods areprovided in Section 5. Following the theoretical investiga-tions of the developed methods, Section 6 provides numer-ical studies. In the end, conclusions are given in Section 7.
The following definitions or lemmas will be utilized in therest of the paper.
Definition 1 [11] A continuously differentiable function N ( · ) is called a Nussbaum function if lim q →∞ sup 1 q Z q N ( s ) ds = ∞ , lim q →∞ inf 1 q Z q N ( s ) ds = −∞ . (1)Typical examples of Nussbaum functions include k cos( k ), k sin( k ), to mention just a few. Interested readers are re-ferred to [13] for more detailed discussions of Nussbaumfunctions. In this paper, we adopt N ( k ) = k sin( k ) . Lemma 1 [12] Suppose that V ( · ) and k ( · ) are smooth func-tions defined on [0 , t f ) , where t f is a positive constant and V ( t ) ≥ , ∀ t ∈ [0 , t f ) . Moreover, if V ( t ) ≤ Z t ( a N ( k ( τ )) + 1) ˙ k ( τ ) dτ + c, ∀ t ∈ [0 , t f ) , (2) where a is a nonzero constant, N is an even smooth Nuss-baum function, and c is a suitable constant. Then, V ( t ) , k ( t ) and R t ( aN ( k ( τ )) + 1) ˙ k ( τ ) dτ are bounded on [0 , t f ) . Lemma 2 ( Barbalat’s Lemma [26]) Suppose that g ( t ) : R → R is a uniformly continuous function. Then, lim t →∞ g ( t ) = 0 given that lim t →∞ R t g ( s ) ds exists and isfinite. A graph G contains a node set V = { , , · · · , M } ( M ≥ E d . The elements of E d are represented by ( i, j ), which illustrates an edge fromnode i to node j and indicates that node j can receiveinformation from node i but not necessarily vice versa.If ( i, j ) ∈ E d implies that ( j, i ) ∈ E d for all i, j ∈ V .The network is undirected. A directed path from node i k to node i k + l is a sequence of ordered edges denoted by( i k + j , i k + j +1 ) , j = 0 , , , · · · , l − . A directed graph issaid to be strongly connected if there is a directed pathbetween any two distinct nodes. Similarly, an undirectedgraph is connected if there is a path between any two dis-tinct nodes. The adjacency matrix A of a directed graph G is a matrix whose ( i, j )th entry is a ij , which is positive if( j, i ) ∈ E d , else, a ij = 0 . Moreover, a ii = 0 . The adjacencymatrix of an undirected graph is similarly defined with afurther requirement that a ij = a ji for all i = j. Moreover,the Laplacian matrix of graph G is L = D − A , in which D
2s a diagonal matrix whose i th diagonal entry is P Mj =1 a ij [5][30]. Consider a game with N players in which the action andcost function of player i is represented by x i ∈ R and f i ( x ) : R N → R , respectively, where x = [ x , x , · · · , x N ] T . De-note the player set as N = { , , · · · , N } and suppose thatthe players’ actions are governed by˙ x i = b i u i + φ i ( x i ) θ i , ∀ i ∈ N , (3)or ˙ x i = v i , ˙ v i = b i u i + φ i ( x i ) θ i , ∀ i ∈ N . (4)Note that in (3) and (4), u i is the control input to bedesigned and b i = 0 is an unknown constant. Moreover, φ i ( x i ) is a sufficiently smooth known function and θ i is anunknown parameter. Moreover, v i ∈ R is a state variableof player i .Furthermore, second-order systems in which player i ’s ac-tion is generated by˙ x i = b i v i + φ i ( x i ) θ i ˙ v i = b i u i + φ i ( x i , v i ) θ i , ∀ i ∈ N , (5)where θ i , θ i , b i , b i are unknown constants, φ i ( x i ) and φ i ( x i , v i ) are smooth functions, will also be considered.Note that in (5), b i and b i are nonzero.The paper aims to design distributed control strategies u i for systems in (3), (4) and (5), successively, such thatlim t →∞ || x ( t ) − x ∗ || = 0 where x ∗ is the Nash equilibriumdefined as follows. Definition 2
An action profile x ∗ = ( x ∗ i , x ∗− i ) is a Nashequilibrium if for i ∈ N ,f i ( x ∗ i , x ∗− i ) ≤ f i ( x i , x ∗− i ) , (6) for x i ∈ R , where x − i = [ x , x , · · · , x i − , x i +1 , · · · , x N ] T [5]. The rest of the paper is based on the following assumptions,which are widely adopted in related works.
Assumption 1
For each i ∈ N , f i ( x ) is sufficientlysmooth and ∂f i ( x ) ∂x i is globally Lipshitz with constant l i . Assumption 2
There exists a positive constant m suchthat for x , z ∈ R N , ( x − z ) T ( P ( x ) − P ( z )) ≥ m || x − z || , (7) where P ( x ) = h ∂f ( x ) ∂x , ∂f ( x ) ∂x , · · · , ∂f N ( x ) ∂x N i T . Assumption 3
The players are equipped with an undi-rected and connected communication graph G . For the systems in (3) and (4), the nonlinear term shouldsatisfy the following condition.
Assumption 4
For each i ∈ N , φ i ( x i ) and ∂φ i ( x i ) ∂x i arebounded provided that x i is bounded. Moreover, for the system in (5), the nonlinear terms shouldsatisfy the following condition.
Assumption 5
For each i ∈ N , φ i ( x i ) and ∂φ i ( x i ) ∂x i arebounded provided that x i is bounded. Moreover, φ i ( x i , v i ) is bounded if x i and v i are bounded. Remark 1
Different from existing works on distributedNash equilibrium seeking that consider the control directionsto be known, we suppose that the control directions are un-known a prior as b i (or b i , b i ) for all i ∈ N are not known.Moreover, the players may have different control directionsas we do not enforce sign ( b i ) (or sign ( b i ) , sign ( b i ) ) forall i ∈ N to be the same. Note that in (3) and (4) , θ i issupposed to be unknown as well, indicating that the playersare suffering from parametric uncertainties. In this section, we will establish distributed Nash equilib-rium seeking algorithms for games in which the players’actions are governed by (3), (4) and (5), successively. Inthe following, Nash equilibrium seekers that are able to ac-commodate the unknown control directions and paramet-ric uncertainties will be proposed, followed by their corre-sponding convergence analyses.
In this section, we consider that the action of player i isgoverned by ˙ x i = b i u i + φ i ( x i ) θ i , ∀ i ∈ N . (8)In the following, method development and convergenceanalysis will be presented. To achieve distributed Nash equilibrium seeking for sys-tems with unknown control directions, let u i = N ( k i )( x i − y i + φ i ( x i )ˆ θ i ) , (9)where N ( k i ) = k i sin ( k i ) and˙ k i =( x i − y i )( x i − y i + φ i ( x i )ˆ θ i ) , ˙ˆ θ i = φ i ( x i )( x i − y i ) . (10)Moreover, y i is an auxiliary variable generated by˙ y i = −∇ i f i ( z i ) , (11)where ∇ i f i ( z i ) = ∂f i ( x ) ∂x i | x = z i , z i = [ z i , z i , · · · , z iN ] T and˙ z ij = − δ ij N X k =1 a ik ( z ij − z kj ) + a ij ( z ij − y j ) ! , (12)in which δ ij = δ ¯ δ ij , δ is positive constant to be determinedand ¯ δ ij is a fixed positive constant. Remark 2
The seeking strategy in (9) - (12) can be viewedas two modules. The subsystem in (9) - (10) is designed todrive x i to y i . The Nussbaum function in (9) is employed toaccommodate the unknown control directions and the sec-ond equation in (10) is utilized to compensate the unknownparameter θ i . In addition, the subsystem in (11) - (12) isadapted from [5] to act as a reference generator that would layer : Optimization Modulestate regulator+uncertain parameter compensator Information exchange among the players Fig. 1. The illustration of the information flows in the seekingstrategy. drive y = [ y , y , · · · , y N ] T to the Nash equilibrium x ∗ [5].The schematic outline of (9) - (12) is depicted in the Fig. 1.4.1.2 Convergence Analysis In this section, we provide the convergence analysis for theseeking strategy proposed in (9)-(12). Before we proceed topresent the convergence results, the following supportivelemma is given.
Lemma 3
Suppose that Assumptions 1-3 are satisfied.Then, there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , the following conclusions hold: • For each i, j ∈ N , y i ( t ) and z ij ( t ) are bounded for t ∈ [0 , ∞ ) . • For each i ∈ N , ˙ y i ( t ) globally exponentially decays tozero. • For each i ∈ N , ˙ y i ( t ) is square integrable over t ∈ [0 , ∞ ) ,i.e., R ∞ ˙ y i ( s ) ds ≤ c i for some positive constant c i . Proof:
Following the results in [5], it can be obtained thatthere exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , y and z , where y = [ y , y , · · · , y N ] T and z =[ z T , z T , · · · , z TN ] T , globally exponentially converge to x ∗ and N ⊗ x ∗ , respectively [5]. Hence, the first conclusiondirectly follows the results in [5]. The second conclusion canbe reasoned as follows. As y and z globally exponentiallyconverge to x ∗ and N ⊗ x ∗ , respectively, there are positiveconstants η , η such that || [( y − x ∗ ) T , ( z − N ⊗ x ∗ ) T ] T || ≤ η e − η t . (13)For each i ∈ N , we get that || ˙ y i || = ||∇ i f i ( z i ) − ∇ i f i ( x ∗ ) || , (14)by noticing that ∇ i f i ( x ∗ ) = 0 , ∀ i ∈ N according to As-sumption 2. By the Lipshitz condition of ∇ i f i in Assump-tion 1, we get that || ˙ y i || ≤ l i || z i − x ∗ || ≤ l i || z − N ⊗ x ∗ || ≤ l i η e − η t , (15)thus arriving at the second conclusion.For the third conclusion, Z ∞ ˙ y i ( s ) ds = Z ∞ ||∇ i f i ( z i ( s )) || ds ≤ l i Z ∞ || z i ( s ) − x ∗ || ds ≤ l i η Z ∞ e − η s ds ≤ l i η η , (16)thus arriving at the third conclusion with c i = l i η η . ✷ Note that by Lemma 3 and (11)-(12), ˙ y i and ˙ z ij are alsobounded as y i and z ij for all i, j ∈ N are bounded.With the above results in mind, we are now ready to showthat the players’ actions x can be driven to the Nash equi-librium x ∗ by utilizing the proposed method. Theorem 1
Suppose that Assumptions 1-4 are satisfied.Then, there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || = 0 , (17) and k i ( t ) , ˆ θ i ( t ) for all i ∈ N stay bounded.Proof: Define a sub-Lyapunov candidate function for player i as V i = 12 ( x i − y i ) + 12 ( θ i − ˆ θ i ) . (18)Then, the time derivative of V along the trajectory is˙ V i =( x i − y i )( ˙ x i − ˙ y i ) + (ˆ θ i − θ i ) ˙ˆ θ i =( x i − y i ) (cid:16) N ( k i ) b i ( x i − y i + φ i ( x i )ˆ θ i ) + φ i ( x i ) θ i (cid:17) − ( x i − y i ) ˙ y i + (ˆ θ i − θ i ) φ i ( x i )( x i − y i ) ≤ N ( k i ) b i ˙ k i − ( x i − y i ) ˙ y i + ˆ θ i φ i ( x i )( x i − y i ) ≤ − ( x i − y i ) + ( N ( k i ) b i + 1) ˙ k i − ( x i − y i ) ˙ y i ≤ − (cid:18) − C i (cid:19) ( x i − y i ) + ( N ( k i ) b i + 1) ˙ k i + ( ˙ y i ) C i , (19)by noticing that | ( x i − y i ) ˙ y i | ≤ C i ( x i − y i ) + ( ˙ y i ) C i , where C i is a positive constant that satisfies C i < . Integrating both sides of (19), it can be obtained that Z t f ˙ V i ( τ ) dτ ≤ − Z t f (cid:18) − C i (cid:19) ( x i − y i ) dτ + Z t f ( N ( k i ) b i + 1) ˙ k i dτ + Z t f ( ˙ y i ) C i dτ ≤ Z t f ( N ( k i ) b i + 1) ˙ k i dτ + c i C i . (20)Note that the last inequality is obtained by noticing that R t f y i ) C i dτ ≤ R ∞ y i ) C i dτ ≤ c i C i according to Lemma 3.Hence, V i ( t ) and k i ( t ) are bounded on [0 , t f ) by Lemma1, which indicates that x i − y i and ˆ θ i are bounded. More-over, as y i is bounded by Lemma 3, we obtain that x i isbounded for t ∈ [0 , t f ) , from which we can further obtainthat ˙ x i , ˙ k i , ˙ˆ θ i are bounded over the time interval [0 , t f ) . This implies that there is no finite-time escape for theclosed-loop system and hence t f = ∞ . Taking the time derivative of ˙ k i gives¨ k i =( ˙ x i − ˙ y i )( x i − y i + φ i ( x i )ˆ θ i )+ ( x i − y i )( ˙ x i − ˙ y i + ∂φ i ( x i ) ∂x i ˙ x i ˆ θ i + φ i ( x i ) ˙ˆ θ i ) . (21)As x i is bounded for t ∈ [0 , ∞ ), ∂φ i ( x i ) ∂x i is bounded for t ∈ [0 , ∞ ) by Assumption 4. Moreover, noticing that4 i , y i , φ i ( x i ) , ˆ θ i , ˙ x i , ˙ y i , ˙ˆ θ i are all bounded, we get that ¨ k i is bounded. Hence, ˙ k i ( t ) is uniformly continuous withrespect to t. In addition, Z ∞ ˙ k i ( s ) ds = k i ( ∞ ) − k i (0) ≤ k ∗ i , (22)where k ∗ i is a finite constant determined by the bounds of k i ( t ) . Therefore, ( x i ( t ) − y i ( t ))( x i ( t ) − y i ( t ) + φ i ( x i ( t ))ˆ θ i ( t )) isintegrable over t ∈ [0 , ∞ ). Hencelim t →∞ ( x i ( t ) − y i ( t ))( x i ( t ) − y i ( t )+ φ i ( x i ( t ))ˆ θ i ( t )) = 0 , (23)by Lemma 2.From the other aspect, taking the time derivative of ˙ˆ θ i gives¨ˆ θ i = ∂φ i ( x i ) ∂x i ˙ x i ( x i − y i ) + φ i ( x i )( ˙ x i − ˙ y i ) , (24)from which we see that ¨ˆ θ i is bounded by noticing that x i , y i , ˙ x i , ˙ y i , ∂φ i ( x i ) ∂x i , φ i ( x i ) are bounded.Therefore, ˙ˆ θ i is uniformly continuous with respect to t .Moreover, Z ∞ ˙ˆ θ i ( t ) dt = ˆ θ i ( ∞ ) − ˆ θ i (0) ≤ ˆ θ ∗ i , (25)where ˆ θ ∗ i is a constant determined by the bounds of ˆ θ i . Hence, by Lemma 2, we can obtain thatlim t →∞ φ i ( x i ( t ))( x i ( t ) − y i ( t )) = 0 . (26)By (23), we have x i ( t ) = y i ( t ) or alternatively, x i ( t ) − y i ( t ) + φ i ( x i ( t ))ˆ θ i ( t ) = 0 for t = ∞ . Moreover, by (26),we have φ i ( x i ( t )) = 0 or x i ( t ) = y i ( t ) for t = ∞ . Supposethat x i ( t ) = y i ( t ) for t = ∞ , then, φ i ( x i ( t )) = 0 must besatisfied. If this is the case, x i ( t ) − y i ( t ) + φ i ( x i ( t ))ˆ θ i ( t ) = x i ( t ) − y i ( t ) = 0 for t = ∞ , indicating that (23) can notbe satisfied. Hence, we arrive at a contradiction and obtainthat x i ( t ) = y i ( t ) must be satisfied for t = ∞ . Recallingthat y ( t ) → x ∗ as t → ∞ , which is proven in Lemma 3,we arrive at the conclusion that x ( t ) → x ∗ as t → ∞ , thuscompleting the proof. ✷ In this paper, we focus on the case in which the commu-nication graph is undirected and connected for simplicity.However, it should be noted that the presented results arestill valid for strongly connected digraphs. To highlight thispoint, the following corollary is given.
Corollary 1
Suppose that Assumptions 1-2, 4 are satisfiedand the communication graph is strongly connected. Then,there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || = 0 , (27) and k i ( t ) , ˆ θ i ( t ) for all i ∈ N stay bounded.Proof: The proof follows that of Theorem 1 by noticing thatthe results in Lemma 3 are still valid for strongly connecteddigraphs. ✷ In Theorem 1, we consider that each player i ’s action issubject to both unknown control directions ( b i is unknown)and uncertain parameter θ i , i.e.,˙ x i = b i u i + φ i ( x i ) θ i . (28)If there is no uncertain parameter, and the players’ actionsare generated by ˙ x i = b i u i . (29)Then, the proposed seeking strategy can be revised to be u i = N ( k i )( x i − y i ) , (30)where N ( k i ) = k i sin ( k i ),˙ k i = ( x i − y i ) , (31)and y i is generated by (11)-(12). If this is the case, thefollowing corollary can be obtained. Corollary 2
Suppose that Assumptions 1-3 are satisfied.Then, there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || = 0 , (32) and k i ( t ) for all i ∈ N stay bounded.4.2 Distributed Nash equilibrium seeking for second-ordersystems In this section, we suppose that for each i ∈ N , player i ’saction x i is governed by˙ x i = v i ˙ v i = b i u i + φ i ( x i ) θ i , (33)in which v i ∈ R is a state of player i . To achieve distributed Nash equilibrium seeking for gamesin which each player i ’s dynamics is governed by (33), thecontrol input u i is designed as u i = N ( k i )( x i − y i + v i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i )) , (34)where N ( k i ) = k i sin ( k i ) and˙ k i =( x i − y i + v i )( x i − y i + v i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i )) , ˙ˆ θ i = φ i ( x i )( x i − y i + v i ) . (35)Moreover, y i is an auxiliary variable generated by˙ y i = −∇ i f i ( z i ) , (36)where z i = [ z i , z i , · · · , z iN ] T . Furthermore,˙ z ij = − δ ij N X k =1 a ik ( z ij − z kj ) + a ij ( z ij − y j ) ! , (37)where δ ij = δ ¯ δ ij , δ is positive constant to be determinedand ¯ δ ij is a fixed positive constant.To establish the results for second-order systems, the fol-lowing assumption is also needed.5 ssumption 6 For each i, j ∈ N , ∂ ∇ i f i ( x ) ∂x j is boundedgiven that x is bounded. Remark 3
Compared the strategy in (34) - (37) with (9) - (12) , we see that the optimization modules are the samewhile the regulation modules are different. As the system in (33) is a second-order system, we further utilize ˙ x i and ˙ y i inthe seeking strategy. Recalling the definitions of ˙ x i and ˙ y i ,it is clear that the communication in the proposed seekingstrategy is still one-hop.4.2.2 Convergence analysis The following theorem illustrates the convergence resultfor the proposed method.
Theorem 2
Suppose that Assumptions 1-3, 5-6 are satis-fied. Then, there exists a positive constant δ ∗ such that foreach δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || → . (38) Moreover, k i ( t ) and ˆ θ i ( t ) for all i ∈ N stay bounded.Proof: For notational convenience, let ξ i = x i − y i + v i .Define the sub-Lyapunov candidate function for player i as V i = 12 ( x i − y i ) + 12 ξ i + 12 ( θ i − ˆ θ i ) . (39)Then, the time derivative of V i is˙ V i = − ( x i − y i )( x i − y i − ξ i ) − ( x i − y i ) ˙ y i + ξ i (cid:16) b i N ( k i )( ξ i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i )) (cid:17) + ξ i ( φ i ( x i ) θ i + ˙ x i − ˙ y i ) + (ˆ θ i − θ i ) φ i ( x i ) ξ i = − ( x i − y i ) − ξ i + ( b i N ( k i ) + 1) ˙ k i + ( x i − y i ) ξ i − ( x i − y i ) ˙ y i ≤ − (cid:18) − C i (cid:19) ( x i − y i ) − ξ i + ( b i N ( k i ) + 1) ˙ k i + ( ˙ y i ) C i , (40)where C i is a positive constant that satisfies C i < . Integrating both sides of (40) over t ∈ [0 , t f ) gives Z t f ˙ V i dτ ≤ − Z t f (cid:20)(cid:18) − C i (cid:19) ( x i − y i ) + 12 ξ i (cid:21) dτ + Z t f ( b i N ( k i ) + 1) ˙ k i dτ + Z t f ( ˙ y i ) C i dτ ≤ Z t f ( b i N ( k i ) + 1) ˙ k i dτ + c i C i . (41)Hence, by Lemma 1, we get that V i and k i are bounded for t ∈ [0 , t f ) , which further indicates that x i − y i , ξ i , ˆ θ i arebounded for t ∈ [0 , t f ). Recalling that y i is bounded, weget that x i is bounded for t ∈ [0 , t f ). Hence, v i is bounded.Therefore, there is no finite-time escape for the closed-loopsystem, which indicates that t f = ∞ . Recalling (34)-(37),we can obtain that ˙ x i , ˙ v i , ˙ y i , ˙ k i , ˙ˆ θ i are all bounded. Taking the time derivative of ˙ k i ( t ) gives¨ k i =( ˙ v i + ˙ x i − ˙ y i )( ξ i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i ))+ ξ i ( ˙ x i − ˙ y i + ˙ v i + ∂φ i ( x i ) ∂x i ˙ x i ˆ θ i + φ i ( x i ) ˙ˆ θ i )+ ξ i (¨ x i − ¨ y i ) . (42)Note that ¨ x i = ˙ v i is bounded and ¨ y i = (cid:16) ∇ i f i ( x ) ∂ x | x = z i (cid:17) T ˙ z i is bounded as z i , ˙ z i are bounded and ∇ i f i ( x ) ∂ x | x = z i isbounded for bounded z i (by Assumption 6), it can be seenthat ¨ k i is bounded. Hence, ˙ k i ( t ) is uniformly continuous.Moreover, Z ∞ ˙ k i ( τ ) dτ = k i ( ∞ ) − k i (0) ≤ k ∗ i , (43)where k ∗ i is a constant determined by the bounds of k i ( t ) . Hence, by Lemma 2, we can obtain that ( v i + x i − y i )( x i − y i + v i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i )) → t → ∞ . Similarly, Z ∞ ˙ˆ θ i ( τ ) dτ = ˆ θ i ( ∞ ) − ˆ θ i (0) ≤ ˆ θ ∗ i , (44)where ˆ θ ∗ i is a constant determined by the bounds of ˆ θ i . Hence, by Lemma 2, we can obtain that φ i ( x i )( v i + x i − y i ) → t → ∞ . Hence, for t = ∞ , we have φ i ( x i )( v i + x i − y i ) = 0 and( v i + x i − y i )( x i − y i + v i + φ i ( x i )ˆ θ i + ( ˙ x i − ˙ y i )) = 0 . Case I: φ i ( x i ) = 0 but v i + x i − y i = 0 for t = ∞ . In thiscase, x i − y i + v i + ( ˙ x i − ˙ y i ) = 0. Recalling that as t → ∞ ,y i → x ∗ i , and ˙ y i →
0, we get that˙ x i = −
12 ( x i − x ∗ i ) , (45)from which it is clear that x ( t ) → x ∗ for t → ∞ . Case II: v i + x i − y i = 0 for t = ∞ . If this is the case˙ x i = − ( x i − y i ) , (46)as t → ∞ . Recalling that lim t →∞ ( y i ( t ) − x ∗ i ) = 0 , wecan obtain that lim t →∞ || x ( t ) − x ∗ || = 0 . To this end, theconclusion is obtained. ✷ Similar to Corollary 1, the following result can be obtainedif the communication graph is strongly connected.
Corollary 3
Suppose that Assumptions 1-2, 4, 6 are sat-isfied and the communication graph is strongly connected.Then, there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || = 0 (47) and k i ( t ) , ˆ θ i ( t ) for all i ∈ N stay bounded.4.3 Distributed Nash equilibrium seeking for more generalsecond-order systems In this section, we consider a game in which each player i ’saction is governed by˙ x i = b i v i + φ i ( x i ) θ i ˙ v i = b i u i + φ i ( x i , v i ) θ i . (48)6 emark 4 Note that compared with (33) , the effect of v i on x i is also uncertain in (48) as b i is also unknown. In ad-dition, an uncertain nonlinear term φ i ( x i ) θ i is addressedas well. Hence, in this problem, b i and b i are both un-known directions that should be addressed. Moreover, both θ i and θ i result in uncertain nonlinearities that should beaccommodated. Motivated by [12], the Nash equilibrium seeking strategyis designed in the following process:
Step 1:
Generate a reference trajectory y i for i ∈ N thatwould converge to the Nash equilibrium according to˙ y i = − ∇ i f i ( z i )˙ z ij = − δ ij N X k =1 a ik ( z ij − z kj ) + a ij ( z ij − y j ) ! , (49)where j ∈ N , z i = [ z i , z i , · · · , z iN ] T , δ ij = δ ¯ δ ij , δ is pos-itive constant to be determined and ¯ δ ij is a fixed positiveconstant. Step 2:
Generate a reference trajectory α i for v i as α i = N ( k i )( x i − y i + φ i ( x i )ˆ θ i )˙ k i =( x i − y i )( x i − y i + φ i ( x i )ˆ θ i )˙ˆ θ i = φ i ( x i )( x i − y i ) . (50) Step 3:
Let β i = v i − α i . Then, through direct calculation,it can be obtained that˙ β i = ˙ v i − ˙ α i = b i u i + φ i ( x i , v i ) θ i + Ψ i ( x i , k i , ˆ θ i ) θ i + Ψ i ( k i , x i , y i , ˆ θ i , ˙ y i ) + Ψ i ( k i , x i , ˆ θ i , v i ) b i (51)where Ψ i = − N ( k i ) (cid:16) φ i ( x i ) + ∂φ i ( x i ) ∂x i ˆ θ i φ i ( x i ) (cid:17) , Ψ i = − (2 k i sin ( k i ) + k i cos ( k i ))( x i − y i )( x i − y i + φ i ( x i )ˆ θ i ) − N ( k i )( − ˙ y i + φ i ( x i )( x i − y i )) andΨ i = − N ( k i ) (cid:16) ∂φ i ( x i ) ∂x i ˆ θ i + 1 (cid:17) v i . Accordingly, thecontrol input u i is designed as u i = N ( k i )( β i + φ i ¯ θ i + Ψ i ¯ θ i + Ψ i + Ψ i ¯ b i ) , ˙ k i = β i ( β i + φ i ¯ θ i + Ψ i ¯ θ i + Ψ i + Ψ i ¯ b i ) , ˙¯ θ i = β i φ i , ˙¯ θ i = β i Ψ i , ˙¯ b i = β i Ψ i . (52) Remark 5
Note that (50) is designed to drive x i to y i and (52) is designed to drive v i to α i . The design of the controlinput in (52) is motivated by [12] that treats v i as a virtualcontrol input for x i . To deal with unknown constants b i and b i , two Nussbaum functions are included. To accom-modate multiple Nussbaum functions, the idea is to designthe control input such that β i is square integrable (see also[12]). The following theorem establishes the stability of Nashequilibrium under the control input designed in (52).
Theorem 3
Suppose that Assumptions 1-3,5 are satisfied.Then, there exists a positive constant δ ∗ such that for each δ ∈ ( δ ∗ , ∞ ) , lim t →∞ || x ( t ) − x ∗ || = 0 , (53) and other variables stay bounded.Proof: The proof is similar to those in [12] and Theorem 2.For the convenience of the readers, sketch of the proof isgiven as follows.Step 1: Show that β i is square integrable by defining thesub-Lyapunov candidate function as V i = 12 β i + 12 (¯ θ i − θ i ) + 12 (¯ θ i − θ i ) + 12 (¯ b i − b i ) . (54)Then, following the proof of Theorem 2, it can be obtainedthat ˙ V i ≤ − β i + ( b i N ( k i ) + 1) ˙ k i . (55)Moreover, taking integrations on both sides of (55) over[0 , t f ), we get that Z t f ˙ V i dτ ≤ − Z t f β i dτ + Z t f ( b i N ( k i ) + 1) ˙ k i dτ, (56)from which it can be obtained that V i , k i and R t f ( b i N ( k i )+1) ˙ k i dt are bounded by Lemma 1.Moreover, from (56), it is clear that Z t f β i dτ ≤ V i (0) − V i ( t f ) + Z t f ( b i N ( k i ) + 1) ˙ k i dτ. (57)Hence, β i is square integrable for t ∈ [0 , t f ).Step 2: Show that x i can be driven to y i by defining theother sub-Lyapunov function as V i = 12 ( x i − y i ) + (ˆ θ i − θ i ) . (58)Then, the time derivative of V i is˙ V i =( x i − y i )( b i α i + b i β i + φ i ( x i ) θ i − ˙ y i )+ (ˆ θ i − θ i ) ˙ˆ θ i =( x i − y i )( b i N ( k i )( x i − y i + φ i ( x i )ˆ θ i )+ b i β i + φ i ( x i ) θ i − ˙ y i ) + (ˆ θ i − θ i ) ˙ˆ θ i ≤ − ( x i − y i ) + ( b i N ( k i ) + 1) ˙ k i + ( x i − y i ) b i β i − ( x i − y i ) ˙ y i ≤ −
12 ( x i − y i ) + ( b i N ( k i ) + 1) ˙ k i + C i b i β i + C i y i . (59)where C i , C i are positive constants that satisfy C i + C i ≤
1. Noticing that both β i and ˙ y i are square integrablefor t ∈ [0 , t f ) , we obtain that V i , k i are bounded for t ∈ [0 , t f ). Combining the above two steps, it can be seen that x i − y i , k i , ˆ θ i as well as β i , k i , ¯ θ i , ¯ θ i , ¯ b i are all bounded.Recalling the definition of α i , it can be obtained that v i isbounded. Furthermore, x i is bounded as y i is bounded by7emma 3. To this end, we have shown that all the variablescontained in the closed-loop system are bounded for t ∈ [0 , t f ) , indicating that there is no finite-time escape and t f = ∞ . Step 3: The rest analysis follows the proof of Theorem 2to take the time derivatives of ˙ k i and ˙ˆ θ i to show thatthere are uniformly continuous. Then, take the integra-tions of them over [0 , ∞ ) to prove that their integrationsare bounded. With the above conclusions in mind, by Bar-balat’s lemma, lim t →∞ ˙ k i = 0 and lim t →∞ ˙ˆ θ i = 0 , fromwhich it can be obtained that lim t →∞ x i ( t ) − y i ( t ) = 0 byfollowing the arguments in the proof of Theorem 1. ✷ Remark 6
The system dynamics considered in (48) issimilar to the one in [12] and the state regulation part ismotivated by [12]. However, different from [12] that regu-lates the state to zero, this paper needs to regulate the stateto a time-varying reference trajectory ( y i ( t ) for i ∈ N ),generated by the optimization module. In Section 4, the proposed seeking strategies contain a cen-tralized control gain δ , which depends on the players’ ob-jective functions and the communication graph. In general,these centralized information can hardly be obtained. Ac-tually, in [28], we proposed fully distributed Nash equilib-rium seeking strategies by adaptively adjusting the controlgains. In the following, we further prove that the adaptivealgorithms in [28] can also be utilized in the proposed algo-rithms to achieve fully distributed Nash equilibrium seek-ing in the considered problem.By the methods in [28], one can replace (11)-(12) in theproposed algorithms with˙ y i = − ∇ i f i ( z i )˙ z ij = − δ ij N X k =1 a ik ( z ij − z kj ) + a ij ( z ij − y j ) ! ˙ δ ij = N X k =1 a ik ( z ij − z kj ) + a ij ( z ij − y j ) ! , (60)for i ∈ N . Then, the following result can be obtained.
Lemma 4
Suppose that Assumptions 1-3 are satisfied.Then, with the strategy in (60) , the following conclusionscan be obtained: • For each i, j ∈ N , y i ( t ) , z ij ( t ) and δ ij ( t ) are bounded for t ∈ [0 , ∞ ) . • For each i ∈ N , ˙ y i ( t ) is square integrable over t ∈ (0 , ∞ ) ,i.e., R ∞ ˙ y i ( s ) ds ≤ c i for some positive constant c i . Proof:
Following the proof of [28] to define V = e T M e + ( y − x ∗ ) T ( y − x ∗ )+ P Ni =1 P Nj =1 ( θ ij − θ ∗ ij ) , where θ ∗ ij > m || M ||√ N max i ∈V { l i } +(2 || M || N +max i ∈V { l i } ) mλ min ( MM ) , e =[ z − y , z − y , · · · , z N − y N , z − y , · · · , z NN − y N ] T , y = [ y , y , · · · , y N ] T , M = L ⊗ I N × N + A and A is adiagonal matrix with its elements being a ij . Then, it follows from [28] that˙ V ≤ − a || E || , (61)where a > E = [( y − x ∗ ) T , e T ] T , from which it can beobtained that for each i, j ∈ N , y i , z ij and δ ij are boundedfor t ∈ [0 , ∞ ) . Moreover, Z ∞ ˙ y i ( s ) ds ≤ Z ∞ |∇ i f i ( z i ( s )) − ∇ i f i ( x ∗ ) | ds ≤ l i Z ∞ || z i ( s ) − x ∗ || ds. (62)Taking integration on both sides of (61), we obtain that Z ∞ ˙ V ( s ) ds ≤ − a Z ∞ || E ( s ) || ds, (63)by which V ( ∞ ) + a Z ∞ || E || ds ≤ V (0) . (64)By further noticing that Z ∞ || z i ( s ) − x ∗ || ds ≤ Z ∞ || E || ds, (65)we obtain that V ( ∞ ) + a Z ∞ || z i ( s ) − x ∗ || ds ≤ V (0) . (66)Hence Z ∞ ˙ y i ( s ) ds ≤ ( V (0) − V ( ∞ )) l i a , (67)thus arriving at the second conclusion. ✷ With the results in Lemma 4, we can achieve the fullydistributed implementations of the proposed algorithms,which is stated in the following theorem.
Theorem 4
Suppose that Assumptions 1-4 are satisfied.Then, for the system considered in (3) with the control inputin (9) - (10) , where y i is generated by (60) . Then, lim t →∞ || x ( t ) − x ∗ || = 0 , (68) and all the other variables stay bounded. It’s worth mentioning that for systems considered in(4)/(5) and the proposed control inputs designed for thecorresponding systems, one can replace y i therein with theone generated by (60) to achieve fully distributed imple-mentations of the proposed algorithms. Note that we onlypresent the results for the system (3) and omit the rest toavoid any repetitions in this paper. Remark 7
In this section, we only provide an example toillustrate the fully distributed implementations of the pro-posed algorithms. However, it is worth noting that the pro-posed algorithms actually provide a general framework todeal with games in systems with unknown control directions.That is, one may utilize other alternative approaches that (cid:21)(cid:20)(cid:23) (cid:24) (cid:25)(cid:26) Fig. 2. The communication graph among the players. result in square integrable ˙ y i and bounded state variablesas well as their time derivatives to achieve fully distributedNash equilibrium seeking for systems with unknown con-trols. Remark 8
The modular design in this paper is motivatedby [25][31]. Though it was required that the controls shouldbe bounded in [25], the control directions were supposedto be known. Moreover, [31] designed an extremum seekerthrough robust state regulation and numerical optimization,in which the control directions are also considered to beknown. Different from [19] that considered distributed opti-mization problems with unknown control directions, this pa-per addresses Nash equilibrium seeking problems with bothunknown control directions and parametric uncertainties.In particular, the existence of multiple unknown control di-rections and uncertain parameters is addressed. Though weonly investigate first-order and second-order systems ana-lytically in this paper, we believe that under the proposedframework, it is not challenging to extend the current re-sults to high-order systems by backstepping techniques.
Though for presentation simplicity, we suppose that x i ∈ R , it should be noted that the presented results can bedirectly adapted to deal with games in which the players areof multiple heterogeneous dimensions. In the subsequentsection, an example in which x i ∈ R for i ∈ N will benumerically studied. In this section, we consider the connectivity control gameamong a network of 7 mobile sensors considered in [1]. Theobjective function of player i engaged in the game is definedas F i ( x ) = h i ( x i ) + l i ( x ) , (69)where x i = [ x i , x i ] T ∈ R and h i ( x i ) = x Ti m ii x i + x Ti m i + i , (70)in which m ii = diag { i, i } , m i = [ i, i ] T . Moreover, l ( x ) = k x − x k , l ( x ) = k x − x k , l ( x ) = k x − x k , l ( x ) = k x − x k , l ( x ) = k x − x k + k x − x k , l ( x ) = k x − x k + k x − x k and l ( x ) = k x − x k . It can be calculated that the Nash equilibriumof the game is x ∗ i = − and x ∗ i = − i ∈ { , , · · · , } .In the simulation, the undirected and connected commu-nication graph is plotted in Fig. 2. In the following, gameswith dynamics in (3), (4) and (5) will be numericallyexplored, successively. In this section, we simulate first-order systems in (3),where the control input is designed in (9)-(12). Notethat as x i ∈ R , b i ∈ R × R . In the simulation, b = diag { , } , b = diag { , } , b = diag {− , − } ,b = diag { , } , b = diag {− , − } , b = diag {− , − } Time(Second) -15-10-505101520
220 240 260 280 300-1-0.75-0.5-0.250
Fig. 3. The trajectories of x i ( t ) for i ∈ { , , · · · , } generatedby (9)-(12). Time(Second) Th e t r a j ec t o r i es o f k ij Fig. 4. The trajectories of k ij for i ∈ { , , · · · , } , j ∈ { , } generated by (9)-(12). Time(Second) -1-0.500.511.522.533.54
Fig. 5. The trajectories of ˆ θ ij generated by (9)-(12) for i ∈ { , , · · · , } , j ∈ { , } . and b = diag { , } . Moreover, φ i = ix i . Let x (0) =[ − , , − , − , , , , − , − , , , , , T , and the initialvalues for all the other variables in (9)-(12) be zero. Then,generated by (9)-(12), the players’ action trajectories x i ( t )for i ∈ { , , · · · , } are plotted in Fig. 3, from whichit is clear that the players’ action trajectories convergeto the Nash equilibrium asymptotically. Moreover, Figs.4-5 illustrate the trajectories of k ij ( t ) and ˆ θ ij ( t ) for all i ∈ { , , · · · , } , j ∈ { , } , respectively. From Figs. 4-5, itcan be seen that these variables stay bounded. Therefore,Theorem 1 is numerically validated.9
20 40 60 80 100 120 140 160 180 200
Time(Second) -8-6-4-20246810 -1-0.75-0.5-0.250
Fig. 6. The trajectories of x i ( t ) for i ∈ { , , · · · , } generatedby (34)-(37). Time(Second) -1012345678 Th e t r a j ec t o r i es o f k ij Fig. 7. The trajectories of k ij for i ∈ { , , · · · , } , j ∈ { , } generated by (34)-(37). In this section, we simulate the system in (4), where thecontrol input is designed in (34)-(37). In the simulation, b i , φ i ( x i ) and x (0) follow those in Section 6.1 and the initialvalues for all the other variables are zero.The players’ action trajectories x i ( t ) for i ∈ { , , · · · , } generated by (34)-(37) are depicted in Fig. 6, from whichit can be seen that the players’ actions converge the actualNash equilibrium of the game. In addition, k ij ( t ) and ˆ θ ij ( t )for all i ∈ { , , · · · , } , j ∈ { , } are given in Figs. 7-8.From Figs. 7-8, we can conclude that k ij ( t ) and ˆ θ ij ( t ) forall i ∈ { , , · · · , } , j ∈ { , } stay bounded. Furthermore,Fig. 9 demonstrates that v i ( t ) for all i ∈ { , , · · · , } de-cay to zero, which is aligned with the results in Theorem2. To this end, the conclusions in Theorem 2 have beennumerically verified. In this section, we numerically verify the control input de-signed for uncertain nonlinear systems in (5). To illustratethe case, we suppose that the action of player 7 is gov-erned by (5), and its control input is given by (49)-(52),while all the other players’ actions are governed by (4)with their control inputs being (34)-(37). For players 1-6, b i and φ i ( x i ) are chosen to be the same as those in Sec- Time(Second) -3-2-101234
Fig. 8. The trajectories of ˆ θ ij for i ∈ { , , · · · , } , j ∈ { , } generated by generated by (34)-(37). Time(Second) -25-20-15-10-505101520 Th e t r a j ec t o r i es o f v i (t) Fig. 9. The trajectories of v i ( t ) for i ∈ { , , · · · , } generatedby (34)-(37). tion 6.2. For player 7, b = b = diag { , } , φ ( x i ) =7 x and φ ( x i , v i ) = [7 x , v ] T . In addition, x (0) =[ − , , − , − , , , , − , − , , , , , T and the initialconditions for all the other variables are zero. Generatedby the proposed methods, the players’ actions x i ( t ) for i ∈ { , , · · · , } are shown in Fig. 10, from which we seethat the players’ actions converge to the Nash equilibrium.Moreover, k i ( t ) and ˆ θ i ( t ) for i ∈ { , , · · · , } are plottedin Figs. 11-12. Figs. 11-12 show that k i ( t ) and ˆ θ i ( t ) staybounded. Moreover, the evolution of v i ( t ) is shown in Fig.13, which shows that v i ( t ) for all i ∈ { , , · · · , } are alsobounded. Hence, the effectiveness of the method in (49)-(52) is also verified. To verify the fully distributed implementations of the pro-posed methods, we take first-order systems as an example.The simulation setting of this section follows that of Sec-tion 6.1 and δ ij (0) = 0 for all i ∈ { , , · · · , } , j ∈ { , } .The simulation results for the system considered in (3) withthe control input in (9)-(10), where y i is generated by (60)are given in Figs. 14-17. Fig. 14 plots the players’ actionsfrom which we see that the players’ actions can convergeto the Nash equilibrium. Moreover, Figs. 15-17 plot k i ( t ) , ˆ θ i ( t ) and δ ij ( t ), respectively, from which it is clear thatthey stay bounded. Therefore, the results in Theorem 4 isnumerically verified. Note that compared with the simula-tion in Section 6.1, there is no centralized control gain in(60), thus verifying the effectiveness of the distributively10
20 40 60 80 100 120 140 160 180 200
Time(Second) -8-6-4-20246810
180 185 190-1-0.75-0.5-0.250
Fig. 10. The trajectories of x i ( t ) for i ∈ { , , · · · , } with player7’s control strategy being (49)-(52) and the rest of the players’control strategy being (34)-(37). Time(Second) -1012345678 Th e t r a j ec t o r i es o f k i Fig. 11. The trajectories of k i ( t ) for i ∈ { , , · · · , } withplayer 7’s control strategy being (49)-(52) and the rest of theplayers’ control strategy being (34)-(37). Time(Second) -3-2-101234
Fig. 12. The trajectories of ˆ θ i ( t ) for i ∈ { , , · · · , } withplayer 7’s control strategy being (49)-(52) and the rest of theplayers’ control strategy being (34)-(37). implemented algorithms. This paper considers distributed Nash equilibrium seek-ing for games in which the players’ actions are subject toboth unknown control directions and parametric uncer-tainties. First-order systems and second-order systems areaddressed successively. To cope with the un-availabilityof control directions, a Nussbaum function is adopted.
Time(Second) -2000-1500-1000-500050010001500 Th e t r a j ec t o r i es o f v i (t)
180 185 190-5051015
Fig. 13. The trajectories of v i ( t ) for i ∈ { , , · · · , } with player7’s control input being (49)-(52) and the rest of the players’control input being (34)-(37). Time(Second) -15-10-505101520
200 220 240 260 280-1-0.75-0.5-0.250
Fig. 14. The trajectories of x i ( t ) for i ∈ { , , · · · , } for thesystem considered in (3) with the control input in (9)-(10),where y i is generated by (60). Time(Second) Th e t r a j ec t o r i es o f k ij Fig. 15. The trajectories of k i ( t ) for i ∈ { , , · · · , } for thesystem considered in (3) with the control input in (9)-(10),where y i is generated by (60). Moreover, the parametric uncertainties are addressed byadaptive laws. Together with an optimization module, astate regulation module is included in the seeking strategy.Based on the Barbalat’s lemma, it is proven that the play-ers’ actions can be driven to the Nash equilibrium. Lastly,the fully distributed implementations of the proposedalgorithms are discussed. It is shown that the adaptivetechniques can be employed to achieve the equilibriumseeking in a fully distributed way.11
50 100 150 200 250 300
Time(Second) -2-101234
Fig. 16. The trajectories of ˆ θ ij ( t ) for i ∈ { , , · · · , } for thesystem considered in (3) with the control input in (9)-(10),where y i is generated by (60).Fig. 17. The trajectories of δ ij ( t ) for i ∈ { , , · · · , } , j ∈ { , } for the system considered in (3) with the control input in(9)-(10), where y i is generated by (60). References [1] M. Ye, “Distributed robust seeking of Nash equilibrium fornetworked games: an extended state observer-based approach,”
IEEE Transactions on Cybernetics , accepted, published online,DOI: 10.1109/TCYB.2020.2989755.[2] M. Ye and G. Hu, “Game design and analysis for price-based demand response: an aggregate game approach,”
IEEETransactions on Cybernetics, vol. 47, no. 3, pp. 720-730, 2017.[3] M. Ye, G. Hu, F. Lewis, L. Xie, “A unified strategy for solutionseeking in graphical N -coalition noncooperative games,” IEEETransactions on Automatic Control , vol. 64, no. 11, pp. 4645-4652, 2019.[4] J. Ma, M. Ye, Y. Zheng, Y. Zhu, “Consensus analysis of hybridmultiagent systems: a game-theoretic approach,”
InternationalJournal of Robust and Nonlinear Control, vol.29, pp. 1840-1853,2019.[5] M. Ye and G. Hu, “Distributed Nash equilibrium seeking by aconsensus based approach,”
IEEE Transactions on AutomaticControl, pp. 4811-4818, vol. 62, no. 9, 2017.[6] P. Jiang, P. Woo, R. Unbehauen, “Iterative learning controlfor manipulator trajectory tracking without any controlsingularity,”
Robotica, vol. 20, no. 2, pp. 149-158, 2002.[7] J. Du, C. Guo, S. Yu, Y. Zhao, “Adaptive autopilot design oftime-varying uncertain ships with completely unknown controlcoefficients,”
IEEE Journal of Oceanic Engineering, vol. 32, no.2, pp. 346-352, 2007.[8] R. Nussbaum, “Some remarks on a conjecture in parameteradaptive control,”
System and Control Letters, vol. 3, no. 5, pp.243-246, 1983. [9] X. Bun, D. Wei, X. Wu, J. Huang, “Guaranteeing preselectedtracking quality for air-breathing hypersonic non-affine modelswith an unknown control direction via concise neural control,”
Journal of the Franklin Institute, vol. 353, no. 13, pp. 3207-3232,2016.[10] L. Wang, W. Deng, J. Liu and R. Mei, “Adaptive sliding modetrajectory tracking control of quadrotor UAV with unknowncontrol direction,” In: R. Wang, Z. Chen, W. Zhang, Q.Zhu(eds) Proceedings of the 11th International Conferenceon Modelling, Identification and Control, Lecture Notes inElectrical Engineering, vol. 582. Springer, Singapore.[11] J. Peng, X. Ye, “Cooperative control of multiple heterogeneousagents with unknown high-frequency-gain signs,”
Systems andControl Letters, vol. 68, pp. 51-56, 2014.[12] X. Ye and J. Jiang, “Adaptive nonlinear design without apriori knowledge of control directions,”
IEEE Transactions onAutomatic Control , vol. 43, no. 11, pp. 1617-1621, 1998.[13] Z. Chen, “Nussbaum functions in adaptive control with time-varying unknown control coefficients,”
Automatica , vol. 102, pp.72-79, 2019.[14] H. Khailil,
Nonlinear Systems,
Upper Saddle River, NJ: PrenticeHall, 2002.[15] A. Scheinker, M. Krstic, “Minimum-seeking for CLFs: universalsemiglobally stabilizing feedback under unknown controldirections,”
IEEE Transactions on Automatic Control , vol. 58,no. 5, pp. 1107-1122, 2013.[16] C. Chen, C. Wen, Z. Liu, K. Xie, Y. Zhang, C. Chen, “Adaptiveconsensus of nonlinear multi-agent systems with non-identicalpartially unknown control directions and bounded modellingerrors,”
IEEE Transactions on Automatic Control, vol. 62, no.9, pp. 4654-4659, 2017.[17] C. Yang, S. Ge, T. Lee, “Output feedback adaptive control of aclass of nonlinear discrete-time systems with unknown controldirections,”
Automatica , vol. 45, no. 1, pp. 270-276, 2008.[18] M. Guo, D. Xu, L. Liu, “Cooperative output regulation ofheterogeneous nonlinear multi-agent systems with unknowncontrol directions,”
IEEE Transactions on Automatic Control ,vol. 62, no. 6, pp 3039-3045, 2017.[19] Y. Tang, “Multi-agent optimal consensus with unknown controldirections,” arXiv:2005.10492.[20] J. Koshal, A. Nedic and U. Shanbhag, “Distributed algorithmsfor aggregative games on graphs,”
Operations Research, vol. 64,pp. 680-704, 2016.[21] F. Salehisadaghiani and L. Pavel, “Distributed Nash equilibriumseeking: A gossip-based algorithm,”
Automatica, vol. 72, pp.209-216, 2016.[22] M. Ye, G. Hu and S. Xu, “An extremum seeking-based approachfor Nash equilibrium seeking in N -cluster noncooperativegames,” Automatica, vol. 114, 108815, 2020.[23] M. Ye, G. Hu, and F. L. Lewis, “Nash equilibrium seeking for N -coalition non-cooperative games,” Automatica, vol. 95, pp.266-272, 2018.[24] A. Ibrahim, T. Hayakawa, “Nash equilibrium seeking withsecondorder dynamic agents,”
IEEE Conference on Decisionand Control, pp. 2514-2518, 2018.[25] M. Ye, “Distributed Nash equilibrium seeking for games insystems with bounded control inputs,” submitted to
IEEETransactions on Automatic Control, revised, available online:arXiv:1901.09333, 2019.[26] J. Slotine, W. Li,
Applied Nonlinear Control,
Prentice Hall,Englewood Cliffs, 1991.[27] M. Ye, G. Hu,“Distributed Nash equilibrium seeking in multi-agent games under switching communication topologies,”
IEEETransactions on Cybernetics, vol. 48, no. 11, pp. 3208-3217,2018.[28] M. Ye, G. Hu, “Adaptive approaches for fully distributedNash equilibrium seeking in networked games,” submitted to
Automatica, revised, available online: arXiv:1912.00415.[29] J. Huang, Y. Song, W. Wang, C. Wen and G. Li, “Fullydistributed adaptive consensus control of a class of high-order onlinear systems with a directed topology and unknown controldirections,” IEEE Transactions on Cybernetics , vol. 48, no. 8,pp. 2349-2356, 2018.[30] Z. Li, Z. Duan,
Cooperative Control of Multi-agent Systems: AConsensus Region Approach,
Taylor and Francis/CRC Press,Boca Roton, FL, 2014. ISBN: 978-1-4665-6994-2.[31] M. Ye, G. Hu, “A robust extremum seeking scheme for dynamicsystems with uncertainties and disturbances,”
Automatica , vol.66, pp. 172-178, 2016., vol.66, pp. 172-178, 2016.