Evolution of Cooperation among Mobile Agents
aa r X i v : . [ phy s i c s . s o c - ph ] F e b Evolution of Cooperation among Mobile Agents
Zhuo Chen ∗ , Jianxi Gao , Yunze Cai and Xiaoming Xu , , Shanghai Jiao Tong University, Shanghai, China University of Shanghai For Science and Technology, Shanghai, China Shanghai Academy of Systems Science, Shanghai, China ∗ jeffchen [email protected] 25, 2018 Abstract
We study the effects of mobility on the evolution of cooperation amongmobile players, which imitate collective motion of biological flocks andinteract with neighbors within a prescribed radius R . Adopting the pris-oner’s dilemma game and the snowdrift game as metaphors, we find thatcooperation can be maintained and even enhanced for low velocities andsmall payoff parameters, when compared with the case that all agents donot move. But such enhancement of cooperation is largely determined bythe value of R , and for modest values of R , there is an optimal value ofvelocity to induce the maximum cooperation level. Besides, we find thatintermediate values of R or initial population densities are most favorablefor cooperation, when the velocity is fixed. Depending on the payoff pa-rameters, the system can reach an absorbing state of cooperation whenthe snowdrift game is played. Our findings may help understanding therelations between individual mobility and cooperative behavior in socialsystems. Keywords : cooperation, flocks, evolutionary games, prisoner’s dilemma,snowdrift game, mobility
Cooperation is commonly observed throughout biological systems, animal king-doms and human societies. But from a Darwinian viewpoint, cooperators are ata disadvantage in natural selection, because they increase the fitness of othersat the cost of their own survival and reproduction [1]. In a broad range of disci-plines, understanding the emergence of cooperation is a fundamental problem,which is often studied within the framework of evolutionary game theory.The prisoner’s dilemma (PD) game and the snowdrift game are commonlyused two person games with two strategies, cooperation (C) and defection (D).Mutual cooperation pays each a reward R , while mutual defection brings each apunishment P . When one defector meets one cooperator, the former gains thetemptation T while the latter obtains the sucker’s payoff S . The PD is definedby the payoffs, if T > R > P > S and 2
R > S + T . In a single round of the PD,though the individual interest can be maximized by defection, the collective1ayoff achieves the maximum only when both players cooperate. Hence thedilemma arises. As an alternative model to study cooperative behavior, the SDis produced when T > R > S > P . In contrast with the PD, the best strategyof the SD depends on the co-player: to defect if the opponent cooperates, butto cooperate if the opponent defects. Under replicator dynamics in well-mixedpopulations, defection is the only evolutionarily stable strategy in the PD, whilecooperators may coexist with defectors in the SD. Note in the SD, the averagepopulation payoff at evolutionary equilibrium is smaller than that when everyoneplays C [2]. Thus SD is still a social dilemma.One of possible mechanisms accounting for the establishment of coopera-tion is the so-called network reciprocity [3]. Discarding the well-mixed assump-tion for populations, this theory focuses on how spatial structure affects theevolution of cooperation. Axelrod first suggested to locate individuals on thetwo-dimensional array, where interactions only happened within local neighbor-hoods. Nowak and May developed this idea later, showing that unconditionalcooperators could survive by forming clusters [4]. These pioneering studieshave triggered an intensive investigation of spatial games, yielding enormouscombinations of evolutionary rules, graphs and game models. In Ref. [5], theeffect of noise is incorporated in the strategy adoption, and Darwinian selec-tion of the noise level favors a specific parameter value that induces the highestlevel of cooperation [6, 7]. Diversity is another role facilitating cooperation,which takes various forms as heterogeneous graphs [8], preferential imitations[9], reproduction probabilities [10], individual rationality [11], fitness [12] or be-havioral preferences [13]. Since connectivity structures in the real world are farmore than regular lattices, there are many interests in the impact of complextopologies on cooperative behavior [14, 15, 16, 17, 18, 19]. The co-evolutionof strategies and individual traits, such as teaching activities [20, 21], learningrules [22, 23] and social ties [24, 25, 26, 27, 28], constitutes a key mechanism forthe sustainability of cooperation. Interestingly, cooperators can benefit fromthe continuous supply of new players [29, 30], and the strategy-independentevolution of networks can evoke powerful mechanisms to promote cooperation[31, 32]. More details about spatial evolutionary games can be found in Ref.[2, 3, 33, 34] and references therein.Mobility of individuals is responsible for various spatiotemporal dynamics ongeographical scales, such as the spread of infectious diseases and wireless viruses[35]. And statistical properties of human motion have attracted much interestin recent years [36, 37, 38]. Indeed, the motion of individuals is an importantcharacteristic of social networks [39]. Though it is often neglected, the effects ofmobility on the evolution of cooperation vary with movement forms and pop-ulation structures. Vainstein et al. [40] considered a random diffusive processin a population of agents with pure strategies, where each agent can jump to anearest empty site with a certain probability. It was found that cooperation canbe enhanced by the movement of players, provided that the mobility parameteris kept with a certain range. The weak form of the PD adopted in Ref. [40] waslater extended to other games [41, 42],and it was found that cooperation in theSD is not so often inhibited as that reported in Ref. [43]. Besides, the movementof players may take an adaptive form for payoffs or neighbors, and contingentmobility is often expected to enhance cooperation. Aktipis [44] proposed a walk-away strategy to avoid repeated interactions with defectors, which outperformscomplex strategies under a number of conditions. Helbing and Yu introduced2he success-driven migration, in which players determine destinations throughfictitious play [45]. Besides, individuals can decide when to move based on thenumber of neighboring defectors [46].The synchronised motion of animal groups, such as fish schools and birdflocks, is an intriguing phenomenon, which can be modeled by systems of self-driven agents [47, 48, 49]. Recently, the model by Vicsek et al. has gainedmuch attention for minimalism styles and rich dynamics [47]. Here we combinethe Vicsek model with evolutionary games, focusing on the effect of mobilityon the evolution of cooperation. We reserve well-known elements like directionalignment and circular neighborhoods, ignoring the influence of angular noiseon the update of velocity. We also cancel the periodic boundary conditions forsimplicity, which can strongly affect the system behavior at the large velocityregime [48]. Thus when players move, the system is split into some discon-nected groups, within which agents move toward the same direction. Note insome social systems, individuals do divide into groups according to race, wealth,age, and so on. We think that the aggregation of individuals partly reflects thecommunity structure in social networks. In Ref. [50], we have investigated anevolutionary PD game in a Vicsek-like model, where each agent plays with con-stant number of neighbors. We have found that cooperation can be maintainedand even enhanced by the motion of players, provided that certain conditions arefulfilled. In the current work, we will check the robustness of our conclusions,when each agent plays the PD game with those individuals within a certaindistance. Besides, we will study how mobility affects the outcome of the SDgame.
We consider a system with N autonomous agents, which have positions x i ( t )and move synchronously with velocities −→ V i ( t ) in a two-dimensional plane. Thevelocity −→ V i ( t ) of the agent i is characterized by a fixed absolute velocity v andan angle θ i ( t ) indicating the direction of motion. When t = 0, all agents arerandomly distributed in an L × L square without boundary restrictions. Ratherthan fixed within a periodic domain, individuals can cross the border of thesquare when t >
0, and move in the whole plane. The square only representsthe initial distribution of individuals with a density ρ = N/L . Besides, initialmoving directions of agents, θ i (0), are uniformly distributed in the interval[0 , π ). At each time step, the i th agent updates its position according to x i ( t + 1) = x i ( t ) + −→ V i ( t )∆ t. (1)Here ∆ t is set to 1 between two updates on the positions.To simulate the process of direction alignment in flocks, the angle θ i ( t ) ofthe agent i is updated according to the average direction of nearby neighbors[47]. Then we have θ i ( t + 1) = arctan sinθ i ( t ) + P j ∈ W i ( t ) sinθ j ( t ) cosθ i ( t ) + P j ∈ W i ( t ) cosθ j ( t ) , (2)where W i ( t ) denotes the neighbors set of the agent i at time t . Here W i ( t ) isdefined as agents in the spherical neighborhood of the radius R centered on the3gent i , W i ( t ) = { j | | x j − x i | < R, j ∈ N, j = i } , (3)where | • | denotes the Euclidean distance between j and i in two-dimensionalspace. And we assume that each agent has the same radius.The equations given above characterize the motion of agents. When movingin the plane, the agents also play games in pairs. For the PD, we take a re-scaledform suggested by Nowak et al. [4] as A = (cid:18) b (cid:19) , (4)where b denotes the temptation to defect, and 1 < b <
2. And for the SD, wetake a simplified form as A = (cid:18) − r r (cid:19) , (5)where r denotes the cost-to-benefit ratio of mutual cooperation, and 0 < r < s i of the agent i , cooperation or defection, can be denoted by aunit vector (1 , T or (0 , T respectively. During the evolution of strategies,an normalized payoff is calculated to exclude the effect coming from differentdegrees of players, P i = P j ∈ W i ( t ) s Ti As j | W i ( t ) | , (6)where | • | represents the size of W i ( t ), and A is the payoff matrix. Afterward,every agent compares its income with that of its neighbors, following the strategywhich owns the highest payoff among its neighbors and itself [4].The system begins with an equal percentage of cooperators and defectors.At each step, all agents collect payoffs and update strategies, and next, theymodify positions and directions. The time scale that characterizes the evolutionof strategies is the same as the time scale that represents the motion of players.This process is repeated until the system reaches equilibrium.Fig. 1(a) illustrates the segregation of players at equilibrium. And playersin the same group can move coherently, as shown in Fig. 1(b). During theprocess of direction alignment, the movement of players may lead to time-variantneighborhoods. The total number of new neighbors appearing at time t can becalculated as n ( t ) = N X i =1 | W i ( t ) − W i ( t ) \ W i ( t − | , (7)where | • | represents the set size. Fig. 1(c) shows typical evolution of n ( t ),which is divided by N for normalization. Besides, we also plot the evolutionof the cooperator frequency f c and the average normalized velocity V a [47] forcomparison. When t > n ( t ) decreases to 0, and V a reachesa steady value. These findings imply that a static interaction network has beenconstructed with fixed neighborhoods and velocities of players. When t > f c fluctuates stably. Then, the equilibrium frequency of cooperators can beobtained by averaging over a long period.Simulations are carried out in a system with N = 1000, L = 10. To ensurefixed topology of the interaction network, the evolution of n ( t ) is monitored4 (a) (b) a n(t)/N (c) Fig. 1. (a)Snapshots of equilibrium configurations in the PD ( b = 1 . R =0 . v = 0 . f c , n ( t ) /N and V a for the PD ( b = 1 . R = 0 . v = 0 . time stepsand dependent on the values of R , v and L . If n ( t ) ≤
1, and this conditioncan hold for q = 1000 time steps, the network will be treated as a static one.Then the equilibrium frequency of cooperators is evaluated by averaging overthe last 1000 generations. All data points shown in each figure are acquired byaveraging over 200 realizations of independent initial states. Fig. 2 shows the fraction of cooperators f c as a function of the payoff parameter, b for the PD and r for the SD, under different values of v when R is fixed. Clearly,the cooperation level decreases with b and r in both games, no matter what v is. Compared with the static case ( v = 0), it is worth noting that cooperationis greatly enhanced in a large region of b ( r ), when players are allowed to movewith a low velocity (for example, v = 0 . v > v = 0,when b < .
17 or r < . f c can even approach 1 in the SD. But such an enhancement of cooperationcan only be observed for small values of b ( r ), as the velocity increases from 0 . .
15. Indeed, a rapid decrease of the cooperator frequency can be observedin both games when v = 0 . f c Prisoner’s Dilemma Game v=0v=0.01v=0.05v=0.15R=0.3 (a) f c Snowdrift Game v=0v=0.01v=0.05v=0.15R=0.3 (b)
Fig. 2. The cooperator frequency f c versus the payoff parameters b (PD) and r (SD) for different values of v .To clarify the effects of v on the cooperator frequency, Fig. 3 presents thedependence of the cooperation level f c on the absolute velocity v for differentvalues of R . It displays that the fraction of cooperators for v = 10 − is veryclose to that for v = 0, irrespective of the value of R . Meanwhile, it can befound that whether the movement of players promotes cooperation is largelydetermined by the value of R . As shown in Fig. 3(a) and Fig. 3(d), themovement of players fails to promote cooperation for small R . One can see thatthe maximum of f c for R = 0 . v = 10 − in both games, and thecooperation level is lower than that for v = 0 over the entire range of v . Forlarge R , the enhancement of cooperation resulting from the movement of playersis quite limited or even disappeared. As illustrated in Fig. 3(c) and Fig. 3(f),6hen v ≤ − , the curve of f c nearly coincides with the result for v = 0 in thePD, and only a tiny increase of f c can be observed in the SD. When R = 0 . v > v = 0 in the whole region of v , and there is amaximum of f c at v = 10 − . The resonance-like phenomenon also implies thatfor a fixed b ( r ), decreasing the value of v cannot always promote cooperation.Before moving forward, we would like to add some remarks about the aboveresult. Previous work has shown that cooperation is not only possible but mayeven be enhanced by the non-contingent movement of players when comparedwith the static case [40, 51, 52, 53]. And our result provides another examplethat helps understand the universal role of mobility in the evolution of cooper-ation. Particularly, Meloni and his colleagues [52] investigated how cooperationemerges in a population of PD players, which move in a square plane withperiodic boundary conditions. The final outcome of the system has only twopossibilities, all-C or all-D, when the neighborhood of each player is defined by afixed radius R . The authors claimed that the movement of players can promotecooperation when the temptation to defect and the velocity of players are small,and the probability of achieving an all-C state monotonically decreases with thevelocity for a fixed payoff parameter. In our model, however, the maximum co-operation level does not occur at the limit v →
0, when the movement of playerspromotes cooperation for modest values of R , and a non-monotonic dependenceof the cooperator frequency on the velocity of players can be observed in Fig.3(b) and Fig. 3(e). Compared with the result in Ref. [52], this phenomenon canbe explained by the difference between the rules of movement. In Ref. [52], thenetwork of contacts is continuously changing, because individual directions arecontrolled by N-independent random variables. As a result, randomness amongpartnerships can be preserved all the time. But in the present work, a staticnetwork of interactions is gradually developed, when individuals successfullyalign themselves with neighbors. Such fixed partnerships are maintained by thecancellation of periodic boundary conditions, which allows the coexistence ofcooperators and defectors. As shown in Fig. 1(c), the range of alignment inter-action fluctuates only when t < v , the higher the probability for each individualto encounter different neighbors. Different with the work in Ref. [52], the valueof v also determines how long random partnerships persist. In our work, for afixed R , the larger the value of v , the sooner the system is expected to achievestatic neighborhoods. It is not easy to describe how cooperation is promoted bysmall values of v , and a heuristic explanation is that a low degree of migrationcan trigger the expansion of cooperator clusters, as suggested in Ref. [53]. Inour model, the cooperation level for v = 10 − is near to that for v = 0. Andfor large values of v , cooperative clusters may be destroyed by the movementof players, making cooperators vulnerable to defectors. Thus similar to the socalled evolutionary coherence resonance [54, 55], the maximum level of coopera-tion can be induced by an optimal amount of randomness, which is determinedby the absolute velocity v of players. In addition, results in Fig. 3 have alsoshown that the movement of players can inhibit cooperation for a small (large)value of R . Next, we will make discussions about the role of R in the evolutionof cooperation. 7 −6 −5 −4 −3 −2 −1 f c Prisoner’s Dilemma Game R=0.1R=0.1, v=0 (a) −6 −5 −4 −3 −2 −1 f c Prisoner’s Dilemma Game R=0.4R=0.4, v=0 (b) −6 −5 −4 −3 −2 −1 f c Prisoner’s Dilemma Game R=0.7R=0.7, v=0 (c) −6 −5 −4 −3 −2 −1 f c Snowdrift Game R=0.1R=0.1, v=0 (d) −6 −5 −4 −3 −2 −1 f c Snowdrift Game R=0.4R=0.4, v=0 (e) −6 −5 −4 −3 −2 −1 f c Snowdrift Game R=0.7R=0.7, v=0 (f)
Fig. 3. The frequency of cooperators f c is plotted against the absolute velocity v for different values of R , while the corresponding value of f c for v = 0 ispresented by a dashed horizontal line in each figure. The temptation b is set to1 .
15 (PD), and the cost-to-benefit ratio r is set to 0 . X axis. 8n Fig. 4(a) and Fig. 4(b), we show that the cooperator frequency f c varieswith the radius R for different values of v . It displays that the proportion ofcooperators monotonously decreases with R , until the radius exceeds a certainvalue, and for R < .
2, the maximum of f c appears at v = 0. Note in the currentwork, interaction neighborhoods are determined by the radius R at each timestep. For near-zero values of R , there are few links among individuals in theinstant network. As pairwise interactions increase with R , it is hard for isolatedcooperators to resist the invasion of defectors. When players are allowed tomove, defectors have more chances to exploit cooperators. Then for small R ,defection becomes dominant in the population, and the movement of playersinhibits the evolution of cooperation. But for larger values of R , cooperatorsare expected to get together, and the introduction of mobility causes a morerapid increase of f c . As shown in Fig. 4(a) and Fig. 4(b), the curves of f c for v > R = 0 .
09, and then begin increasing with R in both games.For each value of v , the increase of R induces a resonance-like phenomena, andthe cooperator frequency f c reaches a maximum around R = 0 .
6. This findingindicates that intermediate values of R are most favorable for cooperation, sincethe system approaches a fully connected network for extremely large R in thestationary state. It also helps explain why the movement of players fails togive evident enhancement to the cooperation level at large values of R . Indeed,for large R , the network of interactions is almost static, since individuals canquickly align themselves with neighbors. As a result, though the maximum of f c decreases with v , the curves of f c for different values of v gradually mergeat large values of R . In Fig. 4(c) and Fig. 4(d), we show the dependence of thecooperation level on the radius R for different values of b ( r ). It displays thatthe resonance-like phenomena is greatly influenced by the payoff parameter,and for a fixed R , the maximum of f c decreases with b ( r ). For large b ( r ), thecooperator frequency f c monotonously decreases with R , and the maximumlevel of cooperation appears at the limit R →
0. But for r = 0 .
2, the systemcan reach an absorbing state of all cooperators. This is because the SD is morefavorable for cooperators than the PD.In Fig. 5, we plot the cooperator frequency f c against the initial density ρ for fixed payoff parameters when R = 0 . v = 0 .
05. One can see that the be-havior of f c caused by the variance of ρ is similar to that shown in Fig. 4. Thisis because both R and ρ can influence the size of neighborhood. For instance,when t = 0, the average degree of the interaction network can be written as < k > = N R π/L = ρR π . When the players are located on the vertices of afixed network, previous results have shown the resonant behavior of the cooper-ator frequency around certain values of the average degree [56]. And our workcan be viewed as an extension to the dynamic interaction network that appearsduring the movement of players. For small ρ , all agents are widely dispersed inthe plane, and cooperators cannot get enough support from cooperative neigh-bors. Though the chance of forming cooperative clusters increases with ρ , theproportion of cooperators monotonously decreases until ρ > .
13, as shown inthe inset of Fig. 5(a). Large values of ρ are also harmful to cooperators. This isbecause the mean field situation is nicely recovered for large ρ , in which inter-actions almost take place among each pair of players. Between these two limits,one can find that the cooperation level can reach the maximum for moderatevalues of ρ . It has been reported that the cooperation level can reach a peakat some values of ρ , when the players are running in a square with periodic9 f c Prisoner’s Dilemma Game v=0v=0.05v=0.1v=0.15b=1.15 (a) f c Snowdrift Game v=0v=0.05v=0.1v=0.15r=0.6 (b) f c Prisoner’s Dilemma Game b=1.15b=1.25b=1.35b=1.65v=0.05 (c) f c Snowdrift Game r=0.2r=0.5r=0.6r=0.8v=0.05 (d)
Fig. 4. The cooperator frequency f c versus the radius R for different values of b (PD), r (SD) and v . 10oundary conditions [52]. And our results indicate the importance of the initialdensity ρ to the evolution of cooperation, even when the boundary restrictionsare removed. In the SD, one can find the similar phenomenon that observed inthe PD: when r = 0 .
6, the cooperator frequency f c first decreases for small ρ ,then increases with ρ until reaching the maximum, and decreases for large ρ .Note when r = 0 .
2, the cooperator frequency f c monotonously increases with ρ , and the system can reach an absorbing state of full cooperators at last. −2 −1 ρ f c Prisoner’s Dilemma Game b=1.17 (a) −2 −1 ρ f c Snowdrift Game r=0.2r=0.6 (b)