[PDF] Aspiration can promote cooperation in well-mixed populations as in regular graphs

Abstract

Classical studies on aspiration-based dynamics suggest that a dissatisfied individual changes strategy without taking into account the success of others. This promotes defection spreading. The imitation-based dynamics allow individuals to imitate successful strategies without taking into account their own-satisfactions. In this article, we propose to study a dynamic based on aspiration which takes into account imitation of successful strategies for dissatisfied individuals. This helps cooperative members to resist. Individuals compare their success to their desired satisfaction level before making a decision to update their strategies. This mechanism helps individuals with a minimum of self-satisfaction to maintain their strategies. If an individual is dissatisfied, it will learn from others by choosing successful strategies. We derive an exact expression of the fixation probability in well-mixed populations as in structured populations in networks. As a result, we show that selection may favor cooperation more than defection in well-mixed populations as in populations ranged over a regular graph. We show that the best scenario is a graph with small connectivity.

Full PDF

aa r X i v : . [ q - b i o . P E ] J un Aspiration can promote cooperation in well-mixed populations as inregular graphs

Dhaker Kroumi Department of Mathematics and StatisticsKing Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia

Abstract

Classical studies on aspiration-based dynamics suggest that dissatisﬁed individuals switch theirstrategies without taking into account the success of others. The imitation-based dynamics allowindividuals to imitate successful strategies without taking into account their own-satisfactions.In this article, we propose to study a dynamic based on aspiration, which takes into accountimitation of successful strategies for dissatisﬁed individuals. Individuals compare their successto their aspired levels. This mechanism helps individuals with a minimum of self-satisfactionto maintain their strategies. Dissatisﬁed individuals will learn from their neighbors by choosingthe successful strategies. We derive an exact expression of the ﬁxation probability in well-mixedpopulations as in graph-structured populations. As a result, we show that selection favor theevolution of cooperation if the diﬀerence in aspired level exceeds some crucial value. Increasing theaspired level of cooperation should oppose cooperative behavior while increasing the aspired levelof defection should promote cooperative behavior. We show that the cooperation level decreasesas the connectivity increases. The best scenario for the cooperative evolution is a graph with asmall connectivity while the worst scenario is a well-mixed population.

Keywords and phrases : Fixation probability; Evolutionary game dynamics; Pair approximation;Cooperation; Imitation; Aspiration

Mathematics Subject Classiﬁcation (2010) : Primary 92D25; Secondary 60J70

1. Introduction

Evolutionary game theory is the framework where the frequency of a strategy depends on theﬁtnesses of the diﬀerent individuals in the population (Maynard Smith and Price [27], MaynardSmith [26], Hofbauer and Sigmund [12], Weibull [51], Samuelson [39], Cressman [3], Vincent and Author for correspondence: [email protected]

Preprint submitted to Dynamic Games And Applications June 9, 2020 rown [50], Nowak [29]). Individuals interact and gain payoﬀs, which are seen as biological ﬁtnessor reproductive rates.The standard model, called the replicator equation, was formulated in an inﬁnitely large well-mixed population where any two individuals have the same probability to interact (Taylor andJonker [44], Zeeman [53], Hofbauer and Sigmund [13, 14]). Suppose that there are n strategies { S , S , . . . , S n } . The game is described by a payoﬀ matrix A = { a i,j } i,j =1 ,...,n , where a i,j is thepayoﬀ of an S i -player if its partner is S j -player. Let x i be the frequency of S i -players in thepopulation. The dynamic is dx i dt = x i ( f i − f ) , (1)where f i = n P j =1 x j a i,j and f = n P i =1 x i f i refer to the expected payoﬀ on an S i -player, and the averagepayoﬀ in the population, respectively.Real populations are ﬁnite and deterministic approaches cannot capture this ﬁniteness. Re-cently, a stochastic approach is introduced to model this ﬁniteness by a Markov chain with a ﬁnitestate space. In the absence of mutation, the Markov chain has absorbing states represented by apopulation of a unique type. A strategy is said to be favored by selection if its ﬁxation probabilityis greater than what it would be under neutrality (Nowak et al. [31], Imhof and Nowak [15]).In the presence of symmetric mutation, this Markov chain is irreducible, and as a result, it hasa stationary state. An interest in the abundance of a given strategy in this equilibrium statesarises. In this case, a strategy is said to be favored by selection if its average frequency in thestationary state is greater than what it would be under neutrality (absence of selection) (Antal etal. [1]). Both models, without mutation and with mutation, share the same favored strategy if themutation rate is small enough (Rousset and Billiard [36], Rousset [37], Fudenberg and Imhof [8]).Further advances in evolutionary game theory study structured populations. The traditionalsetting is the island model where individuals are structured into isolated islands (Ladret andLessard [20]; Lessard [21]). Interactions occur only within islands. After reproduction, individualscan migrate or stay in the parent’s island. The case of isolation by distance, called stepping stonemodel, is considered in Rousset and Billiard [36], and Rousset [38]. Islands are numbered 1 , , . . . , d and the migrate rates are m i,i +1 = m i,i − = m/ m i,i = 1 − m , and 0 otherwise.In these structured models, individuals share the same neighborhood if they belong to thesame group, or they do not have any common neighbor if they belong to two diﬀerent groups.Evolutionary graph theory is a natural extension to take into account that individuals can shareonly some of their neighbors (Nowak and May [30], Hauert and Doebeli [10], Lieberman et al. [22],Ohtsuki et al. [32], Taylor et al. [43]). It is a powerful framework that includes social networks inthe evolutionary process. N individuals occupy N nodes. Each node is linked to k nodes by edges.Each edge indicates who can interact with whom.2or a graph of degree k = 2, the evolutionary process is described in many studies (Ohtsukiand Nowak [34], van Valen and Nowak [49]). The population state is described completely bythe frequency of each strategy. A condition, to favor a strategy over another strategy in a ﬁnitepopulation, can be derived as in well-mixed populations.For general degree k , the frequencies of the diﬀerent strategies are not enough to describe theevolutionary process. To simplify the complexity of such a graph, a technique of pair approximation(Matsuda et al. [25], Nakamura et al. [28], Keeling [17], Van Baalen [47]) is introduced to study theevolutionary process in regular graphs (Ohtsuki et al. [33, 32, 35]). Assume that each individualcan choose a strategy among { A, B } . Pair approximation is a framework to study the stochasticdynamics not only by considering the global frequencies p X for X ∈ { A, B } , but also by considering q X | Y for X ∈ { A, B } , the probability that a neighbor of a Y -player, is of type X . This methodassumes that a two-step adjacent player does not aﬀect the focal site directly, that is q X | Y Z = q X | Y .This technique is limited to a very large population such that k << N .Besides, update rules, in which individuals correct their strategies following a selection mech-anism, are of greater importance for their conﬁrmed impact on the evolutionary process. For thisreason, one of the most open questions is how do individuals update their strategies based on theirknowledge of themselves and others.Many update rules have been proposed. The most used are based in two representative models:imitation-based rule (Szab´o and T¨oke [40], Ohtsuki et al. [32], Traulsen et al. [45]) and aspiration-based rule (Chen and Wang [2], Matjaz and Zhen [24], Du et al. [5, 6], Liu et al. [23]). Underimitation-based rule, individuals update their strategies based on their knowledge of others. Anindividual compares its payoﬀ with its neighbors’ payoﬀs. If its payoﬀ is lower, it would imitateits neighbors with a higher probability. Under aspiration-based rule, individuals update theirstrategies based on their knowledge about themselves. An individual compares its payoﬀ to anaspired level, which represents its tolerance with its current strategy. If its payoﬀ is lower, it wouldswitch its strategy with a higher probability.All these studies suggest that individuals correct their strategies according to only one ofthe following conceptions: their knowledge of others or their knowledge of themselves. However,real biological species can change their strategies using both information due to the inﬂuence ofenvironmental factors and the complexity of their knowledge. In search of food, foragers of antsuse chemical pheromone trails to guide other ants to the food sources. Experienced ants choose tofollow the route to their previous trips (Matjaz and Zhen [24], Gr¨uter et al. [9]). Non-experiencedants will imitate their neighbors. This suggests that if the strategy gives the player a certain levelof self-satisfaction, then it will be maintained. If individuals did not reach their aspired levels, theywill imitate their neighbors. The same conclusion was inferred in experiments on ﬁsh stickleback3van Bergen et al. [48]).In light of this conclusion, this paper studies the eﬀect of a mixed update rule on the evolutionof cooperation in diﬀerent topologies. The update rule is composed of two rounds. In the ﬁrstround, individuals compares their payoﬀs to a personal tolerance index. Satisﬁed individuals keeptheir current strategy with higher probability. Dissatisﬁed individuals will observe the success oftheir neighbors to make a decision. More precisely, a selected individual I compares its payoﬀ Π I to its aspired level α I . It will maintain its current strategy with probability proportional to itssatisfaction measured by Π I − α I . Otherwise, it will imitate a neighbor’s strategy. It will selecta neighbor, say J , with probability proportional to the ﬁtness of individual J . We analyze thismodel in both well-mixed and graph-structured populations.This model is equivalent to the following death-birth update. At each time step, a randomlychosen individual, say I , survives with probability proportional to Π I − α I . Otherwise, it dies.In this case, a competition between its neighbors arises. A neighbor is chosen proportional to itsﬁtness to produce a copy, which will occupy the vacant position. It is similar to the death-birthupdate rule (Ohtsuki et al. [32, 33]), where the death event occurs with probability 1.For a ﬁnite well-mixed population (appendix C ) and a ﬁnite graph-structured population ofdegree k = 2 (appendix B ), an exact calculation technique will be used to measure the success ofcooperation. We use a property of a discrete Markov chain with two absorbing states to derive theﬁxation probabilities of cooperation and defection. However, for a graph of degree k ≥

3, it is notpossible to study analytically the evolutionary process in a ﬁnite population. However, for a largepopulation, we use a pair approximation technique and then a diﬀusion approximation to derivethe ﬁxation probabilities of cooperation and defection (appendix A ).The remainder of this paper is divided in 4 sections. In Section 2, we describe our model. InSection 3, we test the success of cooperation and defection in well-mixed and graph-structuredpopulations. We apply our results to the simpliﬁed additive Prisoner’s Dilemma in Section 4. Weﬁnish this article by a discussion in Section 5.

2. Model

Consider a ﬁnite population composed of N individuals distributed over N nodes of a graph.Each node is related by edges to other k nodes. k , called the graph degree or the connectivity, isthe same for all individuals (see ﬁgure 1). Each edge indicates who interacts with whom. Any twoindividuals who are related by an edge are called neighbors. Suppose that the graph is connectedin the sense that any two nodes are linked by a ﬁnite number of edges. Each individual can adopta strategy among { C, D } : C for cooperation and D for defection.4 C DDDD (a) k = 2 C C DDDD (b) k = 3 C C DDDD (c) k = 4 Figure 1: Each individual is related exactly to k neighbors. k is the same for all individuals, which is called thegraph degree. Red nodes are occupied by defectors while blue nodes are occupied by cooperators. At each time step, each individual interacts with its neighbors through the game matrix  C DC R SD T P  . (2)Two cooperators receive a reward, R , whereas two defectors receive a punishment, P . If they inter-act, a cooperator receives a sucker, S , while a defector receives a temptation, T . After interactionswith its neighbors, any individual accumulates a payoﬀ denoted by Π. Then, a randomly chosenindividual I will compare its payoﬀ Π I to its satisfaction index α I , which represents its tolerancewith its current strategy. Here, we study the simplest case where the satisfaction level is a randomvariable that depends on the strategy of the individual and does not depend on time. In addition,we assume that α I is bounded, such that E [ α I ] = ˜ α I . (3)This assumption makes sense because individuals in the population are generally heterogeneous.Therefore, there is an heterogeneity of the aspired level.Individual I will keep its current strategy with probability g (cid:16) δ (Π I − α I ) (cid:17) , (4)where δ is a non-negative constant called the selection intensity. It will update its current strategywith the complementary probability 1 − g (cid:16) δ (Π I − α I ) (cid:17) . (5)In this case, it adopts the strategy of one of its neighbors, say J , chosen with probability pro-portional to its ﬁtness f J = 1 + δ Π J . More precisely, let individuals J , J , . . . , J k the neighborsof individual I . Individual I will adopt the strategy of one of a neighbor J i with probability5 J i / P kl =1 f J l , for i = 1 , . . . , k . This mechanism helps individuals to learn from their neighbors byselecting the most successful strategies.Here, g is a function such that • g (0) = 1 /

2: for δ = 0, updating and maintaining occur with the same probability, that is1 / • g ′ (0) >

0: for δ > I , which is measured by Π I − α I , since we have g (cid:16) δ (Π I − α I ) (cid:17) ≈

12 + δ · g ′ (0)(Π I − α I ) . If Π I > α I , individuals maintain their strategies with a probability higher than 1 /

2. If Π I < α I ,individuals maintain their strategies with a probability lower than 1 /

2. If Π I = α I , individualsmaintain their strategy with probability 1 /

2. The most used function is the Fermi rule g ( x ) = 11 + e − x . (6)In the remainder, we use the Fermi rule (6), where g ′ (0) = 1 / > δ = 0. The case of weak selection corresponds to δ > et al. [45, 46], Wu et al. [52]). In this case, the eﬀect of payoﬀdiﬀerences on the evolutionary process is small. Weak selection is a reasonable assumption for tworeasons: • It is a standard case to derive many analytic results which are not possible for any selectionintensity, but these results stay a good approximation for other selection intensities • In real biological populations, the ﬁtness of an individual depends on many competitions(games), and then each game makes a small contribution, and here we interested only by agame.In the remainder, we are interested in the eﬀect of weak selection on the evolutionary process.

3. Fixation probabilities

Suppose that a cooperation introduced as a single in an all defecting population. As a result,there are two possibilities for the evolutionary dynamics. The ﬁrst scenario is that this individualproduces a lineage, which will eventually invade the entire population (extinction of defection).The second scenario is that this individual might die before reproducing or generate a lineage thatdisappears after sometime (extinction of cooperation). The probability of the ﬁrst scenario, denoted6y ρ C ( δ ), is called the ﬁxation probability of cooperation. Similarly, the ﬁxation probability ofdefection is the probability that a single defector introduced in an all cooperating populationproduces a lineage, which will take over the population. This probability is denoted by ρ C ( δ ). ρ C ( δ ) > ρ C (0)A ﬁrst criterion, for weak selection to favor the emergence and stabilization of cooperation, isthe comparison of the ﬁxation probability under weak selection, ρ C ( δ ), to what it would be underneutrality, ρ C (0) = N − (Rousset and Billiard [36], Nowak et al. [31], Taylor et al. [42]). We saythat selection favors the evolution of cooperation if ρ C ( δ ) > N − . Otherwise, that is ρ C ( δ ) < N − ,we say that selection disfavors the evolution of cooperation. k ≥ CC . This technique is valid only for a large populationsuch that 3 ≤ k << N . See appendix A for more details.Using Eq (73) in appendix A , we have ρ C ( δ ) = 1 N + δ · N − N k h Γ + 3Γ + 3 k ∆ α i + O ( δ ) , (7)where ∆ α = ˜ α D − ˜ α C , Γ = (3 k + 2) R + (3 k − k − S − ( k + 2) T − (3 k + 2)( k − P, Γ = (3 k + 2)( k − R − S − T + P ) . (8)According to Eq. (7), weak selection favors the evolution of cooperation if Γ + 3Γ + 3 k ∆ α > k + 5 k + 23 k R + 6 k − k − k S − k − k + 23 k T − k + k − k P + ∆ α > . (9)This condition is valid for k ≥ k = 2For the circular model k = 2, we use an exact calculation technique that is valid for any ﬁnitepopulation of size N ≥

3. See appendix B for details. Using Eq. (91) in appendix B , the ﬁxationprobability of cooperation is ρ C ( δ ) = 1 N + δ · N h (2 N − N + 7) R + ( N + 2 N − S − ( N − N + 5) T − (2 N − N − P + N ( N − α i + O ( δ ) . (10)7herefore, weak selection favors the evolution of cooperation if2 N − N + 7 N ( N − R + N + 2 N − N ( N − S − N − N + 5 N ( N − T − N − N − N ( N − P + ∆ α > . (11)For large population, N → ∞ , this condition is reduced to2 R + S − T − P + ∆ α > , (12)which extends Eq. (9) for k = 2. In a well-mixed population, each individual is connected to all other individuals. By an exactcalculation technique, we derived the expression of the probability of ﬁxation of the cooperationfor any ﬁnite population of size N ≥

2. See appendix C for more details.Using Eq. (98) in appendix C , we have ρ C ( δ ) = 1 N + δ · N − " N − N − (cid:16) ( N − R + (2 N − S − ( N + 1) T − (2 N − P (cid:17) + ∆ α (13)As a consequence, weak selection favors the evolution of cooperation if3 N − N − (cid:16) ( N − R + (2 N − S − ( N + 1) T − (2 N − P (cid:17) + ∆ α > . (14)For a large population, N → ∞ , this condition becomes R + 2 S − T − P + ∆ α > . (15)Note that this condition is exactly the limit of condition (9) as k → ∞ . Conclusion : For a large structured population in a regular graph of degree k ≥ , weakselection favors the evolution of cooperation if k + 5 k + 23 k R + 6 k − k − k S − k − k + 23 k T − k + k − k P + ∆ α > . (16) This can be extended for well-mixed populations by taking k → ∞ . If inequality (16) is reversed, weak selection disfavors the evolution of cooperation, ρ C ( δ ) ρ D ( δ )It is possible that weak selection favors the ﬁxation of cooperation and defection or disfavorsthe ﬁxation of cooperation and defection. As a result, comparing the ﬁxation probability to whatit would be under neutrality does not give a complete view of the success of a strategy. Then,a second criterion is introduced (Nowak et al. [31]), based on the comparison of the ﬁxationprobabilities, to measure the most successful strategy. If ρ C ( δ ) > ρ D ( δ ), then the invasion of asingle cooperator in an all defecting population is more likely than the invasion of a single defectorin an all cooperating population. In such a case, we say that weak selection favors the evolutionof cooperation more than the evolution of defection. k ≥ A for p = N − , the ratio of the ﬁxation probabilities is reduced to ρ C ( δ ) ρ D ( δ ) = 1 + δ · N − k h Γ + 2Γ + 2 k ∆ α i + O ( δ ) , (18)Accordingly, we have ρ C ( δ ) > ρ D ( δ ) if Γ + 2Γ + 2 k ∆ α >

0, which is equivalent to3 k + 22 k ( R − P ) + 3 k − k ( S − T ) + ∆ α > ≤ k << N . k = 2For k = 2 and by using Eq. (92) in appendix B , we have ρ C ρ D ( δ ) = 1 + δ h (2 N − R − P ) + N ( S − T ) + N ∆ α i + O ( δ ) . (20)Accordingly, weak selection favors the evolution of cooperation more than the evolution of defectionif (2 N − R − P ) + N ( S − T ) + N ∆ α > . (21)For large population size N → ∞ , this condition is equivalent to2 R + S − T − P + ∆ α > , (22)which extends condition (19) for k = 2. 9 .2.3. Case 3: Well-mixed populations An other extension of condition (19) for well-mixed populations is the following. For a popula-tion fully connected, we derive the expression of the ratio ρ C /ρ D in Eq. (99) in appendix B . Wehave ρ C ρ D ( δ ) = 1 + δ · N − " N − N − (cid:16) ( N − R + N S − N T − ( N − P (cid:17) + ∆ α + O ( δ ) . (23)Therefore, weak selection favors the evolution of cooperation more than the evolution of defectionif 3 N − N − (cid:16) ( N − R + N S − N T − ( N − P (cid:17) + ∆ α > . (24)For a large population, this condition is reduced to32 ( R + S − T − P ) + ∆ α > . (25)This extends condition (19) for k → ∞ . Conclusion : For a large structured population in a regular graph of degree k ≥ , weakselection favors the evolution of cooperation more than the evolution of defection if k + 22 k ( R − P ) + 3 k − k ( S − T ) + ∆ α > . (26) This is can be extended for well-mixed populations by taking k → ∞ . For symmetric aspiration ˜ α D = ˜ α C , condition (26) for weak selection to favor the evolution ofcooperation more than the evolution of defection can be written as σR + S > T + σP, (27)where σ = k +23 k − is the structure coeﬃcient (Tarnita et al. [41]). Here, σ describes the structureand the update rule eﬀects on the evolutionary process. It does not depend on the game matrix. Itquantiﬁes the degree for which individuals of the same type are more likely to meet than individualsof diﬀerent types. If we select two neighbors, we have diﬀerent types with probability 1 / (1 + σ ),or the same type with probability σ/ (1 + σ ).The game is equivalent to a well-mixed population without structure, where each individualcan interact with any other individual through the eﬀective game matrix (Lessard [21]), given by A eff =  σR ST σP  . (28)Note that σ converges to 1 as k → ∞ . Therefore, the normal payoﬀ matrix (2) is obtained in thelimit where each individual interacts with any other individual. This describes exactly a well-mixedpopulation and the limit of condition (16) is R + S > T + P. (29)10his is exactly the limit of the condition for weak selection to favor the evolution of cooperationmore than the evolution of defection for a well-mixed population that follows a Moran procedure(Taylor et al. [42]). In a well-mixed population, the update rule has no eﬀect on the evolutionaryprocess.

4. Example: the simpliﬁed additive Prisoner’s Dilemma

Consider the simpliﬁed additive Prisoner’s Dilemma given by the matrix  C DC b − c − cD b  . (30)A cooperator pays a cost c > b > c if its partner cooperates. A defectorbeneﬁts by receiving b if its partner cooperates. This is one of the most important social dilemmas,which can be used to quantify the eﬀectiveness of cooperation via the beneﬁt-to-cost ratio b/c . Thisratio is an indicator of the performance of cooperation in structured populations as in well-mixedpopulations.Using condition (16) with the new entries, weak selection favors the evolution of cooperation, ρ C ( δ ) > N − , if ∆ α + 2 k b > c. (31)Note that this condition is exactly condition (17) for weak selection to disfavor the evolution ofdefection, ρ D ( δ ) < N − , and condition (26) to favor the evolution of cooperation more than theevolution of defection ρ C ( δ ) > ρ D ( δ ). Therefore, we cannot have ρ C ( δ ) > N − and ρ D ( δ ) > N − or ρ C ( δ ) < N − and ρ D ( δ ) < N − . The diﬀerence in aspired level, ∆ α = ˜ α D − ˜ α C , is a formof compensation to cooperators for their behavior. Weak selection fully favors the evolution ofcooperation, that is ρ C ( δ ) > N − > ρ D ( δ ), if the compensation ∆ α exceeds the diﬀerence inpayoﬀ 3 c − k − b .Otherwise, that is ∆ α + 2 k b < c, (32)weak selection fully favors the evolution of defection, that is ρ C ( δ ) < N − < ρ D ( δ ). In this case,the diﬀerence in aspired level, ∆ α , is not enough to compensate cooperators to evolve and takeover the population. Selection should oppose cooperative behavior.With large values of ∆ α , a cooperator will be more satisﬁed than a defector. This allowscooperators to maintain their strategy more frequently than defectors and increases the updatingfrequency of defectors until they ﬁnish by accepting cooperation.11he weight of the beneﬁt b on condition (31) depends on the connectivity k . For a graph withthe smallest connectivity, k = 2, weak selection fully favors the evolution of cooperation if∆ α + b > c. (33)For a graph with the largest connectivity k → ∞ (well-mixed populations), weak selection fullyfavors the evolution of cooperation, if ∆ α > c. (34)For any other connectivity, the condition is between (33) and (34). The ﬁrst condition is the leaststringent one and the second condition the most stringent one. This suggests that the best scenariofor the cooperative evolution is a graph with a small connectivity. Increasing the connectivityreduces the cooperation level.Consider the case where each type aspires in average the maximum payoﬀ that can receive it,˜ α C = b − c and ˜ α D = b . Then, the diﬀerence in aspired level is ∆ α = c . Condition (31) forselection to fully favor the evolution cooperation becomes bc > k. (35)This is typically the condition derived by Ohtsuki et al. [32] for death-birth updating, where ateach time step, an individual is selected to die. Then, a neighbor is selected with probabilityproportional to its ﬁtness to give birth to a copy, which will take the vacant position. Note that,for a graph with a large connectivity, weak selection fully favors the evolution of defection whateverthe beneﬁt b and the cost c .If both strategies aspire in average the same level, that is ∆ α = 0, then weak selection fullyfavors the evolution of cooperation if bc > (cid:16) bc (cid:17) ∗ = σ + 1 σ − k. (36)Decreasing the connectivity k decreases the crucial ratio (cid:16) bc (cid:17) ∗ that b/c should exceed it for weakselection ti favor the evolution of cooperation. These results reveal that larger is the value of theconnectivity k , larger must be the value of (cid:16) bc (cid:17) ∗ for selection to favor the evolution of cooperationin any sense. For a very large connectivity k → ∞ , weak selection fully favors the evolution ofdefection. Conclusion : In the case of simpliﬁed additive Prisoner’s Dilemma and under weak selection,the condition ρ C ( δ ) > N − is suﬃcient for ρ C ( δ ) > N − > ρ D ( δ ) . This condition is ∆ α + 2 k b > c. (37) Increasing the connectivity reduces the cooperation level. . Discussion Strategy update rule, in which individuals correct their strategies following a selection dynamic,is a microscopic mechanism that can serve to explain the cooperative evolution in diﬀerent topolo-gies. Two fundamentals update rules are the most used: aspiration-based mechanisms, whichare based on the knowledge of individuals about themselves (self-learning), and imitation-basedmechanisms, which are based on the knowledge of individuals about their neighborhood. To date,studies on evolutionary dynamics have focused on one of these mechanisms.In this paper, we have established a mixed update rule, in which individuals test their successwith their aspired levels to decide whether or not to imitate their neighbors. Individuals in thepopulation are generally heterogeneous. Consequently, we have considered along with this papera heterogeneity of aspired level, which is a bounded random variable with a mean depends on thestrategy. Satisﬁed individuals will maintain their strategies with a higher probability. Dissatisﬁedindividuals will imitate one of their neighbors with a higher probability. In this case, a strategy isselected with probability proportional to its ﬁtness.For a general game, we have derived the ﬁxation probabilities of cooperation and defection.For the particular cases, circular model and well-mixed population, we have the exact values ofthe ﬁxation probabilities for ﬁnite population. For a general graph, we have an approximation ofthe ﬁxation probabilities for a large population.We applied these results to test the success of cooperation. We have shown that weak selectionfavors the evolution of cooperation more than the evolution of defection if3 k + 22 k ( R − P ) + 3 k − k ( S − T ) + ∆ α > , (38)where k is the graph degree. This condition can be extended to large well-mixed populations bytending k → ∞ . k +22 k ( R − P ) + k − k ( S − T ) quantiﬁes the eﬀect of the payoﬀ diﬀerence betweencooperation and defection. ∆ α quantiﬁes the eﬀect of the diﬀerence in aspired level betweencooperation and defection.We have shown that an increase in the mean of the aspired level of defection, or a decrease in themean of the aspired level of cooperation, makes it easier for ρ C > N − , ρ C > ρ D and ρ D < N − to hold. The conclusion is that these conditions tend to promote the evolution of cooperation.This is true in well-mixed populations as in graph-structured populations. On the other hand, adecrease in the mean of the aspired level of defection, or an increase in the mean of the aspiredlevel of cooperation tends to oppose the evolution of cooperation.For symmetric aspiration ˜ α C = ˜ α D , weak selection favors the evolution of cooperation morethan the evolution of defection if σR + S > T + σP, (39)13here σ = k +23 k − is the coeﬃcient introduced by Tarnita et al. [41]. It describes the eﬀect of thestructure and the update rule on the evolutionary process. It does not depend on the game matrix.For a well-mixed population, we have σ = 1 and then condition (39) becomes R + S > T + P ,which is exactly the risk dominance condition in a coordination game (Harsanyi and Selten [11]).A coordination game is the case where R > T and

P > S .Of further interest is the eﬀect of the mixed dynamic in the additive simpliﬁed Prisoner’sDilemma. The condition for weak selection to favor the evolution of cooperation, ρ C ( δ ) > N − ,is suﬃcient for weak selection to disfavor the evolution of defection, ρ D ( δ ) < N − , and favor theevolution of cooperation more than the evolution of defection, ρ C ( δ ) > ρ D ( δ ). This is true forwell-mixed populations as in graph-structured populations. This conclusion is in agreement withthe conclusion obtained for other update rules as Birth-death, death-birth, imitation and pairwisecomparison (Nowak et al. [31]; Nowak [29]). Under our mixed model, this condition is∆ α + 2 k b > c. (40)The best scenario for the evolution of cooperation is the circular model, k = 2, where the beneﬁt b has a major eﬀect on the favored strategy. In this case, the condition becomes ∆ α + b > c .It requires that ∆ α > c since b > c . Increasing the connectivity k reduces the cooperation levelsince it reduces the weight of the beneﬁt b on condition (40). For a large well-mixed population,which corresponds to a graph with a large connectivity k → ∞ , the beneﬁt b does not come intoplay in this condition. In this case, the condition is ∆ α > c . It is clear that the condition for agraph in a large connectivity, k → ∞ , is the most stringent one and the condition in the circularmodel, k = 2, the least stringent one.For the particular case, when each type aspires in average the maximum payoﬀ that can receiveit, ˜ α C = b − c and ˜ α D = b , weak selection favors the evolution of cooperation in any sense if bc > k, (41)which is typically the condition derived by Ohtsuki et al. [32] for death-birth updating. If bothstrategies aspire in average the same level, that is ∆ α = 0, then weak selection fully favors theevolution of cooperation if bc > k. (42)In both cases, it is possible for weak selection to favor the evolution of cooperation if the connec-tivity k is ﬁnite and b is suﬃciently large. In such a case, cooperators form clusters that emergeover the graph. However, for large connectivity, weak selection always disfavors the evolution ofcooperation. 14 cknowledgments This research was funded by Deanship of Scientiﬁc Research (DSR) at King Fahd Universityof Petroleum and Minerals (Grant No. SR ReferencesReferences [1] Antal, T.; Nowak, M.A.; Traulsen, A. Strategy abundance in 2 × J. Theor. Biol. , , 340–344.[2] Chen, X.; Wang, L. Promotion of cooperation induced by appropriate payoﬀ aspirations in asmall-world networked game, Phys. Rev. E , , 01713.[3] Cressman, R. Evolutionary games and Extensive Form Games ; MIT Press: Cambridge, 2003.[4] Crow, J.F.; Kimura, M.

An introduction to population Genetics Theory ; Herper and Row:New York, 1970.[5] Du, J.; Wu, B.; Wang, L. Aspiration dynamics in structured populations acts as if in a well-mixed one,

Sci. Rep. , , 8014.[6] Du, J.; Wu, B.; Altrock, P.M.; Wang, L. Aspiration dynamics of multiplayer games in ﬁnitepopulations, J. R. Soc. Interface , , 20140077.[7] Ewens, J.W. Mathematical Population Genetics I. Theoretical Introduction ; Springer: NewYork, 2004.[8] Fudenberg, D.; Imhof, L. Imitation processes with small mutations,

J. Econ. Theory. , , 251–262.[9] Gruter, C.; Czaczkes, T.J.; Ratnieks, F.L.W. Decision making in ant foragers (Lasius niger)facing conﬂicting private and social information, Behav. Ecol. Sociobiol. , , 141–148.[10] Hauert, C.; Doebeli, M. Spatial structure often inhibits the evolution of cooperation in thesnowdrift game, Nature , , 643–646.[11] Harsanyi, J.C.; Selten, R. A General Theory of Equilibrium Selection in Games ; MIT Press:Cambridge, 1988.[12] Hofbauer, J.; Sigmund, K.

The Theory of Evolution and Dynamical Systems ; CambridgeUniversity Press: Cambridge, 1988. 1513] Hofbauer, J.; Sigmund, K.

Evolutionary Games and Population dynamics ; Cambridge Uni-versity Press: Cambridge, 1998.[14] Hofbauer, J.; Sigmund, K. Evolutionary game dynamics,

Bull. Am. Math. Soc. , ,479–519.[15] Imhof, L.; Nowak, M.A. Evolutionary game dynamics in a Wright-Fisher process, J. Math.Biol. , , 667–681.[16] Karlin, S.; Taylor, P. A First Course in Stochastic Processes 2nd edn ; Academic Press: NewYork, 1975.[17] Keeling, M.J.The eﬀects of local spatial en epidemiological invasions,

Proc. R. Soc. Lond. B , , 859–869.[18] Kimura, M. On the probability of ﬁxation of mutants genes in a population, Genetics , , 713–719.[19] Kimura, M. The neutral Theory of Molecular evolution ; Cambridge University Press: Cam-bridge, 1983.[20] Ladret, V.; Lessard, S. Fixation probability for a beneﬁcial allele and a mutant strategy in alinear game under weak selection in a ﬁnite island model,

Theor. Pop. Biol. , , 409–425.[21] Lessard, S. Eﬀective game matrix and inclusive payoﬀ in group-structured populations, Dyn.Games Appl. , , 301–318.[22] Lieberman, E.; Hauert, C.; Nowak, M.A. Evolutionary dynamics on graphs, Nature , , 312–316.[23] Liu, X.; He, M.; Kang, Y.; Pan, Q. Aspiration promotes cooperation in the prisoner’s dilemmagame with the imitation rule, Phys. Rev. E , , 012124.[24] Matjaz, P.; Zhen, W. Heterogeneous Aspirations Promote Cooperation in the Prisoner’sDilemma Game, PLoS ONE , , 515117.[25] Matsuda, H.; Ogita, N.; Sasaki, A.; Sato, K. Statistical mechanisms of population: The latticeLotka-Volterra model, Progress of Theoretical Physics , , 1035–1049.[26] Maynard Smith, J. Evolution and the Theory of games ; Cambridge University Press: Cam-bridge, 1982.[27] Maynard Smith, J.; Price, G.R. The logic of animal conﬂict,

Nature , , 15–18.1628] Nakamaru, M.; Matsuda, H.; Iwasa, Y. The evolution of cooperation in a lattice structuredpopulation, J. Theor. Biol. , , 65–81.[29] Nowak, M.A. Evolutionary dynamics ; Harvard University Press: Cambridge, 2006.[30] Nowak, M.A.; May, R. Evolutionary game and spatial chaos,

Nature , , 826–829.[31] Nowak, M.A.; Sasaki, A.; Taylor, C.; Fudenberg, D. Emergence of cooperation and evolution-ary stability in ﬁnite populations, Nature , , 646–650.[32] Ohtsuki, H.; Hauert, C.; Lieberman, E.; Nowak, M.A. A simple rule for the evolution ofcooperation on graphs and social networks, Nature , , 502–505.[33] Ohtsuki, H.; Nowak, M.A. The replicator equation on graphs, J. Theor. Biol. , ,86–97.[34] Ohtsuki, H.; Nowak, M.A. Evolutionary games on cycles, Proc. R. Soc. B , , 2249–2256.[35] Ohtsuki, H.; Pacheco, J.; Nowak, M.A. Evolutionary graph theory: breaking the symmetrybetween interaction and replacement, J. Theor. Biol. , , 681–694.[36] Rousset, F.; Billiard, D. A theoretical basis for measures of kin selection in subdivided popu-lations: ﬁnite populations and localized dispersal, J. Evol. Biol. , , 814–825.[37] Rousset, F. A minimal derivation of convergence stability measures, J. Theor. Biol. , , 665–668.[38] Rousset, F. Separation of time scales, ﬁxation probabilities and convergence to evolutionarilystable states under isolation by distance, Theor. Pop. Biol. , , 165–179.[39] Samuelson, L. Evolutionary Games and Equilibrium selection ; MIT Press: Cambridge, 1997.[40] Szab´o, G.; T¨oke, C. Evolutionary prisoners dilemma game on a square lattice,

Phys. Rev. E , , 69.[41] Tarnita, C.; Ohtsuki, H.; Antal, T.; Fu, F.; Nowak, M.A. Strategy selection in structuredpopulations, J. Theor. Biol. , , 570–581.[42] Taylor, C.; Fudenberg, D.; Sasaki, A.; Nowak, M.A. Evolutionary game dynamics in ﬁnitepopulations, Bull. Math. Biol. , , 1621–1644.[43] Taylor, P; Day, T.; Wild, G. Evolution of cooperation in a ﬁnite homogeneous graph, Nature , , 469–472. 1744] Taylor, P.; Jonker, L. Evolutionary stable strategies and game dynamics, Math. Biosc. , , 145–156.[45] Traulsen, A.; Pacheco, J.; Nowak, M.A. Pairwise comparison and selection temperature inevolutionary game dynamics, J. Theor. Biol. , , 522-529.[46] Traulsen, A.; Semmann, D.; Sommerfeld, R.; Krambeck, H.; Milinski, M. Human strategyupdating in evolutionary games, Proc. Nat. Acad. Sci. USA , , 2962-2966.[47] van Baalen, M. Pair Approximations for Diﬀerent Spatial Geometries. In The Geometry ofEcological Interactions: Simplifying Spatial Complexity ; Dieackmann, U., Law, R., Metz, J.Eds.; Cambridge University Press: Cambridge, 2000; pp. 359–387.[48] van Bergen, Y.; Coolen, I.; Laland, K.N. Nine-spined sticklebacks exploit the most reliablesource when public and private information conﬂict,

Proc. R. Soc. London. Ser. B , ,957–962.[49] van Veelen, M.; Nowak, M.A. Multi-player games on the cycle, J. Theor. Biol. , ,116–228.[50] Vincent, T.L.; Brown, J.S. Evolutionary Game Theory, Natural Selection, and DarwinianDynamics ; Cambridge University Press: Cambridge, 2005.[51] Weibull, J.W.

Evolutionary Game Theory ; MIT Press: Cambridge, 1995.[52] Wu, B.; Altrock, P.; Wang, L.; Traulsen, A. Universality of weak selection,

Phys. Rev. E , , 046106.[53] Zeeman, R.C. Populations dynamics from game theory. In Global Theory of Dynamical Sys-tems ; Nitecki, Z.H., Robinson, R.C. Eds.; Springer: New York, 1980.18 . Appendix A: General case k ≥ Deﬁne p X and p XY as the frequencies of strategy X and pairs of type XY , respectively, for X, Y ∈ {

C, D } . Denote by q Y | X = p XY /p X the probability that a given neighbor of an X -strategistis a Y -strategist. As a result of basic probability properties, these quantities are related by thefollowing relations p C + p D = 1 ,p XY = q X | Y p Y = q Y | X p X = p Y X , (43) q A | Y + q B | Y = 1 . Using these identities, we can express all these probabilities in terms of p C and q C | C p D = 1 − p C ,p CC = q C | C p C ,q D | C = 1 − q C | C ,p CD = q D | C p C = (1 − q C | C ) p C ,q C | D = p CD p D = (1 − q C | C ) p C − p C ,q D | D = 1 − q C | D = 1 − (1 − q C | C ) p C − p C ,p DD = q D | D p D = 1 − p C − (1 − q C | C ) p C . (44)Based on the above identities, the evolutionary process is completely described through p C and q C | C .The next step is to derive the changes in one time step of p C and q C | C , respectively, to char-acterize the evolutionary process of our model. Assume that the selected individual, I , is an X -strategist, and that its neighborhood is formedby k C cooperators and k D = k − k C defectors. Therefore, its expected payoﬀ isΠ X =  k C R + k D Sk if X = C, k C T + k D Pk if X = D. (45)In the second round, individual I will adopt a neighbor’s strategy. Then, we must consider theneighborhood’s payoﬀs of I . Let individual J , a Y -player, be a random neighbor of individual I ,19nd let Π Y | X be the expected payoﬀ of individual J . Hence, with reasoning based on the strategiesof individuals I and J , we haveΠ C | C = [1 + ( k − q C | C ] R + ( k − q D | C Sk , Π C | D = ( k − q C | C R + [1 + ( k − q D | C ] Sk , Π D | C = [1 + ( k − q C | D ] T + ( k − q D | D Pk , Π D | D = ( k − q C | D T + [1 + ( k − q D | D ] Pk . (46)

Proof.

Start with the ﬁrst payoﬀ in Eq. (46). Assume that I and J are two cooperators. In additionof I , individual J has other k − C with probability q C | C ,or of type D with probability q D | C = 1 − q C | C . In average, the neighborhood of individual J iscomposed of 1 + ( k − q C | C cooperators and ( k − q D | C defectors. This explains the form of theexpected payoﬀ. Similarly, we have the other payoﬀs in Eq. (46). p C The frequency of C , p C , increases if a defector becomes a cooperator. A defector is selectedto update its strategy with probability p D . Its neighborhood is formed by k C cooperators and k D = k − k C defectors with probability (cid:0) kk C (cid:1) q k C C | D q k D D | D , for k C = 0 , , . . . , k . It will choose toupdate its strategy with probability E h

11 + e δ (Π D − α D ) i = E h

12 + δ · α D − Π D O ( δ ) i = 12 + δ · ˜ α D − Π D . (47)Here O ( δ n ) means that the error is of order of δ n for n ∈ N . Finally, it becomes a cooperator withprobability k C (1 + δ Π C | D ) k C (1 + δ Π C | D ) + k D (1 + δ Π D | D ) = k C k + δ k C k D k (Π C | D − Π D | D ) + O ( δ ) . (48)In this case, the change is ∆ p C = N . Summarize this event in the following probability P (cid:16) ∆ p C = 1 N (cid:17) = p D |{z} select a defector k X k C =0 (cid:18) kk C (cid:19) q k C C | D q k D D | D | {z } its neighborhood E h

11 + e δ (Π D − α D ) i| {z } it updates its strategy k C (1 + δ Π C | D ) k C (1 + δ Π C | D ) + k D (1 + δ Π D | D ) | {z } it becomes a cooperator = p D k k X k C =0 k C (cid:18) kk C (cid:19) q k C C | D q k D D | D + δp D k X k C =0 k C k (cid:18) kk C (cid:19) q k C C | D q k D D | D × h k D k (Π C | D − Π D | D ) + ˜ α D − Π D i + O ( δ ) . (49)20sing Eq. (46) and the ﬁrst two moments of the binomial distribution, k X k C =0 k C × (cid:18) kk C (cid:19) q k C C | D q k D D | D = k X k C =0 k C × (cid:18) kk C (cid:19) q k C C | D (1 − q C | D ) k D = kq C | D , k X k C =0 k C × (cid:18) kk C (cid:19) q k C C | D q k D D | D = k X k C =0 k C × (cid:18) kk C (cid:19) q k C C | D (1 − q C | D ) k D = kq C | D + k ( k − q C | D , (50)yield P (cid:16) ∆ p C = 1 N (cid:17) = p D q C | D δp D k h k − q C | D q D | D (Π C | D − Π D | D ) + kq C | D ˜ α D − (1 + ( k − q C | D ) q C | D T − ( k − q C | D q D | D P i + O ( δ )= p CD δp CD k h I + R R + I + S S − I + T T − I + P P + k ˜ α D i + O ( δ ) , (51)where I + R = 2( k − k q C | C q D | D ,I + S = 2( k − k q D | D (cid:16) k − q D | C (cid:17) ,I + T = 1 + ( k − q C | D + 2( k − k q D | D q C | D ,I + P = k − k q D | D (cid:16) k + 2 + 2( k − q D | D (cid:17) . (52)The frequency of C , p C , decreases if a cooperator becomes a defector. In this case, the changeis ∆ p C = − N . This happens with probability P (cid:16) ∆ p C = − N (cid:17) = p C |{z} select a cooperator k X k C =0 (cid:18) kk C (cid:19) q k C C | C q k D D | C | {z } its neighborhood E h

11 + e δ (Π C − α C ) i| {z } it updates its strategy k D (1 + δ Π D | C ) k C (1 + δ Π C | C ) + k D (1 + δ Π D | C ) | {z } it becomes a defector = p C k k X k C =0 k D (cid:18) kk C (cid:19) q k C C | C q k D D | C + δp C k k X k C =0 k D (cid:18) kk C (cid:19) q k C C | C q k D D | C × (cid:16) k C k (Π D | C − Π C | C ) + ˜ α C − Π C (cid:17) + O ( δ )= p CD δp CD k (cid:16) − I − R R − I − S S + I − T T + I − P P + k ˜ α C (cid:17) + O ( δ ) , (53)where I − R = k − k q C | C (cid:16) k + 2 + 2( k − q C | C (cid:17) ,I − S = 1 + ( k − q D | C + 2( k − k q C | C q D | C ,I − T = 2( k − k q C | C (cid:16) k − q C | D (cid:17) ,I − P = 2( k − k q C | C q D | D . (54)21enote by ˙ p A the rate of change of p C in one time step. Using Eqs (51) and (53), we obtain˙ p C = 1 N P (cid:16) ∆ p C = 1 N (cid:17) − N P (cid:16) ∆ p C = − N (cid:17) = δp CD N k h I R R + I S S − I T T − I P P + k ∆ α i + O ( δ ) , (55)where I R = I + R + I − R = h k + 2 + 2( k − q C | C + q D | D ) i k − k q C | C ,I S = I + S + I − S =1 + ( k − q D | C + 2( k − k ( q C | C + q D | D ) q D | C + 2( k − q D | D k ,I T = I + T + I − T =1 + ( k − q C | D + 2( k − k ( q C | C + q D | D ) q C | D + 2( k − q C | C k ,I P = I + P + I − P = h k + 2 + 2( k − q C | C + q D | D ) i k − k q D | D , (56)and ∆ α = ˜ α D − ˜ α C . ∆ α is the diﬀerence in aspired level between a defector and a cooperator. q C | C Since q C | C = p CC /p C , we must start by the rate of change of p CC , the frequency of CC -pairs.Note that the total number of all pairs is kN/ k neighbors. p CC changes if adefector becomes a cooperator or a cooperator becomes a defector. Let I be the selected individualand assume that its neighborhood is formed by k C cooperators and k D = k − k C defectors.The number of pairs CC will increase by k C if a defector becomes a cooperator. This occurswith probability P (cid:16) ∆ p CC = 2 k C kN (cid:17) = p D (cid:18) kk C (cid:19) q k C C | D q k D D | D × E h

11 + e δ (Π D − α D ) i × k C (1 + δ Π C | D ) k C (1 + δ Π C | D ) + k D (1 + δ Π D | D )= k C p D k (cid:18) kk C (cid:19) q k C C | D q k D D | D + O ( δ ) , (57)The number of pairs CC will decrease by k C if a cooperator becomes a defector. This occurs withprobability P (cid:16) ∆ p CC = − k C kN (cid:17) = p C (cid:18) kk C (cid:19) q k C C | C q k D D | C × E h

11 + e δ (Π C − α C ) i × k D (1 + δ Π D | C ) k C (1 + δ Π C | C ) + k D (1 + δ Π D | C )= p C k D k (cid:18) kk C (cid:19) q k C D | C q k D C | C + O ( δ ) . (58)22et ˙ p CC be the rate of change of p CC in one time step. Combining Eqs (50) and (58) yield˙ p CC = k X k C =0 k C kN " P (cid:16) ∆ p CC = 2 k C kN (cid:17) − P (cid:16) ∆ p CC = − k C kN (cid:17) = k X k C =1 k C kN " k C p D k (cid:18) kk C (cid:19) q k C C | D q k D D | D − k D p C k (cid:18) kk C (cid:19) q k C C | C q k D D | C + O ( δ )= p D k N k X k C =0 k C (cid:18) kk C (cid:19) q k C C | D q k D D | D − p C k N k X k C =0 k C k D (cid:18) kk C (cid:19) q k C C | C q k D D | C + O ( δ )= p D k N (cid:16) kq C | D − k ( k − q C | D (cid:17) − p C k N (cid:16) k q C | C − kq C | C − k ( k − q C | C (cid:17) + O ( δ )= p CD kN (cid:16) k − q C | D − q C | C ) (cid:17) + O ( δ ) . (59)As a result, the rate of change of q C | C in one time step is˙ q C | C = ˙ p CC p C = p CD kN p C (cid:16) k − q C | D − q C | C ) (cid:17) + O ( δ )= 1 − q C | C kN n k − p C − q C | C − p C o + O ( δ ) . (60)In the last step, we have used Eq (44). Under weak selection, global frequency p C changes at a rate of order δ (see Eq. (55)), whichis very small, while the local frequency q C | C changes at a rate of order 1 (see Eq. (60)). As aconsequence, the local density q C | C equilibrates much more quickly than the global density p C (Ohtsuki and Nowak [34]). Therefore, the dynamical system rapidly converges onto a quasi-steadystate, deﬁned by ˙ q C | C = 0, or more explicitly, q C | C = 1 k − k − k − p C . (61)It is the key relationship, which is obtained in many studies of structured populations in regulargraphs (Ohtsuki et al. [32, 34, 35]).Instead of studying a diﬀusion process in terms of two variables, p C and q C | C , by this relationwe describe the system by one-dimensional diﬀusion process in terms of p C only. In fact, by usingEq (61), we express the diﬀerent probabilities of Eq (44) in terms of p C as p D =1 − p C ,q D | C =1 − q C | C = k − k − − p C ) ,q C | D = q D | C p C − p C = k − k − p C ,q D | D =1 − q C | D = 1 − k − k − p C ,p CD = q D | C p C = k − k − − p C ) p C . (62)23nserting these equations in Eq (56), we have I R = 3 k + 2 + (3 k + 2)( k − p C k ,I S = 3 k − k − − ( k − k + 2) p C k ,I T = ( k + 2) + (3 k + 2)( k − p C k ,I P = (3 k + 2)( k − − (3 k + 2)( k − p C k . (63)With a short interval ∆ t and by using Eqs (55) and (63), we have E h ∆ p C (cid:12)(cid:12)(cid:12) p C (0) = p i = ˙ p C ∆ t ≡ µ ( p )∆ t ! . (64)Here µ is the ﬁrst order given by µ ( p ) = δ ( k − N k ( k − p (1 − p ) (cid:16) Γ + k ∆ α + Γ p (cid:17) , (65)where Γ = (3 k + 2) R + (3 k − k − S − ( k + 2) T − (3 k + 2)( k − P, Γ = (3 k + 2)( k − R − S − T + P ) . (66)For the variance, we have var h ∆ p C (cid:12)(cid:12)(cid:12) p C (0) = p i = E h (∆ p C ) (cid:12)(cid:12)(cid:12) p C (0) = p i − E h ∆ p C (cid:12)(cid:12)(cid:12) p C (0) = p i = 1 N P (cid:16) ∆ p C = − N (cid:17) + P (cid:16) ∆ p C = 1 N (cid:17)! ∆ t + O ( δ )∆ t ≃ ( k − N ( k − p (1 − p )∆ t ≡ ν ( p )∆ t ! . (67)Conditions (65) and (67) ensure the diﬀusion approximation with drift function µ ( x ) and diﬀusionfunction ν ( x ).Suppose that a proportion p of cooperators appears in a population of defectors, p ∈ (0 , x = 1). The second scenario is that these proportion might die before reproducing orgenerate a lineage that disappears after sometime (extinction of cooperators x = 0). Then, x = 0and x = 1 are absorbing states of the diﬀusion process.Let φ δC ( p, t ) be the probability that absorption has occurred at x = 1 at or before t . Thebackward Kolomogov equation (Kimura [18], Crow and Kimura [4], Ewens [7]) can be written as ∂φ δC ( p, t ) ∂t = µ ( p ) ∂φ δC ( p, t ) ∂p + ν ( p )2 ∂ φ δC ( p, t ) ∂p (68)24ith boundary conditions φ δC (0 , t ) = 0 and φ δC (1 , t ) = 1.By letting t → ∞ , the limit φ δC ( p ) = lim t →∞ φ δC ( p, t ) (69)represents the ﬁxation probability of cooperators given an initial frequency p . As t → ∞ , theleft-hand side in (68) tends to 0, since φ δC ( p, t ) tends to be constant. Therefore, Eq. (68) becomes µ ( x ) dφ δC dx ( x ) + ν ( x )2 d φ δC dx ( x ) = 0 , (70)with the boundary conditions, φ δC (0) = 0 and φ δC (1) = 1. The solution of Eq (70) is φ δC ( p ) = Z p exp n − Z x µ ( y ) ν ( y ) dy o dx . Z exp n − Z x µ ( y ) ν ( y ) dy o dx. (71)Using Eqs (65) and (67), we haveexp n − Z x µ ( y ) v ( y ) dy o = exp n − Z x δN k (Γ y + Γ ) dy o = exp n − δN k (Γ x + 2Γ x + k ∆ αx ) o =1 − δ · N k h Γ x + 2Γ x + 2 k ∆ αx i + O ( δ ) . (72)Integrating Eq (72), we have the key approximation φ δC ( p ) = R p (cid:16) − δ · N k (Γ x + 2Γ x + 2 k ∆ αx ) (cid:17) dx + O ( δ ) R (cid:16) − δ · N k (Γ x + 2Γ x + 2 k ∆ αx ) (cid:17) dx + O ( δ )= p − δ · N k h Γ p + Γ p + k ∆ αp ) i + O ( δ )1 − δ · N k h Γ + Γ + k ∆ α i + O ( δ )= " p − δ · N k (cid:16) Γ p p + k ∆ αp (cid:17) × " δ · N k (cid:16) Γ + k ∆ α (cid:17) + O ( δ )= p + δ · N p (1 − p )12 k h Γ + 3Γ + 3 k ∆ α + Γ p i + O ( δ ) . (73)Similarly, let φ δD ( p ) be the probability that a proportion p of defectors takes over a populationof cooperators. Since there is ultimate ﬁxation of cooperation or defection with probability 1, wehave φ δD ( p ) = 1 − φ δC (1 − p ) = p − δ · N p (1 − p )12 k h + 3Γ + 3 k ∆ α − Γ p i + O ( δ ) . (74)Note that the above calculation is valid only for k ≥ φ δC ( p ) φ δD ( p ) = 1 + δ · N (1 − p )4 k h Γ + 2Γ + 2 k ∆ α i + O ( δ ) . (75)25 . Appendix B: Circular model ( k = 2) Suppose that we have N sites over a circle numbered 1 , . . . , N . Each site is occupied by anindividual. Individual who is located at site l can interact with its neighbors located at sites l − l +1, through game matrix (2). The same graph is used for the replacement graph. Dissatisﬁedindividuals imitate their direct neighbors.At each time step, each individual interacts with its direct neighbors. Then, an individual I is chosen at random. It will update its strategy with probability (5). In this case, it will imitatethe strategy of a direct neighbor J , with probability proportional to its ﬁtness f J = 1 + δ Π J .Otherwise, individual I will keep its current strategy.The population is initially consisted entirely of defectors. A new cooperator is introduced ona particular site. We have two scenarios. This cooperator will generate a lineage of cooperatorswithout overlapping one beside the other, which will take over the population. In this case,the population ends with only cooperators (extinction of defectors). The second scenario is thatthis individual might die before reproducing or generate a lineage that disappears (extinction ofcooperators). Let ρ C ( δ ) the probability of the ﬁrst scenario. Likewise, ρ D ( δ ) is the probabilitythat a single defector placed in a population of cooperators will generate a lineage, which will takeover the population. Using a recursive argument (Karlin and Taylor [16]), we have ρ C ( δ ) = 11 + P N − i =1 Q ij =1 T − j T + j ,ρ C ( δ ) ρ D ( δ ) = N − Y i =1 T + i T − i . (76)where T + i (resp. T − i ) is the transition probability of ”the number of cooperators increases from i to i + 1 in one time step” (resp.”the number of cooperators decreases from i to i − Without loss of generality, suppose that sites l + 1 , . . . , l + i are occupied by cooperators, whilethe other sites are occupied by defectors. Changes in the composition of the population take placeat the boundary between the two clusters: cooperators’ cluster formed by sites l + 1 , . . . , l + i and defectors’ cluster formed by the other sites. Changes in one time step may happen at sites l, l + 1 , l + i, l + i + 1.To ﬁnd the transition probabilities T + i and T − i , the diﬀerent payoﬀs of individuals around theboundary should be known. The payoﬀ of an individual depends on the number of its neighbors26f each type. We have the following types of payoﬀsΠ C, (1 , = R + S , Π D, (1 , = T + P , Π C, (2 , = R, Π C, (0 , = S, (77)Π D, (2 , = T, Π D, (0 , = P, where Π X, ( l,j ) refer to the payoﬀ of an X -player, who has l cooperators and j defectors as neighbors,for X ∈ { C, D } and l + j = 2 is the graph degree. The transition i → i + 1 takes place only if a defector, who is located at the boundary, becomesa cooperator. This occurs with probability T + i = 2 N × E h

11 + e δ (Π D, (1 , − α D ) i × f C, (1 , f C, (1 , + f D, (0 , = 12 N + δ N h Π C, (1 , + ˜ α D − Π D, (0 , − Π D, (1 , i + O ( δ ) . (78)The transition i → i − T − i = 2 N × E h

11 + e δ (Π C, (1 , − α C ) i × f D, (1 , f C, (2 , + f D, (1 , = 12 N + δ N h Π D, (1 , + ˜ α C − Π C, (2 , − Π C, (1 , i + O ( δ ) . (79)Dividing Eq (78) by Eq (79), we obtain T + i T − i = 1 + δ h C, (1 , − Π D, (1 , ) + Π C, (2 , − Π D, (0 , + ∆ α i + O ( δ ) . (80)Note that Eq (80) is valid for i = 3 , . . . , N − . For i = 2, only two cooperators are present in the population. Their payoﬀs are of type Π C, (1 , .As a result, we have T − = 2 N × E h

11 + e δ (Π C, (1 , − α C ) i × f D, (1 , f C, (1 , + f D, (1 , = 12 N + δ N h Π D, (1 , − C, (1 , + ˜ α C i + O ( δ ) . (81)27he transition probability T +2 is the same in Eq (78). Then, the ratio becomes T +2 T − = 1 + δ h C, (1 , − Π D, (1 , ) + Π C, (1 , − Π D, (0 , + ∆ α i + O ( δ ) . (82)Likewise, for i = N −

2, we have T + N − T − N − = 1 + δ h C, (1 , − Π D, (1 , ) + Π C, (2 , − Π D, (1 , + ∆ α i + O ( δ ) . (83)Finally, for i = 1, only one cooperator is in the competition with N − T − = 1 N × E h

11 + e δ (Π C, (0 , − α C ) i (84)= 12 N + δ N h ˜ α C − Π C, (0 , i + O ( δ ) , (85)whereas T +1 = 2 N × E h

11 + e δ (Π D, (1 , − α D ) i × f C, (0 , f C, (0 , + f D, (0 , = 12 N + δ N h Π C, (0 , − Π D, (0 , + ˜ α D − Π D, (1 , i + O ( δ ) . (86)Accordingly, the ratio becomes T +1 T − = 1 + δ h C, (0 , − Π D, (0 , − Π D, (1 , + ∆ α i + O ( δ ) . (87)Likewise, for i = N −

1, we have T + N − T − N − = 1 + δ h Π C, (2 , + Π C, (1 , − D, (2 , + ∆ α i + O ( δ ) . (88) We expand Q ij =1 T − j T + j up to the ﬁrst-order in δ , i Y j =1 T − j T + j = i Y j =1 h δ · ddδ (cid:16) T − j T + j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 i = 1 + δ · i X j =1 ddδ (cid:16) T − j T + j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ ) . (89)28ccordingly, we have ρ C ( δ ) = 1 N + δ · P N − i =1 P ij =1 ddδ (cid:16) T − j T + j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ )= 1 N + δ · N N − X i =1 i X j =1 ddδ (cid:16) T + j T − j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ )= 1 N + δ · N N − X j =1 ( N − j ) ddδ (cid:16) T + j T − j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ ) ,ρ C ρ D ( δ ) = 1 + δ · N − X j =1 ddδ (cid:16) T + j T − j (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ ) . (90)Substituting Eqs (80,82,83,87,88) in Eq (90) yield to ρ C ( δ ) = 1 N + δ · N h (2 N − N + 7) R + ( N + 2 N − S − ( N − N + 5) T − (2 N − N − P + N ( N − α i + O ( δ ) , (91)and ρ C ρ D ( δ ) = 1 + δ h (2 N − R − P ) + N ( S − T ) + N ∆ α i + O ( δ ) . (92)

8. Appendix C: Well-mixed population

Consider a well-mixed population of size N , where each individual can interact with any otherindividual with the same probability through game matrix (2). At any time step, all individualsinteract by pairs to accumulate payoﬀs. Then, an individual I is chosen at random to update itsstrategy. It will update its strategy with probability (5). In this case, it imitates individual J ,one of its neighbors, probability proportional to its ﬁtness f J = 1 + δ Π J . Otherwise, the currentstrategy of individual I will be maintained. Similarly to appendix B , we have ρ C ( δ ) = 11 + P N − i =1 Q ij =1 T − j T + j ,ρ C ( δ ) ρ D ( δ ) = N − Y i =1 T + i T − i . (93)where T + i (resp. T − i ) is the transition probability i → i + 1 (resp. i → i − i cooperators and N − i defectors. Then, thepayoﬀs of a cooperator and a defector are given, respectively, byΠ C,i = ( i − R + ( N − i ) SN − D,i = iT + ( N − i − PN − . (94)29 + i is the probability that a defector, chosen to update its strategy, becomes a cooperator. Thisoccurs with probability T + i = N − iN × E h

11 + e δ (Π D,i − α D ) i × if C,i if C,i + ( N − i − f D,i = ( N − i ) i N ( N −

1) + δ · ( N − i ) i N ( N − × h ˜ α D − Π D,i N − i − N − C,i − Π D,i ) i + O ( δ ) . (95) T − i is the probability that a cooperator, chosen to update its strategy, becomes a defector. Thisoccurs with probability T − i = iN × E h

11 + e δ (Π C,i − α C ) i × ( N − i ) f D,i ( i − f C,i + ( N − i ) f D,i = ( N − i ) i N ( N −

1) + δ · ( N − i ) i N ( N − h ˜ α C − Π C,i i − N − D,i − Π C,i ) i + O ( δ ) . (96)Therefore, the ratio of transition probabilities is T + i T − i = 1 + δ h N − N − C,i − Π D,i ) + ∆ α i + O ( δ ) . (97)Inserting Eq (97) in Eq (93) after expanding them up to the ﬁrst-order in δ , we have ρ C ( δ ) = 1 N + δ · N N − X i =1 ( N − i ) ddδ (cid:16) T + i T − i (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ )= 1 N + δ · N − " N − N − (cid:16) ( N − R + (2 N − S − ( N + 1) T − (2 N − P (cid:17) + ∆ α (98)and ρ C ρ D ( δ ) = 1 + δ · N − X i =1 ddδ (cid:16) T + i T − i (cid:17)(cid:12)(cid:12)(cid:12) δ =0 + O ( δ )= 1 + δ · N − " N − N − (cid:16) ( N − R + N S − N T − ( N − P (cid:17) + ∆ α + O ( δ ) . (99)Note that Eqs (98) and (99) is valid for any ﬁnite population size N ≥≥