[PDF] Evolutionary dynamics, intrinsic noise and cycles of co-operation

Abstract

We use analytical techniques based on an expansion in the inverse system size to study the stochastic evolutionary dynamics of finite populations of players interacting in a repeated prisoner's dilemma game. We show that a mechanism of amplification of demographic noise can give rise to coherent oscillations in parameter regimes where deterministic descriptions converge to fixed points with complex eigenvalues. These quasi-cycles between co-operation and defection have previously been observed in computer simulations; here we provide a systematic and comprehensive analytical characterization of their properties. We are able to predict their power spectra as a function of the mutation rate and other model parameters, and to compare the relative magnitude of the cycles induced by different types of underlying microscopic dynamics. We also extend our analysis to the iterated prisoner's dilemma game with a win-stay lose-shift strategy, appropriate in situations where players are subject to errors of the trembling-hand type.

Full PDF

aa r X i v : . [ q - b i o . P E ] J un Evolutionary dynamics, intrinsic noise and cycles of co-operation

Alex J. Bladon, ∗ Tobias Galla, † and Alan J. McKane ‡ Theoretical Physics, School of Physics and Astronomy,University of Manchester, Manchester M13 9PL, United Kingdom (Dated: October 23, 2018)We use analytical techniques based on an expansion in the inverse system size to study thestochastic evolutionary dynamics of ﬁnite populations of players interacting in a repeated prisoner’sdilemma game. We show that a mechanism of ampliﬁcation of demographic noise can give rise tocoherent oscillations in parameter regimes where deterministic descriptions converge to ﬁxed pointswith complex eigenvalues. These quasi-cycles between co-operation and defection have previouslybeen observed in computer simulations; here we provide a systematic and comprehensive analyticalcharacterization of their properties. We are able to predict their power spectra as a function of themutation rate and other model parameters, and to compare the relative magnitude of the cyclesinduced by diﬀerent types of underlying microscopic dynamics. We also extend our analysis to theiterated prisoner’s dilemma game with a win-stay lose-shift strategy, appropriate in situations whereplayers are subject to errors of the trembling-hand type.

PACS numbers: 02.50.Le, 05.10.Gg, 02.50.Ey, 87.23.Kg

I. INTRODUCTION

Traditionally, modellers in biology and related disci-plines use deterministic ordinary or partial diﬀerentialequations to capture the quantitative behavior of dynam-ical systems in those ﬁelds. Such an approach is validand accurate only if stochastic eﬀects induced by exter-nal or intrinsic ﬂuctuations can be neglected. Externalnoise might result from environmental factors or as anattempt to include the eﬀects of numerous, but weak,external eﬀects. Intrinsic ﬂuctuations arise from the dy-namics of the system itself. One of the most commonsources of such stochasticity in biology is discretizationnoise in systems composed of a ﬁnite number of interact-ing individuals. While deterministic descriptions can bederived, and shown to be exact in the limit of inﬁnite sys-tem size, ﬁnite systems retain an intrinsic randomness,sometimes referred to as demographic noise [1]. Suchﬂuctuations can invalidate conclusions based on the anal-ysis of the deterministic dynamics, turning deterministicﬁxed points into stochastic quasi-cycles, inducing helicalmotion about limit cycles [2], or giving rise to Turingpatterns induced by intrinsic noise [3, 4]. The existenceof stochastic quasi-cycles has been known for a numberof decades in the context of predator-prey-like systems,and methods have been devised to distinguish them fromnoisy limit cycles [5] .Only very recently have systematic methods, based ona system-size expansion of the master equation describ-ing the microscopic stochastic processes, been devised tostudy them analytically [6]. These methods use an ex-pansion in the inverse system size [7], and are now be- ∗ Electronic address: [email protected] † Electronic address: [email protected] ‡ Electronic address: [email protected] ing applied to a number of ﬁelds in which quasi-cycleshave been reported, including epidemiology [8–10], bio-chemical reactions [11], gene regulation [12], and morerecently learning algorithms of interacting agents [13].The purpose of the present work is to apply these ideasto problems in evolutionary game theory, and to providean analytical characterization of stochastic quasi-cyclesfound in computer simulations of populations of inter-acting players [14].Evolutionary dynamics in this context is a mathemat-ical framework describing co-evolving populations. It isthe main tool-kit used in attempts to reconcile the evolu-tion of co-operation with Darwinian natural selection —a problem which was listed as one of the 25 most press-ing scientiﬁc challenges in Science magazine in 2005 [15].The problem of how mutual co-operation is sustained ina population subject to selection pressure favoring self-ish behavior is most commonly modeled using the pris-oner’s dilemma (PD) game [16, 17]. The PD is a classicgame-theory problem in which two players have to si-multaneously choose whether to co-operate or to defect.Although the payoﬀ for mutual co-operation is higherthan that for mutual defection, the payoﬀ for defectingwhen the other player co-operates is higher still. De-fection then forms the Nash equilibrium of the game, i.e.the outcome one may expect if the interacting players arefully rational. A number of experiments have been per-formed in behavioral game theory (examples are [18, 19])and biological realizations of the PD include the studyof competitive interaction among viruses, see e.g. [20].An extension of the basic PD game considers repeatedinteraction of a given pair of players. The space of avail-able strategies then becomes too large to allow for anexhaustive analysis. Most studies therefore focus on aselected set of strategies, such as always defect (AllD),always co-operate (AllC), tit-for-tat (TFT) or win-staylose-shift (WSLS). AllC players always co-operate in anyiteration, and similarly AllD players always defect. TFTco-operates in the ﬁrst round and then copies its op-ponent’s previous move. This strategy emerged as thewinner in a computer tournament run by Axelrod in1981 [16]. Since then TFT has been the subject of a largebody of work [14, 21–23]. Even though TFT won a subse-quent second competition as well, TFT is not perfect. Inmore realistic situations where players can make mistakesTFT can become locked into patterns of alternative co-operation and defection with another TFT player [24]. Itis also vulnerable to invasion from co-operators via neu-tral drift. Nowak and Sigmund [25] then proposed WSLS;this strategy has none of the above disadvantages. WSLSco-operates in the ﬁrst round and then keeps playing thesame action (co-operate or defect) if it receives a favor-able payoﬀ, and switches from one action to the otherif it does not. It can resist neutral drift by co-operatorsand can correct mistakes, avoiding disadvantageous cy-cles. There is evidence to suggest that some animalsemploy these strategies, for example, in their behavior inthe presence of predators [26, 27].Historically, the analysis of evolutionary dynamics hasmostly been based on deterministic replicator dynam-ics [28], explicitly excluding stochastic eﬀects. More re-cently, methods from statistical physics and the theoryof stochastic processes have been used to study gamesin ﬁnite populations. In the absence of mutation, a ﬁ-nite population will always ﬁx on a given strategy due tostochastic ﬂuctuations. The resulting ﬁxation probabili-ties and average ﬁxation times can be calculated [29–31].Further quantities of interest are stationary distributionsof the underlying stochastic processes [23, 32–34], andthe phenomenon of dynamic drift [35].In the context of these studies of stochastic processes ingame theory, cyclic behavior has been reported [36–38] inthe rock-papers-scissors game, and in [14], where stochas-tic quasi-cycles between co-operation and defection havebeen observed in ﬁnite populations of agents playing theiterated PD. In the present work we will focus on thelatter game, and provide an analytical theory which al-lows one to compute properties such as power spectra,or equivalently the correlation functions of these quasi-cycles, to a good approximation in the limit of large,but ﬁnite populations. Based on this analytical approachwe are able to identify regions in parameter space wherestochastic quasi-cycles would be expected to occur, andwe compare the amplitude of cycles arising from diﬀerenttypes of microscopic update dynamics.The paper is organized as follows. In Sec. II we out-line the iterated PD and deﬁne the diﬀerent microscopicprocesses. We ﬁrst focus on the case of only three purestrategies, AllC, AllD, and TFT. The deterministic anal-ysis for this model is presented in Sec. III with a classi-ﬁcation of the ﬁxed points and an exploration of the pa-rameter space. We move from a deterministic descriptionto a stochastic formulation in Sec. IV and consider eﬀectsarising in ﬁnite populations. In particular, we carry outa system-size expansion of the master equation allowingus to classify the periodic stochastic deviations from the deterministic limit. In Sec. V we extend the analysis toinclude WSLS as a fourth strategy. Finally, in Sec. VI,we summarize our ﬁndings and outline avenues of futureresearch.

II. MODEL AND DEFINITIONSA. Iterated PD

We will mostly follow the setup of Imhof et al [14]. Anexception will be when we discuss the extension of themodel in Sec. V. As such, we will consider a popula-tion of N players with each player carrying out one ofthree pure strategies: AllC, AllD, or TFT. The respec-tive payoﬀs resulting from an encounter of two players ischaracterized by the following payoﬀ matrix:  AllC AllD T F TAllC Rm Sm RmAllD T m P m T + P ( m − T F T Rm − c S + P ( m − − c Rm − c  , (1)where m is the number of rounds played when two play-ers meet. The parameters T, R, P, and S are the payoﬀsof the basic PD game (in which players meet only once): T is the temptation to defect, i.e. the payoﬀ a defec-tor receives when playing a co-operator, R is the rewardfor mutual cooperation, P is the punishment for mutualdefection, and S is the sucker’s payoﬀ for co-operatingwith a defector. The so-called complexity cost, c , is im-posed on the TFT strategy and represents the alloca-tion of resources used to remember an opponent’s lastmove [14, 39]. For the dilemma to be present we requirethat the parameters satisfy T > R > P > S and also that

R > ( T + S ) /

2, to prevent mutual alternate co-operationand defection being more proﬁtable that of mutual coop-eration [16]. Throughout this paper we use the speciﬁcparameter values T = 5, R = 3, P = 1, S = 0 .

1, and m = 10 [14]. In the terminology of game theory, theiterated PD as deﬁned by the above payoﬀ matrix is anon-cooperative symmetric game.In the following we will label the strategies AllC, AllD,and TFT by i = 1 , , i will be denoted by n i , and we require that n + n + n = N . The expectedpayoﬀ, or ﬁtness, of a player of type i is then given by π i = P j a ij n j − a ii N − , (2)where a ij are the elements of the payoﬀ matrix, Eq. (1),e.g. a = Rm , a = Sm , etc. In using the deﬁni-tion (2) we follow the choices of [17] and exclude inter-actions of one individual with itself. Deﬁnitions withself-interaction are possible, the diﬀerences do not aﬀectthe results to the order in inverse system size we will beworking at.The so-called reproductive ﬁtness of an agent carryingpure strategy i , f i , is deﬁned as [17] f i = 1 − w + wπ i , (3)where w is a selection strength that determines the im-pact that the game has on the agent’s overall ﬁtness. If w = 0, then f i = 1 for all i , and one recovers the limit ofneutral selection. For w > φ = X i n i N f i , (4)and the average payoﬀ is π = P i ( n i /N ) π i . In order tocomplete the model we need to specify the microscopicdynamics of the system, i.e. we need to deﬁne the rulesby which the composition of the population changes overtime. There are several such microscopic processes whichhave been studied in the literature, and we will deﬁnesome of these in Sec. II C. Before we do so, it will howeverbe helpful to discuss the standard replicator-mutator dy-namics commonly considered in the literature. Providedthe microscopic dynamics are chosen appropriately, theseequations are a suitable description in the deterministiclimit, valid for inﬁnite populations. It is however im-portant to stress that the replicator-mutator dynamicsare not the limiting deterministic dynamics for all micro-scopic processes, as pointed out in [32, 40], and as we willdiscuss in more detail below. B. Canonical replicator-mutator equation

Within the standard replicator dynamics of evolution-ary game theory the time evolution of the concentrationof a strategy i is given by [28]˙ x i = x i ( f ∞ i − φ ∞ ) , (5)where x i denotes the concentration of strategy i in thepopulation in the deterministic limit: x i = lim N →∞ n i /N .Similarly, we will write f ∞ i = lim N →∞ f i = 1 − w + w X j a ij x j , (6)and φ ∞ = lim N →∞ φ = X j x j f ∞ j , (7)where the superscripts indicate that Eq. (5) are, for suit-ably chosen microscopic dynamics, valid only in the de-terministic limit of inﬁnite populations. The basic as-sumption underlying these dynamics is that individualsreproduce asexually in proportion to their reproductiveﬁtness, and that oﬀspring inherit the strategy of theirparent. If one introduces mutation, so that there is a ﬁnitechance that a player will produce an oﬀspring which doesnot use the same strategy as their parent, the above dy-namics needs to be modiﬁed, and the description is thenin terms of so-called replicator-mutator equations [41].Focusing on the case of M pure strategies we will as-sume that in a reproduction event a player produces anexact copy of itself with probability 1 − ( M − u anda mutant which plays one of the other M − u . The parameter u is conﬁned tothe physically meaningful range 0 < u ≤ /M for thecase of M pure strategies. For u = 1 /M an oﬀspring willbe of any of the M types with equal probability 1 /M . Itis convenient to introduce a mutation matrix q ij = (cid:26) − ( M − u if i = ju if i = j . (8)The replicator-mutator equation is then given by [41]˙ x i = X j x j f ∞ j q ji − x i φ ∞ . (9)In the limit of zero mutation Eq. (9) reduces to the stan-dard replicator equation, Eq. (5). C. Microscopic dynamics

We will now deﬁne the diﬀerent microscopic processeswe will consider. We will restrict ourselves to dynamicsconserving the total number of players in the population.To specify a process it is then suﬃcient to deﬁne the‘conversion’ rates T i → j , corresponding to events in whicha player of type i is replaced by one of type j . For thegeneral case with M pure strategies i, j = 1 , . . . , M . Wewill limit the discussion to processes of the general form T i → j = X k n k N n i N g ki ( f ) q kj , (10)where f = ( f , . . . , f M ). The form (10) is found by,at each time step, selecting two players, one for poten-tial reproduction and one for potential removal, from thepopulation. The player selected for potential removal isassumed to be of type i , and each term in the sum cor-responds to selecting a player of type k for potential re-production. A given combination ( i, k ) thus occurs withprobability ( n i n k ) /N . Here we use sampling with re-placement. Dynamics without replacement of an alreadychosen player are equally possible, leading to, for exam-ple, factors of N ( N −

1) in the denominator instead of N . The diﬀerences amount to eﬀects of order N − , anddo not aﬀect results to the order in the inverse systemsize we will be working at. For a given pair of selectedplayers reproduction and death actually only occur at arate proportional to g ki ( f ), which here we assume to bea function of the reproductive ﬁtnesses (implying a possi-ble dependence on the average ﬁtness φ ). The factor q kj accounts for potential mutation events. The four kindsof microscopic dynamics we will consider in the followingdiﬀer in the details of the function g , which we will de-scribe below. Choices in which g ki does not depend on f correspond to neutral selection.Before we deﬁne the details of the diﬀerent microscopicdynamics some general statements are appropriate. Forsimplicity, we will focus on the case of a game with M = 3pure strategies and in particular the iterated PD gamewith strategies AllC, AllD and TFT as introduced above;generalization to an arbitrary number of strategies M is however straightforward. For any choice of g ki , thestate of the N -player population is deﬁned by the num-ber of individuals using the AllC and AllD strategies: n = ( n , n ), the number of TFT players is then givenby n = N − n − n . Furthermore the reproductive ﬁt-nesses f and the average reproductive ﬁtness, φ , are fullydetermined by the state n of the system (see Eqs. (2)-(4)). It follows that the transition rates T i → j can bewritten as functions of n , and the microscopic stochasticprocess is described by the following master equation forthe probability, P ( n , t ), of the system being in state n : dP ( n , t ) dt = ( b E − T → ( n ) P ( n , t ) (11)+ ( b E − T → ( n ) P ( n , t )+ ( b E b E − − T → ( n ) P ( n , t )+ ( b E b E − − T → ( n ) P ( n , t )+ ( b E − − T → ( n ) P ( n , t )+ ( b E − − T → ( n ) P ( n , t ) . Here we have introduced shift operators b E i , where i =1 ,

2, acting on functions of the state of the system, ψ ( n , n ), as follows: b E ψ ( n , n ) = ψ ( n + 1 , n ) , b E − ψ ( n , n ) = ψ ( n − , n ) . (12)Similar deﬁnitions apply for b E and b E − . Multiplyingboth sides of Eq. (11) by n and summing over all possiblestates and using a decoupling approximation, valid for N → ∞ , we can then write the rate of change of theconcentration of strategy i as˙ x i = X k = i [ T k → i ( x ) − T i → k ( x )] , (13)after a re-scaling of time by a factor N . Clearly this equa-tion is not restricted to the case M = 3, and holds whenan arbitrary number of strategies are present. Transitionrates in Eq. (13) are found from Eq. (10) using the sub-stitution n i /N → x i and f i → f ∞ i , see Appendix A forfurther details. Substituting these limits into Eq. (13)one ﬁnds that the deterministic evolution of the concen-trations of strategies is given by˙ x i = X k = i X j x j [ x k g jk q ji − x i g ji q jk ] . (14) For diﬀerent update rules this equation diﬀers only in thespeciﬁc form of g used.We will now proceed to give the speciﬁc form of thefunction g ki ( f ) for a set of diﬀerent update rules whichhave previously been proposed: the Moran process, alinear Moran process, a local process and the Fermi pro-cess [32, 35].

1. The Moran process

In the Moran process [42], once a player of type k hasbeen chosen for potential reproduction and a player oftype i for potential removal, the reproduction event oc-curs at a rate proportional to f k /φ , speciﬁcally we willchoose g Mki ( f ) = f k φ . (15)The arbitrary pre-factor of 1 /

2, equivalent to choosinga time scale, has been introduced to allow better com-parison with other update rules [35]. By substitutingEq. (15) into Eq. (14) and using Eq. (7), P k x k = 1, and P k q jk = 1, one ﬁnds speciﬁcally for the Moran processthat ˙ x i = P j x j f ∞ j q ji − x i φ ∞ φ ∞ . (16)It is important to stress that the average reproductive ﬁt-ness φ ∞ is a function of the concentration vector x , andso φ ∞ is a time-dependent quantity. While Eq. (16) issimilar to the standard replicator-mutator dynamics, thepre-factor (2 φ ∞ ) − corresponds to a dynamic re-scalingof time, and so may aﬀect the transient dynamics. Thelocation of ﬁxed points and their local stability are how-ever not aﬀected, as a straightforward calculation shows.

2. The linear Moran process

The linear Moran process is deﬁned by the followingchoice [35] g LMki ( f ) = 12 (1 + λ ( f k − φ )) , (17)where λ > T i → j ≥

0. Notice also that onehas g LMki ( f ) = (1 + λw ( π k − π )) /

2. A common choice,which we will adopt in the following, is λ = 1 / ∆ f max ,where ∆ f max is the maximum possible diﬀerence between f i and φ [35], i.e. ∆ f max = max k, n | f k ( n ) − φ ( n ) | . Inthe absence of mutation ( u = 0) the deterministic limitresults in the following dynamics˙ x i = x i ( f ∞ i − φ ∞ )2∆ f max . (18)Therefore, up to a re-scaling of time by the constant fac-tor (2∆ f max ) − , the linear Moran process without mu-tation is described by the standard replicator dynamicsin the limit of inﬁnite population size. However for u = 0one does not recover the standard replicator-mutatorequations, Eq. (9), from the linear Moran process.In both Eq. (18) and (for u = 0) from the result of sub-stituting Eq. (17) into Eq. (14), the reproductive ﬁtnessonly enters in diﬀerences of the form f k − φ or f k − f i and is normalized by ∆ f max . Since both, ﬁtness diﬀer-ences and ∆ f max , scale linearly in w , the deterministicdynamics is independent of the selection strength w forthe linear Moran process. Finally, the linear Moran pro-cess can be obtained from Moran process Eq. (15) in theweak selection limit, w ≪

1. Using Eq. (3) one has g Mki ( f ) = 1 − w + wπ k − w + wπ )= 12 (1 + w ( π k − π )) + O ( w ) , (19)so that to linear order one recovers Eqs. (17) with thechoice λ = 1.

3. The local process

The so-called local process was ﬁrst proposed byTraulsen et al [40] and is based on a pairwise comparisonof one agent’s ﬁtness with another in order to determinewhether or not reproduction occurs. This process hasthe advantage that no knowledge or computation of theaverage ﬁtness of the population is required to carry outa microscopic step. The local process is deﬁned by g Lki ( f ) = 12 (cid:18) f k − f i ∆ f max (cid:19) , (20)where ∆ f max is again required for normalization andﬁxed at the beginning, and then remains unchanged asthe dynamics proceeds. As opposed to the case of thelinear Moran process, ∆ f max is now the maximum possi-ble absolute diﬀerence between any two ﬁtnesses f i and f k : ∆ f max = max i,k, n | f i ( n ) − f k ( n ) | . As with the lin-ear Moran process the local process does, up to a con-stant factor which can be absorbed in the deﬁnition oftime, reproduce the standard replicator equation (5) inthe deterministic limit if mutation is absent [32, 40]. Atﬁnite mutation rates one does not however recover thereplicator-mutator equation, Eq. (9) [32].

4. The Fermi process

Finally, the so-called Fermi process is an alternativepairwise comparison process which uses the Fermi-Diracdistribution instead of the linear functional dependenceon ﬁtness diﬀerences as in the local process. It is deﬁned by [32, 43] g Fki ( f ) = 12 (cid:20) (cid:18)

12 ( f k − f i ) (cid:19)(cid:21) . (21)Unlike the other update rules, the Fermi process does notreproduce either the standard replicator or replicator-mutator equations in the deterministic limit. FromEq. (21) one has g Fki ( f ) = 12 (cid:16) w π k − π i ) (cid:17) + O ( w ) , (22)in the limit of weak selection. This is of the same func-tional form as Eq. (20). III. RESULTS OF THE DETERMINISTICANALYSIS

As an initial step towards characterizing the outcomeof the iterated PD, we compute the ﬁxed-point structurein the deterministic limit of the four diﬀerent dynamicsdeﬁned above, as function of the complexity cost c andthe mutation rate u . We ﬁx the selection strength to w = 1 throughout. A. General ﬁxed point structure

The qualitative picture one ﬁnds is similar for any ofthe four dynamics; two diﬀerent threshold values of themutation rate can be identiﬁed, we will refer to theseas u (1) c and u (2) c . For 0 < u < u (1) c , one typically ﬁndsthree ﬁxed points: a locally stable attractor near AllD, asaddle point also near AllD, and an unstable ﬁxed point,located close to the AllD/TFT edge of the strategy sim-plex, see Fig. 1(a). Following [14] we will refer to thislatter ﬁxed point as the ‘mixed ﬁxed point’. At u = u (1) c the mixed ﬁxed point becomes stable, as shown in panel(b) of Fig. 1. At u = u (2) c , the two ﬁxed points near AllDannihilate, leaving the mixed ﬁxed point as the only at-tractor for u > u (2) c , see Fig. 1(c). At u = 1 / u an individual of any type produces an oﬀspring ofany of the three diﬀerent strategies with equal probabil-ity. While this qualitative picture is the same for all fourdynamics considered here, the numerical values of u (1) c and u (2) c will in general be diﬀerent for the diﬀerent dy-namics, and they may also depend on the choice of thecomplexity cost, c . The overall picture is consistent withthe results of [14], where the standard replicator dynam-ics were studied and where similar qualitative behaviorwas found. Our analysis thus demonstrates that the ﬁnd-ings of [14] generalize to a broader class of dynamics. Theonly diﬀerence between our ﬁndings compared to those ofearlier analyses, lies in the saddle point described above, FIG. 1: (Color online) Fixed-point structure and ﬂow ﬁeldsof the standard replicator-mutator equations for the iteratedPD at c = 0 . w = 1. Black symbols are stable ﬁxedpoints, white symbols are unstable and gray symbols denotesaddle points. The AllD ﬁxed point and saddle point canbe seen in the bottom left-hand corner of the simplices. Themixed ﬁxed point (triangle) changes stability at u (1) c ≈ . u (2) c ≈ . || ˙ x || . Theseimages were produced using a modiﬁed version of the Dynamopackage [44]. which was not reported in [14], presumably because itis not an attractor of the dynamics. Although for com-pleteness we have given a general account of the ﬁxedpoint structure, the mixed ﬁxed point will be the focusof our analysis in the following sections, as it is this ﬁxedpoint which gives rise to coherent oscillations induced bydemographic noise. B. Limit of small mutation rates

Further analytical progress is possible in the limit ofsmall mutation rates, u ≪

1. For all four cases consideredhere the deterministic dynamics, derived from Eqs. (13),are of the form ˙ x = h (0) ( x ) + u h (1) ( x ) , (23)with the mutation rate entering linearly in the resultingdiﬀerential equations. We will now make the followingansatz for a ﬁxed point x ∗ : x ∗ = x ∗ (0) + u x ∗ (1) , (24) -8 -6 -4 -2 u x A ll D x AllC

AllDSaddleMixed (a) (b)

FIG. 2: (Color online) Panel (a) shows the x AllD componentsof the three ﬁxed points of the Moran dynamics at c = 0 . u . Circles are the results fromnumerically solving the ﬁxed-point equations (25), lines arefrom Eq. (24). Panel (b) shows the paths of the AllD ﬁxedpoint and saddle point in phase space as u is varied. Theﬁxed points meet and annihilate at u ≈ . u largerthan this value neither ﬁxed point is present. in the limit of small mutation rates u . Here x ∗ (0) is aﬁxed point of Eq. (13) at u = 0 and x ∗ (1) captures theeﬀect of non-zero mutation rates at next-to-leading order.Inserting Eq. (24) into the ﬁxed-point condition h (0) ( x ∗ ) + u h (1) ( x ∗ ) = 0 , (25)and collecting terms in linear in u , one then ﬁnds x ∗ (1) = − J − h (1) ( x ∗ (0) ) , (26)where J is the Jacobian of h (0) evaluated at x ∗ = x ∗ (0) .Since x ∗ (0) can be found in closed form from Eqs. (13)with u = 0, substituting, Eq. (26) into Eq. (24) gives ananalytical prediction of the location of the ﬁxed point atsmall u .In Fig. 2 we compare the outcome of the above linearexpansion with results from a direct numerical evaluationof the ﬁxed points of Eq. (13) obtained using a Newton-Raphson procedure. The expansion is seen to be a goodapproximation for the location of the ﬁxed point for val-ues of u up to u ≈ .

01. From the ﬁgure we see that theAllD and saddle ﬁxed points annihilate at u (2) c ≈ . c . This annihilation is consistent with thedisappearance of the AllD ﬁxed point at large u reportedby Imhof et al [14]. C. Mixed ﬁxed point and phase diagram

For suitable choices of the model parameters c and u ,the mixed ﬁxed point can be a stable attractor with com-plex eigenvalues of the corresponding Jacobian. One can c -8 -6 -4 -2 u Stable spiralsLimit cyclesStable nodes Unstable spirals FIG. 3: A phase diagram showing the regions in the ( c, u )plane for which the mixed ﬁxed point of Eq. (13) for theMoran process is a stable node, stable spiral, unstable spiralor is orbited by a limit cycle. The solid line at u = 1 / x = (1 / , / thus expect coherent stochastic oscillations to arise inﬁnite populations at those model parameters. We there-fore focus our attention on the mixed ﬁxed point, andidentify the regions in the ( c, u )-plane where such com-plex eigenvalues are found. More generally we will deter-mine the nature of the mixed ﬁxed point as a functionof u and c . The result of numerically solving for ﬁxedpoints of the deterministic dynamics corresponding tothe Moran process is shown in Fig. 3. We will denoteﬁxed points with purely real eigenvalues as ‘nodes’ andthose with complex eigenvalues as ‘spirals’. At low valuesof c we observe a re-entry phenomenon, where the mixedﬁxed point goes from a stable spiral to a stable node andback to a stable spiral as u is decreased.Numerically we also observe a region where the dy-namics converges onto a limit cycle. We are at this pointunable to provide a proof for the existence of limit cyclesor to analytically determine the position of the borderbetween the limit cycle and unstable spiral regions. Wetherefore determine the presence of limit cycles by nu-merically integrating the deterministic dynamics usingan Euler forward method, starting from the center of thesimplex, allowing for a period of equilibration and thenapplying a suitable threshold criterion to detect closedtrajectories. The unstable spiral region is identiﬁed asthe region where we do not ﬁnd limit cycles numerically.In situations where there is more than one attractor (e.g.a limit cycle and a stable ﬁxed point near AllD) initialconditions will determine the stationary state of the dy-namics. At u = 1 / c -8 -6 -4 -2 u Stable spiralsLimit cyclesUnstable spiralsStable nodes FIG. 4: Phase diagram for the Fermi process. We again seethe same qualitative structure as for the other update rules,with a diﬀerence only in the positions of the boundaries. same qualitative features as the Moran process, andhence their phase diagrams are structurally similar tothat shown in Fig. 3, except that the mixed ﬁxed pointdoes not become a stable node at x = (1 / , /

3) at u = 1 / c, u ) plane maydiﬀer for each update rule. For example, Fig. 4 shows thestability map for the Fermi process. Here the re-entry re-gion persists for larger values of c and the region in whichthe mixed ﬁxed point is unstable is also much larger. IV. STOCHASTIC EFFECTS ANDSYSTEM-SIZE EXPANSION

Until now we have focused on the dynamics of inﬁ-nite populations. In this section we investigate eﬀectsarising in ﬁnite populations, especially stochastic oscil-lations arising via a coherent ampliﬁcation of intrinsicﬂuctuations. Such oscillations have been found in a va-riety of systems as described in the introduction. Thesequasi-cycles typically arise in regions of the phase dia-gram where the deterministic dynamics approach a ﬁxedpoint, and so the range of parameters in which systems ofﬁnite populations display oscillations is generally widerthan the region in which the deterministic system allowsfor periodic solutions. Fig. 5 indeed conﬁrms that this isalso the case for the evolutionary dynamics of the iteratedPD. In the ﬁgure we choose model parameters such thatnone of the four deterministic dynamics approach peri-odic attractors, but instead have stable ﬁxed points withcomplex eigenvalues (stable spirals). As seen in the ﬁgurethe dynamics in ﬁnite populations still generate oscilla-tory behavior, induced by intrinsic ﬂuctuations. This os- -0.0500.05-0.0500.05-0.0500.051000 1100 1200 1300 1400 1500-0.0500.05

MLMLF x - < x > t FIG. 5: (Color online) The results for the concentration of theAllC strategy from one run of a Gillespie simulation [45, 46]for each of the four update rules at N = 10000, c = 0 .

8, and u = 0 .

05. The time averaged concentration of each run hasbeen subtracted from the data to give the deviation from thedeterministic ﬁxed point. cillatory behavior is similar to that reported in [14]. Thefour panels demonstrate that the quality and frequency ofthese stochastic oscillations can vary over a wide rangedepending on the details of the microscopic dynamics,and so we will now go on to characterize their propertiesin more detail in order to obtain a more comprehensivepicture of this phenomenon.The analytical approach we will use to characterizethese ﬂuctuations is based on an expansion of the mas-ter equation in the inverse system size [7]. This methodis now standard in the analysis of interacting-agent sys-tems, and we will therefore not present the full details ofthe calculation, but instead restrict ourselves to giving afew of the intermediate steps and the ﬁnal results. Thestarting point of the system-size expansion is an ansatzof the type n i N = x i ( t ) + 1 √ N ξ i ( t ) , (27)amounting to a separation of deterministic and stochas-tic contributions to the number, n i , of individuals of type i in the population. The ﬁrst term on the right, x i ( t ), isthe deterministic trajectory, and ξ i ( t ) captures ﬂuctua-tions about this trajectory; the magnitude of these ﬂuc-tuations is expected to be of order N − / , as reﬂected bythe above ansatz. One proceeds by inserting this ansatzinto the master equation (11) and focuses on the prob-ability distribution of ξ , rather than that of n , so thatone sets P ( n , t ) = Π( ξ , t ). Expanding the resulting mas-ter equation for Π( ξ , t ) in powers of N − / one recov-ers, at leading order, the generalized replicator-mutatorequation, Eq. (13). At next-to-leading order in N − / a Fokker-Planck equation of the form ∂ Π ∂t = − X i ∂∂ξ i ( C i Π) + 12 X i,j B ij ∂ Π ∂ξ i ∂ξ j , (28)is found [7], where C i = P k J ik ξ k . Here J is the Jacobianof Eq. (13) and B is a symmetric, 2 × ξ = J ξ + η , (29)where η is Gaussian white noise with correlations h η i ( t ) η j ( t ′ ) i = B ij δ ( t − t ′ ) . (30)In contrast to the Langevin equations derived using theKramers-Moyal expansion [32], Eq. (29) contains addi-tive, rather than multiplicative noise. In the applicationwe are considering here, we are interested in ﬂuctuationsabout the stationary state and so the matrices J and B are evaluated at the ﬁxed point of the deterministicdynamics.Given the linearity of Eq. (29), it is straightforward tocompute the power spectra of the ﬂuctuations ξ aboutthe deterministic ﬁxed point. Following the steps of [11],one obtains P i ( ω ) = D | e ξ i ( ω ) | E = X j X k Φ − ij B jk (Φ † ) − ki , (31)where Φ = iω I − J and I is the 2 × c = 0 . w = 1), u = 0 . N = 12000. Theoretical predictions from the vanKampen expansion and numerical simulations are in nearperfect agreement. The spectrum shows a pronouncedpeak at a frequency of approximately ω = 0 .

05, indicat-ing the existence of ampliﬁed oscillations with that char-acteristic frequency. The amplitude of these oscillationsis proportional to N − / , see Eq. (27), the proportional-ity constant is determined by the area under the powerspectrum. Depending on the choice of parameter valuesone can then expect the amplitude of the quasi-cycle willbe of order one up to system sizes of 10 or so, i.e. com-parable to the species concentrations at the deterministicﬁxed point. Even for very large populations the oscilla-tions can therefore be signiﬁcant. If the trajectory ofthe system is monitored over a time scale much smallerthan the oscillation period, then this may lead to inter-vals in time in which the concentration of AllC is foundto be consistently higher than that of TFT or AllD, thatis, to intermediate periods where co-operation dominatesthe population. Such eﬀects may for example be relevantwhen evolutionary time scales are much longer than timewindows over which measurements can be made. ω P i ( ω ) AllCAllDTFT

FIG. 6: (Color online) The power spectra for oscillations inthe concentrations of the three strategies with Moran updat-ing for c = 0 . u = 0 .

01. Symbols are the results of aGillespie simulation with N = 12000 and approximately 10 runs. Solid lines are theoretical predictions obtained fromEq. (31). Simulation results show excellent agreement withthe theory. ω -2 -1 P i ( ω ) MoranLinear MoranLocalFermi

FIG. 7: (Color online) A comparison of the power spectra foroscillations in the AllC strategy concentration at c = 0 . u = 0 . N ranges from 10 to 10 and the number of runsfor each simulation is of order 10 . Having shown that the analytical approach capturesthe properties of quasi-cycles accurately, we can nowcompare the magnitude of the stochastic oscillations forthe diﬀerent update processes at the same values of c and u . The power spectra of the ﬂuctuations in the AllCconcentration are shown in Fig. 7 for the four updaterules at one ﬁxed mutation rate and for a speciﬁc choiceof the complexity cost. Results indicate that the Fermiprocess produces demographic oscillations of a higher fre-quency than the other update rules, in line with the timeseries shown in Fig. 5. Even though the power spectrafor the Moran and linear Moran update rules are seem-ingly indistinguishable in Fig. 7, they are not analyticallyequivalent. The magnitude of the peak in the power spectra isa good proxy for the amplitude of the stochastic quasi-cycles, and the height of the peak is in turn largely deter-mined by the inverse of the real part of the relevant eigen-value of the deterministic dynamics at the ﬁxed point. Inthe deterministic system, perturbations about the ﬁxedpoint decay with a time constant proportional to the in-verse of this real part, and it is intuitively easy to seethat the magnitude of stochastic oscillations diverges asthe real part of the largest eigenvalue tends to zero. Morespeciﬁcally, as shown in [48], the magnitude of the peakin the spectra diverges as the system approaches a Hopfbifurcation, where the stable spiral becomes an unsta-ble one. The resulting delta-function peak in the powerspectrum indicates that a limit cycle is born in the un-stable phase. This can also be seen from Eq. (31) and thedeﬁnition of the matrix Φ. At the Hopf bifurcation therelevant eigenvalue of J is purely imaginary, and when ω becomes equal to the imaginary part of this eigenvalue,the matrix Φ becomes singular, such that the expressionon the right-hand side of Eq. (31), involving the inverseof Φ, diverges.In order to compare the relative magnitude of stochas-tic oscillations in the four diﬀerent dynamics at ﬁxed val-ues of u and c , it is therefore useful to determine how faror near to the instability the pair ( u, c ) places the re-spective dynamics. In Fig. 8 we plot the instability linesindicating the occurrence of a Hopf bifurcation in the( u, c ) plane for the four diﬀerent types of dynamics. Forany ﬁxed c one ﬁnds that u (1) c,F > u (1) c,L > u (1) c,LM > u (1) c,M and that accordingly for any u suﬃciently large to placeall four dynamics in the stable regime, the Fermi processis much closer to the limit-cycle regime than the othertypes of dynamics, and would therefore be expected tohave a larger peak in the power spectra. As discussedabove we furthermore ﬁnd that the Fermi process, withits alternative form of the pairwise comparison process,produces demographic oscillations of a higher frequencythan the other update rules, see Fig. 7. V. ITERATED PRISONER’S DILEMMA WITHERRORS

In this section we study an extended version of theiterated PD game, allowing for a fourth pure strategy,win-stay lose-shift (WSLS). It is appropriate to includethis strategy in the discussion when so-called ‘trembling-hand’ errors are considered [23]. Trembling-hand errorsintroduce the possibility of a player making a mistakeafter they have decided what to play, that is, a player co-operating when they meant to defect, or defecting whenthe intention was to co-operate. We will assume that thetwo players make errors of this type independently with asmall probability ǫ > c -6 -4 -2 u FermiLocalLinear MoranMoran FIG. 8: (Color online) The borders between the stable spi-ral region (above the respective lines) and limit cycle region(below) for the diﬀerent update processes. The black dot in-dicates the point ( c, u ) = (0 . , .

05) used in Fig. 7. outperformed by WSLS [17]. WSLS co-operates initiallyand then keeps using its strategy (co-operation or defec-tion) whenever it receives payoﬀ T or R and switches itsstrategy (from co-operation to defection or vice versa) ifit receives P or S . WSLS does not become locked in suchcycles when playing against TFT or WSLS. We includethe WSLS strategy in our game, extending the dynamicsto three degrees of freedom.In the presence of trembling-hand errors the outcomeof a PD game between two ﬁxed players and iteratedfor a ﬁnite number of rounds will generally be stochas-tic and depend on the timing at which errors occur inthe interaction sequence. In order to simplify matterswe will therefore follow [23] and restrict the discussionto cases in which an interaction between two playersconsists of an inﬁnite number of iterations of the PDgame. It is then appropriate to use the expected pay-oﬀs per round, i.e. for two ﬁxed players, say of types i, j ∈ { AllC, AllD, TFT, WSLS } , one formally consid-ers an inﬁnite sequence of PD interactions, and uses themean payoﬀ per round to deﬁne the payoﬀ matrix ele-ments a ij . The payoﬀ matrix can then be worked out forsmall error rates, and is given in [23] and reproduced inAppendix B for convenience. The complexity cost, c , isno longer a relevant parameter now that the number ofrounds is inﬁnite.Previous work on this game has shown that in the lim-its of zero mutation and weak selection the populationcan either ﬁx on WSLS or AllD depending on the valuesused in the payoﬀ matrix [23]. We continue to use theparameter values given in Sec. II A and explore how thedynamics of the game depend on mutation and error ratesand identify and classify demographic oscillations. Ana-lyzing the four update rules given in Sec. II C we againﬁnd a mixed ﬁxed point on which we focus our analysis— since demographic oscillations may occur about thisﬁxed point when it is a stable spiral. We use Eq. (13) -4 -3 -2 -1 u x i AllDWSLSTFTAllC

FIG. 9: (Color online) The AllC, AllD, TFT and WSLS com-ponents of the mixed ﬁxed point for the inﬁnitely iterated PDfor changing u at ǫ = 10 − . Microscopic dynamics are of theMoran type. Dashed lines are analytical results from a linearexpansion in u , circles are from a numerical evaluation of theﬁxed point. The vertical solid black line indicates the loca-tion of u (1) c , i.e. the value of u where the mixed ﬁxed pointchanges stability. -6 -4 -2 ε -6 -4 -2 u Limit cycles SpiralsaddlesStable spiralsStable spirals Stable nodes FIG. 10: The stability of the mixed ﬁxed point for Moranupdating in the inﬁnitely iterated PD game in the ( ǫ, u ) plane.Note that the border between the limit cycle and spiral saddleregions is approximate. with four strategies to track the location and stabilityproperties of the mixed ﬁxed point as u is varied. Thepath of the mixed ﬁxed point at constant ǫ and changing u for the Moran process, is shown in Fig. 9. The dashedlines are the result of a similar perturbative expansionto that carried out in Sec. III, where again we see goodagreement with numerical results for changes in u up to u ≈ . ǫ and u . The classiﬁ-cation of the nature of the ﬁxed points is more involved1 ω P i ( ω ) AllCAllDTFTWSLS

FIG. 11: (Color online) The power spectra for the four strate-gies with Moran updating at u = 0 .

02 and ǫ = 0 .

01. Symbolsare results from numerical simulations at N = 12000, solidlines are the predictions of Eq. (31). for the four-strategy game, however, as we are analyz-ing a three-dimensional dynamical system. Stable spiralsare now ﬁxed points with one pair of complex-conjugateeigenvalues with a negative real part and an additionalreal-valued negative eigenvalue. If the sign of the realpart of the pair of complex-conjugate eigenvalues is op-posite to that of the real-valued eigenvalue we will referto the ﬁxed point as a spiral saddle [49]. Fixed pointswith three real-valued negative eigenvalues of the Jaco-bian are referred to as stable nodes. The resulting phasediagram for the Moran dynamics is shown in Fig. 10. Theother three types of microscopic dynamics give qualita-tively similar phase diagrams, but the exact quantitativepositions of the various phase lines will generally be dif-ferent.When the mixed ﬁxed point is a spiral saddle the deter-ministic dynamics can either converge to a limit cycle orto the attractor at AllD, depending on initial conditions.For locations in the parameter space where the mixedﬁxed point of the deterministic dynamics is a stable spi-ral, we again observe demographic oscillations, and theycan be characterized analytically in a manner similar tothat discussed in the previous section. We depict the re-sulting power spectra for the Moran process in Fig. 11.As seen in the ﬁgure WSLS and AllD in particular un-dergo strong demographic oscillations, with a comparablemagnitude between the two strategies.As with the iterated PD considered earlier, the am-plitude of quasi-cycles resulting from an ampliﬁcation ofintrinsic ﬂuctuations can be expected to be large when aﬁxed point of the stable spiral type is located close to theborder between the stable-spiral and limit-cycle phases.There can then again be periods in time when WSLSis the most prevalent strategy, despite AllD dominatingthe ﬁxed point. The power spectra for oscillations in theconcentration of the TFT strategy resulting from the fourdiﬀerent update rules are compared in Fig. 12. We again ω P i ( ω ) MoranLinear MoranLocalFermi

FIG. 12: (Color online) The oscillations in the TFT strategyconcentration for the four update rules at u = 0 .

05 and ǫ =10 − . Symbols are from simulations, lines from the theory.System sizes in the range N = 10 to 10 are used. Forthis particular choice of parameters, the spectra for the localand Fermi processes overlap, although this is not the case ingeneral. observe that the Fermi process exhibits oscillations of ahigher frequency and with a larger amplitude than theother rules. Although spectra for the local process andthe Fermi process overlap in the ﬁgure, this is coinciden-tal at this point in parameter space, and will not be thecase in general. VI. CONCLUSION AND OUTLOOK

In this paper we have used analytical approaches basedon van Kampen’s system-size expansion to study theemergence of quasi-cycles in evolutionary games in ﬁnitepopulations. Most existing studies of such eﬀects are ofa numerical nature; we have complemented these givinga systematic account of the formalism, and derived theresulting eﬀective Langevin equations which describe thestatistics and correlations of ﬂuctuations in large, but ﬁ-nite populations. This approach and our general formu-lae are applicable to a large class of microscopic updaterules, and to games with an arbitrary number of purestrategies. They are, in principle, also valid for arbitrarymutation matrices. The results of this paper hence allowone to predict the regions in parameter space in whichcoherent quasi-cycles are to be expected, and to computetheir spectral properties. In particular coherent cycles,such as reported in game dynamical systems e.g. in [14],can be understood as a consequence of the combinationof intrinsic noise and the existence of a stable ﬁxed pointwith complex eigenvalues in the corresponding determin-istic system obtained in the limit of inﬁnite populations.In absence of noise the deterministic system approachessuch ﬁxed points in an oscillatory manner, with oscilla-tions dying away at a rate proportional to the inverse2real part of the relevant eigenvalue. In ﬁnite systems,discretization noise leads to persistent stochastic correc-tions perturbing the system at all times. In the limit oflarge, but ﬁnite system sizes these ﬂuctuations (the noise η in Eq. (29), together with the pre-factor N − / ) can beseen as a small perturbation to the deterministic dynam-ics, driving the system away from the ﬁxed point. Theattracting ﬁxed point and the oscillatory approach to iton the deterministic level on the one hand and the per-sistent intrinsic noise on the other then conspire to givecoherent and sustained stochastic cycles, with an ampli-tude largely determined by the inverse real part of theleast stable complex eigenvalue.We have applied the van Kampen formalism to the spe-ciﬁc example of the iterated PD game, where stochasticoscillations have been reported in the earlier numericalstudy [14]. We have worked out detailed phase diagramsdepicting the nature of the limiting deterministic dynam-ics and we have studied systematically how the muta-tion rate and complexity cost, the two main model pa-rameters, aﬀect the outcome of the deterministic system.Based on this analysis we are able to predict the parame-ter regimes in which stochastic oscillations occur. In par-ticular we ﬁnd that oscillation amplitudes become maxi-mal when the Hopf bifurcation line in the phase diagramis approached from within the stable phase. At the bi-furcation line the oscillation amplitude formally diverges,with the power spectrum turning into a delta-function,and as the instability line is crossed a limit cycle is born.We have also carried out a detailed comparison of fourdiﬀerent microscopic update rules; results indicate thattheir respective phase diagrams are qualitatively similar.The analysis shows that, at ﬁxed values of the model pa-rameters, the Fermi process tends to produce stochasticcycles with larger amplitudes and frequencies than theother update rules. We have extended our study to aversion of the iterated PD game in which errors of thetrembling-hand type occur with a small, but non-zerorate. The so-called win-stay lose-shift strategy has herebeen seen to out compete tit-for-tat, and accordingly we have considered a four-strategy space (always defect, al-ways co-operate, tit-for-tat and win-stay lose-shift), andhave identiﬁed the regions in parameter space where co-herent cycles are most likely to occur. Analytical resultsfor the resulting power spectra of these quasi-cycles areconﬁrmed convincingly in numerical simulations.Mathematical techniques of the type we have usedhere, most notably the master equation formalism andsystem-size expansions, were ﬁrst devised in statisticalphysics, but they are becoming increasingly more popu-lar in the game theory literature. This extends to equiv-alent approaches based on Kramers-Moyal expansions.We attribute this popularity to the generality with whichthese methods are applicable and to the fact that theyallow one to obtain an exhaustive account of the prop-erties of ﬁrst-order stochastic corrections to the limitingdeterministic dynamics. Exact analytical results can bederived for large, but ﬁnite populations, and hence thesetechniques make simulations on the microscopic level re-dundant (at least in principle). We expect this to bethe case for games with more complicated strategy struc-tures, or with interaction between more than two playerssuch as for example in public goods games. The analyt-ical approach can also be expected to be applicable toother, potentially more intricate types of human error.For such games it may be diﬃcult or time consumingto carry out reliable simulations and to perform exhaus-tive parameter searches. Analytical characterizations ofstochastic eﬀects such as the ones discussed in this papermay then be particularly welcome. Acknowledgments

We would like to thank Sven van Segbroek for makinghis modiﬁcation of the Dynamo package for Mathemat-ica 6 available to us. TG would like to thank RCUKfor support (RCUK reference EP/E500048/1). AJB ac-knowledges an EPSRC studentship. [1] R. M. Nisbet and W. S. C. Gurney

Modelling FluctuatingPopulations (John Wiley, New York, 1982).[2] R. P. Boland, T. Galla, and A. J. McKane, Phys. Rev.E. , 051131 (2009).[3] T. Butler and N. Goldenfeld, Phys. Rev. E. ,030902(R) (2009).[4] T. Biancalani, D. Fanelli, and F. Di Patti, pre-print(2009), arxiv:0910.4984v1.[5] M. Pineda-Krch, H. J. Blok, U. Dieckmann and M. Doe-beli, Oikos , 53 (2007).[6] A. J. McKane and T. J. Newman, Phys. Rev. Lett. ,218102 (2005).[7] N. G. van Kampen, Stochastic Processes in Physics andChemistry (Elsevier Science, Amsterdam, 2007).[8] D. Alonso, A. J. McKane, and M. Pascual, J. R. Soc.Interface , 575 (2007). [9] A. J. Black, A. J. McKane, A. Nunes, and A. Parisi,Phys. Rev. E. , 021922 (2009).[10] G. Rozhnova and A. Nunes, Phys. Rev. E. , 041922(2009).[11] A. J. McKane, J. D. Nagy, T. J. Newman, and M. O.Stephanini, J. Stat. Phys. , 165 (2007).[12] T. Galla, Phys. Rev. E. , 021909 (2009).[13] T. Galla, Phys. Rev. Lett. , 198702 (2009).[14] L. A. Imhof, D. Fudenberg, and M. A. Nowak, Proc. Natl.Acad. Sci. USA , 10797 (2005).[15] E. Pennisi, Science , 93 (2005).[16] R. Axelrod and W. D. Hamilton, Science , 1390(1981).[17] M. A. Nowak, Evolutionary Dynamics (Harvard Univer-sity Press, Cambridge, MA, 2006).[18] D. Semmann, H. J. Krambeck and M. Milinski, Nature , 390 (2003).[19] A. Traulsen, D. Semmann, R. D. Sommerfeld, H. J.Krambeck and M. Milinski, Proc. Natl. Acad. Sci. USA , 2962 (2010).[20] P. E. Turner and L. Chao, Nature , 441 (1999).[21] R. Axelrod and D. Dion, Science , 1385 (1988).[22] M. A. Nowak, A. Sasaki, C. Taylor, and D. Fudenberg,Nature , 646 (2004).[23] L. A. Imhof, D. Fudenberg, and M. A. Nowak, J. Theor.Biol. , 574 (2007).[24] M. A. Nowak and K. Sigmund, Nature , 250 (1992).[25] M. A. Nowak and K. Sigmund, Nature , 56 (1993).[26] M. Milinski, Nature , 433 (1987).[27] M. P. Lombardo, Science , 1363 (1985).[28] P. D. Taylor and L. B. Jonker, Math. Biosci. , 145(1978).[29] P. M. Altrock and A. Traulsen, New J. Phys. , 013012(2009).[30] A. Traulsen, J. M. Pacheco, and M. A. Nowak, J. Theor.Biol. , 522 (2007).[31] P. M. Altrock and A. Traulsen, Phys. Rev. E. , 011909(2009).[32] A. Traulsen, J. C. Claussen, and C. Hauert, Phys. Rev.E. , 011901 (2006).[33] J. C. Claussen and A. Traulsen, Phys. Rev. E. ,025101(R) (2005).[34] D. Fudenberg, M. A. Nowak, C. Taylor, and L. A. Imhof,Theor. Popul. Biol. , 352 (2006).[35] J. C. Claussen, Eur. Phys. J. B , 391 (2007).[36] T. Reichenbach, M. Mobilia, and E. Frey, Nature ,1046 (2007).[37] T. Reichenbach, M. Mobilia, and E. Frey, Phys. Rev. E. , 051907 (2006).[38] M. Mobilia, pre-print (2009), arxiv:0912.5179v1.[39] K. G. Binmore and L. Samuelson, J. Econ. Theory ,278 (1992).[40] A. Traulsen, J. C. Claussen, and C. Hauert, Phys. Rev.Lett. , 238701 (2005).[41] M. A. Nowak, N. L. Komarova, and P. Niyogi, Science , 114 (2001).[42] P. A. P. Moran, Math. Proc. Cam. Phil. Soc. , 60(1958).[43] J. C. Claussen and A. Traulsen, Phys. Rev. Lett. ,058104 (2008).[44] W. H. Sandholm and E. Dokumaci, Dynamo: Phase di-agrams for evolutionary dynamics (2007), software suite,URL .[45] D. T. Gillespie, J. Comput. Phys. , 403 (1976).[46] D. T. Gillespie, J. Phys. Chem. , 2340 (1977).[47] C. W. Gardiner, Handbook of Stochastic Methods forPhysics, Chemistry and the Natural Sciences (Springer,New York, 2009), 4th ed.[48] R. P. Boland, T. Galla, and A. J. McKane, J. Stat. Mech.P09001 (2008).[49] J. C. Sprott,

Chaos and Time-Series Analysis (OxfordUniversity Press, New York, 2003).

Appendix A: System-size expansion of the masterequation

The van Kampen system-size expansion has been ex-tensively discussed elsewhere, together with explicit ex- amples [6–8], and so here we will only brieﬂy summarizethe general idea and give the ﬁnal results of the calcula-tion that are relevant for this paper.The starting point is the substitution of the ansatz (27)into the master equation (11) — or its generalization tomore than three strategies. This yields an expansion inpowers of N − / , after a re-scaling of time by a factorof N . To leading order ( N − / ) the deterministic equa-tion (13) is obtained. To next-to-leading order ( N − )the Fokker-Planck equation (28) is found. This is de-ﬁned in terms of two quantities: C i = P k J ik ξ k , where J is the Jacobian of the dynamics given by Eq. (13), and B a symmetric matrix. Since we are interested in ﬂuc-tuations about stationary states, both J ij and B ij aretime-independent.The Jacobian can be obtained in a straightforwardfashion once the dynamics (13) is known. The elementsof the matrix B are found from the N − terms in thesystem-size expansion to be B ij =  P k = i [ T i → k ( x ) + T k → i ( x )] , if i = j − [ T i → j ( x ) + T j → i ( x )] , if i = j . (A1)Therefore the deterministic and stochastic dynamics tothe order we are working are entirely determined by T i → j ( x ). This can be found by making the substitu-tions ( n i /N ) → x i , f i → f ∞ i and φ → φ ∞ in Eq. (10),to obtain T i → j ( x ) = X k x k x i g ki ( f ∞ ) q kj . (A2)This explicitly shows how to construct T i → j ( x ), once theprocess (deﬁned by g ki ( f )) and the mutation matrix ( q ij )have been given. Appendix B: Payoﬀ matrix for a four strategy,inﬁnitely repeated PD with trembling hand errors

When two players meet and play the PD game overmultiple rounds their state in round ℓ is deﬁned by theiractions in that round, e.g. player 1 co-operates, player2 defects. The actions of the pair of players then deter-mines the payoﬀ they each receive. The payoﬀ matrixfor the inﬁnitely repeated PD game with trembling handerrors is constructed by considering the stationary distri-butions of the state of each pair of players, to ﬁrst orderin ǫ , the probability of a ‘trembling hand’ error occur-ring [23]. It is given by4  AllC AllD T F T W SLSAllC R − ǫ (2 R − S − T ) S + ǫ ( R + P − S ) R − ǫ (3 R − T − S ) ( R + S ) / ǫ/ αAllD T − ǫ (2 T − R − P ) P + ǫ ( S + T − P ) P + ǫ ( S + 2 T − P ) ( P + T ) / − ( ǫ/ αT F T R + ǫ (2 T + S − R ) P + ǫ ( T + 2 S − P ) γ γW SLS ( R + T ) / ǫ/ β ( P + S ) / − ( ǫ/ β γ R + ǫ ( T + 2 P + S − R )  . (B1)where α = ( T + P − R − S ), β = ( S + P − R − T ) and γ = ( T + R + P + S ) //