Chaos and unpredictability in evolution of cooperation in continuous time
Taekho You, Minji Kwon, Hang-Hyun Jo, Woo-Sung Jung, Seung Ki Baek
aa r X i v : . [ q - b i o . P E ] D ec Chaos and unpredictability in evolution of cooperation in continuous time
Taekho You, Minji Kwon, Hang-Hyun Jo,
2, 3, 4, ∗ Woo-Sung Jung,
1, 2, 3, † and Seung Ki Baek ‡ Department of Industrial and Management Engineering,Pohang University of Science and Technology, Pohang 37673, Korea Asia Pacific Center for Theoretical Physics, Pohang 37673, Korea Department of Physics, Pohang University of Science and Technology, Pohang 37673, Korea Department of Computer Science, Aalto University, Espoo FI-00076, Finland Department of Physics, Pukyong National University, Busan 48513, Korea
Cooperators benefit others with paying costs. Evolution of cooperation crucially depends on thecost-benefit ratio of cooperation, denoted as c . In this work, we investigate the infinitely repeatedprisoner’s dilemma for various values of c with four of the representative memory-one strategies,i.e., unconditional cooperation, unconditional defection, tit-for-tat, and win-stay-lose-shift. Weconsider replicator dynamics which deterministically describes how the fraction of each strategyevolves over time in an infinite-sized well-mixed population in the presence of implementation errorand mutation among the four strategies. Our finding is that this three-dimensional continuous-time dynamics exhibits chaos through a bifurcation sequence similar to that of a logistic map as c varies. If mutation occurs with rate µ ≪
1, the position of the bifurcation sequence on the c axisis numerically found to scale as µ . , and such sensitivity to µ suggests that mutation may havenon-perturbative effects on evolutionary paths. It demonstrates how the microscopic randomness ofthe mutation process can be amplified to macroscopic unpredictability by evolutionary dynamics. I. INTRODUCTION
The notion of “evolutionary progress” has been de-bated ever since Darwin [1], and most evolutionary biolo-gists dismiss the idea that evolution is directional changetoward the better [2]. A counterexample to progression-ism is self-extinction caused by individual adaptation [3],which is directional change towards the worse by anymeasure. For example, transgenic males of Japanesemedaka fish
Oryzias latipes have larger body sizes thanthe wild-type counterparts and thus enjoy advantages inmating, but they tend to decrease the population sizebecause their offspring have lower fecundity [4]. Here,individual interests contradict with the collective inter-est of the population [5], as is often modeled by the pris-oner’s dilemma (PD) game. The PD game thus providesus with an analytic model in which selection works inan undesirable direction. However, one could even askwhether evolution is directional after all, and progres-sionism will lose big ground if chaos proves ubiquitous inevolution, i.e., marking natural history as unpredictablealternations of progression and retrogression. In fact, arecent study indicates that chaos becomes more likely asthe dimensionality of the phenotype space increases [6],and the possibility is greatly enhanced by discrete-timedynamics, especially when one considers a wide rangeof parameters [7]. For example, chaos in the iteratedversion of the PD game has been reported among 10different strategies under frequency-dependent selectionin discrete time [8]. On the other hand, chaos in low-dimensional continuous dynamics is a more challenging ∗ [email protected] † [email protected] ‡ [email protected] issue, considering that chaos is impossible when the di-mensionality of the strategy space is less than three [9].In this work, we report chaoticity in three-dimensional(3D) continuous-time mutation-selection dynamics of theiterated PD game with varying the cost-benefit ratioof cooperation. By choosing the PD game as a micro-scopic foundation, we put our problem into the con-text of conflict between individual and collective inter-ests in a biological population. In particular, comparedwith the generic model in Ref. [6], we will see that ourapproach directly relates unpredictability to mutation,whereas self-extinction can be driven solely by selection,so that it naturally incorporates another key argumentof non-progressionism, i.e., historical contingency of mu-tation [10].This work is organized as follows: We explain the basicformulation of our dynamical system in the next section.Section III shows numerical results, which will be dis-cussed in Sec. IV from the viewpoint of time reversal.We will then summarize this work in Sec. V. II. METHOD
The PD game is a symmetric two-person game definedby the following payoff table for the row player: C DC b − c − cD b , (1)where C and D denote two possible moves, i.e., cooper-ation and defection, respectively. Note that the payofftable is parametrized by b and c which we assume to sat-isfy b > c >
0. If you choose to cooperate, it implies thatit benefits your coplayer by an amount of b at the expenseof your own cost c . This is the reason that your payoffbecomes − c whereas the coplayer earns + b when you co-operate and the coplayer defects [Eq. (1)]. Even if webegin with a group of cooperating individuals, a defect-ing trait will invade and take over the population as soonas it comes into existence through mutation, which iscomparable to the scenario of self-extinction [3]. The sit-uation changes when the PD game is iterated between thepair of players so that an individual may adopt a strat-egy according to which cooperation is conditioned on thepast interaction with the coplayer. A famous conditionalcooperator of this sort is tit-for-tat (TFT), which coop-erates at the first round and then copies the coplayer’sprevious move at each subsequent round [11]. Anotherimportant strategy is win-stay-lose-shift (WSLS), whichattempts a different move from the previous one if it didnot earn a positive payoff [12]. Both TFT and WSLS be-long to a class of memory-one strategies in the sense thatthey refer to only the previous round in making decisions.These conditional cooperators achieve a high level of co-operation based on reciprocity when the cost of cooper-ation is low enough [13]. Some animal societies seem tohave developed such behavior [14]. To contrast those con-ditional strategies with unconditional ones, we denote astrategy of unconditional cooperation (defection) as AllC(AllD) henceforth. Although the unconditional strategiesare memoryless, they are also members of the memory-one strategy class with trivial dependence on the pastmemory. These are the most extensively studied strate-gies, so the following strategy set will be considered inthis work: S ≡ {
AllC, AllD, TFT, WSLS } .Let us consider the case when a player with strategy i ∈ S plays the iterated PD game against the coplayerwith strategy j ∈ S . At each round, they can occupyone of the following states: ( C, C ), (
C, D ), (
D, C ), and(
D, D ). Suppose that the result is ( α, β ), or in short αβ ,at a current round ( α, β ∈ { C, D } ). Based on this result,each player’s strategy prescribes cooperation with certainprobability, which is either zero or one. The probabilitythat the player (coplayer) with strategy i ( j ) cooperatesat the next round is denoted by q αβ ( r βα ). For example,for a player with AllC, q αβ = 1 for all states of αβ , whilefor a player with TFT, q CC = q DC = 1 and q CD = q DD =0 (for details, see Appendix A). The values of r βα canbe similarly assigned. In the presence of implementationerror with probability ǫ ∈ (0 , q ′ αβ = (1 − ǫ ) q αβ + ǫ (1 − q αβ ) and r ′ βα = (1 − ǫ ) r βα + ǫ (1 − r βα ), respectively. By defining¯ q ′ αβ ≡ − q ′ αβ and ¯ r ′ βα ≡ − r ′ βα , the transition matrixbetween the current and next states is written as M = q ′ CC r ′ CC q ′ CD r ′ DC q ′ DC r ′ CD q ′ DD r ′ DD q ′ CC ¯ r ′ CC q ′ CD ¯ r ′ DC q ′ DC ¯ r ′ CD q ′ DD ¯ r ′ DD ¯ q ′ CC r ′ CC ¯ q ′ CD r ′ DC ¯ q ′ DC r ′ CD ¯ q ′ DD r ′ DD ¯ q ′ CC ¯ r ′ CC ¯ q ′ CD ¯ r ′ DC ¯ q ′ DC ¯ r ′ CD ¯ q ′ DD ¯ r ′ DD , (2)which admits a unique right eigenvector ~v =( v CC , v CD , v DC , v DD ) with eigenvalue one according to thePerron-Frobenius theorem. We normalize ~v by requiring v CC + v CD + v DC + v DD = 1 to take it as stationary prob- ability distribution over the four states. Note that theexistence of error with ǫ > i and j in S . By taking the in-ner product between the resulting ~v with a payoff vector( b − c, − c, b, a ij thatstrategy i obtains against j (see Appendix A for details).Let x i ( t ) denote the fraction of strategy i ∈ S at time t . If the population is infinitely large, x i can be regardedas a real variable. The payoff of strategy i , denoted by p i ( t ), is given as p i ( t ) = P j a ij x j ( t ) if the population iswell-mixed. The population average payoff will thus be h p i ( t ) ≡ P i p i ( t ) x i ( t ) = P ij a ij x i ( t ) x j ( t ). We describethe mutation-selection dynamics of the population usingthe replicator dynamics (RD) [15, 16] as follows: dx i dt = f i ≡ (1 − µ ) p i x i − h p i x i + µ |S| − X j = i p j x j , (3)where µ denotes the rate of mutation. We have only threedegrees of freedom because |S| = 4 and P i ∈S x i ( t ) =1 all the time. The system described by Eq. (3)is therefore understood as a deterministic dynamicalsystem inside a 3D region, defined as Ω = { x ≡ ( x AllD , x
TFT , x
WSLS ) | x AllD ≥ x TFT ≥ x WSLS ≥
0, and x AllD + x TFT + x WSLS ≤ } . Note that the time scale gov-erning the RD has been assumed to be much larger thanthat for calculating the “long-term” payoffs. Our mainfinding is that the dynamics of Eq. (3) exhibits chaos inthis Ω when µ >
0. Our system is minimal to be chaotic,and this minimal system helps us understand the role ofmutation in analytic terms.
III. RESULT
Numerically, we integrate this set of ordinary differen-tial equations for every i ∈ S by means of the fourth-order Runge-Kutta method, with enforcing the normal-ization condition P i ∈S x i ( t ) = 1 at any time t . We set b = 1 without loss of generality, and take ǫ = 10 − as theprobability of error. We choose ∆ t = 10 − as the incre-mental time interval for numerical integration through-out this work. Note that the RD depends on the initialcondition at t = 0 due to its deterministic nature. It istherefore important to probe many different initial con-ditions to obtain the full picture. In Fig. 1(a), we assume µ = 10 − and plot the time average of x i ,¯ x i ≡ t − t Z t t x i ( t ) dt, (4)which removes transient dynamics for 0 ≤ t < t with t ≫ O (1) and considers the history up to t > t .For each value of c , we have checked roughly 2 × different initial conditions for x = ( x AllD , x
TFT , x
WSLS )in I δ ≡ { (0 , , , , δ ), (0 , , δ ) , . . . , (1 , , } with δ = . If we take c = 0 .
2, for example, we find from
010 10.5x
WSLS (b) x
AllD x TFT (c) x
WSLS =0/48 2/48 4/48 6/48
Cycle AllD WSLS
01 0 0.5 1(a) a v e r a g e o f x i c AllCAllDTFTWSLS FIG. 1. (Color online) (a) Fractions of strategies averagedover long time [Eq. (4)], started from various initial condi-tions (see the main text). The vertical dotted line indicates c = , and the two oblique dotted lines represent x TFT ≈ c and x AllD ≈ − c , respectively. (b) Attractor for c = 0 . x TFT , x
AllD ) plane. The small circles rep-resent local maxima of x WSLS along the trajectory, and thecross indicates a mutation-free fixed point (FP) written as x ∗ = (cid:16) − ǫ − c − ǫ , c − ǫ , (cid:17) . (c) Basins of attraction for c = 0 . x AllD and x TFT , respectively. These are cross-sectional views at dif-ferent values of x WSLS from to . The basin of attractionfor a cycle as shown in (b) is denoted in red, AllD FP with x ∗ = (1 , ,
0) in green, and WSLS FP with x ∗ = (0 , ,
1) inblue.
Fig. 1(a) that the system has two different attractors,characterized by x ∗ ≈ (1 , ,
0) and x ∗ ≈ (0 , , c roughly exceeds ,and this is readily explained by the FP analysis: When µ = 0, Eq. (3) has 11 different FP’s, one of which is x ∗ = (0 , , c liesbelow a certain threshold c th = (1 − ǫ ) − ǫ +4 ǫ ≈ − ǫ , whereasone of them becomes positive for c > c th (see Appendix Bfor details). In terms of evolutionary biology, we can alsosay that the WSLS-dominant FP is an evolutionarily sta-ble state (ESS) for c < c th , noting that an ESS consti-tutes an asymptotically stable FP in RD [15]. Figure 1(a)also shows that the distribution of attractors changesqualitatively as c increases. Roughly speaking, if c > ∼ . x = (¯ x AllD , ¯ x TFT , ¯ x WSLS ) ≈ (cid:0) − c , c, η (cid:1) with a small positive number η . This attractor actuallycorresponds to a cycle, and Fig. 1(b) shows an examplefor c = 0 .
46. By a cycle, we do not only mean an ordinarylimit cycle of finite periodicity but also a quasi-periodicor strange attractor of an infinite period. Figure 1(c)provides tomographic views of the basins of attractionwith different values of x WSLS ( t = 0). The basins havevery complicated boundaries on the ( x AllD , x
TFT ) planeand gradually merge as we increase x WSLS in the initial ×10 -3 (a) 0.0 0.2 0.42 0.44 0.46 0.48 0.5 m a x i m a o f x W S L S cn th ( n + ) t h -6 -5 -4 -3 -2 (c) c µLeft endRight endµ x W S L S t - t λ FIG. 2. (Color online) (a) Largest Lyapunov exponent λ when it is positive, and local maxima of x WSLS ( t ) [see thecircles in Fig. 1(b)] at various values of c . We check manydifferent cycles to estimate the average and standard error of λ . The big horizontal arrow shows the size of the main bi-furcation structure, and the long vertical rectangle indicates c = 0 . n th and ( n +1)th maxima of x WSLS at c = 0 . x WSLS at c = 0 .
475 (upper) and c = 0 . t ≫ O (1).The two lines of the upper panel are obtained with differentinitial conditions. (c) Left and right ends of the main bifur-cation structure [see the horizontal arrow in (a)] as a functionof µ on the log-log scale. We have additionally plotted µ . for comparison. condition. Therefore, although WSLS can stabilize coop-eration for c < c th , it depends on initial conditions, andone cannot easily choose a correct one if beginning withlow x WSLS .The complicated geometry of the basins of attrac-tion suggests the possibility of chaos [17]. To quan-tify chaoticity, we calculate Lyapunov exponents withLyapOde4 [18], which is based on the QR decompositionmethod [19]. In Fig. 2(a), we plot the largest Lyapunovexponent λ when it is estimated as positive (see Ap-pendix C for details). The result supports the existenceof chaos. Furthermore, we can explicitly construct a bi-furcation diagram [Fig. 2(a)] by plotting local maximaof x WSLS [the circles in Fig. 1(b)] at various values of c .The bifurcation structure is reminiscent of that of the lo-gistic map, and the similarity can be made more preciseby plotting a return map between the n th and ( n + 1)thmaxima when the system is chaotic [inset of Fig. 2(a)].We also note that Fig. 2(a) shows many other lines thanthe bifurcation diagram because two different cycles cancoexist at the same c depending on their initial condi-tions. For example, the upper panel of Fig. 2(b) depictstwo time series of x WSLS at c = 0 . c decreases, the former cycle undergoes perioddoubling to chaos [the lower panel in Fig. 2(b)], but thelatter one still remains stable with finite periodicity. Al-though unpredictable, the dynamics exhibits a variety ofcorrelations that we can make use of. For example, themaximum of x WSLS provides a precursor for that of x AllD because it is correlated with how close the trajectory getsto the state of x AllD = 1 before turning toward the stateof x TFT = 1 [see Fig. 1(b)].The position of the main bifurcation structure is ex-tremely sensitive to the variation of µ : Fig. 2(c) showsthe left and right ends of the main bifurcation structure,represented by the horizontal arrow in Fig. 2(a). Ac-cording to our numerical observation, they are roughlyproportional to µ . . If the power-law behavior holds for µ →
0, the bifurcation structure will shrink and eventu-ally disappear as mutation occurs less and less frequently.The exponent is unexpectedly small, and this sensitivityto µ is consistent with the following observation: Sup-pose that we have a FP x ∗ when mutation is absent,i.e., µ = 0. The corresponding FP x ∗ in the presenceof small µ > f ( x ) ≡ ( f AllD ( x ) , f TFT ( x ) , f WSLS ( x )), we obtain x ∗ ≈ x ∗ − J ( x ∗ ) − · f ( x ∗ ) , (5)if the Jacobian matrix J = n ∂f i ∂x j o has an inverse at x = x ∗ (see Appendix D for details). The last termon the right-hand side of Eq. (5) describes displacementof the FP due to mutation and it is expected to be ofan order of µ in this perturbative approach. However,this is not the case if we look at a mutation-free FP x ∗ = (cid:16) − ǫ − c − ǫ , c − ǫ , (cid:17) , which is close to ¯ x ≈ (cid:0) − c , c, η (cid:1) ,the time average of the cycle in Fig. 1(b): We rather findthat ∂f WSLS ∂x i is only of an order of µ for every i ∈ S , sothat the naive perturbative analysis of Eq. (5) yields dis-placement of O (1). Note that it does not vanish in thelimit of µ →
0, contrary to our expectation. The failurein the perturbative estimate of x ∗ nevertheless suggeststhat mutation can greatly affect the FP structure nearthe limit cycle. IV. DISCUSSION
Whenever an ESS exists, it is an asymptotically stableFP in the mutation-free RD, i.e., Eq. (3) with µ = 0,and a local Lyapunov function can be constructed in thevicinity of this FP [15, 16]. However, RD may have differ-ent types of attractors such as a limit cycle and a strange attractor, and we then have no systematic way to regardthe dynamic evolution as optimization of a certain targetfunction. In this context, it is worth noting that RD hasemergent symmetry [20]: Let us denote dx AllC /dt ≡ f AllC ( x AllC , x
AllD ) (6) dx AllD /dt ≡ f AllD ( x AllC , x
AllD ) (7)with setting x WSLS = µ = 0. Then, the following equalityholds: f AllC ( x AllC , x
AllD ) + f AllD ( x AllD , x
AllC ) = 0 . (8)Therefore, if we redefine X AllC ≡ x AllD , X AllD ≡ x AllC ,and τ ≡ − t , we find from Eq. (8) that dX AllC /dτ = f AllC ( X AllC , X
AllD ) (9) dX AllD /dτ = f AllD ( X AllC , X
AllD ) , (10)recovering the original dynamics in Eqs. (6) and (7)(see Appendix E for more detailed derivation). In otherwords, if both x WSLS and µ are strictly zero, the dynam-ics is dual to itself under time reversal and exchange be-tween AllC and AllD. The duality implies that our two-dimensional dynamics cannot have a global Lyapunovfunction: If a function V ( x AllC = x , x AllD = x ) has apositive time derivative, V ( x , x ) must be a decreasingfunction of time. Loosely speaking, therefore, if WSLSwas absent, it would be impossible to define the arrowof time for the whole system based solely on selection.We have two ways for directionality at this point: Oneis to incorporate x WSLS > se-lection . The other is to keep x WSLS at zero but intro-duce µ > mutation . For example, the FP x ∗ , modified from x ∗ = (cid:16) − ǫ − c − ǫ , c − ǫ , (cid:17) by mutation, has eigenvalueswith a negative real part proportional to µ [20]. It im-plies that mutation mixes up the populations of differentstrategies so that the system always ends up with thesame polymorphic state. In either case, the initial condi-tion eventually becomes irrelevant. Our finding impliesthat if both x WSLS and µ are turned on, the symmetry un-der time reversal and AllC-AllD exchange can break in amore nontrivial way. In particular, we have seen that mu-tation, induced by the inherent microscopic randomnessof the environment, produces unpredictability on evolu-tionary scales. One might say that this is not unexpectedbecause mutation per se is a random event which makesthe evolutionary path deviate from the one determined bythe initial condition. However, we are dealing with mu-tation in a fully deterministic manner in Eq. (3), and theeffect of mutation is not just small perturbation addedto a deterministic path, but the FP structure itself seemsto experience non-perturbative changes. The chaotic dy-namics is directional in a rather subtle sense that thepath reveals more and more information of the initialcondition as time goes by. Differently from the abovecases, this additional information is not a function of thecurrent state exclusively, but something evaluated withreference to history [21]. Historical contingency shouldbe understood in this respect, considering that not allhistory-dependent systems are chaotic. V. SUMMARY
In summary, we have presented numerical evidencethat chaos exists in the 3D continuous-time RD with mu-tation among the four representative strategies of the PDgame, i.e., AllC, AllD, TFT, and WSLS. Even if one con-siders a more general strategy space, it will contain thesefour in most cases, and our findings will remain valid inthe corresponding subspace. The model, originally moti-vated by selection against the better for the population,provides a simple analytic picture for non-progressionism,showing various facets of mutation: It generates unpre-dictability with exerting non-perturbative effects on evo-lutionary paths and breaks symmetry of a subsystem un-der time reversal. The strategic interaction consideredhere is one of the most intensively studied subjects inevolutionary game theory, so our finding suggests thatchaos can be more prevalent than previously known.
ACKNOWLEDGMENTS
H.-H.J. acknowledges financial support by Basic Sci-ence Research Program through the National ResearchFoundation of Korea (NRF) grant funded by the Min-istry of Education (Grant No. 2015R1D1A1A01058958).W.-S.J. was supported by Basic Science Research Pro-gram through the National Research Foundation of Ko-rea (NRF) funded by the Ministry of Education (GrantNo. NRF-2016R1D1A1B03932590). S.K.B. was sup-ported by Basic Science Research Program through theNational Research Foundation of Korea (NRF) funded bythe Ministry of Science, ICT and Future Planning (GrantNo. NRF-2017R1A1A1A05001482).
Appendix A: Derivation of long-term payoff a ij As mentioned in the main text, we denote by q αβ ( r βα )the probability that the player (coplayer) with strategy i ( j ) cooperates at the next round, given that the stateof player’s and coplayer’s moves is ( α, β ) at the currentround. Here α, β ∈ { C, D } . We summarize the values of q αβ for each strategy i ∈ S = { AllC, AllD, TFT, WSLS } as follows: ( α, β ) AllC AllD TFT WSLS( C, C ) 1 0 1 1(
C, D ) 1 0 0 0(
D, C ) 1 0 1 0(
D, D ) 1 0 0 1 The values of r βα for the coplayer with strategy j can besimilarly assigned.In the presence of implementation error with proba-bility ǫ ∈ (0 , q ′ αβ = (1 − ǫ ) q αβ + ǫ (1 − q αβ ) and r ′ βα = (1 − ǫ ) r βα + ǫ (1 − r βα ), respectively. By defining¯ q ′ αβ ≡ − q ′ αβ and ¯ r ′ βα ≡ − r ′ βα , the transition matrixbetween the current and next states is written as M = q ′ CC r ′ CC q ′ CD r ′ DC q ′ DC r ′ CD q ′ DD r ′ DD q ′ CC ¯ r ′ CC q ′ CD ¯ r ′ DC q ′ DC ¯ r ′ CD q ′ DD ¯ r ′ DD ¯ q ′ CC r ′ CC ¯ q ′ CD r ′ DC ¯ q ′ DC r ′ CD ¯ q ′ DD r ′ DD ¯ q ′ CC ¯ r ′ CC ¯ q ′ CD ¯ r ′ DC ¯ q ′ DC ¯ r ′ CD ¯ q ′ DD ¯ r ′ DD . (A1)Then, from the equation of M~v = ~v , we obtain the sta-tionary solution as ~v = ( v CC , v CD , v DC , v DD ), which is aunique right eigenvector of the matrix M . Here we im-pose the normalization for ~v using v CC + v CD + v DC + v DD = 1. Then v αβ can be interpreted as the stationaryprobability of finding the state ( α, β ) when players withstrategies i and j play the game. The results of ~v forall combinations of i and j are summarized in Table A.1.Once ~v is obtained, we calculate the long-term payoff a ij that the strategy i obtains against j , by taking the innerproduct between ~v and the payoff vector such that a ij = ~v · ( b − c, − c, b, . (A2)The long-term payoffs a ij for all possible pairs of i and j are presented in Table A.2. Note that with ǫ >
0, initialstates are irrelevant in the long run.
Appendix B: FP analysis of the replicator dynamicswhen µ = 0 In order to better understand the RD in Eq. (3) inthe main text, we first analyze the mutation-free case,i.e., when µ = 0. Let us define x ≡ ( x AllD , x
TFT , x
WSLS ),where x i denotes the fraction of strategy i . From the setof equations0 = dx i dt = f i ( µ = 0) = p i x i − h p i x i (B1) TABLE A.1. Stationary probability distributions v αβ of finding the state ( α, β ) when strategy i plays against j , where ( α, β )can be one of the four states: ( C, C ), (
C, D ), (
D, C ), and (
D, D ). i j v CC v CD v DC v DD AllC AllC (1 − ǫ ) ǫ (1 − ǫ ) ǫ (1 − ǫ ) ǫ AllD ǫ (1 − ǫ ) (1 − ǫ ) ǫ ǫ (1 − ǫ )TFT 1 − ǫ + 4 ǫ − ǫ ǫ (1 − ǫ ) ǫ (1 − ǫ − ǫ ) 2 ǫ (1 − ǫ )WSLS (1 − ǫ ) / − ǫ ) / ǫ/ ǫ/ ǫ (1 − ǫ ) ǫ (1 − ǫ ) ǫ (1 − ǫ )AllD ǫ ǫ (1 − ǫ ) ǫ (1 − ǫ ) (1 − ǫ ) TFT 2 ǫ (1 − ǫ ) ǫ (1 − ǫ − ǫ ) 2 ǫ (1 − ǫ ) − ǫ + 4 ǫ − ǫ WSLS ǫ/ ǫ/ − ǫ ) / − ǫ ) / − ǫ + 4 ǫ − ǫ ǫ (1 − ǫ + 2 ǫ ) 2 ǫ (1 − ǫ ) ǫ (1 − ǫ )AllD 2 ǫ (1 − ǫ ) 2 ǫ (1 − ǫ ) ǫ (1 − ǫ + 2 ǫ ) 1 − ǫ + 4 ǫ − ǫ TFT 1 / / / / / / / / − ǫ ) / ǫ/ − ǫ ) / ǫ/ ǫ/ − ǫ ) / ǫ/ − ǫ ) / / / / / − ǫ + 7 ǫ − ǫ ǫ (1 − ǫ ) ǫ (1 − ǫ ) ǫ (2 − ǫ + 4 ǫ )TABLE A.2. Long-term payoff a ij that strategy i (leftmostcolumn) obtains against j (top row).AllC AllD TFT WSLSAllC ( b − c )(1 − ǫ ) bǫ − c (1 − ǫ ) b (1 − ǫ +2 ǫ ) − c (1 − ǫ ) b − c (1 − ǫ )AllD b (1 − ǫ ) − cǫ ( b − c ) ǫ (2 b (1 − ǫ ) − c ) ǫ ( b − cǫ )TFT b (1 − ǫ ) − c (1 − ǫ − ǫ ) ( b − c (1 − ǫ )) ǫ b − c b − c WSLS b (1 − ǫ ) − c bǫ − c b − c ( b − c )(1 − ǫ + 6 ǫ − ǫ ) for all i ∈ S , we analytically find 11 different FP’s x ∗ asfollows: x ∗ = (1 , , , , , , , , , , − c (1 − ǫ ) ( b − c ) )(0 , cǫ ( b − c )(1 − ǫ ) , b (1 − ǫ ) − c b (1 − ǫ ) , cb (1 − ǫ ) , b (1 − ǫ ) − c ( b − c )(1 − ǫ ) , cǫ ( b − c )(1 − ǫ ) , ( b − c )(1 − ǫ ) − c ( b − c )(1 − ǫ ) , , c ( b − c )(1 − ǫ ) )(0 , c [2( b − c ) ǫ (1 − ǫ ) − c ](1 − ǫ )[( b − c ) (1 − ǫ ) − bc ] , c [ b (2 ǫ − c ](1 − ǫ )[( b − c ) (1 − ǫ )+ bc ] )( ( b − c )[ b (1 − ǫ ) − c ]( b − c ) (1 − ǫ )+ bc , c [2( b − c ) ǫ (1 − ǫ )+ c ](1 − ǫ )[( b − c ) (1 − ǫ )+ bc ] , c [ b (1 − ǫ ) − c ](1 − ǫ )[( b − c ) (1 − ǫ ) − bc ] ) . . (B2)By the linear stability analysis, we find that only oneFP x ∗ = (1 , , x AllD = 1, is stable for the -1 0 1 0 0.5 1(a) λ c 0 0.5 1(b) c FIG. B.1. (Color online) (a) Eigenvalues for the first FP ofEq. (B2), x ∗ = ( x ∗ AllD , x ∗ TFT , x ∗ WSLS ) = (1 , , c . We set b = 1 without loss of generality and choose ǫ = 10 − . (b)Eigenvalues for another FP (0 , , c < c th . The threshold value c th is analyticallyobtained in Eq. (B4). entire range of c , confirmed by the negativity of all eigen-values, as shown in Fig. B.1(a). This result is somehowconsistent with the fact that this FP is numerically ob-served for the entire range of c even when µ >
0, asdepicted in Fig. 1(a) in the main text. We are also inter-ested in another FP x ∗ = (0 , , x WSLS = 1. It isfound that the largest eigenvalue is negative for c < c th with some threshold c th , implying that the FP is stable.However, it becomes positive for c > c th , implying unsta-bility of the FP. See Fig. B.1(b). This largest eigenvalueis calculated as λ = (cid:18) ǫ − (cid:19) (cid:8) b (1 − ǫ ) − c [2( ǫ − ǫ + 1] (cid:9) , (B3)from which we can define c th as follows: c th = b (1 − ǫ ) − ǫ + 2 ǫ ) ≈ b (cid:18) − ǫ (cid:19) . (B4)All the other FP’s turn out to be unstable. Appendix C: Positive Lyapunov exponents
For chaotic dynamics, the largest Lyapunov exponentis positive. The problem arises when its value is so smallthat the numerical error of the algorithm has the sameorder of magnitude. Suppose that we have a continu-ous orbit in a bounded three-dimensional region and itsLyapunov exponents are sorted in descending order, i.e., λ > λ > λ . It is well-known that one of the Lya-punov exponents must be zero for a continuous orbit.We can also be sure that λ must be negative for the fol-lowing reason: the information dimension of an attractoris bounded from above by the Kaplan-Yorke formula, D KY = j + P ji =1 λ i | λ j +1 | , (C1)where j satisfies P ji =1 λ i > P j +1 i =1 λ i <
0. In ourcase, the dimension cannot be greater than three, whichmeans that j ≤
2. From P i =1 λ i <
0, it follows that λ < λ > λ is zero. Therefore, we have the following in-equality, λ + λ > . (C2)On the other hand, if λ is actually zero but numericallyestimated as a tiny positive number, λ + λ is likely tobe negative because of λ . The point is that Eq. (C2)provides a sharp criterion to check the positivity of λ ,which can sometimes be obscure if we observe λ only.The Lyapunov exponents depicted in Fig. 1 of the maintext are estimated as positive in this way. Appendix D: Perturbative analysis for small positive µ Based on the understanding of the mutation-free casein Appendix B, we now analyze a more general case with µ >
0. As it is not straightforward to obtain the FP’sof the RD in Eq. (3) in the main text, we take the per-turbative approach. More specifically, we apply New-ton’s method in which each FP x ∗ for µ = 0 serves as atrial solution. The first order Taylor expansion aroundthe FP x ∗ = ( x ∗ , AllD , x ∗ , TFT , x ∗ , WSLS ) yields the followingequation: = f AllD ( x ∗ ) f TFT ( x ∗ ) f WSLS ( x ∗ ) = f AllD ( x ∗ ) f TFT ( x ∗ ) f WSLS ( x ∗ ) + ∂f AllD ∂x AllD ∂f AllD ∂x TFT ∂f AllD ∂x WSLS ∂f TFT ∂x AllD ∂f TFT ∂x TFT ∂f TFT ∂x WSLS ∂f WSLS ∂x AllD ∂f WSLS ∂x TFT ∂f WSLS ∂x WSLS x ∗ AllD − x ∗ , AllD x ∗ TFT − x ∗ , TFT x ∗ WSLS − x ∗ , WSLS , (D1)where x ∗ = ( x ∗ AllD , x ∗ TFT , x ∗ WSLS ) denotes the correspond-ing perturbative solution. The matrix in the above equa-tion is indeed Jacobian matrix J = n ∂f i ∂x j o calculated at x = x ∗ . If J ( x ∗ ) has an inverse, by arranging Eq. (D1),we get x ∗ AllD x ∗ TFT x ∗ WSLS ≈ x ∗ , AllD x ∗ , TFT x ∗ , WSLS − J − f AllD ( x ∗ ) f TFT ( x ∗ ) f WSLS ( x ∗ ) . (D2)Here, we would like to note that even when J ( x ∗ ) has noinverse, one can first calculate J ( x ) − , then substitute x by x ∗ to obtain the result x ∗ in Eq. (5b). Then, based onnumerical observations, we pick up three FP’s for µ = 0,i.e., (1 , , , , b (1 − ǫ ) − c b (1 − ǫ ) , cb (1 − ǫ ) , x ≈ ( − c , c, η ) with small positive η , when the timeaverage is taken over the cycle (see the main text). Wewill denote this last FP as ˆ x ∗ . By applying Newton’smethod to these FP’s, we obtain the following perturba-tive solutions: x ∗ ≈ + µ (3 ǫ +1)( b − c )3 c − cǫ − ( b − c )3 c − cǫ ǫ ( b − c )3 c (2 ǫ − + µ − ( ( ǫ − ( ǫ − ǫ +1 ) ( b − c ) ) ǫ − b (1 − ǫ ) − c (2( ǫ − ǫ +1)))13 (cid:16) ǫ − − (cid:17) ǫ − ( ǫ − ǫ +1 )( b (1 − ǫ ) − bc (1 − ǫ ) +2 c (12( ǫ − ǫ (2( ǫ − ǫ +1)+1) ) ǫ − ( b (1 − ǫ ) − c ( ǫ − ǫ )( b (1 − ǫ ) − c (2( ǫ − ǫ +1)) b (1 − ǫ ) − c b (1 − ǫ ) cb (1 − ǫ ) + µ ( b − c )(2 ǫ ( b + c ) − b − c )( b (2 ǫ − c )24 c (2 ǫ − b (2 ǫ − c ) − (( b − c )( b + c )( b (2 ǫ − c ))12( c ( b (2 ǫ − c )) + − . (D3)We find that a term of O (1) appears for ˆ x ∗ becauseof µ ≪
1, implying that the perturbative approach failshere. The result can be understood by directly calculat-ing J ( x ∗ ) for µ → J (cid:18) b (1 − ǫ ) − c b (1 − ǫ ) , cb (1 − ǫ ) , (cid:19) = ( c − b (1 − ǫ ))4 b c c + b (1 − ǫ ) c − c − c − c . (D4)This Jacobian matrix has zeros in the bottom row be-cause ∇ f WSLS → f WSLS becomes a con-stant function in this limit for every x such that x TFT =1 − x AllD and x WSLS = 0. The FP ˆ x ∗ also satisfies thiscondition, so every derivative vanishes there along thedirection of (1 , − , H ≡ n ∂ f WSLS ∂x i ∂x j o for the second-derivative test is obtained as H (ˆ x ∗ ) = c ( − ǫ ) b c ( − ǫ ) bc ( − ǫ ) b c ( − ǫ )2 b c − b (1 − ǫ ) − bc (1 − ǫ ) b (D5)when µ →
0. It is straightforward to see that it has aneigenvector (1 , − ,
0) associated with the eigenvalue zero,which makes the test inconclusive.
Appendix E: Symmetry of replicator dynamics
Recall that RD in this work is defined as dx i dt = f i ≡ (1 − µ ) p i x i − h p i x i + µ |S| − X j = i p j x j , (E1)where p i = P i a ij x j and h p i ≡ P i p i x i = P ij a ij x i x j . The long-term payoff a ij is given in Ta-ble A.2. We explicitly write the functional dependence as f i ( x AllC , x
AllD , x
WSLS ) by choosing three independentvariables. Note that we have just chosen x AllC as an in-dependent variable instead of x TFT as we have done inother parts of the paper. It is because this set of inde-pendent variables show the symmetry most clearly. Aftersome algebra, we can readily see the following equality: f AllC ( x , x ,x ) + f AllD ( x , x , x )= b − c x )[ µ (1 + x ) − x (4 µ + 3 x )] . (E2)It is important to note that x and x exchange the po-sitions in evaluating f AllD . Let us set µ and x to zeroso that the right-hand side vanishes altogether. We maysuppress the dependence on x which is fixed, so the re-sult is f AllC ( x , x ) = − f AllD ( x , x ) . (E3)By construction, we already have dx AllC dt = f AllC ( x AllC , x
AllD ) (E4) dx AllD dt = f AllD ( x AllC , x
AllD ) . (E5)We combine Eqs. (E3) to (E5) to obtain dx AllC dt = − f AllD ( x AllD , x
AllC ) (E6) dx AllD dt = − f AllC ( x AllD , x
AllC ) . (E7)Now, let us define τ ≡ − t to rewrite the equations as dx AllC dτ = f AllD ( x AllD , x
AllC ) (E8) dx AllD dτ = f AllC ( x AllD , x
AllC ) . (E9)Finally, the variables are relabeled as X AlC ≡ x AllD and X AlD ≡ x AllC , and we end up with the following set ofequations, dX AllD dτ = f AllD ( X AllC , X
AllD ) (E10) dX AllC dτ = f AllC ( X AllC , X
AllD ) , (E11)which are formally identical to the original ones[Eqs. (E4) and (E5)]. [1] S. J. Gould, Wonderful Life (W. W. Norton & Company,New York, 1989).[2] T. Shanahan,
The Evolution of Darwinism: Selection,Adaptation and Progress in Evolutionary Biology (Cam-bridge University Press, Cambridge, 2004).[3] H. Matsuda and P. A. Abrams, Evolution , 1764(1994); M. Gyllenberg and K. Parvinen, Bull. Math.Biol. , 981 (2001); M. Gyllenberg, K. Parvinen, andU. Dieckmann, J. Math. Biol. , 79 (2002); D. J. Rankinand A. L´opez-Sepulcre, Oikos , 616 (2005).[4] W. M. Muir and R. D. Howard, Proc. Natl. Acad. Sci.USA , 13853 (1999).[5] D. J. Rankin, K. Bargum, and H. Kokko, Trends Ecol.Evol. , 643 (2007).[6] M. Doebeli and I. Ispolatov, Evolution , 1365 (2014).[7] D. Vilone, A. Robledo, and A. S´anchez, Phys. Rev. Lett. , 038101 (2011); T. Galla and J. D. Farmer, Proc.Natl. Acad. Sci. USA , 1232 (2013).[8] M. Nowak and K. Sigmund, Proc. Natl. Acad. Sci. USA , 5091 (1993).[9] B. Skyrms, J. Logic Lang. Inform. , 111 (1992); S. H.Strogatz, Nonlinear Dynamics and Chaos: With Appli-cations to Physics, Biology, Chemistry, and Engineering (Westview Press, Boulder, CO, 2001).[10] M. Travisano, J. A. Mongold, A. F. Bennett, R. E.Lenski, et al. , Science , 87 (1995); Z. D. Blount,C. Z. Borland, and R. E. Lenski, Proc. Natl. Acad. Sci.USA , 7899 (2008); P. Shah, D. M. McCandlish, andJ. B. Plotkin, ibid . , E3226 (2015).[11] R. Axelrod, The Evolution of Cooperation (Basic Books,New York, 1984); M. A. Nowak and K. Sigmund, Nature(London) , 250 (1992); L. A. Imhof, D. Fudenberg, and M. A. Nowak, Proc. Natl. Acad. Sci. USA , 10797(2005).[12] D. Kraines and V. Kraines, Theory Decis. , 47 (1989);M. A. Nowak and K. Sigmund, Nature (London) ,56 (1993); M. Posch, J. Theor. Biol. , 183 (1999);A. J. Bladon, T. Galla, and A. J. McKane, Phys. Rev.E , 066122 (2010); C. Hilbe, L. A. Martinez-Vaquero,K. Chatterjee, and M. A. Nowak, Proc. Natl. Acad. Sci.USA , 4715 (2017).[13] L. A. Imhof, D. Fudenberg, and M. A. Nowak, J.Theor. Biol. , 574 (2007); C. Hilbe, M. A. Nowak,and K. Sigmund, Proc. Natl. Acad. Sci. USA , 6913(2013); S. K. Baek, H.-C. Jeong, C. Hilbe, and M. A.Nowak, Sci. Rep. , 25676 (2016).[14] G. S. Wilkinson, Nature (London) , 181 (1984);M. Milinski, ibid . , 433 (1987).[15] J. W. Weibull, Evolutionary Game Theory (MIT Press,Cambridge, 1995).[16] J. Hofbauer and K. Sigmund,
Evolutionary Gamesand Population Dynamics (Cambridge University Press,Cambridge, 1998).[17] S. W. McDonald, C. Grebogi, E. Ott, and J. A. Yorke,Physica D , 125 (1985).[18] P. H. Bryant, Available athttp://biocircuits.ucsd.edu/pbryant (accessed 2017Mar. 23).[19] J.-P. Eckmann and D. Ruelle, Rev. Mod. Phys. , 617(1985); H. D. Abarbanel, R. Brown, and M. B. Kennel,Int. J. Mod. Phys. B , 1347 (1991).[20] S. K. Baek, S. D. Yi, and H.-C. Jeong, J. Theor. Biol. , 215 (2017).[21] R. Wackerbauer, A. Witt, H. Atmanspacher, J. Kurths,and H. Scheingraber, Chaos Soliton. Fract.4