Triggers for cooperative behavior in the thermodynamic limit: a case study in Public goods game
TTriggers for cooperative behavior in thethermodynamic limit: a case study in Public goodsgame
Colin Benjamin and Shubhayan Sarkar School of Physical Sciences, National Institute of Science Educationand Research, HBNI, Jatni- 752050, IndiaApril 9, 2019
Abstract
In this work, we aim to answer the question- what triggers cooper-ative behaviour in the thermodynamic limit by taking recourse to thePublic goods game. Using the idea of mapping the 1D Ising modelHamiltonian with nearest neighbor coupling to payoffs in game the-ory we calculate the Magnetisation of the game in the thermodynamiclimit. We see a phase transition in the thermodynamic limit of thetwo player Public goods game. We observe that punishment acts as anexternal field for the two player Public goods game triggering cooper-ation or provide strategy, while cost can be a trigger for suppressingcooperation or free riding. Finally, reward also acts as a trigger for pro-viding while the role of inverse temperature (fluctuations in choices) isto introduce randomness in strategic choices.
Keywords:
Nash equilibrium; Public goods game; Ising model
In the context of evolution, it is observed that cooperation amongindividuals exists even when defection should be the choice forevery player. In this work, we figure out what triggers this coop-erative behavior in the thermodynamic limit by considering Publicgoods game both with and without punishment. a r X i v : . [ phy s i c s . s o c - ph ] A p r n recent years there have been some attempts to explain whyindividuals in a population cooperate even when defection wouldbe a better choice. There has been a previous attempt by Adamiand Hintze in Ref. [5] to answer this question using the Isingmodel. In this work, we first point out the errors in their approachand then give the correct approach to solve this problem using anexact mapping to the 1D Ising model. We identify the parameterswhich trigger cooperative behavior among individuals for Publicgoods game both with and without punishment. We find thatreward and punishment are the strongest triggers for promotingcooperative or provide behavior. However, in contrast, cost of theresource acts as a suppressor of cooperation or promotes defec-tion, i.e., free riding. Inverse temperature (fluctuation in choices)introduces randomness in strategic choices. Game theory aims to find an equilibrium strategy where both of the playersare at the maximum benefit or the least loss. This is known as the Nashequilibrium [1]. In games such as Prisoner’s dilemma, defection is the Nashequilibrium for both the players. Under certain conditions like kin selectionor reciprocal altruism [10, 3] cooperation becomes the preferred choice inthe Prisoner’s dilemma game. However, in Hawk-Dove, frequently used tomodel cooperation among humans, and animals, the Nash equilibrium isfor one-player to defect while the other to cooperate. In the real world,however we see that organisms do cooperate among each other like sharingof resources. Thus, in the thermodynamic limit cooperation is indeed achoice in the long run otherwise the population as a whole won’t survive.An account for connections between evolution and game theory can be foundin Ref. [4]. In this paper we try to investigate what triggers cooperationin the thermodynamic limit of a generic two player game like Public goodsgame.
What happens in the thermodynamic limit for games like Prisoner’s dilemma,Hawk-Dove, etc. is an outstanding problem of game theory, since it is onlyin this limit that games can mimic the large populations of humans or ani-mals. In this context and using the Ising model, Ref. [5] makes an attempt2o analytically approach the thermodynamic limit of the two player Pris-oner’s dilemma and three player Public goods game. However, we haveshown in a previous work [6] that this approach to the thermodynamic limitof games has some inconsistencies. In Ref. [6], we extended the idea ofmapping the 1D Ising model Hamiltonian to the payoffs in a game [7] torectify the mismatch between expected outcomes and the calculated resultsof Ref. [5]. In this paper we approach the thermodynamic limit using themethod of Ref. [5] for the two player Public goods game with and withoutpunishment and show the errors in the approach of Ref. [5]. Further, weanalytically apporach the thermodynamic limit of the public goods gamewith and without punishment by using an exact mapping of the 1D Isingmodel to the public goods game. We calculate the game magnetisation, i.e.,the difference between fraction of population choosing a particular strategysay provide over free riding in the thermodynamic limit of a Public goodsgame and analyse it for triggers which lead to a phase transition or cooper-ative behaviour. We unravel a cost dependent phase transition along withtriggers for cooperation such as reward and punishment. The most impor-tant difference between our approach and the traditional approach is that wetackle the problem analytically, i.e., we give a very simple analytical formulafor calculating the distribution of population depending on the strategies.The traditional methods, see Ref. [3], are numerical and involve solving dif-ferential equations (replicator approach) but in our approach if the payoffmatrix of the game is known then the equilibrium condition can be calcu-lated directly. Further, the traditional methods are dynamical, i.e., involvetime. Our approach is not dynamical. The main attraction of our work isnot just what happens at the thermodynamic limit but that providers (or,cooperators) do emerge (a finite minority of providers/cooperators exist) atthe thermodynamic limit which was previously believed not to happen andcan also be inferred from Nowak’s paper, see Ref. [3].This paper is organized as follows- first we review 1D Ising model andthe analogy of Ref. [7] for a general two player two strategy payoff matrixwith the two spin Ising Hamiltonian. In section III, we deal with the twoplayer Public goods game in the thermodynamic limit. We see a mismatchbetween expected outcomes and the calculated results when we approachthe problem using the method of Ref. [5]. This mismatch is resolved byemploying the procedure as adopted in Ref. [6] to tackle the thermody-namic limit of games. In section IV, we first apply the method of Ref. [5] tocalculate the Nash equilibrium for the Public goods game with punishmentin the thermodynamic limit. As before we again see a contradiction, whichis resolved by taking recourse to the method of Ref. [6] in the thermody-3amic limit. We observe an additional feature that cost can also act as theexternal magnetic field in the same two player Public goods game in thethermodynamic limit, we end with conclusions.
The Ising model [8] consists of discrete variables that represent magneticdipole moments of atomic spins that can be in one of two states +1 ( ↑ ) or1 ( ↓ ). In the 1D Ising model the spins at each site talk to their nearestneighbors only. The Hamiltonian of the 1D Ising model for N sites is givenas- H = − J N (cid:88) k =1 σ k σ k +1 − h N (cid:88) k =1 σ k . (1)Herein, h defines the external magnetic field while J denotes the exchangeinteraction between spins the spins with the spins denoted as σ ’s. Thepartition function Z , encodes the statistical properties of a system in ther-modynamic equilibrium. For 1D Ising model, Z is a function of temperatureand parameters such as spins, coupling between spins and the external mag-netic field. Thus the partition function can be defined as the statisticaldistribution of a system at thermal equilibrium and it follows Boltzmannstatistics. The partition function for the 1D Ising model is then- Z = (cid:88) σ ... (cid:88) σ N e β ( J (cid:80) Nk =1 σ i σ k +1 + h/ (cid:80) Nk =1 ( σ k + σ k +1 )) (2) β = k B T . σ k denotes the spin of the k th site which can be either up (+1) ordown (-1). The above sum is carried out by defining the transfer matrix T with the following elements as- < σ | T | σ (cid:48) > = e β ( Jσσ (cid:48) + h ( σ + σ (cid:48) )) . By using the completeness relation and transfer matrix, the partition func-tion from Eq. (2) in the thermodynamic limit can be written as- Z = e NβJ (cosh( βh ) ± (cid:113) sinh ( βh ) + e − βJ ) N . (3)The Free energy is then F = − k B T ln Z . Free energy allows us to studythe state of equilibrium of a system. The state of equilibrium of a system4orresponds to one that minimizes its free energy F . The relation betweenthe partition function Z and the Free energy F is F = − k B T ln Z or Z = e − βF . The net magnetization is found by averaging the total magnetizationover all the allowed energy levels which is same as the partial derivative ofFree energy with respect to the external magnetic field. Magnetization forthe 1D Ising model is then- M = − dfdh = sinh( βh ) (cid:113) sinh ( βh ) + e − βJ . (4)A plot of the Magnetization vs. external magnetic field h is shown in Fig.1 for different β , the inverse temperature. β = β = β = - - - - Figure 1: Magnetization vs. external magnetic field h for 1D Ising model( J = . U = s s s x, x (cid:48) y, y (cid:48) s z, z (cid:48) w, w (cid:48) , (5)where U ( s i , s j ) is the payoff function with x, y, z, w as the payoffs for rowplayer and x (cid:48) , y (cid:48) , z (cid:48) , w (cid:48) are the payoffs for column player, s and s denotethe choices available to the players. In our analysis we consider symmetricgames where x = x (cid:48) , y = z (cid:48) , z = y (cid:48) and w = w (cid:48) . Thus, knowing payoff ofthe row player, the column player’s payoff can be inferred. In Ref. [7] Atransformation is made via addition of a factor λ to the s column and µ tothe s column. Thus, U = s s s x + λ y + µs z + λ w + µ . (6)As shown in Appendix 7.3, the Nash equilibrium of the game, Eq. (5) re-mains unchanged under such transformations. Following Ref. [7] and choos-ing the transformations as λ = − x + z and µ = − y + w . The transformedmatrix becomes U = s s s x − z y − w s z − x w − y . (7)Since our aim is to find the Nash equilibrium in the thermodynamic limit,we start by identifying the connection between the 1D Ising model Hamil-tonian for two spins with the transformed payoff matrix as in Eq. (7). TheHamiltonian in Eq. (1) for N = 2 is given as- H = − J ( σ σ + σ σ ) − h ( σ + σ ) . The individual energies of the two spins can be inferred as- E = − J σ σ − hσ E = − J σ σ − hσ . (8)In Ising model, the equilibrium condition implies that the energies of spinsare minimized. For symmetric coupling as in Eq. (2), the Hamiltonian H is minimized with respect to spins σ , σ . This is equivalent to maximizing − H with respect to σ , σ . Game theory aims to search for an equilibriumstrategy(Nash equilibrium) which can be achieved by maximizing the payofffunction U ( s i , s j ) Eqs. (5-7) with respect to the choices s i , s j . For the two6layer Ising model case this is same as maximizing − E i , in Eq. (8) withrespect to spins σ i , σ j . Thus, the Ising game matrix for the row player is(see Ref. [7] for derivation of Eq. (9)) is- U Ising = s = +1 s = − s = +1 J + h − J + hs = − − J − h J − h . (9)We compare the Ising game matrix Eq. (9) to the matrix elements of Eq. (7)to determine the values of J and h . Thus, J = x − z + w − y , h = x − z + y − w , andtherefore the game Magnetization from Eq. (4) can be written in terms ofthe payoff matrix elements Eq. (5) as- M = sinh( β x − z + y − w ) (cid:113) sinh ( β x − z + y − w ) + e − β ( x − z + w − y ) . (10)which is identified as ”game magnetization“ in the thermodynamic limit ofthe game.In Ising model, magnetization refers to the difference between numberof spins pointing up versus the number pointing down. For games we definea quantity akin to the magnetization called game magnetization(10) whichrefers to the difference between fraction of players opting for a particularstrategy, say, Provide versus the fraction opting for Free ride in case ofPublic goods game. For the Ising model, J -the exchange coupling is inJoules, while h -applied magnetic field is also in Joules while β = k B T is inunits of Joule − . In the case of games, payoffs are unit less and β is alsounit less and acts as a randomizing parameter. It should be noted that β isthe inverse temperature (1 /k B T ) in Ising model. Decreasing β or increasingtemperature randomizes or makes the spins more disordered. As discussedabove, two player games have an analogy with two spin Ising Hamiltonianwhere spins are analogous to strategies. Thus, decreasing β randomizesthe strategic choices available to each player which effectively implies β maytrigger either cooperation or defection depending on the other parameters ofthe problem. This completes the connection of 1D Ising model to a generaltwo player game. In the following sections we will apply this to the Publicgoods game both with and without punishment and analyze them in thethermodynamic limit. 7 Public goods game
The Public goods game otherwise known as the ”free rider problem” is asocial dilemma game akin to the Prisoner’s dilemma game. A ”public good”is a perfectly shareable resource, which once produced can be utilized by allin a community. In the two player version of the Public goods game, this”public good” can be produced by either player alone by paying the full costof the service or it can be jointly produced if each pay for half of the service.The payoffs for the cooperators (provider) and defectors (free rider)[9] aregiven by- P D = kn c c/N, P C = P D − c (11)where c is the cost of the service, k denotes the multiplication factor of the”public good”, N denotes the number of players in the group (in the twoplayer Public goods game, N=2), and n c denotes the number of cooperatorsin the group. Thus, the payoff matrix for the two player Public goods gamecan be written (for the case when both provide- P c = kc − c = 2 r and whenboth free ride- P D = 0 while when one free rides and another provides thenis kc/ r + c/ kc/ − c = r − c/ U = provide f ree rideprovide r, r r − c , r + c f ree ride r + c , r − c , (12)where r > c >
0. As we can see from the payoff matrix, ( f reeride, f reeride )is the stable strategy or the Nash equilibrium for r < c/
2. However, when r > c/ provide, provide ) is the Nash equilibrium. We first calculate thegame magnetization in the thermodynamic limit for this two player publicgoods game using the approach of Ref. [5], bringing out the imperfectionsin the approach of Ref. [5] and then do it correctly using our approach aselucidated in section 4.2.
Now, we extend the approach of Ref. [5] to the two player public goodsgame without punishment. As described above, in the Public goods gamethe provide strategy can be written as cooperation and free ride strategy asdefection. Thus, provide strategy or cooperation is represented as spin up( ↑ ) and free ride strategy or defect as spin down ( ↓ ), which are represented8s vectors, i.e., kets | C >, | D > in bra-ket notation as- | C > = (cid:18) (cid:19) , | D > = (cid:18) (cid:19) . (13)In matrix representation ket vectors are- | (cid:105) = (cid:18) (cid:19) and | (cid:105) = (cid:18) (cid:19) , while bra vectors (cid:104) | = (cid:0) (cid:1) T and (cid:104) | = (cid:0) (cid:1) T , T being transpose . Similar to Ising model, the Hamiltonian of the system can be writtenusing the payoff matrix U, and the projectors P i = P C = | >< | , and P i = P D = | >< | . The projectors are defined as outer products of thebra-ket vectors as- | (cid:105)(cid:104) | = (cid:18) (cid:19) (cid:0) (cid:1) = (cid:18) (cid:19) , | (cid:105)(cid:104) | = (cid:18) (cid:19) (cid:0) (cid:1) = (cid:18) (cid:19) . The Hamiltonian is then given by- H = i =1 (cid:88) N (cid:88) m,n =0 , U mn P ( i ) m ⊗ P ( i +1) n , (14)where 0, and 1 denote spin-up ( ↑ ), and spin-down ( ↓ ) sites and elements ofpayoff matris U are: U = 2 r , U = r − c/ U = r + c/ U = 0. N denotes the total number of players. The Kronecker product of two matrices M = (cid:18) A BC D (cid:19) and M = (cid:18) A BC D (cid:19) is K = M ⊗ M , and is given by- K = (cid:18) A BC D (cid:19) ⊗ (cid:18) R ST P (cid:19) or, K = A (cid:18) R ST P (cid:19) B (cid:18) R ST P (cid:19) C (cid:18) R ST P (cid:19) D (cid:18) R ST P (cid:19) = AR AS BR BSAT AP BT BPCR CS DR DSCT CP DT DP .
9e get the game magnetization in the thermodynamic limit using the ap-proach of Ref. [5] to be- M = e − βr − e − βr ) = − tanh( βr ) . (15)For a detailed calculation, of the game magnetization using the approach ofRef. [5], see Appendix 7.1. As we can see from Fig. 2, the game magneti- β = β = β = - - - - - Figure 2: Game magnetization vs. reward r for the Public goods game fordifferent values of β using the method of Ref. [5]. Note that the gamemagnetization is independent of c . Taking any value of c say c = 2, wesee that for r > c/
2, the game magnetization is still negative which impliesthat free riding is the Nash equilibrium which obviously contradicts thesolution for the two player Public goods game. Further when r = 0, thegame magnetization is 0 which again is a contradiction.zation using approach of Ref. [5] is always negative for r > r is less than c/ r is greater than c/
2, thenprovide or cooperation is the Nash equilibrium. So its expected that a phasetransition should occur at c/
2. Further, when r = 0 from the payoff matrix10q. (12), we see that the Nash equilibrium is still defect. However, with theapproach of Ref. [5] we see that at r = 0 the game magnetization is also 0meaning that there are equal number of providers and free riders which isagain a contradiction. In the next section, we show that using our approachto the problem the issues with the method of Ref. [5] are resolved. Following the calculations in section 3 and using the method of Ref. [6], wemake the correct connection of the Public goods game payoff matrix, Eq. (12)and Ising game matrix Eq. (9). As in Eq. (6), we add λ = − x + z = − r + c to column 1 and µ = − y + w = − r − c to column 2 of the payoff matrix,Eq. (12). Thus the Public goods game payoff matrix Eq. (12) reduces to- U = provide f ree rideprovide r − c r − c f ree ride − r − c − r − c . (16)Comparing this to the Ising game matrix Eq. (9), we have- J + h = r − c and J − h = − r − c . Solving these simultaneous equations, we get- J = 0and h = r − c . The game magnetization in the thermodynamic limit is then- M = sinh( βh ) (cid:113) sinh ( βh ) + e − βJ = tanh( β r − c . (17)As we see in Fig. 3, there is a phase transition which occurs at r = c/ c = 4. For r < c/ r > c/
2, game magnetization ispositive, i.e., the Nash equilibrium is provide strategy. β which defines therandomness in the strategic choices available to the players has no bearingon the critical point of the phase transition. As β → r <
2, as we decrease β from 5 to 1, almostone-fourth of the population start cooperating. In contrast to β , the cost( c ) as in Fig. 4 has a bearing on the critical point of the phase transition.Herein, we see increasing cost for a fixed β ( β = 5) propels the criticalpoint to higher values of reward ( r ). This is because when cost of “publicgood” increases with the reward remaining constant fewer players will paythe cost. Further, when reward ( r ) increases keeping the cost ( c ) constant,the number of cooperators increases.11 = β = β =
51 2 3 4 5 r - - Figure 3: Game magnetization vs. reward r for Public goods game fordifferent values of β and cost c = 4. For same cost of “public good” ( c ), thecritical point doesn’t change.For two player games, both the players would choose the Nash equilib-rium strategy. A natural extension from two player case to N players wouldbe that all the players would go for the Nash equilibrium strategy. How-ever, in contrast to the two player case this is not what we observe. Its truethat in the thermodynamic limit the Nash equilibrium strategy is chosenby majority of the players but there are exceptions. For example, in Publicgoods game when reward increases but is less than half of the cost of “publicgood”, even when the Nash equilibrium strategy is defection but still thenumber of cooperators increases. There is always a small fraction of play-ers who choose to cooperate in the thermodynamic limit and that fractionincreases as the reward r increases. Further, we see that for the case whencost becomes very high, there are individuals in the population who pay forthe “public good” even when defection would be the best choice. The gamemagnetization we calculate is defined as the net difference in the fraction ofplayers opting for a particular strategy, say Provide in Public Goods gameversus the fraction opting for free riding. Both these strategic choices areakin to phases of the Ising model. A phase transition occurs in the game12 = = = - - Figure 4: Game magnetization vs. reward r for Public goods game fordifferent values of the cost c with β = 5. As the cost of “public good” ( c )increases, for the same reward more players choose to free ride or defect.when majority of population opts to change their particular strategy, as at r = c/
2. For r < c/
2, majority free ride while for r > c/
In the Public goods game, the punishment p is introduced such that when-ever a player defects or free rides he has an additional negative payoff givenby − p . Thus, the modified payoff matrix, from Eq. (12), is- U = provide f ree rideprovide r, r r − c , r + c − pf ree ride r + c − p, r − c − p, − p , (18)where r, c, p >
0. As we can see from the payoff matrix Eq. (18), when r > c/ − p , then cooperation or provide is the Nash equilibrium, but when r < c/ − p then defection or free riding is the Nash equilibrium. We firstcalculate the game magnetization in the thermodynamic limit for this two13layer Public goods game with punishment using the approach of Ref. [5],and point out the imperfections and then do it correctly using our approachas alluded to in sections 3 and 4.2. p = = .5p =
11 2 3 4 5 r - - - - - Figure 5: Game magnetization vs. reward r for the Public goods game withpunishment for different values of the cost p with β = 1 using the methodof Ref. [5]. We see that there is no phase transition. Also when increasingthe punishment p , the game magnetization becomes more negative which isincorrect.As described earlier, in the Public goods game provide can be writtenas cooperation and free ride as defection. Similar to 1D Ising model, theHamiltonian of the system using the payoff matrix U , Eq. (18) and theprojectors P i = P C = | >< | and P i = P D = | >< | is given by- H = i =1 (cid:88) N (cid:88) m,n =0 , U mn P ( i ) m ⊗ P ( i +1) n , (19)where 0 denotes spin up ( ↑ ) and 1 denotes spin down ( ↓ ) sites with payoff14atrix elements: U = 2 r , U = r − c/ U = r + c/ − p and U = − p .Similar to that shown in section 4.1, we find the game magnetization in thethermodynamic limit using the approach of Ref. [5] to be- M = e − βr − e βp ( e βp + e − βr ) = − tanh( β ( r + p/ . (20)As we can see from Fig. 5, the game magnetization is always negative for r > p >
0, which means that free ride is always the Nash equilib-rium in the thermodynamic limit. Further, when the punishment increaseshigher fraction of players chooses to defect which is an incorrect conclusion.However, when we analyze the situation, from the payoff matrix Eq. (20) ifthe punishment p is such that reward r > c/ − p then the players shouldchoose provide. Further, the game magnetization is independent of the costwhich is unexpected. In the next section we resolve these issues and findthe correct game magnetization. Following the calculations in sections 3 and 4.2 and using the method ofRef. [6], we make the correct connection of Public goods game with punish-ment payoff matrix Eq. (18) and Ising game matrix Eq. (9). In our approachas elucidated in Eq. (6), we add λ = − x + z = − r + c − p to column 1 and µ = − y + w = − r − c − p to column 2 of the payoff matrix Eq. (18) to makethe mapping between the payoff matrix of game and 2-spin Ising game ma-trix, Eq. (9) exact. Thus, the transformed Public goods game payoff matrixEq. (18) reduces to- U = provide f ree rideprovide r − c +2 p r − c +2 p f ree ride − r − c +2 p − r − c +2 p (21)Comparing this to the Ising game matrix- Eq. (9), we have J + h = r − c +2 p ,and J − h = − r − c +2 p . Solving these simultaneous equations we get- J = 0and h = r − c +2 p . Thus, the game magnetization for Public goods gamewith punishment, from Eq. (10) is- M = sinh( βh ) (cid:113) sinh ( βh ) + e − βJ = tanh( β r − c + 2 p = = .5p =
11 2 3 4 r - - Figure 6: Game magnetization vs. the reward r for classical Public goodsgame with punishment for different values of the punishment p , cost c = 4with β = 5. As the punishment increases, for same value of reward moreplayers choose to provide or cooperate.In Fig. 6 we plot game magnetization Eq. (22) versus the reward r fordifferent values of punishment. We see that as the punishment p increases,considering β fixed ( β = 5), the critical point for the phase transition de-creases to a lower value of reward ( r ). This is because when punishment fordefecting increases, with the reward remaining constant, more players wouldprovide, as the penalty for defecting is high. In contrast to the cost of the“public good”, increasing punishment for defection increases the number ofcooperators.Similar to the Public goods game without punishment, we see that notall players go for the Nash equilibrium strategy in the infinite player orthermodynamic limit of Public goods game with punishment. For example,when the punishment is low as compared to the reward, then higher fractionof players choose to free ride or defect but there exist a finite fraction ofplayers who cooperate or provide, even when punishment p increases from0 to 1, the fraction of cooperators at r = 1 increases by almost 50 %. β has no role in determining the critical point in the phase transition although16epending on other triggers like cost, reward or punishment changing β maylead to an increase of cooperators in certain situations. We have studied the connection between Ising model and game theory tofind the triggers for cooperative behavior in the Public goods game withand without punishment. We contrasted the results from our approach withthat of Adami, Hintze’s in Ref. [5]. We unravel some inconsistencies inthe method of Ref. [5]. In the Appendix sections 7.1 and 7.2, a detaileddiscussion on the errors in Adami-Hintze’s method[5] is presented. Further,using the correct approach to the problem as dealt with in sections 3, 4.2and 5.2, we see that in the Public goods game cost plays a non trivial rolein determining the critical point of the phase transition between providingand free riding. The thermodynamic limit of the Public goods game bothwith and without punishment shows that reward and punishment are thestrongest triggers for providing. Cost invariably suppresses cooperative orprovide while the role of β (fluctuation in choices) is to randomize strategicchoices. Following the method of Ref. [5], the Hamiltonian is given by- H = i =1 (cid:88) N (cid:88) m,n =0 , U mn P ( i ) m ⊗ P ( i +1) n . (23)where U mn denotes the matrix elements of the payoff matrix Eq. (12). Thus,the partition function is Z = (cid:88) x (cid:104) x | e − βH | x (cid:105) = (cid:88) m ,m ...m n e − β ( U m m + U m m ... + U mnm ) = T r ( E N ) , where | x (cid:105) = | m m ....m n (cid:105) is the state of the system and the ij th element ofthe matrix E is e − βU ij , with the E matrix given as- E = (cid:18) e − βr e β ( r − c ) e − β ( r + c ) (cid:19) . (24)17sing the above expression, the partition function becomes- Z = T r ( e − βH ) = T r ( E N ) = (1 + e − βr ) N . (25)As in Ref. [5], the average value of choosing a particular strategy m is theexpectation value of P im with m being the spin at site i . Thus, (cid:104) P im (cid:105) = (cid:80) x (cid:104) x | P im e − βH | x (cid:105) ZN . (26)The average value (cid:104) P C (cid:105) = (cid:104) P i (cid:105) for spin up ( ↑ ), i.e., m = 0 or providestrategy(C) is- (cid:104) P C (cid:105) = (cid:80) x < x | P C e − βH | x >ZN = N T r ( E C E N − ) ZN , where E C = (cid:18) e − βr e − β ( r − c ) (cid:19) . (27)Thus, T r ( E C E N − ) = T r ( E C ) T r ( E N − ) = e − βr (1 + e − βr ) N − , implying- (cid:104) P C (cid:105) = e − βr (1 + e − βr ) . Similarly, we can calculate the average value (cid:104) P D (cid:105) = (cid:104) P i (cid:105) for spin down ( ↓ ),i.e., m = 1 or free ride strategy(D)- (cid:104) P D (cid:105) = (cid:80) x (cid:104) x | P D e − βH | x (cid:105) Z = N T r ( E D E N − ) ZN , where E D = (cid:18) e − β ( r + c ) (cid:19) . (28)Thus, T r ( E D E N − ) = T r ( E D ) T r ( E N − ) = (1 + e − βr ) N − , implying- (cid:104) P D (cid:105) = 1(1 + e − βr ) . The game magnetization M , i.e., the difference in the average fraction ofplayers choosing cooperation over defection using approach of Ref. [5] isthen- M = e − βr − e − βr ) = − tanh( βr ) . (29)A similar calculation can be done for the two player Public goods game withpunishment where we use the payoff matrix U as in Eq. (18).18 .2 The error in Adami-Hintze’s approach, Ref. [5]- Herein we first analyze the two player analog of Adami-Hintze’s approachin order to expose the mistakes in their method.
Adami and Hintze’s approach
We start with a two spin system and then calculate the average payoff. TheHamiltonian for a 2-spin system defined using the Adami-Hintze approachis- H = (cid:88) m,n =1 , U mn P (1) m ⊗ P (2) n + (cid:88) m,n =1 , U mn P (2) m ⊗ P (1) n (30)Equation(30) is Eq. (2) of Ref. [5] for N = 2. U mn represents the elementsof the payoff matrix U for the m th row and n th column. The operator P m defines the projector of site 1 with m being 1 (meaning P = | (cid:105)(cid:104) | ) or ifat site 1 again with m = 2 then ( P = | (cid:105)(cid:104) | ). Similarly P m signifies theprojector of site 2 with m taking values 1 or 2. m, n denote the indicesfor strategies which for 1 denotes cooperation while for 2 denotes defection.Expanding the Hamiltonian Eq. (30)- H = U P (1)1 ⊗ P (2)1 + U P (1)1 ⊗ P (2)2 + U P (1)2 ⊗ P (2)1 + U P (1)2 ⊗ P (2)2 + U P (2)1 ⊗ P (1)1 + U P (2)1 ⊗ P (1)2 + U P (2)2 ⊗ P (1)1 + U P (2)2 ⊗ P (1)2 (31)The state of the system can then be written as- | x (cid:105) = | m (cid:105)⊗| m (cid:105) = | m m (cid:105) . | x (cid:105) represents the tensor product of the state of each site given by m i where i is the site indexwith m i ∈ { , } . The partition function for the two spinsystem then is- Z = T r ( e − βH ) = (cid:88) x (cid:104) x | e − βH | x (cid:105) = (cid:88) m ,m =0 , (cid:104) m m | (1 − βH + ( βH ) ... ) | m m (cid:105) = 1 − (cid:88) m ,m =0 , β (cid:104) m m | H | m m (cid:105) + (cid:88) m ,m =0 , β (cid:104) m m | H | m m (cid:105) + .... (32)Since P m = | m (cid:105)(cid:104) m | and P m = | m (cid:105)(cid:104) m | , we have H | m m (cid:105) = U m m | m m (cid:105) + U m m | m m (cid:105) . Thus, (cid:104) m m | H | m m (cid:105) = U m m + U m m , (33)and (cid:104) m m | H | m m (cid:105) = ( U m m + U m m ) (cid:104) m m | H | m m (cid:105) = ( U m m + U m m ) (34)19herefore, from Eq. (33) we have- Z = (cid:88) m ,m =0 , [1 − β ( U m m + U m m ) + β
2! ( U m m + U m m ) + .... ] , or, Z = (cid:88) m ,m =0 , e − β ( U m m + U m m ) = e − βU + 2 e − β ( U + U ) + e − βU . (35)In Ref. [5], the condition: U + U = U + U leads to- Z = ( e − βU + e − βU ) . (36)Now for the public goods game the payoff matrix is- U = s s s r, r r − c , r + c s r + c , r − c , wherein U = 2 r, U = r − c , U = r + c , U = 0 and because it’s asymmetric game the condition U + U = U + U is satisfied. Thus, wehave Z = ( e − βU + e − βU ) = T r ( E ) = ( e − βw + e − βx ) From section 3 of our main manuscript, the energies in Ising model areequivalent to payoffs of game theory. Thus, we try to find the average payoff(or, the average energy). However, while energies are minimized to get theequilibrium (point of lowest energy or ground state) the payoffs in a gameare maximized. The average payoff using Adami and Hintze’s approach thenis- (cid:104) E (cid:105) = − ∂ ln Z∂β = − ∂ ln(( e − βU + e − βU )) ∂β = 2 ( U e − βU + U e − βU )( e − βU + e − βU ) . (37)Now, in the limit β → ∞ , both players defect, i.e., there is no randomizationof strategies. In this limit, we should get back the results for 2 player case.Thus imposing the limit β → ∞ , the payoff for each player (dividing average (cid:104) E (cid:105) by 2) becomes-lim β →∞ (cid:104) E (cid:105) = lim β →∞ ( U e − βU + U e − βU )( e − βU + e − βU ) = lim β →∞ re − β r e − β r (38)Let’s take two cases and see whether the average payoff given via Eq. (38)gives the correct average payoff- 20. For r > c in which case the Nash equilibrium is the strategy ( provide, provide ).However, from Eq. (38), we getlim β →∞ (cid:104) E (cid:105) = lim β →∞ re − β r e − β r = 0 , (39)which is the payoff of the strategy ( f reeride, f reeride ). In this casewe arrive at a wrong conclusion.2. For r < c the Nash equilibrium is the strategy ( f reeride, f reeride ).From Eq. (38), we get-lim β →∞ (cid:104) E (cid:105) = lim β →∞ re − β r e − β r = 0 , (40)which is the payoff of the strategy ( f reeride, f reeride ) which are the correctpayoffs for this case. Thus, we have shown that even in the two player case,the approach of Ref. [5] gives incorrect average payoffs for the case r > c while it gives correct payoffs for r < c . Now let’s analyze the two playercase using our approach. Using our approach
For two spin case the Hamiltonian from Eq. (1) of this manuscript is, H = − J ( σ σ + σ σ ) − h/ σ + σ ) , (41)herein σ represents the spin at first site which can be either in one of twostates: up or down. Similarly, σ represents the spin at second site. Thepartition function of this two site system is- Z = (cid:88) σ (cid:88) σ e β ( J ( σ σ + σ σ )+ h/ σ + σ )) = e β (2 J + h ) +2 e − β (2 J ) + e β (2 J − h ) (42)For the public goods game, see Eq. (16), we get J = 0 and h = r − c . Thus,we have Z = e βh +2+ e − βh . Since the energy in Ising model is the equivalentof the payoffs of game theory but with the caveat that instead of minimizingthe energies the payoffs are maximized. Thus, the average payoff for 2-playercase is- (cid:104) E (cid:105) = − ∂ ln Z∂β = − ∂ ln( e βh + 2 + e − βh )) ∂β = − he βh + he − βh e βh + 2 + e − βh (43)However, as was mentioned in section 3 of the main manuscript, the payoffsshould be taken as − E as for game we intend to maximize the payoffs(whereas, in solution to Ising model problem our aim is to minimize theenergies, so as to determine the ground state). Now, let’s consider the twocases and see whether our approach gives the correct average payoff.21. For r > c we have h > P rovide, P rovide ). Using our approach,from Eq. (43), we get-lim β →∞ −(cid:104) E (cid:105) = lim β →∞ he βh − he − βh e βh + 2 + e − βh = h = 2 r − c provide, provide ) of the trans-formed payoff matrix, see Eq. (16) of the main manuscript.2. For r < c we have h < f reeride, f reeride ). Using our approach,from Eq. (43), we get-lim β →∞ −(cid:104) E (cid:105) = lim β →∞ he βh − he − βh e βh + 2 + e − βh = − h = − r − c , (45)which is the payoff of the strategy ( f reeride, f reeride ) of the trans-formed payoff matrix, see Eq. (16) of the main manuscript.Thus, our approach gives the correct payoffs corresponding to the Nash equi-librium strategies.Ref. [5]: C. Adami and A. Hintze, Thermodynamics of Evolutionary Games,Phys. Rev. E 97, 062136 (2018).An essay by P Ralegankar, Understanding Emergence of Cooperation usingtools from Thermodynamics, available at:http://guava.physics.uiuc.edu/ nigel/courses/569/Essays Spring2018/Files/Ralegankar.pdfalso comes to similar conclusions. The payoff matrix for two player public goods game can be written as- U = provide f reerideprovide r, r r − c , r + c f reeride r + c , r − c , . (46)Transforming the elements of the payoff matrix by adding a factor λ tocolumn 1 and µ to column 2 for row player’s payoffs and adding a factor λ (cid:48) to row 1 and µ (cid:48) to row 2 for column player’s payoffs we get: U = provide f reerideprovide r + λ, r + λ (cid:48) r − c + µ, r + c + λ (cid:48) f reeride r + c + λ, r − c + µ (cid:48) µ, µ (cid:48) . (47)22aking the transformations λ = λ (cid:48) = − r + c and µ = µ (cid:48) = − r − c suchthat the mapping of the 2 players Public goods game to 2 spin Ising energymatrix is exact as shown for general case of Eq. (7), we get- U = provide f ree rideprovide r − c , r − c r − c , − r − c f ree ride − r − c , r − c − r − c , − r − c (48)Our claim is that the Nash equilibrium in case of Eqs. (46), (48) is identical.Below a simple proof of this claim using fixed point analysis is provided.A fixed point is a point in coordinate space which maps a function to thecoordinate. In a two dimensional coordinate space, a fixed point of a function f ( x, y ) is mathematically defined as ( x, y ) such that[10]- f ( x, y ) = ( x, y ) . (49)From Brouwer’s fixed point theorem it is known that a 2D triangle ∆ hasa fixed point property . This implies that any function which defines allthe points inside a 2D triangle has a fixed point (for a detailed proof ofthis theorem refer to [10]). Also the probabilities for choosing a strategy,represents points inside a square of side length 1. Thus S , = ( x, y ) with0 < x < < y <
1, where x represents the probability of choosinga strategy by row player and y represents the probability of choosing astrategy by column player. It can be shown that a triangle and a square aretopologically equivalent [10], and this implies that if a triangle has a fixedpoint property, so does a square. A function can be defined such that itrepresents all the points inside the square. To define such a function[10],lets consider a vector with coordinates ( u , u ) and another vector withcoordinate ( v , v ) which are given as follows- (cid:18) u u (cid:19) = A (cid:18) y − y (cid:19) , (50)and (cid:0) v v (cid:1) = (cid:0) x − x (cid:1) B, (51)where x and y are the probabilities to choose a particular strategy. A and B denote the respective payoff matrix for row player and column player. Usingthis, the fixed point function from Eq. (49) is given by f ( x, y ) = (cid:18) x + ( u − u ) + | u − u | , y + ( v − v ) + | v − v | (cid:19) , (52)23here ( u − u ) + = u − u + | u − u | and ( v − v ) + = v − v + | v − v | . Wedetermine u , u , v and v for the payoff matrix as in Eq. (46) and thenthe transformed one as in Eq. (47). For the payoff matrix Eq. (46) we getthe coordinates ( u i and v i for i=1,2) as u = 2 ry + ( r − c − y ) u = ( r + c y + 0(1 − y ) v = 2 rx + ( r − c − x ) v = ( r + c x + 0(1 − x ) . (53)Now for the transformed payoff matrix as in Eq. (47) the fixed point functionis given by f t ( x, y ) = (cid:18) x + ( u t − u t ) + | u t − u t | , y + ( v t − v t ) + | v t − v t | (cid:19) . (54)Again we determine the coordinates ( u ti and v ti for i=1,2), as u t = 2 r − c y + 2 r − c − y ) u t = − r − c y − r − c − y ) v t = 2 r − c x + 2 r − c − x ) v t = − r − c x − r − c − x ) . (55)From Eq. (53) and Eq. (55), u − u = u t − u t = r − c and v − v = v t − v t = r − c . Thus, f t ( x, y ) = f ( x, y ) which implies that the Nash equi-librium remains unchanged under the transformations as described beforein Eqs. (46,47). This work was supported by the grants- 1. “Non-local correlations in nanoscalesystems: Role of decoherence, interactions, disorder and pairing symme-try” from SCIENCE & ENGINEERING RESEARCH BOARD, New Delhi,Government of India, Grant No. EMR/20l5/001836, Principal Investigator:Dr. Colin Benjamin, National Institute of Science Education and Research,24hubaneswar, India, and 2. “Nash equilibrium versus Pareto optimality inN-Player games”, SERB MATRICS Grant No. MTR/2018/000070, Princi-pal Investigator: Dr. Colin Benjamin, National Institute of Science Educa-tion and Research, Bhubaneswar, India.