[PDF] Imitation of Success Leads to Cost of Living Mediated Fairness in the Ultimatum Game

Abstract

The mechanism behind the emergence of cooperation in both biological and social systems is currently not understood. In particular, human behavior in the Ultimatum game is almost always irrational, preferring mutualistic sharing strategies, while chimpanzees act rationally and selfishly. However, human behavior varies with geographic and cultural differences leading to distinct behaviors. In this paper, we analyze a social imitation model that incorporates internal energy caches (e.g., food/money savings), cost of living, death, and reproduction. We show that when imitation (and death) occurs, a natural correlation between selfishness and cost of living emerges. However, in all societies that do not collapse, non-Nash sharing strategies emerge as the de facto result of imitation. We explain these results by constructing a mean-field approximation of the internal energy cache informed by time-varying distributions extracted from experimental data. Results from a meta-analysis on geographically diverse ultimatum game studies in humans, show the proposed model captures some of the qualitative aspects of the real-world data and suggests further experimentation.

Full PDF

IImitation of Success Leads to Cost of Living Mediated Fairnessin the Ultimatum Game

Yunong Chen ∗ Andrew Belmonte ∗† Christopher Griﬃn ∗ ‡

Preprint - November 25, 2020

Abstract

The mechanism behind the emergence of cooperation in both biological and social systems is currentlynot understood. In particular, human behavior in the Ultimatum game is almost always irrational,preferring mutualistic sharing strategies, while chimpanzees act rationally and selﬁshly. However, humanbehavior varies with geographic and cultural diﬀerences leading to distinct behaviors. In this paper, weanalyze a social imitation model that incorporates internal energy caches (e.g., food/money savings), costof living, death, and reproduction. We show that when imitation (and death) occurs, a natural correlationbetween selﬁshness and cost of living emerges. However, in all societies that do not collapse, non-Nashsharing strategies emerge as the de facto result of imitation. We explain these results by constructing amean-ﬁeld approximation of the internal energy cache informed by time-varying distributions extractedfrom experimental data. Results from a meta-analysis on geographically diverse ultimatum game studiesin humans, show the proposed model captures some of the qualitative aspects of the real-world data andsuggests further experimentation.

Cooperation is critical for the emergence of societies (e.g., ants, cetaceans, humans etc.). However cooperationis frequently an irrational response to an environment with a cost of living. Consequently, understandingand modeling the mechanism of the emergence of cooperation and fairness is still an active area of researchin social and biological theory [1–12]. The

Ultimatum Game (UG) is an archetypal game illustrating boththe diﬃculties in modeling concepts of fairness and cooperation. In the game, one player is given a sum ofmoney which she must divide in some proportion between herself and a second player. The second playermay then accept the oﬀer, in which case the pot is divided accordingly, or reject the oﬀer in which case eachplayer receives nothing [13]. This is like a continuous variation of the Stag-Hunt game, in which individualgain competes against mutual beneﬁt. The notional money can act as a stand-in for a cooperative hunt,business venture etc. Here we introduce an additional UG variable, individual wealth, which drives thedynamic imitate the successful .A considerable amount of theoretical and experimental research has been done on the ultimatum game(see e.g., [14–24]). Classical game theory asserts the most rational, sub-game perfect solution is for thedividing player to keep as much of the prize as possible, while the deciding player accepts any oﬀer. However,almost all experiments with humans (but not chimpanzees [19]) show that individuals will oﬀer far more thanthe minimum quantity and deciding players will frequently reject oﬀers at the expense of their own well-being(presumably as an act of punishment for unfair or non-cooperative behavior). In particular, Oosterbeek etal. conducted a meta-analysis of 37 papers with 75 results from various countries [16], and concluded thatthere is not a signiﬁcant diﬀerence in proposers’ behavior, but there is a diﬀerence in responders’ behaviorsacross (geographic) regions. Any model of UG dynamics should include elements observed in their meta-study: (i) it must produce a diversity of results that can be tuned to explain geographic diversity; (ii) it ∗ Dept. of Mathematics, The Pennsylvania State University, University Park, PA 16802 † Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802 ‡ Applied Research Laboratory, The Pennsylvania State University, University Park, PA 16802 a r X i v : . [ phy s i c s . s o c - ph ] N ov hould explain why oﬀers show less variation than rejection rates; (iii) it should be generally consistent withhuman behavior.Mathematical models by Nowak et al. approach the Ultimatum Game from an evolutionary game theoryperspective, by including the reputation of agents as part of the oﬀer making process [25] and in a one-shotgame context [26]. More recently, Gale et al. construct a discrete strategy evolutionary game representation[13], and show that evolutionary stable strategies exist in this game. In addition to this, a substantial amountof work has been done on spatial ultimatum games [27–30]. Our proposed model is an agent-based simulation following the spirit of [25, 26], similar to the approachtaken in [31, 32]. We introduce a dynamic wealth variable [33] for each player, as an integrated measure ofsuccess. In our model, agents interact randomly and each interaction is an instance of UG with a possibleprize P . Agents are chosen at random to be the oﬀerer or decider. The state of agent i is speciﬁed byinternal variables ( λ i , θ i , B i ) where λ i ∈ [0 ,

1] is the fairness (cooperation) demanded by Agent i , θ i ∈ [0 , i and B i > κ , which is subtracted at each time step from the energy cache of each player.If Agents i and j interact and i is the oﬀerer, then Agent j rejects the oﬀer whenever θ i < λ j . In thecase of acceptance, Agent i keeps (1 − θ i ) P and Agent j keeps θ i P . When P = 1, then all parameters canbe expressed as ratios of P . Let χ ij ( t ) be an indicator function that is 1 at time t exactly when i and j interact. The discrete time agent-based model dynamics are given by: B i ( t + (cid:15) ) = B i + (cid:15) (cid:88) j (cid:54) = i χ ij ( t ) P (cid:0) − θ i (cid:1) U ( θ i − λ j ) + (cid:15) (cid:88) j (cid:54) = i χ ij ( t ) P θ j U (cid:0) θ j − λ i (cid:1) − κ, (1)where U ( x ) = (cid:40) x >

00 otherwiseis the unit step function, and we take κ ∈ [0 , (cid:15) = 1. Taking the expectedvalue of these equations, a mean-ﬁeld approximation of the agent energy dynamics can be derived:∆ ˆ B i ( t ) = (cid:15) q (cid:88) j (cid:54) = i (cid:2)(cid:0) − θ i (cid:1) U (cid:0) θ i − λ j (cid:1) θ j U (cid:0) θ j − λ i (cid:1)(cid:3) − κ, (2)The normalizing value q is given by: q = (cid:40) n n is odd n − , which models the random choice of two agents from a completely connected population of n agents. We nextpropose dynamics that drive the population towards a statistical equilibrium ( θ i , λ i ) → ( λ ∗ , θ ∗ ). However,independent of any game dynamics for the population, we can already derive certain relations that charac-terize the dynamics of the energy cache B using Eq. (2). If λ ∗ > θ ∗ , then U ( θ ∗ − λ ∗ ) = 0 and ∆ ˆ B < λ ∗ < θ ∗ , then as n → ∞ :∆ ˆ B ( t ) = (cid:15) − κ ) , (3)which also holds in general for even n . Thus, if κ > , the population will collapse in the mean. For κ = ,the population energy caches will stabilize in the mean and for κ < , the population energy caches willincrease without bound.In discrete time, the dynamics of ( θ, λ ) are given by: λ i ( t + (cid:15) ) = λ i ( t ) + (cid:15) (cid:88) j (cid:54) = i ( λ j − λ i ) p ij (4) θ i ( t + (cid:15) ) = θ i ( t ) + (cid:15) (cid:88) j (cid:54) = i ( θ j − θ i ) p ij , (5)2here p ij are imitation probabilities. Let: Q i = (cid:88) j U ( B j − B i ) , (6)this is the cumulative diﬀerence in energy values for all agents j with B j > B i . For the discrete timesimulation, we set: p ij = (cid:40) ( B j − B i ) U ( B j − B i ) (cid:80) h ( B h − B i ) U ( B h − B i ) if Q i >

00 otherwise (7)Eqs. (4) and (5) are imitation dynamics in which agents imitate those who outperform them. Thus Agent j does not rationally choose ( λ j , θ j ), but adjusts these values based on observations weighted towards other,more successful agents. Our model is based on recent research showing that children will imitate higherstatus individuals more selectively than lower status individuals [34]. Additionally, children will infer statusbased on observing imitation in adults [35]. Lastly, [36] shows that in strategic settings humans will imitatebehavior based on pay-oﬀ inequality.For imitation systems like Eqs. (4) and (5), Griﬃn et al. proved that a suﬃcient condition for convergenceis the emergence of a ﬁxed leader i ∗ imitated (directly or indirectly) by all agents [37], which readily occursin this system as a result of the total ordering of B i . As (cid:15) →

0, Eqs. (4) and (5) become the continuoustime consensus equations as surveyed in [38], but with a state-varying coeﬃcients. The proof of convergencein [37] for discrete time updates suggests that exact values of p ij are irrelevant, as long as Agent i is imitatingthose agents who outperform it.Whether in continuous or discrete dynamics, these systems have an inﬁnite set of ﬁxed points θ k = θ ∗ , λ k = λ ∗ for θ ∗ , λ ∗ ∈ [0 , × [0 , λ ∗ > θ ∗ would lead to population collapse for any cost of living κ >

0. Therefore, thedistributions of long-run behavior in these systems should provide insights into the emergence of cooperativeor fair behaviors.We assume agents are initialized with θ k and λ k uniformly distributed in [0 , n → ∞ . From Eq. (2) the expected per-round energy increase near t = 0 for an agent with parameters( θ, λ ) is: ∆ B ( θ, λ, κ ) = 12 (cid:90) θ (1 − θ ) dλ + 12 (cid:90) λ θ dθ − κ = 12 (1 − θ ) θ + 12 (cid:18) − λ (cid:19) − κ. (8)Maximizing this expression subject to the constraints 0 ≤ θ, λ ≤

1, suggests the optimal fairness demand is λ + = 0, while the best oﬀer is θ + = . This is consistent with the classical Nash equilibrium ( λ + = 0) butalso consistent with fairness considerations ( θ + = ), since an agent can never be certain whether she willinteract with an agent with high or low λ . If the players were perfectly rational, then a true Nash equilibriumwould be θ NE = λ NE = 0, since rational players realizing λ + = λ NE = 0 would make θ NE = 0. Our empiricalresults show that this equilibrium does not result from imitation.From Eq. (8), when κ > , the expected increase for even an optimal player is negative. This will leadto a mean decrease in energy caches until imitation leads to higher success rates in UG. Let χ ∆ B ( θ, λ, κ )be an indicator function that is 1 just in case, Eq. (8) is positive. Numerical evaluation shows that when κ ∗ ≈ . (cid:90) (cid:90) χ ∆ B ( θ, λ, κ ∗ ) dλ dθ = . For κ > κ ∗ , the median energy cache value will decrease in early interactions before imitation can contractthe strategy space. Individuals whose energy cache reaches zero are assumed dead and can no longer interactin the system. Reproduction or replacement of players is used to maintain a constant population, and thespeciﬁc rule we use is described in the simulation details below. We simulate a population with N agents. Agents are initialized with an energy cache value B i , and uniformlyrandomly assigned values θ i and λ i . Agents enter a game loop , where each agent plays UG with another3andomly selected agent. Once all agents have played, energy caches are updated accordingly. In the agent-based simulation, we introduce a reproductive step into the mimicking process to account for agents withnon-positive energy cache and to identify population collapse prior to convergence. If all agents have B i < B i > mimic/reproduce loop. If all agents have survived, agents return tothe game loop. Otherwise, agents are randomly chosen to reproduce with probability proportional to theirenergy cache; i.e., the ﬁttest reproduce with higher probability. Reproduction continues until the populationreaches N . If the population never collapses, the process is terminated after T rounds. The size of T ischosen to ensure convergence. To ensure numerical validity, the model was implemented both in Python andMathematica, and results were compared to ensure statistical consistency.Fig. 1 shows simulation results for N = 150 players and running time T = 300. All agent energy cachesare initially set to 1. We used 100 realizations (replications). Distribution plots for B , θ and λ are shown,with cost of living κ ranging from 0 .

05 to 0 .

5. Density plots showing the joint converged ( θ , λ ) distributions Simulation ( ) MeanSimulation ( ) Mean

Simulation ( ) MeanEmpiricalEstimator

Figure 1: Simulation results using 150 agents, 300 rounds of play, and 100 replications: (top) there is a welldeﬁned negative correlation between cost of living and oﬀer; (middle) Fairness demands are stable acrosscost of living values; (bottom) energy cache values follow an empirical trend derivable from the model.are shown in Fig. 3. The convergence of θ i ( t ) and λ i ( t ) is illustrated in Fig. 2 for 300 agents, T = 500 and κ = 0 .

1. To create this ﬁgure, 100 replications were constructed and θ i ( t ) and λ i ( t ) were sorted at eachround. These sorted lists where then averaged (over replication) to obtain ¯ θ [ i ] ( t ) and ¯ λ [ i ] ( t ), where ¯ θ [ i ] ( t ) isthe mean oﬀer value of the agent with the i th smallest oﬀer value. The quantity ¯ λ [ i ] ( t ) is deﬁned analogously.The simulation shows downward pressure on the oﬀer value correlated with the energy cost of living κ .0 0.2 0.4 0.6 0.8 1.00.0000.0050.0100.0150.0200.025 Offer ( θ ) P r opo r t i on ( θ ) P r opo r t i on ( θ ) P r opo r t i on ( θ ) P r opo r t i on ( θ ) P r opo r t i on ( θ ) P r opo r t i on ( λ ) P r opo r t i on ( λ ) P r opo r t i on ( λ ) P r opo r t i on ( λ ) P r opo r t i on ( λ ) P r opo r t i on ( λ ) P r opo r t i on Figure 2: (top) Convergence of the distribution of θ i from a uniform distribution to a delta distribution.(bottom) Convergence of the distribution of λ i from a uniform distribution to a delta distribution. (both) κ = 0 .

1, 300 agents are simulated. Times go from t = 0 to t = 500 in increments of ∆ t = 100. κ = θ λ κ = θ λ κ = θ λ κ = θ λ Figure 3: Density plots show the distributions of ( θ ∗ , λ ∗ ) over multiple replications with varying costs ofliving.with consistent values of λ ∗ between (approximately) 0 . .

4. As is expected, the value of ˆ B i ( t ), themean energy store value decreases as a function of cost of living. We derive an empirical linear approximationfor the mean, which we discuss in the sequel. Understanding the origin of this relationship is complicatedby the fact that there is no convenience closed form expression for λ i ( t ) or θ i ( t ). To remedy this, we usea combination of empirical distribution modeling and closed form analysis of ∆ ˆ B i to explain the observedbehavior. At arbitrary time t when the distribution of fairness demands and oﬀers is given by probability densityfunctions f tλ ( s ) and f tθ ( s ) respectively, then Eq. (8) can be generalized as:∆ ˆ B ( t ; θ, λ, κ ) = 12 (cid:32)(cid:90) θ (1 − θ ) f tλ ( s ) ds + (cid:90) λ sf tθ ( s ) ds (cid:33) − κ (9)This expression cannot be computed without the time-varying distributions in question, which cannot becomputed without an appropriate Fokker-Plank equation, which is diﬃcult to construct. To compensate, wecan ﬁt distributions to the data ¯ λ [ i ] ( t ) and ¯ θ [ i ] ( t ) to obtain estimators ˆ f tλ ( s ) and ˆ f tθ ( s ), which can be usedin Eq. (9). These empirically estimated distributions stand in for the mean-ﬁeld distributions. Not: Alldistributions were estimated using Mathematica’s FindDistribution function. We can then compute:ˆ B ( t ; θ, λ, κ ) = (cid:40) B if t = 0 (cid:80) s ≤ t ∆ B ( t ; θ, λ, κ ) U [ B ( t − θ, λ, κ )] otherwise , (10)5here the factor U [ B ( t − θ, λ, κ )] sets ˆ B ( t ; θ, λ, κ ) = 0 if ˆ B ( t − θ, λ, κ ) = 0. That is, it models the deathof a test agent with parameters ( θ, λ ). The imitation dynamics deﬁned by Eqs. (4), (5) and (7) imply thatthe larger ˆ B ( t ; θ, λ, κ ) the more likely an agent with parameters ( θ, λ ) will be imitated. Thus, we can useˆ B ( t ; θ, λ, κ ) at an appropriately large time (e.g., t = 100) to estimate which agents are most likely to beimitated for a given κ . We show this estimation for κ = 0 . κ = 0 . ( θ ) F a i r ne ss D e m and ( λ ) ( θ ) F a i r ne ss D e m and ( λ ) Figure 4: (top) (bottom)we use the top 5% of computed values of ˆ B ( t, θ, λ, κ ) to compute estimated intervals on the values of θ ∗ and λ ∗ for κ = 0 . κ = 0 .

4. We compare these intervals with the 5% −

95% intervals computed fromthe experimental results shown in Fig. 1. This is shown in Table 1. These results are both consistent with κ Model Est. Interval Computed Interval0 . . , .

64] [0.42,0.54]0 . . , .

41] [0.280.41]

Oﬀer Estimate κ Model Est. Interval Computed Interval0 . , .

33] [0.15,0.31]0 . , .

26] [0.13,0.28]

Fairness Demand Estimate

Table 1: Comparison of estimated and computed intervals on θ ∗ and λ ∗ using information from Eq. (10).and predictive of the distributions seen in Fig. 1; i.e., they explains both the downward slope of θ as afunction of κ in Fig. 1 (top) and the relatively constant behavior of λ as a function of κ . We stress thatestimations in Fig. 4 and Table 1 are generated by a model (Eq. (9) and Eq. (10)) with distribution constants6etermined empirically. Thus an area of future work is to replace these empirically determined distributionswith modeled distributions. ˆ B The dynamics of the energy cache values can be modeled asymptotically. As t → ∞ , f tλ ( s ) ≈ δ ( s − λ ∗ ) and f tθ ( s ) ≈ δ ( s − θ ∗ ), where ( θ ∗ , λ ∗ ) is the ﬁxed point of the ( θ ( t ) , λ ( t )). This is illustrated in Fig. 2. As t → ∞ ,the energy caches of each agent asymptotically approaches: B i ( t ) = ( − κ ) t. This model is shown in Fig. 1 (bottom). This over-estimates the long-run energy cache value because of theinitial time taken to converge. We can approximate the trend seen in Fig. 1 (bottom), by noting that thetime for λ i ( t ) and θ i ( t ) to converge so that most UG interactions are successful in approximately 80 rounds(out of the 300 rounds simulated). Assuming that prior to convergence, only half of all interactions result ina successful UG, we obtain a thermodynamic-type relationship between the mean wealth of the populationand the cost of living: ˜ B ( κ ) = 260 (cid:18) − κ (cid:19) , (11)which explains the linear decrease with κ shown in Fig. 1 (bottom), where we show the ﬁt of Eq. (11).Figure 5 shows the a log-plot of mean deaths per capita per simulation with notional cubic ﬁt. Asexpected, there is a non-linear jump for κ = . In addition, we note tht the zero-crossing (corresponding toone death per round) occurs roughly at κ = κ ∗ , representing a transition in the population to a more rapidlyincreasing per capita death rate with cost of living κ . This also correlates with more than 50% of the agentshaving initially decreasing energy caches, thus increasing the per capita death rate. - - - ( κ ) Log [ D ea t h s P e r C ap i t a P e r R ep li c a t i on ] Figure 5: Log-Log plot of per capita death rate in the simulation as a function of energy cost of living.The global dynamics displayed in Fig. 1 are robust to changes in the speed of the underlying dynamics. Inparticular, we tested models in which (i) we replaced the discrete dynamics with continuous time diﬀerentialequations (by letting (cid:15) → B k <

0. All models used 100 replicationsexcept when reproduction was eliminated in which case 200 replications were used to ensure statisticallysigniﬁant sample sizes. (Samples with population collapse were discarded.)Results from robustness experiments are shown in Fig. 6 (top), where we show the mean values ¯ θ ∗ . Theenvelopes are 1 σ . Similar tests were run for ¯ λ ∗ – see Fig. 6 (middle). For all cases, ¯ θ is decreasing in κ .The mean ﬁt line has negative slope as a function of κ ( p = 5 . × − ) and adjusted r of 0 .

95, consistentwith prior results and theoretical analysis. There is a diﬀerence in the behavior of ¯ λ ∗ for the the discretetime simulations and the continuous time (hybrid) variations. In the case of the hybrid ODE models (withor without Euler step approximations) ¯ λ ∗ increases as a function of κ ( p < . λ ∗ decreases as a function of κ ( p < . λ ∗ increases as afunction of κ but with p < . * * * * * * * * x x x x x x x x xo o o o o o o o o ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ( κ ) O ff e r ( θ ) * HybridODEModel ( ) x EulerModel ( ) o FastTimeScaleEulerModel ◆ Simulation ( ) Simulation - NoReproduction ( ) + Simulation ( ) LinearFitofMeans Combined1 σ ConfidenceInterval (a) Oﬀer * * * * * * * * * x x x x x x x x xo o o o o o o o o ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ κ F a i r ne ss D e m and ( λ ) * HybridODEModel ( ) x EulerModel ( ) o FastTimeScaleEulerModel ◆ Simulation ( ) Simulation - NoReproduction ( ) + Simulation ( ) MeanFit ( SlowTimeScale ) MeanFit ( FastTimeScale ) Combined1 σ ConfidenceInterval (b) Fairness Demand

Figure 6: (top) Model variations show the same trend in oﬀer as a function of cost of living. (middle) Modelvariations show diﬀering trend in fairness demand depending on imitation speed. (bottom) Scaling GDP asa proxy for cost of living in real-world data shows good correlation with the proposed model (number inparentheses denotes the number of countries averaged in each data point [16]).

The proposed model provides behavior consistent with observations in the meta-study by Oosterbeek et al. [16] insofar as a diversity of oﬀer proportions and rejection rates are shown to be possible as a result of randominteraction and imitation of multiple agents. Over all simulations, the grand mean (cid:104) θ (cid:105) = 0 . ± . .

41% oﬀer rate observed in [16]. The grand mean (cid:104) λ (cid:105) = 0 . ± . , λ ∗ have ranges from approximately0 . .

4, adequately modeling the large variation in rejection rates. Despite these similarities, we cannotfully validate the model empirically because neither [16] nor its constituent studies include a variable likecost of living. Their study does regress against GDP and reward as a percent of per capita GDP (whichspans 2 orders of magnitude), but this is not an accurate measurement of intrinsic cost of living, especiallyin geographically diverse areas like the United States. We note that in [16], regression of oﬀer against GDPshows an insigniﬁcant negative correlation, which is consistent with Fig. 1 and Fig. 6 (top), but these studieswere not designed to measure this relationship. The results of this study suggest potential experimentalanalysis that could be done in controlled laboratory settings.

Game Theory ﬁnds application in biological and social sciences, yet well-known occurrences like cooperationand altruism remain challenging within its rational self-interest assumptions. Our paper presents a novel8pproach to the canonical Ultimatum Game (UG), introducing an additional savings variable (energy) alongwith a cost of living. In our nonlinear agent-based model, energy represents success and drives imitation.Agents evolve toward fair sharing, but are more selﬁsh with higher costs of living, with consistently lowerfairness demands of others. This behavior is explained and predicted using a model with empirical determineddistribution parameters. The model reproduces some empirical data of human UG performance acrosscultures, providing a new theoretical framework for heterogenous cooperation among humans. In futurework, we will explore these dynamics further to determine whether the exact structure of the distributionson θ and λ can be determined. This would remove the need to ﬁt the distributions as a part of the modelingprocess and provide a complete mean-ﬁeld dynamics for this system. Acknowledgement

CG and AB were supported in part by the National Science Foundation under grant DMS-1814876. Theauthors would like to thank S. Rajtmajer for her feedback on earlier drafts. CG thanks R. Bailey (USN)who ran the very ﬁrst simulation of this phenomena in MATLAB while at the United States Naval Academy.

References [1] Robert Axelrod. The emergence of cooperation among egoists.

American Political Science Review ,75(2):306–318, 1981.[2] Damien Challet and Y-C Zhang. Emergence of cooperation and organization in an evolutionary game.

Physica A: Statistical Mechanics and its Applications , 246(3-4):407–418, 1997.[3] Sanjay Jain and Sandeep Krishna. A model for the emergence of cooperation, interdependence, andstructure in evolving networks.

Proceedings of the National Academy of Sciences , 98(2):543–547, 2001.[4] Martin A Nowak, Akira Sasaki, Christine Taylor, and Drew Fudenberg. Emergence of cooperation andevolutionary stability in ﬁnite populations.

Nature , 428(6983):646–650, 2004.[5] Francisco C Santos and Jorge M Pacheco. Scale-free networks provide a unifying framework for theemergence of cooperation.

Physical Review Letters , 95(9):098104, 2005.[6] Daniel J Hruschka and Joseph Henrich. Friendship, cliquishness, and the emergence of cooperation.

Journal of theoretical biology , 239(1):1–15, 2006.[7] Francisco C Santos, Marta D Santos, and Jorge M Pacheco. Social diversity promotes the emergence ofcooperation in public goods games.

Nature , 454(7201):213–216, 2008.[8] Shun Kurokawa and Yasuo Ihara. Emergence of cooperation in public goods games.

Proceedings of theRoyal Society B: Biological Sciences , 276(1660):1379–1384, 2009.[9] Sven Van Segbroeck, Jorge M Pacheco, Tom Lenaerts, and Francisco C Santos. Emergence of fairnessin repeated group interactions.

Physical Review Letters , 108(15):158104, 2012.[10] Elisabeth Paulson and Christopher Griﬃn. Cooperation can emerge in prisoner’s dilemma from amulti-species predator prey replicator dynamic.

Mathematical biosciences , 278:56–62, 2016.[11] Pengbi Cui, Zhi-Xi Wu, Tao Zhou, Xiaojie Chen, et al. Cooperator-driven and defector-driven punish-ments: How do they inﬂuence cooperation?

Physical Review E , 100(5):052304, 2019.[12] Shiping Gao, Jinming Du, and Jinling Liang. Evolution of cooperation under punishment.

PhysicalReview E , 101(6):062419, 2020.[13] John Gale, Kenneth G. Binmore, and Larry Samuelson. Learning to be imperfect: The ultimatumgame.

Games and Economic Behavior , 8(1):56 – 90, 1995.914] Gary Bornstein and Ilan Yaniv. Individual and group behavior in the ultimatum game: Are groupsmore “rational” players?

Experimental Economics , 1(1):101–108, Jun 1998.[15] Alan G. Sanfey, James K. Rilling, Jessica A. Aronson, Leigh E. Nystrom, and Jonathan D. Cohen. Theneural basis of economic decision-making in the ultimatum game.

Science , 300(5626):1755–1758, 2003.[16] H. Oosterbeek, R. Sloof, and G. Van De Kuilen. Cultural diﬀerences in ultimatum game experiments:Evidence from a meta-analysis.

Experimental Economics , 7:171–188, 2004.[17] Kevin J. Haley and Daniel M.T. Fessler. Nobody’s watching?: Subtle cues aﬀect generosity in ananonymous economic game.

Evolution and Human Behavior , 26(3):245 – 256, 2005.[18] Joseph Henrich, Richard McElreath, Abigail Barr, Jean Ensminger, Clark Barrett, Alexander Bolyanatz,Juan Camilo Cardenas, Michael Gurven, Edwins Gwako, Natalie Henrich, Carolyn Lesorogol, FrankMarlowe, David Tracer, and John Ziker. Costly punishment across human societies.

Science ,312(5781):1767–1770, 2006.[19] Keith Jensen, Josep Call, and Michael Tomasello. Chimpanzees are rational maximizers in an ultimatumgame.

Science , 318(5847):107–109, 2007.[20] Toshio Yamagishi, Yutaka Horita, Haruto Takagishi, Mizuho Shinada, Shigehito Tanida, and Karen S.Cook. The private rejection of unfair oﬀers and emotional commitment.

Proceedings of the NationalAcademy of Sciences , 106(28):11520–11523, 2009.[21] Terence Burnham. Gender, Punishment, and Cooperation: Men Hurt Others to Advance Their Interests.Available at SSRN: https://papers.ssrn.com/soL3/papers.cfm?abstract_id=2966376 , 2017.[22] Bj¨orn Wallace, David Cesarini, Paul Lichtenstein, and Magnus Johannesson. Heritability of ultimatumgame responder behavior.

Proceedings of the National Academy of Sciences , 104(40):15631–15634, 2007.[23] David Cesarini, Christopher T. Dawes, James H. Fowler, Magnus Johannesson, Paul Lichtenstein, andBj¨orn Wallace. Heritability of cooperative behavior in the trust game.

Proceedings of the NationalAcademy of Sciences , 105(10):3721–3726, 2008.[24] Katherine A. Cronin, Daniel J. Acheson, Pen´elope Hern´andez, and Angel S´anchez. Hierarchy is detri-mental for human cooperation.

Scientiﬁc Reports , 5:18634 EP –, 12 2015.[25] Martin A. Nowak, Karen M. Page, and Karl Sigmund. Fairness versus reason in the ultimatum game.

Science , 289(5485):1773–1775, 2000.[26] David G. Rand, Corina E. Tarnita, Hisashi Ohtsuki, and Martin A. Nowak. Evolution of fairness in theone-shot anonymous ultimatum game.

Proceedings of the National Academy of Sciences , 110(7):2581–2586, 2013.[27] Karen M Page, Martin A Nowak, and Karl Sigmund. The spatial ultimatum game.

Proceedings of theRoyal Society of London B: Biological Sciences , 267(1458):2177–2182, 2000.[28] Karen M. Page and Martin A. Nowak. A generalized adaptive dynamics framework can describe theevolutionary ultimatum game.

Journal of Theoretical Biology , 209(2):173 – 179, 2001.[29] M. N. Kuperman and S. Risau-Gusman. The eﬀect of the topology on the spatial ultimatum game.

TheEuropean Physical Journal B , 62(2):233–238, Mar 2008.[30] Jaime Iranzo, Javier Rom´an, and Angel S´anchez. The spatial ultimatum game revisited.

Journal ofTheoretical Biology , 278(1):1 – 10, 2011.[31] Q. Zhu, S. Rajtmajer, and A. Belmonte. The emergence of fairness in an agent-based ultimatum game. working paper , 2016. 1032] Sarah Rajtmajer, Anna Squicciarini, Jose M Such, Justin Semonsen, and Andrew Belmonte. An ultima-tum game model for the evolution of privacy in jointly managed content. In

International Conferenceon Decision and Game Theory for Security , pages 112–130. Springer, 2017.[33] Bruce M Boghosian. Kinetics of wealth and the pareto law.

Physical Review E , 89(4):042804, 2014.[34] Nicola McGuigan. The inﬂuence of model status on the tendency of young children to over-imitate.

Journal of experimental child psychology , 116(4):962–969, 2013.[35] Harriet Over and Malinda Carpenter. Children infer aﬃliative and status relations from watching othersimitate.

Developmental science , 18(6):917–925, 2015.[36] Jelena Gruji´c and Tom Lenaerts. Do people imitate when making decisions? evidence from a spatialprisoner’s dilemma experiment.

Royal Society Open Science , 7(7):200618, 2020.[37] Christopher Griﬃn, Sarah Rajtmajer, Anna Squicciarini, and Andrew Belmonte. Consensus and in-formation cascades in game-theoretic imitation dynamics with static and dynamic network topologies.

SIAM Journal on Applied Dynamical Systems , 18(2):597–628, 2019.[38] S. Motsch and E. Tadmor. Heterophilious Dynamics Enhances Consensus.