[PDF] Evolution of Coordination in Pairwise and Multi-player Interactions via Prior Commitments

Abstract

Upon starting a collective endeavour, it is important to understand your partners' preferences and how strongly they commit to a common goal. Establishing a prior commitment or agreement in terms of posterior benefits and consequences from those engaging in it provides an important mechanism for securing cooperation. Resorting to methods from Evolutionary Game Theory (EGT), here we analyse how prior commitments can also be adopted as a tool for enhancing coordination when its outcomes exhibit an asymmetric payoff structure, in both pairwise and multiparty interactions. Arguably, coordination is more complex to achieve than cooperation since there might be several desirable collective outcomes in a coordination problem (compared to mutual cooperation, the only desirable collective outcome in cooperation dilemmas). Our analysis, both analytically and via numerical simulations, shows that whether prior commitment would be a viable evolutionary mechanism for enhancing coordination and the overall population social welfare strongly depends on the collective benefit and severity of competition, and more importantly, how asymmetric benefits are resolved in a commitment deal. Moreover, in multiparty interactions, prior commitments prove to be crucial when a high level of group diversity is required for optimal coordination. The results are robust for different selection intensities. Overall, our analysis provides new insights into the complexity and beauty of behavioral evolution driven by humans' capacity for commitment, as well as for the design of self-organised and distributed multi-agent systems for ensuring coordination among autonomous agents.

Full PDF

EEvolution of Coordination in Pairwise and Multi-playerInteractions via Prior Commitments

Ogbo Ndidi Bianca , Aiman Elgarig , The Anh Han ,(cid:63) School of Computing, Engineering and Digital Technologies, Teesside University (cid:63)

Corresponding author: The Anh Han ([email protected])

Abstract

Upon starting a collective endeavour, it is important to understand your partners’ preferences and howstrongly they commit to a common goal. Establishing a prior commitment or agreement in terms ofposterior beneﬁts and consequences from those engaging in it provides an important mechanism for se-curing cooperation in both pairwise and multiparty cooperation dilemmas. Resorting to methods fromEvolutionary Game Theory (EGT), here we analyse how prior commitments can also be adopted as a toolfor enhancing coordination when its outcomes exhibit an asymmetric payoﬀ structure, in both pairwiseand multiparty interactions. Arguably, coordination is more complex to achieve than cooperation sincethere might be several desirable collective outcomes in a coordination problem (compared to mutual coop-eration, the only desirable collective outcome in cooperation dilemmas), especially when these outcomesentail asymmetric beneﬁts for those involved. Our analysis, both analytically and via numerical simu-lations, shows that whether prior commitment would be a viable evolutionary mechanism for enhancingcoordination and the overall population social welfare strongly depends on the collective beneﬁt and sever-ity of competition, and more importantly, how asymmetric beneﬁts are resolved in a commitment deal.Moreover, in multiparty interactions, prior commitments prove to be crucial when a high level of groupdiversity is required for optimal coordination. Our results are shown to be robust for diﬀerent selectionintensities. We frame our model within the context of the technology adoption decision making, but theobtained results are applicable to other coordination problems.

Keywords:

Commitment, Evolutionary Game Theory, Coordination, Technology Adoption.1 a r X i v : . [ c s . M A ] S e p Achieving a collective endeavour among individuals with their own personal interest is an importantsocial and economic challenge in various societies (Hardin, 1968; Ostrom, 1990; Pitt et al., 2012; Barrett,2016; Sigmund, 2010). From coordinating individuals in the workplace to maintaining cooperative andtrust-based relationship among organisations and nations, it success is often jeopardised by individualself-interest (Barrett et al., 2007; Perc et al., 2017). The study of mechanisms that support the evolutionof such collective behaviours has been of great interest in many disciplines, ranging from EvolutionaryBiology, Economics, Physics and Computer Science (Nowak, 2006; Sigmund, 2010; West et al., 2007; Han,2013; Perc et al., 2017; Andras et al., 2018; Kumar et al., 2020). Several mechanisms responsible for theemergence and stability of collective behaviours among such individuals, have been proposed, including kinand group selection, direct and indirect reciprocities, spatial networks, reward and punishment (Nowak,2006; West et al., 2007; Perc et al., 2017; Okada, 2020; Skyrms, 1996).Recently, establishing prior commitments has been proposed as an evolutionarily viable strategy induc-ing cooperative behaviour in the context of pairwise and multi-player cooperation dilemmas (Nesse, 2001;Frank, 1988; Han et al., 2017, 2015a; Sasaki et al., 2015; Arvanitis et al., 2019; Ohtsuki, 2018); namely,the Prisoner’s Dilemma (PD) (Han et al., 2013; Hasan and Raja, 2013) and the Public Goods Game(PGG) (Han et al., 2015a, 2017; Kurzban et al., 2001). It provides an enhancement to diﬀerent formsof punishment against inappropriate behaviours and of rewards to stimulate the appropriate ones (Chenet al., 2014; Martinez-Vaquero et al., 2015, 2017; Sasaki et al., 2015; Powers et al., 2012; Szolnoki and Perc,2012; Wang et al., 2019), allowing ones to eﬃciently avoiding free-riders (Han and Lenaerts, 2016; Hanet al., 2015b) and resolving the antisocial punishment problem (Han, 2016). These works have primarilyfocused on modelling prior commitments for improving mutual cooperation among self-interested agents.In the context of cooperation dilemma games (i.e. PD and PGG), mutual cooperation is the only desirablecollective outcome to which all parties are required to commit if an agreement is to be formed. In othercontexts such as coordination problems, this is not the case anymore since there might be multiple optimalor desirable collective outcomes and players might have distinct, incompatible preferences regarding whichoutcome a mutual agreement should aim to achieve (e.g. due to asymmetric beneﬁts). Such coordinationproblems are abundant in nature, ranging from collective hunting and foraging to international climatechange actions and multi-sector coordination (Santos and Pacheco, 2011; Ostrom, 1990; Barrett, 2016;Ohtsuki, 2018; Bianca and Han, 2019; Skyrms, 1996; Santos et al., 2016).Hence, we explore how arranging a prior agreement or commitment can be used as a mechanismfor enhancing coordination and the population social welfare in this type of coordination problems, inboth pairwise and multi-player interaction settings. Before individuals embark on a joint venture, a pre-agreement makes the motives and intentions of all parties involved more transparent, thereby enablingan easier coordination of personal interests (Nesse, 2001; Cohen and Levesque, 1990; Han, 2013; Hanet al., 2015b). Although our approach is applicable for a wide range of coordination problems (e.g.single market product investments as described above), we will frame our models within the technologyinvestment strategic decision making problem, allowing us to describe the models clearly. Namely, wedescribe technology adoption games capturing the competitive market and decision-making process amongﬁrms adopting new technologies (Zhu and Weyant, 2003; Bardhan et al., 2004), with a key parameter α representing how competitive the market is (thus describing how important coordination is). Similar toprevious commitment models, we will perform theoretical analysis and numerical simulations resortingto stochastic methods from Evolutionary Game Theory (EGT) (Hofbauer and Sigmund, 1998; Sigmundet al., 2010).We will start by modelling a pairwise technology adoption decision making, where two investmentﬁrms (or players) competing within a same product market who need to make strategic decision on whichtechnology to adopt (Zhu and Weyant, 2003; Chevalier-Roignant et al., 2011), a low-beneﬁt (L) or a high-beneﬁt (H) technology. Individually, adopting H would lead to a larger beneﬁt. However, if both ﬁrmsinvest on H they would end up competing with each other leading to a smaller accumulated beneﬁt thanif they could coordinate with each other to choose diﬀerent technologies. However, given the asymmetryin the beneﬁts in such an outcome, clearly no ﬁrm would want to commit to the outcome where its optionis L, unless some form of compensation from the one selecting H can be ensured.We then extend and generalize the pairwise model to a multi-player one, capturing the strategic inter-action between more than two investment ﬁrms. In the multi-player model, a key parameter µ is ascribedto the market demand of high technology, i.e. what is the optimal fraction of the ﬁrms in a group toadopt H. We analytically examine how players can be coordinated when there is a market demand for aparticular technology. We show that diﬀerently from the two-player game, the newly deﬁned parameter µ leads to a new kind of complexity when trying to achieve group coordination. When there is a high levelof diversity in demand (i.e. intermediate values of µ ), as can be seen in diﬀerent technologies adoptioncontexts (Beede and Young, 1998; Schewe and Stuart, 2015), introducing prior commitment can lead tosigniﬁcant improvement in the levels of coordination and population social welfare.The next section discusses related work, which is followed by a description of our models and detailsof the EGT methods for analysing them. Results of the analysis and a ﬁnal discussion will then follow. The problem of explaining the emergence and stability of collective behaviours has been actively addressedin diﬀerent disciplines (Nowak, 2006; Sigmund, 2010). Among other mechanisms, such as reciprocity andcostly punishment, closely related to our present model is the study of cooperative behaviours and pre-commitment in cooperation dilemmas, for both two-player and multiplayer games (Han et al., 2013, 2017;Sasaki et al., 2015; Hasan and Raja, 2013). It has shown that to enhance cooperation commitments need tobe suﬃciently enforced and the cost of setting up the commitments is justiﬁed with respect to the beneﬁtderived from the interactions—both by means of theoretical analysis and of behavioural experiments(Ostrom, 1990; Cherry and McEvoy, 2013; Kurzban et al., 2001; Chen and Komorita, 1994; Arvanitiset al., 2019). Our results show that this same observation is seen for coordination problems. However,arranging commitments for enhancing coordination is more complex, exhibiting a larger behavioural space,and furthermore, their outcomes strongly depend on new factors only appearing in coordination problems;namely, a successful commitment deal needs to take into account the fact that multiple desirable collectiveoutcomes exist for which players have incompatible preferences; and thus how beneﬁts can be sharedthrough compensations in order to resolve the issues of asymmetric beneﬁts, is crucially important (Biancaand Han, 2019).We moved further by expanding our two-player game in the previous work to a multi-player model,the outcome has shown to be more complex as there are more players involved. We yet again investigatedhow coordination and cooperation can be improved using prior commitment deal when there are multipleplayers involved and also when there is a particular market demand (Bianca and Han, 2019). Our approachin exploring how implementing prior commitment enhances cooperation dilemma has also been investigatedby previous researchers in the past (Chen and Komorita, 1994). A good level of cooperation was seen in aPublic Good Game experiment when there was a binding agreement made during the prior communicationstage among members of the group. They hypothesized that if members of a group are allowed to make apledge (a degree of bindings/commitment) before their actual decisions, they will be able to communicatetheir intentions and it will overall increase cooperation rate in the population. As predicted, their resultsclearly demonstrates that making a pledge improves cooperation although the degree of commitmentrequired in the pledge deferentially aﬀected the cooperation rate (Chen and Komorita, 1994; Cherry andMcEvoy, 2013; Kurzban et al., 2001).There have been several other works studying the evolution of coordination, using the so-called StagHunt game, see e.g. (Skyrms, 2003; Pacheco et al., 2009; Santos et al., 2006; Sigmund, 2010). However, tothe best of our knowledge there has been no work studying how prior commitments can be modelled andused for enhancing the outcome of the evolution of coordination. As our results below show, signiﬁcantenhancement of coordination and population welfare can be achieved via the arrangement of suitablecommitment deals.Furthermore, it is noteworthy that commitments have been studied extensively in Artiﬁcial Intelligenceand Multi-agent systems literature, see e.g. (Castelfranchi and Falcone, 2010; Chopra and Singh, 2009;Rzadca et al., 2015). Diﬀerently from our work, these studies utilise commitments for the purpose ofregulating individual and collective behaviours, formalising diﬀerent aspects of commitments (such asnorms and conventions) in multi-agent systems. However, our results and approach provide importantnew insights into the design of such systems as these require commitments to ensure high levels of eﬃcientcollaboration and coordination within a group or team of agents. For example, by providing suitableagreement deals agents can improve the chance that a desirable collective outcome (which is best for thesystems as a whole) is reached even when beneﬁts provided by the outcome are diﬀerent for the partiesinvolved.

In the following we ﬁrst describe a two-player technology adoption game then extend it with the option ofarranging prior commitments before playing the game. We then present a multi-player version of the model,with and without commitments, too. Then, we describe the methods, which are based on EvolutionaryGame Theory for ﬁnite populations, which will be used to analyse the resulting models.

We consider the scenario that two ﬁrms (players) compete for the same product market, and they needto make a (strategic) decision on which technology to invest on, a low-beneﬁt (L) or a high-beneﬁt (H)technology. The outcome of the interaction can be described in terms of costs and beneﬁts of investmentsby the following payoﬀ matrix (for row player): (cid:32)

H LH αb H − c H b H − c H L b L − c L αb L − c L (cid:33) = (cid:32) H LH a bL c d (cid:33) , (1) where c L , c H and b L , b H ( b L ≤ b H ) represent the costs and beneﬁts of investing on L and H, respectively; α ∈ (0 ,

1) indicates the competitive level of the product market: the ﬁrms receive a partial beneﬁt ifthey both choose to invest on the same technology. Collectively, the smaller α is (i.e. the higher themarket competitiveness), the more important that the ﬁrms coordinate to choose diﬀerent technologies.For simplicity, the entries of the payoﬀ matrix are denoted by a, b, c, d , as above. We have b > a and c > d . Without loss of generality, we assume that H would generate a greater net beneﬁt, i.e. c = b L − c L < b H − c H = b .Note that although we describe our model in terms of technology adoption decision making, it is gener-ally applicable to many other coordination problems for instance wherever there are strategic investmentdecisions to make (in competitive markets of any products) (Zhu and Weyant, 2003; Chevalier-Roignantet al., 2011). We now extend the model allowing players to have the option to arrange a prior commitment before aTD interaction. A commitment proposal is to ask the co-player to adopt a diﬀerent technology. That is,a strategist intending to use H (resp., L) would ask the co-player to adopt L (resp., H). We denote thesecommitment proposing strategies as HP and LP, respectively. Similarly to previous models of commitments(for PD and PGG) (Han et al., 2013, 2015a), to make the commitment deal reliable, a proposer pays anarrangement cost (cid:15) . If the co-player agrees with the deal, then the proposer assumes that the opponentwill adopt the agreed choice, yet there is no guarantee that this will actually be the case. Thus whenevera co-player refuses to commit, HP and LP would play H in the game. When the co-player accepts thecommitment though later does not honour it, she has to compensate the honouring co-player at a personalcost δ .Diﬀerently from previous models on PD and PGG where an agreed outcome leads to the same payoﬀfor all parties in the agreement (mutual cooperation beneﬁt), in the current model such an outcome wouldlead to diﬀerent payoﬀs for those involved. Therefore, as part of the agreement, HP would compensateafter the game an amount θ to accepted player that honours the agreement; while LP would request acompensation θ from such an accepted co-player.Besides HP and LP, we consider a minimal model with the following (basic) strategies in this commit-ment version: • Non-proposing acceptors, HC and LC, who always commit when being proposed a commitment dealwherein they are willing to adopt any technology proposed (even when it is diﬀerent from theirintended choice), honour the adopted agreement, but do not propose a commitment themselves.They play their intended choice, i.e. H and L, respectively, when there is no agreement in place; • Non-acceptors, HN and LN, who do not accept commitment, play their intended choice during thegame, and do not propose commitments; • Fake committers, HF and LF, who accept a commitment proposal yet play the choice opposite towhat has been agreed whenever the game takes place. These players assume that they can exploitthe commitment proposing players without suﬀering the consequences .Note that similar to the commitment models for the PD game (Han et al., 2013), some possible strategieshave been excluded from the analysis since they are dominated by at least one of the strategies in anyconﬁguration of the game: they can be omitted without changing the outcome of the analysis. Forexample, those who propose a commitment (i.e. paying a cost (cid:15) ) but then do not honour (thus haveto pay the compensation when facing a honouring acceptors) would be dominated by the correspondingnon-proposers.Together the model consists of eight strategies that deﬁne the following payoﬀ matrix, capturing theaverage payoﬀs that each strategy will receive upon interaction with one of the other seven strategies(where we denote λ = θ + θ , λ = b − (cid:15) − θ , λ = c − (cid:15) + θ , λ = a − (cid:15) + δ and λ = d − (cid:15) + δ , just forthe sake of clear representation)  HP LP HN LN HC LC HF LFHP b + c − (cid:15) b − (cid:15) − λ a b λ λ λ λ LP c − (cid:15) + λ b + c − (cid:15) a b λ λ λ λ HN a a a b a b a b LN c c c d c d c d HC c + θ b − θ a b a b a b LC c + θ b − θ c d c d c d HF a − δ d − δ a b a b a b LF a − δ d − δ c d c d c d  . (2) Note that when two commitment proposers interact only one of them will need to pay the cost of setting upthe commitment. Yet, as either one of them can take this action they pay this cost only half of the time (onaverage). In addition, the average payoﬀ of HP when interacting with LP is given by ( b − (cid:15) − θ + b − θ ) = (2 b − (cid:15) − θ − θ ). When two HP players interact, each receives ( b − (cid:15) − θ + c + θ ) = ( b + c − (cid:15) ). Compared to cooperation dilemmas such as PD and PGG, fake strategies make less sense in the context of coordinationgames since they would not earn the temptation payoﬀ by adopting a diﬀerent choice from what being agreed. Moreover, inthe presence of an agreement, players obtain an additional compensation when adopting the disadvantageous choice (i.e. L).We will keep the fake strategies in the analysis of pairwise games for conﬁrmation of these intuitions but will exclude themfrom multi-player settings for simplicity, without being detrimental to the results.

We say that an agreement is fair if both parties obtain the same beneﬁt when they honour it (afterhaving taken into account the cost of setting up the agreement). For that, we can show that θ and θ mustsatisfy θ = b − c − (cid:15) and θ = b − c + (cid:15) , and thus, both parties obtain b + c − (cid:15) . Indeed, they can be achievedby comparing the payoﬀs of HP and HC when they interact, i.e. b − c − θ = c + θ , where solving thisequation we would obtain θ = b − c − (cid:15) .With these conditions it also ensures that the payoﬀs of HP and LP when interacting with each otherare equal. Our analysis below will ﬁrst focus on whether and when the fair agreements can lead toimprovement in terms of coordination and the overall social welfare (i.e. average population payoﬀ). Wewill discuss how diﬀerent kinds of agreements (varying θ and θ ) aﬀect the outcome, with additionalresults provided in Appendix. We now describe a N -player ( N >

2) version of the TD model. Again, as before, we will introduce themodel in the context of technology investment market decision making. In a group (of size N ) with k players of type H (i.e., N − k players of type L ), the expected payoﬀs of playing H and L can be writtenas follows Π H ( k ) = α H ( k ) b H − c H , Π L ( k ) = α L ( k ) b L − c L , (3)where α H ( k ) and α L ( k ) represent the fraction of the beneﬁt obtained by H and L players, respectively,which depend on the composition of the group, k . For two-player TD, both are equal to α . To generalizefor N -player TD interactions, they should also depend on the demand for high technology (H) in thegroup, describing what is the maximal number of players in the group that can adopt H without reducingtheir beneﬁt due to competition. Let us denote this number by µ (where 1 ≤ µ ≤ N ). For example,intermediate values of µ indicate a high level of group diversity is needed for optimal coordination. When µ = N , it means there is a signiﬁcant market demand of the high beneﬁt technology so that all ﬁrms canadopt it without leading to competition.Hence, we deﬁne α H ( k ) =  , if k ≤ µ, α µk otherwise , (4) α L ( k ) =  , if k ≥ µ, α ( N − µ ) N − k otherwise . (5)The rationale of these deﬁnitions is that whenever k ≤ µ , full beneﬁts from adopting H can be obtained,and moreover, if k > µ , the larger k the stronger the competition is among H adopters. Similarly for Ladopters. The parameters α and α stand for the intensities of competition for investing in H and in L,respectively. For simplicity we assume in this paper α = α = α . Note that for N = 2 we recover thetwo-player model given in Equation (1).The optimal group payoﬀ is achieved when there are exactly µ players adopting H and the rest adoptingL, leading to an average payoﬀ for each member given by A := µ ( b H − c H ) + ( N − µ )( b L − c L ) N .

We can deﬁne the N -player game version with prior commitments in a similar fashion as in the two-playergame. Commitment proposing strategists (i.e. HP and LP players) will propose before an interactionthat the group will play the optimal arrangement (so that every player obtains an average payoﬀ A ). Forsimplicity, we assume that the committed players adopt the fair agreement, i.e. every member will obtainthe same payoﬀ after compensation is made to those adopting L. As such, we don’t need to consider whowill adopt H or L, as all would receive the same payoﬀ at the end. Moreover, whenever a player in thegroup refuses to commit, commitment proposers will adopt H. Details of payoﬀ calculation will be providedin Results section (cf. Table 1). In this work, we will perform theoretical analysis and numerical simulations (see next section) using EGTmethods for ﬁnite populations (Nowak et al., 2004; Imhof et al., 2005). Let Z be the size of the population.In such a setting, individuals’ payoﬀ represents their ﬁtness or social success , and evolutionary dynamicsis shaped by social learning (Hofbauer and Sigmund, 1998; Sigmund, 2010), whereby the most successfulindividuals will tend to be imitated more often by the other individuals. In the current work, social learningis modelled using the so-called pairwise comparison rule (Traulsen et al., 2006), a standard approach inEGT, assuming that an individual A with ﬁtness f A adopts the strategy of another individual B withﬁtness f B with probability p given by the Fermi function, p A,B = (cid:16) e − β ( f B − f A ) (cid:17) − . The parameter β represents the ‘imitation strength’ or ‘intensity of selection’, i.e., how strongly the in-dividuals base their decision to imitate on ﬁtness diﬀerence between themselves and the opponents. For β = 0, we obtain the limit of neutral drift – the imitation decision is random. For large β , imitationbecomes increasingly deterministic.In the absence of mutations or exploration, the end states of evolution are inevitably monomorphic:once such a state is reached, it cannot be escaped through imitation. We thus further assume that, witha certain mutation probability, an individual switches randomly to a diﬀerent strategy without imitatinganother individual. In the limit of small mutation rates, the dynamics will proceed with, at most, twostrategies in the population, such that the behavioural dynamics can be conveniently described by a MarkovChain, where each state represents a monomorphic population, whereas the transition probabilities aregiven by the ﬁxation probability of a single mutant (Imhof et al., 2005; Nowak et al., 2004). The resultingMarkov Chain has a stationary distribution, which characterises the average time the population spendsin each of these monomorphic end states.Before describing how to calculate this stationary distribution, we need to show how payoﬀs are calcu-lated, which diﬀer for two-player and N-player settings, as below. • Average Payoﬀ for the Two Player Game

Let π ij represent the payoﬀ obtained by strategist i in each pairwise interaction with strategist j ,as deﬁned in the payoﬀ matrices in Equations (1) and (2). Suppose there are at most two strategiesin the population, say, k individuals using i (0 ≤ k ≤ Z ) and ( Z − k ) individuals using j . Thus theaverage payoﬀ of the individual that uses i or j can be written respectively as followsΠ i ( k ) = ( k − π ii + ( Z − k ) π i,j Z − , Π j ( k ) = kπ j,i + ( Z − k − π j,j Z − . (6) • Expected Payoﬀ in The Multiplayer Game

In the case of N -player interactions, suppose the population includes x individuals of type i and Z − x individuals of type j . The probability to select k individuals of type i and N − k individuals oftype j , in N trails, is given by the hypergeometric distribution as follows (Sigmund, 2010; Gokhaleand Traulsen, 2010) H ( k, N, x, N ) = (cid:0) xk (cid:1)(cid:0) Z − xN − k (cid:1)(cid:0) ZN (cid:1) Hence, in a population of x i -strategists and ( Z − x ) j strategists, the average payoﬀ of i and j aregiven by Π ij ( k ) = N − (cid:88) k =0 H ( k, N − , x − , Z − π ij ( k + 1) = N − (cid:88) k =0 (cid:0) x − k (cid:1)(cid:0) Z − xN − − k (cid:1)(cid:0) Z − N − (cid:1) π ij ( k + 1) , Π ji ( k ) = N − (cid:88) k =0 H ( k, N − , x − , Z − π ji ( k ) = N − (cid:88) k =0 (cid:0) xk (cid:1)(cid:0) Z − − xN − − k (cid:1)(cid:0) Z − N − (cid:1) π ij ( k ) . (7)Now, for both two-player and N -player settings, the probability to change the number k of individualsusing strategy A by ± one in each time step can be written as (Traulsen et al., 2006) T ± ( k ) = Z − kZ kZ (cid:104) e ∓ β [Π i ( k ) − Π j ( k )] (cid:105) − . (8)The ﬁxation probability of a single mutant with a strategy i in a population of ( N −

1) individuals using0 j is given by (Traulsen et al., 2006; Nowak et al., 2004) ρ j,i =  N − (cid:88) i =1 i (cid:89) j =1 T − ( j ) T + ( j )  − . (9)Considering a set { , ..., q } of diﬀerent strategies, these ﬁxation probabilities determine a transition matrix M = { T ij } qi,j =1 , with T ij,j (cid:54) = i = ρ ji / ( q −

1) and T ii = 1 − (cid:80) qj =1 ,j (cid:54) = i T ij , of a Markov Chain. The normalisedeigenvector associated with the eigenvalue 1 of the transposed of M provides the stationary distributiondescribed above (Imhof et al., 2005), describing the relative time the population spends adopting each ofthe strategies. Risk-dominance

An important measure to determine the evolutionary dynamic of a given strategy isits risk-dominance against others. For the two strategies i and j , risk-dominance is a criterion whichdetermine which selection direction is more probable: an i mutant is able to ﬁxating in a homogeneouspopulation of agents using j or a j mutant ﬁxating in a homogeneous population of individuals playing i .In the case, for instance, the ﬁrst was more probable than the latter then we say that i is risk-dominant against j (Nowak et al., 2004; Sigmund, 2010) which holds for any intensity of selection and in the limitfor large population size Z when N (cid:88) k =1 Π i,j ( k ) ≥ N − (cid:88) k =0 Π j,i ( k ) (10)This condition is applicable for both two-player games, N = 2, and when N-player games with N >

We will ﬁrst describe results for two-player games, then proceeding to provide those for the N -playerversion. To begin with, using the conditions given in Equation 10, we obtain that if θ + θ < b − c then HP is risk-dominant (see Methods) against LP. Otherwise, LP is risk-dominant against HP.Similarly, we derive the conditions regarding the commitment parameters for which HP and LP areevolutionarily viable strategies, i.e. when they are risk-dominant against all other non-proposing ones.Indeed, HP and LP are risk-dominant against all other six non-proposing strategies, respectively, if and1only if (cid:15) < min { b + c − a, b − c − d, b − c − a − θ , b − c − d − θ , b + c − a + 4 δ , b + c − d + 4 δ } ,(cid:15) < min { b + c − a, b − c − d, c − b − a + 4 θ , c − b − d + 4 θ , b + c − a + 4 δ , b + c − d + 4 δ } . (11)Note that each element in the min expressions above corresponds to the condition for one of the sixnon-proposing strategies HN, LN, HC, LC, HF, LF, respectively.Thus, we can derive the conditions for θ , θ and δ : θ <

14 (3 b − c − (cid:15) − { a, d } ) ,θ >

14 ( b − c + 3 (cid:15) + 2 max { a, d } ) ,δ >

14 (3 (cid:15) − b − c + 2 max { a, d } ) . (12)In particular, for fair agreements, i.e. θ = ( b − c − (cid:15) ) / θ = ( b − c + (cid:15) ) /

2, we obtain (cid:15) < b + c − { a, d } ,δ >

14 (3 (cid:15) − b − c + 2 max { a, d } ) . (13)It is because 3 b − c − d > b + c − { a, d } , which is due to b > c and max { a, d } ≥ d .In general, these conditions indicate that for commitments to be a viable option for improving coordi-nation, the cost of arrangement (cid:15) must be suﬃciently small while the compensation associated with thecontract needs to be suﬃciently large (see already Figure 2 for numerical validation). Furthermore, for theﬁrst condition to hold, it is necessary that b + c > { a, d } . It means that the total payoﬀ of two playerswhen playing the TD game is always greater when they can coordinate to choose diﬀerent technologies,than when they both choose the same technology.Moreover, the conditions in Equation 13 can be expressed in terms of α and the costs and beneﬁts ofinvestment, as follows (see again the payoﬀ matrices in Equation 1) α < { c H + b L − c L − (cid:15) b H , c L + b H − c H − (cid:15) b L } ,α < { c H + b L − c L − (cid:15) + 4 δ b H , c L + b H − c H − (cid:15) + 4 δ b L } , which can be rewritten as α < { c H + b L − c L − max { (cid:15), (cid:15) − δ } b H , c L + b H − c H − max { (cid:15), (cid:15) − δ } b L } . (14)2This condition indicates under what condition of the market competitiveness and the costs and beneﬁts ofinvesting in available technologies, commitments can be an evolutionarily viable mechanism. Intuitively,for given costs and beneﬁts of investment (i.e. ﬁxing c L , c H , b L , b H ), a larger cost of arranging a (reliable)agreement, (cid:15) , leads to a smaller threshold of α where commitment is viable. Moreover, given a commitmentsystem (i.e. ﬁxing (cid:15) and δ ), assuming similar costs of investment for the two technologies, then a largerratio of the beneﬁts obtained from the two technologies, b H /b L , leads to a smaller upper bound for α forwhich commitment is viable.Remarkably, our numerical analysis below (see already Figure 1) shows that the condition in Equation14 accurately predicts the threshold of α where commitment proposing strategies (i.e. HP and LP) arehighly abundant in the population, leading to improvement in terms of the average population payoﬀcompared to when commitment is absent (Figure 3).On the other hand, when α is suﬃciently large, little improvement can be achieved, especially when b H /b L is large (which is in accordance with the analytical results above). We calculate the stationary distribution in a population of eight strategies, HP, LP, HN, LN, HC, LC,HF and LF, using methods described above. In Figure 1, we show the frequency of these strategies as afunction of α , for diﬀerent values of (cid:15) and game conﬁgurations. In general, the commitment proposingstrategies HP and LP dominate the population when α is small while HN and HC dominate when α is suﬃciently large even with diﬀerent values of beta utilized in the comparison. That is, commitmentproposing strategies are viable and successful whenever the market competitiveness is high, leading to theneed of eﬃcient coordination among the competing players/ﬁrms to ensure high beneﬁts. Notably, weobserve that the thresholds of α below which HP and LP are dominant, closely corroborate the analyticalcondition described in Equation 14, in all cases. This observation is also robust for diﬀerent values ofintensity of selection, β .This observation is robust for varying commitment parameters, i.e. the cost of arranging commitment, (cid:15) , and the compensation cost associated with commitment, δ , see Figure 2. Namely, we show the totalfrequency of commitment strategies (i.e. sum of the frequencies of HP and LP) for varying these parametersand for diﬀerent values of α . It can be seen that, in general, the commitment strategies dominate thepopulation whenever (cid:15) is suﬃciently small and δ is suﬃciently large. This observation is in accordancewith previous commitment modelling works for the cooperation dilemma games (Han et al., 2013, 2015a,2017). In addition, we observe that in the current coordination problem, that the smaller α is, thesecommitment strategies dominate the population for wider range of (cid:15) and δ . Our additional results showthat these observations are robust with respect to other game conﬁgurations. Furthermore, the resultsshow that increasing β only have some eﬀect for large α , where sharp increase in commitment frequencywhen δ is suﬃciently larger.Now, in order to determine whether and when commitments can actually lead to meaningful improve-ment, in Figure 3, we compare the average population payoﬀ or social welfare when a commitment ispresent and when it is absent. In general, it can be seen that when α is suﬃciently small (below a thresh-3 HP LP HN LN HC LC HF LF ?1 =1.95, ?2 =2.05 ?1 =1.5, ?2 =2.5 ?1 =1, ?2 =3b ce b =5, c =1d fa ihg ? =0.1 ? =0.01 ? =2 ? =0.01 ? =1 ? =0.01 ? =1 ? =1 ? =0.1 ? =1 ? =2 ? =0.1 ? =1 ? =0.1 ? =0.1 ? =0.1 ? =2 ? =1 Figure 1:

Frequency of the eight strategies, HP, LP, HN, LN, HC, LC, HF and LF, as a functionof α , for diﬀerent values of (cid:15) and β . In general, the commitment proposing strategies HP and LP dominate thepopulation when α is small while HN and HC dominate when α is suﬃciently large in all cases, which is robust fordiﬀerent values of intensity of selection, β . The HN and HC dominate the population as the market competitiondecreases (i.e. when α increases). Larger values of β increase the diﬀerence between strategies’ frequencies but donot change the outcomes in general. Parameters: in all panels c H = 1, c L = 1, b L = 2 (i.e. c = 1), b H = 6 (i.e. b = 5). Other parameters: δ = 6; β = 0 . , . Z = 100; Fair agreements are used, where θ and θ are given by θ = ( b − c − (cid:15) ) / θ = ( b − c + (cid:15) ) / a? =0.1 ? =0.01 g? =0.1 ? =1 d? =0.1 ? =0.1 e? =0.5 ? =0.1 f? =0.9 ? =0.1 b? =0.5 ? =0.01 c? =0.9 ? =0.01 i? =0.9 ? =1 h? =0.5 ? =1 Figure 2:

Total frequency of commitment strategies (i.e. sum of the frequencies of HP and LP), as afunction of (cid:15) and δ , for diﬀerent values of α and β . Primarily, the commitment proposing strategies dominate thepopulation whenever (cid:15) is suﬃciently small and δ is suﬃciently large. Furthermore, the smaller α , these commitmentstrategies dominate for a wider range of (cid:15) and δ , especially when α is smaller. Increasing β only have some eﬀectfor large α , where sharp increase in commitment frequency when δ is suﬃciently larger. Parameters: in all panels c H = 1, c L = 1, b L = 2 (i.e. c = 1), and b H = 6 (i.e. b = 5). Other parameters: β = 0 .

01 in the ﬁrst, β = 0 . β = 1 in the third row; population size Z = 100; Fair agreements are used, where θ and θ aregiven by θ = ( b − c − (cid:15) ) / θ = ( b − c + (cid:15) ) / f ? =1 d ? =0.01 e ? =0.1 a ? =0.01 b =2, c =1 b ? =0.1 c ? =1 b =5, c =1 W ith Com m itm ent , ? =0.1 W ith Com m itm ent, ? =1 W ith Com m itm ent, ? =2 W ithout Com m itm ent

Figure 3:

Average population payoﬀ as a function of α , when commitment is absent and when itis present, for diﬀerent values of (cid:15) and β . We observed that when α is small, signiﬁcant improvement interms of the average population payoﬀ can be achieved through prior commitment. When α is suﬃciently large,commitment leads to on improvement or might even be detrimental for social welfare, especially when β is small.That is, at α = 0 . α = 0 . c H = 1, c L = 1, b L = 2 (i.e. c = 1); in panel a, b and c) b H = 6 (i.e. b = 5) with β = 0 . , . and b H = 3 (i.e. b = 2) with β = 0 . , . and δ = 6; population size Z = 100; Fair agreements are used, where θ and θ are given by θ = ( b − c − (cid:15) ) / θ = ( b − c + (cid:15) ) / (cid:15) , the greater improvementis obtained. When α is suﬃciently large, commitment leads to on improvement or might even be detri-mental for social welfare, especially when b H /b L is large (which is in accordance with the analytical resultsabove). The detriment is further increased when β is small. We can observe that the thresholds for whicha notable improvement can be achieved is the same as the one for the viability of HP and LP (i.e. asdescribed in Equation 14). N -player TD game As mentioned above, compared to cooperation dilemmas such as PD and PGG, fake strategies make lesssense in the context of coordination games since they would not earn the temptation payoﬀ by adopting adiﬀerent choice from what being agreed. To focus on the group eﬀect and the eﬀect of the newly introducedparameter µ , we will consider a population consisting of HP, LP, HN, LN, HC and LC (i.e. excluding fakestrategies). As shown in the two-player game analysis, the fake strategies (i.e. HF and LF) are not viableoptions in TD games and can be ignored. It is equivalent to consider to the full set of strategies with asuﬃciently large δ .First of all, we derive the payoﬀs received by each strategy when encountering speciﬁc other strategies(see a summary in Table 1). Namely, Π ij ( k ) and Π ji ( k ) denote the payoﬀs of a strategist of type i and j , respectively, in a group consisting of k player of type i and N − k players of type j . The ﬁrst columnof the table lists all possible strategies which can be used by player i (focal player), where as the secondcolumn shows strategies of co-players (opponents). The third column shows the payoﬀs of focal players. We now derive the conditions under which HP is risk-dominant against the rest of strategies. Since weassume fair agreements, the conditions for LP would be equivalent to those for HP in terms of risk-dominance.First, H P is risk-dominant against HC if N (cid:88) k =1 Π HP,HC ( k ) ≥ N − (cid:88) k =0 Π HC,HP ( k ) , which can be written as N (cid:88) k =1 (cid:16) A − (cid:15)k (cid:17) ≥ Π H ( N ) + N − (cid:88) k =1 A, Hence we obtain (cid:15) ≤ A − Π H ( N ) H N , (15)7Focal Player ( i ) Opponent ( j ) Π i,j ( k )HP, LP HP, LP A − (cid:15)/N HP, LP HC, LC A − (cid:15)/k HP, LP HN Π H ( N ) (for k < N )HP, LP LN Π H ( k ) (for k < N )HN HP,LP, HN,HC Π H ( N )HN LN,LC Π H ( k )LN HP, HN,HC Π L ( k )LN LN,LC Π L ( N )LN LP Π L ( k )HC, LC HP,LP A (for k < N )HC HN,HC Π H ( N )HC LN,LC Π H ( k )LC HN,HC Π L ( k )LC LN,LC Π L ( N ) Table 1: Average payoﬀs of focal strategy i when facing strategy j , in a group of k former and N − k latterstrategists. H N = (cid:80) Nk =1 1 k .Similarly, HP is risk-dominant against LC if (cid:15) ≤ A − Π L ( N ) H N , (16)For risk-dominance of HP against HN, N (cid:88) k =1 Π HP,HN ( k ) ≥ N − (cid:88) k =0 Π HN,HP ( k ) , which equivalently can be written as A − (cid:15)N ≥ Π H ( N ) , or, (cid:15) ≤ N (cid:0) A − Π H ( N ) (cid:1) . (17)Finally, HP is risk-dominant against LN if N (cid:88) k =1 Π HP,LN ( k ) ≥ N − (cid:88) k =0 Π LN,HP ( k ) , which can be rewritten as A − (cid:15)N + N − (cid:88) k =1 Π H ( k ) ≥ N − (cid:88) k =0 Π L ( k )Further simpliﬁcation leads to (cid:15) ≤ N  A +  µb H + αµb H N − (cid:88) k = µ +1 k − ( N − c H  − (cid:34) α ( N − µ ) b L µ (cid:88) k =0 N − k + ( N − µ − b L − N c L (cid:35) . (18)In short, in order for commitment proposers to be risk-dominant against all other strategies, it requiresthat (cid:15) is suﬃciently small, namely, smaller than minimum of the right hand sides of Equations (15)-(18).9 b =2, c =1 b =5, c =1 HP LP HN HC LCLN e µ =2 d µ =1 f µ =5 a µ =1 b µ =2 c µ =5 Figure 4:

Frequency of the six strategies HP, LP, HN, LN, HC and LC, as a function of (cid:15) in a N-playergame with commitment, for diﬀerent values of µ . In the N-player game, the new parameter µ describes themarket demand for a high technology, which was set to 1 in the pairwise game. HP and LP have a high frequency forsuﬃciently small (cid:15) for µ = 2 in both games, and also when µ = 1 for the ﬁrst, easy coordinate situation (ﬁrst row).When µ = 5, i.e. when all players can adopt H without beneﬁt reduction, HC always dominate and commitmentstrategies are not successful. This means that when there is a need for a diversity of technology adoption, initiatingprior commitments to enhance coordination is important. Parameters: in panel a, b and c) b H = 6 (i.e. b = 5)with µ = 1 , , b H = 3 (i.e. b = 2) with µ = 1 , , N = 5, β = 0 . α = 0 . c H = 1, c L = 1, b L = 2 (i.e. c = 1); . We compute stationary distributions in a population of six strategies HP, LP, HN, LN, HC and LC, forthe N-player TD game, using the payoﬀs in Table 1 and the Methods described above. To begin with,in Figure 4 (see also Figure 9 in Appendix), we provide numerical validation for the analytical conditionsobtained in the previous section regarding when commitment proposing strategies are evolutionarily viablestrategies (being risk-dominant against others). Similar to the pairwise TD game, we observe that thereis a threshold for (cid:15) below which it is the case. Moreover, Figure 5 shows that the frequencies of thesestrategies (HP and LP) decrease for increasing α . They dominate the population whenever (cid:15) is suﬃcientlysmall (e.g. (cid:15) = 0 . a ? =0.1 b =2, c =1 b =5, c =1 HP LP HN HC LCLN b ? =1 c ? =2 f ? =2 e ? =1 d ? =0.1 Figure 5:

Frequency of the six strategies HP, LP, HN, LN, HC and LC, as a function of α in amultiplayer game with commitment , for diﬀerent values of (cid:15) and also two diﬀerent game conﬁgurations. Ingeneral, the commitment proposing strategies (HP and LP) decrease in frequency for increasing α . They dominateover other strategies for suﬃciently small α and (cid:15) . That is, it is more beneﬁcial to engage in a prior commitmentdeal when the market competition is ﬁerce and the cost of arranging the commitment is very minimal. Parameters:in all panels c H = 1, c L = 1, b L = 2 (i.e. c = 1); in panel a, b and c) b H = 6 (i.e. b = 5) with (cid:15) = 0 . , b H = 3 (i.e. b = 2) with (cid:15) = 0 . , N = 5, β = 0 . µ = 2. b =5, c =1 b =2, c =1ba Figure 6:

Total frequency of commitment proposing strategies HP and LP as a function of µ and (cid:15) .In general, the commitment proposing strategies are most successful for intermediate values of µ , especially for asuﬃciently small cost of arranging prior commitment (cid:15) . Parameters: in all panels, c H = 1, c L = 1 (i.e. c = 1), b L = 2. In panel a), b H = 6 (i.e. b = 5) and in panel b) b H = 3 (i.e. b = 2). Other parameters: N = 5, β = 0 . α = 0 . market competition is harsher (i.e. small α ). These results are robust for diﬀerent intensities of selection(see Figure 10 in Appendix). In general, our results conﬁrm the similar observations regarding the eﬀectsof (cid:15) and α on the evolutionary outcomes to obtained in the pairwise game above.We now focus on understanding the eﬀect of the new parameter in the N-player game, µ , on theevolutionary outcomes. Recall that µ indicates the demand for high technology (H) in the group, describingwhat is the maximal number of players in the group that can adopt H without reducing their beneﬁt due tocompetition. Figure 4 shows the eﬀect of diﬀerent values of µ on the frequency or evolutionary success of allstrategies as a function of (cid:15) . When µ is small to intermediate, and the cost of arranging prior commitmentis also small, the commitment proposing strategies are dominant. This suggests that arranging priorcommitments might be more beneﬁcial in such instances. These results also imply that µ is very essentialin determining when commitment should be initiated. Apparently, the greater need for a group mixtureor market diversity of technologies, indicating a more diﬃcult coordination situation, the greater need forthe utilization of commitment to enhance coordination among players is. This observation is even moreevident in Figure 6, where we examine the success of commitment for varying µ and (cid:15) , in regards to twodiﬀerent game conﬁgurations. It can be observed that an intermediate value of µ leads to the highestfrequency of commitment strategies, especially in the more diﬃcult coordination situation (i.e. the rightpanel).We now closely examine the gain in terms of social welfare improvement when using prior commitments.As shown in Figure 7, whenever µ < N ( N = 5), i.e. there is a need to coordinate among the group players2 W ith Com m itm ent , ? =0.1 W ith Com m itm ent, ? =1 W ith Com m itm ent, ? =2 W ithout Com m itm ent

Figure 7:

Average population payoﬀ (social welfare) as a function of µ with diﬀerent values of (cid:15) ,showing when commitment is absent against when it is present. We compare results for diﬀerent valuesof β in two game conﬁgurations. We observe that whenever µ < b H = 6 (i.e. b = 5), in panel d,e and f) b H = 3 (i.e. b = 2). Other parameters: N = 5, α = 0 . c H = 1, c L = 1, b L = 2 µ and higher values of intensity of selection, β . We have described in this paper novel evolutionary game theory models showing how prior commitmentscan be adopted as an eﬃcient mechanism for enhancing coordination, in both pairwise and multi-playerinteractions. For that, we described technology adoption (TD) games where technology investment ﬁrmswould achieve the best collective outcome if they can coordinate with each other to adopt a mixture ofdiﬀerent technologies. To this end, a parameter α was used to capture the competitiveness level of theproduct market and how beneﬁcial it is to achieve coordination, while another parameter µ to capturethe optimal coordination mixture or diversity of technology adopters in a group (in the pairwise case, weassume the optimal mixture is where two ﬁrms adopt diﬀerent technologies to avoid conﬂict).In the coordination settings, there are multiple desirable outcomes and players have distinct preferencesin terms of which outcome should be agreed upon, thus leading to a larger behavioural space than in thecontext of cooperation dilemmas (Han et al., 2013, 2017, 2015a; Sasaki et al., 2015; Hasan and Raja,2013). We have shown that whether commitment is a viable mechanism for promoting the evolution ofcoordination, strongly depends on α : when α is suﬃciently small, prior commitment is highly abundantleading to signiﬁcant improvement in terms of social welfare (i.e. population avarage payoﬀ), comparedto when commitment is absent. Importantly, we have derived the analytical condition for the threshold of α below which the success of commitments is guaranteed, for both pairwise and multi-player TD games.Furthermore, moving from pairwise to a multi-player setting, it was shown that µ plays an important rolefor the success of commitment strategies as well. In general, when µ is intermediate, equivalent to a highlevel of diversity in group choices, arranging prior commitments proved to be highly important. It led tosigniﬁcant improvement in terms of social welfare, especially in a harsher coordination situation.In the main text, we have considered that a fair agreement is arranged. In the Appendix (Figure 8), wehave shown that whenever commitment proposers are allowed to freely choose which deal to propose to theirco-players, our results show that, in a highly competitive market (i.e. small α ), commitment proposersshould be strict (i.e. sharing less beneﬁts), while when the market is less competitive, commitmentproposers should be more generous.In short, our analysis has demonstrated that commitment is a viable tool for promoting the evolution ofdiverse collective behaviours among self-interested individuals, beyond the context of cooperation dilemmaswhere there is only one desirable collective outcome (Nesse, 2001; Skyrms, 1996; Barrett et al., 2007). Infuture work, we will consider how commitments can solve more complex collective problems, e.g. in atechnological innovation race (Han et al., 2019), bargaining games (Zisis et al., 2015; Rand et al., 2013),climate change actions (Barrett et al., 2007) and cross-sector coordination (Santos et al., 2016), where theremight be a large number of desirable outcomes or equilibriums, especially when the number of players in4an interaction increases (Duong and Han, 2015; Gokhale and Traulsen, 2010; Han et al., 2012). T.A.H and A.E. are supported by Future of Life Institute (grant RFP2-154).5 θ and θ In the main text, we assume that a fair agreement is always arranged. We consider here what wouldhappen if HP and LP can personalise the commitment deal they want to propose, i.e. any θ and θ canbe proposed (instead of always being fair). Namely, Figure 8 shows the average population payoﬀ varyingthese parameters, for diﬀerent values of α . We observe that when α is small, the highest average payoﬀ isachieved when θ is suﬃciently small and θ is suﬃciently large, while for large α , it is reverse for the twoparameters. That is, in a highly competitive market (i.e. small α ), commitment proposers should be strict(HP keeps suﬃcient beneﬁt while LP requests suﬃcient payment, from their commitment partners), whilewhen the market is less competitive (i.e. large α ), commitment proposers should be more generous (HPproposes to give a larger beneﬁt while LP requests a smaller payment, from their commitment partners).Our results conﬁrm that this observation is robust for diﬀerent values of (cid:15) , δ and β . See Figure 9 for numerical results conﬁrming the risk-dominant conditions in the N-player game in themain text.6 a? =0.1 ? =0.01 g? =0.1 ? =1 d? =0.1 ? =0.1 e? =0.5 ? =0.1 f? =0.9 ? =0.1 b? =0.5 ? =0.01 c? =0.9 ? =0.01 i? =0.9 ? =1 h? =0.5 ? =1 Figure 8:

Average population payoﬀ as a function of θ and θ , for diﬀerent values of α and β . When α is small (panels a and b), the highest average payoﬀ is achieved when θ is suﬃciently small and θ is suﬃcientlylarge, while for large α (panel c), it is the case when θ is suﬃciently large and θ is suﬃciently small. Figure 4 alsoshows that for a small value of β , the highest average payoﬀ is achieved when α is very minimal compared to otherpanels with higher value of β (compare panel a, d and g). Parameters: in all panels c H = 1, c L = 1, b L = 2 (i.e. c = 1), and b H = 6 (i.e. b = 5). Other parameters: δ = 4 , (cid:15) = 1; β = 0 . , . Z = 100. b =5, c =1, µ =2, ? =0.5 Figure 9:

Validation for the analytical conditions under which HP is risk dominant against strategyHC, LC, HN and LN, see main text.

In all cases, with a small value of (cid:15) , the HP strategy dominated otherplayers. This result of this ﬁgure is in accordance with our equations derived above. Parameters: in all panels, c H = 1, c L = 1, b H = 6 (i.e. b = 5), µ = 2 and α = 0 . be b =5, c =1 d fa i h ? =0.1 ? =0.01 ? =2 ? =0.01 ? =1 ? =0.01 ? =2 ? =0.1 HP LP HN HC LC c ? =2 ? =0.01 d ? =0.1 ? =0.1 ? =1 ? =0.1 ? =2 ? =1 g h ? =0.1 ? =1 ? =1 ? =1 LN Figure 10:

Frequency of six strategies HP, LP, HN, LN, HC and LC, as a function of α and fordiﬀerent values of (cid:15) and β . The commitment proposing strategies HP and LP dominate the population whenthe values of α and (cid:15) are suﬃciently small, in all cases of β . Furthermore, as the value of (cid:15) increases, the non-proposing strategies dominate the population. Parameters: in all panels c H = 1, c L = 1, b L = 2 (i.e. c = 1), b H = 6 (i.e. b = 5); Other parameters: (cid:15) = 0 . , , β = 0 . , . , References

Andras, P., Esterle, L., Guckert, M., Han, T. A., Lewis, P. R., Milanovic, K., Payne, T., Perret, C., Pitt,J., Powers, S. T., et al. (2018). Trusting intelligent machines: Deepening trust within socio-technicalsystems.

IEEE Technology and Society Magazine , 37(4):76–83.Arvanitis, A., Papadatou-Pastou, M., and Hantzi, A. (2019). Agreement in the ultimatum game: An anal-ysis of interpersonal and intergroup context on the basis of the consensualistic approach to negotiation.

New Ideas in Psychology , 54:15–26.Bardhan, I., Sougstad, R., and Sougstad, R. (2004). Prioritizing a portfolio of information technologyinvestment projects.

Journal of Management Information Systems , 21(2):33–60.Barrett, S. (2016). Coordination vs. voluntarism and enforcement in sustaining international environmentalcooperation.

Proceedings of the National Academy of Sciences , 113(51):14515–14522.Barrett, S. et al. (2007).

Why cooperate?: the incentive to supply global public goods . Oxford UniversityPress on Demand.Beede, D. N. and Young, K. H. (1998). Patterns of advanced technology adoption and manufacturingperformance.

Business Economics , pages 43–48.Bianca, O. N. and Han, T. A. (2019). Emergence of coordination with asymmetric beneﬁts via priorcommitment. In

Artiﬁcial Life Conference Proceedings , pages 163–170. MIT Press.Castelfranchi, C. and Falcone, R. (2010).

Trust Theory: A Socio-Cognitive and Computational Model(Wiley Series in Agent Technology) . Wiley.Chen, X., Szolnoki, A., and Perc, M. (2014). Probabilistic sharing solves the problem of costly punishment.

New Journal of Physics , 16(8):083016.Chen, X.-P. and Komorita, S. S. (1994). The eﬀects of communication and commitment in a public goodssocial dilemma.

Organizational Behavior and Human Decision Processes , 60(3):367–386.Cherry, T. L. and McEvoy, D. M. (2013). Enforcing compliance with environmental agreements in theabsence of strong institutions: An experimental analysis.

Environmental and Resource Economics ,54(1):63–77.Chevalier-Roignant, B., Flath, C. M., Huchzermeier, A., and Trigeorgis, L. (2011). Strategic investmentunder uncertainty: A synthesis.

European Journal of Operational Research , 215(3):639–650.Chopra, A. K. and Singh, M. P. (2009). Multiagent commitment alignment. In

AAMAS’2009 , pages937–944.Cohen, P. R. and Levesque, H. J. (1990). Intention is Choice with Commitment.

Artiﬁcial Intelligence ,42(2-3):213–261.0Duong, M. H. and Han, T. A. (2015). On the expected number of equilibria in a multi-player multi-strategyevolutionary game.

Dynamic Games and Applications , pages 1–23.Frank, R. H. (1988).

Passions Within Reason: The Strategic Role of the Emotions . Norton and Company.Gokhale, C. S. and Traulsen, A. (2010). Evolutionary games in the multiverse.

Proc. Natl. Acad. Sci.U.S.A. , 107(12):5500–5504.Han, T. A. (2013).

Intention Recognition, Commitments and Their Roles in the Evolution of Coopera-tion: From Artiﬁcial Intelligence Techniques to Evolutionary Game Theory Models , volume 9. SpringerSAPERE series.Han, T. A. (2016). Emergence of social punishment and cooperation through prior commitments. In

AAAI’2016 , pages 2494–2500, Phoenix, Arizona, USA.Han, T. A. and Lenaerts, T. (2016). A synergy of costly punishment and commitment in cooperationdilemmas.

Adaptive Behavior , 24(4):237–248.Han, T. A., Pereira, L. M., and Lenaerts, T. (2015a). Avoiding or Restricting Defectors in Public GoodsGames?

J. Royal Soc Interface , 12(103):20141203.Han, T. A., Pereira, L. M., and Lenaerts, T. (2017). Evolution of commitment and level of participationin public goods games.

Autonomous Agents and Multi-Agent Systems , 31(3):561–583.Han, T. A., Pereira, L. M., and Lenaerts, T. (2019). Modelling and Inﬂuencing the AI Bidding War: AResearch Agenda. In

AAAI/ACM conference AI, Ethics and Society .Han, T. A., Pereira, L. M., Santos, F. C., and Lenaerts, T. (2013). Good agreements make good friends.

Scientiﬁc reports , 3(2695).Han, T. A., Santos, F. C., Lenaerts, T., and Pereira, L. M. (2015b). Synergy between intention recognitionand commitments in cooperation dilemmas.

Scientiﬁc reports , 5(9312).Han, T. A., Traulsen, A., and Gokhale, C. S. (2012). On equilibrium properties of evolutionary multiplayergames with random payoﬀ matrices.

Theoretical Population Biology , 81(4):264–272.Hardin, G. (1968). The tragedy of the commons.

Science , 162:1243–1248.Hasan, M. R. and Raja, A. (2013). Emergence of cooperation using commitments and complex net-work dynamics. In

IEEE/WIC/ACM Intl Joint Conferences on Web Intelligence and Intelligent AgentTechnologies , pages 345–352.Hofbauer, J. and Sigmund, K. (1998).

Evolutionary Games and Population Dynamics . Cambridge Uni-versity Press.Imhof, L. A., Fudenberg, D., and Nowak, M. A. (2005). Evolutionary cycles of cooperation and defection.

Proc. Natl. Acad. Sci. U.S.A. , 102:10797–10800.1Kumar, A., Capraro, V., and Perc, M. (2020). The evolution of trust and trustworthiness.

Journal of theRoyal Society Interface , 17(169):20200491.Kurzban, R., McCabe, K., Smith, V. L., and Wilson, B. J. (2001). Incremental commitment and reciprocityin a real-time public goods game.

Personality and Social Psychology Bulletin , 27(12):1662–1673.Martinez-Vaquero, L. A., Han, T. A., Pereira, L. M., and Lenaerts, T. (2015). Apology and forgivenessevolve to resolve failures in cooperative agreements.

Scientiﬁc reports , 5(10639).Martinez-Vaquero, L. A., Han, T. A., Pereira, L. M., and Lenaerts, T. (2017). When agreement-acceptingfree-riders are a necessary evil for the evolution of cooperation.

Scientiﬁc reports , 7(1):1–9.Nesse, R. M. (2001). Natural selection and the capacity for subjective commitment. In Nesse, R. M.,editor,

Evolution and the capacity for commitment , pages 1–44.Nowak, M. A. (2006). Five rules for the evolution of cooperation.

Science , 314(5805):1560.Nowak, M. A., Sasaki, A., Taylor, C., and Fudenberg, D. (2004). Emergence of cooperation and evolu-tionary stability in ﬁnite populations.

Nature , 428:646–650.Ohtsuki, H. (2018). Evolutionary dynamics of coordinated cooperation.

Frontiers in Ecology and Evolution ,6:62.Okada, I. (2020). A review of theoretical studies on indirect reciprocity.

Games , 11(3):27.Ostrom, E. (1990).

Governing the commons: The evolution of institutions for collective action . Cambridgeuniversity press.Pacheco, J. M., Santos, F. C., Souza, M. O., and Skyrms, B. (2009). Evolutionary dynamics of collectiveaction in n-person stag hunt dilemmas.

Proc. R. Soc. B , 276:315–321.Perc, M., Jordan, J. J., Rand, D. G., Wang, Z., Boccaletti, S., and Szolnoki, A. (2017). Statistical physicsof human cooperation.

Physics Reports , 687:1–51.Pitt, J., Schaumeier, J., and Artikis, A. (2012). Axiomatization of socio-economic principles for self-organizing institutions: Concepts, experiments and challenges.

ACM Transactions on Autonomous andAdaptive Systems (TAAS) , 7(4):39.Powers, S. T., Taylor, D. J., and Bryson, J. J. (2012). Punishment can promote defection in group-structured populations.

Journal of theoretical biology , 311:107–116.Rand, D. G., Tarnita, C. E., Ohtsuki, H., and Nowak, M. A. (2013). Evolution of fairness in the one-shotanonymous ultimatum game.

Proc. Natl. Acad. Sci. USA , 110:2581–2586.Rzadca, K., Datta, A., Kreitz, G., and Buchegger, S. (2015). Game-theoretic mechanisms to increase dataavailability in decentralized storage systems.

ACM Transactions on Autonomous and Adaptive Systems(TAAS) , 10(3):14.2Santos, F. C. and Pacheco, J. M. (2011). Risk of collective failure provides an escape from the tragedyof the commons.

Proceedings of the National Academy of Sciences of the United States of America ,108(26):10421–10425.Santos, F. C., Pacheco, J. M., and Lenaerts, T. (2006). Evolutionary dynamics of social dilemmas instructured heterogeneous populations.

Proceedings of the National Academy of Sciences of the UnitedStates of America , 103:3490–3494.Santos, F. P., Encarna¸c˜ao, S., Santos, F. C., Portugali, J., and Pacheco, J. M. (2016). An evolutionarygame theoretic approach to multi-sector coordination and self-organization.

Entropy , 18(4):152.Sasaki, T., Okada, I., Uchida, S., and Chen, X. (2015). Commitment to cooperation and peer punishment:Its evolution.

Games , 6(4):574–587.Schewe, R. L. and Stuart, D. (2015). Diversity in agricultural technology adoption: How are automaticmilking systems used and to what end?

Agriculture and human values , 32(2):199–213.Sigmund, K. (2010).

The Calculus of Selﬁshness . Princeton University Press.Sigmund, K., Silva, H. D., Traulsen, A., and Hauert, C. (2010). Social learning promotes institutions forgoverning the commons.

Nature , 466:7308.Skyrms, B. (1996).

Evolution of the Social Contract . Cambridge University Press.Skyrms, B. (2003).

The Stag Hunt and the Evolution of Social Structure . Cambridge University Press.Szolnoki, A. and Perc, M. (2012). Evolutionary advantages of adaptive rewarding.

New Journal of Physics ,14(9):093016.Traulsen, A., Nowak, M. A., and Pacheco, J. M. (2006). Stochastic dynamics of invasion and ﬁxation.

Phys. Rev. E , 74:11909.Wang, S., Chen, X., and Szolnoki, A. (2019). Exploring optimal institutional incentives for public cooper-ation.

Communications in Nonlinear Science and Numerical Simulation , 79:104914.West, S., Griﬃn, A., and Gardner, A. (2007). Evolutionary explanations for cooperation.

Current Biology ,17:R661–R672.Zhu, K. and Weyant, J. P. (2003). Strategic decisions of new technology adoption under asymmetricinformation: a game-theoretic model.

Decision sciences , 34(4):643–675.Zisis, I., Guida, S. D., Han, T. A., Kirchsteiger, G., and Lenaerts, T. (2015). Generosity motivated byacceptance - evolutionary analysis of an anticipation games.