[PDF] Relatedness and synergies of kind and scale in the evolution of helping

Abstract

Relatedness and synergy affect the selection pressure on cooperation and altruism. Although early work investigated the effect of these factors independently of each other, recent efforts have been aimed at exploring their interplay. Here, we contribute to this ongoing synthesis in two distinct but complementary ways. First, we integrate models of n -player matrix games into the direct fitness approach of inclusive fitness theory, hence providing a framework to consider synergistic social interactions between relatives in family and spatially structured populations. Second, we illustrate the usefulness of this framework by delineating three distinct types of helping traits ("whole-group", "nonexpresser-only" and "expresser-only"), which are characterized by different synergies of kind (arising from differential fitness effects on individuals expressing or not expressing helping) and can be subjected to different synergies of scale (arising from economies or diseconomies of scale). We find that relatedness and synergies of kind and scale can interact to generate nontrivial evolutionary dynamics, such as cases of bistable coexistence featuring both a stable equilibrium with a positive level of helping and an unstable helping threshold. This broadens the qualitative effects of relatedness (or spatial structure) on the evolution of helping.

Full PDF

RRelatedness and synergies of kind and scale in the evolution ofhelping

Jorge Pe˜na , ∗ Georg N¨oldeke Laurent Lehmann April 4, 2019 Department of Evolutionary TheoryMax Planck Institute for Evolutionary BiologyAugust-Thienemann-Str. 2, 24306 Pl¨on, Germanye-mail: [email protected] Faculty of Business and EconomicsUniversity of BaselPeter Merian-Weg 6, CH-4002 Basel, Switzerlande-mail: [email protected] Department of Ecology and EvolutionUniversity of LausanneLe Biophore, CH-1015 Lausanne, Switzerlande-mail: [email protected]* Corresponding author. 1 a r X i v : . [ q - b i o . P E ] D ec bstract Relatedness and synergy aﬀect the selection pressure on cooperation and altruism. Although earlywork investigated the eﬀect of these factors independently of each other, recent eﬀorts have beenaimed at exploring their interplay. Here, we contribute to this ongoing synthesis in two distinct butcomplementary ways. First, we integrate models of n -player matrix games into the direct ﬁtnessapproach of inclusive ﬁtness theory, hence providing a framework to consider synergistic socialinteractions between relatives in family and spatially structured populations. Second, we illustratethe usefulness of this framework by delineating three distinct types of helping traits (“whole-group”,“nonexpresser-only” and “expresser-only”), which are characterized by diﬀerent synergies of kind(arising from diﬀerential ﬁtness eﬀects on individuals expressing or not expressing helping) and canbe subjected to diﬀerent synergies of scale (arising from economies or diseconomies of scale). Weﬁnd that relatedness and synergies of kind and scale can interact to generate nontrivial evolutionarydynamics, such as cases of bistable coexistence featuring both a stable equilibrium with a positivelevel of helping and an unstable helping threshold. This broadens the qualitative eﬀects of relatedness(or spatial structure) on the evolution of helping. Keywords. evolution of helping, relatedness, synergy, inclusive ﬁtness, evolutionary games Introduction

Explaining the evolution of helping (cooperation and altruism) has been a main focus of researchin evolutionary biology over the last ﬁfty years (e.g., Sachs et al., 2004; West et al., 2007). In thiscontext, Hamilton’s seminal papers established the importance of relatedness (genetic assortment betweenindividuals) by showing that an allele for helping can be favored by natural selection as long as − c + rb > c is the ﬁtness cost to an average carrier from expressing the allele, b is the ﬁtness beneﬁtto such a carrier stemming from a social partner expressing the allele, and r is the relatedness betweensocial partners (Hamilton, 1964a,b, 1970). Additional factors, including diﬀerent forms of reciprocity(i.e., conditional behaviors and responsiveness under multimove interactions, e.g., Trivers, 1971; Axelrodand Hamilton, 1981) and synergy (i.e., nonadditive eﬀects of social behaviors on material payoﬀs, eitherpositive or negative, e.g., Queller, 1985; Sumpter, 2010), modify the ﬁtness costs and beneﬁts in Hamilton’srule (Axelrod and Hamilton, 1981; Day and Taylor, 1997; Lehmann and Keller, 2006a; Gardner et al.,2011; Van Cleve and Ak¸cay, 2014) and hence fundamentally inﬂuence the evolutionary dynamics ofhelping.Because of their ubiquity, relatedness and synergy occupy a central role among the factors aﬀecting theselection pressure on helping. Both are clearly present in the cooperative enterprises of most organisms.First, real populations are characterized by limited gene ﬂow at least until the stage of oﬀspring dispersal(Clobert et al., 2001), with the consequence that most social interactions necessarily occur betweenrelatives of varying degree. Second, social exchanges often feature at least one of two diﬀerent forms ofsynergy, which we call in this article “synergies of kind” and “synergies of scale”.Synergies of kind (implicit in what Queller, 2011 calls “kind selection”) arise when the expression ofa social trait beneﬁts recipients in diﬀerent ways, depending on whether or not (or more generally, towhich extent) recipients express the social trait themselves. A classical example of a positive synergyof kind is collective hunting (Packer and Ruttan, 1988), where the beneﬁts of a successful hunt go tocooperators (hunters) but not to defectors (solitary individuals). Examples of negative synergies of kindare eusociality in Hymenoptera, by which sterile workers help queens to reproduce (Bourke and Franks,1995), and self-destructive cooperation in bacteria, where expressers lyse while releasing virulence factorsthat beneﬁt nonexpressers (Fr¨ohlich and Madeo, 2000; Ackermann et al., 2008).Synergies of scale (Corning, 2002) result from economies or diseconomies of scale in the productionof a social good, so that the net eﬀect of several individuals behaving socially can be more or less thanthe sum of individual eﬀects. For instance, enzyme production in microbial cooperation is likely tobe nonlinear, as in the cases of invertase hydrolyzing disaccharides into glucose in the budding yeast Saccharomyces cerevisiae (Gore et al., 2009) or virulence factors triggering gut inﬂammation (and henceremoval of competitors) in the pathogen

Salmonella typhimurium (Ackermann et al., 2008). In the formercase, the relationship between growth rate and glucose concentration in yeast has been reported to besublinear, i.e., invertase production has diminishing returns or negative synergies of scale (Gore et al.,2009, ﬁg. 3. c ); in the latter case, the relationship between the level of expression of virulence factors andinﬂammation intensity appears to be superlinear, i.e., it exhibits increasing returns or positive synergiesof scale (Ackermann et al., 2008, ﬁg. 2. d ).Previous theoretical work has investigated the eﬀects of relatedness and synergy on the evolutionof helping either independently of each other or by means of simpliﬁed models that neglect crucialinteractions between the two factors. For instance, the eﬀects of demography on relatedness and thescale of competition in family and spatially structured populations have often been explored underthe assumption of additive payoﬀ eﬀects (e.g., Taylor, 1992; Taylor and Irwin, 2000; Lehmann et al.,2006; Gardner and West, 2006), while synergistic interactions have usually been investigated under theassumption that individuals are unrelated (e.g., Motro, 1991; Leimar and Tuomi, 1998; Hauert et al.,3006). In the cases where relatedness and synergy have been considered to operate in conjunction, ithas been customary to model social interactions by means of a two-player Prisoner’s Dilemma, modiﬁedby adding a synergy parameter D to the payoﬀ of mutual cooperation (Grafen, 1979; Queller, 1984,1985, 1992; Fletcher and Zwick, 2006; Lehmann and Keller, 2006a,b; Ohtsuki, 2010; Gardner et al., 2011;Ohtsuki, 2012; Taylor and Maciejewski, 2012; Van Cleve and Ak¸cay, 2014). In this framework, D >

D <

S.typhimurium features both negative synergies of kind and positive synergies of scale (Ackermann et al.,2008). Models of two-player matrix games between relatives miss these patterns of synergy (and possibleinteractions between relatedness and synergy) because such games are linear, and only nonlinear games(which necessarily involve at least three-party interactions) can accommodate both negative and positivesynergies without conﬂating them into a single parameter. Although previous work has explored instancesof n -player games between relatives (e.g., Boyd and Richerson, 1988; Eshel and Motro, 1988; Archetti,2009; Van Cleve and Lehmann, 2013; Marshall, 2014) this has been done only for speciﬁc population orpayoﬀ structures, and hence not in a comprehensive manner.In this article, we study the interplay between relatedness and synergies of kind and scale in modelsof n -player social interactions between relatives. In order to do so, we ﬁrst present a general frameworkthat integrates n -player matrix games (e.g., Kurokawa and Ihara, 2009; Gokhale and Traulsen, 2010)into the “direct ﬁtness” approach (Taylor and Frank, 1996; Rousset, 2004) of social evolution theory.This framework allows us to deliver a tractable expression for the selection gradient (or gain function)determining the evolutionary dynamics, which diﬀers from the corresponding expression for n -playergames between unrelated individuals only in that “inclusive gains from switching” rather than solely“direct gains from switching” must be taken into account.We then use the theoretical framework to investigate the interaction between relatedness, synergies ofkind, and synergies of scale in the evolution of helping. We show the importance of distinguishing betweenthree diﬀerent kinds of helping traits (which we call “whole-group”, “nonexpresser-only” and “expresser-only”), that are characterized by diﬀerent types of synergies of kind (none for “whole-group”, negativefor “nonexpresser-only”, positive for “expresser-only”), and can be subjected to diﬀerent synergies ofscale. Our analysis demonstrates that the interplay between relatedness and synergy can lead to patternsof frequency dependence, evolutionary dynamics, and bifurcations that cannot arise when consideringsynergistic interactions between unrelated individuals. Thereby, our approach illustrates how relatednessand synergy combine nontrivially to aﬀect the evolution of social behaviors. We consider a homogeneous haploid population subdivided into a ﬁnite and constant number of groups,each with a constant number N ≥ N individuals reach adulthood in each group.Dispersal between groups may follow a variety of schemes, including the island model of dispersal(Wright, 1931; Taylor, 1992), isolation by distance (Mal´ecot, 1975; Rousset, 2004), hierarchical migration(Sawyer and Felsenstein, 1983; Lehmann and Rousset, 2012), a model where groups split into daughtergroups and compete against each other (Gardner and West, 2006; Lehmann et al., 2006; Traulsen andNowak, 2006), and several variants of the haystack model (e.g., Matessi and Jayakar, 1976; Godfrey-Smithand Kerr, 2009). We leave the exact details of the life history unspeciﬁed, but assume that they fall withinthe scope of models of spatially homogeneous populations with constant population size (see Rousset,2004, ch. 6). Each demographic time period, individuals interact socially by participating in a game between n players.Interactions can occur among all adults in a group ( n = N ), among a subset of such individuals ( n < N )or among oﬀspring before dispersal ( n > N ). Individuals may either express a social behavior (e.g.,cooperate in a Prisoner’s Dilemma) or not (e.g., defect in a Prisoner’s Dilemma). We denote these twopossible actions by A (“cooperation”) and B (“defection”) and also refer to A-players as “expressers”and to B-players as “nonexpressers”. The game is symmetric so that, from the point of view of a focalindividual, any two co-players playing the same action are exchangeable. We denote by a k the materialpayoﬀ to an A-player when k = 0 , , . . . , n − n − − k co-players chooseB). Likewise, we denote by b k the material payoﬀ to a B-player when k co-players choose A.We assume that individuals implement mixed strategies, i.e., they play A with probability z (andhence play B with probability 1 − z ). The set of available strategies is then the interval z ∈ [0 , z andmutants who play A with probability z + δ . Let us denote by z • the strategy (either z or z + δ ) of a focalindividual, and by z (cid:96) ( • ) the strategy of the (cid:96) -th co-player of such focal. The expected payoﬀ π to thefocal is then π (cid:0) z • , z • ) , z • ) , ..., z n − • ) (cid:1) = n − (cid:88) k =0 φ k (cid:0) z • ) , z • ) , . . . , z n − • ) (cid:1) [ z • a k + (1 − z • ) b k ] , (1)where φ k is the probability that exactly k co-players play action A. A ﬁrst-order Taylor-series expansionabout the average strategy z ◦ = (cid:80) n − (cid:96) =1 z (cid:96) ( • ) / ( n −

1) of co-players shows that, to ﬁrst order in δ , theprobability φ k is given by a binomial distribution with parameters n − z ◦ , i.e., φ k (cid:0) z • ) , z • ) , . . . , z n − • ) (cid:1) = (cid:18) n − k (cid:19) z k ◦ (1 − z ◦ ) n − − k + O ( δ ) . (2)Substituting (2) into (1) and discarding second and higher order terms, we obtain π ( z • , z ◦ ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k ◦ (1 − z ◦ ) n − − k [ z • a k + (1 − z • ) b k ] (3)for the payoﬀ of a focal individual as a function of the focal’s strategy z • and the average strategy z ◦ ofthe focal’s co-players (see also Rousset, 2004, p. 95 and Van Cleve and Lehmann, 2013, p. 85).5 .3 Gain function and convergence stability Consider a population of residents playing z in which a single mutant z + δ appears due to mutation,and denote by ρ the ﬁxation probability of the mutant. We take the phenotypic selection gradient S = (d ρ/ d δ ) δ =0 as a measure of evolutionary success of the mutant (Rousset and Billiard, 2000, p.819; Van Cleve, 2014, p. 17); indeed, S > | δ | (cid:28) S is proportional to the “gain function”given by G ( z ) = ∂π ( z • , z ◦ ) ∂z • (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z (cid:124) (cid:123)(cid:122) (cid:125) “direct” eﬀect, −C ( z ) + κ ∂π ( z • , z ◦ ) ∂z ◦ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z (cid:124) (cid:123)(cid:122) (cid:125) “indirect” eﬀect, B ( z ) = −C ( z ) + κ B ( z ) (4)(see, e.g., Van Cleve and Lehmann, 2013, eq. 7).Equation (4) shows that the gain function G ( z ) is determined by three components. First, the “direct”eﬀect −C ( z ), that describes the change in average payoﬀ of the focal resulting from the focal inﬁnitesimallychanging its own strategy. Second, the “indirect” eﬀect B ( z ), that describes the change in average payoﬀof the focal resulting from the focal’s co-players changing their strategy inﬁnitesimally. Third, the indirecteﬀect is weighted by the “scaled relatedness coeﬃcient” κ , which is a measure of relatedness between thefocal individual and its neighbors, demographically scaled so as to capture the eﬀects of local competitionon selection (Queller, 1994; Lehmann and Rousset, 2010; Ak¸cay and Van Cleve, 2012). We discuss thesethree components of the gain function in more detail in the following section.Knowledge of equation (4) is suﬃcient to characterize convergent stable strategies (Eshel and Motro,1981; Eshel, 1983; Taylor, 1989; Christiansen, 1991; Geritz et al., 1998; Rousset, 2004). In our context,candidate convergent stable strategies are either “singular points” (i.e., values z ∗ ∈ (0 ,

1) such that G ( z ∗ ) = 0), or the two pure strategies z = 1 (always play A) and z = 0 (always play B). In particular, asingular point z ∗ is convergent stable if d G ( z ) / d z | z = z ∗ <

0. Regarding the endpoints, z = 1 (resp. z = 0)is convergent stable if G (1) > G (0) < From equation (4), the condition for a mutant to be favored by selection can be written as −C ( z )+ κ B ( z ) >

0. This can be understood as a scaled form of the marginal version of Hamilton’s rule (Lehmann andRousset, 2010) with C ( z ) corresponding to the marginal direct costs and B ( z ) to the marginal indirectbeneﬁts of expressing an increased probability of playing action A. These marginal costs and beneﬁts arenot measured in terms of actual ﬁtness (number of adult oﬀspring, which are the units of measurementof b and c in Hamilton’s rule as given in the introduction, see e.g., Rousset, 2004, p. 113), but interms of fecundity via payoﬀs in a game. The scaled relatedness coeﬃcient κ is also not equal to theregression deﬁnition of relatedness present in the standard Hamilton’s rule, except for special cases where6ompetition is completely global (Queller, 1994).The coeﬃcient κ is a function of demographic parameters such as migration rate, group size, and vitalrates of individuals or groups, but is independent of the evolving trait z (Van Cleve and Lehmann, 2013).For instance, in the island model with overlapping generations, κ = 2 s (1 − m ) / ( N [2 − m (1 − s )]+2(1 − m ) s ),where m is the migration rate and s is the probability of surviving to the next generation (Taylor and Irwin,2000, eq. A10; Ak¸cay and Van Cleve, 2012, app. A2). In broad terms, we have (i) κ > κ = 0 for inﬁnitelylarge panmictic populations or for viscous populations with local competition exactly compensating forincreased assortment of strategies (Taylor, 1992), and (iii) κ < κ under diﬀerent variants of the haystack model).In contrast to κ , which depends only on population structure, the other two components of the gainfunction are solely determined by the payoﬀ structure of the social interaction. In the following, we showhow −C ( z ) and B ( z ) can be expressed in terms of the payoﬀs a k and b k of the game. Doing so deliversan expression for G ( z ) that can be analyzed with the same techniques applicable for games betweenunrelated individuals. This expression provides the foundation for our subsequent analysis.Imagine a focal individual playing B in a group where k of its co-players play A. Suppose that thisfocal individual unilaterally switches its action to A while its co-players hold ﬁxed their actions, thuschanging its payoﬀ from b k to a k . As a consequence, the focal experiences a “direct gain from switching”given by d k = a k − b k , k = 0 , , . . . , n − . (5)At the same time, each of the focal’s co-players playing A experiences a change in payoﬀ given by∆ a k − = a k − a k − and each of the focal’s co-players playing B experiences a change in payoﬀ given by∆ b k = b k +1 − b k . Hence, taken as a block, the co-players of the focal experience a change in payoﬀ givenby e k = k ∆ a k − + ( n − − k )∆ b k , k = 0 , , . . . , n − , (6)where we let a − = b n +1 = 0 for mathematical convenience. From the perspective of the focal, this changein payoﬀs represents an “indirect gain from switching” the focal obtains if co-players are related.In appendix B, we show that the partial derivatives appearing in (4) can be expressed as expectedvalues of the direct and indirect gains from switching, so that the direct and indirect eﬀects are respectivelygiven by −C ( z ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k d k , (7)and B ( z ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k e k . (8)Hence, deﬁning the “inclusive gains from switching” as f k = d k + κe k , k = 0 , , . . . , n − , (9)7he gain function can be written as the expected value of the inclusive gains from switching: G ( z ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k f k . (10)An immediate consequence of equation (10) is that matrix games between relatives are mathematicallyequivalent to “transformed” games between unrelated individuals, where “inclusive payoﬀs” take theplace of standard, or personal, payoﬀs. Indeed, consider a game in which a focal playing A (resp. B)obtains payoﬀs a (cid:48) k = a k + κ [ ka k + ( n − − k ) b k +1 ] , k = 0 , , . . . , n − b (cid:48) k = b k + κ [ ka k − + ( n − − k ) b k ] , k = 0 , , . . . , n − k of its co-players play A. Using equations (5)–(6) we can rewrite equation (9) as f k = a (cid:48) k − b (cid:48) k , sothat the inclusive gains from switching are identical to the direct gains from switching in a game withpayoﬀ structure given by equations (11)–(12). The payoﬀs a (cid:48) k (resp. b (cid:48) k ) can be understood as inclusivepayoﬀs consisting of the payoﬀ obtained by a focal playing A (resp. B) plus κ times the sum of thepayoﬀs obtained by its co-players.This observation has two relevant consequences. First, the results developed in Pe˜na et al. (2014) fornonlinear n -player matrix games between unrelated individuals, which are based on the observation thatthe right side of (10) is a polynomial in Bernstein form (Farouki, 2012), also apply here, provided that (i)the inclusive gains from switching f k are used instead of the standard (direct) gains from switching d k in the formula for the gain function, and (ii) the concept of evolutionary stability is read as meaningconvergence stability. For a large class of games, these results allow to identify convergence stable pointsfrom a direct inspection of the sign pattern of the inclusive gains from switching f k . Second, we mayinterpret the eﬀect of relatedness on selection as inducing the payoﬀ transformation a k → a (cid:48) k , b k → b (cid:48) k .For n = 2, this payoﬀ transformation is the one hinted at by Hamilton (1971) and later often discussed inthe theoretical literature (Grafen, 1979; Hines and Maynard Smith, 1979; Day and Taylor, 1998), namely (cid:32) a (cid:48) a (cid:48) b (cid:48) b (cid:48) (cid:33) = (cid:32) a + κb (1 + κ ) a (1 + κ ) b b + κa (cid:33) , where the payoﬀ of the focal is augmented by adding κ times the payoﬀ of the co-player. Throughout the following we assume that each A-player incurs a payoﬀ cost γ > β j > j > β = 0). The beneﬁt is increasing in the number of expressers, thatis, the “incremental beneﬁt” ∆ β j = β j +1 − β j is positive (∆ β j > β j isconstant, implying that β j is linear in j . With negative synergies of scale, ∆ β j is decreasing in j , whereaspositive synergies of scale arise when ∆ β j is increasing in j . To illustrate the eﬀects of synergies of scaleon the evolutionary dynamics of the social trait, we will consider the special case in which incrementalbeneﬁts are given by the geometric sequence ∆ β j = βλ j for some β > λ >

0, so that beneﬁts are8iven by β j = β j − (cid:88) (cid:96) =0 λ (cid:96) . (13)With geometric beneﬁts, synergies of scale are absent when λ = 1, negative when λ <

1, and positivewhen λ > a ), (ii) “nonexpresser-only” (beneﬁts accrue only to nonexpressers, ﬁg. 1. b ), and(iii) “expresser-only” (beneﬁts accrue only to expressers, ﬁg. 1. c ). For whole-group traits there are nosynergies of kind: beneﬁts accrue to all individuals irrespective of their kind, i.e., whether they areexpressers or nonexpressers. In contrast, nonexpresser-only traits feature negative synergies of kind,whereas expresser-only traits feature positive synergies of kind. These diﬀerences are reﬂected in diﬀerentpayoﬀ structures for the corresponding n -player games, resulting in diﬀerent direct, indirect, and inclusivegains from switching (see table 2).A classical example of a whole-group trait is the voluntary provision of public goods (Samuelson,1954). In this case, the expressed social behavior consists in the production of a good available to othersand hence exploitable by nonproducing cheats (nonexpressers). Well-known instances of public-goodscooperation are sentinel behavior in animals (Maynard Smith, 1965; Clutton-Brock et al., 1999), and thesecretion of extracellular products (Velicer, 2003; West et al., 2007), such as sucrose-digestive enzymes(Greig and Travisano, 2004; Gore et al., 2009), in social bacteria.The most prominent social behavior matching our deﬁnition of a nonexpresser-only trait is altruisticself-sacriﬁce, which happens when individuals expressing the social behavior sacriﬁce themselves (or theirreproduction) to beneﬁt nonexpressers (Frank, 2006; West et al., 2006). Sterile castes in eusocial insects(Bourke and Franks, 1995), and bacteria lysing while releasing toxins (Fr¨ohlich and Madeo, 2000) orvirulence factors (Ackermann et al., 2008) that beneﬁt other bacteria provide some examples of altruisticself-sacriﬁce in nature.Expresser-only traits have been discussed under the rubrics of “synergistic” (Queller, 1984, 1985;Leimar and Tuomi, 1998) and “greenbeard” (Guilford, 1985; Gardner and West, 2010; Queller, 2011)eﬀects, and conceptualized as involving “rowing”(Maynard Smith and Szathm´ary, 1995, p. 261-262) or“stag hunt” (Skyrms, 2004) games. Often cited examples include collective hunting (Packer and Ruttan,1988), foundresses cooperating in colony establishment (Bernasconi and Strassmann, 1999), aposematic(warning) coloration (Queller, 1984, 1985; Guilford, 1988), and the Ti plasmid in the bacterial pathogen Agrobacterium tumefaciens , which induces its plant host to produce opines, a food source that can beexploited only by bacteria bearing the plasmid (Dawkins, 1999, p. 218, White and Winans, 2007). Ineach of these examples, the social good accrues only to partners expressing the trait, either because ofa greater tendency to group and interact or because of the action of an emergent recognition systemdiscriminating expressers from nonexpressers.For all three kinds of social traits, the indirect gains from switching are always nonnegative ( e k ≥ k ) and hence the indirect eﬀect B ( z ) is nonnegative for all z . This implies that we deal with helpingtraits at the level of payoﬀs and that increasing κ never leads to less selection for expressing the socialbehavior. Due to their diﬀerent ways of deﬁning recipients, however, each social trait is characterized bya social dilemma with structurally diﬀerent payoﬀ, direct gain, and indirect gains from switching. Fornonexpresser-only traits, the direct gains from switching are always negative ( d k < k ) and thusexpressing the social behavior is also payoﬀ altruistic ( −C ( z ) < B ( z ) ≥ z ). For whole-groupand expresser-only traits, expressing the social behavior is not necessarily altruistic, depending on how9he cost γ compares to beneﬁts ( β k +1 , expresser-only traits) or incremental beneﬁts (∆ β k , whole-grouptraits).Before turning to the analysis, we note that a fourth class of social traits is sometimes also distinguishedin the literature, namely “other-only” traits where the beneﬁts accrue to all other individuals in thegroup, but not to the focal expresser itself (Pepper, 2000). Other-only traits, as whole-group traits, lacksynergies of kind, and hence the eﬀects of relatedness on the evolutionary dynamics are qualitativelysimilar for whole-group and other-only traits. We discuss this latter case in more detail in section 3.5 andrelegate the formal analysis, which is similar to the one for whole-group traits, to appendix D. To isolate the eﬀects of synergies of kind, we begin our analysis with the case in which synergies of scaleare absent, that is, beneﬁts take the linear form β j = βj ( λ = 1 in eq. (13)). The resulting expressionsfor the inclusive gains from switching and the gain functions for the three diﬀerent social traits are shownin table 3. In each case, the gain function can be written as G ( z ) = ( n −

1) [ − C + κB + (1 + κ ) Dz ] , where the parameter C > C = γ/ ( n −

1) when a focalexpresser is not among the recipients (nonexpresser-only traits) and C = ( γ − β ) / ( n −

1) otherwise(whole-group and expresser-only traits). The parameter B ≥ B = 0 for expresser-only traits and B = β otherwise. Finally, D measures synergies of kind and isthus null for whole-group traits ( D = 0), negative for nonexpresser-only traits ( D = − β ) and positive forexpresser-only traits ( D = β ).In the absence of synergies of kind ( D = 0, whole-group traits) selection is frequency independent anddefection dominates cooperation ( z = 0 is the only convergence stable strategy) if − C + κB < z = 1 is the only convergence stable strategy) if − C + κB > D < − C + κB ≤ − C + κB +(1 + κ ) D ≥ − C + κB + (1 + κ ) D < < − C + κB holds, both z = 0 and z = 1 are unstableand the singular point z ∗ = C − κB (1 + κ ) D (14)is stable.With positive synergies of kind ( D > − C + κB + (1 + κ ) D ≤ − C + κB ≥ − C + κB < < − C + κB + (1 + κ ) D , there is bistability: both z = 0 and z = 1 are stable and z ∗ is unstable.This analysis reveals three important points. First, in the absence of synergies of scale the gainfunction is linear in z , which allows for a straightforward analysis of the evolutionary dynamics for all threekinds of social traits. Second, because of the linearity of the gain function, the evolutionary dynamics ofsuch games fall into one of the four classical dynamical regimes arising from 2 × For whole-group traits there are no synergies of kind, but either positive or negative synergies of scalemay arise. How do such synergies of scale change the evolutionary dynamics of whole-group helping?Substituting the inclusive gains from switching given in table 2 into equation (10) shows that the gainfunction for whole-group traits is given by G ( z ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k {− γ + [1 + κ ( n − β k } . (15)Since the incremental beneﬁt satisﬁes ∆ β k > k , the gain function (15) is negative for κ ≤ − / ( n − z = 0 is the only stable point. Hence, we consider thecase κ > − / ( n −

1) throughout the following.If synergies of scale are negative (∆ β k decreasing in k ), the direct gains ( d k ) indirect gains ( e k ) andinclusive gains ( f k ) from switching are all decreasing in k . This implies that −C ( z ), B ( z ) and G ( z )are all decreasing in z (cf. Pe˜na et al., 2014, remark 3). Similarly, if synergies of scale are positive(∆ β k increasing in k ), d k , e k and f k are all increasing in k and hence −C ( z ), B ( z ) and G ( z ) are allincreasing in z . In both cases the evolutionary dynamics are easily characterized by applying the resultsfor public goods games with constant costs from Pe˜na et al. (2014, section 4.3): with negative synergiesof scale, defection dominates cooperation (so that z = 0 is the only convergent stable strategy) if γ ≥ [1 + κ ( n − β , whereas cooperation dominates defection if γ ≤ [1 + κ ( n − β n − holds. If[1 + κ ( n − β n − < γ < [1 + κ ( n − β holds, there is coexistence: both z = 0 and z = 1 are unstableand there is a unique stable interior point z ∗ . With positive synergies of scale, defection dominatescooperation if γ ≥ [1 + κ ( n − β n − , whereas cooperation dominates defection if γ ≤ [1 + κ ( n − β .If [1 + κ ( n − β < γ < [1 + κ ( n − β n − holds, there is bistability: both z = 0 and z = 1 are stableand there is a unique, unstable interior point z ∗ separating the basins of attraction of these two stablestrategies. These results resemble those for the cases in which there are no synergies of scale (section3.1), but negative, resp. positive synergies of kind are present. In particular, it is again the case that theevolutionary dynamics fall into one of the four classical dynamical regimes arising from 2 × κ ( n − G ( z ) = [1 + κ ( n − n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k ( − ˜ γ + ∆ β k ) , (16)where ˜ γ = γ/ [1 + κ ( n − γ for producing thepublic good, which has been analyzed under diﬀerent assumptions on the shape of the beneﬁt sequence(Motro, 1991; Bach et al., 2006; Hauert et al., 2006; Pe˜na et al., 2014). Hence, relatedness can beconceptualized as aﬀecting only the cost of cooperation, while leaving synergies of scale and patterns offrequency dependence unchanged. 11s a concrete example, consider the case of geometric beneﬁts (13) with λ (cid:54) = 1 (see table 4 for asummary of the results and app. C for a derivation). We ﬁnd that there are two critical cost-to-beneﬁtratios ε = min (cid:0) κ ( n − , λ n − [1 + κ ( n − (cid:1) and ϑ = max (cid:0) κ ( n − , λ n − [1 + κ ( n − (cid:1) , (17)such that for small costs ( γ/β ≤ ε ) cooperation dominates defection ( z = 1 is the only stable point)and for large costs ( γ/β ≥ ϑ ) defection dominates cooperation ( z = 0 is the only stable point). Forintermediate costs ( ε < γ/β < ϑ ), there is a singular point given by z ∗ = 11 − λ (cid:34) − (cid:18) γβ [1 + κ ( n − (cid:19) n − (cid:35) , (18)such that the evolutionary dynamics are characterized by coexistence if synergies of scale are negative( λ <

1) and by bistability if synergies of scale are positive ( λ > γ/β , increasing relatedness makes larger (resp. smaller) the region in theparameter space where cooperation (resp. defection) dominates. Moreover, and from equation (18), z ∗ isan increasing (resp. decreasing) function of κ when λ < λ > κ (see ﬁg. 2. a and 2. d ). For nonexpresser-only traits, synergies of kind are negative. In the absence of synergies of scale, andas discussed in section 3.1, this implies negative frequency dependence. To investigate how positive ornegative synergies of scale change this baseline scenario, we focus on the case in which relatedness isnonnegative ( κ ≥ d k and e k given in table 2, it is clear that, independently of any synergies ofscale, the direct gains from switching d k are decreasing in k . Hence, the direct eﬀect −C ( z ) is negativefrequency-dependent. When synergies of scale are negative, the indirect gains from switching e k arealso decreasing in k , implying that the indirect eﬀect B ( z ) is also negative frequency-dependent andthat the same is true for the gain function G ( z ) = −C ( z ) + κ B ( z ). Hence, negative synergies of scalelead to evolutionary dynamics that are qualitatively identical to those arising when synergies of scaleare absent: for low relatedness, defection dominates cooperation, and for suﬃciently high relatedness, aunique interior stable equilibrium appears (see app. E.1 and ﬁg. 2. b ).When synergies of scale are positive, the indirect gains from switching e k may still be decreasing in k because the incremental gain ∆ β k accrues to a smaller number of recipients ( n − − k ) as k increases.In such a scenario, always applicable when n = 2, the evolutionary dynamics are again qualitativelyidentical to those arising when synergies of scale are absent. A diﬀerent picture can emerge if n > k , implying (Pe˜na et al., 2014) thatthe indirect beneﬁt B ( z ) is similarly unimodal, featuring positive frequency dependence for small z andnegative frequency dependence for large z . Depending on the value of relatedness, which modulates howthe frequency dependence of B ( z ) interacts with that of C ( z ), this can give rise to evolutionary dynamicsdiﬀerent from those possible without synergies of scale, discussed in section 3.1.For a concrete example of such evolutionary dynamics, consider the case of geometric beneﬁts (13)with λ > e for an12llustration). In this case, the evolutionary dynamics for κ ≥ n > (cid:37) = 1 + κ ( n − κ ( n − , (19)and on the two critical cost-to-beneﬁt ratios ζ = κ ( n − , and η = 1 λ − (cid:34) λκ (cid:18) ( n − λκ κ ( n − (cid:19) n − (cid:35) , (20)which satisfy (cid:37) > ζ < η .With these deﬁnitions our results can be stated as follows. For λ ≤ (cid:37) the dynamical outcome dependson how the cost-to-beneﬁt ratio γ/β compares to ζ . If γ/β ≥ ζ (high costs), defection dominatescooperation, while if γ/β < ζ (low costs), there is coexistence. For λ > (cid:37) , the dynamical outcome alsodepends on how the cost-to-beneﬁt ratio γ/β compares to η . If γ/β ≥ η (high costs), defection dominatescooperation. If γ/β ≤ ζ (low costs), we have coexistence, with the stable singular point z ∗ satisfying z ∗ > ˆ z whereˆ z = κ [( n − λ − ( n − − κ ( n − λ − . (21)In the remaining case ( ζ < γ/β < η , intermediate costs) the dynamics are characterized by bistablecoexistence, with z = 0 stable, z = 1 unstable, and two singular points z L (unstable) and z R (stable)satisfying 0 < z L < ˆ z < z R <

1. Numerical values for z L (resp. z R ) can be obtained by searching forroots of G ( z ) in the interval (0 , ˆ z ) (resp. (ˆ z, e .It is evident from the dependence of (cid:37) , ζ , and η on κ that relatedness plays an important role indetermining the stable level(s) of expression of helping. As κ increases, the regions of the parameterspace where some non-zero level of expression of helping is stable expand at the expense of the region ofdominant non-expression. This is so because ζ and η are increasing functions of κ and (cid:37) is a decreasingfunction of κ . Moreover, inside these regions the stable non-zero probability of expressing helping increaseswith κ (see ﬁg. 2. b and 2. e ). Three cases can be however distinguished as for the eﬀects of increasing κ when starting from a point in the parameter space where z = 0 is the only stable point. First, z = 0can remain stable irrespective of the value of relatedness, which characterizes high cost-to-beneﬁt ratios.Second, the system can undergo a transcritical bifurcation as κ increases, destabilizing z = 0 and leadingto the appearance of a unique stable interior point (ﬁg. 2. b ). This happens when λ and γ/β are relativelysmall. Third, there is a range of intermediate cost-to-beneﬁt ratios such that, for suﬃciently large valuesof λ , the system undergoes a saddle-node bifurcation, whereby two singular points ( z L , unstable, and z R ,stable) appear (ﬁg. 2. e ). In this latter case, positive synergies of scale are strong enough to interact withnegative synergies of kind and relatedness in a nontrivial way. For expresser-only traits, and independently of any synergies of scale, the direct gains from switching d k (cf. table 2) are increasing in k , implying that the direct eﬀect −C ( z ) is positive frequency-dependent.When synergies of scale are positive, the indirect gains from switching e k are also increasing in k , sothat the indirect eﬀect B ( z ) is also positive frequency-dependent. Focusing on the case of nonnegativerelatedness ( κ ≥

0) this ensures that, just as when synergies of scale are absent, the gain function G ( z )is positive frequency-dependent. Hence, the evolutionary dynamics are qualitatively identical to thosearising from linear beneﬁts: for low relatedness, defection dominates cooperation, and for high relatedness,there is bistability, with the basins of attraction of the two pure equilibria z = 0 and z = 1 being separated13y a unique interior unstable point (see app. F.1 and ﬁg. 2. f ).When synergies of scale are negative, the indirect gains from switching e k may still be increasingin k because the incremental gain ∆ β k accrues to a larger number of recipients as k increases. In sucha scenario, always applicable when n = 2, the evolutionary dynamics are again qualitatively identicalto those arising when synergies of scale are absent. A diﬀerent picture can emerge if n > B ( z ) can be negativefrequency-dependent for some z , and hence (for suﬃciently high values of κ ) also G ( z ). Similarly to thecase of nonexpresser-only traits with positive synergies of scale, this can give rise to patterns of frequencydependence that go beyond the scope of helping without synergies of scale.To illustrate this, consider the case of geometric beneﬁts (13) with λ < κ ≥

0, and n > c for an illustration). Deﬁning the critical value ξ = κ ( n − κ ( n − , (22)and the two critical cost-to-beneﬁt ratios ς = 1 − λ n − λ + κ ( n − λ n − , and τ = 11 − λ (cid:34) λκ (cid:18) ( n − κ κ ( n − (cid:19) n − (cid:35) , (23)which satisfy ξ < ς < τ , our result can be stated as follows. For λ ≥ ξ the evolutionary dynamicsdepends on how the cost-to-beneﬁt ratio γ/β compares to 1 and to ς . If γ/β ≤ γ/β ≥ ς (high costs), defection dominates cooperation. If 1 < γ/β < ς (intermediate costs), the dynamics are bistable. For λ < ξ , the classiﬁcation of possible evolutionarydynamics is as in the case λ ≥ ξ , except that, if ς < γ/β < τ , the dynamics are characterized by bistablecoexistence, with z = 0 stable, z L ∈ (0 , ˆ z ) unstable, z R ∈ (ˆ z,

1) stable, and z = 1 unstable, whereˆ z = 1 + κ [1 + κ ( n − − λ ) . (24)For κ ≥

0, the critical values ξ , ς , and τ are all increasing functions of κ . Hence, as relatedness κ increases, the regions of the parameter space where some level of expression of helping is stable expand atthe expense of the region of dominant nonexpression. Moreover, inside these regions the stable positiveprobability of expressing helping increases with κ (ﬁg. 2. c ). When synergies of scale are “suﬃciently”negative ( λ < ξ ) and for intermediate cost-to-beneﬁt ratios ( ς < γ/β < τ ) relatedness and synergiesinteract in a nontrivial way, leading to saddle-node bifurcations as κ increases (ﬁg. 2. c ). Our model without synergies of scale, for which the G ( z ) is linear in z (section 3.1) extends classicaltwo-player matrix games between relatives (e.g. Grafen, 1979, Frank, 1998, ch. 5-6) to the more generalcase of n -player linear games between relatives. Indeed, for n = 2, identifying scaled relatedness κ withrelatedness r , and up to normalization of the payoﬀ matrices, equation (14) recovers Grafen (1979, eq. 9)and Frank (1998, eq. 5.6). Interestingly, Frank (1998, p. 98) considers a two-player model of helping withtwo pure strategies (“nesting” or expressing a queen phenotype, and “helping” or expressing a sterileworker phenotype), which is a particular case of our model of nonexpresser-only traits.Our results on whole-group traits with geometric returns (section 3.2 and app. C) extend the modelstudied by Hauert et al. (2006, p. 198) from the particular case of interactions between unrelatedindividuals ( κ = 0) to the more general case of interactions between relatives ( κ ∈ [ − , λ →

0, in which the game is also called a “volunteer’s14ilemma” (Diekmann, 1985). Although we restricted our attention to the cases of constant, decreasing,and increasing incremental beneﬁts, it is clear that equation (16) applies to beneﬁts β j of any shape.Hence, general results about the stability of equilibria in public goods games (Pe˜na et al., 2014) withsigmoid beneﬁts (Bach et al., 2006; Archetti and Scheuring, 2011) carry over to games between relatives.For their model of “self-destructive cooperation” in bacteria, Ackermann et al. (2008) assumed anonexpresser-only trait with no synergies of scale, and a haystack model of population structure implying κ = ( N o − N ) / ( N o ( N − n = N o ≥ N is the number of oﬀspring among which the game is played(see eq. (A.4)). Identifying our γ and β with (respectively) their β with b , the main result of Ackermannet al. (2008) (eq. 7 in their supplementary material) is recovered as a particular case of our result thatthe unique convergent stable strategy for this case is given by z ∗ = [ κ ( n − β − γ ] / [(1 + κ )( n − β ](eq. (14)). The fact that in this example κ is a probability of coalescence within groups shows that socialinteractions eﬀectively occur between family members, and hence that kin selection is crucial to theunderstanding of self-destructive cooperation (Gardner and K¨ummerli, 2008).As mentioned before, the analysis of other-only traits follows closely that of whole-group traits (seeapp. D). The model of altruistic helping in Eshel and Motro (1988) considers such an other-only trait. Intheir model, one individual in the group needs help, which can be provided (action A) or denied (actionB) by its n − n = 3.Suppose that the cost for each helper is a constant ε > c in their paper) and that the beneﬁt for theindividual in need when k co-players oﬀer help is given by v k (Eshel and Motro (1988)’s “gain function”,denoted by b k in their paper). Then, if individuals need help at random, the payoﬀs for helping (A) andnot helping (B) are given by a k = − ε ( n − /n + v k /n and b k = v k /n . Deﬁning γ = ε ( n − /n and β k = v k / ( n − a k = − γ + β k and b k = β k . Comparing these with the payoﬀs for whole-grouptraits in table 2, it is apparent that the key diﬀerence between other-only traits and whole-group traitsis that an expresser is not among the recipients of its own helping behavior. As we show in appendixD, our results for whole-group traits carry over to such other-only traits. In particular, our results forwhole-group traits with geometric beneﬁts can be used to recover results 1,2, and 3 of Eshel and Motro(1988) and to extend them from family-structured to spatially-structured populations.Finally, Van Cleve and Lehmann (2013) discuss an n -player coordination game. They assume payoﬀsgiven by a k = 1 + S ( R/S ) k/ ( n − and b k = 1 + P ( T /P ) k/ ( n − , for positive R, S, T , and P , satisfying R > T , P > S and

P > T . It is easy to see that both the direct eﬀect −C ( z ) and the indirect eﬀect B ( z )are strictly increasing functions of z having exactly one sign change. This implies that, for κ ≥

0, theevolutionary dynamics are characterized by bistability, with the basins of attraction of the two equilibria z = 0 and z = 1 being divided by the interior unstable equilibrium z ∗ . Importantly, and in contrastto the social traits analyzed in this article, expressing the payoﬀ dominant action A does not alwaysqualify as a helping trait, as B ( z ) is negative for some interval z ∈ [0 , ˆ z ). As a result, increasing scaledrelatedness κ can have mixed eﬀects on the location of z ∗ . Both of these predictions are well supportedby the numerical results reported by Van Cleve and Lehmann (2013), where increasing κ leads to a steadyincrease in z ∗ for R = 2, S = 0 . P = 1 . T = 0 .

25, and a steady decrease in z ∗ for R = 2, S = 0 . P = 1 . T = 1 .

25, see their ﬁgure 5. This illustrates that relatedness (and thus spatial structure) playsan important role not only in the speciﬁc context of helping games but also in the more general contextof nonlinear multiplayer games. 15

Discussion

We have shown that, when phenotypic diﬀerences are small, the selection gradient on a mixed strategy ofa symmetric two-strategy n -player matrix game is proportional to the average inclusive payoﬀ gain to anindividual switching strategies, and that this can be written as a polynomial in Bernstein form (eq. (10)).As a result, convergence stability of strategies in spatially structured populations can be determined fromthe shape of the inclusive gain sequence (eq. (9)) and the mathematical properties of polynomials inBernstein form (Farouki, 2012; Pe˜na et al., 2014). We applied these results to the evolution of helpingunder synergies of scale and kind, and uniﬁed and extended previous analysis. The most importantconclusion we reach is that, although an increase in (scaled) relatedness κ always tempers the socialdilemma faced by cooperative individuals in a helping game, how the social dilemma is relaxed cruciallydepends on the synergies of kind and scale involved.The simplest case is the one of whole-group traits (ﬁg. 1 a ). Since there are no synergies of kind,only synergies of scale can introduce frequency dependent selection. For κ ≥

0, negative (resp. positive)synergies of scale induce negative (resp. positive) frequency-dependent selection. Moreover, increasingrelatedness can transform a game in which defection is dominant (Prisoner’s Dilemma) into a game inwhich cooperation and defection coexist (Snowdrift or anti-coordination game) when synergies of scaleare negative (ﬁg. 2. a ), or into a game in which both cooperation and defection are stable (Stag Hunt orcoordination game) when synergies of scale are positive (ﬁg. 2. d ).More complex interactions between relatedness and frequency dependence can arise when there areboth synergies of kind and scale. For nonexpresser-only traits (ﬁg. 1. b ), synergies of kind are negativeand helping is altruistic, so that in the absence of relatedness defection dominates cooperation (as ina Prisoner’s Dilemma). When synergies of scale are absent (linear beneﬁts) or negative (diminishingincremental beneﬁts), both the direct and the indirect eﬀect are decreasing in z and selection is negativefrequency-dependent. In this case, increasing relatedness might turn the game into a Snowdrift or an anti-coordination game, where the probability of cooperating is an increasing function of relatedness (ﬁg. 2. b ).Contrastingly, when synergies of scale are positive (increasing incremental beneﬁts) the indirect eﬀect maybecome unimodal in z . This paves the way for new patterns of evolutionary dynamics and bifurcations.For the particular case of geometric beneﬁts, we ﬁnd that there is a range of cost-to-beneﬁt ratiossuch that, for suﬃciently strong positive synergies of scale, increasing relatedness induces a saddle-nodebifurcation whereby two internal equilibria appear, the leftmost unstable and the rightmost stable (ﬁg.2. e ). After the bifurcation occurs, the evolutionary dynamics are characterized by bistable coexistence,where the ﬁrst stable equilibrium is pure defection ( z = 0) and the second is a mixed equilibrium in whichindividuals help with a positive probability ( z R ).For expresser-only traits (where synergies of kind are positive, ﬁg. 1 c ) a similar interaction betweenrelatedness and synergies occurs. When synergies of scale are absent or positive, G ( z ) is increasing in z for κ ≥

0. In this case, increasing relatedness might turn a scenario reminiscent of the Prisoner’s Dilemmainto a Stag Hunt or coordination game, where the size of the basin of attraction of the cooperativeequilibrium is an increasing function of relatedness (ﬁg. 2. f ). Contrastingly, if synergies of scale arenegative, relatedness may interact nontrivially with synergies to produce a dynamical outcome whichis qualitatively identical to that arising from nonexpresser-only traits with positive synergies of scale,namely, bistable coexistence (ﬁg. 2. c ).The three kinds of helping traits we considered are also diﬀerent in the conditions they impose onthe origin and the maintenance of helping. To see this, consider a payoﬀ cost γ so large that the directsequence d k is negative. For the case of unrelated individuals ( κ = 0) this implies that B dominates Aso that z = 0 is the only stable strategy. We ask what happens when κ is increased and focus on thestability of the end-points z = 0 and z = 1. 16or whole-group traits, the indirect gains from switching when co-players are all defectors ( e ) andwhen co-players are all helpers ( e n − ) are both positive. This opens up the opportunity for both (i) z = 0to be destabilized if κ > − d /e , and (ii) z = 1 to be stabilized if κ > − d n − /e n − , which underlies theclassical eﬀect that increasing relatedness can destabilize defection and stabilize helping.In contrast, one of these two scenarios is missing for nonexpresser-only and expresser-only traits.For nonexpresser-only traits, we have e > e n − = 0 irrespectively of the shape of the beneﬁtsequence. Hence, although z = 0 can be destabilized by increasing κ (allowing for some level of helpingto be evolutionarily accessible from z = 0), z = 1 can never be stabilized and so full helping is never anevolutionary (convergent) stable point. Exactly the opposite happens for expresser-only traits, where e n > e = 0. As a result, z = 1 can become stable (if κ > − d n − /e n − ) but z = 0 can never bedestabilized by increasing κ . This implies that, under our assumptions, an expresser-only trait with highcosts ( γ > β n ) can never evolve from a monomorphic population of nonexpressers ( z = 0), and this forany value of κ .The kind of social trait also has a big impact on the amount of (scaled) relatedness required to makestable some level of helping. This quantitative eﬀect is illustrated in ﬁgure 2. When synergies of scaleare negative ( λ = 0 .

7) and the cost-to beneﬁt ratio is relatively low ( γ/β = 3 . κ ≈ .

132 for whole-group traits, κ ≈ . a and 2. b ). In contrast, a comparatively large amount of relatedness ( κ ≈ . c ). In the case of positivesynergies of scale ( λ = 1 .

25) and relatively high cost-to-beneﬁt ratio ( γ/β = 15), full expression of helpingis stable already with κ = 0 for whole-group and expresser-only traits (ﬁg. 2. d and 2. f ). Contrastingly,for nonexpresser-only traits, a positive probability of expressing helping is stable only for large values ofrelatedness ( κ > κ ∗ ≈ . e ).We modeled social interactions by assuming that actions implemented by players are discrete. Thisis in contrast to many kin-selection models of games between relatives, which assume a continuum ofpure actions in the form of continuous amounts of eﬀort devoted to some social activity (e.g., Frank 1994;Johnstone et al. 1999; Reuter and Keller 2001; Wenseleers et al. 2010). Such continuous-action modelshave the advantage that the “ﬁtness function” or “payoﬀ function” (the counterpart to our eq. (3))usually takes a simple form that facilitates mathematical analysis. On the other hand, there are situationswhere individuals can express only a few behavioral alternatives or morphs, such as worker and queen inthe eusocial Hymenoptera (Wheeler, 1986), diﬀerent behavioral tactics in foraging (e.g., “producers” and“scroungers” in house sparrows Passer domesticus ; Barnard and Sibly, 1981) and hunting (e.g., lionessespositioned as “wings” and others positioned as “centres” in collective hunts; Stander, 1992), or distinctphenotypic states (e.g., capsulated and non-capsulated cells in

Pseudomonas ﬂuorescens ; Beaumont et al.,2009). These situations are more conveniently modeled by means of a discrete-action model like the onepresented here, but we expect that our qualitative results about the interaction between synergy andrelatedness carry over to continuous-action models.Synergistic interactions are likely to be much more common in nature than additive interactionswhere both synergies of scale and kind are absent. Given the local demographic structure of biologicalpopulations, interactions between relatives are also likely to be the rule rather than the exception.Empirical work should thus aim at measuring not only the genetic relatedness of interactants and theﬁtness costs and beneﬁts of particular actions, but also at identifying the occurrences of positive andnegative synergies of kind and scale, as it is the interaction between synergies and relatedness whichdetermines the qualitative outcomes of the evolutionary dynamics of helping (ﬁg. 2).17

Acknowledgements

This work was partly supported by Swiss NSF Grants PBLAP3-145860 (to JP) and PP00P3-123344 (toLL). 18

The haystack model

Many models of social interactions have assumed diﬀerent versions of the haystack model (e.g., Matessiand Jayakar, 1976; Ackermann et al., 2008), where several rounds of unregulated reproduction canoccur within groups before a round of complete dispersal (Maynard Smith, 1964) so that competition iseﬀectively global. In these cases, as we will see below, κ takes the simpler interpretation of the coalescenceprobability of the gene lineage of two interacting individuals in their group. Here, we calculate κ fordiﬀerent variants of the haystack model.The haystack model can be seen as a special case of the island model where dispersal is completeand where dispersing progeny compete globally. In this context, the fecundity of an adult is the numberof its oﬀspring reaching the stage of global density-dependent competition. The conception of oﬀspringmay occur in a single or over multiple rounds of reproduction, so that a growth phase within patchesis possible. In this context, the number N of “adults” is better thought of as the number of foundingindividuals (or lineages, or seeds) on a patch.Two cases need to be distinguished when it comes to social interactions. First, the game can be playedbetween the adult individuals (founders) in which case κ = 0 , (A.1)since relatedness is zero among founders on a patch and there is no local competition. Alternatively,the game is played between oﬀspring after reproduction and right before their dispersal. In this casetwo individuals can be related since they can descend from the same founder. Since there is no localcompetition, κ is directly the relatedness between two interacting oﬀspring and is obtained as theprobability that the two ancestral lineages of two randomly sampled oﬀspring coalesce in the samefounding individual (relatedness in the island model is deﬁned as the cumulative coalescence probabilityover several generations, see e.g., Rousset, 2004, but owing to complete dispersal gene lineages can onlycoalesce in founders).In order to evaluate κ for the second case, we assume that, after growth, exactly N o oﬀspringare produced and that the game is played between them ( n = N o ). Founding individuals, however,may contribute a variable number of oﬀspring. Let us denote by O i the random number of oﬀspringdescending from the “adult” individual i = 1 , , ..., N on a representative patch after reproduction, i.e., O i is the size of lineage i . Owing to our assumption that the total number of oﬀspring is ﬁxed, we have N o = O + O + ... + O N , where the O i ’s are exchangeable random variables (i.e., neutral process, δ = 0).The coalescence probability κ can then be computed as the expectation of the ratio of the total numberof ways of sampling two oﬀspring from the same founding parent to the total number of ways of samplingtwo oﬀspring: κ = E (cid:34) N (cid:88) i =1 O i ( O i − N o ( N o − (cid:35) = N (cid:18) σ + µ − µN o ( N o − (cid:19) , (A.2)where the second equality follows from exchangeability, µ = E [ O i ] is the expected number of oﬀspringdescending from any individual i , and σ = E (cid:2) ( O i − µ ) (cid:3) is the corresponding variance. Due to thefact that the total number of oﬀspring is ﬁxed, we also necessarily have µ = N o /N (i.e., N o = E [ N o ] =E [ O + O + ... + O N ] = N µ ), whereby κ = N o − NN ( N o −

1) + σ NN o ( N o − , (A.3)which holds for any neutral growth process. 19e now consider diﬀerent cases:(i) Suppose that there is no variation in oﬀspring production between founding individuals, as in thelife cycle described by Ackermann et al. (2008). Then σ = 0, and equation (A.3) simpliﬁes to κ = ( N o − N ) N ( N o − . (A.4)(ii) Suppose that each of the N o oﬀspring has an equal chance of descending from any foundingindividual, so that each oﬀspring is the result of a sampling event (with replacement) from a parentamong the N founding individuals. Then, the oﬀspring number distribution is binomial with parameters N o and 1 /N , whereby σ = (1 − /N ) N o /N . Substituting into equation (A.3) produces κ = 1 N . (A.5)In more biological terms, this case results from a situation where individuals produce oﬀspring accordingto a Poisson process and where exactly N o individuals are kept for interactions (i.e., the conditionalbranching process of population genetics; Ewens, 2004).(iii) Suppose that the oﬀspring distribution follows a beta-binomial distribution, with number of trials N o and shape parameters α > β = α ( N − µ = N o /N and σ = N o ( N − αN + N o ) N (1 + αN ) , which yields κ = 1 + α αN . (A.6)In more biological terms, this reproductive scheme results from a situation where individuals produceoﬀspring according to a negative binomial distribution (larger variance than Poisson, which is recoveredwhen α → ∞ ), and where exactly N o individuals are kept for interactions. B Gains from switching and the gain function

In the following we establish the expressions for C ( z ) and B ( z ) given in equations (7)–(8); equation (10)is then immediate from the deﬁnition of f k (9) and the identity G ( z ) = −C ( z ) + κ B ( z ).Recalling the deﬁnitions of C ( z ) and B ( z ) from equation (4) as well as the deﬁnitions of d k and e k from equations (5)–(6) we need to show ∂π ( z • , z ◦ ) ∂z • (cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k [ a k − b k ] , (B.1) ∂π ( z • , z ◦ ) ∂z ◦ (cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k [ k ∆ a k − + ( n − − k )∆ b k ] , (B.2)where the function π has been deﬁned in equation (3). Equation (B.1) follows directly by taking thepartial derivative of π with respect to z • and evaluating at z • = z ◦ = z , so it remains to establish equation(B.2).Our derivation of equation (B.2) uses properties of polynomials in Bernstein form (Farouki, 2012).20uch polynomials, which in general can be written as (cid:80) mk =0 (cid:0) mk (cid:1) x k (1 − x ) m − k c k , where x ∈ [0 , x m (cid:88) k =0 (cid:18) mk (cid:19) x k (1 − x ) m − k c k = m m − (cid:88) k =0 (cid:18) m − k (cid:19) x k (1 − x ) m − − k ∆ c k . Applying this property to equation (3) and evaluating the resulting partial derivative at z • = z ◦ = z ,yields ∂π ( z • , z ◦ ) ∂z ◦ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z = ( n − z n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k ∆ a k +( n − − z ) n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k ∆ b k . (B.3)In order to obtain equation (B.2) from equation (B.3) it then suﬃces to establish x m − (cid:88) k =0 (cid:18) m − k (cid:19) x k (1 − x ) m − − k c k = m (cid:88) k =0 (cid:18) mk (cid:19) x k (1 − x ) m − k kc k − m (B.4)and (1 − x ) m − (cid:88) k =0 (cid:18) m − k (cid:19) x k (1 − x ) m − − k c k = m (cid:88) k =0 (cid:18) mk (cid:19) x k (1 − x ) m − k ( m − k ) c k m , (B.5)as applying these identities to the terms on the right side of equation (B.3) yields the right side of equation(B.2).Let us prove equation (B.4) (eq. (B.5) is proven in a similar way). Starting from the left side ofequation (B.4), we multiply and divide by m/ ( k + 1) and distribute x to obtain x m − (cid:88) k =0 (cid:18) m − k (cid:19) x k (1 − x ) m − − k c k = m − (cid:88) k =0 mk + 1 (cid:18) m − k (cid:19) x k +1 (1 − x ) m − ( k +1) c k k + 1 m . Applying the identity (cid:0) rk (cid:1) = rk (cid:0) r − k − (cid:1) and changing the index of summation to k = k + 1, we get x m − (cid:88) k =0 (cid:18) m − k (cid:19) x k (1 − x ) m − − k c k = m (cid:88) k =1 (cid:18) mk (cid:19) x k (1 − x ) m − k kc k − m . Finally, changing the lower index of the sum by noting that the summand is zero when k = 0 givesequation (B.4). C Whole-group traits with geometric beneﬁts

With geometric beneﬁts, we have ∆ β k = βλ k , so that the inclusive gains from switching for whole-grouptraits are given by f k = − γ + [1 + κ ( n − βλ k . Using the formula for the probability generating functionof a binomial random variable, equation (10) can be written as G ( z ) = − γ + [1 + κ ( n − β (1 − z + λz ) n − . (C.1)As G ( z ) is either decreasing ( λ <

1) or increasing ( λ >

1) in z , A (resp. B) is a dominant strategy if andonly if min [ G (0) , G (1)] ≥ G (0) , G (1)] ≤ G (0) and G (1) then yields the critical cost-to-beneﬁt ratios ε = min [ G (0) , G (1)] and ϑ = max [ G (0) , G (1)]21iven in equation (17). The value of z ∗ given in equation (18) is obtained by solving G ( z ∗ ) = 0. D Other-only traits

In contrast to what happens in whole-group traits, individuals expressing an other-only trait are au-tomatically excluded from the consumption of the good they create, although they can still reapthe beneﬁts of goods created by other expressers in their group. Payoﬀs for such other-only traitsare given by a k = − γ + β k and b k = β k , so that the inclusive gains from switching are given by f k = − γ + κ [ k ∆ β k − + ( n − − k )∆ β k ]. For this payoﬀ constellation, it is straightforward to obtain theindirect beneﬁts B ( z ) from equation (B.3) in appendix B. Observing that ∆ a k = ∆ b k = ∆ β k holds for all k , we have B ( z ) = ∂π ( z • , z ◦ ) ∂z ◦ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z • = z ◦ = z = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k ( n − β k . Using equation (7) and the equality a k − b k = − γ , we have that the direct beneﬁt is given by −C ( z ) = − γ .Substituting these expressions for C ( z ) and B ( z ) into equation (4), we obtain G ( z ) = n − (cid:88) k =0 (cid:18) n − k (cid:19) z k (1 − z ) n − − k [ − γ + κ ( n − β k ] . (D.1)If κ ≤

0, our assumption that the beneﬁt sequence is increasing implies that G ( z ) is always negative,so that z = 0 is the only stable point (defection dominates cooperation).To analyze the case where κ >

0, it is convenient to observe that equation (D.1) is of a similar formas equation (15). The only diﬀerences are that the summation in equation (D.1) extends from 0 to n − n −

1) and that the term multiplying the incremental beneﬁt ∆ β k is given by κ ( n −

1) (ratherthan 1 + κ ( n − κ < γ/ [( n − β ] and cooperation dominating defection if κ > γ/ [( n − β ]. With negative synergiesof scale, the gain function is decreasing in z (negative frequency dependence). Defection dominatescooperation (so that z = 0 is the only convergent stable strategy) if γ ≥ κ ( n − β , whereas cooperationdominates defection if γ ≤ κ ( n − β n − holds. If κ ( n − β n − < γ < κ ( n − β holds, the uniqueconvergent stable strategy z ∗ features coexistence. With positive synergies of scale, the gain function isincreasing in z (positive frequency dependence). Defection dominates cooperation if γ ≥ κ ( n − β n − ,whereas cooperation dominates defection if γ ≤ κ ( n − β . If κ ( n − β < γ < κ ( n − β n − ,there is bistability: both z = 0 and z = 1 are stable and there is a unique, unstable interior point z ∗ separating the basins of attraction of these two stable strategies.When beneﬁts are geometric (13), the gain function is given by G ( z ) = − γ + κ ( n − β (1 − z + λz ) n − , so that, for λ (cid:54) = 1, the evolutionary dynamics are similar to the case of whole-group traits after redeﬁningthe critical cost-to-beneﬁt ratios as ε = min (cid:0) κ ( n − , λ n − κ ( n − (cid:1) and ϑ = max (cid:0) κ ( n − , λ n − κ ( n − (cid:1) z ∗ = 11 − λ (cid:34) − (cid:18) γβκ ( n − (cid:19) n − (cid:35) . E Nonexpresser-only traits

For nonexpresser-only traits, the inclusive gains from switching are given by f k = − γ − β k + κ ( n − − k )∆ β k . (E.1) E.1 Negative synergies of scale

When synergies of scale are negative, we have the following general result.

Result 1 (Nonexpresser-only traits with negative synergies of scale) . Let f k be given by equation (E.1) with β = 0 , β k increasing and ∆ β k decreasing in k and let κ ≥ (the case κ < is trivial). Then1. If γ ≥ κ ( n − β , z = 0 is the only stable point (B dominates A).2. If γ < κ ( n − β , both z = 0 and z = 1 are unstable and there is a unique internal stable point z ∗ ∈ (0 , (coexistence). To prove this result, we start by observing that the assumptions in the statement imply that f k isdecreasing in k . In particular, we have f n − < f . Consequently, if f ≤ γ ≥ κ ( n − β ) the inclusive gain sequence has no sign changes and its initial sign is negative.Observing that f n − = − γ − β n − < f > γ < κ ( n − β ) implies that the decreasing sequence f k has one sign change and that its initial signis positive. Result 1 is then obtained by an application of Pe˜na et al. (2014, result 3). E.2 Geometric beneﬁts

For geometric beneﬁts, we obtain the following result.

Result 2 (Nonexpresser-only traits with geometric beneﬁts) . Let f k be given by equation (E.1) with β k given by equation (13) and let κ ≥ and n > (the cases κ < or n = 2 are trivial). Moreover, let (cid:37) , ζ and η be deﬁned by equations (19) and (20) . Then1. If λ ≤ (cid:37) , G ( z ) is nonincreasing in z . Furthermore:(a) If γ/β < ζ , both z = 0 and z = 1 are unstable and there is a unique internal stable point z ∗ ∈ (0 , (coexistence).(b) If γ/β ≥ ζ , z = 0 is the only stable point (B dominates A).2. If λ > (cid:37) , G ( z ) is unimodal in z with mode given by ˆ z = κ [( n − λ − ( n − − κ ( n − λ − . Furthermore:(a) If γ/β ≤ ζ , both z = 0 and z = 1 are unstable and there is a unique internal stable point z ∗ > ˆ z (coexistence).(b) If ζ < γ/β < η , there are two interior singular points z L and z R satisfying z L < ˆ z < z R . Thepoints z = 0 and z R are stable, whereas z L and z = 1 are unstable (bistable coexistence).(c) If γ/β ≥ η , then z = 0 is the only stable point (B dominates A). (cid:37) > κ ≥ λ = 1 (no synergies of scale) is trivial, we canprove this result by considering three cases: (i) λ <

1, (ii) 1 < λ ≤ (cid:37) , and (iii) (cid:37) < λ .For λ <

1, we have negative synergies of scale and hence result 1 applies with ∆ β = β . Recalling thedeﬁnition of ζ = κ ( n −

1) from equation (20) and rearranging, this yields result 2.1 for the case λ ≤ < (cid:37) .To obtain the result for the remaining two cases, we calculate the ﬁrst and second forward diﬀerencesof the beneﬁt sequence (13) and substitute them into∆ f k = − (1 + κ )∆ β k + κ ( n − − k )∆ β k , k = 0 , , . . . , n − . to obtain∆ f k = βλ k { κ [( n − λ − ( n − − κ (1 − λ ) k } , k = 0 , , . . . , n − . For λ >

1, the sequence ∆ f k is decreasing in k and hence can have at most one sign change. Moreover,since ∆ f n − = − βλ n − (1 + κ ) < f k depends exclusively onhow ∆ f = β { κ [( n − λ − ( n − − } compares to zero. Observe, too, that f n − < f is identical to the sign of ζ − γ/β .Consider the case 1 < λ ≤ (cid:37) . Recalling the deﬁnition of (cid:37) (eq. (19)) we then have ∆ f ≤

0, implyingthat ∆ f k has no sign changes and that its initial sign is negative, i.e., f k is nonincreasing. Hence, if f ≤ γ/β ≥ ζ ), the inclusive gain sequence has no sign changes and itsinitial sign is negative. Otherwise, that is, if γ/β < ζ holds, we have f > > f n − so that the inclusivegain sequence has one sign change and its initial sign is positive. Result 2.1 then follows from Pe˜na et al.(2014, result 3).For λ > (cid:37) we have ∆ f >

0, implying that ∆ f k has one sign change from + to − , i.e., f k is unimodal.This implies that the gain function G ( z ) is also unimodal with its mode ˆ z being determined by G (cid:48) (ˆ z ) = 0(Pe˜na et al., 2014, section 3.4.3). Using the assumption of geometric beneﬁts, we can express G ( z ) isclosed form as G ( z ) = − γ + βλ − β (cid:26) κ ( n − − λ − − [1 + κ ( n − z (cid:27) (1 − z + λz ) n − with corresponding derivative G (cid:48) ( z ) = ( n − β ( λ −

1) (1 − z + λz ) n − (cid:26) κ ( n − − κλ − − [1 + κ ( n − z (cid:27) . Solving G (cid:48) (ˆ z ) = 0 then yields ˆ z as given in equation (21). The corresponding maximal value of the gainfunction is G (ˆ z ) = − γ + βλ − (cid:34) κλ (cid:18) ( n − κλ κ ( n − (cid:19) n − (cid:35) . Result 2.2 follows from an application of Pe˜na et al. (2014, result 5) upon noticing that f ≥ z = 0 and ensuring G (ˆ z ) >

0) holds if and only if γ/β ≤ ζ and that G (ˆ z ) ≤ γ/β ≥ η . (We note that the range of cost-to-beneﬁt ratios γ/β for which bistable coexistence occurs is nonempty, that is η > ζ holds. Otherwise there would exist γ/β satisfying both γ/β ≤ ζ and γ/β ≥ η which in light of result 2.2.(a) and result 2.2.(c) is clearlyimpossible.) 24 Expresser-only traits

For expresser-only traits, the inclusive gains from switching are given by f k = − γ + β k +1 + κk ∆ β k . (F.1) F.1 Positive synergies of scale

In the case of positive synergies of scale, we have the following general result.

Result 3 (Expresser-only traits with positive synergies of scale) . Let f k be given by equation (F.1) with β k and ∆ β k increasing in k and let κ ≥ . Then1. If γ ≤ β , z = 1 is the only stable point (A dominates B).2. If β < γ < β n + κ ( n − β n − , both z = 0 and z = 1 are stable and there is a unique internalunstable point z ∗ ∈ (0 , (bistability).3. If γ ≥ β n + κ ( n − β n − , z = 0 is the only stable point (B dominates A). The arguments used for deriving this result are analogous to those used for deriving the results for thecase of nonexpresser-only traits and negative synergies of scale (result 1 in app. E). The assumptions inthe statement of the result imply that f k is increasing in k . In particular, we have f < f n − . The signpattern of the inclusive gain sequence thus depends on the values of its endpoints in the following way. If f ≥ γ ≤ β ), f k has no sign changes and a positive initial sign. If f n − ≤ γ ≥ β n + κ ( n − β n − ), f k has no sign changes and a negative initial sign.If f < < f n − (which holds if and only if β < γ < β n + κ ( n − β n − ) f k has one sign change and anegative initial sign. Result 3 follows from these observations upon applying Pe˜na et al. (2014, result 3). F.2 Geometric beneﬁts

For geometric beneﬁts, we obtain the following result.

Result 4 (Expresser-only traits with geometric beneﬁts) . Let f k be given by equation (F.1) with β k givenby equation (13) and let κ ≥ and n > (the cases κ < or n = 2 are trivial). Moreover, let ξ , ς and τ be deﬁned by equations (22) and (23) . Then1. If λ ≥ ξ , G ( z ) is nondecreasing in z . Furthermore(a) If γ/β ≤ , z = 1 is the only stable point (A dominates B).(b) If < γ/β < ς , both z = 0 and z = 1 are stable and there is a unique internal unstable point z ∗ ∈ (0 , (bistability).(c) If γ/β ≥ ς , z = 0 is the only stable point (B dominates A).2. If λ < ξ , G ( z ) is unimodal in z , with mode given by ˆ z = κ [1+ κ ( n − − λ ) . Furthermore(a) If γ/β ≤ , z = 1 is the only stable point (A dominates B).(b) If < γ/β ≤ ς , both z = 0 and z = 1 are stable and there is a unique internal unstable point z ∗ ∈ (0 , ˆ z ) (bistability).(c) If ς < γ/β < τ , there are two interior singular points z L and z R satisfying z L < ˆ z < z R . Thepoints z = 0 and z R are stable, whereas z L and z = 1 are unstable (bistable coexistence).(d) If γ/β ≥ τ , z = 0 is the only stable point (B dominates A). ξ < λ = 1, there are three cases to consider: (i) λ >

1, (ii) 1 > λ ≥ ξ , and (iii) ξ > λ .For λ > β = β and β n + κ ( n − β n − = βς . This yields result 4.1 for the case λ > f k = ∆ β k +1 + κ (cid:8) ( k + 1)∆ β k + ∆ β k (cid:9) , k = 0 , , . . . , n − , to obtain∆ f k = βλ k [ λ (1 + κ ) + κ ( λ − k ] , k = 0 , , . . . , n − . For λ <

1, the sequence ∆ f k is decreasing in k and hence can have at most one sign change. Moreover,as ∆ f = βλ (1 + κ ) > f k is positive and whether or not thesequence ∆ f k has a sign change depends solely on how ∆ f n − compares to zero. Observe, too, that for λ < ς > λ n < λ holds.Consider the case ξ ≤ λ <

1. By the deﬁnition of ξ (eq. (22)) this implies ∆ f n − ≥

0. In this case∆ f k has no sign changes and f k is nondecreasing. The sign pattern of the inclusive gain sequence canthen be determined by looking at how the signs of its endpoints depend on the cost-to-beneﬁt ratio γ/β .If γ/β ≤

1, then f ≥

0, implying that f k has no sign changes and its initial sign is positive. If γ/β ≥ ς ,then f n ≤ f k has no sign changes and its initial sign is negative. If 1 < γ/β < ς , then f < < f n , i.e., f k has one sign change and its initial sign is negative. Result 4.1 then follows from anapplication of Pe˜na et al. (2014, result 3).For λ < ξ we have ∆ f n − <

0, implying that ∆ f k has one sign change from + to − , i.e., f k isunimodal. Hence, the gain function G ( z ) is also unimodal (Pe˜na et al., 2014, section 3.4.3) with mode ˆ z determined by G (cid:48) (ˆ z ) = 0. Using the assumption of geometric beneﬁts, we can express G ( z ) is closed formas G ( z ) = − γ + β − λ + βλ (cid:26) [1 + κ ( n − z − − λ (cid:27) (1 − z + λz ) n − , with corresponding derivative G (cid:48) ( z ) = ( n − βλ { κ − (1 − λ ) [1 + κ ( n − z } (1 − z + λz ) n − . Solving G (cid:48) (ˆ z ) = 0 then yields ˆ z as given in equation (24). The corresponding maximal value of the gainfunction is G (ˆ z ) = − γ + β − λ (cid:34) λκ (cid:18) ( n − κ κ ( n − (cid:19) n − (cid:35) . Result 4.2 then follows from applying Pe˜na et al. (2014, result 5). In particular, if γ/β ≤

1, we also have γ/β < ς , ensuring that f ≥ f n − > < γ/β ≤ ς , we have f < f n − ≥ G (ˆ z ) > ς < γ/β , we have f < f n − <

0. Upon noticing that G (ˆ z ) ≤ γ/β ≥ τ holds, this yields the ﬁnal two cases in result 4.2.26 eferences Ackermann, M., Stecher, B., Freed, N. E., Songhet, P., Hardt, W.-D., Doebeli, M., 2008. Self-destructivecooperation mediated by phenotypic noise. Nature 454 (7207), 987–990.Ak¸cay, E., Van Cleve, J., 2012. Behavioral responses in structured populations pave the way to groupoptimality. American Naturalist 179 (2), 257–269.Archetti, M., 2009. The volunteer’s dilemma and the optimal size of a social group. Journal of TheoreticalBiology 261 (3), 475–480.Archetti, M., Scheuring, I., 2011. Coexistence of cooperation and defection in public goods games.Evolution 65 (4), 1140–1148.Axelrod, R., Hamilton, W., 1981. The evolution of cooperation. Science 211 (4489), 1390–1396.Bach, L., Helvik, T., Christiansen, F., 2006. The evolution of n-player cooperation–threshold games andESS bifurcations. Journal of Theoretical Biology 238 (2), 426–434.Barnard, C., Sibly, R., 1981. Producers and scroungers: a general model and its application to captiveﬂocks of house sparrows. Animal Behaviour 29 (2), 543–550.Beaumont, H. J. E., Gallie, J., Kost, C., Ferguson, G. C., Rainey, P. B., 2009. Experimental evolution ofbet hedging. Nature 462 (7269), 90–93.Bernasconi, G., Strassmann, J. E., 1999. Cooperation among unrelated individuals: the ant foundresscase. Trends in Ecology & Evolution 14 (12), 477–482.Bourke, A., Franks, N., 1995. Social Evolution in Ants. Princeton University Press, Princeton, NJ.Boyd, R., Richerson, P. J., 1988. The evolution of reciprocity in sizable groups. Journal of TheoreticalBiology 132 (3), 337–356.Champagnat, N., Ferri`ere, R., M´el´eard, S., 2006. Unifying evolutionary dynamics: from individualstochastic processes to macroscopic models. Theoretical Population Biology 69 (3), 297–321.Christiansen, F. B., 1991. On conditions for evolutionary stability for a continuously varying character.American Naturalist 138 (1), 37–50.Clobert, J., Danchin, E., Dhondt, A. A., Nichols, J. D. (Eds.), 2001. Dispersal. Oxford University Press.Clutton-Brock, T. H., O’Riain, M. J., Brotherton, P. N. M., Gaynor, D., Kansky, R., Griﬃn, A. S.,Manser, M., 1999. Selﬁsh sentinels in cooperative mammals. Science 284 (5420), 1640–1644.Corning, P. A., 2002. The re-emergence of “emergence”: a venerable concept in search of a theory.Complexity 7 (6), 18–30.Cressman, R., 2003. Evolutionary dynamics and extensive form games. MIT Press, Cambridge, MA.Dawkins, R., 1999. The extended phenotype: the long reach of the gene. Oxford University Press, Oxford,UK.Day, T., Taylor, P. D., May 1997. Hamilton’s rule meets the Hamiltonian: kin selection on dynamiccharacters. Proceedings of the Royal Society B: Biological Sciences 264 (1382), 639–644.Day, T., Taylor, P. D., 1998. Unifying genetic and game theoretic models of kin selection for continuoustraits. Journal of Theoretical Biology 194 (3), 391–407.27iekmann, A., 1985. Volunteer’s dilemma. Journal of Conﬂict Resolution 29 (4), 605–610.Eshel, I., 1983. Evolutionary and continuous stability. Journal of Theoretical Biology 103 (1), 99–111.Eshel, I., Motro, U., 1981. Kin selection and strong evolutionary stability of mutual help. TheoreticalPopulation Biology 19 (3), 420–433.Eshel, I., Motro, U., 1988. The three brothers’ problem: kin selection with more than one potential helper.1. The case of immediate help. American Naturalist 132 (4), 550–566.Ewens, W. J., 2004. Mathematical Population Genetics. Springer-Verlag, New York, NY.Farouki, R. T., 2012. The Bernstein polynomial basis: a centennial retrospective. Computer AidedGeometric Design 29 (6), 379–419.Fletcher, J. A., Zwick, M., 2006. Unifying the theories of inclusive ﬁtness and reciprocal altruism. AmericanNaturalist 168 (2), 252–262.Frank, S. A., 1994. Kin selection and virulence in the evolution of protocells and parasites. Proceedingsof the Royal Society B: Biological Sciences 258 (1352), 153–161.Frank, S. A., 1998. Foundations of social evolution. Princeton University Press, Princeton, NJ.Frank, S. A., 2006. Social selection. Oxford University Press, Oxford, UK, Ch. 23, pp. 350–363.Fr¨ohlich, K.-U., Madeo, F., 2000. Apoptosis in yeast – a monocellular organism exhibits altruisticbehaviour. FEBS Letters 473 (1), 6–9.Gardner, A., K¨ummerli, R., 2008. Social evolution: this microbe will self-destruct. Current Biology18 (21), R1021–R1023.Gardner, A., West, S. A., 2006. Demography, altruism, and the beneﬁts of budding. Journal of EvolutionaryBiology 19, 1707–1716.Gardner, A., West, S. A., 2010. Greenbeards. Evolution 64 (1), 25–38.Gardner, A., West, S. A., Wild, G., 2011. The genetical theory of kin selection. Journal of EvolutionaryBiology 24 (5), 1020–1043.Geritz, S. A. H., Kisdi, E., Mesz´ena, G., Metz, J. A. J., 1998. Evolutionarily singular strategies and theadaptive growth and branching of the evolutionary tree. Evolutionary Ecology 12, 35–57.Gillespie, J. H., 1991. The causes of molecular evolution. Oxford University P, New York, NY.Godfrey-Smith, P., Kerr, B., 2009. Selection in ephemeral networks. American Naturalist 174 (6), 906–911.Gokhale, C. S., Traulsen, A., 2010. Evolutionary games in the multiverse. Proceedings of the NationalAcademy of Sciences 107 (12), 5500–5504.Gore, J., Youk, H., van Oudenaarden, A., 2009. Snowdrift game dynamics and facultative cheating inyeast. Nature 459 (7244), 253–256.Grafen, A., 1979. The hawk-dove game played between relatives. Animal Behaviour 27, Part 3 (0),905–907.Greig, D., Travisano, M., 2004. The prisoner’s dilemma and polymorphism in yeast suc genes. Proceedingsof the Royal Society B: Biological Sciences 271 (Suppl 3), S25–S26.28uilford, T., 1985. Is kin selection involved in the evolution of warning coloration? Oikos 45 (1), 31–36.Guilford, T., 1988. The evolution of conspicuous coloration. American Naturalist 131, S7–S21.Hamilton, W., 1964a. The genetical evolution of social behaviour. II. Journal of Theoretical Biology 7 (1),17–52.Hamilton, W. D., 1964b. The genetical evolution of social behaviour. I. Journal of Theoretical Biology7 (1), 1–16.Hamilton, W. D., 1970. Selﬁsh and spiteful behaviour in an evolutionary model. Nature 228 (5277),1218–1220.Hamilton, W. D., 1971. Selection of selﬁsh and altruistic behavior in some extreme models. In: Eisen-berg, J. F., Dillon, W. S. (Eds.), Man and Beast: Comparative Social Behavior. Smithsonian Press,Washington DC, pp. 57–91.Hauert, C., Michor, F., Nowak, M. A., Doebeli, M., 2006. Synergy and discounting of cooperation insocial dilemmas. Journal of Theoretical Biology 239 (2), 195–202.Hines, W., Maynard Smith, J., 1979. Games between relatives. Journal of Theoretical Biology 79 (1),19–30.Johnstone, R. A., Woodroﬀe, R., Cant, M., Wright, J., 1999. Reproductive skew in multimember groups.American Naturalist 153, 315–331.Kurokawa, S., Ihara, Y., 2009. Emergence of cooperation in public goods games. Proceedings of the RoyalSociety B: Biological Sciences 276 (1660), 1379–1384.Lehmann, L., Keller, L., 2006a. The evolution of cooperation and altruism – a general framework and aclassiﬁcation of models. Journal of Evolutionary Biology 19, 1365–1376.Lehmann, L., Keller, L., 2006b. Synergy, partner choice and frequency dependence: their integration intoinclusive ﬁtness theory and their interpretation in terms of direct and indirect ﬁtness eﬀects. Journal ofEvolutionary Biology 19 (5), 1426–1436.Lehmann, L., Perrin, N., Rousset, F., 2006. Population demography and the evolution of helping behaviors.Evolution 60, 1137–1151.Lehmann, L., Rousset, F., 2010. How life history and demography promote or inhibit the evolution ofhelping behaviours. Philosophical Transactions of the Royal Society B: Biological Sciences 365 (1553),2599–2617.Lehmann, L., Rousset, F., 2012. The evolution of social discounting in hierarchically clustered populations.Molecular Ecology 21 (3), 447–471.Leimar, O., Tuomi, J., 1998. Synergistic selection and graded traits. Evolutionary Ecology 12 (1), 59–71.Mal´ecot, G., 1975. Heterozygosity and relationship in regularly subdivided populations. TheoreticalPopulation Biology 8 (2), 212–241.Marshall, J. A. R., 2014. Generalisations of hamilton’s rule applied to non-additive public goods gameswith random group size. Frontiers in Ecology and Evolution 2, 40.Matessi, C., Jayakar, S. D., 1976. Conditions for the evolution of altruism under darwinian selection.Theoretical Population Biology 9 (3), 360–387. 29aynard Smith, J., 1964. Group selection and kin selection. Nature 201, 1145–1147.Maynard Smith, J., 1965. The evolution of alarm calls. American Naturalist 99 (904), 59–63.Maynard Smith, J., Szathm´ary, E., 1995. The major transitions in evolution. Oxford University Press,Oxford, UK.Metz, J., Geritz, S., Mesz´ena, G., Jacobs, F., van Heerwarden, J., 1996. Stochastic and spatial structuresof dynamical systems. North Holland, Amsterdam, Netherlands, Ch. Adaptive dynamics: a geometricalstudy of the consequences of nearly faithful replication, pp. 183–231.Motro, U., 1991. Co-operation and defection: playing the ﬁeld and the ESS. Journal of TheoreticalBiology 151 (2), 145–154.Ohtsuki, H., 2010. Evolutionary games in Wright’s island model: kin selection meets evolutionary gametheory. Evolution 64 (12), 3344–3353.Ohtsuki, H., 2012. Does synergy rescue the evolution of cooperation? An analysis for homogeneouspopulations with non-overlapping generations. Journal of Theoretical Biology 307 (0), 20–28.Packer, C., Ruttan, L., 1988. The evolution of cooperative hunting. American Naturalist 132 (2), 159–198.Pe˜na, J., Lehmann, L., N¨oldeke, G., 2014. Gains from switching and evolutionary stability in multi-playermatrix games. Journal of Theoretical Biology 346 (0), 23–33.Pepper, J. W., 2000. Relatedness in trait group models of social evolution. Journal of Theoretical Biology206 (3), 355–368.Queller, D. C., 1984. Kin selection and frequency dependence: a game theoretic approach. BiologicalJournal of the Linnean Society 23 (2-3), 133–143.Queller, D. C., 1985. Kinship, reciprocity and synergism in the evolution of social behaviour. Nature318 (6044), 366–367.Queller, D. C., 1992. A general model for kin selection. Evolution 46 (2), 376–380.Queller, D. C., 1994. Genetic relatedness in viscous populations. Evolutionary Ecology 8 (1), 70–73.Queller, D. C., 2011. Expanded social ﬁtness and Hamilton’s rule for kin, kith, and kind. Proceedings ofthe National Academy of Sciences 108 (Supplement 2), 10792–10799.Reuter, M., Keller, L., 2001. Sex ratio conﬂict and worker production in eusocial Hymenoptera. AmericanNaturalist 158 (2), 166–177.Rousset, F., 2004. Genetic Structure and Selection in Subdivided Populations. Princeton University Press,Princeton, NJ.Rousset, F., Billiard, S., 2000. A theoretical basis for measures of kin selection in subdivided populations:ﬁnite populations and localized dispersal. Journal of Evolutionary Biology 13, 814–825.Sachs, J. L., Mueller, U. G., Wilcox, T. P., Bull, J. J., 2004. The evolution of cooperation. The QuarterlyReview of Biology 79 (2), 135–160.Samuelson, P. A., 1954. The pure theory of public expenditure. The Review of Economics and Statistics36 (4), 387–389. 30awyer, S., Felsenstein, J., 1983. Isolation by distance in a hierarchically clustered population. Journal ofApplied Probability 20 (1), 1–10.Skyrms, B., 2004. The Stag Hunt and the Evolution of Social Structure. Cambridge University Press,Cambridge, UK.Stander, P., 1992. Cooperative hunting in lions: the role of the individual. Behavioral Ecology andSociobiology 29 (6), 445–454.Sumpter, D. J. T., 2010. Collective Animal Behavior. Princeton University Press, Princeton, NJ.Taylor, P., 1992. Altruism in viscous populations — an inclusive ﬁtness model. Evolutionary Ecology6 (4), 352–356.Taylor, P., Maciejewski, W., 2012. An inclusive ﬁtness analysis of synergistic interactions in structuredpopulations. Proceedings of the Royal Society B: Biological Sciences 279, 45964603.Taylor, P. D., 1989. Evolutionary stability in one-parameter models under weak selection. TheoreticalPopulation Biology 36 (2), 125–143.Taylor, P. D., Frank, S. A., 1996. How to make a kin selection model. Journal of Theoretical Biology180 (1), 27–37.Taylor, P. D., Irwin, A. J., 2000. Overlapping generations can promote altruistic behavior. Evolution54 (4), 1135–1141.Traulsen, A., Nowak, M. A., 2006. Evolution of cooperation by multilevel selection. Proceedings of theNational Academy of Sciences 103 (29), 10952–10955.Trivers, R. L., 1971. The evolution of reciprocal altruism. The Quarterly Review of Biology 46 (1), 35–57.Van Cleve, J., 2014. Social evolution and genetic interactions in the short and long term. bioRxiv.Van Cleve, J., Ak¸cay, E., 2014. Pathways to social evolution: reciprocity, relatedness, and synergy.Evolution 68 (8), 2245–2258.Van Cleve, J., Lehmann, L., 2013. Stochastic stability and the evolution of coordination in spatiallystructured populations. Theoretical Population Biology 89 (0), 75–87.Velicer, G. J., 2003. Social strife in the microbial world. Trends in Microbiology 11 (7), 330–337.Wenseleers, T., Gardner, A., Foster, K. R., 2010. Social evolution theory: a review of methods andapproaches. In: Szekely, T., Moore, A., Komdeur, J. (Eds.), Social Behaviour: Genes, Ecology andEvolution. Cambridge University Press, Cambridge, pp. 132–158.West, S. A., Diggle, S. P., Buckling, A., Gardner, A., Griﬃn, A. S., 2007. The social lives of microbes.Annu. Rev. Ecol. Evol. Syst. 38, 53–77.West, S. A., Griﬃn, A. S., Gardner, A., Diggle, S. P., 2006. Social evolution theory for microorganisms.Nat Rev Micro 4 (8), 597–607.Wheeler, D. E., 1986. Developmental and physiological determinants of caste in social hymenoptera:Evolutionary implications. American Naturalist 128 (1), 13–34.White, C. E., Winans, S. C., 2007. Cell–cell communication in the plant pathogen agrobacteriumtumefaciens. Philosophical Transactions of the Royal Society B: Biological Sciences 362 (1483), 1135–1148. 31right, S., 1931. Evolution in Mendelian populations. Genetics 16, 97–159.32

B BA A A AB BA A A AB BA A A (a) (b) (c)

Figure 1: Three kinds of social traits. Expressers (As) provide a social good at a personal cost,nonexpressers (Bs) do not. The set of recipients (ﬁlled circles) of the social good depends on the particularkind of social interaction. a , Whole-group traits. b , Nonexpresser-only traits. c , Expresser-only traits.33 hole−group (a) scaled relatedness, k p r obab ili t y o f p l a y i ng A , z nonexpresser−only (b) expresser−only (c) (d) (e) (f) Figure 2: Bifurcation diagrams for whole-group ( a , d ), nonexpresser-only ( b , e ), and expresser-only ( c , f ) traits with geometric beneﬁts. The scaled relatedness coeﬃcient κ ≥ a , b , c , Negative synergies of scale ( λ = 0 .

7) and lowcost-to-beneﬁt ratio ( γ/β = 3 . d , e , f , Positive synergies of scale ( λ = 1 .

25) and high cost-to-beneﬁtratio ( γ/β = 15). In all plots, n = 20. 34ymbol DeﬁnitionA ﬁrst of two pure strategies (e.g., “help”) a k payoﬀ to an A-player matched with k A-players and n − − k B-playersB second of two pure strategies (e.g., “do not help”) B payoﬀ beneﬁt parameter in games without synergies of scale B ( z ) (fecundity) indirect eﬀect b ﬁtness beneﬁt of carrying a social allele b k payoﬀ to a B-player matched with k A-players and n − − k B-players C payoﬀ cost parameter in games without synergies of scale c ﬁtness cost of carrying a social allele −C ( z ) (fecundity) direct eﬀect d k direct gain from switching to a focal matched with k A-players and n − − k B-players D synergy parameter in games without synergies of scale e k indirect gain from switching to a focal matched with k A-players and n − − k B-players f k inclusive gain from switching to a focal matched with k A-players and n − − k B-players G ( z ) gain function (= −C ( z ) + κ B ( z )) j total number of A-players in a group k number of A co-players of a focal individual N number of adult individuals in a group n number of players (usually = N ) r relatedness coeﬃcient β j beneﬁt from the social good when j individuals express helping (play A) β parameter of the geometric beneﬁts z resident strategy (phenotype) z • strategy (phenotype) of a focal individual z (cid:96) ( • ) strategy (phenotype) of the (cid:96) -th co-player of the focal individual z ◦ average strategy (phenotype) of the neighbors of a focal individual∆ ﬁrst forward diﬀerence operator (∆ c k = c k +1 − c k ) γ payoﬀ cost of expressing helping λ parameter of the geometric beneﬁts κ scaled relatedness coeﬃcient π average payoﬀ to a focal individualTable 1: Symbols used in this article.35ocial trait a k b k d k e k f k whole-group − γ + β k +1 β k − γ +∆ β k ( n − β k − γ +(1 + κ ( n − β k nonexpresser-only − γ β k − γ − β k ( n − − k )∆ β k − γ − β k + κ ( n − − k )∆ β k expresser-only − γ + β k +1 − γ + β k +1 k ∆ β k − γ + β k +1 + κk ∆ β k Table 2: Payoﬀ structures and gains from switching for three kinds of social traits. In each case expressers(A-players) incur a cost γ > β j ≥ j they experience. The number of expressers experienced by a focal is j = k if the focal isa nonexpresser, otherwise it is j = k + 1. Direct gains ( d k ) and indirect gains ( e k ) are calculated bysubstituting the expressions for a k and b k into equations (5) and (6). Inclusive gains from switching ( f k )are then obtained from equation (9). 36ocial trait f k G ( z )whole-group − γ + β + κ ( n − β − γ + β + κ ( n − β nonexpresser-only − γ + κ ( n − β − (1+ κ ) βk − γ + κ ( n − β − (1+ κ ) β ( n − z expresser-only − γ + β +(1+ κ ) βk − γ + β +(1+ κ ) β ( n − z Table 3: Inclusive gains from switching ( f k ) and gain functions ( G ( z )) for the case of linear beneﬁts (nosynergies of scale). The inclusive gains from switching are obtained by replacing β j = βj and ∆ β j = β into the corresponding expression from table 2. The gain function is then obtained from equation (10).37hole-group λ < λ > γ/β ≤ ε z = 1 γ/β ≤ ε z = 1 ε < γ/β < ϑ z ∗ ∈ (0 , ε < γ/β < ϑ z = 0, z = 1 γ/β ≥ ϑ z = 0 γ/β ≥ ϑ z = 0nonexpresser-only λ ≤ (cid:37) λ > (cid:37)γ/β < ζ z ∗ ∈ (0 , γ/β < ζ z ∗ ∈ (ˆ z, γ/β ≥ ζ z = 0 ζ ≤ γ/β < η z = 0, z R ∈ (ˆ z, γ/β ≥ η z = 0expresser-only λ < ξ λ ≥ ξγ/β ≤ z = 1 γ/β ≤ z = 11 < γ/β < ς z = 0, z = 1 1 < γ/β < ς z = 0, z = 1 ς ≤ γ/β < τ z = 0, z R ∈ (ˆ z, γ/β ≥ ς z = 0 γ/β ≥ τ z = 0Table 4: Convergence stable strategies for the three kinds of social traits with geometric beneﬁts. Theresults hold for whole-group traits if κ > − / ( n −

1) and for nonexpresser-only and expresser-only traitsif κ ≥

0. For whole-group traits, ε and ϑ are given by equation (17). For nonexpresser-only traits, (cid:37) , ζ , η ,and ˆ z are functions of κ , n , and λ (see eq. (19), (20), and (21)). For expresser-only traits, ξ , ς , τ , and ˆ z are functions of κ , n , and λλ