Network impact on persistence in a finite population dynamic diffusion model: application to an emergent seed exchange network
Pierre Barbillon, Mathieu Thomas, Isabelle Goldringer, Frédéric Hospital, Stéphane Robin
NNetwork impact on persistence in a finite population dynamicdiffusion model: application to an emergent seed exchangenetwork
Pierre Barbillon ∗† AgroParisTech / UMR INRA MIA,F-75005 Paris, FranceINRA, UMR 518,F-75005 Paris, France Mathieu Thomas ∗ AgroParisTech / UMR INRA MIA,F-75005 Paris, FranceINRA, UMR 518,F-75005 Paris, FranceIsabelle GoldringerINRA, UMR 0320 / UMR 8120G´en´etique V´eg´etale,F-91190 Gif-sur-Yvette, France Fr´ed´eric HospitalINRA, UMR 1313G´en´etique Animale et Biologie Int´egrative,F-78352 Jouy-en-Josas, FranceSt´ephane RobinAgroParisTech / UMR INRA MIA,F-75005 Paris, FranceINRA, UMR 518,F-75005 Paris, France
Abstract
Dynamic extinction colonisation models (also called contact processes) are widelystudied in epidemiology and in metapopulation theory. Contacts are usually assumedto be possible only through a network of connected patches. This network accountsfor a spatial landscape or a social organisation of interactions. Thanks to social net-work literature, heterogeneous networks of contacts can be considered. A major issueis to assess the influence of the network in the dynamic model. Most work withthis common purpose uses deterministic models or an approximation of a stochas-tic Extinction-Colonisation model (sEC) which are relevant only for large networks.When working with a limited size network, the induced stochasticity is essential andhas to be taken into account in the conclusions. Here, a rigorous framework is pro-posed for limited size networks and the limitations of the deterministic approximation ∗ These authors contributed equally to this work. † Corresponding author: 16 rue Claude Bernard, 75231 Paris Cedex 05, [email protected] a r X i v : . [ s t a t . O T ] A p r re exhibited. This framework allows exact computations when the number of patchesis small. Otherwise, simulations are used and enhanced by adapted simulation tech-niques when necessary. A sensitivity analysis was conducted to compare four maintopologies of networks in contrasting settings to determine the role of the network.A challenging case was studied in this context: seed exchange of crop species in theR´eseau Semences Paysannes (RSP), an emergent French farmers’ organisation. Astochastic Extinction-Colonisation model was used to characterize the consequencesof substantial changes in terms of RSP’s social organisation on the ability of the systemto maintain crop varieties. Keywords: metapopulation; social network; finite-population model; sensitivity anal-ysis; seed exchange network.
To deal with the persistence of a metapopulation in a dynamic extinction-colonisationmodel, several studies have used deterministic models where the evolution is describedby differential equations (see Levins, 1969; Hanski and Ovaskainen, 2000; Sol´e and Bas-compte, 2006). These models are grounded on an asymptotic approximation in the numberof patches. The same models are used in epidemiology (SIS: Susceptible Infected Suscepti-ble model). More recently, some studies have dealt with the stochastic effect due to a finiteand limited number of patches/actors. Chakrabarti et al. (2008) have proposed an approx-imation in the stochastic model which leads to conclusions similar to the ones obtainedwith deterministic models. Gilarranz and Bascompte (2012) have shown by simulationsthe impact of stochasticity due to a limited number of patches and they have underscoredthe differences with the results obtained with deterministic models when comparing theability of different networks to conserve a metapopulation. However, their results dependonly on the ratio of the extinction rate to the colonisation rate which is not relevant in astochastic model. Indeed, the same ratio values with different values of the extinction andcolonisation rates can lead to very different situations for the dynamic of the metapopu-lation.In this paper, we study the same stochastic model as Gilarranz and Bascompte (2012).The patches can be in only two states: occupied or empty. The dynamic consists in asuccession of extinction events followed by colonisation events. We provide a rigoroustheoretical basis to this model which explains the different behaviours observed in thesimulations. Indeed, the stochastic model is a Markov chain and its transition matrixcan be constructed as done by Day and Possingham (1995). From this, we deduce thatthere is a unique possible equilibrium which is the absorbing state when all patches areempty. Moreover, the steady state which can be observed where the number of occupiedpatches seems to have reached an equilibrium corresponds to a so-called quasi-stationary2istribution of the Markov chain (Darroch and Seneta, 1965; M´el´eard and Villemonais,2012). On this basis, in order to assess the persistence of a metapopulation, we proposecriteria which are adapted to the stochastic context. In particular, since the metapopula-tion will become extinct in any case, we decide to fix a limited time-horizon and to provideconclusions relying on this time-horizon. We also show the limitations of the asymptoticapproximation. Furthermore, this approach leads to exact computations provided thatthe number of patches is not too large. Otherwise, simulations can be conducted andenhanced by modified simulation techniques when necessary.The goal of this study is to measure the impact of the interaction network whichdescribes the relationships between patches (during colonisation events) on the behaviourof the dynamic model. Following the metapopulation model, the network used to accountfor heterogeneous spatial organisation (Gilarranz and Bascompte, 2012) can also be usedto account for a social organisation (Read et al., 2008). Indeed, this work was designedin the context of social networks of farmers who exchange seeds, an important socialprocess in the diffusion and maintenance of crop biodiversity (reviewed in Thomas et al.,2011). We assume that seeds spread through farmers’ relationships like an epidemiologicalprocess as suggested by Pautasso et al. (2013). Relying on the study of the R´eseauSemences Paysannes (RSP) which is a French network of farmers involved in seed exchangeof heirloom crop species (Demeulenaere et al., 2008; Demeulenaere and Bonneuil, 2011;Thomas et al., 2012), we compare different scenarios of social organisation and attest theireffects on the persistence of one crop variety.In the following, the network is assumed to be non-oriented and is denoted by G . Inthe deterministic work (Hanski and Ovaskainen, 2000) or in approximation of the stochas-tic model (Chakrabarti et al., 2008), the leading eigenvalue of the adjacency matrix of G is sufficient to describe the impact of the network on persistence. We show that this isno longer true in a stochastic model. We propose to study four main network topologieswhich represent really distinct organisations. These topologies are determined by genera-tive models: an Erd˝os-R´enyi model (Erd˝os and R´enyi, 1959), a community model (whereconnection inside a community is more likely than between two patches from differentcommunities), a preferential attachment model (Albert and Barab´asi, 2002) and a “lat-tice” model where nodes all have approximately the same degree.In section 2, a full description and an analysis of the sEC model are provided togetherwith the algorithms used in simulations. The limits of the deterministic approximationare presented. The topologies for networks are detailed in section 3. We conducted asensitivity analysis to measure the impact of the topology in contrasting settings. Theresults are presented in section 4. A motivating application of this work in section 5studied the persistence of one crop variety in a farmers’ network of seed exchange.3 Model
The following notations are used: n number of patches (farms) G interaction network between patches p density of the network n edges number of edges e extinction rate c colonisation rate n gen number of studied generations Z t state of the system Z t number of occupied patches for state Z t P ( Z t > t E ( Z t ) Expected number of occupied patches at generations t The model describes the presence or absence of a crop variety on n different farms (patchesaccording to metapopulation vocabulary) during a discrete time evolution process. Thismetapopulation is identified with a network G with n nodes (farms) and adjacency matrix A = [ a ij ] i,j were a ij = 1 if patches i and j are connected ( i ∼ j ) and 0 otherwise. Thismatrix is symmetric which means that a relation between two patches is reciprocal. Wefurther denote by Z i,t the occupancy of patch i ( i = 1 . . . n ) at time t , namely Z i,t = 1if patch i is occupied at time t and 0 otherwise. The vector Z t = [ Z i,t ] i depicts thecomposition of the whole metapopulation at time t .A time step corresponds to a generation of culture. Between two generations, twoevents can occur: extinction and colonisation with respective rates e and c . Within eachtime step, extinction events first take place and occur in occupied patches independently ofthe others, with a probability e , supposed to be constant over patches and time. Colonisa-tions events then take place and are only possible between patches linked according to thestatic relational network G . An empty patch can be colonised by an occupied patch witha probability c . This probability is also assumed constant over linked patches and timesteps. Thus, the probability that the patch i , if empty at generation t , is not colonisedbetween generations t and t + 1 is equal to (1 − c ) o i,t where o i,t is the number of occupiedpatches at generation t linked to the patch i : o i,t = (cid:80) j a ij Z j,t . This model is similar to theone proposed in Gilarranz and Bascompte (2012) and also to the epidemic model used inChakrabarti et al. (2008). It can also be seen as a particular case of the models discussedin Adler and Nuernberger (1994); Day and Possingham (1995); Hanski and Ovaskainen42000). As recalled in Day and Possingham (1995), the stochastic process ( Z t ) t ∈ N is a discretetime Markov chain with 2 n possible states. The matrices describing the colonisation C and the extinction E can be constructed and the transition matrix of ( Z t ) t ∈ N is obtainedas the product of these two matrices: M = E · C . We assume here that Z t is irreducibleand aperiodic which is ensured if the adjacency matrix A of the social network has onlyone connected component. In the sequel, we denote by λ B,k the k th eigenvalue of anymatrix B . Indeed, the leading eigenvalue of M is λ M, = 1, its multiplicity is 1 and thecorresponding eigenvector is the stationary distribution. If e >
0, this unique stationarydistribution consists of being stuck in one state, denoted by 0 and called absorbing stateor coffin state. The coffin state corresponds to all patches empty which means that thevariety is extinct. Thus, if Z t = 0, for any s > t , Z s = 0. We denote by T the extinctiontime: T = inf { t > , Z t = 0 } . Since the number of states is finite, P z ( T < ∞ ) = 1 for any initial state z ( P z denotesthe probability measure associated with the chain Z t and initial state Z = z ). The secondeigenvalue λ M, governs the rate of convergence toward the absorbing state, i.e. P z ( T > t ) = P z ( Z t >
0) = O ( λ tM, ) . (1)The smaller this eigenvalue is, the faster the convergence is. Hence, we can study theprobability of extinction in a given number of generations or the mean time to extinctionfor an initial condition on occupancies at generation 0, a network and a set of parameters. Although extinction is almost sure, the probability of reaching extinction in a realisticnumber of generations can still be small. In that case, we aim to study the behaviour ofthis dynamic before extinction. In some cases, the Markov chain Z t conditioned to non-extinction { T > t } converges toward a so-called quasi-stationary distribution (Darrochand Seneta, 1965; M´el´eard and Villemonais, 2012). This quasi-stationary distribution ex-ists and is unique provided that Z t is irreducible and aperiodic. Note that quasi-stationarydistribution may also exist in reducible chains (van Doorn and Pollett, 2009). The transi-tion matrix on the transient states, denoted by R , has dimension (2 n − × (2 n −
1) and isdefined as a sub-matrix of M by deleting its first row and its first column, correspondingto the coffin state. If it exists, the quasi-stationary distribution is given by normalizingthe eigenvector of the reduced matrix R associated with its leading eigenvalue λ R, . We5enote by α this distribution over the transient states. It can be noticed that λ R, = λ M, .As stated in Darroch and Seneta (1965); M´el´eard and Villemonais (2012),sup z,z (cid:48) transient states | P z ( Z t = z (cid:48) | T > t ) − α z (cid:48) | = O (cid:32)(cid:18) | λ R, | λ R, (cid:19) t (cid:33) . (2)Therefore, the quasi-stationary distribution is met in practice if the Markov chain con-verges faster toward it than toward the absorbing state which corresponds to | λ R, | /λ R, <<λ R, .Building the transition matrix allows an exact study of the dynamic of the varietypersistence. However, due to its large size: 2 n × n , building such a matrix and seeking itseigenvalues is not possible for n >
10. Therefore, for bigger n , we have to run simulations. Another solution is to use an approximate version of the model as proposed by Chakrabartiet al. (2008). They describe the recurrence relation between the probabilities of occupan-cies at generation t + 1 and these probabilities at generation t . In the computation ofthe recurrence relation, they consider the occupancies of the patches at generation t asindependent of each other. Thus, their relation involves only the n patches and not all ofthe 2 n possible configurations. This approximation leads to the following relation: p i,t +1 = 1 − ζ i,t +1 p i,t e − ζ i,t +1 (1 − p i,t ) , (3)where p i,t is the probability of occupancy of patch i at generation t and ζ i,t is the probabilitythat patch i is not colonised at generation t . The following equation derives from theindependence approximation: ζ i,t +1 = (cid:89) j ∼ i (1 − cp j,t ) . From this approximation, they derive a frontier between a pure extinction and an equi-librium phase depending on e , c and λ A, the leading eigenvalue of the adjacency matrix A of the network. If e/c is above λ A, , a pure extinction shall take place, if it is below, thepatch occupancy shall reach an equilibrium where the number of occupied patches variesaround a constant number. More specifically, if e/c > λ A, , the occupancy probabilities( p i,t ) tend to 0 (0 is a stable fixed point). Moreover, in the case where e/ [ c (1 − e )] > λ A, ,the decay over time of the p i,t is exponential, p i,t = O ((1 − e + c (1 − e ) λ ,A ) t ) for any1 ≤ i ≤ n . Otherwise, if e/c < λ A, , there exists a fixed point with non-zero probabilitiesof occupancies. This non-zero equilibrium clashes with the almost sure convergence of theMarkov chain toward the coffin state. 6igure 1: Number of occupied patches for replications from the dynamic model over 100generations, network fixed and parameters fixed at c = 0 .
05 and e = 0 .
25 (black solidlines) or e = 0 .
05 (grey broken lines). The initial state was chosen such that all patchesare occupied.The frontier e/c = λ A, is also found to be a relevant threshold for persistence in deter-ministic models such as the Levins model (Levins, 1969) and its spatially realistic versions(Hanski and Ovaskainen, 2000; Sol´e and Bascompte, 2006). In a stochastic model, extinc-tion eventually takes place since there is an absorbing state. From the previous statementson the quasi-stationary distribution, observing an equilibrium phase on simulations as inGilarranz and Bascompte (2012) actually corresponds to a phase where the Markov chainrelaxes in its quasi-stationary distribution and does not reach the absorbing state duringthe finite number of generations. As an example, for a network with 100 patches, wepresent two typical cases in Figure 1: when extinction is likely in 100 generations (replica-tions in solid black lines) and when a quasi-equilibrium is reached (replications in brokengrey lines). If the simulations are run long enough, the quasi-equilibrium will be left andthe system will converge to the coffin state. In a stochastic model, an extinction threshold does not make sense. We advocate focusingon quantities such as the extinction probability in a given realistic number of generationsand the mean number of occupied patches at this generation. We aim to study the impact7f the network on persistence through its impact on these two quantities. Moreover,the impact of e and c must also be taken into account and not only through the ratio e/c . Indeed, two settings with the same ratio e/c lead to very different results in astochastic model according to the order of magnitude of e and c . For a fixed network with10 nodes and for a fixed network with 100 nodes, we computed (exactly with 10 nodes,estimated with 100 nodes) the probability of extinction in 100 generations P z ( T ≤ P z ( Z = 0) for different values of e and c . Here, the initial state z is chosen such thatall patches are occupied. The color maps of these probabilities are displayed in Figure 2 A and 2 B . A B
Figure 2: Probabilities of extinction in 100 generations for varying values of e and c : ( A )10 nodes,( B ) 100 nodes.The white line corresponds to the level e/c = λ A, which is the frontier obtained whenusing the large network approximation. As observed in this case of a finite network, thisline fails to separate cases with a high probability of extinction from the others. Further-more, as we want to take the stochasticity of the model due to a finite number of patchesinto account, a threshold would not be relevant. Actually, there exists a fuzzy band wherethere is very little confidence in the behavior of the system.From these remarks, we decided to conduct finite horizon analyses in the followingsections. The time horizon was chosen with respect to the application. Both, the proba-bility of persistence and the mean number of occupied patches were studied to quantifythe impact of the network topology in different settings depending on the values of the pa-rameters e and c . The next section presents methods for simulation when the probabilityof persistence is hard to compute. 8 .4 Methods for simulations Since the model is a Markov chain in a finite state space, simulating is quite easy. Hence,the probability of persistence after 100 generations P ( T > th generation E ( Z ) can be estimated. However, incases where persistence is very likely or very unlikely, a large number of simulations arenecessary to achieve precision in the estimate of the persistence probability. Indeed, wecan face two kinds of rare events: rare extinction, or rare persistence. Some techniquesrelated to the estimation of probabilities of rare events were used. They are based onimportance sampling and interacting particle systems. A very simple interacting particle system (Del Moral and Doucet, 2009) is efficient in thiscase. The idea is to consider simultaneous trajectories (particles) and regenerate the oneswhich have been trapped in the coffin state (extinction) among the surviving particles.
Algorithm 1 • Initialisation: N particles set at Z i = (1 , . . . ,
1) for any i = 1 , . . . , N . • Iterations: t = 1 , . . . , – Mutation
Each particle evolves independently according to the Markov model(obtaining ˜ Z it from Z it − by simulation). – Selection/Regeneration:
If ˜ Z it = 0, then Z it is randomly chosen among thesurviving particles ˜ Z jt (cid:54) = 0. Otherwise Z it = ˜ Z it .Compute E t = (cid:80) Ni =1 I ( ˜ Z it = 0) /N .Note that the product (cid:81) t =1 E t is then an unbiased estimator of P ( T ≤ N must be chosen to ensure that not all the particles dieduring a mutation step. If the probability of persistence is high, a lot of simulations are necessary to observe atleast one extinction. If no extinction is observed, then the estimate of the extinctionprobability is zero. To improve the estimator, we have to make extinction more likely inthe simulation and to apply a correction in the final estimator such that the estimator isstill unbiased. Two methods are proposed to achieve this goal: an importance samplingmethod and a splitting method (Rubino and Tuffin, 2009).9he importance sampling is applied on the extinction phase by increasing the extinc-tion rate e in the simulations. Indeed, since extinction occurs independently in patches,the ratios due to the change in distribution is tractable. Algorithm 2 • Initialisation: Z = (1 , . . . , e IS , . . . , e IS ) with size the number ofgenerations of twisted extinction rate (the twisted rate is not necessarily the samethroughout generations) is chosen. • Iterations: t = 1 , . . . , – Extinction
Extinction is simulated with the corresponding twisted extinctionrate e ISt and the ratio is computed as r t = (cid:18) ee ISt (cid:19) d t · (cid:18) − e − e ISt (cid:19) Z t − − d t , where d t is the number of extinction events which occur at generation t and Z t − − d t gives the number of occupied patches which do not become extinctat generation t . – Colonisation:
Colonisation is applied according to the model.Hence, the unbiased estimator of P ( T ≤ N simulations obtained accordingto the previous algorithm (( Z it ) t ∈{ , } with ratios ( r it ) t ∈{ , } , i = 1 , . . . , N ) is1 N N (cid:88) i =1 100 (cid:89) t =1 r it × I ( Z it = 0) . Since the simulations / particles do not interact, the computation can be done in parallel.Although the variance of this estimator is not tractable in a closed form, it can still beshown that the variance is smaller if the vector ( e IS , . . . , e IS ) is chosen such that e ISt increases with t .Another solution is to use a splitting technique. The rare event, which is extinctionhere, is split into intermediate less rare events. The extinction corresponds to zero occupiedpatches at generation 100. An intermediate rare event is the number of occupied patchesbeing less than a given threshold S at any generation between 0 and 100. A sequence ofthresholds S ≥ S · · · ≥ S p is fixed and the probability of extinction in 100 generationsreads as P ( Z = 0) = P ( ∃ t, Z t = 0)= P ( ∃ t, Z t ≤ S ) × P ( ∃ t, Z t ≤ S |∃ t, Z t ≤ S ) × · · · × P ( ∃ t, Z t = 0 |∃ t, Z t ≤ S p ) , ∃ t means implicitly ∃ t ≤
100 and Z t ≤ S p means Z t ≤ S p .The algorithm will keep the trajectories that have crossed the first threshold (the tra-jectories for which there is at least one state with a number of occupied patches below S ). From these successful trajectories, offspring are generated from the time of the firstcrossing and then are kept if they cross the second threshold and so on. The ratio of thesuccessful trajectories over the total number of simulated trajectories between threshold S m − and S m is used to estimate the probabilities P ( ∃ t, Z t ≤ S m |∃ t, Z t ≤ S m − ). Thesplitting algorithm we use is in a fixed success setting that is to say the algorithm waits fora given number of regenerated trajectories to cross each threshold before moving to thenext threshold. Hence, this setting prevents degeneracy of the trajectories (no trajectorymanages to cross a threshold) and the precision is controlled in spite of the computationaleffort (Amrein and K¨unsch, 2011). Algorithm 3 • Initialisation: N particles set to Z i = (1 , . . . ,
1) for any i = 1 , . . . , N . Choose thesequence of decreasing thresholds S , . . . S p and the number of successes n success . Byconvention, S p +1 = 0. Set the beginning level of trajectories L i = 0 and startingstate Z i = (1 , . . . ,
1) for i = 1 , . . . , n success . • For each threshold S m , 1 ≤ m ≤ p + 1 , set s = 0 and k m = 0 and repeat until s = n succes : – Do k m = k m + 1. – Choose uniformly i ∈ { , . . . , n success } . – Simulate a trajectory from generation L im − at state Z im − : ( Z t ) L im − ≤ t ≤ . – If there exists t such that Z t ≤ S m , do1. s = s + 1,2. L sm = inf { t, Z t ≤ S m } ,3. Z sm = Z L sm .The unbiased estimator of the extinction probability is then: p +1 (cid:89) m =1 n succes − k m − . The fixed success setting ensures the non-degeneracy of the trajectories. However, thereis no control on the complexity of the algorithm. As a by-product, this algorithm alsoprovides estimations of the probabilities that the trajectories cross the intermediate thresh-olds. 11he case of rare extinction is more difficult since there is no obvious method for effi-ciently computing the probability unlike the case of rare persistence. In the two algorithmspresented above, the efficiency relies on the tuning of parameters, namely the twisted ex-tinction rates in Algorithm 2 and the sequence of thresholds in Algorithm 3. In the presentstudy, these parameters have been set manually; the definition of a general tuning strategyis out of the scope of this article.
In the following we assume that the topology of a network accounts for a kind of socialorganisation among patches. The main features of a topology are emphasised in orderto make the differences appear clearly. The topologies we compare are well known inthe literature, but we adapt the simulation models in order to limit the variability bycontrolling the number of edges. Once a number of edges is set (denoted by n edges ), atopology consists in a way to distribute edges. To describe the topology of a network, thedistribution of the degrees of nodes is pertinent. We always work under the constraint ofa network with a single component. The package igraph (Csardi and Nepusz, 2006) in R was used for simulating networks for certain topologies and for plotting. We recall thatwe assume that the network G is non-oriented. This random graph model was introduced by Erd˝os and R´enyi (1959) and is defined asfollows: for a chosen number of edges n edges , the network is constructed by choosinguniformly among all the possible edges (cid:18) n (cid:19) .When the number of nodes is large, the distribution of the degrees of nodes is close toa Poisson distribution (Albert and Barab´asi, 2002). The community model was used to take into account cases where networks are organisedthrough communities. Inside a community, the nodes are connected with a high probabilitywhereas the connection probability is weak between two nodes belonging to two differentcommunities. The spirit of this model is drawn from Stochastic Block Model (Nowicki andSnijders, 2001). The community sizes are set to be equal, the intra-community and theinter-community connection probabilities are the same in order to reduce the number ofparameters for defining such a network. This model is then tuned by the number of edges,the number of communities and the ratio of the intra connection probability over the interconnection probability (this ratio is greater than 1 in order to favour intra connection).12igure 3: Simulation of networks with 100 nodes and 247 edges according to Erd˝os-R´enyimodel ( A ), community model ( B ), lattice model ( C ), preferential attachment model withpower 1 ( D ) and power 3 ( E ). The size of a node is proportional to its degree.13 .3 Lattice model We made an abusive use of the word lattice, by considering graphs with quasi-equaldegrees. The lattice model stands for organizations built on the basis of the spatialneighborhood. We propose a simulation model which is flexible since it works for anynumber of nodes and edges. The main idea is to fix the smallest upper bound on thedegree of a node for given numbers of nodes and edges. This upper bound is computedas the floor integer number of 2 n edges /n . First, a one dimension lattice is created in orderto ensure a single component graph. Then, edges are added sequentially with a uniformdistribution between nodes which have not reached the bound on their degree. Hence, insuch a graph, all nodes have quite the same role and importance. This version of the preferential attachment model was proposed by Barab´asi and Albert(1999). It was designed to model growing networks and to capture the power-law tail ofthe degree distribution which was noticed in real networks in many fields of application.The nodes are added sequentially. In each step, a single node is added and is connectedto the nodes already in the network with probability P (connection to node N k ) ∝ degree ( N k ) b , where the power b is chosen in order to tune the strength of the preferential attachment.This generative model tends to create nodes with a high degree which have a central rolein the network. It is clearly opposed to the lattice model which makes the degrees quasihomogeneous.To be able to fix the number of nodes and the number of edges independently, we drawuniformly a sequence of edges added for each new node with the constraint that there isat least one edge (ensuring the single component network) and that the number of edgesto be added is less than the number of nodes already in the network at a given step. The aim was to conduct a sensitivity analysis based on the chosen outputs of the stochasticmodel (mean number of occupied patches and the persistence probability at generation100: E ( Z ) and P ( Z > e , the colonisation rate c and the graph G . The graph is parameterised by its density d which is equivalent to the number of edges or the mean degree and its topology i.e.the distribution of the edges given a total number of edges. The variation range of theparameters was fixed in order to induce a context of weak, middle and strong extinction14ince the aim is to study the impact of the network topology in contrasted situations (seeTable 1 for the values used for the full factorial design of experiments). The analyseswere done for two cases for n = 10 patches and n = 100 patches. For n = 10, exactcomputations were still achievable while for n = 100, the simulation methods presented in2.4 were used. We ensured that the simulations had reached a sufficient degree of precisionto consider that the part of variability in the outputs due to the estimation method wasnegligible in comparison with the variation due to the input parameter variation. Fivekinds of networks were compared: an Erd˝os-R´enyi network, a community network, a“lattice” network, and two preferential attachment networks with powers 1 and 3. For thecommunity model, only one community setting was studied and the ratio of intra versusinter connection probability was set at 100. When n = 10, the patches were equally splitinto two communities. When n = 100, the patches were equally split into five communities.For each network structure, ten replicate networks were built with randomly generatededges. 10 patches 100 patchese { . , . , . } { . , . , . } c { . , . , . } { . , . , . } d { , , } { , , } Table 1: Values for exploration of the model with 10 patches or 100 patchesThe sensitivity analyses were based on an analysis of variance. The influence of theparameters and their interaction on P ( Z >
0) (actually the logit of this probability)and on E ( Z ) were assessed. The only source of variability was the randomness in thegraph generation. We recall that for n = 10 the computations are exact and for n = 100,the estimates are precise enough to ensure the significance. All these linear models hada coefficient of determination R greater than 99 . e , c and d were by far the most important since a large range of variation was explored foreach of these parameters and since any of these parameters could drive the system toextreme situations where extinction is likely or rare. Nevertheless, the topology was stillsignificant. As suggested by the significance of high order interactions, especially the onesinvolving topology, a topology was not found to be uniformly (whatever the values of e , c or d ) better (according to P ( Z >
0) or to E ( Z )) than the others.The comparison based on E ( Z ) has shown an inversion in the ranking of thetopologies which was similar to the one noticed by Gilarranz and Bascompte (2012). Thisinversion appeared with both n = 10 and n = 100 patches. When the combination ofvalues of e , c and d ensured persistence with a high probability, the best topologies werethose with a better balance in degree distribution such as the lattice, ER and community15opologies. However, although the difference in mean was found significant, the order ofmagnitude of this difference was only of a few patches ( ≈
5) for n = 100 patches. Onthe other hand, the topologies leading to some very connected patches (hub) such as thepreferential attachment topologies (especially when the power parameter is set at 3) max-imized the number of occupied patches when the persistence in the system is threatenedin 100 generations. In that case, the differences were more contrasted between topologies.When the topologies were compared according to P ( Z > n = 10 patches. However, the balanced topologies (lattice, ERand community) were found to be better in settings where the probability of persistence isgreater than 99 . n = 100 patches, we have only observedthe better resistance of preferential attachment topologies and also their crucial role incritical situations where persistence and extinction had pretty much the same probabilityof occurring. However, we were not able to obtain settings where the persistence proba-bility was greater for any of the balanced topologies than for the preferential attachmenttopology, even though we have computed it with Algorithms 2 and 3 for settings wherethe order of magnitude of the persistence probability is 1 − − .Figure 4 A illustrates the crucial role of the topology in the chosen setting. The pref-erential attachment topology with power 3 ensured the persistence probability in 100 gen-erations to be greater than 0 . .
05. Figure 4 B displays the mean number ofoccupied patches conditionally to persistence. This figure shows that the quasi-stationarydistributions for the preferential attachment topologies have led to mean numbers of oc-cupied patches which were close to 15 while these conditional mean numbers were closeto 5 for the other three topologies. Since the number of occupied patches was too close to0 in this quasi-equilibrium, the system had a very low probability of relaxing for a whilein its quasi-stationary distribution. Conclusion
From this study, we have determined that the role of the topology is notalways crucial. But in some settings, it has a key impact on persistence probability andthus on the mean number of occupied patches. Thanks to sharp computations of thepersistence probabilities, the differences between the topologies were also highlighted onthe basis of rare events (rare persistence or rare extinction). The preferential attachmenttopology is generally more resistant according to this probability especially when thepersistence is jeopardized. Nevertheless, concerning the occupancies, balanced topologiescan perform a little better even though they still have a smaller probability of persistencethan do the preferential attachment topologies. As an example, Figure 5 A and 5 B display16 B Figure 4: ( A ) Probability of persistence and ( B ) mean number of occupied patches, invarying t generations (based on 20 replications of the network for a given topology) for n = 100, c = 0 . e = 0 .
25 and d = 30%. COM: community network, ER: Erd˝os-R´enyinetwork, LAT: Lattice network, PA1: preferential attachment network with power 1, PA3:preferential attachment with power 3. P ( Z >
0) and E ( Z ) in a particular setting. Each of these two criteria may rankthe settings including the topology in different orders.The community topology may be a little more sensitive to extinction than are ER orlattice topologies but they are globally equivalent for this dynamic model. Even if it isquite obvious, we mention that it was noticed that the role of the topology is enhancedwhen the density is higher, when c is greater and when the number of patches is greater. Our first application of the sEC model was to describe an emergent farmers’ movementinvolved in seed exchange of crops and vegetables in France. From the beginning ofthe 1990’s in Europe, new farmers’ organisations have emerged with the aim of sharingpractices and seeds (Bocci and Chable, 2008). In a preliminary study, Demeulenaere andBonneuil identified the global social dynamics in the context of the “R´eseau Semences17 B Figure 5: Boxplots of the probabilities of persistence over 100 generations ( A ) and thenumber of occupied patches at generation 100 ( B ), computed with 10 replications of eachnetwork topology. COM: community network, ER: Erd˝os-R´enyi network, LAT: Latticenetwork, PA1: preferential attachment network with power 1, PA3: preferential attach-ment with power 3.Paysannes” (RSP), a French national farmers’ organisation created in 2003 (Demeulenaereand Bonneuil, 2011). They described this social movement highlighting emergent rulesand giving a semi-quantitative picture of the dynamics of this social organisation. Theyfocused their study on one of the RSP’s subgroups specialized in bread wheat ( Triticimaestivum ). Based on informants of the RSP, they identified key actors. They performed10 exhaustive interviews to collect data on which varieties were present in the fields ofthe farmers and from whom they were obtained. Additional information such as to whomfarmers provided varieties was less informed. They completed data collection with 8additional semi-directive interviews and 7 phone interviews. They collected 778 distinctrecords of seed exchange events among 160 actors between 1970 and 2005. These seedexchanges involved 175 different varieties of bread wheat. After pooling all the informationcollected between 1970 and 2005, a directed seed circulation network was drawn where anedge connects two farmers who have carried out at least one seed diffusion event duringthis period. Three connected components were identified: one giant component (152nodes, Fig. 6 A ) and two small ones (5 and 3 nodes respectively, not shown). The averagecolonisation rate ( c ) was estimated as the number of diffusion events per variety per farmerper year. The number ranged over time from 0.03 to 0.66. In the context of an emergent self-organized system, a crucial question is to what extent dochanges in social organisation impact the global ability of the system to maintain varieties?Relying on our knowledge about RSP evolution, three network topologies and two net-work sizes were simulated to represent evolution of this social organisation, assuming thatseed exchange networks were embedded in more complex social networks. Five scenarioswere defined to provide a framework for studying the impact of such social change. They18 B Figure 6: ( A ) Summary network of bread wheat seed circulation among 152 farmersdrawn from data collected based on 10 interviews covering a period from 1970 to 2005.( B ) Subgraph of the reliable seed circulation events from 1970 to 2005 based on the 10interviews and used to estimate ˆ p . Interviewed people are in dark grey and mentionedpeople in light grey.are described in the next section. Then, the sEC model was used to represent the dynamicprocess of seed circulation and extinction for each network topology. A dedicated sensi-tivity analysis was designed to explore specific ranges of e and c in a short time window.Working at this time scale was motivated by the rapid change in social organisation ofsuch systems. The probability of persistence and the expected number of occupied patcheswas assessed for each scenario to compare the ability of the system to maintain the varietycirculating in the network.Using the sEC model in the context of seed systems assumed that farmers alwayswanted to recover the variety after losing it. In addition, we assumed that all farmers hadthe same behaviour, having the same ability: 1) to host a seed lot of a variety throughthe seed diffusion process (uniform colonisation probability, c ) and 2) to lose it through astochastic process of extinction (uniform extinction probability, e ). This assumption wasmade to highlight the position in the network independently of individual characteristics. Rapid evolution of the social organisation can be qualitatively depicted through threemain phases. This description relies on a participant observation study (Demeulenaereand Bonneuil, 2011) during farmers’ meetings between 2003 and 2012. During the first19hase, several dozen farmers were invited to participate in national meetings. At thebeginning, only a few exchanged seed with a limited knowledge of each other. For thisreason, we assumed random seed exchanges among farmers using an Erd˝os-Renyi network(ER). After several meetings, a few farmers became more popular and more central inseed diffusion. We modelled this stage using a preferential attachment (PA) algorithm fordesigning the network topology and accounting for a more important role of a few farmers.Eventually, the number of farmers exchanging seeds highly increased. At the same time,a change was observed from national meetings to more local events thanks to the creationof local associations involved in seed exchange (from 0 to 17 between 2003 and 2012).We considered the community model (COM) following the stochastic block model as anappropriate network model to mimic this new organisation with most seed exchange at thelocal scale (within groups) and rare events at the global scale (long distance and amonggroups). Based on these observations, five scenarios were defined for analysing the impactof change in social organisation on the ability of the self-organised system to maintainvarieties (Table 2): •
1: random seed exchanges among few farmers (ER:50) •
2: scale-free seed exchanges among few farmers (PA:50) •
3: community-based seed exchanges among many farmers (COM:500) •
4: random seed exchanges among many farmers (ER:500) •
5: scale-free seed exchanges among many farmers (PA:500)First, we compared the results obtained for scenarios 1 and 2 to study the evolutionof the system capacity to maintain varieties after a change in social organisation in thecontext of small size population. Then, scenario 3 was compared to scenarios 4 and 5 tounderstand the consequence of a new social configuration in maintaining crop diversityafter an increase in the network size.
This additional sensitivity analysis is required to draw conclusions about the scenarios inthe context of the RSP study (different number of patches, ranges of e and c and a shortertime horizon). To initialize the simulations, each actor owned only one and the samevariety. We defined three levels of event frequency to mimic different global behavioursin terms of seed circulation and maintenance: low frequency with e = 0 .
1, intermediatefrequency with e = 0 . e = 0 .
8. It was chosen to investigate twovariety statuses: popular and rare varieties. We considered that rare varieties were lessdiffused with an e/c ratio of 5 compared to 1 for the popular ones. We fitted the density20f the small network ( n = 50) to the density of the observed network (6 B ) which givesˆ p = 0 . − p =ˆ p / (500 /
50) = 0 . P ( Z > E ( Z ). The choice of 30 generationscorresponded to the time scale of observed seed exchange. In addition, we considered thatsuch social organisation in the context of emergent social movements is rapidly evolvingwithout reaching a real equilibrium. Longer simulations seemed to be less informative forunderstanding properties of this self-organised system. Scenario Comparison n vertex n edge topo e ratio e/c
1a 1 50 263 ER { .
1; 0 .
5; 0 . }
11b 1 50 263 ER { .
1; 0 .
5; 0 . }
52a 1 50 263 PA { .
1; 0 .
5; 0 . }
12b 1 50 263 PA { .
1; 0 .
5; 0 . }
53a 2 500 2682 COM ∗ { .
1; 0 .
5; 0 . }
13b 2 500 2682 COM ∗ { .
1; 0 .
5; 0 . }
54a 2 500 2682 ER { .
1; 0 .
5; 0 . }
14b 2 500 2682 ER { .
1; 0 .
5; 0 . }
55a 2 500 2682 PA { .
1; 0 .
5; 0 . }
15b 2 500 2682 PA { .
1; 0 .
5; 0 . } Table 2: Description of the 5 scenarios ( ∗ :the COM model is defined for 10 groups of 50farmers with a probability of connecting people from the same community 10 times higherthan the probability of connecting two people from different communities). Small networks: scenarios 1 and 2
The change in network topology was the maindifference between scenarios 1 and 2. For the popular varieties, ( e/c = 1), PA only showeda lower probability of persisting ( P ( Z > E ( Z )) for the higher values of e and c compared to ER (Table 3). For rarevarieties, more susceptible to extinction ( e/c = 5), an inversion between ER and PA wasobserved in terms of ability to maintain the resource ( P ( Z > e and c , before decreasing to zero for the highest values (Table 3). This trendwas confirmed by the expected number of occupied patches.21 P ( Z > E ( Z ) e/c = 1 0.1 ER = P A = 1 ER ∼ P A = 440.5 ER = P A = 1 ER (cid:38) P A = 440.8 ER = 0 . > P A = 0 . ER = 37 > P A = 25 e/c = 5 0.1 ER = P A = 1
P A (cid:38) ER = 250.5 P A = 0 . (cid:29) ER = 0 . P A = 13 (cid:29) ER = 30.8 P A = ER = 0 P A = ER = 0 Table 3: Summary results of persistence probability ( P ( Z > E ( Z )) for 50 farmers, with ∼ : difference lowerthan 2%, > : difference between 10 − (cid:29) : difference higher than 50%. e P ( Z > E ( Z ) e/c = 1 0.1 P A = ER = COM = 1 ER ∼ COM (cid:38)
P A = 4250.5
P A = ER = COM = 1 ER ∼ COM (cid:38)
P A = 4270.8
P A ∼ ER = COM = 1 ER ∼ COM = 382 > P A = 314 e/c = 5 0.1
P A = ER = COM = 1 ER ∼ COM ∼ P A = 2490.5 ER ∼ COM ∼ P A = 1
P A = 193 (cid:29) ER (cid:29) COM = 400.8
P A = 0 . (cid:29) ER = COM = 0
P A = 43 > ER = COM = 0
Table 4: Summary results of persistence probability ( P ( Z > E ( Z )) for 500 farmers, with ∼ : difference lowerthan 2%, (cid:38) : difference between 2 − > : difference between 10 − (cid:29) : differencehigher than 50%.In both cases ( e/c = 1 and e/c = 5), the results are consistent with section 4: thebalanced distribution of degree in ER networks conferred a higher persistence probabilityand a higher relative occupancy compared with a more hierarchical organisation (PA)in the context of safe situations. It is the heterogeneity of the degree distribution thatconferred more persistence in the context of critical extinction. Larger networks: scenarios 3, 4 and 5
They are characterized by the larger numberof actors. COM configuration was compared to initial topologies: ER and PA networks.Simulation results showed an equivalent P ( Z >
0) = 1, whatever the frequency of event(low or high c and e ) and the network topology (Table 4) for popular varieties ( e/c = 1).Thus, with such parameter values it was not likely that a variety disappeared whatever thetopology. Increasing the number of actors from 50 to 500 induced a substantially higherexpected number of occupied farms for ER and COM topologies compared to PA (Table4). The opposite behavior was observed for rare varieties ( e/c = 5). We noticed that ERand COM provided similar results whatever the conditions.22 .6 Role of the social network topology on variety persistence and rec-ommendations Role of the social network topology
Different factors influenced the distributionand persistence of varieties. The number of farmers as well as e and c parameters wereobviously the most important ones. The increase in the number of participating farmersfrom the creation of the RSP has substantially improved the probability of maintainingrare and popular varieties within the system. The network topology did not always havean incidence on persistence. When it was the case, it was not always the same topologythat outperformed the others depending on the situation. Such behaviour depended onthe status of the variety under consideration. Popular varieties were better maintainedwith ER or COM topologies because of the balanced degree that avoids local extinctions,whereas rare varieties persisted better with PA topology. In the case of rare varieties, PAtopology with few farmers as hubs allowed the variety to be quickly redistributed throughthe network after local extinctions.Relying on simulation results, we showed that the self-organized trajectory of the RSPfrom small ER, then to small PA to large COM improved the efficiency of the systemat maintaining popular varieties compared with rare varieties. The COM network modelseems to be a realistic topology for large networks since local meetings with a subset of thefarmers are easier to organize. In this context, a community is driven by the local meetingsand farmers participating in the same meeting are likely to be connected. In COM, veryfew farmers are linked to farmers from other communities. Nevertheless, we observed thatit led to the same ability for persistence and occupancy as did the ER network. Thesefindings illustrated the independence between the ability to maintain a variety and theconnection within community and across communities in the context of popular varieties.A detailed sensitivity analysis of the COM parameters would allow one to assess whethersome COM configurations depart from ER behaviour. This sensitivity analysis could beextended to the study of a mixture model with COM and PA topologies. Such topologieswould allow one to model an even more realistic situation accounting for a persistentlyhigher degree of a few farmers within and across communities.It was not possible to forecast the behaviour of the model using only the e/c ratio dueto complex interactions with the size of the network and the type of topology. Recommendations for future studies on seed systems
Sensitivity analysis on theextinction-colonisation model confirmed that e and c parameters were the most contribut-ing factors to the ability of the system to maintain a variety. Topological parameters likethe density of the network also looked important. Unfortunately, such data have not yetbeen collected. One of the reasons is that seed systems are rapidly changing, often infor-mal or even illegal depending on the country legislation. Nevertheless, our work showedthat a good knowledge of event frequency (extinction rate and seed exchange rate), of thestatus of the variety (rare or popular) in addition to the density of the social network could23rovide important clues to the health of the seed system. Thus, particular attention hasto be paid to these particular quantities in future studies to strengthen our understandingof the sustainability and health of seed systems. This study aimed to investigate the role of the network topology in a dynamic extinctioncolonisation model. Obviously, the number of edges was the most important feature ofthe network. The topology (distribution of the edges) impact was less important but notnegligible and its impact depended on the other parameters (extinction rate e , colonisationrate c and number of edges). In section 2, we have highlighted the limits of describinga stochastic dynamic extinction colonisation model only by the ratio e/c . As noticedin section 4, if the relevant parameters led to a probable extinction, networks with highdegree nodes (PA) were more resistant than networks with balanced degrees (LAT, ERor COM). On the contrary, if persistence was quite certain, more patches were occupiedin balanced networks than in the PA networks. The community structure (COM) andER showed similar properties with respect to persistence and occupancy. These resultsobtained for small networks and after a short time period were consistent with thoseobtained for large networks after reaching a quasi-equilibrium state as shown by Gilarranzand Bascompte (2012). Nevertheless, the necessity to properly estimate the persistenceprobability and the expected occupancy in strongly stochastic conditions was pointed outand a specific procedure was provided. Franc (2004); Peyrard et al. (2008) proposed moreaccurate approximations of the sEC model to determine the behaviour of the system closeto critical situations. They demonstrated the importance of particular geometrical featuresof the network such as the clustering coefficient and the square clustering coefficient fordescribing the impact of the network in the evolution of the system. Further studies shouldbe conducted to determine the role of these features in a limited network size.Such work could contribute to feeding the thoughts for further discussions with farmerorganisations and community seed systems, particularly on the way to monitor popularand rare varieties circulating in the system. This study improved our understanding of therole of the social organisation in maintaining crop diversity in such emergent self-organisedsystems. The authors of the paper want to thank the farmers of the R´eseau Semences Paysanneswho accepted to devote a part of their time discussing the questions of seed exchange.The authors also thank the participants of the NetSeed and MIRES consortia for theirfruitful discussions. NetSeed is a project funded by the Fondation pour la Recherche surla Biodiversit´e (FRB), Paris, France and hosted by the Centre de Synth`ese et d’Analyse24ur la Biodiversit´e (CESAB), Aix-en-Provence, France. MIRES is a working group onmodelling seed exchanges funded by the R´eseau National des Syst`emes Complexes, Paris,France. Mathieu Thomas is a postdoctoral fellow who obtained the grant “Contrat JeunesScientiques” from the Institut National de la Recherche Agronomique, France.25 eferenceseferences