Refined Mean Field Analysis of the Gossip Shuffle Protocol -- extended version --
RRefined Mean Field Analysis of theGossip Shuffle Protocol– extended version –
Nicolas Gast , Diego Latella , and Mieke Massink INRIA, France Consiglio Nazionale delle Ricerche - Istituto di Scienza e Tecnologiedell’Informazione ‘A. Faedo’, CNR, Italy
Abstract.
Gossip protocols form the basis of many smart collectiveadaptive systems. They are a class of fully decentralised, simple but ro-bust protocols for the distribution of information throughout large scalenetworks with hundreds or thousands of nodes. Mean field analysis meth-ods have made it possible to approximate and analyse performance as-pects of such large scale protocols in an efficient way. Taking the gossipshuffle protocol as a benchmark, we evaluate a recently developed refined mean field approach. We illustrate the gain in accuracy this can providefor the analysis of medium size models analysing two key performancemeasures. We also show that refined mean field analysis requires spe-cial attention to correctly capture the coordination aspects of the gossipshuffle protocol.
Keywords:
Refined Mean Field; Collective Adaptive Systems; Discrete TimeMarkov Chains; Gossip protocols; Self-organisation.
Many collective adaptive systems rely on the decentralised distribution of in-formation. Gossip protocols (also known as epidemic or random walk protocols)have been proposed as a paradigm that can provide a stable and reliable methodfor such decentralised spreading of information [23,6,3,17,9,8,4,2,22]. Gossip pro-tocols are able to scale up to the very large environments that collective adaptivesystems are envisioned for. The basic mechanism of information spreading fol-lowed by a gossip protocol is that nodes exchange part of the data they keep intheir cache with randomly selected peers in pairwise synchronous communica-tions on a regular basis.Interesting performance aspects of such gossip protocols are the diffusion orreplication of a newly inserted fresh data element in a network and the dynamicsof network coverage. Diffusion or replication of a data element occurs when nodesexchange the data element in pairwise communication. Two relevant measuresare of interest in this case. One is the fraction of the population that has the dataelement in its cache at a certain point in time (replication). The other concerns a r X i v : . [ c s . PF ] A p r Gast et al. network coverage (coverage), i.e. the fraction of the population of network nodesthat have “seen” the data element since its introduction into the network, evenif they may no longer have it in their cache due to further exchanges with otherpeers.Traditionally, these performance measures have been studied based on simu-lation models. However, when large populations of nodes are involved, such sim-ulations may be very resource consuming. Recently these protocols have beenstudied using classic mean field approximation techniques [2,1]. In that classicapproach the full stochastic model of a gossip network, i.e. one in which eachnode is modelled individually, is replaced by a much simpler model in whichthe pairwise synchronous interactions between individual nodes are replaced bythe average effect that all those interactions have on a single node and then themodel of this single node is studied in the context of the overall average networkbehaviour. Of course, the average effects may change over time as nodes maychange their local states. This is taken into account in a mean field model byletting the probabilities of interactions possibly depend on the fraction of nodesthat are in a particular local state. Compared to traditional simulation methods,mean field approximation techniques scale very well to large populations becausethese techniques are independent of the exact population size . This method ofderivation of a mean field model from a large population of interacting objectsrelies on what is known as the assumption of “propagation of chaos” (also called“statistical independence” or “decoupling of joint probabilities”) [20,7,10,18].The assumption is based on the fact that when the number of interacting nodesbecomes very large, their interactions tend to behave as if they were statisticallyindependent.However, in reality, we are not always dealing with huge systems, but ratherwith medium size ones. These are still resource intensive when analysed usingsimulation and, unfortunately, the classical mean field approximation is less ac-curate for such medium size systems. For example, in Fig. 1 the results of classicalmean field approximation are shown together with a Java based simulation ofthe protocol for a medium size gossip system with 2500 nodes where initially onenode has a new data element that will spread over the network by gossiping. It iseasy to see that there is a discrepancy between simulation and classic mean fieldapproximation, both for replication of the data element and for the coverage,even in this not so small system.In this paper we revisit an analysis of the gossip shuffle protocol using a refined mean field approximation for discrete time population models that wedeveloped in [12,13], and which was in turn inspired by an earlier result forcontinuous time population models in [11]. The gossip shuffle protocol was anal-ysed in detail by Bahkshi et al. in [4,2,1] both analytically and by using classicmean field approximation in [2,1] and, more recently, by using on-the-fly mean As long as this size is large enough to obtain a sufficiently accurate approximation.The computational complexity of these techniques does depend on the number oflocal states of an object in a population.efined Mean Field Gossip Shuffle Protocol 3 nu m be r o f r ep li c a s timeprotocol simulationclassic mf 0 500 1000 1500 2000 2500 3000 0 500 1000 1500 2000 2500 3000 nu m be r o f node s timeprotocol simulationclassic mf Fig. 1.
Replication (left) and Coverage (right) for one new data element in a networkwith N = 2500. Average of 500 simulation runs of the Java simulator [1]. Vertical barsshow standard deviation for the simulation. field discrete time model checking techniques in [19]. The present paper is anextended version of the short paper [14]. Contributions
The main contribution of this paper is a novel benchmark (clock-synchronous) DTMC population model of the gossip shuffle protocol analysedusing our refined mean field analysis [12,13]. In particular: – We show that with refined mean field approximation better accuracy canbe obtained compared to classical mean field approximation for medium sizepopulations for this gossip protocol, but that this requires a novel model thatreflects the synchronisation effects of the pairwise interaction of the originalprotocol. – The developed model is parametric in G max , i.e. the number of steps itremains passive in between active interactions with peer gossip nodes. – The results we obtained are very close both to those of independent Javabased simulation from the literature in [2] (taken as “ground truth”) and tothose of the event simulation of the model itself, but with the advantage thatthe refined mean field approximation is several orders of magnitude fasterto obtain and independent of the system size. – Development of a proof-of-concept implementation in F the analysis time is independent of the population size . Theanalysis is orders of magnitudes faster than discrete event based simulation.Therefore it is an interesting candidate for being integrated with other analysisapproaches such as (on-the-fly) mean field model checking, which is planned infuture work. The current study aims at providing further insight in the feasibilityof applying the refined mean field approach, that implies the use of symbolicdifferentiation, on larger benchmark examples and in the possible complicationsof such an analysis that need to be taken into consideration.
Gast et al.
The outline of the paper is as follows. The relevant aspects of the gossipshuffle protocol are briefly recalled in Section 2. The refined mean field approachused in this paper applies to the classical population model of [20,10,18] and isbriefly recalled in Section 3. Section 4 presents full and aggregated classical meanfield models of the protocol which form the starting point for the novel gossipmodel suitable for refined mean field approximation presented and analysed inSection 5. Section 6 presents conclusions.
We briefly recall the main aspects of the gossip shuffle protocol described in [15,1,2]that serves as our benchmark. This particular version has been extensively stud-ied by Bahkshi et al., leading to an analytical model of the gossip protocol [3], aclassical mean field model [2] and a Java implementation of a simulator for theprotocol [2], which makes it a very suitable candidate of a real-world applicationthat allows for the comparison of new results. In the following we briefly recallsome main aspects of the shuffle gossip protocol and the Java simulator. Furtherdetails can be found in [2,1].
The gossip shuffle protocol distributes data items throughout a network of smalldevices. Such networks typically consist of a very large collection of nodes. Eachnode has a limited amount of storage space (called its cache) for the data items.At any instant, gossip nodes are divided into two classes: active and passivenodes. Active nodes can initiate a shuffle, i.e. an exchange of data between twopeers, by contacting a passive neighbouring node and exchange part of theirdata. Such a passive node is selected through an underlying layer that keepstrack of which nodes are active or passive.Each gossip node maintains a finite list of data items in its cache. Both theactive node and its passive partner exchange a random subset from their localcaches in one atomic peer-to-peer communication session. Given the limited sizeof the cache, a node may have to discard some items it receives. This is donein such a way that no information is lost in the network , i.e. a node discardsitems selected among those that it has just sent to its peer and does not discardnew items it has just received from the peer. Fig. 2 recalls the pseudo code of ageneric shuffle protocol (adapted from [1]).Two main key measures that are of interest for this protocol are the transientaspects of the replication of a newly introduced element in the network and thatof the coverage of the network, i.e. the fraction of network nodes that have seenthe new data element when time is passing. These measures depend on a num-ber of characteristics of the network. In the following we use N to denote the This layer is not explicitly modelled. For example, in wireless environments suchpassive peers may be determined by the radio connectivity between nodes.efined Mean Field Gossip Shuffle Protocol 5 while true dowait ( ∆t time units) B := randomPeer() s A := itemsToSend( c A );send s A to B ; s B := receive( · ); c A := itemKeep( c A \ ( s A \ s B ) , s B \ c A );(a) An active node A while true do s A := receive( · ); s B := itemsToSend( c B );send s B to sender( s A ); c B := itemKeep( c B \ ( s B \ s A ) , s A \ c B );(b) A passive contacted node B Fig. 2.
Pseudo code of a generic shuffle protocol (adapted from [1]). c A and s A denotethe cache and selection of active node A . Similarly, c B and s B denote those of passivenode B . ∆t = G max . The operation ‘itemsToSend( c i )’ selects the items to be sent fromthe cache c i . The operation ‘itemKeep(c,s)’ in node A decides which items to keep inthe cache (c) removing from the cache those selected for sending ( s A ) except those thatwhere received from B ( s B ), and adding to those the elements from s B that were notyet in the cache of A. Similarly for the operation in node B. size of the network, i.e. the number of gossiping nodes, n to denote the numberof different data items in the network, c to denote the size of the cache and s to denote the size of the selected items from the cache to be exchanged witha neighbour. In the context of this work, and for comparison with the resultspresented in [1], the network is assumed to be fully connected . We consider a dis-crete time variant of the protocol with a maximal delay between two subsequentactive data-exchanges of a node denoted by G max . To assess the quality of classic mean field approximation results, Bahkshi et al.developed a Java-based implementation of a simulator for the shuffle protocolwith which networks of various sizes can be simulated on a single processor [2].In this paper we also adopt the results produced by this simulator, the sourcecode of which was generously shared with us by the developers, as the “groundtruth” with which to compare our own results. This simulator works as follows.It takes the network size N , and the specific size of the storage, c , the numberof messages exchanged in each shuffle, s and the total number of different dataelements in the network, n . It divides all network nodes into G max + 1 differentgroups, each representing a different value of the gossip delay. Recall that themaximal period between two consecutive contact initiations of any particularnetwork node is G max . The nodes in the group with gossip delay equal to zeroare the active nodes, i.e. those that initiate contact with their peers in the currentround uniformly at random. If an active node contacts a node that is already incontact with another node, the interaction between all three nodes fails, leadingto a collision. At the start of the simulation, a new data item is introduced in Gast et al. the network (i.e. one different from the n types of data-elements that are alreadypresent in the network and that are assumed to be uniformly distributed overthe local cash of all network nodes. After each round, the total number of copiesof the new data element in the network (replication) and the number of nodesthat have seen the data element (coverage) are measured. In the sequel we use theoretical results on discrete time mean field approximation[20,7,12]. We briefly recall the notation and main results in the following. Weconsider a population model of a system composed of 0 < N ∈ IN identicalinteracting objects, i.e. a (model of a) system of size N . We assume that theset { , . . . , n − } of local states of each object is finite; we refer to [12] for adiscussion on how to deal with infinite dimensional models. Time is discrete andthe behaviour of the system is characterised by a (time homogeneous) discretetime Markov chain (DTMC) X ( N ) ( t ) = ( X ( N )1 ( t ) , . . . , X ( N ) N ( t )), where X ( N ) i ( t )is the state of object i at time t , for i = 1 , . . . , N .The occupancy measure vector at time t of the model is the row-vector DTMC M ( N ) ( t ) = ( M ( N )0 ( t ) , . . . , M ( N ) n − ( t )) where, for j = 0 , . . . , n −
1, the stochasticvariable M ( N ) j ( t ) denotes the fraction of objects in state j at time t , over thetotal population of N objects: M ( N ) j ( t ) = 1 N N (cid:88) i =1 { X ( N ) i ( t )= j } and 1 { x = j } is equal to 1 if x = j and 0 otherwise. At each time step t ∈ IN eachobject performs a local transition, possibly changing its state. The transitionsof any two objects are assumed to be independent from each other, while thetransition probabilities of an object may depend also on M ( t ), thus, for large N , the probabilistic behaviour of an object is characterised by the one-steptransition probability n × n matrix K ( m ), where K ij ( m ) is the probability forthe object to jump from state i to state j when the occupancy measure vector is m ∈ U n ; U n is the unit simplex of IR n ≥ , that is U n = { m ∈ [0 , n | (cid:80) ni =1 m i =1 } . In this paper, for simplicity, we assume K ( m ) to be a continuous functionof m that does not depend on N . In the sequel, for reasons of presentation,we provide a graphical specification of the relevant models. The computation ofmatrix K ( m ) from such a model specification is straightforward. Below we recall Theorem 4.1 of [20] on classic mean field approximation, underthe simplifying assumptions mentioned above:
Theorem 4.1 of [20] (Convergence to Mean Field)
Assume thatthe initial occupancy measure M ( N ) (0) converges almost surely to the efined Mean Field Gossip Shuffle Protocol 7 deterministic limit µ (0) . Define µ ( t ) iteratively by (for t ≥ ): µ ( t + 1) = µ ( t ) K ( µ ( t )) . (1) Then for any fixed time t , almost surely, lim N →∞ M ( N ) ( t ) = µ ( t ) . The above result thus allows one to use, for large N , a deterministic approx-imation µ of the average behaviour of a discrete population model. In [12] we proposed a refined mean field method for discrete time populationmodels that has shown to provide a considerably better approximation thanclassic mean field in the case of population models with a medium populationsize N . This work was inspired by the development of a refined mean fieldapproximation for continuous time population models in [11]. Before recallingthe theoretical results for the refined mean field approximation technique fordiscrete time models we introduce some further basic notation.IR n ≥ denotes the set of n -tuples—i.e. 1 × n matrices—of non-negative realnumbers. For n × m matrix A we let A T denote its m × n transposed matrix. Forfunction f : IR n → IR p continuous and twice differentiable, let the p × n (function)matrix Df ( m ) and the p × n × n tensor D f ( m ) denote its first and secondderivatives, respectively: ( Df ( m )) ij = ∂f i ( m ) ∂m j and ( D f ( m )) ijk = ∂ f i ( m ) ∂m j ∂m k . Letfunction Φ : IN → U n → U n be defined as follows: Φ ( m ) = m ; ( Φ ( m )) j = n − (cid:88) i =0 m i K ij ( m ); Φ t +1 ( m ) = Φ ( Φ t ( m )) . Note that Φ ( Φ t ( m )) = Φ t ( Φ ( m )) and that, for µ ( t ) defined as in Equation(1), we have: µ ( t + 1) = Φ ( µ ( t )) = Φ t +1 ( µ (0)); so, function Φ makes explicit thedependence of µ ( t ) on the initial occupancy measure vector m . Suppose function h : U n → IR p ≥ models a measure of interest over the occupancy measure vectors.Below we recall Theorem 1 we proved in [12] on Refined mean-field approxi-mation : Theorem 1 of [12] (Refined Mean Field)
Assume that function Φ istwice differentiable with continuous second derivative and that M ( N ) (0) converges weakly to µ (0) . Let A t and B t be respectively the n × n matrix A t = ( DΦ )( µ ( t )) and the n × n × n tensor B t = ( D Φ )( µ ( t )) . Then forany continuous and twice differentiable function with continuous secondderivative h : U n → IR p ≥ we have: lim N →∞ N E (cid:104) h ( M ( N ) ( t )) − h ( Φ t ( M ( N ) (0))) (cid:105) = Dh ( µ ( t )) V t + 12 D h ( µ ( t )) · W t , where V t is an n × vector and W t is an n × n matrix, defined as follows: V t +1 = A t V t + B t · W t and W t +1 = Γ ( µ ( t )) + A t W t A Tt , Gast et al. with V = 0 , W = 0 and Γ ( m ) is the following n × n matrix: Γ jj ( m ) = (cid:80) n − i =0 m i K ij ( m )(1 − K ij ( m )) Γ jk ( m ) = − (cid:80) n − i =0 m i K ij ( m ) K ik ( m )The following corollary illustrates the relationship between the refined meanfield result and the classic convergence theorem: Corollary 1(i) of [12]
Under the assumptions of
Theorem 1 of [12] ,it holds that for any coordinate i and any time-step t ∈ IN E (cid:104) M ( N ) i ( t ) (cid:105) = µ i ( t ) + ( V t ) i N + o (cid:18) N (cid:19) . In other words, the expected value of the fraction of the objects in local state i of the full stochastic model with population size N at time t , is equal to theclassic limit mean field value µ i ( t ) plus a factor that is a constant ( V t ) i , calculatedas shown in Theorem 1, divided by the population size N plus a residual amountof order o (cid:0) N (cid:1) . It is easy to see that the larger is N the smaller this additionalfactor gets. Essentially, the refined mean field takes not only the first moment(the mean) but also the second moment (variance) into consideration in theapproximation.In [12] we have applied this discrete time refined mean field approximationon a number of examples ranging from the well-known epidemic model SEIRto wireless networks. It was shown that the approach works well under theassumption that the models have a unique fixed point and exponentially stablebehaviour , i.e. possible oscillations in the behaviour of the system, due to a finiteinput, will die out at an exponential rate. Here we investigate its application tothe more complex gossip shuffle protocol.A proof-of-concept implementation of both the classical and the refined meanfield techniques and a discrete event simulator has been developed by one ofthe authors of the present paper in F Following the classic discrete time mean field approximation technique [20,1,2]the behaviour of an individual gossip node can be described based on its localstate and the current occupancy measure vector. This exploits what is known asthe “decoupling principle”, i.e. in the limit for N going to infinity, the evolutionof each individual object is assumed to be stochastically independent from otherspecific objects – except through dependence on the global occupancy measure– even in the presence of explicit cooperation (i.e. synchronisation) betweenobjects [20,7,16]. Such a model of an individual node can then be used to analyse efined Mean Field Gossip Shuffle Protocol 9 global properties of the network such as the replication and coverage measuresthat are relevant in this case study.Without going into full detail , the mean field models proposed in the workby Bahkshi et al. [2] consider a gossip network as consisting of active and pas-sive nodes that possess, or do not possess, the specific data element in theircache. This is illustrated in Fig. 3 (left), where the local states of a single nodeare shown. States in which the node actively looks for a gossip peer are red,those in which it passively receives requests are blue. States in which the nodehas the data-element in its cache are labelled by D i , those in which it does notare labelled by O i . Transitions between states occur with certain probabilities,which depend on the global occupancy measure and the conditional probabili-ties of pairwise node interaction, shown in Fig. 3 (right), under the assumptionof a uniform distribution of data items over the local storages of all nodes. P ( A (cid:48) B (cid:48) | AB ) denotes the conditional probability of the state of an active-passivepair AB to have state A (cid:48) B (cid:48) after their interaction, where A, B, A (cid:48) , B (cid:48) ∈ {
O, D } . D0 D1 D2 D3O0 O1 O2 O3 dksdksdks onsonsons dlsdlsdls ogsogsogs onrdkrogrdlr
OO ODDDDO
P(OO | OO)P(DO | DO) P(DD | DD)P(OD | OD) . P ( O D | D O ) P ( D O | O D ) P ( DD | O D ) P ( O D | DD ) P(DO | DD)P(DD | DO)
Fig. 3.
Left: Push-pull gossip model of individual gossip node with rounds of length3 (i.e. G max = 3). Active states are red, passive ones blue. Model for replication.Right: Transition diagram of conditional probabilities pairwise interaction betweengossip nodes. D: data element in cache; O: data element not in cache. The conditional probabilities can be expressed in terms of n (number ofdifferent data elements), c (size of the cache) and s (number of selected elementsfor exchange), as follows: More details can be found in the Appendix, which will not be part of this paper. See [1,2] for further details on this pairwise communication probabilities.0 Gast et al.
P(OD | DO) = P(DO | OD) = sc ∗ n − cn − s P(OD | OD) = P(DO | DO) = c − sc P(DD | OD) = P(DD | DO) = sc ∗ c − sn − s P(OD | DD) = P(DO | DD) = sc ∗ c − sc ∗ n − cn − s P(DD | DD) = 1 . − . ∗ sc ∗ c − sc ∗ n − cn − s P(OO | OO) = 1 . G max , by aggregating the O-states and the D-states, respectively.This uses the experimental observation that when, in the initial state, the O-states all have the same occupancy measure, and all the D-states have the sameoccupancy measure, this situation remains so when time evolves . This observa-tion is illustrated in Fig. 4 for a network with 2500 nodes and G max = 3, with10 nodes in each O-state and 615 nodes in each D-state initially. time d i ff u s i on D0D1D2D3 time d no t i n c a c he O0O1O2O3
Fig. 4.
Diffusion of d-element in the network for N = 2500 with initially 615 nodes ineach O-state and 10 nodes in each D-state. Occupancy measure of D-states (left) andof O-states (right). The simplified aggregated mean field model is shown in Fig. 5 (left) for theanalysis of the replication, and on the right for the analysis of coverage. Thelatter shows an additional state (I). This models the state in which the nodedoes not have the data-element in its cache currently and also has never had itbefore. The O-state in this model represents the fact that it does not have thedata element at the moment, but that is has seen it previously (i.e. the node isalready covered).The transition probability functions in the three-state model of Fig. 5, withstates O , D and I are defined as follows, for m = ( m O , m D ) ∈ U : – from O to D : get ( m ) = G max G max +1 ( ogs (cid:48) ( m )) + G max +1 ( ogr (cid:48) ( m )) – from D to O : loose ( m ) = G max G max +1 ( dls (cid:48) ( m )) + G max +1 ( dlr (cid:48) ( m )) Note that we do not assume that the occupancy measure of an O-state is equal tothat of a D-state.efined Mean Field Gossip Shuffle Protocol 11 DO get loose1-get1-loose DOI get loose1-get1-loose get1-get
Fig. 5.
Two-state (left) and three-state (right) aggregate push-pull gossip model of anindividual gossip node with rounds of length G max . – from I to D : get as abovewhere ogs (cid:48) ( m ) = G max +1 m D ( P ( OD | DO ) + P ( DD | DO )) nocdls (cid:48) ( m ) = G max +1 (( m O + m I ) P ( DO | OD ) + m D P ( DO | DD )) nocogr (cid:48) ( m ) = G max G max +1 m D ( P ( DO | OD ) + P ( DD | OD )) nocdlr (cid:48) ( m ) = G max G max +1 (( m O + m I ) P ( OD | DO ) + m D P ( DO | DD )) noc where noc is the no-collision probability, which, in the aggregated models, isequal to e − ∗ (1 / ( G max +1)) ; note that 1 / ( G max + 1) is the fraction of active nodesin the network at any time instant. This is derived from [1], where it is shownthat in the limit for N to infinity, the probability of no collision is given by e − ∗ ( frc ( O frc ( D where the sum frc ( O
0) + frc ( D
0) denotes the fraction ofactive nodes in the network at any time. In the aggregated model this amountsto 1 / ( G max + 1). In this model the number of replications of the data elementin the network corresponds to the number of nodes that are in state D. Thecoverage of the network is given by the number of nodes that are in state D orstate O. The definitions of the transition probabilities for the two-state model aresimilar, but with m I equal to zero. With the two state model only the numberreplications can be analysed.For very large systems both models show a surprisingly good correspondencebetween the Java simulation results and the classic mean field approximation.For N=25,000 the curves for both measures essentially overlap (see [1,2]). ForN=2,500, with initially one node in state D and all other nodes in state I, for G max = 9, the results for replication and coverage are shown in Fig. 1. For thatsystem size already some differences can be observed, and, even though they arenot huge, there is a considerable difference in the time at which network coverageseems to be reached. The Java simulation (average of 500 runs) indicates thatthis happened close to time 1500, whereas the mean field indicates a time wellbefore that, just before time 1000, even though the mean field approximation is still just within the standard deviation of the simulation runs. In the nextsection we illustrate what results can be obtained with the refined mean fieldapproximation and we also motivate why, in the general case, this requires amore detailed mean field model. The mean field models of the gossip shuffle protocol in the previous sectionwere based on the principle of decoupling of joint probabilities [20,7] based ona careful study of the pairwise probabilities of the various possible outcomes ofa shuffle between two gossip nodes (as in [1]). In our previous work on refinedmean field approximation we have shown for a number of other models that thisapproximation technique can provide an increased accuracy w.r.t. classical meanfield and that there is also a close correspondence between the simulation of themean field model and the refined approximation [12,13]. However, simulation ofthe mean field model of Fig. 3 for a small network of size N=120, with G max = 3,with initially 29 nodes in each O-state and one in each D-state, shows that inmany simulation runs the system completely looses the introduced data-element.In other words, no gossip node in the network has the element in its cache at acertain point in time. This is clearly in contrast with the properties of the gossipprotocol itself. The refined mean field approximation is also sensitive to thisaspect of the model behaviour as can be observed in Fig. 6. Similar observationscan be made for the aggregated 3-state model of Fig. 5. nu m be r o f r ep li c a s time refined mfclassic mfmodel simulation Fig. 6.
Replication of data element in the network for N = 120, G max = 3, withinitially 29 nodes in each O-state and 1 node in each D-state showing a single simulationtrace of the model of Fig. 3 (left)in which the data element gets lost from the network.The figure also shows the classic mean field (blue) and refined mean field (green) results. In the following we propose a more detailed mean field model in which (1) thesystem can never completely loose the inserted data element and (2) the model We really intend the simulation of the model here, and not the Java simulation ofthe protocol.efined Mean Field Gossip Shuffle Protocol 13 reflects more explicitly the effects of the pairwise interaction and synchronisationbetween nodes. Note the emphasis on effects of node synchronisation because westill are aiming at a model that respects the decoupling principle for its use in amean field setting. What we really aim at is to distinguish the effects of a nodegetting a data element through exchanging it with another node–in which casethe total number of replicas of the data element in the system remains the same–or through replication , i.e. the other node retains its copy of the data elementand the global number of the data element in the system increases by one.With reference to Fig. 7, for what concerns point (1) above, we introduce aspecific state, PD , to the model representing that there always is a gossip nodein the network that possesses the data element.To address point (2), we introduce two more states, FD and LD , to distinguishbetween the effect of interactions between gossip nodes. State FD represents thefact that the gossip node received the data element for the first time via an exchange of the data element with another node. State LD also represents thefact that the node received the data element via an exchange , but that it hadalready seen the data element in the past. So in both cases, the data element issimply exchanged, i.e. one node gives it to the other, and the total number ofgossip nodes that possess the data element is not changed by such an interaction.Note that modelling the effect of an exchange of the data element between twonodes in this way also means that we can retrieve the total number of gossipnodes in the system that do not possess the data element as the sum of the nodesthat are in states FD, LD, I and O. This is so because we know that for eachnode in state FD (LD, resp.) there is a node in the network that just lost itsdata element in the synchronous shuffle with our current node. We will makeuse of this in the probability functions associated with the transitions betweennodes.A gossip node can also get involved in an interaction in which the dataelement is replicated, i.e. a node gives it to another one but also retains a copyitself. Note that this can happen both in case the node that receives the dataelement does not possess the data element and when it does possess it. Thissituation is modelled by state D and represents the fact that the interaction hasthe effect that the total number of nodes in the network that possess the dataelement increases (by one).A third case exists where two nodes, both possessing the data element, inter-act and one of them looses its copy. In that case the overall number of copies ofthe data element in the network is reduced by one. Note that the gossip protocoldoes not allow that both copies get lost in such an interaction. Moreover, if thereis only a single node left in the network with a copy of the data element thiscopy cannot get lost because this node cannot interact with another node havingthe data element.To distinguish the various kinds of interactions mentioned above we refine thetransition probability functions introduced on page 10. In particular, we split theprobability functions get and loose into two distinct parts, get rep and get exc forthe get function to model data element replication and exchange, respectively, and likewise for the loose function as follows, where the appropriate conditionalprobabilities are used: get exc ( m ) = 2 ∗ G max ( G max +1) ( m D + m PD ) P ( OD | DO ) nocget rep ( m ) = 2 ∗ G max ( G max +1) ( m D + m PD ) P ( DD | DO ) nocloose exc ( m ) = 2 ∗ G max ( G max +1) ( m O + m I + m LD + m FD ) P ( OD | DO ) nocloose rep ( m ) = 2 ∗ G max ( G max +1) ( m D + m PD ) P ( DO | DD ) noc DO FDLDPD I get rep1 loose rep1-(get rep)-(get exc)get exc1-(loose rep) get repget reploose exc get excloose excget rep 1-(loose exc)-(get rep)1-(loose exc)- (get rep) 1-(get exc)-(get rep)
Fig. 7.
Six-state model of an individual gossip node with rounds of length G max . Fig. 8 shows the replication as sum of the number of nodes in states D and PDand the coverage as the sum of the number of nodes in D, PD, FD, LD and O for a network with N = 100, n = 500, c = 100 and s = 50 with initially one nodein state PD and all the others in state I. Besides the classic and refined meanfield approximations for the model in Fig. 7 and the Java simulation results ofthe actual shuffle protocol, Fig. 8 also shows the average of the model simulation.In particular, note the good approximation of the simulation results (both theJava simulation and the model simulation) by the refined mean field even inthis very small network. This holds both for the diffusion of the replicas and forthe coverage. Similarly good results have been found for a system with N=2,500shown in Fig. 9, also in the case in which there is only a single data elementin the system initially. An indication of the (non-optimised) performance of theanalysis for producing the results in Fig. 9 is: 0.543s (classic mean field); 25.459s(refined mean field); 7m 1.389s (fast model simulation [20], 500 runs); 3h 42m41.459s (Java simulation, 500 runs) on a MacBook Pro, Intel i7, 16GB. Recall For the refined mean field this means the application of Thm. 1 with h ( m ) = m D + m PD (replication) and h ( m ) = m D + m PD + m FD + m LD + m O (coverage), repectively.efined Mean Field Gossip Shuffle Protocol 15 nu m be r o f r ep li c a s timeprotocol simulationrefined mfclassic mfmodel simulation nu m be r o f node s timeprotocol simulationrefined mfclassic mfmodel simulation Fig. 8.
Replication (left) and network coverage (right) of the data element in thenetwork for N = 100 with initially 99 nodes in the I -state and 1 node in the PD -statefor G max = 3. Average of 500 simulation runs of both the model and Java simulations.Vertical bars show standard deviation for the Java simulation. that the mean field and refined mean field analyses times are independent of thesize of the system and, as can be seen, several orders of magnitude faster thantraditional event simulation approaches. Gossip protocols play an important role in the design of collective adaptivesystems providing a basic, but robust and scalable, mechanism of informationspreading in very large networks. Therefore they also form an interesting bench-mark application for the analysis of scalable verification techniques. We havedeveloped a new mean field model for the shuffle gossip protocol with whichmore accurate approximations for medium size gossip protocols can be obtainedvia refined mean field approximation techniques. This model respects key as-pects of the protocol such as the effects of different kinds of interactions and thefact that a new data element cannot be lost by the system as a whole.Good approximation of medium size systems is of interest for several reasons.First of all, many practical systems consist of many, but not a huge number, ofcomponents. However, even in case of medium size systems, simulation is still aresource consuming effort and in that case a refined mean field approximation can nu m be r o f r ep li c a s timeprotocol simulationrefined mfclassic mfmodel simulation nu m be r o f node s timeprotocol simulationrefined mfclassic mfmodel simulation Fig. 9.
Replication (left) and network coverage (right) of data element for N = 2500with initially 2499 nodes in I and 1 in PD , for G max = 9. Average of 500 simulationruns for both model and Java simulations. Vertical bars show standard deviation forthe Java simulation. provide fast but accurate approximations. Furthermore, we expect that refinedmean field approximation can also be of use when analysing systems in whichobjects are mobile and move through physical space. The uneven distribution ofobjects over partitions of such a space requires a mean field approximation thatis accurate also for those partitions with relatively few objects. We wish to thank Rena Bakhshi for sharing with us her Java simulator softwarefor the reproduction of the gossip protocol simulations.This research has been partially supported by the MIUR project PRIN2017FTXR7S “IT-MaTTerS” (Methods and Tools for Trustworthy Smart Sys-tems).
References
1. Bakhshi, R.: Gossiping Models – Formal Analysis of Epidemic Protocols. Ph.D.thesis, Vrije Universiteit Amsterdam (January 2011),
2. Bakhshi, R., Cloth, L., Fokkink, W., Haverkort, B.R.: Mean-field framework forperformance evaluation of push-pull gossip protocols. Perform. Eval. 68(2), 157–179 (2011), https://doi.org/10.1016/j.peva.2010.08.025 efined Mean Field Gossip Shuffle Protocol 173. Bakhshi, R., Gavidia, D., Fokkink, W., van Steen, M.: An analytical model ofinformation dissemination for a gossip-based protocol. Computer Networks 53(13),2288–2303 (2009), https://doi.org/10.1016/j.comnet.2009.03.017
4. Bakhshi, R., Gavidia, D., Fokkink, W., van Steen, M.: A modeling framework forgossip-based information spread. In: Eighth International Conference on Quantita-tive Evaluation of Systems, QEST 2011, Aachen, Germany, 5-8 September, 2011.pp. 245–254. IEEE Computer Society (2011), https://doi.org/10.1109/QEST.2011.39
5. Baydin, A.G., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differen-tiation in machine learning: a survey. Journal of Machine Learning Research 18,153:1–153:43 (2018), http://jmlr.org/papers/v18/17-468.html
6. Birman, K.: The promise, and limitations, of gossip protocols. Operating SystemsReview 41(5), 8–13 (2007), https://doi.org/10.1145/1317379.1317382
7. Bortolussi, L., Hillston, J., Latella, D., Massink, M.: Continuous approximation ofcollective system behaviour: A tutorial. Perform. Eval. 70(5), 317–349 (2013)8. Frei, R., Serugendo, G.D.M.: Advances in complexity engineering. InternationalJournal of Bio-Inspired Computation 3(4), 199–212 (2011), https://doi.org/10.1504/IJBIC.2011.041144
9. Frei, R., Serugendo, G.D.M.: Concepts in complexity engineering. InternationalJournal of Bio-Inspired Computation 3(2), 123–139 (2011), https://doi.org/10.1504/IJBIC.2011.039911
10. Gast, N., Gaujal, B.: A mean field approach for optimization in discrete time. Dis-crete Event Dynamic Systems 21(1), 63–101 (2011), https://doi.org/10.1007/s10626-010-0094-3
11. Gast, N., Houdt, B.V.: A refined mean field approximation. Proceedings of theACM on Measurement and Analysis of Computing Systems 1(2), 33:1–33:28 (2017), https://doi.org/10.1145/3154491
12. Gast, N., Latella, D., Massink, M.: A refined mean field approximation of syn-chronous discrete-time population models. Perform. Eval. 126, 1–21 (2018), https://doi.org/10.1016/j.peva.2018.05.002
13. Gast, N., Latella, D., Massink, M.: A refined mean field approximation for syn-chronous population processes. In: Workshop on MAthematical performance Mod-eling and Analysis (MAMA 2018). pp. 30–32. ACM SIGMETRICS PerformanceEvaluation Review, ACM (2019)14. Gast, N., Latella, D., Massink, M.: Refined mean field analysis:the gossip shuffle protocol revisited. In: Bocchi, L., Bliudze, S. (eds.) CoordinationModels and Languages - 22th IFIP WG 6.1 International Conference, COORDI-NATION 2020. LNCS, Springer (2020), short paper, to appear.15. Gavidia, D., Voulgaris, S., van Steen, M.: A gossip-based distributed news servicefor wireless mesh networks. In: Conf. on Wireless On demand Network Systemsand Services (WONS). pp. 59–67. IEEE Computer Society (2006)16. Gottlieb, A.D.: Markov Transitions and the Propagation of Chaos. Office of scien-tific and Technical Information (OSTI), U.S. Department of Energy. (2013), alsoavailable at: https://arxiv.org/abs/math/000107617. Jelasity, M.: Gossip. In: Serugendo, G.D.M., Gleizes, M.P., Karageorgos, A.(eds.) Self-organising Software - From Natural to Artificial Adaptation, pp.139–162. Natural Computing Series, Springer (2011), https://doi.org/10.1007/978-3-642-17348-6_7
18. Latella, D., Loreti, M., Massink, M.: On-the-fly PCTL fast mean-field approx-imated model-checking for self-organising coordination. Sci. Comput. Program.110, 23–50 (2015), https://doi.org/10.1016/j.scico.2015.06.009 https://doi.org/10.1007/978-3-662-54580-5_18
20. Le Boudec, J., McDonald, D.D., Mundinger, J.: A generic mean field convergenceresult for systems of interacting objects. In: Fourth International Conference onthe Quantitative Evaluaiton of Systems (QEST 2007), 17-19 September 2007, Ed-inburgh, Scotland, UK. pp. 3–18. IEEE Computer Society (2007)21. Massink, M.: Refined mean field F https://github.com/mimass/RefinedMF
22. Pianini, D., Beal, J., Viroli, M.: Improving gossip dynamics through overlappingreplicates. In: Lluch-Lafuente, A., Proen¸ca, J. (eds.) Coordination Models andLanguages - 18th IFIP WG 6.1 International Conference, COORDINATION 2016,Held as Part of the 11th International Federated Conference on Distributed Com-puting Techniques, DisCoTec 2016, Heraklion, Crete, Greece, June 6-9, 2016, Pro-ceedings. Lecture Notes in Computer Science, vol. 9686, pp. 192–207. Springer(2016), https://doi.org/10.1007/978-3-319-39519-7_12
23. Voulgaris, S., Jelasity, M., van Steen, M.: A robust and scalable peer-to-peer gos-siping protocol. In: Moro, G., Sartori, C., Singh, M.P. (eds.) Agents and Peer-to-Peer Computing, Second International Workshop, AP2PC 2003, Melbourne,Australia, July 14, 2003, Revised and Invited Papers. Lecture Notes in Com-puter Science, vol. 2872, pp. 47–58. Springer (2003), https://doi.org/10.1007/978-3-540-25840-7_6
Fig. 3 (left) shows the states and transitions of a single gossip node where G max =3. The red states, D O
0, denote states in which the gossip node is active ,i.e. it can initiate an exchange of local information with a passive node; in D O
0) the node has (resp. does not have) the data element in its local cache.The blue states denote states in which the node is passive and is available fordata exchange with an active node when contacted by the latter. The number inthe node-labels denotes the value, ranging from 0 to 3, of the current gossip delay g before the node becomes active again. The D/O convention w.r.t. having thedata element applies also to the states where the node is passive. The transitionlabels in Fig. 3 (left) are shorthands for transition probability functions. Thelatter depend on the occupancy measure vector. Their definition makes use of theconditional probabilities shown in Fig. 3 (right). Furthermore, we recall from [1]that there is a small probability that collision occurs in the communicationbetween two nodes. This happens when a gossip partner is selected that is alreadyinvolved in a shuffle with another node. In the limit for N to infinity, the valueof the probability of no collision is given by e − ∗ ( frc ( O frc ( D where frc ( O
0) + efined Mean Field Gossip Shuffle Protocol 19 frc ( D
0) denotes the fraction of active nodes in the network at any time, i.e.summing active nodes that have the d-element and those that do not.In the definition of the transition probability functions, we also make use ofthe following observation that greatly simplifies the definitions. Note that thisgossip model is a clock-synchronous model and that each node gets active every G max time steps. This means that in every step, if the node is in state Di (or Oi ) then in the next step it leaves this state with probability 1.0 to move onestep closer towards the active state D
0, or, if it was active, it moves to D O
3) modelling a reset of the time-to-activation; similarly for Oi states. In otherwords, for all time steps t it holds that: dks ( µ ( t )) + dls ( µ ( t )) = 1 and ons ( µ ( t )) + ogs ( µ ( t )) = 1Similarly for the reset probabilities. The proofs can be found in Sect. 8.7 of thisAppendix.The transition probability functions that concern the D -states are definedas shown below. Note that here and in the remainder of the Appendix we alsouse Currying notation for notational simplicity. In the sequel m denotes theoccupancy measure vector . Its components are indicated by m O , m O and soon. d loss reset m = ( m O + m O + m O ) ∗ P ( OD | DO ) ∗ ( noc )+( m D + m D + m D ) ∗ P ( OD | DD ) ∗ ( noc ) d loss step m = m O ∗ P ( DO | OD ) ∗ ( noc ) + m D ∗ P ( DO | DD ) ∗ ( noc ) d keep reset m = 1 . − ( d loss reset m ) d keep step m = 1 . − ( d loss step m )Function d loss reset corresponds to the transition dlr in Fig. 3, and so on.The transition probability functions concerning the O -states are: o getd reset m = ( m D + m D + m D ) ∗ ( P ( DO | OD ) + P ( DD | OD )) ∗ ( noc ) o getd step m = m D ∗ ( P ( OD | DO ) + P ( DD | DO )) ∗ ( noc ) o nod reset m = 1 . − ( o getd reset m ) o nod step m = 1 . − ( o getd step m )Defining a 8 × K with indexes i, j in { , · · · , } then we can definethe gossip model as follows for the non-zero elements of K : K , = 1 . − ( o getd reset m ) K , = o getd reset mK , = K , = K , = 1 . − ( o getd step m ) K , = K , = K , = o getd step mK , = d loss reset mK , = 1 . − d loss reset mK , = K , = K , = d loss step mK , = K , = K , = 1 . − d loss step m This leads a set of eight difference equations for the model recalling that theoccupancy vector m at time t + 1 is m ( t + 1) = m ( t ) ∗ K ( m ( t )). These equationsare shown in full in the next section. We obtain the following set of difference equations for the O-states and theD-states of the model, representing the occupancy measure of state Oi by m Oi ,and Di by m Di , for i ∈ { , · · · , } . m O = m O − m O ∗ ( o getd step m ) + m D ∗ ( d loss step m ) m O = m O − m O ∗ ( o getd step m ) + m D ∗ ( d loss step m ) m O = m O − m O ∗ ( o getd step m ) + m D ∗ ( d loss step m ) m O = m O − m O ∗ ( o getd reset m ) + m D ∗ ( d loss reset m ) m D = m O ∗ ( o getd step m ) + m D − m D ∗ ( d loss step m ) m D = m O ∗ ( o getd step m ) + m D − m D ∗ ( d loss step m ) m D = m O ∗ ( o getd step m ) + m D − m D ∗ ( d loss step m ) m D = m O ∗ ( o getd reset m ) + m D − m D ∗ ( d loss reset m ) Network coverage at time t denotes the fraction of the gossip nodes that haveseen the data element at any point in time t (cid:48) , with t ≤ t (cid:48) ≤ t , where t isthe time the data element was introduced in the network. To analyse networkcoverage we extend the model of an individual node in Fig. 10 with four morestates. These states are I I I I
3. A gossip node is in state Ii if the dataelement is not in its cache and it has never seen the data element since it wasintroduced in the network. The latter is the case initially for most nodes, hencethe name I : Initial O-state. If a node is in one of the other O -states this meansthat it does not have the data element in its cache currently, but it was in itscache at an earlier point in time, so the node has seen the data element since itwas introduced for the first time. On the left of each equation the new value of the occupancy measure vector attime step t + 1, on the right the values of m are intended to be those at time t . Fornotational simplicity t + 1 and t have been omitted in the equations below.efined Mean Field Gossip Shuffle Protocol 21 The probability functions for the outgoing transitions of the Ii nodes arethe same as for their companion O -nodes. Also the probability functions of theincoming transitions, when they come from I -states, are the same. There are noincoming transitions from D -nodes of course, since passing by a D -node wouldmean that the data element has been in the cache of that node. Similarly, thereis no transition from an O -state to an I -state. D0 D1 D2 D3O0 O1 O2 O3I0 I1 I2 I3 dksdksdks onsonsons onsonsons dlsdlsdls ogsogsogs ogsogsogs onr dkronr ogrogrdlr
Fig. 10.
Extended push-pull gossip model of individual gossip node with rounds oflength 3 (i.e. G max = 3) for the analysis of network coverage. Active states are red,passive ones blue. The probability functions have to be updated slightly to take the two versionsof the O-states into account. d loss reset m = ( m O + m O + m O ) ∗ P ( OD | DO ) ∗ ( noc )+( m I + m I + m I ) ∗ P ( OD | DO ) ∗ ( noc )+( m D + m D + m D ) ∗ P ( OD | DD ) ∗ ( noc ) d loss step m = ( m O + m I ) ∗ P ( DO | OD ) ∗ ( noc )+ m D ∗ P ( DO | DD ) ∗ ( noc )and their dual probabilities, and o getd reset m = ( m D + m D + m D ) ∗ ( P ( DO | OD ) + P ( DD | OD )) ∗ ( noc ) o getd step m = m D ∗ ( P ( OD | DO ) + P ( DD | DO )) ∗ ( noc ) . Similarly to the gossip model for replications we can define a 12 ×
12 matrix K with indexes i, j in { , · · · , } . The definition of the non-zero elements ofthis matrix can be found in the next section, as well as the additional set ofdifference equations that can be obtained from it. Numbering the states in the model of Fig. 10 as O0=0, O1=1, O2=2, O3=3,D0=4, D1=5, D2=6, D3=7, I0=8, I1=9, I2=10 and I3=11, the non-empty ele-ments of the K matrix are given by: K , = 1 . − ( o getd reset m ) K , = K , = o getd reset mK , = K , = K , = 1 . − ( o getd step m ) K , = K , = K , = o getd step mK , = d loss reset mK , = 1 . − d loss reset mK , = K , = K , = d loss step mK , = K , = K , = 1 . − d loss step mK , = K , = K , = o getd step mK , = K , = K , = 1 . − ( o getd step m )We also obtain the following set of additional four difference equations forthe I-states of the model, representing the occupancy measure of state I0 by m I , I1 by m I , I2 by m I and I3 by m I : m I = m I − m I ∗ ( o getd step m ) m I = m I − m I ∗ ( o getd step m ) m I = m I − m I ∗ ( o getd step m ) m I = m I − m I ∗ ( o getd reset m ) Numbering the states in the model of Fig. 5 (right) as O=0, D=1, I=2 thenon-empty elements of the K matrix are given by: K , = K , = get mK , = loose mK , = K , = 1 . − ( get m ) K , = 1 . − ( loose m )Representing the occupancy measure of state O by m O , D by m D , I by m I weobtain the following set of difference equations for the model in Fig. 5 (right): efined Mean Field Gossip Shuffle Protocol 23 m O = m O − m O ∗ ( get m ) + m D ∗ ( loose m ) m D = m D + m O ∗ ( get m ) − m D ∗ ( loose m ) m I = m I − m I ∗ ( get m ) Numbering the states in the model of Fig. 7 as O=0, D=1, I=2, FD=3, PD=4,LD=5, the non-empty elements of the K matrix are given by: K , = K , = K , = K , = get rep mK , = loose rep mK , = K , = 1 . − ( get rep m ) − ( get exc m ) K , = K , = get exc mK , = 1 . − ( loose rep m ) K , = K , = loose exc mK , = K , = 1 . − ( loose exc m ) − ( get rep m ) K , = 1 . m O , D by m D , I by m I ,FD by m F D , PD by m P D and LD by m LD , we also obtain the following set ofdifference equations for the model in Fig. 7: m O = m O − m O ∗ (( get rep m ) + ( get exc m )) + m D ∗ ( loose rep m )+( m F D + m LD ) ∗ ( loose exc m ) m D = ( m O + m I + m F D + m LD ) ∗ ( get rep m ) + m D − m D ∗ ( loose rep m ) m I = m I − m I ∗ ( get rep m + get exc m ) m F D = m F D − m F D ∗ ( get rep m ) + m I ∗ ( get exc m ) − m F D ∗ ( loose exc m ) m P D = m P D m LD = m LD + m O ∗ ( get exc m ) − m LD ∗ (( loose exc m ) + ( get rep m )) To prove: dks ( µ ( t )) + dls ( µ ( t )) = 1 for all t. . In the following frc ( X ) denotes thefraction of nodes in state X . dks ( µ ( t )) + dls ( µ ( t ))= { By Defs. of dks and dls } ( frc ( O
0) + frc ( D ∗ (1 − e − ∗ ( frc ( O frc ( D ) +(1 − ( frc ( O
0) + frc ( D frc ( O ∗ ( P ( OD | OD ) + P ( DD | OD )) ∗ e − ∗ ( frc ( O frc ( D + frc ( D ∗ ( P ( OD | DD ) + P ( DD | DD )) ∗ e − ∗ ( frc ( O frc ( D + frc ( O ∗ P ( DO | OD ) ∗ e − ∗ ( frc ( O frc ( D + frc ( D ∗ P ( DO | DD ) ∗ e − ∗ ( frc ( O frc ( D = { Use that frc ( D i )( t ) = frc ( D )( t ) gmax +1 and frc ( O i )( t ) = frc ( O )( t ) gmax +1 } ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 ) ∗ (1 − ∗ e − / ( gmax +1) ) +(1 − ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 )) + frc ( O )( t ) gmax +1 ∗ ( P ( OD | OD ) + P ( DD | OD )) ∗ e − / ( gmax +1) + frc ( D )( t ) gmax +1 ∗ ( P ( OD | DD ) + P ( DD | DD )) ∗ e − / ( gmax +1) + frc ( O )( t ) gmax +1 ∗ P ( DO | OD ) ∗ e − / ( gmax +1) + frc ( D )( t ) gmax +1 ∗ P ( DO | DD ) ∗ e − / ( gmax +1) = { Simplify } ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 ) ∗ (1 − e − ∗ ( frc ( O frc ( D )( t ) gmax +1 ) ) +(1 − ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 )) + frc ( O )( t ) gmax +1 ∗ ( P ( OD | OD ) + P ( DD | OD ) + P ( DO | OD )) ∗ e − / ( gmax +1) + frc ( D )( t ) gmax +1 ∗ ( P ( OD | DD ) + P ( DD | DD ) + P ( DO | DD )) ∗ e − / ( gmax +1) = { Use P ( OD | OD ) + P ( DD | OD ) + P ( DO | OD ) = 1 and P ( OD | DD ) + P ( DD | DD ) + P ( DO | DD ) = 1 } ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 ) ∗ (1 − e − ∗ ( frc ( O frc ( D )( t ) gmax +1 ) ) +(1 − ( frc ( O )( t ) gmax +1 + frc ( D )( t ) gmax +1 )) + frc ( O )( t ) gmax +1 ∗ e − / ( gmax +1) + frc ( D )( t ) gmax +1 ∗ e − / ( gmax +1) += { Simplify } P ( OD | OD ) + P ( DD | OD ) + P ( DO | OD ) = 1 P ( OD | OD ) + P ( DD | OD ) + P ( DO | OD ) = 1 efined Mean Field Gossip Shuffle Protocol 25 = { Defs. of