A refined mean field approximation of synchronous discrete-time population models
AA REFINED MEAN FIELD APPROXIMATION OFSYNCHRONOUS DISCRETE-TIME POPULATION MODELS
NICOLAS GAST (INRIA), DIEGO LATELLA (CNR-ISTI),AND MIEKE MASSINK (CNR-ISTI)
Abstract.
Mean field approximation is a popular method to study the be-haviour of stochastic models composed of a large number of interacting objects.When the objects are asynchronous, the mean field approximation of a popu-lation model can be expressed as an ordinary differential equation. When theobjects are (clock-) synchronous the mean field approximation is a discretetime dynamical system. We focus on the latter.We study the accuracy of mean field approximation when this approxima-tion is a discrete-time dynamical system. We extend a result that was shownfor the continuous time case and we prove that expected performance indica-tors estimated by mean field approximation are O (1 /N )-accurate. We providesimple expressions to effectively compute the asymptotic error of mean fieldapproximation, for finite time-horizon and steady-state, and we use this com-puted error to propose what we call a refined mean field approximation. Weshow, by using a few numerical examples, that this technique improves thequality of approximation compared to the classical mean field approximation,especially for relatively small population sizes. Introduction
Stochastic models are often used to model and analyse the performance of com-puter (and many other) systems. A particularly rich and popular class of modelsis given by stochastic population models. These have been used, for instance,to model biological systems [26], epidemic spreading [1] or queuing networks [25].These systems are composed of a set of homogeneous objects interacting with oneanother. These models have a high expressive power, but an exact analysis of anysuch a model is often computationally prohibitive when the number of objects ofthe system grows. This results in the need for approximation techniques.A popular technique is to use mean field approximation. The idea behind meanfield approximation is to replace the study of the original stochastic system by theone of a, much simpler, deterministic dynamical system. The success of mean fieldapproximation can be explained by multiple factors : (a) it is fast – many modelscan be solved in closed form [25, 19, 23, 18] or easily solved numerically [17, 10, 24]– (b) it is proven to be asymptotically optimal as the number of objects in thesystem goes to infinity [15, 5, 2, 12, 4]; and (c) it is often very accurate also forsystems of moderate size, composed of N ≈
100 objects.The mean field approximation of a given model is constructed by considering thelimit of the original stochastic model as the number of objects N goes to infinity.There can be two types of limits. The first type arises when the dynamics of This paper and the simulations it contains are fully reproducible : https://github.com/ngast/RefinedMeanField_SynchronousPopulation . a r X i v : . [ c s . PF ] J u l NICOLAS GAST (INRIA), DIEGO LATELLA (CNR-ISTI), AND MIEKE MASSINK (CNR-ISTI) the objects are asynchronous. In this case the mean field approximation is givenby a continuous time dynamical system (often a system of ordinary differentialequations) – this is the most studied case e.g. [15, 2, 4]. The second type ariseswhen the objects are synchronous. In this case the mean field approximation is adiscrete time dynamical system [5, 11, 22]. We focus on the latter.Contributions. Our main contribution is an extension to (synchronous) DTMC pop-ulation models of the results proposed in [13] for (asynchronous) CTMC populationmodels, thus providing a new approximation technique that is significantly moreaccurate than classical mean field approximation, especially for relatively small sys-tems. Our results apply to the classical model of [5, 11, 16]. We prove our resultfor the transient and the steady state dynamics. Moreover, it retains an interestingfeature of mean field approximation by being computationally non-intensive.More precisely, if M ( N ) i ( t ) denotes the proportion of objects in a state i at time t , then the classical result of [5] states that, as N grows large, if the vector M ( N ) (0)converges almost surely to m , for some vector m , then the vector M ( N ) ( t ) convergesalmost surely to a deterministic quantity µ ( t ) that satisfies a recurrence equationof the form µ ( t + 1) = µ ( t ) K ( µ ( t )) with µ (0) = m . We show that, for any twicedifferentiable function h , there exists a constant V t,h such thatlim N →∞ N ( E (cid:104) h ( M ( N ) t ) (cid:105) − h ( µ ( t ))) = V t,h . (1)We provide an algorithm to compute the constant V t,h by a linear dynamical systemthat involves the first and second derivative of the functions m (cid:55)→ m K ( m ) and h .We also show that if the function m (cid:55)→ m K ( m ) has a unique fixed point µ ( ∞ ) thatis globally exponentially stable, then the same result holds for the steady-state :in this case, V ∞ ,h = lim t →∞ V t,h exists and can be expressed as the solution ofa discrete-time Lyapunov equation that involves the first and second derivative of m (cid:55)→ m K ( m ) and h evaluated at the point µ ( ∞ ).By using these results, we define a quantity h ( µ ( t )) + V t,h /N that we call the refined mean field approximation. As opposed to the classical mean field approxi-mation, this approximation depends on the system size N . We illustrate our the-oretical results with four different examples. While these examples all show thatour refined model is clearly more accurate than the classical approximations, theyillustrate different characteristics. The first two examples are cases where the dy-namical system has a unique exponentially stable attractor. In these examples,refined mean field provides performance estimates that are extremely accurate (thetypical error between M ( N ) and µ is less than 1% for N = 10). The third exampleis different as it is a case when the stochastic system has two absorbing states. Inthis case, the refined mean field is still more accurate than the classical mean fieldapproximation but remains far from the exact values for N = 10. It is only forlarger values of N that the refined mean field provides a very accurate estimate.Finally, the fourth example is a case where the mean field approximation has aunique attractor that is not exponentially stable. We observe that in this case therefined approximation provides an accurate approximation of E [ M ( N ) ( t )] for smallvalues of t but fails to predict correctly what happens when t is large compared to N . In fact in this case, one cannot refine the steady-state expectation by a term in O (1 /N ) because the convergence is only in O (1 / √ N ) in this case.This suggests that, when using a mean field or refined mean field approximation,one has to be careful : the approximations of a system with more than one stable REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS3 equilibrium or a unique but non-exponentially stable equilibrium is likely to beinaccurate for small values of N , even when one focuses on the transient behaviour.Related work. Our results extend the recent results of [13]. The authors of [13]study the steady-state of stochastic models that have a continuous-time mean fieldapproximation. They show that Equation (1) is true in this case and providea numerical algorithm to compute the constant. Our paper has two theoreticalcontributions with respect to [13] : First we show that the results also hold formodels that have a discrete-time mean field approximation, and second we showhow to derive these equations for the transient and steady state regimes of suchsystems. This means that our results remain in the realm of discrete-time modelswhereas in [13] it is shown how some discrete-time models can be transformed intodensity-dependent (continuous time) population models by replacing time stepswith steps that last for a random time that is exponentially distributed with mean1 /N , where N is the population size. The resulting continuous time model canthen be analysed using the approximation techniques for CTMC population modelsdiscussed in [13].The results of [13] and the one of the current paper follow from a series of recentresults concerning the rate of convergence of stochastic models to their mean fieldapproximation [9, 27, 28, 14]. The key idea behind these works is to study theconvergence of the generator of the stochastic processes to the one of its mean fieldapproximation and to use this convergence rate to obtain a bound on Equation (1).For the steady-state regime, this is made possible by using Stein’s methods [21, 7, 6].Note that the approach taken in the current paper is fundamentally different fromthe one that is usually used to obtain convergence rates, like [12, 3, 11] in which theauthors focus on sample path convergence and obtain bounds on the convergence ofthe expected distance E [ (cid:107) M ( N ) − µ (cid:107) ] between the stochastic system and its meanfield approximation. When focusing on sample path convergence, the refinementof the mean field approximation would be to consider an additive term of 1 / √ N times a Gaussian noise as for example in [12] and not a 1 /N term as in this paper.Outline. The rest of the paper is organised as follows. In Section 2, we introducethe model that we study. In Section 3 we provide the main results and in particularTheorem 1. In Sections 4, 5, 6 and 7, we provide a few numerical examples thatdemonstrate the accuracy of the refined mean field approximation and its limits.Finally, we conclude in Section 8.2. Preliminaries
In this section we introduce some terminology and notation as well as somepreliminary definitions, setting the context for the rest of the paper.2.1.
Notations.
We let IN denote the set of natural numbers and IR n ≥ the set of n -tuples of non-negative real numbers; we conventionally see any such an n -tuple m = ( m , . . . , m n ) as a row-vector, i.e. a 1 × n matrix. We let U n ⊂ IR n ≥ be theunit simplex of IR n ≥ , that is U n = { m ∈ [0 , n | m + . . . + m n = 1 } .For function f : IR n → IR p continuous and twice differentiable, with f ( m ) =( f ( m ) , . . . , f p ( m )) , we denote by Df and D f its first and second derivatives,respectively. Df ( m ) is the p × n (function) matrix such that ( Df ( m )) ij = ∂f i ( m ) ∂m j . D f ( m ) is the p × n × n tensor such that ( D f ( m )) ijk = ∂ f i ( m ) ∂m j ∂m k . NICOLAS GAST (INRIA), DIEGO LATELLA (CNR-ISTI), AND MIEKE MASSINK (CNR-ISTI)
Moreover, for a p × n × n tensor P and a n × n matrix Q , we let P · Q be the(row) vector in IR p such that ( P · Q ) i = (cid:80) nj,k =1 ( P ) ijk ( Q ) jk , for i = 1 , . . . , p . Inaddition, for p × n matrix A and n × m matrix B we use the notation AB forstandard matrix product: ( AB ) ij = (cid:80) nk =1 ( A ) ik ( B ) kj ; obviously this includes thecase of vector inner product u T v for u and v being n × A T is the transpose of A , i.e. ( A T ) ij = ( A ) ji ; the vector outer product u ⊗ v ,sometimes also denoted by u v T , is the matrix such that ( u ⊗ v ) i,j = u i v j .Finally, for any vector v , we let (cid:107) v (cid:107) denote the norm of v and (cid:107) v (cid:107) denote thesquare of (cid:107) v (cid:107) . Note that the choice of the specific norm is left unspecified becausethe results presented in the present paper hold for any norm . Note that in theproofs, what we denote by E, o ( (cid:107) E (cid:107) ) , E [ o ( (cid:107) E (cid:107) )] are p -dimensional vectors whiletheir norm (cid:107) E (cid:107) , (cid:107) E (cid:107) , (cid:107) E [ o ( (cid:107) E (cid:107) )] (cid:107) ∈ IR are real non-negative numbers.2.2.
Synchronous Mean Field Model.
We consider a system of 0 < N ∈ INidentical interacting objects; N is called the size of the system. We assumethat the number of local states of each object is finite , say n ; for the sake ofsimplicity, in the sequel we let the set of states of local objects be { , . . . , n } ,when not specified otherwise. Time is discrete and the behaviour of the systemis characterised by a (time homogeneous) discrete time Markov chain (DTMC) X ( N ) ( t ) = ( X ( N )1 ( t ) , . . . , X ( N ) N ( t )), where X ( N ) i ( t ) is the state of object i at time t ,for i = 1 , . . . , N .We define the occupancy measure at time t as the row-vector M ( N ) ( t ) = ( M ( N )1 ( t ) , . . . , M ( N ) n ( t ))where, for j = 1 , . . . , n , M ( N ) j ( t ) is the fraction of objects in state j at time t , overthe total population of N objects: M ( N ) j ( t ) = 1 N N (cid:88) i =1 { X ( N ) i ( t )= j } where 1 { x = j } is equal to 1 if x = j and 0 otherwise.At each time step t ∈ IN each object performs a local transition, that may alsobe a self-loop to its current state. The transition probabilities of an object statedepend on the current local state of the object and may depend also on M ( t ).We let K ( m ) denote the one-step transition probability n × n matrix of an objectin the system: K ij ( m ) is the probability for the object to jump from state i to state j in the system when the occupancy measure vector is m . We assume that, giventhe occupancy measure, the transitions made by two objects are independent. Ourmodel is identical to the one of [5] up to the fact that the authors of [5] add acontinuous resource to the model and allow object transition matrix K to dependalso on the size N of the system (in which case they assume that the sequenceof transition matrices K N converges to a function K as N goes to infinity). To Of course, technical details in the proofs may depend on the specific norm (see e.g. footnote 4in the proof of Lemma 4) It is worth pointing out here that the requirement that the N objects be identical can berelaxed, since a system with different classes of identical objects can easily be modelled by con-sidering an equivalent system with instances of an object whose set of states is the union of thoseof the original objects and similarly for the set of its transitions, as shown in the example inSection 5. In fact, the same theoretical results could be derived for infinite dimensional models with twoadditional assumptions to cope with the fact that the simplex U ∞ is not compact : (i) imposingall functions to be uniformly continuous and (ii) Imposing tightness assumptions. REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS5 simplify the exposition, in this paper we consider a case without resource and weassume K ( m ) is a continuous function of m that does not depend on N . The resultspresented in this paper could be extended to the more general case where K ( m ) isalso a function of N and could be modified to incorporate a resource, essentially byreplacing our equation for the variance Γ by the one presented in [11, Equation (7)].Below we recall Theorem 4.1 of [5] on classical mean field approximation, underthe simplifying assumptions mentioned above: Theorem 4.1 of [5] (Convergence to Mean Field)
Assumethat the initial occupancy measure M ( N ) (0) converges almost surelyto the deterministic limit µ (0) . Define µ ( t ) iteratively by (for t ≥ ): µ ( t + 1) = µ ( t ) K ( µ ( t )) . (2) Then for any fixed time t , almost surely: lim N →∞ M ( N ) ( t ) = µ ( t ) . In the sequel, we will write M ( t ) or simply M instead of M ( N ) ( t ), leaving N and t implicit, when this does not cause confusion.3. Refined Deterministic Approximation Theorem
In this section, we present our main results (Theorem 1 and Theorem 2) andprovide their proofs.3.1.
First Main Result : Transient Behaviour.
The iterative procedure ofTheorem 4.1 of [5] can be formalised as a t -indexed family of functions Φ t from U n to U n , where the functions Φ t : U n → U n are defined as follows:Φ ( m ) = m ; (Φ ( m )) j = n (cid:88) i =1 m i K ij ( m ); Φ t +1 ( m ) = Φ (Φ t ( m ))This implies that for all m ∈ U n and t ∈ IN, we have Φ (Φ t ( m )) = Φ t (Φ ( m )).In the following we assume that Φ is continuous and twice differentiable withrespect to m and that its second derivative is continuous (note that as U n is com-pact, this implies that Φ and its first two derivative are uniformly continuous).Moreover, in what follows, we assume that M ( N ) (0) converges to µ (0) (a determin-istic value) as N goes to infinity and we let µ ( t ) be defined as in Equation (2), or,equivalently, µ ( t + 1) = Φ ( µ ( t )) = Φ t +1 ( µ (0)).Our main theorem can be stated as follows. Theorem 1.
Assume that the function Φ is twice differentiable with continu-ous second derivative and that M ( N ) (0) converges weakly to µ (0) . Let A t and B t be respectively the n × n matrix A t = ( D Φ )( µ ( t )) and the n × n × n tensor B t = ( D Φ )( µ ( t )) . Then for any continuous and twice differentiable function withcontinuous second derivative h : U n → IR p ≥ we have: lim N →∞ N E (cid:104) h ( M ( N ) ( t )) − h (Φ t ( M ( N ) (0))) (cid:105) = Dh ( µ ( t )) V t + 12 D h ( µ ( t )) · W t , NICOLAS GAST (INRIA), DIEGO LATELLA (CNR-ISTI), AND MIEKE MASSINK (CNR-ISTI) where V t is an n × vector and W t is an n × n matrix, defined as follows: V t +1 = A t V t + B t · W t W t +1 = Γ( µ ( t )) + A t W t A Tt , with V = 0 , W = 0 and Γ( m ) is the following n × n matrix: Γ jj ( m ) = (cid:80) ni =1 m i K ij ( m )(1 − K ij ( m ))Γ jk ( m ) = − (cid:80) ni =1 m i K ij ( m ) K ik ( m )The key idea of the proof is to use a Taylor expansion of E [ h (Φ ( m ))] aroundΦ ( m ). We postpone the proof to Section 3.3.1.One of the main consequences of Theorem 1 is that it allows us to computeprecisely a development in O (1 /N ) of the mean and the covariance of the vector M ( N ) ( t ). This first order development is what we call the refined mean field approx-imation . In our numerical simulations, we will show that this refined approximationcan greatly improve the accuracy of the original mean field approximation whenthe number of entities N is relatively small. Corollary 1.
Let t ∈ IN . Under the assumptions of Theorem 1, and denoting µ ( t ) = Φ t ( m ) , it holds that (i) For any coordinate i and any time-step E (cid:104) M ( N ) i ( t ) (cid:105) = µ i ( t ) + ( V t ) i N + o (cid:18) N (cid:19) . (ii) For any pair of coordinates i, j , the co-variance satisfies cov (cid:16) M ( N ) i ( t ) , M ( N ) j ( t ) (cid:17) = 1 N ( W t ) i,j + o (cid:18) N (cid:19) . Second Main Result : Steady-State.
Mean field approximation can alsobe used to characterise the steady-state behaviour of a population model. It hasbeen shown that, in the case of continuous time or discrete-time mean field approx-imation, if this approximation has a unique attractor, then the stationary distri-bution of the system of size N concentrates on this attractor. In this section, werefine this result by computing the rate of convergence and by defining the refinedapproximation in this case.We say that a point µ ( ∞ ) is an exponentially stable attractor if • For any m ∈ U n : lim t →∞ Φ t ( m ) = µ ( ∞ ) ( i.e. it is a global attractor). • There exists an open neighbourhood V of µ ( ∞ ) and two constants a, b such that all m ∈ V : (cid:107) Φ t ( m ) − µ ( ∞ ) (cid:107) ≤ ae − bt (cid:107) m − µ ( ∞ ) (cid:107) ( i.e. it isexponentially stable). Theorem 2.
Assume that M ( N ) has a unique stationary distribution (for each N ), that the function Φ is twice differentiable and that the flow has a uniqueexponentially stable attractor µ ( ∞ ) . Then there exists a n × vector V ∞ and a n × n matrix W ∞ such that the constants V t and W t defined in Theorem 1 satisfy: lim t →∞ V t = V ∞ and lim t →∞ W t = W ∞ Moreover
REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS7 (i) W ∞ is the unique solution of the discrete-time Lyapunov equation: A ∞ W A T ∞ − W + Γ( µ ( ∞ )) = 0 and V ∞ is uniquely determined by V ∞ = 12 ( I − A ∞ ) − B ∞ W ∞ , where A ∞ = D Φ ( µ ( ∞ )) , B ∞ = D Φ ( µ ( ∞ )) and I is the identity matrix. (ii) for any twice differentiable function h , we can exchange the limits : lim N →∞ lim t →∞ N (cid:0) E [ h ( M ( t ))] − h (Φ t ( M ( N ) (0))) (cid:1) = lim t →∞ lim N →∞ N (cid:0) E [ h ( M ( t ))] − h (Φ t ( M ( N ) (0))) (cid:1) = Dh ( µ ( ∞ )) V ∞ + 12 D h ( µ ( ∞ )) · W ∞ , This result is interesting for at least two reasons. First, it generalises Theorem 1to the case of stationary distribution. Second, it is also the first result that providesa rate of convergence for the steady-state distribution of a model that has a discrete-time mean field approximation (to the best of our knowledge, the rate of convergencehad only been obtained for the finite-time horizon in [11]). We postpone the proofto Section 3.3.2.3.3.
Proofs.
To ease notation, in all the proofs we denote M N (0) by m , unlessspecified otherwise. In particular, when we write E [ h ( M ( t ))] we formally mean E [ h ( M ( N ) ( t )) | M ( N ) (0) = m ].3.3.1. Proof of Theorem 1.
One of the key ingredients to prove our result is to studywhat happens for t = 1. This is what we do in Lemma 1. Then Theorem 1 willfollow by using an induction on t . Note that one of the main technicalities in allthese lemmas is to prove that the convergence is uniform in the initial condition.This is why we make use of various functions ε ( N ) or ε t,g ( N ) that control the uniform convergence to 0. Lemma 1.
Let h : U n → IR p ≥ (for p ≥ ), be a twice differentiable func-tion such that D h is continuous. Then, there exists a function ε ( N ) such that lim N →∞ ε ( N ) = 0 and for all M ( N ) (0) = m ∈ U n , the following holds: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:0) E [ h ( M (1))] − h (Φ ( m )) (cid:1) −
12 ( D h )(Φ ( m )) · Γ( m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ε ( N ) . where Γ( m ) is defined in Theorem 1.Proof. The key idea of this proof is to consider the Taylor expansion of h for M (1)in the neighbourhood of Φ ( m ). Let E = M (1) − Φ ( m ). We get the following : h ( M (1)) − h (Φ ( m )) = ( Dh )(Φ ( m )) E + 12 ( D h )(Φ ( m )) · ( E ⊗ E ) + o ( || E || )Taking the expectation on both sides, we get : E [ h ( M (1))] − h (Φ ( m ))= ( Dh )(Φ ( m )) E [ E ] + 12 ( D h )(Φ ( m )) · E [( E ⊗ E )] + E [ o ( || E || )]The result follows by using Lemma 3, that establishes that E [ E ] = 0 and N E [( E ⊗ E )] = Γ( m ) and Lemma 4 that shows || N E [ o ( || E || )] || ≤ ε ( N ). (cid:3) NICOLAS GAST (INRIA), DIEGO LATELLA (CNR-ISTI), AND MIEKE MASSINK (CNR-ISTI)
The following lemma is a direct generalisation of Lemma 1 and it will be usedin the proof of Theorem 1. The proof is essentially the same as that of Lemma 1and exploits the time-homogeneity of the Markov chain.
Lemma 2.
Let h : U n → IR p ≥ be a twice differentiable function whose second deriv-ative is continuous. Then there exists a function ε ( N ) such that lim N →∞ ε ( N ) = 0 and for all t ∈ IN , N ∈ IN > and M ( N ) ( t ) = m (cid:48) ∈ U n , the following holds: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:0) E [ h ( M ( t + 1)) | M ( t ) = m (cid:48) ] − h (Φ ( m (cid:48) )) (cid:1) −
12 ( D h )(Φ ( m (cid:48) )) · Γ( m (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ε ( N ) Lemma 3.
Under the assumptions of Lemma 1, we obtain that E = E [ M ( N ) (1) − Φ ( m ) | M ( N ) (0) = m ] satisfies E [ E ] = 0 and E [ E ⊗ E ] =Γ( m ) /N .Proof. We observe that by definition of our model, M j (1) is the following randomvariable:(3) M j (1) = 1 N n (cid:88) i =1 (cid:98) B ij where ( (cid:98) B i,. ) is a random vector with multinomial distribution, with parameters N m i and ( K i, · ( m )). The variables are independent for different values of i (inparticular, if i (cid:54) = i (cid:48) we have cov (cid:16) (cid:98) B ij , (cid:98) B i (cid:48) k (cid:17) = 0). Moreover, for all i and all j (cid:54) = k : E (cid:104) (cid:98) B ij (cid:105) = N m i K ij ( m )var (cid:16) (cid:98) B ij (cid:17) = N m i K ij ( m )(1 − K ij ( m ))cov (cid:16) (cid:98) B ij , (cid:98) B ik (cid:17) = N m i K ij ( m ) K ik ( m ) . This implies that E [ M j (1)] = E (cid:34) N n (cid:88) i =1 (cid:98) B ij (cid:35) = 1 N n (cid:88) i =1 E (cid:104) (cid:98) B ij (cid:105) = Φ ( m ) j . (4)The case of E ⊗ E makes again use of Equation (3). Note that by (4), E j = M j (1) − Φ ( m ) j = M j (1) − E [ M j (1)]. This shows that E [( E ⊗ E ) jk ] = cov ( M j (1) , M k (1)), i.e. the covariance of M j (1) and M k (1). We consider the case k = j and the case k (cid:54) = j separately. Case k = j . N E [( E ⊗ E ) jj ] = N var ( M j (1))= N var (cid:32) N n (cid:88) i =1 (cid:98) B ij (cid:33) = 1 N n (cid:88) i =1 var (cid:16) (cid:98) B ij (cid:17) = n (cid:88) i =1 m i K ij ( m )(1 − K ij ( m )) , where the one but last equality comes from the independence of the variables (cid:98) B ij for i ∈ { . . . n } . REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS9
Case k (cid:54) = j . This case is similar. N E [( E ⊗ E ) jk ] = N cov ( M j (1) , M k (1))= N n (cid:88) i =1 n (cid:88) i (cid:48) =1 N cov (cid:16) (cid:98) B ij , (cid:98) B i (cid:48) k (cid:17) = − n (cid:88) i =1 m i K ij ( m ) K ik ( m ) , where in the double sum, only the terms i = i (cid:48) are non-zero because (cid:98) B ij and (cid:98) B i (cid:48) k are independent when i (cid:54) = i (cid:48) . (cid:3) Lemma 4.
Under the assumptions of Lemma 1 and using the notations of Lemma 1,there exists a function ε ( N ) such that lim N →∞ ε ( N ) = 0 and that || N E [ o ( || E || )] || ≤ ε ( N ) .Proof. First of all we note that, as D h is continuous, it is uniformly continu-ous (because U is compact). Hence, the term o ( || E || ) ∈ IR p is uniform in m and || E || , i.e. , there exist a function δ : IR n → IR and a constant γ > || e ||→ δ ( e ) = 0, δ ( e ) ≤ γ for all e ∈ IR n , and || o ( || E || ) || ≤ || E || δ ( E ).This implies E [ || o ( || E || ) || ] ≤ E [ || E || δ ( E )]. The proof proceeds with the followingderivation:First, note that lim || e ||→ δ ( e ) = 0 implies that for all (cid:15) >
0, there exists a (cid:15) > δ ( e ) ≤ (cid:15) for all e such that || e || ≤ a (cid:15) . Therefore E [ || E || δ ( E )] ≤ E [ || E || δ ( E ) || E ||≥ a (cid:15) ] + E [ || E || (cid:15) || E || ( N η γ ne − Na (cid:15) /n + (cid:15) ). (cid:3) Lemma 5.
Under the assumptions of Lemma 1 and that M ( N ) (0) converges weaklyto µ (0) as N goes to infinity, for any continuous function g : U n → IR p ≥ and all t ,there exists a function ε t,g ( N ) such that lim N →∞ ε t,g ( N ) = 0 and (cid:107) E [ g ( M ( t ))] − g ( µ ( t )) (cid:107) ≤ ε t,g ( N ) . The specific value of η depends on the norm used; for instance η = 1 for the infinity norm. Proof.
We proceed by induction on t . The lemma holds for t = 0 because M ( N ) (0)converges weakly to µ (0). As g is continuous, there exists a function δ : IR + → IR + such that (cid:107) g ( m ) − g ( m (cid:48) ) (cid:107) ≤ δ ( (cid:107) m − m (cid:48) (cid:107) ) and lim r → δ ( r ) = 0. Moreover, as g andΦ t are continuous, g ◦ Φ t is also uniformly continuous (and the continuity is uniformsince U n is compact). Hence E [ (cid:107) g ( M ( t + 1)) − g ( µ ( t + 1)) (cid:107) ]= E [ (cid:107) g ( M ( t + 1)) − g (Φ ( M ( t ))) + g (Φ ( M ( t ))) − g ( µ ( t + 1)) (cid:107) ] ≤ E [ δ ( (cid:107) M ( t + 1) − Φ ( M ( t )) (cid:107) )] + E [ g ◦ Φ ( M ( t )) − g ◦ Φ ( µ ( t ))] ≤ E [ E [ δ ( (cid:107) E (cid:107) ) | M ( t )]] + ε t,g ◦ Φ , where E = M ( t +1) − Φ ( M ( t )) converges to 0 (uniformly in M(t)) by Lemma 3. (cid:3) Of the main theorem.
We proceed by induction on t . The theorem clearly holdsfor t = 0 because Φ ( M ( N ) (0)) = M ( N ) (0) by definition of Φ . Assume that thetheorem now holds for some t ≥
0. We have : N ( E [ h ( M ( t + 1))] − h ( µ ( t + 1))) = N E [ h ( M ( t + 1)) − h (Φ ( M ( t )))]+ N ( E [ h (Φ ( M ( t )))] − h ( µ ( t + 1))) . We will analyse the two lines separately. For the first line, the idea is to useLemma 1. Indeed this line is equal to E [ N E [ h ( M ( t + 1)) − h (Φ ( M ( t ))) | M ( t )]]= E (cid:20)
12 ( D h )(Φ ( M ( t ))) · Γ( M ( t )) (cid:21) + Θ( N ) , where by Lemma 1, Θ( N ) is such that (cid:107) Θ( N ) (cid:107) ≤ ε ( N ). By Lemma 5 with g =( D h )(Φ ) · Γ, as N goes to infinity, this quantity converges to12 ( D h )(Φ ( µ ( t ))) · Γ( µ ( t )) = 12 ( D h )( µ ( t + 1)) · Γ( µ ( t )) . (5)For the second line, the idea is to apply the induction hypothesis to h ◦ Φ whichcan be done because the h ◦ Φ is twice differentiable (because both h and Φ are).This shows that N ( E [ h (Φ ( M ( t )))] − h ( µ ( t + 1)))= N ( E [ h (Φ ( M ( t )))] − h (Φ ( µ ( t ))))= D ( h ◦ Φ )( µ ( t )) V t + 12 D ( h ◦ Φ )( µ ( t )) · W t + ε t,h ◦ Φ ( N )The first term can be dealt with by applying the chain rule D ( h ◦ Φ ) = ( Dh )(Φ )( D Φ )which shows that: D ( h ◦ Φ )( µ ( t )) V t = ( Dh )(Φ ( µ ( t )))( D Φ )( µ ( t )) V t = ( Dh )( µ ( t + 1))( D Φ )( µ ( t )) V t = ( Dh )( µ ( t + 1)) A t V t . (6) REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS11
For the second term, we apply the product rule and again the chain rule:12 D ( h ◦ Φ ) · W t = 12 D (cid:0) ( Dh )(Φ )( D Φ ) (cid:1) · W t = (cid:16) D (cid:0) ( Dh )(Φ ) (cid:1) · ( D Φ ) + ( Dh )(Φ ) · D ( D Φ ) (cid:17) · W t = (cid:16) ( D h )(Φ ) · ( D Φ )( D Φ ) T + ( Dh )(Φ )( D Φ ) (cid:17) · W t By applying the last function at the point µ ( t ) we get that:12 D ( h ◦ Φ )( µ ( t )) · W t =( D h )(Φ ( µ ( t ))) · ( D Φ )( µ ( t )) 12 W t ( D Φ ) T ( µ ( t ))+ ( Dh )(Φ ( µ ( t )))( D Φ ( µ ( t ))) · W t , which, using the definition of µ ( t + 1) = Φ ( µ ( t )) and the assumptions A t =( D Φ )( µ ( t )) and B t = ( D Φ ( µ ( t ))), is the same as:12 ( D h )( µ ( t + 1)) · A t W t A Tt + 12 ( Dh )( µ ( t + 1))( B t · W t ) . (7)The theorem holds by combining Equations (5), (6) and (7). (cid:3) Proof of Theorem 2.Proof.
The proof is inspired by the proof of [13, Theorem 3.1] and uses ideas ofStein’s method. Because many details are similar to the proof of Theorem 1, weskip some details of computation in this proof.Let h be a twice-differentiable function and let G h be the function defined forall m by: G h ( m ) = ∞ (cid:88) t =0 [ h (Φ t ( m )) − h ( µ ( ∞ ))] .G h ( m ) is well defined because µ ( ∞ ) is exponentially stable attractor.By construction, for any m we have G h ( m ) = h ( m ) − h ( µ ( ∞ )) + ∞ (cid:88) t =1 [ h (Φ t ( m )) − h ( µ ( ∞ ))]= h ( m ) − h ( µ ( ∞ )) + G h (Φ ( m ))The above equation is a discrete time Poisson equation and implies that for any m : h ( m ) − h ( µ ( ∞ )) = G h ( m ) − G h (Φ ( m ))(8)Assume that at time 0, the initial state M (0) is distributed according to thestationary distribution of the system of size N . By the definition of stationarity, attime 1, M (1) is also distributed according to the same stationary distribution andwe have E [ G h ( M (0))] = E [ G h ( M (1))].By using (8) and then the above equation, we get: E [ M (0)] − h ( µ ( ∞ )) = E [ M (0) − h ( µ ( ∞ ))]= E [ G h ( M (0)) − G h (Φ ( M (0)))]= E [ G h ( M (1)) − G h (Φ ( M (0)))] Figure 1.
SEIR model of individual objectBy Lemma 1, this shows that : N ( E [ M ( ∞ )] − h ( µ ( ∞ ))) = E [ G h ( M (1)) − G h (Φ ( M (0)))]= E (cid:20) D ( G h )(Φ ( M (0))) · Γ( M (0)) (cid:21) + o (1)= 12 D ( G h )(Φ ( µ ( ∞ ))) · Γ( µ ( ∞ )) + o (1) , where the last equality comes from the fact that the stationary distribution of thesystem of size N converges weakly to a Dirac measure in µ ( ∞ ) as N goes to infinity(see [11, Corollary 14]).To conclude the proof, the only remaining step is to compute the second differ-ential of G h which can be expressed as the infinite sum: D ( G h )( µ ( ∞ )) = ∞ (cid:88) t =0 D ( h ◦ Φ t )( µ ( ∞ )) . The expressions for V ∞ and W ∞ come from plugging the above equations intoEquation (5), (6) and (7). The uniqueness of the solution of the Lyapunov equationis due to the fact that the fixed point µ ( ∞ ) is exponentially stable and thereforealso linearly stable. (cid:3) Refined Mean Field Model for SEIR
In this section we provide a simple example that illustrates the results for therefined mean field model of the simple computer epidemic SEIR example presentedin [4]. Each object in the model consists of four local states: Susceptible (S),Exposed (E), Infected (I) (and active) and Recovered (R). The four-state SEIRmodel of an individual object is shown in Figure 1.Its discrete time evolution is given by the following probability transition matrix K in which m S , m E , m I and m R denote the fraction of objects in the system thatare in local state S, E, I and R, respectively: K ( m S , m E , m I , m R ) = − ( α e + α i m I ) α e + α i m I − α a α a
00 0 1 − α r α r α l − α l REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS13
In other words, a susceptible becomes exposed with probability ( α e + α i m I ) – i.e. , α e denotes the external and α i the internal infection probability –; An exposed nodeactivates his infection with probability α a ; An infected recovers with probability α r ; and α l is the probability to loose the protection against infection.4.1. Computation of A , B and Γ . We illustrate how to apply Theorem 1 inits simplified form, when h is the identity function, as in Corollary 1– (i) . Thefirst step is to compute the Jacobian and the Hessian of the function Φ for ageneric occupancy measure vector m at time step t . Written as a column vector,the function Φ ( m ) = mK ( m ) is given byΦ ( m ) = m S (1 − α e − α i m I ) + α l m R m S ( α e + α i m I ) + (1 − α a ) m E m E α a + (1 − α r ) m I α r m I + (1 − α l ) m R Hence, the Jacobian is the following 4 × D (Φ )( m S , m E , m I , m R ) = − ( α e + α i m I ) 0 − α i m S α l α e + α i m I − α a α i m S α a − α r
00 0 α r − α l The Hessian is a 4 × × ×
4, onefor each function (Φ ) j , where j ∈ { S, E, I, R } : D ((Φ ) S )( m S , m E , m I , m R ) = − α i
00 0 0 0 − α i D ((Φ ) E )( m S , m E , m I , m R ) = α i
00 0 0 0 α i The matrices for I and R are two 4 × × K and the occupancy measure m as defined in Lemma 1. The refined mean fieldapproximation of the occupancy measure vector is thus given by E [ M ( N ) ( t )] ≈ µ ( t ) + V t /N , where V t is computed recursively, according to Theorem 1.4.2. Dynamics of the SEIR Model and its Approximations.
We considera model with the following parameter values for the local transition probabil-ities: α e = 0 . , α i = 0 . , α r = 0 . , α l = 0 .
01 and α a = 0 .
04. Initially, M (0) = (0 . , . , . , . N = 10, N = 20, N = 50 and N = 100, respectively; time t rangesfrom 0 to 500 time units.We observe that, as exepcted, the gap between the classical mean field approx-imation and the simulation is relatively small and decreases with N . Still, for N = 10, we observe a clear difference between the classical mean field approxi-mation and the simulation, whereas the refined mean field provides a much closer x s ( t ) (mean field) E [ X S [ t ]] (Simulation, N =10) S ( t )+ V ( t )/ N (refined mean field) 0 100 200 300 400 5000.150.160.170.180.190.20 x s ( t ) (mean field) E [ X S [ t ]] (Simulation, N =20) S ( t )+ V ( t )/ N (refined mean field) N = 10 N = 20 x s ( t ) (mean field) E [ X S [ t ]] (Simulation, N =50) S ( t )+ V ( t )/ N (refined mean field) 0 100 200 300 400 5000.150.160.170.180.190.20 x s ( t ) (mean field) E [ X S [ t ]] (Simulation, N =100) S ( t )+ V ( t )/ N (refined mean field) N = 50 N = 100 Figure 2.
Evolution of the fraction of objects in state S forpopulation sizes N = 10, N = 20, N = 50 and N = 100. Thefigures compare the classical mean field approximation (obtainedwith python – numpy ) with the refined one and with the average of10,000 simulation runs of the system.approximation (in this case, the graphs overlap almost everywhere). With the in-crease of the population size N both approximations converge to the same value,as well as the value obtained by simulation: for N ≥
50, the curves are almostindistinguishable.To highlight the differences, we plot in Figure 3 the difference between the twoapproximations with respect to the simulation : On the left panel, we plot a functionof time for the quantities E [ M ( t )] − µ ( t ); On the right we plot E [ M ( t )]+ V t /N − µ ( t )(in both cases for N = 10). We observe that the refined mean field approximation(right panel) is an order of magnitude closer to the value obtained by simulation:while the error of the classical mean field approximation can be larger than 0 . . N . To go further, we study the steady-state distribution in Table 1 in which wedisplay the average proportion of objects in states S , E , I or R estimated by simula-tion, refined mean field approximation and classical mean field approximation. Thisillustrates the approximation accuracy for the steady state of the SEIR example foreach local state of an object. As for the two previous figures, this table illustratesthat the refined mean field approximation provides very accurate estimates of thetrue stationary distribution even for very small values of N , which shows that theasymptotic results presented in Theorem 2 are also useful for small values of N . REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS15 t D i ff e r e n c e S i m u l a t i o n - ( m e a n f i e l d ) S mf ErrorE mf ErrorI mf ErrorR mf Error 0 100 200 300 400 500Time t D i ff e r e n c e S i m u l a t i o n - ( r e f i n e d m e a n f i e l d ) S rmf ErrorE rmf ErrorI rmf ErrorR rmf Error
Error of the mean field approximation Error of the refined mean field approx.
Figure 3.
SEIR model: Quantification of the difference (error)between the simulation results and the classical mean field approx-imation (left) and the refined mean field result (right), respectively,for N = 10. The results show simulation value minus mean fieldvalue.These results are in line with the results presented in [13] for continuous time meanfield models. State S E I R
Simulation ( N = 10) 0.191 0.115 0.231 0.462Refined mean field ( N = 10) 0.189 0.116 0.232 0.464Mean field ( N = 10) 0.164 0.119 0.239 0.478 Table 1.
SEIR model: Comparison of the accuracy of the meanfield and refined mean field approximation. The columns show theaverage proportion of objects in the states susceptible (S), exposed(E), infected (I) and recovered (R), respectively. Each item in thetable was computed by measuring the occupancy measure for times t = 1000, i.e. when the systems’ occupancy measure has reacheda sufficiently stable value. Simulation values are averages over100 ,
000 simulations.5.
Refined Mean Field Model for WSN
The next example concerns a simple model of a wireless sensor network [3]. Suchnetworks are composed of wireless sensor nodes and gateways. This example servestwo purposes. First it shows that the assumption of homogeneous objects is notrestrictive: In this example, there are two classes of objects, which is representedby having a block-diagonal matrix K . Second, we use it to consider a function h that is not just the projection on one coordinate.Wireless sensor nodes have three local states. In the initial state, e , a sensornode waits for detecting an event of interest and collects data for that event. Afterthat, the node moves to state c to communicate its data to an available gateway.The communication attempt may timeout if no gateway is available. In that case cd e ab ληγ βm a βm c α Figure 4.
WSN model of individual objects: Sensor Node (left)and Gateway (right)the sensor node moves to state d , introducing some delay before moving back tostate c for a further communication attempt.Gateway nodes have two states. Initially they are in state a and available toreceive data from a sensor node. Upon connection to a sensor node they moveto state b during which they are busy processing the data. When in state b theyare temporarily unavailable for communication with other sensor nodes. Afterprocessing the batch of data they move back to state a .We consider a model where objects have five local states { a, b, c, d, e } , where a and b are states of a gateway node and c, d and e are states of a sensor node, i.e.each object in the model can behave either as a gateway or as a sensor, but itcannot change its behaviour from that of a gateway to that of a sensor, or vice-versa. A system is then composed of N (syntactical) homogeneous objects with afixed fraction m G = m a + m b of gateway nodes and a fraction m W = m c + m d + m e of wireless sensor nodes, such that m W = 1 − m G . To keep the model simple forthe purpose of illustrating the refined mean field approach, we do not considerinterference due to collision in the communication between nodes and gateways.The probability transition matrix is given below: K ( m ) = − βm c βm c α − α − γ − βm a γ βm a η − η
00 0 λ − λ , where α denotes the probability of the gateway to get again available, β the prob-ability of data communication between the gateway and a sensor node, λ the prob-ability that a sensor node is ready to send data, γ the probability that a sensornode performs a time-out and η the probability that a delayed sensor node tries tocommunicate again.In the example we will use the following values for the above parameters: α =0 . , β = 0 . , λ = 0 . , γ = 0 . , and η = 0 .
01, and let M ( N ) ( t ) denote, as usual,the occupancy measure process of the WSN model (leaving N and t implicit forthe sake of notation simplicity). We are interested in the average response time ofa sensor node, i.e. the time a sensor node needs to wait to be able to communicateits data to the gateway. This expected response time can be defined as the fractionbetween the sensor nodes that are already waiting to communicate their data, i.e.the sensor nodes in state c and state d , and the new sensor nodes that became REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS17 ready in the current time step, i.e. λ times the nodes in local state e : E [ R ] = E (cid:20) ( M c + M d ) λM e (cid:21) , With reference to Theorem 1, we define h ( x , x , x , x , x ) = ( x + x ) λx .5.1. Computation of A , B and Γ for the WSN model. In the sequel, we makereference to m = ( m a , m b , m c , m d , m e ) ∈ U . The Jacobian of function Φ : D (Φ )( m ) = − βm c α − βm a βm c − α βm a − βm c − γ − βm a η λ γ − η βm c βm a − λ The Hessian of the function Φ satisfies D ((Φ ) d )( m ) = 0: D ((Φ ) a )( m ) = D ((Φ ) c )( m ) = − β − β D ((Φ ) b )( m ) = D ((Φ ) e )( m ) = β β The Jacobian of function h is D ( h )( m ) = (0 , , λm e , λm e , − m c + m d λm e ) and its Hessianis D ( h )( m ) = − λm e − λm e − λm e − λm e ∗ ( m c + m d ) λ ∗ m e . The 5 × V t and the 5 × W t are computed recursively accordingto Theorem 1, using the new Jacobian and Hessian for function Φ; thus the refinedmean field approximation of the measure of interest is given by E [ h ( M ( N ) ( t ))] ≈ h (Φ t ( m )) + ( D ( h ) · V t + ∗ ( D ( h ) · W t )) /N .5.2. Results.
In Figure 5 various approximations of the expected response timefor the WSN model are shown, for time values t ranging from 0 to 400 time units.We consider a relatively small system with 15 nodes (10 sensor and 5 gatewaynodes). Recall that the function h is defined by : h ( x , x , x , x , x ) = ( x + x ) λx .We compare five curves :(1) The blue curve labelled Classic Mean Field (1) (obtained with Octave)shows the expected response time when this is approximated by defining E [ R ] as in [3]: E [ R ] = ( m c + m d ) λm e = h ( m ) , where m c , m d and m e denote the classical mean field approximation valuesfor the fractions of sensor nodes being in state c , d and e , respectively.(2) The red curve (2) is the expectation of h ( M ), computed by stochasticsimulation: E [ h ( M )] = min( E (cid:20) ( M c + M d ) λM e (cid:21) , M e = 0 occasionally. This is why we defined E [ h ( M )] as the minimumbetween the actual value and 100. The latter is the value one obtainswhen one but all nodes are waiting and the last node is getting ready forcommunication too: M c + M d = 9 and M e = 1, i.e. 9/(0.09*1) = 100.(3) The orange curve (3) shows the expected response time approximated usingthe refined mean field approximation of Theorem 1 with the function h .For comparison, we also compute two other quantities :4. The purple curve (4) shows the response time approximated as follows,using the refined mean field approximation for the fraction of sensor nodesin each state (i.e. we use Theorem 1 with the identity function and thenapply h ): h ( rmf ) = ( rmf c + rmf d ) λ rmf e where rmf c = µ c + V c /N denotes the refined mean field approximation ofthe fraction of sensor nodes in state c , and similarly for rmf d and rmf e .5. Finally, the green curve (5) shows the expected response time defined as: h ( E [ M ]) = ( E [ M c ] + E [ M d ]) λ E [ M e ]where E [ M c ] is the average fraction of sensor nodes in state c obtained viathe average of 100,000 individual simulation runs of the model. E [ M d ] and E [ M e ] are obtained in a similar way.In Figure 5(a), we plot these various curves for 15 nodes in total. We make twoobservations. First, in this case the value obtained by simulation (red curve (2))is almost 50% larger than the classic mean field approximation (1) whereas therefined approximation (3) is much closer. Second, the purple curve (4) is close tothe green curve (5) but quite far away from the red (2) and orange (3) curves. Thisshows that when applying Theorem 1, computing a refined model for E [ h ( M )] andfor h ( E [ M ]) might lead to very different results.Of course, the larger N gets (in an otherwise equal model), the closer the orange(3) and red (2) curves will get to the blue curve (1), i.e. the classic mean fieldapproximation, as illustrated in Figure 5(b). In both cases, all curves collapse intoto a single curve.6. Refined Mean Field Model for Majority Rule Decision-making
The example in this section concerns a model for collective decision-making. Themodel is inspired by the work of Montes de Oca et al. (see [8, 20] and referencestherein). Collective decision-making is a process whereby the members of a groupdecide on a course of action by consensus. Such collective decision-making processeshave also been applied in swarm robotics. In particular, in that context the robots
REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS19 t t (a) N = 15 : 10 sensors; 5 gateways (b) N = 1500 : 1000 sensors; 500 gateways Figure 5.
Expected response time E [ R ] for a sensor node tocommunicate its data to a GW for a WSN model with N nodesof which 2 N/ N/ N = 15 and1000 for N = 1500.where asked to choose between two actions that have the same effect but differ intheir execution times [8].One strategy of collective decision making is the use of the majority rule. Inthis strategy the agents in a population are initially divided into two groups. Onein which all members have opinion A and one where all members have opinionB. In every step three agents are selected randomly from the total population toform a temporary team. The team applies the majority rule such that all itsmembers adopt the opinion held by the majority (i.e. at least two) of the members,after which they return to the total population until the population has reached aconsensus on one of the two opinions.In the majority rule strategy extended with differential latency the agents in thepopulation are not all the time available for team formation. Both types of agentsare assumed to perform an action with a certain duration during which they cannotparticipate in team formation. For example, agents with opinion B perform suchactions taking (on average) relatively more time than those with opinion A. In [8]such latency periods for agents with opinion A and B are modelled by randomvariables with exponential distributions with rate λ A and λ B respectively. Forsimplicity, it can also be assumed that the A-type actions take 1 time unit onaverage (i.e. λ a = 1) and that B-type actions take 1 /λ time units on average,where λ takes a value in (0 , LA (latent A), NA (non-latentA), LB (latent B), NB (non-latent B). It is assumed that if an agent is latent itcannot be selected for team formation. The four state MRDL-DT model of anindividual object of the population is shown in Figure 6. The name of the statesindicate in which class the object is. LB NBLA NA keepBactBkeepAactAchangeBAchangeAB
Figure 6.
Majority rule differential latency model of an individ-ual object. Latent states are red, non-latent ones blue.The behaviour of an individual object is as follows. Initially the object is latent,and, assuming it has opinion A (state LA ), it finishes its job and becomes availablefor team-formation (transition actA ) moving to state NA with probability 1 /q , forappropriate q . When in NA it gets selected in a team with two other members.If the two other members have opinion B it changes its opinion into B and movesto state LB . This can happen with a probability 3 m NB where factor 3 modelsthe fact that we abstract from the exact order in which the members of the teamare selected, which can happen in 3 different ways. In alternative the two othermembers can have both opinion A, or one opinion A and the other opinion B. Inthat case the opinion of the object does not change and the object moves back to LA with probability q ( m NA + m NA m NB ).If the object is in state LB it becomes available for team formation, moving tostate NB with a probability λ/q , where λ is a value in (0 , NB is similar to that in state NA , except that now the opinion may change from B toA. The discrete time evolution of the model is given by probability transitionmatrix K in which m LA , m NA , m LB and m NB denote the fraction of objectsin the system that are in local state LA , NA , LB and NB , respectively and m = ( m LA , m NA , m LB , m NB ): K ( m ) = − q q q ( m NA + m NA m NB ) NA ( m ) q m NB
00 0 1 − λq λq q m NA q ( m NB + m NA m NB ) NB ( m ) where NA ( m ) = 1 − q ( m NB + m NA + m NA m NB )and NB ( m ) = 1 − q ( m NA + m NB + m NA m NB ) . REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS21
Since we are dealing with clock-synchronous discrete systems, we also introducedthe discretisation factor q = 10 so that only a fraction of the population is movingfrom the latent to the non-latent state at any time.In the example we use the values q = 10 and λ taking values 1.0, 0.5 and 0.25 inthe various analyses, modelling that task B takes the same time as task A, or twiceas much time or four times as much time as task A, on average , respectively. Weare interested in the evolution of the consensus, C A , on opinion A as a function ofthe initial values and the differential latency λ . Let E [ C A ] = E [ M LA + M NA ] = E [ h ( M LA , M NA , M LB , M NB )]where, with reference to Theorem 1, h ( x , x , x , x ) = x + x .6.1. Computation of A , B and Γ for the MRDL-DT model. As before, wefirst need to compute the Jacobian and the Hessian of the function Φ for a genericoccupancy measure vector m at time step t. The Jacobian of function Φ is: D (Φ )( m ) = − /q q m NA + q m NA m NB q m NA /q JNA ( m ) 0 − q m NA − q m NA m NB q m NB − λq q m NB m NA + q m NB − q m NB − q m NA m NB λq JNB ( m ) where JNA ( m ) = 1 − ( 9 q m NA + 6 q m NA m NB + 3 q m NB )and JNB ( m ) = 1 − ( 3 q m NA + 9 q m NB + 6 q m NA m NB ) . The Hessian of function Φ is: D ((Φ ) LA )( m ) = q m NA + q m NB q m NA q m NA D ((Φ ) NA )( m ) = − q m NA − q m NB − q m NA − q m NB − q m NA − q m NB − q m NA D ((Φ ) LB )( m ) = q m NB q m NB q m NB + q m NA D ((Φ ) NB )( m ) = − q m NB − q m NA − q m NB − q m NA − q m NB − q m NB − q m NA Results for the MRDL-DT example.
We first show some results for amedium size population of N = 160. Figure 7 shows the dynamics of the fractionsof the population having opinion A and B, for both the latent and non-latentobjects, for the first 200 time units. Similarly to the previous examples, a goodcorrespondence can be observed between the results of the mean of 1000 simulationruns and the mean field approximations. Also in this case the refined mean fieldprovides a better approximation than the classical mean field approximation forthe period ranging approximately from 50 to 150 time units. (a) Latent and non-latent (b) Dynamics of opinion A Figure 7.
Dynamics of opinion A, latent and non-latent. Clas-sical mean field (plain lines), refined mean field (dashed lines) andsimulation results (dotted lines) of MRDL model with 160 objects, λ = 1 .
0, q=10 and initially 0 . ∗
160 = 96 have opinion A and thepopulation is initially latent.A similar improved correspondence can be observed when considering the ag-gregated populations with opinion A or opinion B, respectively, as shown in Fig-ure 7(b).However, if a much smaller population is considered, e.g. N = 32, the meanfield approximation differs considerably from the simulation results and the refinedapproximation does not really improve the accuracy of the approximation. Thisis what can be observed in Figure 8, where the results for N = 32 are shown fora model without differential latency (i.e. λ = 1 . t ∈ [0 , N [20], where the critical density is given by m A = λ (1+ λ ) , where m A is the initial fraction of the population with opinion A . Forlarge N , if m A > λ (1+ λ ) the system almost surely reaches consensus on A, whereasfor m A < λ (1+ λ ) it almost surely reaches consensus on B. This explains why for REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS23
Figure 8.
Classical mean field (plain lines), refined mean field(dashed lines) and simulation results (dotted lines) of MRDL modelwith 32 objects, λ = 1 .
0, q=10 and initially (cid:98) . ∗ (cid:99) = 9 haveopinion A and the population is initially latent.larger populations the mean field approximations become increasingly accurate asshown in Figure 7.This example shows a limit of the refined mean field approximation: when asystem has multiple equilibrium point (and in particular when there are multipleabsorbing states as in this example), the dynamics of the mean field approximationdepends on the initial state of the system: For a given initial state the mean field willalways follow the same trajectory (it is a deterministic system). When the systemis large, the random fluctuations will remain small and the corresponding stochasticsystem will stay in the same basin of attraction. In the case of a small population,however, the dynamics will be greatly affected by the random fluctuations. Thesefluctuations can lead the system to another basin of attraction than the originalone.7. Non-Exponentially Stable Equilibrium : Accuracy versus Time
In the previous sections, the dynamical system m = Φ ( m ) has either one ex-ponentially stable attractor (Section 4) or multiple exponentially stable attractors(Section 5). When the attractor is unique and exponentially stable, the accuracyof the mean field or refined mean field approximation is uniform in time (Theo-rem 2(ii)). In this section, we study the case of a system that has a unique attractorbut that is not exponentially stable. We show that, in this case, the accuracy ofthe (refined) mean field approximation is no longer uniform in time.We consider a system with N objects in which each object is in state 0 or 1. Anobject in state 1 goes to state 0 with probability 1 and an object in state 0 goes to1 with probability αm , where α ∈ (0 ,
1) is a parameter. The transition matrix K is K ( m ) = (cid:20) − αm αm (cid:21) The function Φ : m (cid:55)→ mK ( m ) has a unique fixed point whose first component is µ ( ∞ ) = ( √ α − / (2 α ). This fixed point is exponentially stable if and onlyif α < . m ( t ) Refined mean-field approx.Mean-field approx.Exact value of E[M(t)] 0 5 10 15 20 25 30Time0.6850.6900.6950.7000.705 m ( t ) Refined mean-field approx.Mean-field approx.Exact value of E[M(t)] (a) N = 10 (b) N = 30 Figure 9.
Exponentially stable case ( α = 0 . m ( t ) Refined mean-field approx.Mean-field approx.Exact value of E[M(t)] 0 5 10 15 20 25 30Time0.500.550.600.650.700.750.80 m ( t ) Refined mean-field approx.Mean-field approx.Exact value of E[M(t)] (a) N = 10 (b) N = 30 Figure 10.
Non-exponentially stable case ( α = 0 . Transient regime and accuracy for large t . In Figure 9 and Figure 10, weplot the first component of the mean field µ ( t ) and refined mean field approximation µ ( t )+ V ( t ) /N as well an exact value of E [ M ( t )] for N = 10 and N = 30. The initialvalue is m = (0 . , . E [ M ( t )] was computed by a numericalmethod that uses the fact that the system with N objects can be described by aMarkov chain with N + 1 states. REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS25
These figures show that the refined approximation always improves the accuracycompared to the classical mean field approximation for small values of t , both for α = 0 . α = 0 .
75. The situation for large values of t is quite different. Onthe one hand, when the fixed point is exponentially stable ( α = 0 .
6, Figure 9),the refined approximation is very accurate for all values of t . On the other hand,when the fixed point is not exponentially stable ( α = 0 .
75, Figure 10), the refinedapproximation seems to be unstable and is not a good approximation of E [ M ( t )]for values of t that are too large compared to N ( t > N = 10 or t >
12 for N = 30).7.2. Steady-state convergence.
To explore how the non-exponentially stablecase affects the accuracy of mean field approximation, we now study in more de-tails the steady-state convergence when α = 0 .
75. It is known (see for example [11,Corollary 14]) that when the dynamical system m = Φ ( m ) has a unique attractor µ ( ∞ ), then the steady-state expectation E [ M ( N ) ] converges to µ ( ∞ ) as N goes toinfinity. Theorem 2 shows that if in addition the attractor is exponentially stablethen E [ M ( N ) ] ≈ µ ( ∞ ) + V /N .In Figure 11, we show that the latter no longer holds when the mean field systemhas a unique attractor that is not exponentially stable. We consider the same modelwith α = 0 .
75 for which µ ( ∞ ) = 2 /
3. We plot in Figure 11 √ N ( E [ M ( N ) ] − µ ( ∞ )),where we computed E [ M ( N ) ] by inverting the transition matrix of the system ofsize N . This figure shows that √ N ( E [ M ( N ) ] − µ ( ∞ )) does not converge to 0 as N goes to infinity but seems to converge to approximately − . E (cid:104) M ( N ) (cid:105) ≈ µ ( ∞ ) − . √ N + 0 . N . (9)Note that the constants − . .
14 were obtained by a purely numericalmethod that consist in finding the best curve of the form a + b/ √ N that fits √ N ( E [ M ( N ) − µ ( ∞ )]). For now, we do not known if there exists a systematic wayto obtain these values for another model that would also have a non-exponentiallyequilibrium point. We left this question for future work.We remark that the convergence of E [ M ( N ) ] to µ ( ∞ ) observed Equation (9)is in O (1 / √ N ) and not in O (1 /N ). This model satisfies all the assumptions ofTheorem 2 but one : The attractor µ ( ∞ ) is not exponentially stable. This suggeststhat having an exponentially stable attractor is needed to obtain a convergence in1 /N . 8. Conclusion
In this paper we studied population models composed of (clock-)synchronousobjects. A classical method to study such systems is to consider the mean fieldapproximation. By studying the accuracy of this deterministic approximation, wedeveloped a new approximation, that we call the refined mean field approximation.We illustrated on a few examples that this approximation can greatly improve theaccuracy of the classical mean field limit, also for systems with a relatively smallsize (10 −
20 objects). Yet, this refined approximation has some limitations whenthe deterministic approximation has multiple basins of attraction or has a uniqueattractor that is not exponentially stable.
25 50 75 100 125 150 175N0.09650.09600.09550.09500.09450.09400.09350.0930 N ( E [ M ]()) N ( E [ M N ] ( ))0.0975 + 0.14/ N Figure 11.
Non-exponentially stable case ( α = 0 .
75) : conver-gence of E [ M ( t )] to µ ( ∞ ).The proposed refined approximation is given by a set of linear equations thatscales as the square of the dimension of the model (but does not depend on thesystem size). For now, we limited our study to relatively small models, for which theJacobian and Hessian can be computed in closed form. We are currently investingmeans to make this computation automatic which will allow us to study large-scaleexamples. References [1] H. Andersson and T. Britton.
Stochastic Epidemic Models and Their Statistical Analysis .Springer-Verlag, 2000.[2] Michel Bena¨ım and Jean-Yves Le Boudec. A class of mean field interaction models for com-puter and communication systems.
Perform. Eval. , 65(11-12):823–838, 2008.[3] Luca Bortolussi and Richard A. Hayden. Bounds on the deviation of discrete-time markovchains from their mean-field model.
Perform. Eval. , 70(10):736–749, 2013.[4] Luca Bortolussi, Jane Hillston, Diego Latella, and Mieke Massink. Continuous approximationof collective system behaviour: A tutorial.
Perform. Eval. , 70(5):317–349, 2013.[5] Jean-Yves Le Boudec, David D. McDonald, and Jochen Mundinger. A generic mean fieldconvergence result for systems of interacting objects. In
Fourth International Conference onthe Quantitative Evaluation of Systems (QEST 2007), 17-19 September 2007, Edinburgh,Scotland, UK , pages 3–18. IEEE Computer Society, 2007.[6] Anton Braverman, J. G. Dai, and Jiekun Feng. Steins method for steady-state diffusion ap-proximations: An introduction through the erlang-a and erlang-c models.
Stochastic Systems ,6(2):301–366, 2016.[7] Anton Braverman and Jim Dai. Steins method for steady-state diffusion approximations of
M/P h/n + M systems. The Annals of Applied Probability , 27(1):550–581, 2017.[8] Marco Antonio Montes de Oca, Eliseo Ferrante, Alexander Scheidler, Carlo Pinciroli, MauroBirattari, and Marco Dorigo. Majority-rule opinion dynamics with differential latency: amechanism for self-organized collective decision-making.
Swarm Intelligence , 5(3-4):305–327,2011.[9] Nicolas Gast. Expected values estimated via mean-field approximation are 1/n-accurate: Ex-tended abstract. In Bruce E. Hajek, Sewoong Oh, Augustin Chaintreau, Leana Golubchik,and Zhi-Li Zhang, editors,
Proceedings of the 2017 ACM SIGMETRICS / InternationalConference on Measurement and Modeling of Computer Systems, Urbana-Champaign, IL,USA, June 05 - 09, 2017 , page 50. ACM, 2017.[10] Nicolas Gast and Gaujal Bruno. A mean field model of work stealing in large-scale systems.
SIGMETRICS Perform. Eval. Rev. , 38(1):13–24, June 2010.
REFINED MEAN FIELD APPROXIMATION OF SYNCHRONOUS DISCRETE-TIME POPULATION MODELS27 [11] Nicolas Gast and Bruno Gaujal. A mean field approach for optimization in discrete time.
Discrete Event Dynamic Systems , 21(1):63–101, 2011.[12] Nicolas Gast and Bruno Gaujal. Markov chains with discontinuous drifts have differentialinclusion limits.
Perform. Eval. , 69(12):623–642, 2012.[13] Nicolas Gast and Benny Van Houdt. A refined mean field approximation.
Proc. ACM Meas.Anal. Comput. Syst. , 1(2):33:1–33:28, December 2017.[14] V. N. Kolokoltsov, J. Li, and W. Yang. Mean Field Games and Nonlinear Markov Processes.
ArXiv e-prints , December 2011.[15] Thomas G Kurtz. Solutions of Ordinary Differential Equations as Limits of Pure JumpMarkov Processes.
Journal of Applied Probability , 7:49–58, 1970.[16] Diego Latella, Michele Loreti, and Mieke Massink. On-the-fly PCTL fast mean-field approx-imated model-checking for self-organising coordination.
Sci. Comput. Program. , 110:23–50,2015.[17] L. Massouli´e and M. Vojnovi´c. Coupon replication systems.
SIGMETRICS Perform. Eval.Rev. , 33(1):2–13, June 2005.[18] Wouter Minnebo and Benny Van Houdt. A fair comparison of pull and push strategies inlarge distributed networks.
IEEE/ACM Trans. Netw. , 22(3):996–1006, 2014.[19] Michael Mitzenmacher. The power of two choices in randomized load balancing.
IEEE Trans.Parallel Distrib. Syst. , 12(10):1094–1104, 2001.[20] Alexander Scheidler. Dynamics of majority rule with differential latencies.
Phys. Rev. E ,83:031116–1 – 031116–4, Mar 2011.[21] Charles Stein. Approximate computation of expectations.
Lecture Notes-Monograph Series ,7:i–164, 1986.[22] Peerapol Tinnakornsrisuphap and Armand M. Makowski. Limit behavior of ECN/RED gate-ways under a large number of TCP flows. In
Proceedings IEEE INFOCOM 2003, The 22ndAnnual Joint Conference of the IEEE Computer and Communications Societies, San Fran-ciso, CA, USA, March 30 - April 3, 2003 , pages 873–883. IEEE, 2003.[23] John N. Tsitsiklis and Kuang Xu. On the power of (even a little) centralization in distributedprocessing. In Arif Merchant, Kimberly Keeton, and Dan Rubenstein, editors,
SIGMETRICS2011, Proceedings of the 2011 ACM SIGMETRICS International Conference on Measure-ment and Modeling of Computer Systems, San Jose, CA, USA, 07-11 June 2011 (Co-locatedwith FCRC 2011) , pages 161–172. ACM, 2011.[24] Benny Van Houdt. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. In Mor Harchol-Balter, John R. Douceur, and Jun Xu, editors,
ACM SIGMETRICS / International Conference on Measurement and Modeling of ComputerSystems, SIGMETRICS ’13, Pittsburgh, PA, USA, June 17-21, 2013 , pages 191–202. ACM,2013.[25] Nikita Dmitrievna Vvedenskaya, Roland L’vovich Dobrushin, and Fridrikh IzrailevichKarpelevich. Queueing system with selection of the shortest of two queues: An asymptoticapproach.
Problemy Peredachi Informatsii , 32(1):20–34, 1996.[26] D.J Wilkinson.
Stochastic Modelling for Systems Biology . Chapman & Hall, 2006.[27] Lei Ying. On the approximation error of mean-field models. In
Proceedings of the 2016 ACMSIGMETRICS International Conference on Measurement and Modeling of Computer Sci-ence , SIGMETRICS ’16, pages 285–297, New York, NY, USA, 2016. ACM.[28] Lei Ying. Stein’s method for mean field approximations in light and heavy traffic regimes.