Rare event simulation for stochastic dynamics in continuous time
Letizia Angeli, Stefan Grosskinsky, Adam M. Johansen, Andrea Pizzoferrato
RRare event simulation for stochastic dynamics incontinuous time
Letizia Angeli, Stefan Grosskinsky, Adam M. Johansen,Andrea PizzoferratoJune 12, 2019
Abstract
Large deviations for additive path functionals of stochastic dynam-ics and related numerical approaches have attracted significant recentresearch interest. We focus on the question of convergence propertiesfor cloning algorithms in continuous time, and establish connections tothe literature of particle filters and sequential Monte Carlo methods.This enables us to derive rigorous convergence bounds for cloning algo-rithms which we report in this paper, with details of proofs given in afurther publication. The tilted generator characterizing the large devi-ation rate function can be associated to non-linear processes which giverise to several representations of the dynamics and additional freedomfor associated numerical approximations. We discuss these choices indetail, and combine insights from the filtering literature and cloningalgorithms to compare different approaches and improve efficiency.
Contents a r X i v : . [ c ond - m a t . s t a t - m ec h ] J un Introduction
Large deviation simulation techniques based on classical ideas ofevolutionary algorithms [1, 2] have been proposed under the name of‘cloning algorithms’ in [3] for discrete and in [4] for continuous timeprocesses, in order to study rare events of dynamic observables of in-teracting lattice gases. This approach has subsequently been appliedin a wide variety of contexts (see e.g. [5, 6, 7, 8] and references therein),and more recently, the convergence properties of the algorithm havebecome a subject of interest. Analytical approaches so far are basedon a branching process interpretation of the algorithm in discrete time[9], with limited and mostly numerical results in continuous time [10].Systematic errors arise from the correlation structure of the cloningensemble which can be large in practice, and several variants of theapproach have been proposed to address those including e.g. a mul-ticanonical feedback control [7], adaptive sampling methods [11] orsystematic resampling [12]. A recent survey of these issues and differ-ent variants of cloning algorithms in discrete and continuous time canbe found in [13], Section III.In this paper we provide a novel perspective on the underlyingstructure of the cloning algorithm, which is in fact well established inthe statistics and applied probability literature on Feynman-Kac mod-els and particle filters [14, 15, 16]. The framework we develop here canbe used to generalize rigorous convergence results in [17] to the settingof continuous-time cloning algorithms as introduced in [4]. Full mathe-matical details of this work are published in [18], and here we focus ondescribing the underlying approach and report the main convergenceresults. A second motivation is to use different McKean interpreta-tions of Feynman-Kac semigroups (see Section 2.2) to highlight severaldegrees of freedom in the design of cloning-type algorithms that can beused to improve performance. We illustrate this with the example ofcurrent large deviations for the inclusion process (originally introducedin [19]), aspects of which have previously been studied [20]. Currentfluctuations in stochastic lattice gases have attracted significant re-cent research interest (see e.g. [21, 22, 23] and references therein), andare one of the main application areas of cloning algorithms which areparticularly challenging. In contrast to previous work in the contextof cloning algorithms [9, 10], our mathematical approach does not re-quire a time discretization and works in a very general setting of ajump Markov process on a compact state space. This covers in partic-ular any finite state Markov chain or stochastic lattice gas on a finitelattice.The paper is organized as follows. In Section 2 we introduce no-tation, the Feynman-Kac semigroup and several representations of theassociated non-linear process. In Section 3 we describe different par-ticle approximations including the cloning algorithm, and summarizeresults published in [18] on convergence properties of estimators basedon the latter. In Section 4 we describe a modification of the cloningalgorithm for a particular class of stochastic lattice gases and apply it o the inclusion process as an example. We consider a continuous-time Markov jump process (cid:0) X ( t ) : t ≥ (cid:1) on a compact state space E . To fix ideas we can think of a finite stateMarkov chain, such as a stochastic lattice gas on a finite lattice Λ witha fixed number of particles M . Here E is of the form S Λ with a finiteset S of local states (e.g. S = { , } or { , . . . , M } ), but continuoussettings with compact E ⊂ R d for any d ≥ are also included. Onecan in principle also generalize to separable and locally compact statespaces, including countable Markov chains and lattice gases on finitelattices with open boundaries. But this would require more effort andcomplicate not only the proof but also the presentation of the mainresults for technical reasons which we want to avoid here (see [18] fora more detailed discussion).Jump rates are given by the kernel W ( x, dy ) , such that for all x ∈ E and measurable subsets A ⊂ E P (cid:2) X ( t + ∆ t ) ∈ A (cid:12)(cid:12) X ( t ) = x (cid:3) = ∆ t (cid:90) A W ( x, dy ) + o (∆ t ) as ∆ t → , (2.1)where o (∆ t ) / ∆ t → . We use the standard notation P and E for thedistribution and the corresponding expectation on the usual path spacefor jump processes Ω = (cid:8) ω : [0 , ∞ ) → E right continuous with left limits (cid:9) . If we want to stress a particular initial condition x ∈ E of the pro-cess we write P x and E x . The process can be characterized by theinfinitesimal generator L f ( x ) = (cid:90) E W ( x, dy )[ f ( y ) − f ( x )] , ∀ f ∈ C b ( E ) , x ∈ E, (2.2)acting on all continuous bounded functions f ∈ C b ( E ) on the statespace. The adjoint L † of this operator acts on probability distributions µ on E , and determines their time evolution via ddt µ t ( dy ) = (cid:90) x ∈ E µ t ( dx ) W ( x, dy ) − (cid:90) x ∈ E µ t ( dy ) W ( y, dx ) , (2.3)where P [ X t ∈ A ] = (cid:82) A µ t ( dy ) for any regular A ⊂ E characterizes thedistribution of the process at time t ≥ . In case of countable E , (2.3) issimply the usual master equation of the process for µ t ( y ) = P [ X t = y ] ,but we focus our presentation on the equivalent description via thegenerator (2.2), which leads to a more compact notation and appliesin the general setting. As a technical assumption we require that thetotal exit rate of the process is uniformly bounded w ( x ) := (cid:90) E W ( x, dy ) ≤ ¯ w < ∞ for all x ∈ E . (2.4) e are interested in the large deviations of additive path spaceobservables A T : Ω → R of the general form A T ( ω ) := (cid:88) t ≤ Tω ( t − ) (cid:54) = ω ( t ) g (cid:0) ω ( t − ) , ω ( t ) (cid:1) + (cid:90) T h (cid:0) ω ( t ) (cid:1) dt, (2.5)where g ∈ C b ( E ) and h ∈ C b ( E ) . Note that A T is well defined sincethe bound on w ( x ) implies that the sum in the first term almost surelycontains only finitely many terms for any T > . The above functional,which recently appeared in this form in [24], assigns a weight via thefunction g to jumps of the process, as well as to the local time viathe function h . Dynamics conditioned on such a functional have beenstudied in many contexts [5], including driven diffusions on periodiccontinuous spaces E [25].As mentioned before, the simplest examples covered by our settingare Markov chains with finite state space E . This includes stochasticparticle systems on a finite lattice with periodic or closed boundaryconditions such as zero-range or inclusion processes [23, 20, 26], andalso processes with open boundaries and bounded local state space suchas the exclusion process [3]. Choosing g appropriately and h ≡ thefunctional A T can, for example, measure the empirical particle currentacross a bond of the lattice or within the whole system up to time T .We assume that A T admits a large deviation rate function, whichis a lower semi-continuous function I : R → [0 , ∞ ] such that lim T →∞ − T log P [ A T /T ∈ U ] = inf a ∈ U I ( a ) (2.6)for all regular intervals U ⊂ R (see e.g. [27, 28] for a more generaldiscussion). Based on the graphical construction of the jump processand the contraction principle, existence and convexity of I can be es-tablished in a very general setting for countable state space (see e.g.[29] and references therein). If I is convex, it is characterized by thescaled cumulant generating function (SCGF) λ k := lim T →∞ T log E (cid:2) e kA T (cid:3) (2.7)via the Legendre transform I ( a ) = sup k ∈ R (cid:0) ka − λ k (cid:1) . (2.8)It is well known (see e.g. [26, 24]) that λ k can be characterized as theprincipal eigenvalue of a tilted version of the generator (2.2) L k f ( x ) := (cid:90) E W ( x, dy ) (cid:2) e kg ( x,y ) f ( y ) − f ( x ) (cid:3) + kh ( x ) f ( x )= (cid:90) E W k ( x, dy ) (cid:2) f ( y ) − f ( x ) (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) =: (cid:98) L k f ( x ) + V k ( x ) f ( x ) (2.9) ith modified rates for the jump part ˆ L k W k ( x, dy ) := W ( x, dy ) e kg ( x,y ) , (2.10)and potential for the diagonal part V k ( x ) := (cid:90) E W ( x, dy ) (cid:2) e kg ( x,y ) − (cid:3) + kh ( x ) . (2.11)By (2.4) and the boundedness of g and h , for each k ∈ R there existconstants such that we have uniform bounds (cid:90) E W k ( x, dy ) ≤ ¯ w k and V k ( x ) ≤ ¯ v k for all x ∈ E . (2.12)Note also that L = L , but for any k (cid:54) = 0 the diagonal part of theoperator does not vanish and L k x ) = V k ( x ) for all x ∈ E . (2.13)Still, it generates a Feynman-Kac semigroup (see e.g. [17, 24] for de-tails), defined as P kt f ( x ) = (cid:0) e t L k (cid:1) f ( x ) := E x (cid:2) f e kA t (cid:3) , (2.14)which is the unique solution to the backward equation ddt P kt f = P kt ( L k f ) with P k f = f . (2.15)Due to the diagonal part of L k this does not conserve probability, i.e.for the constant function f ≡ we get P kt x ) = E x (cid:2) e kA t (cid:3) (cid:54) = 1 for all k (cid:54) = 0 , t > . (2.16)The associated logarithmic growth rate λ k ( t ) := 1 t log P kt x ) (2.17)provides a finite-time approximation of the SCGF λ k , which dependson the initial condition x ∈ E . We require convergence of this approx-imation as t → ∞ as an asymptotic stability property of the process,discussed in detail in [18] and references therein. Under exponentialmixing assumptions, which are mild in our contexts of interest, it canbe shown that for some constant C > (cid:12)(cid:12) λ k ( t ) − λ k (cid:12)(cid:12) ≤ C/t as t → ∞ . (2.18)This is for example the case if E is finite and the process necessarily hasa spectral gap, as is the case for all finite state lattice gases mentionedearlier. Furthermore, exponential mixing implies that the modifiedfinite-time approximation λ k ( at, t ) := 1(1 − a ) t log P kt x ) P kat x ) with a ∈ (0 , , (2.19) ith a ’burn-in’ time period of length at , significantly improves theconvergence in (2.18) to (cid:12)(cid:12) λ k ( at, t ) − λ k (cid:12)(cid:12) ≤ Cρ at /t as t → ∞ (2.20)for some ρ ∈ (0 , . This is of course routinely used in Monte-Carlosampling where systems are allowed to relax towards stationarity be-fore measuring. These intrinsic properties of the process and the re-lated finite-time errors in the estimation of λ k are not the main subjectof this paper. In the following we simply assume asymptotic stabilityand (2.18), and focus on the efficient numerical estimation of λ k ( t ) forany given t ≥ . λ k ( at, t ) can be treated completely analogously, whichis discussed in more detail in [18]. As usual, for a given initial distribution ν k on E the semigroup(2.14) determines a measure ν kt at times t ≥ on E , which can becharacterized weakly through integrals of bounded test functions f ∈ C b ( E ) as ν kt ( f ) := (cid:90) E f ( x ) ν kt ( dx ) = (cid:90) E P kt f ( x ) ν k ( dx ) . (2.21)Here and in the following we use the common short notation ν kt ( f ) for the integral of the function f under the measure ν kt to simplifynotation. Note that we can write (2.17) as λ k ( t ) = 1 t log ν kt (1) , (2.22)with a more general initial condition ν k . But since P kt does not con-serve probability, ν kt is not a normalized probability measure and it isconsequently impossible to sample from it.With (2.16) ν kt (1) > for all t ≥ , and we can define normalizedversions of the measures via µ kt ( f ) := ν kt ( f ) /ν kt (1) . (2.23)Using (2.9), (2.13) and (2.15) on can derive the evolution equation ddt µ kt ( f ) = ddt ν kt ( f ) ν kt (1) = 1 ν kt (1) ν kt ( L k f ) − ν kt ( f ) ν kt (1) ν kt ( L k µ kt ( L k f ) − µ kt ( f ) µ kt ( L k µ kt ( (cid:98) L k f ) + µ kt ( V k f ) − µ kt ( f ) µ kt ( V k ) (2.24)with initial condition µ k = ν k . It can be shown by similar directcomputation of ddt log ν kt (1) , using (2.13) and (2.22) that λ k ( t ) = 1 t (cid:90) t µ ks ( V k ) ds . (2.25) o the finite-time approximation of λ k is given by an ergodic averagewith respect to the distribution µ kt , depending on the initial distri-bution µ k , with an obvious modification for (2.19). The asymptoticstability of the original process implies that µ kt → µ k ∞ converges to aunique stationary distribution µ k ∞ on E , so that the SCGF (2.7) canbe written as the stationary expectation of the potential λ k = µ k ∞ ( V k ) . Due to the non-linear nature of (2.24), µ k ∞ is characterized by station-arity as the solution of the non-linear equation µ k ∞ ( (cid:98) L k f ) = µ k ∞ ( f ) µ k ∞ ( V k ) − µ k ∞ ( V k f ) for all f ∈ C b ( E ) . Usually µ k ∞ cannot be evaluated explicitly, but from (2.24) it is pos-sible to define a generic processes (cid:0) X k ( t ) : t ≥ (cid:1) with time-marginals µ kt , and then use Monte Carlo sampling techniques. The first termof (2.24) already corresponds to a jump process with generator (cid:98) L k ,and we have to rewrite the second non-linear part to be of the formof a generator. There is some freedom at this stage, and we reportthree common choices from the applied probability literature on parti-cle approximations [30, 17], one of which corresponds to the approachin [3, 4], and to the best of the authors’ knowledge the other two havenot been considered in the computational physics literature so far.For every probability distribution µ on E we can write µ ( V k f ) − µ ( f ) µ ( V k ) = µ (cid:0) L − k,µ,c f + L + k,µ,c f (cid:1) , (2.26)where L − k,µ,c f ( x ) := (cid:0) V k ( x ) − c (cid:1) − (cid:90) E (cid:0) f ( y ) − f ( x ) (cid:1) µ ( dy ) (2.27)and L + k,µ,c f ( x ) := (cid:90) E (cid:0) V k ( y ) − c (cid:1) + (cid:0) f ( y ) − f ( x ) (cid:1) µ ( dy ) , (2.28)using the standard notation a + = max { , a } and a − = max { , − a } for positive and negative part of a ∈ R . We have the freedom tointroduce an arbitrary constant c ∈ R , possibly depending also on themeasure µ (but not the state x ∈ E ), since the left-hand side of (2.26)is invariant under renormalization of the potential V k ( x ) → V k ( x ) − c .The generators L − k,µ,c and L + k,µ,c describe jump processes on E withrates depending on the probability measure µ . V k ( x ) can be interpretedas a fitness potential for the process, and play exactly that role in theparticle approximation of this process based on population dynamics,which is presented in Section 3. Generic choices are: • c = 0 is the default and simplest choice, but is usually not optimalas discussed in Section 4. c = µ kt ( V k ) corresponding to the average potential: If the systemin state x is less fit than c it jumps to state y chosen from thedistribution µ kt ( dy ) according to (2.27), and independently, thesystem jumps to states fitter than c irrespective of its currentstate according to (2.28). • c = sup x ∈ E V k ( x ) or inf x ∈ E V k ( x ) , so that L + k,µ,c ( f )( x ) ≡ or L − k,µ,c ( f )( x ) ≡ , respectively, and only one of the two processeshas to be implemented in a simulation.Another representation of the non-linear part in (2.24) is (see e.g.[31], Section 5.3.1) L V k,µ f ( x ) := (cid:90) E (cid:0) V k ( y ) − V k ( x ) (cid:1) + (cid:0) f ( y ) − f ( x ) (cid:1) µ ( dy ) , (2.29)which is particularly interesting for implementing efficient selection dy-namics as discussed in Section 4. Here every jump from this part ofthe generator strictly increases the fitness of the process, which is astronger version of the previous idea where the process on average in-creased its fitness above level c . The rate depends on departure state x and target state y , which is in general computationally more expensiveto implement than rates in (2.27) and (2.28), but can still be feasibledue to simplifications in many concrete examples as demonstrated inSection 4. A further improvement of that idea is given by L V k,µ f ( x ) := (cid:0) V k ( x ) − µ ( V k ) (cid:1) − (cid:90) E (cid:0) V k ( y ) − µ ( V k ) (cid:1) + µ (cid:0) ( V k − µ ( V k )) + (cid:1) (cid:0) f ( y ) − f ( x ) (cid:1) µ ( dy ) , (2.30)which resembles a continuous-time version of selection processes whichare known under the names of stochastic remainder sampling [32] orresidual sampling [33] in discrete time. Here selection events changethe process from states x of less than average fitness µ ( V k ) to states y fitter than average, but we will see in Section 4 that this variant isharder to implement than (2.29) in our area of interest, and offers onlylimited extra gain on selection efficiency.In summary, the evolution equation (2.24) for µ kt can be written as ddt µ kt ( f ) = µ kt ( (cid:98) L k f ) + µ t ( L V k,µ t f ) (2.31)where the first choice with (2.27) and (2.28) is included defining L V k,µ = L − k,µ,c + L + k,µ,c in that case. This defines a Markov process (cid:0) X k ( t ) : t ≥ (cid:1) on the state space E with generator L k,µ kt f ( x ) := (cid:98) L k f ( x ) + L V k,µ kt f ( x ) . (2.32)The process is non-linear since the transition rates in the generator L k,µ kt depend on the distribution µ kt of the process at time t , and inparticular the process is also time-inhomogeneous. While the generator s still a linear operator acting on test functions f , the adjoint L † k,µ kt is a non-linear operator acting on measures µ kt , generating their timeevolution via ddt µ kt ( dy ) = L † k,µ kt µ kt , (2.33)which is equivalent to (2.31). This microscopic mass transport de-scription consistent with the macroscopic description provided by theFeynman-Kac semigroup P kt is also called a McKean representation[16, 31]. It is well know that particle approximations of different McK-ean representations can have very different properties. The first partis similar to the original dynamics with modified rates W k (2.10), andthe second non-linear part depending on the distribution µ kt arises fromnormalizing the measures ν kt . Note that µ kt and therefore the finite-time approximation λ k ( t ) in (2.25) are uniquely determined by (2.24),and thus independent of the different McKean representations, as areof course the limiting quantities µ k ∞ and λ k . Also, these interpretationsdo not make use of concepts from population dynamics such as branch-ing, which will only come into play when using particle approximationsof the measures µ kt as explained in the next section. The rates of the non-linear process (cid:0) X k ( t ) : t ≥ (cid:1) (2.32) depend onthe distribution µ t , which is not known a-priori in the cases in whichwe are interested. The natural framework to sample such non-linearprocesses approximately is a particle approximation, see e.g. [30]. Herean ensemble (cid:0) X k ( t ) : t ≥ (cid:1) of N processes (also called particles orclones) X ik ( t ) , i = 1 , . . . , N is run in parallel on the state space E N ,and µ kt is approximated by the empirical distribution µ N ( X k ( t )) of therealizations, where for any x ∈ E N we define µ N ( x )( dy ) := 1 N N (cid:88) i =1 δ x i ( dy ) as a distribution on E . (3.1)Since µ N ( X k ( t )) is fully determined by the state of the ensemble attime t , the particle approximation is a standard (linear) Markov pro-cess on E N . This leads to an estimator for the SCGF using (2.25)given by Λ Nk ( t ) := 1 t (cid:90) t µ N ( X k ( s ))( V k ) ds = 1 t (cid:90) t N N (cid:88) i =1 V k ( X ik ( s )) ds , (3.2)which is a random object depending on the realization of the particleapproximation. The full dynamics can be set up in various differentways such that µ N ( X k ( t )) → µ kt converges as N → ∞ for any t ≥ . Ageneric version, directly related to the above McKean representationshas been studied in the applied probability literature in great detail
30, 17], providing quantitative control on error bounds for conver-gence. After describing this approach, we present a different approachknown in the theoretical physics literature under the name of cloningalgorithms [5, 13], which provides some computational advantages butlacks general rigorous error control so far [9, 10]. We will then set up aframework to identify common aspects of both approaches, which canbe used to generalize existing convergence results to obtain rigorouserror bounds for cloning algorithms as described in detail in [18], andto compare computational efficiency of both approaches.
The most basic particle approximation is simply to run the McK-ean dynamics (2.32) in parallel on each of the particles, replacingthe dependence on µ kt by µ N ( X k ( t )) in the jump rates. Mathemat-ically, denoting by L Nk the generator of the full N particle process (cid:0) X k ( t ) : t ≥ (cid:1) acting on functions F : E N → R , this corresponds to L Nk F ( x ) := N (cid:88) i =1 L ik,µ N ( x ) F ( x ) . (3.3)Here L ik,µ N ( x ) is equivalent to (2.32) acting on particle i only, i.e. on thefunction x i (cid:55)→ F ( x ) while x j , j (cid:54) = i remain fixed. The linear part (cid:98) L k of(2.32) does not depend on µ kt and follows the original dynamics for eachparticle, referred to as ‘mutation’ events in the standard populationdynamics interpretation. In this context, the non-linear parts (2.27)and (2.28) can be interpreted as ‘selection’ events leading to mean-field interactions between the particles. Using the definition (3.1) ofthe empirical measures, we can write for the part (2.27) L − ,ik,µ N ( x ) ,c F ( x ) = (cid:0) V k ( x i ) − c (cid:1) − (cid:90) E (cid:0) F ( x i,y ) − F ( x ) (cid:1) µ N ( x )( dy )= (cid:0) V k ( x i ) − c (cid:1) − N N (cid:88) j =1 (cid:0) F ( x i,x j ) − F ( x ) (cid:1) (3.4)with notation x i,y = ( x , . . . , x i − , y, x i +1 , . . . , x N ) . So with a ratedepending on the fitness of particle i , it is ‘killed’ and replaced by acopy of particle j uniformly chosen from all particles. Analogously, wehave for (2.28) L + ,ik,µ N ( x ) ,c F ( x ) = 1 N N (cid:88) j =1 (cid:0) V k ( x j ) − c (cid:1) + (cid:0) F ( x i,x j ) − F ( x ) (cid:1) , (3.5)which leads to the same transition x → x i,x j , but with a different in-terpretation. Each particle j in the system reproduces independentlywith a rate depending on its fitness (cloning event), and its offspringreplaces a uniformly chosen particle, which is equal to i with proba-bility /N . The different nature of killing and cloning events becomes learer when we write out the full generator (3.3) and switch summa-tion indices for the cloning part (3.5) in the second line, L Nk F ( x ) = N (cid:88) i =1 (cid:90) E W k ( x i , dy ) (cid:0) F ( x i,y ) − F ( x ) (cid:1) + N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) + N N (cid:88) j =1 (cid:0) F ( x j,x i ) − F ( x ) (cid:1) + N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) − N N (cid:88) j =1 (cid:0) F ( x i,x j ) − F ( x ) (cid:1) . (3.6)Analogously, the McKean representations (2.29) and (2.30) lead tobasic N -particle systems with generators L Nk F ( x ) = N (cid:88) i =1 (cid:90) E W k ( x i , dy ) (cid:0) F ( x i,y ) − F ( x ) (cid:1) + 1 N N (cid:88) i,j =1 (cid:0) V k ( x j ) − V k ( x i ) (cid:1) + (cid:0) F ( x i,x j ) − F ( x ) (cid:1) (3.7)and L Nk F ( x ) = N (cid:88) i =1 (cid:90) E W k ( x i , dy ) (cid:0) F ( x i,y ) − F ( x ) (cid:1) + 1 N N (cid:88) i,j =1 (cid:0) V k ( x i ) − µ ( V k ) (cid:1) − (cid:0) V k ( x j ) − µ ( V k ) (cid:1) + µ (cid:0) ( V k − µ ( V k )) + (cid:1) (cid:0) F ( x i,x j ) − F ( x ) (cid:1) . (3.8)Here a particle i is replaced by a particle j with higher fitness, combin-ing killing and cloning into a single event. In the case of (3.8), particle i is furthermore less fit and j is fitter than average. Note that theseapproximating systems can be seen as particle systems with mean-fieldor averaged pairwise interaction given by the selection dynamics.Following established results in [30, 17, 31], the (random) quantity Λ Nk ( t ) is an asymptotically unbiased estimator of λ k ( t ) with a system-atic error bounded by sup t ≥ (cid:12)(cid:12)(cid:12) E N (cid:2) Λ Nk ( t ) (cid:3) − λ k ( t ) (cid:12)(cid:12)(cid:12) ≤ CN for all N ≥ , (3.9)along with several rigorous convergence results. These include an esti-mate on the random error in L p norm for any p > , sup t ≥ E N (cid:2) | Λ Nk ( t ) − λ k ( t ) | p (cid:3) /p ≤ C p √ N for all N ≥ , (3.10)as well as other formulations including almost sure convergence. Notethat these estimates are uniform in t ≥ , so are not affected by the hoice of simulation time. The use of a finite simulation time, t , leadsto an additional systematic error to the estimate of the SCGF λ k , oforder /t as in (2.18) or ρ at /t as in (2.20). The bound (3.10) for p = 2 implies for the variance sup t ≥ E N (cid:104)(cid:12)(cid:12)(cid:12) Λ Nk ( t ) − E N (cid:2) Λ Nk ( t ) (cid:3)(cid:12)(cid:12)(cid:12) (cid:105) ≤ C N for all N ≥ , (3.11)since we have Var( Y ) = inf a ∈ R E (cid:2) ( Y − a ) (cid:3) for any real-valued randomvariable Y . Therefore, error bars based on standard deviations are ofthe usual Monte Carlo order of / √ N , and the random error dominatesthe systematic bias (3.9) for N large enough. Further remarks on possi-ble unbiased estimators can be found at the end of the next subsection. Following the standard martingale characterization of Feller-typeMarkov processes (see e.g. [34], Chapter 3), we know that for everybounded, continuous F ∈ C b ( E N ) M NF ( t ) := F (cid:0) X k ( t ) (cid:1) − F (cid:0) X k (0) (cid:1) − (cid:90) t L Nk F (cid:0) X k ( s ) (cid:1) ds (3.12)is a martingale on R with (predictable) quadratic variation (cid:104)M NF (cid:105) ( t ) = (cid:90) t Γ Nk F (cid:0) X k ( s ) (cid:1) ds , (3.13)where the associated carré du champ operator Γ Nk is given by Γ Nk F ( x ) := L Nk F ( x ) − F ( x ) L Nk F ( x ) . (3.14)In analogy to the decomposition of a random variable into its meanand a centred fluctuating part, the martingale (3.12) describes thefluctuations of the process t (cid:55)→ F (cid:0) X k ( t ) (cid:1) . The strength of the noisedepends on time and is given by the increasing process (3.13), whosetime evolution is generated by the carré du champ operator (3.14). Incontrast to the generator L Nk , this is a quadratic (non-linear) operatorand it is the main tool for studying the fluctuations of a process.Elementary computations for approximations (3.6), (3.7) and (3.8)show that for marginal test functions F ( x ) = f ( x i ) depending only ona single particle, we have L Nk F ( x ) = L k,µ N ( x ) f ( x i ) and Γ Nk F ( x ) = Γ k,µ N ( x ) f ( x i ) . So generator and carré du champ both coincide with the corre-sponding operators L k,µ N ( x ) and Γ k,µ N ( x ) for the McKean dynamics(2.32). This means that for large enough N and µ N (cid:0) X k ( t ) (cid:1) close to µ kt , each marginal process t (cid:55)→ X ik ( t ) has essentially the same distribu-tion as the corresponding McKean process t (cid:55)→ X k ( t ) . Note that due o selection events these (auxiliary) dynamics do not coincide with theoriginal process conditioned on a large deviation event, and they arealso not unique since there are various choices for McKean representa-tions of Feynman-Kac semigroups, as discussed earlier. Trajectories ina particle approximation always correspond to the trajectories of theparticular McKean interpretation they are based on, which is usually(2.26) in the context of cloning algorithms. Due to asymptotic stabil-ity the particle approximation converges as t → ∞ for fixed N to aunique stationary distribution µ N,k ∞ , and the single-particle marginalsof this distribution converge to µ k ∞ as N → ∞ . While the marginalprocesses for a given particle approximation are identically distributedthey are not independent, and µ N,k ∞ exhibits non-trivial correlations be-tween particles resulting from selection events, which we discuss againin Section 4.Now consider averaged observables of the form F ( x ) = 1 N N (cid:88) i =1 f ( x i ) = µ N ( x )( f ) as they appear in the eigenvalue estimator (3.2). Since the generator L Nk is a linear operator in F , we have the same identity as above forthe generator, L Nk µ N ( x )( f ) = µ N ( x ) (cid:0) L k,µ N ( x ) f (cid:1) . (3.15)The carré du champ, on the other hand, is non-linear in F and thedependence between particles is captured by this operator. Since forall approximations (3.6), (3.7) and (3.8) in every selection event onlya single particle is affected, another elementary, slightly more cumber-some computation shows (see [18] for details) Γ Nk µ N ( x )( f ) = 1 N µ N ( x ) (cid:0) Γ k,µ N ( x ) f (cid:1) . (3.16)The factor /N results from a self-averaging property of the mean-fieldinteraction through selection dynamics, which is expected from resultson other mean-field particle systems (see e.g. [35, 36] and referencestherein), and is fully analogous to the central-limit type scaling of theempirical variance for the sum of N independent random variables.While this scaling remains the same for more general particle approxi-mations with more than one particle being affected by selection events,the simple identity (3.16) does not hold exactly for any N ≥ as wesee in the next subsection.Recall that the estimator (3.2) for the principal eigenvalue (2.7) isgiven by an ergodic integral of the average observable F ( x ) = µ N ( x )( V k ) .With (2.12) V k ∈ C b ( E ) and rates are bounded, so µ N ( x ) (cid:0) Γ k,µ N ( x ) V k (cid:1) is also bounded and the carré du champ (3.16) vanishes as N → ∞ .Therefore the martingale M NF ( t ) also vanishes for all t ≥ , leading in L p -sense for any p > following with the Burkholder-Davis-Gundy inequality, seee.g. [37], Section 11 o a convergence of the measures µ N ( X k ( t )) → µ kt ( t ) and also of fi-nite time approximations Λ Nk ( t ) → λ k ( t ) as reported in the previoussubsection. Due to the time-normalization in (3.2) and the assumedergodicity, corresponding error bounds hold uniformly in t ≥ . Insummary, bounds on the carré du champ are the main ingredient forthe proof of convergence results as explained in detail in [18] and ref-erences therein. All above properties up to and including (3.15) aregeneric requirements for any particle approximation. These particle ap-proximations can differ in their correlation structures and this freedomcan be used to construct numerically more efficient particle approxi-mations as discussed in the next subsection. To optimize sampling,particles should ideally evolve in as uncorrelated a fashion as possible;it is not possible to achieve completely independent evolution due tothe non-linearity of the underlying McKean process and resulting se-lection events and mean-field interactions. Remarks on unbiased estimators.
Estimators based on expecta-tions w.r.t. the empirical measures µ Nt = µ N ( X k ( t )) usually have abias, i.e. E (cid:2) µ Nt ( f ) (cid:3) (cid:54) = µ t ( f ) for f ∈ C b ( E ) , which vanishes only asymp-totically (3.9). This originates from the non-linear time evolution of µ kt (2.24) and associated McKean processes. It is straightforward toderive estimators based on the unnormalized measures ν kt (2.21) thatare unbiased. Based on (2.25) and (3.2), we obtain an estimate of thenormalization ν kt (1) : ν Nt (1) := exp (cid:16) (cid:90) t µ Ns ( V k ) (cid:17) , (3.17)and then introduce unnormalized empirical measures on E at time t based on the particle approximation ν Nt ( f ) := ν Nt (1) µ Nt ( f ) for all f ∈ C b ( E ) . The expected time evolution of observables f is then given by ddt E (cid:2) ν Nt ( f ) (cid:3) = E (cid:104) ν Nt ( f ) µ Nt ( V k ) + ν Nt (1) L Nk µ Nt ( f ) (cid:105) . (3.18)Now with (3.15) and the decomposition (2.32) of L k,µ Nt into mutationand selection part, we have L Nk µ Nt ( f ) = µ Nt (cid:0) (cid:98) L k f + L V k,µ Nt f (cid:1) , and with the general construction of McKean representations (2.26) µ Nt ( L V k,µ Nt f ) = µ Nt ( V k f ) − µ Nt ( f ) µ Nt ( V k ) . Inserting into (3.18), this simplifies to ddt E (cid:2) ν Nt ( f ) (cid:3) = E (cid:2) ν Nt ( (cid:98) L k f ) + ν Nt ( V k f ) (cid:3) . ince with (2.15) L k = (cid:98) L k + V k also generates the time evolution of ν t ( f ) , a simple Gronwall argument with E (cid:2) ν N ( f ) (cid:3) = ν ( f ) gives E (cid:2) ν Nt ( f ) (cid:3) = ν t ( f ) for all t ≥ and N ≥ . (3.19)Note that choosing f ≡ implies that the normalization (3.17) is anunbiased estimator of e tλ k ( t ) , which we will see again in Section 3.4 inthe context of cloning algorithms. However, in practice the randomerror dominates the accuracy of estimates of λ k ( t ) , so N has to bechosen large and the bias of the estimator Λ Nk ( t ) (3.2) is negligible. Cloning algorithms proposed in the theoretical physics literature[3, 4] are similar to the particle approximation (3.6), using the sametilted rates W k for mutations, but combining the cloning and mutationpart of the generator. We focus the following exposition around thealgorithm proposed in [4], but other continuous-time versions can beanalysed analogously. The idea is simply to sample the cloning processfor each particle i together with the mutation process at the same rate w k ( x i ) := (cid:90) E W k ( x i , dy ) = (cid:90) E w ( x i , dy ) e kg ( x i ,y ) . In each combined event, a random number of clones is generated witha distribution p Nk,x i ( n ) such that its expectation is m Nk ( x i ) := N (cid:88) n =0 np Nk,x i ( n ) = (cid:0) V k ( x i ) − c (cid:1) + /w k ( x i ) . (3.20)These clones then replace n particles chosen uniformly at random (inthe sense that all subsets of size n are equally probable) from theensemble. In this way, the rate at which a clone of particle i replacesany given particle j is w k ( x i ) m Nk ( x i ) N = 1 N (cid:0) V k ( x i ) − c (cid:1) + , as required for L Nk in (3.6). The only additional assumption on p Nk,x i ( n ) is that the range of possible values for n has to be bounded by N forthe cloning event to be well defined. Since its mean is bounded by max x ∈ E (cid:0) V k ( x ) − c (cid:1) + /w k ( x ) < ∞ independently of N , any distributionwith the correct mean and finite range will lead to a valid algorithmfor sufficiently large N .The above cloning process is described by the generator L N,mck F ( x ) := N (cid:88) i =1 N (cid:88) n =0 (cid:0) Nn (cid:1) (cid:88) A ⊆{ ,..,N }| A | = n (cid:90) E W k ( x i , dy ) p Nk,x i ( n ) (cid:0) F ( x A,x i ; i,y ) − F ( x ) (cid:1) . (3.21) ere we have used the notation x A,w ; i,y for the vector z ∈ E N with z j := x j j (cid:54)∈ A ∪ { i } w j ∈ A \ { i } y j = i, for j ∈ { , . . . , N } and w, y ∈ E . (3.21) combines cloning of x i into auniformly chosen subset A of size n , with a subsequent mutation eventwhere the state of particle i changes to y . If we simply write L N, − k forthe killing part (third line in (3.6)) which remains unchanged, the fullgenerator of this cloning algorithm is given by L N,clonek := L N,mck + L N, − k . (3.22)It can be shown by direct computation that for marginal test functionsof the form F ( x ) = f ( x i ) this is equivalent to the generator (3.6) L Nk F ( x ) = L N,clonek F ( x ) , (3.23)and by linearity of generators also for all averaged functions of the form F ( x ) = µ N ( x )( f ) . One can also show that for marginal test functionsthe carré du champ operators coincide, so the cloning algorithm pro-duces marginal processes or particle trajectories with the same distri-bution as the simple particle approximation (3.6). For averaged testfunctions the change in the correlation structure between particles ispicked up by the carré du champ operator. Instead of (3.16) one canderive the following estimate for the mutation and cloning part Γ N,mck F ( x ) ≤ N µ N ( x ) (cid:16)(cid:98) Γ k f + ( V k > c ) q Nk m Nk Γ + k,µ N ( x ) ,c f (cid:17) , see [18], Theorem 3.2, for a proof.Here q Nk ( x i ) := (cid:80) Nn =0 n p Nk,x i ( n ) denotes the second moment ofthe number of clones for the particle i , and we use the decomposition(2.32) where (cid:98) Γ k f is the carré du champ corresponding to the mutationdynamics (cid:98) L k , and Γ + k,µ N ( x ) ,c the one corresponding to the cloning part(2.28). This estimate holds, of course, only for N large enough thatthe cloning event is well defined (see discussion above). Note also that Γ + k,µ N ( x ) ,c f ( x ) is proportional to ( V k ( x ) − c ) + and with (3.20) ( V k ( x ) − c ) + = 0 implies m Nk ( x ) = 0 for the expectation of the distribution p Nk,x ,leading to the indicator function ( V k > c ) ∈ { , } .This is sufficient to carry out the full proof of the convergence re-sults mentioned in Section 3 based on results in [17]. This is carriedout in [18] in full detail, and here we only report the main result ofthat work. Recall the bounds (2.12) on V k and the total modified exitrate w k . Theorem.
Denote by ¯Λ Nk ( t ) the eigenvalue estimator (3.2) correspond-ing to the cloning algorithm (3.22) . Then there exist constants α, γ > and α p , γ p > such that for all N large enough sup t ≥ (cid:12)(cid:12)(cid:12) E N (cid:2) ¯Λ Nk ( t ) (cid:3) − λ k ( t ) (cid:12)(cid:12)(cid:12) ≤ αN ¯ v k ¯ w k (cid:0) γ + (cid:107) q Nk (cid:107) ∞ (cid:1) , (3.24) nd for all p > t ≥ E N (cid:2) | ¯Λ Nk ( t ) − λ k ( t ) | p (cid:3) /p ≤ α p √ N ¯ v k ¯ w k (cid:0) γ p + (cid:107) q Nk (cid:107) ∞ (cid:1) / . (3.25) Remarks. • Choosing the normalization of the potential c < inf x ∈ E V k ( x ) thekilling rate in (3.6) vanishes and (3.21) describes the full gener-ator L N,clonek for the cloning algorithm. This is computationallycheaper and simpler to implement, since only the mutation pro-cess has to be sampled independently for all particles, and cloningevents happen simultaneously. However, as is discussed in Section4, this choice in general reduces the accuracy of the estimator. • A common choice in the physics literature for the distribution p Nk,x i of the clone size event (see e.g. [3, 13]) is p Nk,x i ( n ) = m Nk ( x i ) − (cid:98) m Nk ( x i ) (cid:99) , for n = (cid:98) m Nk ( x i ) (cid:99) + 1 (cid:98) m Nk ( x i ) (cid:99) + 1 − m Nk ( x i ) , for n = (cid:98) m Nk ( x i ) (cid:99) , otherwise (3.26)So the two adjacent integers to the mean are chosen with ap-propriate probabilities, which minimizes the variance of the dis-tribution for a given mean. This choice therefore minimizes thecontribution of the second moment q Nk to the bound for the errorsin (3.24) and (3.25), and is also simple to implement in practice. • Due to (3.23), trajectories of individual particles follow the samelaw as the simple particle approximation (3.6) and therefore thesame McKean process as explained in Section 2.2 The cloningapproach can introduce additional correlations between particlesdue to large cloning events, which is quantified by the secondmoment q Nk entering the error bounds in (3.24) and (3.25). In the physics literature an alternative estimator to Λ Nt (3.2) isoften used, based on a concept called the ‘cloning factor’ (see e.g.[3, 5, 13]). This is essentially a continuous-time jump process ( C Nk ( t ) : t ≥ on R + with C Nk (0) = 1 , where at each selection event of size n ≥ − at a given time t the value is updated as C Nk ( t ) = C Nk ( t − )(1 + n/N ) . (3.27)Here n = − indicates a killing event, and n ≥ a cloning eventaccording to the two parts of the generator (3.22). This idea comesfrom a branching process interpretation of the cloning ensemble relatedto the unnormalized measure ν kt , since with (2.17) we have that ν kt (1) ≈ e λ k t as t → ∞ , o λ k corresponds to the volume expansion factor of the clone ensembledue to selection dynamics.In our setting, the dynamics of C Nk ( t ) can be defined jointly withthe cloning process via an extension of the generator (3.22) ¯ L N,clonek F ( x, ζ ) = ¯ L N,mck F ( x, ζ ) + ¯ L N, − k F ( x, ζ ) , acting on functions that depend on the state x ∈ E N and the cloningfactor ζ ∈ R + . With (3.21) we have for cloning events ¯ L N,mck F ( x, ζ ) := N (cid:88) i =1 N (cid:88) n =0 (cid:0) Nn (cid:1) (cid:88) A ⊆{ ,..,N }| A | = n (cid:90) E W k ( x i , dy ) p Nk,x i ( n ) (cid:16) F (cid:0) x A,x i ; i,y , ζ (1 + n/N ) (cid:1) − F (cid:0) x, ζ (cid:1)(cid:17) and with the third line of (3.6) for killing events ¯ L N, − k F ( x, ζ ) = N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) − N N (cid:88) j =1 (cid:0) F ( x i,x j , ζ (1 − /N )) − F ( x, ζ ) (cid:1) . So the joint process (cid:0) ( X Nk ( t ) , C Nk ( t )) : t ≥ (cid:1) is Markov, and observingonly the cloning factor with the simple test function G ( x, ζ ) = ζ weget ¯ L N,mck G ( x, ζ ) = N (cid:88) i =1 N (cid:88) n =0 (cid:0) Nn (cid:1) (cid:88) A ⊆{ ,..,N }| A | = n (cid:90) E W k ( x i , dy ) p Nk,x i ( n ) ζ · nN = ζN N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) + . (3.28)In the last line we have used (3.20), and in a similar fashion we get forkilling events ¯ L N, − k G ( x, ζ ) = − ζN N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) − . (3.29)Therefore ¯ L N,clonek G ( x, ζ ) = ζN N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) = ζm N ( x )( V k ) − ζc , and analogously to (3.18), the expected time evolution of C Nk ( t ) is thengiven by ddt E [ C Nk ( t )] = E [ C Nk ( t ) · µ Nt ( V k − c )] . This is also the evolution of ν Nt ( e − tc ) = e − tc ν Nt (1) , since ddt E [ ν Nt ( e − tc )] = E [ µ Nt ( V k ) e − tc ν Nt (1) − c e − tc ν Nt (1)]= E [ ν Nt ( e − tc ) · µ Nt ( V k − c )] . ith initial conditions C Nk (0) = 1 = ν Nt (1) , a Gronwall argumentanalogous to (3.19) gives E [ e tc C Nk ( t )] = E [ ν Nt (1)] = ν t (1) for all t ≥ and N ≥ . So e tc C Nk ( t ) is an unbiased estimator for ν t (1) , which leads also to analternative estimator for λ k ( t ) (2.17) given by Λ Nk ( t ) := 1 t log C Nk ( t ) + c. (3.30)Note that this is not itself unbiased as a consequence of the nonlineartransformation involving the logarithm.Since C Nk ( t ) is defined as a product (3.27), we can use another sim-ple test function G ( x, ζ ) = log ζ to analyze the convergence behaviourof Λ Nk ( t ) . Analogously to (3.28) and (3.29) we get ¯ L N,clonek G ( x, ζ ) = 1 N N (cid:88) i =1 (cid:0) V k ( x i ) − c (cid:1) = m N ( x )( V k ) − c + O (cid:16) N (cid:17) , where we have also used (3.20) and assumed that the support of p Nk,x i is bounded independently of N (which is the case for common choicesin the literature such as (3.26)). This allows us to approximate log(1 + n/N ) = n/N + O (1 /N ) as N → ∞ , leading to error terms of order /N . Then, analogously to (3.12) we get with log C Nk (0) = 0 that M NC ( t ) := log C Nk ( t ) − (cid:90) t ¯ L N,clonek G ( X k ( s ) , C Nk ( s )) ds = log C Nk ( t ) − t (Λ Nk ( t ) − c ) + t O (cid:16) N (cid:17) is a martingale. For the carré du champ we obtain from a straightfor-ward computation that Γ N,clonek S ( x, ζ ) = 1 N N (cid:88) i =1 (cid:12)(cid:12) V k ( x i ) − c (cid:12)(cid:12) + O (cid:16) N (cid:17) , and since the potential V k is bounded (2.12), the quadratic variationof the martingale is bounded by (cid:104)M NC (cid:105) ( t ) ≤ tN (cid:0) ¯ v k + | c | (cid:1) + O (cid:16) N (cid:17) . Therefore the estimator (3.30) based on the cloning factor ¯Λ Nk ( t ) = 1 t log C Nk ( t ) + c = Λ Nk ( t ) + O (cid:16) N (cid:17) + 1 t M NC ( t ) is asymptotically equal to the basic estimator Λ Nk ( t ) (3.2), with cor-rections that vanish as /N in the L p -norm as N → ∞ uniformly in t ≥ , analogously to the discussion in Section 3.2. Therefore, thesame convergence results as stated in the Theorem apply for ¯Λ Nk ( t ) . imilar convergence results can be shown to hold for e tc C Nk ( t ) as anestimator of ν t (1) for fixed t > , but naturally cannot hold uniformlyin time. Since the object of interest is usually the long-time limit λ k (2.17), the practical relevance of this is limited, in addition to the gen-eral point that random errors dominate the convergence as mentionedin Section 3.2. In practice, the basic ergodic average Λ Nk ( t ) (3.2) ismore useful than the cloning factor in the application areas we havein mind. In particular, for alternative particle approximations such as(3.7) or (3.8) where cloning and killing events are effectively combined,it is not clear how to define a cloning factor, whereas Λ Nk ( t ) is alwayseasily accessible. Selection events (cloning or killing) in a particle approximationincrease the correlations among the particles in the ensemble, andthereby decrease the resolution in the empirical distribution µ Nt = µ N ( X k ( t )) , and ultimately the quality of the sample average in theestimator (3.2). Therefore it is desirable to minimize the total rate S k ( x ) of selection events for a particle approximation. For algorithm(3.6) this is given by S k ( x ) = N (cid:88) i =1 (cid:12)(cid:12) V k ( x i ) − c (cid:12)(cid:12) , (4.1)and the same holds for the cloning algorithm (3.22), since the changein cloning rate is compensated exactly by the average number of clonescreated to obtain the same overall rate. It is easy to see that for a givenstate x of the clone ensemble, there is an optimal choice of c to min-imize this expression, given by the median of the fitness distributions V k ( x ) := (cid:8) V k ( x i ) : i = 1 , . . . , N (cid:9) . If the distribution of X k ( t ) is uni-modal with light enough tails, the median can be well approximated bythe mean µ Nt ( V k ) . Since both quantities can be computed with similarcomputational effort (or well approximated at reduced cost using onlya subset of the ensemble), choosing c = c ( t ) = median (cid:0) V k ( X k ( t )) (cid:1) should be computationally optimal. In particular, the simplest choice c = 0 in the cloning algorithm is in general far from optimal, so ischoosing c = inf x ∈ E V k ( x ) to get rid of the killing part of the dynamics(see first remark in Section 3.3).Intuitively, algorithms (3.7) and (3.8) should lead to even lowertotal selection rates since every selection event increases the fitness po-tential, while in algorithms based on (3.6) it increases only on average nd may also decrease as the result of selection events. Indeed for (3.7)we have S k ( x ) = 1 N N (cid:88) i,j =1 (cid:0) V k ( x i ) − V k ( x j ) (cid:1) + = 12 N N (cid:88) i,j =1 (cid:12)(cid:12) V k ( x i ) − V k ( x j ) (cid:12)(cid:12) ≤ N (cid:88) i =1 (cid:16)(cid:12)(cid:12) V k ( x i ) − c (cid:12)(cid:12) + (cid:12)(cid:12) c − V k ( x i ) (cid:12)(cid:12)(cid:17) = S k ( x ) , (4.2)by symmetry of summations and the triangle inequality. The inequalityis strict except for degenerate cases, e.g. if V k ( x i ) takes only two values,and c lies in between the two. In practice, in the scenarios which wehave investigated, it turns out that unless the distribution of X k ( t ) isseriously skewed, S k is strictly smaller than S k by a sizeable amount,as is illustrated later in Figure 2 for the inclusion process. Algorithm(3.8) provides further improvement with S k ( x ) = 1 N N (cid:88) i,j =1 (cid:0) V k ( x i ) − µ N ( x )( V k ) (cid:1) − (cid:0) V k ( x j ) − µ N ( x )( V k ) (cid:1) + µ N ( x ) (cid:0) ( V k − µ N ( x )( V k )) + (cid:1) = 12 N (cid:88) i =1 (cid:12)(cid:12) V k ( x i ) − µ N ( x )( V k ) (cid:12)(cid:12) ≤ S k ( x ) . (4.3)Here we have used (cid:80) Ni =1 (cid:0) V k ( x i ) − µ N ( x )( V k ) (cid:1) = 0 and Jensen’s in-equality to compare with S k ( x ) , since v (cid:55)→ | a − v | is convex for all a ∈ R . Note that the rate of change of the mean fitness µ Nt ( V k ) isgiven by the same expression in all the above particle approximations, µ Nt ( (cid:98) L k V k ) + µ Nt ( V k ) − µ Nt ( V k ) . (4.4)The first term due to mutation dynamics (cid:98) L k can have either sign and isidentical in all algorithms, while the second due to selection is positiveand given by the empirical variance of V k . This follows from direct com-putations using the averaged test function F ( x ) = µ N ( x )( V k ) in (3.6),(3.7), (3.8) and (3.22), and is consistent with the evolution equation(2.24). So the mean fitness evolves until a mutation selection balanceis reached and the rate of change (4.4) vanishes, characterizing thestationary state of the particle approximation process. Note that thisbasic mechanism is identical in all particle approximations discussedhere, so we expect the mean fitness to show a very similar behaviour.While finite size effects can lead to deviations also in the mean, themain difference between the algorithms is found on the level of vari-ances and time correlations, which can be significantly reduced using(3.7) or (3.8) as illustrated in the next subsections. Since our mainobservable of interest Λ Nk ( t ) is an ergodic time average of µ Nt ( V k ) , thiscan lead to significant improvements in the accuracy of the estimator(3.2).The correlations introduced by selection are counteracted by muta-tion dynamics, which occur independently for each particle and decor-relate the ensemble. The dynamics of correlation structures in cloning lgorithms has been discussed in some detail recently in [38, 7, 8, 13],and can be understood in terms of ancestry in the generic popula-tion dynamics interpretation. Those results also discuss importantnon-ergodicity effects in the measurement of path properties and theinterpretation of particle trajectories, which were already pointed outin [3] and are also a subject of recent research [39]. This poses in-teresting questions for rigorous mathematical investigations which areleft to future work. Here we simply conclude with a numerical testin the next subsections, which supports the intuition that approxima-tion (3.7) with minimal selection rates leads to variance reduction inthe relevant estimators compared to the cloning algorithm. Since theselection rate in (3.7) depends on potential differences between pairs,implementation is in general more involved than for algorithms basedon (3.6). While the scaling tN log N of computational complexity withthe size N of the clone ensemble is the same, the prefactor and com-putational cost in practice may be higher and this has to be traded offagainst gains in accuracy on a case by case basis. For the examplesstudied below we find a computationally efficient implementation of(3.7) providing a clear improvement over the standard cloning algo-rithm, which is the main contribution of this paper in this context.Algorithm (3.8), on the other hand, provides only marginal improve-ment over (3.7), but cannot be implemented as efficiently in our areaof interest. In the following we consider one-dimensional stochastic lattice gaseswith periodic boundary conditions on the discrete torus T L with L sites and a fixed number of particles M . Within our general frame-work, they are simply Markov chains on the finite state space E ofall particle configurations, which have been of recent research interestin the context of current fluctuations. We denote configurations by η = ( η x : x ∈ T L ) where η x ∈ N is interpreted as the mass (or numberof monomers) at site x , and the process is denoted as ( η ( t ) : t ≥ .In order to use standard notation for lattice gases, in this and thefollowing subsection we change notation, and in particular the use of x, y ∈ T L is different to the use of those symbols in previous sectionswhere they denoted states in E . Monomers jump to nearest neighboursites with rates u ( η x , η y ) ≥ for y = x ± depending on the occu-pation numbers of departure and target site, multiplied with a spatialbias p = 1 − q ∈ [0 , . The generator is of the form L f ( η ) = (cid:88) x ∈ T L (cid:104) p u ( η x , η x +1 ) (cid:0) f ( σ x,x +1 η ) − f ( η ) (cid:1) + q u ( η x , η x − ) (cid:0) f ( σ x,x − η ) − f ( η ) (cid:1)(cid:105) , (4.5)where σ x,y η results from the configuration η after moving one particlefrom x to y . The number of particles M = (cid:80) x ∈ T L η x is a conservedquantity, but otherwise we assume the process to be irreducible for any xed M , which is ensured for example by positivity of the rates, i.e.for all k, l ≥ u ( k, l ) = 0 ⇔ k = 0 . This class includes various models that have been studied in the liter-ature, for example the inclusion process introduced in [19], where u ( k, l ) = k ( d + l ) for all k, l ≥ , (4.6)with a positive parameter d > . Particles perform independent jumpswith rate d and in addition are attracted by each particle on the targetsite with rate , giving rise to the ‘inclusion’ interaction. This modelhas attracted recent attention due the presence of condensation phe-nomena [40, 41] and in the context of large deviations of the particlecurrent [20], and we will use this as an example in Section 4.3. Otherwell-studied models covered by our set-up are the exclusion processwith state space E ⊂ { , } T L and u ( η x , η y ) = η x (1 − η y ) , or zero-range processes with E ⊂ N T L and rates u ( η x , η y ) = u ( η x ) dependingonly on the occupation number on the departure site.In terms of previous notation, the jump rates for a lattice gas oftype (4.5) between any two configurations η and ζ are given as W ( η, ζ ) = (cid:88) x ∈ T L (cid:16) p u ( η x , η x +1 ) δ ζ,σ x,x +1 η + q u ( η x , η x − ) δ ζ,σ x,x − η (cid:17) . (4.7)In the following we focus on lattice gases where (cid:80) x u ( η x , η x +1 ) = (cid:80) x u ( η x , η x − ) for all configurations η . While this is not true in gen-eral for models of type (4.5), it holds for many examples includinginclusion, exclusion and zero-range processes mentioned above. With p + q = 1 , the total exit rate out of configuration η is then simply givenby w ( η ) = (cid:88) x ∈ T L (cid:16) p u ( η x , η x +1 )+ q u ( η x , η x − ) (cid:17) = (cid:88) x ∈ T L u ( η x , η x +1 ) . (4.8)We are interested in an observable A T measuring the total particlecurrent up to time T , which is achieved by choosing h ( η ) ≡ in (2.5)and g ( η, ζ ) = ± if ζ = σ x,x ± η and g ( η, ζ ) = 0 otherwise . Using (4.8) we see by direct computation that the potential (2.11) takesthe simple form V k ( η ) = ( Q k − w ( η ) where Q k := pe k + qe − k . (4.9)Modified mutation rates W k ( η, ζ ) are given by (4.7) replacing ( p, q ) by ( pe k , qe − k ) , leading to modified total exit rates w k ( η ) = Q k (cid:88) x ∈ T L u ( η x , η x +1 ) = Q k w ( η ) . (4.10) he similarity of V k and w k for lattice gases (4.5) that obey (4.8) pro-vides a direct relation between mutation and selection rates, and allowsus to set up an efficient rejection based implementation of a particleapproximation ( η k ( t ) : t ≥ based on the efficient algorithm (3.7).In the following we omit the subscript k for configurations and write η ( t ) = ( η i ( t ) , i = 1 , . . . , N ) to simplify notation. For given parameters p, q = 1 − p and fixed k ∈ R we distinguish two cases. Q k < . We sample the ensemble of N clones at a total rate of W ( η ) := (cid:80) Ni =1 w ( η i ) , and pick a clone i with probability w ( η i ) / W ( η ) for the next event. With probability Q k ∈ (0 , this is a simple mu-tation within clone i , and then we replace η i by ζ i with probability W k ( η i , ζ i ) /w k ( η i ) . Otherwise, with probability − Q k we perform aselection event following the second line in (3.7): Pick a clone j uni-formly at random (including i ). If V k ( η j ) > V k ( η i ) or equivalently w ( η j ) < w ( η i ) (with (4.9) and since Q k < ), replace η i by η j with probability (cid:0) w ( η i ) − w ( η j ) (cid:1) /w ( η i ) . This procedure ensures that mutation andselection events are sampled with the correct rates as required in (3.7). Q k > . We sample the ensemble of N clones at a total rate of Q k W ( η ) , and pick a clone i with probability w ( η i ) / W ( η ) and a clone j uniformly at random. If w ( η j ) < w ( η i ) we replace η j by η i with prob-ability Q k − Q k w ( η i ) − w ( η j ) w ( η i ) . Then we mutate clone i as above, combiningthe mutation and selection event as in the cloning algorithm. Remarks.
Note that Q k = 1 is equivalent to k = 0 , which correspondsto the original process with λ = 0 and does not require any estimation.For Q k > we perform mutation and selection events simultaneously,in analogy to the cloning procedure explained in Section 3.3, but canuse the efficient algorithm (3.7). For Q k < no mutation or selectionevent occurs with probability (1 − Q k ) w ( η j ) w ( η i ) ( w ( η j ) < w ( η i )) , and ahigh rate of such rejections is not desirable for computational efficiency.But even for very small values of Q k the second factor is usually sig-nificantly smaller than (or simply ), since clone i was picked withprobability proportional to w ( η i ) and j uniformly at random.Note also that if the cloning algorithm (3.22) is implemented withthe common choice c = 0 for a lattice gas of the type discussed here,due to (4.9) and (4.10) the average number of clones per event (3.20)is m Nk ( η i ) = (cid:0) V k ( η i ) (cid:1) + /w k ( η i ) = Q k − Q k ∈ (0 , if Q k > , and for Q k < , where only killing occurs. In particular, this isindependent of the state η of the clone ensemble, and the standarddistribution of the form (3.26) is a simple Bernoulli random variable. = / = / = - - k Q k p = / = / = - - - - k k / Q k - Figure 1: Illustration of Q k (left) as given in (4.9) and the drift pe k /Q k − for the modified dynamics (right) as a function of k for different valuesof the asymmetry p = 1 − q . The minimum of Q k is √ pq , attained at k = log qp ∈ [ −∞ , ∞ ] , which is also where the modified drift vanishes. While with (4.10) the total mutation rate is Q k W ( η ) , selection rates(4.1), (4.2) and (4.3) can be written as S k ( η ) = N (cid:88) i =1 (cid:12)(cid:12) ( Q k − w ( η i ) − c (cid:12)(cid:12) c =0 = | Q k − |W ( η ) S k ( η ) = | Q k − | N N (cid:88) i,j =1 (cid:12)(cid:12) w ( η i ) − w ( η j ) (cid:12)(cid:12) S k ( η ) = | Q k − | N (cid:88) i =1 (cid:12)(cid:12) w ( η i ) − µ N ( η )( w ) (cid:12)(cid:12) . (4.11)So for very small values of Q k close to the mutation rate can becomevery small in comparison to selection, which means that significantcomputation time is devoted to re-weighting by selection, rather thanadvancing the dynamics via mutation events. This effect is typicallymuch stronger for the standard cloning algorithm with c = 0 , andoccurs for example for totally asymmetric lattice gases with p = 1 andnegative k conditioning on low currents. In Figure 1 we include a sketchof Q k for different values of asymmetry, including also the drift ofthe modified dynamics, which can be reversed in partially asymmetricsystems.In Figure 2 we compare the cloning algorithm to algorithms (3.7)and (3.8) for an inclusion process with d = 1 , L = 64 , M = 128 and asymmetry p = 0 . . It is known [20] that the SCGF λ k scaleslinearly with the system size L , and outside the convergent regime k ∈ [ − ln( − pp ) , ≈ [ − . , the rescaled SCGF λ k /L diverges as L →∞ (divergent regime). We compare estimates Λ Nk ( t ) for the cloningalgorithm (3.22) with c = 0 and algorithm (3.7) in the convergentregime. We use initial conditions where M particles are distributed on L lattice sites uniformly at random, and a burn-in time of · L = 640 as discussed in (2.19) and (2.20). This leads to an obvious adaption XXXX XXXXXX □□□□□ □□□□□□ - - theoryx cloning alg. ( ) □ algorithm ( ) - - - - - - - - - - k Λ k N ( t ) / L median mean c = ( cloning alg. ) k =- S k S k S k - - - - - - - c S k ( η ( t )) / N medianmeanc = ( cloning alg. ) k = S k S k S k c S k ( η ( t )) / N Figure 2: Inclusion process (4.6) with d = 1 , system size L = 64 , M = 128 particles, asymmetry p = 0 . and N = 2 clones at time t = 42000 . (Top)The rescaled estimator Λ Nk ( t ) /L as a function of k in the convergent regime,comparing the cloning algorithm (3.22) with c = 0 (orange) and algorithm(3.7) (blue). Error bars indicate standard deviations, which are boundedby the size of the symbols for (3.7). (Bottom) Illustration of the relationshipbetween S k depending on c , and S k and S k (4.11) for k = − . (left) and k = 0 . (right) based on the state η ( t ) of the clone ensemble.26 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx - - - - - time m N ( η ( t ))( V k ) / L xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx time m N ( η ( t ))( V k ) / L Figure 3: Inclusion process (4.6) with d = 1 , system size L = 64 , M = 128 particles, asymmetry p = 0 . and N = 2 clones. Time series of the meanfitness m N ( η ( t ))( V k ) /L for the cloning algorithm (red dots) and algorithm(3.7) (blue crosses), with time averages indicated by full lines. (Left) In theconvergent regime for k = − . we see a clear variance reduction using(3.7) with similar time average. (Right) In the divergent regime for k = 0 . we have similar variance but (3.7) improves on the time average. of the integration interval in the estimator Λ Nk ( t ) (3.2), but we do notalter the notation here to keep it simple. Both algorithms performvery well and agree with a simple theoretical estimate based on biasreversal, which is not the main concern in this paper and we refer thereader to [20]. Enlarged error bars indicating standard deviationsreveal that (3.7) is significantly more accurate than (3.22). This isdue to lower total selection rates S k illustrated at the bottom in theconverging and diverging regime. While S k for (3.7) is much lowerthan S k with c = 0 , S k does not offer significant further improvement.Since the efficient rejection based implementation of (3.7) explainedabove does not work for (3.8), we focus on (3.7) in our context. Themuch higher selection rate for the cloning algorithm with c = 0 leadsto a significantly higher time variation of the average potential in theconvergent regime compared to algorithm (3.7), as is illustrated inFigure 3. So in comparison to standard cloning, algorithm (3.7) leadsto reduced finite size effects and/or a significant variance reductionin this example, and a significant improvement of convergence of theestimator (3.2). We have checked that this also holds for zero-rangeprocesses with bounded rates. These promising first numerical resultspose interesting questions for a systematic study of practical propertiesof the algorithms and associated time correlations for future work, alsoin comparison with various recent results on improvements of cloningalgorithms [7, 12, 11]. We summarize the procedure outlined in the previous subsection forthe inclusion process with rates (4.6) on the torus T L with M particlesin pseudo-code given below. Besides fixing the model parameter d > nd the tilt k ∈ R , the specific parameters for the estimator are theensemble size N and the total simulation time t , which lead to theestimator Λ Nk ( t ) as given in (3.2). For simplicity we do not includeany burn-in times in this description, which would obviously be usedin practice. In this implementation we make a further simplificationwhich is very common for continuous-time jump dynamics of largesystems: We replace exponentially distributed random time incrementsby their expectation, given by ∆ t = 1 / W ( η ) for Q k < and ∆ t =1 / ( Q k W ( η )) for Q k > . Since with (4.9) N N (cid:88) i =1 V k ( η i ( s )) = Q k − N W ( η ) , we get for increments in the evaluation of the ergodic time integral in(3.2) ∆ t N N (cid:88) i =1 V k ( η i ( s )) = (cid:40) Q k − N , Q k < Q k − Q k N , Q k > . These are independent of the actual state η of the clones, so evaluationof Λ Nk ( t ) in (3.2) can be achieved by a simple integer counter ˆΛ Nk asexplained in the pseudocode Algorithm 1 and 2. While this countermay appear similar to the cloning factor explained in Section 3.4 atfirst glance, we want to stress that here finer increments of +1 areadded after every event (not only selections). Algorithm 1:
IP (4.5) with rates (4.6) and Q k < (4.9) Parameters N number of clones; t simulation time; Initialize configurations η i , i = 1 , . . . , N with mass M each; w ( η i ) = d M + (cid:80) x η ix η ix +1 ; W ( η ) = (cid:80) j w ( η j ) ; s = W ( η ) ; ˆΛ Nk = 1 ; while s < t do pick clone i with probability w ( η i ) / W ( η ) ; if R ∼ U (0 , < Q k then η i ← ζ , ζ chosen with probability W k ( η i ,ζ ) Q k w ( η i ) (mutation); else pick clone j uniformly at random; if w ( η j ) < w ( η i ) then η i ← η j with probability w ( η i ) − w ( η j ) w ( η i ) (selection); endend s ← s + W ( η ) ; ˆΛ Nk ← ˆΛ Nk + 1 ; endOutput Λ Nk ( t ) = Q k − Nt ˆΛ Nk ; 28 lgorithm 2: IP (4.5) with rates (4.6) and Q k > (4.9) Parameters N number of clones; t simulation time; Initialize configurations η i , i = 1 , . . . , N with mass M each; w ( η i ) = d M + (cid:80) x η ix η ix +1 ; W ( η ) = (cid:80) j w ( η j ) ; s = Q k W ; ˆΛ Nk = 1 ; while s < t do pick clone i with probability w ( η i ) / W ( η ) ;pick clone j uniformly at random; if w ( η j ) < w ( η i ) then η j ← η i with probability Q k − Q k w ( η i ) − w ( η j ) w ( η i ) (selection); end η i ← ζ , ζ chosen with probability W k ( η i ,ζ ) Q k w ( η i ) (mutation); s ← s + Q k W ( η ) ; ˆΛ Nk ← ˆΛ Nk + 1 ; endOutput Λ Nk ( t ) = Q k − Q k Nt ˆΛ Nk ; We have presented an analytical approach to cloning algorithmsbased on McKean interpretations of Feynman-Kac semigroups thathave been introduced in the applied probability literature. This allowsus to establish rigorous error bounds for the cloning algorithm in con-tinuous time, and to suggest a more efficient variant of the algorithmwhich can be implemented effectively for current large deviations instochastic lattice gases. The latter is based on minimizing the selec-tion rate in a standard population dynamics interpretation of particleapproximations of non-linear processes. We include a first applicationof this idea in the context of inclusion processes, but its full poten-tial will be explored in future more systematic studies of optimizationof cloning-type algorithms. The rigorous results fully reported in [18]apply under very general conditions, demanding bounded jump ratesand existence of a spectral gap for the underlying jump process. Theseimpose no restriction for lattice gases with a fixed number of parti-cles, which are essentially finite state Markov chains. We anticipatethat these techniques can also be applied for more general processesincluding diffusive, piecewise deterministic, or possibly non-Markoviandynamics (see [42] for first heuristic results in this direction). Anotherinteresting direction would be a rigorous analysis of the detailed er-godic properties of trajectories in the clone ensemble based on recentresults in [7, 8, 39]. cknowledgements This work was supported by The Alan Turing Institute under the EP-SRC grant EP/N510129/1 and The Alan Turing Institute–Lloyds Reg-ister Foundation Programme on Data-centric Engineering.AP acknowledges support by the National Group of MathematicalPhysics (GNFM-INdAM), and by Imperial College together with theData Science Institute and Thomson-Reuters Grant No. 4500902397-3408.
References [1] James B. Anderson. A random-walk simulation of the Schrödingerequation: H +3 . Journal of Chemical Physics , 63:1499, 1975.[2] Peter Grassberger. Go with the winners: a general Monte Carlostrategy.
Computer Physics Communications , 147(1):64–70, 2002.[3] Cristian Giardina, Jorge Kurchan, and Luca Peliti. Directevaluation of large-deviation functions.
Physical review letters ,96(12):120603, 2006.[4] Vivien Lecomte and Julien Tailleur. A numerical approach to largedeviations in continuous time.
Journal of Statistical Mechanics:Theory and Experiment , 2007(03):P03004, 2007.[5] Cristian Giardina, Jorge Kurchan, Vivien Lecomte, and JulienTailleur. Simulating rare events in dynamical processes.
Journalof Statistical Physics , 145(4):787–811, 2011.[6] Robert L. Jack and Peter Sollich. Large deviations and ensem-bles of trajectories in stochastic models.
Progress of TheoreticalPhysics Supplement , 184:304–317, 2010.[7] Takahiro Nemoto, Freddy Bouchet, Robert L. Jack, and VivienLecomte. Population-dynamics method with a multicanonicalfeedback control.
Physical Review E , 93:062123, 2016.[8] Esteban Guevara Hidalgo.
Cloning Algorithms: from Large Devi-ations to Population Dynamics . Ph.d. thesis, Université SorbonneParis Cité — Université Paris Diderot 7, 2018.[9] Takahiro Nemoto, Esteban Guevara Hidalgo, and Vivien Lecomte.Finite-time and finite-size scalings in the evaluation of large-deviation functions: Analytical study using a birth-death process.
Physical Review E , 95:012102, 2017.[10] Esteban Guevara Hidalgo, Takahiro Nemoto, and Vivien Lecomte.Finite-time and finite-size scalings in the evaluation of large-deviation functions: Numerical approach in continuous time.
Physical Review E , 95:062134, 2017.[11] Grégoire Ferré and Hugo Touchette. Adaptive sampling of largedeviations.
Journal of Statistical Physics , 172(6):1525–1544, 2018.
12] Tobias Brewer, Stephen R Clark, Russell Bradford, and Robert LJack. Efficient characterisation of large deviations using popu-lation dynamics.
Journal of Statistical Mechanics: Theory andExperiment , 2018(5):053204, 2018.[13] Carlos Pérez-Espigares, Pablo I. Hurtado. Sampling rare eventsacross dynamical phase transitions. arXiv:1902.01276[14] Pierre Del Moral and Laurent Miclo. A Moran particle systemapproximation of Feynman-Kac formulae.
Stochastic Processesand their Applications , 86(2):193–216, 2000.[15] Pierre Del Moral and Laurent Miclo. Branching and interactingparticle systems approximations of Feynman-Kac formulae withapplications to non-linear filtering. In
Seminaire de probabilitesXXXIV , pages 1–145. Springer, 2000.[16] Pierre Del Moral.
Feynman-Kac Formulae . Springer, 2004.[17] Mathias Rousset. On the control of an interacting particle estima-tion of Schrödinger ground states.
SIAM Journal on MathematicalAnalysis , 38(3):824–844, 2006.[18] Letizia Angeli, Stefan Grosskinsky, and Adam M. Johansen. Limittheorems for cloning algorithms. under review (arXiv:1902.00509)[19] Cristian Giardinà, Jorge Kurchan, Frank Redig, and KiamarsVafayi. Duality and hidden symmetries in interacting particlesystems.
Journal of Statistical Physics , 135(1):25–55, 2009.[20] Paul Chleboun, Stefan Grosskinsky, and Andrea Pizzoferrato.Current large deviations for partially asymmetric particle systemson a ring.
Journal of Physics A: Mathematical and Theoretical ,51(40):405001, 2018.[21] Alexandre Lazarescu. The physicist’s companion to current fluc-tuations: one-dimensional bulk-driven lattice gases.
Journal ofPhysics A: Mathematical and Theoretical , 48(50):503001, 2015.[22] Pablo I. Hurtado, Carlos P. Espigares, Jesús J. del Pozo, andPedro L. Garrido. Thermodynamics of currents in nonequilibriumdiffusive systems: Theory and simulation.
Journal of StatisticalPhysics , 154(1):214–264, 2014.[23] Paul Chleboun, Stefan Grosskinsky, and Andrea Pizzoferrato.Lower current large deviations for zero-range processes on a ring.
Journal of Statistical Physics , 167(1):64–89, 2017.[24] Raphaël Chetrite and Hugo Touchette. Nonequilibrium Markovprocesses conditioned on large deviations.
Annales de L’InstitutHenri Poincaré , 16:2005–2057, 2015.[25] Pelerine Tsobgni Nyawo and Hugo Touchette. Large deviationsof the current for driven periodic diffusions.
Physical Review E ,94:032101, 2016.[26] Rosemary J. Harris and Gunter M. Schütz. Fluctuation theoremsfor stochastic dynamics.
Journal of Statistical Mechanics: Theoryand Experiment , 2007(07):P07020, 2007.
27] Frank Den Hollander.
Large deviations , volume 14 of
GraduateTexts in Mathematics . American Mathematical Society, 2008.[28] Amir Dembo and Ofer Zeitouni.
Large deviations techniques andapplications , volume 38. Springer Science & Business Media, 2009.[29] Lorenzo Bertini, Alessandra Faggionato, and Davide Gabrielli.Large deviations of the empirical flow for continuous time Markovchains.
Annales de L’Institut Henri Poincaré , 51(3):867–900,2015.[30] Pierre Del Moral and Laurent Miclo. Particle approximationsof Lyapunov exponents connected to Schrödinger operators andFeynman-Kac semigroups.
ESAIM: Probability and Statistics ,7:171–208, 2003.[31] Pierre Del Moral.
Mean field simulation for Monte Carlo integra-tion . CRC Press, 2013.[32] James E. Baker. Adaptive selection methods for genetic algo-rithms.
In Proceedings of an International Conference on GeneticAlgorithms and their applications , 101–111, 1985.[33] Augustine Kong, Jun S. Liu and Wing Hung Wong. Sequentialimputations and Bayesian missing data problems.
Journal of theAmerican statistical association , 89(425):278–288, 1994.[34] Thomas Milton Liggett.
Continuous time Markov processes:an introduction , volume 113 of
Graduate Texts in Mathematics .American Mathematical Society, 2010.[35] Stefan Grosskinsky and Watthanan Jatuviriyapornchai. Deriva-tion of mean-field equations for stochastic particle systems.
Stochastic Processes and their Applications
Probability Theory - Inde-pendence, Interchangeability, Martingales . Springer, 3rd edition,1998.[38] Juan P Garrahan, Robert L Jack, Vivien Lecomte, Estelle Pitard,Kristina van Duijvendijk, and Frédéric van Wijland. First-orderdynamical phase transition in models of glasses: an approachbased on ensembles of histories.
Journal of Physics A: Mathe-matical and Theoretical , 42(7):075007, 2009.[39] Ushnish Ray, Garnet Kin-Lic Chan, and David T. Limmer. Impor-tance sampling large deviations in nonequilibrium steady states.i.
The Journal of Chemical Physics , 148(12):124120, 2018.[40] Stefan Grosskinsky, Frank Redig, and Kiamars Vafayi. Dynamicsof condensation in the symmetric inclusion process.
ElectronicJournal of Probability , 18(66):1–23, 2013.
41] Alessandra Bianchi, Sander Dommers, and Cristian Giardinà.Metastability in the reversible inclusion process.
Electronic Jour-nal of Probability , 22(70):1–34, 2017.[42] Massimo Cavallaro and Rosemary J. Harris. A framework forthe direct evaluation of large deviations in non-markovian pro-cesses.
Journal of Physics A: Mathematical and Theoretical ,49(47):47LT02, 2016.,49(47):47LT02, 2016.