[PDF] A pseudo-marginal sequential Monte Carlo online smoothing algorithm

Abstract

We consider online computation of expectations of additive state functionals under general path probability measures proportional to products of unnormalised transition densities. These transition densities are assumed to be intractable but possible to estimate, with or without bias. Using pseudo-marginalisation techniques we are able to extend the particle-based, rapid incremental smoother (PaRIS) algorithm proposed in [J.Olsson and J.Westerborn. Efficient particle-based online smoothing in general hidden Markov models: The PaRIS algorithm. Bernoulli, 23(3):1951--1996, 2017] to this setting. The resulting algorithm, which has a linear complexity in the number of particles and constant memory requirements, applies to a wide range of challenging path-space Monte Carlo problems, including smoothing in partially observed diffusion processes and models with intractable likelihood. The algorithm is furnished with several theoretical results, including a central limit theorem, establishing its convergence and numerical stability. Moreover, under strong mixing assumptions we establish a novel O(nε) bound on the asymptotic bias of the algorithm, where n is the path length and ε controls the bias of the density estimators.

Full PDF

OOnline pseudo Marginal Sequential Monte Carlo smoother forgeneral state spaces. Application to recursive maximumlikelihood estimation of stochastic diﬀerential equations.

Pierre Gloaguen , Sylvain Le Corﬀ , and Jimmy Olsson AgroParisTech, UMR MIA 518. Samovar, T´el´ecom SudParis, Institut Polytechnique de Paris. Department of Mathematics, KTH Royal Institute of Technology, Stockholm.

Abstract

This paper focuses on the estimation of smoothing distributions in general state space mod-els where the transition density of the hidden Markov chain or the conditional likelihood of theobservations given the latent state cannot be evaluated pointwise. The consistency and asymp-totic normality of a pseudo marginal online algorithm to estimate smoothed expectations ofadditive functionals when these quantities are replaced by unbiased estimators are established.A recursive maximum likelihood estimation procedure is also introduced by combining thisonline algorithm with an estimation of the gradient of the ﬁltering distributions, also knownas the tangent ﬁlters, when the model is driven by unknown parameters. The performance ofthis estimator is assessed in the case of a partially observed stochastic diﬀerential equation.

The data considered in this paper originate from general state space models, usually deﬁned as bi-variate stochastic processes { ( X k , Y k ) } (cid:54) k (cid:54) n where { Y k } (cid:54) k (cid:54) n are the observations and { X k } (cid:54) k (cid:54) n are the latent states comonly assumed to be a Markov chain. When both processes take values ingeneral spaces, the estimation of the conditional distribution of a sequence of hidden states given aﬁxed observation record is a challenging task required for instance to perform maximum likelihoodinference. Markov chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC) methods (alsoknown as particle ﬁlters or smoothers) are widespread solutions to propose consistent estimators ofsuch distributions. This paper sets the focus on the special case where the conditional likelihoodof an observation given the corresponding latent state (also known as the emission distribution) orthe transition density of the hidden Markov chain cannot be evaluated pointwise, while they arepivotal tools of both MCMC and SMC approaches. The ﬁrst objective of this paper is to prove thatconditional expectations of additive functionals of the hidden states may still be estimated onlinewith a consistent and asymptotically normal SMC algorithm. A recursive maximum likelihoodestimation procedure based on this algorithm and using an approximation of the gradient of theﬁltering distributions, referred to as the tangent ﬁlters, is then introduced.1 a r X i v : . [ s t a t . C O ] A ug seudo marginal SMCThe use of latent data models is ubiquitous in time series analysis across a wide range ofapplied science and engineering domains such as signal processing [6], genomics [36, 35], targettracking [33], enhancement and segmentation of speech and audio signals [31], see also [32, 14, 37]and the numerous references therein. Statistical inference for such models is likely to requirethe computation of conditional expectations of sequences of hidden states given observations. Inthis Bayesian setting, one of the most challenging problems is the approximation of expectationsunder the joint smoothing distribution, i.e. the posterior distribution of the sequence of states( X , . . . , X n ) given the observations ( Y , . . . , Y n ) for some n (cid:62)

1. This computation is not tractablein the framework of this paper where it is assumed that the transtion density of the hidden processor the conditional likelihood of observations given states cannot be computed. This circumstanceis somehow common for instance in the case of partially observed stochastic diferential equations(SDE), or in models where the emission distributions relies on a computationally prohibitive black-box routine.Following [18, 21], this paper concentrates on SMC methods to approximate smoothing distri-butions with a random set of states, the particles, associated with importance weights by combiningimportance sampling and resampling steps. This allows to solve the ﬁltering problem by combiningan auxiliary particle ﬁlter with an unbiased estimate of the unknown densities. Then, the onlinesmoother of [21] extends the particle-based rapid incremental smoother (PaRIS) of [28], to ap-proximate, processing the data stream online, smoothed expectations of additive functionals whenthe unknown densities are replaced by unbiased estimates. This approach is an online version ofthe Forward Filtering Backward Simulation algorithm algorithm [11] speciﬁcally designed to ap-proximate smoothed additive functionals. The crucial feature which makes the PaRIS algorithmappealing is the acceptance-rejection step which beneﬁts from the unbiased estimation. The exten-sion of the usual alternative, named the Forward Filtering Backward Smoothing algorithm [15], ismore sensitive as it involves ratios of these unknown quantities. Other smoothing algorithms suchas two-ﬁlter based approaches [2, 19, 25] could be extended similarly but they are intrisically notonline procedures as they require the time horizon and all observations to be available to initializea backward information ﬁlter.In [21], the only theoretical guarantee is that the accept reject mechanism of the PaRIS algorithmis still correct when the transition densities are replaced by unbiased estimates. In this paper,the consistency of the algorithm as long as a central limit theorem (CLT) are established (seeProposition 4.2 and Proposition 4.3 in Section 4.2). This makes this pseudo marginal smoother theﬁrst algorithm to approximate such expectations in the general setting of this paper with theoreticalguarantees and an explicit expression of the asymptotic variance. As a byproduct, the proofs of theseresults require to establish exponential deviation inequalities and a CLT for the PaRIS algorithmbased on the auxiliary particle ﬁlter, see Section 4.1. This extends the result of [28], written onlyin the case of the bootstrap ﬁlter of [22]. This also extends the theoretical guarantees obtained foronline sequential Monte Carlo smoothers given in [11, 9, 17, 20].The second part of the paper is devoted to recursive maximum likelihood estimation when theemission distributions or the transition densities depend on an unknown parameter, see Section 5.Following the ﬁlter sensitivity approach of [5, Section 10.2.4], the pseudo marginal smoother is usedto estimate online the gradient of the one-step predictive likelihood of an observation given pastobservations. This procedure allows to perform online estimation in complex frameworks and isapplied in Section 6 to partially observed SDE. 2seudo marginal SMC

Let n be a positive integer and X and Y two general state spaces. Consider a distribution χ on B ( X )and the Markov transition kernels ( Q k ) (cid:54) k (cid:54) n − on X × B ( X ) and ( G k ) (cid:54) k (cid:54) n − on X × X × B ( Y ).Throughout this paper, for all 0 (cid:54) k (cid:54) n − G k has a density g k with respect to a reference measure ν on B ( Y ). In the following, F ( Z ) denotes the set of real valued measurable functions deﬁned on theset Z . Let ( Y k ) (cid:54) k (cid:54) n be a sequence of observations in Y and deﬁne the joint smoothing distributions,for any 0 (cid:54) k (cid:54) k (cid:54) n and any function h ∈ F ( X k − k +1 ), by: φ k : k | n [ h ] := L − n ( Y n ) (cid:90) χ (d x ) n − (cid:89) k =0 Q k ( x k , d x k +1 ) g k ( x k , x k +1 , Y k +1 ) h ( x k : k ) , (1)where a u : v is a short-hand notation for ( a u , . . . , a v ) and L n ( Y n ) = (cid:90) χ (d x ) n − (cid:89) k =0 Q k ( x k , d x k +1 ) g k ( x k , x k +1 , Y k +1 ) (2)is the observed data likelihood. For all 0 (cid:54) k (cid:54) n − Q k has a density q k with respect to areference measure µ on B ( X ). The initial measure χ is also assumed to have a density with respectto µ which is also referred to as χ . For all 0 (cid:54) k (cid:54) n , φ k = φ k : k | k are the ﬁltering distributions, π k +1 = φ k +1: k +1 | k are the one-step predictive distributions, while φ k | n = φ k : k | n are the marginalsmoothing distributions.Consider a latent Markov chain ( X k ) (cid:54) k (cid:54) n with initial distribution χ and Markov transitionkernels ( Q k ) (cid:54) k (cid:54) n − . The states ( X k ) (cid:54) k (cid:54) n are not available so that any statistical inferenceprocedure is performed using the sequence of observations ( Y k ) (cid:54) k (cid:54) n only. The observations areassumed to be independent conditional on ( X k ) (cid:54) k (cid:54) n and such that for all 1 (cid:54) (cid:96) (cid:54) n the distributionof Y (cid:96) given ( X k ) (cid:54) k (cid:54) n has distribution G k ( X k , · ). In this case, (1) may be interpreted as: φ k : k | n [ h ] = E [ h ( X k : k ) | Y n ] .X k − X k Y k X k +1 Y k +1 q k − ( X k − , · ) q k ( X k , · ) g k − ( X k − , X k , · ) g k ( X k , X k + , · ) Figure 1: Graphical model of the general state space hidden Markov modelFigure 1 displays the graphical model associated with (2). Note that, when for all 0 (cid:54) k (cid:54) n − g k only depends on its last two arguments, (2) is the likelihood of a standard hidden Markov model.In such models, computing (1) allows to solve classical problems such as:3seudo marginal SMCi) path reconstruction, i.e. the reconstruction of the hidden states given the observations;ii) parameter inference, i.e., when q k and g k depend on some unknown parameter θ , the designof a consistent estimator of θ from the observations.As (1) is, in general, not available explicitly, this paper focuses on a sequential Monte Carlo basedapproximation speciﬁcally designed for cases where q k and/or g k cannot be evaluated pointwise.Partially observed diﬀusion processes (POD) [27], where the latent process is the solution to astochastic diﬀerential equation are widespread examples where q k is not tractable. Recursive formulation of (1) for additive functionals.

For all 0 (cid:54) k (cid:54) n −

1, deﬁne r k ( x k , x k +1 ) = q k ( x k , x k +1 ) g k ( x k , x k +1 , Y k +1 ) . (3)For all 0 (cid:54) k (cid:54) n −

1, deﬁne also the kernel L k on X × B ( X ), for all x ∈ X and all f ∈ F ( X ) by L k f ( x ) = (cid:90) r k ( x, y ) f ( y )d y . In the following, denotes the constant function which equals 1 for all x ∈ X so that L k ( x ) = (cid:90) r k ( x, y )d y . Following for instance [4], the joint smoothing distributions φ n | n may be decomposed using thebackward Markov kernels deﬁned, for all 0 (cid:54) k (cid:54) n −

1, all x k +1 ∈ X and all f ∈ F ( X ), by: ←− Q φ k f ( x k +1 ) := (cid:82) f ( x k ) r k ( x k , x k +1 ) φ k (d x k ) (cid:82) r k ( x (cid:48) k , x k +1 ) φ k (d x (cid:48) k ) . (4)Consequently, the joint-smoothing distribution φ n | n may be expressed, for all h ∈ F ( X n +1 ), as φ n | n [ h ] = φ n [ T n h ] , (5)where T n := (cid:40) ←− Q φ n − ⊗ ←− Q φ n − ⊗ · · · ⊗ ←− Q φ for n > , id for n = 0 , (6)where, for all Markov kernels K , K on X × B ( X ), all f ∈ F ( X ) and all x ∈ X ,( K ⊗ K ) f ( x ) = (cid:90) f ( y, z ) K ( x, d y ) K ( y, d z ) . In this paper, the focus is set on additive functionals of the form h n ( x n ) = n − (cid:88) k =0 ˜ h k ( x k , x k +1 ) , (7)4seudo marginal SMCwith, for all 0 (cid:54) k (cid:54) n −

1, ˜ h k : X × X → R p for some p (cid:62)

1. The additive form of the function h n deﬁned in (7) allows to update the backward statistics ( T k h k ) k (cid:62) recursively, see [3, 9]. For all k (cid:62) T k +1 h k +1 ( x k +1 ) = (cid:90) { T k h k ( x k ) + ˜ h k ( x k : k +1 ) } ←− Q φ k ( x k +1 , d x k ) . (8)By (5) and (8), the smoothed additive functional (5) can be updated recursively each time a newobservation is available. However, its exact computation is not possible in general state spaces. Inthis paper, we propose to approximate φ n | n [ h n ] using SMC methods: φ n in (5) and ←− Q φ k in (8)are replaced by a set of random samples associated with nonnegative importance weights. Theseparticle ﬁlters and smoothers approximations combine sequential importance sampling steps toupdate recursively φ n and importance resampling steps to duplicate or discard particles accordingto their importance weights. Sequential Monte Carlo for additive functionals.

Let ( ξ (cid:96) ) N(cid:96) =1 be independent and identicallydistributed according to the instrumental proposal density ρ on X and deﬁne the importanceweights: ω (cid:96) := χ ( ξ (cid:96) ) ρ ( ξ (cid:96) ) . For any f ∈ F ( X ), φ N [ f ] := Ω − N (cid:88) (cid:96) =1 ω (cid:96) f ( ξ (cid:96) ) , where Ω := N (cid:88) (cid:96) =1 ω (cid:96) is a consistent estimator of φ [ f ], see for instance [8]. Then, for all k (cid:62)

1, once the observation Y k is available, the weighted particle sample { ( ω (cid:96)k − , ξ (cid:96)k − ) } N(cid:96) =1 is transformed into a new weightedparticle sample approximating φ k . This update step is carried through in two steps, selection and mutation , using the auxiliary sampler introduced in [29]. New indices and particles { ( I (cid:96)k , ξ (cid:96)k ) } N(cid:96) =1 are simulated independently from the instrumental distribution with density on { , . . . , N } × X : υ k ( (cid:96), x ) ∝ ω (cid:96)k − ϑ k − ( ξ (cid:96)k − ) p k − ( ξ (cid:96)k − , x ) , (9)where ϑ k − is an adjustment multiplier weight function and p k − a Markovian transition density.For any (cid:96) ∈ { , . . . , N } , ξ (cid:96)k is associated with the importance weight deﬁned by: ω (cid:96)k := r k − ( ξ I (cid:96)k k − , ξ (cid:96)k ) ϑ k − ( ξ I (cid:96)k k − ) p k − ( ξ I (cid:96)k k − , ξ (cid:96)k ) (10)to produce the following approximation of φ k [ f ]: φ Nk [ f ] := Ω − k N (cid:88) (cid:96) =1 ω (cid:96)k f ( ξ (cid:96)k ) , where Ω k := N (cid:88) (cid:96) =1 ω (cid:96)k . For all k (cid:62) x, f ) ∈ X × F ( X ), replacing φ k by φ Nk in (4), ←− Q φ k f ( x ) is approximated by: ←− Q φ Nk f ( x ) = N (cid:88) i =1 ω ik r k ( ξ ik , x ) (cid:80) N(cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) f ( ξ ik ) . (11)5seudo marginal SMCThe forward-ﬁltering backward-smoothing (FFBS) algorithm proposed in [9] consists in replacing, in(8), ←− Q φ k by the approximation ←− Q φ Nk . Proceeding recursively, this produces a sequence of estimates(˜ τ ik ) Ni =1 of ( T k h k ( ξ ik )) Ni =1 for 0 (cid:54) k (cid:54) n . Starting with ˜ τ i = 0 for all 1 (cid:54) i (cid:54) N , this yields for all0 (cid:54) k (cid:54) n −

1: ˜ τ ik +1 = N (cid:88) j =1 ω jk r k ( ξ jk , ξ ik +1 ) (cid:80) N(cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 ) (cid:16) ˜ τ jk + ˜ h k ( ξ jk , ξ ik +1 ) (cid:17) . (12)Then, at each iteration 0 (cid:54) k (cid:54) n − φ k +1 | k [ h k +1 ] and φ k +1 | k +1 [ h k +1 ] are approximated by φ N, FFBS0: k +1 | k [ h k +1 ] := 1 N N (cid:88) i =1 ˜ τ ik +1 and φ N, FFBS0: k +1 | k +1 [ h k +1 ] := N (cid:88) i =1 ω ik +1 Ω k +1 ˜ τ ik +1 . The computational complexity of the update (12) grows quadratically with the number of particles N . This computational cost can be reduced following [28] by ﬁrst replacing (12) by the MonteCarlo estimate τ ik +1 = 1 (cid:101) N (cid:101) N (cid:88) j =1 (cid:18) τ J ( i,j ) k +1 k + ˜ h k ( ξ J ( i,j ) k +1 k , ξ ik +1 ) (cid:19) , (13)where the sample size (cid:101) N (cid:62) N and ( J ( i,j ) k +1 ) (cid:101) Nj =1 are i.i.d. samples in { , . . . , N } with probabilities proportional to ( ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 )) N(cid:96) =1 . In the resulting Particle RapidIncremental smoother (PaRIS) algorithm, the updated ( τ ik +1 ) Ni =1 , estimates of φ k +1 | k [ h k +1 ] = π k +1 [ T k +1 h k +1 ] and φ k +1 | k +1 [ h k +1 ] = π k +1 [ T k +1 h k +1 ] are obtained as: φ N, PaRIS0: k +1 | k [ h k ] := 1 N N (cid:88) i =1 τ ik +1 and φ N, PaRIS0: k +1 | k [ h k +1 ] := N (cid:88) i =1 ω ik +1 Ω k +1 τ ik +1 . Acceptance-rejection procedure.

The computational complexity of the described approach isstill of order N since it requires the normalising constant (cid:80) N(cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 ) to sample ( J ( i,j ) k +1 ) (cid:101) Nj =1 for all particle ξ ik +1 , 1 (cid:54) i (cid:54) N . A faster algorithm is obtained by applying the accept-rejectsampling approach proposed in [11] and illustrated in [16] which presupposes that there exists aconstant M > r k ( x, x (cid:48) ) (cid:54) M for all ( x, x (cid:48) ) ∈ X × X . Then, in order to sample from( ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 )) N(cid:96) =1 a candidate J ∗ ∼ ( ω ik ) Ni =1 is accepted with probability:Υ Mk ( J ∗ , i ) := r k ( ξ J ∗ k , ξ ik +1 ) /M . (14)This procedure is repeated until acceptance. Under strong mixing assumptions it can be shown,see for instance [11, Proposition 2] and [28, Theorem 10], that the expected number of trials neededfor this approach to update ( τ ik ) Ni =1 to ( τ ik +1 ) Ni =1 is O ( (cid:101) N N ). In many applications, Sequential Monte Carlo methods cannot be used as the transition densities q k or g k , 0 (cid:54) k (cid:54) n −

1, are unknown. The following crucial steps which rely on r k are not tractable:6seudo marginal SMC(a) computation of the importance weights ω (cid:96)k in (10) ;(b) computation of the acceptance ratio (14).To overcome these issues, following [21], consider the following algorithm. Initialization.

At time k = 0, set for all 1 (cid:54) (cid:96) (cid:54) N , (cid:98) ω (cid:96) = ω (cid:96) , (cid:98) I (cid:96) = 0 and (cid:98) τ (cid:96) = τ (cid:96) = 0 . Propagation.

Starting with weighted samples { ( ξ (cid:96)k , (cid:98) ω (cid:96)k ) } N(cid:96) =1 , deﬁne (cid:101) F Nk = σ (cid:8) ( ξ (cid:96)u , (cid:98) ω (cid:96)u , (cid:98) τ (cid:96)u ) ; 1 (cid:54) (cid:96) (cid:54) N , (cid:54) u (cid:54) k (cid:9) and (cid:101) G Nk = σ (cid:110) ( (cid:98) I (cid:96)k , ξ (cid:96)k ) ; 1 (cid:54) (cid:96) (cid:54) N (cid:111) . New indices and particles { ( (cid:98) I (cid:96)k +1 , ξ (cid:96)k +1 ) } N(cid:96) =1 are simulated independently from the instrumentaldistribution with density on { , . . . , N } × X : υ k +1 ( (cid:96), x ) ∝ (cid:98) ω (cid:96)k ϑ k ( ξ (cid:96)k ) p k ( ξ (cid:96)k , x ) , (15)Following [18, 27], weights update can be approximated by replacing r k ( ξ (cid:96)k , ξ ik +1 ) by an unbiasedestimator. H1 There exist a Markov kernel R k on ( X × X , B ( Z )) where ( Z , B ( Z )) is a general state space anda positive mapping (cid:98) r k on X × X × Z such that, for all ( x, x (cid:48) ) ∈ X , (cid:90) R k ( x, x (cid:48) ; d z ) (cid:98) r k ( x, x (cid:48) ; z ) = r k ( x, x (cid:48) ) . Then, under H1, if conditionally on (cid:101) F Nk ∨ (cid:101) G Nk +1 , ζ (cid:96)k has distribution R k ( ξ (cid:98) I (cid:96)k +1 k , ξ (cid:96)k +1 ; · ), then E (cid:20)(cid:98) r k ( ξ (cid:98) I (cid:96)k +1 k , ξ (cid:96)k +1 ; ζ (cid:96)k ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:21) = r k ( ξ (cid:98) I (cid:96)k +1 k , ξ (cid:96)k +1 ) . The ﬁltering weights then become: (cid:98) ω (cid:96)k +1 := (cid:98) r k ( ξ (cid:98) I (cid:96)k +1 k , ξ (cid:96)k +1 ; ζ (cid:96)k ) ϑ k ( ξ (cid:98) I (cid:96)k +1 k ) p k ( ξ (cid:98) I (cid:96)k +1 k , ξ (cid:96)k +1 ) . (16)For all f ∈ F ( X ) and all 0 (cid:54) k (cid:54) n , φ k [ f ] is approximated by (cid:98) φ Nk [ f ] := N (cid:88) i =1 (cid:98) ω ik (cid:98) Ω k f ( ξ ik ) , (cid:98) Ω k = N (cid:88) i =1 (cid:98) ω ik . To solve issue (b), [21] ensured that, under several assumptions, the acceptance-rejection mechanismintroduced to implement PaRIS algorithm is still valid for stochastic diﬀerential equations. Considerthe following assumption, 7seudo marginal SMC H2 For all 0 (cid:54) k (cid:54) n , there exists a random variable M k measurable with respect to (cid:101) G Nk +1 suchthat sup x,y,ζ (cid:98) r k ( x, y ; ζ ) ≤ M k . If this assumption holds, the accept-reject mechanism of PaRIS algorithm is replaced by the fol-lowing steps. For all 1 (cid:54) i (cid:54) N and all 1 (cid:54) j (cid:54) (cid:101) N , a candidate J ∗ is sampled in { , . . . , N } withprobabilities proportional to ( (cid:98) ω ik ) Ni =1 and is accepted with probability (cid:98) r k ( ξ J ∗ k , ξ ik +1 ; ζ ) /M k , where ζ has distribution R k ( ξ J ∗ k , ξ ik +1 ; · ). Then, set (cid:98) J ( i,j ) k +1 = J ∗ and (cid:98) τ ik +1 = 1 (cid:101) N (cid:101) N (cid:88) j =1 (cid:18)(cid:98) τ (cid:98) J ( i,j ) k +1 k + ˜ h k (cid:18) ξ (cid:98) J ( i,j ) k +1 k , ξ ik +1 (cid:19)(cid:19) . (17) Lemma 3.1.

Assume that H1 and H2 hold. Then, for all (cid:54) k (cid:54) n − and all (cid:54) i (cid:54) N , ( (cid:98) J ( i,j ) k +1 ) (cid:54) j (cid:54) (cid:101) N are i.i.d. and independent of (cid:98) ω ik +1 given (cid:101) F Nk ∨ (cid:101) G Nk +1 and such that for all (cid:54) (cid:96) (cid:54) N , P (cid:16) (cid:98) J ( i,j ) k +1 = (cid:96) (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:17) = (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , ξ ik +1 ) , where (cid:98) ω (cid:96)k is deﬁned by (16) .Proof. The proof follows the same lines as [21, Lemma 1].The proposed algorithm therefore leads to an estimator of the expectation (1) in the generalsetting of this paper. The following section provides constistency and asymptotic normality resultsfor this estimator.

In [26], the authors established the consistency and asymmptotic normality of PaRIS algorithm forthe bootstrap ﬁlter, i.e. in the simple case where for all 0 (cid:54) k (cid:54) n − ϑ k is the constant functionwhich equals 1 and p k = q k . This section extends these convergence results to the general auxiliaryparticle ﬁlter based PaRIS algorithm as such ﬁlters are required for the pseudo marginal smoother.Consider the following assumptions. H3 For all 0 (cid:54) k (cid:54) n − g k is a positive function such that (cid:107) g k (cid:107) ∞ < ∞ . For all 0 (cid:54) k (cid:54) n − (cid:107) q k (cid:107) ∞ < ∞ , (cid:107) ϑ k (cid:107) ∞ < ∞ and (cid:107) ¯ ω k +1 (cid:107) ∞ < ∞ where for all ( x, y ) ∈ X × X ,¯ ω k +1 ( x, y ) := r k ( x, y ) ϑ k ( x ) p k ( x, y ) . (18)8seudo marginal SMC Lemma 4.1.

Assume that H3 holds. Then, for all (cid:54) k (cid:54) n , ( f k , (cid:101) f k ) ∈ F ( X ) and (cid:101) N (cid:62) , thereexist ( c k , (cid:101) c k ) ∈ ( R (cid:63) + ) such that for all N ∈ R (cid:63) + and all ε ∈ R (cid:63) + , P (cid:32)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:88) i =1 ω ik Ω k { τ ik f k ( ξ ik ) + (cid:101) f k ( ξ ik ) } − φ k [ T k h k f k + (cid:101) f k ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:33) (cid:54) c k e − (cid:101) c k Nε . Proof.

The proof follows the same lines as the proof of [26, Theorem 1].

Lemma 4.2.

Assume that H3 holds. Then, for all (cid:54) k (cid:54) n , f k ∈ F ( X ) and (cid:101) N (cid:62) , N (cid:88) i =1 ω ik Ω k ( τ ik ) f k ( ξ ik ) P −→ N →∞ η k [ f k ] + φ k [ T k h k f k ] , where η [ f ] = 0 and for all (cid:54) k (cid:54) n − , η k +1 [ f k +1 ] = η k [ L k f k +1 ] + φ k [ L k {←− Q φ k ( T k h k + (cid:101) h k − T k +1 h k +1 ) f k +1 } ] (cid:101) N φ k [ L k ] . Proof.

The proof is postponed to Section B.1.Following [26, Lemma 13], for all 0 (cid:54) k (cid:54) n and f k ∈ F ( X ), the recursion given in Lemma 4.2may also be expressed as η k [ f k ] = k − (cid:88) (cid:96) =0 φ (cid:96) [ L (cid:96) {←− Q φ (cid:96) ( T (cid:96) h (cid:96) + (cid:101) h (cid:96) − T (cid:96) +1 h (cid:96) +1 ) L (cid:96) +1 . . . L k − f k } ] (cid:101) N k − (cid:96) φ (cid:96) [ L (cid:96) . . . L k − ] . (19)Establishing a central limit theorem for PaRIS algorithms requires to introduce the retro-prospective kernels, deﬁned, for all 0 (cid:54) k (cid:54) m (cid:54) n , x k ∈ X and h ∈ F ( X m +1 ), by D k,m h ( x k ) := (cid:90) h ( x m ) T k ( x k , d x k − ) L k . . . L m − ( x k , d x k +1: m ) , (cid:101) D k,m h ( x k ) := D k,m { h − φ m | m [ h ] } ( x k ) . Proposition 4.1.

Assume that H3 holds. Then, for all (cid:54) k (cid:54) n , ( f k , (cid:101) f k ) ∈ F ( X ) , √ N (cid:32) N (cid:88) i =1 ω ik Ω k { τ ik f k ( ξ ik ) + (cid:101) f k ( ξ ik ) } − φ k [ T k h k f k + (cid:101) f k )] (cid:33) D −→ N →∞ σ k (cid:104) f k ; (cid:101) f k (cid:105) Z , where Z is a standard Gaussian random variable and for all (cid:54) k (cid:54) n − , σ k (cid:104) f k ; (cid:101) f k (cid:105) = k − (cid:88) s =0 φ s [ ϑ s ] φ s [ L s { ¯ ω s (cid:101) D s +1 ,k ( h k f k + (cid:101) f k ) } ] φ s [ L s . . . L k − ] + k − (cid:88) s =0 k (cid:88) (cid:96) =0 φ s [ ϑ s ] φ (cid:96) [ L (cid:96) {←− Q φ (cid:96) ( T (cid:96) h (cid:96) + (cid:101) h (cid:96) − T (cid:96) +1 h (cid:96) +1 ) L (cid:96) +1 . . . L s ( ←− Q φ s ¯ ω s { L s +1 . . . L k − f k } )] (cid:101) N s +1 − (cid:96) φ (cid:96) [ L (cid:96) . . . L s − ] φ s [ L s . . . L k − ] . Proof.

The proof is postponed to Section B.2. 9seudo marginal SMC

Corollary 4.1.

Assume that H3 holds. Then, for all (cid:54) k (cid:54) n , ( f k , (cid:101) f k ) ∈ F ( X ) , √ N (cid:32) N (cid:88) i =1 ω ik Ω k τ ik − φ k [ T k h k ] (cid:33) D −→ N →∞ σ k ( h k ) Z , where Z is a standard Gaussian random variable and σ k ( h k ) = k − (cid:88) s =0 φ s [ ϑ s ] φ s [ L s { ¯ ω s (cid:101) D s +1 ,k h k } ] φ s [ L s . . . L k − ] + k − (cid:88) s =0 k (cid:88) (cid:96) =0 φ s [ ϑ s ] φ (cid:96) [ L (cid:96) {←− Q φ (cid:96) ( T (cid:96) h (cid:96) + (cid:101) h (cid:96) − T (cid:96) +1 h (cid:96) +1 ) L (cid:96) +1 . . . L s ( ←− Q φ s ¯ ω s { L s +1 . . . L k − } )] (cid:101) N s +1 − (cid:96) φ (cid:96) [ L (cid:96) . . . L s − ] φ s [ L s . . . L k − ] . Consider the following assumption. H4 For all 0 (cid:54) k (cid:54) n − (cid:107) (cid:98) ω k +1 (cid:107) ∞ < ∞ where for all ( x, y, z ) ∈ X × X × Z , (cid:98) ω k +1 ( x, y ; z ) := (cid:98) r k ( x, y ; z ) ϑ k ( x ) p k ( x, y ) . (20) Proposition 4.2.

Assume that H1, H2 and H4 hold. Then, for all (cid:54) k (cid:54) n , ( f k , (cid:101) f k ) ∈ F ( X ) and (cid:101) N (cid:62) , there exist ( c k , (cid:101) c k ) ∈ ( R (cid:63) + ) such that for all N ∈ R (cid:63) + and all ε ∈ R (cid:63) + , P (cid:32)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N (cid:88) i =1 (cid:98) ω ik (cid:98) Ω k { (cid:98) τ ik f k ( ξ ik ) + (cid:101) f k ( ξ ik ) } − φ k [ T k h k f k + (cid:101) f k ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:33) (cid:54) c k e − (cid:101) c k Nε . Proof.

The proof follows the same lines as the proof of [26, Theorem 1].

Lemma 4.3.

Assume that H1, H2 and H4 hold. Then, for all (cid:54) k (cid:54) n , f k ∈ F ( X ) and (cid:101) N (cid:62) , N (cid:88) i =1 (cid:98) ω ik (cid:98) Ω k ( (cid:98) τ ik ) f k ( ξ ik ) P −→ N →∞ η k [ f k ] + φ k [ T k h k f k ] , where for all (cid:54) k (cid:54) n , η k [ f k ] is deﬁned in (19) .Proof. The proof is postponed to Section C.1.

Proposition 4.3.

Assume that H1, H2 and H4 hold. Then, for all (cid:54) k (cid:54) n , ( f k , (cid:101) f k ) ∈ F ( X ) , √ N (cid:32) N (cid:88) i =1 (cid:98) ω ik (cid:98) Ω k { (cid:98) τ ik f k ( ξ ik ) + (cid:101) f k ( ξ ik ) } − φ k [ T k h k f k + (cid:101) f k )] (cid:33) D −→ N →∞ ¯ σ k (cid:104) f k ; (cid:101) f k (cid:105) Z , where Z is a standard Gaussian random variable and for all (cid:54) k (cid:54) n − , ¯ σ k +1 (cid:104) f k +1 ; (cid:101) f k +1 (cid:105) canbe computed using an explicit recursive formula given in Appendix C.2. Proof.

The proof is potsponed to Section C.2.

Corollary 4.2.

Assume that H1, H2 and H4 hold. Then, for all (cid:54) k (cid:54) n , √ N (cid:32) N (cid:88) i =1 (cid:98) ω ik (cid:98) Ω k (cid:98) τ ik − φ k [ T k h k ] (cid:33) D −→ N →∞ ¯ σ k ( h k ) Z , where Z is a standard Gaussian random variable and ¯ σ k ( h k ) can be computed using an explicitrecursive formula given in Appendix C.2. Let Θ be a parameter space. This section considers a family of transition kernels ( Q k ; θ ) θ ∈ Θ;0 (cid:54) k (cid:54) n − on X × B ( X ) and ( G k ; θ ) θ ∈ Θ;1 (cid:54) k (cid:54) n on X × B ( Y ) associated with densities q k ; θ and g k ; θ with respectto µ and ν . The joint smoothing distributions are then deﬁned, for any θ ∈ Θ, 0 (cid:54) k (cid:54) k (cid:54) n and any function h ∈ F ( X k − k +1 ), by: φ k : k ; θ | n [ h ] := L − n ; θ ( Y n ) (cid:90) χ (d x ) n − (cid:89) k =0 Q k ; θ ( x k , d x k +1 ) g k +1; θ ( x k +1 , Y k +1 ) h ( x k : k ) , where L n ; θ ( Y n ) = (cid:90) χ (d x ) n − (cid:89) k =0 Q k ; θ ( x k , d x k +1 ) g k +1; θ ( x k +1 , Y k +1 )As noted for instance in [10, Section 2] and [26], for all θ ∈ Θ and all f n ∈ F ( X n +1 ), ∇ θ φ n ; θ | n − [ f n ] = φ n ; θ | n − [ h n f n ] − φ n ; θ | n − [ f n ] × φ n ; θ | n − [ h n ] , where h n ( x n ) = n − (cid:88) k =0 ˜ h k ; θ ( x k , x k +1 ) , with, for all 0 (cid:54) k (cid:54) n − h k ; θ ( x k , x k +1 ) = ∇ θ log g k +1; θ ( x k +1 , Y k +1 ) + ∇ θ log q k ; θ ( x k , x k +1 ) . Considering an objective function f n ∈ F ( X ) which depends on the last state x n only, the tangentﬁlter η n is deﬁned as the following signed measure: η n ; θ [ f n ] := ∇ θ π n ; θ [ f n ] = φ n ; θ | n − [ h n ; θ f n ] − π n ; θ [ f n ] × φ n ; θ | n − [ h n ] , where π n = φ n : n | n − is the predictive measure. The particle based estimator of π n [ f ] is given by: π Nn [ f ] = 1 N N (cid:88) (cid:96) =1 f ( ξ (cid:96)n ) . Using the tower property, (4) and the backward decomposition (6): η n ; θ [ f n ] = π n ; θ [( T n h n − π n ; θ [ T n h n ]) f n ] . (21)11seudo marginal SMCTherefore, the tangent ﬁlter (21) can be approximated on-the-ﬂy using the statistics (˜ τ in ) Ni =1 andthe weighted particles { ( ξ in , ω in ) } Ni =1 : η N, FFBS n ; θ [ f n ] = 1 N N (cid:88) i =1 ˜ τ in f n ( ξ ni ) − (cid:32) N N (cid:88) i =1 ˜ τ in (cid:33) (cid:32) N N (cid:88) i =1 f n ( ξ ni ) (cid:33) . (22)In cases where r k , 0 (cid:54) k (cid:54) n −

1, is unknown and replaced by an unbiased estimate, the associatedpseudo marginal particle-based approximation of the tangent ﬁlter is given by: (cid:98) η Nn ; θ [ f n ] = 1 N N (cid:88) i =1 (cid:98) τ in f n ( ξ ni ) − (cid:32) N N (cid:88) i =1 (cid:98) τ in (cid:33) (cid:32) N N (cid:88) i =1 f n ( ξ ni ) (cid:33) . (23)Given a set of observations Y n , maximum likelihood estimation amounts at obtaining a parameterˆ θ n ∈ Θ such that ˆ θ n = arg max θ ∈ Θ (cid:96) θ ; n ( Y n ), where (cid:96) θ ; n ( Y n ) = log L θ ; n ( Y n ) is the logarithm ofthe likelihood given in (2). There are many diﬀerent approaches to compute an estimator of ˆ θ n ,see for instance [4, Chapter 10]. Following [12], under strong mixing assumptions, for all θ ∈ θ , theextended process { ( X n , Y n , π n , η n ) } n (cid:62) is an ergodic Markov chain and for all θ ∈ θ , the normalizedscore ∇ θ (cid:96) θ ( Y n ) /n of the observations may be shown to converge where:1 n ∇ θ (cid:96) θ ( Y n ) = 1 n n (cid:88) k =1 ∇ θ (cid:96) θ ( Y k | Y k − ) = 1 n n (cid:88) k =0 π k ; θ [ ∇ θ g k ; θ ] + η k ; θ [ g k ; θ ] π k ; θ [ g k ; θ ] . Assuming that the observations Y n are generated by a model driven by a true parameter θ (cid:63) forall θ ∈ θ this normalized score converges almost surely to a limiting quantity λ ( θ, θ (cid:63) ) such that,under identiﬁability constraints, λ ( θ (cid:63) , θ (cid:63) ) = 0. A gradient ascent algorithm cannot be designed asthe limiting function θ (cid:55)→ λ ( θ, θ (cid:63) ) is not available explicitly. Solving the equation λ ( θ (cid:63) , θ (cid:63) ) = 0 maybe cast into the framework of stochastic approximation to produce parameter estimates using the Robbins-Monro algorithm θ n +1 = θ n + γ n +1 ζ n +1 , n (cid:62) , (24)where ζ n +1 is a noisy observation of λ ( θ n , θ (cid:63) ). Obtaining such an observation is not possible inpractice and following [26] this noisy observation is approximated by ζ n +1 := ζ n +1 + ζ n +1 ζ n +1 , (25)where ζ n +1 := π n +1; θ n (cid:2) ( ∇ θ g n +1; θ ) | θ = θ n (cid:3) , ζ n +1 := η n +1; θ n [ g n +1; θ n ] and ζ n +1 := π n +1; θ n [ g n +1; θ n ] . (26)In (26), the measures π n +1; θ n and η n +1; θ n depend on all the past parameter values. In the caseof a ﬁnite state space X the algorithm was studied in [24], which also provides assumptions underwhich the sequence { θ n } n (cid:62) converges towards the parameter θ (cid:63) (see also [34] for reﬁnements). Inmore general cases, these measures may be estimated online using the pseudo marginal smootherpresented in this paper. 12seudo marginal SMC Let ( X t ) t ≥ be deﬁned as a weak solution to the following Stochastic Diﬀerential Equation (SDE)in R d : X = x and d X t = α θ ( X t )d t + d W t , (27)where ( W t ) t ≥ is a standard Brownian motion, α θ : X → X is the drift function . The inferenceprocedure presented in this paper is applied in the case where the solution to (27) is supposed tobe partially observed at times t = 0 , . . . , t n , for a given n (cid:62)

1, through an observation process( Y k ) (cid:54) k (cid:54) n taking values in R m . For all 0 (cid:54) k (cid:54) n , the distribution of Y k given ( X t ) t (cid:62) depends on X k = X t k only and has density g k ; θ with respect to ν . The distribution of X has density χ withrespect to µ and for all 0 (cid:54) k (cid:54) n −

1, the conditional distribution of X k +1 given ( X t ) (cid:54) t (cid:54) k hasdensity q k +1; θ ( X k , · ) with respect to µ . This unknown density can be expressed as an expectationof a Brownian Bridge functional [7].Let ω = ( ω s ) ≤ s ≤ t be the realization of a Brownian Bridge starting at x at time 0 and ending in y at time ∆. The distribution of ω is denoted by W ∆ ,yx . Moreover, suppose that for all θ ∈ Θ, α θ is of a gradient form α θ = ∇ x A θ where A θ : X → R is a twice continuously diﬀerentiable function.Denoting, ψ θ : x (cid:55)→ ψ θ ( x ) = ( (cid:107) α θ ( x ) (cid:107) + ∆ A θ ( x )) /

2, by Girsanov theorem, for all x, y ∈ R d × R d q k +1; θ ( x, y ) = φ ∆ k ( x − y )exp ( A θ ( y ) − A θ ( x )) E W ∆ k,yx (cid:34) exp (cid:32) − (cid:90) ∆ k ψ θ ( ω s )d s (cid:33)(cid:35) , (28)where ∆ k = t k +1 − t k , for all a > φ a is the probability density function of a centered Gaussianrandom variable with variance a .The transition density then cannot be computed as it involves an integration over the wholepath between x and y . To perform the algorithm proposed in this paper, we therefore have todesign a positive an unbiased estimator of q k +1; θ ( x, y ). Moreover, maximum likelihood estimationof θ requires an unbiased estimator of ∇ θ log q k +1; θ ( x, y ). Such two estimators can be obtainedusing the General Poisson Estimator (GPE, [18]). Unbiased GPE estimator for q k +1; θ ( x, y ; ζ ) . Assume that there exist random variables m θ and m θ such that for all 0 (cid:54) s (cid:54) ∆ k , m θ (cid:54) ψ θ ( ω s ) (cid:54) m θ . Let κ be a random variable taking valuesin N with distribution µ , ω = ( ω s ) ≤ s ≤ ∆ k be the realization of a Brownian Bridge, and ( U j ) (cid:54) j (cid:54) κ be independent uniform random variables on (0 , ∆ k ) and ζ = ( κ, ω, U , . . . , U κ ). As shown in [18],equation (28) leads to a positive unbiased estimator given by (cid:98) q k +1; θ ( x, y ; ζ ) = φ ∆ k ( x − y )exp ( A θ ( y ) − A θ ( x ) − m θ ∆ k ) κ (cid:89) j =1 m θ − ψ θ ( ω U j ) m θ − m θ . Unbiased GPE estimator of ∇ θ log q k +1; θ ( x, y ) . Let’s denote ϕ θ : x (cid:55)→ ψ θ ( x ) − m θ . By (28), ∇ θ log q k +1; θ ( x, y ) = ∇ θ A θ ( y ) − ∇ θ A θ ( x ) − ∇ θ m θ ∆ k − E W ∆ k,yx (cid:104)(cid:16)(cid:82) ∆ k ∇ θ ϕ θ ( ω s )d s (cid:17) exp (cid:16) − (cid:82) ∆ k ϕ θ ( ω s )d s (cid:17)(cid:105) E W ∆ k,yx (cid:104) exp (cid:16) − (cid:82) ∆ k ϕ θ ( ω s )d s (cid:17)(cid:105) . S ∆ k ,yθ,x associated with the SDE (27) is absolutely continuouswith respect to W ∆ k ,yx with Radon-Nikodym derivative given byd S ∆ k ,yθ,x d W ∆ k ,yx ( ω ) = [ q k +1; θ ( x, y )] − φ ∆ k ( x − y )exp (cid:32) A θ ( y ) − A θ ( x ) − m θ ∆ k − (cid:90) ∆ k ϕ θ ( ω s )d s (cid:33) , = E W ∆ k,yx (cid:34) exp (cid:32) − (cid:90) ∆ k ϕ θ ( ω s )d s (cid:33)(cid:35) − exp (cid:32) − (cid:90) ∆ k ϕ θ ( ω s )d s (cid:33) . This yields ∇ θ log q k +1; θ ( x, y ) = ( ∇ θ A θ ( y ) − ∇ θ A θ ( x ) − ∇ θ m θ ∆ k ) − E S ∆ k,yθ,x (cid:34)(cid:90) ∆ k ∇ θ ϕ θ ( ω s )d s (cid:35) and an unbiased estimator of ∇ θ log q k +1; θ ( x, y ) is given by l k +1; θ ( x, y, s θ,x,y, ∆ k U ) = ( ∇ θ A θ ( y ) − ∇ θ A θ ( x ) − ∇ θ m θ ∆ k ) − ∆ k ∇ θ ϕ θ ( s θ,x,y, ∆ k U ) , where U is uniform on (0 ,

1) and independent of s θ,x,y, ∆ k ∼ S ∆ k ,yθ,x . In the context of GPE, s θ,x,y, ∆ k can be simulated exactly using exact algorithms for diﬀusion processes proposed in [1]. Experiments.

Online recursive maximum likelihood using pseudo marginal SMC is illustratedwhen (27) has the speciﬁc form: X = x and d X t = sin( X t − θ )d t + d W t , (29)where θ is an unknown parameter ranging between 0 and 2 π . For this numerical experiments, wesuppose that a realization of (29) is only observed at times t k = k for 0 (cid:54) k (cid:54) n with n = 5000through a noisy observation process ( Y k ) (cid:54) k (cid:54) n such for all 0 (cid:54) k (cid:54) n , Y k = X t k + ε k , where ( ε k ) (cid:54) k (cid:54) n are i.i.d. standard Gaussian random variables, independent of ( W t ) t (cid:62) . In thiscase α θ : x (cid:55)→ sin( x − θ ) and inf x ∈ R ( α θ ( x ) + ∆ A θ ( x )) / (cid:62) − / x ∈ R , 0 (cid:54) ϕ θ ( x ) = ( α θ ( x ) + ∆ A θ ( x )) / / (cid:54) / θ ∗ = π/

4. The solution to (29) is sampled attimes ( t k ) (cid:54) k (cid:54) n using the Exact algorithm of [1]. For all 0 (cid:54) k (cid:54) n − (cid:98) q k,θ and the GPE unbiasedestimator of ∇ θ q k,θ ( x, y ) are estimated using M = 30 independent Monte Carlo replications ofthe general Poisson estimator. The estimations of θ ∗ are given for 50 independent runs started atrandom locations θ with N = 100 particles and (cid:101) N = 2 backward samples. Following [21], the14seudo marginal SMCproposal distribution of the particle ﬁlter is obtained using an approximation of the fully adaptedparticle ﬁlter where q k,θ is replaced by the its Euler scheme approximation. Sensitivity to the starting point ˆ θ . The inference procedure was performed on the same data setfrom 50 diﬀerent starting points uniformly chosen in (0 , π ). The gradient step size γ k of equation(24) was chosen constant (and equal to 0.5) for the ﬁrst 300 time steps, and then decreasing witha rate proportional to k − . . Results are given Figure 3. There is no sensitivity to the startingpoint of the algorithm, and after a couple of hundred observations, the estimates all concentratearound the true value. As the gradient step size decreases, the estimates stay around the true valuefollowing autocorrelated patterns that are common to all trajectories. Asymptotic normality.

The inference procedure was performed on 50 diﬀerent data sets sim-ulated with the same θ ∗ . The 50 estimates were obtained starting from the same starting point(ﬁxed to θ ∗ , as Figure 3 shows no sensitivity to the starting point). Figure 4 shows the results forthe raw and the averaged estimates. The averaged estimates ( (cid:101) θ k ) k (cid:62) consist in averaging the valuesproduced by the estimation procedure after a burning phase of n time steps (here n = 300 timesteps). This procedure allows to obtain an estimator whose convergence rate does not depend onthe step sizes chosen by the user, see [30, 23]. For all 0 (cid:54) k (cid:54) n , (cid:101) θ k = (cid:98) θ k and for all k > n , (cid:101) θ k = 1 k − n k (cid:88) j = n +1 (cid:98) θ j . As expected, the estimated distribution of the ﬁnal estimates tends to be Gaussian, centered aroundthe true value.

Step size inﬂuence.

To illustrate the inﬂuence of the gradient step sizes, diﬀerent settings areconsidered. In each scenario, the sequence ( γ k ) k (cid:62) is given by γ k = γ { k ≤ n } + γ ( k − n ) κ { k>n } , where γ = 0 .

5. In this experiment κ ∈ { . , . , . , . , . , } . The results are shown in Figure 5.As expected, the raw estimator shows diﬀerent rates of convergence depending on κ , whereas theaveraged estimator has the same behavior in all cases. References [1] A. Beskos, O. Papaspiliopoulos, and G. O. Roberts. Retrospective exact simulation of diﬀusionsample paths with applications.

Bernoulli , 12(6):1077–1098, 2006.[2] M. Briers, A. Doucet, and S. Maskell. Smoothing algorithms for state–space models.

Annalsof the Institute of Statistical Mathematics , 62(1):61, 2010.[3] O. Capp´e. Online EM algorithm for hidden Markov models.

Journal of Computational andGraphical Statistics , 20(3):728–749, 2011.[4] O. Capp´e, ´E. Moulines, and T. Ryd´en.

Inference in Hidden Markov Models . Springer, 2005.[5] Olivier Capp´e, Eric Moulines, and Tobias Ryden.

Inference in Hidden Markov Models (SpringerSeries in Statistics) . Springer-Verlag, Berlin, Heidelberg, 2005.15seudo marginal SMC ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● −30−20−100 0 1000 2000 3000 4000 5000

Time V a l u e ●● ObservationTrue state

Figure 2: Data set simulated according to the SINE process, observed with noise at discrete timesteps.

246 0 1000 2000 3000 4000 5000

Time E s t i m a t e ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● Time G r a d i e n t s t e p Figure 3: (

Left ) online estimation of θ for the data set presented in Figure 2. The algorithm isperformed from 50 starting points. ( Right ) The gradient step sizes (deﬁned in equation (24)).16seudo marginal SMC

246 0 1000 2000 3000 4000 5000

Time E s t i m a t e

246 0 1000 2000 3000 4000 5000

Time E s t i m a t e Estimate D e n s i t y Figure 4: (

Left ) online estimation of θ for 50 diﬀerent simulated data sets as presented in Figure2. The algorithm is performed from 1 starting point with the gradient step size shown in Figure 3.( Center ) Averaged estimator, where ˆ θ is averaged after a burning phase of 300 time steps. ( Right )Empirical distribution of ˆ θ . The red line is the value of θ ∗ .

246 0 1000 2000 3000 4000 5000

Time E s t i m a t e κ Time E s t i m a t e κ Figure 5: (

Left ) online estimation of θ for the data set presented in Figure 2, with diﬀerent decreasingrates values κ . ( Right ) Averaged estimator, where ˆ θ is averaged after a burning phase of 300 timesteps. 17seudo marginal SMC[6] M.S. Crouse, R.D. Nowak, and R.G. Baraniuk. Wavelet-based statistical signal processingusing hidden Markov models. IEEE Transactions on Signal Processing , 46(4):886–902, 1998.[7] Didier Dacunha-Castelle and Danielle Florens-Zmirou. Estimation of the coeﬃcients of adiﬀusion from discrete observations.

Stochastics: An International Journal of Probability andStochastic Processes , 19(4):263–284, 1986.[8] P. Del Moral.

Feynman-Kac Formulae. Genealogical and Interacting Particle Systems withApplications . Springer, 2004.[9] P. Del Moral, A. Doucet, and S. Singh. A Backward Particle Interpretation of Feynman-KacFormulae.

ESAIM M2AN , 44(5):947–975, 2010.[10] P. Del Moral, A. Doucet, and S.S. Singh. Uniform stability of a particle approximation of theoptimal ﬁlter derivative.

SIAM Journal on Control and Optimization , 53:1278–1304, 2015.[11] R. Douc, A. Garivier, ´E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing forgeneral state space hidden Markov models.

Ann. Appl. Probab. , 21(6):2109–2145, 2011.[12] R. Douc and C. Matias. Asymptotics of the maximum likelihood estimator for general hiddenMarkov models.

Bernoulli , 7:381–420, 2001.[13] R. Douc and E. Moulines. Limit theorems for weighted samples with applications to sequentialmonte carlo methods.

Ann. Statist. , 36(5):2344–2376, 10 2008.[14] R. Douc, ´E. Moulines, and D. Stoﬀer.

Nonlinear time series: theory, methods and applicationswith R examples . CRC Press, 2014.[15] Arnaud Doucet, Simon Godsill, and Christophe Andrieu. On sequential monte carlo samplingmethods for bayesian ﬁltering.

Statistics and computing , 10(3):197–208, 2000.[16] C. Dubarry and S. Le Corﬀ. Fast computation of smoothed additive functionals in generalstate-space models. , pages 197–200,2011.[17] C. Dubarry and S. Le Corﬀ. Non-asymptotic deviation inequalities for smoothed additivefunctionals in nonlinear state-space models.

Bernoulli , 19(5B):2222–2249, 2013.[18] P. Fearnhead, O. Papaspiliopoulos, and G. O. Roberts. Particle ﬁlters for partially observed dif-fusions.

Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 70(4):755–777, 2008.[19] Paul Fearnhead, David Wyncoll, and Jonathan Tawn. A sequential smoothing algorithm withlinear computational cost.

Biometrika , 97(2):447–464, 2010.[20] M. Gerber and N. Chopin. Convergence of sequential quasi-Monte Carlo smoothing algorithms.

Bernoulli , 23(4B):2951–2987, 2017.[21] P. Gloaguen, M.-P. ´Etienne, and S. Le Corﬀ. Online Sequential Monte Carlo smoother forpartially observed stochastic diﬀerential equations.

EURASIP Journal on Advances in SignalProcessing , 2018. 18seudo marginal SMC[22] Neil J Gordon, David J Salmond, and Adrian FM Smith. Novel approach to nonlinear/non-gaussian bayesian state estimation. In

IEE proceedings F (radar and signal processing) , volume140, pages 107–113. IET, 1993.[23] H. J. Kushner and G. G. Yin.

Stochastic Approximation Algorithms and Applications . Springer,1997.[24] F. Le Gland and L. Mevel. Recursive estimation in HMMs. In

Proc. IEEE Conf. Decis. Control ,pages 3468–3473, 1997.[25] Thi Ngoc Minh Nguyen, Sylvain Le Corﬀ, and Eric Moulines. On the two-ﬁlter approxima-tions of marginal smoothing distributions in general state-space models.

Advances in AppliedProbability , 50(1):154–177, 2017.[26] J. Olsson and Westerborn J. Particle-based, online estimation of tangent ﬁlters with applicationto parameter estimation in nonlinear state-space models.

ArXiv:1712.08466 , 2017.[27] J. Olsson and J. Strojby. Particle-based likelihood inference in partially observed diﬀusionprocesses using generalised Poisson estimators.

Electron. J. Statist. , 5:1090–1122, 2011.[28] J. Olsson and J. Westerborn. Eﬃcient particle-based online smoothing in general hiddenmarkov models: The PaRIS algorithm.

Bernoulli , 23(3):1951–1996, 08 2017.[29] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle ﬁlters.

J. Am. Statist.Assoc. , 94(446):590–599, 1999.[30] B. T. Polyak and A. B. Juditsky. Acceleration of stochastic approximation by averaging.

SIAMJ. Control Optim. , 30(4):838–855, 1992.[31] L.R. Rabiner. A tutorial on hidden markov models and selected applications in speech recog-nition. In

Proceedings of the IEEE , pages 257–286, 1989.[32] S. S¨arkk¨a.

Bayesian Filtering and Smoothing . Cambridge University Press, New York, NY,USA, 2013.[33] S. S¨arkk¨a, A. Vehtari, and J. Lampinen. Rao-Blackwellized particle ﬁlter for multiple targettracking.

Inofrmation Fusion , 8(1):2–15, 2007.[34] V.B. Tadi´c. Analyticity, convergence, and convergence rate of recursive maximum-likelihoodestimation in hidden Markov models.

IEEE Transactions on Information Theory , 56:6406–6432, 2010.[35] X. Wang, E. Lebarbier, J. Aubert, and S. Robin. Variational inference for coupled hiddenmarkov models applied to the joint detection of copy number variations.

The InternationalJournal of Biostatistics , 15, 06 2017.[36] C. Yau, O. Papaspiliopoulos, G. O. Roberts, and C. Holmes. Bayesian non-parametric hiddenMarkov models with applications in genomics. 73:1–21, 2011.[37] W. Zucchini, I.L. Mac Donald, and R. Langrock.

Hidden Markov models for time series: anintroduction using R . CRC Press, 2016. 19seudo marginal SMC

A Additional technical results

The proof of Lemma A.1 is given in [11].

Lemma A.1.

Assume that a N , b N , and b are random variables deﬁned on the same probabilityspace such that there exist positive constants β , B , C , and M satisfying(i) | a N /b N | (cid:54) M , P -a.s. and b (cid:62) β , P -a.s.,(ii) For all ε > and all N (cid:62) , P ( | b N − b | > ε ) (cid:54) B e − CNε ,(iii) For all ε > and all N (cid:62) , P ( | a N | > ε ) (cid:54) B e − CN ( ε/M ) .Then, P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) a N b N (cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:19) (cid:54) B exp (cid:32) − CN (cid:18) εβ M (cid:19) (cid:33) . The proof of Theorem A.1 is given in [13, Theorem A.3].

Theorem A.1.

Let N be a positive integer, ( U N,i ) (cid:54) i (cid:54) N be random variables on a probability space (Ω , F , P ) and ( F N,i ) (cid:54) i (cid:54) N be a ﬁltration on (Ω , F , P ) . Assume that for all (cid:54) i (cid:54) N the randomvariable U N,i is such that E [ U N,i |F N,i − ] < ∞ . Assume also that the two following conditions hold.(i) There exists σ > such that N (cid:88) i =1 (cid:0) E [ U N,i |F N,i − ] − E [ U N,i |F N,i − ] (cid:1) P −→ N →∞ σ . (ii) For all ε > , N (cid:88) i =1 E [ U N,i | U N,i | (cid:62) ε |F N,i − ] P −→ N →∞ . Then, for all u > , E (cid:34) exp (cid:32) iu N (cid:88) i =1 (cid:8) U N,i − E [ U N,i |F N,i − ] (cid:9)(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F N, (cid:35) P −→ N →∞ e − u σ / . The proof of Lemma A.2 follows the same lines as [26, Lemma 14].

Lemma A.2.

Assume that H3 holds. Let K be a transition kernel on ( X , B ( X ) with transitiondensity k ∈ F ( X × X ) with respect to the reference measure µ . Assume that ( ϕ N ) N (cid:62) is a sequenceof functions in F ( X ) such thati) there exists ϕ ∈ F ( X ) such that for all x ∈ X , ϕ N ( x ) P − a . s . −→ N →∞ ϕ ( x ) ;ii) there exists < c ∞ < ∞ such that for all N (cid:62) , (cid:107) ϕ N (cid:107) ∞ (cid:54) c ∞ .Then, for all (cid:54) k (cid:54) n , φ Nk [ K ϕ N ] P −→ N →∞ φ k [ K ϕ ] and (cid:98) φ Nk [ K ϕ N ] P −→ N →∞ φ k [ K ϕ ] . B Convergence results for PaRIS algorithms

For all 0 (cid:54) k (cid:54) n , deﬁne the following σ -ﬁelds: F Nk := σ (cid:8) ( ξ (cid:96)u , ω (cid:96)u , τ (cid:96)u ) ; 1 (cid:54) (cid:96) (cid:54) N , (cid:54) u (cid:54) k (cid:9) and G Nk := σ (cid:8) ( ξ (cid:96)k , ω (cid:96)k ) ; 1 (cid:54) (cid:96) (cid:54) N (cid:9) . Lemma B.1.

For all (cid:54) k (cid:54) n − , ( f k +1 , (cid:101) f k +1 ) ∈ F ( X ) and N, (cid:101) N (cid:62) , the random variables { ω ik +1 ( τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) } Ni =1 are i.i.d. conditionally on F Nk with E (cid:104) ω ik +1 ( τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) F Nk (cid:105) = (cid:0) φ Nk [ ϑ k ] (cid:1) − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) . Proof.

The proof follows the same lines as [26, Lemma 12].

B.1 Proof of Lemma 4.2

Proof.

The proof proceeds by induction. The case k = 0 is a direct consequence of the fact that T h = 0 and τ i = 0 for all 1 (cid:54) i (cid:54) N . Assume that the result holds for some 0 (cid:54) k (cid:54) n − N (cid:88) i =1 ω ik +1 Ω k +1 ( τ ik +1 ) f k +1 ( ξ ik +1 ) = a N b N , where a N = 1 N N (cid:88) i =1 ω ik +1 ( τ ik +1 ) f k +1 ( ξ ik +1 ) and b N = 1 N N (cid:88) i =1 ω ik +1 . Then, using that ( ω ik +1 ) (cid:54) i (cid:54) N are i.i.d. conditionally on F Nk and E (cid:2) ω k +1 (cid:12)(cid:12) F Nk (cid:3) = φ Nk [ L k ] φ Nk [ ϑ k ] , by Hoeﬀding inequality, since for all 1 (cid:54) i (cid:54) N , 0 (cid:54) ω ik +1 (cid:54) (cid:107) ¯ ω k +1 (cid:107) ∞ , P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) b N − φ Nk [ L k ] φ Nk [ ϑ k ] (cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:19) = E (cid:20) P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) b N − φ Nk [ L k ] φ Nk [ ϑ k ] (cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:12)(cid:12)(cid:12)(cid:12) F Nk (cid:19)(cid:21) (cid:54) − Nε / (cid:107) ¯ ω k +1 (cid:107) ∞ . Therefore, by Lemma 4.1, b N P − a . s . −→ N →∞ φ k [ L k ] φ k [ ϑ k ] . Since φ k [ L k ] > a N ) N (cid:62) . On theother hand, by Hoeﬀding inequality, using that for all 1 (cid:54) i (cid:54) N , | ω ik +1 ( τ ik +1 ) f k +1 ( ξ ik +1 ) | (cid:54) (cid:107) ¯ ω k +1 (cid:107) ∞ (cid:107) h k +1 (cid:107) ∞ (cid:107) f k +1 (cid:107) ∞ , P (cid:0)(cid:12)(cid:12) a N − E [ a N |F Nk ] (cid:12)(cid:12) > ε (cid:1) (cid:54) − Nε / (2 (cid:107) ¯ ω k +1 (cid:107) ∞ (cid:107) h k +1 (cid:107) ∞ (cid:107) f k +1 (cid:107) ∞ ) , E [ a N |F Nk ] as N grows to inﬁnity. Then, write E [ a N |F Nk ] = E (cid:2) ω k +1 ( τ k +1 ) f k +1 ( ξ k +1 ) (cid:12)(cid:12) F Nk (cid:3) = (cid:101) a N + (cid:101) a N , where (cid:101) a N = (cid:101) N − E (cid:34) ω k +1 f k +1 ( ξ k +1 ) E (cid:34)(cid:18) τ J (1 , k +1 k + (cid:101) h k ( ξ J (1 , k +1 k , ξ k +1 ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F Nk ∨ G Nk +1 (cid:35)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F Nk (cid:35) , (cid:101) a N = ( (cid:101) N − (cid:101) N − E (cid:34) ω k +1 f k +1 ( ξ k +1 ) E (cid:20) τ J (1 , k +1 k + (cid:101) h k ( ξ J (1 , k +1 k , ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) F Nk ∨ G Nk +1 (cid:21) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F Nk (cid:35) . The ﬁrst term is given by (cid:101) a N = (cid:101) N − N (cid:88) j =1 (cid:90) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 ω mk ϑ k ( ξ mk ) r k ( ξ jk , x ) ϑ k ( ξ jk ) p k ( ξ jk , x ) f k +1 ( x ) × N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:16) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) µ (d x ) , = (cid:101) N − ( φ Nk [ ϑ k ]) − (cid:90) f k +1 ( x ) N (cid:88) (cid:96) =1 ω (cid:96)k Ω k r k ( ξ (cid:96)k , x ) (cid:16) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) µ (d x ) . By the induction hypothesis and Lemma 4.1, (cid:101) a N P −→ N →∞ ( (cid:101) N φ k [ ϑ k ]) − × (cid:110) η k [ L k f k +1 ] + φ k [ T k h k L k f k +1 ] + φ k [ L k ( f k +1 (cid:101) h k )] + 2 φ k [ T k h k L k ( f k +1 (cid:101) h k )] (cid:111) which yields (cid:101) a N P −→ N →∞ ( (cid:101) N φ k [ ϑ k ]) − (cid:110) η k [ L k f k +1 ] + φ k [ L k { ( T k h k + (cid:101) h k ) f k +1 } ] (cid:111) . The second term is given by (cid:101) a N = ( (cid:101) N − (cid:101) N − N (cid:88) j =1 (cid:90) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 ω mk ϑ k ( ξ mk ) r k ( ξ jk , x ) ϑ k ( ξ jk ) p k ( ξ jk , x ) f k +1 ( x ) × (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:110) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:111)(cid:33) µ (d x ) , = ( (cid:101) N − (cid:101) N − ( φ Nk [ ϑ k ]) − φ Nk [ L k ϕ N ] , with, for all x ∈ X , ϕ N ( x ) = f k +1 ( x ) (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:110) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:111)(cid:33) . x ∈ X , by Lemma 4.1, ϕ N ( x ) P − a . s . −→ N →∞ f k +1 ( x ) (cid:32) φ k [ T k h k r k ( · , x ) + r k ( · , x ) (cid:101) h k ( · , x )] φ k [ r k ( · , x )] (cid:33) . In addition, for all x ∈ X , by (8), φ k [ T k h k r k ( · , x ) + r k ( · , x ) (cid:101) h k ( · , x )] φ k [ r k ( · , x )] = ←− Q φ k (cid:16) T k h k + (cid:101) h k (cid:17) ( x ) = T k +1 h k +1 ( x )so that ϕ N ( x ) P − a . s . −→ N →∞ f k +1 ( x ) T k +1 h k +1 ( x ) . Therefore, as (cid:107) ϕ N (cid:107) ∞ (cid:54) (cid:107) f k +1 (cid:107) ∞ (cid:107) h k +1 (cid:107) ∞ , by the generalized Lebesgue dominated convergencetheorem, see Lemma A.2, (cid:101) a N P −→ N →∞ ( (cid:101) N − (cid:101) N − ( φ k [ ϑ k ]) − φ k [ L k { f k +1 T k +1 h k +1 } ] . Using that φ k [ L k f k +1 T k +1 h k +1 ] φ k [ L k ] = φ k +1 [ f k +1 T k +1 h k +1 ] , yields a N b N P −→ N →∞ φ k +1 [ f k +1 T k +1 h k +1 ]+ η k [ L k f k +1 ] (cid:101) N φ k [ L k ] + φ k [ L k { ( T k h k + (cid:101) h k ) f k +1 } ] − φ k [ L k f k +1 T k +1 h k +1 ] (cid:101) N φ k [ L k ] . The proof is concluded upon noting that φ k [ L k { ( T k h k + (cid:101) h k ) f k +1 } ] − φ k [ L k f k +1 T k +1 h k +1 ]= φ k [ L k {←− Q φ k ( T k h k + (cid:101) h k − T k +1 h k +1 ) f k +1 } ] . B.2 Proof of Proposition 4.1

Proof.

The result is proved by induction on k . It holds for k = 0 as for all 1 (cid:54) i (cid:54) N , τ i = 0.Assume now that the result holds for some 0 (cid:54) k (cid:54) n − φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 ] = 0.Write √ N N (cid:88) i =1 ω ik +1 Ω k +1 { τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) } = Ω − k +1 ∆ Nk +1 , where ∆ Nk +1 = √ N (cid:80) Ni =1 ω ik +1 { τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) } is decomposed as follows∆ Nk +1 = ∆ Nk +1 , + ∆ Nk +1 , , Nk +1 , = √ N N (cid:88) i =1 E (cid:104) ω ik +1 ( τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) F Nk (cid:105) , ∆ Nk +1 , = √ N N (cid:88) i =1 (cid:110) ω ik +1 (cid:16) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) (cid:17) − E (cid:104) ω ik +1 ( τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) F Nk (cid:105)(cid:111) . By Lemma B.1,Ω − k +1 ∆ Nk +1 , = N Ω k +1 (cid:0) φ Nk [ ϑ k ] (cid:1) − √ N N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) As φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 ] = 0, φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0 . Therefore, using the induction hypothesis, Slutsky’s lemma and N Ω k +1 (cid:0) φ Nk [ ϑ k ] (cid:1) − P −→ N →∞ ( φ k [ L k ]) − yields Ω − k +1 ∆ Nk +1 , D −→ N →∞ σ k (cid:104) L k f k +1 ; L k ( (cid:101) h k f k +1 + (cid:101) f k +1 ) (cid:105) φ k [ L k ] Z , where Z is a standard Gaussian random variable. By Lemma B.1,Ω − k +1 ∆ Nk +1 , = N Ω k +1 N (cid:88) i =1 υ iN , where for all 1 (cid:54) i, j (cid:54) N and all x ∈ X , υ iN = 1 √ N (cid:101) N (cid:101) N (cid:88) j =1 (cid:101) υ N ( I ik +1 , J ( i,j ) k +1 , ξ ik +1 ) , (cid:101) υ N ( i, j, x ) = r k ( ξ ik , x ) ϑ k ( ξ ik ) p k ( ξ ik , x ) (cid:110)(cid:16) τ jk + (cid:101) h k ( ξ jk , x ) (cid:17) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:111) − (cid:0) φ Nk [ ϑ k ] (cid:1) − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) . First, by Lemma 4.1, N Ω k +1 P −→ N →∞ φ k [ ϑ k ] φ k [ L k ] . υ iN ) (cid:54) i (cid:54) N . By construction E [ υ iN |F Nk ] = 0 so that the proof of (i) is based on N (cid:88) i =1 E [( υ iN ) |F Nk ] = (cid:101) N − E [ E [( (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 )) |F Nk ∨ G Nk +1 |F Nk ]+ ( (cid:101) N − (cid:101) N − E [ E [ (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) |F Nk ∨ G Nk +1 ] |F Nk ] . (30)The ﬁrst term of (30) is given by E (cid:104) E (cid:104)(cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) (cid:12)(cid:12)(cid:12) F Nk ∨ G Nk +1 (cid:105)(cid:12)(cid:12)(cid:12) F Nk (cid:105) = E (cid:34) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , ξ k +1 ) (cid:80) Nm =1 ω mk r k ( ξ mk , ξ k +1 ) (cid:101) υ N ( I k +1 , (cid:96), ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F Nk (cid:35) , = N (cid:88) j =1 (cid:90) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 ω mk ϑ k ( ξ mk ) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:101) υ N ( j, (cid:96), x ) µ (d x ) , = (cid:0) φ Nk [ ϑ k ] (cid:1) − N (cid:88) j =1 (cid:90) ω jk Ω k ϑ k ( ξ jk ) p k ( ξ jk , x ) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:101) υ N ( j, (cid:96), x ) µ (d x ) , = (cid:0) φ Nk [ ϑ k ] (cid:1) − (cid:90) A N ( x ) B N ( x ) µ (d x ) − (cid:0) φ Nk [ ϑ k ] (cid:1) − (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) , where A N ( x ) = N (cid:88) j =1 ω jk Ω k ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:32) r k ( ξ jk , x ) ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:33) = N (cid:88) j =1 ω jk Ω k r k ( ξ jk , x )¯ ω ( ξ jk , x ) ,B N ( x ) = N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:110)(cid:16) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:111) . By Lemma 4.1, (cid:0) φ Nk [ ϑ k ] (cid:1) − (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) P −→ N →∞ ( φ k [ ϑ k ]) − (cid:16) φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] (cid:17) = 0 , since by assumption and [26, Lemma 11], φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0. Then, write (cid:90) A N ( x ) B N ( x ) µ (d x ) = (cid:101) a N + (cid:101) a N + (cid:101) a N , ϕ N : x (cid:55)→  N (cid:88) j =1 ω jk Ω k r k ( ξ jk , x )¯ ω k ( ξ jk , x )  (cid:32) N (cid:88) m =1 ω mk Ω k r k ( ξ mk , x ) (cid:33) − = ←− Q φ Nk ¯ ω k ( x ) , (cid:101) a N = N (cid:88) (cid:96) =1 ω (cid:96)k Ω k ( τ (cid:96)k ) (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) ϕ N ( x ) µ (d x ) , (cid:101) a N = N (cid:88) j =1 ω jk Ω k (cid:90) r k ( ξ jk , x )¯ ω k ( ξ jk , x ) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) µ (d x ) , (cid:101) a N = 2 N (cid:88) (cid:96) =1 ω (cid:96)k Ω k τ (cid:96)k (cid:90) r k ( ξ (cid:96)k , x ) ϕ N ( x ) f k +1 ( x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) µ (d x ) . By Lemma 4.1, for all x ∈ X , | ϕ N ( x ) − ←− Q φ k ¯ ω k ( x ) | P − a . s . −→ N →∞

0, and note that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:101) a N − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k ( τ (cid:96)k ) (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) ←− Q φ k ¯ ω k ( x ) µ (d x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) (cid:107) h k (cid:107) ∞ N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) (cid:12)(cid:12)(cid:12) ϕ N ( x ) − ←− Q φ k ¯ ω k ( x ) (cid:12)(cid:12)(cid:12) µ (d x ) . Since (cid:107) f k +1 | ϕ N − ←− Q φ k ¯ ω k |(cid:107) ∞ (cid:54) (cid:107) ¯ ω k (cid:107) ∞ (cid:107) f k +1 (cid:107) ∞ < ∞ , by the generalized Lebesgue dominatedconvergence theorem, see also Lemma A.2, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:101) a N − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k ( τ (cid:96)k ) (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) ←− Q φ k ¯ ω k ( x ) µ (d x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) P −→ N →∞ (cid:101) a N P −→ N →∞ η k [ L k { f k +1 ←− Q φ k ¯ ω k } ] + φ k [ T k h k L k { f k +1 ←− Q φ k ¯ ω k } ] . On the other hand, by Lemma A.2 applied to ψ N : x (cid:55)→ N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) which is such that (cid:107) ψ N (cid:107) ∞ (cid:54) (cid:107) (cid:101) h k f k +1 (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ ), (cid:101) a N P −→ N →∞ (cid:90) φ k [ r k ( · , x )¯ ω ( · , x )] φ k [ r k ( · , x )( (cid:101) h k ( · , x ) f k +1 ( x ) + (cid:101) f k +1 ( x )) ]( φ k [ r k ( · , x )]) − µ (d x ) , which yields (cid:101) a N P −→ N →∞ φ k [ L k { ( ←− Q φ k ¯ ω k )( (cid:101) h k f k +1 + (cid:101) f k +1 ) } ] . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:101) a N − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k τ (cid:96)k (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) ←− Q φ k ¯ ω k ( x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) µ (d x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) (cid:107) h k (cid:107) ∞ ( (cid:107) f k +1 (cid:107) ∞ (cid:107) (cid:101) h k (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ ) N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) (cid:12)(cid:12)(cid:12) ϕ N ( x ) − ←− Q φ k ¯ ω k ( x ) (cid:12)(cid:12)(cid:12) µ (d x ) , so that using again Lemma A.2 and Lemma 4.1, (cid:101) a N P −→ N →∞ φ k (cid:104) T k h k L k (cid:110) ( ←− Q φ k ¯ ω k ) f k +1 (cid:16)(cid:101) h k f k +1 + (cid:101) f k +1 (cid:17)(cid:111)(cid:105) . Therefore, the ﬁrst term of (30) satisﬁes (cid:101) N − E [ E [( (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 )) |F Nk ∨ G Nk +1 |F Nk ] P −→ N →∞ η k [ L k { f k +1 ←− Q φ k ¯ ω k } ] (cid:101) N φ k [ ϑ k ] + φ k [ L k { ( ←− Q φ k ¯ ω k ) (cid:104) ( T k h k + (cid:101) h k ) f k +1 + (cid:101) f k +1 (cid:105) } ] (cid:101) N φ k [ ϑ k ] , which concludes the proof for the ﬁrst term of (30). The second term of (30) is given by E [ E [ (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) |F Nk ∨ G Nk +1 ] |F Nk ]= E (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , ξ k +1 ) (cid:80) Nm =1 ω mk r k ( ξ mk , ξ k +1 ) (cid:101) υ N ( I k +1 , (cid:96), ξ k +1 ) (cid:33) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F Nk  , = N (cid:88) j =1 (cid:90) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 ω mk ϑ k ( ξ mk ) (cid:32) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 ω mk r k ( ξ mk , x ) (cid:101) υ N ( j, (cid:96), x ) (cid:33) µ (d x ) , = (cid:0) φ Nk [ ϑ k ] (cid:1) − φ Nk [ ϑ k ϕ Nk ] , where, for all x ∈ X , ϕ Nk ( x ) = (cid:90) p k ( x, z ) (cid:32) r k ( x, z ) ϑ k ( x ) p k ( x, z ) f k +1 ( z ) N (cid:88) (cid:96) =1 ω (cid:96)k r k ( ξ (cid:96)k , z ) (cid:80) Nm =1 ω mk r k ( ξ mk , z ) (cid:16) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , z ) (cid:17) + r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:101) f k +1 ( z ) − (cid:0) φ Nk [ ϑ k ] (cid:1) − N (cid:88) (cid:96) =1 ω (cid:96)k Ω k (cid:110) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) µ (d z ) . By assumption and [26, Lemma 11], φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0 so that by Lemma A.2, φ Nk [ ϑ k ϕ Nk ] P −→ N →∞ φ k [ ϑ k ϕ k ] , where ϕ k ( x ) = (cid:90) p k ( x, z ) (cid:18) r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:19) (cid:16) f k +1 ( z ) ←− Q φ k ( T k h k + (cid:101) h k )( z ) + (cid:101) f k +1 ( z ) (cid:17) µ (d z ) , = (cid:90) p k ( x, z ) (cid:18) r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:19) (cid:16) f k +1 ( z ) T k +1 h k +1 ( z ) + (cid:101) f k +1 ( z ) (cid:17) µ (d z ) . E [ E [ (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) |F Nk ∨ G Nk +1 ] |F Nk ] P −→ N →∞ (cid:18) − (cid:101) N (cid:19) ( φ k [ ϑ k ]) − φ k (cid:104) L k (cid:110) ¯ ω k (cid:0) f k +1 T k +1 h k +1 + ¯ f k +1 (cid:1) (cid:111)(cid:105) . The proof of (ii) is an immediate consequence of H3 since for all 1 (cid:54) i (cid:54) N , υ iN (cid:54) (cid:107) ¯ ω k +1 (cid:107) ∞ (cid:16) (cid:107) h k +1 (cid:107) ∞ (cid:107) (cid:101) f k +1 (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ (cid:17) N − / . Then, deﬁning c k = 2 (cid:107) ¯ ω k +1 (cid:107) ∞ ( (cid:107) h k +1 (cid:107) ∞ (cid:107) (cid:101) f k +1 (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ ), for all ε > N (cid:88) i =1 E [( υ iN ) | υ iN | (cid:62) ε |F Nk ] (cid:54) c k c k (cid:62) ε √ N P −→ N →∞ . Writing ¯ f k +1 = (cid:101) f k +1 − φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 )] , yields σ k +1 (cid:104) f k +1 ; (cid:101) f k +1 (cid:105) = σ k (cid:104) L k f k +1 ; L k { (cid:101) h k f k +1 + ¯ f k +1 }(cid:105) φ k [ L k ] + φ k [ ϑ k ] η k [ L k { f k +1 ←− Q φ k ¯ ω k } ] (cid:101) N φ k [ L k ] + φ k [ ϑ k ] φ k (cid:104) L k (cid:110) ¯ ω k (cid:0) f k +1 T k +1 h k +1 + ¯ f k +1 (cid:1) (cid:111)(cid:105) φ k [ L k ] + φ k [ ϑ k ] φ k (cid:20) L k { ¯ ω k f k +1 ←− Q φ k (cid:16) T k h k + (cid:101) h k − T k +1 h k +1 (cid:17) } (cid:21)(cid:101) N φ k [ L k ] and then, by (19), σ k +1 (cid:104) f k +1 ; (cid:101) f k +1 (cid:105) = σ k (cid:104) L k f k +1 ; L k { (cid:101) h k f k +1 + ¯ f k +1 }(cid:105) φ k [ L k ] + φ k [ ϑ k ] φ k (cid:104) L k (cid:110) ¯ ω k (cid:0) f k +1 T k +1 h k +1 + ¯ f k +1 (cid:1) (cid:111)(cid:105) φ k [ L k ] + φ k [ ϑ k ] φ k [ L k ] k (cid:88) (cid:96) =0 φ (cid:96) [ L (cid:96) {←− Q φ (cid:96) ( T (cid:96) h (cid:96) + (cid:101) h (cid:96) − T (cid:96) +1 h (cid:96) +1 ) L (cid:96) +1 . . . L k { f k +1 ←− Q φ k ¯ ω k }} ] (cid:101) N k +1 − (cid:96) φ (cid:96) [ L (cid:96) . . . L k − ] . By deﬁnition of the kernel (cid:101) D k +1 ,k +1 , φ k (cid:104) L k ¯ ω k (cid:0) f k +1 T k +1 h k +1 + ¯ f k +1 (cid:1) (cid:105) = φ k (cid:104) L k ¯ ω k (cid:101) D k +1 ,k +1 (cid:110) h k +1 f k +1 + (cid:101) f k +1 (cid:111)(cid:105) . It remains to prove the explicit expression of σ k +1 (cid:104) f k +1 ; (cid:101) f k +1 (cid:105) from this recursion formula. First,following the proof of [26, Theorem 3], for all 0 (cid:54) s < k , (cid:101) D s +1 ,k (cid:16) L k f k +1 + L k { (cid:101) h k f k +1 + ¯ f k +1 (cid:17) = (cid:101) D s +1 ,k +1 (cid:16) h k +1 f k +1 + (cid:101) f k +1 (cid:17) . (cid:54) s < k , φ k [ L k ] = φ s [ L s . . . L k ] φ s [ L s . . . L k − ] , which concludes the proof. C Convergence results for Pseudo marginal PaRIS algorithms

Lemma C.1.

Assume that H1 and H2 hold. The, for all (cid:54) k (cid:54) n − , ( f k +1 , (cid:101) f k +1 ) ∈ F ( X ) and N, (cid:101) N (cid:62) , the random variables { (cid:98) ω ik +1 ( (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) } Ni =1 are i.i.d. conditionallyon (cid:101) F Nk with E (cid:104)(cid:98) ω ik +1 ( (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) = (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) . Proof.

The proof follows the same lines as [21, Lemma 2]. Note ﬁrst that E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) = ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) , E (cid:104)(cid:98) τ k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) = N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , ξ k +1 ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , ξ k +1 ) (cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , ξ k +1 ) (cid:17) , where ¯ ω k is deﬁned by (18) in H3. Then, since conditionally on (cid:101) F Nk ∨ (cid:101) G Nk +1 , (cid:98) τ k +1 is independent of (cid:98) ω k +1 , E (cid:104)(cid:98) ω k +1 ( (cid:98) τ ik +1 f k +1 ( ξ k +1 ) + (cid:101) f k +1 ( ξ k +1 )) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) = E (cid:104) E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) E (cid:104)(cid:98) τ k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) + E (cid:104) E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) (cid:101) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) , = E (cid:34) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , ξ ik +1 ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , ξ ik +1 ) (cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , ξ ik +1 ) (cid:17) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:35) + E (cid:20) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) (cid:101) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:21) , = (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) , which concludes the proof. 29seudo marginal SMC C.1 Proof of Lemma 4.3

Proof.

The proof proceeds by induction and follows the same lines as [26, Lemma 13]. The case k = 0 is a direct consequence of the fact that T h = 0 and (cid:98) τ i = 0 for all 1 (cid:54) i (cid:54) N . Assume thatthe result holds for some 0 (cid:54) k (cid:54) n − N (cid:88) i =1 (cid:98) ω ik +1 (cid:98) Ω k +1 ( (cid:98) τ ik +1 ) f k +1 ( ξ ik +1 ) = a N b N , where a N = 1 N N (cid:88) i =1 (cid:98) ω ik +1 ( (cid:98) τ ik +1 ) f k +1 ( ξ ik +1 ) and b N = 1 N N (cid:88) i =1 (cid:98) ω ik +1 . The random variables ( (cid:98) ω ik +1 ) (cid:54) i (cid:54) N are i.i.d. conditionally on (cid:101) F Nk with E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) = E (cid:104) E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) = E (cid:20) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:21) , where ¯ ω k is deﬁned by (18) in H3. Noting that by H4 for all 1 (cid:54) i (cid:54) N | (cid:98) ω ik +1 | (cid:54) (cid:107) (cid:98) ω k (cid:107) ∞ and E (cid:20) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:21) = (cid:80) Ni =1 (cid:98) ω ik L k ( ξ ik +1 ) (cid:80) Ni =1 (cid:98) ω ik ϑ k ( ξ ik +1 ) = (cid:98) φ Nk [ L k ] (cid:98) φ Nk [ ϑ k ] , by Hoeﬀding’s inequality, there exist positive constants c k and (cid:101) c k such that P (cid:32)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) b N − (cid:98) φ Nk [ L k ] (cid:98) φ Nk [ ϑ k ] (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:33) (cid:54) c k e − (cid:101) c k Nε . Therefore, by Proposition 4.2 and Lemma A.1, P (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) b N − φ k [ L k ] φ k [ ϑ k ] (cid:12)(cid:12)(cid:12)(cid:12) > ε (cid:19) (cid:54) c k e − (cid:101) c k Nε so that, b N P − a . s . −→ N →∞ φ k [ L k ] φ k [ ϑ k ] . Since φ k [ L k ] > a N ) N (cid:62) . On theother hand, by Hoeﬀding inequality, using that for all 1 (cid:54) i (cid:54) N , | (cid:98) ω ik +1 ( (cid:98) τ ik +1 ) f k +1 ( ξ ik +1 ) | (cid:54) (cid:107) (cid:98) ω k +1 (cid:107) ∞ (cid:107) h k +1 (cid:107) ∞ (cid:107) f k +1 (cid:107) ∞ , P (cid:16)(cid:12)(cid:12)(cid:12) a N − E [ a N | (cid:101) F Nk ] (cid:12)(cid:12)(cid:12) > ε (cid:17) (cid:54) c k e − (cid:101) c k Nε , Then, write E [ a N | (cid:101) F Nk ] = E (cid:104)(cid:98) ω k +1 ( (cid:98) τ k +1 ) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) , = E (cid:104) E (cid:104)(cid:98) ω k +1 (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) E (cid:104) ( (cid:98) τ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) , = E (cid:20) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) E (cid:104) ( (cid:98) τ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105) f k +1 ( ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:21) , = (cid:101) a N + (cid:101) a N , (cid:101) a N = (cid:101) N − E (cid:34) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) f k +1 ( ξ k +1 ) E (cid:34)(cid:18)(cid:98) τ (cid:98) J (1 , k +1 k + (cid:101) h k ( ξ (cid:98) J (1 , k +1 k , ξ k +1 ) (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:35)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:35) , (cid:101) a N = ( (cid:101) N − (cid:101) N − E (cid:34) ¯ ω k +1 ( ξ (cid:98) I k +1 k , ξ k +1 ) f k +1 ( ξ k +1 ) E (cid:20) τ (cid:98) J (1 , k +1 k + (cid:101) h k ( ξ (cid:98) J (1 , k +1 k , ξ k +1 ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:21) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:35) . By Lemma 3.1, the ﬁrst term is given by (cid:101) a N = (cid:101) N − N (cid:88) j =1 (cid:90) (cid:98) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 (cid:98) ω mk ϑ k ( ξ mk ) r k ( ξ jk , x ) ϑ k ( ξ jk ) p k ( ξ jk , x ) f k +1 ( x ) × N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , x ) (cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) µ (d x ) , = (cid:101) N − ( (cid:98) φ Nk [ ϑ k ]) − (cid:90) f k +1 ( x ) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k r k ( ξ (cid:96)k , x ) (cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) µ (d x ) , By the induction hypothesis and Proposition 4.2, a N P −→ N →∞ ( (cid:101) N φ k [ ϑ k ]) − (cid:110) η k [ L k f k +1 ] + φ k [ T k h k L k f k +1 ] + φ k [ L k ( f k +1 (cid:101) h k )] + 2 φ k [ T k h k L k ( f k +1 (cid:101) h k )] (cid:111) , which yields a N P −→ N →∞ ( (cid:101) N φ k [ ϑ k ]) − (cid:110) η k [ L k f k +1 ] + φ k [ L k { ( T k h k + (cid:101) h k ) f k +1 } ] (cid:111) . The second term is given by (cid:101) a N = ( (cid:101) N − (cid:101) N − N (cid:88) j =1 (cid:90) (cid:98) ω jk ϑ k ( ξ jk ) p k ( ξ jk , x ) (cid:80) Nm =1 (cid:98) ω mk ϑ k ( ξ mk ) r k ( ξ jk , x ) ϑ k ( ξ jk ) p k ( ξ jk , x ) f k +1 ( x ) × (cid:32) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , x ) (cid:110)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:111)(cid:33) µ (d x ) , = ( (cid:101) N − (cid:101) N − ( (cid:98) φ Nk [ ϑ k ]) − (cid:98) φ Nk [ L k ϕ N ] , with, for all x ∈ X , ϕ N ( x ) = f k +1 ( x ) (cid:32) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , x ) (cid:110)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:111)(cid:33) . For all x ∈ X , by Proposition 4.2, ϕ N ( x ) P − a . s . −→ N →∞ f k +1 ( x ) (cid:32) φ k [ T k h k r k ( · , x ) + r k ( · , x ) (cid:101) h k ( · , x )] φ k [ r k ( · , x )] (cid:33) . (cid:107) ϕ N (cid:107) ∞ (cid:54) (cid:107) f k +1 (cid:107) ∞ (cid:107) h k +1 (cid:107) ∞ , by the generalized Lebesgue dominated convergencetheorem, see Lemma A.2, (cid:101) a N P −→ N →∞ ( (cid:101) N − (cid:101) N − ( φ k [ ϑ k ]) − φ k [ L k { f k +1 T k +1 h k +1 } ] . This concludes the proof following the same steps as in the proof of Lemma 4.2.

C.2 Proof of Proposition 4.3

Proof.

The result is proved by induction on k . It holds for k = 0 as for all 1 (cid:54) i (cid:54) N , (cid:98) τ i = 0.Assume now that the result holds for some 0 (cid:54) k (cid:54) n − φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 ] = 0.Write √ N N (cid:88) i =1 (cid:98) ω ik +1 (cid:98) Ω k +1 { (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) } = (cid:98) Ω − k +1 ∆ Nk +1 , where ∆ Nk +1 = √ N (cid:80) Ni =1 (cid:98) ω ik +1 { (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) } is decomposed as follows∆ Nk +1 = ∆ Nk +1 , + ∆ Nk +1 , , where ∆ Nk +1 , = √ N N (cid:88) i =1 E (cid:104)(cid:98) ω ik +1 ( (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) , ∆ Nk +1 , = √ N N (cid:88) i =1 (cid:110)(cid:98) ω ik +1 (cid:16)(cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 ) (cid:17) − E (cid:104)(cid:98) ω ik +1 ( (cid:98) τ ik +1 f k +1 ( ξ ik +1 ) + (cid:101) f k +1 ( ξ ik +1 )) (cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105)(cid:111) . By Lemma B.1, (cid:98) Ω − k +1 ∆ Nk +1 , = N (cid:98) Ω k +1 (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − √ N N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) As φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 ] = 0, φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0 . Therefore, using the induction hypothesis, Slutsky’s lemma and N (cid:98) Ω k +1 (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − P −→ N →∞ ( φ k [ L k ]) − yields (cid:98) Ω − k +1 ∆ Nk +1 , D −→ N →∞ ¯ σ k (cid:104) L k f k +1 ; L k ( (cid:101) h k f k +1 + (cid:101) f k +1 ) (cid:105) φ k [ L k ] Z , Z is a standard Gaussian random variable. By Lemma B.1, (cid:98) Ω − k +1 ∆ Nk +1 , = N (cid:98) Ω k +1 N (cid:88) i =1 υ iN , where for all 1 (cid:54) i, j (cid:54) N and all x ∈ X , υ iN = 1 √ N (cid:101) N (cid:101) N (cid:88) j =1 (cid:101) υ N ( (cid:98) I ik +1 , (cid:98) J ( i,j ) k +1 , ξ ik +1 ) , (cid:101) υ N ( i, j, x ) = (cid:98) r k ( ξ ik , x ; ζ ik ) ϑ k ( ξ ik ) p k ( ξ ik , x ) (cid:110)(cid:16)(cid:98) τ jk + (cid:101) h k ( ξ jk , x ) (cid:17) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:111) − (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111) . First, N (cid:98) Ω k +1 P −→ N →∞ φ k [ ϑ k ] φ k [ L k ] . Then, by construction, E [ υ iN | (cid:101) F Nk ] = 0 and N (cid:88) i =1 E [( υ iN ) | (cid:101) F Nk ] = (cid:101) N − E [ E [( (cid:101) υ N ( (cid:98) I k +1 , (cid:98) J (1 , k +1 , ξ k +1 )) | (cid:101) F Nk ∨ (cid:101) G Nk +1 | (cid:101) F Nk ]+ ( (cid:101) N − (cid:101) N − E [ E [ (cid:101) υ N ( (cid:98) I k +1 , (cid:98) J (1 , k +1 , ξ k +1 ) | (cid:101) F Nk ∨ (cid:101) G Nk +1 ] | (cid:101) F Nk ] . (31)The ﬁrst term of (31) is given by E (cid:104) E (cid:104)(cid:101) υ N ( (cid:98) I k +1 , (cid:98) J (1 , k +1 , ξ k +1 ) (cid:12)(cid:12)(cid:12) (cid:101) F Nk ∨ (cid:101) G Nk +1 (cid:105)(cid:12)(cid:12)(cid:12) (cid:101) F Nk (cid:105) = (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − (cid:90) A N ( x ) B N ( x ) µ (d x ) − (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − (cid:32) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) , where, for all ( x, y ) ∈ X × X , (cid:36) k ( x, y ) = (cid:90) (cid:98) r k ( x, y ; z ) (cid:98) ω k +1 ( x, y ; z ) R k ( x, y, d z ) ,A N ( x ) = N (cid:88) j =1 (cid:98) ω jk (cid:98) Ω k (cid:82) (cid:98) r k ( ξ jk , x ; u ) R k ( ξ jk , x ; d u ) ϑ k ( ξ jk ) p k ( ξ jk , x ) = N (cid:88) j =1 (cid:98) ω jk (cid:98) Ω k (cid:36) k ( ξ jk , x ) ,B N ( x ) = N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , x ) (cid:110)(cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , x ) (cid:17) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:111) . (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − (cid:32) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) P −→ N →∞ ( φ k [ ϑ k ]) − (cid:16) φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] (cid:17) = 0 , where by assumption φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 ] = 0, so that φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0. Then, write (cid:90) A N ( x ) B N ( x ) µ (d x ) = (cid:101) a N + (cid:101) a N + (cid:101) a N , where ϕ N : x (cid:55)→  N (cid:88) j =1 (cid:98) ω jk (cid:98) Ω k (cid:36) k ( ξ jk , x )  (cid:32) N (cid:88) m =1 (cid:98) ω mk (cid:98) Ω k r k ( ξ mk , x ) (cid:33) − , (cid:101) a N = N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k ( (cid:98) τ (cid:96)k ) (cid:90) r k ( ξ (cid:96)k , x ) f k +1 ( x ) ϕ N ( x ) µ (d x ) , (cid:101) a N = N (cid:88) j =1 (cid:98) ω jk (cid:98) Ω k (cid:90) (cid:36) k ( ξ jk , x ) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , x ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) µ (d x ) , (cid:101) a N = 2 N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:98) τ (cid:96)k (cid:90) r k ( ξ (cid:96)k , x ) ϕ N ( x ) f k +1 ( x ) (cid:16)(cid:101) h k ( ξ (cid:96)k , x ) f k +1 ( x ) + (cid:101) f k +1 ( x ) (cid:17) µ (d x ) . Following the same steps as in the proof of Proposition 4.1, (cid:101) a N P −→ N →∞ η k [ L k { f k +1 (cid:98) Q φ k (cid:36) k } ] + φ k [ T k h k L k { f k +1 (cid:98) Q φ k (cid:36) k } ] , (cid:101) a N P −→ N →∞ φ k (cid:20)(cid:90) (cid:36) k ( · , x ) φ k [ r k ( · , x )( (cid:101) h k ( · , x ) f k +1 ( x ) + (cid:101) f k +1 ( x )) ]( φ k [ r k ( · , x )]) − µ (d x ) (cid:21) , (cid:101) a N P −→ N →∞ φ k (cid:104) T k h k L k (cid:110) ( (cid:98) Q φ k (cid:36) k ) f k +1 (cid:16)(cid:101) h k f k +1 + (cid:101) f k +1 (cid:17)(cid:111)(cid:105) , where (cid:98) Q φ k (cid:36) k : x (cid:55)→ φ k [ (cid:36) k ( ., x )] /φ k [ r k ( ., x )]. Therefore, the ﬁrst term of (31) satisﬁes (cid:101) N − E [ E [( (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 )) |F Nk ∨ G Nk +1 |F Nk ] P −→ N →∞ (cid:90) η k [ f k +1 ( x ) r k ( · , x )] φ k [ (cid:36) k ( · , x )] (cid:101) N φ k [ ϑ k ] φ k [ r k ( · , x )] µ (d x )+ (cid:90) φ k [ r k ( · , x ) { ( T k h k + (cid:101) h k ( · , x )) f k +1 ( x ) + (cid:101) f k +1 ( x ) } ] φ k [ (cid:36) k ( · , x )] (cid:101) N φ k [ ϑ k ] φ k [ r k ( · , x )] µ (d x ) , which concludes the proof for the ﬁrst term of (31). The second term of (31) is given by E [ E [ (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) |F Nk ∨ G Nk +1 ] |F Nk ] = (cid:16) (cid:98) φ Nk [ ϑ k ] (cid:17) − (cid:98) φ Nk [ ϑ k ϕ Nk ] , x ∈ X , ϕ Nk ( x ) = (cid:90) p k ( x, z ) (cid:32) r k ( x, z ) ϑ k ( x ) p k ( x, z ) f k +1 ( z ) N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k r k ( ξ (cid:96)k , z ) (cid:80) Nm =1 (cid:98) ω mk r k ( ξ mk , z ) (cid:16)(cid:98) τ (cid:96)k + (cid:101) h k ( ξ (cid:96)k , z ) (cid:17) + r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:101) f k +1 ( z ) − (cid:0) φ Nk [ ϑ k ] (cid:1) − N (cid:88) (cid:96) =1 (cid:98) ω (cid:96)k (cid:98) Ω k (cid:110)(cid:98) τ (cid:96)k L k f k +1 ( ξ (cid:96)k ) + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )( ξ (cid:96)k ) (cid:111)(cid:33) µ (d z ) . By assumption, φ k [ T k h k L k f k +1 + L k ( (cid:101) h k f k +1 + (cid:101) f k +1 )] = 0 so that by Lemma A.2, (cid:98) φ Nk [ ϑ k ϕ Nk ] P −→ N →∞ φ k [ ϑ k ϕ k ] , where ϕ k ( x ) = (cid:90) p k ( x, z ) (cid:18) r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:19) (cid:16) f k +1 ( z ) ←− Q φ k ( T k h k + (cid:101) h k )( z ) + (cid:101) f k +1 ( z ) (cid:17) µ (d z ) , = (cid:90) p k ( x, z ) (cid:18) r k ( x, z ) ϑ k ( x ) p k ( x, z ) (cid:19) (cid:16) f k +1 ( z ) T k +1 h k +1 ( z ) + (cid:101) f k +1 ( z ) (cid:17) µ (d z ) . Therefore, E [ E [ (cid:101) υ N ( I k +1 , J (1 , k +1 , ξ k +1 ) |F Nk ∨ G Nk +1 ] |F Nk ] P −→ N →∞ (cid:18) − (cid:101) N (cid:19) ( φ k [ ϑ k ]) − φ k (cid:20)(cid:90) r k ( · , z )¯ ω k ( · , z ) (cid:16) f k +1 ( z ) T k +1 h k +1 ( z ) + (cid:101) f k +1 ( z ) (cid:17) µ (d z ) (cid:21) . The proof of (ii) is an immediate consequence of H4 since for all 1 (cid:54) i (cid:54) N , υ iN (cid:54) (cid:107) (cid:98) ω k +1 (cid:107) ∞ (cid:16) (cid:107) h k +1 (cid:107) ∞ (cid:107) (cid:101) f k +1 (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ (cid:17) N − / . Then, deﬁning c k = 2 (cid:107) (cid:98) ω k +1 (cid:107) ∞ ( (cid:107) h k +1 (cid:107) ∞ (cid:107) (cid:101) f k +1 (cid:107) ∞ + (cid:107) (cid:101) f k +1 (cid:107) ∞ ), for all ε > N (cid:88) i =1 E [( υ iN ) | υ iN | (cid:62) ε |F Nk ] (cid:54) c k c k (cid:62) ε √ N P −→ N →∞ . Writing ¯ f k +1 = (cid:101) f k +1 − φ k +1 [ T k +1 h k +1 f k +1 + (cid:101) f k +1 )] , σ k +1 (cid:104) f k +1 ; (cid:101) f k +1 (cid:105) = ¯ σ k (cid:104) L k f k +1 ; L k { (cid:101) h k f k +1 + ¯ f k +1 }(cid:105) φ k [ L k ] + φ k [ ϑ k ] (cid:82) η k [ r k ( · , x )] f k +1 ( x ) (cid:98) Q φ k (cid:36) k ( x ) µ (d x ) (cid:101) N φ k [ L k ] + φ k [ ϑ k ] φ k (cid:104)(cid:82) r k ( · , z )¯ ω k ( · , z ) (cid:0) f k +1 ( z ) T k +1 h k +1 ( z ) + ¯ f k +1 ( z ) (cid:1) µ (d z ) (cid:105) φ k [ L k ] + φ k [ ϑ k ] φ k (cid:20)(cid:82) (cid:36) k ( · , z ) f k +1 ( z ) ←− Q φ k (cid:16) T k h k + (cid:101) h k − T k +1 h k +1 (cid:17) ( z ) µ (d z ) (cid:21)(cid:101) N φ k [ L k ] , + φ k [ ϑ k ] φ k (cid:104)(cid:82) Cov { (cid:98) r k ( · , z ; ζ k ) (cid:98) ω k ( · , z ; ζ k ) } (cid:0) f k +1 ( z ) T k +1 h k +1 ( z ) + ¯ f k +1 ( z ) (cid:1) µ (d z ) (cid:105)(cid:101) N φ k [ L k ] ,,