More on the long time stability of Feynman-Kac semigroups
aa r X i v : . [ m a t h . P R ] O c t More on the long time stability of Feynman–Kac semigroups
Grégoire Ferré, Mathias Rousset and Gabriel Stoltz
Université Paris-Est, CERMICS (ENPC), Inria, F-77455 Marne-la-Vallée, FranceINRIA Rennes – Bretagne Atlantique & IRMAR Université Rennes 1, France
November 2, 2018
Abstract
Feynman–Kac semigroups appear in various areas of mathematics: non-linear filtering, large deviations theory,spectral analysis of Schrödinger operators among others. Their long time behavior provides important information,for example in terms of ground state energy of Schrödinger operators, or scaled cumulant generating function inlarge deviations theory. In this paper, we propose a simple and natural extension of the stability of Markov chainsfor these non-linear evolutions. As other classical ergodicity results, it relies on two assumptions: a Lyapunovcondition that induces some compactness, and a minorization condition ensuring some mixing. We show thatthese conditions are satisfied in a variety of situations. We also use our technique to provide uniform in the timestep convergence estimates for discretizations of stochastic differential equations.
Feynman–Kac semigroups have a long history in physics and mathematics. One of their traditional applications asa probabilistic representation of Schrödinger semigroups [39] is the computation of ground state energies throughDiffusion Monte Carlo algorithms [30, 1, 7, 27]. It has also become a significant tool in non-linear filtering andgenealogical models [14, 16, 12], as well as in large deviations theory [20, 43, 28, 58]. In all these contexts, thedynamics is evolved and its paths are weighted depending on some cost function. This function is typically apotential energy, a likelihood, or a function whose fluctuations are of interest.As for Markov chains, the long time behavior of such dynamics is important. However, the long-time analysisis made difficult by the non-linear character of the evolution, so the methods used for the stability of Markovchains [46, 35] cannot be straightforwardly adapted in this context. A series of papers [13, 15, 12] rely on thepowerful Dobrushin ergodic coefficient [18, 19]. If this tools enables to deal with the nonlinearity and to considertime-inhomogeneous processes, the conditions imposed on the dynamics are not realistic for unbounded domains.The purpose of this paper is to propose a new scheme of proof for the ergodicity of Feynman–Kac dynamics,suitable for cases where the state space is unbounded. It is based on the principal eigenvalue problem associatedto a weighted evolution operator. It then relies on studying a h -transformed version of the dynamics [21], where h is the eigenvector associated to the eigenproblem. This turns the non-linear dynamics into a linear Markovevolution, which can then be studied with standard techniques [46, 35]. However, the spectral properties of thegenerator fall out of the typical regime of self-adjoint operators, since the dynamics is in general non-reversible. Astriking fact of our results is that, under Lyapunov and minorization conditions similar to those of [35] stated fornon-probabilistic kernels, we perform a non self-adjoint spectral analysis that recasts the Feynman–Kac probleminto the Markov chain framework studied in [35].The works of Kontoyannis and Meyn [40, 43] provide elements of answer concerning the spectral propertiesof the evolution operator, and rely on a nonlinear Lyapunov condition and a regularity in terms of hitting times.If the latter Lyapunov condition is natural in terms of optimal stochastic control [26], we propose instead proofsbased on linear conditions. Our generalized linear Lyapunov condition is inspired by [50], and comes togetherwith a minorization condition and a local strong Feller assumption. We will see that these conditions apply to a ariety of situations, with natural interpretations. From a broader perspective, it appears as a natural extensionof previous works on the stability of Markov chains [35] for evolution kernels that do not conserve probability. Tothat extent, our work resonates with recent works on Quasi-Stationary Distributions (QSD) [29, 9, 8, 4]. However,our scope and assumptions being different, we leave the comparison for future studies. Let us also mention thatour framework applies for both discrete and continuous time processes. This is interesting since one motivation forthis work is to understand the behavior of time discretizations of continuous Feynman–Kac dynamics, as in [25].Let us outline our main results in an informal way. The quantities we are interested in typically correspondto Markov chains ( x k ) k > over a state space X , whose trajectories are weighted by a function f : X → R . Thiscorresponds to semigroups of the formΦ k ( µ )( ϕ ) = E (cid:20) ϕ ( x k ) e P k − i =0 f ( x i ) (cid:12)(cid:12)(cid:12) x ∼ µ (cid:21) E (cid:20) e P k − i =0 f ( x i ) (cid:12)(cid:12)(cid:12) x ∼ µ (cid:21) , (1)where µ is an initial probability distribution, and ϕ is a test function. We show that, for more general semigroupsand under some assumptions on ( x k ) k > and f , there exists a measure µ ∗ f such that for any initial measure µ andany ϕ belonging to a particular class of test functions,Φ k ( µ )( ϕ ) −−−−−→ k → + ∞ µ ∗ f ( ϕ ) , (2)at an exponential rate. As a corollary of this result, we show that the principal eigenvalue Λ of the generator ofthe dynamics (Φ k ) k > can be obtained as the following limit, for any initial measure µ and suitable functions f :log(Λ) = lim k → + ∞ k log E (cid:20) e P k − i =0 f ( x i ) (cid:12)(cid:12)(cid:12) x ∼ µ (cid:21) , a quantity sometimes called scaled cumulant generating function in large deviations theory [17, 43]. Anothernatural situation corresponds to continuous semigroups of the formΘ t ( µ )( ϕ ) = E (cid:20) ϕ ( X t ) e R t f ( X s ) ds (cid:12)(cid:12)(cid:12) x ∼ µ (cid:21) E (cid:20) e R t f ( X s ) ds (cid:12)(cid:12)(cid:12) x ∼ µ (cid:21) , (3)where ( X t ) t > is typically a diffusion process. Similar results are then derived for this continuous dynamics. Wewill see that ergodic properties such as (2) are proved under natural extensions of Lyapunov and minorizationconditions, which should be reminiscent of the corresponding theory for Markov chains [35, 50], with additionalregularity conditions.The paper is organized as follows. In Section 2, we present our main results on the stability of Feynman–Kacsemigroups. Section 2.2 is devoted to discrete time results, while Section 2.3 is concerned with the continuoustime case. Section 3 presents a number of natural applications of the method. In particular, Section 3.3 providesuniform in the time step convergence estimates. Section 4 discusses some links with related works and possiblefurther directions. In this section, we present our main convergence results for generalizations of the dynamics (1). The state space X is assumed to be a Polish space, and for a measurable set A ⊂ X , we denote by A c its complement, and A itsindicator function. For a Banach space E , we denote by B ( E ) the space of bounded linear operators over E , with ssociated norm k T k B ( E ) = sup {k T u k E , k u k E } . The Banach space of continuous functions is called C ( X ),and the Banach space of measurable functions ϕ such that k ϕ k B ∞ := sup x ∈X | ϕ ( x ) | < + ∞ is referred to as B ∞ ( X ). Given a measure µ over X with finite mass, we use the notation µ ( ϕ ) = R X ϕ ( x ) µ ( dx )for ϕ ∈ B ∞ ( X ). The spaces of positive measures and probability measures over X are denoted respectivelyby M ( X ) and P ( X ). When we consider Markov chains ( x k ) k ∈ N over X , we write E µ for the expectation overall the realizations of the Markov chain with initial condition distributed according to the probability measure µ .Appendix A is devoted to reminders on the ergodicity of Markov chains extracted from [35], while Appendix Brecalls some useful definitions and theorems used in the proofs of the results of this section.We consider general kernel operators Q f over X , i.e. such that for any x ∈ X , Q f ( x, · ) is a positive measurewith finite mass ( i.e. Q ( x ) < + ∞ ), and for any measurable set A ⊂ X , Q f ( · , A ) is a measurable function. Sucha kernel is referred to as Markov (also probabilistic or conserving) when Q f = . The notation Q f instead of Q emphasizes that in general Q f = depends on a measurable function f : X → R . For ϕ ∈ B ∞ ( X ), we denoteby Q f ϕ = R X ϕ ( y ) Q f ( · , dy ) the action of Q f on test functions, and by µQ f = R X µ ( dx ) Q f ( x, · ) its action on finitemeasures µ . We call Feynman–Kac semigroups the dynamics (Φ k ) k > defined as follows: ∀ k > , ∀ µ ∈ P ( X ) , ∀ ϕ ∈ B ∞ ( X ) , Φ k ( µ )( ϕ ) = µ (cid:0) ( Q f ) k ϕ (cid:1) µ (cid:0) ( Q f ) k (cid:1) . (4)Note that Φ k = Φ ◦ . . . ◦ Φ, where Φ is the one step evolution operator Φ : P ( X ) → P ( X ): ∀ µ ∈ P ( X ) , ∀ ϕ ∈ B ∞ ( X ) , Φ( µ )( ϕ ) = µ (cid:0) Q f ϕ (cid:1) µ (cid:0) Q f (cid:1) , (5)which is well-defined as soon as µ ( Q f ) > µ ∈ P ( X ). Lemma 1 below proves that (5) is indeedwell-defined under the assumptions presented in Section 2.2.Although Q f is not probabilistic, the normalizing factor in (5) ensures that Φ evolves a positive measure offinite mass into a probability measure. An important motivation for studying the general dynamics (5) is that (1)can be written in the form (4) with Q f = e f Q , where Q is the transition operator of the Markov chain ( x k ) k ∈ N .In this typical setting, Q f = e f . Even when Q f is not defined in this way (see for instance the continuous timesituation (30) considered in Section 2.3), we keep the notation to emphasize that Q f typically corresponds to aMarkov dynamics whose trajectories are weighted by a function f . We now introduce the assumptions ensuring the well-posedness and ergodicity of the semigroup (4), which shouldbe reminiscent of the ones used in [35, 50] for showing the ergodicity of Markov chains. The first step of the proofis the existence of a principal eigenvector h for Q f , as shown in Lemma 2. This eigenvector is used in Lemma 3to study a h -transformed version of Q f , which leads to our main result, Theorem 1. Note that, in practice, wehave in mind the situation X = R d for d ∈ N ∗ , but discrete spaces like X = Z d can also be considered, in whichcase the framework may be simplified.The first assumption is that a generalized Lyapunov condition holds. We will see in Section 3 that it is satisfiedfor a large class of processes. In all this section, we consider an increasing sequence of compact sets ( K n ) n > suchthat, for any compact K ⊂ X , there exists m > K ⊂ K m . Assumption 1 (Lyapunov condition) . There exist a function W : X → [1 , + ∞ ) bounded on compact sets, andpositive sequences ( γ n ) n > , ( b n ) n > with γ n → as n → + ∞ such that, for all n > , Q f W γ n W + b n K n . (6) et us mention that, in many situations, the function W has compact level sets, so that a natural choice ofcompact sets is K n = { x ∈ X | W ( x ) n } . When a Lyapunov function W exists, it is natural [35] to consider thefollowing functional space B ∞ W ( X ) = n ϕ measurable , (cid:13)(cid:13)(cid:13) ϕW (cid:13)(cid:13)(cid:13) B ∞ < + ∞ o . (7)In particular, Assumption 1 implies that Q f is a bounded operator on B ∞ W ( X ), since one can show that ∀ n > , k Q f k B ( B ∞ W ) γ n + b n . We next assume that the following minorization condition holds.
Assumption 2 (Minorization and irreducibility) . For any n > , there exist η n ∈ P ( X ) and α n > such that inf x ∈ K n Q f ( x, · ) > α n η n ( · ) . (8) In addition, for any n > and any ϕ ∈ B ∞ W ( X ) with ϕ > , η n ( ϕ ) = 0 , ∀ n > n = ⇒ (cid:0) Q f ϕ (cid:1) ( x ) = 0 , ∀ x ∈ X . (9)Note that (9) expresses some form of irreducibility with respect to the minorizing measures. It can be refor-mulated in the following way: for any n > x ∈ X , Q f ( x, · ) is absolutely continuous with respect tothe measure X n > n − n η n . The typical situation for X = R d is to choose η n ( dx ) = K n ( x ) dx/ | K n | , where | K n | denotes the Lebesgue measureof K n . We also mention that, although we will consider the previous minorization measures η n in our examplesin Section 3, the first part of Assumption 2 can be obtained using irreducibility together with a strong Fellerproperty, see [32], or through the Stroock-Varadhan support theorem [56] with some regularity property, see thediscussion in [50]. In our context, we also need some local regularity for the operator Q f . Assumption 3 (Local regularity) . The operator Q f is strong Feller on the compact sets K n , i.e. for any n > and any measurable function ϕ bounded on K n , Q f ( ϕ K n ) is continuous over K n . From these assumptions we first state the following preliminary lemma, whose proof can be found in Ap-pendix C.
Lemma 1.
Let Q f satisfy Assumptions 1 and 2. Then, for any µ ∈ P ( X ) with µ ( W ) < + ∞ , one has < µ ( Q f ) < + ∞ . (10) Moreover, for any n > , it holds η n ( W ) < + ∞ , and there exist infinitely many indices ¯ n > such that η ¯ n ( K ¯ n ) > . (11)The lower bound in (10) implies in particular that the dynamics (4) is well-defined. The inequality (11) meansthat, for infinitely many minorization conditions, some mass of the minorizing measure remains in the associatedcompact set. It is used in the proof of Lemma 2 to show that Q f has a positive spectral radius. Since (11) issatisfied for infinitely many indices, we could consider that it holds for any n >
0, upon extracting a subsequenceand, in the situations considered in Section 3, we can actually check that η n ( K n ) > n > Q f , which are a key ingredient forour analysis. Let us recall that the spectral radius of Q f on B ∞ W ( X ), denoted by Λ := Λ( Q f ), is given by theGelfand formula [48]: Λ = lim k → + ∞ (cid:13)(cid:13) ( Q f ) k (cid:13)(cid:13) k B ( B ∞ W ) , (12)and that the essential spectral radius of Q f , denoted by θ ( Q f ), reads (see Appendix B): θ ( Q f ) = lim k → + ∞ (cid:16) inf (cid:8)(cid:13)(cid:13) ( Q f ) k − T (cid:13)(cid:13) B ( E ) , T compact (cid:9)(cid:17) k . emma 2. Under Assumptions 1, 2 and 3, the operator Q f considered on B ∞ W ( X ) has a zero essential spectralradius, admits its spectral radius Λ > as a largest eigenvalue (in modulus), and has an associated eigenfunction h ∈ B ∞ W ( X ) , normalized so that k h k B ∞ W = 1 , and which satisfies ∀ x ∈ X , < h ( x ) < + ∞ . (13) In particular, < η n ( h ) < + ∞ for all n > . Note that the eigenspace associated with Λ is a priori not of dimension one. We prove Lemma 2 in Appendix Dby using arguments inspired by [50, Theorem 8.9] to show that the essential spectral radius of Q f is zero, andthen relying on the theory of positive operators [11]. Some useful elements of operator theory are reminded inAppendix B for the reader’s convenience. Our result is close to those obtained in [43], and the control of theessential spectral radius under Lyapunov and topological conditions has already been studied in [59, 31]. However,our proof uses different techniques based on different assumptions.Once such a principal eigenvector h is available, the geometric ergodicity of the Feynman–Kac dynamics (4)is derived from the one of a h -transformed kernel, as made clear in the proof of Theorem 1 below. This is thepurpose of the next lemma whose proof is postponed to Appendix E. Lemma 3.
Suppose that Assumptions 1, 2 and 3 hold, and consider an eigenvector h associated with Λ as givenby Lemma 2. Since h > we can define the corresponding h -transformed operator Q h as Q h φ = Λ − h − Q f ( hφ ) . (14) Then Q h is a Markov operator with Lyapunov function W h − : X → [1 , + ∞ ) . Moreover, there exist a unique µ h ∈ P ( X ) , which satisfies µ h ( W h − ) < + ∞ , and constants c > , ¯ α ∈ (0 , such that, for any φ ∈ B ∞ Wh − ( X ) and any k > , (cid:13)(cid:13) Q kh φ − µ h ( φ ) (cid:13)(cid:13) B ∞ Wh − c ¯ α k k φ − µ h ( φ ) k B ∞ Wh − . (15)Although this is not obvious at first glance, the operator Q h is in fact independent of the choice of h inLemma 2, and so is the invariant measure µ h . Actually, Lemma 3 allows to show that the eigenspace associatedwith h has geometric dimension one, i.e. Ker (cid:0) Q f − Λ Id (cid:1) = Span { h } . Indeed, if ˜ h ∈ B ∞ W ( X ) is another eigenvectorassociated with Λ (which may not be of constant sign), it holds, since h ( x ) > x ∈ X by (13): Q h (cid:18) ˜ hh (cid:19) = Λ − h − Q f ˜ h = ˜ hh ∈ B ∞ Wh − ( X ) . From (15), we obtain ˜ hh = µ h (cid:18) ˜ hh (cid:19) , hence ˜ h is proportional to h . It may be possible to directly obtain this uniqueness result from stronger Krein-Rutman theorems, like [11, Theorem 19.3], using the irreducibility condition (9) in Assumption 2.We are now in position to state our main theorem. Theorem 1.
Consider a kernel operator Q f satisfying Assumptions 1, 2 and 3 and the associated dynamics (4) with one step evolution operator Φ : P ( X ) → P ( X ) . Then Φ admits a unique fixed point µ ∗ f ∈ P ( X ) , that is aprobability measure such that Φ( µ ∗ f ) = µ ∗ f , (16) and this measure satisfies µ ∗ f ( W ) < + ∞ . Moreover, there exists ¯ α ∈ (0 , such that, for any µ ∈ P ( X ) satisfying µ ( W ) < + ∞ , there is C µ > for which ∀ ϕ ∈ B ∞ W ( X ) , ∀ k > , (cid:12)(cid:12) Φ k ( µ )( ϕ ) − µ ∗ f ( ϕ ) (cid:12)(cid:12) C µ ¯ α k k ϕ k B ∞ W . (17)We call µ ∗ f the invariant measure of Q f , in analogy with Markov chains. Note that Theorem 1 also implies theconvergence of Φ k ( µ ) towards µ ∗ f in the weighted total variation distance (Wasserstein distance [57, 35]) defined,for µ, ν ∈ P ( X ) with µ ( W ) < + ∞ , ν ( W ) < + ∞ , by ρ W ( µ, ν ) = sup k ϕ k B ∞ W Z X ϕ ( x ) ( µ − ν )( dx ) . (18) roof. The key idea of the proof is to reformulate the dynamics (4) using the h -transformed operator Q h =Λ − h − Q f h of Lemma 3. Using the notation of Lemmas 2 and 3, we rewrite (4) asΦ k ( µ )( ϕ ) = µ (cid:0) ( Q f ) k ϕ (cid:1) Λ − k µ (cid:0) ( Q f ) k (cid:1) Λ − k = µ (cid:16) h (cid:0) Λ − h − Q f h (cid:1) k ( h − ϕ ) (cid:17) µ (cid:16) h (cid:0) Λ − h − Q f h (cid:1) k h − (cid:17) = µ (cid:0) h ( Q h ) k ( h − ϕ ) (cid:1) µ (cid:0) h ( Q h ) k h − (cid:1) . The dynamics (4) is therefore reformulated as the ratio of long time expectations of the Markov chains inducedby Q h , applied to the functions h − ϕ and h − . It is then possible to resort to the convergence results given byLemma 3.We first construct a probability measure µ ∗ f for which (17) is satisfied, namely µ ∗ f ( ϕ ) = µ h (cid:0) h − ϕ (cid:1) µ h ( h − ) , (19)where µ h is the probability measure introduced in Lemma 3. Note that µ ∗ f is well-defined for ϕ ∈ B ∞ W ( X ). Indeed,for ϕ ∈ B ∞ W ( X ), it holds h − ϕ ∈ B ∞ Wh − ( X ). Second, we show that µ h ( h − ) > . (20)Indeed, since k h k B ∞ W = 1, it holds h − > W − , and since W is upper bounded on any compact set, W − islower bounded by a positive constant on any compact set. As µ h ∈ P ( X ), we can use Lemma 5 in Appendix Bto conclude that µ h ( h − ) >
0. Moreover, µ ∗ f does not depend on the choice of normalization for h . Finally, µ ∗ f ( W ) < + ∞ since µ h ( W h − ) < + ∞ .From Lemma 3, for any ϕ ∈ B ∞ W ( X ), it holds Q kh ( h − ϕ ) = µ h ( h − ϕ ) + a k and Q kh ( h − ) = µ h ( h − ) + b k with k a k k B ∞ Wh − c ¯ α k k h − ϕ − µ h ( h − ϕ ) k B ∞ Wh − and k b k k B ∞ Wh − c ¯ α k k h − − µ h ( h − ) k B ∞ Wh − . Since ϕ ∈ B ∞ W ( X ),we have in particular (using also k h k B ∞ W = 1), (cid:13)(cid:13) h − ϕ − µ h ( h − ϕ ) (cid:13)(cid:13) B ∞ Wh − k h − ϕ k B ∞ Wh − + µ h ( h − | ϕ | ) k h k B ∞ W (cid:0) µ h ( W h − ) (cid:1) k ϕ k B ∞ W < + ∞ . Since µ h ( W h − ) < + ∞ , we can set c ′ = 1 + µ h ( W h − ) so that k a k k B ∞ Wh − c ′ ¯ α k k ϕ k B ∞ W . (21)A similar estimate holds for the sequence ( b k ) k > by taking ϕ ≡ . This leads to, for any ϕ ∈ B ∞ W ( X ), (cid:12)(cid:12) Φ k ( µ )( ϕ ) − µ ∗ f ( ϕ ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) µ (cid:0) h ( Q h ) k ( h − ϕ ) (cid:1) µ ( h ( Q h ) k h − ) − µ ∗ f ( ϕ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) µ (cid:0) h ( µ h ( h − ϕ ) + a k ) (cid:1) µ ( h ( µ h ( h − ) + b k )) − µ ∗ f ( ϕ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) µ ( h ) µ h ( h − ϕ ) + µ ( ha k ) µ ( h ) µ h ( h − ) + µ ( hb k ) − µ ∗ f ( ϕ ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) µ ∗ f ( ϕ ) + c µ,h µ ( ha k )1 + c µ,h µ ( hb k ) − µ ∗ f ( ϕ ) (cid:12)(cid:12)(cid:12)(cid:12) , where we introduced c h,µ = 1 µ ( h ) µ h ( h − ) . (22)It holds 0 < c h,µ < + ∞ because: • Lemma 2 shows that for any µ ∈ P ( X ) with µ ( W ) < + ∞ , it holds 0 < µ ( h ) < + ∞ ; • we know that µ h ( h − ) < + ∞ from Lemma 3; • µ h ( h − ) > ow, since | b k | k b k k B ∞ Wh − W h − and (21) holds for b k with ϕ ≡ , we have1 + c µ,h µ ( hb k ) > − c µ,h µ ( h | b k | ) > − c µ,h µ ( W ) k b k k B ∞ Wh − > − ¯ α k c ′ c µ,h µ ( W ) . Therefore, the choice k > − log (cid:0) c ′ c µ,h µ ( W ) (cid:1) log(¯ α )ensures that 1 + c µ,h µ ( hb k ) > . As a result, for k large enough, using | a k | k a k k B ∞ Wh − W h − and recalling (21), (cid:12)(cid:12) Φ k ( µ )( ϕ ) − µ ∗ f ( ϕ ) (cid:12)(cid:12) c h,µ (cid:0) µ ( h | a k | ) + µ ∗ f ( | ϕ | ) µ ( h | b k | ) (cid:1) c h,µ µ ( hb k ) C µ k ϕ k B ∞ W ¯ α k , (23)with C µ = 2 c h,µ c ′ µ ( W ) (cid:0) µ ∗ f ( W ) (cid:1) = 2 µ h ( h − ) (cid:0) µ h ( W h − ) (cid:1)(cid:0) µ ∗ f ( W ) (cid:1) µ ( W ) µ ( h ) . (24)We therefore obtain (17) from (23) with the constant defined in (24). Note that C µ depends on the initialmeasure µ only through the ratio µ ( W ) /µ ( h ).Taking the supremum over ϕ ∈ B ∞ W ( X ) such that k ϕ k B ∞ W
1, (23) rewrites, with (18): ρ W (cid:0) Φ k ( µ ) , µ ∗ f (cid:1) C µ ¯ α k . Choosing µ = Φ( µ ∗ f ) and using the semigroup property we obtain ρ W (cid:0) Φ(Φ k ( µ ∗ f )) , µ ∗ f (cid:1) C µ ∗ f ¯ α k . Taking the limit k → + ∞ shows that Φ( µ ∗ f ) = µ ∗ f , so µ ∗ f is a fixed point of Φ.We have shown the existence of an invariant measure of the form (19), which is a fixed point of Φ andintegrates W . We now turn to uniqueness, which follows by a standard fixed point argument. Assume that wehave two probability measures µ and µ satisfying (17) and such that µ ( W ) < + ∞ , µ ( W ) < + ∞ , which aretherefore fixed points of Φ. Then, there exists ¯ α ∈ (0 ,
1) such that, for any measure µ ∈ P ( X ) with µ ( W ) < + ∞ ,there is a constant C µ for which ∀ k > , ρ W (cid:0) Φ k ( µ ) , µ (cid:1) C µ ¯ α k . Choosing µ = µ and using the invariance by Φ leads to ρ W ( µ , µ ) C µ ¯ α k . Taking the limit k → + ∞ shows that µ = µ , so the invariant measure is unique.Theorem 1 also leads to alternative representations of the spectral radius Λ as a scaled cumulant generatingfunction [43] and as the average rate of creation of probability of the dynamics. This is the purpose of the followingresult. Theorem 2.
Let Q f be as in Theorem 1 and define λ = log(Λ) . Then, for any µ ∈ P ( X ) with µ ( W ) < + ∞ , λ = lim k → + ∞ k log (cid:16) µ (cid:2) ( Q f ) k (cid:3)(cid:17) . (25) Moreover,
Λ = µ ∗ f (cid:0) Q f (cid:1) . (26) roof. Considering the operator Q h introduced in Lemma 3, we have for any µ ∈ P ( X ) with µ ( W ) < + ∞ , µ (cid:2) ( Q f ) k (cid:3) = µ (cid:0) Λ k hQ kh h − (cid:1) . Taking the logarithm and dividing by k leads to1 k log µ (cid:2) ( Q f ) k (cid:3) = log(Λ) + 1 k log µ (cid:0) hQ kh h − (cid:1) . Lemma 3 shows that µ (cid:0) hQ kh h − (cid:1) converges to c − h,µ , where c h,µ is defined in (22). Taking the limit k → + ∞ thenleads to (25).In order to prove (26), we use that µ ∗ f is a fixed point of Φ, i.e. for any ϕ ∈ B ∞ W ( X ), µ ∗ f ( ϕ ) = µ ∗ f ( Q f ϕ ) µ ∗ f ( Q f ) . Taking ϕ = h ∈ B ∞ W ( X ) and using Q f h = Λ h we obtain µ ∗ f ( h ) = µ ∗ f (Λ h ) µ ∗ f ( Q f ) , so that Λ = µ ∗ f ( Q f ), as claimed.Although stated in an abstract setting, Theorem 2 has a natural interpretation. If Q f = e f Q where Q is theevolution operator of a Markov chain ( x k ) k ∈ N with x ∼ µ , then (25) rewrites λ = lim k → + ∞ k log E µ h e P k − i =0 f ( x i ) i , which is a standard formula for the scaled cumulant generating function (SCGF, or logarithmic spectral radius)in large deviations theory [17, 43]. We remind that E µ stands for the expectation with respect to all trajectorieswith initial condition distributed according to µ . On the other hand, (26) means that this SCGF can be expressedas the average rate of creation of probability of the process under the invariant measure. In particular, if Q f = Q is the evolution operator of a Markov chain, Λ = 1 since there is no creation of probability. Formula (26) does notseem typical in the large deviations literature, but was used in [25] to quantify the bias arising from discretizinga continuous Feynman–Kac dynamics. Remark 1.
It should be clear from the proofs that Assumptions 1 to 3 can be adapted or relaxed depending onthe context. In particular, we typically consider situations in which the state space X is (a subset of) R d , and thetransition kernel Q f has a transition density p f ( x, y ) > jointly continuous in x, y . In this case, Assumptions 2and 3 are immediately fulfilled by setting η n ( dx ) = K n ( x ) dx/ | K n | for each compact K n , as we will see inSection 2.3.Similarly, the assumption that W > can be weakened into: W is lower bounded by a positive constant oneach compact set.Another remark of interest is that the regularity condition (Assumption 3) is not satisfied by Metropolis typekernels [51], which are therefore not covered by our analysis.Let us mention that, in Assumption 1, it seems sufficient to suppose that γ n < Λ for some n > in order toobtain that θ ( Q f ) < Λ( Q f ) in the proof of Lemma 2. This is sufficient to use the Krein-Rutman theorem, and toobtain a Lyapunov condition for Q h (see Remark 4 in Appendix E).It is also possible to keep track of the constants in the proofs of Lemma 3 and Theorem 1, like in [35], and observethat they depend on the assumptions through the coefficients γ n , b n , α n , the measures η n and the function W .More precisely, the constants deteriorate when γ n , α n and η n ( h ) become small, and b n and sup K n W get large.Therefore, although the term η n ( h ) cannot be controlled more explicitly under our assumptions, it seems possibleto optimize the final constants in Lemma 3 (and thus in Theorem 1) with respect to the choice of n .In order to sketch the role of each assumption in the proofs of the results, we display in Figure 1 a schematicrepresentation of the arguments. We hope this will help adapting our framework to situations where our assump-tions are not fulfilled as such. inorization conditionLyapunov condition Local regularityExistence of ¯ n suchZero essential radiusTotal cone stabilityPositive spectral radiusExistence of h > Q h that η ¯ n ( K ¯ n ) > h Theorem 1
Figure 1: Schematic representation of the arguments used for the proofs of Lemmas 1, 2, 3, and Theorem 1. The plainlines correspond to Assumptions 1-3 pointing towards Lemma 1 and the key ingredients for the proof of Lemma 2.The dashed lines correspond to the actual proof of Lemma 2. The dotted lines correspond to the elements neededfor the proof of Lemma 3 and its consequences.
Our analysis carries over to time continuous processes, in particular diffusions. In this case, it is possible torephrase Assumption 1 in terms of the associated infinitesimal generator. In order to avoid the technical difficultyof dealing with an infinite dimensional process, we consider a diffusion ( X t ) t > over X = R d for some integer d >
1, satisfying the SDE dX t = b ( X t ) dt + σ ( X t ) dW t , (27)where b : X → R d , σ : X → R d × m and ( W t ) t > is an m -dimensional Brownian motion (for some integer m > L = b · ∇ + σσ T ∇ = d X i =1 b i ∂ x i + 12 d X i,j =1 ( σσ T ) ij ∂ x i ∂ x j . (28)We also consider a function f : X → R and the corresponding continuous Feynman–Kac semigroup that reads,for all t > µ ∈ P ( X ),Θ t ( µ )( ϕ ) = E µ (cid:18) ϕ ( X t ) e R t f ( X s ) ds (cid:19) E µ (cid:18) e R t f ( X s ) ds (cid:19) . (29) n this setting, we define the operator (cid:0) P ft ϕ (cid:1) ( x ) = E x (cid:18) ϕ ( X t ) e R t f ( X s ) ds (cid:19) , so that (29) is the natural continuous counterpart of (4) where, for a fixed time t >
0, we formally have Q f := P ft = e t ( L + f ) . (30)As a result, Θ t satisfies a semigroup property as the discrete time evolution through (5). In this case, the generatorof the weighted evolution operator P ft is L + f . As for the discrete semigroup (4), we are interested in the longtime behavior of quantities such as (29). When b = 0 and σ = √ L ( X ) towards awell-defined limit can be obtained by considering the spectral properties of the Schrödinger type operator − ∆ − f ,as in [52]. When σ = √ b = −∇ U is the gradient of a potential energy, the operator L + f is self-adjoint in L (e − U ) (see for instance [3]), and the unitary transform ϕ ϕ e − U leads to an analysis similar to theSchrödinger case. More precisely, L + f is unitarily equivalent to∆ − |∇ U | + 12 ∆ U + f, which can be studied by the theory of symmetric operators [37]. In both cases, the operator L + f is self-adjoint on asuitable Hilbert space, so that the Rayleigh formula can be used. It is also possible to study the spectral propertiesof P ft when b = −∇ U and X is bounded through the Krein-Rutman theorem (see e.g. [25, Proposition 1]). Tothe best of our knowledge, the case b = −∇ U in an unbounded space X remains open in general.Our analysis provides a practical criterion to study the long time behavior of (29) through the Lyapunovfunction techniques developed in Section 2. The continuous counterpart of Assumption 1 can be stated in thefollowing simple form. Assumption 4.
Let ( X t ) t > be the dynamics (27) with generator (28) . There exists a C ( X ) function W : X → [1 , + ∞ ) going to infinity at infinity such that W − ( L + f ) W −−−−−−→ | x |→ + ∞ −∞ . (31) In addition, there exist a C ( X ) function W : X → [1 , + ∞ ) and a constant c > such that ε ( x ) := W ( x ) W ( x ) −−−−−−→ | x |→ + ∞ , W − ( L + f ) W c. (32)Condition (31) can be checked by direct computations, as shown on some examples in Section 3.2. Finding afunction W such that (32) holds is generally just a formality, since we build Lyapunov functions in an exponentialform. More precisely, we consider in general W ( x ) = e aU ( x ) for some function U : X → R and a >
0, and W ( x ) =e a ′ U ( x ) for 0 < a ′ < a . In the proof of Theorem 3, (31)-(32) are used to control P ft thanks to a Grönwall lemma.It is also important to remark that, in the case f = 0, we are exactly back to typical conditions for the ergodicityof SDEs and compactness of the evolution operator P t , see [50, Theorem 8.9]. As in Section 2.2, some regularityof the transition kernel is required. A natural condition in the context of diffusions reads as follows [50, Section7]. Assumption 5.
The functions f and σ are continuous and, for any t > , the transition kernel P ft has acontinuous density p ft with respect to the Lebesgue measure, that is ∀ x, y ∈ X , P ft ( x, dy ) = p ft ( x, y ) dy. Moreover, it holds ∀ x, y ∈ X , p ft ( x, y ) > . his assumption is standard for diffusion processes and, as shown in the proof of Theorem 3, it impliesAssumptions 2 and 3 in Section 2.2. It holds true in particular for elliptic diffusions with regular coefficients andadditive noise ( b ∈ C ∞ ( X ) and σ = Id). For degenerate diffusions, possibly with multiplicative noise, this resultcan be obtained through hypoelliptic conditions and controllability. We refer to [56, 50, 58] and the referencestherein for more details.We now state the continuous version of Theorem 1. Theorem 3.
Consider the dynamics (29) induced by the SDE (27) and suppose that Assumptions 4 and 5 hold.Then, there exist a unique invariant measure µ ∗ f and κ > such that, for any initial measure µ ∈ P ( X ) with µ ( W ) < + ∞ , there is C µ > for which ∀ ϕ ∈ B ∞ W ( X ) , ∀ t > , (cid:12)(cid:12) Θ t ( µ )( ϕ ) − µ ∗ f ( ϕ ) (cid:12)(cid:12) C µ e − κt k ϕ k B ∞ W . (33) Moreover, the invariant measure satisfies µ ∗ f ( W ) < + ∞ .Proof. The idea of the proof is to show that, for any t >
0, the evolution operator (cid:0) P ft ϕ (cid:1) ( x ) = E x (cid:20) ϕ ( X t ) e R t f ( X s ) ds (cid:21) satisfies the assumptions of Theorem 1. Step 1: Minorization and regularity.
We first show that, by Assumption 5, P ft satisfies Assumptions 2and 3. A first remark is that, since P ft is assumed to have a continuous density with respect to the Lebesguemeasure, Assumption 3 immediately holds.It is enough to prove the minorization condition (Assumption 2) for measurable subsets of X = R d . Considerthe compact sets K n = B (0 , n ), i.e. the balls centered at 0 with radius n >
1. For a measurable set S ⊂ R d and n >
1, we have, for all x ∈ K n ,( P ft S )( x ) = Z S p ft ( x, y ) dy > Z S ∩ K n p ft ( x, y ) dy > (cid:16) inf x,y ∈ K n p ft ( x, y ) (cid:17) | S ∩ K n | , (34)where we denote by | A | the Lebesgue measure of a measurable set A ⊂ R d . As a result, (8) holds for all n > η n ( S ) = | S ∩ K n || K n | , α n = | K n | (cid:16) inf x,y ∈ K n p ft ( x, y ) (cid:17) > . Finally, let us check that (9) is satisfied. Take ϕ ∈ B ∞ W ( X ) with ϕ > η n ( ϕ ) = 1 | K n | Z K n ϕ ( x ) dx = 0 , for any n > n for an arbitrary n >
1. Since for any compact set K ⊂ X there exists m > K ⊂ K m ,this implies that ϕ = 0 almost everywhere, so Q f ϕ = 0 everywhere since Q f has a continuous density with respectto the Lebesgue measure. Therefore, Assumption 2 is satisfied. Step 2: Lyapunov condition.
Let us now show that Assumption 1 holds. First, Assumption 4 is equivalentto the existence of positive sequences ( a n ) n ∈ N , ( b n ) n ∈ N such that( L + f ) W − a n W + b n , (35)with a n → + ∞ as n → + ∞ . We then compute, for any t > n ∈ N , ddt (cid:0) e a n t P ft W (cid:1) = e a n t P ft ( a n W + ( L + f ) W ) b n e a n t P ft . (36)We can now bound the right hand side of the latter expression using (32). Since W > (cid:0) P ft (cid:1) ( x ) = E x (cid:20) e R t f ( X s ) ds (cid:21) E x (cid:20) W ( X t ) e R t f ( X s ) ds (cid:21) . (37) rom the second condition in (32), (37) becomes (cid:0) P ft (cid:1) ( x ) e ct E x (cid:20) W ( X t ) e − R t L WW ( X s ) ds (cid:21) . Inspired by a similar calculation in [58], we see that the right hand side of the above equation is a supermartingale.Indeed, introducing M t = W ( X t ) e − R t L WW ( X s ) ds , Itô formula shows that dM t = e − R t L WW ( X s ) ds ∇ W T ( X t ) σ ( X t ) dB t , so that M t is a local martingale (see [39, Proposition 2.24]). Since M t is nonnegative, it is a supermartingale byFatou’s lemma. As a result, E x [ M t ] M = W ( x ). The inequality (37) then becomes (cid:0) P ft (cid:1) ( x ) e ct E x [ M t ] e ct W ( x ) . Coming back to (36), we obtain ddt (cid:0) e a n t P ft W (cid:1) b n e ( a n + c ) t W . Integrating in time, (cid:0) e a n t P ft W − W (cid:1) ( x ) b n e ( a n + c ) t a n + c W ( x ) . As a result P ft W ( x ) e γ n W ( x ) + c n W ( x ) , (38)with e γ n = e − a n t , c n = b n e ct a n + c > . At this stage, (6) holds with the indicator function replaced by the function W . However, using the first conditionin (32), we can find a compact set K n such that c n ε ( x ) e γ n outside K n . Using this set and W = εW , (38)becomes P ft W ( x ) e γ n W ( x ) + c n K n ( x ) W ( x ) + c n ε ( x ) W ( x ) K cn ( x ) e γ n W ( x ) + c n (cid:18) sup K n W (cid:19) K n ( x ) . Setting γ n = 2 e γ n and b n = c n sup K n W , we see that P ft W γ n W + b n K n , (39)with γ n → n → + ∞ . This means that P ft satisfies Assumption 1, and hence fullfils all the assumptions ofTheorem 1. Step 3: using Theorem 1.
We now use that P ft satisfies the assumptions of Theorem 1 to conclude theproof. Fix t >
0. There exist a unique measure µ ∗ f,t and a constant κ t > µ ∈ P ( X ) with µ ( W ) < + ∞ , it holds (with the constant C µ > ∀ ϕ ∈ B ∞ W ( X ) , ∀ k > , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) µ (cid:0) ( P ft ) k ϕ (cid:1) µ (cid:0) ( P ft ) k (cid:1) − µ ∗ f,t ( ϕ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C µ e − kκ t k ϕ k B ∞ W . We next show that (33) can be obtained for any t > t ) and that the invariantmeasure µ ∗ f,t actually does not depend on t . This follows by a standard time decomposition argument [45, 38].Indeed, for any t >
0, we set t = kt + r with r ∈ [0 , t ), and we use the semigroup property to obtainΘ t ( µ )( ϕ ) = Θ kt (cid:18) µP fr µ ( P fr ) (cid:19) ( ϕ ) = µ r (cid:0) ( P ft ) k ϕ (cid:1) µ r (cid:0) ( P ft ) k (cid:1) , here we defined µ r as µ r ( ϕ ) = µ ( P fr ϕ ) µ ( P fr ) . We then only need to control the familly of initial distributions ( µ r ) r ∈ [0 ,t ) . Step 1 in the proof shows that µ ( P fr ) > P fr maps B ∞ W ( X ) to B ∞ W ( X ) for any r >
0, so µ r ( W ) < + ∞ and thus µ r defines an admissible initial condition in Theorem 1. This leads to: ∀ ϕ ∈ B ∞ W ( X ) , ∀ t > , (cid:12)(cid:12) Θ t ( µ )( ϕ ) − µ ∗ f,t ( ϕ ) (cid:12)(cid:12) (cid:18) sup r ∈ [0 ,t ) C µ r (cid:19) e − κ t tt k ϕ k B ∞ W , (40)where the constant C µ r is given in (24). In view of (24), it remains to boundsup r ∈ [0 ,t ) µ r ( W ) µ r ( h ) = sup r ∈ [0 ,t ) µ ( P fr W ) µ ( P fr h t ) , (41)where h t is the principal eigenvector associated to P ft with eigenvalue Λ t (using Lemma 2). The numerator inthe latter expression is easily bounded uniformly in r using (39). Standard semigroup analysis shows that h t = h does not depend on t and Λ t = e t α for some α ∈ R . Therefore, for any r ∈ [0 , t ), P fr h t = e rα h , and thedenominator in (41) is bounded away from 0 independently on r .We finally prove that the invariant measure µ ∗ f,t does not depend on t . Following the same procedure foranother time t > µ ∗ f,t . Then, for any ϕ ∈ B ∞ W ( X ), µ ∈ P ( X )with µ ( W ) < + ∞ and t > (cid:12)(cid:12) µ ∗ f,t ( ϕ ) − µ ∗ f,t ( ϕ ) (cid:12)(cid:12) (cid:12)(cid:12) Θ t ( µ )( ϕ ) − µ ∗ f,t ( ϕ ) (cid:12)(cid:12) + (cid:12)(cid:12) Θ t ( µ )( ϕ ) − µ ∗ f,t ( ϕ ) (cid:12)(cid:12) (cid:18) sup r ∈ [0 ,t ) C µ r (cid:19) e − κ t tt k ϕ k B ∞ W + (cid:18) sup r ∈ [0 ,t ) C µ r (cid:19) e − κ t tt k ϕ k B ∞ W . Taking the limit t → + ∞ on the right hand side shows that µ ∗ f,t = µ ∗ f,t , so the invariant measure is independentof the arbitrary time t . This concludes the proof of Theorem 3.We close this section by mentioning that, under the assumptions of Theorem 3, it is also possible to define thelogarithmic spectral radius of the dynamics as in Theorem 2, which reads in this case λ = lim t → + ∞ t log E µ h e R t f ( X s ) ds i , for any initial measure µ that satisfies µ ( W ) < + ∞ . We do not reproduce the proof of this result which is similarto that of Theorem 2. Since our study was first motivated by practical situations, we provide in this section a number of finite dimensionalexamples where our framework provides simple criteria for proving convergence of the Feynman–Kac semigrouptowards an invariant measure. Sections 3.1 and 3.2 are concerned with discrete and continuous time applicationsrespectively. Section 3.3 presents a convergence result for numerical discretizations of (29), where convergencerates are uniform in the time step.
In this section, we provide two typical examples of Markov chains for which our results apply. First of all, let usconsider the Diffusion Monte Carlo case where f = − V and V stands for a Schrödinger potential. roposition 1. Consider a weighted evolution operator Q V = e − V Q in X = R d with Gaussian increments Q ( x, dy ) = (2 πσ ) − d e − ( x − y )22 σ dy , and where V is a continuous function. Then, if V ( x ) → + ∞ when | x | → + ∞ , W ( x ) = is a Lyapunov function for Q V in the sense of Assumption 1. Moreover, if there exist constants a > and c ∈ R such that V ( x ) > a | x | − c, (42) then W ( x ) = e βx is a Lyapunov function for < β < a r aσ − ! . Finally, Assumptions 2 and 3 hold true, so that Theorem 1 applies for these choices of Lyapunov function.
The interpretation of this result is the following. In the Diffusion Monte Carlo setting, the confinement cannotbe provided by the dynamics, since it is a Gaussian random walk over R d . However, the external potential V gives a small weight to the trajectories going to infinity, which makes the dynamics stable. If more information isavailable on the growth of V , we obtain better integrability results for the invariant measure µ ∗ V through Lyapunovfunctions growing faster at infinity. Proof.
Let us first check that W = is a Lyapunov function when V goes to infinity at infinity. Note that, forany compact set K ⊂ R d , (cid:0) Q V (cid:1) ( x ) = e − V ( x ) = K c ( x ) e − V ( x ) + K ( x ) e − V ( x ) . Taking an increasing sequence of compact sets K n (in the sense of inclusion) and setting γ n = sup K cn e − V , b n = sup K n e − V < + ∞ , we obtain Q V γ n + b n K n , which proves the first assertion since γ n → n → + ∞ .Let us now assume that (42) holds. Setting W ( x ) = e βx , under the condition β < σ , (43)an easy computation shows that QW ( x ) = e β − βσ x (1 − βσ ) d . We remark that W is not a Lyapunov function for Q since 1 − βσ <
1. However, setting C d = (1 − βσ ) − d , Q V W ( x ) = C d e − V ( x )+ β − βσ x C d e c − ax + β − βσ x − βx W ( x ) = C ′ d e − ax + β σ − βσ x W ( x ) , with C ′ d = C d e c . One can then check that the choice0 < β < a r aσ − ! (44)leads to − a + 2 β σ − βσ < . Note that, since a r aσ − ! < σ , he condition (43) is automatically satisfied when β is chosen according to (44). Next, when β satisfies (44), thefunction ε ( x ) = e − ax + β σ − βσ x tends to zero at infinity. Therefore, taking increasing compact sets K n (such as balls of increasing radii),( Q V W )( x ) = K cn ( x ) ε ( x ) W ( x ) + K n ( x ) ε ( x ) W ( x ) γ n W ( x ) + b n K n ( x ) , with γ n = sup K cn ε → n → + ∞ and b n = sup K n εW < + ∞ . Hence W is a Lyapunov function for Q V forthis choice of β , i.e. Assumption 1 is satisfied.Assumption 3 is easily seen to hold. It therefore suffices to prove the minorization condition (Assumption 2).Take a compact set K with non zero Lebesgue measure, and let us first show that the condition of Assumption 2holds for Q . It is enough to prove the condition for the indicator function of any borel set S ⊂ X . Denoting by D K = sup {| x − y | , x ∈ K, y ∈ K } the diameter of K , we compute for any x ∈ K ( Q S )( x ) = Q ( x, S ) = Z S e − ( x − y )22 σ dy > Z S ∩ K e − ( x − y )22 σ dy > inf x ∈ K Z S ∩ K e − ( x − y )22 σ dy > e − D K σ Z S ∩ K dy > e − D K σ | S ∩ K | , where we denote again by | A | the Lebesgue measure of a measurable set A ⊂ R d . This motivates defining α K = e − D K σ | K | > , η K ( S ) = | S ∩ K || K | . Note also that, since | K | ∈ (0 , + ∞ ), η K is a probability measure. Finally, since V is continuous, ∀ x ∈ K, Q V ( x, · ) > α V η K ( · ) , with α V = α K e − sup K V >
0. Choosing K n = B (0 , n ) the centered balls of radius n , we see that (9) holds usingarguments similar to the ones used for the proof of Theorem 3, hence Q V satisfies Assumption 2.We now provide an example where the dynamics Q admits a Lyapunov function W in the sense of thecondition (58) recalled in Appendix A, and this function is also a Lyapunov function for Q f when f does notgrow too fast. Proposition 2.
Consider the dynamics corresponding to a discrete Ornstein-Uhlenbeck process in R d , namely x k +1 = ρx k + σG k , where ρ ∈ ( − , , σ ∈ R and ( G k ) k > is a familly of independent standard d -dimensional Gaussian randomvariables. Define the operator Q f = e f Q with f a continous function such that there exist constants a > , c > , p < for which f ( x ) a | x | p + c .Then, the Feynman–Kac dynamics associated to Q f satisfies the assumptions of Theorem 1 with Lyapunovfunction W ( x ) = e βx when < β < − ρ σ . The interpretation of this result is quite different from the interpretation of Proposition 1. Here, the confine-ment is provided by the dynamics itself, and the weight f has to be controlled by the Lyapunov function of thedynamics. In that case it is important to find a «strong enough» Lyapunov function in order for this control tobe possible. Quite typically, if f is unbounded, W ( x ) = x is a Lyapunov function for Q , but not for Q f . On theother hand, if f is bounded above, the result is straightforward. roof. We set W ( x ) = e βx and first compute QW ( x ) = E (cid:2) W ( x k +1 ) (cid:12)(cid:12) x k = x (cid:3) = E G h e β | ρx + σG | i = e βρ x E h e β (2 σρxG + σ G ) i . For β < / (2 σ ), an easy computation similar to that of Proposition 1 leads to QW ( x ) = 1(1 − βσ ) d e ρ − βσ βx . Define now δ β = ρ − βσ . Then δ β ∈ (0 ,
1) and 1 − βσ > β ∈ (cid:16) , − ρ σ (cid:17) . This leads toe f ( x ) QW ( x ) = 1(1 − βσ ) d e f ( x )+( δ β − x W ( x ) − βσ ) d e a | x | p + c +( δ β − x W ( x ) = ε ( x ) W ( x ) , with ε ( x ) → | x | → + ∞ . Therefore, by considering againg K n = B (0 , n ), we see that (cid:0) Q f W (cid:1) ( x ) = K cn ( x ) ε ( x ) W ( x ) + K n ( x ) ε ( x ) W ( x ) γ n W ( x ) + K n ( x ) b n , where γ n = sup K cn ε → n → + ∞ , and b n = sup K n ε W < + ∞ . This shows that Assumption 1 is satisfied.Assumptions 2 and 3 follow by arguments similar to those used in the proof of Proposition 1.The latter examples do not intend to form a complete overview of the possible practical cases. However, theyseem characteristic of two typical situations: one where the confinement arises from the dynamics, and anotherwhere it comes from the potential V = − f . These two strategies correspond respectively to a Large Deviationscontext [43] and a Diffusion Monte Carlo context [36]. They are both encoded in the condition (6). We now provide some examples where the conditions of Section 2.3 are met. Our main concern is the Lyapunovcondition, Assumption 4, so we assume f and the coefficients of the SDE (27) to be regular enough for Assumption 5to be satisfied. Let us start with a reversible diffusion. Proposition 3.
Consider a diffusion process ( X t ) t > over R d satisfying (27) with σ = √ , and assume thatthe drift is given by b = −∇ U , where U : X → R is a smooth potential such that U ( x ) → + ∞ as | x | → + ∞ .Assume moreover that U satisfies lim | x |→ + ∞ |∇ U ( x ) | | ∆ U ( x ) | = + ∞ , (45) and there exists / < β < such that lim | x |→ + ∞ (cid:16) − β (1 − β ) |∇ U | + β ∆ U + f (cid:17) = −∞ . (46) Then Assumption 4 holds for the Lyapunov function W ( x ) = e βU ( x ) .Proof. The proof follows by simple computations. Indeed, it holds L W = − β ∇ U · ( ∇ U ) W + β ∇ · [( ∇ U ) W ] = − β |∇ U | W + βW ∆ U + β |∇ U | W ( x ) , so that ( L + f ) W = (cid:16) − β (1 − β ) |∇ U | + β ∆ U + f (cid:17) W, (47) ence (31) in Assumption 4 is satisfied. The conditions in (32) are obtained setting W ( x ) = e θU ( x ) , for some θ ∈ (1 / , β ). It is clear that W /W goes to zero at infinity, so the first condition in (32) holds true. Thekey remark is then to note that for our choice of θ, β , we have β (1 − β ) θ (1 − θ ) . Therefore, (45) and (46) show that there exist c, c ′ > f β (1 − β ) |∇ U | − β ∆ U + c θ (1 − θ ) |∇ U | − θ ∆ U + c ′ = − L WW + c ′ . This proves that the second condition in (32) holds, which concludes the proof.Let us mention that the conditions in Proposition 3 are similar to conditions appearing in works on Poincaréinequalities (see [2] and references therein), and corresponds to the case where the confinement comes from thepotential U , f being a perturbation that should not go too fast to + ∞ with respect to U . Remark 2.
Proposition 3 is also related to confinement conditions for Schrödinger operators. Indeed, usingthe parameters of Proposition 3, the dynamics is reversible with respect to the measure e − U and, as noted inSection 2.3, it is possible to turn the diffusion operator L into a Schrödinger operator using the unitary transform: L → e − U L e U . Using this transformation, L + f is unitarily equivalent [45] to the following Schrödinger operator: ∆ − |∇ U | + 12 ∆ U + f. We then notice that the confinement condition for this Schrodinger operator is precisely (46) for the limit value β = 1 / . This shows that our Lyapunov condition (31) is a natural extension of this condition for non-reversibledynamics. As a side product, it shows that a slightly modified confinement condition for a Schrödinger operatordoes not only provide convergence in L -norm, but also in a weighted uniform norm, which does not seem to be astandard result. In the non-reversible setting one cannot hope for a Schrödinger representation, and the Lyapunov functionframework shows its usefulness. Let us present such an application, drawn from [22], where the drift behavespolynomialy at infinity.
Proposition 4.
Let ( X t ) t > satisfy the SDE (27) with σ = √ and where the drift b is such that there exist q > , δ > , R > for which ∀ | x | > R, b ( x ) · x − δ | x | q . (48) Assume also that f is smooth and satisfies f ( x ) a | x | p for | x | > R and some p < q − . Then, Assumption 4holds for the Lyapunov function W ( x ) = e β | x | q , with < β < δq . (49) Proof.
Setting W ( x ) = e β | x | q , a simple computation shows that L W ( x ) = βqb ( x ) · x | x | q − W ( x ) + βq ∇ · ( x | x | q − W ( x ))= βqb ( x ) · x | x | q − W ( x ) + βqd | x | q − W ( x ) + βq ( q − | x | q − W ( x ) + β q | x | q − W ( x ) , (50)so L WW ( x ) = βqb ( x ) · x | x | q − + βq ( q + d − | x | q − + β q | x | q − . sing (48) and the bound on f leads to, for | x | > R , L WW ( x ) + f ( x ) − βq ( δ − βq ) | x | q − + βq ( q + d − | x | q − + a | x | p . (51)Since p < q −
2, (31) is readily satisfied when 0 < β < δ/q .We end the proof by showing that (32) holds. Similarly to the proof of Proposition 3, we consider W ( x ) = e θ | x | q , with 0 < θ < β, which satisfies the first condition in (32). Repeating the calculations leading to (51), since θ < δ/q and p < q − c > L WW ( x ) + f ( x ) − θq ( δ − θq ) | x | q − + θq ( q − | x | q − + a | x | p c, so the second condition in (32) holds true, and Assumption 4 is satisfied. When one considers continuous semigroups as in Section 2.3, it is natural in practical applications to discretize (29)for example with Φ k ( µ )( ϕ ) = E µ (cid:20) ϕ ( x k ) e ∆ t P k − i =0 f ( x i ) (cid:21) E µ (cid:20) e ∆ t P k − i =0 f ( x i ) (cid:21) , (52)where ( x k ) k ∈ N is a discretization of the SDE (27) with time step ∆ t > i.e. x k is an approximation of X k ∆ t . Asmentioned in [25], the stability of the discretization schemes for unbounded state spaces was an open question.Our framework covers this situation, as shown by the examples provided in Section 3.1.Another interesting consequence of our analysis is that we are able to obtain convergence estimates uniformin the time step ∆ t , in the sense that the decay rate on fact depends on k ∆ t , the physical time of the system,with a prefactor independent of ∆ t . It has been the purpose of several works to develop such uniform in ∆ t estimates for time convergence, in particular in the context of Metropolized discretizations of overdamped Langevindynamics [5, 23], discretization of the Langevin dynamics [45, 44], and other discretizations of SDEs [10, 41, 42].Our goal is to show that similar results can be obtained for Feynman–Kac semigroups. For the remainder of thissection, we assume that X = T d is the d -dimensional torus, the function σ in (27) is a positive real constant, and we denote by ⌈ a ⌉ the upperinteger part of a for a ∈ R . Considering an unbounded state space X is also possible but, as noted in [25], thisleads to serious technical difficulties – we therefore postpone this case to future works.We consider here a simplified version of the framework extensively developed in [25]. We say that a kerneloperator Q f ∆ t defines a consistent discretization of the semigroup (29) if it satisfies Assumption 3 and there exist∆ t ∗ > C > p ∈ N , and an operator R ∆ t : C ∞ ( X ) → C ∞ ( X ) (which encodes remainder terms) such that, forany ϕ ∈ C ∞ ( X ), Q f ∆ t ϕ = ϕ + ∆ t ( L + f ) ϕ + ∆ t R ∆ t ϕ, where, for all ∆ t ∈ (0 , ∆ t ∗ ], kR ∆ t ϕ k B ∞ C sup m ∈ N d | m | p k ∂ m ϕ k B ∞ , using the notation ∂ m = ∂ m x . . . ∂ m d x d for m = ( m , . . . , m d ) ∈ N d . The dynamics (29) is then approximated bythe discrete semigroup ∀ k > , ∀ µ ∈ P ( X ) , ∀ ϕ ∈ B ∞ ( X ) , Φ k ( µ )( ϕ ) = µ (cid:0) ( Q f ∆ t ) k ϕ (cid:1) µ (cid:0) ( Q f ∆ t ) k (cid:1) . (53) he latter definition encompasses many numerical schemes – we refer the interested reader to [25] for ajustification of this framework and the subsequent numerical analysis. In order to obtain uniform in the time stepestimates, we now assume a uniform minorization and boundedness condition of the following form. Assumption 6.
Fix a time
T > . There exist ∆ t ∗ > , η ∈ P ( X ) and α ∈ (0 , such that, for any ∆ t ∈ (0 , ∆ t ∗ ] ,the operator Q f ∆ t is strong Feller and for any ϕ ∈ B ∞ ( X ) with ϕ > , ∀ x ∈ X , αη ( ϕ ) (cid:16)(cid:0) Q f ∆ t (cid:1) ⌈ T ∆ t ⌉ ϕ (cid:17) ( x ) α η ( ϕ ) . (54)The lower bound in (54) corresponds to a minorization condition with respect to a physical time T >
Remark 3.
Although Assumption 6 holds in many situations when X is compact, the requirement that the upperbound in (54) holds may not seem natural in view of the results of Section 2.2. Indeed, our framework showsthat this upper bound is not necessary to prove the ergodicity of Feynman–Kac semigroups, as opposed to previousworks [13, 15, 12, 25]. A careful look at the proof of Theorem 4 shows that this upper bound is only used to showthe uniform boundedness of h ∆ t in (56) . However, controlling h ∆ t as ∆ t → does not seem to be an easy task ingeneral. We therefore stick to this assumption here. Before stating our uniform in ∆ t convergence result, we need the following estimate deduced from [25,Lemma 5], whose proof can be found in Appendix F. Lemma 4.
Consider the process ( X t ) t > solution to (27) with σ = Id , b ∈ C ∞ ( X ) , and a function f ∈ C ∞ ( X ) .Then the operator L + f admits a real isolated largest (in modulus) eigenvalue λ with eigenvector h ∈ C ∞ ( X ) andassociated eigenspace of dimension one, which satisfies ( L + f ) h = λh, and P ft h = e tλ h, ∀ t > . If Q f ∆ t is a consistent discretization of (29) satisfying Assumption 6, then for any ∆ t > , the operator Q f ∆ t hasa largest (in modulus) eigenvalue Λ ∆ t ∈ R , which is non-degenerate. The associated eigenvector h ∆ t , such that Q f ∆ t h ∆ t = Λ ∆ t h ∆ t , is normalized as η ( h ∆ t ) = 1 . Finally, there exist ∆ t ∗ > , C > , ε > such that for all ∆ t ∈ (0 , ∆ t ∗ ] , there is c ∆ t ∈ R for which Λ ∆ t = e ∆ tλ +∆ t c ∆ t , (55) with | c ∆ t | C and ∀ x ∈ X , ∀ ∆ t ∈ (0 , ∆ t ∗ ] , ε h ∆ t ( x ) ε − . (56)This result means that the evolution operator associated with a consistent discretization has a principaleigenvalue approximating the principal eigenvalue of the continuous dynamics, and that its associated principaleigenvector remains uniformly bounded from below and above if ∆ t is sufficiently small. We will see in Propo-sition 5 that Assumption 6 is naturally satisfied if a similar condition holds for Q ∆ t and the evolution operatorreads Q f ∆ t = e ∆ tf Q ∆ t (which corresponds to the discretization (52)). Let us now state the uniform in ∆ t versionof Theorem 1. Theorem 4.
Consider a consistent discretization Q f ∆ t of the dynamics (29) satisfying Assumption 6. Then,there exists ∆ t ∗ > such that, for any ∆ t ∈ (0 , ∆ t ∗ ] , the dynamics (53) admits a unique invariant measure µ ∗ f, ∆ t ∈ P ( X ) . Moreover, there exist κ > , C > such that for any ϕ ∈ B ∞ ( X ) , µ ∈ P ( X ) , and ∆ t ∈ (0 , ∆ t ∗ ] , ∀ k > , (cid:12)(cid:12) Φ k ( µ )( ϕ ) − µ ∗ f, ∆ t ( ϕ ) (cid:12)(cid:12) C e − κk ∆ t k ϕ k B ∞ . Let us note that the uniformity of the prefactor C in the initial condition is a consequence of the boundednessof X . Indeed, in this case, we can choose W ≡ C µ in (24) can beuniformly bounded using (56). Such a uniformity does not hold for Theorem 1 since in that case X was notassumed to be bounded. The important part of the theorem is the control of C and κ in the time step, whichprovides convergence with respect to the physical time k ∆ t . roof. The proof essentially relies on the fact that if Q f ∆ t satisfies Assumption 6, then Q h, ∆ t defined as in Lemma 3satisfies a uniform minorization condition. For controlling the dependencies in the time step, we rely on Lemma 4,and use the same notation.We want to prove a uniform minorization condition (in the sense of [45, Lemma 3.4]) for the operator definedby Q h, ∆ t = Λ − t h − t Q f ∆ t h ∆ t , and apply [45, Corollary 3.5]. Fix T >
0. From (54) and (56) we have, for any ϕ > x ∈ X , Q ⌈ T ∆ t ⌉ h, ∆ t ϕ ( x ) = Λ −⌈ T ∆ t ⌉ ∆ t h − t (cid:0) Q f ∆ t (cid:1) ⌈ T ∆ t ⌉ ( h ∆ t ϕ )( x ) > Λ −⌈ T ∆ t ⌉ ∆ t ε αη ( ϕ ) . (57)Moreover, from (55), Λ −⌈ T ∆ t ⌉ ∆ t = e − ∆ t ( λ +∆ tc ∆ t ) ⌈ T ∆ t ⌉ > e − | λ | T > , upon possibly reducing ∆ t ∗ . Then, (57) becomes ∀ x ∈ X , Q ⌈ T ∆ t ⌉ h, ∆ t ( x, · ) > αε e − | λ | T η ( · ) . As a result, Q h, ∆ t satisfies the assumptions of [45, Corollary 3.5]: there exist a unique measure µ h, ∆ t ∈ P ( X ), C > κ > φ ∈ B ∞ ( X ), k ∈ N and ∆ t ∈ (0 , ∆ t ∗ ], (cid:13)(cid:13) Q kh, ∆ t φ − µ h, ∆ t ( φ ) (cid:13)(cid:13) B ∞ C e − κk ∆ t k φ k B ∞ . This is a version of Lemma 3 uniform with respect to ∆ t . The result then follows by rewriting the proof ofTheorem 1, with ¯ α k replaced by e − κk ∆ t .It only remains to study the constant C µ, ∆ t arising in Theorem 1 (see (24)), which now also depends on ∆ t through the eigenvector h ∆ t and the invariant measure µ h, ∆ t . Since X is bounded, we can actually choose aconstant Lyapunov function, i.e. W = . Next, using (56) we obtain that for any ∆ t ∈ (0 , ∆ t ∗ ] and any µ ∈ P ( X ),it holds C µ, ∆ t = 4 µ h, ∆ t ( h − t ) (cid:0) µ h, ∆ t ( h − t ) (cid:1) µ ( h ∆ t ) ε − (1 + ε − ) . This provides a uniform bound on C µ, ∆ t , which concludes the proof.We now show that the setting of Theorem 4 is natural, since Assumption 6 can be deduced from a similarassumption on the Markov dynamics Q ∆ t when the evolution operator is Q f ∆ t = e ∆ tf Q ∆ t , which corresponds tothe discretization (52). For proving the condition on Q ∆ t , we refer to [45] and the references therein. Proposition 5.
Assume that X is bounded, f ∈ C ( X ) , and the SDE (27) is discretized for a given time step ∆ t > with a Markov chain ( x k ) k ∈ N whose evolution operator Q ∆ t is strong Feller and satisfies the followinguniform minorization and boundedness condition: for a fixed T > , there exist ∆ t ∗ > , η ∈ P ( X ) and α ∈ (0 , such that, for any ∆ t ∈ (0 , ∆ t ∗ ] and ϕ ∈ B ∞ ( X ) with ϕ > , ∀ x ∈ X , αη ( ϕ ) (cid:0) Q ∆ t (cid:1) ⌈ T ∆ t ⌉ ϕ ( x ) α η ( ϕ ) . Then, the transition operator Q f ∆ t defined as Q f ∆ t = e ∆ tf Q ∆ t satisfies Assumption 6.Proof. Since Q ∆ t is strong Feller and f is continuous, Q f ∆ t is strong Feller. Then, for any k ∈ N and ϕ ∈ B ∞ ( X ),( Q f ∆ t ) k ϕ ( x ) = E x (cid:20) ϕ ( x k ) e ∆ t P k − i =0 f ( x i ) (cid:21) > e − k ∆ t k f k B ∞ E x [ ϕ ( x k )] = e − k ∆ t k f k B ∞ (cid:0) ( Q ∆ t ) k ϕ (cid:1) ( x ) . Taking k = ⌈ T / ∆ t ⌉ with 0 < ∆ t ∆ t ∗ then shows that( Q f ∆ t ) ⌈ T ∆ t ⌉ ϕ ( x ) > e − T k f k B ∞ (cid:0) Q ⌈ T ∆ t ⌉ ϕ (cid:1) ( x ) > e − T k f k B ∞ αη ( ϕ ) . A similar computation for the upper bound allows to conclude the proof. Discussion
The ideas developped in this work concerning the ergodicity of Feynman–Kac semigroups solve several problemsfor which, to the best of our knowledge, no solution was available. They are closely related to previous works andwe want to highlight two important connections.First, as we mentionned in the introduction, our framework can be considered as an extension of ergodictheory for Markov chains [46], when the evolution operator of the dynamics does not conserve probability. Forthis reason, we tried to formulate our assumptions in the flavour of [35]. However, the spectral theory on whichwe crucially rely in our study requires stronger conditions. This leaves open a few questions, as the converge ofFeynman–Kac dynamics based on Metropolis type kernels, which lack regularity, or the case of non-Polish spaces,which may arise for stochastic partial differential equations. Finally, another interesting feature of our frameworkis that we can prove ergodicity for Feynman–Kac dynamics for which the underlying Markov chain is not ergodic– a case we called Diffusion Monte Carlo (DMC) in analogy with quantum physics models (see Proposition 1).The other clear connection concerns Large Deviations theory. Indeed, one motivation for studying Feynman–Kac dynamics is to prove large deviations principles for additive functionals of Markov chains [20, 17, 58, 43],which can be achieved by proving the existence of formulas such as (25). It is then no surprise that the spectraltheory we develop, although based on [50], is reminiscent of [43], and requires stronger assumptions than the onesneeded for proving ergodicity in [35]. However, the tools we use seem new in this context, and more adapted to thesituation at hand, for instance the Krein-Rutman theorem based on the minorization condition. In particular, [43](like [24]) makes use of nonlinear generators related to an optimal control problem. This actually does not seemnecessary to obtain the desired spectral properties. It seems interesting to investigate the links of our workwith [43] in order to prove large deviations principles in the so called « τ W -topology », which seems the mostadapted to this situation. Acknowledgements
The authors are grateful to Jonathan C. Mattingly for interesting discussions at a preliminary stage of this work.The authors also warmly thank Nicolas Champagnat and Denis Villemonais for pointing out a gap in one argumentin the first version of the manuscript. The PhD of Grégoire Ferré is supported by the Labex Bézout. The workof Gabriel Stoltz was funded in part by the Agence Nationale de la Recherche, under grant ANR-14-CE23-0012(COSMOS). Gabriel Stoltz and Mathias Rousset are supported by the European Research Council under theEuropean Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement number 614492. Thework of Mathias Rousset is supported by the INRIA Rennes and the IRMAR. We also benefited from the scientificenvironment of the Laboratoire International Associé between the Centre National de la Recherche Scientifiqueand the University of Illinois at Urbana-Champaign.
A Stability of Markov chains
In this section, we recall the results presented in [35]. We consider a measurable space X and Markov chain( x k ) k > with transition kernel Q on X . By transition kernel, we mean that (i) for all x ∈ X , Q ( x, · ) is a positivemeasure on X , (ii) for any measurable set A ⊂ X , Q ( · , A ) is measurable, and (iii) Q = . In the notation ofSection 2, Q is a kernel operator ( i.e. (i) and (ii) are satisfied) such that Q = .The stability of Markov dynamics can be obtained from minorization and Lyapunov conditions [46, 50, 35]. Assumption 7.
There exist a function W : X → [1 , + ∞ ) and constants C > , γ ∈ (0 , such that ∀ x ∈ X , ( Q W )( x ) γ W ( x ) + C. (58)Given such a Lyapunov function, we consider the associated functional space as in (7). A second key ingredientin the ergodicity of Q is the following minorization condition. Compared to [35], we replace W by W + 1; this is for notational convenience only. ssumption 8. There exist α ∈ (0 , and η ∈ P ( X ) such that inf x ∈C Q ( x, · ) > αη ( · ) , (59) where C = { x ∈ X | W ( x ) R + 1 } for some R > C/ (1 − γ ) , and γ , C are the constants from Assumption 7. The following result holds under these conditions (see [35, Theorem 1.2]).
Theorem 5.
Let Assumptions 7 and 8 hold. Then, Q has a unique invariant measure µ ∗ , which is such that µ ∗ ( W ) < + ∞ . Moreover, there exist C > and ¯ α ∈ (0 , such that, for any ϕ ∈ B ∞W ( X ) , ∀ k > , k Q k ϕ − µ ∗ ( ϕ ) k B ∞W C ¯ α k k ϕ − µ ∗ ( ϕ ) k B ∞W . B Useful theorems
We remind here some definitions and results around the Krein-Rutman theorem, as well as some basic resultsfrom analysis. Let us start with some operator-theoretic definitions from [47, 49, 11, 50].
Definition 1.
For a Banach space E and an operator T ∈ B ( E ) , we denote by Λ( T ) its spectral radius definedby: Λ( T ) = lim k → + ∞ k T k k k B ( E ) = inf k > k T k k k B ( E ) . We denote by θ ( T ) the essential spectral radius of T defined by (see [48, Eq. (1.14)] and [47, Theorem 1]): θ ( T ) = lim k → + ∞ (cid:0) inf {k T k − Q k B ( E ) , Q compact } (cid:1) k = inf k > (cid:0) inf {k T k − Q k B ( E ) , Q compact } (cid:1) k . An operator T ∈ B ( E ) is said to be compact if it maps bounded sets into precompact sets. In other words, T iscompact if, for any bounded sequence ( u n ) n ∈ N in E , there is a subsequence ( n k ) k ∈ N such that ( T u n k ) k ∈ N convergesin E , see [49]. In order to recall the Krein-Rutman theorem, let us first give some definitions for cones in Banach spaces.
Definition 2.
Let E be a Banach space. A closed convex set K ⊂ E is said to be a cone if K ∩ − K = { } and forall u ∈ K and α ∈ R + , it holds αu ∈ K . A cone is total if the norm closure of K − K is equal to E . We now recall a weak version of the Krein-Rutman theorem, which can be found in [48, Theorem 1.1].Interesting remarks and comments are also available in [11, Section 19.8].
Theorem 6.
Let E be a Banach space, K ⊂ E a total cone, and T ∈ B ( E ) be such that θ ( T ) < Λ( T ) and T K ⊂ K .Then Λ( T ) is an eigenvalue of T with an eigenvector in K . In Theorem 6, there is no uniqueness of the eigenvector. The non degeneracy can be otained under strongerpositivity conditions on the operator T , as made precise in [11, Theorems 19.3 and 19.5]. In order to control theessential spectral radius and apply the Krein-Rutman theorem, we will need the following classical results, see [54,Theorem 11.28] and [55, Theorem 2.7.19]. Theorem 7 (Ascoli) . Let ( Y , d Y ) be a compact metric space and C ( Y ) be the space of continuous functions over Y endowed with the uniform norm k f k C = sup y ∈Y | f ( y ) | . Consider a uniformly bounded and equicontinuoussequence ( f n ) n ∈ N , i.e. a sequence for which there exists M > such that k f n k C M for all n > , and for any ε > there exists δ > such that d Y ( x, y ) δ implies | f ( x ) − f ( y ) | ε . Then ( f n ) n ∈ N converges in the uniformnorm to some limit f up to extraction. Theorem 8 (Heine-Cantor) . Let f : E → F where ( E, d E ) and ( F, d F ) are two metric spaces and E is compact.Then, if f is continuous, it is uniformly continuous: for any ε > , there is δ > such that for any x , x ′ ∈ E with d E ( x, x ′ ) δ , it holds d F ( f ( x ) , f ( x ′ )) ε . We close this section with some results in probability theory. The next lemma can be found in [33, Lemma 4.14].
Lemma 5. If X is a Polish space and µ ∈ P ( X ) , then the familly constituted of the single measure µ is tight,i.e. for any ε > , there exists a compact set K ⊂ X such that µ ( K ) > − ε . e finally present results concerning ultra-Feller operators, extending the ones of [34, Appendix A]. Recallthat the total variation distance between two positive measures µ, ν ∈ M ( X ) is defined by the norm: k µ − ν k TV = sup ϕ ∈ B ∞ ( X ) k ϕ k B ∞ Z X ϕ dµ − Z X ϕ dν. (60) Definition 3 (Ultra-Feller) . A kernel operator Q is ultra-Feller if the mappping x Q ( x, · ) ∈ M ( X ) is continuousin the total variation distance (60) . The next lemma, used to show that an operator is ultra-Feller, is adapted from [34, Appendix A].
Lemma 6.
Suppose that P and Q are two kernel operators over a Polish space X that satisfy the followingproperties: • for all ϕ ∈ B ∞ ( X ) , Qϕ is continuous and finite; • for all ψ such that | ψ | Q , P ψ is continuous and finite.Then
P Q is ultra-Feller.
We remind some elements of the proof from [34, Theorem 1.6.6], which is based on the Banach-Alaoglutheorem. The details are left to the reader.
Proof.
A first element to prove Lemma 6 is to show that, if Q is strong Feller, then there exists a referenceprobability measure ζ ∈ P ( X ) such that for any x ∈ X , Q ( x, · ) is absolutely continuous with respect to ζ . This isshown in [34, Lemma 1.6.4] for operators Q such that Q = . Even for a non-probabilistic Q , we can considerthe normalized probabilities Q ( x, · ) Q ( x ) , for x in the open set e X := { x ∈ X | Q ( x ) > } . We can apply [34, Lemma 1.6.4] to these probabilities definedover the set e X , so there exists a measure ζ such that, for any x ∈ e X , Q ( x, · ) is absolutely continuous with respectto ζ . If x ∈ X \ e X , Q ( x, · ) = 0, which is also absolutely continuous with respect to ζ , so that Q ( x, · ) is absolutelycontinuous with respect to ζ for any x ∈ X .Once this is done, one can write the kernel Q as Q ( y, dz ) = k ( y, z ) dz with k ( y, · ) ∈ L ( X , ζ ) for all x ∈ X .If one supposes by contradiction that P Q is not ultra-Feller, then Definition 3 shows that there exist a sequenceof functions ( g n ) n ∈ N with k g n k B ∞ x n ) n ∈ N converging to an element x ∈ X such that forsome δ > ∀ n ∈ N , P Qg n ( x n ) − P Qg n ( x ) > δ. (61)Since the sequence ( g n ) n ∈ N is bounded, it possesses a weak- ∗ converging subsequence in L ∞ ( X , ζ ) (the space of ζ -essentially bounded functions) to an element g ∈ B ∞ ( X , ζ ). In particular it holds (upon extracting a subsequence),for any y ∈ X , lim n → + ∞ Qg n ( y ) = lim n → + ∞ Z X k ( y, z ) g n ( z ) ζ ( dz ) = Z X k ( y, z ) g ( z ) ζ ( dz ) = Qg ( y ) . Defining, f n = Qg n , the latter limit shows that f n converges pointwise to f = Qg . Since ( g n ) n ∈ N is boundedin B ∞ ( X ), the second condition in Lemma 6 ensures that P f n ( x ) → P f ( x ) for all x ∈ X , by the dominatedconvergence theorem. This is the main difference compared to the proof in [34, Theorem 1.6.6]. The contradictionfollows similarly. Indeed, defining the positive decreasing function h n = sup m > n | f m − f | we have, for any m ∈ N ,lim n → + ∞ P h n ( x n ) lim n → + ∞ P h m ( x n ) = P h m ( x ) , so that P h n ( x n ) → n → + ∞ . In the end,lim n → + ∞ P f n ( x n ) − P f ( x ) lim n → + ∞ | P f n ( x n ) − P f ( x n ) | + lim n → + ∞ | P f ( x n ) − P f ( x ) | = 0 , which comes in contradiction with (61) and concludes the proof. Proof of Lemma 1
Let us show that µ ( Q f ) > µ ∈ P ( X ). First, Lemma 5 in Appendix B ensures that, for any ε > K ⊂ X such that µ ( K ) > − ε . Consider next a compact set K n of Assumption 1 suchthat K ⊂ K n . Then, with the corresponding α n > η n ∈ P ( X ) defined in Assumption 2, we have ∀ x ∈ K n , ( Q f )( x ) > α n η n ( ) > α n > . Integrating with respect to µ leads to Z X ( Q f )( x ) µ ( dx ) > Z K n ( Q f )( x ) µ ( dx ) > α n Z K n µ ( dx ) = α n µ ( K n ) > α n (1 − ε ) > , since K ⊂ K n , which proves the statement. Moreover W >
1, so (6) implies that µ ( Q f ) µ ( Q f W ) < + ∞ if µ ( W ) < + ∞ .Since W >
1, we immediately have that η n ( W ) > > n >
1. Now, for any n > x ∈ K n ,Assumptions 1 and 2 lead to α n η n ( W ) Q f W ( x ) γ n W ( x ) + b n K n ( x ) < + ∞ , since W is finite. Moreover, α n >
0, so that η n ( W ) < + ∞ for any n > n > n , wehave η n ( K n ) = 0, with n > m > n and ϕ m = K m >
0. Then, using (8) with n = n , ∀ x ∈ K n , (cid:0) Q f ϕ m (cid:1) ( x ) > α n η n ( K m ) . Using again Lemma 5 in Appendix B, we see that for m large enough, (cid:0) Q f ϕ m (cid:1) ( x ) > x ∈ K n and so Q f ϕ m = 0. However, for n > m , we have, using that K m ⊂ K n (since the sets are increasing):0 η n ( ϕ m ) = η n ( K m ) η n ( K n ) = 0 , since we assumed η n ( K n ) = 0 for n > n . The contradiction with (9) shows that there exists ¯ n > n such that η ¯ n ( K ¯ n ) >
0. Since n is arbitrary, ¯ n can be chosen arbitrarily large, and this concludes the proof of Lemma 1. D Proof of Lemma 2
The proof is decomposed into three steps. First we show that the essential spectral radius of the operator Q f considered over B ∞ W ( X ) is zero. We next prove that the spectral radius Λ of Q f is positive. Finally, we use theKrein-Rutman theorem to obtain that Λ is a eigenvalue of Q f with largest modulus, and that the associatedeigenvector is positive. Step 1: Q f has zero essential spectral radius We first perform the following decomposition, for any n > Q f ) = K n Q f K n Q f + K cn ( Q f ) + K n Q f K cn Q f , where K n ⊂ X are the compact sets from Section 2.2. Applying again Q f leads to( Q f ) = ( K n Q f K n ) Q f + K cn Q f ( K n Q f ) + Q f K cn ( Q f ) + Q f K n Q f K cn Q f . (62)We will show that Q fn := K n Q f K n is such that ( Q fn ) is compact on B ∞ W ( X ), while K cn Q f tends to zero innorm. This will prove that ( Q f ) is compact as limit of compact operators in operator norm, so the essentialspectral radius of Q f in B ∞ W ( X ), denoted by θ ( Q f ), is equal to zero.Let us first prove that ( Q fn ) is compact on B ∞ W ( X ) for any n ∈ N . For this, we use the ultra-Feller propertyproved in Lemma 6 (see Appendix B) to apply the Ascoli theorem. Consider a sequence ( ϕ k ) k ∈ N in B ∞ W ( X ) uch that k ϕ k k B ∞ W M for some M >
0. By Assumption 3, the operator Q fn is strong Feller over the compactset K n . In particular, for ϕ ∈ B ∞ W ( X ), ϕ K n ∈ B ∞ ( X ), so Q fn ϕ is continuous over K n and finite, so thatLemma 6 in Appendix B applies. Indeed, the second condition in the lemma is easy to check since Q n is equalto zero outside the compact K n . Therefore, ( Q fn ) is ultra-Feller by Lemma 6. By Definition 3, the application x ∈ K n ( Q fn ) ( x, · ) ∈ M ( X ) is continuous in total variation norm. Since K n is compact in the metric space X and P ( X ) is a metric space, the Heine-Cantor theorem (Theorem 8 in Appendix B) ensures that this applicationis continuous over K n . This means that, for any ε >
0, there exists δ > x, x ′ ∈ K n with | x − x ′ | δ , it holds sup k ϕ k B ∞ (cid:12)(cid:12)(cid:12)(cid:0) ( Q fn ) ϕ (cid:1) ( x ) − (cid:0) ( Q fn ) ϕ (cid:1) ( x ′ ) (cid:12)(cid:12)(cid:12) ε. (63)Noting that Assumption 1 implies that 1 sup K n W < + ∞ , it holds M n = (sup K n W ) − ∈ (0 ,
1] for any n > (cid:8) ϕ measurable (cid:12)(cid:12) k K n ϕ k B ∞ (cid:9) ⊃ (cid:8) ϕ measurable (cid:12)(cid:12) k K n ϕ k B ∞ W M n (cid:9) . (64)Since Q fn = K n Q f K n , (64) shows that (63) becomessup k ϕ k B ∞ W M n (cid:12)(cid:12)(cid:12)(cid:0) ( Q fn ) ϕ (cid:1) ( x ) − (cid:0) ( Q fn ) ϕ (cid:1) ( x ′ ) (cid:12)(cid:12)(cid:12) ε. As a consequence, if ( ϕ k ) k ∈ N is such that k ϕ k k B ∞ W M , we see that (cid:0) ( Q fn ) ϕ k (cid:1) k ∈ N is equicontinuous. By theAscoli theorem, it therefore converges uniformly to a continuous limit on K n (since the function is supportedon K n , we extend it by 0 on X outside K n ). Since W >
1, it also converges as a function in B ∞ W ( X ), showing that( Q fn ) is a compact operator on B ∞ W ( X ). Since Q f is bounded over B ∞ W ( X ) and the space of compact operatorsis stable by composition with bounded operators [49], ( Q fn ) Q f is also compact.We now show that the second, third and fourth operators on the right hand side of (62) tend to 0 in theoperator norm of B ∞ W ( X ). For any ϕ ∈ B ∞ W ( X ), (cid:13)(cid:13) K cn Q f ϕ (cid:13)(cid:13) B ∞ W = (cid:13)(cid:13)(cid:13)(cid:13) K cn Q f ϕW (cid:13)(cid:13)(cid:13)(cid:13) B ∞ k ϕ k B ∞ W (cid:13)(cid:13)(cid:13)(cid:13) K cn Q f WW (cid:13)(cid:13)(cid:13)(cid:13) B ∞ γ n k ϕ k B ∞ W . Taking the supremum over ϕ ∈ B ∞ W ( X ) and using γ n → n → + ∞ , we obtain: (cid:13)(cid:13) K cn Q f (cid:13)(cid:13) B ( B ∞ W ) −−−−−→ n → + ∞ . (65)Since Q f is bounded on B ( B ∞ W ), the second, third and fourth operators on the right hand side of (62) vanish innorm as n → + ∞ . As a result, ( Q f ) is the norm-limit of the compact operators ( Q fn ) Q f as n → + ∞ in B ( B ∞ W ).Since the set of compact operators over B ∞ W ( X ) is closed in the Banach space B ( B ∞ W ), ( Q f ) is compact, see e.g. [49, Theorem VI.12]. Using Definition 1, we conclude that θ ( Q f ) = 0. In this procedure, we see that workingin the weighted space B ∞ W ( X ) as opposed to B ∞ ( X ) is crucial in order to obtain the compactness of ( Q f ) fromthe control (65) provided by the Lyapunov condition (6). Step 2: The spectral radius is positive
We now show that the spectral radius Λ of Q f defined in (12) is positive, in order to use Theorem 6. Given thedefinition of the operator norm, choosing some arbitrary non negative function φ ∈ B ∞ W ( X ) with k φ k B ∞ W (cid:13)(cid:13) Q f (cid:13)(cid:13) B ( B ∞ W ) > (cid:13)(cid:13)(cid:13)(cid:13) Q f φW (cid:13)(cid:13)(cid:13)(cid:13) B ∞ > (cid:0) Q f φ (cid:1) ( x ) W ( x ) , where x ∈ X is arbitrary. We now consider a compact set corresponding to some n = ¯ n as defined in Lemma 1,which satisfies η ¯ n ( K ¯ n ) >
0, and take x ∈ K ¯ n . For any non negative function φ ∈ B ∞ W ( X ) with k φ k B ∞ W η ¯ n (cid:0) Q f φ (cid:1) = (cid:18)Z K ¯ n ( Q f φ )( x ) η ¯ n ( dx ) + Z X\ K ¯ n ( Q f φ )( x ) η ¯ n ( dx ) (cid:19) > Z K ¯ n α ¯ n η ¯ n ( φ ) η ¯ n ( dx ) > α ¯ n η ¯ n ( φ ) η ¯ n ( K ¯ n ) , (66) here we used (8) with n = ¯ n . Iterating the inequality shows that ∀ k > , η ¯ n (cid:0) ( Q f ) k φ (cid:1) > α k ¯ n η ¯ n ( K ¯ n ) k η ¯ n ( φ ) . This leads to the following lower bound on the operator norm of ( Q f ) k : (cid:13)(cid:13) ( Q f ) k (cid:13)(cid:13) B ( B ∞ W ) > (cid:0) ( Q f ) k φ (cid:1) ( x ) W ( x ) = (cid:0) Q f (( Q f ) k − φ ) (cid:1) ( x ) W ( x ) > α ¯ n η ¯ n (cid:0) ( Q f ) k − φ (cid:1) W ( x ) > α k ¯ n η ¯ n ( K ¯ n ) k − W ( x ) η ¯ n ( φ ) . Taking the power 1 /k and the limit k → + ∞ , together with the choice φ = ∈ B ∞ W ( X ), leads toΛ > α ¯ n η ¯ n ( K ¯ n ) . From Lemma 1, it holds η ¯ n ( K ¯ n ) >
0, hence Λ > Q f has a positive spectral radius. Note that the existenceof ¯ n > η ¯ n ( K ¯ n ) > Step 3: Existence of a principal eigenvector
In order to use Theorem 6, we introduce the closed cone: K W = (cid:8) u ∈ B ∞ W ( X ) (cid:12)(cid:12) u > (cid:9) . This cone is total, and the positiveness of Q f ∈ B ∞ W ( X ) shows that Q f K ⊂ K . At this stage, Theorem 6 inAppendix B ensures that the spectral radius Λ is an eigenvalue of Q f of largest modulus with an associatedeigenvector h ∈ K W \ { } . Step 4: Positivity
We now use the irreducibility condition (9) to show that, for the eigenvector h obtained in Step 3, it holds h ( x ) > x ∈ X and hence η n ( h ) > n > x ∈ X such that h ( x ) = 0. Sincethe sets K n are increasing, there exists n such that for all n > n it holds x ∈ K n so that, by (8), ∀ n > n , (cid:0) Q f h (cid:1) ( x ) > α n η n ( h ) . Since Q f h = Λ h with Λ >
0, this leads to 0 > η n ( h ) , and so η n ( h ) = 0 for n > n . By the irreducibility assumption (9), we therefore have ( Q f h )( x ) = 0 for all x ∈ X .Using again Q f h = Λ h , this shows that h = 0, which is in contradiction with the fact that h is an eigenvectorassociated with Λ.The second property follows from h ( x ) > x ∈ X and η n ∈ P ( X ) for all n >
1. Indeed, X = [ k > h − h k , + ∞ (cid:17) , (67)where h − denotes here the pre-image of h . Therefore, for a given n > η n ( X ) = η n (cid:0) h − [1 , + ∞ ) (cid:1) + X k > η n (cid:16) h − h k + 1 , k (cid:17)(cid:17) = 1 . Thus, there exists N > η n (cid:16) h − h N , + ∞ (cid:17)(cid:17) > , so η n ( h ) > η n (cid:0) h h > N (cid:1) > N η n (cid:0) h > N (cid:1) > N .
Since n > η n ( h ) > n > Proof of Lemma 3
A first important remark is that Q h is a Markov operator. Indeed, it is a well-defined kernel operator (since0 < h ( x ) < + ∞ for all x ∈ X ), and Q h = Λ − h − Q f h = Λ − h − Λ h = . Our goal is therefore to show thatthe Markov operator Q h fits the framework reminded in Appendix A, in particular that it satisfies Assumptions 7and 8.Let us show that this operator satisfies Assumption 7 in Appendix A with Lyapunov function W h − . We firstnote that the normalization k h k B ∞ W = 1 implies that W h − >
1. Using Assumption 1, we obtain Q h ( W h − ) = Λ − h − Q f W Λ − h − ( γ n W + b n K n ) γ n Λ W h − + b n Λ h K n . Noting that, for all x ∈ K n , Λ h ( x ) = ( Q f h )( x ) > α n η n ( h ) , with η n ( h ) > Q h ( W h − ) γ n Λ W h − + b n α n η n ( h ) K n . (68)Since γ n can be taken arbitrarily small and η n ( h ) > n >
1, we deduce that
W h − is a Lyapunov functionfor Q h in the sense of Assumption 7 in Appendix A. Remark 4.
Let us mention that, in order for (68) to define a Lyapunov condition in the sense of Assumption 7,it is not necessary to have γ n → as n → + ∞ . The existence of n > such that γ n < Λ is sufficient. We will now prove that: (i)
W h − has compact level sets, and (ii) Q h satisfies Assumption 7 in Appendix Aon any compact set K n , that is inf K n Q h is lower bounded by some probability measure. First, choosing x n / ∈ K n in Assumption 1 leads to Λ h ( x n ) = ( Q f h )( x n ) γ n W ( x n ) , so that W ( x n ) h ( x n ) > Λ γ n . (69)Since γ n → n → + ∞ , the function W h − diverges outside the compact sets K n defined in Assumption 1. Inother words, W h − has compact level sets, which shows (i).Next, for n >
1, consider α n > η n ∈ P ( X ) as in Assumption 2, so that, for any bounded measurablefunction ϕ > x ∈ K n , Q h ϕ ( x ) = Λ − Q f ( hϕ )( x ) h ( x ) >
1Λ sup K n h α n η n ( hϕ ) > e α n e η n ( ϕ ) , with e α n = α n η n ( h )Λ sup K n h > , e η n ( ϕ ) = η n ( hϕ ) η n ( h ) ∈ P ( X ) . The latter expression is well-defined because, from Lemma 2, we know that 0 < η n ( h ) < + ∞ for any n > < sup K n h < + ∞ (since h ∈ B ∞ W ( X ) and sup K n W < + ∞ by Assumption 1), and this yieldsprecisely (ii). Finally, (i) and (ii) show that Q h satisfies Assumption 8, so that Q h satisfies the assumptionsof Theorem 5. As a result there exist a unique µ h ∈ P ( X ) and constants c >
0, ¯ α ∈ (0 ,
1) such that for any φ ∈ B ∞ Wh − ( X ), ∀ k > , (cid:13)(cid:13) Q kh φ − µ h ( φ ) (cid:13)(cid:13) B ∞ Wh − c ¯ α k k φ − µ h ( φ ) k B ∞ Wh − . Moreover, the measure µ h satisfies µ h ( W h − ) < + ∞ . Proof of Lemma 4
From [25, Proposition 1], we obtain that L + f has a largest (in modulus) eigenvalue λ with associated smootheigenvector h . Similarly, Lemma 2 shows that for any ∆ t ∈ (0 , ∆ t ∗ ] the operator Q f ∆ t has a largest (in modulus)eigenvalue Λ ∆ t with continuous eigenvector h ∆ t (since Q f ∆ t is assumed to be strong Feller). Moreover, there is norestriction of generality in normalizing h ∆ t so that η ( h ∆ t ) = 1.We now turn to the estimate (55) on the spectral radius. In the notation of [25], we have Λ ∆ t = e ∆ tλ ∆ t . Adirect application of [25, Theorem 3] then shows that there exist ∆ t ∗ > C > λ ∆ t = λ + ∆ tc ∆ t with | c ∆ t | C for ∆ t ∈ (0 , ∆ t ∗ ], which is the desired result.Finally, since Q f ∆ t h ∆ t = Λ ∆ t h ∆ t , the lower bound (54) applied to ϕ = h ∆ t > ∀ x ∈ X , (cid:0) Q f ∆ t (cid:1) ⌈ T ∆ t ⌉ h ∆ t ( x ) = Λ ⌈ T ∆ t ⌉ ∆ t h ∆ t ( x ) > αη ( h ∆ t ) . Using the estimate on Λ ∆ t and the normalization η ( h ∆ t ) = 1 we obtain, for ∆ t ∈ (0 , ∆ t ∗ ] (possibly upon decreasing∆ t ∗ ) and x ∈ X , h ∆ t ( x ) > Λ −⌈ T ∆ t ⌉ ∆ t αη ( h ∆ t ) > α e − ∆ t ( λ +∆ tc ∆ t ) ⌈ T ∆ t ⌉ > α e − T | λ | . A similar computation leads to an analogous upper bounded, which shows (56).
References [1] J. B. Anderson. A random-walk simulation of the Schrödinger equation: H +3 . J. Chem. Phys. , 63(4):1499–1503, 1975.[2] D. Bakry, F. Barthe, P. Cattiaux, and A. Guillin. A simple proof of the Poincaré inequality for a large classof probability measures.
Electron. Commun. Probab. , 13:60–66, 2008.[3] D. Bakry, I. Gentil and M. Ledoux.
Analysis and Geometry of Markov Diffusion Operators , volume 348 of
Grundlehren der mathematischen Wissenschaften . Springer Science & Business Media, 2013.[4] V. Bansaye, B. Cloez, and P. Gabriel. Ergodic behavior of non-conservative semigroups via generalizedDoeblin’s conditions. arXiv:1710.05584 , 2017.[5] N. Bou-Rabee and M. Hairer. Nonasymptotic mixing of the MALA algorithm.
IMA J. Numer. Anal. ,33(1):80–110, 2012.[6] A. Brunel and B. Revuz. Quelques applications probabilistes de la quasi-compacité.
Annales de l’InstitutHenri Poincaré, Section B , 10(3):301–337, 1974.[7] D. M. Ceperley and B. Alder. Ground state of the electron gas by a stochastic method.
Phys. Rev. Lett. ,45(7):566, 1980.[8] N. Champagnat and D. Villemonais. General criteria for the study of quasi-stationarity. arXiv:1712.08092 ,2017.[9] N. Champagnat and D. Villemonais. Lyapunov criteria for uniform convergence of conditional distributionsof absorbed Markov processes. arXiv:1704.01928 , 2017.[10] A. Debussche and E. Faou. Weak backward error analysis for SDEs.
SIAM J. Numer. Anal. , 50(3):1735–1752,2012.[11] K. Deimling.
Nonlinear Functional Analysis . Courier Corporation, 2010.[12] P. Del Moral.
Feynman-Kac Formulae . Springer, 2004.[13] P. Del Moral and A. Guionnet. On the stability of interacting processes with applications to filtering andgenetic algorithms.
Annales de l’IHP Probabilités et statistiques , 37(2):155–194, 2001.[14] P. Del Moral and L. Miclo. Branching and interacting particle systems approximations of Feynman-Kacformulae with applications to non-linear filtering. In
Séminaire de probabilités XXXIV , pages 1–145. Springer,2000.
15] P. Del Moral and L. Miclo. On the stability of nonlinear Feynman-Kac semigroups.
Annales de la Facultédes Sciences Toulouse Mathematiques , 11:135–175, 2002.[16] P. Del Moral and L. Miclo. Particle approximations of Lyapunov exponents connected to Schrödinger oper-ators and Feynman–Kac semigroups.
ESAIM: Probab. Stat. , 7:171–208, 2003.[17] A. Dembo and O. Zeitouni.
Large Deviations Techniques and Applications , volume 38 of
Stochastic Modellingand Applied Probability . Springer-Verlag, Berlin, 2010.[18] R. L. Dobrushin. Central limit theorem for nonstationary Markov chains. I.
Theory of Probability & ItsApplications , 1(1):65–80, 1956.[19] R. L. Dobrushin. Central limit theorem for nonstationary Markov chains. II.
Theory of Probability & ItsApplications , 1(4):329–383, 1956.[20] M. D. Donsker and S. R. S. Varadhan. On a variational formula for the principal eigenvalue for operatorswith maximum principle.
Proc. Natl. Acad. Sci. , 72(3):780–783, 1975.[21] J. L. Doob. Conditional Brownian motion and the boundary limits of harmonic functions.
Bulletin de laSociété Mathématique de France , 85:431–458, 1957.[22] A. Eberle, A. Guillin, and R. Zimmer. Quantitative Harris type theorems for diffusions and McKean-Vlasovprocesses. arXiv:1606.06012 , 2016.[23] M. Fathi and G. Stoltz. Improving dynamical properties of metropolized discretizations of overdampedLangevin dynamics.
Numer. Math. , 136(2):1–58, 2015.[24] J. Feng and T. G. Kurtz.
Large Deviations for Stochastic Processes , volume 131 of
Mathematical Surveysand Monographs . American Mathematical Soc., 2006.[25] G. Ferré and G. Stoltz. Error estimates on ergodic properties of Feynman–Kac semigroups. arXiv:1712.04013 ,2017.[26] W. H. Fleming. Exit probabilities and optimal stochastic control.
Appl. Math. Optim. , 4(1):329–346, 1977.[27] W. Foulkes, L. Mitas, R. Needs, and G. Rajagopal. Quantum Monte Carlo simulations of solids.
Rev. Mod.Phys. , 73(1):33, 2001.[28] C. Giardina, J. Kurchan, and L. Peliti. Direct evaluation of large-deviation functions.
Phys. Rev. Lett. ,96(12):120603, 2006.[29] F. Gosselin. Asymptotic behavior of absorbing Markov chains conditional on nonabsorption for applicationsin conservation biology.
Ann. Appl. Probab. , 11:261–284, 2001.[30] R. Grimm and R. Storer. Monte-Carlo solution of Schrödinger’s equation.
J. Comput. Phys. , 7(1):134–156,1971.[31] D. Guibourg, L. Hervé and J. Ledoux. Quasi-compactness of Markov kernels on weighted-supremum spacesand geometrical ergodicity. arXiv:1110.3240 , 2011.[32] M. Hairer. Exponential mixing for a stochastic PDE driven by degenerate noise. arXiv:math-ph/0103039 ,2001.[33] M. Hairer. Ergodic properties of Markov processes.
Lecture notes , 2006.[34] M. Hairer. Ergodic properties of a class of Non-Markovian processes.
Trends in Stochastic Analysis , 353:65–102, 2009.[35] M. Hairer and J. C. Mattingly. Yet another look at Harris’ ergodic theorem for Markov chains. In
Seminaron Stochastic Analysis, Random Fields and Applications VI , pages 109–117. Springer, 2011.[36] M. Hairer and J. Weare. Improved diffusion Monte Carlo.
Comm. Pure Appl. Math. , 67(12):1995–2021, 2014.[37] B. Helffer.
Spectral Theory and its Applications , volume 139 of
Cambridge Studies in Advanced Mathematics .Cambridge University Press, 2013.[38] D. P. Herzog and J. C. Mattingly. Ergodicity and Lyapunov functions for Langevin dynamics with singularpotentials. arXiv:1711.02250 , 2017.
39] I. Karatzas and S. Shreve.
Brownian Motion and Stochastic Calculus , volume 113 of
Graduate Texts inMathematics . Springer Science & Business Media, 2012.[40] I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markovprocesses.
Ann. Appl. Probab. , 13(1):304–362, 2003.[41] M. Kopec. Weak backward error analysis for overdamped Langevin processes.
IMA J. Numer. Anal. ,35(2):583–614, 2014.[42] M. Kopec. Weak backward error analysis for Langevin process.
BIT Numer. Math. , 55(4):1057–1103, 2015.[43] I. Kontoyiannis and S. P. Meyn. Large deviations asymptotics and the spectral theory of multiplicativelyregular Markov processes.
Electron. J. Probab. , 10(3):61–123, 2005.[44] B. Leimkuhler, C. Matthews, and G. Stoltz. The computation of averages from equilibrium and nonequilib-rium Langevin molecular dynamics.
IMA J. Numer. Anal. , 36(1):13–79, 2016.[45] T. Lelièvre and G. Stoltz. Partial differential equations and stochastic methods in molecular dynamics.
ActaNumerica , 25:681–880, 2016.[46] S. P. Meyn and R. L. Tweedie.
Markov Chains and Stochastic Stability . Springer Science & Business Media,2012.[47] R. D. Nussbaum. The radius of the essential spectrum.
Duke Math. J. , 37(3):473–478, 1970.[48] R. D. Nussbaum. Eigenvectors of order-preserving linear operators.
J. London Math. Soc. , 58(2):480–496,1998.[49] M. Reed and B. Simon.
Methods of Modern Mathematical Physics I: Functional Analysis . Academic Press,San Diego, 1980.[50] L. Rey-Bellet.
Ergodic properties of Markov processes , volume 1881 of
Lecture Notes in Mathematics , pages1–39. Springer, 2006.[51] C. P. Robert.
Monte Carlo Methods . Wiley Online Library, 2004.[52] M. Rousset. On the control of an interacting particle estimation of Schrödinger ground states.
SIAM J.Math. Anal. , 38(3):824–844, 2006.[53] W. Rudin.
Functional Analysis . McGraw-Hill, New York, 1991.[54] W. Rudin.
Real and Complex Analysis . McGraw-Hill, New York, 2006.[55] L. Schwartz.
Analyse I . Hermann, Paris, 1991.[56] D. W. Stroock and S. R. S. Varadhan. On the support of diffusion processes with applications to thestrong maximum principle. In
Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics andProbability , (3):333–359. 1972.[57] C. Villani.
Topics in Optimal Transportation , volume 58 of
Graduate Studies in Mathematics . AmericanMathematical Society, 2003.[58] L. Wu. Large and moderate deviations and exponential convergence for stochastic damping Hamiltoniansystems.
Stoch. Process. Appl. , 91(2):205–238, 2001.[59] L. Wu. Essential spectral radius for Markov semigroups (I): discrete time case.
Probab. Theory Relat. Fields ,128(2):255–321, 2004.,128(2):255–321, 2004.