[PDF] Large deviations for Markov jump processes with uniformly diminishing rates

Abstract

We prove a large-deviation principle (LDP) for the sample paths of jump Markov processes in the small noise limit when, possibly, all the jump rates vanish uniformly, but slowly enough, in a region of the state space. We further discuss the optimality of our assumptions on the decay of the jump rates. As a direct application of this work we relax the assumptions needed for the application of LDPs to, e.g., Chemical Reaction Network dynamics, where vanishing reaction rates arise naturally particularly the context of mass action kinetics.

Full PDF

LLARGE DEVIATIONS FOR MARKOV JUMP PROCESSESWITH UNIFORMLY DIMINISHING RATES

ANDREA AGAZZI, LUISA ANDREIS, ROBERT I. A. PATTERSON, AND D. R. MICHIEL RENGERA

BSTRACT . We prove a large-deviation principle (

LDP ) for the sample paths of jump Markov processes in the small noiselimit when, possibly, all the jump rates vanish uniformly , but slowly enough, in a region of the state space. We further discussthe optimality of our assumptions on the decay of the jump rates. As a direct application of this work we relax the assumptionsneeded for the application of

LDP s to, e.g. , Chemical Reaction Network dynamics, where vanishing reaction rates arisenaturally particularly the context of mass action kinetics.

1. I

NTRODUCTION

Large deviations of Markov jump processes.

We study a family of d -dimensional Markov jump processes { X v } v ∈ N with state space ( v − Z ) d , deterministic initial condition X v (0) = x v ∈ ( v − Z ) d and generator: ( L v f )( x ) := v (cid:88) r ∈R Λ vr ( x ) (cid:2) f ( x + v − γ r ) − f ( x ) (cid:3) . (1)Here R is the ﬁnite set of possible jumps, γ r ∈ Z d are the ﬁxed jump vectors, and v Λ vr : ( v − Z ) d → [0 , ∞ ) theassociated jump rates. The parameter v controls the noise in the system, and the scaling is chosen so that Λ vr ( x ) convergeas v → ∞ . Under this scaling it is known that the paths X v concentrate on solutions of the ﬂuid limit ODE [17]: dd t x ( t ) = (cid:88) r ∈R λ r ( x ( t )) γ r , (2)where the continuous rates λ r : R d → [0 , ∞ ) are the limits of Λ vr ( x ) as v → ∞ .The process (1) and the ODE (2) are used as microscopic and macroscopic models for a wide range of applications.For example, in the context of chemical reactions X v denotes the concentrations (number of molecules per unit volume)of d species being transformed by a set of reactions R . Here, for each reaction r ∈ R , the vectors γ r encode whichspecies are removed and created when a reaction r occurs, and λ r ( x ) are the reaction rates [18]. In that setting v can beinterpreted as the volume size over which the concentrations are averaged. Other typical applications using similarmodels include biological systems involving predator-prey interaction, birth/cell division and death, biological ﬁtnessmodels, as well as epidemiological models. In these settings, large-deviation techniques are often applied to simplify thedynamical landscape of the complex, high dimensional microscopic model (1) while retrieving quantitative informationabout the random ﬂuctuations around the mean (2). This information can be used for example to study non-equilibriumthermodynamics [19, 22], to speed up simulations of rare events [7, 14, 29], or to study spontaneous transitions betweenmetastable states [13] also in the multi-scale setting [26].The classical proof of the Large-Deviation Principle ( LDP ) uses a tilting, also called a change-of-measure technique.The main challenge there is that the tilting can only be performed around sufﬁciently regular paths, whereas the large-deviation principle needs to be proven for any non-typical path. Therefore, the large-deviation lower bound requires anapproximation argument, either for the random process or for the rate functional. A particular challenge in either case isto approximate a path without changing its starting point, which is required when proving the large-deviation principleunder a deterministic initial condition. This becomes more difﬁcult if the jump rates vanish in some regions of the statespace, which is however an inherent property of the models used in many application domains. For example, in thecontext of chemical kinetics, it is natural to assume that the rate of a chemical reaction vanishes when the concentrationof one or more of the reactants approaches . Similarly in the context of infectious disease models, the rate of spread ofa virus is usually modelled as a linear function of the infected population. Date : February 26, 2021. a r X i v : . [ m a t h . P R ] F e b xample 1.1. The problem is nicely illustrated by the simplest model for autocatalysis or cell division, in chemicalnotation: A → A . In this case there is only one species and one jump, so we may write the linear jump rate as Λ v ( x ) = λ ( x ) = x , starting from a concentration with one particle, i.e. X v (0) = x v = 1 /v . Clearly, the processconverges to the solution of ˙ x ( t ) = x ( t ) with initial condition x (0) = x = 0 , that is x ( t ) ≡ . In other words, theprocess is expected to stick to the degenerate set ∂ S = { x = 0 } , which corresponds to the boundary of the statespace R + . However, it can be calculated (as an application of [16, Lemma 2.1]) that lim v →∞ v − log P (cid:0) X v ( t ) ≥ δ (cid:1) = δ log(1 − e − t ) for any ﬁxed time t > and δ > . Although this is a large-deviation result about the marginal X v ( t ) , it suggests that the paths can also “escape” from the boundary with ﬁnite large-deviation cost. On the otherhand, we note that choosing x v ≡ implies that X v ( t ) ≡ almost surely. Hence, whether or not escape is possibledepends strongly on the initial condition. This is a similar principle as what is sometimes called a “well-preparedinitial condition” in Γ -convergence theory [21]. For more general models of the type (1), one expects that as long as the jump rates λ r ( x ) do not vanish too rapidlyapproaching the degenerate points and when starting at a well-chosen initial condition near such points, then the processwill be able to escape with ﬁnite large-deviation cost, and the large-deviation principle should still hold. In this paper,we show that this is indeed the case. More speciﬁcally, denoting by ∂ S the set where some – or possibly all – of thereaction rates λ vanish, we prove that when the rates near ∂ S decrease slower than exp ( − / dist( x, ∂ S ) α ) , i.e. , dist( x, ∂ S ) α log λ ( x ) → as dist( x, ∂ S ) → , (3)the large-deviation principle holds under the assumption α ∈ [0 , . Furthermore, we show that these conditions on thedecay of the rates close to the degenerate set are, in some sense, necessary and not just sufﬁcient.1.2. Literature and approach.

Early papers [12, 20] establishing a sample-path large-deviation principle for jumpMarkov processes mimicked the Dawson–G¨artner approach [6], where one ﬁrst derives an abstract large-deviationresult for the empirical measure on paths, and then contracts it to obtain a large-deviation principle for the path of theempirical measure. Another approach, now considered the classic tilting technique, was ﬁrst used in [27] assuming thatall the jump rates are uniformly bounded away from zero in the domain of interest. In [28], the authors relaxed thiscondition by assuming the existence a subset of jumps with rates that are uniformly bounded away from zero and that“push” the process away from the degeneracy region. This assumption is further relaxed in [1, 2, 4, 10], requiring onlythe existence of a sequence of jumps that sequentially transport the process away from the problematic region.Recent works have taken steps to generalise these assumptions to the uniformly vanishing case (when all rates canvanish in some region of state space), in the context of chemical kinetics [25] and in the context of infectious diseasemodels [23, 24]. These papers give sufﬁcient conditions on the models at hand to bypass the technical difﬁcultiesencountered in the proof of the

LDP lower bound when some of the jump rates are not bounded away from zero. Wemention that the work [25] assumes “sufﬁciently random” initial conditions to bypass the problem, but we shall focuson a deterministic initial condition.The problem of vanishing rates is addressed more completely in [23, 24], where the authors obtain large-deviationestimates for vanishing rates when the microscopic initial condition allows escaping the degenerate set with positive –although vanishing in v – probability. Their approach is based on a careful adaptation of the standard tilting argument toobtain the LDP lower bound for processes. In particular, to bypass the problem of jump rates vanishing in some regionsof state space, the change of measure performed by the authors depends on the large-deviation scaling parameter v ,which is inversely proportional to the jump size. This replaces the problem of escaping to an O (1) distance from thesedegeneracy regions uniformly in v to the one of escaping to O (1 / √ v ) . Their result allows for jump rates that behave asin (3) for α ∈ [0 , / .In this paper we bypass the change-of-measure technique altogether establishing more direct and concise bounds onthe process level. The main challenge in realizing this strategy is to identify a set of paths occurring with sufﬁcientprobability to recover the LDP lower bound while allowing for simple estimation of such probability. The core of theapproach consists of showing separate estimates on the total number of jumps and on the types of jumps for paths insuch set. This allows us to extend the assumption (3) to any α ∈ [0 , while covering a larger family of processes thanthe existing literature. In addition, we provide a counterexample showing that our upper bound for the exponent α isoptimal: If the rates of the process decay as (3) for α ≥ with sufﬁcient uniformity, as we make precise below, theprocess will no longer be able to escape the degenerate set with ﬁnite large-deviation cost.1.3. Outline.

The paper is structured as follows. In Section 2 we introduce our notation, we list our assumptions andwe state our main result, Theorem 1, namely the

LDP . We also illustrate the generality of our result with some examples. he proof of Theorem 1 is split into two sections: In Section 3 we prove the LDP lower bound, while Section 4 dealswith the

LDP upper bound. Finally, in Section 5, we discuss the optimality of our assumptions on the decay of the ratesand in Section 6 we summarize the key quantities of the proof.2. N

OTATION AND RESULTS

We start by giving a concrete example of the properties of systems we aim to generalize in this paper, introducingsome important quantities in an intuitive way:

Example 2.1 (Mass action kinetics) . In the context of chemical kinetics, one indexes the dimension of state space witha set of species { S i } representing the chemical compounds in the system of interest. To describe interactions betweendifferent compounds one deﬁnes reactions r ∈ R via γ r, in , γ r, out ∈ N d and d (cid:88) i =1 γ r, in i S i −→ d (cid:88) i =1 γ r, out i S i . This is to be understood as saying γ r, in i copies of species S i are consumed in each r -reaction while γ r, out i copies areproduced. The reaction rate Λ vr is speciﬁed via a rate constant k r ≥ as Λ vr ( x ) := k r v (cid:80) i | γ r, in i | d (cid:89) i =1 (cid:18) vx i γ r, in i (cid:19) γ r, in i ! ∀ x ∈ (cid:0) v − N (cid:1) d , where (cid:0) ·· (cid:1) denotes the binomial coefﬁcient. The jump vector is γ r := γ r, out − γ r, in . These rates are bounded fromabove on compact sets, and they converge to λ r ( x ) = k r (cid:81) di =1 x γ r, in i i as v → ∞ . It is easy to see that Λ vr ( x ) = 0 whenever vx i < γ r, in , so that X v ∈ ( v − N ) d almost surely. Consequently, the degenerate set of the limiting dynamicsis a subset of the boundary ∂ R d ≥ . As we shall illustrate in examples below, different choices of reactions result in X v being conﬁned on subsets of ( v − N ) d . We start by deﬁning the set of reachable points of the process. Throughout, we ﬁx a sequence of deterministic initialconditions { x v } . By the potentially degenerate character of the stochastic dynamics at hand, we reduce the state space ( v − Z ) d to the set of reachable points of the process with that initial condition x v ∈ ( v − Z ) d : S v := { x ∈ (cid:0) v − Z (cid:1) d : P [ ∃ t ≥ , X v ( t ) = x | X v (0) = x v ] > } . Note that, by deﬁnition, Λ vr ( x ) = 0 whenever x + v − γ r / ∈ S v for any x ∈ S v . Assuming that in the limit v → ∞ , theinitial values x v ∈ (cid:0) v − Z (cid:1) d converge to x ∈ R d , we write S = (cid:92) n ∈ N (cid:91) v ≥ n S v where the raised line indicates topologicalclosure. We assume throughout that S is compact, but discuss how to relax this assumption in Remark 2.6. We associateto S the set of jumps R ≥ := { r ∈ R : ∃ x ∈ S , λ r ( x ) > } Notice that, depending on the sequence of initial conditions, the same Markov process may have a different state space S v and different set of jumps R ≥ . We refer to Example 2.3 for a situation where R ≥ (cid:54) = R . However, by abuse ofnotation, we will drop the index ≥ and refer to this set simply as R .Finally, we deﬁne the degenerate set – also referred to as “boundary” from its topological characterization in manyapplication domains – as ∂ S := { x ∈ S : ∃ r ∈ R , λ r ( x ) = 0 } . This represents the set of points where the limitingprocess is degenerate, i.e. , where the classical proof of the large-deviation principle will not immediately apply. Observethat this is a slight abuse of both notation and terminology, since this degenerate set ∂ S may be different from the actualtopological boundary of the set S .The following example clariﬁes the role of the sequence of initial conditions on the resulting state space. Example 2.2.

The mass action kinetics model A ↔ B (see Example 2.1 for deﬁnition of the rates and jump vectors)with initial conditions x v = (0 , /v ) results in S v = { x ∈ ( v − Z ≥ ) : x + x = 1 + 1 /v } and S = { x ∈ R ≥ : x + x = 1 } . Example 2.3.

The nontrivial effect of different sequences of initial conditions is captured by the system B ↔ B A ↔ + B , ith mass action kinetics (see Example 2.1). For this model, the sequence x v = (1 /v, results in S v = ( v − Z ≥ ) \{ } and S = R ≥ . However, if x v = (0 , /v ) we have R ≥ = { B → B, B → B } and the dynamics are restrictedto S v = { } × ( v − Z ≥ ) resulting in S = { x ∈ R ≥ : x = 0 } . Assumptions.

To ensure existence of the limit, we require the reaction rates to satisfy some conditions.

Assumption 1 (Convergence and regularity of rates) . We assume the following.a) There exists a collection of non-negative functions { λ r } r ∈R , Lipschitz continuous on a neighborhood of S in R d ,such that lim v →∞ sup x ∈S v (cid:88) r ∈R | Λ vr ( x ) − λ r ( x ) | = 0 . b) There exists ℵ > such that for all r ∈ R , v > and x ∈ S v with Λ vr ( x ) > , we have Λ vr ( x ) λ r ( x ) ≥ ℵ . As we outline in Section 3.1 the proof of our main theorem is based on the construction of short linear paths movingthe process away from the boundary ∂ S . We now introduce notation to decompose the state space into subsets, in eachof which the linear path will be ﬁxed. More precisely, following a standard approach ﬁrst presented in [28], we coverthe state space S with the relative interior of ﬁnitely many convex, compact sets {A i } i ∈I with A i ⊆ S for all i ∈ I .We then deﬁne ∂ A i := ∂ S ∩ A i and let I bd ⊆ I be the subset of indices for which ∂ A i (cid:54) = ∅ .We now present the assumptions for the lower bound. We assume that, whenever the process starts from an initialcondition close to ∂ S (where possibly all the rates are zero), one can identify a ﬁnite sequence of favorable jumps,which we call the escape sequence , that push the process away from the boundary. We further crucially assume that therates of such favorable jumps do not decay too fast as we approach the boundary. This is captured by the followingcounterexample. Example 2.4.

Consider the family of Markov jump processes { X v } with generator L v f ( x ) := ve − kx (cid:0) f ( x + v − ) − f ( x ) (cid:1) for f : v − N → R , (4) for any k > . The above process, which for small x is a time-changed version of the autocatalytic process introduced inExample 1.1, has only one possible jump in the positive direction with rate Λ v ( x ) = e − kx s.t. lim (cid:37) → (cid:37) (inf x : x ≥ (cid:37) log λ ( x )) = − k (cid:54) = 0 . For the sequence of initial conditions x v = 1 /v , we have S v = v − N and S = R ≥ . Then, for any w > and ε ∈ (0 , w/ the probability of observing a realization of X v in an ε -neighborhood of the path z ( s ) = sw on theinterval s ∈ [0 , can be trivially estimated as P (cid:34) sup t ∈ [0 , | X vt − z ( t ) | ≤ ε (cid:35) ≤ P [ X v ≥ w − ε ] ≤ P [ X v ≥ (cid:101) w ] for (cid:101) w = min( k, w − ε ) / . Denoting by τ i the waiting time between the i − -th and i -th jump of the Poisson process X vt at x ∈ S v , we further have P [ X v ≥ (cid:101) w ] ≤ (cid:98) v (cid:101) w (cid:99) (cid:89) i =1 P [ τ i ≤

1] = (cid:98) v (cid:101) w (cid:99) (cid:89) i =1 − exp[ − ( ve − kv/i )] ≤ exp  (cid:98) v (cid:101) w (cid:99) (cid:88) i =1 (log v − kv/i )  . The rough estimate above yields v log P [ X v ≥ (cid:101) w ] ≤ v (cid:98) v (cid:101) w (cid:99) log v − k (cid:98) v (cid:101) w (cid:99) (cid:88) i =1 i < ( (cid:101) w − k ) log v − k (1 + log (cid:101) w ) (5) which approaches −∞ as v → ∞ . As the example above shows, sufﬁciently fast decay of the rates of the process X v implies the divergence of thelarge-deviation cost of any nontrivial path starting on the boundary ∂ S . We now proceed to give sufﬁcient assumptionsguaranteeing that this does not happen in the general setting. In particular, to capture the idea of escaping a boundary inthe higher dimensional setting, for each A i we deﬁne directions w i with some structural properties (Assumption 2 a))allowing to construct linear paths that leave such boundaries. We assume that these paths can be realized as a sequenceof jumps E i whose rates do not decay too fast (Assumption 2 b)-c)), as to avoid for the realization of such path to have n inﬁnite large-deviation cost. Denoting throughout by B (cid:37) ( x ) the Euclidean ball of radius (cid:37) in R d and by | A | theLebesgue measure of the set A , we summarize such assumptions below: Assumption 2 (Escape) . There exist constants ε, ε (cid:48) , ε (cid:48)(cid:48) > such that for each j ∈ I , the following holds:a) If j ∈ I bd there is a w j ∈ R d with (cid:107) w j (cid:107) = 1 and κ j ∈ (0 , such that whenever x ∈ A j and inf y ∈ ∂ S (cid:107) x − y (cid:107) < ε (cid:48) and t ∈ (0 , ε ) i) t (cid:55)→ inf y ∈ ∂ S (cid:107) x + t w j − y (cid:107) is increasing, andii) B tκ j ( x + t w j ) ∩ ∂ S = ∅ .We write κ − = min j ∈I bd { κ j } . If j ∈ I \ I bd we choose w j = 0 .b) There exists a ﬁnite sequence E j := ( r , . . . , r n j ) of jumps in R with lim sup v →∞ v (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) log Λ vr k (cid:32) x v + v − k − (cid:88) i =1 γ r i (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 , k = 1 , . . . n j , and (cid:80) ni =1 γ r i = α j w j for some α j > .c) Deﬁning Z v := { x v + v − (cid:80) ki =1 γ r i : k ∈ (0 , . . . , n j − } , for all r ∈ E j and T > (cid:37) → sup x ∈A j ∪ (cid:83) v Z v (cid:90) (cid:37) | log λ r ( x + sw j ) | d s = 0 . d) Let W j,κ (cid:48)(cid:48) := { w j + y : (cid:107) y (cid:107) < κ (cid:48)(cid:48) } . There exists κ (cid:48)(cid:48) < κ − / such that for any x ∈ A j ∪ (cid:83) v Z v we have thatfor all r ∈ R with λ r ( x ) < ε (cid:48)(cid:48) the rates λ r ( · ) are nondecreasing along paths t (cid:55)→ x + tw for any w ∈ W j,κ (cid:48)(cid:48) , for t ∈ (0 , ε ) . It is readily veriﬁed Assumption 1 and 2 are satisﬁed by mass action kinetics rates on a convex domain [1].

Remark 2.5.

While Assumption 2 c) is natural in terms of our proof, we note that it is automatically satisﬁed wheneverthere exists α ∈ [0 , such that lim (cid:37) → (cid:37) α (cid:18) inf x ∈A j ∪ (cid:83) v Z v : d ( x,∂ A j ) >(cid:37) log λ r ( x ) (cid:19) → for all r ∈ E j , (6) as we mentioned in (3) , where d ( A, B ) := inf x ∈ A,y ∈ B (cid:107) x − y (cid:107) . In particular, the above decay condition implies theresults of [28] and [23]. These papers make the stronger assumptions that (6) holds with α = 0 and α ∈ [0 , / respectively. The large-deviation principle.

For a parameter

T > ﬁxed throughout the paper, we denote by D u (0 , T ; R d ) (resp. D s (0 , T ; R d ) ) the space of c`adl`ag functions with values in R d endowed with the topology of uniform convergence(resp. Skorohod topology). Furthermore we deﬁne B [0 ,T ] ( (cid:37), z ) to be the ball of radius (cid:37) in D u (0 , T ; R d ) . Finally, for z : [0 , T ] (cid:55)→ R d in the set AC (0 , T ; R d ) of absolutely continuous functions, we denote by ˙ z its time derivative and wewill say that z ∈ AC (0 , T ; S ) whenever z ( t ) ∈ S a.e. t ∈ [0 , T ] .To deﬁne the standard rate function for the LDP of Markov jump processes in the small noise limit [27] we introducethe action I [0 ,T ] ( z ) := (cid:40)(cid:82) T inf { µ ∈ R R : (cid:80) r ∈R µ r γ r = ˙ z }H ( µ | λ ( z ( t ))) d t, if z ∈ AC (0 , T ; S ) , + ∞ otherwise. (7) H ( µ | λ ) := (cid:88) r ∈R λ r − µ r + µ r log µ r λ r , (8)where ( λ ( x )) r := λ r ( x ) . We can now state the main result of this paper: Theorem 1.

Consider the sequence of Markov jump processes { X v } v ∈ N , ﬁxing a sequence of deterministic initialconditions { x v } ∞ v ∈ N with x v ∈ ( v − Z ) d such that X v (0) = x v → x ∈ R d . Furthermore, let Assumption 1 and 2 hold.Then the sequence { X v } v ∈ N satisﬁes a LDP in D u (0 , T ; R d ) (resp. D s (0 , T ; R d ) ) with good rate function I x [0 ,T ] ( z ) := (cid:40) I [0 ,T ] ( z ) if z (0) = x + ∞ otherwise, (9) hat is, I x [0 ,T ] has compact sublevel sets D u (0 , T ; R d ) (resp. D s (0 , T ; R d ) ), and for all measurable Γ ⊆ D s (0 , T ; R d ) , lim sup v →∞ v log P x v [ X v ∈ Γ] ≤ − inf x ∈ ¯Γ I x [0 ,T ] ( x ) (10) lim inf v →∞ v log P x v [ X v ∈ Γ] ≥ − inf x ∈ Γ o I x [0 ,T ] ( x ) , (11) where P x v [ · ] denotes the conditional probability on X v = x v . The compactness of S and Lipschitz continuity of the rates { λ r } directly implies that the rates are bounded andLipschitz, which in turn implies that the process is exponentially tight. Therefore, the compactness of sublevel sets of I x [0 ,T ] comes for free, and the upper bound only needs to be proven for compact sets [8, Lem. 1.2.18].A few considerations are now in order. Remark 2.6.

The boundedness of the rates and compactness of S can be relaxed. Indeed, exponential tightness canbe obtained by other means: Either by Lipschitz continuity of the jump rates [11, 27], or by stability estimates [1, 3].Once exponential tightness is guaranteed, one can restrict the analysis to trajectories that do not leave a large enoughcompact [11, Theorem 4.4], effectively reducing the problem to the one with compact state space, which we discussabove. We further note that the seemingly restrictive assumption of deterministic initial condition also covers the casewhen such an initial condition is random. This can be done, given the probability conditioned on a ﬁxed initial statefrom Theorem 1, by integrating with respect to the probability distribution ν v ∈ M (( v − Z ) d ) of the initial condition,provided that the measure ν v satisﬁes some weak regularity and tightness assumptions. In this case, however, one mustcheck that the conditions in Theorem 1 hold uniformly on a set of positive measure WRT ν v . For a detailed discussion ofthis procedure when ν also satisﬁes a LDP at the same rate we refer to [5].3. P

ROOF OF

LDP

LOWER BOUND

The general strategy adopted to prove the

LDP lower bound result is mainly standard. Without loss of generality,we may assume that Γ is open, for any path z ∈ Γ one can ﬁnd a δ > such that B [0 ,T ] ( δ, z ) ⊂ Γ , so that P x [ X v ∈ Γ] ≥ P x (cid:2) B [0 ,T ] ( δ, z ) (cid:3) for δ > small enough. Hence, it is sufﬁcient to prove that, for any path z ∈ Γ theprobability that the process X v stays in a neighborhood B [0 ,T ] ( δ, z ) for any δ > is approximately exp[ − vI x [0 ,T ] ( z )] .Applying such estimate to a sequence { z ( n ) } ∞ n =1 of paths converging to the minimizer of I x [0 ,T ] in Γ with small enough δ ( n ) proves the desired result. This shows that for the lower bound (11) it is sufﬁcient to prove the following. Proposition 3.1.

Fix a path z : [0 , T ] → S with a ﬁxed initial condition z (0) = x ∈ S such that I x [0 ,T ] ( z ) = K < ∞ .Then, for a sequence of initial conditions x v ∈ ( v − Z ) d converging to x , under Assumption 1 and Assumption 2, lim δ → lim inf v →∞ v log P x v (cid:2) X v ∈ B [0 ,T ] ( δ, z ) (cid:3) ≥ − I x [0 ,T ] ( z ) . (12)The remainder of this section concentrates on proving such estimate. Our approach to the proof of the above resultmimicks the one from [28]. Throughout this section, we ﬁx a path z ∈ AC ([0 , T ] , S ) starting from z (0) = x andwe approximate z with another path z δ obtained by perturbing z , shifting it uniformly away from the regions wherethe rates are degenerate by a quantity controlled by δ . We then proceed to prove on one hand that the probabilityof X v approximately following z δ is accurately described by the rate function I x [0 ,T ] ( z δ ) , and on the other that thelarge-deviation cost of the process following the shifted path converges towards the one of the original path as δ → .The main difﬁculty to establish the former claim arises with the necessity of keeping the microscopic initial condition ofthe path ﬁxed, and estimating the probability of the process reaching, in a small time interval, the origin of the shiftedpath z , which is macroscopically bounded away from the boundary. On the other hand, to establish the latter convergenceproperty of the rate functional we have to guarantee sufﬁcient regularity of such functional as some of the jump ratesdecrease to with δ → . The remainder of the section is devoted to the realization of this program. In Section 3.1we give the explicit construction of the path z δ and detail its role in the proof of Proposition 3.1, in Section 3.2 weestimate the probability of the process reaching the origin of the shifted path from its ﬁxed initial condition x , while inSection 3.3 we prove sufﬁcient regularity of the rate functional I x [0 ,T ] . The proof of Proposition 3.1 is ﬁnally concludedin Section 3.4. z δ A i A i A i w i w i x F IGURE

1. Schematic representation of shifted path.3.1.

Construction of the path z δ . We now construct a macroscopic path that perturbs the original path z ∈ AC (0 , T ; S ) at a negligible cost and that can only intersect ∂ S at its initial point. To do so we recall the covering {A i } deﬁned inSection 2.1, allowing us to identify, for each A i , directions w i to move away from the boundary ∂ S . More speciﬁcally,this covering allows to partition the path z as it enters different regions A i and to shift it in the corresponding direction w i , thereby guaranteeing that the shifted path avoids ∂ S as we detail below and as depicted in Fig. 1.To construct z δ we introduce the sequence of times { τ k } k so that { z ( t ) : t ∈ [ τ k , τ k +1 ] } ⊂ A i k for all k for acorresponding sequence { i k } k of indices in I . Then, for ﬁxed x ∈ S , we consider i ∈ I such that x ∈ A i , so that z ( t ) ∈ A i for all t ∈ [0 , τ ] . In this interval we deﬁne the shifted path z δ ( t ) := (cid:40) x + tw j for t ∈ [0 , t δ ] z ( t − t δ ) − x + t δ w j for t ∈ ( t δ , τ + t δ ] (13)for t δ = ξ min( δ, ω − z ( δ )) where ω z denotes the modulus of continuity of the path z and ξ > is deﬁned below (seeLemma 3.2). We then continue deﬁning the path z δ by shifting the original path z inﬁnitesimally on every interval [ τ k , τ k +1 ] , sequentially moving it away from ∂ A i k with the corresponding w i k . More precisely, setting the lengthof the k -th shift time for the perturbed path as β k t δ for β > to be chosen later (see Lemma 3.2) and denoting by ∆ βk := t δ (cid:80) k − (cid:96) =0 β (cid:96) the cumulative shift time up to transition k we deﬁne the i -th shift as z δ ( s ) := (cid:40) z δ ( t ) + ( s − τ i − ∆ βi ) w i for s ∈ ( τ i + ∆ βi , τ i + ∆ βi +1 ] z δ ( τ i + ∆ βi ) + β k t δ w i + z ( s − ∆ βi +1 ) − z ( τ i ) for s ∈ ( τ i + ∆ βi +1 , τ i +1 + ∆ βi +1 ] . (14)We now establish some structural properties of the newly constructed path around the original z , which we recall isﬁxed throughout this section. This lemma extends [28, Lemma 3.4]. Lemma 3.2.

Let Assumptions 1 and 2 hold and set β := 3 /κ (cid:48)(cid:48) recalling that κ (cid:48)(cid:48) < κ − = min j ∈I κ j from Assumption 2a). Then, for any K > there is a J > such that if I [0 ,T ] ( z ) ≤ K , there are τ < τ < · · · < τ J = T and { i k } with z ( t ) ∈ A i k for τ k − ≤ t ≤ τ k . Furthermore, setting ξ := min(1 , ( κ (cid:48)(cid:48) / J +1 / , ε ) there exists δ z > such thatfor all δ < δ z the path z δ from (13) and (14) satisﬁes sup [0 ,T ] (cid:107) z − z δ (cid:107) < δ/ .Finally, the path z δ satisﬁes (cid:83) t ∈ [ t δ ,T ] B κ − t δ ( z δ ( t )) ∩ ∂ S = ∅ and for every i ∈ (1 , . . . , J ) and a ∈ R d ∩ span r ∈R ( γ r ) with (cid:107) a (cid:107) < t δ κ (cid:48)(cid:48) / there exists w ∈ W i k ,κ (cid:48)(cid:48) such that z δ ( τ k + ∆ βk +1 ) + a = z ( τ k ) + β k t δ w . We defer the proof of this lemma to the end of the section and proceed to present the central estimate allowing us tobound the probability in (12) from below — in the sense of large deviations. To do so, deﬁning throughout β := 3 /κ (cid:48)(cid:48) and ξ := min(1 , ( κ (cid:48)(cid:48) / J +1 / , ε ) so that Lemma 3.2 holds, by triangle inequality it is sufﬁcient to consider the event sup t ∈ [0 ,T ] (cid:107) z δ ( t ) − X v ( t ) (cid:107) < δ/ . Furthermore, z δ ∈ S and for any δ (cid:48)(cid:48) / ≤ δ (cid:48) ≤ δ/ we can further bound the event f interest from below as follows: P x v (cid:2) X v ∈ B [0 ,T ] ( δ, z ) (cid:3) ≥ P x v (cid:2) X v ∈ B [0 ,T ] ( δ/ , z δ ) (cid:3) ≥ P x v (cid:2) { X v ∈ B [0 ,t δ ] ( δ/ , z δ ) } ∩ { X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + t δ w j ) } ∩ { X v ∈ B [ t δ ,T ] ( δ (cid:48) , z δ ) } (cid:3) ≥ P x v (cid:2) { X v ∈ B [0 ,t δ ] ( δ/ , z δ ) } ∩ { X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + t δ w j ) } (cid:3) (15) × inf y ∈B δ (cid:48)(cid:48) / ( x + t δ w j ) P (cid:2) X v ∈ B [ t δ ,T ] ( δ (cid:48) , z δ ) | X v ( t δ ) = y (cid:3) , where in the last inequality we have used the Markov property. In the remainder of the paper we set δ (cid:48) := κ − t δ / and δ (cid:48)(cid:48) := t δ κ (cid:48)(cid:48) < δ (cid:48) where recalling that κ − = min j ∈I bd { κ j } < and that ξ < we must have δ (cid:48) ≤ δ/ . We note that this choice iscompatible with the deﬁnition of κ (cid:48)(cid:48) from Assumption 2. Remark 3.3.

We pause brieﬂy to motivate our choice of δ (cid:48) and δ (cid:48)(cid:48) : These small parameters are chosen in such a wayas to guarantee that the event in the second term in the last line of (15) only contains paths that are uniformly boundedaway from ∂ S , as captured by Lemma 3.2 and depicted in Fig. 2. The desired result is obtained by showing that lim δ → lim inf v →∞ v log P x v (cid:2) { X v ∈ B [0 ,t δ ] ( δ/ , z δ ) } ∩ { X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + w j t δ ) } (cid:3) = 0 and (16) lim δ → lim inf v →∞ v log inf y ∈B δ (cid:48)(cid:48) / ( x + t δ w ) P (cid:2) X v ∈ B [ t δ ,T ] ( δ (cid:48) , z δ ) | X v ( t δ ) = y (cid:3) ≥ − I x [0 ,T ] ( z ) . (17)The term in (16) is bounded from below in Section 3.2, while in Section 3.3 and Section 3.4 we formulate and combinethe estimates in different A j to bound (17), thereby proving the desired LDP lower bound.3.2.

Jump bounds.

We now proceed to bound from below the ﬁrst term on the last line of (15). To do this we considera convenient subset of outcomes obtained by ﬁxing a precise sequence of jumps (but not the times of the jumps) thatthe process undergoes in the interval (0 , t δ ) . To deﬁne such an event, we recall the deﬁnition of the sequence E j of n j jumps leading away from ∂ A j and we denote the event of repeating the sequence of jumps in E j n times by Ξ j ( n, v ) := n − (cid:92) m =0 n j (cid:92) i =1 (cid:8) X v ( σ mn j + i ) − X v ( σ mn j + i − ) = v − γ r i (cid:9) , (18)where, for all k ∈ N , σ k is the time of the k -th jump of the Markov process X v . Furthermore, we note that by ourchoice of t δ < / δ we must have { x + tw j : t ∈ (0 , t δ ) } ⊂ (cid:84) t ∈ (0 ,t δ ) B δ/ ( x + w j t ) , as depicted in Fig. 2. Thusfor n v + := (cid:98) vα j t δ (cid:99) and n v − := (cid:100) vα j ( t δ − δ (cid:48) ) (cid:101) we have for all v large enough that n v − < n v + , { X v ∈ B [0 ,t δ ] ( δ/ , z δ ) } ⊇ Ξ j ( n v + , v ) ∩ { σ n v + n j > t δ } , (19)and also { X v ( t δ ) ∈ B δ (cid:48) ( x + w j t δ ) } ⊇ Ξ j ( n v + , v ) ∩ { σ n v + n j > t δ } ∩ { σ n v − n j ≤ t δ } . Note that, as v and n increase the paths in Ξ j ( n, v ) have ranges concentrating on a straight line segment in R d However, there is no information about the speed at which they move along this line segment. This degree offreedom will be sufﬁcient to establish the

LDP lower bound, as we shall see next. In preparation for the next re-sult observe by Assumption 1a) that lim sup v →∞ sup (cid:8)(cid:80) r ∈R Λ vr ( x ) : x ∈ B t δ ( x ) (cid:9) < ∞ and deﬁne ¯ t δ ( α, ε (cid:48)(cid:48) ) :=min j { α j /n j , ε (cid:48)(cid:48) / max r ∈R Lip( λ r ) } . Lemma 3.4.

Suppose x ∈ A j and let Assumption 1 and 2 hold and that δ is small enough that t δ < ¯ t δ ( α, ε (cid:48)(cid:48) ) , thenfor ¯ λ > max { , lim sup v →∞ sup (cid:8)(cid:80) r ∈R Λ vr ( x ) : x ∈ B t δ ( x ) (cid:9) } we have lim inf v →∞ v log P (cid:104) X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + t δ w j ) , Ξ j ( n v , v ) , σ n v + n j > t δ (cid:12)(cid:12)(cid:12) X v (0) = x v (cid:105) ≥ − t δ (cid:18) n j α j log (cid:18) n j α j λ (cid:19) − n j α j + λ (cid:19) + n j (cid:88) i =1 (cid:90) t δ /α j log ( λ r i ( x + sα j w j )) d s + t δ n j α j log ℵ . δ/ ( x ) B δ/ ( x + t δ w j ) B δ (cid:48) ( x + t δ w i ) z z δ δ/ x F IGURE

2. Schematic representation of the desired effect for the choice of parameters δ, δ (cid:48) , δ (cid:48)(cid:48) summarized in Lemma 3.2. For a ﬁxed path z and δ > by our choice of ξ > and consequently t δ we have that a neighborhood of the path x + w i t (blue arrow) is contained in (cid:84) t ∈ (0 ,t δ ) B δ/ ( x + tw i ) (the intersection of the two dotted balls). By our choice of δ (cid:48) ( ξ, κ − ) > , we ﬁnd that a δ (cid:48) -neighborhood (dashed blue region) of the path z δ (blue line) on [ t δ , T ) never intersects ∂S while B δ (cid:48) ( x + w i t δ ) ⊂ (cid:84) t ∈ (0 ,t δ ) B δ ( z ( t )) . The shaded blue region represents B δ (cid:48)(cid:48) / ( x + t δ w j ) .To prove the above result, deﬁning throughout ˜ γ ( i ) := (cid:80) i − k =1 γ r k we introduce the following lemma relating theRiemann sum of log λ r along the escape sequence deﬁning Ξ j to the corresponding integral. Lemma 3.5.

Suppose x ∈ A j , let Assumption 1 and 2 hold, then for all δ such that t δ < ¯ t δ ( α, ε (cid:48)(cid:48) ) and r i ∈ E j wehave lim inf v →∞ v n v + (cid:88) m =0 log (cid:16) λ r i (cid:16) x v + mv α j w j + v − ˜ γ ( i ) (cid:17)(cid:17) ≥ (cid:90) t δ /α j log ( λ r i ( x + sα j w j )) d s . (20) Proof of Lemma 3.4.

When x (cid:54)∈ ∂ A j all the rates are strictly positive by deﬁnition, so the result follows by standardlarge-deviation estimates [27]. Therefore, for the rest of the proof we assume that x ∈ ∂ A j . Denoting by { x ∈ A } the indicator function on the set A , we introduce a jump r (cid:63) with Λ vr (cid:63) ( x ) := (cid:32) ¯ λ − (cid:88) r ∈R Λ vr ( x ) (cid:33) { x ∈ B t δ ( x ) } and γ r (cid:63) = 0 , and expand the set of jumps R (cid:63) := R ∪ { r (cid:63) } . We then deﬁne a new family of processes ¯ X v on the extended set ofjumps R (cid:63) and corresponding jump rates. By independence of the jump processes we trivially couple the underlyingPoisson processes for jumps in R to the ones of the process X v , so that ¯ X v ( t ) = X v ( t ) a.s. and proceed to establishthe desired result for ¯ X v . In the rest of the proof, by abuse of notation we will denote by ¯Ξ j the set deﬁned in (18) for ¯ X v instead of X v . We then have P (cid:104) ¯ X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + t δ w j ) , σ n v + n j > t δ , ¯Ξ j ( n v + , v ) (cid:12)(cid:12)(cid:12) ¯ X v (0) = x v (cid:105) ≥ P (cid:104) σ n v − n j ≤ t δ < σ n v + n j (cid:12)(cid:12)(cid:12) ¯Ξ j ( n v + , v ) , ¯ X v (0) = x v (cid:105) × P (cid:2) ¯Ξ j ( n v + , v ) (cid:12)(cid:12) ¯ X v (0) = x v (cid:3) . On the event ¯Ξ j ( n v + , v ) ∩ (cid:110) σ n v + n j ≥ t (cid:111) one has, for v large enough, sup s from Assumption 1 we have for (22) lim inf v →∞ v log P (cid:2) ¯Ξ j ( n v , v ) (cid:12)(cid:12) ¯ X v (0) = x v (cid:3) = n j (cid:88) i =1 (cid:90) t δ /α j log (cid:16) λ r (cid:96)i ( x + sα j w j ) (cid:17) d s − n j t δ log (cid:0) λ/ ℵ (cid:1) /α j . (23)Also since the waiting times between jumps are independent of which type of jump actually occurs P (cid:104) ¯ X v ( t δ ) ∈ B δ (cid:48)(cid:48) / ( x + t δ w j ) , σ n v + n j > t δ (cid:12)(cid:12)(cid:12) ¯Ξ j ( n v , v ) , ¯ X v (0) = x v (cid:105) = P (cid:2) n v − n j ≤ Y v < n v + n j (cid:3) , where Y v is Poisson distributed with mean t δ vλ . By deﬁnition, ¯ λ ≥ so that it is in particular greater than n v + n j /v forour choice of t δ , and we have lim v →∞ v log P (cid:2) n v − n j ≤ Y v < n v + n j (cid:3) = − t δ (cid:18) n j α j log (cid:18) n j α j ¯ λ (cid:19) − n j α j + ¯ λ (cid:19) (24)The result now follows by combining (24) and (23). (cid:3) Proof of Lemma 3.5.

We note that λ r is Lipschitz so whenever λ r ( x ) > then λ r ( x + sw j ) is uniformly boundedaway from 0 on a sufﬁciently small time interval and the result follows immediately by a dominated convergenceargument. We therefore assume throughout that λ r ( x ) = 0 . Introducing ˜ n v := inf (cid:26) m ∈ N : x v + mv − α j w j + v − ˜ γ ( i ) − x (cid:107) x v + mv − α j w j + v − ˜ γ ( i ) − x (cid:107) ∈ W j,κ (cid:48)(cid:48) (cid:27) , (25)we split the sum in the statement of the Lemma into the terms m = 0 , m ∈ (0 , . . . , ˜ n v ) and m ∈ (˜ n v + 1 , . . . , n v + ) andproceed to bound their contribution separately. The term m = 0 is automatically bounded by Assumption 2b). For thesecond term we observe since log (cid:0) λ r i (cid:0) x v + mv − α j w j + v − ˜ γ ( i ) (cid:1)(cid:1) is increasing in m by Assumption 2d) that lim inf v →∞ v ˜ n v (cid:88) m =1 log (cid:16) λ r i (cid:16) x v + mv − α j w j + v − ˜ γ ( i ) (cid:17)(cid:17) ≥ lim inf v →∞ v (cid:90) ˜ n v /v log (cid:16) λ r i (cid:16) x v + v − ˜ γ ( i ) + tα j w j (cid:17)(cid:17) d t = 0 , where the ﬁnal equality arises since lim v →∞ x v = x implies lim v →∞ v − ˜ n v = 0 , and we have the integral estimatefrom Assumption 2c), which is uniform in the starting points x v + v − ˜ γ ( i ) ∈ Z v .Similarly for the terms m ∈ (˜ n v + 1 , . . . , n v + ) , deﬁning m (cid:48) := m − ˜ n v ≥ we can write x v + mv − α j w j + v − ˜ γ ( i ) = x + m (cid:48) v − α j w j + (cid:16) x v + ˜ n v v − α j w j + v − ˜ γ ( i ) − x (cid:17) , (26)and note that by (25) the vector in brackets, once renormalized, is in W j,κ (cid:48)(cid:48) .We also have lim v →∞ (cid:13)(cid:13) x v + ˜ n v v − α j w j + v − ˜ γ ( i ) − x (cid:13)(cid:13) = 0 so provided that δ is small enough that x + m (cid:48) v − α j w j ∈ A j and λ r i ( x + m (cid:48) v − α j w j ) < ε (cid:48)(cid:48) we may apply Assumption 2d) to see that for each m (cid:48) ≥ λ r i (cid:16) x v + mv − α j w j + v − ˜ γ ( i ) (cid:17) ≥ λ r i (cid:0) x + m (cid:48) v − α j w j (cid:1) . A second application of Assumption 2d) implies lim inf v →∞ v n v + − ˜ n v (cid:88) m (cid:48) =1 log (cid:0) λ r i (cid:0) x + mv − α j w j (cid:1)(cid:1) ≥ lim inf v →∞ (cid:90) ( n v + − ˜ n v ) /v log ( λ r i ( x + tα j w j )) d t. (cid:3) .3. Approximation of the rate functional.

After the process has left the boundary ∂ S , we estimate the cost of agiven path z by approximating z with another path, that is uniformly bounded away from ∂ S . This implies that astandard LDP holds for such shifted path, and this can then be used to bound the rate function of the original z . We startby proving an adaptation of [28, Lemma 4.1] to the present setting, recalling that ω z is the modulus of continuity of z . Lemma 3.6.

Under Assumption 2, for every x ∈ A j for j ∈ I recalling that t δ := ξ min( ω − z ( δ ) , δ ) the cost of thepath z δ ( t ) = x + tw j satisﬁes lim δ → I [0 ,t δ ] ( z δ ) = 0 . Proof.

The thesis follows immediately by Assumption 2, resulting in the integrability of the rate functional I [0 ,t δ ] alongthe chosen trajectories. (cid:3) We deﬁne throughout (cid:96) ( x, y ) := sup ϑ ∈ R d ϑ · y − (cid:88) r ∈R λ r ( x ) (exp ( ϑ · γ r ) − . (27)and recall that, by convex duality, for any x, y ∈ R d we have (cid:96) ( x, y ) = inf { µ ∈ R |R|≥ : (cid:80) r ∈R µ r γ r = y } H ( µ | λ ( x )) for H ( µ | λ ) deﬁned in (8), as proven e.g. , in [27]. Consequently we can express I [0 ,T ] ( z ) = (cid:82) T (cid:96) ( z ( s ) , z (cid:48) ( s )) d s . Thisallows to prove the following adaptation of [28, Lemma 5.1]. Lemma 3.7.

Let Assumptions 1 and 2 hold. Fix i ∈ I , τ > and let the path z take values in A i for t ∈ [0 , τ ] andsatisfy I [0 ,τ ] ( z ) < K for K < ∞ . Then for any w ∈ W i,κ (cid:48)(cid:48) , C β > the shifted path z δ ( · ) = z ( · ) + C β t δ w satisﬁes lim sup δ → I [0 ,τ ] ( z δ ) ≤ I [0 ,τ ] ( z ) Proof.

We deﬁne (cid:96) ( t ) = (cid:96) ( z ( t ) , z (cid:48) ( t )) and (cid:96) ( t ) = (cid:96) ( z δ ( t ) , z (cid:48) δ ( t )) and denote by ( µ ∗ r ( t )) r ∈R the optimizing set ofjumps in (7) for the path z . This minimizer exists, because the sublevel sets for µ (cid:55)→ H ( µ | λ ) for H from (8) arecompact. Then we have that (cid:96) ( t ) ≤ H ( µ ∗ ( t ) | λ ( z δ ( t ))) . On the other hand, by continuity of the asymptotic rates λ r there exists a function K λ ( δ ) with lim δ → K λ ( δ ) = 0 for which we have | λ r ( z ( t )) − λ r ( z δ ( t )) | < K λ ( δ ) for all r ∈ R , so that (cid:96) ( t ) − (cid:96) ( t ) ≤ (cid:88) r ∈R λ r ( z δ ( t )) − λ r ( z ( t )) + µ ∗ r ( t ) log λ r ( z ( t )) λ r ( z δ ( t )) ≤ |R| K λ ( δ ) + (cid:88) r ∈R µ ∗ r ( t ) log λ r ( z ( t )) λ r ( z δ ( t )) . We now bound the second term on the

RHS from above depending on whether λ r ( z ( t )) > ε (cid:48)(cid:48) from Assumption 2. If λ r ( z ( t )) ≤ ε (cid:48)(cid:48) , by the assumed increasing property of λ r ( · ) along x + sw , we have log λ r ( z ( t )) /λ r ( z δ ( t )) ≤ . Onthe other hand, if λ r ( z ( t )) > ε (cid:48)(cid:48) then for δ small enough log λ r ( z ( t )) λ r ( z δ ( t )) ≤ log λ r ( z ( t )) λ r ( z ( t )) − K λ ( δ ) ≤ log ε (cid:48)(cid:48) ε (cid:48)(cid:48) − K λ ( δ ) ≤ K λ ( δ ) ε (cid:48)(cid:48) , where in the last inequality we used log(1 − x ) − < x for x small enough. It remains to show that the contributionof the term µ ∗ r ( t ) is bounded from above on the paths of interest. This result is obtained in the proof of [28, Lemma5.1] by the convexity and asymptotic growth of the Lagrangian (cid:96) ( x, y ) in its second argument, proven in [28, Lemma5.1], [27, Lemma 5.17] leveraging only the boundedness of the rates λ r . Following the same argument we bound µ ∗ r ( t ) ≤ C (1 + (cid:96) ( t )) and we ﬁnally obtain (cid:96) ( t ) ≤ (cid:96) ( t ) + K λ ( δ )( C + C (cid:96) ( t )) for sufﬁciently large, positive constants C , C , C . By the assumed boundedness of I [0 ,τ ] ( z ) , this gives the desiredresult by integration. (cid:3) We now combine the above estimates, established in each region A i separately, to obtain convergence of the ratefunctional I [0 ,T ] ( z δ ) to I [0 ,T ] ( z ) as δ → . While the idea of the proof is the same as in the original reference, we haveto reproduce the process more closely, as in our case we cannot bound, in general, the rate function of the shifts linearlyin δ as done in [28, Lemma 4.1] and we only have the limiting result Lemma 3.6. To bypass this issue, we leverageexponential tightness – discussed directly below the statement of Theorem 1 – to show that the number of transitions etween different A j done by the path of interest is bounded uniformly on sublevel sets of the rate functional. For any a ∈ R d , we extend the deﬁnition of z δ on the interval [0 , T ] as: ˜ z aδ ( t ) = (cid:40) z δ ( t δ ) + a for t ∈ [0 , t δ ) z δ ( t ) + a for t ∈ [ t δ , T ] . Lemma 3.8.

Let Assumptions 1 and 2 hold, and let the path z satisfy I [0 ,T ] ( z ) < ∞ . Then the path ˜ z δ satisﬁes lim sup δ → sup a ∈ span r ∈R ( γ r ) : (cid:107) a (cid:107) <κ (cid:48)(cid:48) t δ I [0 ,T ] (˜ z aδ ) ≤ I [0 ,T ] ( z ) Proof.

Recall the deﬁnition of times τ . . . , τ J from Lemma 3.2, separating [0 , T ] in a ﬁnite number of intervals wherethe path z is contained in a set A j . We now express the rate function as the sum of the cost of the shifted path and thecost of the shifts: For ∆ i := t δ (cid:80) ik =0 (3 /κ (cid:48)(cid:48) ) k (reﬂecting the choice of β in Lemma 3.2) we write I [0 ,T ] (˜ z aδ ) = I [0 ,t δ ) (˜ z aδ ) + J (cid:88) i =1 I [ τ i − +∆ i ,τ i +∆ i ] (˜ z aδ ) + J (cid:88) i =1 I [ τ i +∆ i ,τ i +∆ i +1 ] (˜ z aδ ) , (28)and proceed to bound the terms on the RHS separately. For the ﬁrst term we can trivially choose the optimizing set ofﬂuxes in (8) as µ ∗ = 0 , so that by boundedness of the rates λ r ( x ) < C we have I [0 ,t δ ) (˜ z aδ ) ≤ |R| Ct δ , which vanisheswith δ → .We proceed to bound the summands in the second term. Recalling by Lemma 3.2 that for each time interval [ τ i − + ∆ i , τ i + ∆ i ] the trajectory of ˜ z aδ corresponds to the one of z + wt δ for w ∈ W j i ,κ (cid:48)(cid:48) we see that for each suchinterval we can apply Lemma 3.7. Combining this result with the time-translation invariance of the rate functional weobtain that lim sup δ → J (cid:88) i =1 I [ τ i − +∆ i ,τ i +∆ i ] (˜ z aδ ) ≤ J (cid:88) i =1 I [ τ i − ,τ i ] ( z ) . (29)We then bound the third term of (28) by Lemma 3.6, recalling that by Lemma 3.2 the path ˜ z aδ is in S . We start bywriting I [ τ i +∆ i ,τ i +∆ i +1 ] (˜ z aδ ) ≤ (cid:88) r ∈R (cid:90) τ i +∆ i +1 τ i +∆ i (cid:104) λ r (˜ z aδ ( s )) − µ ∗ r + µ ∗ r log µ ∗ r λ r (˜ z aδ ( s )) (cid:105) d s . where µ ∗ r is given by the multiplicity of reaction r in E j . We then further divide the sum on R based on whether thejump rates λ r ( z ( t )) are bounded from below by ε (cid:48)(cid:48) from Assumption 2 on the time interval of interest. We denote thejumps whose rates do not satisfy this lower bound by R (0) ( j i ) and write I [ τ i +∆ i ,τ i +∆ i +1 ] (˜ z aδ ) ≤ (cid:88) r ∈R (0) ( j i ) (cid:90) τ i +∆ i +1 τ i +∆ i (cid:104) λ r (˜ z aδ ( s )) − µ ∗ r + µ ∗ r log µ ∗ r λ r (˜ z aδ ( s )) (cid:105) d s + (cid:88) r ∈R\R (0) ( j i ) (cid:90) τ i +∆ i +1 τ i +∆ i (cid:104) λ r (˜ z aδ ( s )) − µ ∗ r + µ ∗ r log µ ∗ r λ r (˜ z aδ ( s )) (cid:105) d s . (30)Then, by compactness given by I ( z ) < K there exists C (cid:48) ( K ) > such that the second term is bounded from above by (cid:88) r ∈R\R (0) ( j i ) (cid:90) τ i +∆ i +1 τ i +∆ i (cid:104) λ r (˜ z aδ ( s )) − µ ∗ r + µ ∗ r log µ ∗ r λ r (˜ z aδ ( s )) (cid:105) d s < |R| C (cid:48) (∆ i +1 − ∆ i ) . On the other hand, for the ﬁrst term in (30) we have (cid:88) r ∈R (0) ( j i ) (cid:90) τ i +∆ i +1 τ i +∆ i (cid:104) λ r (˜ z aδ ( s )) − µ ∗ r + µ ∗ r log µ ∗ r λ r (˜ z aδ ( s )) (cid:105) d s < |R| C (cid:48) (∆ i +1 − ∆ i )+ |R| C (cid:48)(cid:48) (cid:90) τ i +∆ i +1 τ i +∆ i log λ r (˜ z aδ ( s )) d s , and using Assumption 2 b) and c) we have for some x ∈ A j i lim δ → (cid:90) τ i +∆ i +1 τ i +∆ i log λ r (˜ z aδ ( s )) d s ≤ lim δ → (cid:90) β i t δ log λ r ( x + sw j i ) d s = 0 . ombining the above upper bounds for each transition we have lim δ → J (cid:88) i =0 I [ τ i +∆ i ,τ i +∆ i +1 ] (˜ z aδ ) ≤ J lim δ → sup i ∈ (0 ,...,J ) I [ τ i +∆ i ,τ i +∆ i +1 ] (˜ z aδ ) = 0 . (31)Finally, combining (28) with (29) and (31) we obtain the desired result. (cid:3) Proof of LDP in path space.

Proof of Proposition 3.1.

We conclude the proof by bounding the terms in (15). For the ﬁrst one we have that (16)holds by combining (19) and Lemma 3.4, for which we have that lim δ → lim inf v →∞ v log P (cid:104) X v ( t ) ∈ B δ (cid:48)(cid:48) / ( t δ w j ) , Ξ j ( n v + , v ) , σ n v + ,n j > t δ (cid:12)(cid:12)(cid:12) X v (0) = x v (cid:105) = 0 , (32)It remains to show that the second term is bounded by the rate function as in (16). We ﬁrst bound this term as inf y ∈B δ (cid:48)(cid:48) / ( x + t δ w ) P (cid:2) X v ∈ B [ t δ ,T ] ( δ (cid:48) , z δ ) | X v ( t δ ) = y (cid:3) ≥ inf a ∈B δ (cid:48)(cid:48) / (0) P (cid:2) X v ∈ B [ t δ ,T ] ( δ (cid:48) / , z δ + a ) | X v ( t δ ) = z δ ( t δ ) + a (cid:3) , where we shift the path z δ of a , but the lower bound is preserved since B [ t δ ,T ] ( δ (cid:48) / , z δ + a ) ⊆ B [ t δ ,T ] ( δ (cid:48) , z δ ) forall a ∈ B δ (cid:48)(cid:48) / (0) . Since paths in the RHS above are uniformly bounded away from ∂ S by Lemma 3.2, rates areuniformly bounded away from on the paths of interest and standard large-deviation bounds (which hold uniformly on y ∈ B δ (cid:48)(cid:48) / ( t δ w i ) ) can be applied. Therefore, deﬁning N δ := B δ (cid:48)(cid:48) / (0) ∩ span r ∈R ( γ r ) we bound the second term of(15) by lim inf v →∞ v log inf a ∈N δ P (cid:2) X v ∈ B [ t δ ,T ] ( δ (cid:48) / , z δ + a ) | X v ( t δ ) = z δ ( t δ ) + a (cid:3) ≥ inf a ∈N δ ( − inf z ∈ B [ tδ,T ] ( δ (cid:48) / , z δ + a ) I [ t δ ,T ] ( z )) ≥ inf a ∈N δ ( − I [ t δ ,T ] ( z δ + a )) ≥ inf a ∈N δ ( − I [0 ,T ] (˜ z aδ )) . (33)Finally, combining Lemma 3.8 with the bound obtained above we obtain that lim inf δ → inf a ∈N δ ( − I [0 ,T ] (˜ z aδ )) ≥ − I [0 ,T ] ( z ) = − I x [0 ,T ] ( z ) . (34) (cid:3) We conclude this section by proving Lemma 3.2

Proof of Lemma 3.2.

The ﬁniteness of J results from [28, Lemma 3.5]. In particular, one can choose α small enoughand [ τ i − , τ i ] so that the set {B α ( z ( t )) , t ∈ [ τ i − , τ i ] } is contained in A j i for all i ∈ (1 , . . . , J ) . By exponentialtightness we can apply [28, Lemma 3.4] to obtain absolute continuity of z on the set I [0 ,T ] ( z ) < K , so that there exists τ − > with inf { z : I [0 ,T ] ( z ) τ − . Consequently J = T /τ − is ﬁnite.The bound sup [0 ,T ] (cid:107) z − z δ (cid:107) < δ/ follows from the construction (13) and our choice of ξ := min(1 , ( κ (cid:48)(cid:48) / J +1 / , ε ) .Indeed for the time interval [0 , t δ ] we have sup t ∈ [0 ,t δ ] (cid:107) x + w j t − z ( t ) (cid:107) ≤ t δ + ω z ( t δ ) ≤ δ , where ω z is the (subadditive) modulus of continuity of z . To extend this estimate beyond t δ we note that sup t ∈ [0 ,T ] (cid:107) z δ ( t ) − z ( t ) (cid:107) < J (cid:88) k =0 β k t δ + ω z ( β k t δ ) ≤ ξδ J (cid:88) k =0 β k . (35)Then, by our choice β = 3 /κ (cid:48)(cid:48) and since k (cid:88) l =0 (cid:18) κ (cid:48)(cid:48) (cid:19) l = 1 − (3 /κ (cid:48)(cid:48) ) k +1 − /κ (cid:48)(cid:48) ≤ κ (cid:48)(cid:48) /κ (cid:48)(cid:48) ) k +1 . (36)we see that by boundedness of k ≤ J and by the deﬁnition of ξ we have sup t ∈ [0 ,T ] (cid:107) z δ ( t ) − z ( t ) (cid:107) < δ/ . e now prove that (cid:83) t ∈ [ t δ ,T ] B κ − t δ ( z δ ( t )) ∩ ∂ S = ∅ by induction on k . For k = 0 the claim follows directly byAssumption 2a) for δ < ε (cid:48) / . Then, by (36) as the k + 1 -th shift is of length t δ (3 /κ (cid:48)(cid:48) ) k +1 we must have that z δ ( t ) is atleast at distance t δ κ − (1 − κ (cid:48)(cid:48) / /κ (cid:48)(cid:48) ) k +1 > t δ κ − / /κ (cid:48)(cid:48) ) k +1 > κ − t δ from ∂ S for t ∈ [ τ k +1 +∆ βk +1 , τ k +2 +∆ βk +1 ] .Since the initial point of the shift satisﬁes the required condition by assumption, and that this property is conserved on [ τ k + ∆ βk , τ k + ∆ βk +1 ] ( i.e. , during a shift) by Assumption 2a) we obtain the desired result.Finally, we show that for every k ∈ (1 , . . . , J ) and a ∈ R d with (cid:107) a (cid:107) < t δ κ (cid:48)(cid:48) / there exists ˜ w ∈ B κ (cid:48)(cid:48) (0) such that z δ ( τ k + ∆ βk +1 ) + a = z ( τ k ) + β k t δ ( w i k + ˜ w ) . This follows immediately from (36), since we have (cid:107) z δ ( τ k + ∆ βk +1 ) + a − z ( τ k ) − β k t δ w i k (cid:107) = (cid:107) k − (cid:88) l =0 (cid:18) κ (cid:48)(cid:48) (cid:19) l t δ w i l + a (cid:107) ≤ k − (cid:88) l =0 (cid:18) κ (cid:48)(cid:48) (cid:19) l t δ + (cid:107) a (cid:107)≤ t δ κ (cid:48)(cid:48) (cid:0) /κ (cid:48)(cid:48) ) k (cid:1) ≤ κ (cid:48)(cid:48) (cid:107) (3 /κ (cid:48)(cid:48) ) k t δ w i k (cid:107) , concluding the proof of the lemma. (cid:3)

4. LDP

UPPER BOUND

Similar results under slightly more restrictive assumptions are well known e.g. , [9, 27] with jump rates boundedaway from 0. A sufﬁciently general result is available in [25], but under assumptions on the initial condition that are notsatisﬁed here. We will sketch the application of the ideas from [25] to the setting of this paper.In order to prove the upper bound, we will temporarily enlarge the state space in order to include the integrated ﬂuxof each reaction, i. e. we consider the process ( X v ( t ) , W v ( t )) ∈ R d × R |R|≥ with initial condition ( x v , and generator: Q v f ( x, w ) = (cid:88) r ∈R v Λ vr ( x )( f ( x + v − γ r , w + v − δ r ) − f ( x, w )) , for δ rr = 1 and δ rs = 0 for all s (cid:54) = r . It is clear that the marginal distribution of the X v -coordinate is the distribution ofour original process. In the following proposition, we prove a large-deviation upper bound for this process. To shortennotation, let us deﬁne for any w ∈ R |R| the vector Γ w : = (cid:80) r ∈R γ r w r ∈ R d .Let C ([0 , T ); R |R| ) be the space of continuous and differentiable compactly supported functions from [0 , T ) to R |R| . For x ∈ D u (0 , T ; S ) , w ∈ D u (0 , T ; R |R|≥ ) and ζ ∈ C ([0 , T ); R |R| ) we set G ( x, w, ζ ) := − (cid:90) T (cid:88) r ∈R (cid:16) ˙ ζ r ( t ) w r ( t ) + (cid:104) e ζ r ( t ) − (cid:105) λ r ( x ( t )) (cid:17) d t and use this to deﬁne a partial rate function (cid:101) J S ( x, w ) := (cid:40) sup ζ ∈ C ([0 ,T ); R R ) G ( x, w, ζ ) if x ( t ) = x + Γ w ( t ) , x ( t ) ∈ S ∀ t ∈ [0 , T )+ ∞ otherwise. Proposition 4.1.

Let ζ ∈ C ([0 , T ); R |R| ) and suppose Assumption 1 holds, then ( x, w ) (cid:55)→ G ( x, w, ζ ) is continuousfrom D u (0 , T ; R d × R |R|≥ ) to R .Proof. We have | G ( x, w, ζ ) − G ( x (cid:48) , w (cid:48) , ζ ) | ≤ (cid:13)(cid:13)(cid:13) ˙ ζ (cid:13)(cid:13)(cid:13) ∞ (cid:107) w − w (cid:48) (cid:107) ∞ T + (cid:16) e (cid:107) ζ (cid:107) ∞ + 1 (cid:17) (cid:90) T (cid:88) r ∈R | λ r ( x ( t )) − λ r ( x (cid:48) ( t )) | d t. We can use the continuity of the λ r from Assumption 1a), along with the boundedness given by the compactness of S ,and apply dominated convergence when x (cid:48) → x to see that the second term of our estimate vanishes, as well as the ﬁrstterm when w (cid:48) → w . (cid:3) Proposition 4.2.

Let K be a closed subset of the space of c`adl`ag paths D u (0 , T ; R d × R |R|≥ ) , then under Assumption 1 lim sup v →∞ v log P (( X v , W v ) ∈ K ) ≤ − inf ( x,w ) ∈K (cid:101) J S ( x, w ) . roof. Assumption 1 implies exponential tightness – see discussion below the statement of Theorem 1 – therefore wemay assume that K is compact.Fix ε ∈ (0 , , then for every ( x, w ) ∈ D u (0 , T ; S × R |R|≥ ) satisfying x = x + Γ w one can ﬁnd ζ [ x, w ] ∈ C (cid:0) [0 , T ); R R (cid:1) such that G ( x, w, ζ [ x, w ]) ≥ min (cid:16) (cid:101) J ( x, w ) , ε − (cid:17) − ε and deﬁne neighbourhoods in path space G ε ( x, w ) := { ( x (cid:48) , w (cid:48) ) : G ( x (cid:48) , w (cid:48) , ζ [ x, w ]) ≥ G ( x, w, ζ [ x, w ]) − ε } . We may use Proposition 4.1 to see that the G ε ( x, w ) are open and so can ﬁnd a ﬁnite cover G ε ( x i , w i ) i = 1 , . . . , n for K . Now following [25, Thm A.3] and using the fact that the jump rates are bounded over S (because of Assumption 1a)and compactness of S ), we deﬁne tilted measures P ζ via mean 1 non-negative martingales so that v log d P ζ ◦ ( X v , W v ) − d P ◦ ( X v , W v ) − ( x, w ) = − (cid:90) T (cid:88) r ∈R w r ( t ) ˙ ζ r ( t ) + Λ vr ( x ( t )) (cid:16) e ζ r ( t ) − (cid:17) d t := G v ( x, w, ζ ) . Slightly adapting and simplifying the argument of [25, Lemma 4.7] v log P (( X v , W v ) ∈ G ε ( x, w )) ≤ v log P ζ [ x,w ] (( X v , W v ) ∈ G ε ( x, w )) − inf ( x (cid:48) ,w (cid:48) ) ∈G ε ( x,w ) G v ( x (cid:48) , w (cid:48) , ζ [ x, w ]) ≤ − inf ( x (cid:48) ,w (cid:48) ) ∈G ε ( x,w ) | G v ( x (cid:48) , w (cid:48) , ζ [ x, w ]) − G ( x (cid:48) , w (cid:48) , ζ [ x, w ]) | − inf ( x (cid:48) ,w (cid:48) ) ∈G ε ( x,w ) G ( x (cid:48) , w (cid:48) , ζ [ x, w ]) . (37)The ﬁrst term on the ﬁnal line vanishes as v → ∞ by the uniform convergence from Assumption 1b). Thus for i = 1 , . . . , n v log P (cid:0) ( X v , W v ) ∈ G ε ( x i , w i ) (cid:1) ≤ − min (cid:16) (cid:101) J ( x i , w i ) , ε − (cid:17) − ε and by the Laplace principle v log P (( X v , W v ) ∈ K ) ≤ − min i =1 ,...,n min (cid:16) (cid:101) J ( x i , w i ) , ε − (cid:17) − ε ≤ inf ( x (cid:48) ,w (cid:48) ) ∈K min (cid:0) J ( x (cid:48) , w (cid:48) ) , ε − (cid:1) − ε, which completes the proof as ε can be taken arbitrarily small.If (cid:64) ( x, w ) ∈ K satisfying x = x + Γ w , then lim sup v →∞ v log P (( X v , W v ) ∈ K ) ≤ −∞ , since, by deﬁnition X v ( t ) = X v (0) + Γ W v ( t ) a.s. for all t ∈ [0 , T ] . (cid:3) Proposition 4.3.

If Assumption 1 holds then J = (cid:101) J S , where J ( x, w ) := (cid:40)(cid:80) r ∈R (cid:82) T H ( ˙ w ( t ) | λ ( x ( t ))) d t, ( x, w ) ∈ AC (0 , T ; S × R |R| ) , ˙ x = Γ ˙ w, x (0) = x ∞ , otherwise . where H is deﬁned in (8) .Proof. The proof follows [25, Prop 3.5]; we assume that ˙ x = Γ ˙ w and x ( t ) ∈ S a.e. t ∈ [0 , T ) throughout. Supposethat ( x, w ) / ∈ AC (0 , T ; S × R |R|≥ ) , then one can ﬁnd a sequence ζ n ∈ C (cid:0) [0 , T ); R |R| (cid:1) with sup n,z,t | ζ nz ( t ) | ≤ but lim n →∞ (cid:82) T (cid:80) r ∈R ˙ ζ nz ( t ) w ( t )d t = −∞ and thus J ( x, w ) ≥ lim n →∞ G ( c, w, ζ n ) = + ∞ , that is, J = ˜ J S if thepath is not absolutely continuous.One then shows that J ( x, w ) = (cid:101) J S ( x, w ) for any ( x, w ) ∈ AC (0 , T ; S × R |R| ) using approximation arguments. (cid:3) Corollary 4.4.

If Assumption 1 holds then the large-deviation upper bound holds with good rate functional: inf w ∈ W , (0 ,T ; R |R|≥ )˙ x =Γ ˙ w J ( x, w ) = I x [0 ,T ] ( x ) = (cid:90) T sup ϑ ∈ R d (cid:104) ϑ · ˙ x ( t ) − (cid:88) r ∈R λ r ( x ( t )) (cid:0) exp( ϑ · γ r ) − (cid:1)(cid:105) d t. roof. The large-deviation upper bound with the rate functional on the left-hand side follows from Propositions 4.2, 4.3and the contraction principle. For non-absolutely continuous paths both the left-hand and right-hand sides will diverge:The left-hand side since the inﬁmum will be taken over an empty set, and the right-hand by a similar argument as inProposition 4.3.Now take an arbitrary x ∈ AC (0 , T ; R d ) . Then inf w ∈ W , (0 ,T ; R |R|≥ )˙ x =Γ ˙ w J ( x, w ) ≥ (cid:90) T inf j ∈ R |R|≥ : ˙ x ( t )=Γ j H (cid:0) j ( t ) | λ ( x ( t )) (cid:1) d t (cid:124) (cid:123)(cid:122) (cid:125) = I x ,T ] ( x ) = (cid:90) T sup ϑ ∈ R d (cid:104) ϑ · ˙ x ( t ) − (cid:88) r ∈R λ r ( x ( t )) (cid:0) exp( ϑ · γ r ) − (cid:1)(cid:105) d t, where the equality follows from convex duality, pointwise in t . To show that the inequality is in fact an equality, wemay assume that the left-hand side is ﬁnite. Hence from now on we may assume that x ∈ W , (0 , T ; S ) . By Jensen’sinequality, any path j : (0 , T ) → R |R|≥ for which (cid:82) T H (cid:0) j ( t ) | λ ( x ( t )) (cid:1) d t < ∞ is bounded in L (0 , T ; R |R| ) , whichshows the ﬁrst equality. (cid:3)

5. O

PTIMALITY OF DECAY RATE IN (6)We recall from Remark 2.5 that integrability of the rates necessary to establish the lower bound estimates in Section 3– but not the upper bound ones in Section 4 – is directly implied by a sufﬁciently slow decay of the rates (6). In thissection (and more speciﬁcally in Proposition 5.4) we make precise our claim that the range of exponents α given inRemark 2.5 is maximal. In particular, we show that whenever the rates of jumps necessary to escape the degenerate setdecay too fast (satisfying a condition similar to (6) for α ≥ ), the rate function for the upper bound diverges for any y ∈ AC (0 , T ; S ) with y (0) ∈ ∂ S and y ( t ) ∈ S \ ∂ S for a t > . We start the discussion of this problem with someexamples as to capture the idea of our strategy in a simple setting. Recall from Corollary 4.4 that, under Assumption 1,for any y ∈ AC (0 , T ; S ) and any δ > : lim sup v →∞ v log P x v ( X v ∈ B [0 ,T ] ( δ, y )) ≤ − inf x ∈ B [0 ,T ] ( δ,y ) (cid:90) T (cid:96) ( x ( s ) , ˙ x ( s ))d s, (38)where we recall that (cid:96) is deﬁned as (cid:96) ( x, y ) = sup ϑ ∈ R d ϑ · y − (cid:88) r ∈R λ r ( x ) (exp ( ϑ · γ r ) − . Example 5.1.

Recall Example 2.4, where (6) does not hold and the upper bound is easily seen to diverge. Given thegenerator in (4) , the integrand on the right-hand side of the

LDP upper bound in (38) reads (cid:96) ( x, y ) = sup ϑ ∈ R d (cid:8) ϑy − e − /x ( e ϑ − (cid:9) ≥ y log ye /x − ( y − e − /x ) , where in the inequality we have chosen ϑ ( x, y ) = log ye /x . Then, the LDP rate function can be bounded as follows I ,T ] ( z ) ≥ (cid:90) T z (cid:48) ( t ) log( z (cid:48) ( t ) e /z ( t ) ) − ( z (cid:48) ( t ) − e − /z ( t ) )d t ≥ (cid:90) T z (cid:48) ( t ) log z (cid:48) ( t ) + z (cid:48) ( t ) /z ( t ) − z ( t ) d t, for any z ∈ AC (0 , T ; R ≥ ) with z (0) = 0 . Using that x log x > − is continuous at 0 we have, for every boundedpath z with bounded derivative, that I ,T ] ( z ) ≥ − − z (1) + (cid:90) z (cid:48) ( t ) /z ( t ) d t ≥ − − z ( T ) + (cid:90) z ( T )0 x − d x = + ∞ . In order to proceed and generalize the example above, we discuss two further examples highlighting the appropriateway to negate the assumptions in Remark 2.5. xample 5.2. Consider a system deﬁned on S v = ( v − N ) with two jumps, γ = (0 , , γ = (1 , and correspond-ing rates Λ v ( x ) = λ ( x ) = { x ≥ } ( x − , Λ v ( x ) = λ ( x ) = 1 . For the initial condition x = 0 the law of largenumbers [18] shows that the paths of X v concentrate around ( y ∗ , y ∗ )( t ) = ( t, { t > } ( t − / . In particular,this implies the existence of paths y ( t ) ∈ AC (0 , T ; S ) with y (0) ∈ ∂ S and y ( t ) ∈ S \ ∂ S but having a ﬁnite largedeviations cost for a system violating (6). To discuss the optimality of the interval for the parameter α governing the local decay of rates in (6) we avoidconsidering macroscopic behaviors like the one highlighted in the example above and we restrict our attention to pathsthat do not leave the set A j in the time interval of interest, keeping j ﬁxed throughout this section. For the same reason,to negate (6) we consider jumps whose rates decay faster than exp[ − k · dist( x, ∂ A j ) − ] uniformly in x in A j for a k > . These jumps belong to the set FAST R ,j := (cid:40) r ∈ R : lim (cid:37) → (cid:37) (cid:32) sup z ∈A j : dist( z,∂ A j ) <(cid:37) log λ r ( z ) (cid:33) < (cid:41) , (39)with dist( z, ∂ A j ) = inf x ∈ ∂ A j (cid:107) z − x (cid:107) . While this set is not, in general, the complement of the jumps whose ratessatisfy (6), it allows to capture, at least locally, those whose decay is more rapid (in terms of α ) than (6).We further notice that the existence of a single reaction r ∈ FAST R may still result in a ﬁnite cost for paths escaping ∂ A j , as the following example shows. Example 5.3.

Consider a system deﬁned on S v = ( v − N ) with two jumps, γ = ( − , and γ = (1 , andcorresponding rates λ ( x ) = x e − /x , λ ( x ) = 1 . It is clear that this system satisﬁes a LDP with any sequence ofinitial conditions, see [28]. However, approaching the set { x ∈ R : x = 0 } , upon choosing w j = (1 , we see thatthis system does not satisfy (6). We are, however, able to choose w j = (1 , so that with such choice of w j (6) holdsfor all r ∈ E j . In light of the above example, the statement we seek to negate is the existence of vectors w j and corresponding E j stated in Assumption 2 such that E j ∩ FAST R ,j = ∅ . To do so, we ﬁx a (non-empty) S and a limiting point x ∈ ∂ A j ,for j ∈ I bd , assuming throughout that ∂ A j is a subset of a ( d − -dimensional hyperplane to simplify the notation ofthe proof. In this way, we deﬁne T x A j = { y ∈ span r ∈R ( γ r ) : y · n x ≥ } , with n x inward normal to the boundary ∂ A j in x . Assuming that there is no vector w j ∈ span r ∈R ( γ r ) that is a sum of jumps with rates decaying slow enoughmeans that ∀ x ∈ ∂ A j Co x ( { γ r : r ∈ ( R \

FAST R ,j ) } ) ∩ T x A j = ∅ , (40)where Co x ( A ) is the convex cone deﬁned by the set of vectors A with origin x .Note that in this way we are building a class of processes where the jumps r pointing in the interior of the domain(and therefore useful to escape the boundary) necessarily belong to FAST R ,j . Proposition 5.4.

Assume that (40) holds, then for every y ∈ AC (0 , T ; S ) with y (0) = x ∈ ∂ A j and such that thereexists t ∈ (0 , T ) with y ( t ) ∈ A j \ ∂ A j and inf { t ∈ (0 , T ) : z ( t ) (cid:54)∈ A j } > t it holds that I x [0 ,T ] ( y ) = ∞ .Proof. Recalling the structure in (38) of the

LDP upper bound we notice that, in order to show that the rate function isinﬁnite for any path y ∈ AC (0 , T ; S ) as above, it is sufﬁcient to ﬁnd a ϑ ( t, y ) such that (cid:90) T (cid:34) ϑ ( s, y ( s )) · ˙ y ( s ) − (cid:88) r ∈R λ r ( y ( s )) (exp [ ϑ ( s, y ( s )) · γ r ] − (cid:35) d s = + ∞ . (41)By assumption, there exists t < t ∈ [0 , T ] such that y ( t ) ∈ ∂ A j and y ( t ) (cid:54)∈ ∂ A j for all t ∈ ( t , t ) . We nowaim to express (cid:82) t t ϑ ( t, y ( t )) · ˙ y ( t ) d t as an exact integral of some potential for which Φ( x ) = −∞ , while we choose ϑ ( t, y ( t )) = 0 for t ∈ [0 , t ] ∪ [ t , T ] , such that the integral in (41) vanishes on that interval. More speciﬁcally,following for example [15], for a κ > we take ϑ = κ ∇ Φ( y ) so that I ( y ) ≥ κ Φ( y ( t )) − κ Φ( y ( t )) − (cid:88) r ∈R (cid:90) t t λ r ( y ( t )) e κγ r ·∇ Φ( y ) d t + (cid:88) r ∈R (cid:90) t t λ r ( y ( t )) d t (cid:124) (cid:123)(cid:122) (cid:125) ≥ . he missing step is therefore to choose Φ and tune κ such that Φ( y ( t )) = −∞ and (cid:80) r ∈R (cid:82) t t λ r ( y ( t )) e κγ r ·∇ Φ( y ) d t is bounded. For the choice Φ( y ) := log( n x · ( y − x )) we have: (cid:88) r ∈R (cid:90) t t λ r ( y t ) exp [ κγ r · ∇ Φ( y )] d t = (cid:88) r ∈R (cid:90) t t e log λ r ( y ( t ))+ κ ( n x · γ r ) / ( n x · ( y ( t ) − x )) d t . With this choice, since y ∈ AC (0 , T ; S ) , we have that n x · ( y ( t ) − x ) ≥ for t ∈ ( t , t ) . We can split the jumps inthe set R + := { r ∈ R : γ r ∈ T x A j } and R − = R \ R + . For the latter class of jumps we have n x · γ r ≤ and (cid:88) r ∈R − (cid:90) t t λ r ( y ( t )) e κ n x · γ r n x · ( y ( t ) − x ) d t < + ∞ , since the argument of each integral is bounded on ( t , t ) . On the other hand, we handle the terms coming from theformer class of jumps R + – the ones pushing the process in the interior of S – using that R + ⊆ FAST R ,j . Thereforewe can tune κ such that lim t → ( n x · ( y ( t ) − x )) log λ r ( y t ) + κ ( n x · γ r ) ≤ ∀ r ∈ R + , ensuring that (cid:80) r ∈R + (cid:82) t t λ r ( y ( t )) exp (cid:104) κ n x · γ r n x · ( y ( t ) − x ) (cid:105) d t < + ∞ . This proves (41). (cid:3)

6. L

IST OF SYMBOLS R ﬁnite set of jumps/reactions Subsec. 1.1 Λ vr , λ r microscopic and macroscopic jump rates Subsec. 1.1 & Ass. 1, 2 γ r ∈ R d jump vectors Subsec. 1.1 & Ass. 2 S v , S , ∂ S ⊂ R d reachable points and boundary/degenerate set Sec. 2 A i , ∂ A i ⊂ R d covering of S Sec. 2 I , I bd , index sets for covering Sec. 2 B (cid:37) ( x ) ⊂ R d euclidean ball of radius (cid:37) and center x Subsec. 2.1 D u (0 , T ; R d ) (cid:0) D s (0 , T ; R d ) (cid:1) c`adl`ag functions with uniform (Skorohod) topology Subsec. 2.2 AC (0 , T ; R d ) (cid:0) AC (0 , T ; S ) (cid:1) absolutely continuous functions (restricted to S ) Subsec. 2.2 B [0 ,T ] ( (cid:37), z ) ball of radius (cid:37) and center z in D u (0 , T ; R d ) Subsec. 2.2 I [0 ,T ] , I x [0 ,T ] large-deviation action and rate functional Eqns. (7), (9) H ( µ | λ ) , (cid:96) ( x, y ) relative entropy and Lagrangian Eqn. (8), (27) J , (cid:101) J S ﬂux large-deviation functional and dual form Section 4 E j ⊂ R , w j ∈ R d , α j > escape sequence, vector and normalisation Ass. 2 a), b) ε, ε (cid:48) , κ j , κ − > , escape parameters Ass. 2 a) ε (cid:48)(cid:48) , κ (cid:48)(cid:48) > monotonicity range Ass. 2 d) z, z δ target and approximated path Subsec. 3.1 t δ , ξ, β > , ω z path shift parameters, modulus of continuity Eqns. (13),(14) δ (cid:48) , δ (cid:48)(cid:48) > neighborhood parameters of shifted path Subsec. 3.1A CKNOWLEDGEMENTS

AA was partially supported by the Swiss National Science Foundation grant P2GEP2-175015 and by the NSF grantDMS-1613337. He furhter acknowledges the hospitality of the Weierstrass Institute of Applied for Applied Analysisand Stochastics. LA acknowledges the hospitality of the Hausdorff Institute in Bonn, during the Junior trimesterprogram “Randomness, PDEs and Nonlinear Fluctuation”, where she partially worked on this project. RP and MRreceived support from the Deutsche Forschungsgemeinschaft (DFG) through grant CRC 1114 “Scaling Cascades inComplex Systems”, Project C08. EFERENCES[1] A. Agazzi, A. Dembo, and J.-P. Eckmann. Large deviations theory for markov jump models of chemical reaction networks.

The Annals ofApplied Probability (2018), 1821–1855.[2] A. Agazzi, A. Dembo, and J.-P. Eckmann. On the geometry of chemical reaction networks: Lyapunov function and large deviations. Journal ofStatistical Physics (2018), 321–352.[3] A. Agazzi and J. C. Mattingly. Seemingly stable chemical kinetics can be stable, marginally stable, or unstable.

Communications in MathematicalSciences (2020), 1605 – 1642.[4] D. F. Anderson, D. Cappelletti, J. Kim, and T. D. Nguyen. Tier structure of strongly endotactic reaction networks. Stochastic Processes andtheir Applications (2020), 7218 – 7259.[5] J. Biggins. Large deviations for mixtures.

Electronic Communications in Probability (2004), 60–71.[6] D. Dawson and J. G¨artner. Large deviations from the McKean-Vlasov limit for weakly interacting diffusions. Stochastics (1987), 247–308.[7] P. Del Moral. Feynman-kac formulae. In: Feynman-Kac Formulae (Springer, 2004), pp. 47–93.[8] A. Dembo and O. Zeitouni.

Large deviations techniques and applications , volume 38 of

Stochastic modelling and applied probability (NewYork, NY, USA: Springer, 1987), 2nd edition.[9] P. Dupuis, R. S. Ellis, and A. Weiss. Large deviations for markov processes with discontinuous statistics, i: General upper bounds.

Ann. Probab. (1991), 1280–1297.[10] P. Dupuis, K. Ramanan, and W. Wu. Large deviation principle for ﬁnite-state mean ﬁeld interacting particle systems. Technical Report1601.06219, arXiv (2016).[11] J. Feng and T. Kurtz. Large deviations for stochastic processes , volume 131 of

Mathematical surveys and monographs (Providence, RI, USA:American Mathematical Society, 2006).[12] S. Feng. Large deviations for empirical process of mean-ﬁeld interacting particle system with unbounded jumps.

The Annals of Probability (1994), 1679–2274.[13] M. Freidlin and A. Wentzell. Random Perturbations of Dynamical Systems . Grundlehren der mathematischen Wissenschaften (Springer, 2012).[14] T. Grafke and E. Vanden-Eijnden. Numerical computation of rare events via large deviation theory.

Chaos: An Interdisciplinary Journal ofNonlinear Science (2019), 063118.[15] B. Hilder, M. A. Peletier, U. Sharma, and O. Tse. An inequality connecting entropy distance, ﬁsher information and large deviations. StochasticProcesses and their Applications .[16] W. Kordecki. Reliability bounds for multistage structures with independent components.

Statistics & Probability Letters (1997), 43 – 51.[17] T. Kurtz. Solutions of ordinary differential equations as limits of pure jump processes. Journal of Applied Probability (1970), 49–58.[18] T. G. Kurtz. The relationship between stochastic and deterministic models for chemical reactions. The Journal of Chemical Physics (1972),2976–2978.[19] A. Lazarescu, T. Cossetto, G. Falasco, and M. Esposito. Large deviations and dynamical phase transitions in stochastic chemical networks. TheJournal of Chemical Physics (2019), 064117.[20] C. L´eonard. Large deviations for long range interacting particle systems with jumps.

Annales de l’Institut Henri Poincar´e, section B (1995),289–323.[21] A. Mielke. On Evolutionary Γ -Convergence for Gradient Systems (Cham, Swiss: Springer International Publishing, 2016), pp. 187–249.[22] A. Mielke, R. I. A. Patterson, M. A. Peletier, and D. R. M. Renger. Non-equilibrium thermodynamic principles for nonlinear chemical reactionsand systems with coagulation and fragmentation. WIAS Preprint .[23] E. Pardoux and B. Samegni-Kepgnou. Large deviation principle for poisson driven sdes in epidemic models. arXiv preprint arXiv:1606.01619 .[24] E. Pardoux and B. Samegni-Kepgnou. Large deviation principle for epidemic models.

Journal of Applied Probability (2017), 905–920.[25] R. I. A. Patterson and D. R. M. Renger. Large Deviations of Jump Process Fluxes. Mathematical Physics, Analysis and Geometry (2019), 21.[26] L. Popovic. Large deviations of markov chains with multiple time-scales. Stochastic Processes and their Applications (2019), 3319–3359.[27] A. Shwartz and A. Weiss.

Large deviations for performance analysis . Stochastic Modeling Series (Chapman & Hall, London, 1995). Queues,communications, and computing, With an appendix by Robert J. Vanderbei.[28] A. Shwartz and A. Weiss. Large deviations with diminishing rates.

Math. Oper. Res. (2005), 281–310.[29] E. Weinan, W. Ren, and E. Vanden-Eijnden. Minimum action method for the study of rare events. Communications on pure and appliedmathematics (2004), 637–656.D EPARTMENT OF M ATHEMATICS , D

UKE U NIVERSITY , 120 S

CIENCE D R , D URHAM , NC 27710, USA

Email address : [email protected] W EIERSTRASS -I NSTITUT F ¨ UR A NGEWANDTE A NALYSIS UND S TOCHASTIK , M

OHRENSTRASSE

39, 10117 B

ERLIN , G

ERMANY

Email address : { andreis, patterson, renger } @[email protected]