A weak solution theory for stochastic Volterra equations of convolution type
Eduardo Abi Jaber, Christa Cuchiero, Martin Larsson, Sergio Pulido
aa r X i v : . [ m a t h . P R ] S e p A weak solution theory for stochastic Volterra equations ofconvolution type ∗ Eduardo Abi Jaber † Christa Cuchiero ‡ Martin Larsson § Sergio Pulido ¶ September 4, 2019
Abstract
We obtain general weak existence and stability results for stochastic convolutionequations with jumps under mild regularity assumptions, allowing for non-Lipschitz co-efficients and singular kernels. Our approach relies on weak convergence in L p spaces.The main tools are new a priori estimates on Sobolev–Slobodeckij norms of the solu-tion, as well as a novel martingale problem that is equivalent to the original equation.This leads to generic approximation and stability theorems in the spirit of classicalmartingale problem theory. We also prove uniqueness and path regularity of solutionsunder additional hypotheses. To illustrate the applicability of our results, we considerscaling limits of nonlinear Hawkes processes and approximations of stochastic Volterraprocesses by Markovian semimartingales. Contents L p solutions 21 ∗ The work of Eduardo Abi Jaber was supported by grants from R´egion Ile-de-France. Christa Cuchierogratefully acknowledges financial support by the Vienna Science and Technology Fund (WWTF) under grantMA16-021. The research of Sergio Pulido benefited from the support of the Chair Markets in Transition(F´ed´eration Bancaire Fran¸caise) and the project ANR 11-LABX-0019. † Ecole Polytechnique, [email protected] ‡ Vienna University of Economics and Business, [email protected] § Carnegie Mellon University, [email protected] ¶ ENSIIE & Universit´e Paris-Saclay, [email protected] Uniqueness of weak L p solutions 266 Path regularity 287 Applications 30 A Auxiliary results 33
A stochastic Volterra equation of convolution type is a stochastic equation of the form X t = g ( t ) + Z [0 ,t ) K ( t − s ) dZ s , (1.1)where X is the d -dimensional process to be solved for, g is a given function, K is a given d × k matrix-valued convolution kernel, and Z is a k -dimensional Itˆo semimartingale whosedifferential characteristics are given functions of X . The solution concept is described indetail below. In particular, conditions are needed to ensure that the stochastic integral onthe right-hand side of (1.1) is well-defined.This type of equation appears in multiple applications, for example turbulence (Barndorff-Nielsen and Schmiegel,2008), energy markets (Barndorff-Nielsen et al., 2013), and rough volatility modeling in fi-nance (El Euch and Rosenbaum, 2019; Gatheral et al., 2018). In the latter context thekernel is singular, K ( t ) = t γ − with γ ∈ ( , X (Basse and Pedersen, 2009; Marquardt, 2006).A more complex example is the intensity λ of a Hawkes process N . Here the drivingsemimartingale is the Hawkes process itself, which is a counting process, and the intensitysatisfies λ t = g ( t ) + Z [0 ,t ) K ( t − s ) dN s . L p estimates forsolutions of (1.1), combined with a novel “Volterra” martingale problem in R d that allowsus to pass to weak limits in (1.1). In view of the irregular path behavior that occurs, inparticular, in the presence of jumps, this identifies L p spaces as a natural environment forthe weak convergence analysis. With this approach we obtain • existence of weak solutions for singular kernels, non-Lipschitz coefficients and generaljump behavior; • strong existence and pathwise uniqueness under Lipschitz conditions (but still singu-lar kernels and jumps); • convergence and stability theorems in the spirit of classical martingale problem the-ory, allowing for instance to study scaling limits of nonlinear Hawkes processes andto approximate stochastic Volterra processes by Markovian semimartingales; • path regularity under certain additional conditions on the kernel and the character-istics.Let us now describe the solution concept for (1.1). For p ∈ [2 , ∞ ) we denote by L p loc = L p loc ( R + , R n ) the space of locally p -integrable functions from R + to R n , where thedimension n of the image space will depend on the context. Let d, k ∈ N and consider thefollowing data:(D1) an initial condition g : R + → R d in L p loc ,(D2) a convolution kernel K : R + → R d × k in L p loc ,3D3) a characteristic triplet ( b, a, ν ) of measurable maps b : R d → R k and a : R d → S k + aswell as a kernel ν ( x, dζ ) from R d into R k such that ν ( x, { } ) = 0 for all x ∈ R d and,for some c ∈ R + , | b ( x ) | + | a ( x ) | + Z R k (cid:0) ∧ | ζ | (cid:1) ν ( x, dζ ) ≤ c (1 + | x | p ) , x ∈ R d . (1.2)Given this data, we can now state the following key definition. Definition 1.1. A weak L p solution of (1.1) for the data ( g , K, b, a, ν ) is an R d -valuedpredictable process X , defined on some filtered probability space (Ω , F , F , P ) , that has tra-jectories in L p loc and satisfies X t = g ( t ) + Z [0 ,t ) K ( t − s ) dZ s P ⊗ dt -a.e. (1.3) for some R k -valued Itˆo semimartingale Z with Z = 0 whose differential characteristics(with respect to some given truncation function) are b ( X ) , a ( X ) , ν ( X, dζ ) . For conveniencewe often refer to the pair ( X, Z ) as a weak L p solution. Due to condition (1.2), the stochastic integral in (1.3) is well-defined for almost every t ∈ R + , confirming that the definition of L p solution makes sense. This is shown inLemma A.3.Throughout this section we assume R R k | ζ | ν ( x, dζ ) < ∞ for all x ∈ R d so we can usethe “truncation function” χ ( ζ ) = ζ . The characteristics of Z are therefore understoodwith respect to this function. We can now state our main result on existence of weak L p solutions. Theorem 1.2.
Let d, k ∈ N , p ∈ [2 , ∞ ) , and consider data ( g , K, b, a, ν ) as in (D1) – (D3) .Assume b and a are continuous, and x
7→ | ζ | ν ( x, dζ ) is continuous from R d into the finitepositive measures on R k with the topology of weak convergence. In addition, assume thereexist a constant η ∈ (0 , , a locally bounded function c K : R + → R + , and a constant c LG such that Z T | K ( t ) | p t ηp dt + Z T Z T | K ( t ) − K ( s ) | p | t − s | ηp ds dt ≤ c K ( T ) , T ≥ , (1.4) and | b ( x ) | + | a ( x ) | + Z R k | ζ | ν ( x, dζ )+ (cid:18)Z R k | ζ | p ν ( x, dζ ) (cid:19) /p ≤ c LG (1 + | x | ) , x ∈ R d . (1.5) Then there is a weak L p solution ( X, Z ) of (1.1) for the data ( g , K, b, a, ν ) .
4n overview of the proof of Theorem 1.2 is given below, and the formal argument is inSection 4. However, let us first mention several kernels of interest that satisfy (1.4).
Example 1.3. (i) Consider the kernel K ( t ) = t γ − with γ > , which is singular when γ <
1. Then with η ∈ (0 , ( γ − ) ∧
1) one has 2 γ − η − >
0, and therefore Z T | K ( t ) | t − η dt = T γ − η − γ − η − Z T Z T | K ( t ) − K ( s ) | | t − s | η ds dt = 2 T γ − η − γ − η − Z ( u γ − − (1 − u ) η du. These expressions are locally bounded in T , so (1.4) holds with p = 2.(ii) Consider a locally Lipschitz kernel K with optimal Lipschitz constant L T over [0 , T ].Let p ∈ [2 , ∞ ) and choose η < p . Then Z T | K ( t ) | p t − ηp dt ≤ max t ∈ [0 ,T ] | K ( t ) | p T − ηp − ηp and Z T Z T | K ( t ) − K ( s ) | p | t − s | η ds dt ≤ L pT Z T Z T | t − s | p − − η ds dt. Since 1 − ηp > p − η >
0, these expressions are locally bounded in T .Thus (1.4) holds.(iii) Consider two kernels K and K . Suppose K ∈ L p loc satisfies (1.4) for some p ∈ [2 , ∞ )and η ∈ (0 , K is locally Lipschitz. Then it is not hard to check that theproduct K = K K satisfies (1.4) with the same p and η as K . An example of thiskind is the exponentially dampened singular kernel K ( t ) = t γ − e − βt with γ ∈ ( , β ≥
0. For this kernel one can take p = 2 and any η ∈ (0 , γ − ).The proof of Theorem 1.2 is based on approximation and weak convergence of laws onsuitable function spaces. The semimartingale Z has trajectories in the Skorokhod space D = D ( R + , R k ) of c`adl`ag functions. Weak convergence in D is a classical tool used, forexample, to obtain weak solutions of stochastic differential equations with jumps (see, e.g.,Ethier and Kurtz (2005)). However, as explained in Section 6, the trajectories of X neednot be c`adl`ag, only locally p -integrable. Thus it is natural to regard X as a random elementof the Polish space L p loc = L p loc ( R + , R d ). It is in this space—or rather, the product space L p loc × D —that our weak convergence analysis takes place.Relative compactness in L p is characterized by the Kolmogorov–Riesz–Fr´echet theorem;see e.g. Brezis (2010, Theorem 4.26). A more convenient criterion in our context uses the5obolev–Slobodeckij norms, defined for any measurable function f : R + → R d by k f k W η,p (0 ,T ) = (cid:18)Z T | f ( t ) | p dt + Z T Z T | f ( t ) − f ( s ) | p | t − s | ηp ds dt (cid:19) /p , where p ≥ η ∈ (0 , T ≥ L p spaces is somewhat analogous to the relation between H¨older norms and spaces ofcontinuous functions. In particular, balls with respect to k · k W η,p (0 ,T ) are relatively compactin L p (0 , T ); see e.g. Flandoli and Gatarek (1995, Theorem 2.1). The following a prioriestimate clarifies the role of the conditions (1.4) and (1.5) in Theorem 1.2, and is the keytool that allows us to obtain convergent sequences of approximate L p solutions. The proofis given in Section 2. Theorem 1.4.
Let d, k ∈ N , p ∈ [2 , ∞ ) , and consider data ( g , K, b, a, ν ) as in (D1) – (D3) .Assume there exists a constant c LG such that (1.5) holds. Then any weak L p solution X of (1.1) for the data ( g , K, b, a, ν ) satisfies E [ k X k pL p (0 ,T ) ] ≤ c, (1.6) where c < ∞ only depends on d, k, p, c LG , T, k g k L p (0 ,T ) , and, L p -continuously, on K | [0 ,T ] .If in addition there exist a constant η ∈ (0 , and a locally bounded function c K : R + → R + such that (1.4) holds, then E [ k X − g k pW η,p (0 ,T ) ] ≤ c, (1.7) where c < ∞ only depends on d, k, p, η, c K , c LG , T . An immediate corollary is the following tightness result.
Corollary 1.5.
Fix d, k, p, η, c K , c LG as in Theorem 1.4, and let G ⊂ L p loc be relativelycompact. Let X be the set of all weak L p solutions X of (1.1) as g ranges through G , K ranges through all kernels that satisfy (1.4) with the given η and c K , and ( b, a, ν ) rangesthrough all characteristic triplets that satisfy (1.5) with the given c LG . Then X is tight, inthe sense that the family { Law ( X ) : X ∈ X } is tight in P ( L p loc ) .Proof. Fix T ∈ R + and let c be the constant in (1.7). For any m >
0, Markov’s inequalitygives sup X ∈X P ( k X − g k W η,p (0 ,T ) > m ) ≤ cm p . The balls { f : k f k W η,p (0 ,T ) ≤ m } are relatively compact in L p (0 , T ), so the above estimateimplies that the family { ( X − g ) | [0 ,T ] : X ∈ X } is tight in L p (0 , T ). Since T was arbitrary,it follows that X = { X − g : X ∈ X } is tight in L p loc . Since G is relatively compact, G + X is tight as well, and it contains X . Thus X is tight.6he second main ingredient in the proof of Theorem 1.2 relies on a reformulation of(1.1) as a certain martingale problem. This martingale problem is introduced in Section 3,and it is shown in Lemma 3.3 that weak L p solutions of (1.1) can equivalently be understoodas solutions of the martingale problem. This point of view is useful because it leads to thefollowing stability result, which under appropriate conditions asserts that the weak limitof a sequence of solutions is again a solution. The proof is given at the end of Section 3.Recall that D denotes the Skorokhod space of c`adl`ag functions from R + to R k . Theorem 1.6.
Let d, k ∈ N , p ∈ [2 , ∞ ) . For each n ∈ N , let ( X n , Z n ) be a weak L p solutionof (1.1) given data ( g n , K n , b n , a n , ν n ) as in (D1) – (D3) . Assume the triplets ( b n , a n , ν n ) all satisfy (1.5) with a common constant c LG . Assume also, for some ( g , K, b, a, ν ) andlimiting process ( X, Z ) , that • g n → g in L p loc , • K n → K in L p loc , • ( b n , a n , ν n ) → ( b, a, ν ) in the sense that A n f → Af locally uniformly on R d × R k forevery f ∈ C c ( R k ) , where Af is defined in terms of the characteristic triplet by Af ( x, z ) = b ( x ) ⊤ ∇ f ( z ) + 12 tr( a ( x ) ∇ f ( z ))+ Z R k ( f ( z + ζ ) − f ( z ) − ζ ⊤ ∇ f ( z )) ν ( x, dζ ) , and A n f is defined analogously, • ( X n , Z n ) ⇒ ( X, Z ) in L p loc × D .Then ( X, Z ) is a weak L p solution of (1.1) for the data ( g , K, b, a, ν ) . It is important to appreciate that no pointwise convergence of characteristic tripletsis required in Theorem 1.6. For example, it may happen that a n = 0 for all n , but thelimiting triplet has a = 0. This is because diffusion can be approximated by small jumps,and we indeed make use of this in a crucial manner.By combining the tightness and stability results with an approximation scheme for thecharacteristic triplet, we reduce the existence question to the pure jump case where Z is piecewise constant with finite activity jumps. A solution X can then be constructeddirectly. The details are given in Section 4.At this point it is natural to ask about uniqueness of solutions to (1.1). Standardcounterexamples for SDEs reveal that no reasonable uniqueness statement will hold at thelevel of generality of Theorem 1.2. Additional assumptions are needed. In Section 5 weprove a pathwise uniqueness theorem under suitable Lipschitz conditions; see Theorem 5.3.This in turn yields uniqueness in law via the abstract machinery of Kurtz (2014) and, as7 by-product, strong existence. As for SDEs, uniqueness in the non-Lipschitz case ismore delicate and not treated here. In certain situations, uniqueness in law can still beestablished; see for instance Abi Jaber et al. (2017) for the case of affine characteristicsand continuous trajectories.In Section 6 we turn to path regularity of solutions X of (1.1). Basic examples showthat X can be as irregular as the kernel K itself. However, often additional information isavailable that allows one to assert better path regularity. Criteria of this kind are collectedin Theorem 6.1.At this stage let us mention various path regularity results for stochastic convolutionsthat already exist in the literature. For one-dimensional continuous kernels K , stochasticconvolutions R t K ( t − s ) dW s with W a standard Brownian motion may fail to be locallybounded in t (Brzezniak et al., 2001, Theorem 1). However, under appropriate conditionson K , allowing in particular for certain singular kernels, a version with H¨older samplepaths exists (Abi Jaber et al., 2017, Lemma 2.4). If W is replaced by a pure jump process,Rosinski (1989, Theorem 4) showed that the stochastic convolution fails to be locallybounded whenever the kernel is singular. Similar results appear in infinite dimensions, seeBrze´zniak and Zabczyk (2010, Theorem 7.1). Under additional regularity of the kernel,existence of H¨older continuous versions for fractional L´evy processes has been establishedby Marquardt (2006) and Mytnik and Neuman (2011).Finally, in Section 7 we sketch how our results can be applied to scaling limits ofHawkes processes (Subsection 7.1) and approximations of solutions of (1.1) by means offinite-dimensional systems of Markovian SDEs (Subsection 7.2).Some basic auxiliary results are gathered in the appendix. This section is devoted to the proof of Theorem 1.4. We will need the following inequality,taken from Marinelli and R¨ockner (2014, Theorem 1). It first appeared in Novikov (1975,Theorem 1), but is also known as the
Bichteler–Jacod inequality or Kunita estimate . Werefer to Marinelli and R¨ockner (2014) for a historical survey of these maximal inequalities.
Lemma 2.1.
Let µ be a random measure with compensator ν , and define ¯ µ = µ − ν . Forany T ∈ R + and g such that the integral M t = Z [0 ,t ) × R k g ( s, ζ )¯ µ ( ds, dζ ) is well-defined for all t ∈ [0 , T ] , one has the inequality E h sup t ≤ T | M t | p i ≤ C ( p, T ) E h Z [0 ,T ) × R k | g ( s, ζ ) | p ν ( ds, dζ )+ (cid:16) Z [0 ,T ) × R k | g ( s, ζ ) | ν ( ds, dζ ) (cid:17) p/ i , or any p ≥ , where C ( p, T ) only depends on p and T . We now proceed to the proof of Theorem 1.4. Let therefore d, k ∈ N , p ∈ [2 , ∞ ), andconsider ( g , K, b, a, ν ) as in (D1)–(D3). We assume there exists a constant c LG such that(1.5) holds, and let ( X, Z ) be a weak L p solution of (1.1) for the data ( g , K, b, a, ν ). Proof of (1.6) . Observe that Z admits the representation Z t = Z t b ( X s ) ds + M ct + M dt , t ≥ , where M c is a continuous local martingale with quadratic variation h M c i = R · a ( X s ) ds and M d is a purely discontinuous local martingale with compensator R R k ζν ( X, dζ ). Define τ n = inf { t : R t | X s | p ds ≥ n } ∧ T . Since X is predictable with sample paths in L p loc , theprocess R · | X s | p ds is continuous, adapted, and increasing. Thus τ n is a stopping time forevery n , and τ n → ∞ . Define the process X n by X nt = X t t<τ n . We then have k X n k pL p (0 ,T ) ≤ p − k g k pL p (0 ,T ) + Z T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z [0 ,t ) K ( t − s ) b ( X ns ) ds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p dt ! + 4 p − Z T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z [0 ,t ) K ( t − s ) dM c,ns (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p dt + Z T (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z [0 ,t ) K ( t − s ) dM d,ns (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p dt ! = 4 p − (cid:18) k g k pL p (0 ,T ) + Z T ( I t + II t + III t ) dt (cid:19) , where M c,n has quadratic variation equal to R · a ( X ns ) ds , and the jump measure of M d,n has compensator ν ( X n , dζ ). An application of the Jensen and BDG inequalities combinedwith Fubini’s theorem and (1.5) leads to E [ I t ] + E [ II t ] ≤ C ( c LG , p, T ) Z t | K ( t − s ) | p (1 + E [ | X ns | p ]) ds, for every t ≤ T . Thanks to Novikov’s inequality, see Lemma 2.1, we have E [ III t ] ≤ E " sup r ≤ t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z [0 ,r ) K ( t − s ) dM d,ns (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ≤ C ( p, t ) Z t | K ( t − s ) | p E "Z R k | ζ | p ν ( X ns , dζ ) + (cid:18)Z R k | ζ | ν ( X ns , dζ ) (cid:19) p/ ds ≤ C ( c LG , p, t ) Z t | K ( t − s ) | p (1 + E [ | X ns | p ]) ds, t ≤ T , where the last inequality follows from (1.5). Combining the above yields E [ k X n k pL p (0 ,T ) ] ≤ C ( c LG , p, T ) × (cid:16) k g k pL p (0 ,T ) + Z T Z t | K ( t − s ) | p (1 + E [ | X ns | p ]) ds dt (cid:17) . Multiple changes of variables and applications of Tonelli’s theorem yield Z T Z t | K ( t − s ) | p (1 + E [ | X ns | p ]) ds dt = Z T | K ( s ) | p Z T − s (1 + E [ | X nt | p ]) dt ds ≤ T k K k pL p (0 ,T ) + Z T | K ( T − s ) | p E [ k X n k pL p (0 ,s ) ] ds. We deduce that the function f n ( t ) = E [ k X n k pL p (0 ,t ) ] satisfies the convolution inequality f n ( t ) ≤ C ( c LG , p, T ) (cid:16) k g k pL p (0 ,T ) + T k K k pL p (0 ,T ) (cid:17) − ( b K ∗ f n )( t ) , where b K = − C ( c LG , p, T ) | K | p lies in L (0 , T ). The resolvent b R of b K is nonpositive andlies in L (0 , T ); see Gripenberg et al. (1990, Theorem 2.3.1 and its proof). Moreover, f n ≤ n by construction. Thus the Gronwall lemma for convolution inequalities applies; seeLemma A.2. In particular, we have f n ( T ) ≤ C ( c LG , p, T ) (cid:16) k g k pL p (0 ,T ) + T k K k pL p (0 ,T ) (cid:17) (cid:16) k b R k L (0 ,T ) (cid:17) . As n → ∞ we have τ n → ∞ , and hence f n ( T ) → E [ k X k pL p (0 ,T ) ] by monotone convergence.We deduce (1.6), as desired. Finally, the continuous dependence on K | [0 ,T ] follows fromGripenberg et al. (1990, Theorem 2.3.1), which implies that the map from L p (0 , T ) to R that takes K | [0 ,T ] to k b R k L (0 ,T ) is continuous.For the proof of the second part of Theorem 1.4, namely (1.7), will need the followingestimate. Lemma 2.2.
Let K : R + → R d × k be measurable. For any η > , T ∈ R + , p ≥ andnonnegative measurable function f , one has Z T Z T Z s ∨ ts ∧ t | K ( s ∨ t − u ) | p | t − s | pη f ( u ) du ds dt ≤ k f k L (0 ,T ) η Z T | K ( t ) | p t − ηp dt (2.1)10 nd Z T Z T Z s ∧ t | K ( t − u ) − K ( s − u ) | p | t − s | ηp f ( u ) du ds dt ≤ k f k L (0 ,T ) Z T Z T | K ( t ) − K ( s ) | p | t − s | ηp ds dt (2.2) Proof.
We first prove (2.1). Since R s ∨ ts ∧ t ( . . . ) du = R T ( s
1) and a locally bounded function c K : R + → R + such that (1.4) holds. Proof of (1.7) . Set ¯ X = X − g and observe that | ¯ X t − ¯ X s | ≤ (cid:12)(cid:12)(cid:12) Z [0 ,s ∧ t ) ( K ( t − u ) − K ( s − u )) dZ u (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Z [ s ∧ t,s ∨ t ) K ( s ∨ t − u ) dZ u (cid:12)(cid:12)(cid:12) , P ⊗ dt ⊗ ds -a.e.11 similar argument as in the proof of (1.6) shows that E [ R T R T | ¯ X t − ¯ X s | p | t − s | ηp ds dt ] is boundedabove by C ( c LG , p, T ) Z T Z T Z s ∧ t | K ( t − u ) − K ( s − u ) | p (1 + E [ | X u | p ]) | t − s | ηp du ds dt + Z T Z T Z s ∨ ts ∧ t | K ( s ∨ t − u ) | p (1 + E [ | X u | p ]) | t − s | ηp du ds dt ! . Applying (1.6), as well as Lemma 2.2 with f ( u ) = 1 + E [ | X u | p ], we obtain the bound (1.7)with a constant c < ∞ that depends on d, k, p, η, c K , c LG , T as well as, L p -continuously, on K | [0 ,T ] . Note that the set of restrictions K | [0 ,T ] of kernels that satisfy (1.4) with the given c K is relatively compact in L p (0 , T ). By maximizing the bound over all such K , we obtaina bound that only depends on d, k, p, η, c K , c LG , T . We consider initial conditions g and convolution kernels K as in (D1)–(D2) of Section 1,as well as linear operators A that map functions f ∈ C c ( R k ) to measurable functions Af : R d × R k → R , and satisfy the following growth bound for some p ∈ [1 , ∞ ):For every f ∈ C c ( R k ) there is a finite constant c f such that | Af ( x, z ) | ≤ c f (1 + | x | p ) for all ( x, z ) ∈ R d × R k . (3.1)Note that (3.1) ensures that Af ( x , z ) ∈ L ( R + , R ) for any pair of functions ( x , z ) ∈ L p loc × D . Definition 3.1.
Let p ∈ [1 , ∞ ) . A solution of the local martingale problem for ( g , K, A ) is a pair ( X, Z ) of processes with trajectories in L p loc × D , defined on a filtered probabilityspace (Ω , F , F , P ) , such that X is predictable, Z is adapted with Z = 0 , the process M ft = f ( Z t ) − Z t Af ( X s , Z s ) ds, t ≥ , (3.2) is a local martingale for every f ∈ C c ( R k ) , and one has the equality Z t X s ds = Z t g ( s ) ds + Z t K ( t − s ) Z s ds, t ≥ . (3.3)Note that both the left- and right-hand sides of (3.3) are continuous in t and equal tozero for t = 0. For the convolution R t K ( t − s ) Z s ds , this follows because K is in L andthe trajectories of Z are in L ∞ loc ; see Gripenberg et al. (1990, Corollary 2.2.3).12ur first goal is to establish the equivalence between weak L p solutions of (1.1) andsolutions of the local martingale problem. The relevant operator A is given by Af ( x, z ) = b ( x ) ⊤ ∇ f ( z ) + 12 tr( a ( x ) ∇ f ( z ))+ Z R k (cid:16) f ( z + ζ ) − f ( z ) − χ ( ζ ) ⊤ ∇ f ( z ) (cid:17) ν ( x, dζ ) , (3.4)where ( b, a, ν ) is the given characteristic triplet and χ is the truncation function. Thisequivalence will allow us to establish Theorem 1.6 by proving a stability theorem for so-lutions of local martingale problems; see Theorem 3.4 below. The latter is easier, becausethe conditions (3.2) and (3.3) are more easily shown to be closed with respect to suitableperturbations of X , Z , g , K and A . Lemma 3.2.
Let p ∈ [2 , ∞ ) . Consider a kernel K ∈ L and a characteristic triplet ( b, a, ν ) satisfying (1.2) . Let X be a predictable process with trajectories in L p loc and let Z be an Itˆo semimartingale whose differential characteristics with respect to some giventruncation function χ are b ( X ) , a ( X ) , ν ( X, dζ ) . Then R [0 ,t ) K ( t − s ) dZ s is well-defined foralmost every t ∈ R + , and Z t Z [0 ,s ) K ( s − u ) dZ u ! ds = Z t K ( t − s ) Z s ds, t ≥ . Proof.
The stochastic integral R [0 ,t ) K ( t − s ) dZ s is well-defined for a.e. t ∈ R + by Lemma A.3.Define κ ( x ) = | b ( x ) | + | a ( x ) | + R R k (1 ∧ | ζ | ) ν ( x, dζ ). The bound (1.2) and a change of vari-ables yield Z t (cid:18)Z t | K ( s − u ) | { u
Let p ∈ [2 , ∞ ) and consider data ( g , K, b, a, ν ) as in (D1) – (D3) and atruncation function χ . A pair ( X, Z ) is a weak L p solution of (1.1) if and only if it is asolution of the local martingale problem for ( g , K, A ) , where A is given by (3.4) .Proof. Suppose first (
X, Z ) is a weak L p solution of (1.1). Itˆo’s formula applied to Z shows that the process M f in (3.2) is a local martingale for every f ∈ C c ( R k ); seeJacod and Shiryaev (2003, Theorem II.2.42 (a) ⇒ (c)). Furthermore, integrating both sidesof (1.3) and invoking Lemma 3.2 yields (3.3). Thus ( X, Z ) is a solution of the local mar-tingale problem.Conversely, suppose (
X, Z ) is a solution of the local martingale problem for ( g , K, A ).Lemma 3.2 and (3.3) yield Z T X t dt = Z T g ( t ) + Z [0 ,t ) K ( t − s ) dZ s ! dt for any T >
0. This implies (1.3). It remains to check that Z is a semimartingale withdifferential characteristics b ( X ) , a ( X ) , ν ( X, dζ ) with respect to χ . This will follow fromJacod and Shiryaev (2003, Theorem II.2.42 (c) ⇒ (a)), once we prove that M f given in(3.2) is a local martingale not only for all f ∈ C c ( R k ), but for all f ∈ C b ( R k ), i.e. boundedfunctions which are continuously twice differentiable. Observe that M f remains well-defined thanks to (1.2). We adapt the proof of Cheridito et al. (2005, Proposition 3.2).Consider the stopping times T m = inf { t ≥ Z t (1 + | X s | p ) ds ≥ m } ,S m = inf { t ≥ | Z t − | ≥ m or | Z t | ≥ m } ,τ m = T m ∧ S m , m ≥
1. It is clear that τ m → ∞ as m → ∞ . Fix any function f ∈ C b ( R k ). Fix alsofunctions ϕ n ∈ C c ( R k ) taking values in [0 ,
1] and equal to one on the centered ball B (0 , n )of radius n . Then f ϕ n ∈ C c ( R k ), so that M fϕ n defined as in (3.2) is a local martingale foreach n . Write M n,mt = M fϕ n t ∧ τ m . We then have for n, m ∈ N | M n,mt | ≤ k f k ∞ + mc n , t ≥ , where the constant c n comes from (1.2) and depends on n . Hence, M n,m is a true martingalefor each m, n ∈ N . Fix m ∈ N and set M mt = M ft ∧ τ m . For all n > m , by definition of T m and the fact that ϕ n = 1 on B (0 , n ) we have M mt − M n,mt = Z (0 ,t ∧ τ m ] × R k ( f ( Z s + ζ ) − ( f ϕ n )( Z s + ζ )) ν ( X s , dζ ) ds. Thus | M mt − M n,mt | ≤ k f k ∞ Z (0 ,t ∧ τ m ] × R k | ζ |≥ n − m ν ( X s , dζ ) ds. As n → ∞ , the right–hand side tends to zero in L ( P ), by virtue of the dominated con-vergence theorem. Indeed, 1 ∧ | ζ | ≥ | ζ |≥ n − m → n → ∞ , and it follows from (1.2)that Z (0 ,t ∧ τ m ] × R d (1 ∧ | ζ | ) ν ( X s , dζ ) ds ≤ c Z t ∧ τ m (1 + | X s | p ) ds ≤ cm. We conclude that E [ | M n,mt − M mt | ] → n → ∞ . Thus M mt = M ft ∧ τ m is a martingalebeing an L ( P )-limit of martingales. Thus M f is a local martingale, as required.The following is our main result on stability for solutions of local martingale problems.Together with Lemma 3.3, it will imply Theorem 1.6. We let x = ( x ( t )) t ≥ and z =( z ( t )) t ≥ denote generic elements of L p loc and D , respectively. Theorem 3.4.
Let d, k ∈ N , p ∈ (1 , ∞ ) . Consider data ( g n , K n , A n ) for n ∈ N and ( g , K, A ) , and assume that the A n satisfy (3.1) with constants c f that do not depend on n . For each n , let ( X n , Z n ) be a solution of the local martingale problem for ( g n , K n , A n ) .Assume that • g n → g in L p loc , • K n → K in L p loc , • A n f → Af locally uniformly on R d × R k for every f ∈ C c ( R k ) , • ( X n , Z n ) ⇒ ( X, Z ) in L p loc × D for some limiting process ( X, Z ) .Then ( X, Z ) is a solution of the local martingale problem for ( g , K, A ) . roof. Let (Ω n , F n , ( F nt ) t ≥ , P n ) be the filtered probability space where ( X n , Z n ) is defined.We may assume without loss of generality that this space supports an F n -measurablestandard uniform random variable U n that is independent of ( X n , Z n ). We then have( U n , X n , Z n ) ⇒ ( U, X, Z ) in [0 , × L p loc × D , where U is standard uniform and independentof ( X, Z ). The standard uniform random variable U will be used below as a randomizationdevice to avoid the jumps of Z .Fix f ∈ C c ( R k ) and m ∈ N . For any ( u, x ) ∈ [0 , × L p loc , define τ ( u, x ) = inf { t ≥ Z t ( u + 1 + | x ( s ) | p ) ds ≥ m } . Then τ ( U n , X n ) is a stopping time in ( F nt ) t ≥ , and the growth bound (3.1) yields Z t ∧ τ ( U n ,X n )0 | A n f ( X ns , Z ns ) | ds ≤ c f Z t ∧ τ ( U n ,X n )0 (1 + | X ns | p ) ds ≤ mc f . Thus the local martingale M nt = f ( Z nt ∧ τ ( U n ,X n ) ) − Z t ∧ τ ( U n ,X n )0 A n f ( X ns , Z ns ) ds, t ≥ , satisfies | M nt | ≤ k f k ∞ + mc f , t ≥ . (3.5)In particular it is a true martingale, so for any time points 0 ≤ t < · · · < t k ≤ s < t , andfunctions h ∈ C ([0 , g i ∈ C b ( R d × R d ), i = 1 , . . . , k , we have E " ( M nt − M ns ) h ( U n ) k Y i =1 g i (cid:18)Z t i X nr dr, Z nt i (cid:19) = 0 , (3.6)where E is understood as expectation under P n .Next, by Skorokhod’s representation theorem (see Billingsley (1999, Theorem 6.7)),we may assume that all the triplets ( U n , X n , Z n ) and ( U, X, Z ) are defined on a commonprobability space (Ω , F , P ), that ( U n , X n , Z n ) → ( U, X, Z ) in [0 , × L p loc × D almost surely,and that each triplet has the same law under P as it did under P n . In particular, (3.6)still holds, now with E understood as expectation under P .We now prepare to pass to the limit in (3.6). One easily checks that the map ( u, x ) τ ( u, x ) is continuous. Combined with Lemma 3.6 below, it follows that Z t ∧ τ ( U n ,X n )0 A n f ( X nr , Z nr ) dr → Z t ∧ τ ( U,X )0 Af ( X r , Z r ) dr We can however not assume that the filtrations ( F nt ) t ≥ are the same. t ≥
0. Moreover, Z is continuous at τ ( U, X ), almost surely. To seethis, let { T i ( z ) : i ∈ N } denote an enumeration of the countably many jump times of thefunction z ∈ D . We choose T i ( z ) measurable in z . Since U and ( X, Z ) are independent,and since for any x ∈ L p loc the law of τ ( U, x ) has no atoms, we get P ( τ ( U, X ) = T i ( Z )) = E (cid:2) P ( τ ( U, x ) = T i ( z )) | ( x , z )=( X,Z ) (cid:3) = 0 . Thus P ( τ ( U, X ) ∈ { T i ( z ) : i ∈ N } ) ≤ X i ∈ N P ( τ ( U, X ) = T i ( Z )) = 0 , showing that Z is indeed continuous at τ ( U, X ), almost surely. We conclude that M nt → M t almost surely for any t ∈ C ( Z ) = { r ∈ R + : P ( Z r = Z r − ) = 1 } , where we define M t = f ( Z t ∧ τ ( U,X ) ) − Z t ∧ τ ( U,X )0 Af ( X s , Z s ) ds, t ≥ . Selecting 0 ≤ t < . . . < t k ≤ s < t from C ( Z ), we may thus use the bounded convergencetheorem, justified by (3.5), to pass to the limit in (3.6) to obtain E " ( M t − M s ) h ( U ) k Y i =1 g i (cid:18)Z t i X r dr, Z t i (cid:19) = 0 . (3.7)By Ethier and Kurtz (2005, Theorem 3.7.7), C ( Z ) is dense in R + . Along with right-continuity of M and Z , this implies that (3.7) actually holds for any choice of times points0 ≤ t < . . . < t k ≤ s < t . Thus M is a martingale with respect to the filtration given by F t = σ ( U ) ∨ σ ( Z s X r dr, Z s : s ≤ t ) , t ≥ . Since τ ( U, X ) is a stopping time for this filtration, and since the constant m in the definitionof τ ( U, X ) was arbitrary, the process M f in (3.2) is a local martingale.We must also verify (3.3). This is immediate from L p convergence of g n and X n as well as Lemma 3.5 below. This lets us pass to the limit in the identity R t X ns ds = R t g n ( s ) ds + R t K n ( t − s ) Z ns ds , which is valid by assumption.It only remains to ensure that Z is adapted and X is predictable. Adaptedness of Z holds by definition of the filtration. It is however not clear that X is predictable. Therefore,we replace X by the process e X = lim inf h ↓ e X h , where for each h > e X ht = 1 h Z t ( t − h ) ∨ X s ds, t ≥ . e X is predictable, being the pointwise liminf of the continuous and adaptedprocesses e X h . Moreover, for every fixed ω , the trajectory e X ( ω ) coincides with X ( ω ) almosteverywhere by Lebesgue’s differentiation theorem. Replacing X by e X therefore does notaffect either (3.3) or the local martingale property in (3.2).The following two lemmas were used in the proof of Theorem 3.4. The first one usesthe convolution notation ( f ∗ g )( t ) = R t f ( t − s ) g ( s ) ds . Lemma 3.5.
Fix p ∈ (1 , ∞ ) . If K n → K in L p loc ( R + , R d × k ) and z n → z in D , then K n ∗ z n → K ∗ z locally uniformly.Proof. Fix any T ∈ R + and let q ∈ (1 , ∞ ) satisfy p − + q − = 1. The triangle inequalityand Young’s inequality, see Lemma A.1 with r = ∞ , give k K ∗ z − K n ∗ z n k L ∞ (0 ,T ) ≤ k K ∗ ( z − z n ) k L ∞ (0 ,T ) + k ( K − K n ) ∗ z n k L ∞ (0 ,T ) ≤ k K k L p (0 ,T ) k z − z n k L q (0 ,T ) + k K − K n k L p (0 ,T ) k z n k L q (0 ,T ) . Since z n → z in D , we have sup n k z − z n k L ∞ (0 ,T ) < ∞ and z n ( t ) → z ( t ) for almost every t ∈ [0 , T ]. Hence z n → z in L q (0 , T ) by the dominated convergence theorem. Since K ∗ z and K n ∗ z n are continuous functions due to Gripenberg et al. (1990, Corollary 2.2.3), the L ∞ (0 , T ) norm coincides with the supremum norm on [0 , T ]. The result follows. Lemma 3.6.
Fix d, k ∈ N , p ∈ [1 , ∞ ) . Let g n : R d × R k → R be continuous functionssatisfying the following polynomial growth condition: For every compact subset Q ⊂ R k ,there exists a constant c Q ∈ R + such that | g n ( x, z ) | ≤ c Q (1 + | x | p ) , ( n, x, z ) ∈ N × R d × Q. (3.8) Assume that g n → g locally uniformly for some function g : R d × R k → R . Then, whenever ( x n , z n ) → ( x , z ) in L p loc × D , we have Z t g n ( x n ( s ) , z n ( s )) ds → Z t g ( x ( s ) , z ( s )) ds locally uniformly in t ∈ R + .Proof. Suppose ( x n , z n ) → ( x , z ) in L p loc × D . Fix T ∈ R + , let Q ⊂ R d be a compactset that contains the values attained by z and z n , n ∈ N , over [0 , T ], and let c Q be the18ssociated constant in (3.8). Let R ∈ [1 , ∞ ) be an arbitrary constant, and write Z T | g n ( x n ( s ) , z n ( s )) − g ( x ( s ) , z ( s )) | ds ≤ Z T | g n ( x n ( s ) , z n ( s )) − g ( x n ( s ) , z n ( s )) | | x n ( s ) |≤ R ds + Z T | g n ( x n ( s ) , z n ( s )) − g ( x n ( s ) , z n ( s )) | | x n ( s ) | >R ds + Z T | g ( x n ( s ) , z n ( s )) − g ( x ( s ) , z n ( s )) | | x n ( s ) |∨| x ( s ) |≤ R ds + Z T | g ( x n ( s ) , z n ( s )) − g ( x ( s ) , z n ( s )) | | x n ( s ) |∨| x ( s ) | >R ds + Z T | g ( x ( s ) , z n ( s )) − g ( x ( s ) , z ( s )) | ds = I n + II n + III n + IV n + V n . We bound these terms individually. First, defining the compact set Q R = B (0 , R ) × Q ,where B (0 , R ) = { x ∈ R d : | x | ≤ R } is the centered closed ball of radius R , we have I n ≤ T sup ( x,z ) ∈ Q R | g n ( x, z ) − g ( x, z ) | → n → ∞ ) . Next, consider the restrictions x n | [0 ,T ] , again denoted by x n for simplicity; they areconvergent in L p (0 , T ). The Vitali convergence theorem implies that {| x n | p : n ∈ N } isuniformly integrable. Since g satisfies the same polynomial growth condition (3.8) as the g n and since R ≥
1, we then get II n ≤ c Q Z T | x n ( s ) | p | x n ( s ) | >R ds ≤ ϕ II ( R p ) , where ϕ II ( R p ) = 4 c Q sup n R T | x n ( s ) | p | x n ( s ) | p >R p ds converges to zero as R → ∞ by thedefinition of uniform integrability. In a similar manner, we get IV n ≤ c Q Z T ( | x n ( s ) | ∨ | x ( s ) | ) p | x n ( s ) |∨| x ( s ) | >R ds ≤ ϕ IV ( R p ) , where ϕ IV ( R p ) = 4 c Q sup n R T ( | x n ( s ) | ∨ | x ( s ) | ) p | x n ( s ) |∨| x ( s ) | >R ds also converges to zero as R → ∞ .We now turn to III n . Let ω R : R + → R + be a continuous strictly increasing concavefunction with ω R (0) = 0 such thatsup z ∈ Q | g ( x, z ) − g ( y, z ) | ≤ ω R ( | x − y | ) , x, y ∈ B (0 , R ) . g is uniformly continuous on the compact set Q R . Its inverse ω − R exists and is convex, so by using Jensen’s inequality we get III n ≤ Z T ω R ( | x n ( s ) − x ( s ) | ) ds = T ω R ◦ ω − R (cid:18)Z T ω R ( | x n ( s ) − x ( s ) | ) dsT (cid:19) ≤ T ω R (cid:18)Z T | x n ( s ) − x ( s ) | dsT (cid:19) → n → ∞ ) . Finally, consider V n . Since z n → z in D , we have z n ( s ) → z ( s ) for almost every s ∈ R + . Thus the integrand in V n converges to zero for almost every s ∈ R + . Moreover, thepolynomial growth condition (3.8) implies that the integrand is bounded by 2 c Q (1+ | x ( s ) | p ),which has finite L ([0 , T ] , R d )-norm. The dominated convergence theorem now shows that V n → n → ∞ .Combining the above bounds, we obtainlim sup n →∞ Z T | g n ( x n ( s ) , z n ( s )) − g ( x ( s ) , z ( s )) | ds ≤ ϕ II ( R p ) + ϕ IV ( R p ) . Sending R to infinity shows that the left-hand side is actually equal to zero. This completesthe proof.The proof of Theorem 1.6 is now straightforward. Proof of Theorem 1.6.
This is a consequence of Lemma 3.3 and Theorem 3.4. We onlyneed to observe that the “truncation function” χ ( ζ ) = ζ can be used under the strongerintegrability condition (1.5), and that the A n satisfy (3.1) with constants c f that do notdepend on n . To see this, observe that any f ∈ C c ( R k ) satisfies | f ( z + ζ ) − f ( z ) − ζ ⊤ ∇ f ( z ) | ≤ k∇ f k ∞ | ζ | . (3.9)Therefore, | A n f ( x, z ) | ≤ (cid:18) k∇ f k ∞ + 12 k∇ f k ∞ (cid:19) × (cid:18) | b n ( x ) | + | a n ( x ) | + Z R k | ζ | ν n ( x, dζ ) (cid:19) . Since ( b n , a n , ν n ) satisfy (1.5) with a common constant c LG , and due to the bounds | b n ( x ) | ≤ | b n ( x ) | and | x | ≤ | x | p , we deduce that | A n f ( x, z ) | ≤ c f (1 + | x | p ) holds with c f = 2(1 + c LG ) (cid:18) k∇ f k ∞ + 12 k∇ f k ∞ (cid:19) . (3.10)20his does not depend on n , as required. The proof is complete. L p solutions This section is devoted to the proof of Theorem 1.2. We first give an elementary existenceresult for the simple pure jump case where the diffusion part of the characteristic tripletvanishes, and the jump kernel is uniformly bounded.
Lemma 4.1.
Let K : R + → R d × k and g : R + → R d be measurable functions. Let ν ( x, dζ ) be a bounded kernel from R d into R k , meaning that sup x ∈ R d ν ( x, R k ) < ∞ . Then there existsa filtered probability space with a predictable process X and a c`adl`ag piecewise constantsemimartingale Z such that X t = g ( t ) + Z [0 ,t ) K ( t − s ) dZ s , t ≥ , and the differential characteristics of Z are b ( X ) = R R k ζν ( X, dζ ) , a ( X ) = 0 , ν ( X, dζ ) .Proof. Let { ( U n , E n ) : n ∈ N } be a collection of independent random variables on a prob-ability space (Ω , F , P ), with U n standard uniform and E n standard exponential. Define T = 0 , X t = g ( t ) , Z t = 0 , t ≥ . We now construct processes X n , Z n and random times T n recursively as follows. For each n ∈ N , if X n − and Z n − have already been constructed, define a jump time T n and jumpsize J n as follows. First set T n = inf { t > T n − : Z tT n − ν ( X n − s , R k ) ds ≥ E n } , and note that T n > T n − since the kernel ν ( x, dζ ) is bounded. Then let F : R d × [0 , → R k be a measurable function with the following property: If U is standard uniform, then F ( x, U ) has distribution ν ( x, · ) /ν ( x, R k ) if ν ( x, R k ) >
0, and F ( x, U ) = 0 otherwise. Set J n = F ( X n − T n , U n ). We can now define X nt = X n − t + K ( t − T n ) J n t>T n Z nt = Z n − t + J n t ≥ T n for t ≥
0. Note that ( X n , Z n ) coincides with ( X n − , Z n − ) on [0 , T n ).Since the kernel ν ( x, dζ ) is bounded, we have sup x ∈ R d ν ( x, R k ) ≤ c for some constant c , and thus T n − T n − ≥ inf { t > ct ≥ E n } = E n /c . It follows from the Borel–Cantellilemma that lim n →∞ T n = P n ∈ N ( T n − T n − ) = ∞ . We can thus define ( X t , Z t ) for all t ≥ X t , Z t ) = ( X nt , Z nt ) for t < T n . It follows from the construction that Z is c`adl`agand piecewise constant, and that X t = g ( t ) + X n : t>T n K ( t − T n )∆ Z T n , t ≥ . This is the desired convolution equation.Let ( F t ) t ≥ be the filtration generated by Z , so that in particular Z is a semimartingale.It follows from the construction of Z that its jump characteristic is ν ( X t , dζ ) dt , provided X is predictable. We now show that this is the case. Indeed, any process of the form f ( t ) g ( T n , J n ) t>T n is predictable, so by a monotone class argument the same is true for K ( t − T n ) J n t>T n . Since X = g is predictable, it follows by induction that X n ispredictable for each n . Thus X is predictable, and the proof is complete.We now proceed with the proof of Theorem 1.2. Throughout the rest of this section,we therefore consider d, k ∈ N , p ∈ [2 , ∞ ), and ( g , K, b, a, ν ) as in (D1)–(D3). We assumethat b and a are continuous, and that x
7→ | ζ | ν ( x, dζ ) is continuous from R d to M + ( R k ),the finite positive measures on R k with the topology of weak convergence. We also assumethere exist a constant η ∈ (0 , c K : R + → R + , and a constant c LG such that (1.4) and (1.5) hold.Lemma 3.3 connects (1.1) to the local martingale problem for ( g , K, A ), where theoperator A is given by Af ( x, z ) = b ( x ) ⊤ ∇ f ( z ) + 12 tr( a ( x ) ∇ f ( z ))+ Z R k ( f ( z + ζ ) − f ( z ) − ζ ⊤ ∇ f ( z )) ν ( x, dζ ) . (4.1)By the same arguments as in the proof of Theorem 1.6, the inequality (3.9) and the growthbound (1.5), A satisfies (3.1) with the constants c f given by (3.10). In the following lemma,we construct approximations of A . Lemma 4.2.
Let A be as in (4.1) . Then there exist kernels ν n ( x, dζ ) from R d into R k with the following properties. (i) boundedness and compact support: sup x ∈ R d ν n ( x, R k ) < ∞ , and ν n ( x, · ) is compactlysupported for every x ∈ R d , (ii) linear growth uniformly in n : with b n ( x ) = R R k ζν n ( x, dζ ) , one has | b n ( x ) | + Z R k | ζ | ν n ( x, dζ ) + (cid:18)Z R k | ζ | p ν n ( x, dζ ) (cid:19) /p ≤ c ′ LG (1 + | x | ) , (4.2) where c ′ LG = (5 + 2 √ d ) c LG , locally uniform approximation: for every f ∈ C c ( R k ) , defining A n f ( x, z ) = Z R k ( f ( z + ζ ) − f ( z )) ν n ( x, dζ ) , we have A n f ∈ C ( R d × R k ) and A n f → Af locally uniformly.Proof. Multiplying by a continuous cutoff function if necessary, we may assume that b ( x ), a ( x ), and ν ( x, dζ ) are zero for all x outside some compact set Q . Moreover, we canapproximate the b , a , and ν parts separately and then add up the approximations (observingthat the left-hand side of (4.2) is subadditive in ( b n , ν n ), so that we may simply add upthe corresponding constants c ′ LG ).Suppose first that a and ν are zero, and let ν n ( x, dζ ) = 1 ε δ εb ( x ) ( dζ ) ζ =0 , where ε = n − . Clearly (i) holds. Moreover, A n f ( x, z ) = ε − ( f ( z + εb ( x )) − f ( z )) lies in C ( R d × R k ), and converges to Af ( x, z ) = b ( x ) ⊤ ∇ f ( z ). The convergence is locally uniform,since the difference quotients converge locally uniformly for f ∈ C ( R k ). Thus (iii) holds.Finally, note that b n ( x ) = b ( x ), and that R R k | ζ | q ν n ( x, dζ ) = ε q − | b ( x ) | q for any q ≥ c ′ LG = 3 c LG .Suppose instead that b and ν are zero. Write σ ( x ) = a ( x ) / using the positive semidef-inite square root. Then x σ ( x ) is again continuous and compactly supported. So are itscolumns, denoted by σ ( x ) , . . . , σ d ( x ). Let ν n ( x, dζ ) = 12 ε d X i =1 ( δ εσ i ( x ) ( dζ ) + δ − εσ i ( x ) ( dζ )) ζ =0 , where again ε = n − . As before, (i) holds. Moreover, A n f ( x, z ) = 12 d X i =1 f ( z + εσ i ( x )) − f ( z ) + f ( z − εσ i ( x )) ε → d X i =1 σ i ( x ) ⊤ ∇ f ( z ) σ i ( x ) = 12 tr( a ( x ) ∇ f ( z )) . Again, A n f lies in C ( R d × R k ) and the convergence is locally uniform since f is C andthe σ i are continuous. This gives (iii). Next, we have b n ( x ) = 0. Also, writing σ ji ( x ) forthe j th component of σ i ( x ), we have Z R k | ζ | q ν n ( x, dζ ) = ε q − d X i =1 | σ i ( x ) | q ≤ (cid:16) d X i,j =1 | σ ji ( x ) | (cid:17) q/ = tr( a ( x )) q/ q ≥
2. Since also tr( a ( x )) ≤ √ d | a ( x ) | , it follows from (1.5) that (4.2) holds with c ′ LG = 2 √ d c LG .Finally, suppose that b and a are zero. Let ϕ n be a continuous cutoff function supportedon [ n − , n ] and equal to one on [2 n − , n/ ϕ n +1 ≥ ϕ n for all n . Let ν n ( x, B ) = Z R k (cid:18) δ ζ ( B ) + 1 ε δ − εζ ( B ) (cid:19) ϕ n ( | ζ | ) ν ( x, dζ ) , where again ε = n − . Clearly ν n ( x, · ) has compact support. Moreover, ν n ( x, R k ) ≤ (cid:18) ε (cid:19) Z R k n | ζ | ν ( x, dζ ) ≤ c LG (1 + n ) n sup x ∈ Q (1 + | x | ) < ∞ , due to the growth bound (1.5) and recalling that we assumed ν ( x, dζ ) = 0 for all x outsidesome compact set Q . We deduce that (i) holds. Next, we have b n ( x ) = Z R k (cid:18) ζ + 1 ε ( − εζ ) (cid:19) ϕ n ( | ζ | ) ν ( x, dζ ) = 0and Z R k | ζ | q ν n ( x, dζ ) = 2 Z R k | ζ | q ϕ n ( | ζ | ) ν ( x, dζ ) ≤ Z R k | ζ | q ν ( x, dζ ) . Thus it follows from (1.5) that (4.2) holds with c ′ LG = 2 c LG . It remains to show that A n f → Af locally uniformly. Write Af ( x, z ) − A n f ( x, z )= Z R k (cid:16) f ( z + ζ ) − f ( z ) − ζ ⊤ ∇ f ( z ) (cid:17) (1 − ϕ n ( | ζ | )) ν ( x, dζ )+ Z R k ε (cid:16) f ( z ) − f ( z − εζ ) − εζ ⊤ ∇ f ( z ) (cid:17) ϕ n ( | ζ | ) ν ( x, dζ ) . Due to (3.9) and the bound | f ( z ) − f ( z − εζ ) − εζ ⊤ ∇ f ( z ) | ≤ ε k∇ f k ∞ | ζ | , we obtain | Af ( x, z ) − A n f ( x, z ) | ≤ c Z R k (1 − ϕ n ( | ζ | )) e ν ( x, dζ ) + cn e ν ( x, R k ) (4.3)for the constant c = k∇ f k ∞ and the finite kernel e ν ( x, dζ ) = | ζ | ν ( x, dζ ) . ν ( x, dζ ) = 0 for all x outsidea compact set Q , we have R R k | ζ | ν ( x, dζ ) ≤ c LG sup x ∈ Q (1 + | x | ) < ∞ . Thus the secondterm on the right-hand side of (4.3) tends to zero uniformly as n → ∞ . To bound the firstterm, write Z R d (1 − ϕ n ( | ζ | )) e ν ( x, dζ ) ≤ Z R d ψ n ( | ζ | ) e ν ( x, dζ ) + Z R d | ζ |≥ n/ e ν ( x, dζ ) , (4.4)where ψ n = (1 − ϕ n ) [0 , n − ] is continuous and supported on [0 , n − ]. We bound the twoterms on the right-hand side of (4.4) separately.First, by assumption, x e ν ( x, dζ ) is continuous from R d to M + ( R k ). Moreover, e ν ( x, dζ ) is zero for x outside a compact set Q . Thus the set P = { e ν ( x, dζ ) : x ∈ R d } = { e ν ( x, dζ ) : x ∈ Q } is a compact subset of M + ( R d ), being a continuous image of a compactset. Therefore P is tight, so thatsup x ∈ R d Z R d | ζ |≥ n/ e ν ( x, dζ ) = sup µ ∈ P µ ( B (0 , n/ c ) → , n → ∞ . (4.5)Next, we claim that lim sup n →∞ sup x ∈ R d Z R d ψ n ( | ζ | ) e ν ( x, dζ ) = 0 . (4.6)Let v denote the limsup in (4.6). For each n , x R R d ψ n ( | ζ | ) e ν ( x, dζ ) is continuous andsupported on Q , hence maximized at some x n ∈ Q . After passing to a subsequence, wehave x n → ¯ x for some ¯ x ∈ Q , and R R d ψ n ( | ζ | ) e ν ( x n , dζ ) → v . By the choice of ϕ n , we have ψ n +1 ≤ ψ n for all n . As a result, for each fixed m , v ≤ lim n →∞ Z R d ψ m ( | ζ | ) e ν ( x n , dζ ) = Z R d ψ m ( | ζ | ) e ν (¯ x, dζ ) . This tends to zero as m → ∞ by dominated convergence, since e ν (¯ x, { } ) = 0. Thus v = 0,that is, (4.6) holds. Combining (4.4), (4.5), and (4.6), it follows that also the first termon the right-hand side of (4.3) tends to zero uniformly as n → ∞ . This gives (iii) andcompletes the proof of the lemma.We can now complete the proof of existence of weak L p solutions. Proof of Theorem 1.2.
Consider the kernels ν n ( x, dζ ) and corresponding triplets ( b n , , ν n )given by Lemma 4.2. Apply the basic existence result Lemma 4.1 with each kernel ν n ( x, dζ ) and the given g and K to obtain processes ( X n , Z n ). Note that the dif-ferential characteristics of Z n with respect to the “truncation function” χ ( ζ ) = ζ are b n ( X n ) , a n ( X n ) = 0 , ν n ( X n , dζ ). Thus ( X n , Z n ) is a weak L p solution of (1.1) for the data( g , K, b n , , ν n ). 25he triplets ( b n , , ν n ) satisfy the growth bound in Lemma 4.2(ii) with a commonconstant c ′ LG . Corollary 1.5 thus implies that the sequence { X n } n ∈ N is tight in L p loc . Bypassing to a subsequence, we assume that X n ⇒ X in L p loc for some limiting process X .We claim that the sequence { Z n } n ∈ N is tight in D . To prove this, first note that forany T ∈ R + , m > ε >
0, we have P (cid:18)Z T Z R k | ζ | >m ν n ( X nt , dζ ) > ε (cid:19) ≤ m ε E (cid:20)Z T Z R k | ζ | ν n ( X nt , dζ ) (cid:21) ≤ m ε c ′ LG (cid:16) T + E [ k X n k L (0 ,T ) ] (cid:17) . Theorem 1.4 shows that the expectation on the right-hand side is bounded by a constantthat does not depend on n . Therefore,lim m →∞ sup n ∈ N P (cid:18)Z T Z R k | ζ | >m ν n ( X nt , dζ ) > ε (cid:19) = 0 . Furthermore, the increasing process Z t (cid:18) | b n ( X ns ) | + Z R k | ζ | ν n ( X ns , dζ ) (cid:19) ds, t ≥ , (4.7)is strongly majorized by c ′ LG R · (1 + | X ns | ) ds in the sense that the difference of the two isincreasing; see Jacod and Shiryaev (2003, Definition VI.3.34). The latter process convergesweakly to the continuous increasing process c ′ LG R · (1 + | X s | ) ds . Thus (4.7) is tight withonly continuous limit points; see Jacod and Shiryaev (2003, Proposition VI.3.35). Withthese observations we may now apply Jacod and Shiryaev (2003, Theorem VI.4.18 andRemark VI.4.20(2)) to conclude that { Z n } n ∈ N is tight in D .Finally, by passing to a further subsequence, we now have ( X n , Z n ) ⇒ ( X, Z ) in L p loc × D for some limiting process ( X, Z ). An application of Theorem 1.6 then shows that (
X, Z ) isa weak L p solution of (1.1) for the data ( g , K, b, a, ν ), as desired. The proof of Theorem 1.2is complete. L p solutions We now turn to pathwise uniqueness and uniqueness in law under suitable Lipschitz con-ditions.Let (
X, Z ) be a weak L p solution of (1.1) for the data ( g , K, b, a, ν ), where R R k | ζ | ν ( x, dζ ) < ∞ . The characteristics are understood with respect to the “truncation function” χ ( ζ ) = ζ .Standard representation theorems for semimartingales allow us to express Z as a stochas-tic integral with respect to time, Brownian motion, and a compensated Poisson ran-dom measure; see Jacod and Protter (2011, Theorem 2.1.2) and El Karoui and Lepeltier261977); Lepeltier and Marchal (1976). It follows that X satisfies a d -dimensional stochasticVolterra equation of the form X t = g ( t ) + Z t K ( t − s ) b ( X s ) ds + Z t K ( t − s ) σ ( X s ) dW s + Z [0 ,t ) × R m K ( t − s ) γ ( X s , ξ )( µ ( ds, dξ ) − F ( dξ ) ds ) , P ⊗ dt -a.e. (5.1)for some d ′ -dimensional Brownian motion W , Poisson random measure µ on R + × R m withcompensator dt ⊗ F ( dξ ), and some measurable functions σ : R d → R k × d ′ and γ : R d × R m → R k such that a ( x ) = σ ( x ) σ ( x ) ⊤ and ν ( x, B ) = Z R m B ( γ ( x, ξ )) F ( dξ ).Both W and µ are defined on some extension (Ω , F , F , P ) of the filtered probability spacewhere X and Z are defined.Conversely, given ( g , K, b, σ, γ, F ) along with a filtered probability space (Ω , F , F , P )equipped with a d ′ -dimensional Brownian motion W and Poisson random measure µ on R + × R m with compensator dt ⊗ F ( dξ ), a solution of (5.1) is any predictable process X on (Ω , F , F , P ) with trajectories in L p loc such that (5.1) holds. We are now in position todefine pathwise uniqueness for such solutions. Definition 5.1.
Fix ( g , K, b, σ, γ, F ) as above. We say that pathwise uniqueness holdsfor (5.1) if for any (Ω , F , F , P ) , W , µ as above and any two solutions X and Y of (5.1) ,we have X = Y , P ⊗ dt -a.e. The powerful abstract machinery of Kurtz (2014) can be used in this setting to relatepathwise uniqueness and weak existence to strong existence and uniqueness in law. A strongsolution of (5.1) in the sense of Kurtz (2014, Definition 1.2) is a weak L p solution X whichis P ⊗ dt -a.e. equal to a Borel measurable function of W and N = R · ξ ( µ ( dξ, ds ) − F ( dξ ) ds )from (5.1). Theorem 5.2.
The following are equivalent: (i)
There exists a weak L p solution of (1.1) , and pathwise uniqueness holds for (5.1) . (ii) There exists a strong solution of (5.1) , and joint uniqueness in law of ( X, W, N ) holds.Proof. Let S = L p loc and S = D × D . Then the statement follows from Kurtz (2014,Theorem 1.5 and Lemma 2.10). Indeed, Kurtz (2014, Lemma 2.10) clarifies that ournotion of pathwise uniqueness is equivalent to the one used in Kurtz (2014, Theorem 1.5).Note that the definitions in Kurtz (2014, Definition 1.4, Definition 2.9) have to be adaptedto replace P -a.s. assertions by P ⊗ dt -a.e. assertions.27s for standards SDEs, pathwise uniqueness holds under Lipschitz conditions on thecoefficients. Theorem 5.3.
Let K ∈ L and suppose there exists a constant c Lip such that b, σ, γ, F in (5.1) satisfy | b ( x ) − b ( y ) | + | σ ( x ) − σ ( y ) | + Z R m | γ ( x, ξ ) − γ ( y, ξ ) | F ( dξ ) ≤ c Lip | x − y | for all x, y ∈ R d . Then pathwise uniqueness holds for (5.1) , and hence also uniqueness inlaw of weak L p solutions of (1.1) .Proof. The argument is similar to the proof of (1.6), so we only give a sketch. Let X and Y be two solutions of (5.1) with trajectories in L . Define τ n = inf { t : R t ( | X s | + | Y s | ) ds ≥ n } ∧ T as well as X nt = X t t<τ n and Y nt = Y t t<τ n . As in the proof of (1.6), but relyingon the Lipschitz assumption rather than linear growth, one shows that E [ k X n − Y n k L (0 ,T ) ] ≤ c Z T Z t | K ( t − s ) | E [ | X ns − Y ns | ]) ds dt for all T ≥ c = c ( T, c
Lip ) < ∞ that depends continuously on T and c Lip .Multiple changes of variables and applications of Tonelli’s theorem then show that f n ( t ) = E [ k X n − Y n k L (0 ,t ) ] satisfies the convolution inequality f n ( t ) ≤ − ( b K ∗ f n )( t ) on [0 , T ] with b K = − c ( T, c
Lip ) | K | . The Gronwall lemma for convolution inequalities (see Lemma A.2)yields f n ( T ) ≤
0, and monotone convergence gives f n ( T ) → E [ k X − Y k L (0 ,T ) ]. Thus E [ k X − Y k L (0 ,T ) ] = 0, which implies pathwises uniqueness in the sense of Definition 5.1.Uniqueness in law now follows from Theorem 5.2. Solutions X of (1.1) can be very irregular. Consider for example the simple case X t = Z [0 ,t ) K ( t − s ) dN s = X t>T n K ( t − T n ) , where N is a standard Poisson process with jump times T n , n ∈ N . Without further infor-mation about K , nothing can be said about the path regularity of X beyond measurability.Even with singular but otherwise “nice” kernels such as those in Example 1.3(i), X fails tohave c`adl`ag or even l`adl`ag trajectories. This is why L p spaces are useful for the solutiontheory. Nonetheless, one frequently does have additional information that implies betterpath regularity.The following result yields H¨older continuity in many cases, also when the drivingsemimartingale has jumps. The result relies on a combination of the estimates (1.6)-(1.7)28ith Sobolev embedding theorems. For any T > η >
0, we denote by C η (0 , T ) thespace of H¨older continuous functions of order η on [0 , T ]. Thus f ∈ C η (0 , T ) if k f k C η (0 ,T ) = k f k L ∞ (0 ,T ) + sup t,s ∈ [0 ,T ] t = s | f ( t ) − f ( s ) || t − s | η < ∞ . Theorem 6.1.
Let d, k ∈ N , p ∈ [2 , ∞ ) , and consider data ( g , K, b, a, ν ) as in (D1) – (D3) .Assume there exist a constant η ∈ (0 , , a locally bounded function c K : R + → R + , and aconstants c LG such that (1.4) and (1.5) hold. Then for any weak L p solution X of (1.1) the following statements hold: (i) if ηp > , then X − g admits a version whose sample paths lie in C ( ηp − /p (0 , T ) almost surely. (ii) if p = 2 and ν ≡ , then X − g admits a version whose sample paths lie in C β (0 , T ) for all β < η almost surely. (iii) if K (0) < ∞ and if K − K (0) (instead of K ) satisfies (1.4) with ηp > , then X − g admits a version with c`agl`ad sample paths. (iv) without assuming (1.4) and (1.5) , but rather that K is differentiable with derivative K ′ ∈ L , we have that X − g is a semimartingale and thus admits a version withc`agl`ad sample paths. Proof.
Assertion (i) follows from (1.7) and the Sobolev embedding theorem, see Di Nezza et al.(2012, Theorem 8.2). To prove (ii), one can adapt the proof of Theorem 1.4 to get that(1.6)-(1.7) hold for all p ≥
2. Applying Di Nezza et al. (2012, Theorem 8.2) for sufficientlylarge values of p yields the claimed statement. For (iii), we write X t − g ( t ) = K (0) Z t − + Z [0 ,t ) ( K ( t − s ) − K (0)) dZ s . The claimed regularity follows on observing that the first term on the right-hand side isc`agl`ad and that, similarly to (i), the second term admits a version with continuous samplepaths. For (iv) one applies a Fubini theorem, see Lemma 3.2, to get that X t − g ( t ) = K (0) Z t − + Z t (cid:16) Z [0 ,s ) K ′ ( s − u ) dZ u (cid:17) ds. This completes the proof. Note that (1.4) is implied by the given assumption on K , for any η < /p . Applications
In this section, we illustrate our results with two applications: scaling limits of Hawkesprocesses and approximation of stochastic Volterra equations by Markovian semimartin-gales.
Fix d, k ∈ N along with functions g : R + → R d , b : R d → R k , Λ : R d → R k + , and a kernel K : R + → R d × k . We fix p ≥ g and K lie in L p loc , that K satisfies (1.4)for some η ∈ (0 ,
1) and locally bounded function c K , and that b and Λ are continuous andsatisfy the linear growth condition | b ( y ) | + | Λ( y ) | ≤ c (1 + | y | ) , y ∈ R d , (7.1)for some constant c ∈ R + . Consider a k -dimensional counting process N with no simulta-neous jumps, whose intensity vector is given by Λ( Y ) with Y a d -dimensional predictableprocess with trajectories in L p loc that satisfies Y t = g ( t ) + Z t K ( t − s ) b ( Y s ) ds + Z [0 ,t ) K ( t − s ) dN s P ⊗ dt -a.e. (7.2)We call such a process N a generalized nonlinear Hawkes process . The existence of Y and N follows immediately from Theorem 1.2. Indeed, (7.2) is a stochastic Volterra equation of theform (1.1) whose driving semimartingale Z has differential characteristics b ( Y ), a ( Y ) = 0,and ν ( Y, dζ ) = P di =1 Λ i ( Y ) δ e i ( dζ ), where e , . . . , e d are the canonical basis vectors in R d . Example 7.1.
For k = d , b = 0, and Λ( y ) = y we obtain a multivariate Hawkes process,and nonlinear Hawkes processes for more general Λ; see Br´emaud and Massouli´e (1996);Daley and Vere-Jones (2003); Delattre et al. (2016) and the references there.We now establish convergence of rescaled generalized nonlinear Hawkes processes to-ward stochastic Volterra equations with no jump part, as those studied by Abi Jaber et al.(2017). In the following theorem we consider given inputs g , K as well as g n , K n indexedby n ∈ N , that satisfy the assumptions described in the beginning of this subsection. Weconsider a fixed function Λ = (Λ , . . . , Λ d ) as above and take b = − Λ. We continue toassume (7.1) (with b = − Λ). For each n , denote the corresponding generalized nonlinearHawkes process by N n . Its intensity vector is Λ( Y n ), where Y n satisfies Y nt = g n ( t ) + Z [0 ,t ) K n ( t − s ) dM ns ,M nt = N nt − Z t Λ( Y ns ) ds. heorem 7.2. For each n ∈ N , consider a diagonal matrix of rescaling parameters, ε n =diag( ε n , . . . , ε nd ) ∈ R d × d . Assume for all i that n ( ε ni ) Λ i (cid:0) ( ε n ) − x (cid:1) ≤ c i (1 + | x | ) , x ∈ R d , (7.3) for some constant c i > independent of n , and that n ( ε ni ) Λ i (cid:0) ( ε n ) − x (cid:1) → ¯Λ i ( x ) (7.4) locally uniformly in x for some function ¯Λ : R d → R d . Assume also that (i) ε n g n ( n · ) → g in L p loc , (ii) ε n K n ( n · )( ε n ) − → K in L p loc , (iii) ε n K n ( n · )( ε n ) − satisfy (1.4) with the same η and c K as K .Then the rescaled sequence ( X n , Z n ) given by X nt = ε n Y nnt , Z nt = ε n M nnt is tight in L p loc × D ,and every limit point ( X, Z ) is a weak L p solution of X t = g ( t ) + Z t K ( t − s ) dZ s , (7.5) where Z admits the representation Z t = R t p diag( ¯Λ( X s )) dW s for some d -dimensionalBrownian motion W .Proof. One verifies that the rescaled intensity X n satisfies the equation X nt = ε n g n ( nt ) + Z t ε n K n ( n ( t − s ))( ε n ) − dZ ns , where Z n has differential characteristics b n ( X n ) = 0 , a n ( X n ) = 0 , ν n ( X n , dζ ) with jumpkernel given by ν n ( x, dζ ) = P di =1 n Λ i (cid:0) ( ε n ) − x (cid:1) δ ε ni e i ( dζ ). Here e , . . . , e d are the canonicalbasis vectors in R d . The associated operator is given by A n f ( x, z ) = d X i =1 n Λ i (cid:0) ( ε n ) − x (cid:1) (cid:16) f ( z + ε ni e i ) − f ( z ) − ε ni ∇ f ( z ) ⊤ e i (cid:17) , which converges locally uniformly to tr (cid:0) diag (cid:0) ¯Λ( x ) (cid:1) ∇ f ( z ) (cid:1) due to (7.4). Consequently,provided ( X n , Z n ) is tight, Theorem 3.4 shows that every limit point ( X, Z ) is a weak L p solution of (7.5), where Z has differential characteristics b ( X ) = 0, a ( X ) = diag (cid:0) ¯Λ( X ) (cid:1) , ν ( X, dζ ) = 0. The representation of Z in terms of a Brownian motion is standard. Itremains to prove tightness. First, by virtue of (7.3), we have R R d | ζ | ν n ( x, dζ ) ≤ c (1 + | x | )for all x ∈ R d and some constant c . Thus, (1.5) is satisfied uniformly in n . Recalling (i)and (iii), Corollary 1.5 yields tightness of ( X n ) n ≥ . Tightness of ( Z n ) n ≥ in D is thenobtained by reiterating the arguments in the proof of Theorem 1.2 at the end of Section 4.Since marginal tightness implies joint tightness the proof is complete.31 xample 7.3. Let
K, g be as described in the beginning of this subsection and let ε n =diag( ε n , . . . , ε nd ) ∈ R d × d as above. Then the functions g n and K n given by g n ( t ) = ( ε n ) − g (cid:18) tn (cid:19) , K n ( t ) = ( ε n ) − K (cid:18) tn (cid:19) ε n satisfy (i)–(iii). There are other ways of constructing such kernels, as illustrated in Jaisson and Rosenbaum(2015, 2016) for linear Hawkes processes.Theorem 7.2 is in the same spirit as the results of Erny et al. (2019), who obtain square-root type processes as limits of mean field interactions of multi-dimensional nonlinearHawkes processes. The following example provides a concrete specification for the specialcase of fractional powers, extending results in Jaisson and Rosenbaum (2015, 2016) tononlinear Hawkes processes. Example 7.4.
Let β i ∈ (0 , i = 1 , . . . , d , and take Λ( y ) = ( y β , . . . , y β d d ). Let ε n =diag( ε n , . . . , ε nd ) satisfy n ( ε ni ) − β i → ν i for some constants ν i ≥
0. Then (7.3)–(7.4) aresatisfied with ¯Λ = Λ. The limiting process (
X, Z ) produced by Theorem 7.2 takes the form X t = g ( t ) + Z t K ( t − s ) q diag ( ν | X s | β , . . . , ν d | X ds | β d ) dW s , where W is a d -dimensional Brownian motion. It is sometimes useful, for example for numerical purposes, to replace a singular kernelwith a smooth approximation. Theorem 3.4 can be used to analyze this procedure; see alsothe stability result of Abi Jaber and El Euch (2019a, Theorem 3.6) for the case withoutjumps. An approximation scheme that is useful in practice is to consider weighted sums ofexponentials.
Theorem 7.5.
Fix d, k ∈ N , p ≥ and ( g , K, b, a, ν ) as in (D1) – (D3) , and assume (1.5) holds. For each n ∈ N , let g n ∈ L p loc and consider the kernel K n ( t ) = n X i =1 c ni e − λ ni t for some c ni ∈ R d × k and λ ni ≥ , i = 1 , . . . , n . By Example 1.3 (ii) and Theorem 1.2 thereexists a weak L p -solution ( X n , Z n ) for the data ( g n , K n , b, a, ν ) . Moreover, X n admits therepresentation X nt = g n ( t ) + n X i =1 c ni Y n,it dY n,it = − λ ni Y n,it dt + dZ nt , Y n,i = 0 , i = 1 , . . . , n. Assume in addition that g n → g in L p loc , (ii) K n → K in L p loc , (iii) K n satisfy (1.4) with the same η and c K as K .Then ( X n , Z n ) n ≥ is tight in L p loc × D , and every limit point ( X, Z ) is a weak L p solutionof (1.1) for the data ( g , K, b, a, ν ) .Proof. Defining Y n,it = R t e − λ ni ( t − s ) dZ ns , the representation of X n follows from Itˆo’s for-mula. Corollary 1.5 yields tightness of ( X n ) n ≥ . Tightness of ( Z n ) n ≥ in D is then obtainedby reiterating the arguments in the proof of Theorem 1.2 at the end of Section 4. Theclaimed convergence follows from Theorem 3.4. Remark 7.6. If K is the Laplace transform of a R d × d -valued measure µ , K ( t ) = Z R + e − λt µ ( dλ ) , t > , then K can indeed be approximated by weighted sums of exponentials. Constructions ofsuch weighted sums are given by Abi Jaber and El Euch (2019a). A Auxiliary results
We occasionally use the following version of Young’s inequality on subintervals. It uses theconvolution notation ( f ∗ g )( t ) = R t f ( t − s ) g ( s ) ds . Lemma A.1.
Fix T ∈ R + and p, q, r ∈ [1 , ∞ ] with p − + q − = r − + 1 . For anymatrix-valued measurable functions f, g on [0 , T ] of compatible size, one has the Youngtype inequality k f ∗ g k L r (0 ,T ) ≤ k f k L p (0 ,T ) k g k L q (0 ,T ) .Proof. This follows from the Young inequality for convolutions on the whole real lineapplied to the functions | f | [0 ,T ] and | g | [0 ,T ] that equal | f ( t ) | and | g ( t ) | for t ∈ [0 , T ]and zero elsewhere.For ease of reference, we give the following well-known Gronwall type lemma for convolu-tion inequalities; see Gripenberg et al. (1990, Lemma 9.8.2) for the case of non-convolutionkernels. Lemma A.2.
Let T ∈ R + and suppose f, g, k ∈ L (0 , T ) . Assume k has a nonpositiveresolvent r ≤ . If f ≤ g − k ∗ f , then f ≤ g − r ∗ g .Proof. Write f + k ∗ f = g − h for h ≥
0. By the definition of resolvent, one then has f = ( g − h ) − r ∗ ( g − h ) ≤ g − r ∗ g . 33 emma A.3. Let p ∈ [2 , ∞ ) . Consider a convolution kernel K ∈ L p loc and a characteristictriplet ( b, a, ν ) satisfying (1.2) . Let X be a predictable process with trajectories in L p loc ,and let Z be an Itˆo semimartingale whose differential characteristics (with respect to somegiven truncation function χ ) are b ( X ) , a ( X ) , ν ( X, dζ ) . Then for almost every t ∈ R + , thestochastic integral R [0 ,t ) K ( t − s ) dZ s is well-defined.Proof. Define κ ( x ) = | b ( x ) | + | a ( x ) | + R R k (1 ∧ | ζ | ) ν ( x, dζ ) and set τ n = inf { t : R t | X s | p ds >n } . Due to the bound (1.2) and the definition of τ n , we have R T ∧ τ n κ ( X s ) ds ≤ c ( T + n ).Thus, for any T ∈ R + , Young’s inequality, see Lemma A.1, gives Z T (cid:18)Z t ∧ τ n | K ( t − s ) | κ ( X s ) ds (cid:19) p/ dt ≤ (cid:18)Z T | K ( t ) | p dt (cid:19) (cid:18)Z T ∧ τ n κ ( X s ) dt (cid:19) p/ ≤ (cid:18)Z T | K ( t ) | p dt (cid:19) ( c ( T + n )) p/ . The right-hand side is deterministic; call it c n . Taking expectations and using Tonelli’stheorem yields Z T E "(cid:18)Z t ∧ τ n | K ( t − s ) | κ ( X s ) ds (cid:19) p/ dt ≤ c n . Therefore, for each n , there is a nullset N n ⊂ [0 , T ] such that the expectation is finite forall t ∈ [0 , T ] \ N n . The union N = S n N n is still a nullset, and for each t ∈ [0 , T ] \ N , Z t ∧ τ n | K ( t − s ) | κ ( X s ) ds < ∞ for all n , P -a.s.Since X has trajectories in L p loc , we have τ n → ∞ . We infer that, for each t ∈ [0 , T ] \ N , R t | K ( t − s ) | κ ( X s ) ds < ∞ , P -a.s. This implies that the random variable R [0 ,t ) K ( t − s ) dZ s is well-defined. References
Eduardo Abi Jaber and Omar El Euch. Multifactor approximation of rough volatilitymodels.
SIAM Journal on Financial Mathematics , 10(2):309–349, 2019a.Eduardo Abi Jaber and Omar El Euch. Markovian structure of the Volterra Heston model.
Statistics & Probability Letters , 149:63–72, 2019b.Eduardo Abi Jaber, Martin Larsson, and Sergio Pulido. Affine Volterra processes.
Annalsof Applied Probability, to appear , 2017. 34le E Barndorff-Nielsen and J¨urgen Schmiegel. Time change, volatility, and turbulence.In
Mathematical Control Theory and Finance , pages 29–53. Springer, 2008.Ole E Barndorff-Nielsen, Fred Espen Benth, Almut ED Veraart, et al. Modelling energyspot prices by volatility modulated l´evy-driven volterra processes.
Bernoulli , 19(3):803–845, 2013.Andreas Basse and Jan Pedersen. L´evy driven moving averages and semimartingales.
Stochastic Processes and their Applications , 119(9):2970–2991, 2009.Fred Espen Benth, Nils Detering, and Paul Kruehner. Stochastic Volterra integral equa-tions and a class of first order stochastic partial differential equations. arXiv preprintarXiv:1903.05045 , 2019.Marc A. Berger and Victor J. Mizel. Volterra equations with Itˆo integrals.I.
J. Integral Equations , 2(3):187–245, 1980. ISSN 0163-5549. URL .Patrick Billingsley.
Convergence of probability measures . Wiley Series in Probabil-ity and Statistics: Probability and Statistics. John Wiley & Sons, Inc., New York,second edition, 1999. ISBN 0-471-19745-9. doi: 10.1002/9780470316962. URL http://dx.doi.org/10.1002/9780470316962 . A Wiley-Interscience Publication.Pierre Br´emaud and Laurent Massouli´e. Stability of nonlinear Hawkes processes.
TheAnnals of Probability , pages 1563–1588, 1996.Haim Brezis.
Functional analysis, Sobolev spaces and partial differential equations . SpringerScience & Business Media, 2010.Zdzis law Brze´zniak and Jerzy Zabczyk. Regularity of Ornstein–Uhlenbeck processes drivenby a L´evy white noise.
Potential Analysis , 32(2):153–188, 2010.Zdzislaw Brzezniak, Szymon Peszat, and Jerzy Zabczyk. Continuity of stochastic convolu-tions.
Czechoslovak Mathematical Journal , 51(4):679–684, 2001.Patrick Cheridito, Damir Filipovi´c, and Marc Yor. Equivalent and absolutely continuousmeasure changes for jump-diffusion processes.
Annals of Applied Probability , pages 1713–1732, 2005.Laure Coutin and Laurent Decreusefond. Stochastic Volterra equations with singular ker-nels. In
Stochastic Analysis and Mathematical Physics , volume 50 of
Progr. Probab. ,pages 39–50. Birkh¨auser Boston, Boston, MA, 2001.Christa Cuchiero and Josef Teichmann. Generalized Feller processes and Markovian liftsof stochastic Volterra processes: the affine case. arXiv preprint arXiv:1804.10450 , 2018.35hrista Cuchiero and Josef Teichmann. Markovian lifts of positive semidefinite affinevolterra type processes. arXiv preprint arXiv:1907.01917 , 2019.Daryl J Daley and David Vere-Jones. An introduction to the theory of point processes.vol. i. probability and its applications, 2003.Sylvain Delattre, Nicolas Fournier, Marc Hoffmann, et al. Hawkes processes on largenetworks.
The Annals of Applied Probability , 26(1):216–261, 2016.Eleonora Di Nezza, Giampiero Palatucci, and Enrico Valdinoci. Hitchhiker’s guide to thefractional Sobolev spaces.
Bulletin des Sciences Math´ematiques , 136(5):521–573, 2012.Omar El Euch and Mathieu Rosenbaum. The characteristic function of rough Hestonmodels.
Mathematical Finance , 29(1):3–38, 2019. doi: 10.1111/mafi.12173. URL https://doi.org/10.1111/mafi.12173 .Nicole El Karoui and Jean-Pierre Lepeltier. Repr´esentation des processus ponctuels multi-vari´es `a l’aide d’un processus de poisson.
Zeitschrift f¨ur Wahrscheinlichkeitstheorie undverwandte Gebiete , 39(2):111–133, 1977.Xavier Erny, Eva L¨ocherbach, and Dasha Loukianova. Mean field limits for interactingHawkes processes in a diffusive regime. arXiv preprint arXiv:1904.06985 , 2019.Stewart N Ethier and Thomas G Kurtz.
Markov Processes: Characterization and Conver-gence . Wiley Series in Probability and Statistics. Wiley, 2005.Franco Flandoli and Dariusz Gatarek. Martingale and stationary solutions for stochasticNavier-stokes equations.
Probability Theory and Related Fields , 102(3):367–391, 1995.Jim Gatheral and Martin Keller-Ressel. Affine forward variance models.
Finance andStochastics , 23(3):501–533, 2019.Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum. Volatility is rough.
Quan-titative Finance , 18(6):933–949, 2018. doi: 10.1080/14697688.2017.1393551. URL https://doi.org/10.1080/14697688.2017.1393551 .Gustaf Gripenberg, Stig-Olof Londen, and Olof Staffans.
Volterra integral and func-tional equations , volume 34 of
Encyclopedia of Mathematics and its Applications .Cambridge University Press, Cambridge, 1990. ISBN 0-521-37289-5. doi: 10.1017/CBO9780511662805. URL http://dx.doi.org/10.1017/CBO9780511662805 .Jean Jacod and Philip Protter.
Discretization of processes , volume 67. Springer Science &Business Media, 2011. 36ean Jacod and Albert N. Shiryaev.
Limit Theorems for Stochastic Processes , volume 288of
Grundlehren der Mathematischen Wissenschaften . Springer-Verlag, Berlin, secondedition, 2003.Thibault Jaisson and Mathieu Rosenbaum. Limit theorems for nearly unstable hawkesprocesses.
The Annals of Applied Probability , 25(2):600–631, 2015.Thibault Jaisson and Mathieu Rosenbaum. Rough fractional diffusions as scaling limits ofnearly unstable heavy tailed Hawkes processes.
The Annals of Applied Probability , 26(5):2860–2882, 2016.Thomas Kurtz. Weak and strong solutions of general stochastic models.
Electronic Com-munications in Probability , 19(58):1–16, 2014.Jean-Pierre Lepeltier and Bernard Marchal. Probl`eme des martingales et ´equationsdiff´erentielles stochastiques associ´ees `a un op´erateur int´egro-diff´erentiel. In
Annales del’IHP Probabilit´es et statistiques , volume 12, pages 43–103, 1976.Carlo Marinelli and Michael R¨ockner. On maximal inequalities for purely discontinuousmartingales in infinite dimensions. In
S´eminaire de Probabilit´es XLVI , pages 293–315.Springer, 2014.Tina Marquardt. Fractional L´evy processes with an application to long memory movingaverage processes.
Bernoulli , 12(6):1099–1126, 2006.Leonid Mytnik and Eyal Neuman. Sample path properties of Volterra processes. arXivpreprint arXiv:1101.4969 , 2011.Leonid Mytnik and Thomas S. Salisbury. Uniqueness for Volterra-type stochastic integralequations. arXiv preprint arXiv:1502.05513 , 2015.Aleksandr Aleksandrovich Novikov. On discontinuous martingales.
Theory of Probability& Its Applications , 20(1):11–26, 1975.Philip Protter. Volterra equations driven by semimartingales.
Ann. Probab. , 13(2):519–530, 1985. ISSN 0091-1798. URL http://links.jstor.org/sici?sici=0091-1798(198505)13:2<519:VEDBS>2.0.CO;2-3&origin=MSN .Philip E. Protter.
Stochastic Integration and Differential Equations , volume 21 of
StochasticModelling and Applied Probability . Springer-Verlag, Berlin, 2005. Second edition. Version2.1, Corrected third printing.Jan Rosinski. On path properties of certain infinitely divisible processes.
Stochastic Pro-cesses and their Applications , 33(1):73–87, 1989.37hidong Wang. Existence and uniqueness of solutions to stochastic Volterra equa-tions with singular kernels and non-Lipschitz coefficients.
Statistics & ProbabilityLetters , 78(9):1062–1071, 2008. doi: https://doi.org/10.1016/j.spl.2007.10.007. URL .Xicheng Zhang. Stochastic Volterra equations in Banach spaces and stochastic partialdifferential equation.
J. Funct. Anal. , 258(4):1361–1425, 2010. ISSN 0022-1236. doi:10.1016/j.jfa.2009.11.006. URL http://dx.doi.org/10.1016/j.jfa.2009.11.006http://dx.doi.org/10.1016/j.jfa.2009.11.006