[PDF] Rough semimartingales and p -variation estimates for martingale transforms

Abstract

We establish a new scale of p -variation estimates for martingale paraproducts, martingale transforms, and Itô integrals, of relevance in rough paths theory, stochastic, and harmonic analysis. As an application, we introduce rough semimartingales, a common generalization of classical semimartingales and (controlled) rough paths, and their integration theory.

Full PDF

aa r X i v : . [ m a t h . P R ] A ug ROUGH SEMIMARTINGALES AND p -VARIATION ESTIMATESFOR MARTINGALE TRANSFORMS PETER FRIZ AND PAVEL ZORIN-KRANICH

Abstract.

We establish a new scale of p -variation estimates for martingale para-products, martingale transforms, and Itô integrals, of relevance in rough pathstheory, stochastic, and harmonic analysis. As an application, we introduce roughsemimartingales, a common generalization of classical semimartingales and (con-trolled) rough paths, and their integration theory. Contents

1. Statement of main results 11.1. Itô integral 31.2. Rough integrators 51.3. Rough semimartingales 61.4. Diﬀerential equations 92. Vector-valued estimates in discrete time 112.1. Davis decomposition 112.2. Vector-valued BDG inequality 122.3. Vector-valued maximal paraproduct estimate 132.4. Branched rough paths 163. Variational estimates in discrete time 173.1. Stopping time construction 183.2. Sewing lemma 193.3. Discrete sums corresponding to Itô integrals 203.4. Discrete sums arising in Itô integration of branched rough paths 204. Estimates for the Itô integral 214.1. Itô integral 214.2. Mesh convergence 225. Quadratic covariation of a controlled process and a martingale 245.1. Variation norm estimate 245.2. Discretization of quadratic covariation 255.3. Integration by parts 295.4. Quadratic covariation of two martingales 306. Consistency of rough and stochastic integration 30Appendix A. Hölder estimates for martingale transforms 33References 341.

Statement of main results

Let (Ω , F , ( F t ) t ≥ , P ) be a ﬁltered probability space. An adapted partition π is anincreasing sequence of stopping times ( π n ) n ∈ N such that π = 0 and lim n →∞ π n = ∞ .The set of adapted partitions is a directed set with respect to the inclusion relation π ′ ⊆ π : ⇐⇒ { π ′ n | n ∈ N } ⊆ { π n | n ∈ N } . Mathematics Subject Classiﬁcation.

For a two-parameter process

Π = (Π t,t ′ ) ≤ t ≤ t ′ < ∞ and p ∈ (0 , ∞ ) , the p -variation is deﬁned by(1.1) V p Π := sup l max ,u ≤···≤ u l max (cid:16) l max X l =1 | Π u l − ,u l | p (cid:17) /p , with the ℓ p norm replaced by the ℓ ∞ norm in the case p = ∞ . For a one-parameterprocess f = ( f t ) t ≥ , the p -variation is deﬁned by V p f := V p ( δf ) , ( δf ) t,t ′ := f t ′ − f t . The p -variation is a monotonically decreasing function of p . A classical result about p -variation is Lépingle’s inequality [Lep76] which tells that, for a càdlàg martingale g = ( g t ) t ≥ , we have (1.2) k V p g k L q (Ω) . k V ∞ g k L q (Ω) , < p ≤ ∞ , ≤ q < ∞ . Here, V ∞ g = sup ≤ t , p > , and /p + 1 /p > .If g is a martingale, then V p g < ∞ (locally in time) for any < p by Lépingle’sinequality (1.2), and so Young’s condition becomes < p < . Under this condition,for /r = 1 /p + 1 /p , we have(1.7) sup π V r Π π ( f, g ) . ( V p f )( V p g ) , and this estimate passes to the limit. For continuous martingales, this holds for any < q < ∞ but will play no rôle for us. As before, partitions (adaptedness is irrelevant here) form a directed set under ⊆ . In fact, since f, g are càdlàg, so that f − and g have no common discontinuities, one can get convergence also inthe stronger mesh sense: π ′ π : ⇐⇒ mesh( π ) ≤ mesh( π ′ ) , mesh( π ) := sup j k π j − π j − k ∞ . OUGH SEMIMARTINGALES 3

Itô integral.

Our ﬁrst main result extends the estimate (1.7) to the case of Itôintegrals with integrands whose variation exponent is p ≥ . The pathwise estimate(1.7) becomes false in this regime, and we have to substitute it with a momentestimate (which follows directly from (1.7), Hölder’s, and Lépingle’s inequalities inthe case p < ). Moreover, we replace the increment process δf by a general two-parameter process F ; the motivation for doing so is explained below. Theorem 1.1.

Let < q ≤ ∞ , ≤ q < ∞ , and < r, p ≤ ∞ . Suppose (1.8) /r < /p + 1 / /p i, + 1 /p i, , /q = 1 /q + 1 /q . Let ( F s,t ) s ≤ t be a càdlàg adapted process and ( g t ) a càdlàg martingale. Suppose thatthere exist càdlàg adapted processes F i , ˜ F i , i ∈ { , . . . , i max } , i max ∈ N , such that (1.9) F s,u − F t,u = i max X i =1 F is,t ˜ F it,u , s ≤ t ≤ u. Then, the following holds. (1)

For every adapted partition π , we have the estimate (cid:13)(cid:13) V r Π π ( F, g ) (cid:13)(cid:13) L q . k V p F ( π ) k L q k V ∞ g k L q + X i k V p i, F i, ( π ) · V p i, Π π ( ˜ F i , g ) k L q . (1.10)(2) For every i , let /q = 1 /q i, + 1 /q i, , and suppose that F i = lim π F i, ( π ) in L q i, ( V p i, ) , (1.11) Π( ˜ F i , g ) = lim π Π π ( ˜ F i , g ) exists in L q i, ( V p i, ) , (1.12) and ˜ F i ∈ L q ( V ∞ ) . Suppose that the right-hand side of (1.14) is ﬁnite. Then (1.13) Π( F, g ) := lim π Π π ( F, g ) exists as the limit of a Cauchy net in L q (Ω , V r ) , satisﬁes the bound (cid:13)(cid:13) V r Π( F, g ) (cid:13)(cid:13) L q . k V p F k L q k V ∞ g k L q + X i k V p i, F i · V p i, Π( ˜ F i , g ) k L q , (1.14) and, for any ≤ t ≤ t ′ ≤ t ′′ < ∞ , Chen’s relation(1.15) Π( F, g ) t,t ′′ = Π( F, g ) t,t ′ + Π( F, g ) t ′ ,t ′′ + X i F it,t ′ Π( ˜ F i , g ) t ′ ,t ′′ . The limit (1.13) is the Itô integral, which can also be denoted by(1.16) Π( F, g ) t,t ′ = Z ( t,t ′ ] F t,u − dg u . The hypothesis (1.11) is easily verﬁed if F i satisﬁes a structural hypothesis similar to(1.9) for F , see Lemma 4.1. The hypothesis (1.12) is typically obtained by recursiveapplication of Theorem 1.1.1.1.1. Relation to previous works.

In the case F ≡ , we have Π π ( F, g ) = δg for anyadapted partition π . Moreover, the right-hand side of (1.9) is an empty sum in thiscase, so that Theorem 1.1 boils down to Lépingle’s inequality (1.2). Our argumenthas its roots in the approach to Lépingle’s inequality given in [Bou89; PX88]; we alsorefer to [Zor20] for a short self-contained exposition of this case.If F = δf are the diﬀerences of a càdlàg process f , then F s,u − F t,u = ( δf ) s,t · F s,t · ˜ F t,u with ˜ F s,t ≡ . The convergence hypotheses (1.11) and (1.12) are witnessed by thestopping construction in Lemma 4.1. Since Π( ˜

F , g ) = δg and by Lépingle’s inequality(1.2) for g , the estimate (1.14) becomes(1.17) (cid:13)(cid:13) V r Π( δf, g ) (cid:13)(cid:13) L q . k V p ( δf ) k L q k V ∞ g k L q . P. FRIZ AND P. ZORIN-KRANICH If f is also a martingale, ≤ q < ∞ , and r > , then, taking p = 2 + and usingLépingle inequality (1.2) for f , the estimate (1.17) implies(1.18) (cid:13)(cid:13) V r Π( δf, g ) (cid:13)(cid:13) L q . k V ∞ f k L q k V ∞ g k L q . In this case, the object Π( δf, g ) is analogous to so-called paraproducts in harmonicanalysis. For paraproducts, an estimate of the form (1.18) was ﬁrst proved in[DMT12], motivated by an application of rough path theory in time-frequency anal-ysis [DMT17, Corollary 1.2].The estimate (1.18) is of interest because it shows that, for a (multidimensional)martingale X , the pair ( X, Π( X, X )) is almost surely a rough path. For continuousmartingales, the estimate (1.18) was proved in [FV06] (in the diagonal case q = q ).For càdlàg martingales, the estimate (1.18) was proved in [CF19] (in the diagonalcase q = q ) and in [KZ19] (for general q , q > ).For non-martingale integrands f , the estimate (1.17) is new. One of the motiva-tions for considering this case is the construction of joint rough path lifts of roughpaths and martingales, see Theorem 1.2 below, which underlies our notion of roughsemimartingale. Another motivation, see e.g. [CL05] and [FV10a, Ch.14], is the an-alytic stability of Itô integrals of the form R ϕ ( f ) d g , with suﬃciently regular ϕ , asa function of f . A weaker version of the estimate (1.17), which does not respect theHölder scaling condition on q , was proved in the case q = q = 2 in [DOP19, Propo-sition 3.13] and used to establish invariance principles of random walks in randomenvironments in rough path topology.Although of no direct interest in rough paths, we note that the case p = ∞ , r = 2 + of (1.17) is a consequence of Lépingle’s inequality applied to the martingales ( R t f u − d g u ) t and g . However, the approach via Theorem 1.1 is still preferable inthis case, since it provides a construction of the Itô integral R f u − d g u that natu-rally comes with variation norm estimates. We further elaborate on this point ofview in Section 4.2, where we deduce the classical convergence results for discreteapproximations to the Itô integral with respect to càdlàg local martingales ( M loc )from Theorem 1.1. At this point, the ability to take q = 1 , missing in [KZ19], isimportant, see Lemma 4.4.The estimate (1.14) for processes F that are not of the increment form is usefulfor the construction of Itô branched rough paths , see Section 3.4. For instance, if f ∈ L q ( V p ) with p ≥ , then the information R δf − d g is not suﬃcient for roughpath theory, and more stochastic building blocks have to be included. Theorem 1.1shows, for instance, that R ( δf − ) d g has variational exponent r = 1 / (2 /p + 1 / − .Note that one can choose r < iﬀ p < which, in that case, reﬂects redundancy of R ( δf − ) d g from a rough integration perspective. In harmonic analysis, analogues ofsuch integrals are known as multilinear paraproducts , see e.g. [MTT02; Mus14].Another setting in which two-parameter integrands F are useful is that of con-trolled rough integration, introduced in [Gub04]. The easiest situation is as follows.Let X, Y, Y ′ be càdlàg adapted processes and g a càdlàg martingale. We interpret Y ′ as the Gubinelli derivative of Y with respect to X , so that the remainder term isgiven by(1.19) R ≡ δY − Y ′ δX : ⇐⇒ R s,t ≡ δY s,t − Y ′ s δX s,t . Then(1.20) R s,u − R t,u = δY ′ s,t δX t,u + R s,t · , and Theorem 1.1 implies the estimate (cid:13)(cid:13) V r Π( R, g ) (cid:13)(cid:13) q . (cid:13)(cid:13) V r Y ′ · V / (1 /r +1 / Π( δX, g ) (cid:13)(cid:13) q + (cid:13)(cid:13) V / (1 /r +1 /r ) R (cid:13)(cid:13) q k Sg k q . When the ℓ r norm implicit in the left-hand side of this estimate is computed fora given partition π , this estimate can be interpreted as a bound for the error in adiscrete approximation of the controlled integral R Y dg .Such integrands also appear in stochastic numerics, see e.g. [KP92, Ch.5], [GL97,Lem.4.2.], or [KN07].

OUGH SEMIMARTINGALES 5

Further variants.

Theorem 1.1 continues to hold with all processes beingHilbert spaces valued, upon replacing all products by tensor products, and thebounds do not depend on the dimensions of the Hilbert spaces.The limiting variational estimate (1.14) has a precise analogue in Hölder topology,given in Appendix A, which extends and quantiﬁes some previous constructionsnotably Diehl et al. [DOR15] and [FH20, Ch.13] (with g taken as Brownian motion).To wit, in these references the Hölder regularity is obtained by some variation ofKolmogorov’s criterion (or Besov-Hölder embedding); the resulting (1 /q ) + -loss onthe Hölder exponent (integrability parameter q ) is avoided in Theorem A.1.1.2. Rough integrators.

Second main result concerns integrals formally given by Π( g, Y) t,t ′ ≡ Z ( t,t ′ ] ( δg ) t,u − dY u , where g is a martingale and Y is a suitable (rough) càdlàg process. When Y hasﬁnite p -variation sample paths, for p < , use Young’s inequality pathwise, with p > such that /p + 1 /p > max(1 , /r ) , followed by Hölder’s inequality (with q, q , q as in Theorem 1.1) and Lépingle’s estimate (applied to || V p g || L q ) to see(1.21) (cid:13)(cid:13) V r Π( g, Y) (cid:13)(cid:13) L q (Ω) . k V p Y k L q (Ω) k V ∞ g k L q (Ω) . When p ≥ , pathwise arguments fail, and this includes the case when Y is anothercàdlàg martingale (hence dealt with by Theorem 1.1), which requires stochastic ar-guments. For any partition π , of [0 , T ] say, integration by parts for sums gives ( Y T − Y )( g T − g ) − Π π ( Y, g ) ,T = Π π ( g, Y ) ,T + X π j

Y, g ] π . We give an example where [ Y, g ] , hence Π( g, Y) , does not exist. Example . Let g = B , a standard Brownian motion, and Y t := R t ( t − s ) H − / d B ,so that Y = B H is a fractional Brownian motion (fBm) of Hurst parameter H ,of ﬁnite p variation, any p > /H (and no better). Take T = 1 and compute k P π j < ( B Hπ j +1 − − B Hπ j )( B π j +1 − B π j ) k L (Ω) ∼ mesh( π ) H +1 / − , as seen by Itô isom-etry, hence divergent in the rough regime H ∈ (0 , / . (This implies that theItô integral R B H d B has inﬁnite Itô-Stratonovich correction, cf. [FH20, Ch.14,15]for discussion of this example from KPZ type renormalisation perspective.) As aconsequence, R B d B H = lim π Π π ( B, B H ) does not exist.The problem in this example is correlation , and taking Y = X deterministic (orindependent of g ) is a way of ruling out such situations. (This example explains whyindependence of components is a standard assumption for Gaussian rough paths[FV10b]. ) A ﬂexible structural assumption to overcome this problem is to assumefor the (adapted) process Y to be (analytically) close to a deterministic referencepath X . Theorem 1.2.

Let q, q , q be as in Theorem 1.1, < r ≤ ∞ , and < ˆ p < ≤ p ≤ ∞ with /r < / /p . Let X be a deterministic càdlàg path, Y = (

Y, Y ′ ) acàdlàg adapted process, and g a càdlàg martingale. Assume that V ∞ g ∈ L q , M Y ′ := sup t | Y ′ t | ∈ L q , X ∈ V p , V ˆ p R Y ∈ L q , where (1.22) R Y s,t := R Y ,Xs,t := Y t − Y s − Y ′ s ( X t − X s ) , ≤ s ≤ t < ∞ . Then, there exists a process (Π( g, Y) t,t ′ ) ≤ t ≤ t ′ < ∞ with the following properties. For an independent Brownian B ⊥ , existence of R B ⊥ d B H = lim Π( B ⊥ , B H ) π holds in L (Ω) . P. FRIZ AND P. ZORIN-KRANICH (1)

Along deterministic partitions π , we have existence of (1.23) Π( g, Y) ,T = u . c . p . -lim mesh( π ) → Π π ( g, Y ) ,T =: Z T ( δg ) ,t − dY t . (2) We have Chen’s relation (1.24) Π( g, Y) t,t ′′ = Π( g, Y) t,t ′ + Π( g, Y) t ′ ,t ′′ + ( g t ′ − g t )( Y t ′′ − Y t ′ ) . (3) We have the bound (1.25) (cid:13)(cid:13) V r Π( g, Y) (cid:13)(cid:13) L q (Ω) . (cid:16) V p X k M Y ′ k L q (Ω) + k V ˆ p R Y k L q (Ω) (cid:17) k V ∞ g k L q (Ω) . Theorem 1.2 is proved in Section 5.3. The construction of Π( g, Y) is based onthe aforementioned integration by parts identity in combination with constructingquadratic covariation, given as (u.c.p.) limit of [ Y, g ] π (see Deﬁnition 5.2), for everylocal martingale g , identiﬁed explicitly in Theorem 5.4 as(1.26) X s ≤ t ∆ X s Y ′ s − ∆ g s + X s ≤ t ∆ R Y s ∆ g s =: [Y , g ] t , were our notation tracks Y = (

Y, Y ′ ) ↔ ( Y ′ , R Y ) , with X ﬁxed. (Note that, ingeneral, [ Y, Y ] π does not converge.) Again, several remarks are in order. • The exponent p quantiﬁes the variational regularity of both X and Y . Theassumption p ≥ is not essential. Indeed, as noted above, when p < onecan use (pathwise) Young, Hölder, and Lépingle to get the estimate (1.21),from which (1.25), if so desired, is an easy consequence. • The assumption ˆ p < reﬂects the “length” of the expansion Y t ≈ Y s + Y ′ s ( X t − X s ) , familiar from controlled rough path theory (think: ˆ p = p / )although we do not need to control any variation norm of Y ′ here: Theo-rem 1.2 is a stochastic result, and not based on pathwise (sewing) arguments.It is then clear that the condition on ˆ p could be relaxed by suitable higherorder “controllness” assumptions, but we have not pursed this further. • The special case of deterministic X corresponds to ( Y, Y ′ ) = ( X, , R Y = 0 .Take q = ∞ and ≤ q = q < ∞ , so that (1.25) simpliﬁes to(1.27) (cid:13)(cid:13) V r Π( g, X ) (cid:13)(cid:13) L q (Ω) . ( V p X ) k V ∞ g k L q (Ω) . In case of random X , but independent of g , this estimate can be used uponconditioning on X , and immediately gives (cid:13)(cid:13) V r Π( g, X ) (cid:13)(cid:13) L q (Ω) . k V p X k L q (Ω) k V ∞ g k L q (Ω) . The better integrability of the left-hand side, compared to (1.25), is a conse-quence of independence. • U.c.p. convergence as mesh( π ) → in (1.23) fails in general for the two-parameter processes Π π ( g, Y) t,t ′ . In fact, it already fails in the simpler situa-tion of Corollary 4.5, which deals with mesh convergence of discrete approx-imations to Itô integrals.1.3. Rough semimartingales.

Recall that a classical semimartingale Z = g + Y ,possibly vector valued, is the sum of a càdlàg local martingale g and càdlàg adapted Y ∈ V . This was generalised, at least in the continuous setting, to Dirichletprocesses [Föl81], where the ﬁnite variation condition on Y is replaced by vanish-ing quadratic variation. In a similar spirit, we can deﬁne Young semimartingales (YSM) as processes Z = g + Y , as above, but now with Y ∈ V − loc , meaning V p loc for p ∈ [1 , . Although this decomposition need not be unique, the paraproduct Π( Z, ¯ Z ) t,t ′ = R ( δZ ) t,u − d ¯ Z u is easily seen to be well-deﬁned, essentially as conse-quence of Itô and Young integration, with pathwise estimates obtained by combin-ing Young and Lépingle, exactly as was done for (1.21). Examples of suitable V − loc processes include fractional Brownian motion with Hurst parameter H > / and α -stable Lévy processes, α < , see [JM83; Man04] for some general results. OUGH SEMIMARTINGALES 7

Both Dirichlet processes and Young semimartingales face a seemingly fundamentalbarrier at p = 2 . Yet, Theorems 1.1 and 1.2 provide us with a way of going beyond- the key idea is to postulate a deterministic reference path X . (This assumptionappears naturally, e.g. under partial conditioning of driving noise, cf. Corollary 1.9.) Deﬁnition 1.3.

Let p ∈ [2 , . Let X be a càdlàg adapted process, with valuesin some Hilbert space ˜ H and X ∈ V p loc almost surely. We call a pair of càdlàgadapted processes Y = (

Y, Y ′ ) with values in some Hilbert space H and in the operatorspace L ( ˜ H, H ) , respectively, an X -controlled p -rough process if Y, Y ′ ∈ V p loc and R Y ,X ∈ V p/ , almost surely. Deﬁnition 1.4.

Let p ∈ [2 , and X ∈ V p loc be a càdlàg determinsitic path. We deﬁnean X -controlled p -rough semimartingale (RSM) to be a càdlàg adapted process of theform ( g + Y, Y ′ ) : Ω × [0 , ∞ ) → H ⊕ L ( ˜ H, H ) , where g is a càdlàg local martingale and Y = (

Y, Y ′ ) is an X -controlled p -roughcàdlàg adapted process. A trivial example of X -controlled p -RSM is given by ( g + X, Id) for some deter-ministic càdlàg path X ∈ V p loc , p < , as may be supplied by a typical realization ofanother martingale. The following can be seen as RSM version of the Doob–Meyerdecomposition for special semimartingales. Theorem 1.5.

Let ( g + Y, Y ) be a RSM. Assume Y = Y ( t, ω ) is previsible, Y (0 , ω ) =0 . Then the decomposition is unique.Proof. From (1.26), using crucially the existence of the reference path X , the qua-dratic covariation, given as (u.c.p.) limit of [ Y, ¯ g ] π , exists and vanishes for every continuous local martingale ¯ g . (This shows that g + Y is a weak Dirichlet process inthe sense of [ER03; Coq+06]). Consider now two decompositions g + Y = g + Y with Y i previsible. Then Y − Y =: ¯ g is a previsible local martingale, hence acontinuous local martingale. But then [ Y − Y , Y − Y ] π = [ Y , ¯ g ] π − [ Y , ¯ g ] π and both terms on the right-hand side vanish upon reﬁnement of π . This shows that Y − Y is a continuous martingale with vanishing quadratic variation, starting atzero, hence identically equal to zero. (cid:3) Similar to controlled rough paths, the notion of RSM is most fruitful when pairedwith rough paths . Recall [Lyo98; FS17], see also [Wil01] and [Che+19] for a recentreview (with applications to homogenization), that a càdlàg p -rough path with p ∈ (2 , can be viewed as a pair of càdlàg processes X = ( X, X ) = (( X t ) , ( X s,t )) with val-ues in a Banach space B and a tensor product space B ⊗ B , with V p X, V p/ X (locallyin time) ﬁnite and subject to Chen relation X t,t ′′ = X t,t ′ + X t ′ ,t ′′ + ( δX ) t,t ′ ( δX ) t ′ ,t ′′ .Recall further that càdlàg X -controlled p -rough paths can be integrated against X and, more generally, other càdlàg X -controlled p -rough paths, Z (0 ,T ] δ Y d ¯Y = lim mesh( π ) → Π π (Y , ¯Y) ,T , Π π (Y , ¯Y) T,T ′ = X π j ≤ T δY ,π j δ ¯ Y π j ,π j +1 ∧ T + Y ′ π j ¯ Y ′ π j X π j ,π j +1 ∧ T . (1.28)The statement with mesh convergence above is from [FZ18, Proposition 2.6]; theproof in fact also shows that the convergence is locally uniform in T . Convergenceof càdlàg rough integrals in the net sense was proved in [FS17, Theorem 34] (with ¯ Y = X , ¯ Y ′ = 1 ; see [FH20, Remark 4.12] for the general case), extending the Höldercontinuous case in [Gub04]. P. FRIZ AND P. ZORIN-KRANICH

Theorem 1.6.

Let p ∈ [2 , , X = ( X, X ) be a càdlàg p -rough path and W =( g + Y, Y ′ ) , ¯W = (¯ g + ¯ Y , ¯ Y ′ ) be two rough semimartingales. Then the paraproduct Π(W , ¯W) t,t ′ := Z ( t,t ′ ] δ ( g + Y ) t,u − d¯ g u + Z ( t,t ′ ] ( δg ) t,u − d ¯Y u + Z ( t,t ′ ] ( δ Y) t,u − d ¯Y u is well-deﬁned, as sum of Itô integral, then R δg d ¯Y := Π( g, ¯Y) , and at the far righta rough integral, with quantitative estimates provided respectively by Theorem 1.1,Theorem 1.2 and (càdlàg) rough integration theory. The enhanced paraproduct (Π(W , ¯W) ,t , δ ( g + Y ) ,t Y ′ t ) deﬁnes another rough semimartingale, with local martingale component given by theItô integral R (0 ,t ] δ ( g + Y ) ,u − d¯ g u . Furthermore, a càdlàg p -rough path is given by W s,t := ( δ ( g + Y ) s,t , Π(W , W) s,t ) . Proof.

By Corollary 4.5, Theorem 5.4, and (1.28), we have

Π(W , ¯W) , · = u . c . p . -lim mesh( π ) → (cid:16) X π j

Let X = ( X, X ) be a càdlàg p -rough path over R m , p ∈ (2 , , and g an R n -valued martingale with V ∞ g ∈ L q , for some ≤ q < ∞ . Then, a.s., themap (1.29) J : ( X , g ( ω )) (cid:18)(cid:18) Xg (cid:19) , (cid:18) X Π( g, X )Π( X, g ) Π( g, g ) (cid:19)(cid:19) = ( X g ( ω ) , X g ( ω )) . takes values in the space of càdlàg p -rough paths over R m + n , with q -integrable ho-mogeneous rough path norm, given by V p hom X g := V p X g + ( V p/ X g ) / ∈ L q . Moreover, J is locally Lipschitz continuous in the sense that (cid:13)(cid:13) V p ( X g − X g ) (cid:13)(cid:13) L q . V p ( X − X ) + k V ∞ ( g − g ) k L q , OUGH SEMIMARTINGALES 9 and k V p/ (Π( X , g ) − Π( X , g )) k L q + k V p/ (Π( g , X ) − Π( g , X )) k L q . ( V p X ) k V ∞ ( g − g ) k L q + V p ( X − X ) k V ∞ g k L q , k V p/ (Π( g , g ) − Π( g , g )) k L q / . ( k V ∞ g k L q + k V ∞ g k L q ) k V ∞ ( g − g ) k L q In particular, the map ( X , g )

7→ J ( X , g ) =: ¯ X = ( ¯ X, ¯ X ) is continuous (and uni-formly so on bounded sets), with respect to homogeneous L q rough paths metric k V p hom ( ¯ X − ¯ X ) k L q ≍ k V p ( ¯ X − ¯ X ) k L q + k ( V p/ ( ¯ X − ¯ X )) / k L q . Diﬀerential equations.

In Theorem 1.6, we gave a canonical construction ofa (random) p -rough path W associated to any rough semimartingale W = ( g + Y, Y ′ ) in sense of Deﬁnition 1.4. The parameter p ∈ (2 , and the reference path X arekept ﬁxed. In particular, rough semimartingales can drive diﬀerential equations,(1.30) dZ = σ ( Z − ) dW : ⇐⇒ dZ = σ ( Z − ) d W , understood for a.e. realization of W = W ( ω ) as rough diﬀerential equation (bynature, multidimensional). This should be contrasted with SDEs driven by weakDirichlet processes [CR07], essentially restricted to scalar drivers. Standard resultsin (deterministic) rough path theory provide a unique solution Z = Z ( W , Z ) of theinitial value problem for (1.30) provided that σ ∈ Lip p + . The construction assuresthat Z t = Z t ( W ( ω ) , Z ( ω )) deﬁnes an adapted (càdlàg) process provided that theinitial datum Z is F -measurable. When ( Y, Y ′ ) = 0 , W is nothing but the Itôrough path lift of the càdlàg local martingale g , as previously constructed in [CF19],and yields (a robust version of) the classical Itô solution, as found in textbooks onstochastic diﬀerential equations. We ﬁnd it instructive to replace σ by ( σ, µ ) andspecialise to(1.31) dZ = σ ( Z − ) d X + µ ( Z − ) d g : ⇐⇒ dZ = ( σ, µ )( Z − ) d J ( X , g ) . Many authors (e.g. [GN08]) have consider the situation where g = B , a multidi-mensional Brownian motion, and X replaced by an independent fractional Brownian B H motion with H > / . In this case, the left-hand side of (1.31) makes sensein mixed Young Itô sense (and could accordingly be phrased in terms of Youngsemimartingales). From the perspective of [FV10b], it suﬃces to construct ( B H , B ) jointly as Gaussian rough paths, which is possible for H > / . Equation (1.31),in case when g is a Brownian motion B and X a geometric Hölder rough path, wastreated in [Cri+13] as ﬂow transformed Itô SDE, in [DOR15; DFS17], in the right-hand side sense of (1.31). (In absence of jumps, the situations is much simpliﬁed inthat that ( X , B ) is construction by a Kolmogorov type criterion for rough paths; see[FH20, Ch.12] for a review.) Still in a continuous setting, forthcoming work [FHL20]employes stochastic sewing arguments.Back to the case of càdlàg g ∈ M loc , with càdlàg p -rough X , the (formal) left-handof (1.31) suggests that Z is a rough semimartingale with local martingale componentgiven by the (well-deﬁned) Itô integral R µ ( Z − ) d g . However, from a rough pathperspective, Z is constructed as an ( X, g ) -controlled rough path. Knowing only X ,this is insuﬃcient to deﬁne Y ? = R σ ( Z − ) d X by (purely analytic) rough integration.The next theorem shows that the left-hand side of (1.31) has, thanks to stochasticcancellations, a bona-ﬁde integral meaning after all. Theorem 1.8 (cf. Theorem 6.3) . Let σ, µ ∈ Lip p + , so that (1.31) admits a uniquesolution process in RDE sense, given by (1.32) Z t ( ω ) := Z t ( X , Z ( ω ); ω ) := Z t ( J ( X , g )( ω ) , Z ( ω )) , This restriction is easy to understand since every deterministic continuous path is a weakDirichlet process. In general, this is not suﬃcient to drive a diﬀerential equations (the raison d’êtreof rough path theory). adapted for F -measurable Z . Then ( Z, σ ( Z )) is a rough semimartingale with de-composition Z = M + Y with local martingale component M = R · µ ( Z − ) d g and Y given by Y t = u . c . p . -lim d-mesh( π ) → X j : π j

RDE theory yields a solution ( Z, ( σ, µ )( Z )) as ( X, g ) -controlled p -rough pro-cess. By Theorem 6.3, we see that ( Z, σ ( Z )) is an X -controlled p -RSM, as is ( σ ( Z ) , Dσ ( Z ) ◦ σ ( Z )) by Corollary 6.4. To see the stated decomposition into lo-cal martingal and rough drift part, we write the RDE solution as integral equation,obtained as mesh-limit of local approximations given by δZ s,t ∼ = f ( Z s )( δX ) s,t + f ( Z s ) X s,t + f ( Z s )( δg ) s,t + f ( Z s )Π( X, g ) s,t + f ( Z s )Π( g, X ) s,t + f ( Z s )Π( g, g ) s,t where f = σ, f = µ, f = Dσ ◦ σ and so on. (Our assumptions on σ, µ imply thatall the f ij ’s are bounded.) It follows from Lemma 6.1 and 6.2 that converges stilltakes place when f , f , f are set to zero, provided we restrict ourselves to themesh limit of deterministc partitions. What remains are Itô left-point sums, with f -terms, and u.c.p. Itô limit M = R µ ( Z − ) d g . All these entails convergence ofsum with the remaining terms ( f and f ), as given in the statement. Alternatively,though equivalently, we can see as R σ ( Z − ) d X as integral of a rough semimartingaleagainst (0 + X, Id) , trivially another X -controlled rough semimartingale, hence relyon Theorem 1.6. (cid:3) The next result asserts, loosely speaking, that an Itô SDE solution, conditioned on(an independent) part of the driving noise, is a.s. a rough semimartingale. (This canbe seen as major extension of the rather trivial fact B ( ω ) + X is a rough semimartin-gale (in ω ) for a.e. typical realization of X = B ⊥ ( ω ′ ) , for independent Brownianmotions B, B ⊥ .) Corollary 1.9.

Assume g = g ( ω ) and X = X ( ω ′ ) are independent local martingales,deﬁned on some ﬁltered product space ( ¯Ω , ¯ F ) = (Ω , F ) × (Ω ′ , F ′ ) . Let σ, µ be as inTheorem 1.8 and write ˜ Z ( Z ; ω, ω ′ ) for the unique ¯ F -adapted solution of the Itô SDE (1.33) d ˜ Z = σ ( ˜ Z − ) d X + µ ( ˜ Z − ) d g with ¯ F -measurable initial data Z = Z ( ω, ω ′ ) . With the Itô rough path lift of X , X ( ω ′ ) = ( X, X )( ω ′ ) = ( X ( ω ′ ) , Π( X, X )( ω ′ ) and rough semimartingale Z as in (1.32) we have, for a.e. ω and a.e. ω ′ , (1.34) ˜ Z ( Z ; ω, ω ′ ) = Z ( X ( ω ′ ) , Z ( ω, ω ′ ); ω ) . Proof.

In view of uniqueness of the Itô solution, it suﬃces to show that the right-hand side of (1.34) is an Itô solution of (1.33). By Theorem 1.8, it suﬃces to showthat u . c . p . -lim d-mesh( π ) → X j : π j

Z, σ ( ˜ Z )) given X isexpressed terms of the distribution of the rough semimartingale ( Z, σ ( Z )) . OUGH SEMIMARTINGALES 11 Vector-valued estimates in discrete time

The main result of this section, Theorem 2.6, is a bound for discrete time versionsof the Itô integral. Its main advantage over the previous result [KZ19, Proposition3.1] is that the integrands F ( k ) are allowed to be arbitrary two-parameter processes,rather than martingale diﬀerences. The connection of Theorem 2.6 with variationnorm estimates will be established in Corollary 3.4.We begin this section by recalling several known results. We abbreviate k·k q := k·k L q (Ω) .2.1. Davis decomposition.

For a scalar-valued process ( f n ) , we denote the mar-tingale maximal function and its stopped version by M f := sup n | f n | , M t f := sup n ≤ t | f n | , and the martingale square function and its stopped version by Sf := ℓ n | df n | , S t f := ℓ n ≤ t | df n | . Here and later, dg j := g j − g j − . We denote ℓ p norms by ℓ pk a k := ( X k ∈ N | a k | p ) /p . In order to simplify notation, we only consider martingales g with g = 0 . Theorem 2.1 (Davis decomposition [Dav70], see e.g. [Hyt+16, Theorem 3.4.3]) . Let ( f n ) ∞ n =0 be a martingale with values in a Banach space X . Suppose that f = 0 and f n ∈ L (Ω → X, F n ) for all n . Then there is a decomposition f n = f pred n + f bv n intomartingales adapted to the same ﬁltration with f pred0 = 0 such that the diﬀerences of f pred have predictable majorants: (2.1) k df pred n k X ≤ n ′

Let ≤ q < ∞ , X be a Banach function space, elements of which are R -valued maps x ( · ) , and ( f n ) a martingale with values in X . Then for f pred givenby Theorem 2.1 we have kk Sf pred k X k L q . q kk Sf k X k L q , where the square function is given by k Sf k X := k ℓ n ( df n ( · )) k X Remark . We will apply this with X = ℓ r , i.e. r -summable series, viewed as mapsfrom N → R , with the usual Banach structure. Proof.

Using (2.2) we estimate kk Sf pred k X k L q ≤ kk Sf k X k L q + kk Sf bv k X k L q ≤ kk Sf k X k L q + kk X n | d n f bv |k X k L q ≤ kk Sf k X k L q + k X n k d n f bv k X k L q ≤ kk Sf k X k L q + C k sup n k d n f k X k L q . kk Sf k X k L q . (cid:3) Vector-valued BDG inequality.

We recall the weighted Burkholder–Davis–Gundy inequality.

Lemma 2.3 ([Os¸e17]) . Let ( f n ) be a martingale and ( w ) a positive random variable.Then E ( M f · w ) ≤ √ E ( Sf · M w ) , where M w = sup n E ( w | F n ) .Remark . The proof of Lemma 2.3 given in [Os¸e17] also works for martingales withvalues in a real Hilbert space.

Lemma 2.4.

Let h ( k ) be martingales with respect to some ﬁxed ﬁltration. Let ≤ q < ∞ and ≤ r < ∞ . Then we have (2.3) (cid:13)(cid:13) M h ( k ) (cid:13)(cid:13) L q ( ℓ rk ) . q,r (cid:13)(cid:13) Sh ( k ) (cid:13)(cid:13) L q ( ℓ rk ) . Proof.

First we consider the case < q < ∞ .Take positive functions with k w ( k ) k L q ′ ( ℓ r ′ k ) = 1 . Then, by Lemma 2.3, we have E (cid:18)X k ( M h ( k ) ) w ( k ) (cid:19) . X k E (cid:18) Sh ( k ) M w ( k ) (cid:19) ≤ (cid:13)(cid:13) Sh ( k ) (cid:13)(cid:13) L q ( ℓ rk ) (cid:13)(cid:13) M w ( k ) (cid:13)(cid:13) L q ′ ( ℓ r ′ k ) . By the vector-valued Doob’s inequality [Hyt+16, Theorem 3.2.7], we have (cid:13)(cid:13)

M w ( k ) (cid:13)(cid:13) L q ′ ( ℓ r ′ k ) . (cid:13)(cid:13) w ( k ) (cid:13)(cid:13) L q ′ ( ℓ r ′ k ) = 1 . Taking the supremum over w ( k ) , we obtain the claim.Now we consider q = 1 . The case r = 1 follows from the usual BDG inequality, sowe may assume < r < ∞ .Decompose ~h = ~h pred + ~h bv as in Theorem 2.1 with X = ℓ r . For λ > , deﬁne thestopping time τ := inf { t | k S t h pred k ℓ r > λ or k S t h k ℓ r > λ } . We claim that(2.4) k Sh pred τ k ℓ r ≤ k Sh pred k ℓ r ∧ Cλ.

Indeed, the ﬁrst bound is trivial, and the second bound is only non-void if < τ < ∞ .In the latter case, by (2.1), we have k Sh pred τ k ℓ r ≤ k Sh pred τ − k ℓ r + k h pred τ − h pred τ − k ℓ r ≤ λ + 4 sup n ′ <τ k h n ′ − h n ′ − k ℓ r ≤ λ. Also, {k M h pred k ℓ r > λ } ⊆ {k M h pred τ k ℓ r > λ } ∪ { τ < ∞}⊆ {k M h pred τ k ℓ r > λ } ∪ {k Sh k ℓ r > λ } ∪ {k Sh pred k ℓ r > λ } By the layer cake formula, k M h pred k L ( ℓ r ) = Z ∞ P {k M h pred k ℓ r > λ } d λ ≤ Z ∞ P {k M h pred τ k ℓ r > λ } d λ + Z ∞ P {k Sh pred k ℓ r > λ } d λ + Z ∞ P {k Sh k ℓ r > λ } d λ =: I + II + III.

The term

III is the claimed right-hand side of the estimate (2.3), again by the layercake formula. By Lemma 2.2, we have II = kk Sh pred k ℓ r k L . kk Sh k ℓ r k L . OUGH SEMIMARTINGALES 13

Using the already known L r ( ℓ r ) case of Lemma 2.4 and (2.4), we bound the ﬁrstterm by I . Z ∞ λ − r k Sh pred τ k rL r ( ℓ r ) d λ ≤ Z ∞ λ − r kk Sh pred k ℓ r ∧ Cλ k rL r d λ = E Z ∞ min (cid:0) λ − r k Sh pred k rℓ r , (cid:1) d λ . E k Sh pred k ℓ r = II, and we reuse the previously established estimate for II . (cid:3) Remark . Lépingle’s inequality (1.2) can be obtained from Lemma 2.4 and Corol-lary 3.2. In fact, Corollary 3.2 simpliﬁes for processes Π that are of diﬀerence form,see [Zor20, Corollary 2.4], so that the vector-valued bound (2.3) is not necessary toshow (1.2).2.3. Vector-valued maximal paraproduct estimate.

We call a two-parameterprocess ( F s,t ) s ≤ t adapted if F s,t is F t -measurable for every s ≤ t .For an adapted process ( F s,t ) and a martingale ( g n ) , we deﬁne(2.5) Π s,t ( F, g ) := X s

Let < q, q ≤ ∞ , ≤ q , r, r < ∞ , ≤ r ≤ ∞ . Assume /q = 1 /q + 1 /q and /r = 1 /r + 1 /r . Then, for any martingales ( g ( k ) n ) n , anyadapted sequences ( F ( k ) s,t ) s ≤ t , and any stopping times τ ′ k ≤ τ k with k ∈ Z , we have (2.6) (cid:13)(cid:13) ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) ) τ ′ k ,t | (cid:13)(cid:13) q ≤ C q ,q ,r ,r (cid:13)(cid:13) ℓ r k sup τ ′ k ≤ t<τ k | F ( k ) τ ′ k ,t | (cid:13)(cid:13) q k ℓ r k Sg ( k ) τ ′ k ,τ k k q , where Sg s,t := (cid:0)P tj = s +1 | dg j | (cid:1) / .Proof of Proposition 2.5. We may replace each g ( k ) by the martingale(2.7) ˜ g ( k ) n := g ( k ) n ∧ τ k − g ( k ) n ∧ τ ′ k without changing the value of either side of (2.6).Consider ﬁrst q ≥ . For each k , the sequence h ( k ) t := ( , t < τ ′ k , Π( F ( k ) , g ( k ) ) τ ′ k ,t , t ≥ τ ′ k , is a martingale. We may also assume F τ ′ k ,t = 0 if t [ τ ′ k , τ k ) . By Lemma 2.4, we canestimate LHS (2.6) . (cid:13)(cid:13) ℓ rk | Sh ( k ) | (cid:13)(cid:13) q = (cid:13)(cid:13) ℓ rk ℓ j | F ( k ) τ ′ k ,j − dg ( k ) j | (cid:13)(cid:13) q ≤ (cid:13)(cid:13) ℓ rk M F ( k ) ℓ j | dg ( k ) j | (cid:13)(cid:13) q ≤ k ℓ r k M F ( k ) k q (cid:13)(cid:13) ℓ r k Sg ( k ) (cid:13)(cid:13) q . Here and later, we abbreviate

M F := sup j | F τ ′ k ,j | .Consider now q < . By homogeneity, we may assume(2.8) (cid:13)(cid:13) ℓ r k M F ( k ) (cid:13)(cid:13) q = (cid:13)(cid:13) ℓ r k Sg ( k ) (cid:13)(cid:13) q = 1 , and we have to show (cid:13)(cid:13) ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) ) τ ′ k ,t | (cid:13)(cid:13) q . . We use the Davis decomposition g = g pred + g bv (Theorem 2.1 with X = ℓ r ). Thecontribution of the bounded variation part is estimated as follows: k ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) , bv ) τ ′ k ,t |k q ≤ k ℓ rk X j | F ( k ) τ ′ k ,j − | · | dg ( k ) , bv j |k q ≤ k ℓ r k M F ( k ) k q k ℓ r k (cid:16)X j | dg ( k ) , bv j | (cid:17) k q ≤ k ℓ r k M F ( k ) k q k X j ℓ r k | dg ( k ) , bv j |k q . k ℓ r k M F ( k ) k q k sup j ℓ r k | dg ( k ) j |k q ≤ k ℓ r k M F ( k ) k q k ℓ r k Sg ( k ) k q , where we used (2.2) in the penultimate step.It remains to consider the part g pred with predictable bounds for jumps. By thelayer cake formula, we have(2.9) (cid:13)(cid:13) ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) , pred ) τ ′ k ,t | (cid:13)(cid:13) qq = Z ∞ P { ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) , pred ) τ ′ k ,t | > λ /q } d λ. Fix some λ > and deﬁne a stopping time(2.10) τ := inf n t (cid:12)(cid:12)(cid:12) ℓ r k Sg ( k ) t > cλ /q or ℓ r k Sg ( k ) , pred t > cλ /q or ℓ r k sup λ /q o . Deﬁne stopped martingales ˜ g ( k ) t := g ( k ) , pred t ∧ τ and adapted processes ˜ F ( k ) t,t ′ := F ( k ) t,t ′ ∧ τ − . Then, on the set { τ = ∞} , we have Π( F ( k ) , g ( k ) , pred ) τ ′ k ,t = Π( ˜ F ( k ) , ˜ g ( k ) ) τ ′ k ,t for all k, t. Hence, { ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( F ( k ) , g ( k ) , pred ) τ ′ k ,t | > λ /q }⊂{ ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( ˜ F ( k ) , ˜ g ( k ) ) τ ′ k ,t | > λ /q }∪ { ℓ r k Sg ( k ) > λ /q } ∪ { ℓ r k Sg ( k ) , pred > λ /q }∪ { ℓ r k M F ( k ) > λ /q } (2.11)The contributions of the latter three terms to (2.9) are . by (2.8) and Lemma 2.2.It remains to handle the ﬁrst term.By construction, we have ℓ r k M ˜ F ( k ) ≤ λ /q , and due to (2.1) we also have ℓ r k S ˜ g ( k ) ≤ λ /q , provided that the absolute constant c in (2.10) is small enough.Choose an arbitrary exponent ˜ q with q < ˜ q < ∞ . By the already known case of theProposition with ( q , q ) replaced by (˜ q, ∞ ) , we obtain P { ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( ˜ F ( k ) , ˜ g ( k ) ) τ ′ k ,t | > λ /q }≤ λ − ˜ q/q k ℓ rk sup τ ′ k ≤ t ≤ τ k | Π( ˜ F ( k ) , ˜ g ( k ) ) τ ′ k ,t |k ˜ q ˜ q . ˜ q λ − ˜ q/q k ℓ r k M ˜ F ( k ) k ˜ q ∞ k ℓ r k S ˜ g ( k ) k ˜ q ˜ q ≤ λ − ˜ q/q k ℓ r k Sg ( k ) , pred ∧ λ /q k ˜ q ˜ q . (2.12) OUGH SEMIMARTINGALES 15

This estimate no longer depends on the stopping time τ . Integrating the right-handside of (2.12) in λ , we obtain Z ∞ λ − ˜ q/q k ℓ r k Sg ( k ) , pred ∧ λ /q k ˜ q ˜ q d λ = E Z ∞ (cid:0) λ − ˜ q/q ( ℓ r k Sg ( k ) , pred ) ˜ q ∧ (cid:1) d λ ∼ E ( ℓ r k Sg ( k ) , pred ) q ∼ , where we used ˜ q > q , Lemma 2.2 with X = ℓ r , and the assumption (2.8). (cid:3) Next, we deduce a version of Proposition 2.5 that involves a two-parameter supre-mum of the kind that appears in Corollary 3.2. Recall the deﬁnition of second orderincrements of a two-parameter process ( F s,t ) :(2.13) ( δF ) s,t,u := F s,u − F s,t − F t,u , s < t < u. For a ﬁxed s , we deﬁne(2.14) ( δ s F ) t,u := F s,u − F t,u , s < t < u. Theorem 2.6.

In the situation of Proposition 2.5, we have (cid:13)(cid:13) ℓ rk sup τ ′ k ≤ s

Let q, q , q , r, r be as in Proposition 2.5 with r = 2 . Let ( F s,t ) bean adapted process such that (2.17) δ s F t,u = X i F is,t ˜ F it,u with adapted processes F i , ˜ F i , g a martingale, and ( τ k ) an adapted partition. Then,we have (cid:13)(cid:13) ℓ rk sup τ k − ≤ s

In this section, we iterate Corollary 2.7 by applyingit recursively to each term

Π( ˜ F i , g ) on the right-hand side of (2.18). The algebraicframework for this iteration is provided by the theory of branched rough paths in-troduced in [Gub10], see also [HK15]. We recall the relevant notation from [Gub10].We ﬁx a ﬁnite set of label L . The set of trees with vertices labeled by the elementsof L is denoted by T L . A forest is a ﬁnite unordered tuple of trees in T L , in whichrepetition is allowed. The set of all forests is denoted by F L . The free commutative R -algebra generated by the trees T L is denoted by AT L . It can be identiﬁed with thefree R -vector space generated by F L .A branched rough path is an algebra homomorphism F : AT L → C , where C is the algebra of càdlàg functions on the simplex { ( s, t ) | s < t } , that satisﬁesthe generalized Chen relation(2.19) δF f = F ∆( f ) − ⊗ f − f ⊗ , f ∈ AT L . On the right-hand side, we use the extension of F to an algebra homomorphism AT L ⊗ AT L → C deﬁned by F f ⊗ f ′ = F f F f ′ , where we use the product C × C → C given by ( F G ) stu = F st G tu . The coproduct ∆ : AT L → AT L ⊗ AT L is an algebrahomomorphism acting on forests by(2.20) ∆( f ) = X ( b , r ) ∈ Cut f b ⊗ r , where the sum goes over the multiset of all admissible cuts , that is, partitions of treesin the forest f into (possibly empty) initial trees collected in the forest r (for “roots”)and ﬁnal trees collected in the forest b (for “branches”). Our convention for cuts isdiﬀerent from [Gub10, eq. (3)], in that we allow roots and branches to be empty. Theorem 2.8.

Let q ∈ (0 , ∞ ) , q ∈ [1 , ∞ ) , and, for each tree t ∈ T L , let q t ∈ (0 , ∞ ] .Let r ∈ [1 , ∞ ) and, for each tree t ∈ T L , let r t ∈ [1 , ∞ ] . Let f ∈ F L be a forest andlet F be the set of all forests f ′ that are the disjoint unions of arbitrary partitions oftrees in f into subtrees. Assume that, for each f ′ ∈ F , we have /q = 1 /q + X t ∈ f ′ /q t , /r = 1 / X t ∈ f ′ /r t . Let F be an adapted family of branched rough paths and g a martingale. Then, wehave (2.21) (cid:13)(cid:13) ℓ rk sup τ k − ≤ s

We induct on the degree of the forest f , that is, the total number of verticesin its trees. Let f be given and suppose that the claim is known for all forests withstrictly smaller degree. By the generalized Chen relation (2.19) and the deﬁnition ofthe coproduct (2.20), we have(2.22) δ s F f t,u = X ( b , r ) ∈ Cut( f ) , b =0 F b s,t F r t,u . We apply Corollary 2.7 with r = r f , q = q f , where /r f = P t ∈ f /r t and /q f = P t ∈ f /q t . Then the second term on the right-hand side of (2.18) corresponds to thesummand f ′ = f in (2.21).It remains to estimate the ﬁrst term on the right-hand side of (2.18), for a ﬁxedcut ( b , r ) , we have (cid:13)(cid:13) ℓ rk sup τ k − ≤ s

OUGH SEMIMARTINGALES 17 where / ˜ q = 1 /q − X t ′ ∈ b /q t ′ , / ˜ r = 1 /r − X t ′ ∈ b /r t ′ . The latter norm can be estimated by the inductive hypothesis, since deg r < deg f . (cid:3) Example . The vector-valued BDG inequality 2.4is the case of the empty forest f in Theorem 2.8. In this case, we have F f ≡ , sothat Π( F f , g ) = δg. Therefore, the estimate (2.21) becomes (2.3).

Example . Suppose that F = δf . This corresponds to the forest f consisting of the single tree a . In this case, F = { f } , and Theorem 2.8 gives (cid:13)(cid:13) ℓ rk sup τ k − ≤ s

In this section, we will estimate V r Π( F, g ) in open ranges r > ρ . There is adichotomy depending on the value of the threshold ρ . For ρ < , we will use thesewing lemma, see Section 3.2. The main new results of this article are in the range ρ ≥ . In this range, pathwise estimates are insuﬃcient, and we have to rely on thecancellation provided by the martingale g . By the construction in Section 3.1, vari-ation norm estimates in this range follow directly from the vector-valued estimatesin Section 2. Stopping time construction.

In this section, we will bound r -variation bysquare function-like objects. For Lépingle’s inequality, this idea was introduced in[Bou89; PX88]. It was ﬁrst applied to a (real variable) paraproduct in [DMT12].The stopping time argument in [Bou89; PX88] involves a real interpolation step thatwas made increasingly more explicit in [JSW08; MSZ20]. We use diﬀerent stoppingtimes, which better capture the structure of the process at hand and avoid the realinterpolation step. For Lépingle’s inequality, similar stopping times were introducedin [Zor20]. One of the advantages of the present construction is that it allows us toremove a restriction on the integrability parameters ( q > ) from [KZ19].For an adapted process (Π s,t ) s ≤ t , let Π ∗ n ′′ := sup ≤ n

For any discrete time adapted process (Π s,t ) s

For m ∈ N , deﬁne stopping times τ ( m )0 := 0 , and then, for j ≥ , allowing a priori values in N ∪ {∞} ,(3.2) τ ( m ) j +1 := min n t > τ ( m ) j (cid:12)(cid:12)(cid:12) sup τ ( m ) j ≤ t ′ for all l ≥ , since otherwisethe corresponding terms vanish. Consider < ρ < r < ∞ and split(3.3) l max X l =1 | Π u l − ,u l | r = ∞ X m =0 X l ∈ L ( m ) | Π u l − ,u l | r , where(3.4) L ( m ) := (cid:8) l ∈ { , . . . , l max } (cid:12)(cid:12) − m − < | Π u l − ,u l | / Π ∗ u l ≤ − m (cid:9) . In (3.3), we only omitted vanishing summands, since | Π u l − ,u l | ≤ Π ∗ u l . Let also L ′ ( m ) := L ( m ) \ { max L ( m ) } . Using (3.4), we obtain(3.5) l max X l =1 | Π u l − ,u l | r ≤ ∞ X m =0 (2 − m Π ∗ ) r − ρ X l ∈ L ′ ( m ) | Π u l − ,u l | ρ + ∞ X m =0 (2 − m Π ∗ ) r . Claim . For every l ∈ L ( m ) , there exists j s.t. τ ( m ) j ∈ ( u l − , u l ] . Proof of the claim.

Let j be maximal with τ ( m ) j ≤ u l − . Since l ∈ L ( m ) , by deﬁnition(3.4), we have | Π u l − ,u l | > − m − Π ∗ u l . By the deﬁnition of stopping times (3.2), we obtain τ ( m ) j +1 ≤ u l . (cid:3) Fix m . For each l ∈ L ′ ( m ) , let j ( l ) be the largest j such that τ ( m ) j ∈ ( u l − , u l ] .Then all j ( l ) are distinct, and, since l = max L ( m ) , the claim shows that τ ( m ) j ( l )+1 < ∞ .Furthermore, by (3.4), the monotonicity of t Π ∗ t , and the deﬁnition (3.2) ofstopping times, we have(3.6) | Π u l − ,u l | ≤ − m Π ∗ u l ≤ − m Π ∗ τ ( m ) j ( l )+1 ≤ τ ( m ) j ( l ) ≤ t ′ <τ ( m ) j ( l )+1 | Π t ′ ,τ ( m ) j ( l )+1 | OUGH SEMIMARTINGALES 19 by the deﬁnition of τ ( m ) j ( l ) . Since all j ( l ) are distinct, this implies X l ∈ L ′ ( m ) | Π u l − ,u l | ρ ≤ ρ ∞ X j =1 sup τ ( m ) j − ≤ t ′ <τ ( m ) j | Π t ′ ,τ ( m ) j | ρ . Substituting this into (3.5), we conclude the proof of Lemma 3.1. (cid:3)

Corollary 3.2.

Let (Π s,t ) s ≤ t be an adapted process with Π t,t = 0 for all t . Then,for every < ρ < r < ∞ and q ∈ (0 , ∞ ] , we have (3.7) k V r Π k L q . sup τ (cid:13)(cid:13)(cid:13)(cid:16) ∞ X j =1 (cid:0) sup τ j − ≤ t

In this section, we apply the sewing lemma to the processes Π( F, g ) . Lemma 3.3.

Let F s,t be a two-parameter process such that F s,s = 0 and g t a one-parameter process. Suppose that (2.17) holds. Let ρ < and /ρ = 1 /p i, + 1 /p i, for every i . Then, we have (3.8) V ρ Π( F, g ) . X i V p i, F i · V p i, Π( ˜ F i , g ) . Proof.

We will use the sewing lemma [FZ18, Theorem 2.5] with Ξ s,t := Π( F, g ) s,t . By deﬁnition (2.5) and the hypothesis F s,s = 0 , we have Ξ j,j +1 = 0 , so that Π( F, g ) s,t = Ξ s,t − t − X j = s Ξ j,j +1 . Moreover, from Chen’s relation (2.16), we obtain ( δ Ξ) s,t,u = X t ≤ j

Discrete sums corresponding to Itô integrals.

Here, we combine the re-sults in Sections 3.1 and 3.2 into a statement that holds for arbitrary variationalexponents r . Corollary 3.4.

Let < q ≤ ∞ , ≤ q < ∞ , and < r, p ≤ ∞ . Let /q =1 /q + 1 /q and assume /r < /p + 1 / . Let ( F s,t ) be an adapted process suchthat (2.17) holds, g a martingale, and ( τ k ) an adapted partition. Assume that /r < /p i, + 1 /p i, for every i . Then, we have (cid:13)(cid:13) V r Π( F, g ) (cid:13)(cid:13) q . X i (cid:13)(cid:13) V p i, F i · V p i, Π( ˜ F i , g ) (cid:13)(cid:13) q + (cid:13)(cid:13) V p F (cid:13)(cid:13) q k Sg k q . (3.9) Proof.

Deﬁne ρ by /ρ = 1 /p + 1 / . Consider ﬁrst the case ρ ≥ . By Corollary 3.2with ≤ ρ < r < ∞ , it suﬃces to estimate the terms k ℓ ρj sup τ j − ≤ t

Onecan obtain estimates for Π( F, g ) , with F being a component of a branched roughpath, by iterating Corollary 3.4. However, this would involve potentially applyingCorollary 3.2 at every step of the iteration, resulting in unnecessary losses. It is infact more eﬃcient to iterate vector-valued, rather than variational, estimates, whichwe have already done in Theorem 2.8. Here, we indicate the consequences thatTheorem 2.8 has for variation norm estimates. Corollary 3.5.

Let q ∈ (0 , ∞ ) , q ∈ [1 , ∞ ) , and, for each tree t ∈ T L , let q t ∈ (0 , ∞ ] .Let ρ ∈ (0 , ∞ ) and, for each tree t ∈ T L , let r t ∈ [1 , ∞ ] . Let f ∈ F L be a forest andlet F be the set of all forests f ′ that are the disjoint unions of arbitrary partitions oftrees in f into subtrees. Assume that, for each f ′ ∈ F , we have /q = 1 /q + X t ∈ f ′ /q t , /ρ = 1 / X t ∈ f ′ /r t . Let F be an adapted family of branched rough paths and g a martingale. Then, forevery r > ρ , we have (3.10) (cid:13)(cid:13) V r Π( F f , g ) (cid:13)(cid:13) q . X f ′ ∈ F (cid:16)Y t ∈ f ′ (cid:13)(cid:13) V r t F t (cid:13)(cid:13) q t (cid:17) k Sg k q . Proof.

Consider ﬁrst the case ρ ≥ . By Corollary 3.2, it suﬃces to estimate(3.11) k ℓ ρk sup τ k − ≤ t

Itô integral.

Proof of Theorem 1.1, part 1.

Since Π π ( F, g ) t,t ′ is càdlàg in both t and t ′ , we have V r Π π ( F, g ) = lim n →∞ sup l max ,u < ···

Let

F, F i , ˜ F i be càdlàg processes such that (1.9) holds and F it,t = 0 for all i, t . Suppose that V p F ∈ L q and V ∞ ˜ F i ∈ L q for every i . Then, for every ˜ p ∈ ( p , ∞ ) ∪ {∞} , we have lim π k V ˜ p ( F − F ( π ) ) k L q = 0 . Proof.

Since V p F ( π ) ≤ V p F and by Hölder’s inequality, it suﬃces to consider ˜ p = ∞ .Let ǫ > and deﬁne a sequence of stopping times recursively, starting with π := 0 ,by π j +1 := min n t > π j (cid:12)(cid:12)(cid:12) sup s ≤ π j | F s,t − F s,π j | ≥ ǫ or sup π j ≤ s ′ ≤ t max i | F is ′ ,t | ≥ ǫ o . Then, by (1.9), for any adapted partition π ′ ⊇ π and s ≤ t , we have | F s,t − F ( π ′ ) s,t | ≤ | F s,t − F ⌊ s,π ′ ⌋ ,t | + | F ⌊ s,π ′ ⌋ ,t − F ⌊ s,π ′ ⌋ , ⌊ t,π ′ ⌋ |≤ X i | F i ⌊ s,π ′ ⌋ ,s || ˜ F is,t | + | F ⌊ s,π ′ ⌋ ,t − F ⌊ s,π ′ ⌋ , ⌊ t,π ′ ⌋ | . ≤ X i ǫ · V ∞ ˜ F i + ǫ. (cid:3) Remark . Some structural condition on the two-parameter process F is necessaryin Lemma 4.1. Even if F is deterministic, continuous, and vanishes on the diagonal, F ( π ) does not necessarily converge to F uniformly. To see this, let φ : R → [0 , bea smooth function such that φ = 0 on ( −∞ , and φ = 1 on (1 , ∞ ) . Let F ( s, t ) := φ ( st ) φ ( t − s ) . Then, for any partition π , for s < π , we have F ( s, t ) − F (0 , t ) → as t → ∞ .In the above example, F is not uniformly continuous. Convergence can also fail foruniformly continuous in time processes if their samples are not equicontinuous. Tosee this, let Ω = (0 , with the Lebesgue measure, F t the trivial σ -algebra for t < / and the Lebesgue σ -algebra for t ≥ / . Let F ( s, t ) := φ (2 sφ (3 t − /ω ) φ (3( t − s )) ,where ω ∈ Ω and ≤ s ≤ t ≤ . For any ≤ s ≤ t ≤ / , we have F ( s, t ) = 0 ,so this process is indeed measurable with respect to the given ﬁltration. For anyadapted partition π , there is an < s ≤ / such that s ≤ π ( ω ) for a.e. ω ∈ Ω .Let < s < s and t ≥ / . Then F ( s, t ) − F (0 , t ) = φ (2 s/ω ) − φ (0) = 1 for ω < s, so that k V ∞ ( F − F ( π ) ) k L ∞ = 1 . Proof of Theorem 1.1, part 2.

By deﬁnition of a Cauchy net, the existence of thelimit (1.13) will follow if we can show that(4.3) lim π sup π ′ ⊇ π (cid:13)(cid:13) V r (Π π ( F, g ) − Π π ′ ( F, g )) (cid:13)(cid:13) L q = 0 . However, by (4.1), we have Π π ( F, g ) − Π π ′ ( F, g ) = Π π ′ ( F ( π ) − F ( π ′ ) , g ) . It follows from (4.2) that ( F ( π ) s,u − F ( π ′ ) s,u ) − ( F ( π ) t,u − F ( π ′ ) t,u )= X i F i, ( π ) s,t ˜ F i, ( π ) t,u − X i F i, ( π ′ ) s,t ˜ F i, ( π ′ ) t,u = X i ( F i, ( π ) s,t − F i, ( π ′ ) s,t ) ˜ F i, ( π ) t,u + X i F i, ( π ′ ) s,t ( ˜ F i, ( π ) t,u − ˜ F i, ( π ′ ) t,u ) . Therefore, we can estimate the norm in (4.3) using Part 1 of Theorem 1.1 with some ˜ p ∈ ( p , ∞ ) ∪{ p } in place of p , which is possible because (1.8) is an open condition.The bound that we obtain converges to by the hypothesis amd Lemma 4.1.The Chen relation (1.15) follows from the corresponding relation (2.16) for thediscrete paraproduct. (cid:3) Mesh convergence.

Theorem 1.1 can be used to recover the classical resultsabout discrete approximations to the Itô integral. We begin with the simpler case ofcontinuous integrands.

Corollary 4.2.

In the situation of part 2 of Theorem 1.1, suppose that F = δf , q , q < ∞ , and the process f has a.s. continuous paths. Then convergence in (1.13) holds in the stronger sense that (4.4) Π( δf, g ) = lim mesh( π ) → Π π ( δf, g ) in L q ( V p ) , where π ranges over adapted partitions. OUGH SEMIMARTINGALES 23

Proof.

In view of the uniform bound in part 1 of Theorem 1.1, it suﬃces to consider abounded time interval. On such an interval, the paths of f are uniformly continuous.Therefore, F ( π ) → F uniformly as mesh( π ) → . Since F ( π ) are also uniformlybounded in L q ( V p ) , we have F ( π ) → F in L q ( V ˜ p ) for any ˜ p ∈ ( p , ∞ ) ∪ {∞} .We can choose ˜ p such that /r < / ˜ p + 1 / . It remains to apply the estimate(1.14) with p replaced by ˜ p to Π( F, g ) − Π π ( F, g ) = Π( F − F ( π ) , g ) . (cid:3) Next, we recover the convergence result for discrete approximations to the Itôintegral in the presence of jumps. First, let us recall the sense in which the Itôintegral is usually deﬁned.

Deﬁnition 4.3.

Suppose that, for every adapted partition π , we are given a one-parameter process ( f πt ) t . We say that the family f π converges to a process ( f t ) t inthe mesh u.c.p. (uniform on compacts in probability) sense if (4.5) ( ∀ T >

0) ( ∀ ǫ >

0) ( ∃ δ >

0) ( ∀ π : mesh( π ) < δ ) P { sup ≤ t ′ ≤ T | f πt ′ − f t ′ | > ǫ } < ǫ. We denote this mode of convergence by (4.6) u . c . p . -lim mesh( π ) → f π = f. If π is only allowed to range over deterministic partitions, we denote this by d-mesh( π ) → . Lemma 4.4.

Let g be a càdlàg local martingale. Then there exists a localizingsequence ( τ k ) such that, for every k , we have g ( τ k ) ∈ L ( V ∞ ) .Proof. Without loss of generality, g = 0 . Let (˜ τ k ) be a localizing sequence such that g (˜ τ k ) t ∈ L for each k, t and ( g (˜ τ k ) t ) t is a martingale for each k . Deﬁne τ k := ˜ τ k ∧ k ∧ inf { t | | g t | ≥ k } . Then V ∞ g ( τ k ) ≤ k + | g τ k | . The ﬁrst summand is in L ∞ ⊂ L . For the second summand, we have E | g τ k | = E | g (˜ τ k ) τ k | ≤ E | g (˜ τ k ) k | < ∞ . (cid:3) Now, we can recover the existence of Itô integrals.

Corollary 4.5.

Let f be a càdlàg adapted process and g a càdlàg local martingale.Then, there exists the limit (4.7) Π( f, g ) , · = u . c . p . -lim mesh( π ) → Π π ( f, g ) , · . Note that the two-parameter supremum sup ≤ t ≤ t ′ ≤ T | Π π ( f, g ) t,t ′ − Π( f, g ) t,t ′ | does not converge to if f has jumps. Indeed by Chen’s relation, it is bounded belowby a multiple of sup ≤ t ≤ T | δ ( f − f ( π ) ) ,t δg t,T | = sup ≤ t ≤ T | ( f t − f ⌊ t,π ⌋ ) δg t,T | , and the diﬀerence ( f t − f ⌊ t,π ⌋ ) does not converge to if f has jumps. Proof of Corollary 4.5.

We may assume without loss of generality that f = 0 and g = 0 . Let (˜ τ k ) be a localizing sequence for g given by Lemma 4.4. Then τ k := ˜ τ k ∧ inf { t | | f t | > k } is also a localizing sequence. Fix T > and ǫ > . For a suﬃciently large k , we willhave P { τ k ≤ T } < ǫ/ . Replacing g by ( g t ∧ τ k ) t and f by ( f t ∧ τ k − ) t , we may assume that g ∈ L ( V ∞ ) and f ∈ L ∞ ( V ∞ ) .By part 2 of Theorem 1.1 with q = 1 and any r > , there exists an adaptedpartition π ◦ such that, for every adapted partition π ′ ⊇ π ◦ , we have (cid:13)(cid:13)(cid:13) V r (Π π ′ ( f, g ) − Π( f, g )) (cid:13)(cid:13)(cid:13) L q (Ω) < ( ǫ/ /q . In particular, for every adapted partition π ′ ⊇ π ◦ , we have P Ω π ′ < ǫ/ , Ω π ′ := { sup ≤ t ≤ T | Π π ′ ( f, g ) ,t − Π( f, g ) ,t | > ǫ/ } . Since V ∞ f is ﬁnite a.s., there exists A < ∞ such that P Ω < ǫ/ , Ω := { sup t ≤ T | f t | > A } < ǫ/ . Since lim j →∞ π ◦ j = ∞ a.s., there exists J ∈ N such that P Ω < ǫ/ , Ω := { π ◦ J < T } . Since g t is right continuous in t and measurable on Ω , there exists δ > such that P Ω < ǫ/ , Ω := { sup j ≤ J sup ≤ s ≤ δ | g π ◦ j + s − g π ◦ j | > ǫ/ (10 AJ ) } and P Ω < ǫ/ , Ω := { min j ≤ J | π ◦ j +1 − π ◦ j | ≤ δ } . We will show that this δ works for (4.7).Let π be an adapted partition with mesh( π ) < δ . Let π ′ := π ∪ π ◦ , this is anotheradapted partition. For every π ′ l ∈ π ◦ \ π and π ′ l < t ′ , we will use the identity(4.8) f π ′ l − ( g π ′ l ∧ t ′ − g π ′ l − ) + f π ′ l ( g π ′ l +1 ∧ t ′ − g π ′ l )= f π ′ l − ( g π ′ l +1 ∧ t ′ − g π ′ l − ) + ( f π ′ l − f π ′ l − )( g π ′ l +1 ∧ t ′ − g π ′ l ) . Now, if ω ∈ Ω \ Ω , then π ′ l − , π ′ l +1 π ◦ in the situation of (4.8). Therefore, the ﬁrstterm on the right-hand side of (4.8) appears in Π π . Therefore, for every t ′ ≤ T , wehave | Π π ′ ( f, g ) ,t ′ − Π π ( f, g ) ,t ′ | = (cid:12)(cid:12)(cid:12) X l : π ′ l ∈ π ◦ \ π and π ′ l

Variation norm estimate.

The main diﬃculty in deﬁning [ Y, g ] for an X -controlled process Y and a martingale g is to handle the contribution of the jumpsof X . This is done by the following result. Recall OUGH SEMIMARTINGALES 25

Theorem 5.1.

Let < q, q ≤ ∞ , ≤ q < ∞ with /q = 1 /q + 1 /q . Let ( g t ) t ≥ be a càdlàg martingale and ( Y ′ ) t ≥ a càdlàg adapted process. Let I ⊂ (0 , ∞ ) be acountable subset and (∆ t ) t ∈ I a (deterministic) sequence. Consider the process (5.1) B t,t ′ := X j ∈ I ∩ ( t,t ′ ] Y ′ j − ∆ j δg j − ,j . Then, for every p ∈ [2 , ∞ ] and /r < / /p , with M Y ′ = sup t | Y ′ t | , (5.2) k V r B k L q . k M Y ′ k L q (cid:0)X j ∈ I | ∆ j | p (cid:1) /p k (cid:0)X j ∈ I | δg j − ,j | (cid:1) / k L q . Proof.

We will ﬁrst show that the estimate (5.2) holds for ﬁnite sets I . This willimmediately imply that the series (5.1) converges unconditionally in L q ( V r ) and thatits limit also satisﬁes the estimate (5.2).When I is ﬁnite, we may assume that we are in discrete time, which correspondsto the case I = { , . . . , N } and Y ′ , g being constant on intervals [ n, n + 1) for n ∈ N .By Corollary 3.2, it suﬃces to estimate the L q norm of(5.3) (cid:13)(cid:13)(cid:13) sup τ k − ≤ t

Discretization of quadratic covariation.Deﬁnition 5.2.

Let g = ( g t ) t ≥ be a càdlàg local martingale. For adapted càdlàgprocesses Y, Z and a deterministic partition π , deﬁne (5.6) Z • [ Y, g ] πT := X π j

Let ≤ ˆ p , p ≤ ∞ . Let X ∈ V p loc be a deterministic càdlàg path. Let Y = (

Y, Y ′ ) be a càdlàg adapted process such that Y ∈ V p loc and R Y ,X ∈ V ˆ p loc almostsurely and Y ′ ∈ L ∞ . Then, there exists a localizing sequence ( τ k ) such that, for every k , the process ˜Y = ( ˜ Y , ˜ Y ′ ) , deﬁned by ˜ Y t = Y t ∧ τ j − , ˜ Y ′ t = ( Y ′ t if t < τ j , if t ≥ τ j , , satisﬁes ˜ Y ∈ L ∞ ( V p ) , M Y ′ ∈ L ∞ , and R ˜Y , ˜ X ∈ L ∞ ( V ˆ p ) , where ˜ X t := X t ∧ k .Proof. Without loss of generality, | Y ′ | ≤ / . Let τ k := k ∧ min { t | max( V p [0 ,t ] Y, sup s ∈ [0 ,t ] | Y ′ s | , V ˆ p [0 ,t ] R Y ,X ) ≥ k } . At this point, we have used the fact that the functions t V p [0 ,t ] Y and t V ˆ p [0 ,t ] R Y ,X is right continuous if X, Y, Y ′ are càdlàg, so that the above minimum in fact exists.For the former function, this is veriﬁed e.g. in [FZ18, Lemma 7.1]; the argument forthe latter function is similar.Then, for any t ≤ t ′ , we have(5.7) R ˜Y , ˜ Xt,t ′ =  R Y ,Xt,t ′ if t ≤ t ′ < τ k , if τ k ≤ t ≤ t ′ ,δY t,τ k − − Y ′ t δX t,t ′ ∧ k , if t < τ k ≤ t ′ . The latter case can only appear once in any ℓ ˆ p norm in the deﬁnition of V ˆ p R ˜Y , ˜ X .Therefore, V ˆ p R ˜Y , ˜ X ≤ V ˆ p [0 ,τ k ) R Y ,X + 2 k + kV ∞ [0 ,k ] X is a bounded function. (cid:3) Theorem 5.4.

Let ˆ p < ≤ p and X ∈ V p loc a deterministic càdlàg path. Supposethat Y = (

Y, Y ′ ) and Z are càdlàg adapted processes, g a càdlàg local martingale,and R Y ,X ∈ V ˆ p loc almost surely. Then (5.8) Z • [Y , g ] := u . c . p . -lim d-mesh( π ) → Z • [ Y, g ] π exists, and we have (5.9) Z • [Y , g ] t = X s ≤ t Z s − ∆ X s Y ′ s − ∆ g s + X s ≤ t Z s − ∆ R Y s ∆ g s , where ∆ g s := δg s − ,s and ∆ R Y s := R Y s − ,s . Moreover, for any /r < / /p , wehave Z • [Y , g ] ∈ V r loc .Remark . The case needed for the construction of the square bracket in Theorem 1.2is Z ≡ . General processes Z are needed in the consisteny result, Theorem 6.5. Proof.

Since (5.6) and (5.9) are linear in Y , we may assume | Y ′ | ≤ upon replacing Y by Y / max(1 , | Y ′ | ) . Similarly, we may assume | Z | ≤ .Using the localizing sequence τ k = min { t | | Z t | > k } and replacing Z by ( Z t ∧ τ k − ) t ,we may assume that Z is uniformly bounded. Using the localizing sequence givenby Lemma 4.4, we may assume g ∈ L ( V ∞ ) . Using the localizing sequence given byLemma 5.3, we may assume that X ∈ V p , Y ∈ L ∞ ( V p ) , and R Y ,X ∈ L ∞ ( V ˆ p ) .Overall, we may assume(5.10) g ∈ L V ∞ , X ∈ V p , M Y ′ , M Z ∈ L ∞ , R Y ,X ∈ L ∞ ( V ˆ p ) . Assuming (5.10), the ﬁrst sum in (5.9) now makes sense by Theorem 5.1 and is in V r loc for any /r < / /p . The second sum in (5.9) almost surely converges OUGH SEMIMARTINGALES 27 absolutely for every t , and in particular deﬁnes a process with almost surely V paths.Now, still assuming (5.10), we will show that the limit (5.8) exists and coincideswith (5.9).Fix T > . Let A ≥ be such that sup t ≤ T | X t | < A and the set Ω := { sup t ≤ T ( | Y t | ∨ | Y ′ t | ∨ | g t | ∨ | Z t | ) < A } has probability ≥ − ǫ .Let J X := { s | | ∆ X s | > ǫ/ (2 A ) } and J Y ( ω ) := { s | | ∆ Y s | > ǫ/ } . Let N < ∞ besuch that | J X | ≤ N and Ω := {| J Y | < N } has probability ≥ − ǫ .Let δ be such that sup t ≤ t ′ ≤ T : | t ′ − t |≤ δ, ( t,t ′ ] ∩ J X = ∅ | δX t,t ′ | < ǫ/A, sup t ∈ ( J X ∪ J Y ) ∩ [0 ,T ] sup δ } , Ω := { sup t ≤ t ′ ≤ T : | t ′ − t |≤ δ, ( t,t ′ ] ∩ J Y = ∅ | δY t,t ′ | < ǫ } , have probability ≥ − ǫ . Let π be a deterministic partition with mesh( π ) < δ .The basic idea to handle the main term is the following. Suppose ω ∈ Ω ∩ · · · ∩ Ω and s ∈ J X ∪ J Y ( ω ) . Suppose π j < s ≤ π j +1 ∧ T . Then (cid:12)(cid:12)(cid:12) Z π j δY π j ,π j +1 ∧ T δg π j ,π j +1 ∧ T − Z s − ∆ Y s ∆ g s (cid:12)(cid:12)(cid:12) ≤ | Z π j − Z s − | · | δY π j ,π j +1 ∧ T δg π j ,π j +1 ∧ T | + | Z s − | · | δY π j ,π j +1 ∧ T − ∆ Y s | · | δg π j ,π j +1 ∧ T | + | Z s − ∆ Y s | · | δg π j ,π j +1 ∧ T − ∆ g s | = | Z π j − Z s − | · | δY π j ,π j +1 ∧ T δg π j ,π j +1 ∧ T | + | Z s − | · | δY π j ,s − + δY s,π j +1 ∧ T | · | δg π j ,π j +1 ∧ T | + | Z s − ∆ Y s | · | δg π j ,s − + δg s,π j +1 ∧ T |≤ · (2 A ) · ǫ/ (100 A N ) ≤ ǫ/ (4 N ) . In case s ∈ J Y ( ω ) \ J X , we similarly estimate | Z s − Y ′ s − ∆ X s ∆ g s − Z π j Y ′ π j δX π j ,π j +1 ∧ T δg π j ,π j +1 ∧ T |≤ | Z s − − Z π j | · | Y ′ s − ∆ X s ∆ g s | + | Z π j | · | Y ′ s − − Y ′ π j | · | ∆ X s ∆ g s | + | Z π j Y ′ π j | · | ∆ X s − δX π j ,π j +1 ∧ T | · | ∆ g s | + | Z π j Y ′ π j δX π j ,π j +1 ∧ T | · | ∆ g s − δg π j ,π j +1 ∧ T | . ǫ/N. Since ω ∈ Ω , these errors contribute O ( ǫ ) to the sum over j . Hence, we obtain | X π j

These estimates are uniform in T , so we obtain sup T ≤ T | X π j by Theorem 5.1, since | ∆ X s | = O ( ǫ ) and δX π j ,π j +1 = O ( ǫ ) in all summands. Thecontribution of the supremum involving Y ′ is easy to bound, again because δX = O ( ǫ ) there.The contribution of the sums involving R is bounded by ( X j | R ... | ) / ( X j | δg ... | ) / ≤ (sup j | R ... | ) − ˆ p / ( V ˆ p R ) ˆ p / ( X j | δg ... | ) / . Using that | R ... | = O ( ǫ ) in all these terms and the BDG inequality to estimate thesquare function of g , we see that the contribution of these terms is O ( ǫ − ˆ p / ) in L q . (cid:3) Integration by parts.

The following estimate will be used for boundary terms.

Lemma 5.5.

Let < q , q ≤ ∞ and /q = 1 /q + 1 /q . Let < p , p ≤ ∞ and /r < /p + 1 /p . Let f, g be càdlàg adapted processes. Then (cid:13)(cid:13)(cid:13) V r (cid:0) δf t,t ′ δg t,t ′ (cid:1)(cid:13)(cid:13)(cid:13) L q ≤ sup τ k sup τ k − ≤ t<τ k | f t − f τ k |k L q ( ℓ p ) k sup τ k − ≤ t<τ k | g t − g τ k |k L q ( ℓ p ) . where the supremum is taken over adapted partitions τ .Proof. This is a direct consequence of Corollary 3.2 with /r < /ρ = 1 /p + 1 /p and Hölder’s inequality. (cid:3) Corollary 5.6.

Let ≤ q < ∞ , < q ≤ ∞ , and /q = 1 /q + 1 /q . Let < p ≤ ∞ and /r < / /p . Let f be a càdlàg adapted process and g a càdlàgmartingale. Then (5.11) (cid:13)(cid:13)(cid:13) V r (cid:0) δf t,t ′ δg t,t ′ (cid:1)(cid:13)(cid:13)(cid:13) L q . k V p f k L q k V ∞ g k L q . Proof.

We apply Lemma 5.5 with p = 2 . The resulting L q ( ℓ ) norm can be esti-mated, after discretization, using ﬁrst the vector-valued and then the scalar-valuedBDG inequality. (cid:3) Proof of Theorem 1.2.

For any adapted partition π and any càdlàg processes f, g ,we have the summation by parts identity Π π ( f, g ) ,T + Π π ( g, f ) ,T + [ f, g ] πT = ( f T − f )( g T − g ) . Deﬁne(5.12) Π( g, Y) := δgδY − Π( Y, g ) − δ [Y , g ] . Convergence (1.23) then follows from Corollary 4.5 and Theorem 5.4.Chen’s relation (1.24) follows from Chen’s relation (1.15) for Π( Y, g ) . The variation norm bound (1.25) follows from Corollary 5.6, part 2 of Theorem 1.1,and Theorem 5.1 applied to the respective terms. (cid:3)

Quadratic covariation of two martingales.

In this section, we recall a fewfacts about quadratic covariation needed in Section 6 and explain how they ﬁt intothe approach to Itô integration provided by Theorem 1.1.Let f, g be càdlàg martingales. The quadratic covariation process of f, g is deﬁnedby [ f, g ] t := δf ,t δg ,t − Π( f, g ) ,t − Π( g, f ) ,t . One can verify that the discrete brackets introduced in (5.6) satisfy δf ,t δg ,t − Π π ( f, g ) ,t − Π π ( g, f ) ,t = [ f, g ] π ,t . Therefore, Corollary 4.5 recovers the existence of the limit that is usually used todeﬁne the quadratic covariation: [ f, g ] t = u . c . p . -lim mesh( π ) → δf ,t δg ,t − Π π ( f, g ) ,t − Π π ( g, f ) ,t . In particular, in the case g = f , the function t [ g ] t := [ g, g ] t is a.s. monotonicallyincreasing and locally bounded. Passing to the limit in the vector-valued BDGinequality, Lemma 2.4, we obtain the estimate(5.13) (cid:13)(cid:13) V ∞ h ( k ) (cid:13)(cid:13) L q ( ℓ rk ) . q,r (cid:13)(cid:13) [ h ( k ) ] / (cid:13)(cid:13) L q ( ℓ rk ) , where h ( k ) are càdlàg martingales, [ h ] = [ h ] ∞ = lim t →∞ [ h, h ] t , and the hypotheseson the exponents q, r are the same as in Lemma 2.4.Finally, we recall the (almost sure, pathwise) Itô isometry(5.14) [Π( f, g ) s, · ] t = Z ( s,t ] | f u − − f s | d[ g ] u , where the integral is taken in the Riemann–Stieltjes sense.6. Consistency of rough and stochastic integration

Let g be a càdlàg local martingale and g = ( g, Π( g, g )) the p -rough path lift (with p ∈ (2 , ) provided by Theorem 1.1 with F = δg . It is well-known that, for any g -controlled p -rough adapted process A = (

A, A ′ ) , the Itô integral and the roughintegral coincide almost surely:(6.1) Z A u − d g u = Z A u − dg u , see e.g. [FH20, Proposition 5.1] for the case of Brownian motion and references giventhere for historical information. We begin with a generalization of this fact, in whichone of the copies of g is replaced by a further process Y and Z plays the role of A ′ . Lemma 6.1.

Let g be a càdlàg local martingale and Y, Z càdlàg adapted processes.Then, along adapted partitions π , we have (6.2) u . c . p . -lim mesh( π ) → (cid:16) X π j
Y, Z , is the main ingredient in showing consistencyresults such as (6.1). Indeed, the diﬀerence between the discrete approximations ofthe two sides of (6.1) is precisely the sum in (6.2). More generally, one can replace therough lift g by a rough semimartingale g + ˜ g , where ˜ g is independent from g , and thecontrolled process A by another process that is a g -controlled rough semimartingaleconditionally on each path of g . OUGH SEMIMARTINGALES 31

Proof of Lemma 6.1.

Without loss of generality, Y = 0 . Multiplying Z by an F -measurable time-independent function, we may also assume | Z | ≤ . Similarly to(5.10), we may assume g ∈ L V ∞ , M Y, M Z ∈ L ∞ . By the BDG inequality and Itô isometry (5.14), we have E sup T (cid:12)(cid:12)(cid:12) X π j be arbitrary. By the càdlàg propertyof Y , there are ﬁnitely many points ( s k ) such that | ∆ Y s k | ≥ ǫ , and there exists δ > such that V ∞ Y | ( s k − ǫ,s k ) < ǫ , V ∞ Y | [ s k ,s k + ǫ ] < ǫ , and for every interval J such that s k J for all k we have V ∞ Y | J < ǫ . It follows that, for every partition ( π ) with mesh( π ) < δ , we have Z (0 ,T ] | δY ⌊ u − ,π ⌋ ,u − | d[ g ] u . ǫ Z (0 ,T ] d[ g ] u + X k | ∆ Y s k | Z ( s k ,s k + δ ) d[ g ] u . ≤ ǫ Z (0 ,T ] d[ g ] u + X k | ∆ Y s k | δ [ g ] s k + ,s k + δ . (6.4)The ﬁrst term is clearly arbitrarily small, and the second term also becomes arbi-trarily small as δ decreases because the sum is ﬁnite and u [ g ] u is monotonic. (cid:3) Lemma 6.2.

Let ˆ p < ≤ p . Let X ∈ V p loc be a deterministic càdlàg path. Supposethat Y = (

Y, Y ′ ) is a càdlàg adapted process, Z a càdlàg adapted process, g a càdlàglocal martingale, R Y ,X ∈ V ˆ p loc a.s.. Then u . c . p . -lim d-mesh( π ) → (cid:16) X π j
By deﬁnition (5.12), we have X π j
The ﬁrst term on the right-hand side is, by Deﬁnition 5.2, equal to Z • [ Y, g ] π . ByThoerem 5.4, it converges to Z • [Y , g ] .The middle term equals Z ( π ) • [Y , g ] . This also converges to Z • [Y , g ] as mesh( π ) → by an argument similar to (6.4). (cid:3) If ( g + Y, Y ′ ) is an X -controlled p -RSM, p ∈ (2 , , then Z = (

Z, Z ′ ) with(6.5) Z = g + Y, Z ′ t ( δX, δg ) = Y ′ t δX + δg is easily seen to be an ( X, g ) -controlled p -rough process. Indeed, g ∈ V p loc almostsurely by Lemma 4.4 and Lépingle’s inequality (1.2). It remains to observe that R Z s,t = δZ s,t − Z ′ t ( δX s,t , δg s,t )= δg s,t + δY s,t − Y ′ t δX s,t − δg s,t = R Y s,t . The converse implication is more subtle, because the g component of the Gubinelliderivative of a ( X, g ) -controlled process need not be the identity. Theorem 6.3.

Let p ∈ (2 , and X ∈ V p loc be a deterministic càdlàg path. Let g be a càdlàg local martinagle. Let Z = (

Z, Z ′ ) be an adapted càdlàg ( X, g ) -controlled p -rough process.Then ( Z, Z ′ ( · , is an X -controlled p -RSM: ( Z, Z ′ ( · , g + ˜ Y , ˜ Y ′ ) , with the local martingale part given by (6.6) ˜ g T := Π( Z ′ (0 , · ) , g ) ,T and Gubinelli derivative (6.7) ˜ Y ′ T := Z ′ T ( · , . Proof of Theorem 6.3.

With the local martingale component deﬁned by (6.6), thecontrolled rough component will be deﬁned by ˜ Y T := Z T − ˜ g T . It follows from Lépingle’s inequality (1.2) and localization, Lemma 4.4, that ˜ Y ∈ V p loc almost surely. It remains to show that R ˜Y ,X ∈ V p/ almost surely. To this end, with s < t , we write R ˜Y ,Xs,t = ˜ Y t − ˜ Y s − Z ′ s ( X t − X s , Z t − Z s − Π( Z ′ (0 , · ) , g ) ,t + Π( Z ′ (0 , · ) , g ) ,s − Z ′ s ( X t − X s , (cid:16) Z t − Z s − Z ′ s ( X t − X s , g t − g s ) (cid:17) − Π( Z ′ (0 , · ) , g ) ,t + Π( Z ′ (0 , · ) , g ) ,s + Z ′ s (0 , g t − g s )= R Z , ( X,g ) s,t − Π( Z ′ (0 , · ) , g ) s,t . (6.8)The former term is in V p/ by the hypothesis. The latter term is in V p/ by Theo-rem 1.1 and localization similar to Lemma 5.3. (cid:3) Corollary 6.4.

Let p ∈ (2 , If ( g + Y, Y ′ ) is an X -controlled, p -rough semimartin-gale and σ ∈ C , then ( σ ( g + Y ) , Dσ ◦ Y ′ ) is also an X -controlled p -rough semi-martingale.Proof. By (6.5), g + Y can be lifted to an ( X, g ) -controlled p -rough process. Thecomposition of this process with σ is again an ( X, g ) -controlled p -rough path, seee.g. [FZ18, Remark 4.15], to which we can apply Theorem 6.3. (cid:3) Remark . Theorem 6.3 has an analog for classical semimartingales. Let g be a càdlàglocal martingale and Z = (

Z, Z ′ ) a càdlàg adapted process such that R Z ,g ∈ V and Z ′ ∈ V . Then Z must be a semimartingale. Indeed, let ˜ g T := Π( Z ′ , g ) T , Y T := Z T − ˜ g T , Y ′ T := 0 . OUGH SEMIMARTINGALES 33

Then, by the same calculation as in (6.8), we have δY s,t = R Y , s,t = − Π( Z ′ , g ) s,t . It follows from the ℓ -valued estimate in Corollary 2.7 that Y ∈ V , so that Z is asemimartingale. Theorem 6.5.

Let p ∈ (2 , and X = ( X, X ) be a deterministic càdlàg p -roughpath. Let g be a càdlàg local martingale. Let Z = (

Z, Z ′ ) be an adapted càdlàg ( X, g ) -controlled p -rough process. Then Z Z d J (X , g ) = Π( Z, ( X, g )) . where the left-hand side is the pathwise rough integral and the right-hand side is theRSM integral.Proof. The right-hand side makes sense by Theorem 6.3. Expaniding the deﬁni-tions, we see that the diﬀerence between the two sides vanishes by Lemma 6.1 andLemma 6.2. (cid:3)

Appendix A. Hölder estimates for martingale transforms

For a two-parameter process

Π = (Π t,t ′ ) ≤ t
In the situation of Theorem 1.1, part 2, suppose that all processeshave a.s. continuous paths and restrict the time parameter to a ﬁnite interval, t ∈ [0 , . Let ≤ γ < α + β = α i + β i with α, β, α i , β i ≥ . Then, we have (cid:13)(cid:13) H γ Π( F, g ) (cid:13)(cid:13) L q . k H β f k L q k H α ( Sg ) k L q + X i (cid:13)(cid:13) H α i F i · H β i Π( ˜ F i , g ) (cid:13)(cid:13) L q Proof.

We abbreviate X := Π( F, g ) .Consider the deterministic partitions τ ( n ) = 2 − n N , ˜ τ ( n ) = { , } ∪ (2 − n N + 2 − n − ) .Let K n := sup j ∈ N sup τ ( n ) j − ≤ t ≤ t ′ ≤ τ ( n ) j | X t,t ′ | , and deﬁne ˜ K n analogously with ˜ τ ( n ) in place of τ ( n ) . Then, we have sup | t − t ′ |≤ − n − | X t,t ′ | ≤ K n + ˜ K n , sup | t − t ′ |≤ | X t,t ′ | ≤ K . It follows that sup | t − t ′ |≤ − n − | t − t ′ | − γ | X t,t ′ | . nγ K n + 2 nγ ˜ K n . Therefore, H γ X . max n ∈ N γn ( K n + ˜ K n ) . It follows that k H γ X k qL q . ∞ X n =0 (cid:0) γn k K n k L q (cid:1) q + ∞ X n =0 (cid:0) γn k ˜ K n k L q (cid:1) q . The two sums are similar, so we only consider the ﬁrst one. Let < r < ∞ be suchthat γ + 1 /r < α + β . By Theorem 2.6, which passes to the continuous time case,we have γn k K n k L q ≤ γn k ℓ rj sup τ ( n ) j − ≤ t ≤ t ′ ≤ τ ( n ) j | X t,t ′ |k L q . γn X i (cid:13)(cid:13) ℓ rk (cid:0) sup τ ( n ) k − ≤ s
Fundamentals of stochastic ﬁltering . Vol. 60. Stochastic Mod-elling and Applied Probability. Springer, New York, 2009, pp. xiv+390. mr : (cit. on p. 10).[Bou89] J. Bourgain. “Pointwise ergodic theorems for arithmetic sets”. In: Inst. Hautes ÉtudesSci. Publ. Math.

69 (1989). With an appendix by the author, Harry Furstenberg, YitzhakKatznelson and Donald S. Ornstein, pp. 5–45. mr : (cit. on pp. 3, 18).[CF19] I. Chevyrev and P. K. Friz. “Canonical RDEs and general semimartingales as rough paths”.In: Ann. Probab. . mr : (cit. onpp. 4, 9, 30).[Che+19] I. Chevyrev, P. K. Friz, A. Korepanov, I. Melbourne, and H. Zhang. “Multiscale systems, homogenization, and rough paths”.In: Probability and analysis in interacting physical systems, In Honor of S.R.S. Varad-han, Berlin, August, 2016 . Vol. 283. Springer Proc. Math. Stat. Springer, Cham, 2019,pp. 17–48. arXiv: . mr : (cit. on p. 7).[CL05] L. Coutin and A. Lejay. “Semi-martingales and rough paths theory”. In: Electron. J.Probab.

10 (2005), no. 23, 761–785. mr : (cit. on p. 4).[Coq+06] F. Coquet, A. Jakubowski, J. Mémin, and L. Słomiński. “Natural decomposition of processes and weak Dirichlet processes” .In: In memoriam Paul-André Meyer: Séminaire de Probabilités XXXIX . Vol. 1874.Lecture Notes in Math. Springer, Berlin, 2006, pp. 81–116. arXiv: math/0403461 . mr : (cit. on p. 7).[CR07] R. Coviello and F. Russo. “Nonsemimartingales: stochastic diﬀerential equations and weak Dirichlet processes” .In: Ann. Probab. math/0602384 . mr : (cit. onp. 9).[Cri+13] D. Crisan, J. Diehl, P. K. Friz, and H. Oberhauser. “Robust ﬁltering: correlated noise and multidimensional observation”.In: Ann. Appl. Probab. . mr : (cit.on pp. 9, 10).[Dav11] M. H. A. Davis. “Pathwise nonlinear ﬁltering with correlated noise”. In: The Oxfordhandbook of nonlinear ﬁltering . Oxford Univ. Press, Oxford, 2011, pp. 403–424. mr : (cit. on p. 10).[Dav70] B. Davis. “On the integrability of the martingale square function”. In: Israel J. Math. mr : (cit. on p. 11).[DFS17] J. Diehl, P. K. Friz, and W. Stannat. “Stochastic partial diﬀerential equations: a rough paths view on weak solutions via Feynman-Kac”.In: Ann. Fac. Sci. Toulouse Math. (6) . mr : (cit. on p. 9).[DMT12] Y. Do, C. Muscalu, and C. Thiele. “Variational estimates for paraproducts”. In: Rev.Mat. Iberoam. . mr : (cit. on pp. 4,18).[DMT17] Y. Do, C. Muscalu, and C. Thiele. “Variational estimates for the bilinear iterated Fourier integral”.In: J. Funct. Anal. . mr : (cit.on p. 4).[DOP19] J.-D. Deuschel, T. Orenshtein, and N. Perkowski. “Additive functionals as rough paths”.2019. arXiv: (cit. on p. 4). EFERENCES 35 [DOR15] J. Diehl, H. Oberhauser, and S. Riedel. “A Lévy area between Brownian motion and rough paths with applications to robust nonlinear ﬁltering and rough partial diﬀerential equations”.In:

Stochastic Process. Appl. . mr : (cit. on pp. 5, 9).[ER03] M. Errami and F. Russo. “ n -covariation, generalized Dirichlet processes and calculus with respect to ﬁnite cubic variation processes”.In: Stochastic Process. Appl. mr : (cit. on p. 7).[FH20] P. K. Friz and M. Hairer. A course on rough paths. With an introduction to regularitystructures . 2nd ed. Universitext. Springer, 2020 (cit. on pp. 5, 7, 9, 30).[FHL20] P. Friz, A. Hocquet, and K. Lê. “Rough Markov diﬀusions and stochastic diﬀerentialequations”. In preparation. 2020 (cit. on p. 9).[Föl81] H. Föllmer. “Dirichlet processes”. In:

Stochastic integrals (Proc. Sympos., Univ. Durham,Durham, 1980) . Vol. 851. Lecture Notes in Math. Springer, Berlin, 1981, pp. 476–478. mr : (cit. on p. 6).[FS17] P. K. Friz and A. Shekhar. “General rough integration, Lévy rough paths and a Lévy-Kintchine-type formula”.In: Ann. Probab. . mr : (cit. onp. 7).[FV06] P. Friz and N. Victoir. “The Burkholder-Davis-Gundy inequality for enhanced martingales” .In: Séminaire de probabilités XLI . Vol. 1934. Lecture Notes in Math. Springer, Berlin,2006, pp. 421–438. arXiv: math/0608783 . mr : (cit. on p. 4).[FV10a] P. K. Friz and N. B. Victoir. Multidimensional stochastic processes as rough paths .Vol. 120. Cambridge Studies in Advanced Mathematics. Theory and applications. Cam-bridge University Press, Cambridge, 2010, pp. xiv+656. mr : (cit. on p. 4).[FV10b] P. Friz and N. Victoir. “Diﬀerential equations driven by Gaussian signals”. In: Ann.Inst. Henri Poincaré Probab. Stat. . mr : (cit. on pp. 5, 9).[FZ18] P. K. Friz and H. Zhang. “Diﬀerential equations driven by rough paths with jumps”. In: J. Diﬀerential Equations . mr : (cit. on pp. 7, 19, 26, 32).[GL97] J. G. Gaines and T. J. Lyons. “Variable step size control in the numerical solution of stochastic diﬀerential equations”.In: SIAM J. Appl. Math. mr : (cit. on p. 4).[GN08] J. Guerra and D. Nualart. “Stochastic diﬀerential equations driven by fractional Brownian motion and standard Brownian motion”.In: Stoch. Anal. Appl. . mr : (cit.on p. 9).[Gub04] M. Gubinelli. “Controlling rough paths”. In: J. Funct. Anal. math/0306433 . mr : (cit. on pp. 4, 7).[Gub10] M. Gubinelli. “Ramiﬁcation of rough paths”. In: J. Diﬀerential Equations math/0610300 . mr : (cit. on pp. 15, 16).[HK15] M. Hairer and D. Kelly. “Geometric versus non-geometric rough paths”. In: Ann. Inst.Henri Poincaré Probab. Stat. . mr : (cit. on p. 16).[Hyt+16] T. Hytönen, J. van Neerven, M. Veraar, and L. Weis. Analysis in Banach spaces .Vol. I:

Martingales and Littlewood-Paley theory . Cham: Springer, 2016, pp. xvi+614. mr : (cit. on pp. 11, 12).[JM83] N. C. Jain and D. Monrad. “Gaussian measures in B p ”. In: Ann. Probab. mr : (cit. on p. 6).[JSW08] R. L. Jones, A. Seeger, and J. Wright. “Strong variational and jump inequalities in harmonic analysis”.In: Trans. Amer. Math. Soc. mr : (cit. on p. 18).[KN07] P. E. Kloeden and A. Neuenkirch. “The pathwise convergence of approximation schemes for stochastic diﬀerential equations”.In: LMS J. Comput. Math.

10 (2007), pp. 235–253. mr : (cit. on p. 4).[KP92] P. E. Kloeden and E. Platen. Numerical solution of stochastic diﬀerential equations .Vol. 23. Applications of Mathematics (New York). Springer-Verlag, Berlin, 1992, pp. xxxvi+632. mr : (cit. on p. 4).[KZ19] V. Kovač and P. Zorin-Kranich. “Variational estimates for martingale paraproducts”.In: Electron. Commun. Probab.

24 (2019), Paper No. 48, 14. arXiv: . mr : (cit. on pp. 4, 11, 18).[Lep76] D. Lepingle. “La variation d’ordre p des semi-martingales” . In: Z. Wahrscheinlichkeits-theorie und Verw. Gebiete mr : (cit. on p. 2).[Lyo98] T. J. Lyons. “Diﬀerential equations driven by rough signals”. In: Rev. Mat. Iberoamer-icana mr : (cit. on p. 7).[Man04] M. Manstavičius. “ p -variation of strong Markov processes”. In: Ann. Probab. math/0410106 . mr : (cit. on p. 6).[MSZ20] M. Mirek, E. M. Stein, and P. Zorin-Kranich. “Jump inequalities via real interpolation”.In: Math. Ann. . mr : (cit. onp. 18).[MTT02] C. Muscalu, T. Tao, and C. Thiele. “Uniform estimates on paraproducts”. In: J. Anal.Math.

87 (2002). Dedicated to the memory of Thomas H. Wolﬀ, pp. 369–384. arXiv: math/0106092 . mr : (cit. on p. 4). [Mus14] C. Muscalu. “Calderón commutators and the Cauchy integral on Lipschitz curves revisited II. The Cauchy integral and its generalizations” .In: Rev. Mat. Iberoam. . mr : (cit. on p. 4).[Os¸e17] A. Os¸ekowski. “A Feﬀerman-Stein inequality for the martingale square and maximal functions”.In: Statist. Probab. Lett.

129 (2017), pp. 81–85. mr : (cit. on p. 12).[PX88] G. Pisier and Q. H. Xu. “The strong p -variation of martingales and orthogonal series”.In: Probab. Theory Related Fields mr : (cit. on pp. 3,18).[Wil01] D. R. E. Williams. “Path-wise solutions of stochastic diﬀerential equations driven by Lévy processes” .In: Rev. Mat. Iberoamericana mr : (cit. on p. 7).[You36] L. C. Young. “An inequality of the Hölder type, connected with Stieltjes integration”.In: Acta Math. mr : (cit. on p. 2).[Zor20] P. Zorin-Kranich. “Weighted Lépingle inequality”. In: Bernoulli (cit. on pp. 3, 13, 18).(PF)

Institut für Mathematik, TU Berlin (PF)

Weierstraß–Institut für Angewandte Analysis und Stochastik

E-mail address : [email protected] (PZK) Mathematical Institute, University of Bonn

E-mail address ::

Related Researches

A limit theorem for Bernoulli convolutions and the Φ -variation of functions in the Takagi class

by Xiyue Han

On the stability of the martingale optimal transport problem: A set-valued map approach

by Ariel Neufeld

Large N limit of the O(N) linear sigma model in 3D

by Hao Shen

Skorohod and Stratonovich integrals for controlled processes

by Jian Song

Cutoff for Almost All Random Walks on Abelian Groups

by Jonathan Hermon

Geometry of Random Cayley Graphs of Abelian Groups

by Jonathan Hermon

Mixing time for the asymmetric simple exclusion process in a random environment

by Hubert Lacoin

Mixing time of fractional random walk on finite fields

by Jimmy He

Typicality and entropy of processes on infinite trees

by ?gnes Backhausz

General Law of iterated logarithm for Markov processes

by Soobin Cho

Asymptotically linear iterated function systems on the real line

by Gerold Alsmeyer

Proper Scoring Rules and Domination

by Alexander Pruss

Simulated annealing from continuum to discretization: a convergence analysis via the Eyring--Kramers law

by Wenpin Tang

Non-Stationary KPZ equation from ASEP with slow bonds

by Kevin Yang

Cesaro Limits for Fractional Dynamics

by José L. da Silva

Sharp Asymptotics for q -Norms of Random Vectors in High-Dimensional ??n p -Balls

by Tom Kaufmann

First-passage probabilities and invariant distributions of Kac-Ornstein-Uhlenbeck processes

by Nikita Ratanov

Approximation of Stochastic Volterra Equations with kernels of completely monotone type

by Aurélien Alfonsi

Universality of deterministic KPZ

by Sourav Chatterjee

An adaptive strong order 1 method for SDEs with discontinuous drift coefficient

by Larisa Yaroslavtseva

Models of random subtrees of a graph

by Luis Fredes

Free boundary dimers: random walk representation and scaling limit

by Nathanael Berestycki

Proof of the Contiguity Conjecture and Lognormal Limit for the Symmetric Perceptron

by Emmanuel Abbe

Large deviations for Markov jump processes with uniformly diminishing rates

by Andrea Agazzi

Global Upper Expectations for Discrete-Time Stochastic Processes: In Practice, They Are All The Same!

by Natan T'Joens

«
1

2

3

4

»

Submitted on 20 Aug 2020 Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar