A regularity method for lower bounds on the Lyapunov exponent for stochastic differential equations
aa r X i v : . [ m a t h . D S ] J u l A regularity method for lower boundson the Lyapunov exponentfor stochastic differential equations
Jacob Bedrossian ∗ Alex Blumenthal † Sam Punshon-Smith ‡ August 3, 2020
Abstract
We put forward a new method for obtaining quantitative lower bounds on the top Lyapunov expo-nent of stochastic differential equations (SDEs). Our method combines (i) an (apparently new) identityconnecting the top Lyapunov exponent to a Fisher information-like functional of the stationary densityof the Markov process tracking tangent directions with (ii) a novel, quantitative version of H¨ormander’shypoelliptic regularity theory in an L framework which estimates this (degenerate) Fisher informationfrom below by an W ,s loc Sobolev norm. This method is applicable to a wide range of systems beyondthe reach of currently existing mathematically rigorous methods. As an initial application, we prove thepositivity of the top Lyapunov exponent for a class of weakly-dissipative, weakly forced SDE; in thispaper we prove that this class includes the Lorenz 96 model in any dimension, provided the additivestochastic driving is applied to any consecutive pair of modes.
Contents R n . . . . . . . . . . . . . . . . . . . . . 81.3 Context within prior work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Λ s with |·| X j ,s j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Mathematics subject classification.
Primary: 37H15, 35H10. Secondary: 37D25, 58J65, 35B65 ∗ Department of Mathematics, University of Maryland, College Park, MD 20742, USA [email protected] . J.B. wassupported by National Science Foundation CAREER grant DMS-1552826 and National Science Foundation RNMS † School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA [email protected] . A.B.was supported by National Science Foundation grant DMS-2009431 ‡ Division of Applied Mathematics, Brown University, Providence, RI 02906, USA [email protected] . This material wasbased upon work supported by the National Science Foundation under Award No. DMS-1803481. .4 Positive X regularity from negative X and positive X j regularity . . . . . . . . . . . . . . . . . . . 253.5 Regularization: Lemma 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 R n . . . . . . . . . . . 354.3 Projective spanning for Lorenz 96 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 A Qualitative properties of the projective stationary measure 41B Proof of Proposition 4.2 44References 46
Many nonlinear systems of physical origin exhibit chaotic behavior when subjected to external forcing andweak damping. Here, “chaos” refers to sensitivity with respect to the initial conditions, and is often measuredby the Lyapunov exponent, a measure of the asymptotic exponential rate at which nearby trajectories diverge;positivity of the Lyapunov exponent is a well-known hallmark of chaos. Despite the ubiquity of chaos insystems of physical interest, and in contrast with the rather well-developed abstract theory for the descriptionof chaotic states and associated statistical properties, it is notoriously challenging to verify, for a givensystem, whether or not chaotic behavior is actually present in the above sense.The purpose of this paper is to put forward a method for providing (at least “preliminary” ) quantitativelower bounds for the Lyapunov exponents of weakly-damped, weakly-driven SDE. Our method combinestwo new ingredients: (i) an apparently new identity connecting the largest Lyapunov exponent to a Fisherinformation-type quantity on the stationary statistics of tangent directions; and (ii) a quantitative hypoellip-ticity argument for showing that this Fisher information-typequantity uniformly controls Sobolev regularityof the tangent-direction stationary statistics, and hence regularity provides a lower bound on the Lyapunovexponent . Our methods can potentially be interpreted as the beginning of a quantitative and more robust `ala Furstenberg theory for SDEs. See Section 1.3 for a more in-depth discussion of the previously existingwork and how ours fits in.As a first application of our methods, we study a class of high-dimensional SDE commonly used as finitedimensional models in fluid mechanics and other fields. In [45], Lorenz put forward the following model ,now referred to as Lorenz-96 (L96), for a system of J periodically coupled oscillators u = ( u , · · · , u J ) ∈ R J written here with a small damping parameter ˆ ǫ > and subjected to stochastic forcing: d u m = (cid:0) ( u m +1 − u m − ) u m − − ˆ ǫu m (cid:1) d t + q m d ˆ W mt , ≤ m ≤ J (1.1)where { ˆ W mt } is a collection of one-dimensional independent standard Brownian motions, { q m } are fixedparameters, and the u m are J -periodic in m , i.e., u m + kJ := u m . Since its introduction, L96 has come “Preliminary” in the sense that the lower bounds we provide are expected to be sub-optimal for most systems of physicalinterest. More precisely, we provide a lower bound on nλ − λ Σ , where λ Σ is the sum of the Lyapunov exponents. Note that L96 is distinct from the “butterfly attractor” model, an ODE on R , put forward in Lorenz’s seminal 1967 work [44].For this simple 3D model, positivity of Lyapunov exponents for typical initial conditions in a certain parameter regime follows fromthe well-known computer-assisted proof carried out in [67].
2o be recognized as a prototypical model of chaotic behavior in spatially extended systems such as thosearising in fluid mechanics and similar fields, and serves as a remarkably common benchmark for numericalmethods adapted to the analysis of chaotic systems (see, e.g., [31, 46, 48–50] and references therein). As thenonlinearity is bilinear, by rescaling u by √ ˆ ǫu ( √ ˆ ǫt ) and a re-definition of ǫ := ˆ ǫ / , (1.1) is equivalent tothe following system (see Remark 1.2), d u m = (cid:0) ( u m +1 − u m − ) u m − − ǫu m (cid:1) d t + √ ǫq m d W mt , ≤ m ≤ J, (1.2)where { W mt } are equal, in law to { ˆ W mt } . In this form, the damping and driving are balanced in the sense thatthe stationary measures of (1.2) are tight as ǫ → (see Appendix A) and these stationary measures convergeto (absolutely continuous) invariant measures of the deterministic ǫ = 0 problem. There is little known(with mathematical rigor) regarding the dynamics of the ǫ = 0 nor can one make a perturbative treatmentfrom the existing `a la Furstenberg methods for random dynamics; see Section 1.3 for more discussion.For (1.2), our methods yield the following theorem. Theorem 1.1.
Let Φ tω : R J → R J denote the stochastic flow of diffeomorphisms solving (1.2) for almostevery random sample path ω . Assume J ≥ and that q , q = 0 . Then, for every ǫ > sufficiently small,the top Lyapunov exponent λ ǫ = lim t →∞ t log | D Φ tω ( u ) | exists, is constant over Leb.-a.e. u ∈ R J and a.e. sample path ω , and satisfies λ ǫ ǫ → ∞ in particular, λ ǫ > for all ǫ sufficiently small. Remarkably, the problem of proving λ ǫ > was previously open in spite of overwhelming numericalevidence to support this [19, 31, 50, 57, 59]. In fact, our results apply, in principle, to any model in a wideclass including not only Lorenz-96, but also Galerkin truncations of the Navier-Stokes equations, subject toa suitable hypoellipticity condition (Theorem 1.12 below) which currently remains open for Galerkin NSE–this will be the subject of future work. On the other hand, we remark that the scaling ǫ − λ ǫ → ∞ is likelyto be sub-optimal. Remark 1.2.
Without the the Brownian term and when ˆ ǫ = 0 , equation (1.1) preserves volume on R J ,and so < ˆ ǫ ≪ can be thought of as a weakly dissipative regime (a property shared, e.g., by Galerkintruncations of NSE). However, as we are taking t → ∞ , it is not possible to treat (1.1) directly using aperturbation argument in ˆ ǫ since at ˆ ǫ = 0 all trajectories diverge as t → ∞ .On the other hand, we can relate the stochastic flow of diffeomorphisms ˆΦ t ˆ ω solving the SDE (1.1) withthe stochastic flow Φ tω solving (1.2) by Φ tω ( u ) = √ ˆ ǫ b Φ √ ˆ ǫt ˆ ω ( u/ √ ˆ ǫ ) (where ω t = ˆ ǫ − / ˆ ω √ ˆ ǫt a Brownian self-similar rescaling of the noise path ˆ ω so equality of the two flows is interpreted as equality in probabilisticlaw ). Thus, the Lyapunov exponent ˆ λ ˆ ǫ of the stochastic flow ˆΦ tω satisfies ˆ ǫ − ˆ λ ˆ ǫ = ǫ − λ ǫ , and in particular ˆ λ ˆ ǫ > if and only if λ ǫ > . We first provide our main results relating regularity to the lower bounds on the top Lyapunov exponents forgeneral SDEs. Let ( M, g ) be a smooth, connected, n -dimensional, orientable Riemannian manifold (notnecessarily bounded) with no boundary, and consider the stochastic process x t ∈ M, t ≥ defined by the(Stratonovich) SDE d x t = X ( x t ) d t + r X k =1 X k ( x t ) ◦ d W kt , (1.3)3here { X k } rk =0 are a family of smooth vector fields (potentially unbounded) on M and { W k } rk =1 areindependent standard Wiener processes with respect to a canonical stochastic basis (Ω , F , ( F t ) , P ) . Let uslist the first of several (relatively mild) standing assumptions to be imposed throughout the paper. Assumption 1. (i) For each initial data x ∈ M , equation (1.9) has a unique global solution ( x t ) withprobability 1. The (random) solution maps x x t =: Φ tω ( x ) , t ≥ comprise a (stochastic) flow of C r diffeomorphisms (Φ tω ) on M , r ≥ . Moreover, (ii) ( x t ) admits a unique, absolutely continuous stationaryprobability measure µ on M for which (iii) we have the integrability condition E ˆ M (cid:2) log + | D Φ t ( x ) | + log + | D Φ t ( x ) − | (cid:3) d µ ( x ) < ∞ . Assumption 1(i) is well-studied and follows from mild conditions on (1.9); see, e.g., [5, 37]. When M is compact, item (ii) follows from a parabolic H ¨ormander condition on the vector fields { X , · · · , X r } (seeDefinition 1.7 below). If M is not compact then some additional constraints are needed to avoid drift toinfinity. Given (i) and (ii), item (iii) is standard; see, e.g., [33]. Additional discussion and details are givenin Section 2.1.The following standard result is a corollary of the Kingman subadditive ergodic theorem [35] as wellas some basic ergodic theory for random dynamical systems [34]. It provides mathematically rigorousjustification for the existence of Lyapunov exponents. Theorem 1.3.
Assume (1.9) satisfies Assumption 1. Then there exist positive, deterministic constants λ and λ Σ , independent of both the random sample ω as well as x ∈ M , such that for P ⊗ µ almost every ( ω, x ) ∈ Ω × M the following limits hold: λ = lim t →∞ t log | D Φ tω ( x ) | ,λ Σ = lim t →∞ t log | det D Φ tω ( x ) | . The value λ is the top Lyapunov exponent ; the condition λ > implies sensitivity with respect toinitial conditions as well as local moving-frame saddle-type behavior for (random) trajectories for µ -typicalinitial x for a.e. random sample ω ∈ Ω (see [10, 73]; see also [5, 34, 43] for emphasis on random dynamics).This abstract smooth ergodic theory leans on the Multiplicative Ergodic Theorem [56, 62, 69]; in brief, thisresult provides a a decomposition of the tangent bundle T M into (random) sub-bundles along which variousexponential growth rates (a.k.a. Lyapunov exponents) are realized. Similarly value λ Σ is the sum Lyapunovexponent and describes the asymptotic exponential rate at which Lebesgue volume is contracted/expandedby the dynamics. For more information, see, e.g., the expositions [72, 74].The purpose of this paper is to put forward a new method for obtaining lower bounds on λ . Results areframed in terms of the augmented Markov process ( x t , v t ) tracking a trajectory in phase space ( x t ) and thetangent direction v t := D Φ t ( x ) v | D Φ t ( x ) v | . (1.4)It is straightforward to see that the top Lyapunov exponent λ is connected to Birkhoff sums of the observable g ω ( x, v ) := log k D Φ ω ( x ) v k on the sphere bundle S M (consisting of fibers S x M = S n − ( T x M ) ), and sothere is a clear connection between λ and the “augmented” tangent-direction process ( x t , v t ) ; see, e.g., Here, for a > we write log + a := min { log a, } for the positive part of log . ( w t ) = ( x t , v t ) as the projective process on S M . It is not hard to check that v t solves the SDE d v t = V ∇ X ( x t ) ( v t )d t + r X k =1 V ∇ X k ( x t ) ( v t ) ◦ d W k, where ∇ denotes the covariant derivative and, for x ∈ M and A : T x M → T x M linear, the vector field V A on S x M is defined by V A ( v ) := Av − h v, Av i v =: Π v Av .
Here, and everywhere below unless specified otherwise, we use the notation h a, b i = g ( a, b ) for a, b ∈ T x M .The full projective process ( w t ) evolves according to d w t = ˜ X ( w t )d t + r X k =1 ˜ X k ( w t ) ◦ d W kt . (1.5)Here, for w = ( x, v ) we regard T w S M = T x M ⊕ T v ( S x M ) (see Section 2.2) and define { ˜ X k } rk =0 by ˜ X k ( x, v ) := (cid:18) X k ( x ) V ∇ X k ( x ) ( v ) (cid:19) . Throughout, we take on the following assumption regarding ( w t ) . Assumption 2.
The SDE (1.5) defining the process ( w t ) satisfies Assumptions 1(i) and (ii). That is, the SDEdefining ( w t ) is globally well-posed for a.e. random sample and every initial data; and the Markov process ( w t ) admits a unique, absolutely continuous stationary measure ν on S M . Let d q denote Lebesgue measure on S M , let ν be the stationary measure for the projective process ( x t , v t ) ,and let f = d ν d q denote the stationary density, similarly let µ be the stationary measure for ( x t ) with density ρ = d µ d x . Our first main result is a new formula (to our knowledge) connecting the stationary density f to theexponent λ through a partial Fisher information -type quantity
F I ( f ) defined by F I ( f ) := 12 r X k =1 ˆ S M | ˜ X ∗ k f | f d q . Here, the vector fields ˜ X k are regarded as first order differential operators, and ˜ X ∗ k denotes the formal dualin L (d q ) . Proposition 1.4 (Fisher Information Identity) . Let Assumptions 1 and 2 hold. Moreover, assume that (a) thestationary density f satisfies f log f ∈ L (d q ) and (b) Q ∈ L ( µ ) and ˜ Q ∈ L ( ν ) , where Q, ˜ Q are definedby Q ( x ) := div X ( x ) + 12 r X k =1 X k div X k ( x ) , ˜ Q ( w ) := div ˜ X ( w ) + 12 r X k =1 ˜ X k div ˜ X k ( w ) . The distinction between v t or − v t is irrelevant for Lyapunov exponents, and so morally we regard ( w t ) as evolving on theprojective bundle P M consisting of fibers P x M = P n − ( T x M ) , the projectivization of the tangent space T x M . However, inpractice we will regard ( w t ) as a process on the sphere bundle S M . hen the following identities hold: F I ( ρ ) = − ˆ S M Q d µ = − λ Σ ,F I ( f ) = − ˆ S M ˜ Q d ν = nλ − λ Σ . (1.6) Equivalently, writing h x ( v ) = f ( x, v ) /ρ ( x ) for the conditional densities on S x M of v with respect to x , wehave F I ( f ) − F I ( ρ ) = 12 r X k =1 ˆ M ˆ S x M | ( X k − V ∗∇ X k ( x ) ) h x ( v ) | h x ( v ) d v ! d µ ( x ) = nλ − λ Σ . (1.7) Remark 1.5.
In each line of (1.6), the second equality is a version of the famous Furstenberg-Khasminskiiformula (see, e.g., [5]) for the Lyapunov exponents of an SDE satisfying Assumptions 1 and 2 (see Lemma2.4 for more details). What’s new here are the first equalities concerning
F I ( ρ ) , F I ( f ) . Equation (1.7)is an equivalent formulation highlighting the relation to the natural quantity nλ − λ Σ and criteria `a laFurstenberg for the Lyapunov exponents of stochastic systems. In particular, note that F I ( f ) − F I ( ρ ) ≥ and nλ > λ Σ if and only if F I ( f ) − F I ( ρ ) > . See Section 1.3 for more information.Proposition 1.4 is derived in Section 2. In fact, we give two proofs: the first is a combination of theFurstenberg-Khasminskii formula [5, 32] with the Kolmogorov equation ˜ L ∗ f = ˜ X ∗ f + 12 r X j =1 ( ˜ X ∗ j ) f = 0 , (1.8)for f ; the second proof (which we merely sketch, leaving details to the interested reader) connects F I ( f ) to a certain relative entropy formula for Lyapunov exponents [11, 26] (see also [41]) intimately connectedwith the Furstenberg-style approach to Lyapunov exponents of random systems. See Sections 1.3 and 2 foradditional discussion. Remark 1.6.
The Fisher information is a fundamental quantity in the theory of statistical inference andinformation geometry (see [2]). Typically used to measure the amount of information a parametrized familyof laws (e.g., the law of one variable conditioned on the value of another) carries about the inference param-eter. In this case, ( ν x ) x ∈ M are the family of laws indicating that the Fisher information in (1.7) signifies,on average, how much information about x can be inferred by making observations only in the projectivevariable v . The identity (1.6) will be most useful in a quantitative sense when studying the small-noise limit. Hence wedefine for ǫ ∈ (0 ,
1] d x ǫt = X ǫ ( x t ) d t + √ ǫ r X k =1 X ǫk ( x t ) ◦ d W kt , (1.9)where note we also are allowing X ǫj to be parameterized by ǫ ; below we assume natural uniformity propertieson this dependence. In this case, (1.6) becomes (now parameterizing everything by ǫ ): F I ( f ǫ ) := 12 r X k =1 ˆ S M | ( ˜ X ǫk ) ∗ f ǫ | f ǫ d q = nλ ǫ − λ ǫ Σ ǫ . { ˜ X ǫ , ..., ˜ X ǫk } spans the tangent space of S M everywhere, then the identity (1.6) would imply that nλ ǫ − λ ǫ Σ is related in a straightforward manner to the regularity of f in the sense of distributional derivatives,i.e., Sobolev norms.However, in nearly all cases of interest (and especially in the settings we are interested in, such as (1.2)),the collection { ˜ X ǫ , ..., ˜ X ǫk } fails to span the tangent space of S M . Our second main result, Theorem 1.9below, overcomes this complication by adapting ideas from H ¨ormander’s hypoelliptic regularity theory toshow that, in fact, the partial Fisher information F I ( f ) actually does control at least some Sobolev regularityof f .In [29], H ¨ormander isolated the general conditions that guarantee the regularity of solutions to Kol-mogorov equations such as the PDE satisfied by f (1.8) when the forcing directions do not span the tangentspace. We recall the classical parabolic H ¨ormander condition, as it is directly important for the next mainresult. For vector fields X, Y , we write [ X, Y ] for the usual Lie bracket of X and Y . Definition 1.7.
Given a collection of vector fields Z , Z , . . . , Z r on a manifold M , we define collectionsof vector fields X ⊆ X ⊆ . . . recursively by X = { Z j : j ≥ } , X k +1 = X k ∪ { [ Z j , Z ] : Z ∈ X k , j ≥ } . We say that { Z i } ri =0 satisfies the parabolic H¨ormander condition if exists k such that for all w ∈ M , span { Z ( w ) : Z ∈ X k } = T w M . Assumption 3 (Projective spanning condition) . The vector fields { ˜ X ǫ , ˜ X ǫ , · · · , ˜ X ǫr } satisfy the parabolicH¨ormander condition on S M and uniformly in ǫ ∈ (0 , on bounded sets (see Definition 3.1 below forprecise statement). Remark 1.8.
This condition appears routinely in the random dynamics literature: see for example [11, 24].For SDE systems (1.9) it is the primary sufficient condition used to ensure that ( w ǫt ) will have at most oneabsolutely continuous stationary measure as in Assumption 2; indeed, in most practical examples, one willuse Assumption 3 to deduce Assumption 2. We also note that Assumption 3 can be shown to imply that { X ǫ , ..., X ǫr } satisfies the parabolic H ¨ormander condition on M ; see Section 4.We are now positioned to state our second result, which provides a quantitative hypoelliptic regularityestimate turning the partial information F I ( f ǫ ) of f ǫ into a uniform-in- ǫ estimate of Sobolev regularity in all directions. Theorem 1.9.
Assume that { ˜ X ǫ , ..., ˜ X ǫr } are uniformly bounded in C kloc ∀ k and such that Assumptions 1,2 and 3 hold. Then, there exists s ∗ ∈ (0 , such that for any bounded, open set U ⊂ S M , there exists C = C U > such that for all ǫ ∈ (0 , || f ǫ || W s ∗ , ( U ) ≤ C (cid:16) p F I ( f ǫ ) (cid:17) . Remark 1.10.
It might be possible that there is a slightly more refined version of Theorem 1.9 whichreplaces
F I with ǫ δ F I for some δ ∈ (0 , s ∗ ) in the statement, which would lead to a more precise lowerbound on the Lyapunov exponents in the example below. Such a scaling would be more consistent with theresults of [14, 61]. Remark 1.11.
Above, the value s ∗ is determined exclusively by the number of ‘generations’ of bracketsneeded to satisfy Assumption 3 (though note this will generally depend on the dimension of the manifold M itself). 7heorem 1.9 is proved in Section 3. The result is a key aspect of our work and the proof requiressome significant effort. It essentially amounts to a quantitative version of H ¨ormander’s a priori estimate forhypoelliptic regularity in an L framework; in contrast to H ¨ormander’s original work [29] is based in L forfundamental reasons.One of H ¨ormander’s original insights is that, given regularity in the forcing directions, the PDE (1.8)implies a matching, negative regularity-type estimate defined by duality on ˜ X (see also discussions in [4]).Using a delicate regularization procedure, the regularity in { ˜ X j } rj =1 and the negative regularity in the ˜ X are combined in a suitable manner to obtain regularity in all directions. In order to exploit the negativeregularity dual to the regularity provided by F I , the regularization procedure we must perform is even moredelicate than H ¨ormander’s. Of course, there is a large literature of works extending H ¨ormander’s theory invarious ways, e.g., to handle rough coefficients: we refer the reader to, e.g., [1, 3, 16, 28, 36, 38, 53] and thereferences therein. However, as far as the authors are aware, there are no previous works that fundamentallyrework the theory into L . R n As our application, we apply Proposition 1.4 and Theorem 1.9 to a concrete class of dynamical systemsposed on R n of which L96 in (1.1) and Galerkin truncations of many PDE are special cases. The generalclass of systems we consider on R n are of the following form, modeling a volume-preserving nonlinearitywith a weak linear damping and weak noise: d x ǫt = F ( x ǫt )d t + ǫAx ǫt + √ ǫ r X k =1 X k d W kt . (1.10)Here W kt are independent standard Brownian motions, the forcing directions { X k } rk =1 are assumed forsimplicity to be constant vector fields (a.k.a. “additive noise”), while the matrix A ∈ R n × n is negativedefinite, contributing volume dissipation to the overall system. We will primarily consider drift terms F of the following form: F ( x ) = B ( x, x ) for B : R n × R n → R n bilinear,and moreover, div F ≡ and x · F ≡ . (1.11)The divergence-free condition implies preservation of Liouville measure (Lebesgue measure on R n ), whilethe condition x · F ( x ) ≡ ensures that ˙ x = F ( x ) preserves the “energy shells” S E = { x ∈ R n : k x k = E } . Systems of this form include the Lorenz-96 model (1.1) as well as Galerkin truncations of severalwell-known PDE of interest such as the Navier-Stokes equations. A more general class of models for whichthese methods apply is discussed in Remark 1.16 below.Regarding our standing assumptions, it is straightforward [5, 37] to show that (1.10) with drift term asin (1.11) generates a unique stochastic flow of diffeomorphisms Φ tω : R n → R n for a.e. Brownian path ω ,and so Assumption 1(i) always holds for any ǫ > . By standard hypoellipticity theory, Assumption 1(ii) re-garding a unique stationary density is valid when { F + ǫA, X , · · · , X r } satisfies the parabolic H ¨ormandercondition on R n . If this holds, then Assumption 1(iii) is essentially automatic and follows from a combina-tion of results in Appendix A and [33]. As a result, the exponents λ ǫ , λ ǫ Σ as in Theorem 1.3 exist for (1.10)for any ǫ > .For systems of the form (1.10), it is particularly easy to lift vector fields to S M ≃ R n × S n − via ˜ X ǫ := (cid:18) F ( x ) + ǫAx Π v ( ∇ F ( x ) + ǫA ) v (cid:19) , ˜ X j := (cid:18) X k (cid:19) . (1.12) Note that since the vector fields X k are constant, the Itˆo and Stratonovich formulations are identical, hence why we use the Itˆonotation above. ( w ǫt ) = ( x ǫt , v ǫt ) evolving on R n × S n − , well-posedness as inAssumption 2 is standard, while the existence and uniqueness of a stationary density f ǫ follows from theparabolic H ¨ormander condition for { ˜ X ǫ , ˜ X , · · · , ˜ X r } in Assumption 3 and the drift condition provided bythe damping (see Appendix A). We emphasize that Assumption 3 generally requires work to check: see thediscussion below.For the class of Euler-like SDE above, our main result is as follows, which shows that Assumption 3 issufficient to deduce ǫ − λ ǫ → ∞ . Theorem 1.12.
Consider the SDE (1.10) where F ( x ) = B ( x, x ) is as in (1.11) and B ( x, x ) is not identically . If { ˜ X ǫ , ˜ X , ... ˜ X r } as in (1.12) satisfies the parabolic H¨ormander condition as in Assumption 3, then thetop Lyapunov exponent λ ǫ for (1.10) satisfies λ ǫ ǫ → ∞ as ǫ → . Before presenting the proof of Theorem 1.12, let us briefly comment on the verification of Assumption3. For many systems of interest, it can be significantly harder to verify spanning for { ˜ X k } on S M than toverify spanning for the vector fields { X k } on M : this is already the case for the L96 model with additivenoise. As discussed above, the verification of Assumption 3 is the only remaining task to apply Theorem1.12 to Galerkin truncations of the Navier-Stokes equations. This is being undertaken in ongoing work.Nevertheless, we emphasize that for a given model of the form (1.10) with fixed dimension and param-eters, Assumption 3 is (at least in principle) checkable using, e.g., computer algebra software. In Section4 we prove the following, which reduces the question of projective spanning to a combination of (i) thespanning condition for { X ǫ , X , ..., X r } on R d and (ii) the purely linear condition that sl ( R n ) , the space oftraceless real matrices, is generated by a collection (cid:8) H i (cid:9) of constant-valued n × n real matrices (definedexplicitly in terms of B ( x, x ) ) under the standard matrix Lie bracket. Lemma 1.13.
Let { X ǫ , X , ..., X r } be defined by the SDE (4.3) and suppose that the constant vector fields { ∂ x k } nk =1 belong to the parabolic Lie algebra Lie( X ǫ ; X , . . . , X r ) . Define for each k = 1 , . . . n thefollowing constant matrices , H k := ∂ x k ∇ F ∈ sl ( R n ) and let Lie( H , . . . , H n ) be the matrix Lie sub-algebra of sl ( R n ) generated by H , . . . H n . Then theprojective vector fields { ˜ X ǫ , ˜ X , . . . , ˜ X r } satisfies Assumption 3 if Lie( H , . . . , H n ) = sl ( R n ) . (1.13)Lemma 1.13 is used to prove projective spanning for L96 with additive forcing in Section 4.3. UsingTheorem 1.12, Theorem 1.1 above follows as a corollary.We note that the proof presented there for L96 heavily relies on the “local” coupling of unknowns inthe nonlinearity, which greatly simplifies the application of Lemma 1.13 in this case. However, for modelswhich are the Galerkin truncations of PDEs, coupling between unknowns has a more ‘global’ character, andverifying the hypotheses of Lemma 1.13 remains open. Throughout, we assume the setting of Theorem 1.12, and in particular, that the collection of projective vectorfields { ˜ X ǫ , ˜ X , · · · , ˜ X r } as in (1.12) satisfies the parabolic H ¨ormander condition in Assumption 3.Let us begin by articulating the Fisher information identity (Proposition 1.4) and hypoelliptic regular-ity estimate (1.9) in the context of Euler-like models. Writing λ ǫ , λ ǫ Σ for the top and summed Lyapunov9xponents as in Theorem 1.3, the partial Fisher information identity (1.6) reads as follows: nλ ǫ ǫ − A = 12 X k ˆ R n × S n − | ˜ X k f ǫ | f ǫ d x d v =: F I ( f ǫ ) . (1.14)This is immediate from Proposition 1.4 on noting that (i) λ Σ = ǫ tr A by Theorem 1.3 and (1.11), while(ii) the condition f ǫ log f ǫ ∈ L (d q ) , with d q the volume element for SR n , follows from the estimates inAppendix A. Turning to the hypoelliptic regularity estimate: Theorem 1.9 implies k f ǫ k W s, ( U × S n − ) ≤ C (cid:16) p F I ( f ǫ ) (cid:17) , (1.15)for any U ⊂ R n bounded, where s ∈ (0 , and C = C U are constants independent of ǫ .In view of the form of (1.14) and (1.15) we see that if ǫ − λ ǫ were to remain bounded, then f ǫ would bebounded in W s, uniformly in ǫ . This observation leads naturally to the following alternative. Proposition 1.14.
At least one of the following holds:(a) lim ǫ → λ ǫ ǫ = ∞ ; or(b) the zero-noise flow ( x t , v t ) admits a stationary density f ∈ L ( R n × S n − ) (and moreover f ∈ W s, loc on bounded sets).Proof. Suppose that (a) fails, i.e. lim inf ǫ → λ ǫ ǫ < ∞ . In this case, (1.14) implies that lim inf ǫ → F I ( f ǫ ) < ∞ and the hypoelliptic regularity estimate (1.15)provides regularity in the missing directions, i.e., lim inf ǫ → || f ǫ || W s, ( U ) < ∞ for all open, bounded sets.Combined with the uniform tightness of { f ǫ } ǫ> in (A.1) (coming from the energy identity x · B ( x, x ) = 0 and that A is negative definite) this yields compactness in L for { f ǫ } ǫ ∈ (0 , as ǫ → . Extracting asubsequence { ǫ j } , we see that ∃ f ∈ L such that f ǫ j → f in L . Furthermore, passing to the limit ǫ j → pathwise in the SDE and in the Kolmogorov equation (1.8), we see that f d q is an invariant measure of deterministic flow ( x t , v t ) and hence (b) holds.A crucial feature of our approach is that alternative (b) in Proposition 1.14 is quite rigid and can beruled out in many cases, even for systems with very complicated deterministic dynamics for which we haveaccess to little information. In our setting, alternative (b) is ruled out by the following proposition, provedin Section 5; this is enough to complete the proof of Theorem 1.12.Below, we define ˆΦ t : SR n (cid:9) for the (deterministic) flow corresponding to the ǫ = 0 process ( x t , v t ) ,while Φ t : R n (cid:9) is the flow corresponding to ( x t ) on R n . Proposition 1.15.
Assume that the bilinear mapping B is not identically 0. Let ν be any invariant proba-bility measure for ˆΦ t with the property that ν ( A × S n − ) = µ ( A ) , where µ ≪ Leb R n . Then, ν is singularwith respect to Lebesgue measure Leb SR n on SR n . Proposition 1.15 is proved in Section 5, using ideas inspired by the classification of invariant projectivemeasures for general linear cocycles [6]. Roughly speaking, this theory implies that if ν ≪ Leb SR n were tohold, then the ǫ = 0 flow must be an isometry with respect to a potentially ‘rough’ (i.e., measurably-varying)Riemannian metric on R n . For systems of the form (1.10) satisfying (1.11), this can be ruled out relativelyeasily, using the fact that at ǫ = 0 , the dynamics of (1.10) induces shearing between successive “energyshells” S E = {k x k = E } , E > . To wit, D Φ t ( x ) x = Φ t ( x ) + tB (Φ t ( x ) , Φ t ( x )) (1.16)10see Lemma 5.1). In particular, at a point x ∈ R n \{ } , an infinitesimal perturbation in the “radial” direction x will grow indefinitely at a linear rate t , except at times when B (Φ t ( x ) , Φ t ( x )) is very small. Thus,Proposition 1.14(b) can be ruled out by a simple Poincar´e recurrence argument, using only the assumptionthat B is not identically 0 (see Section 5 for details). Remark 1.16.
The above arguments apply, in principle, to a broader class of drift terms F ( x ) than thosegiven in (1.11). For instance, provided that we start with the weak-damping, constant forcing regime d x ǫt = F ( x ǫt )d t + ǫAx ǫt d t + r X k =1 X k d W kt , our methods easily extend to treat multilinear F ( x ) = P Pj =0 B j ( x, .., x ) for B j multilinear of degree p j with p j ≥ for at least one j . Reordering so that p > p > . . . > p P , a rescaling of u, t and a re-definitionof ǫ provides the analogue of (1.10): d x ǫt = P X j =0 ǫ pj − p p B j ( x t , ...x t )d t + ǫAx ǫt d t + √ ǫ r X k =1 X k d W kt . Hence, as ǫ → , the leading order nonlinearity dominates and the problem essentially reduces to the homo-geneous case, provided of course that the leading nonzero nonlinearity term B p satisfies x · B p ( x, · · · , x ) ≡ and div B p ( x, · · · , x ) ≡ , analogously to (1.11) (and of course, we require Assumption 3). Remark 1.17.
Without much additional work, Proposition1.14 generalizes to a large class of zero-noiselimits of volume-preserving systems on a compact manifold. Of particular interest are parabolic flows, e.g.,‘typical’ completely integrable flows, for which Proposition 1.14(b) is usually impossible due to shearingbetween invariant tori (analogous to (1.16)). This suggests that the scaling ǫ − λ ǫ → ∞ ought to be fairlycommon among zero-noise limits, even for a large class of decidedly non-chaotic zero-noise dynamics.Remarkably, when ǫ ≪ , many Lyapunov times O (( λ ǫ ) − ) elapse before the O ( ǫ − ) timescale whenthe effects of noise become apparent. On the other hand, how long a Lyapunov time actually takes (thatis, how long it typically takes a tangent vector to double in length) depends crucially on the rate at whichLyapunov exponents are realized, itself a large-deviations problem. This will be the subject of future work. As remarked earlier, for a given system it can be extremely challenging to estimate its Lyapunov expo-nents and provide a mathematically rigorous account of its time-asymptotic behavior. Indeed, in principleLyapunov exponents require infinitely precise information on infinite trajectories, and in practice the con-vergence of Lyapunov exponents to their ‘true’ values can exhibit long stretches of intermittent behavior.This is especially so for deterministic systems in the absence of stochastic driving, for which one anticipatesthat “chaotic” and “orderly” regimes coexist in a convoluted way in both phase space as well as ‘parame-ter space’, i.e., as the underlying dynamical system is varied: we refer the interested reader to, e.g., workon Newhouse phenomena in dissipative systems [54, 55]; the proliferation of elliptic islands in volume-preserving systems [25]; known coexistence of chaotic and ordered regimes for the quadratic map family[47]; and C generic dichotomies [17, 18]. For more background on this rich topic, see, e.g., [23, 60, 72, 74].Although it still presents significant challenges, the situation for Lyapunov exponents of stochasticallyforced systems is notably more tractable. To start, let us first address the body of work `a la Furstenbergwhich describes necessary conditions for ‘degeneracy’ of the Lyapunov exponents of a random dynamicalsystem. Consider a stochastic flow of diffeomorphisms Φ tω on an n -dimensional manifold M arising froman SDE satisfying Assumption 1, and let λ , λ Σ be as in Theorem 1.3. Let µ be the (unique) stationarymeasure for x t := Φ tω ( x ) . Note that unconditionally we have nλ − λ Σ ≥ . In this context, and brushing11side technical details, the criterion `a la Furstenberg is due to a variety of authors (e.g., [21, 41, 65, 68]), andcan be stated as follow: if ν ∈ P ( S M ) is a stationary measure for the projective process and d ν ( x, v ) =d ν x ( v )d µ ( x ) the disintegration of ν , then for all t > there holds E ˆ M H ( D Φ t ( x ) ∗ ν x | ν Φ t ( x ) ) d µ ( x ) ≤ t ( nλ − λ Σ ) , (1.17)where H denotes the relative entropy of defined for two measure measures η ≪ λ by H ( η | λ ) := ˆ log (cid:18) d η d λ (cid:19) d η . From this we see that either nλ − λ Σ > , (1.18)or the probabilistic law governing the stochastic flow admits a strong ‘degeneracy’ in the sense that ( D x Φ tω ) ∗ ν x = ν Φ tω ( x ) (1.19)with probability 1 for all t ≥ and µ -typical x . That this is situation is very ‘degenerate’ follows from thefact that for fixed x and t , the above right-hand side depends only on the time − t position Φ tω ( x ) , while theleft-hand side depends additionally on the entire noise path ω | [0 ,t ] .Observe that in the weakly-damped, weakly-driven setting of (1.10), λ ǫ Σ = ǫ tr A < and so (1.18) istotally agnostic as to whether λ ǫ > or not. Indeed, the techniques in the above-mentioned works are “soft”as the identity (1.19) is non-quantitative in the parameters of the underlying system. Although (1.17) does atleast provide some kind of formula for nλ − λ Σ , it is unclear how to glean useful quantitative informationdirectly from (1.17).Interestingly, our Fisher-information identity in Proposition 1.4, specifically (1.7), is in fact essentiallythe time-infinitesimal analogue of (1.17), as we show below in Section 2.4. Hence, like (1.17), our Propo-sition 1.4 admits an interpretation in terms of the rate at which the degeneracy (1.19) fails to hold for thestochastic flow Φ tω . However, Proposition 1.4 recasts the information in terms of the generator of ( w t ) ,which is more amenable now to the use of hypoelliptic PDE methods such as those employed in Theorem1.9. This motivates the claim that the methods in this paper constitute a first step towards a quantitative `ala Furstenberg theory. We remark that Fisher information-type quantities also commonly appear as the timederivatives of the relative entropy in the study of gradient flows and logarithmic Sobolev inequalities (seee.g. [9, 39, 63, 66]).Beyond `a la Furstenberg, there is by now a large literature on the Lyapunov exponents of particularmodels for which we cannot do justice in this space. Instead, we will focus on a class of results most closelyrelated to ours (Theorems 1.1 and 1.12): small-noise expansions of Lyapunov exponents for weakly-drivenstochastic systems. To frame the discussion, consider the abstract linear SDE d V t = A ǫt V t d t + √ ǫ r X k =1 B kt V t ◦ d W kt , (1.20)where A ǫt , B kt are, in general, time-varying and/or themselves randomly driven, and A ǫt may or may notexhibit some vanishingly weak damping as ǫ → . There are many works studying the scaling behaviorof Lyapunov exponent λ ǫ := lim t →∞ t log | V t | of such systems, e.g., [8, 30, 52, 58, 61] in the constantcoefficient case, and [7, 13, 14] when the when A t , B kt are coupled to some other stochastic flow. To theauthors’ best knowledge, however, all of these results are restricted to settings where the ǫ = 0 dynamics12re relatively simple and essentially completely known. In comparison, our results are indifferent to anydetailed description of the zero-noise dynamics. On the other hand, the sacrifice for our level of generalityis that our estimate λ ǫ /ǫ → ∞ is far weaker than an asymptotic expansion, and is likely to be suboptimalfor many models of interest.Of particular interest is that among models of the form (1.20), scaling laws of the form λ ǫ ∼ ǫ γ , γ ≥ tend to be associated with zero-noise dynamics which are rigid isometries (exhibiting no shearing) [8, 12,13, 58]: note that such projectivized zero-noise dynamics preserve an invariant density, namely, Lebesguemeasure, c.f. alternative (b) in Proposition 1.14. Meanwhile, laws of the form λ ǫ ∼ ǫ γ , γ < are associatedwith zero-noise dynamics exhibiting some shearing mechanism (c.f. shearing between energy shells as in(1.16)). By way of example, [8, 61] derive such scaling laws when A t as above is given by A t ≡ (cid:18) (cid:19) , B t ≡ (cid:18) (cid:19) , corresponding to the constant application of a horizontal shear in conjunction with a small, stochasticallydriven vertical shear. This analysis was extended to the setting of fluctuation-dissipation zero-noise limitsof certain 2d completely integrable Hamiltonian systems in the work [14].Although the Fisher information identity in Proposition 1.4 and our Proposition 1.14 for zero-noiselimits make no direct reference to a specific dynamical motif or behavior, it is clear from our application tothe class of Euler-like models (1.10) with bilinear nonlinearity as in (1.11) that shearing in the zero-noise,zero-damping dynamics is a very natural way to rule out the rigid isometry alternative in Proposition 1.14(b).Of course, shearing has long been regarded as a potential mechanism for the generation of chaotic behavior.As early as the late 70’s it was realized that chaotic attractors could arise from time-periodic driving of asystem undergoing a Hopf bifurcation [75], while subsequent mathematically rigorous work has confirmedthis mechanism (see, e.g., [71] for an overview of this program). We also point out the work [42], whichprovides a mix of heuristics, numerics and mathematical analysis demonstrating the shearing mechanism asa source of chaotic behavior. This section is devoted to a proof of the Fisher information identity Proposition 1.4 for the top Lyapunovexponent in a general setting. We present two proofs: the first, via the Furstenberg-Khasminskii formula[5], is carried out in Section 2.2, while the second, via a relative entropy formula related more directly toFurstenberg’s work on Lyapunov exponents, is presented in Section 2.4.
Let ( M, g ) be a smooth connected Riemannian manifold, and as in (1.9), consider the SDE d x t = X ( x t ) d t + r X k =1 X k ( x t ) ◦ d W kt , (2.1)where { X k } rk =0 are a family of smooth vector fields (potentially unbounded) on M , { W k } rk =1 are inde-pendent standard Wiener processes and the product is taken in the Stratonovich sense. Recall that (2.1) isdefined so that for each ϕ ∈ C ∞ c ( M ) the following R valued Stratonovich equation holds for each t ∈ R + ϕ ( x t ) = ϕ ( x ) + ˆ t X ϕ ( x s )d s + r X k =1 ˆ t X k ϕ ( x s ) ◦ d W ks . (2.2)The generator of the Markov semigroup is the following second order differential operator written inH ¨ormander form L := X + 12 r X k =1 X k , Mϕ ( x t ) = ϕ ( x ) + ˆ t L ϕ ( x s )d s + r X k =1 ˆ t X k ϕ ( x s )d W ks , Recall that a stationary probability µ ∈ P ( M ) for the SDE (1.9), is any probability measure µ satisfying ˆ M L ϕ d µ = 0 , for all ϕ ∈ C ∞ c ( M ) .We will only be interested in cases when equation (1.9) gives rise to a unique Markov process ( x t ) anda global-in-time stochastic flow of diffeomorphisms Φ t and has a unique stationary measure µ . Remark 2.1.
Obtaining the existence and uniqueness of a global stochastic flow of diffeomorphisms and athe existence of a stationary probability measure µ is not automatic when the manifold M is not compactdue to the potential unboundedness of a the vector fields { X k } rk =0 and the loss of tightness as t → ∞ of Law( x t ) . In general, one must obtain a suitable Lyapunov function (also called a drift condition) tocontrol the growth of the process ( x t ) (see e.g. [51]) to obtain global solutions and existence of a stationaryprobability measure. In order to deduce our Fisher information identity, we must first derive a formula for the top Lyapunovexponent, commonly referred to as the Furstenberg-Khasminskii formula (see, e.g., [5, 32]).As in Section 1.1, we define the projective process ( x t , v t ) defined on P M as in (1.4) and in particular(1.5). As is commonly done in, we will often conflate the projective bundle
P M with the unit sphere bundle S M whose fibers are the spheres S x M = S n − ( T x M ) canonically embedded in T x M , and are universaldouble covers of P x M .Using the Riemannian structure on M and the Levi-Civita connection ∇ , we equip S M with a cannon-ical Riemannian metric ˜ g (the Sasaki metric), so that the bundle projection π : S M → M is a Riemanniansubmersion. This means that for each w = ( x, v ) ∈ S M we can decompose T w S M into a horizontal H w S M subspace of directions transverse to the fibers and a vertical V w S M subspace of directions along thefibers and each of which can be identified with the spaces T x M and T v ( S x M ) respectively. Moreover thesespaces are orthogonal with respect to the metric ˜ g giving the orthogonal decomposition T w S M = T x M ⊕ T v ( S x M ) which allows us to write the vector fields { ˜ X k } rk =0 as ˜ X k ( x, v ) := (cid:18) X k ( x ) V ∇ X k ( x ) ( v ) (cid:19) , where ∇ X k ( x ) the covariant derivative of the vector field X k , which for each x ∈ M we view as a linearendomorphism ∇ X k ( x ) : T x M → T x M, so that for each v ∈ T x M , ∇ X k ( x ) v := ∇ v X k ( x ) . Recall that the divergence div X of a vector field X onRiemannian manifold M is given by the trace of it’s covariant derivative (using the Levi-Civita connection) div X := tr( ∇ X ) . The following identity will be useful relating the divergence of ˜ X k to that of X k .14 emma 2.2. The following identity holds for for each k = 0 , . . . r and v ∈ S x M , div ˜ X k ( x, v ) = 2 div X k ( x ) − n h v, ∇ X k ( x ) v i . (2.3) Proof.
First we note that in light of the orthogonal splitting T w SM = T x M ⊕ T v ( S x M ) we have div ˜ X k ( x, v ) = div X k ( x ) + div V ∇ X k ( x ) ( v ) , where for a fixed x ∈ M the divergence div V ∇ X k ( x ) ( v ) is divergence of V ∇ X k ( x ) ( v ) treated as a vector fieldon the sphere S n − ( T x M ) . Since T x M is isomorphic to R n , it suffices to show that for any linear operator A : R n → R n that the following identity holds true div V A ( v ) = tr( A ) − n h v, Av i . (2.4)To show this, we first compute the covariant derivative ∇ V A using the embedding of S n − in R n . Specificallywe use that ∇ V A is related to the Euclidean differential DV A ( v ) : T v S n − → R n by projecting it’s rangeonto T v S n − via Π v = I − v ⊗ v ♯ . Recalling that V A ( v ) = Π v Av a simple calculation shows that for each v ∈ S n − , the Euclidean differential is DV A ( v ) = Π v A − h v, Av i I − v ⊗ ( Av ) ♯ . Projecting onto T v S n − eliminates the normal term v ⊗ Av ♯ , giving the following formula for the covariantderivative ∇ V A ( v ) = Π v A − h v, Av i I, which implies div V A ( v ) = tr( ∇ V A ) = tr T v S n − ( A ) − ( n − h v, Av i , (2.5)where tr T v S n − ( A ) denotes the trace of A restricted to the n − dimensional subspace T v S n − . To computethis, fix v ∈ S n − and let { e , e , . . . e n − } be an orthonormal basis for T v S n − = v ⊥ and note that { e , e , . . . , e n − , v } is an orthonormal basis for R n . Therefore we have tr T v S n − ( A ) = n − X i =1 h e i , Ae i i = tr( A ) − h v, Av i . Upon substituting this expression into (2.5), we obtain (2.4).We will need the following enhancement of the multiplicative ergodic theorem which says that undersome ergodicity assumptions on the projective process w t , one achieves λ exponential growth in every tangent direction with probability 1. For each ( t, x ) ∈ R + × M , let D Φ t ( x ) : T x M → T x t M be theJacobian of the stochastic flow Φ t at x ∈ M . The following is a corollary of, e.g., Theorem III.1.2 in [34]. Theorem 2.3.
Suppose that Assumptions 1 and 2 hold. Let ν be the unique stationary measure for ( w t ) .Then, for ν almost every w = ( x, v ) ∈ S M we have λ = lim t →∞ t log | D Φ t ( x ) v | with probability 1 . We are now ready to prove the Furstenberg-Khasminskii formula for (2.1). A sketch of its proof isincluded for completeness. 15 roposition 2.4 (Furstenberg-Khasminskii) . Define for each x ∈ MQ ( x ) := div X ( x ) + 12 r X k =1 X k div X k ( x ) and each w ∈ S M ˜ Q ( w ) := div ˜ X ( w ) + 12 r X k =1 ˜ X k div ˜ X k ( w ) . Suppose that ( w t ) has a unique stationary probability measure ν on S M that projects to µ on M , and that Q ∈ L ( µ ) and ˜ Q ∈ L ( ν ) , then the following formulas hold λ Σ = ˆ M Q d µ, (2.6) nλ − λ Σ = − ˆ S M ˜ Q d ν. (2.7) Proof.
We begin by proving (2.6). We begin by noting that a standard calculation relating determinants totraces gives d log | det D Φ t ( x ) | = tr ∇ X ( x t ) d t + r X k =1 tr ∇ X k ( x t ) ◦ d W kt = div X ( x t ) d t + r X k =1 div X k ( x t ) ◦ d W kt . Upon converting to Itˆo and integrating in time, we obtain t log | det D Φ t ( x ) | = 1 t ˆ t div X ( x s ) d s + r X k =1 t ˆ t X k div X k ( x s )d s + 1 t M t = 1 t ˆ t Q ( x s )d s + 1 t M t , where M t is a mean-zero martingale arising from the Itˆo integral. We now take t → ∞ : the LHS convergesto λ Σ by Theorem 1.3, while the first term on the RHS converges to ´ Qdµ by the ergodic theorem. Inparticular, t M t must also converge, both pointwise and in L ( P × µ ) , hence t M t → and (2.6) follows.Likewise, to prove (2.7), we see that a straight forward computation and formula (2.3) yields d log( | D Φ t ( x ) v | ) = h v t , ∇ X ( x t ) v t i d t + r X k =1 h v t , ∇ X k ( x t ) v t i ◦ d W kt = 1 n (cid:16) X ( x t ) − div ˜ X ( w t ) (cid:17) d t + 1 n r X k =1 (cid:16) X k ( x t ) − div ˜ X k ( w t ) (cid:17) ◦ d W kt . Converting to Itˆo gives t log( | D Φ t ( x ) v | ) = 1 nt ˆ t (cid:16) X k ( x s ) − div ˜ X k ( w s ) (cid:17) d s + 1 nt ˆ t (cid:18) X k div X k ( x s ) −
12 ˜ X k div ˜ X k ( w s ) (cid:19) d s + 1 t ˜ M t = 1 nt ˆ t Q ( x s )d s − nt ˆ t ˜ Q ( w s )d s + 1 t ˜ M t , with ˜ M t another mean-zero martingale. The proof is complete on sending t → ∞ and applying the ergodictheorem, this time using Theorem 2.3 to ensure the LHS converges to λ .16 .3 Fisher information identity In this section, we prove Proposition 1.4. As discussed in Section 1.1, the Markov process ( w t ) on thesphere bundle S M has the following generator in H ¨ormander form ˜ L = ˜ X + 12 X k ˜ X k . As discussed in the intro, we will be working in the setting where the process ( w t ) admits a unique stationaryprobability measure ν on S M with smooth density f ( w ) with respect to the volume measure d q on S M satisfying ´ f d q = 1 . The stationary density f solves the following PDE ˜ L ∗ f = ˜ X ∗ f + 12 r X k =1 ( ˜ X ∗ k ) f = 0 , (2.8)where for a given vector field ˜ X on S M , ˜ X ∗ denotes the formal adjoint operator with respect to L (d q ) .Note that the differential operator ˜ X ∗ can be related to ˜ X and div ˜ X through the following relation ˜ X ∗ h = − ˜ Xh − (div ˜ X ) h, h ∈ C ∞ c ( S M ) . (2.9)We are now ready to prove Proposition 1.4. Proof of Proposition 1.4 . Formally, the argument is straightforward. Consider first the second equality in(1.6). Pairing (2.8) with log f and integrating gives − r X k =1 ˆ S M (log f )( ˜ X ∗ k ) f d q = ˆ S M (log f ) ˜ X ∗ f d q. Ignoring, for the moment, that f is not compactly supported, integrating by parts a few times and using (2.9)gives for the left hand side − r X k =1 ˆ S M (log f )( ˜ X ∗ k ) f d q = − r X k =1 ˆ S M ( ˜ X k f )( ˜ X ∗ k f ) f d q = F I ( f ) + 12 r X k =1 ˆ S M ( ˜ X k div ˜ X k ) f d q, whereas, for the right hand side we have ˆ (log f ) ˜ X ∗ f d q = ˆ ˜ X f d q = − ˆ (div ˜ X ) f d q. Putting these two identities together yields
F I ( f ) = − ´ Q d ν and therefore (1.6). The formula for F I ( ρ ) with ρ = dµdx the stationary density for ( x t ) , follows from an identical argument, omitted for brevity, onceone observes that ρ solves the Kolmogorov equation X ∗ ρ + 12 r X j =1 ( X ∗ j ) ρ = 0 . To rigorously justify the above formal calculation, we need to be a little more careful with integration byparts and make use of the f log f integrability assumption. Let χ ∈ C ∞ c ( B (0 , R )) satisfy ≤ χ ≤ with17 R ( x ) = 1 in B (0 , R/ , where B (0 , R ) is the geodesic ball of radius R on M . Multiplying both sides by (log f ) χ R and following the above procedure gives X k ˆ S M | ˜ X ∗ k f | f χ R d q = − ˆ S M ˜ Qf χ R d q + ˆ S M ( L χ R )( f log f − f ) d q. Using the fact that χ R → , |L χ R | . uniformly in R , and L χ R → pointwise as R → ∞ , and the factthat f log f − f ∈ L , we apply the dominated convergence theorem to pass the limit as R → ∞ .Turn next to (1.7). We give only the formal proof, the rigorous proof by the dominated convergencetheorem is analogous given the regularity provided by (1.6). For this we observe that (denoting d v Lebesguemeasure on S x M ) ˜ X ∗ k ( hρ ) = (( V ∗∇ X k h − ˜ X k h ) ρ + ( X ∗ k ρ ) h and therefore since ´ S x M h x ( v ) = 1 , we find F I ( f ) − F I ( ρ ) = 12 r X k =1 ˆ S M | ( V ∗∇ X k h − ˜ X k h ) ρ + ( X ∗ k ρ ) h | hρ d q − r X k =1 ˆ S M | X ∗ k ρ | ρ h d q = 12 r X k =1 ˆ M ˆ S x M | ( X k − V ∗∇ X k ( x ) ) h x ( v ) | h x ( v ) d v ! d µ ( x )+ ˆ S M ( V ∗∇ X k h − X k h ) ( X ∗ k ρ ) d q. However, ˆ S M ( V ∗∇ X k h − X k h ) ( X ∗ k ρ ) d q = − ˆ S M X k h ( X ∗ k ρ ) d q = − ˆ S M h ( X ∗ k ) ρ d q = 0 . In this section we give a formal argument of the Fisher information identity using the proper analogue ofthe relative entropy formula (1.17), measuring the degree to which the degeneracy (1.19) fails to hold. Wealready have given a complete proof above, this section is simply a way to get some additional intuitionregarding the meaning behind Proposition 1.4. Hence, in this section we do not endeavor to give a completeproof, furthermore, for technical simplicity in this section we only consider the case in which M is compact(still with no boundary).In preparation, recall that given two measures λ, η on S M , η ≪ λ , we define the relative entropy of η with respect to λ by H ( η | λ ) := ˆ S M log (cid:18) d η d λ (cid:19) d η . Since we work frequently with absolutely continuous measures, we abuse notation somewhat and also write H ( f | g ) for the relative entropy of f d q with respect to g d q . Recall that H ( f | g ) = 0 if and only if f ≡ g .In what follows, we let ˆΦ t be the stochastic flow of diffeomorphisms on S M induced by the SDEgoverning the projective process ( w t ) . Given a smooth density f ∈ L ( S M ) , let f t := ( ˆΦ t ) ∗ f = f ◦ ˆΦ − t | det D ˆΦ − t |
18e the pushforward of f as a measure on S M . The density f t can readily be seen to solve the stochasticcontinuity equation d f t = ˜ L ∗ f t d t + r X k =1 ˜ X ∗ k f t d W kt , and satisfies f t → f locally uniformly on S M .In [11], Baxendale derived the following formula (inspired by one of Furstenberg [26] in the context ofIID compositions of random matrices): Theorem 2.5 (Baxendale [11]) . Under Assumptions 1 & 2, writing f = dνdq for the stationary density of ( w t ) and ρ = dµdx for that of ( x t ) and denoting f t = ( ˆΦ t ) ∗ f, ρ t = (Φ t ) ∗ ρ, one has the following: E H ( ρ t | ρ ) = − tλ Σ , E ( H ( f t | f ) − H ( ρ t | ρ )) = t ( nλ − λ Σ ) . (2.10)The second line can be rewritten (using, e.g., Lemma 3.2 in [11]) in the following highly suggestive form.Let ( ν x ) denote the disintegration measures of ν (as integrands, d ν ( x, v ) = d ν x ( v )d µ ( x ) ; see Section 5.2),one has E ˆ H (cid:0) D Φ t ( x ) ∗ ν x | ν Φ t ( x ) (cid:1) d µ ( x ) = nλ − λ Σ . (2.11)This form directly encodes the Furstenberg criterion (1.19) for nλ − λ Σ = 0 : naturally, if nλ − λ Σ = 0 then one must have ( D x Φ) ∗ ν x = ν x t for all t . Indeed, one might hope to extract quantitative informationabout gap nλ − λ Σ using (2.11), although to our best knowledge this has not been done.On the other hand, our Fisher information identity can be thought of as the time-infinitesimal analogueof (2.10), as the following shows. Lemma 2.6.
F I ( ρ ) = lim t → t E H ( ρ t | ρ ) ,F I ( f ) = lim t → t E H ( f t | f ) . Consequently,
F I ( f ) = nλ − λ Σ . In view of (2.10), we see that
F I ( f ) − F I ( ρ ) measures the rate at which the projective dynamics ˆΦ t distorts the stationary fiber measures ( ν x ) x ∈ M . Proof.
We include a sketch of the proof, ignoring technical details related to localization and convergenceof ρ t → ρ, f t → f . We present here the proof for f t ; the proof for ρ t largely follows the same lines and isomitted.Using the formula for f t , we can apply Itˆo’s lemma to obtain the following stochastic equation that holdspointwise on S M : d (cid:20) f t log (cid:18) f t f (cid:19)(cid:21) = 12 r X k =1 | X ∗ k f t | f t d t + (cid:20) log (cid:18) f t f (cid:19) − (cid:21) (d f t ) . f t log (cid:18) f t f (cid:19) = 12 r X k =1 ˆ t | X ∗ k f s | f s d s + ˆ t ( ˜ L ∗ f s ) (cid:20) log (cid:18) f s f (cid:19) − (cid:21) d s + 1 t M t where M t is a mean-zero Martingale whose exact form is not important. Integrating over S M , using Fubini,and averaging with respect to E gives t E H ( f t | f ) = 1 t E ˆ t F I ( f s )d s + 1 t E ˆ t ˆ S M ( ˜ L ∗ f s ) log (cid:18) f s f (cid:19) d q d s, where we used the fact that ´ S M ˜ L ∗ f s d q = 0 . Sending t → and assuming that we can pass the limit f t → f in both terms on the right-hand side gives the result since log (cid:16) f t f (cid:17) → . This section is dedicated to the proof of Theorem 1.9. We start with some notation and conventions to setup our main result, the statement of the quantitative hypoelliptic regularity estimate in Theorem 3.2.The proof we present has little directly to do with S M , and so throughout Section 3 we replace S M withan arbitrary, connected, orientable Riemannian manifold ( M , g ) with volume element d q . Some notation:in what follows we denote d = dim M , and write X ( M ) for the set of smooth vector fields on M . Elements X ∈ X ( M ) are regarded in the usual way as first-order differential operators acting on observables w : M → R .Throughout, { X ǫ , X ǫ , ..., X ǫr } ⊂ X ( M ) is a fixed collection of smooth vector fields (note that since thissection is not specific to S M , we will drop the tildes for notational simplicity). We are interested in studyingthe regularity of the family { f ǫ d q } ⊂ P ( M ) of smooth, absolutely continuous probability densities solvingthe stationary forward Kolmogorov equation ( X ǫ ) ∗ f ǫ + ǫ r X j =1 (( X ǫj ) ∗ ) f ǫ = 0 . (3.1)In what follows we will drop the ǫ superscript on the vector fields for notational simplicity.Regularity is estimated using the following ‘fractional’ norms, which arise naturally in our analysis. Todefine these, let { x j } be a countable family of smooth injective mappings x j : B δ (0) → M , B δ (0) ⊂ R d such that both ˜ U j := x j ( B δ (0)) and U j := x j ( B δ (0)) are covers of M for which diam ˜ U j < ∞ and every q ∈ M is in at most finitely many ˜ U j . Let { χ j } be a smooth partition of unity on M subordinate to thecover { U j } , i.e., (i) ≤ χ j ≤ everywhere, (ii) χ j | U j ≡ , and (iii) χ j is supported in ˜ U j .Fractional L Sobolev spaces W s, with s ∈ (0 , are defined by || w || W s, = || w || L + X j ˆ | h | <δ ˆ R d | ˜ w ( x + h ) − ˜ w ( x ) || h | d + s J j ( x )d x d h. (3.2)In practice, though, is easier to work with the following L H ¨older-type regularity class (essentially theBesov space B s , ∞ ): for s ∈ (0 , , || w || Λ s = || w || L + sup h ∈ R d : | h | <δ X j ˆ R n | ˜ w ( x + h ) − ˜ w ( x ) || h | s J j ( x )d x, (3.3)20here ˜ w j = ( χ j w ) ◦ x j and J = J j : B δ (0) → R ≥ is the coordinate representation of the volume elementin the chart ( U j , x j ) . The following embedding is clear: for all < s < s ′ < , for any w ∈ C ∞ c ( U ) with U ⊂ M open and bounded, we have || w || W s, . || w || Λ s ′ . Given
X, Y ∈ X ( M ) , the adjoint action of X on Y is defined through the Lie bracketad ( X ) Y = [ X, Y ] . For a multi-index I = ( i , ..., i k ) , i j ∈ { , · · · , r } for each ≤ j ≤ k , we denote X I = ad ( X i ) . . . ad ( X i k − ) X i k . In what follows, set s = , s j = 1 , ≤ j ≤ r and for a multi-index I = ( i , ..., i k ) we write, m ( I ) := 1 s ( I ) := k X j =1 s i j . Note that m ( I ) provides a measure of how “deep” a bracket is (i.e. the larger m ( I ) the more brackets thatwere taken), weighted in a way that will be consistent with available regularity.We denote by X s ( M ) ⊂ X ( M ) the C ∞ ( M ) -submodule of vector fields generated from successivebrackets with s ≤ s ( I ) , that is, X s ( M ) = Z ∈ X ( M ) : Z = N X j =1 h j X I j , s ( I j ) ≥ s, h j ∈ C ∞ ( M ) . Recall that { X j } rj =0 = { X ǫj } rj =0 ⊂ X ( M ) depend in a general manner on a parameter ǫ ∈ (0 , , hence X s ( M ) also depends on ǫ . This dependence is constrained only by the following ‘uniform’ version of theparabolic H ¨ormander condition: Definition 3.1 (Uniform parabolic H ¨ormander) . Let { Z ǫ , Z ǫ , ..., Z ǫr } ⊂ X ( M ) be a set of vector fieldsparameterized by ǫ ∈ (0 , . With X k defined as in Definition 1.7 we say { Z ǫ , Z ǫ , ..., Z ǫr } satisfies theuniform parabolic H ¨ormander condition on M if ∃ k ∈ N , such that for any open, bounded set U ⊆ M there exists constants { K n } ∞ n =0 , such that for all ǫ ∈ (0 , and all x ∈ U , there is a subset V ( x ) ⊂ X k suchthat ∀ ξ ∈ R d | ξ | ≤ K X Z ∈ V ( x ) | Z ( x ) · ξ | X Z ∈ V ( x ) || Z || C n ≤ K n . Assuming, as we do, that n X ǫj o satisfies the uniform parabolic H ¨ormander condition, a simple conse-quence is that ∃ s > such that ∀ ǫ ∈ (0 , , X s ( M ) = X ( M ) . Once and for all, fix s ∗ > so that X s ∗ ( M ) = X ( M ) .We now prove the following variant of Theorem 1.9. Indeed, if { Z , ..., Z m } span T w M then ∃ δ > such that ∀ v ∈ B δ ( w ) = { v ∈ M : d ( w, v ) < δ } the same vector fieldsspan, and so for V ∈ X ( M ) , ∃ c j ∈ C ∞ such that V = P c j Z j on B δ . The result then follows by a suitable partition of unity. heorem 3.2. We assume that for all ǫ ∈ (0 , the PDE (3.1) admits a unique, smooth probability mea-sure solution which satisfies f ǫ log f ǫ ∈ L (d q ) . Assume { X ǫ , X ǫ , ..., X ǫr } satisfies the uniform parabolicH¨ormander condition as in Definition 3.1. Then, ∀ U ⊂ M open, bounded, ∃ C > such that for all ∀ ǫ ∈ (0 , there holds, || f ǫ || Λ s ∗ ( U ) ≤ C (cid:16) √ F I ( f ǫ ) (cid:17) . Moreover, the constant C can be chosen to depend only on U , d and the constants k and { K n } Jn =0 (for a J depending only on k and d ) in Definition 3.1. Remark 3.3.
One can check from the proof that k < s ≤ k where k is as in Definition 3.1.The remainder of this section is devoted to proving Theorem 3.2. The following is a brief outline ofwhat is to come in the remainder of Section 3. Outline of the proof of Theorem 3.2
Crucial to both H ¨ormander’s original approach and our own is the ability to measure partial regularity of afunction along some given set of directions. To make sense of this, for a vector field Y ∈ X ( M ) and s > ,we define below the norm |·| Y,s which measures L H ¨older regularity along the direction Y .Let us make this more precise. Throughout, we fix an open bounded set U ⊂ M . Given Y ∈ X ( M ) ,let Y ∗ denotes its formal adjoint and let e tY ∗ denote the linear propagator solving the partial differentialequation ∂ t − Y ∗ = 0 (this makes sense as long as t > is taken sufficiently small depending on Y and U ).For s > , we define the family of ‘partial’ L H ¨older seminorms | w | Y,s = sup | t |≤ δ | t | − s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tY ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . Note the dependence on the parameter δ > : in practice, given U , this parameter is fixed and dependsonly on the regularity of { X I } as I ranges over the multi-indices with s ( I ) ≥ s ∗ . We may choose thisparameter thus as the vector fields in the proof vary in a uniformly bounded set in C J for a J dependingonly the constants in Definition 3.1.Turning back to the proof of Theorem 3.2: ultimately, for f = f ǫ solving the Kolmogorov equation (3.1),we seek to control k f ǫ k Λ s ∗ ( U ) from above in terms of F I ( f ǫ ) . Starting from the latter, it is straightforward(Lemma 3.4) to obtain the general functional inequality r X j =1 | w | X j , . || w || / p F I ( w ) . (3.4)for any w ∈ C ∞ ( U ) . Hence, for all intents and purposes it suffices to control the regularity || f || Λ s ∗ fromabove in terms of P j ≥ | f | X j , .For this, we turn to the ideas laid out by H ¨ormander. First, the spanning condition X s ∗ = X allows to“fill in” the missing directions not spanned by the original { X , · · · , X r } , leading to the general functionalinequality || w || Λ s ∗ . U || w || L + r X j =0 | w | X j ,s j (3.5) Note that these seminorms are slightly different from those used in [29], where the linear propagator e tY solving ∂ t − Y = 0 is used directly. Note, though, that the regularity defined is essentially the same in the sense that || w || L + sup | t |≤ δ | t | − s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tY ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ≈ δ ,U,H || w || L + sup | t |≤ δ | t | − s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e tY w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . w ∈ C ∞ ( U ) . This is a straightforward adaptation of [Section 4; [29]]– see Lemma 3.7 below.While (3.4) controls | w | X j , , ≤ j ≤ r in terms of the Fisher information F I ( w ) , it remains (as in[29]) to obtain an upper estimate on | f ǫ | X , / . The starting point is the derivation of an a priori estimateon f ǫ from (3.1). In [29], H ¨ormander observed that one naturally obtains an a priori regularity estimate on X f in a negative regularity L space in terms of X j f ∈ L (see also discussions in [4]). In our case, wecannot work in L , and instead have to work in a negative-type regularity which is essentially the dual tothat in (3.4)– this is the only a priori estimate available that will be useful. Pairing (3.1) with a test function v ∈ C ∞ we obtain the following, which is essentially the W − , ∞ norm with respect to the X ∗ j directions: D ǫ ( f ǫ ) := sup v ∈ C ∞ : || v || L ∞ + P rj =1 || X j v || L ∞ ≤ (cid:12)(cid:12)(cid:12)(cid:12) ˆ f ǫ X ǫ v (cid:12)(cid:12)(cid:12)(cid:12) ≤ ǫ r X j =1 (cid:12)(cid:12)(cid:12)(cid:12) X ∗ j f ǫ (cid:12)(cid:12)(cid:12)(cid:12) L . ǫ √ F I. (3.6)Using this, the missing X regularity is recovered by the following, which is the main difficulty in theproof: for any < σ < s ∗ , U ⊂ M bounded, open set and w ∈ C ∞ c ( U ) , we show that | w | X , / . U r X j =1 | w | X j , + D ǫ ( w ) + || w || Λ σ . (3.7)That is, we recover the | w | X , / regularity by a combination of the negative D ǫ regularity in conjunctionwith the positive | w | X j , regularity, accruing only a remainder term || w || Λ σ . Combining with (3.5) (alongwith interpolation of Λ σ between Λ s ∗ and L ), we obtain the following: ∀ U ⊂ M open, bounded, ∃ C > such that for all w ∈ C ∞ c ( U ) , there holds || w || Λ s ∗ ( U ) ≤ C || w || L + D ǫ ( w ) + r X j =1 | w | X j , . (3.8)From here, our estimate on || f ǫ || Λ s ∗ ( U ) in Theorem 3.2 is an easy consequence of the functional inequality(3.4) and the a priori estimate in (3.6).In Section 3.2 we review the available a priori estimates and basic functional inequalities that are usedin the proof. In Section 3.3 we briefly recall (3.5) and a closely related inequality which are straightforwardadaptations of estimates in [Section 4; [29]]. In Section 3.4 we give the proof of (3.7), leaving the mainlemma to be proved in Section 3.5. As in the corresponding step in [29], (3.7) is based on a careful regu-larization procedure, though it is more subtle to perform this procedure in the W − , ∞ -type framework wework with here. Section 3.5 is dedicated to the details of this regularization. To start, we record some useful estimates for the L H ¨older-type seminorms | · |
Y,s . Let Y ∈ X ( M ) and let e tY be the linear propagator of the partial differential operator ∂ t − Y . By the method of characteristics, thesmooth family of diffeomorphisms h Y ( t ) : M → M solving the initial value problem ˙ x = Y ( x ) satisfiesthe identity e tY w = w ◦ h Y ( t ) . With Y ∗ the formal adjoint of Y , again by the method of characteristics there is a smooth family of strictlypositive densities H Y ( t ) : M → (0 , ∞ ) such that e − tY ∗ w = H Y ( t ) w ◦ h Y ( t ) = H Y ( t ) e tY w . (3.9)23n particular, for | t | . , | H Y − | . | t | , (3.10)with similar estimates on higher derivatives.Next, we prove (3.4): that k X ∗ j w k L controls one derivative in the L -H ¨older norms. Lemma 3.4.
Let U be a bounded, open set U ⊂ M . Then, ∀ w ∈ C ∞ c ( U ) there holds, k w k X j , . U k X ∗ j w k L . || w || / L p F I ( w ) . Proof.
Let v ∈ L ∞ , then (cid:12)(cid:12)(cid:12)(cid:12) ˆ M v ( e − tX ∗ j w − w )d q (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ˆ t ˆ M ve − sX ∗ j X ∗ j w d q d s (cid:12)(cid:12)(cid:12)(cid:12) ≤ | t |k v k L ∞ k X ∗ j w k L . Taking the supremum over k v k L ∞ ≤ and dividing by | t | gives the first inequality whereas the secondfollows by Cauchy-Schwarz.Lastly, we record the simple observation that the negative regularity D ǫ can be localized. Lemma 3.5.
Let U ⊆ M be an open, bounded set and χ ∈ C ∞ c ( U ) . Then, for any h ∈ L ( M ) , we have D ǫ ( χh ) . U || h || L + D ǫ ( h ) . Proof.
Set w = χh . For test functions v ∈ C ∞ ( U ) , we estimate (cid:12)(cid:12)(cid:12)(cid:12) ˆ ( X v ) w d q (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) ˆ v ( X χ ) h (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) ˆ X ( χv ) h (cid:12)(cid:12)(cid:12)(cid:12) . || h || L + D ǫ ( h ) . Note that the estimate is uniform in k X k C ( U ) . Λ s with |·| X j ,s j The first steps to Theorem 3.2 are several lemmas that are nearly the same as those in [Section 4; [29]],except (A) we need them in L , (B) we need them uniform in the parameter ǫ hidden in X , (C) we needto generalize the proof to compact manifolds ( M , g ) . However, these small changes are straightforwardon a careful reading of [29] and are omitted for the sake of brevity; see [16] for more discussion on theuniformity. Lemma 3.6.
Let U be an open, bounded set and < σ < s ∗ with X s ∗ ( M ) = X ( M ) . For all δ > , ∃ C δ > such that for all multi-indices I such that least one index is zero, the following holds ∀ w ∈ C ∞ c ( U ) , | w | X I ,s ( I ) ≤ δ | w | X , + C δ r X j =1 | w | X j , + || w || Λ σ . Moreover, C δ depends on { X , X , ..., X r } only in the manner stated in Theorem 3.2. The next lemma shows that one can control regularity in Λ s by controlling the original vector fields. Lemma 3.7.
Let U be an open, bounded set and s ∗ be such that X s ∗ ( M ) = X ( M ) . Then, for w ∈ C ∞ c ( U ) there holds || w || Λ s ∗ . U || w || L + r X j =0 | w | X j ,s j . Moreover, C δ depends on { X , X , ..., X r } only in the manner stated in Theorem 3.2. .4 Positive X regularity from negative X and positive X j regularity In this subsection, we prove the a priori estimate (3.7) and then use it to complete the proof of Theorem3.2. Fix < σ < s ∗ arbitrary. Having fixed U we may, by rescaling { X , X , ..., X r } , assume that e tX I (and hence e − tX ∗ I ) is well-posed for w ∈ C ∞ ( U ) for t ∈ [ − , for σ ≤ s ( I ) (and hence we may choose δ = 1 ).Analogous to [Section 5; [29]], the primary intermediate step is to first deduce the estimate assumingthe natural control on essentially all other vector fields in X σ . Definition 3.8.
Denote by J the set of all multi-indices I with σ ≤ s ( I ) except for the singleton { } .Note this definition is slightly different from that in [29]. Define the following semi-norm | w | M := X I ∈J | w | X I ,s ( I ) . The main step in the proof of (3.7) (and hence Theorem 3.2 as a whole) is to prove the following.
Lemma 3.9.
For any bounded, open set U ⊂ M , and w ∈ C ∞ ( U ) , the following holds uniformly in ǫ | w | X , . U | w | M + k w k Λ σ + D ǫ ( w ) . As in the corresponding [Section 5; [29]] (and in [4]), we use an approach based on a carefully selectedregularization, but our choice is even a little more delicate than [29]. As the regularization procedure is quitetechnically subtle, we first give the proof of Lemma 3.9 assuming the existence of a regularizer satisfyingthe desired properties.
Lemma 3.10.
There exists a family of uniformly bounded smoothing operators S τ : L p → L p for τ ∈ (0 , and p ∈ [1 , ∞ ] with the following properties: for all w ∈ C ∞ ( U ) , || S ∗ τ w − w || L . τ | w | Mr X j =1 || X j S τ w || L ∞ . τ || w || L ∞ || [ X , S τ ] ∗ w || L . τ ( | w | M + k w k Λ σ ) . Assuming this lemma for now, we proceed.
Proof of Lemma 3.9 assuming Lemma 3.10.
We will first obtain regularity estimates by evaluating the frac-tional time derivative of e tX ∗ w (omitting the ǫ for notational simplicity). Observe that for any t, τ > , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ ( S ∗ τ w − w ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + || S ∗ τ w − w || L + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ S ∗ τ w − S ∗ τ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . (3.11)Therefore, by Lemma 3.10 and L boundedness of the group e − tX ∗ on U , sup | t |≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ ( S ∗ τ w − w ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . || S ∗ τ w − w || L . τ | w | M . (3.12)This will suffice for the first two terms in (3.11). Next, we estimate the last term in (3.11). We will do thisusing the fact that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ S ∗ τ w − S ∗ τ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ≤ sup k v k L ∞ ≤ (cid:12)(cid:12)(cid:12)(cid:12) ˆ t ˆ M ( e sX v ) X ∗ S ∗ τ w d q d s (cid:12)(cid:12)(cid:12)(cid:12) . (3.13)25or a fixed v ∈ L ∞ , we find that (cid:12)(cid:12)(cid:12)(cid:12) ˆ M ( e sX v ) X ∗ S ∗ τ w d q (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12) ˆ M ( e sX v )[ X , S τ ] ∗ w d q (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) ˆ M ( S τ e sX v ) X ∗ w d q (cid:12)(cid:12)(cid:12)(cid:12) ≤ k e sX v k L ∞ k [ X , S τ ] ∗ w k L + (cid:12)(cid:12)(cid:12)(cid:12) S τ e sX v (cid:12)(cid:12)(cid:12)(cid:12) ∞ + r X j =1 k X j S τ e sX v k L ∞ D ( w ) . Using Lemma 3.10 and the boundedness of e tX in L ∞ ( U ) , we conclude that (cid:12)(cid:12)(cid:12)(cid:12) ˆ M ( e sX v ) X ∗ S ∗ τ w d q (cid:12)(cid:12)(cid:12)(cid:12) . U τ k v k L ∞ ( | w | M + k w k Λ σ + D ( w )) and from (3.13) we deduce (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ S ∗ τ w − S ∗ τ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . U | t | τ ( | w | M + k w k Λ σ + D ( w )) . Therefore, setting τ = p | t | and using (3.12) implies (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . p | t | ( | w | M + D ( w )) . By (3.9), (3.10), and the boundedness of U , this implies the desired result.To complete the section, we explain in more detail how Lemma 3.9 implies Theorem 3.2 Proof of Theorem 3.2.
By Lemma 3.9 followed by Lemma 3.6 to absorb the effect of the higher order brack-ets by choosing δ sufficiently small, implies (3.7), that is for any w ∈ C ∞ c ( U ) , || w || X , . || w || Λ σ + r X j =1 | w | X j , + D ǫ ( w ) . Applying Lemma 3.7 then implies || w || Λ s ∗ . r X j =1 | w | X j , + || w || Λ σ + D ǫ ( w ) . (3.14)Next, note the interpolation (from H ¨older’s inequality and Definition 3.3): ∀ σ ∈ (0 , s ) and all δ > , ∃ C δ such that || w || Λ σ ≤ δ || w || Λ s ∗ + C δ || w || L , which by (3.14) implies H ¨ormander inequality (3.8). Let U ⊂⊂ U ′ ⊂ M where U ′ is another open andbounded set and let χ ∈ C ∞ c ( U ′ ) with χ ( x ) = 1 for all x ∈ U . Then, Lemma 3.5 implies || χf ǫ || Λ s . r X j =1 | χf ǫ | X j , + D ǫ ( f ǫ ) . (3.15)Putting Lemma 3.4 together with (3.15) and (3.6), completes the proof of Theorem 3.2.26 .5 Regularization: Lemma 3.10 In this subsection we prove Lemma 3.10. First, we define a suitable “isotropic” mollifier via the parameter-ization. Let ϕ ∈ C ∞ (( − , with ϕ ≥ , ´ − ϕ ( t )d t = 1 , and ϕ ( − t ) = ϕ ( t ) , denoting ˜ w j = χ j w ◦ x j ,and for each x ∈ R d let φ τ ( x ) = τ d φ ( | x | /τ ) . We define the regularization of χ j w as follows for | τ | ≤ δ , Φ ( j ) τ w ◦ x j = ˆ R d φ τ ( | x − y | ) ˜ w j ( y ) J j ( y )d y, where as above we write J j = ( det ˜ g ) / , the volume element on M in local coordinates. We write Φ τ w ( q ) = X j : j ∈ ˜ U j Φ ( j ) τ w ( q ) . (3.16)Note that by definition, Φ τ = Φ ∗ τ for the adjoint in L ( dq ) . The basic properties of these kinds of mollifiersare classical, however, we include sketches for completeness. Due to the compatibility between definitions(3.3) and (3.16), and the fact that the properties we are interested in are purely local, the results followfrom the corresponding statements on R d . We sketch the details of this in the first lemma for the readers’convenience. Lemma 3.11.
For all σ ∈ [0 , , U ⊂ M open and bounded, there holds the following uniformly in τ ∈ (0 , δ ) and uniformly in C bounded sets of Y ∈ X ( U ) , for all w ∈ C ∞ ( U ) (identifying Λ = L ), || [ Y, Φ τ ] w || L . U τ σ || w || Λ σ . Proof.
It suffices to show that the lemma holds for all Φ ( j ) . By the definition of the parameterization wehave, writing a k ( x ) ∂ kx (using Einstein notation summation) as the parameterization of the vector field Y , (cid:12)(cid:12)(cid:12)(cid:12) [ Y, Φ jτ ] w (cid:12)(cid:12)(cid:12)(cid:12) L = ˆ R d (cid:12)(cid:12)(cid:12)(cid:12) ˆ R d J j ( y ) a k ( x ) ∂ kx φ τ ( | x − y | ) ˜ w j ( y ) − φ τ ( | x − y | ) a k ( y ) ∂ ky ˜ w j ( y )d y (cid:12)(cid:12)(cid:12)(cid:12) J j ( x )d x. Integrating by parts and using the average zero property, we obtain (cid:12)(cid:12)(cid:12)(cid:12) [ Y, Φ jτ ] w (cid:12)(cid:12)(cid:12)(cid:12) L . ˆ R d (cid:12)(cid:12)(cid:12)(cid:12) ˆ R d (cid:16) J j ( y ) a k ( x ) ∂ kx φ τ ( | x − y | ) + ∂ ky ( J j ( y ) φ τ ( | x − y | ) a k ( y )) (cid:17) ( ˜ w j ( y ) − ˜ w j ( x )) d y (cid:12)(cid:12)(cid:12)(cid:12) J j ( x )d x. Using that | a k ( x ) − a k ( y ) | . | x − y | , and ∂ ky ( J j ( y ) a k ( y )) ≤ gives (cid:12)(cid:12)(cid:12)(cid:12) [ Y, Φ jτ ] w (cid:12)(cid:12)(cid:12)(cid:12) L . ˆ R d ˆ R d (cid:12)(cid:12)(cid:12)(cid:12) τ d φ (cid:18) | x − y | τ (cid:19) + | x − y | τ d +1 φ ′ (cid:18) | x − y | τ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) | ˜ w j ( x ) − ˜ w j ( y ) | d yJ j ( x )d x. Then, making the change of variables y = x + h , we obtain from (3.3), (cid:12)(cid:12)(cid:12)(cid:12) [ Y, Φ jτ ] w (cid:12)(cid:12)(cid:12)(cid:12) L . τ σ || w || Λ σ . Next we prove the following regularization estimate.27 emma 3.12.
For all σ ∈ [0 , , for all U ⊂ M open and bounded, there holds uniformly over τ ∈ (0 , and uniformly over bounded C sets of Y ∈ X ( K ) , for all w ∈ C ∞ c ( U ) and p ∈ [1 , ∞ ] , || τ Y Φ τ w || L p . U || w || L p (3.17) || τ Y Φ τ w || L . U τ σ || w || Λ σ . (3.18) || Φ τ τ Y w || L . U τ σ || w || Λ σ . (3.19) Proof.
We proceed with a proof similar to that used in Lemma 3.11. We consider only (3.18); the proofs of(3.17) and (3.19) follow from similar arguments. As above, we have, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) τ Y Φ ( j ) τ w j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . ˆ R d (cid:12)(cid:12)(cid:12)(cid:12) ˆ R d τ a k ∂ kx φ τ ( | x − y | ) ( ˜ w j ( y ) − ˜ w j ( x )) d y (cid:12)(cid:12)(cid:12)(cid:12) d x . τ σ || w || Λ σ . Next, we introduce directional regularizations with respect to a given vector field Y ∈ X , as done in[Section 5; [29]]. Accordingly, for each ϕ ∈ C ∞ c ([ − , and τ ∈ (0 , define ϕ τY w := ˆ R ( e tY w ) ϕ τ ( t )d t, where ϕ τ ( t ) := τ ϕ ( τ − t ) . Note that, ( ϕ τY ) ∗ w = ϕ − τY ∗ w = ˆ R ( e − tY ∗ w ) ϕ τ ( t )d t, a property that will be used repeatedly in the sequel.First we record the basic property that these regularizers are bounded on L p . The proof is straightforwardand is omitted for brevity. Lemma 3.13.
For any Y ∈ X , for any open bounded set U ⊂ M , and ϕ ∈ C ∞ c ([ − , there holds for all p ∈ [1 , ∞ ] , and w ∈ C ∞ c ( U ) , || ( ϕ τY ) ∗ w || L p . || w || L p || Φ τ w || L p . || w || L p . Next, we note that the regularizations, the adjoint regularizations, and vector field exponentials arebounded in the Λ s space. Lemma 3.14.
For | t | ≤ , τ ∈ (0 , and σ ∈ [0 , , for all open, bounded sets U ⊂ M and w ∈ C ∞ c ( U ) ,there holds (cid:12)(cid:12)(cid:12)(cid:12) e tY w (cid:12)(cid:12)(cid:12)(cid:12) Λ σ . || w || Λ σ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e tY ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Λ σ . || w || Λ σ || ( ϕ τY ) ∗ w || Λ σ . || w || Λ σ . || Φ τ w || Λ σ . || w || Λ σ . Proof.
The latter three estimates follow easily from the first estimate. After applying parameterization toreduce to the case of R d , the first estimate follows from a straightforward L adaptation of [Lemma 4.2;[29]]. The details are omitted for the sake of brevity.28n a similar vein, the chain rule implies the following estimates. Lemma 3.15.
For all open, bounded U ⊂ M , for all | τ | ≤ and ∀ k ≥ , the following holds ∀ w ∈ C ∞ c ( U ) , sup Z ∈ X : || Z || Ck ≤ (cid:12)(cid:12)(cid:12)(cid:12) Ze τY w (cid:12)(cid:12)(cid:12)(cid:12) L ∞ . sup Z ∈ X : || Z || Ck ≤ || Zw || L ∞ (3.20) sup Z ∈ X : || Z || Ck ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z ∗ e τY ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . sup Z ∈ X : || Z || Ck ≤ || Z ∗ w || L (3.21) sup Z ∈ X : || Z || Ck ≤ || Z ∗ ( ϕ τY ) ∗ w || L . sup Z ∈ X : || Z || Ck ≤ || Z ∗ w || L . (3.22) Proof.
Estimates (3.20), (3.21) follow from the chain rule and (3.22) then follows from the definition of theregularizers.The next lemma characterizes the regularization property of the regularizers.
Lemma 3.16.
For all open, bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , || ( Y ϕ τY ) ∗ w || L . sup | t |≤ τ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tY ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . Proof.
We have ( ϕ τY ) ∗ Y ∗ w = ˆ R ( e − tY ∗ Y ∗ w ) ϕ τ ( t ) d t = − ˆ R dd t (cid:16) e − tY ∗ w − w (cid:17) ϕ τ ( t )d t = ˆ R (cid:16) e − tY ∗ w − w (cid:17) ϕ ′ τ ( t )d t. The result then follows by Minkowski’s inequality.We will also need the L ∞ regularization property. Lemma 3.17.
For all open bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , || Y ϕ τY w || L ∞ . τ || w || L ∞ . Proof.
This follows by a straightforward variant of the proof of Lemma 3.16.Next, we show that the H ¨older-type regularity classes are natural for controlling convergence of theoperators. It is natural to specialize to the specific form in which we are using it.
Lemma 3.18.
For all open bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , there holds for ϕ ∈ C ∞ c ([ − , , ϕ ≥ and ´ R ϕ ( t )d t = 1 , || ( ϕ τX I ) ∗ w − w || L . sup | t |≤ τ k e − tX ∗ I w − w k L . Proof.
By Minkowski’s inequality, (cid:12)(cid:12)(cid:12)(cid:12) ( ϕ ντX I ) ∗ w − w (cid:12)(cid:12)(cid:12)(cid:12) L ≤ ˆ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ϕ τ ( t )d t ≤ sup | t |≤ τ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . ϕ τX I with respect to X J Lemma 3.19.
For all open bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , for I, J ∈ J , there holds sup | t |≤ τ m ( J ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ J ( ϕ τ m ( I ) X I ) ∗ w − ( ϕ τ m ( I ) X I ) ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ≤ sup | t |≤ τ m ( J ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ J w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + sup | t |≤ τ m ( I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . Now, we are ready to define the regularizer S t . Let us now give J a total ordering so that m ( I ) is anincreasing function of I ∈ J and we denote J ∞ = J ∪ {∞} . We define S t in terms of an ascending ,ordered composition of regularizing operators S τ w := Y I ∈J ϕ τ m ( I ) X I ! Φ τ /σ w. This regularizer is similar, but not quite exactly the same as that defined in [29] due to the inclusion of moreregularization operators. We will ultimately use S ∗ t as the regularizer, which is a little more subtle to workwith. Analogous to [29], we also define the truncated regularizer, for all J ∈ J , S Jτ := Y I ∈J : I ≥ J ϕ τ m ( I ) X I Φ τ /σ The remainder of the subsection is dedicated to proving Lemma 3.10. The first step is to obtain the L convergence. Lemma 3.20.
For all open bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , (cid:12)(cid:12)(cid:12)(cid:12) S ∗ η w − w (cid:12)(cid:12)(cid:12)(cid:12) L . t | w | M Proof.
For any finite family of L → L bounded linear operators Z , Z , ..., Z k we have || Z Z ...Z k w − w || L ≤ k X j =1 (cid:12)(cid:12)(cid:12)(cid:12) ( Z ...Z j − )( Z j w − w ) (cid:12)(cid:12)(cid:12)(cid:12) L . k X j =1 || Z j w − w || L . The result then follows from Lemma 3.18.The next Lemma is crucial for characterizing the regularization properties of ( S Jt ) ∗ in L . This is theadjoint analogue of [Lemma 5.2; [29]], which is a little more technical. Lemma 3.21.
For all open bounded sets U ⊂ M and w ∈ C ∞ c ( U ) , there holds for any multi-indices J ≤ I , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( τ /σ Y S Jτ ) ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . τ || w || Λ σ (3.23) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( τ m ( I ) X I S Jτ ) ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . X I ′ ∈J : I ′ ≥ J sup | t |≤ τ m ( I ′ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I ′ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + τ || w || Λ σ . (3.24)Before we continue, we define for two vector fields X and Ye t ad ( X ) Y := e tX Y e − tX , which is the adjoint representation of e tX on the Lie algebra of vector fields. It will be useful to expand e t ad ( X ) Y in a Taylor expansion. 30 emma 3.22. For two smooth vector fields
X, Y , t ∈ [ − , and N ∈ N , there exists a smooth boundedvector field Y N,t locally uniformly bounded in C k ( ∀ k ) on t ∈ [ − , , ( e t ad ( X ) Y ) = X ≤ k The adjoint representation gives the following commutator representation for the smoothing operators Y ϕ τX w = ˆ R (cid:16) e tX ( e − t ad ( X ) Y ) w (cid:17) ϕ τ ( t )d t. Lemma 3.22 then gives the following formula for Y ϕ τX (used also in [29]). Lemma 3.23. For each ϕ ∈ C ∞ c (( − , , k ∈ N and vector field X define ( ˆ ϕ kτX ) g := ˆ R ( e tX g ) ˆ ϕ kτ ( t )d t, where ˆ ϕ k ( t ) := t k k ! ϕ ( t ) ∈ C ∞ c (( − , . (3.25) For two smooth vector fields X, Y , τ ∈ (0 , and N ∈ N , the following holds Y ϕ τX = X ≤ k Proof of (3.23). We proceed by induction. For J = ∞ the result follows from (3.19).Hence, we next assume that the result holds for all J ′ with J ′ > J and prove that it also holds for J . Webegin with the decomposition ( S Jτ ) ∗ = ( S J ′ τ ) ∗ ( ϕ τ m ( J ) X J ) ∗ . By a trivial application of Lemma 3.23 with N = 1 and X I = Y , there exists a smooth bounded vector field Y ,t such that (recall Definition (3.26)), ( τ /σ Y S Jτ ) ∗ = ( τ /σ Y S J ′ τ ) ∗ ( ϕ τ m ( J ) X J ) ∗ + τ m ( J )+1 /σ ( S J ′ τ ) ∗ ( R τ m ( J ) X ) ∗ . By the induction hypothesis and Lemma 3.14 we have for the first term above k ( τ /σ Y S J ′ τ ) ∗ ( ϕ τ m ( J ) X J ) ∗ g k L . τ k ( ϕ τ m ( J ) X J ) ∗ g k Λ σ . τ k g k Λ σ . A similar estimate holds for the second term using Minkowski’s inequality k ( τ /σ S J ′ τ ) ∗ ( R τ m ( J ) X J ) ∗ g k L ≤ ˆ R k ( τ /σ Y ,t S J ′ τ ) ∗ e − tX ∗ J g k L ˆ ϕ τ m ( J ) ( t )d t . τ k e − tX ∗ J g k Λ σ . τ k g k Λ σ and the estimate (3.23) now follows. 31 roof of (3.24). First we note that if I = J then we have for J ′ the smallest element such that J ′ > J by Lemma 3.16 and the L boundedness of ( S Jτ ) ∗ k ( X J S Jτ ) ∗ g k L = k ( S J ′ τ ) ∗ ( X J ϕ τ m ( J ) X J ) ∗ g k L . sup | t |≤ τ m ( J ) k e − tX ∗ J g − g k L . When I > J , we proceed by induction. First of all, the result follows by definition of (3.16) if J = ∞ .Hence, we next assume that the result holds for all J ′ with J ′ > J and prove that it also holds for J thelargest element less than J ′ . Again writing ( S Jτ ) ∗ = ( S J ′ τ )( ϕ τ m ( J ) X J ) ∗ and using Lemma 3.23 we obtain, ∀ N ≥ , ( τ m ( I ) X I S Jτ ) ∗ = X ≤ k For all J ∈ J , U ⊂ M open and bounded, there holds ∀ w ∈ C ∞ c ( U ) and τ ∈ ( − , , (cid:12)(cid:12)(cid:12)(cid:12) [ τ X , S Jτ ] ∗ w (cid:12)(cid:12)(cid:12)(cid:12) L . X I ∈J : I ≥ J sup | t |≤ τ m ( I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + τ || w || Λ s . Proof. As in the proof of (3.24) above, we proceed by induction. Firstly, the estimate holds for J = ∞ due to the commutator estimate Lemma 3.11. As above, assume the result holds for all J ′ with J ′ > J andprove that it also holds for J , writing ( S Jt ) ∗ = ( S J ′ t ) ∗ ( ϕ t m ( J ) X J ) ∗ . [ τ X , ( S Jτ )] ∗ = [ τ X , S J ′ τ ] ∗ ( ϕ τ m ( J ) X J ) ∗ + ( S J ′ τ ) ∗ [ τ X , ϕ τ m ( J ) X J ] ∗ . (3.27)By the inductive hypothesis and Lemmas 3.19 and Lemmas 3.13 we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ τ X , S J ′ τ ] ∗ ( ϕ τ m ( J ) X J ) ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . X I ′ ≥ J ′ sup | t |≤ τ m ( I ′ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I ′ ( ϕ τ m ( J ) X J ) ∗ w − ( ϕ τ m ( J ) X J ) ∗ w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + τ (cid:12)(cid:12)(cid:12)(cid:12) ϕ τ m ( J ) X J w (cid:12)(cid:12)(cid:12)(cid:12) Λ σ . X I ′ ≥ J sup | t |≤ τ m ( I ′ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e − tX ∗ I ′ w − w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + τ k w k Λ σ . This term in (3.27) is consistent with the desired result.For the second term in (3.27), by Lemma 3.23 we have (note the cancellation which eliminates the k = 0 term) ( S J ′ τ ) ∗ [ τ X , ϕ τ m ( J ) X J ] ∗ = X N m ( J ) ≥ σ .We omit the repetitive details for the sake of brevity.Finally, we prove the required L ∞ regularization estimate. Lemma 3.25. Let U ⊂ M be open and bounded. Let I be any multi-index and J ≤ I . Then, ∀ w ∈ C ∞ c ( U ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t m ( I ) X I S Jt w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L ∞ . U || w || L ∞ Proof. This is done by induction as in previous lemmas. The case J = ∞ follows from (3.17). Assume thatthe result holds for all J ′ with J ′ > J and write S Jt = ϕ t m ( J ) X J S J ′ t . Case 1: I > J . We apply the Taylor expansion of Lemma 3.23 (recalling definitions (3.25) and (3.26)) τ m ( I ) X I S Jτ = X ≤ k Projective spanning In this section, we discuss tools for verifying the parabolic H ¨ormander conditions on the projective process ( w t ) on the sphere bundle S M in general (Section 4.1) as well as specific criteria for the class of Euler-likemodels introduced in Section 1.2 (Section 4.2). This projective spanning condition is proved for the Lorenz96 model in Section 4.3 We work in this section at the same level of generality as we did in Section 2. As before, let M be a smooth,connected, orientable Riemannian manifold without boundary. Given a vector field X on M we denote theassociated “lifted” vector field ˜ X on the sphere bundle S M by ˜ X ( x, v ) := (cid:18) X ( x ) V ∇ X ( x ) ( v ) (cid:19) where each of the components in the block vector above are with respect to the orthogonal splitting T w S M = T x M ⊕ T v S x M , V ∇ X ( x ) ( v ) = Π v ∇ X ( x ) v is the projective vector field on S x M and ∇ X ( x ) denotes thetotal covariant derivative with respect to the Levi-Civita connections, viewed as a linear endomorphism on T x M .Here we give general necessary and sufficient conditions on a collection of vector fields { X k } rk =0 sothat their lifts { ˜ X k } rk =0 satisfy the parabolic H ¨ormander condition on S M . Definition 4.1. Let { X k } rk =0 be a collection of smooth vector fields on a connected manifold M , and let X k ⊆ X ( M ) be as in Definition 1.7. Define the parabolic Lie algebra generated by { X k } rk =0 to be theLie-algebra of vector fields spanned by these collections Lie( X ; X , . . . , X r ) := ( Z ∈ X ( M ) : Z = N X i =1 c i Z i , c i ∈ R , { Z i } ⊂ [ k ∈ N X k ) . It follows that { X k } rk =0 satisfies the parabolic H¨ormander condition if for each x ∈ M { X ( x ) : X ∈ Lie( X ; X , . . . , X r ) } = T x M, and similarly for { ˜ X k } rk =0 .Since the vector fields { X k } rk =0 may not be volume preserving, it is convenient to define for each X ∈ X ( M ) the following traceless linear operator on T x M : M X ( x ) := ∇ X ( x ) − n div X ( x ) I , which we view as an element of the Lie algebra sl ( T x M ) (the space of traceless linear operators on T x M ).Since the projective vector field V ∇ X ( v ) includes a projection orthogonal to v , we always have V ∇ X ≡ V M X .An important role will be played by the following Lie sub-algebra of sl ( T x M ) : m x ( X ; X , . . . , X r ) := { M X ( x ) : X ∈ Lie( X ; X , . . . , X r ) , X ( x ) = 0 } . (4.1)It is a simple matter to check that m x ( X ; X , . . . , X r ) is indeed a Lie sub-algebra of sl ( T x M ) with respectto the matrix commutator [ A, B ] = AB − BA. It was observed by Baxendale in [11] that the parabolicH ¨ormander condition on lifted vector fields { ˜ X k } rk =0 on S M can be related to properties of the matrix Liealgebra m x ( X ; X , . . . , X r ) . However, as there does not seem to be a proof of this fact anywhere in theliterature, a self-contained proof is included in the Appendix B.34 roposition 4.2. Let { X k } rk =0 be a collection of smooth vector fields on M . Their lifts { ˜ X k } rk =0 satisfy theparabolic H¨ormander condition on S M if and only if { X k } rk =0 satisfy the parabolic H¨ormander conditionon M and for each ( x, v ) ∈ S M we have { V A ( v ) : A ∈ m x ( X ; X , . . . , X r ) } = T v S x M. (4.2) Remark 4.3. The theory of Lie algebra actions on manifolds, the condition (4.2) means that m x acts tran-sitively on S x M through the Lie algebra action A V A . It is straight forward to show that the Lie algebra so ( T x M ) of skew-symmetric linear operators (depending on the metric) acts transitively on S x M and there-fore a sufficient condition for transitive action of m x ( X ; X , . . . , X r ) on S x M is so ( T x M ) ⊆ m x ( X ; X , . . . , X r ) . R n In this section we introduce useful sufficient conditions for verifying the parabolic H ¨ormander condition forthe projective process arising from a certain class of SDE on M = R n of the form d x ǫt = ( F ( x ǫt ) + ǫAx ǫt ) d t + √ ǫ r X k =1 X k d W kt , (4.3)where { X k } rk =1 are assumed for simplicity to be constant vector fields while the matrix A ∈ R n × n isnegative definite, contributing volume dissipation to the overall system, and F ( x ) = B ( x, x ) is bilinear asin (1.11).As previously, ǫ > denotes a parameter that should be thought of as small. Our goal will be to verifythat ˜ X ǫ , ˜ X , . . . , ˜ X r satisfies the parabolic H ¨ormander condition uniformly in ǫ on bounded sets in the senseof Definition 3.1,On R n , the sphere bundle trivializes to SR n ≃ R n × S n − and the lifts { ˜ X ǫ , ˜ X , . . . , ˜ X r } to R n × S n − are given by ˜ X ǫ ( x, v ) = (cid:18) X ǫ V ∇ X ǫ ( v ) (cid:19) , ˜ X k = (cid:18) X k (cid:19) , where k = 1 , . . . r and X ǫ ( x ) = F ( x ) + ǫAx .By Proposition 4.2, we know that verifying the parabolic H ¨ormander condition for { ˜ X ǫ , ˜ X , . . . , ˜ X r } on R n × S n − is equivalent to checking that { X ǫ , X , . . . , X r } satisfies the parabolic H ¨ormander conditionon R n and that the Lie algebra m x ( X ǫ ; X , . . . , X r ) defined by (4.1) satisfies { V A ( v ) : A ∈ m x ( X ǫ ; X , . . . , X r ) } = T v S n − for each ( x, v ) ∈ R n × S n − . In general it is a challenge to directly work with m x ( X ǫ ; X , . . . , X r ) as it isnot a simple task to classify all vector fields in Lie( X ǫ ; X , . . . X r ) that vanish at each x ∈ R n . However, in R n it is often the case that the parabolic Lie algebra generated by { X k } rk =0 contains a spanning collectionof constant vector fields. In this case m x can be described more explicitly. Lemma 4.4. Let { X k } rk =0 ⊆ X ( R n ) and suppose that Lie( X ; X , . . . , X r ) contains the constant vectorfields { ∂ x k } nk =1 . Then m x ( X ; X , . . . , X r ) = { M X ( x ) : X ∈ Lie( X ; X , . . . , X r ) } . Proof. Our hypothesis { ∂ x k } nk =1 ⊆ Lie( X ; X , . . . , X r ) implies that for each X ∈ Lie( X ; X , . . . , X r ) and x ∈ R n , the vector field ˆ X = X − X ( x ) also belongs to Lie( X ; X , . . . , X r ) and satisfies ˆ X ( x ) = 0 .Since ∇ ˆ X = ∇ X , we have M ˆ X ( x ) = M X ( x ) , hence M X ( x ) ∈ m x .35urthermore, the assumptions that F ( x ) = B ( x, x ) is bilinear and Ax is linear allow us to deduce aconvenient sufficient condition for verifying (4.2) on the vector fields X ǫ , X , . . . , X r uniformly in ǫ . Tomake this precise, we define for each k = 1 , . . . , n the linear operator H k := ∇ [ ∂ x k , X ǫ ] = ∂ x k ∇ F ∈ sl ( R n ) . Note that H k is independent of both x ∈ M and ǫ . Below, Lie( H , . . . , H n ) denotes the matrix Liesubalgebra of sl ( R n ) generated by (cid:8) H , · · · , H n (cid:9) . Lemma 4.5. Assume (i) { ∂ x k } nk =1 ⊆ Lie( X ǫ ; X , . . . , X r ) and (ii) that Lie( H , . . . , H n ) = sl ( R n ) . (4.4) Then, ˜ X ǫ , ˜ X , . . . , ˜ X r satisfies the uniform parabolic H¨ormander condition in the sense of Definition 3.1 as ǫ is varied in (0 , . Remark 4.6. By Remark 4.3, we can replace (4.4) with the weaker condition so ( R n ) ⊆ Lie( H , . . . , H n ) . Remark 4.7. Let us comment briefly on how one might verify (4.4). Since sl ( R n ) is n − dimensional,it is clear that one must use commutators that go several generations deep if one has any hope of generating sl ( R n ) . However, it can simplify things to instead look to build a suitable generating set for sl ( R n ) out ofbrackets of H i ’s. A particularly useful generating set for sl ( R n ) is the collection of elementary matrices E , , E , , . . . , E n, , where E i,j is the matrix with in ( i, j ) entry and elsewhere. For these, we have the commutation relation [ E i,j , E k,ℓ ] = E i,ℓ δ j,k − E k,j δ ℓ,i , so that, e.g., [ E , , E , ] = E , and [ E , , E , ] = E , − E , . Continuing like this allows to generate the off-diagonal matrices { E i,j } i = j as well as the directions E , − E , , . . . E n,n − E n − ,n − needed to complete a basis for sl ( R n ) . Therefore, (cid:8) E , , E , , . . . , E n, (cid:9) gen-erates sl ( R n ) . Now we turn to verifying the uniform projective spanning for stochastically forced Lorentz 96 (1.1). Recallthe stochastic Lorenz 96 is an SDE on R J defined by d u ℓ = ( u ℓ +1 − u ℓ − ) u ℓ − d t − ǫu ℓ d t + √ ǫq ℓ d W ℓt . (4.5)Here, we assume a periodic ensemble of coupled oscillators, i.e., u i + kJ := u i . Naturally we can write (4.5)in the general form (4.3), by defining F ℓ ( u ) = u ℓ +1 u ℓ − − u ℓ − u ℓ − , ( Au ) ℓ = u, X ℓ ( u ) = q ℓ ∂ u ℓ , where F ( u ) satisfies assumption 1.11.First, we verify uniform hypoellipticity of ( u t ) . Lemma 4.8. Let J < ∞ be arbitrary, suppose that at least q , q = 0 , then Lie( F, q ∂ u , q ∂ u ) contains { ∂ u j } Jj =1 and spans R n uniformly in ǫ on compact sets. roof. Since the nonlinearity is bilinear, we readily observe that [ ∂ u , [ ∂ u , F ]] = − ∂ u . Iterating this observation allows to generate all brackets of the form [ ∂ u i +1 , [ ∂ u i , F ]] = − ∂ u i +2 .In order to prove uniform projective spanning we first observe that ( ∇ F ( u )) ℓm = DF ℓ ( u ) m = u ℓ − δ m = ℓ +1 + u ℓ +1 δ m = ℓ − u ℓ − δ m = ℓ − − u ℓ − δ m = ℓ − , hence it follows that for each k ∈ { , . . . , J } we have H k = ∂ u k DF ( u ) = E k +1 ,k +2 + E k − ,k − − E k +1 ,k − − E k +2 ,k +1 . The following lemma now implies projective spanning for Lorenz-96 when combined Lemma 4.5. Lemma 4.9. The following holds Lie( H , . . . , H J ) = sl ( R J ) Proof. Throughout, we regard the indices in E i,j modulo J , so that E i + kJ,j + ℓJ = E i,j for all i, j, k, ℓ .Let g denote the smallest Lie algebra containing { H k } . To start, let ≤ k ≤ J . We compute [ H k , H k +4 ] = E k +3 ,k +1 , hence E k,k − ∈ g for all ≤ k ≤ J . Continuing, [ H k , E k − ,k − ] = E k − ,k − , hence E k,k − ∈ g for all k . Inductively, assuming E k,k − ℓ ∈ g , we have that [ H k , E k − ,k − ( ℓ +2) ] = E k − ,k − − ( ℓ +1) , (4.6)hence E k,k − ( ℓ +1) ∈ g for all k . The induction step in (4.6) continues to hold as long as k − ( ℓ + 2) is disjointfrom { k − , k + 1 , k + 2 } modulo J , which is assured so long as ℓ < J − .Fix ℓ ∈ { , , · · · , J − } so that J − ℓ is co-prime to J . In particular, { − ℓ , − ℓ , · · · , − ( J − ℓ } coincides with the complete set of residue classes { , , · · · , J − } in Z /J Z . Since n E , − ℓ , E − ℓ , − ℓ , · · · , E − ( J − ℓ , − ( J − ℓ , E − ( J − ℓ , o ⊂ g is really just a re-ordering of the generating set identified in Remark 4.7, we conclude g = sl ( R J ) . The goal of this section is to complete the proof of Proposition 1.15 described in Section 1. Denote by Φ t : R n → R n , t ≥ the (deterministic) flow of diffeomorphisms solving the Euler-like initial valueproblem ˙ x t = B ( x t , x t ) , x = x ∈ R n . (5.1)where B : R n × R n → R n is a bilinear mapping for which x · B ( x, x ) = 0 and div B ( x, x ) = 0 . As inSection 1 let ˆΦ t : SR n → SR n be the associated projective flow defined by ˆΦ t ( x, v ) = (cid:18) Φ t ( x ) , D x Φ t v | D x Φ t v | (cid:19) . To start, in Section 5.1 we collect preliminaries regarding the Euler-like class (5.1). In Section 5.2 wethen recall some general linear cocycle theory ruling out the existence of absolutely continuous invariantprobabilities for generalized projective actions. Finally, Section 5.3 completes the proof of Proposition1.15. 37 .1 Preliminaries for Euler-like systems For our use, we record below the following simple consequences of the special Euler-like structure imposedby (5.1). Some notation: for E > let us write S E := { x ∈ R d : | x | = E } for the “energy shells”preserved by the flow Φ t , i.e., Φ t ( S E ) = S E for all t ≥ , E > . Write E ( x ) = | x | . Lemma 5.1. Let x ∈ R d then the following identity holds D Φ t ( x ) x = Φ t ( x ) + tB (Φ t ( x ) , Φ t ( x )) . (5.2) Moreover, for each x ∈ R d and t ≥ , we have that | D Φ t ( x ) | ≥ t | B (Φ t ( x ) , Φ t ( x )) || x | . (5.3) Proof of Lemma. For a given α > , note that the rescaled flow α Φ αt ( x ) also solves (5.1) with initial data αx . Therefore by uniqueness, we have Φ t ( αx ) = α Φ αt ( x ) (5.4)Taking the derivative with respect to α on both sides of (5.4) yields D Φ t ( αx ) x = Φ αt ( x ) + αtB (Φ αt ( x ) , Φ αt ( x )) . Setting α = 1 gives (5.2).Inequality (5.3) follows from part (5.2) and the fact that Φ t ( x ) · B (Φ t ( x ) , Φ t ( x )) = 0 for all x , byassumption.The following emphasizes the shearing between energy surfaces used in the sequel: In this section, we will state everything in the following abstract linear cocycle setting. Throughout, T :( X, B , m ) (cid:9) is a (discrete-time) continuous transformation of a compact metric space X , with B the Borel σ -algebra. Let A : ( X, B ) → SL d ( R ) , x A x be a measurable mapping . This generates the cocyle oflinear operators A : X × Z ≥ → SL d ( R ) defined by A ( n, x ) = A nx := A T n − x A T n − x · · · A T x A x . Note that A satisfies the cocycle identity A m + nx = A mT n x A nx for all m, n ≥ , x ∈ X . Associated to T, A is the projective action ˆ T : X × S d − (cid:9) defined by ( x, v ) (cid:18) T x, A x v | A x v | (cid:19) , x ∈ X, v ∈ S d − , which we regard as a dynamical system on X × S d − in its own right.Let ˆ m be any ˆ T -invariant measure on X × S d − projecting to m (i.e., ˆ m ( K × S d − ) = m ( K ) for allmeasurable K ⊂ X ), and consider its disintegration d ˆ m ( u, v ) = d ˆ m x ( v )d m ( x ) . Here, SL d ( R ) is the group of d × d real matrices of determinant 1. When T is a smooth mapping of a manifold and A nx := D x ( T n ) is the so-called derivative cocycle , the cocycle identity ismerely the chain rule for T n . 38n this context, it is well-known [22, 64] that disintegrations ( ˆ m x ) x ∈ X exist and are essentially unique (upto m -measure zero modifications) and x ˆ m x is weak-* measurably varying. Note that invariance of ˆ m implies that ( A x ) ∗ ˆ m x = ˆ m T x for m -a.e. x ∈ X, where for a d × d matrix A we write A ∗ for the action of A on probability measures on S d − .The following result (more-or-less Theorem 3.23 of [6], up to a technical issue-see below) involves therigidity of absolute continuity of the disintegration measures ˆ m x with respect to Leb S d − . Theorem 5.2. Assume that ˆ m x ≪ Leb S d − for m -almost every x ∈ X . Then, there exists a measurablefamily of inner products X ∋ x g x ( · , · ) on R d and a T -invariant set Γ ⊂ X of full m -measure such thatfor all x ∈ Γ and v, w ∈ R d , we have that g T x ( A x v, A x w ) = g x ( v, w ) . That is, A x : ( R d , g x ) → ( R d , g T x ) is an isometry. This is slightly different from the form in Theorem 3.23 of [6]: there, it is supposed that ˆ m x ∼ Leb S d − ,whereas for our purposes we need the version with “ ≪ ”. For this reason, as well as for the sake of com-pleteness, we sketch the proof of Theorem 5.2 here. Proof sketch. To start, let us assume for now that T : ( X, B , m ) (cid:9) is ergodic (note that we do not assume ˆ m is ergodic). We require the following Lemma: Lemma 5.3 (Corollary 3.7 in [6]; Lemma 6.2 in [27]) . Assume ( X, B , m, T ) is ergodic. Then, there is afull m -measure set of x ∈ X with the following property: there exists a measurable mapping G : X → SL d ( R ) , depending on the choice of x , such that G ( x ) ∗ ˆ m x = ˆ m x for m − almost every x ∈ X . This version is slightly different from those appearing in [6, 27], so we briefly recall the proof below. Proof sketch of Lemma. Let P ( S d − ) denote the space of probability measures on S d − with the weak ∗ topology. Consider the quotient P ( S d − ) /SL d ( R ) , i.e., for ξ, η ∈ P ( S d − ) we set ξ ∼ η iff ∃ B ∈ SL d ( R ) so that B ∗ ξ = η . Writing [ η ] for the equivalence class of η ∈ P ( S d − ) , note that [ ˆ m x ] = [ ˆ m T k x ] for all k , i.e., x [ ˆ m x ] is constant along orbits. By Corollary 3.2.12 in [76], the Borel σ -algebra on the quotientspace P ( S d − ) /SL d ( R ) is countably generated. Using this along with the fact that T : ( X, µ ) (cid:9) is ergodic,it follows from a standard argument that [ ˆ m x ] is constant µ -almost surely. In particular, for µ -a.e. x , x ∈ X ,the measures ˆ m x and ˆ m x are related by the application of a matrix in SL d ( R ) . It is now straightforward toconstruct a measurable selection G : X → SL d ( R ) as above.Fix x so that ˆ m x ≪ Leb S d − and let G be as in Lemma 5.3. Observe that for any n ≥ and m -a.e. x ∈ X we have that G ( T n x ) − A nx G ( x ) ∈ H x , where H x := { B ∈ SL d ( R ) : B ∗ ˆ m x = ˆ m x } . Observe that H x is a subgroup of SL d ( R ) , which we claim to be compact. If not, then a lemma of Fursten-berg (see, e.g., Claim 4.8 in [15]) would imply the existence of proper subspaces V , V ⊂ R d and asequence { B n } ⊂ H x so that dist ( B n v, V ) → for all v / ∈ V , which would contradict ˆ m x ≪ Leb S d − .Since H x is compact, there exists an inner product h· , ·i on R d with respect to which all members of H x are isometries (Lemma 4.6 in [15]). The proof is complete on defining g x through g x ( v, w ) = h G ( x ) − v, G ( x ) − w i . (5.5)39o handle the case when m is not ergodic, we use the ergodic decomposition [70] m ( · ) = ˆ ξ ∈E T ( X ) ξ ( · ) d τ m ( ξ ) , where E T ( X ) is the space of T -ergodic measures on X and τ m a Borel measure (w.r.t. the weak ∗ topology)on E T ( X ) . For each component ξ , we define ˆ ξ through the formula d ˆ ξ ( x, v ) = d ˆ m x ( v )d ξ ( x ) , noting that ˆ m x ≪ Leb S d − for ξ -a.e. x ∈ X and τ m -a.e. ξ ∈ E T ( X ) . The proof now goes through the sameas before, the only difference being that the measurable inner product (5.5) is defined along each ξ ∈ E T ( X ) one at a time. To start, let ν be ˆΦ t -invariant, projecting to an absolutely continuous measure µ on R n , and assume that ν = ν ac + ν ⊥ where ν ac ≪ Leb SR n is not identically zero (our contradiction hypothesis), while ν ⊥ is singular. Since ˆΦ t sends absolutely continuous measures to absolutely continuous measures and singular to singular, it followsthat ν ac is ˆΦ t -invariant. Since ν ac ≤ ν , the measure µ ac ( K ) := ν ac ( K × S n − ) satisfies µ ac ≪ µ ≪ Leb R n and is likewise Φ t -invariant. On replacing ν with the normalization of ν ac , going forward we may assumewithout loss that ν ≪ Leb SR n . Finally, since the energy shells S E = {| x | = E } are invariant, we mayreplace ν with the normalization of its restriction to B (0 , R ) × S d − for some large, fixed R > .Continuing, let dν ( x, v ) = dν x ( v ) dµ ( x ) denote the disintegration measures of ν and note that ν x ≪ Leb S n − for µ -a.e. x . By Theorem 5.2, there exists a measurable family of inner products g x , x ∈ R n sothat D Φ ( x ) : ( R n , g x ) → ( R n , g Φ x ) (5.6)is an isometry for µ -a.e. x . By a standard procedure, we may assume that (5.6) holds for x ∈ Γ , where Γ ⊂ R n satisfies µ (Γ) = 1 and Φ (Γ) = Γ .For L > , define Γ L = ( x ∈ Γ : L − ≤ p g x ( v, v ) | v | ≤ L for all v ∈ S n − ) ∩ (cid:8) x ∈ Γ : | B ( x, x ) | ≥ L − (cid:9) . and note that if x, Φ n x ∈ Γ L for some n ≥ , then | D x Φ n | ≤ L must hold by (5.6). Moreover, we havethat µ (Γ L ) ր µ (Γ) = 1 as L → ∞ . Observe that this relies on the assumption that B is not identically 0,hence | B ( x, x ) | > Lebesgue-a.e. (here, we use the standard fact that { B ( x, x ) = 0 } is a proper variety in R n , hence must have zero volume).Fix L such that µ (Γ L ) ≥ / > . By the Poincar´e Recurrence Theorem, µ -a.e. x ∈ Γ L visits Γ L infinitely many times. Fix such an x ∈ Γ L \ { } and let n < n < n < · · · , lim ℓ →∞ n ℓ = ∞ , sothat Φ n ℓ ( x ) ∈ Γ L for all n ℓ , hence | D Φ n ℓ ( x ) | ≤ L for all such n ℓ . On the other hand, (5.3) implies | D Φ n ℓ ( x ) | ≥ n ℓ L | x | as n ℓ → ∞ , a contradiction. This completes the proof of Proposition 1.15.40 Qualitative properties of the projective stationary measure In this section we record basic properties of the SDE (1.10). Theorem A.1. Suppose that { ˜ X , ˜ X , ..., ˜ X r } satisfies the uniform parabolic H¨ormander condition on SR n as in Definition 3.1. For all ǫ > , the SDE (1.10) satisfies Assumptions 1 and 2. Moreover, the stationarymeasure of the ( w t ) process f ǫ has a smooth density with respect to Lebesgue measure f ǫ ∈ L ∩ L ∩ C ∞ with f ǫ log f ǫ ∈ L , the marginal ˆ S n − f ǫ ( x, v )d v = ρ ǫ ( x ) and ∃ C, γ > such that ∀ ǫ ∈ (0 , , ˆ S M f ǫ e γ | x | d q < C. (A.1) Furthermore, the estimate in Assumption 1 (iii) holds for all ǫ > .Proof of Theorem A.1. Claims (i) and (ii) of Assumption 1 are standard or proved in [16]. The proof ofAssumption 1 (iii) follows by providing suitable moment estimates on log (cid:12)(cid:12) det D Φ t (cid:12)(cid:12) and log (cid:12)(cid:12) D Φ t (cid:12)(cid:12) usingthe SDE derived in the proof of Proposition 2.4. The results of Assumption 2 follow from similar methods(including (A.1), though see below)However, we are not aware of any proof of f ǫ ∈ L or f ǫ log f ǫ ∈ L in the literature and we thereforeinclude the proof. For this we use some ideas that appear in [16]. As in [16], a convenient method to justifymany formal calculations begins by first regularizing the problem by adding elliptic Brownian motions.Recall that the generator ˜ L for the projective process ( w t ) is given by ˜ L = ˜ X + 12 r X k =1 X k . We then regularize this by the perturbing the generator ˜ L δ = ˜ L + δ x,v , where ∆ x,v = ∆ x + ∆ v with ∆ x the usual Laplacian on R n and ∆ v the Laplace-Beltrami operator on S n − .This corresponds to perturbing the SDE (1.10) by a non-degenerate √ δ Brownian motion on SR n . It is nothard to show that ˜ L δ satisfies a drift condition ˜ L e γ | x | ≤ − αe γ | x | + K for some α ∈ (0 , , K ≥ (uniformly in ǫ, δ ). This gives rise to a globally defined Markov process ( w δt ) .Moreover for a given initial density f ∈ C ∞ c ( SR d ) with ´ f d q = 1 and f ≥ , such that Law( w δ ) = f wedenote f t = Law( w δt ) , which solves the forward Kolmogorov equation ∂ t f t = ˜ L ∗ f t + 12 δ ∆ x,v f t = 0 . (A.2)From the drift condition we have, ∀ γ sufficiently small, ∃ α ∈ (0 , such that (uniformly in ǫ, δ ), ˆ SR d f t e γ | x | d q . e − αt ˆ SR d f e γ | x | d q. (A.3)41et ¯ χ ∈ C ∞ c ( B (0 , with ≤ ¯ χ ≤ , and ¯ χ = 1 for x ≤ / and define χ ( x ) = χ ( x/ − χ ( x ) . Define χ j = χ (2 − j x ) , which defines the partition of unity χ + P ∞ j =0 χ j ( x ) . From energy estimates on (A.2)we have the following, dd t || f t || L + δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . || (1 + | x | ) f || L . || ¯ χf || L + ∞ X j =1 j || χ j f || L , (in order to justify such estimates one may apply smooth, v -independent radially symmetric cut-offs to thenonlinearity and pass to the limit). By the Gagliardo-Nirenberg-Sobolev inequality, ∃ θ ∈ (0 , (that suchan inequality holds is verified through the local coordinates and the fact that the estimates on the metric areuniform over the manifold), dd t || f t || L + δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . || ¯ χf t || L + ∞ X j =1 j || χ j f t || L . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / ¯ χf t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL || ¯ χf t || − θL + ∞ X j =1 j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / χ j f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL || χ j f t || − θL . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL + ||∇ x ¯ χf t || θL + ∞ X j =1 j (cid:18)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL + ||∇ x χ j f t || θL (cid:19) || χ j f t || − θL , and hence dd t || f t || L + δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . δ || f t || − θL (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL + ˆ SR d f t e γ | x | d q. Hence, from (A.3), there holds for some q > , t ˆ t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / f τ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L d τ . δ − q ˆ SR d f e γ | x | d q. Combined with the uniform drift condition, this allows to pass to the limit t → ∞ and conclude that theunique stationary measure, denoted below as f ǫ,δ is in H ( SR d ) ; we note that f ǫ,δ is a smooth solution ofthe Kolmogorov equation (cid:18) ˜ L ∗ + δ x,v (cid:19) f ǫ,δ = 0 . (A.4)Next, we obtain an L estimate that is uniform in δ in order to pass to the δ → limit. For this, we clearlyneed to depend on hypoelliptic regularity. Define the regularized H ¨ormander norm pair (see discussions in[4, 16, 29] for motivations), || w || H δ := || w || L + r X k =1 || X k w || L + δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( − ∆ x,v ) / w (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L || w || H ∗ δ := sup ϕ : || ϕ || H δ ≤ (cid:12)(cid:12)(cid:12)(cid:12) ˆ SR n ( ˜ X ϕ ) w d q (cid:12)(cid:12)(cid:12)(cid:12) The proof is similar to [Lemma 2.3; [16]] provided we have the following quantification of H ¨ormander’sinequality. 42 emma A.2 (Quantitative H ¨ormander inequality for the projective process) . Suppose that { ˜ X , ˜ X , ..., ˜ X r } satisfies the uniform parabolic H¨ormander condition on B (0 , × S n − as in Definition 3.1. There exists s > and q > , such that for any R ≥ , w ∈ C ∞ ( B R × S n − ) and δ ∈ [0 , there holds k w k H s . R q ( || w || H δ + || w || H ∗ δ ) , (A.5) where both s > and the implicit constant do not depend on ǫ , δ , or R , where here we denote in analogywith (3.2) (for dim SR d = m ), for ˜ w j = χ j w ◦ x j as defined therein), || w || H s = || w || L + X j ˆ | h | <δ ˆ R n × S n − | ˜ w j ( q + h ) − ˜ w j ( q ) | | h | m +2 s J j ( q )d q d h / Proof. The proof begins with a re-scaling as in [Lemma 3.2; [16]]. Define h ( x, v ) = w ( Rx, v ) which solvesa PDE of the following form for suitable vector fields N , V , Y , (denoting ∆ x,v as the Laplace-Beltramioperator, which note is invariant under this scaling) ǫδ ∆ x,v h + 12 r X j =1 ǫ ( ˜ X ∗ j ) h − N h + R − V ∗ h − ǫR Y ∗ h = 0 . where N ( x ) = B ( x, x ) , Y ( x ) = Ax and V ( x, v ) = Π v ∇ F ( x ) v , and their action on h is interpreted asa differential operator. We see that the proof here is more subtle than in the corresponding [Lemma 3.2;[16]] as R − V is required to span the directions in projective space. From Proposition 4.2, we see that thespanning in x and v can be considered essentially separately, first choosing brackets to span in x and thencorrecting by choosing suitable brackets in m x ( X ; X , . . . , X r ) to span in v . Using this structure we seethat given a vector field Z ∈ X ( SR n ) and q ∈ B (0 , × S n − , there exists p j < ... < p < p ≤ k (with k as in Definition 3.1) such that for q in a neighborhood of q , there are finitely many smooth coefficients c j and vectors Z j ∈ X k with Z ( q ) = X j R p j c j ( q ) Z j ( q ) , where if Z varies in a bounded set in C m , then { c j } j varies in a similarly bounded set as well. A carefulreading of [29] shows that this introduces powers of R matching the powers of t into all of the estimates in[Sections 4 and 5; [29]], the maximal power arising being R k . In particular, the error estimates come in theform O ( R k/σ ) , provided that R k t < and < σ < s ∗ as in [29]. This restriction on t in the estimatesfurther introduces only polynomial dependence on R , as for any Z ∈ X ( SR n ) , sup | t |≤ | t | − σ (cid:12)(cid:12)(cid:12)(cid:12) e tZ g − g (cid:12)(cid:12)(cid:12)(cid:12) L . R kσ || g || L + sup | t |≤ R − k | t | − σ (cid:12)(cid:12)(cid:12)(cid:12) e tZ g − g (cid:12)(cid:12)(cid:12)(cid:12) L . Combining the above observations with those of [29] implies that the constant in (A.5) remains polynomialin R (exponential would also be sufficient for our purposes, as we only use that the constant is boundedabove by e ηR for η < γ ).Once one has Lemma A.2, the proof of Theorem A.1 follows easily, given that we are not seeking ǫ -independent bounds, as these such bounds will be false for all but the most degenerate models (see [Lemma2.4; [16]] for the corresponding argument on ρ ǫ , which does yield ǫ -independent estimates). Let ¯ χ ∈ C ∞ c ( B (0 , with ≤ ¯ χ ≤ , and ¯ χ = 1 for x ≤ / and define χ ( x ) = χ ( x/ − χ ( x ) . Define χ j = χ (2 − j x ) , which defines the partition of unity χ + P ∞ j =0 χ j ( x ) .43or s as in Lemma A.2, let θ ∈ (0 , be such that for any g ∈ C ∞ c (that such an inequality holds on SR d is verified again through the local coordinates and the fact that the estimates on the metric are uniformover the manifold), || g || L . || g || θL || g || − θH s . We now obtain a uniform-in- δ L estimate. By Lemma A.2, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ¯ χf ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − θH Hyp ,δ + ∞ X j =1 jq (1 − θ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) χ j f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − θH Hyp ,δ , where we have denoted k · k H Hyp ,δ = k · k H δ + k · k H ∗ δ . Pairing (A.4) with ¯ χf ǫ,δ and χ j f ǫ,δ followed bystandard manipulations gives (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ¯ χf ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) H Hyp ,δ + sup j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) χ j f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) H Hyp ,δ . ǫ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L . Therefore, we have (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL . ∞ X j =1 jq (1 − θ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) χ j f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) e | x | f ǫ,δ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) θL . . Hence, we have a uniform-in- δ estimate on the L norm. Note that the estimate still depends badly on ǫ .Passing to the δ → limit shows that f ǫ ∈ L for each ǫ > . Finally, observe that f ǫ log f ǫ ∈ L , indeed, ˆ SR n f ǫ | log f ǫ | d q . ˆ SR n p f ǫ + ( f ǫ ) d q . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) f ǫ e γ | x | (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) L + || f ǫ || L . This completes the proof of Theorem A.1. B Proof of Proposition 4.2 Before we prove Proposition 4.2, we will need some preliminary results. As we will be taking commutatorsof the above vector fields it is important to record how projective vector fields behave under the Lie bracket. Lemma B.1. Let A, B ∈ sl ( R n ) , then the following identity holds [ V A , V B ]( v ) = − V [ A,B ] ( v ) , where [ A, B ] := AB − BA denotes the usual commutator on linear operators.Proof. Let ∇ denote the Levi-Civita connection on S n − , then since ∇ is torsion-free, we have the followingformula for the Lie bracket in terms of the the covariant derivative [ V A , V B ] = ∇ V A V B − ∇ V B V A . Recall from the proof of Lemma 2.3 that using the embedding of S n − into R n , we have the followingformula for the total covariant derivative of V A (viewed as a linear operator on T v S n − ) ∇ V A ( v ) = Π v A − h v, Av i I. It follows that [ V A , V B ]( v ) = ∇ V B ( v ) V A ( v ) − ∇ V A ( v ) V B ( v )= Π v B Π v Av − Π v A Π v Bv − h v, Bv i V A ( v ) + h v, Av i V B ( v ) Π v u = u − h u, v i v for u ∈ T x M , we find Π v B Π v Av + h v, Av i V B ( v ) = Π v BAv and Π v A Π v Bv + h v, Bv i V A ( v ) = Π v ABv , hence [ V A , V B ] = V BA − V AB = − V [ A,B ] . Of fundamental importance is the following observation for the lifting operation X ˜ X . Lemma B.2. Any two vector fields X, Y ∈ X ( M ) satisfy the identity [ ˜ X, ˜ Y ] = ^ [ X, Y ] . Thus the lifting operation X ˜ X is a Lie algebra isomorphism onto its range.Proof. Denote by ˆ X ( x, v ) = (cid:18) X ( x )0 (cid:19) , ˆ V ∇ X ( x, v ) = (cid:18) V ∇ X ( x ) ( v ) (cid:19) the extensions of the vector fields X and V ∇ X to vector fields on the sphere bundle S M and let U ( x, v ) = (cid:18) v (cid:19) be the ‘canonical’ vector field on S M . Note that U is parallel to ˆ X in the sense that ˜ ∇ ˆ X U = 0 . Let ˜ ∇ denote the Levi-Civita connection on S M induced by the Sasaki metric ˜ g , and define a projection ˜Π on T ( x,v ) S M = T x M ⊕ T v S x M by ˜Π ( x,v ) (cid:18) u u (cid:19) = (cid:18) v u (cid:19) Note that for any “horizontal” vector field ˆ X , ˜ ∇ ˆ X ˜Π = 0 holds since ∇ preserves the metric g .Using the above notation, we can write ˆ V ∇ X = ˜Π ˜ ∇ U ˆ X . Since we can now split any lifted vector field ˜ X as ˜ X = ˆ X + ˆ V ∇ X , the commutator of ˜ X and ˜ Y is [ ˜ X, ˜ Y ] = [ ˆ X, ˆ Y ] + [ ˆ X, ˆ V ∇ Y ] − [ ˆ Y , ˆ V ∇ X ] + [ ˆ V ∇ X , ˆ V ∇ Y ] . (B.1)Naturally we find that [ ˆ X, ˆ Y ] = (cid:18) [ X, Y ]0 (cid:19) = \ [ X, Y ] . Likewise, a simple consequence of Lemma B.1 implies [ ˆ V ∇ X , ˆ V ∇ Y ] = − ˜Π[ ˜ ∇ ˆ X, ˜ ∇ ˆ Y ] U = ˜Π (cid:16) ˜ ∇ [ U, ˆ X ] ˆ Y − ˜ ∇ [ U, ˆ Y ] ˆ X (cid:17) , where above [ ˜ ∇ ˆ X, ˜ ∇ ˆ Y ] denotes the commutator of ˜ ∇ ˆ X, ˜ ∇ ˆ Y as linear endomorphisms on a fixed tangentspace T ( x,v ) S M . The remaining terms in (B.1) can be computed as [ ˆ X, ˆ V ∇ Y ] − [ ˆ Y , ˆ V ∇ X ] = ˜ ∇ ˆ X ˆ V ∇ Y − ˜ ∇ ˆ Y ˆ V ∇ X = ˜Π (cid:16) ˜ ∇ ˆ X ˜ ∇ U ˆ Y − ˜ ∇ ˆ Y ˜ ∇ U ˆ X (cid:17) . [ ˜ X, ˜ Y ] = \ [ X, Y ] + ˜Π (cid:16) ˜ ∇ ˆ X ˜ ∇ U ˆ Y − ˜ ∇ ˆ Y ˜ ∇ U ˆ X + ˜ ∇ [ U, ˆ X ] ˆ Y − ˜ ∇ [ U, ˆ Y ] ˆ X (cid:17) The proof will be complete once we show the identity ˜ ∇ ˆ X ˜ ∇ U ˆ Y − ˜ ∇ ˆ Y ˜ ∇ U ˆ X + ˜ ∇ [ U, ˆ X ] ˆ Y − ˜ ∇ [ U, ˆ Y ] ˆ X = ˜ ∇ U \ [ X, Y ] . (B.2)For this, we can use the Riemann curvature tensor on S M ˜ R ( X, Y ) Z := ˜ ∇ X ˜ ∇ Y Z − ˜ ∇ Y ˜ ∇ X Z − ˜ ∇ [ X,Y ] Z to change the order of covariant derivatives, giving ˜ ∇ ˆ X ˜ ∇ U ˆ Y − ˜ ∇ ˆ Y ˜ ∇ U ˆ X + ˜ ∇ [ U, ˆ X ] ˆ Y − ˜ ∇ [ U, ˆ Y ] ˆ X = ˜ R ( ˆ X, U ) ˆ Y − ˜ R ( ˆ Y , U ) ˆ X + ˜ ∇ U ˜ ∇ ˆ X ˆ Y − ˜ ∇ U ˜ ∇ ˆ Y ˆ X = ˜ R ( ˆ X, U ) ˆ Y + ˜ R ( U, Y ) ˆ X + ∇ U [ ˆ X, ˆ Y ] . The first Bianchi identity implies that ˜ R ( ˆ X, U ) ˆ Y + ˜ R ( U, ˆ Y ) ˆ X = ˜ R ( ˆ X, ˆ Y ) U, and therefore identity (B.2) follows from the fact that R ( ˆ X, ˆ Y ) U = 0 since, for any vector field Z ∈ X ( M ) ,we have that ˜ ∇ ˆ Z U = 0 .We are now ready to prove Proposition 4.2. Proof of Proposition 4.2. A simple consequence of Lemma B.2 that for any collection of vector fields { X k } rk =0 on M we have the following identification Lie( ˜ X ; ˜ X , . . . , ˜ X r ) = n ˜ X : X ∈ Lie( X ; X , . . . , X r ) o . Therefore the parabolic H ¨ormander condition for { ˜ X k } rk =0 is equivalent to (cid:26)(cid:18) X ( x ) V M X ( x ) ( v ) (cid:19) : X ∈ Lie( X ; X , . . . , X r ) (cid:27) = T x M ⊕ T v S x M. Clearly if the above condition is satisfied then { X k } rk =0 satisfies the parabolic H ¨ormander condition and 4.2holds. The converse follows from the fact that (4.2) (cid:26)(cid:18) X ( x ) V M X ( x ) ( v ) (cid:19) : X ∈ Lie( X ; X , . . . , X r ) , X ( x ) = 0 (cid:27) = { } ⊕ T v S x M. and { X ( x ) : X ∈ Lie( X ; X , . . . , X r ) , X ( x ) = 0 } = T x M \{ } . eferences [1] F. Abedin and G. Tralli, Harnack inequality for a class of Kolmogorov–Fokker–Planck equations in non-divergence form ,Archive for Rational Mechanics and Analysis (2019), no. 2, 867–900.[2] S. Amari and H. Nagaoka, Methods of information geometry , American Mathematical Society, 2000 (en).[3] F. Anceschi, S. Polidoro, and M. A. Ragusa, Moser’s estimates for degenerate Kolmogorov equations with non-negativedivergence lower order coefficients , Nonlinear Analysis (2019), 111568.[4] S. Armstrong and J.-C. Mourrat, Variational methods for the kinetic Fokker-Planck equation , arXiv preprint arXiv:1902.04037(2019).[5] L. Arnold, Random dynamical systems , Dynamical systems, 1995, pp. 1–43.[6] L. Arnold, D. C. Nguyen, and V. Oseledets, Jordan normal form for linear cocycles , Random Operators and StochasticEquations (1999), no. 4, 303–358.[7] L. Arnold, G. Papanicolaou, and V. Wihstutz, Asymptotic analysis of the Lyapunov exponent and rotation number of therandom oscillator and applications , SIAM Journal on Applied Mathematics (1986), no. 3, 427–450.[8] E. I. Auslender and G. N. Milstein, Asymptotic expansion of Lyapunov exponent for linear stochastic systems with smallnoises , Prikl. Mat. i Mekh. (1982), 358–365 (In Russ.)[9] D. Bakry and M. ´Emery, Diffusions hypercontractives , S´eminaire de probabilit´es xix 1983/84, 1985, pp. 177–206.[10] L. Barreira and Y. Pesin, Smooth ergodic theory and nonuniformly hyperbolic dynamics , Handbook of dynamical systems (2006), 57–263.[11] P. H Baxendale, Lyapunov exponents and relative entropy for a stochastic flow of diffeomorphisms , Probability Theory andRelated Fields (1989), no. 4, 521–554.[12] P. H Baxendale, Lyapunov exponents and stability for the stochastic Duffing-van der Pol oscillator , Iutam symposium onnonlinear stochastic dynamics, 2003, pp. 125–135.[13] P. H Baxendale, Stochastic averaging and asymptotic behavior of the stochastic duffing–van der pol equation , Stochasticprocesses and their applications (2004), no. 2, 235–272.[14] P. H Baxendale and L. Goukasian, Lyapunov exponents for small random perturbations of Hamiltonian systems , Annals ofprobability (2002), 101–134.[15] J. Bedrossian, A. Blumenthal, and S. Punshon-Smith, Lagrangian chaos and scalar advection in stochastic fluid mechanics ,arXiv preprint arXiv:1809.06484 (2018).[16] J. Bedrossian and K. Liss, Quantitative spectral gaps and uniform lower bounds in the small noise limit for Markov semigroupsgenerated by hypoelliptic stochastic differential equations , arXiv:2007.13297 (2020).[17] J. Bochi, Genericity of zero Lyapunov exponents , Ergodic Theory and Dynamical Systems (2002), no. 6, 1667–1696.[18] J. Bochi and M. Viana, The Lyapunov exponents of generic volume-preserving and symplectic maps , Annals of mathematics(2005), 1423–1485.[19] G. Boffetta, M. Cencini, M. Falcioni, and A. Vulpiani, Predictability: a way to characterize complexity , Physics reports (2002), no. 6, 367–474.[20] A. Carverhill, A nonrandom Lyapunov spectrum for nonlinear stochastic dynamical systems , Stochastics: an internationaljournal of probability and stochastic processes (1986), no. 4, 253–287.[21] A. Carverhill, Furstenberg’s theorem for nonlinear stochastic systems , Probability theory and related fields (1987), no. 4,529–534.[22] J. T Chang and D. Pollard, Conditioning as disintegration , Statistica Neerlandica (1997), no. 3, 287–317.[23] S. Crovisier and S. Senti, Un probl`eme pour le xxi(i)`eme si`ecle , La Gazette des mathaticiens (2018).[24] D. Dolgopyat, V. Kaloshin, L. Koralov, et al., Sample path properties of the stochastic flows , The Annals of Probability (2004), no. 1A, 1–27.[25] P. Duarte, Abundance of elliptic isles at conservative bifurcations , Dynamics and Stability of Systems (1999), no. 4,339–356.[26] H. Furstenberg, Noncommuting random products , Transactions of the American Mathematical Society (1963), no. 3,377–428.[27] H. Furstenberg, Rigidity and cocycles for ergodic actions of semi-simple lie groups , S´eminaire bourbaki vol. 1979/80 expos´es543–560, 1981, pp. 273–292. 28] F Golse, C. Imbert, C. Mouhot, and A Vasseur, Harnack inequality for kinetic fokker-planck equations with rough coefficientsand application to the landau equation , to appear in Annali della Scuola Normale Superiore di Pisa (2016).[29] L. H¨ormander, Hypoelliptic second order differential equations , Acta Mathematica (1967), no. 1, 147–171.[30] P. Imkeller and C. Lederer, An explicit description of the Lyapunov exponents of the noisy damped harmonic oscillator ,Dynamics and Stability of Systems (1999), no. 4, 385–405.[31] A. Karimi and M. R Paul, Extensive chaos in the Lorenz-96 model , Chaos: An interdisciplinary journal of nonlinear science (2010), no. 4, 043105.[32] R. Khasminskii, Stochastic stability of differential equations , Vol. 66, Springer Science & Business Media, 2011.[33] Y. Kifer, A note on integrability of C r -norms of stochastic flows and applications , Stochastic mechanics and stochastic pro-cesses, 1988, pp. 125–131.[34] Y. Kifer, Ergodic theory of random transformations , Vol. 10, Springer Science & Business Media, 2012.[35] J. F. C. Kingman, Subadditive ergodic theory , The annals of Probability (1973), no. 6, 883–899.[36] A. E. Kogoj and S. Polidoro, Harnack inequality for hypoelliptic second order partial differential operators , Potential Anal. (20164), no. 14, 545–555.[37] H. Kunita, Stochastic flows and stochastic differential equations , Vol. 24, Cambridge university press, 1997.[38] A. Lanconelli, A. Pascucci, and S. Polidoro, Gaussian lower bounds for non-homogeneous kolmogorov equations with mea-surable coefficients , Journal of Evolution Equations (2020), 1–19.[39] M. Ledoux, I. Nourdin, and G. Peccati, Steins method, logarithmic sobolev and transport inequalities , Geometric and Func-tional Analysis (2015), no. 1, 256–306.[40] F. Ledrappier, Quelques propri´et´es des exposants caract´eristiques , ´Ecole d’´et´e de probabilit´es de saint-flour xii-1982, 1984,pp. 305–396.[41] F. Ledrappier, Positivity of the exponent for stationary sequences of matrices , Lyapunov exponents, 1986, pp. 56–73.[42] K. K Lin and L.-S. Young, Shear-induced chaos , Nonlinearity (2008), no. 5, 899.[43] P.-D. Liu and M. Qian, Smooth ergodic theory of random dynamical systems , Springer, 2006.[44] E. N Lorenz, The nature and theory of the general circulation of the atmosphere , Vol. 218, World Meteorological OrganizationGeneva, 1967.[45] E. N Lorenz, Predictability: A problem partly solved , Proc. seminar on predictability, 1996.[46] E. N Lorenz and K. A Emanuel, Optimal sites for supplementary weather observations: Simulation with a small model ,Journal of the Atmospheric Sciences (1998), no. 3, 399–414.[47] M. Lyubich, Almost every real quadratic map is either regular or stochastic , Annals of Mathematics (2002), 1–78.[48] A. Majda, R. V Abramov, and M. J Grote, Information theory and stochastics for multiscale nonlinear systems , Vol. 25,American Mathematical Soc., 2005.[49] A. Majda and X. Wang, Nonlinear dynamics and statistical theories for basic geophysical flows , Cambridge University Press,2006.[50] A. J Majda, Introduction to turbulent dynamical systems in complex systems , Springer, 2016.[51] S. P Meyn and R. L Tweedie, Markov chains and stochastic stability , Springer Science & Business Media, 2012.[52] N. Moshchuk and R Khasminskii, Moment Lyapunov exponent and stability index for linear conservative system with smallrandom perturbation , SIAM Journal on Applied Mathematics (1998), no. 1, 245–256.[53] C. Mouhot, De Giorgi–Nash–Moser and H¨ormander theories: new interplays , Proceedings of the international congress ofmathematiciansrio de, 2018, pp. 2467–2493.[54] S. E Newhouse, Diffeomorphisms with infinitely many sinks , Topology (1974), no. 1, 9–18.[55] S. E Newhouse, The abundance of wild hyperbolic sets and non-smooth stable sets for diffeomorphisms , PublicationsMath´ematiques de l’IH ´ES (1979), 101–151.[56] V. I. Oseledets, A multiplicative ergodic theorem. characteristic Ljapunov exponents of dynamical systems , TrudyMoskovskogo Matematicheskogo Obshchestva (1968), 179–210.[57] E. Ott, B. R Hunt, I. Szunyogh, A. V Zimin, E. J Kostelich, M. Corazza, E. Kalnay, D. Patil, and J. A Yorke, A localensemble Kalman filter for atmospheric data assimilation , Tellus A: Dynamic Meteorology and Oceanography (2004),no. 5, 415–428. 58] E. Pardoux and V. Wihstutz, Lyapunov exponent and rotation number of two-dimensional linear stochastic systems with smalldiffusion , SIAM Journal on Applied Mathematics (1988), no. 2, 442–457.[59] D. Paz´o, I. G Szendro, J. M L´opez, and M. A Rodr´ıguez, Structure of characteristic Lyapunov vectors in spatiotemporalchaos , Physical Review E (2008), no. 1, 016209.[60] Y. Pesin and V. Climenhaga, Open problems in the theory of non-uniform hyperbolicity , Discrete Contin. Dyn. Syst (2010),no. 2, 589–607.[61] M. A Pinsky and V. Wihstutz, Lyapunov exponents of nilpotent Itˆo systems , Stochastics: An International Journal of Probabilityand Stochastic Processes (1988), no. 1, 43–57.[62] M. S Raghunathan, A proof of Oseledec’s multiplicative ergodic theorem , Israel Journal of Mathematics (1979), no. 4,356–362.[63] F. Rezakhanlou, C. Villani, and F. Golse, Entropy methods for the Boltzmann equation: lectures from a special semester at theCentre ´Emile Borel, Institut H. Poincar´e, Paris, 2001 , Springer Science & Business Media, 2008.[64] V. A. Rokhlin, On the fundamental ideas of measure theory , Matematicheskii Sbornik (1949), no. 1, 107–150.[65] G. Royer, Croissance exponentielle de produits Markoviens de matrices al´eatoires , Annales de l’ihp probabilit´es et statis-tiques, 1980, pp. 49–62.[66] G. Toscani, Entropy production and the rate of convergence to equilibrium for the fokker-planck equation , Quarterly of Ap-plied Mathematics (1999), no. 3, 521–541.[67] W. Tucker, The Lorenz attractor exists , Comptes Rendus de l’Acad´emie des Sciences-Series I-Mathematics (1999), no. 12,1197–1202.[68] A. Virtser, On products of random matrices and operators , Theory of Probability & Its Applications (1980), no. 2, 367–377.[69] P. Walters, A dynamical proof of the multiplicative ergodic theorem , Transactions of the American Mathematical Society (1993), no. 1, 245–257.[70] P. Walters, An introduction to ergodic theory , Vol. 79, Springer Science & Business Media, 2000.[71] Q. Wang and L.-S. Young, Toward a theory of rank one attractors , Annals of Mathematics (2008), no. 2, 349–480.[72] A. Wilkinson, What are Lyapunov exponents, and why are they interesting? , Bulletin of the American Mathematical Society (2017), no. 1, 79–105.[73] L.-S. Young, Ergodic theory of differentiable dynamical systems , Real and complex dynamical systems, 1995, pp. 293–336.[74] L.-S. Young, Mathematical theory of Lyapunov exponents , Journal of Physics A: Mathematical and Theoretical (2013),no. 25, 254001.[75] G. Zaslavsky, The simplest case of a strange attractor , Physics Letters A (1978), no. 3, 145–147.[76] R. J Zimmer, Ergodic theory and semisimple groups , Vol. 81, Springer Science & Business Media, 2013., Vol. 81, Springer Science & Business Media, 2013.