[PDF] A practical criterion for positivity of transition densities

Abstract

We establish a simple criterion for locating points where the transition density of a degenerate diffusion is strictly positive. Throughout, we assume that the diffusion satisfies a stochastic differential equation (SDE) on R d with additive noise and polynomial drift. In this setting, we will see that it is often that case that local information of the flow, e.g. the Lie algebra generated by the vector fields defining the SDE at a point x∈ R d , determines where the transition density is strictly positive. This is surprising in that positivity is a more global property of the diffusion. This work primarily builds on and combines the ideas of Ben Arous and Léandre (1991) and Jurdjevic and Kupka (1981, 1985).

Full PDF

aa r X i v : . [ m a t h . P R ] J u l A PRACTICAL CRITERION FOR POSITIVITY OF TRANSITIONDENSITIES

DAVID P. HERZOG AND JONATHAN C. MATTINGLY

Abstract.

We establish a simple criterion for locating points where the tran-sition density of a degenerate diﬀusion is strictly positive. Throughout, weassume that the diﬀusion satisﬁes a stochastic diﬀerential equation (SDE) on R d with additive noise and polynomial drift. In this setting, we will see thatit is often that case that local information of the ﬂow, e.g. the Lie algebragenerated by the vector ﬁelds deﬁning the SDE at a point x ∈ R d , determineswhere the transition density is strictly positive. This is surprising in that pos-itivity is a more global property of the diﬀusion. This work primarily buildson and combines the ideas of Ben Arous and L´eandre [2] and Jurdjevic andKupka [6]. Introduction

The goal of this paper is to develop an easily applicable framework for locatingpoints where the probability density of a degenerate diﬀusion is strictly positive.We will focus on the setting where the diﬀusion satisﬁes a stochastic diﬀerentialequation (SDE) on R d where each component of the drift is a polynomial in thestandard Euclidean coordinates and the noise is additive. Our methods reduceﬁnding points of positivity to computing a certain collection of constant vectorﬁelds generated by taking iterated commutators of the vector ﬁelds deﬁning theSDE. This is convenient since a similar computation is typically used to show thatthe diﬀusion has a smooth probability density function p t ( x, y ) with respect toLebesgue measure dy . While the existence of a smooth density is decided locally,we show that in some settings the bracket computation also determines the moreglobal property of where the density is strictly positive. Additionally, uncoveringsuﬃciently large regions of positivity is useful for proving unique ergodicity.While methods already exist for proving positivity of transition densities, mostrequire knowledge of attainable sets via controls. Here we have structured ourassumptions to require as little global control information as possible. In particular,our results prove smoothness of the densities, the needed control statements, andpositivity, all with one set of primarily local assumptions.Although our general framework is limited to SDEs with polynomial drift andadditive noise, working within such boundaries is reasonable in many applications.In particular, to illustrate the utility of our results, we will apply them to a collec-tion of examples, each with quite diﬀerent structure. Moreover, for the equationsconsidered, either new results will be obtained or existing results will be improvedupon.The ideas used in this note build on a number existing works. Beyond thenow classical theory of H¨ormander [4] on hypoelliptic operators in the “sum ofsquares” form, we use the associated probabilistic techniques of Malliavin calculus [12]. We also use a number of ideas from geometric control theory [7]. Moreover,we modify the idea that odd powered polynomial vector ﬁelds are “good” (due totheir time reversal properties) and even powered polynomial vector ﬁelds are “bad”[6]. Similar ideas were critical in the work of Romito [14]. We also integrate intoour results the powerful ideas of Ben Arous and L´eandre [2] for proving positivityof densities of random variables over a Wiener space. Our hope is that by bringingthese ideas together and adapting them to our speciﬁc context, we will provide auseful tool for many applied equations.The layout of this paper is as follows. In Section 2, we introduce notation andterminology and state the main general results of the paper. In Section 3, we applyour results to speciﬁc examples. Section 4 contains heuristic discussions of whythe main results hold and are natural. We also include an “non-example”, that isan example where the main results fail to apply yet the corresponding density hasregions of positivity (in space and time), and illustrate how to adapt the generaltheory in such cases. Additionally, Section 4 contains the proof of the main resultsas stated in Section 2. Acknowledgements

The authors would like to thank Avanti Athreya, Richard Durrett, Tiﬀany Kolba,James Nolen, and Jan Wehr for helpful conversations on the topic of this paper.DPH would also like to thank Martin Hairer for suggesting the paper [6], fromwhich his understanding of these ideas began and lead to the current collaboration.We would also like to acknowledge partial support of the NSF through grant DMS-08-54879 and the Duke University Dean’s oﬃce.2.

Notation, Terminology and Main Results

Throughout, we study stochastic diﬀerential equations on R d of the followingform dx t = X ( x t ) dt + r X j =1 X j dW jt (2.1)where X is a polynomial vector ﬁeld ; that is, X = P dj =1 X j ( x ) ∂ x j is such that eachmap x X j ( x ) is a polynomial in the standard Euclidean coordinates, X , . . . , X r are constant vector ﬁelds; that is, they do not depend on the base point, and W t , W t , . . . , W rt are standard independent real Wiener processes deﬁned on a prob-ability space (Ω , F , P ).To deal with the issue of ﬁnite-time explosion in (2.1), we will need to stop theprocess x t prior to the time of explosion. Thus for n ∈ N , let B n (0) denote theopen ball of radius n centered at the origin in R d , and deﬁne the stopping times τ n = inf { t > x t / ∈ B n (0) } and τ ∞ = lim n ↑∞ τ n . Our results will be stated forthe stopped processes x t ∧ τ n , n ∈ N . Of course, x t ∧ τ n coincides with x t for all times t ≤ τ n .For vector ﬁelds V = P dj =1 V j ( x ) ∂∂x j and W = P dj =1 W j ( x ) ∂∂x j , let ad V ( W ) = W , ad V ( W ) = [ V, W ] := d X j =1 (cid:18) d X k =1 V k ( x ) ∂W j ( x ) ∂x k − W k ( x ) ∂V j ( x ) ∂x k (cid:19) ∂∂x j . OSITIVE DENSITIES 3

Inductively, for m ≥ m V ( W ) = ad V ad m − V ( W ). For a set of vectorﬁelds G on R d , span( G ) denotes the R -linear span of G andcone ≥ ( G ) = { P ji =1 λ i V i : λ i ≥ , V i ∈ G} . We call x ∈ R d an equilibrium point of a set of vector ﬁelds G if V ( x ) = 0 forsome V ∈ G . If V is a constant vector ﬁeld with constant value v ∈ R d and W is a polynomial vector ﬁeld, then we may deﬁne a map from R into R d given by λ ( W j ( λv )). Note that since W is a polynomial vector ﬁeld, ( W j ( λv )) is a vectorof polynomials in λ . Let n ( V, W ) be the maximal degree among these polynomials(For purposes below, we assume that the zero polynomial has neither even nor odddegree). We call n ( V, W ) the relative degree of V and W .We now introduce the set of constant vector ﬁelds C which will play a funda-mental role throughout the paper. It will be deﬁned as the subset of constantvector ﬁelds in a larger set of vector ﬁelds which we now introduce. To initializethe inductive procedure let G = span { X , . . . , X r } and G o1 = G ∪ { ad n ( V,X ) V ( X ) : V ∈ G , n ( V, X ) odd } , G e1 = { ad n ( V,X ) V ( X ) : V ∈ G , n ( V, X ) even } , G = span( G o1 ) + cone ≥ ( G e1 ) . For j ≥

1, we deﬁne G o j +1 , G e j +1 , G j +1 inductively as G o j +1 = G o j ∪ { ad n ( V,W ) V ( W ) : V ∈ G o j constant , W ∈ G j , n ( V, W ) odd } , G e j +1 = G e j ∪ { ad n ( V,W ) V ( W ) : V ∈ G o j constant , W ∈ G j , n ( V, W ) even } , G j +1 = span( G o j ) + cone ≥ ( G e j ) . Let C o denote the set of constant vector ﬁelds in ∪ j G o j and C e denote the set ofconstant vector ﬁelds in ∪ j G e j . Finally, deﬁne C = span( C o ) + cone ≥ ( C e ) . (2.2) Remark 2.3.

Throughout, we will often identify a constant vector ﬁeld on R d with the vector in R d which deﬁnes it. For example, depending on the context, C o will be used to denote either the set of vector ﬁelds C o deﬁned above or the set ofvectors v ∈ R d such that v = V ( x ) for some V ∈ C o . Remark 2.4.

The primary assumption we will make is that C is d -dimensional.This is equivalent to assuming that C spans the entire tangent space at all points x ∈ R d as C contains only constant vector ﬁelds. Since C is contained in the Liealgebra generated by X , . . . , X r , [ X , X ] , . . . , [ X r , X ] , it follows by H¨ormander’s hypoellipticity theorem [4] that for every n ≥ x ∈ B n (0) and every Borel set A ⊂ B n (0) P x { x t ∧ τ n ∈ A } = Z A p nt ( x, y ) dy for some nonnegative function p nt ( x, y ) which is deﬁned and smooth on (0 , ∞ ) × B n (0) × B n (0). Here we recall that B n (0) is the open ball of radius n centeredat the origin in R d . Certainly, the transition kernel of x t ∧ τ n contains a singularcomponent concentrated on the boundary of B n (0). However, this is invisible tosets contained in B n (0) since B n (0) is open. DAVID P. HERZOG AND JONATHAN C. MATTINGLY

We now state the main general result of the paper.

Theorem 2.5.

Suppose that C is d -dimensional and let { y , . . . , y d } ⊂ C be a basisof C such that { y , . . . , y k } ⊂ C o and { y k +1 , . . . , y d } ⊂ C e . For x ∈ R d , deﬁne theset D ( x ) = (cid:8) x (cid:9) + (cid:8)P ki =1 α i y i + P dj = k +1 λ j y j : α i ∈ R , λ j > (cid:9) . and suppose that x, z ∈ R d are such that z ∈ D ( x ) .(a) For all T > there exist t ∈ (0 , T ) and N ∈ N such that p nt ( x, z ) > for all n ≥ N. (b) If there exists an equilibrium point y ∈ R d of G = { X + P rj =1 u j X j : u j ∈ R } such that y ∈ D ( x ) and z ∈ D ( y ) , then for all T > there exists N ∈ N suchthat p nt ( x, z ) > for all t ≥ T, n ≥ N. Remark 2.6.

Suppose that C is d -dimensional and that x t is non-explosive ; thatis, for every x ∈ R d P x { τ ∞ < ∞} = 0 . Then x t has a probability density function p t ( x, y ) with respect to Lebesgue measure dy which is smooth on (0 , ∞ ) × R d × R d . Moreover, all conclusions of Theorem 2.5hold with p nt ( x, z ) replaced by p t ( x, z ). Remark 2.7.

Even if C is d -dimensional, it is still possible that the set D ( x ) cannotbe chosen to be the entire space R d . See Example 3.4 in Section 3. Remark 2.8.

It is worth emphasizing that y ∈ R d can be an equilibrium withoutbeing an equilibrium point of the drift vector ﬁeld X . For example, if X ( y , y ) =( g ( y , y )(1 − y ) , f ( y , y )) for some scalar functions f, g and X = (0 ,

1) then allpoints of the form ( y ,

1) are equilibrium points since X ( y ,

1) + uX = (0 ,

0) if u = − f ( y , Theorem 2.9.

Suppose that C is d -dimensional and x t is non-explosive. Let D ( x ) be as in the statement of Theorem 2.5. Then there is at most one invariant prob-ability measure corresponding to the Markov process x t deﬁned by (2.1) . More-over, if such an invariant probability measure µ exists, then µ ( dx ) = m ( x ) dx forsome smooth, non-negative function m and if x ∈ supp( µ ) then for all z ∈ D ( x ) , m ( z ) > . Examples

Before proving the main results, we apply them to speciﬁc examples to show theirutility. A “non-example”, that is an example where Theorem 2.5 is not applicable,is given in the next section in Remark 4.11 as it ﬁts in better with the discussionthere.

OSITIVE DENSITIES 5

Example 3.1.

As a ﬁrst example, we consider the Langevin dynamics on R d , d ≥ dx t = [ − γx t − ∇ F ( y t )] dt + d X j =1 σ j dW jt dy t = x t dt where x t , y t ∈ R d , γ > F ∈ C ∞ ( R d : R ), σ j ∈ R d and the W jt areindependent standard Wiener processes. So that solutions to (3.2) do not explodein ﬁnite time, we assume that F satisﬁes the one-sided Lipschitz condition andconcavity and growth assumptions of Condition 3.1 of [9]. A prototypic example ofa potential which satisﬁes these assumptions is F ( y ) = | y | − | y | .As a consequence of Theorem 2.5, we now prove: Corollary 3.3. If span { σ , . . . , σ d } = R d , then for all ( x, y ) , ( x ′ , y ′ ) ∈ R d and t > p t (( x, y ) , ( x ′ , y ′ )) > . Proof.

Let = (0 , , . . . , ∈ R d and let G = { X + P dj =1 u j X j : u j ∈ R } where X ( x, y ) = (cid:18) − γx + ∇ F ( y ) x (cid:19) and X j ( x, y ) = (cid:18) σ j (cid:19) . We begin by computing C (deﬁned in the introduction) corresponding to equation(3.2). Since n ( X , X j ) = 1 for all j , we see that G o1 ⊃ { [ X j , X ] : j = 1 , , . . . , d } and [ X j , X ]( x, y ) = (cid:18) − γσ j σ j (cid:19) . Hence, in particular,

C ⊃ { X j , [ X j , X ] : j = 1 , , . . . , d } . Since the vectors σ , . . . , σ d are linearly independent, it follows that C has a basis. Additionally, since C o ⊃ { X j , [ X j , X ] : j = 1 , , . . . , d } we can choose a basis so that D ( x, y ) = R d forall ( x, y ) ∈ R d . To ﬁnish proving the result, we claim that the origin ( , ) ∈ R d is an equilibrium point of G . Indeed, since X ( , ) + d X j =1 u j X j ( , ) = (cid:18) −∇ F ( ) (cid:19) + (cid:18) P dj =1 u j σ j (cid:19) and the σ j form a basis of R d , we may choose real numbers u j ∈ R such that X ( , ) + d X j =1 u j X j ( , ) = (cid:18) (cid:19) . In light of Remark 2.6, applying Theorem 2.5 (b) ﬁnishes the proof of Corollary3.3. (cid:3)

Example 3.4.

Let a , a ∈ R , α > α >

0, and ǫ >

0. With motivations fromturbulent transport of inertial particles, the stochastic diﬀerential equation on R given by(3.5) dx t = ( a x t − α x t + y t ) dtdy t = ( a y t − α x t y t ) dt + ǫ dW t is considered in [3]. Here, we strengthen the results of Section 4 of this work. Amore hands on application of some of the ideas of this note were applied to a speciﬁccase of this example in Section 11 of [1]. Applying Theorem 2.1 of [3], we ﬁrst notethat ( x t , y t ) is non-explosive.We now prove: Corollary 3.6.

Suppose that ( x, y ) ∈ R satisﬁes x < a − | a | α or x ≥ a + | a | α . Then for all t > and ( x ′ , y ′ ) ∈ R with x ′ > xp t (( x, y ) , ( x ′ , y ′ )) > . Otherwise if ( x, y ) ∈ R satisﬁes a − | a | α ≤ x ≤ a + | a | α , then for all t > and ( x ′ , y ′ ) ∈ R with x ′ > a + | a | α p t (( x, y ) , ( x ′ , y ′ )) > . Remark 3.7.

It is important to point out that Corollary 3.6 is not sharp. Forexample if a = a = 0, α = 1 and α = 2, it was shown in Section 11 of [1] that,in addition to the result above, for all ( x, y ) , ( x ′ , y ′ ) ∈ R with x ′ > p t (( x, y ) , ( x ′ , y ′ )) > t > X in favor of making generalstatements for any positive time. However, Corollary 3.6 is more than suﬃcient toprove unique ergodicity in equation (3.5). Nevertheless, it is not hard to bootstrapfrom Corollary 3.6 to obtain the full (sharp) result proved in [1]. Proof.

As in the previous example, we begin by computing the set C correspondingto equation (3.5). Let G = (cid:8) X + uX : u ∈ R (cid:9) where X = ( a x − α x + y ) ∂ x +( a y − α xy ) ∂ y and X = ∂ y . Since n ( X , X ) = 2,we ﬁnd that ad X ( X ) = 2 ∂ x ∈ G e1 . Let D ( x, y ) = { ( x, y ) + u (0 ,

1) + λ (1 ,

0) : u ∈ R , λ > } . As opposed to the previous example, the set D ( x, y ) is not the entire space. Hencewe must make sure we have enough equilibrium points in the right locations.Consider the polynomial equation a x − α x + y = 0 a y − α xy + u = 0 OSITIVE DENSITIES 7 where u ∈ R . Clearly, any pair ( x, y ) ∈ R satisfying the above equations for some u ∈ R is an equilibrium point of G . In particular, we may solve a x − α x + y = 0producing x = a ± p a + 4 α y α . Since we may pick u = α xy − a y , we therefore deduce that all points ( x, y ) ∈ R such that either x ≥ a + | a | α or x ≤ a − | a | α are equilibrium points for the control system G . Hence Remark 2.6 now impliesCorollary 3.6. (cid:3) Example 3.8.

Let ν > ∂ t u ( x , t ) + ( u ( x , t ) · ∇ x ) u ( x , t ) = ν ∆ x u ( x , t ) + ξ ( x , t )with periodic boundary conditions on the torus T = [0 , π ] . Here, we assumethat there is no mean ﬂow and that ξ is a Gaussian process which is white in timeand colored in space. To emphasize, we do not require the divergence free condition ∇ · u = 0; hence, (3.9) is not the 2D Navier Stokes equation. Moreover, we do notrestrict ourselves to gradient solutions as is often done when considering the mul-tidimensional Burgers equation. In the dynamics (3.9), we are precisely interestedhow the divergence free forcing spreads to the non-divergence free (gradiant-likedirections). Since one does not have global solutions in this setting, here we mustmake use of the stopped processes.Let us now be more precise. Writing X = k ∈ Z u k ( t ) e − i h k , x i , where h· , ·i denotes the dot product, and ﬁxing a positive integer N ≥

2, we considerthe following stochastic diﬀerential equation on C N +1) − du k = [ iF N k ( u ) − ν | k | u k ] dt + k ⊥ | k | ( σ k dB k , (1) t + iσ ′ k dB k , (2) t )(3.10) + k | k | ( γ k dW k , (1) t + iγ ′ k dW k , (2) t )where • u k ∈ C ; • the equation is over all indices k ∈ H N = (cid:8) k ∈ Z \ { (0 , } : k k k ∞ ≤ N (cid:9) ; • F N k ( u ) = X l , k − l ∈ H N h u l , k − l i u k − l ; • σ k , σ ′ k , γ k , γ ′ k ∈ R ; • k ⊥ = ( k , k ) ⊥ = ( − k , k ); • { B k , (1) t , B k , (2) t , W k , (1) t , W k , (2) t } k ∈ H N is a set of independent Brownian mo-tions. DAVID P. HERZOG AND JONATHAN C. MATTINGLY

To further illuminate the discussion, we ﬁrst split the equation into incompress-ible and compressible directions. To this end, write u k = w k k ⊥ | k | + q k k | k | F k ( u ) = F ⊥ k ( w, q ) k ⊥ | k | + F k k ( w, q ) k | k | where w k , q k ∈ C . In particular, equation (3.10) now becomes dw k = [ − ν | k | w k + iF ⊥ k ( w, q )] dt + σ k dB k , (1) t + iσ ′ k dB k , (2) t (3.11) dq k = [ − ν | k | q k + iF k k ( w, q )] dt + γ k dW k , (1) t + iγ ′ k dW k , (2) t for some F ⊥ k , F k k to be computed in a moment. Note that (3.11) evolves on C N +1) − = C N ( N +1) for all t < τ ∞ .We will now use Theorem 2.5 to prove the following result: Theorem 3.12.

Suppose that { k ∈ H N : σ k = 0 , σ ′ k = 0 } ⊃ { k ∈ H N : k k k ∞ = 1 } . Then for all ( w, q ) , ( w ′ , q ′ ) ∈ C N ( N +1) and T > , there exists N ∈ N large enoughso that p nt (( w, q ) , ( w ′ , q ′ )) > for all t ≥ T, n ≥ N. Remark 3.13.

It is interesting to note that, even if the process ( w t , q t ) is assumedto be incompressible initially; that is, ( w , q ) = ( w, ∈ C N ( N +1) , a small amountof low mode forcing ensures that any mixture of incompressible and compressiblestates becomes instantaneously possible. As we will see in the proof below, thiscannot happen if we do not force the incompressible directions. In particular, if weassume that the process ( w t , q t ) is initially compressible; that is, ( w , q ) = (0 , q )and σ k = σ ′ k = 0 for all k ∈ H N , then w t ≡ t ≥ Proof of Theorem 3.12.

We will ﬁrst write out and symmetrize the nonlinear terms F ⊥ k and F k k . Using the relations h k ⊥ , l i = −h k , l ⊥ i and h k ⊥ , l ⊥ i = h k , l i , we ﬁndthat F ⊥ k ( w, q ) = X l , k − l ∈ H N w l w k − l h l ⊥ , k ih k − l , k i| l | | k − l | + w l q k − l h l ⊥ , k i | l | | k − l | + X l , k − l ∈ H N q l w k − l h l , k − l ih k − l , k i| l | | k − l | − q l q k − l h l , k − l ih l , k ⊥ i| l | | k − l | and F k k ( w, q ) = X l , k − l ∈ H N − w l w k − l h l ⊥ , k i | l | | k − l | + w l q k − l h l ⊥ , k ih k − l , k i| l | | k − l | X l , k − l ∈ H N − q l w k − l h l , k − l ih l ⊥ , k i| l | | k − l | + q l q k − l h l , k − l ih k − l , k i| l | | k − l | . OSITIVE DENSITIES 9

After considering the eﬀect of the mapping ( l , k − l ) ( k − l , l ) on each of theterms above, we may write F ⊥ k ( w, q ) = X l , k − l ∈ H N w l w k − l h l ⊥ , k i (cid:18) | l | − | k − l | (cid:19) + w l q k − l h k − l , k i| k − l | F k k ( w, q ) = X l , k − l ∈ H N − w l w k − l h l ⊥ , k i | l | | k − l | + w l q k − l h l ⊥ , k ih k − l , k + l i| l | | k − l | + X l , k − l ∈ H N q l q k − l h l , k − l i | k | | l | | k − l | . The assertion made in the previous remark now follows easily from these expressionssince if σ k = σ ′ k = 0 for all k ∈ H N and w = 0, then w t = ( w k ( t )) k ∈ H N ≡ t .To prove Theorem 3.12, we do as in the previous two examples and start bycomputing C corresponding to (3.11). Deﬁne G = (cid:26) X + X k ∈ F D I u k X k + v k Y k : u k , v k ∈ R (cid:27) where X = X k ∈ H N (cid:2) − ν | k | w k + iF ⊥ k ( w, q ) (cid:3) ∂∂w k + (cid:2) − ν | k | q k + iF k k ( w, q ) (cid:3) ∂∂q k + X k ∈ G N (cid:2) − ν | k | ¯ w k − iF ⊥ k ( ¯ w, ¯ q ) (cid:3) ∂∂ ¯ w k + (cid:2) − ν | k | ¯ q k − iF k k ( ¯ w, ¯ q ) (cid:3) ∂∂ ¯ q k and X k = ∂∂w k + ∂∂ ¯ w k , Y k = i ∂∂w k − i ∂∂ ¯ w k . Notice that n ( X , X j ) = 1 for all j ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } since there areno diagonal terms in the nonlinear part of X . In particular,[ X j , X ] ∈ G o1 for all j ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } . Moreover, one can compute these commutators to see that[ X j , X ] = − ν | j | ∂∂w j − ν | j | ∂∂ ¯ w j + i X k ∈ H N (cid:20) w k − j h j ⊥ , k i (cid:18) | j | − | k − j | (cid:19) + q k − j h k − j , k i| k − j | (cid:21) ∂∂w k − i X k ∈ H N (cid:20) ¯ w k − j h j ⊥ , k i (cid:18) | j | − | k − j | (cid:19) + ¯ q k − j h k − j , k i| k − j | (cid:21) ∂∂ ¯ w k + i X k ∈ H N (cid:20) − w k − j h j ⊥ , k i | j | | k − j | + q k − j h j ⊥ , k ih k − j , k + j i| j | | k − j | (cid:21) ∂∂q k − i X k ∈ H N (cid:20) − w k − j h j ⊥ , k i | j | | k − j | + ¯ q k − j h j ⊥ , k ih k − j , k + j i| j | | k − j | (cid:21) ∂∂ ¯ q k . Note also that for all j , m ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } such that j + m ∈ H N n ( X m , [ X j , X ]) = n ( Y m , [ X j , X ]) = 1 . Hence for all j , m ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } with j + m ∈ H N , [ X m , [ X j , X ]] ∈G o2 and [ Y m , [ X j , X ]] ∈ G o2 . Computing these commutators we ﬁnd that(3.14) [ X m , [ X j , X ]] = h j ⊥ , m i (cid:18) | j | − | m | (cid:19) Y j + m − h j ⊥ , m i | j | | m | e Y j + m and(3.15) [ Y m , [ X j , X ]] = −h j ⊥ , m i (cid:18) | j | − | m | (cid:19) X j + m + 2 h j ⊥ , m i | j | | m | e X j + m where e X · = ∂∂q · + ∂∂ ¯ q · , e Y · = i ∂∂q · − i ∂∂ ¯ q · . We will now use the above computations to prove that (cid:8) X j , Y j , e X j , e Y j : k k k ∞ ≤ k } ⊂ C o for all k = 1 , , . . . , N by induction on k . It will then follow that C o spans thetangent space, and so we may pick D ( w, q ) = C N ( N +1) for all ( w, q ) ∈ C N ( N +1) .To prove the claim when k = 1, ﬁrst substitute( j , m ) = ((1 , , (0 , , ((1 , , (0 , − , (( − , , (0 , − , (( − , , (0 , e X (1 , , e Y (1 , , e X (1 , − , e Y (1 , − , e X ( − , − , e Y ( − , − , e X ( − , , e Y ( − , ∈ C o . Substituting( j , m ) = ((1 , , (0 , − , ((1 , , ( − , , (( − , , (0 , − , (( − , − , (1 , X k , Y k ∈ C o for any k k k ∞ = 1,we ﬁnd by taking linear combinations that e X (1 , , e Y (1 , , e X (0 , , e Y (0 , , e X ( − , , e Y ( − , , e X (0 , − , e Y (0 , − ∈ C o . This proves the initial statement in the inductiveargument. Suppose now that for some 1 ≤ k < N (cid:8) X j , Y j , e X j , e Y j : j ∈ H N , k j k ∞ ≤ k (cid:9) ⊂ C o . Note that if m , j ∈ H N are such that k m k ∞ ≤ k , k j k ∞ = 1, then [ e X m , [ X j , X ]] ∈C odd and [ e Y m , [ X j , X ]] ∈ C o . Note moreover that(3.16) [ e X m , [ X j , X ]] = h m , j + m i| m | Y j + m + h j ⊥ , m ih m , m + 2 j i| j | | m | e Y j + m and(3.17) [ e Y m , [ X j , X ]] = − h m , j + m i| m | X j + m − h j ⊥ , m ih m , m + 2 j i| j | | m | e X j + m . We claim that if m , j ∈ H N are such that | j | 6 = | m | and h j ⊥ , m i 6 = 0, then the pairs(3.14) and (3.16), (3.15) and (3.17), are independent. Indeed, if they are dependentunder these assumptions, then | j | h m , m + j i = 12 ( | j | − | m | ) h m , m + 2 j i which is true if and only if | j | + | m | + 2 h m , j i = 0 . Note that this equality is impossible since | j | 6 = | m | . Therefore, to ﬁnish theinductive argument, it suﬃces to show that for all k ∈ H N with k k k ∞ = k + 1,there exist m , j ∈ H N such that OSITIVE DENSITIES 11 • m + j = k ; • k m k ∞ = k , k j k ∞ = 1, | m | 6 = | j | , and h j ⊥ , m i 6 = 0.For those such k away from the axes and the lines | y | = | x | in the ( x, y )-plane, take j ∈ H N to be the unique member of the set { (1 , , (0 , , ( − , , (0 , − } such that k k − j k ∞ = k . Thus deﬁne m = k − j and note that j and m have diﬀerent Euclideanlengths and h j ⊥ , m i 6 = 0 . Now suppose k is on one of the axes or the lines | y | = | x | .Then there exists j ∈ { (1 , , (0 , , ( − , , (0 , − } such that m = k − j belongs tothe set of indices generated up to this point of sup norm length k + 1. It is easy tocheck that, again, j and m have diﬀerent Euclidean lengths and h j ⊥ , m i 6 = 0 . Thisﬁnishes the proof of the inductive argument.Now note that we may choose a basis of C such that D ( w, q ) = C N ( N +1) for all ( w, q ) ∈ C N ( N +1) . Moreover, the origin is clearly an equilibrium point of G .Because the issue of explosion is still evident, Theorem 2.5 implies that for every( w, q ) , ( w ′ , q ′ ) ∈ C N ( N +1) and T >

0, there exists N ∈ N large enough such that p nt (( w, q ) , ( w ′ , q ′ )) > t ≥ T, n ≥ N. for all t > (cid:3) Proof of Main Results

The goal of this section is to prove Theorem 2.5 and Theorem 2.9. Theorem 2.9will be a relatively straightforward consequence of Theorem 2.5, so we focus ourattention ﬁrst on proving Theorem 2.5.To prove Theorem 2.5, we will use a slight modiﬁcation of the condition forpositivity of the density given by Ben Arous and L´eandre [2] (see also [12]). Theslight modiﬁcation is necessary to remove the global Lipschitzian and boundednessconditions often assumed of the coeﬃcients in the SDE.To setup the statement of our slight modiﬁcation, let H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), and Φ x · ( H ) denote the maximally-deﬁned solution (in time) of the equation(4.1) Φ xs ( H ) = x + Z s X (Φ xu ( H )) du + r X j =1 X j Z s h ju du.J xs,t ( H ) denotes the maximally-deﬁned d × d matrix-valued solution of J xs,t ( H ) = Id d × d + Z ts DX (Φ xu ( H )) J xs,u ( H ) du (4.2)where Id d × d is the identity matrix and D is the Jacobian. Deﬁne the Gramianmatrix M xt ( H ) by(4.3) ( M xt ( H )) nk = r X m =1 Z t ( J xs,t ( H ) X m ) n ( J xs,t ( H ) X m ) k ds. Remark 4.4.

Sometimes M xt ( H ) is called the deterministic Malliavin covariancematrix. Formally replacing H with a Brownian motion W yields the standard(stochastic) Malliavin covariance matrix. Lemma 4.5.

Fix x, z ∈ R d and t > and suppose that H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ) , is such that Φ xs ( H ) is deﬁned for all times s ∈ [0 , t ] and Φ xt ( H ) = z .If M xt ( H ) is invertible, then p nt ( x, z ) > for any integer n ≥ such that Φ xs ( H ) ⊂ B n (0) for all s ∈ [0 , t ] . We defer the proof of Lemma 4.5 until the Appendix, and focus our eﬀorts in thissection on exhibiting a control H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), so that Φ x · ( H )has all of the properties stated in Lemma 4.5. The proof of the existence of sucha control splits into two parts. First, in Section 4.1 we will use the enlargementtechniques of Jurjevic and Kupka [5, 6, 7] to see which directions can be ﬂowed alongin small times by Φ xs ( H ) over the class of controls H deﬁned above. Second, we willsee that there are enough directions so that we can construct a suﬃciently “twisty”control H , ensuring that M xt ( H ) is invertible. The existence of an equilibriumpoint y ∈ R d as in the statement of Theorem 2.5 allows us control over the timeparameter.4.1. A Primer on Geometric Control Theory.

For x ∈ R d and t >

0, let A ( x, ≤ t ) be the set of points z ∈ R d such that for some time t ∈ (0 , t ] there exists H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), for which Φ xs ( H ) is deﬁned for all s ∈ [0 , t ]and Φ xt ( H ) = z . Recalling the set C deﬁned in Section 2, here we will use thetechniques [5, 6, 7] to prove the following result: Lemma 4.6.

For all x ∈ R d and all t > , { x } + C ⊂ A ( x, ≤ t ) . We start by making some heuristic observations, arguing intuitively why weshould expect Lemma 4.6 to be true. To make notation more legible, for any C ∞ vector ﬁeld V on R d let exp( tV )( x ) denote the maximally-deﬁned integral curve of V passing through x at t = 0.We ﬁrst see why we should expect the following containment to hold { x } + span { X , . . . , X r } ⊂ A ( x, ≤ t )(4.7)for all x ∈ R d , t >

0. Let x ∈ R d , α ∈ R \ { } and j ∈ { , . . . , r } be given. Thekey is to realize that for λ > t > t ( X + αλX j ))( x ) ≈ exp( tαλX j )( x )This is because the behavior of the ﬂow along X + αλX j is initially dominated forsmall times by the ﬂow along αλX j since λ is large. More precisely, taking t = t ′ /λ for some t ′ > λ → ∞ exp( t ( X + αλX j ))( x ) = exp (cid:16) t ′ λ ( X + αλX j ) (cid:17) ( x ) → exp( t ′ αX j )( x ) . Since x ∈ R d , α ∈ R \ { } and j ∈ { , , . . . , r } were assumed to be arbitrary, wenow see why one should believe the containment (4.7) as one could repeat the sameargument with αX j replaced by an arbitrary linear combination of X , . . . , X r .To see how some of the commutators in the deﬁnition of C arise, we start by“tweaking” the directions X , . . . , X r obtained in the previous step by X ; that is,we will ﬁrst ﬂow along X j for αλ units of times and then ﬂow along X for t > OSITIVE DENSITIES 13 units of time. Again let x ∈ R d , α ∈ R \{ } and j ∈ { , . . . , r } be given. If x j ∈ R d is the constant value of X j , we notice that for t > tX ) ◦ exp( αλX j )( x ) = exp( tX )( x + αλx j )(4.8) = x + αλx j + Z t X ( x + αλx j + O ( s )) ds. Letting t = t ′ /λ n ( X j ,X ) , it follows that as λ → ∞ Z t X ( x + αλx j + O ( s )) ds → α n ( X j ,X ) n ( X j , X )! ad n ( X j ,X ) X j ( X )( x ) . (4.9)As much as we would like to obtain this potentially new direction by taking λ → ∞ in (4.8), we cannot as αλx j blows up as λ → ∞ . To rid ourselves of this problem,we need to ﬂow backwards along X j for αλ units of time producing the relationexp( − αλX j ) ◦ exp( tX ) ◦ exp( αλX j )( x )= x + Z t X ( x + αλx j + O ( s )) ds. Using the same scaling of time t = t ′ /λ n ( X j ,X ) , we now see how the commutatoron the righthand side of (4.9), hence in the deﬁnition of G e1 and G o1 , arises. Remark 4.10.

Note that this computation explains why the separation of C into C o and C e is needed. If n ( X j , X ) is even and ad n ( X j ,X ) X j ( X ) is constant, thenrelation (4.9) implies that we may only ﬂow along ad n ( X j ,X ) X j ( X ) for positivetimes. Additionally, in the subsequent iteration of this method we cannot neces-sarily ﬂow backwards along this vector ﬁeld producing yet another direction. Remark 4.11.

Following these observations, it is evident where and why Theorem2.5 will fail to either produce optimal results or be applicable at all. The failure isprecisely due to the fact that the set C only includes those constant vector ﬁeldswhich can be ﬂowed along in small positive times. In particular, Theorem 3.7 doesnot account for cases where there is an unavoidable time delay needed to accesscertain points in space (as in the example highlighted in Remark 3.7), usually duethe need to employ the drift vector ﬁeld X . Moreover, Theorem 3.7 will not evenapply in situations if there is a more serious absence of time reversibility preventing C from being d -dimensional. As an example, consider the following SDE on R (4.12) dx t = − x t y t dt + dB t dy t = ( x t − y t z t ) dtdz t = ( y t − z t ) dt. For this system, it is not hard to check that H¨ormander’s bracket condition issatisﬁed globally but C = { α∂ x + λ∂ y : α ∈ R , λ ≥ } . Hence, Theorem 2.5 does not apply since C has dimension 2 < C is still useful in that Lemma 4.6 is true regardless if C is d -dimensional. If C isnot d -dimensional, one can now proceed to ﬁnd more points in the set A ( x, ≤ t )by using C and the speciﬁc nature of the drift vector ﬁeld X . Then, given theexistence of H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ) such that Φ xt ( H ) = z , positivity of the transition density p nt ( x, z ) for n large enough can then be shown by followinga similar line of reasoning to Lemma 4.22 or Remark 4.27.We now turn the previous heuristics into a proof of Theorem 4.6. Our proofwill employ results from the reference [7], so we will ﬁrst introduce some furthernotation and terminology to connect with the setup there.We recall that for any C ∞ vector ﬁeld V on R d , exp( tV )( x ) denotes the maxi-mally deﬁned integral curve of V passing through x at time t = 0. Let H be anyset of C ∞ vector ﬁelds on R d . For x ∈ R d and t > A H ( x, ≤ t ) denotes the setof z ∈ R d such that there exist positive times t , . . . , t k and corresponding vectorﬁelds V , . . . , V k ∈ H such that t + · · · + t k ≤ t andexp( t k V k ) ◦ exp( t k − V k − ) ◦ · · · exp( t V )( x ) = z. Because there will be many diﬀerent sets of vector ﬁelds, here we will absolutelyneed to emphasize the dependence of these sets on H .Two sets of C ∞ R d -vector ﬁelds, H and I , are called equivalent , denoted by H ∼ I , if A H ( x, ≤ t ) = A I ( x, ≤ t ) for all x ∈ R d and all t >

0. One can show, see[7], that if

H ∼ I and

H ∼ J , then

H ∼ I ∪ J . In particular, if we deﬁnesat( H ) = [ I∼H I , then it also follows that sat( H ) ∼ H . sat( H ) is called the saturate of H . Remark 4.13.

It is often the case that sat( H ) contains more vector ﬁelds than H itself. Moreover, the saturate maintains identical accessibility properties in thesense ( ∼ ) described above. This is convenient in that it allows one to use simplervector ﬁelds to determine accessibility properties of the original set of vector ﬁelds H . For example, even though the constant vector ﬁeld X j , j ≥

1, does not belongto G = { X + P rj =1 u j X j : u j ∈ R } , we used it above to generate more directions in A ( x, ≤ t ) as done in the argumentsfollowing equation (4.8). Using a limiting procedure, however, one can justify thatthis is indeed permissible.In the next two lemmas, we list operations which allow us to expand (up toequivalence) a set of vector ﬁelds H . Lemma 4.14. H is equivalent to the closed convex hull of the set { λV : λ ∈ [0 , , V ∈ H} . Here the closure is taken in the topology of uniform convergence with all derivativeson compact subsets of R d .Proof. Apply Theorem 5 and Theorem 6 in Chapter 2 of [7]. (cid:3)

To state the next lemma, let ψ : R d → R d be a diﬀeomorphism. For any V ∈ H ,we may deﬁne a vector ﬁeld ψ ∗ ( V ) by ψ ∗ ( V )( x ) = Dψ ( ψ − ( x )) V ( ψ − ( x ))where Dψ is the Jacobian of ψ . A diﬀeomorphism ψ : R d → R d is called a normalizer of H if ψ ( x ) , ψ − ( x ) ∈ A H ( x, ≤ t ) for all x ∈ R d and all t >

0. The setof normalizers of H is denoted by Norm( H ). OSITIVE DENSITIES 15

Lemma 4.15.

H ∼ [ ψ ∈ Norm ( H ) { ψ ∗ ( V ) : V ∈ H} . Proof.

Notice that by the lemma immediately after Deﬁnition 5 of Chapter 2 of [7],if ψ is a normalizer of H using our deﬁnition, then it is also a normalizer using thedeﬁnition given in [7]. The result then follows after applying Theorem 9 in Chapter2 of [7] and using the fact that the identity map is a normalizer. (cid:3) Remark 4.16.

We will see in the proof of Lemma 4.6 that the limiting procedureused in our heuristic calculations is exactly of the type covered by Lemma 4.14. Wewill also see that the use of normalizers is very much in line with one’s ability toﬂow along a constant vector ﬁeld for positive or negative times (hence the ψ and ψ − in the deﬁnition of a normalizer).Using repeated applications of Lemma 4.14 and Lemma 4.15, we now proveLemma 4.6. Proof of Lemma 4.6.

Let G = { X + P rj =1 u j X j : u j ∈ R } . First note that itsuﬃces to show that if V ∈ C o and W ∈ C e , then αV, λW ∈ sat( G ) for all α ∈ R andall λ ≥

0. The result would then follow by Lemma 4.14 since if V , V , . . . , V k ∈ C o and W , W , . . . , W j ∈ C e , then k X l =1 α l V l + j X i =1 λ i W i ∈ sat( G )for all α i ∈ R and all λ i ≥ αX j ∈ sat( G ) for all α ∈ R and j ∈ { , . . . , r } . Indeed,by Lemma 4.14 we have αX j = lim λ →∞ λ ( X + αλX j ) ∈ sat( G ) . By induction, it is enough to show that if V is a constant vector ﬁeld with αV ∈ sat( G ) for all α ∈ R and W ∈ sat( G ) is a polynomial vector ﬁeld, then α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) ∈ sat( G )for all α ∈ R . To prove this result, we seek to apply Lemma 4.15. Since V is aconstant vector ﬁeld, let v = V ( x ) ∈ R d denote its constant value. For α ∈ R ,deﬁne a map ψ α : R d → R d by ψ α ( x ) = x − αv. Note that, for each α ∈ R , ψ α is a normalizer for G . Hence, for each α ∈ R , Lemma4.15 implies that ( ψ α ) ∗ ( W ) ∈ sat( G ). Since Dψ α is the identity matrix, notice that( ψ α ) ∗ ( W )( x ) = W ( x + αv ) . Applying Lemma 4.14, we thus ﬁnd that for all α ∈ R V αW := lim λ ↓ λ n ( V,W ) ( ψ λα ) ∗ ( W ) ∈ sat( G ) . To ﬁnish the proof, all we must see is that V αW = α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) . Recalling that v ∈ R d denotes the constant value of V , for x ∈ R d ﬁxed considerthe function F : R → R d deﬁned by α W ( x + αv ). By induction, for j ≥ F ( j ) ( α ) = ad j V ( W )( x + αv ) . where F ( j ) is the j th derivative of F with respect to α . Hence we obtain the formula( ψ α ) ∗ W ( x ) = F ( α ) = n ( V,W ) X j =0 α j j ! F ( j ) (0) = n ( V,W ) X j =0 α j j ! ad j V ( W )( x )since each component of F ( α ) is a polynomial in α with degree ≤ n ( V, W ). Hencewe now see that V αW = lim λ →∞ λ n ( V,W ) ( ψ αλ ) ∗ ( W ) = α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) , completing the proof. (cid:3) Before proceeding onto the second part of the argument, we state the followinglemma which we will need later.

Lemma 4.17.

Suppose that, for some x ∈ R d , the Lie algebra generated by H evaluated at x spans the tangent space. Then for all t, ǫ > A H ( x, ≤ t + ǫ )) ⊃ interior( A H ( x, ≤ t )) . Proof.

See Theorem 2 of Chapter 3 in [7]. (cid:3)

Strict Positivity.

The next two lemmas will operate as an easy-to-checkcriterion assuring that, for a given control H , M xt ( H ) is invertible. Though notnecessary (see Remark 4.27), these results use the fact that G contains only poly-nomial vector ﬁelds. In particular, the special structure of zero sets of polynomialsis employed in the following lemma. Lemma 4.18.

Suppose that C is d -dimensional and let H = ∪ rm =1 { X m , [ X , X m ] } . Then for any non-empty open A ⊂ R d the set of points in R d given by (4.19) [ x ∈ A { V ( x ) : V ∈ H} is d -dimensional.Proof. Suppose that the subspace spanned by the set in (4.19) has dimension l ≤ d and choose a basis v , v , . . . , v l ∈ R d for this subspace. The goal is to showthat l = d . Let V , V , . . . , V l be the constant vector ﬁelds with constant values v , v , . . . , v l , respectively. Notice that every vector ﬁeld V in the span of H is apolynomial vector ﬁeld and satisﬁes the following equality on the open set A (4.20) V = p V + p V + · · · + p l V l for some polynomials p , . . . , p l . Since A is open and V is a polynomial vector ﬁeld,(4.20) is valid everywhere on R d . Moreover, since vector ﬁelds of the form (4.20)are closed under commutators and linear combinations, we see thatspan( C ) ⊂ span { v , v , . . . , v l } Note that this ﬁnishes the proof since C is d -dimensional. (cid:3) OSITIVE DENSITIES 17

To setup the statement of the next result, deﬁne K xt ( H ) ⊂ R d as follows:(4.21) K xt ( H ) = r [ m =1 (cid:8) X m (Φ xs ( H )) , [ X , X m ](Φ xs ( H )) : s ∈ (0 , t ) (cid:9) . Lemma 4.22.

Suppose that K xt ( H ) is d -dimensional. Then the associated matrix M xt ( H ) is invertible.Proof. It suﬃces to show that M xt ( H ) is positive deﬁnite. Assume, to the contrary,that M xt ( H ) is not positive-deﬁnite and let h · , · i denote the inner product on R d .Then there exists y ∈ R d \ { } such that0 = h M xt ( H ) y, y i = r X m =1 Z t h J xs,t ( H ) X m , y i ds. To get a contradiction, we seek to obtain a lower bound h M xt ( H ) y, y i which ispositive using the equality above. To derive such a bound, ﬁrst observe that for s ≤ s ≤ u ≤ t ≤ t , J xs ,t ( H ) = J xu ,t ( H ) J xs ,u ( H ) and that the matrix J xs ,t ( H )is invertible. Using these two facts, it is not hard to check that for s ≤ s ≤ t ≤ t∂ s J xs ,t ( H ) = − J xs ,t ( H ) DX (Φ xs ( H ))(4.23) J xt ,t ( H ) = Id d × d . Letting | · | denote the Euclidean norm on R d , we then see that for all u ∈ (0 , t ), ǫ ∈ (0 , min( u, t − u ))0 = h M xt ( H ) y, y i ≥ Z t h J xs,t ( H ) X m , y i ds ≥ Z u + ǫu − ǫ h J xs,t ( H ) X m , y i ds = Z u + ǫu − ǫ h J xs,u ( H ) X m , ( J xu,t ( H )) ∗ y i ds (4.24) ≥ | ( J xu,t ( H )) ∗ y | inf y : k y k =1 Z u + ǫu − ǫ h J s,u X m , y i ds. Since | J ∗ u,t y | > R d , it suﬃces to show that for allnonzero y ∈ R d there exists m ∈ { , , . . . , r } , u ∈ (0 , t ), and ǫ ∈ (0 , min( u, t − u ))such that(4.25) Z u + ǫu − ǫ h J xs,u ( H ) X m , y i ds > . Thus let y ∈ R d , y = 0, be arbitrary. By hypothesis, either h X m , y i 6 = 0 for some m ∈ { , . . . , r } or h [ X m , X ](Φ xt ( H )) , y i 6 = 0 for some m ∈ { , . . . , r } , t ∈ (0 , t ).Clearly, if h X m , y i 6 = 0 for some m ∈ { , , . . . , r } , then there is nothing to show bycontinuity and (4.25). Thus suppose that h X m , y i = 0 for all m = 1 , , . . . , r andpick t ∈ (0 , t ), m ∈ { , , . . . , r } such that h z, y i = h [ X m , X ](Φ xt ( H )) , y i 6 = 0 . Since h X m , y i = 0, using the deﬁnition of J xs,t ( H ) twice we see that h J xs,t ( H ) X m , y i = Z t s h DX (Φ xu ( H )) J xs,u ( H ) X m , y i du = Z t s h [ X m , X ](Φ xu ( H )) , y i du + Z t s (cid:28) DX (Φ xu ( H )) Z us DX (Φ xv ( H )) J xs,v ( H ) X m dv, y (cid:29) du. Therefore, for s suﬃciently close to t , h J xs,t ( H ) X m , y i 6 = 0. Hence continuity thenimplies for any ǫ ∈ (0 , t ) Z t t − ǫ h J xs,t ( H ) X m , y i ds > , ﬁnishing the proof. (cid:3) We now use the previous two results and Lemma 4.6 to prove Theorem 2.5.

Proof of Theorem 2.5.

We ﬁrst prove Theorem 2.5 part (b) and then show how part(a) follows by a similar argument. Therefore suppose that y ∈ R d is an equilibriumpoint of G and that x, z ∈ R d are such that y ∈ D ( x ) and z ∈ D ( y ). By Lemma 4.5,our goal is to exhibit H · = R · h s ds , h ∈ L ([0 , t ] : R r ), such that Φ xt ( H ) = z and M xt ( H ) invertible. To ensure that M xt ( H ) is invertible, we will build H · in such away so as to “twist” the path of Φ x · ( H ) from x to z .We ﬁrst claim that there exist countably many non-empty disjoint open subsets U l , l ≥

0, with the property that U l +1 ⊂ [ w ∈ U l D ( w )(4.26)for all l ≥

0. Suppose ﬁrst that D ( x ) = R d . Then it follows that D ( x ′ ) = R d forall x ′ ∈ R d . Thus in this case simply let U l be any partition of R d . If D ( x ) = R d ,then since y ∈ D ( x ) write y = x + P kj =1 α j y j + P dj = k +1 λ j y j for some α j ∈ R and λ j >

0. Let λ = min j λ j > α = 0 and α l = P lk =1 − k , l ≥

1. Note that for l ≥ U l = x + span { y , . . . , y k } + { µ k +1 y k +1 + · · · + µ d y d : µ j ∈ ( α l λ, α l +1 λ ) } are disjoint, open and satisfy (4.26). This ﬁnishes the proof of the claim.By construction of the sets U l , l ≥

0, and Lemma 4.18, there exist x l + r ∈ U l such that r [ m =1 { x , . . . , x r , [ X m , X ]( x r +1 ) , . . . , [ X m , X ]( x r + j ) } is d -dimensional. Here, recall that x , . . . , x r are the constant values of X , . . . , X r ,respectively. Moreover, x r +1 ∈ D ( x ), y ∈ D ( x j + r ) and x l +1+ r ∈ D ( x l + r )for all l = 1 , , . . . , j .We now show that we can build H · so that the path Φ x · ( H ) passes through eachof these points prior to time t > xt ( H ) = z . Observe that Lemma OSITIVE DENSITIES 19 A ( w, ≤ s ) ⊃ D ( w ) for all w ∈ R d and all s >

0. Hence by deﬁnition of A ( w, ≤ s ), there exist positive times t , t , . . . , t j +1 with P j +1 l =1 t l < t and corresponding H l ( · ) = R · h l ( s ) ds , h l ∈ L ([0 , t l ] : R r ), suchthat Φ xt ( H ) = x r +1 , Φ x r + l t l +1 ( H l +1 ) = x r + l +1 , l = 1 , . . . , j −

1, and Φ x r + j t j +1 ( H j +1 ) = y .By piecing together the H l ’s, this now gives us the path from x to y . For the restof the path, we may also pick a positive time t j +3 < t and H j +3 ( · ) = R · h j +3 ( s ) ds , h j +3 ∈ L ([0 , t j +3 ] : R r ) such that Φ yt j +3 ( H j +3 ) = z . Moreover, since y is anequilibrium point of G , letting t j +2 = t − ( t + · · · + t j +1 + t j +3 ) > H j +2 ( · ) = R · h j +2 ( s ) ds , h j +2 ∈ L ([0 , t j +2 ] : R r ) such that Φ yt j +2 ( H j +2 ) = y . By Lemma 4.22, we now obtain the conclusion in part (b).To prove part (a), simply let z = y in the ﬁrst argument and, for an arbitrary T >

0, choose t < T . Note that this now ﬁnishes the proof of Theorem 2.5. (cid:3)

Remark 4.27.

Without using the special structure of polynomial vector ﬁelds, onecan prove Theorem 2.5 alternatively by choosing the path from x to y diﬀerentlyas follows. Deﬁne D ( x, y ) = ( D ( x ) \ D ( y ) if D ( x ) = R d R d otherwiseand let y ′ ∈ D ( x, y ) be arbitrary. Since D ( x, y ) is open, let δ > B δ ( y ′ ) ⊂ D ( x, y ). By the support theorems [15, 16], there exists s ∈ (0 , t/

4) suchthat for all n large enough P x { s < τ n , x s ∈ B δ ( y ′ ) } > . Now recall that W s = ( W s , . . . , W rs ) is an r -dimensional standard Wiener processdeﬁned on the probability space (Ω , F , P ). In this remark, we identify the set Ωwith the space of continuous paths C ([0 , ∞ ) : R r ). Letting M xt ( W ( ω )) denote thematrix M xt ( H ) when H s = ( W s ( ω ) , . . . , W rs ( ω )), we note that by Malliavin’s proofof H¨ormander’s theorem [8, 11] P x { s < τ n , x s ∈ B δ ( y ′ ) , M xs ( W ) invertible } = P x { s < τ n , x s ∈ B δ ( y ′ ) } > n suﬃciently large. Therefore, ﬁx ω ∈ { s < τ n , x s ∈ B δ ( y ′ ) , M s ( W ( ω )) invertible } and deﬁne H s = ( W s ( ω ) , . . . , W rs ( ω )) on the time interval [0 , s ]. Hence Φ xs ( H ) ∈ B δ ( y ′ ). Since y ∈ \ w ∈ B δ ( y ′ ) D ( w ) , pick ˜ H such that for some s < t Φ Φ xs ( H ) s ( ˜ H ) = y. We can complete our path from y to z in exactly the same way as in the proof ofTheorem 2.5. Invertibility of the covariance matrix for our chosen control at time t follows immediately since M xs ( W ( ω )) is invertible. See Theorem 8.1 in [10] for asimilar argument. Remark 4.28.

Yet another way to prove Theorem 2.5 is to use a Feynman-Kacrepresentation of the probability density function p nt ( x, z ). Indeed ﬁxing n ∈ N and x ∈ B n (0), observe that the time-reversed density q ns ( x, z ) = p nt − s ( x, z ) solvesthe following PDE ∂q ns ∂s = −L ∗ z q ns on [0 , t ) × B n (0)where L ∗ z is the formal adjoint (in the z variable) of the Markov generator L corre-sponding to the diﬀusion x t . Now consider the process y t solving dy t = − X ( y t ) dt − r X j =1 X j dW jt and let T n = inf { t > | y t | ≥ n } . It then follows that we may write p nt ( x, z ) as p nt ( x, z ) = q n ( x, z ) = E z e R s ∧ Tn f ( y u ) du q s ∧ T n ( x, y s ∧ T n )for some f ∈ C ∞ ( R d : R ). One can use now the expression above coupled with thesupport theorems [15, 16] applied to the time-reversed process y t to bound p nt ( x, z )from below by a positive quantity.We ﬁnish this section by proving Theorem 2.9 as a consequence of Theorem 2.5(a). Proof of Theorem 2.9.

Let µ be an invariant probability measure for the Markovprocess x t deﬁned by (2.1). Again, since C is contained in the Lie algebra gen-erated by X , . . . , X r , [ X , X ] , . . . , [ X r , X ] and C is d -dimensional, it follows byH¨ormander’s theorem [4] that µ ( dx ) = m ( x ) dx for some nonnegative function m ∈ C ∞ ( R d ). Recall also that, for the same reasons, the Markov process x t de-ﬁned by (2.1) has a probability density function p t ( x, y ) with respect to Lebesguemeasure on R d which is smooth for ( t, x, y ) ∈ (0 , ∞ ) × R d × R d . Since µ is an in-variant probability measure, we have the following relation for almost every z ∈ R d and t > m ( z ) = Z R d m ( y ) p t ( y, z ) dy. We now use this relation to prove the positivity assertion. Let x ∈ supp( µ ). Hence µ ( B δ ( x )) > δ >

0. By smoothness of the density m , for each δ > x = x ( δ ) ∈ B δ ( x ) such that m ( x ) >

0. Since m is smooth, in particularcontinuous, there exists γ > B γ ( x ) ⊂ B δ ( x ) and m ( y ) ≥ ǫ > y ∈ B γ ( x ). Hence for almost every z ∈ R d we have m ( z ) ≥ Z B γ ( x ) m ( y ) p t ( y, z ) dy ≥ ǫ Z B γ ( x ) p t ( y, z ) dy. To bound p t ( y, z ) from below, there are two cases. First suppose that D ( x ) = R d .Then by deﬁnition of D ( x ), we have that C o is d -dimensional, and hence D ( y ) = R d for all y ∈ R d . Theorem 2.5 (a) implies that for any y ∈ B γ ( x ) , z ∈ D ( x ) thereexists t > p t ( x, z ) >

0. Since the transition density is a continuousfunction in all of its arguments, there exists an open neighborood U of ( t, x, z ) in(0 , ∞ ) × B γ ( x ) × R d such that p s ( x ′ , z ′ ) ≥ c > s, x ′ , z ′ ) ∈ U . In particular,for almost every y in an open ball centered at zm ( y ) ≥ ǫc > . Since m is continuous it follows that m ( z ) ≥ ǫc >

0. For the second case, supposethat D ( x ) = R d . In particular, this implies that C o has dimension l < d and OSITIVE DENSITIES 21 x / ∈ D ( x ). Take z ∈ D ( x ) and decrease δ > y ∈ B δ ( x ), z ∈ D ( y ).Following now in the same way as in the previous case we ﬁnish the proof of theresult. (cid:3) Appendix

Here we prove Lemma 4.5. We recall that this result is the slight modiﬁcationof the criterion for positivity of the density given by Ben-Arous L´eandre [2] whichwas applied without proof in Section 4. Such an extension is needed in this papersince the drift vector ﬁeld X was not assumed to be globally Lipschitzian and itsderivatives were not assumed to be globally bounded.The proof of Lemma 4.5 is almost identical to (and in some parts simpler than)the proof of Proposition 4.2.2 of [12]. The basic diﬀerence needed to remove theseassumptions on X is that we need to compare the stopped process x t ∧ τ n withanother process x ( n ) t such that x ( n ) t solves an SDE whose coeﬃcients satisfy therequired Lipschitzian and boundedness conditions and x t ∧ τ n = x ( n ) t ∧ τ n for all t ≥ . This localization procedure is relatively standard but we include the details forcompleteness.To do such a comparison, for any integer n ≥ X ( n )0 be a C ∞ vector ﬁeld on R d satisfying X ( n )0 ( x ) = ( X ( x ) for | x | ≤ n | x | ≥ n + 1 . For x ∈ R d , n ∈ N , t > H = ( H j ) ∈ C ([0 , t ] : R r ) let Φ x,nt ( H ) denote thesolution of the equationΦ x,nt ( H ) = x + Z t X ( n )0 (Φ x,ns ( H )) ds + r X j =1 X j H jt . Let J x,ns,t = J x,ns,t ( H ) denote the d × d matrix-valued solution of the equation J x,ns,t = Id d × d + Z ts DX ( n )0 (Φ x,nu ( H )) J x,ns,u du and M x,nt ( H ) denote the matrix( M x,nt ( H )) lm = r X j =1 Z t ( J x,ns,t ( H ) X j ) l ( J x,ns,t ( H ) X j ) m ds. Proof of Lemma 4.5.

As in [12], our goal is to use Malliavin calculus to bound p nt ( x, z ) from below by a quantity which is positive if the covariance matrix M x,nt ( H )is invertible. For brevity of notation during this proof, we will write the functionalΦ x,nt ( · ) simply as Φ( · ). Let H · = R · h u du, h ∈ L ([0 , ∞ ) : R r ) be as in thestatement of the lemma and let k l ( s ) denote the l th row of the matrix k lj ( s ) =( J x,ns,t ( H ) X j ) l . For y ∈ R d , let( T y W )( t ) = W ( t ) + d X l =1 y l Z t k l ( s ) ds and g ( y, W ) = Φ( T y W ) − Φ( W ) where W ( t ) = ( W ( t ) , . . . , W r ( t )) denotes the standard r -dimensional Wiener pro-cess on (Ω , F , P ). For β >

1, deﬁne cutoﬀ functions K β , α β ∈ C ( R : [0 , K β ( x ) = ( | x | ≥ β | x | ≤ β − α β ( x ) = ( | x | ≤ β | x | ≥ β , and set H β = K β ( k g ( · , W ) k C ( B (0): R d ) ) α β ( | det ∂ j g i (0) | ) . Under our assumptions, one can check that (see [13], Example 1.2.1, Theorem 2.2.2and surrounding text) g ( · , W ( ω )) ∈ C ∞ ( R d ) for a.s. ω ∈ Ω.Now let f : R d → [0 , ∞ ) be bounded, measurable and ρ : R r → (0 , ∞ ) be ameasurable function satisfying R R r ρ ( y ) dy = 1. Observe that E x f ( x t ∧ τ n ) = Z R r E x f ( x t ∧ τ n ) ρ ( y ) dy = Z R r E f (Φ( W )) {k Φ( W ) k t ≤ n } ρ ( y ) dy where {k Φ( W ) k t ≤ n } = (cid:26) ω ∈ Ω : sup s ∈ [0 ,t ] | Φ x,ns ( W ( ω )) | ≤ n (cid:27) . Girsanov’s theorem then gives Z R r E f (Φ( W )) {k Φ( W ) k t ≤ n } ρ ( y ) dy = Z R r E f (Φ( T y W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy where G ( y ) > c β > E x f ( x t ∧ τ n ) ≥ Z R r E f (Φ( T y W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy ≥ E H β Z | y |≤ c β f ( g ( y ) + Φ( W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy ≥ E H β { sup | y |≤ cβ k Φ( T y W ) k t ≤ n } Z | y |≤ c β f ( g ( y ) + Φ( W )) G ( y ) ρ ( y ) dy. Let A β = { sup | y |≤ c β k Φ( T y W ) k t ≤ n } . By Lemma 4.2.1 of [12], for any β > c β ∈ (0 , β − ) and δ β > G : B (0) → R d with G (0) = 0, k G k C ( B (0)) ≤ β and | det ∂ j g i (0) | ≥ β is diﬀeomorphic from B c β (0) ⊂ R d into a neighborhood of B δ β (0) ⊂ R d . In particular, we ﬁnd that after OSITIVE DENSITIES 23 changing variables twice E x f ( x t ∧ τ n ) ≥ E H β A β Z | y |≤ c β f ( g ( y ) + Φ( W )) G ( y ) ρ ( y ) dy ≥ E H β A β Z | z |≤ δ β f ( z + Φ( W )) G ( g − ( z )) ρ ( g − ( z )) | det ∂ j g i ( g − ( z )) | dz = E H β A β Z | z − Φ( W ) |≤ δ β f ( z ) G ( g − ( z − Φ( W ))) × ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W ))) | dz. Therefore we deduce the following inequality p t ( x, z ) ≥ E H β A β {| z − Φ( W ) |≤ δ β } G ( g − ( z − Φ( W )) × ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W )) | . By construction, if H β = 0 and | z − Φ( W ) | ≤ δ β then G ( g − ( z − Φ( W )) ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W )) | > . Thus it remains to prove that β > A β ∩ n | z − Φ( W ) | ≤ δ β , | det ∂ j g i (0) | ≥ β − , k g ( · , W ) k C ( B (0)) ≤ β − o has positive probability. Note that this can be shown by following exactly the sameline of reasoning starting in the last paragraph of p. 1777 of [10]. (cid:3) References [1] Avanti Athreya, Tiﬀany Kolba, and Jonathan C. Mattingly. Propogating lya-punov functions to prove noise-induced stability. arXiv: 1111.1755 , (1):1–41,2011.[2] G. Ben Arous and R. L´eandre. D´ecroissance exponentielle du noyau de lachaleur sur la diagonale. II.

Probab. Theory Related Fields , 90(3):377–402,1991.[3] Jeremiah Birrell, David P. Herzog, and Jan Wehr. The transition from ergodicto explosive behavior in a family of stochastic diﬀerential equations.

StochasticProcesses and their Applications , 122(4):1519 – 1539, 2012.[4] Lars H¨ormander. Hypoelliptic second order diﬀerential equations.

Acta Math. ,119:147–171, 1967.[5] V. Jurdjevic and I. Kupka. Control systems on semisimple Lie groups andtheir homogeneous spaces.

Ann. Inst. Fourier (Grenoble) , 31(4):vi, 151–179,1981.[6] V. Jurdjevic and I. Kupka. Polynomial control systems.

Math. Ann. ,272(3):361–368, 1985.[7] Velimir Jurdjevic.

Geometric control theory , volume 52 of

Cambridge Studiesin Advanced Mathematics . Cambridge University Press, Cambridge, 1997.[8] S. Kusuoka and D. Stroock. Applications of the Malliavin calculus. II.

J. Fac.Sci. Univ. Tokyo Sect. IA Math. , 32(1):1–76, 1985.[9] J. C. Mattingly, A. M. Stuart, and D. J. Higham. Ergodicity for SDEs andapproximations: locally Lipschitz vector ﬁelds and degenerate noise.

StochasticProcess. Appl. , 101(2):185–232, 2002. [10] Jonathan C. Mattingly and ´Etienne Pardoux. Malliavin calculus for the sto-chastic 2D Navier-Stokes equation.

Comm. Pure Appl. Math. , 59(12):1742–1790, 2006.[11] James Norris. Simpliﬁed Malliavin calculus. In

S´eminaire de Probabilit´es, XX,1984/85 , volume 1204 of

Lecture Notes in Math. , pages 101–130. Springer,Berlin, 1986.[12] David Nualart. Analysis on Wiener space and anticipating stochastic calculus.In

Lectures on probability theory and statistics (Saint-Flour, 1995) , volume1690 of

Lecture Notes in Math. , pages 123–227. Springer, Berlin, 1998.[13] David Nualart.

The Malliavin calculus and related topics . Probability and itsApplications (New York). Springer-Verlag, Berlin, second edition, 2006.[14] Marco Romito. Ergodicity of the ﬁnite dimensional approximation of the 3DNavier-Stokes equations forced by a degenerate noise.

J. Statist. Phys. , 114(1-2):155–177, 2004.[15] D. Stroock and S. R. S. Varadhan. On degenerate elliptic-parabolic operatorsof second order and their associated diﬀusions.

Comm. Pure Appl. Math. ,25:651–713, 1972.[16] Daniel W. Stroock and S. R. S. Varadhan. On the support of diﬀusion processeswith applications to the strong maximum principle. In