A practical criterion for positivity of transition densities
aa r X i v : . [ m a t h . P R ] J u l A PRACTICAL CRITERION FOR POSITIVITY OF TRANSITIONDENSITIES
DAVID P. HERZOG AND JONATHAN C. MATTINGLY
Abstract.
We establish a simple criterion for locating points where the tran-sition density of a degenerate diffusion is strictly positive. Throughout, weassume that the diffusion satisfies a stochastic differential equation (SDE) on R d with additive noise and polynomial drift. In this setting, we will see thatit is often that case that local information of the flow, e.g. the Lie algebragenerated by the vector fields defining the SDE at a point x ∈ R d , determineswhere the transition density is strictly positive. This is surprising in that pos-itivity is a more global property of the diffusion. This work primarily buildson and combines the ideas of Ben Arous and L´eandre [2] and Jurdjevic andKupka [6]. Introduction
The goal of this paper is to develop an easily applicable framework for locatingpoints where the probability density of a degenerate diffusion is strictly positive.We will focus on the setting where the diffusion satisfies a stochastic differentialequation (SDE) on R d where each component of the drift is a polynomial in thestandard Euclidean coordinates and the noise is additive. Our methods reducefinding points of positivity to computing a certain collection of constant vectorfields generated by taking iterated commutators of the vector fields defining theSDE. This is convenient since a similar computation is typically used to show thatthe diffusion has a smooth probability density function p t ( x, y ) with respect toLebesgue measure dy . While the existence of a smooth density is decided locally,we show that in some settings the bracket computation also determines the moreglobal property of where the density is strictly positive. Additionally, uncoveringsufficiently large regions of positivity is useful for proving unique ergodicity.While methods already exist for proving positivity of transition densities, mostrequire knowledge of attainable sets via controls. Here we have structured ourassumptions to require as little global control information as possible. In particular,our results prove smoothness of the densities, the needed control statements, andpositivity, all with one set of primarily local assumptions.Although our general framework is limited to SDEs with polynomial drift andadditive noise, working within such boundaries is reasonable in many applications.In particular, to illustrate the utility of our results, we will apply them to a collec-tion of examples, each with quite different structure. Moreover, for the equationsconsidered, either new results will be obtained or existing results will be improvedupon.The ideas used in this note build on a number existing works. Beyond thenow classical theory of H¨ormander [4] on hypoelliptic operators in the “sum ofsquares” form, we use the associated probabilistic techniques of Malliavin calculus [12]. We also use a number of ideas from geometric control theory [7]. Moreover,we modify the idea that odd powered polynomial vector fields are “good” (due totheir time reversal properties) and even powered polynomial vector fields are “bad”[6]. Similar ideas were critical in the work of Romito [14]. We also integrate intoour results the powerful ideas of Ben Arous and L´eandre [2] for proving positivityof densities of random variables over a Wiener space. Our hope is that by bringingthese ideas together and adapting them to our specific context, we will provide auseful tool for many applied equations.The layout of this paper is as follows. In Section 2, we introduce notation andterminology and state the main general results of the paper. In Section 3, we applyour results to specific examples. Section 4 contains heuristic discussions of whythe main results hold and are natural. We also include an “non-example”, that isan example where the main results fail to apply yet the corresponding density hasregions of positivity (in space and time), and illustrate how to adapt the generaltheory in such cases. Additionally, Section 4 contains the proof of the main resultsas stated in Section 2. Acknowledgements
The authors would like to thank Avanti Athreya, Richard Durrett, Tiffany Kolba,James Nolen, and Jan Wehr for helpful conversations on the topic of this paper.DPH would also like to thank Martin Hairer for suggesting the paper [6], fromwhich his understanding of these ideas began and lead to the current collaboration.We would also like to acknowledge partial support of the NSF through grant DMS-08-54879 and the Duke University Dean’s office.2.
Notation, Terminology and Main Results
Throughout, we study stochastic differential equations on R d of the followingform dx t = X ( x t ) dt + r X j =1 X j dW jt (2.1)where X is a polynomial vector field ; that is, X = P dj =1 X j ( x ) ∂ x j is such that eachmap x X j ( x ) is a polynomial in the standard Euclidean coordinates, X , . . . , X r are constant vector fields; that is, they do not depend on the base point, and W t , W t , . . . , W rt are standard independent real Wiener processes defined on a prob-ability space (Ω , F , P ).To deal with the issue of finite-time explosion in (2.1), we will need to stop theprocess x t prior to the time of explosion. Thus for n ∈ N , let B n (0) denote theopen ball of radius n centered at the origin in R d , and define the stopping times τ n = inf { t > x t / ∈ B n (0) } and τ ∞ = lim n ↑∞ τ n . Our results will be stated forthe stopped processes x t ∧ τ n , n ∈ N . Of course, x t ∧ τ n coincides with x t for all times t ≤ τ n .For vector fields V = P dj =1 V j ( x ) ∂∂x j and W = P dj =1 W j ( x ) ∂∂x j , let ad V ( W ) = W , ad V ( W ) = [ V, W ] := d X j =1 (cid:18) d X k =1 V k ( x ) ∂W j ( x ) ∂x k − W k ( x ) ∂V j ( x ) ∂x k (cid:19) ∂∂x j . OSITIVE DENSITIES 3
Inductively, for m ≥ m V ( W ) = ad V ad m − V ( W ). For a set of vectorfields G on R d , span( G ) denotes the R -linear span of G andcone ≥ ( G ) = { P ji =1 λ i V i : λ i ≥ , V i ∈ G} . We call x ∈ R d an equilibrium point of a set of vector fields G if V ( x ) = 0 forsome V ∈ G . If V is a constant vector field with constant value v ∈ R d and W is a polynomial vector field, then we may define a map from R into R d given by λ ( W j ( λv )). Note that since W is a polynomial vector field, ( W j ( λv )) is a vectorof polynomials in λ . Let n ( V, W ) be the maximal degree among these polynomials(For purposes below, we assume that the zero polynomial has neither even nor odddegree). We call n ( V, W ) the relative degree of V and W .We now introduce the set of constant vector fields C which will play a funda-mental role throughout the paper. It will be defined as the subset of constantvector fields in a larger set of vector fields which we now introduce. To initializethe inductive procedure let G = span { X , . . . , X r } and G o1 = G ∪ { ad n ( V,X ) V ( X ) : V ∈ G , n ( V, X ) odd } , G e1 = { ad n ( V,X ) V ( X ) : V ∈ G , n ( V, X ) even } , G = span( G o1 ) + cone ≥ ( G e1 ) . For j ≥
1, we define G o j +1 , G e j +1 , G j +1 inductively as G o j +1 = G o j ∪ { ad n ( V,W ) V ( W ) : V ∈ G o j constant , W ∈ G j , n ( V, W ) odd } , G e j +1 = G e j ∪ { ad n ( V,W ) V ( W ) : V ∈ G o j constant , W ∈ G j , n ( V, W ) even } , G j +1 = span( G o j ) + cone ≥ ( G e j ) . Let C o denote the set of constant vector fields in ∪ j G o j and C e denote the set ofconstant vector fields in ∪ j G e j . Finally, define C = span( C o ) + cone ≥ ( C e ) . (2.2) Remark 2.3.
Throughout, we will often identify a constant vector field on R d with the vector in R d which defines it. For example, depending on the context, C o will be used to denote either the set of vector fields C o defined above or the set ofvectors v ∈ R d such that v = V ( x ) for some V ∈ C o . Remark 2.4.
The primary assumption we will make is that C is d -dimensional.This is equivalent to assuming that C spans the entire tangent space at all points x ∈ R d as C contains only constant vector fields. Since C is contained in the Liealgebra generated by X , . . . , X r , [ X , X ] , . . . , [ X r , X ] , it follows by H¨ormander’s hypoellipticity theorem [4] that for every n ≥ x ∈ B n (0) and every Borel set A ⊂ B n (0) P x { x t ∧ τ n ∈ A } = Z A p nt ( x, y ) dy for some nonnegative function p nt ( x, y ) which is defined and smooth on (0 , ∞ ) × B n (0) × B n (0). Here we recall that B n (0) is the open ball of radius n centeredat the origin in R d . Certainly, the transition kernel of x t ∧ τ n contains a singularcomponent concentrated on the boundary of B n (0). However, this is invisible tosets contained in B n (0) since B n (0) is open. DAVID P. HERZOG AND JONATHAN C. MATTINGLY
We now state the main general result of the paper.
Theorem 2.5.
Suppose that C is d -dimensional and let { y , . . . , y d } ⊂ C be a basisof C such that { y , . . . , y k } ⊂ C o and { y k +1 , . . . , y d } ⊂ C e . For x ∈ R d , define theset D ( x ) = (cid:8) x (cid:9) + (cid:8)P ki =1 α i y i + P dj = k +1 λ j y j : α i ∈ R , λ j > (cid:9) . and suppose that x, z ∈ R d are such that z ∈ D ( x ) .(a) For all T > there exist t ∈ (0 , T ) and N ∈ N such that p nt ( x, z ) > for all n ≥ N. (b) If there exists an equilibrium point y ∈ R d of G = { X + P rj =1 u j X j : u j ∈ R } such that y ∈ D ( x ) and z ∈ D ( y ) , then for all T > there exists N ∈ N suchthat p nt ( x, z ) > for all t ≥ T, n ≥ N. Remark 2.6.
Suppose that C is d -dimensional and that x t is non-explosive ; thatis, for every x ∈ R d P x { τ ∞ < ∞} = 0 . Then x t has a probability density function p t ( x, y ) with respect to Lebesgue measure dy which is smooth on (0 , ∞ ) × R d × R d . Moreover, all conclusions of Theorem 2.5hold with p nt ( x, z ) replaced by p t ( x, z ). Remark 2.7.
Even if C is d -dimensional, it is still possible that the set D ( x ) cannotbe chosen to be the entire space R d . See Example 3.4 in Section 3. Remark 2.8.
It is worth emphasizing that y ∈ R d can be an equilibrium withoutbeing an equilibrium point of the drift vector field X . For example, if X ( y , y ) =( g ( y , y )(1 − y ) , f ( y , y )) for some scalar functions f, g and X = (0 ,
1) then allpoints of the form ( y ,
1) are equilibrium points since X ( y ,
1) + uX = (0 ,
0) if u = − f ( y , Theorem 2.9.
Suppose that C is d -dimensional and x t is non-explosive. Let D ( x ) be as in the statement of Theorem 2.5. Then there is at most one invariant prob-ability measure corresponding to the Markov process x t defined by (2.1) . More-over, if such an invariant probability measure µ exists, then µ ( dx ) = m ( x ) dx forsome smooth, non-negative function m and if x ∈ supp( µ ) then for all z ∈ D ( x ) , m ( z ) > . Examples
Before proving the main results, we apply them to specific examples to show theirutility. A “non-example”, that is an example where Theorem 2.5 is not applicable,is given in the next section in Remark 4.11 as it fits in better with the discussionthere.
OSITIVE DENSITIES 5
Example 3.1.
As a first example, we consider the Langevin dynamics on R d , d ≥ dx t = [ − γx t − ∇ F ( y t )] dt + d X j =1 σ j dW jt dy t = x t dt where x t , y t ∈ R d , γ > F ∈ C ∞ ( R d : R ), σ j ∈ R d and the W jt areindependent standard Wiener processes. So that solutions to (3.2) do not explodein finite time, we assume that F satisfies the one-sided Lipschitz condition andconcavity and growth assumptions of Condition 3.1 of [9]. A prototypic example ofa potential which satisfies these assumptions is F ( y ) = | y | − | y | .As a consequence of Theorem 2.5, we now prove: Corollary 3.3. If span { σ , . . . , σ d } = R d , then for all ( x, y ) , ( x ′ , y ′ ) ∈ R d and t > p t (( x, y ) , ( x ′ , y ′ )) > . Proof.
Let = (0 , , . . . , ∈ R d and let G = { X + P dj =1 u j X j : u j ∈ R } where X ( x, y ) = (cid:18) − γx + ∇ F ( y ) x (cid:19) and X j ( x, y ) = (cid:18) σ j (cid:19) . We begin by computing C (defined in the introduction) corresponding to equation(3.2). Since n ( X , X j ) = 1 for all j , we see that G o1 ⊃ { [ X j , X ] : j = 1 , , . . . , d } and [ X j , X ]( x, y ) = (cid:18) − γσ j σ j (cid:19) . Hence, in particular,
C ⊃ { X j , [ X j , X ] : j = 1 , , . . . , d } . Since the vectors σ , . . . , σ d are linearly independent, it follows that C has a basis. Additionally, since C o ⊃ { X j , [ X j , X ] : j = 1 , , . . . , d } we can choose a basis so that D ( x, y ) = R d forall ( x, y ) ∈ R d . To finish proving the result, we claim that the origin ( , ) ∈ R d is an equilibrium point of G . Indeed, since X ( , ) + d X j =1 u j X j ( , ) = (cid:18) −∇ F ( ) (cid:19) + (cid:18) P dj =1 u j σ j (cid:19) and the σ j form a basis of R d , we may choose real numbers u j ∈ R such that X ( , ) + d X j =1 u j X j ( , ) = (cid:18) (cid:19) . In light of Remark 2.6, applying Theorem 2.5 (b) finishes the proof of Corollary3.3. (cid:3)
Example 3.4.
Let a , a ∈ R , α > α >
0, and ǫ >
0. With motivations fromturbulent transport of inertial particles, the stochastic differential equation on R given by(3.5) dx t = ( a x t − α x t + y t ) dtdy t = ( a y t − α x t y t ) dt + ǫ dW t is considered in [3]. Here, we strengthen the results of Section 4 of this work. Amore hands on application of some of the ideas of this note were applied to a specificcase of this example in Section 11 of [1]. Applying Theorem 2.1 of [3], we first notethat ( x t , y t ) is non-explosive.We now prove: Corollary 3.6.
Suppose that ( x, y ) ∈ R satisfies x < a − | a | α or x ≥ a + | a | α . Then for all t > and ( x ′ , y ′ ) ∈ R with x ′ > xp t (( x, y ) , ( x ′ , y ′ )) > . Otherwise if ( x, y ) ∈ R satisfies a − | a | α ≤ x ≤ a + | a | α , then for all t > and ( x ′ , y ′ ) ∈ R with x ′ > a + | a | α p t (( x, y ) , ( x ′ , y ′ )) > . Remark 3.7.
It is important to point out that Corollary 3.6 is not sharp. Forexample if a = a = 0, α = 1 and α = 2, it was shown in Section 11 of [1] that,in addition to the result above, for all ( x, y ) , ( x ′ , y ′ ) ∈ R with x ′ > p t (( x, y ) , ( x ′ , y ′ )) > t > X in favor of making generalstatements for any positive time. However, Corollary 3.6 is more than sufficient toprove unique ergodicity in equation (3.5). Nevertheless, it is not hard to bootstrapfrom Corollary 3.6 to obtain the full (sharp) result proved in [1]. Proof.
As in the previous example, we begin by computing the set C correspondingto equation (3.5). Let G = (cid:8) X + uX : u ∈ R (cid:9) where X = ( a x − α x + y ) ∂ x +( a y − α xy ) ∂ y and X = ∂ y . Since n ( X , X ) = 2,we find that ad X ( X ) = 2 ∂ x ∈ G e1 . Let D ( x, y ) = { ( x, y ) + u (0 ,
1) + λ (1 ,
0) : u ∈ R , λ > } . As opposed to the previous example, the set D ( x, y ) is not the entire space. Hencewe must make sure we have enough equilibrium points in the right locations.Consider the polynomial equation a x − α x + y = 0 a y − α xy + u = 0 OSITIVE DENSITIES 7 where u ∈ R . Clearly, any pair ( x, y ) ∈ R satisfying the above equations for some u ∈ R is an equilibrium point of G . In particular, we may solve a x − α x + y = 0producing x = a ± p a + 4 α y α . Since we may pick u = α xy − a y , we therefore deduce that all points ( x, y ) ∈ R such that either x ≥ a + | a | α or x ≤ a − | a | α are equilibrium points for the control system G . Hence Remark 2.6 now impliesCorollary 3.6. (cid:3) Example 3.8.
Let ν > ∂ t u ( x , t ) + ( u ( x , t ) · ∇ x ) u ( x , t ) = ν ∆ x u ( x , t ) + ξ ( x , t )with periodic boundary conditions on the torus T = [0 , π ] . Here, we assumethat there is no mean flow and that ξ is a Gaussian process which is white in timeand colored in space. To emphasize, we do not require the divergence free condition ∇ · u = 0; hence, (3.9) is not the 2D Navier Stokes equation. Moreover, we do notrestrict ourselves to gradient solutions as is often done when considering the mul-tidimensional Burgers equation. In the dynamics (3.9), we are precisely interestedhow the divergence free forcing spreads to the non-divergence free (gradiant-likedirections). Since one does not have global solutions in this setting, here we mustmake use of the stopped processes.Let us now be more precise. Writing X = k ∈ Z u k ( t ) e − i h k , x i , where h· , ·i denotes the dot product, and fixing a positive integer N ≥
2, we considerthe following stochastic differential equation on C N +1) − du k = [ iF N k ( u ) − ν | k | u k ] dt + k ⊥ | k | ( σ k dB k , (1) t + iσ ′ k dB k , (2) t )(3.10) + k | k | ( γ k dW k , (1) t + iγ ′ k dW k , (2) t )where • u k ∈ C ; • the equation is over all indices k ∈ H N = (cid:8) k ∈ Z \ { (0 , } : k k k ∞ ≤ N (cid:9) ; • F N k ( u ) = X l , k − l ∈ H N h u l , k − l i u k − l ; • σ k , σ ′ k , γ k , γ ′ k ∈ R ; • k ⊥ = ( k , k ) ⊥ = ( − k , k ); • { B k , (1) t , B k , (2) t , W k , (1) t , W k , (2) t } k ∈ H N is a set of independent Brownian mo-tions. DAVID P. HERZOG AND JONATHAN C. MATTINGLY
To further illuminate the discussion, we first split the equation into incompress-ible and compressible directions. To this end, write u k = w k k ⊥ | k | + q k k | k | F k ( u ) = F ⊥ k ( w, q ) k ⊥ | k | + F k k ( w, q ) k | k | where w k , q k ∈ C . In particular, equation (3.10) now becomes dw k = [ − ν | k | w k + iF ⊥ k ( w, q )] dt + σ k dB k , (1) t + iσ ′ k dB k , (2) t (3.11) dq k = [ − ν | k | q k + iF k k ( w, q )] dt + γ k dW k , (1) t + iγ ′ k dW k , (2) t for some F ⊥ k , F k k to be computed in a moment. Note that (3.11) evolves on C N +1) − = C N ( N +1) for all t < τ ∞ .We will now use Theorem 2.5 to prove the following result: Theorem 3.12.
Suppose that { k ∈ H N : σ k = 0 , σ ′ k = 0 } ⊃ { k ∈ H N : k k k ∞ = 1 } . Then for all ( w, q ) , ( w ′ , q ′ ) ∈ C N ( N +1) and T > , there exists N ∈ N large enoughso that p nt (( w, q ) , ( w ′ , q ′ )) > for all t ≥ T, n ≥ N. Remark 3.13.
It is interesting to note that, even if the process ( w t , q t ) is assumedto be incompressible initially; that is, ( w , q ) = ( w, ∈ C N ( N +1) , a small amountof low mode forcing ensures that any mixture of incompressible and compressiblestates becomes instantaneously possible. As we will see in the proof below, thiscannot happen if we do not force the incompressible directions. In particular, if weassume that the process ( w t , q t ) is initially compressible; that is, ( w , q ) = (0 , q )and σ k = σ ′ k = 0 for all k ∈ H N , then w t ≡ t ≥ Proof of Theorem 3.12.
We will first write out and symmetrize the nonlinear terms F ⊥ k and F k k . Using the relations h k ⊥ , l i = −h k , l ⊥ i and h k ⊥ , l ⊥ i = h k , l i , we findthat F ⊥ k ( w, q ) = X l , k − l ∈ H N w l w k − l h l ⊥ , k ih k − l , k i| l | | k − l | + w l q k − l h l ⊥ , k i | l | | k − l | + X l , k − l ∈ H N q l w k − l h l , k − l ih k − l , k i| l | | k − l | − q l q k − l h l , k − l ih l , k ⊥ i| l | | k − l | and F k k ( w, q ) = X l , k − l ∈ H N − w l w k − l h l ⊥ , k i | l | | k − l | + w l q k − l h l ⊥ , k ih k − l , k i| l | | k − l | X l , k − l ∈ H N − q l w k − l h l , k − l ih l ⊥ , k i| l | | k − l | + q l q k − l h l , k − l ih k − l , k i| l | | k − l | . OSITIVE DENSITIES 9
After considering the effect of the mapping ( l , k − l ) ( k − l , l ) on each of theterms above, we may write F ⊥ k ( w, q ) = X l , k − l ∈ H N w l w k − l h l ⊥ , k i (cid:18) | l | − | k − l | (cid:19) + w l q k − l h k − l , k i| k − l | F k k ( w, q ) = X l , k − l ∈ H N − w l w k − l h l ⊥ , k i | l | | k − l | + w l q k − l h l ⊥ , k ih k − l , k + l i| l | | k − l | + X l , k − l ∈ H N q l q k − l h l , k − l i | k | | l | | k − l | . The assertion made in the previous remark now follows easily from these expressionssince if σ k = σ ′ k = 0 for all k ∈ H N and w = 0, then w t = ( w k ( t )) k ∈ H N ≡ t .To prove Theorem 3.12, we do as in the previous two examples and start bycomputing C corresponding to (3.11). Define G = (cid:26) X + X k ∈ F D I u k X k + v k Y k : u k , v k ∈ R (cid:27) where X = X k ∈ H N (cid:2) − ν | k | w k + iF ⊥ k ( w, q ) (cid:3) ∂∂w k + (cid:2) − ν | k | q k + iF k k ( w, q ) (cid:3) ∂∂q k + X k ∈ G N (cid:2) − ν | k | ¯ w k − iF ⊥ k ( ¯ w, ¯ q ) (cid:3) ∂∂ ¯ w k + (cid:2) − ν | k | ¯ q k − iF k k ( ¯ w, ¯ q ) (cid:3) ∂∂ ¯ q k and X k = ∂∂w k + ∂∂ ¯ w k , Y k = i ∂∂w k − i ∂∂ ¯ w k . Notice that n ( X , X j ) = 1 for all j ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } since there areno diagonal terms in the nonlinear part of X . In particular,[ X j , X ] ∈ G o1 for all j ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } . Moreover, one can compute these commutators to see that[ X j , X ] = − ν | j | ∂∂w j − ν | j | ∂∂ ¯ w j + i X k ∈ H N (cid:20) w k − j h j ⊥ , k i (cid:18) | j | − | k − j | (cid:19) + q k − j h k − j , k i| k − j | (cid:21) ∂∂w k − i X k ∈ H N (cid:20) ¯ w k − j h j ⊥ , k i (cid:18) | j | − | k − j | (cid:19) + ¯ q k − j h k − j , k i| k − j | (cid:21) ∂∂ ¯ w k + i X k ∈ H N (cid:20) − w k − j h j ⊥ , k i | j | | k − j | + q k − j h j ⊥ , k ih k − j , k + j i| j | | k − j | (cid:21) ∂∂q k − i X k ∈ H N (cid:20) − w k − j h j ⊥ , k i | j | | k − j | + ¯ q k − j h j ⊥ , k ih k − j , k + j i| j | | k − j | (cid:21) ∂∂ ¯ q k . Note also that for all j , m ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } such that j + m ∈ H N n ( X m , [ X j , X ]) = n ( Y m , [ X j , X ]) = 1 . Hence for all j , m ∈ { k ∈ H N : σ k = 0 , σ ′ k = 0 } with j + m ∈ H N , [ X m , [ X j , X ]] ∈G o2 and [ Y m , [ X j , X ]] ∈ G o2 . Computing these commutators we find that(3.14) [ X m , [ X j , X ]] = h j ⊥ , m i (cid:18) | j | − | m | (cid:19) Y j + m − h j ⊥ , m i | j | | m | e Y j + m and(3.15) [ Y m , [ X j , X ]] = −h j ⊥ , m i (cid:18) | j | − | m | (cid:19) X j + m + 2 h j ⊥ , m i | j | | m | e X j + m where e X · = ∂∂q · + ∂∂ ¯ q · , e Y · = i ∂∂q · − i ∂∂ ¯ q · . We will now use the above computations to prove that (cid:8) X j , Y j , e X j , e Y j : k k k ∞ ≤ k } ⊂ C o for all k = 1 , , . . . , N by induction on k . It will then follow that C o spans thetangent space, and so we may pick D ( w, q ) = C N ( N +1) for all ( w, q ) ∈ C N ( N +1) .To prove the claim when k = 1, first substitute( j , m ) = ((1 , , (0 , , ((1 , , (0 , − , (( − , , (0 , − , (( − , , (0 , e X (1 , , e Y (1 , , e X (1 , − , e Y (1 , − , e X ( − , − , e Y ( − , − , e X ( − , , e Y ( − , ∈ C o . Substituting( j , m ) = ((1 , , (0 , − , ((1 , , ( − , , (( − , , (0 , − , (( − , − , (1 , X k , Y k ∈ C o for any k k k ∞ = 1,we find by taking linear combinations that e X (1 , , e Y (1 , , e X (0 , , e Y (0 , , e X ( − , , e Y ( − , , e X (0 , − , e Y (0 , − ∈ C o . This proves the initial statement in the inductiveargument. Suppose now that for some 1 ≤ k < N (cid:8) X j , Y j , e X j , e Y j : j ∈ H N , k j k ∞ ≤ k (cid:9) ⊂ C o . Note that if m , j ∈ H N are such that k m k ∞ ≤ k , k j k ∞ = 1, then [ e X m , [ X j , X ]] ∈C odd and [ e Y m , [ X j , X ]] ∈ C o . Note moreover that(3.16) [ e X m , [ X j , X ]] = h m , j + m i| m | Y j + m + h j ⊥ , m ih m , m + 2 j i| j | | m | e Y j + m and(3.17) [ e Y m , [ X j , X ]] = − h m , j + m i| m | X j + m − h j ⊥ , m ih m , m + 2 j i| j | | m | e X j + m . We claim that if m , j ∈ H N are such that | j | 6 = | m | and h j ⊥ , m i 6 = 0, then the pairs(3.14) and (3.16), (3.15) and (3.17), are independent. Indeed, if they are dependentunder these assumptions, then | j | h m , m + j i = 12 ( | j | − | m | ) h m , m + 2 j i which is true if and only if | j | + | m | + 2 h m , j i = 0 . Note that this equality is impossible since | j | 6 = | m | . Therefore, to finish theinductive argument, it suffices to show that for all k ∈ H N with k k k ∞ = k + 1,there exist m , j ∈ H N such that OSITIVE DENSITIES 11 • m + j = k ; • k m k ∞ = k , k j k ∞ = 1, | m | 6 = | j | , and h j ⊥ , m i 6 = 0.For those such k away from the axes and the lines | y | = | x | in the ( x, y )-plane, take j ∈ H N to be the unique member of the set { (1 , , (0 , , ( − , , (0 , − } such that k k − j k ∞ = k . Thus define m = k − j and note that j and m have different Euclideanlengths and h j ⊥ , m i 6 = 0 . Now suppose k is on one of the axes or the lines | y | = | x | .Then there exists j ∈ { (1 , , (0 , , ( − , , (0 , − } such that m = k − j belongs tothe set of indices generated up to this point of sup norm length k + 1. It is easy tocheck that, again, j and m have different Euclidean lengths and h j ⊥ , m i 6 = 0 . Thisfinishes the proof of the inductive argument.Now note that we may choose a basis of C such that D ( w, q ) = C N ( N +1) for all ( w, q ) ∈ C N ( N +1) . Moreover, the origin is clearly an equilibrium point of G .Because the issue of explosion is still evident, Theorem 2.5 implies that for every( w, q ) , ( w ′ , q ′ ) ∈ C N ( N +1) and T >
0, there exists N ∈ N large enough such that p nt (( w, q ) , ( w ′ , q ′ )) > t ≥ T, n ≥ N. for all t > (cid:3) Proof of Main Results
The goal of this section is to prove Theorem 2.5 and Theorem 2.9. Theorem 2.9will be a relatively straightforward consequence of Theorem 2.5, so we focus ourattention first on proving Theorem 2.5.To prove Theorem 2.5, we will use a slight modification of the condition forpositivity of the density given by Ben Arous and L´eandre [2] (see also [12]). Theslight modification is necessary to remove the global Lipschitzian and boundednessconditions often assumed of the coefficients in the SDE.To setup the statement of our slight modification, let H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), and Φ x · ( H ) denote the maximally-defined solution (in time) of the equation(4.1) Φ xs ( H ) = x + Z s X (Φ xu ( H )) du + r X j =1 X j Z s h ju du.J xs,t ( H ) denotes the maximally-defined d × d matrix-valued solution of J xs,t ( H ) = Id d × d + Z ts DX (Φ xu ( H )) J xs,u ( H ) du (4.2)where Id d × d is the identity matrix and D is the Jacobian. Define the Gramianmatrix M xt ( H ) by(4.3) ( M xt ( H )) nk = r X m =1 Z t ( J xs,t ( H ) X m ) n ( J xs,t ( H ) X m ) k ds. Remark 4.4.
Sometimes M xt ( H ) is called the deterministic Malliavin covariancematrix. Formally replacing H with a Brownian motion W yields the standard(stochastic) Malliavin covariance matrix. Lemma 4.5.
Fix x, z ∈ R d and t > and suppose that H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ) , is such that Φ xs ( H ) is defined for all times s ∈ [0 , t ] and Φ xt ( H ) = z .If M xt ( H ) is invertible, then p nt ( x, z ) > for any integer n ≥ such that Φ xs ( H ) ⊂ B n (0) for all s ∈ [0 , t ] . We defer the proof of Lemma 4.5 until the Appendix, and focus our efforts in thissection on exhibiting a control H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), so that Φ x · ( H )has all of the properties stated in Lemma 4.5. The proof of the existence of sucha control splits into two parts. First, in Section 4.1 we will use the enlargementtechniques of Jurjevic and Kupka [5, 6, 7] to see which directions can be flowed alongin small times by Φ xs ( H ) over the class of controls H defined above. Second, we willsee that there are enough directions so that we can construct a sufficiently “twisty”control H , ensuring that M xt ( H ) is invertible. The existence of an equilibriumpoint y ∈ R d as in the statement of Theorem 2.5 allows us control over the timeparameter.4.1. A Primer on Geometric Control Theory.
For x ∈ R d and t >
0, let A ( x, ≤ t ) be the set of points z ∈ R d such that for some time t ∈ (0 , t ] there exists H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ), for which Φ xs ( H ) is defined for all s ∈ [0 , t ]and Φ xt ( H ) = z . Recalling the set C defined in Section 2, here we will use thetechniques [5, 6, 7] to prove the following result: Lemma 4.6.
For all x ∈ R d and all t > , { x } + C ⊂ A ( x, ≤ t ) . We start by making some heuristic observations, arguing intuitively why weshould expect Lemma 4.6 to be true. To make notation more legible, for any C ∞ vector field V on R d let exp( tV )( x ) denote the maximally-defined integral curve of V passing through x at t = 0.We first see why we should expect the following containment to hold { x } + span { X , . . . , X r } ⊂ A ( x, ≤ t )(4.7)for all x ∈ R d , t >
0. Let x ∈ R d , α ∈ R \ { } and j ∈ { , . . . , r } be given. Thekey is to realize that for λ > t > t ( X + αλX j ))( x ) ≈ exp( tαλX j )( x )This is because the behavior of the flow along X + αλX j is initially dominated forsmall times by the flow along αλX j since λ is large. More precisely, taking t = t ′ /λ for some t ′ > λ → ∞ exp( t ( X + αλX j ))( x ) = exp (cid:16) t ′ λ ( X + αλX j ) (cid:17) ( x ) → exp( t ′ αX j )( x ) . Since x ∈ R d , α ∈ R \ { } and j ∈ { , , . . . , r } were assumed to be arbitrary, wenow see why one should believe the containment (4.7) as one could repeat the sameargument with αX j replaced by an arbitrary linear combination of X , . . . , X r .To see how some of the commutators in the definition of C arise, we start by“tweaking” the directions X , . . . , X r obtained in the previous step by X ; that is,we will first flow along X j for αλ units of times and then flow along X for t > OSITIVE DENSITIES 13 units of time. Again let x ∈ R d , α ∈ R \{ } and j ∈ { , . . . , r } be given. If x j ∈ R d is the constant value of X j , we notice that for t > tX ) ◦ exp( αλX j )( x ) = exp( tX )( x + αλx j )(4.8) = x + αλx j + Z t X ( x + αλx j + O ( s )) ds. Letting t = t ′ /λ n ( X j ,X ) , it follows that as λ → ∞ Z t X ( x + αλx j + O ( s )) ds → α n ( X j ,X ) n ( X j , X )! ad n ( X j ,X ) X j ( X )( x ) . (4.9)As much as we would like to obtain this potentially new direction by taking λ → ∞ in (4.8), we cannot as αλx j blows up as λ → ∞ . To rid ourselves of this problem,we need to flow backwards along X j for αλ units of time producing the relationexp( − αλX j ) ◦ exp( tX ) ◦ exp( αλX j )( x )= x + Z t X ( x + αλx j + O ( s )) ds. Using the same scaling of time t = t ′ /λ n ( X j ,X ) , we now see how the commutatoron the righthand side of (4.9), hence in the definition of G e1 and G o1 , arises. Remark 4.10.
Note that this computation explains why the separation of C into C o and C e is needed. If n ( X j , X ) is even and ad n ( X j ,X ) X j ( X ) is constant, thenrelation (4.9) implies that we may only flow along ad n ( X j ,X ) X j ( X ) for positivetimes. Additionally, in the subsequent iteration of this method we cannot neces-sarily flow backwards along this vector field producing yet another direction. Remark 4.11.
Following these observations, it is evident where and why Theorem2.5 will fail to either produce optimal results or be applicable at all. The failure isprecisely due to the fact that the set C only includes those constant vector fieldswhich can be flowed along in small positive times. In particular, Theorem 3.7 doesnot account for cases where there is an unavoidable time delay needed to accesscertain points in space (as in the example highlighted in Remark 3.7), usually duethe need to employ the drift vector field X . Moreover, Theorem 3.7 will not evenapply in situations if there is a more serious absence of time reversibility preventing C from being d -dimensional. As an example, consider the following SDE on R (4.12) dx t = − x t y t dt + dB t dy t = ( x t − y t z t ) dtdz t = ( y t − z t ) dt. For this system, it is not hard to check that H¨ormander’s bracket condition issatisfied globally but C = { α∂ x + λ∂ y : α ∈ R , λ ≥ } . Hence, Theorem 2.5 does not apply since C has dimension 2 < C is still useful in that Lemma 4.6 is true regardless if C is d -dimensional. If C isnot d -dimensional, one can now proceed to find more points in the set A ( x, ≤ t )by using C and the specific nature of the drift vector field X . Then, given theexistence of H · = R · h s ds , h ∈ L ([0 , ∞ ) : R r ) such that Φ xt ( H ) = z , positivity of the transition density p nt ( x, z ) for n large enough can then be shown by followinga similar line of reasoning to Lemma 4.22 or Remark 4.27.We now turn the previous heuristics into a proof of Theorem 4.6. Our proofwill employ results from the reference [7], so we will first introduce some furthernotation and terminology to connect with the setup there.We recall that for any C ∞ vector field V on R d , exp( tV )( x ) denotes the maxi-mally defined integral curve of V passing through x at time t = 0. Let H be anyset of C ∞ vector fields on R d . For x ∈ R d and t > A H ( x, ≤ t ) denotes the setof z ∈ R d such that there exist positive times t , . . . , t k and corresponding vectorfields V , . . . , V k ∈ H such that t + · · · + t k ≤ t andexp( t k V k ) ◦ exp( t k − V k − ) ◦ · · · exp( t V )( x ) = z. Because there will be many different sets of vector fields, here we will absolutelyneed to emphasize the dependence of these sets on H .Two sets of C ∞ R d -vector fields, H and I , are called equivalent , denoted by H ∼ I , if A H ( x, ≤ t ) = A I ( x, ≤ t ) for all x ∈ R d and all t >
0. One can show, see[7], that if
H ∼ I and
H ∼ J , then
H ∼ I ∪ J . In particular, if we definesat( H ) = [ I∼H I , then it also follows that sat( H ) ∼ H . sat( H ) is called the saturate of H . Remark 4.13.
It is often the case that sat( H ) contains more vector fields than H itself. Moreover, the saturate maintains identical accessibility properties in thesense ( ∼ ) described above. This is convenient in that it allows one to use simplervector fields to determine accessibility properties of the original set of vector fields H . For example, even though the constant vector field X j , j ≥
1, does not belongto G = { X + P rj =1 u j X j : u j ∈ R } , we used it above to generate more directions in A ( x, ≤ t ) as done in the argumentsfollowing equation (4.8). Using a limiting procedure, however, one can justify thatthis is indeed permissible.In the next two lemmas, we list operations which allow us to expand (up toequivalence) a set of vector fields H . Lemma 4.14. H is equivalent to the closed convex hull of the set { λV : λ ∈ [0 , , V ∈ H} . Here the closure is taken in the topology of uniform convergence with all derivativeson compact subsets of R d .Proof. Apply Theorem 5 and Theorem 6 in Chapter 2 of [7]. (cid:3)
To state the next lemma, let ψ : R d → R d be a diffeomorphism. For any V ∈ H ,we may define a vector field ψ ∗ ( V ) by ψ ∗ ( V )( x ) = Dψ ( ψ − ( x )) V ( ψ − ( x ))where Dψ is the Jacobian of ψ . A diffeomorphism ψ : R d → R d is called a normalizer of H if ψ ( x ) , ψ − ( x ) ∈ A H ( x, ≤ t ) for all x ∈ R d and all t >
0. The setof normalizers of H is denoted by Norm( H ). OSITIVE DENSITIES 15
Lemma 4.15.
H ∼ [ ψ ∈ Norm ( H ) { ψ ∗ ( V ) : V ∈ H} . Proof.
Notice that by the lemma immediately after Definition 5 of Chapter 2 of [7],if ψ is a normalizer of H using our definition, then it is also a normalizer using thedefinition given in [7]. The result then follows after applying Theorem 9 in Chapter2 of [7] and using the fact that the identity map is a normalizer. (cid:3) Remark 4.16.
We will see in the proof of Lemma 4.6 that the limiting procedureused in our heuristic calculations is exactly of the type covered by Lemma 4.14. Wewill also see that the use of normalizers is very much in line with one’s ability toflow along a constant vector field for positive or negative times (hence the ψ and ψ − in the definition of a normalizer).Using repeated applications of Lemma 4.14 and Lemma 4.15, we now proveLemma 4.6. Proof of Lemma 4.6.
Let G = { X + P rj =1 u j X j : u j ∈ R } . First note that itsuffices to show that if V ∈ C o and W ∈ C e , then αV, λW ∈ sat( G ) for all α ∈ R andall λ ≥
0. The result would then follow by Lemma 4.14 since if V , V , . . . , V k ∈ C o and W , W , . . . , W j ∈ C e , then k X l =1 α l V l + j X i =1 λ i W i ∈ sat( G )for all α i ∈ R and all λ i ≥ αX j ∈ sat( G ) for all α ∈ R and j ∈ { , . . . , r } . Indeed,by Lemma 4.14 we have αX j = lim λ →∞ λ ( X + αλX j ) ∈ sat( G ) . By induction, it is enough to show that if V is a constant vector field with αV ∈ sat( G ) for all α ∈ R and W ∈ sat( G ) is a polynomial vector field, then α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) ∈ sat( G )for all α ∈ R . To prove this result, we seek to apply Lemma 4.15. Since V is aconstant vector field, let v = V ( x ) ∈ R d denote its constant value. For α ∈ R ,define a map ψ α : R d → R d by ψ α ( x ) = x − αv. Note that, for each α ∈ R , ψ α is a normalizer for G . Hence, for each α ∈ R , Lemma4.15 implies that ( ψ α ) ∗ ( W ) ∈ sat( G ). Since Dψ α is the identity matrix, notice that( ψ α ) ∗ ( W )( x ) = W ( x + αv ) . Applying Lemma 4.14, we thus find that for all α ∈ R V αW := lim λ ↓ λ n ( V,W ) ( ψ λα ) ∗ ( W ) ∈ sat( G ) . To finish the proof, all we must see is that V αW = α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) . Recalling that v ∈ R d denotes the constant value of V , for x ∈ R d fixed considerthe function F : R → R d defined by α W ( x + αv ). By induction, for j ≥ F ( j ) ( α ) = ad j V ( W )( x + αv ) . where F ( j ) is the j th derivative of F with respect to α . Hence we obtain the formula( ψ α ) ∗ W ( x ) = F ( α ) = n ( V,W ) X j =0 α j j ! F ( j ) (0) = n ( V,W ) X j =0 α j j ! ad j V ( W )( x )since each component of F ( α ) is a polynomial in α with degree ≤ n ( V, W ). Hencewe now see that V αW = lim λ →∞ λ n ( V,W ) ( ψ αλ ) ∗ ( W ) = α n ( V,W ) n ( V, W )! ad n ( V,W ) V ( W ) , completing the proof. (cid:3) Before proceeding onto the second part of the argument, we state the followinglemma which we will need later.
Lemma 4.17.
Suppose that, for some x ∈ R d , the Lie algebra generated by H evaluated at x spans the tangent space. Then for all t, ǫ > A H ( x, ≤ t + ǫ )) ⊃ interior( A H ( x, ≤ t )) . Proof.
See Theorem 2 of Chapter 3 in [7]. (cid:3)
Strict Positivity.
The next two lemmas will operate as an easy-to-checkcriterion assuring that, for a given control H , M xt ( H ) is invertible. Though notnecessary (see Remark 4.27), these results use the fact that G contains only poly-nomial vector fields. In particular, the special structure of zero sets of polynomialsis employed in the following lemma. Lemma 4.18.
Suppose that C is d -dimensional and let H = ∪ rm =1 { X m , [ X , X m ] } . Then for any non-empty open A ⊂ R d the set of points in R d given by (4.19) [ x ∈ A { V ( x ) : V ∈ H} is d -dimensional.Proof. Suppose that the subspace spanned by the set in (4.19) has dimension l ≤ d and choose a basis v , v , . . . , v l ∈ R d for this subspace. The goal is to showthat l = d . Let V , V , . . . , V l be the constant vector fields with constant values v , v , . . . , v l , respectively. Notice that every vector field V in the span of H is apolynomial vector field and satisfies the following equality on the open set A (4.20) V = p V + p V + · · · + p l V l for some polynomials p , . . . , p l . Since A is open and V is a polynomial vector field,(4.20) is valid everywhere on R d . Moreover, since vector fields of the form (4.20)are closed under commutators and linear combinations, we see thatspan( C ) ⊂ span { v , v , . . . , v l } Note that this finishes the proof since C is d -dimensional. (cid:3) OSITIVE DENSITIES 17
To setup the statement of the next result, define K xt ( H ) ⊂ R d as follows:(4.21) K xt ( H ) = r [ m =1 (cid:8) X m (Φ xs ( H )) , [ X , X m ](Φ xs ( H )) : s ∈ (0 , t ) (cid:9) . Lemma 4.22.
Suppose that K xt ( H ) is d -dimensional. Then the associated matrix M xt ( H ) is invertible.Proof. It suffices to show that M xt ( H ) is positive definite. Assume, to the contrary,that M xt ( H ) is not positive-definite and let h · , · i denote the inner product on R d .Then there exists y ∈ R d \ { } such that0 = h M xt ( H ) y, y i = r X m =1 Z t h J xs,t ( H ) X m , y i ds. To get a contradiction, we seek to obtain a lower bound h M xt ( H ) y, y i which ispositive using the equality above. To derive such a bound, first observe that for s ≤ s ≤ u ≤ t ≤ t , J xs ,t ( H ) = J xu ,t ( H ) J xs ,u ( H ) and that the matrix J xs ,t ( H )is invertible. Using these two facts, it is not hard to check that for s ≤ s ≤ t ≤ t∂ s J xs ,t ( H ) = − J xs ,t ( H ) DX (Φ xs ( H ))(4.23) J xt ,t ( H ) = Id d × d . Letting | · | denote the Euclidean norm on R d , we then see that for all u ∈ (0 , t ), ǫ ∈ (0 , min( u, t − u ))0 = h M xt ( H ) y, y i ≥ Z t h J xs,t ( H ) X m , y i ds ≥ Z u + ǫu − ǫ h J xs,t ( H ) X m , y i ds = Z u + ǫu − ǫ h J xs,u ( H ) X m , ( J xu,t ( H )) ∗ y i ds (4.24) ≥ | ( J xu,t ( H )) ∗ y | inf y : k y k =1 Z u + ǫu − ǫ h J s,u X m , y i ds. Since | J ∗ u,t y | > R d , it suffices to show that for allnonzero y ∈ R d there exists m ∈ { , , . . . , r } , u ∈ (0 , t ), and ǫ ∈ (0 , min( u, t − u ))such that(4.25) Z u + ǫu − ǫ h J xs,u ( H ) X m , y i ds > . Thus let y ∈ R d , y = 0, be arbitrary. By hypothesis, either h X m , y i 6 = 0 for some m ∈ { , . . . , r } or h [ X m , X ](Φ xt ( H )) , y i 6 = 0 for some m ∈ { , . . . , r } , t ∈ (0 , t ).Clearly, if h X m , y i 6 = 0 for some m ∈ { , , . . . , r } , then there is nothing to show bycontinuity and (4.25). Thus suppose that h X m , y i = 0 for all m = 1 , , . . . , r andpick t ∈ (0 , t ), m ∈ { , , . . . , r } such that h z, y i = h [ X m , X ](Φ xt ( H )) , y i 6 = 0 . Since h X m , y i = 0, using the definition of J xs,t ( H ) twice we see that h J xs,t ( H ) X m , y i = Z t s h DX (Φ xu ( H )) J xs,u ( H ) X m , y i du = Z t s h [ X m , X ](Φ xu ( H )) , y i du + Z t s (cid:28) DX (Φ xu ( H )) Z us DX (Φ xv ( H )) J xs,v ( H ) X m dv, y (cid:29) du. Therefore, for s sufficiently close to t , h J xs,t ( H ) X m , y i 6 = 0. Hence continuity thenimplies for any ǫ ∈ (0 , t ) Z t t − ǫ h J xs,t ( H ) X m , y i ds > , finishing the proof. (cid:3) We now use the previous two results and Lemma 4.6 to prove Theorem 2.5.
Proof of Theorem 2.5.
We first prove Theorem 2.5 part (b) and then show how part(a) follows by a similar argument. Therefore suppose that y ∈ R d is an equilibriumpoint of G and that x, z ∈ R d are such that y ∈ D ( x ) and z ∈ D ( y ). By Lemma 4.5,our goal is to exhibit H · = R · h s ds , h ∈ L ([0 , t ] : R r ), such that Φ xt ( H ) = z and M xt ( H ) invertible. To ensure that M xt ( H ) is invertible, we will build H · in such away so as to “twist” the path of Φ x · ( H ) from x to z .We first claim that there exist countably many non-empty disjoint open subsets U l , l ≥
0, with the property that U l +1 ⊂ [ w ∈ U l D ( w )(4.26)for all l ≥
0. Suppose first that D ( x ) = R d . Then it follows that D ( x ′ ) = R d forall x ′ ∈ R d . Thus in this case simply let U l be any partition of R d . If D ( x ) = R d ,then since y ∈ D ( x ) write y = x + P kj =1 α j y j + P dj = k +1 λ j y j for some α j ∈ R and λ j >
0. Let λ = min j λ j > α = 0 and α l = P lk =1 − k , l ≥
1. Note that for l ≥ U l = x + span { y , . . . , y k } + { µ k +1 y k +1 + · · · + µ d y d : µ j ∈ ( α l λ, α l +1 λ ) } are disjoint, open and satisfy (4.26). This finishes the proof of the claim.By construction of the sets U l , l ≥
0, and Lemma 4.18, there exist x l + r ∈ U l such that r [ m =1 { x , . . . , x r , [ X m , X ]( x r +1 ) , . . . , [ X m , X ]( x r + j ) } is d -dimensional. Here, recall that x , . . . , x r are the constant values of X , . . . , X r ,respectively. Moreover, x r +1 ∈ D ( x ), y ∈ D ( x j + r ) and x l +1+ r ∈ D ( x l + r )for all l = 1 , , . . . , j .We now show that we can build H · so that the path Φ x · ( H ) passes through eachof these points prior to time t > xt ( H ) = z . Observe that Lemma OSITIVE DENSITIES 19 A ( w, ≤ s ) ⊃ D ( w ) for all w ∈ R d and all s >
0. Hence by definition of A ( w, ≤ s ), there exist positive times t , t , . . . , t j +1 with P j +1 l =1 t l < t and corresponding H l ( · ) = R · h l ( s ) ds , h l ∈ L ([0 , t l ] : R r ), suchthat Φ xt ( H ) = x r +1 , Φ x r + l t l +1 ( H l +1 ) = x r + l +1 , l = 1 , . . . , j −
1, and Φ x r + j t j +1 ( H j +1 ) = y .By piecing together the H l ’s, this now gives us the path from x to y . For the restof the path, we may also pick a positive time t j +3 < t and H j +3 ( · ) = R · h j +3 ( s ) ds , h j +3 ∈ L ([0 , t j +3 ] : R r ) such that Φ yt j +3 ( H j +3 ) = z . Moreover, since y is anequilibrium point of G , letting t j +2 = t − ( t + · · · + t j +1 + t j +3 ) > H j +2 ( · ) = R · h j +2 ( s ) ds , h j +2 ∈ L ([0 , t j +2 ] : R r ) such that Φ yt j +2 ( H j +2 ) = y . By Lemma 4.22, we now obtain the conclusion in part (b).To prove part (a), simply let z = y in the first argument and, for an arbitrary T >
0, choose t < T . Note that this now finishes the proof of Theorem 2.5. (cid:3)
Remark 4.27.
Without using the special structure of polynomial vector fields, onecan prove Theorem 2.5 alternatively by choosing the path from x to y differentlyas follows. Define D ( x, y ) = ( D ( x ) \ D ( y ) if D ( x ) = R d R d otherwiseand let y ′ ∈ D ( x, y ) be arbitrary. Since D ( x, y ) is open, let δ > B δ ( y ′ ) ⊂ D ( x, y ). By the support theorems [15, 16], there exists s ∈ (0 , t/
4) suchthat for all n large enough P x { s < τ n , x s ∈ B δ ( y ′ ) } > . Now recall that W s = ( W s , . . . , W rs ) is an r -dimensional standard Wiener processdefined on the probability space (Ω , F , P ). In this remark, we identify the set Ωwith the space of continuous paths C ([0 , ∞ ) : R r ). Letting M xt ( W ( ω )) denote thematrix M xt ( H ) when H s = ( W s ( ω ) , . . . , W rs ( ω )), we note that by Malliavin’s proofof H¨ormander’s theorem [8, 11] P x { s < τ n , x s ∈ B δ ( y ′ ) , M xs ( W ) invertible } = P x { s < τ n , x s ∈ B δ ( y ′ ) } > n sufficiently large. Therefore, fix ω ∈ { s < τ n , x s ∈ B δ ( y ′ ) , M s ( W ( ω )) invertible } and define H s = ( W s ( ω ) , . . . , W rs ( ω )) on the time interval [0 , s ]. Hence Φ xs ( H ) ∈ B δ ( y ′ ). Since y ∈ \ w ∈ B δ ( y ′ ) D ( w ) , pick ˜ H such that for some s < t Φ Φ xs ( H ) s ( ˜ H ) = y. We can complete our path from y to z in exactly the same way as in the proof ofTheorem 2.5. Invertibility of the covariance matrix for our chosen control at time t follows immediately since M xs ( W ( ω )) is invertible. See Theorem 8.1 in [10] for asimilar argument. Remark 4.28.
Yet another way to prove Theorem 2.5 is to use a Feynman-Kacrepresentation of the probability density function p nt ( x, z ). Indeed fixing n ∈ N and x ∈ B n (0), observe that the time-reversed density q ns ( x, z ) = p nt − s ( x, z ) solvesthe following PDE ∂q ns ∂s = −L ∗ z q ns on [0 , t ) × B n (0)where L ∗ z is the formal adjoint (in the z variable) of the Markov generator L corre-sponding to the diffusion x t . Now consider the process y t solving dy t = − X ( y t ) dt − r X j =1 X j dW jt and let T n = inf { t > | y t | ≥ n } . It then follows that we may write p nt ( x, z ) as p nt ( x, z ) = q n ( x, z ) = E z e R s ∧ Tn f ( y u ) du q s ∧ T n ( x, y s ∧ T n )for some f ∈ C ∞ ( R d : R ). One can use now the expression above coupled with thesupport theorems [15, 16] applied to the time-reversed process y t to bound p nt ( x, z )from below by a positive quantity.We finish this section by proving Theorem 2.9 as a consequence of Theorem 2.5(a). Proof of Theorem 2.9.
Let µ be an invariant probability measure for the Markovprocess x t defined by (2.1). Again, since C is contained in the Lie algebra gen-erated by X , . . . , X r , [ X , X ] , . . . , [ X r , X ] and C is d -dimensional, it follows byH¨ormander’s theorem [4] that µ ( dx ) = m ( x ) dx for some nonnegative function m ∈ C ∞ ( R d ). Recall also that, for the same reasons, the Markov process x t de-fined by (2.1) has a probability density function p t ( x, y ) with respect to Lebesguemeasure on R d which is smooth for ( t, x, y ) ∈ (0 , ∞ ) × R d × R d . Since µ is an in-variant probability measure, we have the following relation for almost every z ∈ R d and t > m ( z ) = Z R d m ( y ) p t ( y, z ) dy. We now use this relation to prove the positivity assertion. Let x ∈ supp( µ ). Hence µ ( B δ ( x )) > δ >
0. By smoothness of the density m , for each δ > x = x ( δ ) ∈ B δ ( x ) such that m ( x ) >
0. Since m is smooth, in particularcontinuous, there exists γ > B γ ( x ) ⊂ B δ ( x ) and m ( y ) ≥ ǫ > y ∈ B γ ( x ). Hence for almost every z ∈ R d we have m ( z ) ≥ Z B γ ( x ) m ( y ) p t ( y, z ) dy ≥ ǫ Z B γ ( x ) p t ( y, z ) dy. To bound p t ( y, z ) from below, there are two cases. First suppose that D ( x ) = R d .Then by definition of D ( x ), we have that C o is d -dimensional, and hence D ( y ) = R d for all y ∈ R d . Theorem 2.5 (a) implies that for any y ∈ B γ ( x ) , z ∈ D ( x ) thereexists t > p t ( x, z ) >
0. Since the transition density is a continuousfunction in all of its arguments, there exists an open neighborood U of ( t, x, z ) in(0 , ∞ ) × B γ ( x ) × R d such that p s ( x ′ , z ′ ) ≥ c > s, x ′ , z ′ ) ∈ U . In particular,for almost every y in an open ball centered at zm ( y ) ≥ ǫc > . Since m is continuous it follows that m ( z ) ≥ ǫc >
0. For the second case, supposethat D ( x ) = R d . In particular, this implies that C o has dimension l < d and OSITIVE DENSITIES 21 x / ∈ D ( x ). Take z ∈ D ( x ) and decrease δ > y ∈ B δ ( x ), z ∈ D ( y ).Following now in the same way as in the previous case we finish the proof of theresult. (cid:3) Appendix
Here we prove Lemma 4.5. We recall that this result is the slight modificationof the criterion for positivity of the density given by Ben-Arous L´eandre [2] whichwas applied without proof in Section 4. Such an extension is needed in this papersince the drift vector field X was not assumed to be globally Lipschitzian and itsderivatives were not assumed to be globally bounded.The proof of Lemma 4.5 is almost identical to (and in some parts simpler than)the proof of Proposition 4.2.2 of [12]. The basic difference needed to remove theseassumptions on X is that we need to compare the stopped process x t ∧ τ n withanother process x ( n ) t such that x ( n ) t solves an SDE whose coefficients satisfy therequired Lipschitzian and boundedness conditions and x t ∧ τ n = x ( n ) t ∧ τ n for all t ≥ . This localization procedure is relatively standard but we include the details forcompleteness.To do such a comparison, for any integer n ≥ X ( n )0 be a C ∞ vector field on R d satisfying X ( n )0 ( x ) = ( X ( x ) for | x | ≤ n | x | ≥ n + 1 . For x ∈ R d , n ∈ N , t > H = ( H j ) ∈ C ([0 , t ] : R r ) let Φ x,nt ( H ) denote thesolution of the equationΦ x,nt ( H ) = x + Z t X ( n )0 (Φ x,ns ( H )) ds + r X j =1 X j H jt . Let J x,ns,t = J x,ns,t ( H ) denote the d × d matrix-valued solution of the equation J x,ns,t = Id d × d + Z ts DX ( n )0 (Φ x,nu ( H )) J x,ns,u du and M x,nt ( H ) denote the matrix( M x,nt ( H )) lm = r X j =1 Z t ( J x,ns,t ( H ) X j ) l ( J x,ns,t ( H ) X j ) m ds. Proof of Lemma 4.5.
As in [12], our goal is to use Malliavin calculus to bound p nt ( x, z ) from below by a quantity which is positive if the covariance matrix M x,nt ( H )is invertible. For brevity of notation during this proof, we will write the functionalΦ x,nt ( · ) simply as Φ( · ). Let H · = R · h u du, h ∈ L ([0 , ∞ ) : R r ) be as in thestatement of the lemma and let k l ( s ) denote the l th row of the matrix k lj ( s ) =( J x,ns,t ( H ) X j ) l . For y ∈ R d , let( T y W )( t ) = W ( t ) + d X l =1 y l Z t k l ( s ) ds and g ( y, W ) = Φ( T y W ) − Φ( W ) where W ( t ) = ( W ( t ) , . . . , W r ( t )) denotes the standard r -dimensional Wiener pro-cess on (Ω , F , P ). For β >
1, define cutoff functions K β , α β ∈ C ( R : [0 , K β ( x ) = ( | x | ≥ β | x | ≤ β − α β ( x ) = ( | x | ≤ β | x | ≥ β , and set H β = K β ( k g ( · , W ) k C ( B (0): R d ) ) α β ( | det ∂ j g i (0) | ) . Under our assumptions, one can check that (see [13], Example 1.2.1, Theorem 2.2.2and surrounding text) g ( · , W ( ω )) ∈ C ∞ ( R d ) for a.s. ω ∈ Ω.Now let f : R d → [0 , ∞ ) be bounded, measurable and ρ : R r → (0 , ∞ ) be ameasurable function satisfying R R r ρ ( y ) dy = 1. Observe that E x f ( x t ∧ τ n ) = Z R r E x f ( x t ∧ τ n ) ρ ( y ) dy = Z R r E f (Φ( W )) {k Φ( W ) k t ≤ n } ρ ( y ) dy where {k Φ( W ) k t ≤ n } = (cid:26) ω ∈ Ω : sup s ∈ [0 ,t ] | Φ x,ns ( W ( ω )) | ≤ n (cid:27) . Girsanov’s theorem then gives Z R r E f (Φ( W )) {k Φ( W ) k t ≤ n } ρ ( y ) dy = Z R r E f (Φ( T y W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy where G ( y ) > c β > E x f ( x t ∧ τ n ) ≥ Z R r E f (Φ( T y W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy ≥ E H β Z | y |≤ c β f ( g ( y ) + Φ( W )) {k Φ( T y W ) k t ≤ n } G ( y ) ρ ( y ) dy ≥ E H β { sup | y |≤ cβ k Φ( T y W ) k t ≤ n } Z | y |≤ c β f ( g ( y ) + Φ( W )) G ( y ) ρ ( y ) dy. Let A β = { sup | y |≤ c β k Φ( T y W ) k t ≤ n } . By Lemma 4.2.1 of [12], for any β > c β ∈ (0 , β − ) and δ β > G : B (0) → R d with G (0) = 0, k G k C ( B (0)) ≤ β and | det ∂ j g i (0) | ≥ β is diffeomorphic from B c β (0) ⊂ R d into a neighborhood of B δ β (0) ⊂ R d . In particular, we find that after OSITIVE DENSITIES 23 changing variables twice E x f ( x t ∧ τ n ) ≥ E H β A β Z | y |≤ c β f ( g ( y ) + Φ( W )) G ( y ) ρ ( y ) dy ≥ E H β A β Z | z |≤ δ β f ( z + Φ( W )) G ( g − ( z )) ρ ( g − ( z )) | det ∂ j g i ( g − ( z )) | dz = E H β A β Z | z − Φ( W ) |≤ δ β f ( z ) G ( g − ( z − Φ( W ))) × ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W ))) | dz. Therefore we deduce the following inequality p t ( x, z ) ≥ E H β A β {| z − Φ( W ) |≤ δ β } G ( g − ( z − Φ( W )) × ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W )) | . By construction, if H β = 0 and | z − Φ( W ) | ≤ δ β then G ( g − ( z − Φ( W )) ρ ( g − ( z − Φ( W ))) | det ∂ j g i ( g − ( z − Φ( W )) | > . Thus it remains to prove that β > A β ∩ n | z − Φ( W ) | ≤ δ β , | det ∂ j g i (0) | ≥ β − , k g ( · , W ) k C ( B (0)) ≤ β − o has positive probability. Note that this can be shown by following exactly the sameline of reasoning starting in the last paragraph of p. 1777 of [10]. (cid:3) References [1] Avanti Athreya, Tiffany Kolba, and Jonathan C. Mattingly. Propogating lya-punov functions to prove noise-induced stability. arXiv: 1111.1755 , (1):1–41,2011.[2] G. Ben Arous and R. L´eandre. D´ecroissance exponentielle du noyau de lachaleur sur la diagonale. II.
Probab. Theory Related Fields , 90(3):377–402,1991.[3] Jeremiah Birrell, David P. Herzog, and Jan Wehr. The transition from ergodicto explosive behavior in a family of stochastic differential equations.
StochasticProcesses and their Applications , 122(4):1519 – 1539, 2012.[4] Lars H¨ormander. Hypoelliptic second order differential equations.
Acta Math. ,119:147–171, 1967.[5] V. Jurdjevic and I. Kupka. Control systems on semisimple Lie groups andtheir homogeneous spaces.
Ann. Inst. Fourier (Grenoble) , 31(4):vi, 151–179,1981.[6] V. Jurdjevic and I. Kupka. Polynomial control systems.
Math. Ann. ,272(3):361–368, 1985.[7] Velimir Jurdjevic.
Geometric control theory , volume 52 of
Cambridge Studiesin Advanced Mathematics . Cambridge University Press, Cambridge, 1997.[8] S. Kusuoka and D. Stroock. Applications of the Malliavin calculus. II.
J. Fac.Sci. Univ. Tokyo Sect. IA Math. , 32(1):1–76, 1985.[9] J. C. Mattingly, A. M. Stuart, and D. J. Higham. Ergodicity for SDEs andapproximations: locally Lipschitz vector fields and degenerate noise.
StochasticProcess. Appl. , 101(2):185–232, 2002. [10] Jonathan C. Mattingly and ´Etienne Pardoux. Malliavin calculus for the sto-chastic 2D Navier-Stokes equation.
Comm. Pure Appl. Math. , 59(12):1742–1790, 2006.[11] James Norris. Simplified Malliavin calculus. In
S´eminaire de Probabilit´es, XX,1984/85 , volume 1204 of
Lecture Notes in Math. , pages 101–130. Springer,Berlin, 1986.[12] David Nualart. Analysis on Wiener space and anticipating stochastic calculus.In
Lectures on probability theory and statistics (Saint-Flour, 1995) , volume1690 of
Lecture Notes in Math. , pages 123–227. Springer, Berlin, 1998.[13] David Nualart.
The Malliavin calculus and related topics . Probability and itsApplications (New York). Springer-Verlag, Berlin, second edition, 2006.[14] Marco Romito. Ergodicity of the finite dimensional approximation of the 3DNavier-Stokes equations forced by a degenerate noise.
J. Statist. Phys. , 114(1-2):155–177, 2004.[15] D. Stroock and S. R. S. Varadhan. On degenerate elliptic-parabolic operatorsof second order and their associated diffusions.
Comm. Pure Appl. Math. ,25:651–713, 1972.[16] Daniel W. Stroock and S. R. S. Varadhan. On the support of diffusion processeswith applications to the strong maximum principle. In