[PDF] Improved spectral gap bounds on positively curved manifolds

Abstract

A coupling method and an analytic one allow us to prove new lower bounds for the spectral gap of reversible diffusions on compact manifolds. Those bounds are based on the a notion of curvature of the diffusion, like the coarse Ricci curvature or the Bakry--Emery curvature-dimension inequalities. We show that when this curvature is nonnegative, its harmonic mean is a lower bound for the spectral gap.

Full PDF

aa r X i v : . [ m a t h . P R ] M a y Improved spectral gap bounds on positivelycurved manifolds

Laurent VeysseireDecember 7, 2018

Abstract

A coupling method and an analytic one allow us to prove new lowerbounds for the spectral gap of reversible diﬀusions on compact manifolds.Those bounds are based on the a notion of curvature of the diﬀusion, likethe coarse Ricci curvature or the Bakry–Emery curvature-dimension in-equalities. We show that when this curvature is nonnegative, its harmonicmean is a lower bound for the spectral gap.

Introduction

The study of the spectrum of the Laplace Operator on Riemannian man-ifolds has many applications in various domains of mathematics. A wholechapter of [5] is devoted to this issue. In this article, we take the conven-tion ∆ = g ij ∇ i ∇ j for the Laplace operator. The spectral gap of ∆ is the opposite of thegreatest non-zero eigenvalue of ∆ (the spectrum of ∆ is discrete and non-positive).One way to estimate this spectral gap is to use the Ricci curvature, aswe see it in the Lichnerowicz theorem (see [10]). Theorem 1 (Lichnerowicz)

Let M be a n -dimensional Riemannianmanifold. If there exists K > such that for each x ∈ M , for each u ∈ T x M , we have Ric x ( u, u ) ≥ Kg x ( u, u ) , then the spectral gap λ of ∆ satisﬁes λ ≥ nn − K. Here we denote by

Ric the Ricci curvature of M . Chen and Wang improved this result in [7], using the diameter of themanifold in their estimates:

Theorem 2

Let M be a compact connected n -dimensional Riemannianmanifold, K be the inﬁmum of the Ricci curvature on M and D be thediameter of M . Then if K ≥ , we have the following bounds: λ ≥ π D + max (cid:18) π n , − π (cid:19) nd if n > , λ ≥ nK ( n − (cid:18) − cos n (cid:18) D √ K ( n − (cid:19)(cid:19) . And if K ≤ , we have the following bounds: λ ≥ π D + (cid:16) π − (cid:17) K and if n > , λ ≥ π q − D Kπ D ch (cid:18) D √ − K ( n − (cid:19) In [2], E.Aubry gives a lower bound for λ when the curvatureRic( x ) := inf u ∈ T x M Ric x ( u, u ) k u k is close to a positive constant in the sense of L p norm with p large enough: Theorem 3

Let M be a complete n -dimensional Riemannian Manifold, p > n and K > , such that Z M (Ric − K ) p − < + ∞ . Then M has a ﬁnite volume and the spectral gap of ∆ on M satisﬁes: λ ≥ nn − K (cid:18) − C ( p, n ) K k (Ric − K ) − k p (cid:19) where C ( p, n ) is a constant only depending on p and n , and k f k p = (cid:16) R M | f | p vol( M ) (cid:17) p . This allows a little negative curvature, which is not the case of ourresults.This article recapitulates and extends the results already stated in [13]and presents a coupling method, more adapted to discrete spaces than theanalytic one.We show by a coupling method that another bound for λ is the har-monic mean of the Ricci curvature. Theorem 4

Let M be a compact Riemannian manifold with positiveRicci curvature. Then we have λ ≤ Z M d µ ( x )Ric( x ) , with d µ = d volvol( M ) , where vol is the Riemannian volume measure on M . This bound is often better than the Lichnerowicz one because the har-monic mean is better (and can be much better) than the inﬁmum. Butunfortunately we lose the nn − factor.Merging the proof of Theorem 1 and an analytic proof of Theorem 4gives us the following improvement: heorem 5 Let M be a Riemannian manifold with positive Ricci curva-ture and K = inf x ∈M Ric( x ) . Then for every ≤ c ≤ K , we have: λ ≥ nn − c + 1 R M d volRic( x ) − c . Taking c = K gives us the Lichnerowicz bound or even better, while c = 0 gives us Theorem 4.Our coupling approach is based on a notion of coarse Ricci curvature,introduced by Yann Ollivier in [12], which uses the Wasserstein distance W . A major step in our proof is the use of the coupling given by thefollowing theorem: Theorem 6

Let M be a smooth Riemannian manifold, and F i be asmooth vector ﬁeld on M . Assume that there exists a diﬀusion processassociated with the generator Lf = ∆ f + F i ∇ i f . Let κ ( x, y ) be the coarseRicci curvature of the diﬀusion between x and y (see Deﬁnition 10). Thenfor any two distinct points x and y of M , there exists a coupling ( x ( t ) , y ( t )) between the paths of the diﬀusion process starting at x and y which satis-ﬁes: d ( x ( t ) , y ( t )) = d ( x, y )e − R t κ ( x ( s ) ,y ( s ))d s on the event that for any s ∈ [0 , t ] , ( x ( s ) , y ( s )) does not belong to thecut-locus of M . The contraction rate κ ( x, y ) of this coupling behaves like the oneof the coupling derived from the diﬀusion in C path space deﬁned byM.Arnaudon, K.A.Coulibaly and A.Thalmaier in [1] when x and y areclose. We have a cut-locus problem that we will avoid by making a com-pactness assumption, which was anyway necessary to replace κ ( x, y ) byits limit when x and y are inﬁnitely close.The coupling method and the analytic one keep working when we adda drift to the Brownian motion, provided the diﬀusion is reversible. Inthis case, the generator takes the following form: L = 12 g ij ( ∇ i ∇ j − ( ∇ j ϕ ) ∇ i )with ϕ a smooth function on M , and e − ϕ d vol is a reversible measure.We have then the following generalization of Theorem 5: Theorem 7

Let M be a compact Riemannian manifold and L = g ij ( ∇ i ∇ j − ( ∇ j ϕ ) ∇ i ) be the operator associated with a reversible diﬀusion process on M . Suppose that we have a curvature-dimension inequality in the senseof Bakry- ´Emery (see [3] or [4]) with a positive curvature ρ and a constantand positive dimension n ′ ,which is Γ ( f )( x ) ≥ ρ ( x )Γ( f )( x ) + 1 n ′ L ( f )( x ) . Let R be the inﬁmum of ρ . Then for every ≤ c < R, we have λ ( L ) ≥ n ′ n ′ − c + 1 R M d π ( x ) ρ ( x ) − c with d π = e − ϕ d vol R M e − ϕ d vol the reversible probability measure. e try to generalize our coupling method to diﬀusions which are notadaptated to the metric g , that is, whose generator takes the more generalform: L = 12 A ij ∇ i ∇ j + F i ∇ i without having necessarily A ij = g ij anymore. We have a generalizationof Theorem 6 only on the very restrictive condition:( H ) ⇔ ∀ u ∈ T M , u i g jk u j g lm u l ∇ i A km = 0 ⇔ g il ∇ l A jk + g jl ∇ l A ki + g kl ∇ l A ij = 0and with a lower ˜ κ ( x, y ) instead of κ ( x, y ). Note that ( H ) is true for A ij = g ij , in which case we have ˜ κ = κ .We have the following generalization of Theorem 4: Theorem 8

Consider a diﬀusion process on a compact Riemannian man-ifold M which is reversible and satisﬁes ( H ) . For every x in M , we set ˜ κ ( x ) = inf u ∈ T x M ˜ κ ( x, u ) . If we have ˜ κ ( x ) ≥ ε > , then the spectral gapof L is at least the harmonic mean of ˜ κ (with respect to the reversibleprobability measure π ): λ ( L ) ≤ Z M d π ( x )˜ κ ( x ) . In section 1, we present a short argument which shows how we canderive the harmonic mean from Theorem 6. In section 2, we deﬁne thecoarse Ricci curvature for diﬀusions and construct our couplings, so it’swhere Theorem 6 is proved. In section 3, we present the proofs using thecouplings and purely analytical ones for the harmonic mean bounds forthe spectral gap.

The result and its proof presented in this section are a shortcut found byYann Ollivier to obtain a harmonic mean from Theorem 6.Using a classical method, we will prove thanks to Theorem 6 the fol-lowing result, which is a weaker version of Theorem 4:

Theorem 9

Let M be a compact Riemannian manifold with positiveRicci curvature, and f be any -Lipschitz function on M . Then the vari-ance of f is at most the average of . Indeed, the Poincar´e inequality states that Var µ ( f ) ≤ λ R k∇ f k d µ ,and the integral on the right hand side is at most 1 for 1-Lipschitz func-tions. In [11], E.Milman shows that the converse is true i.e a controlon the variance of Lipschitz functions (and even on the L norm of 0-mean Lipschitz functions) implies a Poincar´e inequality, with a universalloss in the constants, under the hypothesis of a Bakry–Emery CD (0 , ∞ )curvature-dimension inequality. Proof of theorem 9:

We only have to prove the result for f regularenough, and use a density argument to get the result for non-regular f . Weconsider the semi-group P t generated by the Laplacian operator. Thenthe limit of P t when t tends to inﬁnity is the operator which associates to the constant function equals to the mean of f (respect to the normalizedRiemannian volume measure). So the variance of f is the limit of the meanof P t ( f ) − ( P t ( f )) when t tends to inﬁnity. We have P t ( f ) − ( P t ( f )) = Z t dd s (cid:0) P s (( P t − s ( f )) ) (cid:1) d s = Z t P s (2 k∇ ( P t − s ( f )) k ) d s Integrating over M yields Z M ( P t ( f )( x ) − ( P t ( f )( x ))) d vol( x ) = Z M Z t k∇ ( P t − s ( f ))( x ) k d s d vol( x ) . Thanks to Theorem 6, by taking y very close to x , we have k∇ ( P t − s ( f ))( x ) k ≤ E P x [e − R t − s Ric( X u )d u k∇ f ( X t − s ) k ], where the right hand side is the expec-tation of the term inside the brackets when X has the law P x of the twiceaccelerated Brownian motion on M starting at x . Using the convexity ofthe exponential function, and the fact that f is 1-Lipschitz, we get then R M ( P t ( f )( x ) − ( P t ( f )( x )))d vol( x ) ≤ R M R t (cid:16) E P x h e − R t − s Ric( X u )d u i(cid:17) d s d vol( x ) ≤ R M R t E P x h e − R t − s Ric( X u )d u i d s d vol( x ) ≤ R t R M E P x hR e − t − s )Ric( X ( t − s ) u ) d u i d vol( x ) d s = 2 R t R M e − t − s )Ric( x ) d vol( x ) d s = R M − e − t Ric( x ) Ric( x ) d vol( x ) . We just have to take the limit when t tends to inﬁnity and to divide by R M d vol( x ) to get the theorem. (cid:3) In this section, we introduce the Coarse Ricci curvature κ for generaldiﬀusions and give an explicit formula. Then we construct the couplingof Theorem 6, we show why the ( H ) condition is needed and we deﬁne ˜ κ when it is satisﬁed. Following what is done in [12] for Markov chains, we deﬁne the coarse Riccicurvature of diﬀusions as the rate of decay of the Wasserstein distance W between the measures associated with the diﬀusion and starting at twodiﬀerent points: Deﬁnition 10

Let M be a Riemannian manifold and P t be the semi-group of a diﬀusion on M . The coarse Ricci curvature between two dif-ferent points x and y is the following quantity: κ ( x, y ) = lim t → d ( x, y ) − W ( δ x .P t , δ y .P t ) td ( x, y ) . he Wasserstein distance W between two measures is the inﬁmum onall the couplings of the expectation of the distance. Our coupling will beconsrtucted thanks to optimal ones.To get an expression of this curvature only depending on the coeﬃ-cients of the generator of the diﬀusion, we need to make sure that thediﬀusion does not move far away too fast. Deﬁnition 11

A diﬀusion on M is said to be locally uniformly L -boundedat x if ∃ M > , ∃ η > , ∀ y ∈ M| d ( x, y ) < η, ∀ < t < η, R d ( x, z )d( δ y .P t )( z )

Take two distinct points x and y in M , such that A and F are continuous at x and y , and that the diﬀusion is locally uniformly L -bounded at x and y . Assume that the distance between two points inthe neighborhoods of x and y admits the following second-order Taylorexpansion: d (exp x ( εv ) , exp y ( εw )) = d ( x, y )  ε (cid:16) l (1) i v i + l (2) j w j (cid:17) + ε (cid:16) q (1) i i v i v i + q (2) j j w j w j + 2 q (12) ij v i w j (cid:17) + o (cid:16) ε | ln( ε ) | (cid:17)  . Then the coarse Ricci curvature between x and y is: κ ( x, y ) = − l (1) i F i ( x ) − l (2) j F j ( y ) − q (1) i i A i i ( x ) + q (2) j j A j j ( y )2 +tr (cid:18)q A i i ( x ) q (12) i j A j j ( y ) q (12) i j (cid:19) . Here the matrix S i i = A i i ( x ) q (12) i j A j j ( y ) q (12) i j is diagonalizable withnon-negative eigenvalues, since it is the product of two symmetric non-negative matrices, so S i i admits an unique diagonalizable square root R i i with non-negative eigenvalues, and the last term of the formula issimply R ii . Remark 14

We don’t assume here that d is the usual geodesic distanceon the manifold M , but only that it admits a nice second order Taylorexpansion. For example, we can take d the Euclidean distance on thesphere S n embedded in R n +1 . Proof:

The idea is to approximate the distributions P tx and P ty for small t by Gaussian distributions in the tangent spaces T x M and T y M , andto approximate the distance by its second order Taylor expansion. Wecan describe the process x ( t ) starting at x in the exponential map by theequation: d X i ( t ) = B iα ( X ( t ))d W α ( t ) + F ′ i ( X ( t ))d t here W ( t ) is a Brownian motion in R n , B i α (0) δ α α B i α (0) = A i i ( x )and F ′ i (0) = F i ( x ), and B and F ′ are continuous (because of the conti-nuity of A and F ) and deﬁned in a neighborhood of 0. Keep in mind that X ( t ) may not be deﬁned for every t >

0, but we have x ( t ) = exp x ( X ( t ))when it is. We will approximate X i ( t ) by X (0) i ( t ) = B iα (0) W α ( t ) + tF i ( x ) , which has the Gaussian law N ( tF ( x ) , tA ( x )). For small t , the ball K t of radius p (2 A i i ( x ) g i i ( x ) + 2) t | ln( t ) | of T x M is included in the def-inition domain of B and F ′ . We will show that X ( s ) remains in K t for0 ≤ s ≤ t with probability 1 − o ( t ). Let T t be the exit time of X ( s ) from K t , and X t ( s ) = (cid:26) X ( s ) if s ≤ T t X (0) ( s ) − X (0) ( T t ) + X ( T t ) if s > T t . We want to prove that k X t | [0 ,t ] k ∞ = sup s ∈ [0 ,t ] k X t ( s ) k ≤ p (2 A i i ( x ) g i i ( x ) + 2) t | ln( t ) | with probability 1 − o ( t ).We ﬁrst prove that k X t − X (0) | [0 ,t ] k ∞ = o ( p t | ln( t ) | ) with probability1 − o ( t ). We have d( X t − X (0) )( s ) = s

12 ( q (1) i i A i i ( x )+ q (2) j j A j j ( y ))+tr (cid:18)q A i i ( x ) q (12) i j A j j ( y ) q (12) i j (cid:19) . (cid:3) Lemma 15

Let A i i and B j j be two symmetric non-negative tensorsbelonging to E ⊗ E and E ⊗ E , with E and E two ﬁnite dimensional R -vector spaces, not necessarily of the same dimension. Let D ij be atensor belonging to E ∗ ⊗ E ∗ . Then the minimum of E [ D ij X i Y j ] over allcouplings between X of law N (0 , A ) and Y of law N (0 , B ) is − tr (cid:16)p A i i D i j B j j D i j (cid:17) . Proof:

The quantity to be minimized only depends on the covariance C ij = E [ X i Y j ] between X and Y (this quantity is C ij D ij ). So our prob-lem is equivalent to minimizing C ij D ij over the set of all possible C suchthat there exists a coupling between X and Y such that the covariancebetween X and Y is C . Since X and Y are Gaussian, C is the covarianceof a coupling between X and Y if and only if (cid:18) A CC T B (cid:19) is a symmetric non-negative matrix (because there exists a Gaussian cou-pling having this covariance). This condition is equivalent to ∀ ( X ∗ i , Y ∗ j ) ∈ E ∗ × E ∗ , X ∗ i A i i X ∗ i + Y ∗ j B j j Y ∗ j + 2 X ∗ i C ij Y ∗ j ≥

0, which is equivalentto ∀ ( X ∗ i , Y ∗ j ) , | X ∗ i C ij Y ∗ j | ≤ q X ∗ i A i i X ∗ i Y ∗ j B j j Y ∗ j . In particular,this implies C ∈ Im( A ) ⊗ Im( B ) (just take X ∗ ∈ Ker( A ) or Y ∗ ∈ Ker( B )and remember Im( A T ) = (Ker( A )) ⊥ ).Let n = rk( A ) and n = rk( B ) be the ranks of A and B , usingsuitable bases of Im( A ) and Im( B ), we ﬁnd ”square roots” A ′ iα and B ′ jβ of A and B , in the sense that A ′ I ( n ) A ′ T = A and B ′ I ( n ) B ′ T = B ( I ( n ) the scalar product on a canonical n -dimensional Euclidean space and I ( n ) the associated scalar product in the dual of this space). Then A ′ and B ′ admit left inverses A ′− and B ′− (here we don’t necessarily have unicity,we just choose two left inverses). We set C ′ = A ′− CB ′− T . We have (cid:18) A CC T B (cid:19) ≥ ⇔ (cid:18) A ′− B ′− (cid:19) . (cid:18) A CC T B (cid:19) . (cid:18) A ′− T B ′− T (cid:19) ≥ ⇔ (cid:18) I ( n ) C ′ C ′ T I ( n ) (cid:19) ≥ A ′ A ′− restricted to Im( A ) is the identity, and likewise for B ′ B ′− ).So we have reduced the problem to the case where E ′ and E ′ are Eu-clidean spaces of dimensions n and n , with A = I ( n ) , B = I ( n ) and ′ = A ′ T DB ′ instead of D . There exist two nice orthogonal bases so thatthe matrix of D ′ in the associated dual bases has the following form: D ′ = (cid:18) diag( λ , . . . , λ r ) 00 0 (cid:19) with diag( λ , . . . , λ r ) the diagonal matrix with coeﬃcients λ , . . . , λ r , λ k > r = rk( D ′ ) = rk( ADB ), and furthermore, we have unicity of the co-eﬃcients λ k . This result can be proved thanks to the polar decomposition.We have (cid:18) I ( n ) C ′ C ′ T I ( n ) (cid:19) > ⇔ k C ′ k op ≤ k C ′ k op the operator norm of C ′ associated with the Euclidean norms,hence the coeﬃcients of C ′ are greater than or equals to −

1. The minimumof C ′ αβ D ′ αβ is then − P rk =1 λ k , and is attained when the matrix of C ′ inthe nice bases is C ′ = (cid:18) − I r C ′′ (cid:19) with k C ′′ k op ≤

1, and only for those ones C ′ .The endomorphism I ( n ) D ′ I ( n ) D ′ T has the eigenvalues λ k and 0 withmultiplicity n − r (it’s matrix in the nice basis is diag( λ , . . . , λ r , , . . . , I ( n ) D ′ I ( n ) D ′ T = I ( n ) A ′ T DB ′ I ( n ) B ′ T D T A ′ = I ( n ) A ′ T DBD T A ′ . For any two matrices

M, N of size p × q and q × p , we have for every n ∈ N , tr(( MN ) n ) = tr(( NM ) n ), so MN and NM have the same eigen-values with the same multiplicity, except for the eigenvalue 0, where thediﬀerence of the multiplicities is | p − q | . So the matrix A ′ I ( n ) A ′ DBD T = ADBD T also has the eigenvalues λ k and 0 with some multiplicity. The λ k arethen the non-zero eigenvalues of √ ADBD T . So the minimum we werelooking for is − P rk =1 λ k = − tr( √ ADBD T ) (= − tr( √ BD T AD ) so thesymmetry between A and B is respected, which was not straightforwardby looking at the formula). (cid:3) The two following remarks provide a good understanding of what theset of the solutions of our minimization problem look like.

Remark 16

The set of all possible covariances is convex and compact,and the quantity to minimize is linear, so the minimum is attained atan extremal point of this convex set. Suppose n ≥ n , then in the caseof an extremal covariance, the coupling between X and Y has the form Y j = M ji X i . Indeed, C A ′− CB ′− T restricted to Im( A ) ⊗ Im( B ) is linear and bijective, so C is an extremal covariance if and only if C ′ is an extremal tensor of operator norm smaller than or equals to . weknow that for any tensor C ′ there exists two orthogonal bases in which thematrix of C ′ can be written: C ′ = (cid:18) diag( µ , . . . , µ n )0 (cid:19) ith µ k ≥ . The operator norm of C ′ is then max ≤ k ≤ n | µ k | .So C ′ is an extremal tensor of norm at most if and only if µ k = 1 for every k .Indeed, if at least one µ k is strictly smaller than , C ′ is a non-trivialconvex combination of the tensors whose matrices in the same basis are (cid:18) diag( ε , . . . , ε n )0 (cid:19) with ε i = ± , and each of these tensors has operator norm . And con-versely, if µ k = 1 , then C ′ is an extremal tensor of norm at most .Assume that C ′ = tC (1) + (1 − t ) C (2) with t ∈ ]0 , , and C (1) and C (2) have an operator norm smaller than or eqals to . Then the matricesof C (1) and C (2) have coeﬃcients smaller than or equals to , so theircoeﬃcients on the “diagonal” must be . The coeﬃcients outside the “di-agonal” are because the sum of the squared coeﬃcients on each row andeach column is less than . So C (1) = C (2) = C ′ .If C is an extremal covariance, we have C ′ T I ( n ) C ′ = I ( n ) (justdo the product of the matrices in the nice bases). We set then M = C T A ′− T I ( n ) A ′− = B ′ C ′ T I ( n − A ′− . The covariance of Y − MX is B − MC − C T M T + MAM T = B − [ B ′ C ′ T I ( n ) A ′− ][ A ′ C ′ B ′ T ] − [ B ′ C ′ T A ′ T ][ A ′− T I ( n ) C ′ B ′ T ]+[ B ′ C ′ T I ( n ) A ′− ][ A ′ I ( n ) A ′ T ][ A ′− T I ( n ) C ′ B ′ T ]= B − B ′ C ′ T I ( n ) C ′ B ′ T = 0 . So Y = MX as previously said. Remark 17

For any solution C of our minimization problem, we have CD T C = A ′ C ′ B ′ T D T A ′ C ′ B ′ T = A ′ C ′ D ′ T C ′ B ′ T = A ′ I ( n ) D ′ I ( n ) B ′ T = A ′ I ( n ) A ′ T DB ′ I ( n ) B ′ T = ADB.

In particular, we have ( CD T ) = ADBD T and ( D T C ) = D T ADB . Ifwe take C the solution which corresponds to C ′′ = 0 , C is the unique so-lution with minimal rank (hence the optimal coupling with ”the least corre-lation” between X and Y ). We have rk( C ) = rk( ADB ) = rk(

ADBD T ) ,so rk( C D T ) ≤ rk( ADBD T ) , and furthermore tr( C ) D T = − tr( √ ADBD T ) ,hence we have C D T = −√ ADBD T , and in a similar way D T C = −√ D T ADB . Since C D T C = ADB , we have

Im( C ) ⊃ Im(

ADB ) and Im( C T ) ⊃ Im( BD T A ) , and we have in fact equalities because these matri-ces have the same rank. As ADB , ADBD T and D T ADB have the samerank, there exist E and F (which will play the role of D − T ) such that ED T ADB = ADB = ADBD T F , and then C is given by the formula C = −√ ADBD T F = − E √ D T ADB.

For the other solutions, we have C − C = A ′ (cid:18) C ′′ (cid:19) B ′ T he condition k C ′′ k op ≤ is equivalent to the positivity of  I n − r C ′′ C ′′ T I n − r  which is equivalent to the positivity of (cid:18) A − C B − C T C − C ( C − C ) T B − C T A − C (cid:19) where A − = A ′− T I ( n ) A ′− and B − = B ′− T I ( n ) B ′− . But we wouldﬁnd the same results for the products C B − C T and C T A − C by taking A − and B − such that AA − A = A and BB − B = B , so this does notdepend on the choice of A ′ , A ′− , B ′ or B ′− .We can split Im( A ) as the direct sum of Im(

ADB ) and the orthogonal(for the quadratic form induced by A on Im( A ) ) of this space (which canbe written as Im( A ) ∩ Ker( BD T ) ). The two matrices C B − C T and A − C B − C T correspond to the decomposition of A on this two subspaces.The similar remark is valid for the matrices C T A − C and B − C T A − C with respect to the decomposition of Im( B ) as the sum of Im( BD T A ) and its orthogonal. An optimal coupling is then any coupling of X and Y satisfying that the covariance between the orthogonal projections (withrespect to A and B ) of X and Y on Im(

ADB ) and Im( BD T A ) is C . κ ( x, y ) when x and y are close Let us look at what the formula given by Theorem 13 for κ ( x, y ) becomeswhen we take y = exp x ( δu ), d the usual geodesic distance on Riemannianmanifolds and when δ tends to 0. We have the following result that givesthe second order Taylor expansion of the geodesic distance on Riemannianmanifolds. Lemma 18

Let x ∈ M , ( u, v, w ) ∈ (T x M ) such that g ij u i u j = 1 , y = exp x ( δu ) , w ′ ∈ T y M obtained from w by parallel transport alongthe geodesic t exp x ( δtu ) . Then we have for ﬁxed small enough δ , thefollowing Taylor expansion in ε : d (exp x ( εv ) , exp y ( εw ′ )) = δ (cid:18) εδ u i g ij ( w j − v j ) + ε δ (cid:16) r (1) ij v i v j + r (2) ij w i w j + 2 r (12) ij v i w j + O ( ε ) (cid:17)(cid:19) with r (1) ij = g ij − g ik u k u l g lj − δ R kilj u k u l + o ( δ ) r (2) ij = g ij − g ik u k u l g lj − δ R kilj u k u l + o ( δ ) r (12) ij = − g ij + g ik u k u l g lj − δ R kilj u k u l + o ( δ ) , where R kilj is the Riemann tensor of the manifold, and r (1) ij u i v j = r (2) ij u i v j = r (12) ij u i v j = r (12) ij u j v i = 0 (and not only o ( δ ) ). Proof :

We will take δ small enough such that ( x, y ) does not belong tothe cut-locus. Then the Riemannian distance is smooth on a neighborhoodof ( x, y ). or the term in ε , the well known fact that the sphere of center x and radius δ is orthogonal at y to the geodesic joining x to y gives usthat the part of this term depending on w is proportional to g ij u i w j . Asimilar argument holds for the term in ε depending on v . Taking v and w proportional to u give the two constants, so we have the term in ε .For the term in ε , we only show that it does only depend on theorthogonal projections of v and w on the orthogonal of u , the proof of thebehaviour in δ being based on tedious calculations. We deﬁne Σ x as theimage by the exponential map at x of a small ball of the orthogonal of u ,and Σ y as the image by the exponential map at y of a small ball of theorthogonal of u ′ . For ε small enough, the geodesic between x = exp x ( εv )and y = exp y ( εw ′ ) intersects Σ x and Σ y at x = exp x ( εv ) and y =exp y ( εw ′ ) (we may have to extend the geodesic of O ( ε ) beyond x and y ). We have : d ( x , y ) = d ( x , x )+ d ( x , y )+ d ( y , y ) with d ( x , x ) = − d ( x , x ) if we needed to extend the geodesic beyond x and d ( x , x )otherwise, and the same for d ( y , y ). We also have v = v + O ( ε ) and w = w + O ( ε ), where v = v − h u, v i u , w = w − h u, w i u , and the O ( ε ) are orthogonal to u . Since in the exponential map, the variation ofthe metric is of order 2, we have d ( x , x ) = ε k v − v k (1 + O ( ε )), and d ( y , y ) = ε k w − w k (1 + O ( ε )). So we get d ( x , x ) = − ε h u, v i + O ( ε )and d ( y , y ) = ε h u, w i + O ( ε ), so we ﬁnd the terms in ε we expected,and no terms in ε . As v and w are orthogonal to u , we get d ( x , y ) = d (exp x ( v ) , exp y ( w ′ )) + O ( ε ). So the ε term does only depend on v and w as wanted. (cid:3) From Theorem 13 and Lemma 18, we get:

Theorem 19

Suppose we have a diﬀusion process on a manifold M suchthat A and F are C , rk( A ) = n everywhere, locally uniformly L -bounded.Then κ ( x, exp x ( δu )) converges to κ ( x, u ) : = − u i g ij u k ∇ k F j + 12 R kilj A ij u k u l − u i ∇ i A αβ (cid:16) g − ⊗ A + A ⊗ g − (cid:17) − αγδβ u j ∇ j A γδ when δ tends to .Here, for any M ∈ T x M ⊗ T x M , we denote by M the canonicalprojection of M to (T x M / Vect( u )) ⊗ (T x M / Vect( u )) , and the tensor T ijkl = (cid:16) g − ⊗ A + A ⊗ g − (cid:17) − ijkl is uniquely deﬁned by the relationship: T ijkl (cid:16) g − jm A kn + A jm g − kn (cid:17) = δ mi δ nl . The contraction T ijkl M jk is the unique matrix N il such that ANg − + g − NA = M . Remark 20

In the special case A ij = g ij , we ﬁnd the usual curvature ofthe Bakry-Emery theory: κ ( x, u ) = −h u, ∇ u F i + 12 Ric( u, u ) . roof: The hypothesis that A and F are C gives us that the paralleltransport of A ( y ) and F ( y ) along the geodesic are A ij ( x )+ δu k ∇ k A ij ( x )+ o ( δ ) = A ij ( x ) + δE ij ( δ ) where E ij ( δ ) tends to u k ∇ k A ij ( x ) when δ tendsto 0, and F i ( x ) + δu k ∇ k F i ( x ) + o ( δ ). The application of Theorem 13 andLemma 18 leads to κ ( x, exp x ( δu )) = u i g ij δ ( F j − ( F j + δu k ∇ k F j + o ( δ )))+ δ " − ( A ij + δ E ij ( δ ))( g ij − g ik u k u l g lj − δ R kilj u k u l )+ tr( p Ar (12) ( δ )( A + δE ( δ )) r (12) T ( δ )) + o (1) . The diﬃcult point is to understand the behaviour of the square root when δ tends to 0. The quantity under the square root tends to ( A ik ( g kj − g kl u k u l g lj )) , which is of rank n − u )). The squareroot of matrices is an analytic function in a neigborhood of matrices withpositive eigenvalues. This is why we quotient the space T x M by Vect( u )(thanks to Lemma 18, we know that r (12) ∈ u ⊥ ⊗ u ⊥ ).We need the second-order Taylor expansion of tr( √ M + εN ) with M a diagonalizable matrix with positive eigenvalues. We have √ M + εN = M + εH + ε K + O ( ε ), so we have HM + MH = N and H + KM + MK =0. If we work in a diagonalization basis of M (with λ ( i ) the eigenvaluesof M ), we get: H ij = N ij λ ( i ) + λ ( j ) and K ij = − λ ( i ) + λ ( j ) X k N ik λ ( i ) + λ ( k ) N kj λ ( i ) + λ ( k ) . So we have: tr( H ) = X i N ii λ ( i ) = 12 tr( M − N )and tr( K ) = − P i,j N ij N ji λ ( i ) ( λ ( i ) + λ ( j ) ) = − P i,j N ij N ji λ ( i ) + λ ( j ) ) ( λ ( i ) + λ ( j ) )= − P i,j N ij N ji λ ( i ) λ j ( λ ( i ) + λ ( j ) ) = − tr( M − N (( I ⊗ M + M ⊗ I ) − ( M − N )) . We only have to apply these results with M = A ( g − guu T g ), ε = δ and N = A ( g − guu T g ) E ( δ )( g − guu T g ) + δ ( AR ( u ) A ( g − guu T g ) + A ( g − guu T g ) AR ( u )) + o ( δ ), where R ij ( u ) = R kijl u k u l ∈ u ⊥ ⊗ u ⊥ . We obtain:tr( p Ar (12) ( δ )( A + δE ( δ )) r (12) T ( δ )) = tr( A ( g − guu T g )) + δ tr(( E ( δ ) + δ R ( u ))( g − guu T g )) − δ tr( ∇ u A ( g − guu T g )(( I ⊗ M + M ⊗ I ) − ( ∇ u A ( g − guu T g )))) + o ( δ ) . We have tr( AR ( u )) = tr( AR ( u )), and the last term can be written − δ tr( ∇ u A (( A ⊗ g − + g − ⊗ A ) − ∇ u A )) because the inverse of g − guu T g (acting on T x M / Vect( u )) is g − . Replacing this expression of the traceof the square root in the expression of κ ( x, exp x ( δu )) cancels the terms oforder δ and δ , and we get the announced result. (cid:3) emark 21 The dependency on u of the last term of the formula forthe curvature is generally not quadratic (because of the complicated depen-dency on u of the tensor ( A ⊗ g − + g − ⊗ A ) − ), but is always non-positiveand greater than or equals to the same expression without the bars (whichwe would have obtained by using the W distance instead of the W in thedeﬁnition on κ , and this expression without the bars depends on u in aquadratic way). Now we will construct a coupling between the paths of the diﬀusion pro-cess thanks to the optimal coupling in the tangent spaces. In the casewhen A is invertible everywhere on M , we have rk( A ( x ) q (12) ( x, y ) A ( y )) = n −

1. According to Remarks 16 and 17, we have two extremal covariances C + ( x, y ) and C − ( x, y ) in the set of the covariances of optimal couplings,given by the formulas: C + ( x, y ) = − q A ( x ) q (12) ( x, y ) A ( y ) q (12) T ( x, y ) p ( x, y )+ 1 p u T A ( x ) − uu ′ T A ( y ) − u ′ uu ′ T C − ( x, y ) = − q A ( x ) q (12) ( x, y ) A ( y ) q (12) T ( x, y ) p ( x, y ) − p u T A ( x ) − uu ′ T A ( y ) − u ′ uu ′ T with y = exp x ( δu ) with δ small enough, u ′ the parallel transport of u and p ( x, y ) any matrix such that A ( x ) q (12) ( x, y ) A ( y ) q (12) T ( x, y ) p ( x, y ) = A ( x ) q (12) ( x, y ) A ( y ). The extremal covariance C + ( x, y ) tends to A ( x )when y tends to x , whereas C − ( x, y ) tends to A ( x ) − uu T u T A ( x ) − u when u stays ﬁxed and δ tends to 0, so the coupling with C + ( x, y ) generalizes thecoupling by parallel transport, whereas the one with C − ( x, y ) generalizesthe coupling by reﬂection introduced by Kendall in [9]. Here we will use C + to construct our coupling for Theorem 6, because the behaviour of C − when δ tends to 0 is irregular.So we can construct a coupling between the paths as a diﬀusion processon M×M (at least in the neighborhood of the diagonal), whose generatoris deﬁned by: L + ( f )( x, y ) = [ A ( x ) ij ∇ ij f ( x, y ) + A ( y ) ij ∇ ij f ( x, y )+ 2 C + ij ( x, y ) ∇ ij f ( x, y )] + F i ( x ) ∇ (1) i f ( x, y ) + F i ( y ) ∇ (2) i f ( x, y ) . The coupling above in the case A = g − is the one of Theorem 6. Proof of Theorem 6:

Let us consider the diﬀusion process of inﬁnitesi-mal generator L + , which is well deﬁned outside the cut-locus of M . In thespecial case of compact Riemannian manifolds, this is true when d ( x, y ) isstrictly smaller than the injectivity radius. To get the inﬁnitesimal vari-ation of d ( x ( t ) , y ( t )), we have to compute L + ( f ) where f has the specialform f ( x, y ) = ϕ ( d ( x, y )) with ϕ regular enough ( C ). We have: ∇ (1) i f ( x, y ) = ϕ ′ ( d ( x, y )) ∇ (1) i d ( x, y ) ∇ (2) i f ( x, y ) = ϕ ′ ( d ( x, y )) ∇ (2) i d ( x, y ) ∇ ij f ( x, y ) = ϕ ′ ( d ( x, y )) ∇ ij d ( x, y ) + ϕ ′′ ( d ( x, y )) ∇ (1) i d ( x, y ) ∇ (1) j d ( x, y ) ∇ ij f ( x, y ) = ϕ ′ ( d ( x, y )) ∇ ij d ( x, y ) + ϕ ′′ ( d ( x, y )) ∇ (1) i d ( x, y ) ∇ (2) j d ( x, y ) ∇ ij f ( x, y ) = ϕ ′ ( d ( x, y )) ∇ ij d ( x, y ) + ϕ ′′ ( d ( x, y )) ∇ (2) i d ( x, y ) ∇ (2) j d ( x, y ) ith, according to Lemma 18: ∇ (1) i d ( x, y ) = − g ij ( x ) u j ( x, y ) ∇ (2) i d ( x, y ) = − g ij ( y ) u j ( y, x ) = g ij ( y ) u ′ j ( x, y ) ∇ ij d ( x, y ) = d ( x, y ) q (1) i j ( x, y ) ∇ ij d ( x, y ) = d ( x, y ) q (12) i j ( x, y ) ∇ ij d ( x, y ) = d ( x, y ) q (2) i j ( x, y ) . Thus we get: L + f ( x, y ) = ϕ ′′ ( d ( x, y )) (cid:20) A ij ( x ) g ik ( x ) u k ( x, y ) g jl ( x ) u l ( x, y ) + A ij ( y ) g ik ( k ) u k ( y, x ) g jl ( y ) u l ( y, x )+2 C + ij ( x, y ) g ik ( x ) u k ( x, y ) g jl ( y ) u l ( y, x ) (cid:21) − d ( x, y ) ϕ ′ ( x, y ) κ ( x, y ) . Since we have A = g − , we get C + ( x, y ) = C ( x, y ) − u ( x, y ) u T ( y, x ), with C ∈ g − ( x ) u ( x, y ) ⊥ ⊗ g − ( y ) u ( y, x ) ⊥ . So the term containing ϕ ′′ ( d ( x, y ))is 0, which means that the variance of d ( x ( t ) , y ( t )) is o ( t ) when t tends to0. So d d ( x ( t ) , y ( t )) = − d ( x ( t ) , y ( t )) κ ( x ( t ) , y ( t ))d t . Then by integrationof this equality, we get: d ( x ( t ) , y ( t )) = d ( x (0) , y (0))e − R t κ ( x ( s ) ,y ( s ))d s . (cid:3) ˜ κ The variance term of this optimal coupling is generally not 0 in the casewhen A = g − (nor a multiple of g − ). So we can try to use another cou-pling, by replacing C + ( x, y ) with ˜ C ( x, y ), which is the optimal covariance(for the distance) under the set of covariances which cancel the varianceterm of d (if this set is non-empty).We will prove this set is non-empty if and only if the condition( H ) ⇔ ∀ u ∈ T M , u i g jk u j g lm u l ∇ i A km = 0is satisﬁed.Indeed, the variance term is always nonnegative, so it may vanish ifand only if its minimum is 0. This is equivalent, according to Lemma 15,to 2 tr( p A ( x ) g ( x ) u ( x, y ) u T ( y, x ) g ( y ) A ( y ) g ( y ) u ( y,x ) u T ( x, y ) g ( x ))= u T ( x, y ) g ( x ) A ( x ) g ( x ) u ( x,y ) + u T ( y, x ) g ( y ) A ( y ) g ( y ) u ( y,x ) , which is equivalent to u T ( x, y ) g ( x ) A ( x ) g ( x ) u ( x,y ) = u T ( y, x ) g ( y ) A ( y ) g ( y ) u ( y,x )(this is the equality case in the inequality between arithmetic and geo-metric mean). Diﬀerentiating this condition with respect to y along thegeodesic starting at x in the direction u gives the condition ( H ), and ofcourse the converse implication is obtained by integration.The hypothesis ( H ) is a very strong hypothesis: for a given metric, theset of the possible A which are nonnegative and satisfy ( H ) is a convexcone of ﬁnite dimension. Indeed, H is equivalent to: for every geodesic ( t ), A ( γ ( t ))( g − ˙ γ ( t )) ⊗ is constant. We choose x ∈ M , and we takea family of vectors u ( k ) , k = 1 , . . . , n ( n +1)2 such that { u ⊗ k ) } is a basis ofthe symmetric tensors of T x M . Then we take x ( k ) = exp x ( εu ( k ) ), with ε small enough to have k εu ( k ) k < r , with r the injectivity radius of M . Thenthere exists a ball B centered at x such that for every y ∈ B and every k ,there exists an unique minimal geodesic joining y and x ( k ) , with velocity v (( k ) at y , and { v ⊗ k ) } is a basis of the symmetric tensors of T y M . Theknowledge of A at the points x ( k ) is suﬃcient to uniquely determine A onthe ball B . For any z ∈ M , we have x = exp z ( v ) for some v ∈ T z M . Wecan ﬁnd a family of vectors v ( k ) ∈ T z M in a neighborhood of v such that { v ⊗ k ) } is a basis of the symmetric tensors of T z M , and exp z ( v ( k ) ) ∈ B .Then the knowledge of A on the points x ( k ) uniquely determines A on M .This argument also shows that A is smooth, and the second order Tay-lor expansion of A in the neighborhood of a single point is suﬃcient todetermine A one the whole manifold. The condition ( H ), and the equa-tions obtained by diﬀerentiating it twice show that this Taylor expansionmust belong to a subspace of dimension n ( n +1) ( n +2)12 .The following examples give the set of the possible A in the cases when M is an Euclidean space of dimension n , the sphere of dimension n orthe hyperbolic space of dimension n , providing examples where ( H ) issatisﬁed without having A = g ij . Example 22

In all three cases mentionned below, M can be consideredas a submanifold of E = R n +1 such that the geodesics are the intersectionof M and a two dimensional vector subspace of E . Let ( e , . . . , e n +1 ) bethe canonical basis of E and ( e ∗ , . . . , e ∗ n +1 ) be the corresponding dual basis • We take M equal to the aﬃne hyperplane of equation e ∗ n +1 ( x ) = 1 ,equipped with the Euclidean metric P ni =1 e ∗ i in the ﬁrst case. • We put the scalar product s = P n +1 i =1 e ∗ i on E , and we take M equalto the sphere s ( x, x ) = 1 , equipped with the metric induced by s inthe second case • We put the quadratic form q = P ni =1 e ∗ i − e ∗ n +12 on E , and we take M = { x | q ( x, x ) = − and e ∗ n +1 ( x ) > } , equipped with the metricinduced by q in the third case.Then we take T ∈ E ∗⊗ any tensor with the same symmetry as a Riemanntensor, that is, T must satisfy T ijkl = − T jikl = − T ijlk = T klij and theBianchi identity T ijkl + T jkil + T kijl = 0 . We construct the tensor ﬁeld A on M in the following way: let ( x, v ) ∈ T M , we want to have A ( x )( g − v ) ⊗ = T ijkl x i v j x k v l where the sense of the right hand side is given by considering x and v as elements of E . The quadratic dependency in v is trivial, so A is welldeﬁned by the previous equation. Let us consider a unit speed geodesic on M , joining two distinct points x and y , and v and w be the speed vectorsof the geodesic at points x and y . As said above, the geodesic is includedin a two dimensional subspace of E , so ( x, v ) and ( y, w ) are two bases ofthis subspace. Thus there exists a matrix M = (cid:18) a bc d (cid:19) uch that y = ax + bv and w = cx + dv . Then we have T ( y, w, y, w ) =det( M ) T ( x, v, x, v ) (that is a classical property of the Riemann tensor).If l is the length of the geodesic, we have: M = (cid:18) l (cid:19) in the case of the Euclidean space, M = (cid:18) cos( l ) sin( l ) − sin( l ) cos( l ) (cid:19) in the case of the sphere, M = (cid:18) ch( l ) sh( l )sh( l ) ch( l ) (cid:19) in the case of the hyperbolic space.In each of the three cases, we have det( M ) = 1 . Thus we have A ( x )( g − v ) ⊗ = A ( y )( g − w ) ⊗ as wanted.The linear application T A is injective, so the dimension of itsimage is n ( n +1) ( n +2)12 , which is the maximal dimension of the vector spaceof symmetric tensor ﬁelds on M satisfying the hypothesis ( H ) . Thus thisimage is exactly this vector space. But the tensor ﬁelds A which interestus are nonnegative on M , and this implies some restrictions on T . In thecases of the Euclidean space and the sphere, it is true if and only if the“sectional curvature” associated to T is nonnegative, whereas in the caseof the hyperbolic space, it is true if and only if this “sectional curvature”is nonnegative on the planes whose intersection with the cone q ( x, x ) = 0 is not { } . In the case when ( H ) is satisﬁed, the covariances which cancel thevariance term of d take the form: C ( x, y ) = ˜ C ( x, y ) + C ′ ( x, y ), with˜ C ( x, y ) = − A ( x ) g ( x ) u ( x, y ) u T ( y, x ) g ( y ) A ( y ) u T ( x, y ) g ( x ) A ( x ) g ( x ) u ( x,y ) = − A ( x ) g ( x ) u ( x, y ) u T ( y, x ) g ( y ) A ( y ) u T ( y, x ) g ( y ) A ( y ) g ( y ) u ( y,x )and C ′ ( x, y ) is such that the big matrix: (cid:18) A ′ ( x, y ) C ′ ( x, y ) C ′ T ( x, y ) A ′ ( y, x ) (cid:19) is nonnegative, with A ′ ( x, y ) = A ( x ) − A ( x ) g ( x ) u ( x,y ) u T ( x,y ) g ( x ) A ( x ) u T ( x,y ) g ( x ) A ( x ) g ( x ) u ( x,y ) .Using Lemma 15 again gives us the following expression of ˜ κ ( x, y ):˜ κ ( x, y ) = 1 δ (cid:20) − F ( y ) g ( y ) u ( y, x ) − F ( x ) g ( x ) u ( x, y ) − (tr( A ( x ) q (1) ( x, y )) + tr( A ( y ) q (2) ( x, y ))) − tr( ˜ C ( x, y ) q (12) T ( x, y )) + tr( p A ′ ( x, y ) q (12) ( x, y ) A ′ ( y, x ) q (12) T ( x, y )) (cid:21) . We can deﬁne ˜ κ ( x, u ) as the limit when δ tends to 0 of ˜ κ ( x, exp x ( δu )).Then we have:˜ κ ( x, u ) = − u i g ij u k ∇ k F j + A ij R ikjl u k u l − u i g ij u k ∇ k A jl g lm u n ∇ n A mo g op u p u i g ij A jk g kl u l − B ij ( A ′ ⊗ ( g − − uu T ) + ( g − − uu T ) ⊗ A ′ ) − kijl B kl with A ′ = A − Aguu T gAu T gAgu B ij = ∇ u A − ∇ u Aguu T gA + Aguu T g ∇ u Au T gAgu , and as A ′ , B and g − − uu T belong to g − u ⊥ ⊗ g − u ⊥ , we take ( A ′ ⊗ ( g − − uu T ) + ( g − − uu T ) ⊗ A ′ ) − the unique inverse of A ′ ⊗ ( g − − uu T ) + ( g − − uu T ) ⊗ A ′ in (T ∗ x M / Vect( gu )) ⊗ .And we have the equivalent of Theorem 6: emma 23 If the hypothesis (H) is satisﬁed, then there exists a couplingbetween paths such that d ( X ( t ) , Y ( t )) = d ( X (0) , Y (0))e − R t ˜ κ ( X ( s ) ,Y ( s ))d s almost surely on the event that for every ≤ s ≤ t , d ( x ′ , y ′ ) is smooth ina neighborhood of ( X ( s ) , Y ( s )) . The idea to prove Theorems 4 and 8 is to look at the exponential decay ofthe Lipschitz norm of P t f when f is Lipschitz with mean 0 (respect to thereversible probability measure). Then we use the reversibility assumptionto conclude that the variance of P t f also decreases exponentially fast withthe same rate, which is hence a lower bound for the spectral gap. Proof of Theorem 8:

Let x and y be two points of M such that d ( x, y ) < r i where r i is the injectivity radius of M . We have P t f ( y ) − P t f ( x ) = E [ f ( Y ( t )) − f ( X ( t ))] for any coupling between paths. If f is 1-Lipschitz, then | f ( Y ( t )) − f ( X ( t )) | ≤ d ( Y ( t ) , X ( t )), so Lemma 23 tells usthat | P t f ( y ) − P t f ( x ) | ≤ d ( x, y ) E [e − R t ˜ κ ( X ( s ) ,Y ( s ))d s ]. For any ε > η > δ > x ′ , y ′ ) such that d ( x ′ , y ′ ) ≤ δ , wehave ˜ κ ( x ′ , y ′ ) ≥ ˜ κ ( x ′ ) − η , where ˜ κ ( x ′ ) = inf u ∈ T x ′ M ˜ κ ( x ′ , u ). Taking T = ln( riδ ) ε , we have d ( X ( T ) , Y ( T )) ≤ δ , and then for t ≥ T we have E [e − R t ˜ κ ( X ( s ) ,Y ( s ))d s ] ≤ δr i E [e − R t − TT (˜ κ ( X ( s )) − η )d s ]. Following what wasdone in [8], we use the Feynman-Kac semigroup F t generated by K , with Kf ( x ) = 12 A ij ( x ) ∇ ij f ( x ) + F i ( x ) ∇ i f ( x ) − (˜ κ ( x ) − η ) f ( x ) . Indeed we have E [e − R t (˜ κ ( X ( s )) − η )d s ] = F t x ). The Lipschitz norm of P t f is at most sup x ∈M δr i E [e − R t − TT ˜ κ ( X ( s ) ,Y ( s ))d s ], for every t ≥ T . Thisquantity is sup x ∈M δr i E δ x .P T F t − T y ), so it is smaller than or equals to δr i sup x ∈M k d( δ x .P T )d π k L ( π ) k F t − T k L ( π ) . Then the Lipschitz norm of P t f decreases exponentially fast with a betterrate than the one of F t L -norm of F t h | R h d π =1 R − hKh d π ≥ inf h | R h d π =1 λ Var π ( h )+ R (˜ κ ( x ) − η ) h ( x ) d π ( x ) = λ +inf h | R h d π =1 R (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ ( R h d π ) . The method of Lagrange multiplicators sug-gests to take h ( x ) = c ˜ κ ( x ) − η + α , with α such that 1 λ = Z d π ( x )˜ κ ( x ) − η + α nd c = rR d π ( x )(˜ κ ( x ) − η + α )2 . With this h , we have Z (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ ( Z h d π ) = − α. This is indeed the minimal h when λ is at least the harmonic mean λ of ˜ κ − inf(˜ κ ). We can see it by using Cauchy-Schwarz: R (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ (cid:0)R h d π (cid:1) ≥ R (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ (cid:0)R (˜ κ ( x ) − η + α ) h d π ( x ) (cid:1)(cid:16)R d π ( x )˜ κ ( x ) − η + α (cid:17) = − α. In the case where λ < λ , we take α = η − inf(˜ κ ). This time we get R (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ (cid:0)R h d π (cid:1) ≥ R (˜ κ ( x ) − η ) h ( x ) d π ( x ) − λ (cid:0)R (˜ κ ( x ) − η + α ) h d π ( x ) (cid:1)(cid:16)R d π ( x )˜ κ ( x ) − η + α (cid:17) + ( λ − λ ) R ( h d π ) ≥ − α. A minimizing sequence h i ( x ) can be, for example, a sequence such that h i tends to a Dirac at a point where the minimum of ˜ κ is reached.In both cases, the exponential decay rate for zero-mean Lipschitz func-tions is at least λ − α , then by density of the Lipschitz functions on L ( π )and by the reversibility assumption, the exponential decay rate for zero-mean L ( π ) functions (which is equal to λ ) is also at least λ − α . Thus α is nonnegative, which means that λ is at least the harmonic mean of˜ κ − η , so letting η tend to 0 yields the result. (cid:3) Purely analytical methods can also be used to prove this result in thecase A ij = g ij , and they also work when inf x ∈M κ ( x ) = 0. Lemma 24

Let f be a regular enough ( C ) function from M to R . Thenwe have dd t (cid:12)(cid:12)(cid:12)(cid:12) t =0 k∇ P t f k = h (2 L ( h )+ u k g kl ∇ l A ij ∇ i hu j )+ h (cid:18) u k g kl ∇ l A ij ∇ i u j − A ij g kl ∇ i u k ∇ j u l + A ij R lijα g αβ u β g kl u k + 2 u k g kl ∇ l F i u i (cid:19) where h = k∇ f k and ∇ k f = hu k . Proof:

We have dd t (cid:12)(cid:12) t =0 k∇ P t f k = 2 ∇ k fg kl ∇ l ( Lf ), and ∇ l ( Lf ) = ( ∇ l A ij ∇ ij f + A ij ∇ lij f ) + ∇ l F i ∇ i f + F i ∇ li f = ( ∇ l A ij ∇ ij f + A ij ( ∇ ijl f + R lijα g αβ ∇ β f )) + ∇ l F i ∇ i f + F i ∇ il f. Diﬀerentiating ∇ i f = hu i , we get ∇ ij f = ∇ i hu j + h ∇ i u j , and ∇ ijl f = ∇ ij hu l + ∇ j h ∇ i u l + ∇ i h ∇ j u l + h ∇ ij u l . So we get:dd t (cid:12)(cid:12)(cid:12)(cid:12) t =0 k∇ P t f k = hg kl u k  ∇ l A ij ( ∇ i hu j + h ∇ i u j )+ A ij (cid:18) ∇ ij hu l + ∇ j h ∇ i u l + ∇ i h ∇ j u l + h ∇ ij u l + hR lijα g αβ u β (cid:19) +2 h ∇ l F i u i + 2 F i ( ∇ i hu l + h ∇ i u l )  . Diﬀerentiating g kl u k u l = 1 gives g kl u k ∇ j u l = 0 and g kl ( ∇ i u k ∇ j u l + u k ∇ ij u l ) = 0, so using these relationships, the above expression can besimpliﬁed to get the formula given in Lemma 24. (cid:3) roof of Theorem 7: We ﬁrst prove the Theorem in the case n ′ = ∞ and c = 0, in which case we get the result of Theorem 8. Indeed, in thiscase, the optimal ρ ( x ) is nothing but κ ( x ) = inf u ∈ T x M κ ( x, u ).Let f be an eigenfunction of L for the eigenvalue − λ . With theprevious notation for h and u , we have: − λ k h k L ( π ) = dd t (cid:12)(cid:12) t =0 k∇ P t f k L ( π ) = R h ( x ) L ( h )( x ) − h ( x ) κ ( x, g − u ( x )) − h ( x ) A ij ( x ) g kl ( x ) ∇ i u k ( x ) ∇ j u l ( x )d π ( x ) ≤ − λ ( R h ( x ) d π ( x ) − ( R h ( x )d π ( x )) ) − R κ ( x ) h ( x ) d π ( x ) + 0where the inequality R hL ( h ) ≤ λ Var( h ) is due to the reversibility as-sumption. By Cauchy-Schwarz, we have( Z h ( x )d π ( x )) ≤ Z d π ( x ) κ ( x ) Z κ ( x ) h ( x ) d π ( x ) . Finally we get: Z κ ( x ) h ( x )d π ( x )( λ Z d π ( x ) κ ( x ) − ≥ , and if R d π ( x ) κ ( x ) < + ∞ , then R κ ( x ) h ( x ) >

0, because f is nonconstant, so h can’t be 0 almost everywhere. So we have λ ≥ R M d π ( x ) κ ( x ) . In the general case, we have n ′ ≥ n and the optimal ρ is given by: ρ ( x ) = 12 inf u ∈ T x M , k u k =1 Ric( u, u ) + ∇ u,u ϕ − ( ∇ u ϕ ) n ′ − n . So we have ρ ≤ κ . Then with the previous notation, we have: λ ( Z M h ( x )d π ( x )) − Z M ρ ( x ) h ( x )d π ( x ) ≥ κ instead of ρ .We also have R M Γ ( f )( x )d π ( x ) = R M [ L ( h − h∇ f, ∇ ( Lf ) i ]d π = 0 + λ R M h d π ≥ R M ρ Γ( f ) + n ′ L ( f ) d π = R M ρh d π + λ n ′ R M h d π. Thus for any θ ∈ [0 ,

1] we have:(1 − θ ) λ ( Z M h d π ) − Z M (cid:18) ρ − θλ n ′ − n ′ (cid:19) h d π ≥ . For θ = 1, we have 0 ≤ R M ( λ n ′ − n ′ − ρ ) h d π ≤ λ n ′ − n ′ − inf( ρ ) R M h d π ,this proves the Bakery–´Emery bound: λ ≥ n ′ n ′ − ρ ) . o for any c ∈ [0 , inf( ρ )], we take θ = n ′ c ( n ′ − λ ∈ [0 , − θ ) λ Z ( ρ − c ) h d π Z d πρ − c − Z ( ρ − c ) h d π ≥ λ − c n ′ n ′ − ) R d πρ − c − ≥

0, which leads to the desired result. (cid:3)

References [1]

M.Arnaudon, K.A.Coulibaly, A.Thalmaier , Horizontal diﬀu-sion in C path space , S´eminaire de probabilit´es, XLIII, 2006/2011,Lecture notes in Math. p 73–94, Springer, 2011[2] E.Aubry , Finiteness of π and geometric inequalities in almost posi-tive Ricci curvature, Ann.Sci.Ec.Norm.Sup, vol.40, July-August 2007[3]

D.Bakry, M.´Emery , Hypercontractivit´e de semi-groupes de diﬀu-sion , C.R.Acad.Sci.Paris S´er I Math. (1984), n o D.Bakry, M.´Emery , Diﬀusions hypercontractives , S´eminaire deprobabilit´es, XIX, 1983/84. Lecture notes in Math. , Springer,Berlin (1985)[5]

M.Berger , A panoramic view of Riemannian geometry , Springer-Verlag, 2003[6]

M.F.Chen , From Markov chains to non-equilibrium particle systems ,Singapore: Word Scientiﬁc, 1992[7]

M.F.Chen, F.Y.Wang , General formula for lower bound of the ﬁrsteigenvalue on Riemannian manifolds , Sci.Sin. 1997[8]

A.Guillin, C.Leonard, L.Wu , Transportation-information in-equalities for Markov processes , Probability theory and related ﬁelds,vol.144, p 669–695, Springer, 2009[9]

W.S.Kendall , Nonnegative Ricci curvature and the Brownian cou-pling property , Stochastics, vol 19, 1986, p 111–129[10]

A.Lichnerowicz , G´eom´etrie des groupes de transformations ,Dunod, 1958[11]

E.Milman , On the role of convexity in isoperimetry, spectral gap andconcentration , Inventiones Mathematicae, 2009, Springer[12]

Y.Ollivier , Ricci curvature of Markov chains on metric spaces ,J.Funct.Anal.256, n o L.Veysseire , A harmonic mean bound for the spcetral gap of theLaplacian on Riemannian manifolds , Comptes rendus mathematique,2010 vol 438, p 1319–1322, Comptes rendus mathematique,2010 vol 438, p 1319–1322