Estimates on the Markov Convexity of Carnot Groups and Quantitative Nonembeddability
EESTIMATES ON THE MARKOV CONVEXITY OF CARNOT GROUPS ANDQUANTITATIVE NONEMBEDDABILITY
CHRIS GARTLAND
Abstract.
We show that every graded nilpotent Lie group G of step r , equipped with a leftinvariant metric homogeneous with respect to the dilations induced by the grading, (this includesall Carnot groups with Carnot-Caratheodory metric) is Markov p -convex for all p ∈ [2 r, ∞ ). Wealso show that this is sharp whenever G is a Carnot group with r ≤
3, a free Carnot group, or ajet space group; such groups are not Markov p -convex for any p ∈ (0 , r ). This continues a line ofresearch started by Li who proved this sharp result when G is the Heisenberg group. As corollaries,we obtain new estimates on the non-biLipschitz embeddability of some finitely generated nilpotentgroups into nilpotent Lie groups of lower step. Sharp estimates of this type are known when thedomain is the Heisenberg group and the target is a uniformly convex Banach space or L , but notwhen the target is a nonabelian nilpotent group. Contents
1. Introduction 11.1. Background 11.2. Summary of Results 32. Discussion of Proof Methods 62.1. Discussion of Proof of Theorem 4.19 62.2. Discussion of Proof of Theorem 5.6 73. Preliminaries 83.1. Graded Nilpotent and Stratified Lie Algebras and their Lie Groups 83.2. Norms and Metrics 93.3. Model Filiform Groups and Jet Spaces over R J r − ( R ) 245.1. Directed Graphs and Random Walks 245.2. Mapping the Graphs into J r − ( R ) 27References 351. Introduction
Background.
In [Rib76], Ribe showed that if two Banach spaces
E, F are uniformly home-omorphic, then they are mutually finitely representable; there exists a λ < ∞ such that for anyfinitely dimensional subspace E of E , there is a subspace F of F whose Banach-Mazur distance Thanks to Jeremy Tyson for helpful comments in the preparation of this article and to Assaf Naor for suggestingthe (non)embeddability corollaries of the main theorems. a r X i v : . [ m a t h . M G ] D ec rom E is at most λ . Properties of Banach spaces that are preserved under mutual finite repre-sentability are called local , and many classical properties such as type, cotype, superreflexivity, and p -convexity are local. Recall that a Banach space is said to be p -convex for some p ≥ (cid:107) · (cid:107) and K < ∞ such that for every (cid:15) ∈ [0 , {(cid:107) ( x + y ) / (cid:107) : (cid:107) x (cid:107) , (cid:107) y (cid:107) ≤ , (cid:107) x − y (cid:107) ≥ (cid:15) } ≤ − (cid:15) p /K Ribe’s theorem implies that these properties are really metric properties, suggesting that eachshould have a reformulation that involves only the metric structure of the Banach space and notthe linear structure. The research program concerned with finding these reformulations is knownas the
Ribe program . The program was initiated by Bourgain in [Bou86] in which he made thefirst substantial contribution by characterizing superreflexive Banach spaces as those which do notadmit biLipschitz embeddings of the binary trees of depth k with uniform control on the biLipschitzdistortion. We record here that the biLipschitz distortion (or just distortion) of a map f : X → Y between metric spaces ( X, d X ), ( Y, d Y ) is the least value of L for which there exists 0 < D < ∞ sothat d X ( x, y ) ≤ Dd Y ( f ( x ) , f ( y )) ≤ Ld X ( x, y )for all x, y ∈ X , that f is a biLipschitz embedding if its distortion is finite, and that f is a biLipschitzequivalence if it is a biLipschitz embedding and surjective. The biLipschitz distortion of X into Y is the infimal distortion of all maps from X into Y . Another major contribution to the Ribeprogram is a purely metric reformulation of p -convexity. The metric property Markov p -convexity was originally defined by Lee-Naor-Peres in [LNP09] and proved by Mendel-Naor in [MN13] to bea reformulation of p -convexity. Here are the specifics: Definition 1.1 (Definition 1.2, [MN13]) . Let { X t } t ∈ Z be a Markov chain on a state space Ω. Givenan integer k ≥
0, we denote by { ˜ X t ( k ) } t ∈ Z the process which equals X t for time t ≤ k and evolvesindependently (with respect to the same transition probabilities) for time t > k . Fix p >
0. Ametric space (
M, d ) is called
Markov p -convex if there is Π < ∞ so that for every Markov chain { X t } t ∈ Z on a state space Ω, and for every f : Ω → M , ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ d ( f ( X t ) , f ( ˜ X t ( t − k ))) p ]2 kp ≤ Π p (cid:88) t ∈ Z E [ d ( f ( X t +1 ) , f ( X t )) p ]Set Π p ( M ) equal to the least value of Π so that the above inequality holds (whenever it exists).Π p ( M ) is called the Markov p -convexity constant of M . Theorem 1.2 (Theorem 1.3, [MN13]) . A Banach space is p -convex if and only if it is Markov p -convex. Observe the following fact: if there is a map f : X → Y with biLipschitz distortion L , thenΠ p ( X ) ≤ L Π p ( Y ). Thus, Markov convexity can be used to answer quantitative questions aboutmetric spaces in the Lipschitz category.We present two such applications, the first on the impossibility of dimension reduction in traceclass operators, S . From page 2 of [NPS18]): A Banach space ( X, (cid:107) · (cid:107) X ) admits metric dimen-sion reduction if there exists α < ∞ such that every n -point subset of X biLipschitz embeds withdistortion α into a linear subspace of X with dimension n o (1) . This definition is inspired by the fa-mous Johnson-Lindenstrauss Lemma ([JL84]) which implies Hilbert space admits metric dimensionreduction. In [NPS18], Naor, Pisier, and Schechtman showed that there is an infinite sequence of n -point subsets of S whose Markov 2-convexity constant is bounded below by a universal constanttimes (cid:112) ln( n ), and that the Markov 2-convexity constant of any d -dimensional linear subspace of S is bounded above by a universal constant times (cid:112) ln( n ). Together these imply their main result Theorem 1, [NPS18]): S does not admit dimension reduction. For more on the Ribe programand dimension reduction, see the surveys [Nao12] and [Nao18].Here is a second application of Markov convexity. In the spirit of the Ribe program, Ostrovskiifound a purely metric characterization of the Radon-Nikodym property (RNP) of Banach spacesby showing that a Banach space has the RNP if and only if it does not contain a biLipschitz copyof a thick family of geodesics (Corollary 1.5 [Ost14a]). He asked a natural follow-up question: if ageodesic metric space does not biLipschitz embed into any RNP space, must it contain a biLipschitzcopy of a thick family of geodesics? The Heisenberg group is a geodesic metric space that does notbiLipschitz embed into any RNP space (see Section 1.2 of [LN06] or Theorem 6.1 of [CK06]), andOstrovskii showed that in fact it does not contain a biLipschitz copy of a thick family of geodesics,thus negatively answering the question. He accomplished this by proving that any metric spacecontaining a biLipschitz copy of a thick family of geodesics cannot be Markov p -convex for any p > Theorem 1.3.
Proposition 7.2 and Theorem 7.4, [Li14] : Every graded nilpotent Lie group of step r is Markov r !) -convex.Theorem 1.1 and Corollary 1.3, [Li16] : The set of p for which the Heisenberg group is Markov p -convex is exactly [4 , ∞ ) . Summary of Results.
This article continues the line of research started by Theorem 1.3.Our main results are:
Theorem 4.19.
Every graded nilpotent Lie group of step r , equipped with a left invariant metrichomogeneous with respect to the dilations induced by the grading, is Markov p -convex for every p ∈ [2 r, ∞ ) . Theorem 5.6.
For every p > , r ≥ , coarsely dense set N ⊆ J r − ( R ) , and R ≥ , let B N ( R ) := { x ∈ N : d CC (0 , x ) ≤ R } . Then Π p ( B N ( R )) (cid:38) ln( R ) p − r ln(ln( R )) p + r where the implicit constant can depend on r, p but not on N, R . Recall that a subset N of a metric space ( X, d X ) is coarsely dense if there exists C < ∞ suchthat X = ∪ x (cid:48) ∈ N { x ∈ X : d X ( x, x (cid:48) ) ≤ C } . See Section 3 for the definition of J r − ( R ). Theorem4.19 is restated and proved at the end of Section 4.2, and similarly for Theorem 5.6 at the end ofSection 5.2.We can extend this result to other groups using the notion of subquotients. Recall that asurjective map f : X → Y between metric spaces ( X, d X ) , ( Y, d Y ) is a Lipschitz quotient map withconstant
C < ∞ if there exists 0 < D < ∞ such that for all x ∈ X and R > B R ( f ( x )) ⊆ f ( B DR ( x )) ⊆ B CR ( f ( x ))If such a map f exists we say Y is a Lipschitz quotient of X . X is a Lipschitz subquotient of Y with constant C if there is a metric space Z such that Z embeds isometrically into Y and X isa Lipschitz quotient of Z with constant C , or, equivalently, there is a a metric space Z such that Z is a Lipschitz quotient of Y with constant C and X isometrically embeds into Z . It followsfrom Proposition 4.1 of [MN13] that if X is a Lipschitz subquotient of Y with constant C thenΠ p ( X ) ≤ C Π p ( Y ).Every free Carnot group of step r ≥ J r − ( R ) (in fact every graded nilpotent Lie groupof step r with 2-dimensional horizontal layer) as a graded quotient group, and the projection map k (cid:16) R dualizes to a graded embedding J r − ( R ) (cid:44) → J r − ( R k ). See Chapter 14 of [BLU07] forbackground on free Carnot groups and [War05] for background on the jet spaces groups J r − ( R k ). Corollary 1.4.
Let G be a Carnot group of step r that has J r − ( R ) as a graded subquotient group,for example G may be a free Carnot group, J r − ( R k ) , or any Carnot group if r ≤ . The set of p > for which G is Markov p -convex is exactly [2 r, ∞ ) .Proof. This follows from Theorems 4.19 and 5.6 and the preceding discussion. (cid:3)
Recall that a subgroup Γ ≤ G of a Lie group G is a lattice if the subspace topology on Γ isdiscrete and G/ Γ carries a G -invariant, Borel probability measure. Corollary 1.5.
Let G be a Carnot group of step r that has J r − ( R ) as a graded subquotient group,for example G may be a free Carnot group, J r − ( R k ) , or any Carnot group if r ≤ (by Lemma3.3). Let Γ ≤ G be a lattice equipped with the word metric with respect to a finite generating set(which exists by Theorem 2.21 of [Rag72] ), and let B Γ ( R ) denote the ball of radius R in Γ centeredat the identity. Then for any p > , Π p ( B Γ ( R )) (cid:38) ln( R ) p − r ln(ln( R )) p + r Proof.
Let G, Γ , p be as above. The inclusion Γ (cid:44) → G is a biLipschitz embedding onto a coarselydense subset when Γ is equipped with the word metric with respect to a finite generating set (thiscan be proven using Mostow’s theorem that lattices in nilpotent Lie groups are cocompact ([Mos62])and applying the fundamental theorem of geometric group theory). Thus it suffices to prove theconclusion for any coarsely dense N (cid:48)(cid:48) ⊆ G . Let N (cid:48)(cid:48) be such a subset. By assumption, there is aCarnot group G (cid:48) and a graded quotient homomorphism q : G → G (cid:48) such that J r − ( R ) is a gradedsubgroup of G (cid:48) . Then q is a Lipschitz quotient map, so there is a constant C < ∞ such that forany R ≥
3, Π p ( B N (cid:48)(cid:48) ( R )) (cid:38) Π p ( B q ( N (cid:48)(cid:48) ) ( R/C ))Thus it suffices to prove the conclusion for any coarsely dense subset N (cid:48) ⊆ G (cid:48) . Let N (cid:48) be such asubset. Fix B >> N ⊆ J r − ( R ) be a coarsely dense, B -separated subset (each pair ofdistinct points in N is separated by a distance at least B - such sets always exist by Zorn’s Lemma).Then since J r − ( R ) is a graded subgroup of G (cid:48) , there is a biLipschitz embedding N → G (cid:48) . If B ischosen large enough, we map postcompose with a nearest neighbor map G (cid:48) → N (cid:48) to obtain anotherbiLipschitz embedding N → N (cid:48) . Then the conclusion follows from Theorem 5.6. (cid:3) The following quantitative nonembeddability estimate follows from the previous corollary andTheorem 4.19.
Corollary 1.6.
Let G be a Carnot group of step r that has J r − ( R ) as a graded subquotient group,for example G may be a free Carnot group, J r − ( R k ) , or any Carnot group if r ≤ . Let Γ ≤ G bea lattice equipped with the word metric with respect to a finite generating set, and let B Γ ( R ) denotethe ball of radius R in Γ centered at the identity. Let G (cid:48) be any graded nilpotent Lie group of step r (cid:48) < r . Then we have the following estimate for c G (cid:48) ( B Γ ( R )) , the biLipschitz distortion of B Γ ( R ) in G (cid:48) : c G (cid:48) ( B Γ ( R )) (cid:38) ln( R ) r (cid:48) − r ln(ln( R )) r (cid:48) + r where the implicit constant depends on G and G (cid:48) but not on R . Such quantitative nonembeddability estimates have been the subject of much attention for em-beddings of Heisenberg groups into certain Banach spaces, see [ANT13] and [LN14] for uniformlyconvex Banach space targets and [NY18] for L targets. In particular, it can be deduced from ANT13] and [Ass83] that the biLipschitz distortion of the ball of radius R in a lattice in theHeisenberg group into Hilbert space equals, up to universal factors, (cid:112) ln( R ). Thus, our estimatesin the previous corollary cannot be sharp when r = 2 and r (cid:48) = 1. However, these estimates seemto be the first of their type when the target is allowed to be a nilpotent group of step larger than 1.Other quantitative nonembeddability estimates of between Carnot groups were obtained in [Li14],but they are of a different flavor. Since our estimates are not sharp for r = 2 , r (cid:48) = 1, we speculatethat they are not sharp for larger values of r, r (cid:48) either.Next, we obtain new results on the nonexistence Lipschitz subquotient maps. Corollary 1.7.
Let G be a Carnot group of step r that has J r − ( R ) as a graded subquotient group,for example G may be a free Carnot group, J r − ( R k ) , or any Carnot group if r ≤ . Let G (cid:48) be anygraded nilpotent Lie group of step r (cid:48) .(1) G is not a Lipschitz subquotient of L p (or any p -convex space) for any p ∈ (1 , r ) .(2) If r > r (cid:48) , G is not a Lipschitz subquotient of G (cid:48) .Proof. These follow from the previous corollary, the fact that Markov p -convexity is preservedunder Lipschitz subquotients, Theorem 1.2, and the classical fact that L p is max(2 , p )-convex for p > (cid:3) Essentially all of the previously know results of this flavor are proved via Pansu differentiation([Pan89]), which applies when the domain is a (finite dimensional) Carnot group and the targetis an RNP Banach space or (finite dimensional) Carnot group (Section 1.2 of [LN06] or Theorem6.1 of [CK06], which are stated for biLipschitz maps on the Heisenberg group, but also apply tobiLipschitz or Lipschitz quotient maps on any Carnot group of step at least 2). There is also arecent differentiation theorem of Le Donne-Li-Moisala ([LDLM18]) which applies when the domainis a “scalable” group filtrated by (finite dimensional) Carnot groups and the target is an RNPspace. However, there does not seem to be a clear way to deduce Corollary 1.7 in full generalityfrom any of these methods.We may use Markov convexity again to prove nonexistence of subquotient maps onto some“infinite step” graded Lie groups. See Section 3.4 for the definitions of inverse limits, J ∞ ( R k ), andthe free Carnot group on k generators, F ∞ k . Corollary 1.8.
Let G ← G ← . . . be an inverse system of graded nilpotent Lie groups such thatfor every r , there is an i with J r − ( R ) a graded subquotient of G i , and let G ∞ be the inverse limitgroup. For example, G ∞ may be J ∞ ( R k ) or F ∞ k . Then G ∞ is not a Lipschitz subquotient of anysuperreflexive space.Proof. Pisier’s renorming theorem, Theorem 11.37 of [Pis16], states that any superreflexive Banachspace is p -convex for some p ∈ [2 , ∞ ). Thus it suffices to show that G ∞ is not Markov p -convexfor any p ∈ (0 , ∞ ). For every r ≥ J r − ( R ) is a Lipschitz subquotient of G ∞ , so since Markov p -convexity is preserved under Lipschitz quotients, the conclusion follows from Corollary 1.4. (cid:3) Finally, we provide a positive result on the existence of embeddings using one of the main resultsof [LNP09]. A metric tree is the vertex set of a weighted graph-theoretical tree equipped with theshortest path metric.
Theorem 1.9 (Theorem 4.1, [LNP09]) . If T is a metric tree and T is Markov p -convex, then T biLipschitz embeds into L p . Corollary 1.10.
If a metric tree T is a Lipschitz subquotient of a graded nilpotent Lie group G ofstep r , then T biLipschitz embeds into L p for every p ≥ r .Proof. This follows from Theorem 4.19, the fact that Markov convexity is inherited by Lipschitzsubquotients, and Theorem 1.9. (cid:3) e conclude this introduction with the obvious conjecture that Theorems 4.19 and 5.6 lead to,and another somewhat less obvious conjecture. Conjecture 1.11.
Every Carnot group of step r is not Markov p -convex for every p ∈ (0 , r ) . Conjecture 1.12.
For each graded nilpotent Lie group G , the set of p for which G is Markov p -convex is the same as that of the largest Carnot subgroup of G . Discussion of Proof Methods
We engage here in informal discussion of the proofs of Theorem 4.19 and 5.6. This discussionis intended to give a brief overview of the proofs for readers with a sufficient background in therelevant topics. For Theorem 4.19, the relevant topics are graded nilpotent Lie algebras, the groupstructure they inherit via the Baker-Campbell Hausdorff formula, and their graded-homogeneousgroup quasi-norms. For Theorem 5.6, the relevant topics are Markov convexity of diamond-typegraphs, jet space Carnot groups, and Khintchine’s inequality. Readers unfamiliar with these topicmay find this section unuseful.2.1.
Discussion of Proof of Theorem 4.19.
The method employed by Mendel-Naor to provethat p -convexity of Banach spaces implies Markov p -convexity is to:(1) Invoke the well-known result that p -convex Banach spaces have equivalent norms (cid:107) · (cid:107) satisfying the parallelogram inequality ( (cid:107) x (cid:107) p + (cid:107) x − y (cid:107) p ) / − (cid:107) y/ (cid:107) p (cid:38) (cid:107) x − y/ (cid:107) p .(2) Prove the 4-point inequality (2 d ( y, x ) p + d ( z, y ) p + d ( y, w ) p ) / − ( d ( x, w ) / p − ( d ( x, z ) / p (cid:38) d ( z, w ) p , where d ( x, y ) = (cid:107) x − y (cid:107) .(3) Prove the Markov p -convexity inequality, Definition 1.1.We prove the analogous inequalities for graded nilpotent Lie groups:(1) Lemma 4.17. Construct a group quasi-norm N satisfying ( N ( x ) p + N ( y − x ) p ) / − ( N ( y ) / p (cid:38) N ( δ / ( y ) − x ) p .(2) Lemma 4.18. Prove the 4-point inequality (2 d ( y, x ) p + d ( z, y ) p + d ( y, w ) p ) / − ( d ( x, w ) / p − ( d ( x, z ) / p (cid:38) d ( z, w ) p , where d ( x, y ) = N ( y − x ).(3) Prove Theorem 4.19. The Markov p -convexity inequality.The passage from (1) to (2) and from (2) to (3) is exactly the same as in Banach space case.To prove (1), we recursively construct a sequence of homogeneous quasi-norms on the group, andprove that they satisfy (1) inductively. Actually, the following stronger version of (1) (with p = 2 s ,the case p ≥ s is taken care of later) is needed for the induction to close, this is Lemma 4.16.( N s ( x ) s + N s ( y − x ) s ) / − ( N s ( y ) / s (cid:38) SN s ( x, y ) s + D s ( x, y ) + N s ( δ / ( y ) − x ) s There are two extra terms that appear in this inequality, SN s ( x, y ) and D s ( x, y ), defined in Def-initions 4.8 and 4.14. D s ( x, y ) is designed to bound (up to constants) the square of any BCHpolynomial of degree s (see Definition 4.1), so one may guess how it would be useful to prove (1). SN s ( x, y ) is nearly a positive definite quasi-norm of ( x , . . . x s , y , . . . y s ) (the name SN is meantto suggest that it is a seminorm instead of a norm, since it is not positive definite), but not quiteas it vanishes when x = y / x i = y i = 0 for i ≥
2. However, this is not an issue as we willhave an extra (cid:107) y (cid:107) term in the induction, so that (cid:107) y (cid:107) + SN s ( x, y ) is genuinely a quasi-norm of( x , . . . x s , y , . . . y s ). Here are D s and SN s for some small s : D ( x, y ) = (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) τ ( x, y ) D ( x, y ) = (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) (cid:107) ( x , y ) (cid:107) + (cid:107) ( x , y ) (cid:107) τ ( x, y ) + (cid:107) ( x , y ) (cid:107) τ ( x, y ) SN ( x, y ) = max( (cid:107) x − y / (cid:107) , (cid:107) ( x , y ) (cid:107) / , (cid:107) ( x , y ) (cid:107) / ) he polynomial τ ( x, y ) is designed to bound the squares of terms coming from the bracket betweentwo vectors from the horizontal layer. For example, in the second Heisenberg group, τ ( x, y ) = ( x y − x y ) + ( x y − x y ) We recursively construct the quasi-norms N s +1 given all the previous quasi-norms by defining N s +1 ( x ) to be an (cid:96) s +1) sum of λ s +1 (cid:107) x s +1 (cid:107) / ( s +1) and the top half of the previously defined quasi-norms, where λ s +1 is a positive constant chosen small enough (depending on the product structureof the group in question) to make the inequality of Lemma 4.16(1) hold. Specifically, from (4.1), N ( x ) = (cid:112) (cid:107) x (cid:107) + λ (cid:107) x (cid:107) N s +1 ( x ) = s +1) (cid:118)(cid:117)(cid:117)(cid:116) λ s +1 (cid:107) x s +1 (cid:107) + s (cid:88) s (cid:48) = (cid:100) ( s +1) / (cid:101) N s +1) s (cid:48) ( x )The reason why we add the top half of the previously defined norms, and the reason for the inclusion SN s ( x, y ) term in the inequality, is to help pass from D s ( x, y ) to D s +1 ( x, y ) during the proof of theinductive step. When proving the inductive step, we have terms like( SN s (cid:48) ( x, y ) s (cid:48) + D s (cid:48) ( x, y )) ( s +1) /s (cid:48) , s (cid:48) ≤ s , appearing to which we apply Lemma 3.6 and obtaina term like SN s (cid:48) ( x, y ) s +1 − s (cid:48) ) D s (cid:48) ( x, y ). This term bounds (cid:107) ( x s +1 − s (cid:48) , y s +1 − s (cid:48) ) (cid:107) D s (cid:48) ( x, y ) exactlywhen (cid:100) ( s + 1) / (cid:101) ≤ s (cid:48) ≤ s . Then summing (cid:107) ( x s +1 − s (cid:48) , y s +1 − s (cid:48) ) (cid:107) D s (cid:48) ( x, y ) over this range of s (cid:48) accounts for all the terms in D s +1 ( x, y ), except for the top-layer term (cid:107) ( x s +1 , y s +1 ) (cid:107) (since anyother term in D s +1 ( x, y ) contains as a factor a variable from one of the lower half layers, see Lemma4.9 for details), which is accounted for later.2.2. Discussion of Proof of Theorem 5.6.
We recursively construct a sequence of directedgraphs Γ m and maps from them into the jet space of step r ( J r − ( R )) to show that it is not Markov p -convex for any p < r . The Markov processes we use are standard directed random walks on thegraphs. This is very similar to the method used in [Li16], where something akin to the Laakso-Lang-Plaut diamond graphs were used. The main feature of those graphs G m is that G m +1 isobtained from G i by replaced each edge of G with a copy of G m . Roughly speaking, Li recursivelymaps G m +1 into R by replacing each edge of a distorted image of G by a rotated, distorted copyof the image of G i . The distortion is done in such a way that the coLipschitz constant (the Lipschitzconstant of the inverse map) is on the order of √ m (cid:112) ln( m + 1), and the fact that rotations areisometries of the Heisenberg group affords one uniform control on the Lipschitz constants. One canconclude from this that the Heisenberg group is not Markov p -convex for p < m ).Our graphs differ from those in [Li16] in that, to obtain Γ m +1 from Γ m , we first glue together many copies of Γ m together with a small number of copies of a single edge I in series to get a newgraph Γ (cid:48) m +1 , and then replace each edge of Γ with a copy of Γ (cid:48) m +1 (this isn’t exactly how ourconstruction is defined, but is close enough to get the main idea). See Definition 5.1 for the fulldetails. We will explain the reasoning for this after describing our maps of Γ m into J r − ( R ).Our maps differ from those in [Li16] in that we do not rotate the image of Γ m before using itto replace the edges of the image of Γ , as rotations are not Lipschitz maps in higher step groupslike they are in the Heisenberg group. Refer to Figure 2 throughout this discussion to get anidea of the construction of these maps. Instead of rotating, we simply add (many copies of) theimage of Γ m to a distorted copy of the image of Γ to obtain the mapping of Γ m +1 into R . Morespecifically, we map each directed path γ in Γ m +1 to the jet of a function φ γ - a horizontal curvein J r − ( R ). The Lipschitz constant of this map is controlled by (cid:13)(cid:13) d r d r x φ γ (cid:13)(cid:13) ∞ . We still distort thegraphs Γ m with the same asymptotics as in [Li16], so that the coLipschitz constant is on the orderof r √ m r (cid:112) ln( m + 1) (at least on the pairs of random walks ( X mt , ˜ X mt ( t − k )). That we get the r th root of m instead of the fourth root of m comes from the fact that J r − ( R ) is of step r andthe Heisenberg group is of step 2. One potential problem is that the absence of isometric rotationsand the fact that ( √ m ln( m )) − isn’t summable means (cid:13)(cid:13) d r d r x φ γ (cid:13)(cid:13) ∞ blows up along some paths, andthus we do not have uniform control on the Lipschitz constant of the map, unlike [Li16]. However,( √ m ln( m )) − is square-summable , and together with the nature of the image of the random walk X mt in J r − ( R ), this allows us to control E [ d CC ( X mt +1 , X mt ) p ] uniformly in m, t . Loosely, alongthe random walk in the horizontal layer (which has x - and u r − -coordinates), every time one isconfronted with a choice of direction to walk in, the choice is to walk 1 unit in the x -directionand +( √ i ln( i + 1)) − units in the u r − -direction with probability 1/2, or 1 unit in the x -directionand − ( √ i ln( i + 1)) − units in the u r − -direction with probability 1/2 (for some i depending onhow far one has walked). Thus, one might expect d CC ( X mt +1 , X mt ) to be bounded by a randomvariable distributed like 1 + | (cid:80) ti =1 (cid:15) i ( √ i ln( i + 1)) − | , where { (cid:15) i } i are iid Rademachers, and thenKhintchine’s inequality implies we should have a uniform bound on E [ d CC ( X mt +1 , X mt ) p ] (which isthe real quantity of interest, recall Definition 1.1). Of course, the random walk is not distributedlike this, but it turns out that this intuition is correct nonetheless, see Lemmas 3.9 and 5.5(4) forthe specifics.Finally, the reason we use many copies of Γ m in creating Γ m +1 is so that, compared to thediameter of Γ m +1 , the diameter of the copies of Γ m is very small, and thus those that replacedopposite edges of Γ don’t get too close together, which would ruin the coLipschitz constant.Morally, this “decouples” any interaction between different scales in Γ m +1 .3. Preliminaries
The next two subsections don’t follow any particular reference, but ones we recommend are[BLU07] for Carnot groups and [LD17] for graded nilpotent groups. We mostly follow [War05] forthe subsection on jet spaces.3.1.
Graded Nilpotent and Stratified Lie Algebras and their Lie Groups. A graded nilpo-tent Lie algebra ( g , [ · , · ]) of step r is a Lie algebra equipped with a grading g = ⊕ ri =1 g i , meaning g r (cid:54) = 0, [ g i , g j ] ⊆ g i + j if i + j ≤ r , and [ g i , g j ] = 0 if i + j > r . A stratified Lie algebra ( g , [ · , · ]) of step r is a graded nilpotent Lie algebra of step r such that the Lie subalgebra generated by g is allof g . The grading is called a stratification , g is often called the horizontal layer (or stratum), and g is said to be horizontally generated . Whenever a Lie algebra g (not presumed to be equipped witha grading) admits a stratification, it is unique (Lemma 2.16, [LD17]). A graded nilpotent Lie groupof step r is a simply connected Lie group whose Lie algebra is graded nilpotent of step r . A gradednilpotent Lie group whose Lie algebra is stratified is a Carnot group . A graded homomorphism or map is a Lie group homomorphism between graded nilpotent Lie groups whose derivative is agraded Lie algebra homomorphism. One graded nilpotent Lie group G (cid:48) is a graded subgroup ofanother graded nilpotent Lie group G if there is an injective graded homomorphism from G (cid:48) into G . One graded nilpotent Lie group G (cid:48) is a graded quotient group of another graded nilpotent Liegroup G if there is a surjective graded homomorphism from G onto G (cid:48) . One graded nilpotent Liegroup G (cid:48) is a graded subquotient group of another graded nilpotent Lie group G if there is anothergraded nilpotent Lie group G (cid:48)(cid:48) such that G (cid:48)(cid:48) is a graded subgroup of G and G (cid:48) is a graded quotientgroup of G (cid:48)(cid:48) , or, equivalently, there is another graded nilpotent Lie group G (cid:48)(cid:48) such that G (cid:48)(cid:48) is agraded quotient group of G and G (cid:48) is a graded subgroup of G (cid:48)(cid:48) .Given a graded nilpotent Lie group G and its Lie algebra g , since g is nilpotent and G is simplyconnected, the exponential map is a diffeomorphism, and thus we can use it to equip g with agraded nilpotent Lie group structure such that it becomes graded isomorphic to G . The Baker-Campbell-Hausdorff formula provides a formula for the group product on g in terms of the Lie lgebra structure (Section 2, [War05]): xy = (cid:88) n> ( − n +1 n (cid:88)
-action on G . It can be deduced that a Lie group homomorphism θ between graded nilpotentLie groups is a graded homomorphism if and only if it is δ t -equivariant, that is, θ ( δ t ( x )) = δ t ( θ ( x )),where we’ve abused (and will continue to do so) notation and written δ t for the dilation on boththe domain and codomain.3.2. Norms and Metrics.
Let G be a graded nilpotent Lie group. A homogeneous quasi-norm on G is a continuous function N : G → R such that for all x ∈ G and t ∈ R > , • N ( x ) ≥ • N ( x − ) = N ( x ) (symmetry) • N ( δ t ( x )) = tN ( x ) (homogeneity)If additionally N ( x ) = 0 implies x = 0, then N is a positive definite homogeneous quasi-norm,and if N ( xy ) ≤ N ( x ) + N ( y ) for all x, y ∈ G (triangle inequality), N is a homogeneous norm . Forany two positive definite homogeneous quasi-norms N, N (cid:48) on G , the continuity, homogeneity, andpositive definiteness of N, N (cid:48) , together with the compactness of the unit sphere in ⊕ ri =1 R dim( g i ) ,imply that N and N (cid:48) are biLipschitz equivalent, that is, there is a constant 0 < C < ∞ such that C − N ( x ) ≤ N (cid:48) ( x ) ≤ CN ( x )for all x ∈ G .Positive definite homogeneous norms always exist, most famously those considered in [HS90].Thus any positive definite homogeneous quasi-norm N satisfies the quasi-triangle inequality : thereis a 0 < C < ∞ such that for all x, y ∈ G , N ( xy ) ≤ C ( N ( x ) + N ( y ))Typically one requires that every homogeneous quasi-norm N satisfies the quasi-triangle in-equality. Although it turns out that the quasi-norms we consider in this article do satisfy thequasi-triangle inequality, we only need to know this for positive-definite quasi-norms and thus donot explicitly make this requirement.There is a bijective correspondence between homogeneous, positive definite quasi-norms N on G and left-invariant, homogeneous quasi-metrics d N on G via N (cid:55)→ d N defined by d N ( x, y ) := N ( y − x )Positive definiteness of N implies positive definiteness of d N , symmetry of N implies symmetry of d N , homogeneity of N implies the homogeneity of d N (meaning d N ( δ t ( x ) , δ t ( y )) = td N ( x, y )), andthe quasi-triangle inequality of N implies the quasi-triangle inequality of d N . The left-invarianceof d N is automatic from the definition. N satisfies the triangle inequality if and only if d N does.The inverse of N (cid:55)→ d N is d (cid:55)→ N d , where N d ( x ) := d (0 , x ). In addition to those determined by the omogeneous, positive definite norms from [HS90], there are canonical left-invariant, homogeneousmetrics on Carnots groups called Carnot-Caratheodory metrics , denoted d CC . These metrics arealso geodesic. See [BLU07] or [LD17] for further information.In what follows, whenever dealing with a graded nilpotent Lie group, we will automaticallyassume it is equipped with a left-invariant, homogeneous quasi-metric. By the preceding discussion,this quasi-metric is well-defined up to biLipschitz equivalence, so any biLipschitz-invariant propertyof metric spaces we may well attribute to a graded nilpotent Lie group G knowing only the algebraicstructure of its graded Lie algebra. The δ t -equivariance of graded group maps implies that anygraded map between graded nilpotent Lie groups is Lipschitz, and thus graded group embeddingsare biLipschitz embeddings, graded quotient maps are Lipschitz quotient maps, and graded groupisomorphisms are biLipschitz equivalences.3.3. Model Filiform Groups and Jet Spaces over R . We follow [War05] (especially Example4.3) throughout this subsection. The model filiform group of step r ≥ g = ( R X ⊕ R Y ) ⊕ ri =2 R Y i , where X, Y is a basis for g and Y i is a basis for g i for 2 ≤ i ≤ r , and the nontrivial bracket relations are given by [ X, Y i ] = Y i +1 for 1 ≤ i ≤ r − s ≥ r , there is a canonical Carnot group quotient map from the model filiform groupof step s to that of step r . The model filiform group of step 2 is frequently called the Heisenberggroup , and the one of step 3 the
Engel group . The corresponding Lie algebras are the
Heisenbergalgebra and
Engel algebra .The jet space over R of step r ≥
0, denoted J r − ( R ), is a certain Carnot group of step r graded isomorphic to the model filiform group of step r . There are also jet space groups J r − ( R k )over higher dimensional Euclidean space, but we will focus on k = 1 in this discussion. As aset, J r − ( R ) consists of equivalence classes of pairs ( x, f ) where x ∈ R and f ∈ C r − ( R ). Twopairs ( x, f ) , ( y, g ) are equivalent if x = y and f ( k ) ( x ) = g ( k ) ( y ) for all 0 ≤ k ≤ r −
1. We definemaps π x , π i : J r − ( R ) → R , 0 ≤ i ≤ r −
1, by π x ([( y, g )]) = y and π i ([( y, g )]) = g ( i ) ( y ). Thesemaps are obviously well-defined and the direct sum map π x ⊕ r − i =0 π r − − i : J r − ( R ) → R × R r isa bijection. For v ∈ J r − ( R ), the quantity π x ( v ) is referred to as the x -coordinate and π i ( v ) asthe u i -coordinate . We equip J r − ( R ) with a topological vector space structure so that this mapis a linear homeomorphism, and from this point on will represent elements of J r − ( R ) using thesecoordinates. We will especially represent elements as pairs ( y, v ) ∈ J r − ( R ) = R × R r so that y ∈ R , v ∈ R r , and π x (( y, v )) = y . Although we won’t explicitly use it, the group operation on J r − ( R ) isgiven by π x (( x, u r − , . . . u ) ∗ ( y, v r − , . . . v )) = x + yπ i (( x, u r − , . . . u ) ∗ ( y, v r − , . . . v )) = u i + v i + r − (cid:88) j = i +1 u j y j − i ( j − i )!Given y ∈ R and g ∈ C r − ( R ), we get an element [ j r − ( y )]( g ) ∈ J r − ( R ) defined by π x ([ j r − ( y )]( g )) = yπ i ([ j r − ( y )]( g )) = g ( i ) ( y )called the jet of g at y . The following two Lemmas are essentially all we need to know about jetspaces. The first is a special case of [RW10]. Although their lemma is stated for C r functions, theproof works the same in the case of C r − , functions. Lemma 3.1 (pages 4-5, [RW10]) . For any [ a, b ] ⊆ R and φ ∈ C r − , ([ a, b ]) , d CC ([ j r − ( b )]( φ ) , [ j r − ( a )]( φ )) ≤ (cid:18) (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) L ∞ ([ a,b ]) (cid:19) | b − a | emma 3.2. There is a constant c > such that for all ( x, u ) , ( x, v ) ∈ J r − ( R ) , d CC (( x, u ) , ( x, v )) ≥ c | π ( u − v ) | r Proof.
By left invariance of d CC and the ball-box theorem (see Corollary 2.2 of [Jun19], there is aconstant c > x, u ) , ( x, v ) ∈ J r − ( R ), d CC (( x, u ) , ( x, v )) ≥ c | π (( x, v ) − ( x, u )) | r and by Lemma 3.1 from [Jun17], π (( x, v ) − ( x, u )) = π ( u − v ) (cid:3) The following lemma will be used to obtain lower bounds on the Markov convexity of Carnotgroups of step 2 or 3.
Lemma 3.3.
Every Carnot group of step 2 or 3 contains the model filiform group of the corre-sponding step (the Heisenberg or Engel group) as a graded subquotient group.Proof.
Let G be a Carnot group of step 2 with stratified Lie algebra g = g ⊕ g . Since g has step2, there is a nonzero V ∈ g . Since g is horizontally generated, there exist U, V ∈ g such that[ U, V ] = V . Recall that the Heisenberg algebra has first layer generated by linearly independentvectors X, Y , second layer generated by Y (cid:54) = 0, and nontrivial bracket relation [ X, Y ] = Y . Thenit easily follows that X (cid:55)→ U , Y (cid:55)→ V , Y (cid:55)→ V is a graded algebra embedding into g . This provesthat the Heisenberg group is a graded subgroup of G .Now assume G is of step 3 with stratified Lie algebra g = g ⊕ g ⊕ g . By the grading property,any subspace of g is an ideal, and thus there is a graded algebra quotient map onto another step 3stratified Lie algebra whose third layer is one dimensional. Thus we may assume g = R W , W (cid:54) = 0,and prove that the Engel algebra embeds into g . Since g is horizontally generated, W = [ U , [ U , U ]]for some U , U , U ∈ g . First we claim that there is a 2-dimensional subspace of the span of U , U , U that generates a Lie subalgebra of step 3. After proving the claim, we’ll show that thissubalgebra must be graded algebra-isomorphic to the Engel algebra. To prove the claim, we’ll showthat at least one of the following is nonzero:(1) [ U , [ U , U ]](2) [ U , [ U , U ]](3) [ U , [ U , U ]](4) [ U , [ U , U ]](5) [ U + U , [ U + U , U ]](6) [ U + U , [ U + U , U ]]Assume that all terms are 0. First let’s see that [ U , [ U , U ]] = W .0 (5) = [ U + U , [ U + U , U ]] = [ U , [ U , U ]] + [ U , [ U , U ]] + [ U , [ U , U ]] + [ U , [ U , U ]] (2) , (3) = W + [ U , [ U , U ]] = W − [ U , [ U , U ]]Using (6) , (1) , (4) in place of (5) , (2) , (3) shows [ U , [ U , U ]] = W . Putting these together yields:[ U , [ U , U ]] + [ U , [ U , U ]] + [ U , [ U , U ]] = 3 W (cid:54) = 0in violation of the Jacobi identity. This proves the claim.So now the situation is that there are Z , Z ∈ g with [ Z , [ Z , Z ]] = zW for some z (cid:54) = 0.Recall that the Engel algebra has first layer spanned by X, Y , second layer by Y , and third layerby Y with nontrivial bracket relations [ X, Y ] = Y and [ X, Y ] = Y . Let z (cid:48) ∈ R such that Z , [ Z , Z ]] = z (cid:48) W . Then since [ Z , [ Z , Z ]] = zW (cid:54) = 0, the map from the Engel algebra into g defined by X (cid:55)→ Z , Y (cid:55)→ Z − z (cid:48) z Z , Y (cid:55)→ [ Z , Z ] , Y (cid:55)→ zW is a graded algebra embedding. (cid:3) Remark . The analogue of Lemma 3.3 is false for groups of step larger than 3. Let g be thestratified Lie algebra g = ⊕ i =1 g i with g = R X ⊕ R X , g = R X , g = R X ⊕ R X , g = R X and nontrivial brackets [ X , X ] = X , [ X , X ] = X , [ X , X ] = X , [ X , X ] = X ,[ X , X ] = X . The only graded quotient maps from g onto another step 4 stratified Lie algebraor graded embeddings into g from another step 4 stratified Lie algebra are isomorphisms.3.4. Infinite Step Carnot groups.
Given an inverse system of graded nilpotent Lie groups G ρ ← G ρ ← . . . , where each ρ i is a graded quotient map, we define the inverse limit metric group , G ∞ , to be the subgroup of ( ⊕ ∞ i =1 G i ) ∞ consisting of those sequences ( x i ) ∞ i =1 for which ρ ( x i +1 ) = x i for all i ≥
1, where ( ⊕ ∞ i =1 G i ) ∞ is the (cid:96) ∞ -sum of the pointed metric spaces ( G i , d CC , G ∞ inheritsa left-invariant homogeneous metric from ( ⊕ ∞ i =1 G i ) ∞ (where the dilations δ t are defined on G ∞ inthe obvious way), and each G i is a Lipschitz quotient of G ∞ . Definition 3.5. J ∞ ( R k ) is the inverse limit metric group, equipped with the induced δ t -action,associated to the natural inverse system formed by the jet space groups, J ( R k ) ρ ← J ( R k ) ρ ← . . . .See [War05] for background on jet space groups. Similarly, F ∞ k is the inverse limit metric group,equipped with the induced δ t -action, associated to the natural inverse system formed by the freeCarnot groups on k generators, F k ρ ← F k ρ ← . . . . See Chapter 14 of [BLU07] for background onfree Carnot groups.3.5. Probabilistic and Convexity Inequalities.
In this article, we will often justify an inequal-ity with the phrase “by convexity” or “by the parallelogram law”. The convexity inequalities werefer to are almost always of the form a p + b p ≥ (cid:18) a + b (cid:19) p or a p + b p ≤ ( a + b ) p for p ≥ a, b ≥
0. The form of the parallelogram law we most often use is (cid:107) u (cid:107) + (cid:107) u − v (cid:107) (cid:107) v/ (cid:107) + (cid:107) u − v/ (cid:107) for u, v in a Hilbert space, which implies the inequality (cid:107) u (cid:107) + (cid:107) u − v (cid:107) ≥ (cid:107) v (cid:107) L p -normsof random variables. Lemma 3.6.
For all a, b ≥ and q ≥ , ( a + b ) q ≥ a q + qa q − b Proof.
Let a, b, q be as above. The inequality is obviously true if a = 0. Then if a >
0, after dividingeach side by a q and replacing b/a with t , it suffices to prove (1 + t ) q ≥ qt . This inequality istrue since the right hand is the linearization of the left hand side at t = 0, and the left hand sideis a convex function of t . (cid:3) emma 3.7. For each p > and k ≥ , k (cid:88) t =1 (2 t ) p > k p +1 / Proof.
Let p > k ≥
1. Since the function t (cid:55)→ (2 t ) p is increasing, k (cid:88) t =1 (2 t ) p > ˆ k (2 t ) p dt = 2 p p + 1 k p +1 ≥ k p +1 / (cid:3) The following two lemmas are frequently used in tandem to prove Khintchine’s inequality (forexample, Proposition 4.5 of [Wol03]). We will need them for a similar inequality used in Section5.2.
Lemma 3.8.
For all y ∈ R , cosh( y ) ≤ exp( y / .Proof. Let y ∈ R .cosh( y ) = e y + e − y ∞ (cid:88) k =0 y k + ( − y ) k k ! = ∞ (cid:88) k =0 y k (2 k )! ≤ ∞ (cid:88) k =0 ( y / k k ! = exp (cid:18) y (cid:19) (cid:3) Lemma 3.9.
For each p ≥ and < A, B < ∞ , there is a constant C = C ( p, A, B ) < ∞ such thatany real-valued random variable Y satisfying the moment generating function subgaussian bound E [exp( yY )] ≤ Ae By also satisfies the L p -norm bound E [ | Y | p ] ≤ C Proof.
This is a standard result from the theory of subgaussian random variables whose proofappears in any text on measure concentration. For the sake of completeness we’ll include the proof,roughly following the proof of Proposition 4.5 from [Wol03]. Let p , A , B , Y be as above. For any t >
0, Markov’s inequality and our assumption imply P ( Y ≥ t ) = P (cid:18) exp (cid:18) t B Y (cid:19) ≥ exp (cid:18) t B (cid:19)(cid:19) ≤ exp (cid:18) − t B (cid:19) E (cid:20) exp (cid:18) t B Y (cid:19)(cid:21) ≤ A exp (cid:18) − t B + t B (cid:19) = A exp (cid:18) − t B (cid:19) Likewise, P ( Y ≤ − t ) ≤ A exp (cid:18) − t B (cid:19) giving us P ( | Y | ≥ t ) ≤ A exp (cid:18) − t B (cid:19) We then use the layer cake principle to calculate E [ | Y | p ]: E [ | Y | p ] = p ˆ ∞ t p − P ( | Y | ≥ t ) dt ≤ p ˆ ∞ t p − A exp (cid:18) − t B (cid:19) dt = C ( p, A, B ) < ∞ (cid:3) . Upper Bound on Markov Convexity of Graded Nilpotent Lie Groups
Throughout this section, fix a graded nilpotent Lie algebra ( g , [ · , · ]) of step r ≥ ⊕ ri =1 g i and dim( g i ) = k i . Choose an ordered basis U i, , . . . U i,k i for each g i and equip g with aHilbert norm (cid:107) · (cid:107) such that these vectors form an orthonormal basis. We also use (cid:107) · (cid:107) to denotethe Euclidean norm on any R n . Given x ∈ g , let x i ∈ g i denote its g i -component. Given x i ∈ g i ,let x i,j ∈ R denote its U i,j -component. Thus, (cid:107) x (cid:107) = r (cid:88) i =1 (cid:107) x i (cid:107) and (cid:107) x i (cid:107) = k i (cid:88) j =1 | x i,j | Consider g as a graded nilpotent Lie group as in Section 3. It’s easy to see that 0 is the groupidentity element and x − = − x . Whenever u, v ∈ g or u, v ∈ R n , we use the notation (cid:107) ( u, v ) (cid:107) tomean (cid:107) u (cid:107) + (cid:107) v (cid:107) .4.1. BCH Polynomials.Definition 4.1.
For s ≥
0, a function P : g × g → R that is a monomial(polynomial) in the variables x n,m , y n,m is a graded-homogeneous monomial(polynomial) of degree s if P ( δ t ( x ) , δ t ( y )) = t s P ( x, y )for all x, y ∈ g and t ∈ R > . Clearly, any graded-homogeneous polynomial of degree s must be asum of graded-homogeneous monomials of degree s .In this section, a multiset is a finite sequence of positive integers modulo permutations. Disjointunions I (cid:116) I of multisets are defined in the obvious way. Given a multiset I , (cid:107) I (cid:107) denotes the sumof the elements and (cid:107) I (cid:107) ∞ the maximum of the elements. Given a nonzero graded-homogeneousmonomial M of degree s , we associate to it a multiset I ( M ) defined recursively on the numberof variables in the monomial by I ( M ) = { i } (cid:116) I ( M (cid:48) ) if M ( x, y ) = x i,n M (cid:48) ( x, y ) or M ( x, y ) = y i,n M (cid:48) ( x, y ) for some n ≤ k i and graded-homogeneous polynomial M (cid:48) of degree s − i (the basecase is I (1) = ∅ ). By the homogeneity property, it must hold that if M is nonzero and graded-homogeneous of degree s , (cid:107) I ( M ) (cid:107) = s .For s ≥
1, let s denote the unique multiset with (cid:107) s (cid:107) = s and (cid:107) s (cid:107) ∞ = 1 (and = ∅ ). For each n, m ≤ k , let τ n,m ( x, y ) := x ,n y ,m − x ,m y ,n . A graded-homogeneous polynomial P of degree s ≥ τ -type if P ( x, y ) = τ n,m ( x, y ) M (cid:48) ( x, y ) for some n, m ≤ k and graded-homogeneousmonomial M (cid:48) with I ( M (cid:48) ) = s − .A graded homogeneous polynomial of degree s ≥ (cid:80) j Q j (the sum is finite), whereeach Q j is of τ -type or a graded-homogeneous monomial of degree s with 1 < (cid:107) I ( Q j ) (cid:107) ∞ < s iscalled a BCH polynomial of degree s . Remark . Obviously a sum of BCH polynomials of degree s is another such polynomial. If P is a BCH polynomial of degree s , 1 ≤ i ≤ r , and 1 ≤ j ≤ k i , x i,j P ( x, y ) and y i,j P ( x, y ) are BCHpolynomials of degree s + i . If P ( x, y ) is a BCH polynomial of degree s , then so is P ( x, δ t ( y )) forany t ∈ R > . Example 4.3.
Let M ( x, y ) = 6 x , x , y , , P ( x, y ) = − y , ( x , y , − x , y , ), Q ( x, y ) = x , y , ,and R ( x, y ) = y , . M is a graded-homogeneous monomial of degree 7 with I ( M ) = { , , , } , P is a graded homogeneous polynomial of degree 3 of τ -type, Q is a graded-homogeneous monomial ofdegree 2 with I ( Q ) = { , } , and R is a graded homogeneous monomial of degree 3 with I ( r ) = { } . M and P are BCH polynomials, but Q and R are not because they are monomials with (cid:107) I ( Q ) (cid:107) ∞ = 1and (cid:107) I ( R ) (cid:107) ∞ = (cid:107) I ( R ) (cid:107) .We now arrive at a key structural lemma for the group product on graded nilpotent Lie algebras.The rest of this subsection is dedicated to its proof. emma 4.4. For all x, y ∈ g and ≤ s ≤ r , (1) ( y − x ) = x − y (2) ( y − x ) s = x s − y s + (cid:80) k s j =1 P s,j ( x, y ) U s,j where each P s,j is a BCH polynomial of degree s . A trusting reader familiar with the group structure of graded nilpotent Lie algebras induced bythe Baker-Campbell-Hausdorff formula may safely skip the rest of this subsection. Before provingthe lemma, we need to set some useful notation that allows us to work with nested Lie brackets,and then prove a lemma about these brackets.
Definition 4.5.
Given x, y ∈ g , i ≥
1, and (cid:15) ∈ { , } i , we recursively define ( x, y ) (cid:15) as follows:for i = 1, ( x, y ) (cid:15) := x if (cid:15) = 1 and ( x, y ) (cid:15) := y if (cid:15) = 2. Assume ( x, y ) (cid:15) has been defined for all (cid:15) ∈ { , } i for some i ≥
1. Let (cid:15) ∈ { , } i +1 . Then (cid:15) equals (1 , (cid:15) (cid:48) ) or (2 , (cid:15) (cid:48) ) for some (cid:15) (cid:48) ∈ { , } i .We define ( x, y ) (cid:15) := [ x, ( x, y ) (cid:15) (cid:48) ] if (cid:15) = (1 , (cid:15) (cid:48) ) and ( x, y ) (cid:15) := [ y, ( x, y ) (cid:15) (cid:48) ] if (cid:15) = (2 , (cid:15) (cid:48) ). Example 4.6. ( x, y ) (1 , , , = [ x, [ y, [ y, x ]]]. The 1 or 2 in the superscript should be thought of asindicating the first or second component of ( x, y ) in the nested Lie bracket. Lemma 4.7.
For all x, y ∈ g , ≤ i , i ≤ r , and (cid:15) ∈ { , } i , (( x, y ) (cid:15) ) i = k i (cid:88) j =1 Q i ,j ( x, y ) U i ,j where each Q i ,j is a BCH polynomial of degree i if i ≤ i i > i .Proof. Let x, y ∈ g . By the grading property, (( x, y ) (cid:15) ) i = 0 if (cid:15) ∈ { , } i and i > i . We’ll provethe remaining case by induction on i .Proof of base case. The base case is i = 2. Let (cid:15) ∈ { , } . Then (cid:15) equals (1 , , , , x, y ) (1 , = ( x, y ) (2 , = 0 and ( x, y ) (2 , = − ( x, y ) (1 , , it suffices to only consider (cid:15) = (1 , x, y ) (cid:15) = [ x, y ]. Let i ≥
2. We treat the two cases i = 2 and i >
2. Firstassume i = 2. Then we have[ x, y ] = [ x , y ] = k (cid:88) j =1 x ,j U ,j , k (cid:88) j (cid:48) =1 y ,j (cid:48) U ,j (cid:48) = k (cid:88) j =1 k (cid:88) j (cid:48) =1 x ,j y ,j (cid:48) (cid:2) U ,j , U ,j (cid:48) (cid:3) = 12 k (cid:88) n,m =1 x ,n y ,m [ U ,n , U ,m ] + k (cid:88) n,m =1 x ,m y ,n [ U ,m , U ,n ] = 12 k (cid:88) n,m =1 ( x ,n y ,m − x ,m y ,n ) [ U ,n , U ,m ] = 12 k (cid:88) n,m =1 τ n,m ( x, y ) [ U ,n , U ,m ]= 12 k (cid:88) n,m =1 τ n,m ( x, y ) k (cid:88) j =1 c j,n,m U ,j = k (cid:88) j =1 k (cid:88) n,m =1 c j,n,m τ n,m ( x, y ) U ,j for some c j,n,m ∈ R . The inner sum is a sum of polynomials of degree 2 of τ -type, and thus a BCHpolynomial of degree i .Now we consider the case i > x, y ] i = i − (cid:88) n =1 [ x n , y i − n ] = i − (cid:88) n =1 k n (cid:88) j =1 x n,j U n,j , k i − n (cid:88) j (cid:48) =1 y i − n,j (cid:48) U i − n,j (cid:48) i − (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 x n,j y i − n,j (cid:48) (cid:2) U n,j , U i − n,j (cid:48) (cid:3) = i − (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 x n,j y i − n,j (cid:48) k i (cid:88) m =1 c m,n,j,j (cid:48) U i ,m = k i (cid:88) m =1 i − (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 c m,n,j,j (cid:48) x n,j y i − n,j (cid:48) U i ,m for some c m,n,j,j (cid:48) ∈ R . Notice that, for each n, j, j (cid:48) , I ( x n,j y i − n,j (cid:48) ) = { n, i − n } , and so since i > ≤ n ≤ i −
1, 1 < (cid:107) I ( x n,j y i − n,j (cid:48) ) (cid:107) ∞ < i , and thus x n,j , y i − n,j (cid:48) is a BCH polynomial ofdegree i . This completes the proof of the base case.Proof of inductive step. Now assume the lemma holds for some 2 ≤ i < r . Let (cid:15) ∈ { , } i +1 .Then (cid:15) equals (1 , (cid:15) (cid:48) ) or (2 , (cid:15) (cid:48) ) for some (cid:15) (cid:48) ∈ { , } i . Without loss of generality, assume (cid:15) = (1 , (cid:15) (cid:48) ).Let i ≥ i + 1. Then (( x, y ) (cid:15) ) i = [ x, ( x, y ) (cid:15) (cid:48) ] i = i − (cid:88) n =1 [ x n , (( x, y ) (cid:15) (cid:48) ) i − n ] ind hyp = i − (cid:88) n =1 k n (cid:88) j =1 x n,j U n,j , k i − n (cid:88) j (cid:48) =1 P i − n,j (cid:48) ( x, y ) U i − n,j (cid:48) = i − (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 x n,j P i − n,j (cid:48) ( x, y ) (cid:2) U n,j , U i − n,j (cid:48) (cid:3) = i − (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 x n,j P i − n,j (cid:48) ( x, y ) k i (cid:88) m =1 c m,n,j,j (cid:48) U i ,m = k i (cid:88) m =1 i (cid:88) n =1 k n (cid:88) j =1 k i − n (cid:88) j (cid:48) =1 c m,n,j,j (cid:48) x n,j P i − n,j (cid:48) ( x, y ) U i ,m for some c m,n,j,j (cid:48) ∈ R and BCH polynomials P i − n,j (cid:48) ,(cid:96) of degree i − n . This implies x n,j P i − n,j (cid:48) ( x, y )is a BCH polynomial of degree i , as desired. (cid:3) Proof of Lemma 4.4.
The Baker-Campbell-Hausdorff formula, (3.1), implies that there are con-stants (many can be taken to be 0) { α (cid:15) } (cid:15) ∈∪ ri =2 { , } i ⊆ R such that y − x = x − y + r (cid:88) i =2 (cid:88) (cid:15) ∈{ , } i α (cid:15) ( x, y ) (cid:15) Since ( y − x ) i = x i − y i + r (cid:88) i =2 (cid:88) (cid:15) ∈{ , } i α (cid:15) (( x, y ) (cid:15) ) i the desired conclusion follows by appealing to Lemma 4.7. (cid:3) .2. Convex Metrics.
The goal of this subsection is to prove Theorem 4.19. To do so, we constructa left invariant homogeneous quasi-metric on g that satisfies a certain 4-point inequality. This isthe content of Lemma 4.18. All the lemmas and definitions preceding Lemma 4.18 exist to proveit. We next define a graded-homogeneous polynomial of degree 2 s that dominates the square ofany BCH polynomial of degree s , Lemma 4.9. As a consequence of this we get two dominationinequalities involving norms of group products, Lemmas 4.11 and 4.12. These types of dominationare what will ultimately allow us to prove Lemma 4.16, the key lemma used in the proof of Lemma4.18. Definition 4.8.
Let τ ( x, y ) := (cid:115) (cid:88) n,m ≤ k τ n,m ( x, y )so that τ ( x, y ) ≥ τ n,m ( x, y ) for every n and m . For each 2 ≤ s ≤ r , define D s : g × g → R ≥ recursively by D ( x, y ) := τ ( x, y ) + (cid:107) ( x , y ) (cid:107) D s +1 := (cid:107) ( x s +1 , y s +1 ) (cid:107) + (cid:98) ( s +1) / (cid:99) (cid:88) s (cid:48) =1 (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) Lemma 4.9.
For any ≤ s ≤ r and BCH polynomial P of degree s , there exists < c ≤ suchthat for all x, y ∈ g , D s ( x, y ) − (cid:107) ( x s , y s ) (cid:107) ≥ cP ( x, y ) Proof.
The proof is by induction on s . The base case s = 2 is clear from the definition of D andBCH polynomial of degree 2. Assume the inequality holds for all s ≤ s for some s < r . Let P be a BCH polynomial of degree s + 1. By definition of BCH polynomial, it suffices to prove theinequality assuming P is a monomial with 1 < (cid:107) I ( P ) (cid:107) ∞ < s + 1 or P is of τ -type. First assume P is a monomial with 1 < (cid:107) I ( P ) (cid:107) ∞ < s + 1. There are two subcases to consider: 1 ∈ I ( P ) and1 / ∈ I ( P ). Assume the first subcase holds. Then P = x ,n M ( x, y ) or P = y ,n M ( x, y ) for some n ≤ k and monomial M of degree s with 1 < (cid:107) I ( M ) (cid:107) ∞ < s + 1. Then D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) = (cid:98) ( s +1) / (cid:99) (cid:88) s (cid:48) =1 (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) ≥ (cid:107) ( x , y ) (cid:107) D s ( x, y ) ind hyp ≥ c (cid:107) ( x , y ) (cid:107) M ( x, y ) ≥ cP ( x, y )Now assume the second subcase holds. Then P ( x, y ) = x i,j M ( x, y ) or P ( x, y ) = y i,j M ( x, y ) forsome 1 < i ≤ (cid:98) ( s + 1) / (cid:99) , j ≤ k i , and monomial M of degree s + 1 − i with 1 < (cid:107) I ( M ) (cid:107) ∞ < s + 1.Then D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) = (cid:98) ( s +1) / (cid:99) (cid:88) s (cid:48) =1 (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) ≥ (cid:107) ( x i , y i ) (cid:107) D s +1 − i ( x, y ) ind hyp ≥ c (cid:107) ( x i , y i ) (cid:107) M ( x, y ) ≥ cP ( x, y )Now assume P is of τ -type. By definition, since P has degree s + 1, this means P ( x, y ) = τ n,m ( x, y ) M (cid:48) ( x, y ) for some n, m ≤ k and graded-homogeneous monomial M (cid:48) with I ( M (cid:48) ) = s − . his implies P ( x, y ) = x ,(cid:96) P (cid:48) ( x, y ) or P ( x, y ) = y ,(cid:96) P (cid:48) ( x, y ) for some (cid:96) ≤ k , and degree s polyno-mial P (cid:48) of τ -type. Then D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) = (cid:98) ( s +1) / (cid:99) (cid:88) s (cid:48) =1 (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) ≥ (cid:107) ( x , y ) (cid:107) D s ( x, y ) ind hyp ≥ c (cid:107) ( x , y ) (cid:107) ( P (cid:48) ) ( x, y ) ≥ cP ( x, y ) (cid:3) Lemma 4.10.
Let ≤ s ≤ r . For any t > , there is a constant c > such that for all x, y ∈ g , D s ( x, y ) − (cid:107) ( x s , y s ) (cid:107) ≥ c (cid:107) ( δ t ( y ) − x ) s − ( x s − t s y s ) (cid:107) Proof.
Let t >
0. By Lemma 4.4, (cid:107) ( δ t ( y ) − x ) s − ( x s − t s y s ) (cid:107) . = (cid:88) j | P s,j ( x, δ t ( y )) | = (cid:88) j | P (cid:48) s,j,t ( x, y ) | where each P s,j is a BCH polynomial of degree s , and by Remark 4.2, each P (cid:48) s,j,t is a BCH polynomialof degree s . Then the desired inequality follows from Lemma 4.9. (cid:3) Lemma 4.11.
Let ≤ s ≤ r and c > . For all sufficiently small λ > (depending on c ), for all x, y ∈ g , c ( D s ( x, y ) − (cid:107) ( x s , y s ) (cid:107) ) + λ (cid:107) ( y − x ) s (cid:107) ≥ λ (cid:107) x s − y s (cid:107) Proof.
Let λ >
0. By Lemma 4.10, there is a constant c (cid:48) > x, y ) such that c ( D s ( x, y ) − (cid:107) ( x s , y s ) (cid:107) ) + λ (cid:107) ( y − x ) s (cid:107) ≥ c (cid:48) (cid:107) ( y − x ) s − ( x s − y s ) (cid:107) + λ (cid:107) ( y − x ) s (cid:107) =: ( ∗ )Thus, if λ ≤ c (cid:48) , ( ∗ ) ≥ λ (cid:107) ( y − x ) s − ( x s − y s ) (cid:107) + λ (cid:107) ( y − x ) s (cid:107) ≥ λ (cid:107) x s − y s (cid:107) where the last inequality follows from the parallelogram law. (cid:3) Lemma 4.12.
Let ≤ s ≤ r . There is a constant c > such that for all x, y ∈ g , D s ( x, y ) ≥ c (cid:107) ( δ / ( y ) − x ) s (cid:107) Proof.
By Lemma 4.10, it suffices to show (cid:107) ( δ / ( y ) − x ) s − ( x s − − s y s ) (cid:107) + (cid:107) ( x s , y s ) (cid:107) ≥ (cid:107) ( δ / ( y ) − x ) s (cid:107) Since (cid:107) x s − − s y s (cid:107) ≤ (cid:107) ( x s , y s ) (cid:107) it suffices to show (cid:107) ( δ / ( y ) − x ) s − ( x s − − s y s ) (cid:107) + 14 (cid:107) x s − − s y s (cid:107) ≥ (cid:107) ( δ / ( y ) − x ) s (cid:107) This inequality is true by the parallelogram law. (cid:3)
Lemma 4.13.
There is a constant c > such that for all x, y ∈ g , (cid:107) y (cid:107)(cid:107) x − y / (cid:107) ≥ c τ ( x, y ) roof. It suffices to show, for each fixed n, m ≤ k , (cid:107) y (cid:107)(cid:107) x − y / (cid:107) ≥ | τ n,m ( x, y ) | . By Cauchy-Schwarz, (cid:107) y (cid:107)(cid:107) x − y / (cid:107) ≥ (cid:107) ( y ,m , − y ,n ) (cid:107)(cid:107) ( x ,n , x ,m ) − ( y ,n , y ,m ) / (cid:107) C-S ≥ | y ,m ( x ,n − y ,n / − y ,n ( x ,m − y ,m / | = | x ,n y ,m − x ,m y ,n | = | τ n,m ( x, y ) | (cid:3) Definition 4.14.
For 2 ≤ s ≤ r , define SN s : g × g → R by SN s ( x, y ) := max {(cid:107) x − y / (cid:107) , (cid:107) ( x , y ) (cid:107) / , (cid:107) ( x , y ) (cid:107) / , . . . (cid:107) ( x s , y s ) (cid:107) /s } Remark . Using the maximum of the terms is not important here; it could be replaced by any (cid:96) p -sum or other such norm. If a different choice of norm was used, the rest of the section wouldproceed the exact same way except with possibly different values of constants (but still independentof x, y ). Lemma 4.16.
For each ≤ s ≤ r , there exists a homogeneous quasi-norm N s and a constant c > such that for all x, y ∈ g ,(1) ( N s ( x ) s + N s ( y − x ) s ) / − ( N s ( y ) / s ≥ cSN ss ( x, y ) + cD s ( x, y ) + cN s ( δ / ( y ) − x ) s (2) N s ( y ) ≥ (cid:107) y (cid:107) for all s ≥ . Consequently, N s ( y ) + SN s ( x, y ) ≥ b (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) /s (cid:48) for some b > and all ≤ s (cid:48) ≤ s .(3) If N s ( y ) = 0 , y i = 0 for all ≤ i ≤ s . In particular, N r is a positive definite homogeneousquasi-norm.Proof. The proof is by induction on s . The functions N s we construct will clearly be homogeneousquasi-norms and satisfy (2) and (3), so we will only concern ourselves with proving (1).Proof of base case: The base case is s = 2. Throughout the proof of the base case, c (cid:48) , c (cid:48)(cid:48) , c (cid:48)(cid:48)(cid:48) denote (small) positive constants that depend on g but not on x, y . Each of the constants maydepend on the ones previously appearing, but of course this is compatible with the fact that theyare all independent of x, y . Define N ( x ) := (cid:112) (cid:107) x (cid:107) + λ (cid:107) x (cid:107) where λ > SN ( x, y ) = max( (cid:107) x − y / (cid:107) , (cid:107) ( x , y ) (cid:107) ) ≤(cid:107) x − y / (cid:107) + (cid:107) ( x , y ) (cid:107) and D ( x, y ) = τ ( x, y ) + (cid:107) ( x , y ) (cid:107) , we need to show( N ( x ) + N ( y − x ) ) / ≥ ( N ( y ) / + c (cid:107) x − y / (cid:107) + c τ ( x, y ) + c (cid:107) ( x , y ) (cid:107) + cN ( δ / ( y ) − x ) for some λ, c >
0. First let’s write out the definitions of some of the terms in the inequality. N ( x ) = (cid:107) x (cid:107) + λ (cid:107) x (cid:107) N ( y − x ) = (cid:107) x − y (cid:107) + λ (cid:107) ( y − x ) (cid:107) N ( y ) = (cid:107) y (cid:107) + λ (cid:107) y (cid:107) By convexity, parallelogram law, and Lemma 4.13,( (cid:107) x (cid:107) + (cid:107) x − y (cid:107) ) / ≥ (( (cid:107) x (cid:107) + (cid:107) x − y (cid:107) ) / = ( (cid:107) y / (cid:107) + (cid:107) x − y / (cid:107) ) = ( (cid:107) y (cid:107) / + 2 (cid:107) y / (cid:107) (cid:107) x − y / (cid:107) + (cid:107) x − y / (cid:107) . ≥ ( (cid:107) y (cid:107) / + c (cid:48) τ ( x, y ) + (cid:107) x − y / (cid:107) or some c (cid:48) >
0. Thus, it suffices to show that for sufficiently small λ, c > c (cid:48) τ ( x, y ) + λ (cid:107) x (cid:107) + λ (cid:107) ( y − x ) (cid:107) + 12 (cid:107) x − y / (cid:107) ≥ − λ (cid:107) y (cid:107) + c τ ( x, y ) + c (cid:107) ( x , y ) (cid:107) + cN ( δ / ( y ) − x ) By Lemma 4.11, the following inequality is true for sufficiently small λ > c (cid:48) τ ( x, y ) + λ (cid:107) ( y − x ) (cid:107) = c (cid:48) D ( x, y ) − (cid:107) ( x , y ) (cid:107)| ) + λ (cid:107) ( y − x ) (cid:107) . ≥ λ (cid:107) x − y (cid:107) Thus it suffices for the following inequality to hold for λ, c > c (cid:48) τ ( x, y ) + λ (cid:107) x (cid:107) + λ (cid:107) x − y (cid:107) + 12 (cid:107) x − y / (cid:107) ≥ − λ (cid:107) y (cid:107) + c (cid:107) ( x , y ) (cid:107) + cN ( δ / ( y ) − x ) We have λ (cid:107) x (cid:107) + λ (cid:107) x − y (cid:107) ≥ λ (cid:107) x (cid:107) + (cid:107) x − y (cid:107) )= λ (cid:107) x (cid:107) + (cid:107) x − y (cid:107) ) + λ (cid:107) x (cid:107) + (cid:107) x − y (cid:107) ) ≥ − λ (cid:107) y (cid:107) + c (cid:48)(cid:48) (cid:107) ( x , y ) (cid:107) Thus it remains to show c (cid:48) τ ( x, y ) + c (cid:48)(cid:48) (cid:107) ( x , y ) (cid:107) + 12 (cid:107) x − y / (cid:107) ≥ cN ( δ / ( y ) − x ) for c > c (cid:48) τ ( x, y ) + c (cid:48)(cid:48) (cid:107) ( x , y ) (cid:107) ≥ c (cid:48)(cid:48)(cid:48) ( τ ( x, y ) + (cid:107) ( x , y ) (cid:107) ) = c (cid:48)(cid:48)(cid:48) D s ( x, y ) Lem 4 . ≥ cλ (cid:107) ( δ / ( y ) − x ) (cid:107) c > cλ (cid:107) ( δ / ( y ) − x ) (cid:107) + 12 (cid:107) x − y / (cid:107) ≥ cN ( δ / ( y ) − x ) This is true by definition of N . This completes the proof of the base case.Proof of inductive step: Now assume the statement holds for all 2 ≤ s (cid:48) ≤ s some 2 ≤ s ≤ r − N s +1 by N s +1 ( x ) := s +1) (cid:118)(cid:117)(cid:117)(cid:116) λ (cid:107) x s +1 (cid:107) + s (cid:88) s (cid:48) = (cid:100) ( s +1) / (cid:101) N s +1) s (cid:48) ( x ) (4.1)where λ is a (small) positive constant (different λ than in the base case) to be chosen later (indepen-dent of x, y ). Throughout the remainder of the proof, c − c denote (small) positive constants thatdepend on g but not on x, y . Each of the constants may depend on the ones previously appearing,but of course this is compatible with the fact that they are all independent of x, y . The constant λ will end up depending on c (which in turn depends on c ), and the subsequent constants willdepend on λ .We now prove the inductive step. In what follows, we adopt some conventions to help makethe proof more readable. There are two types of equalities/inequalities we use relating each ofthe expressions below. The first type is simply using a lemma, definition, inductive hypothesis, orconvexity or trivial numerical inequality. Whenever an equality/inequality of this type is used, theparticular terms in the expression that change from one to the next are bolded . No other termschange, except for the bolded ones to which the particular lemma, definition, inductive hypothesis,or convexity or trivial numerical inequality apply. Apart from the trivial numerical inequalities,the name of the lemma or definition, “ind hyp”, or “convexity” decorates the equality/inequalitysymbol. The second type of equality/inequality used is always an equality and the equality symbol s decorated with the word “rearrange”. This means we use trivialities like commutivity of addi-tion or multiplication, reindexing of a sum, or no symbolic changes at all. Importantly, we alsouse equalities decorated with “rearrange” to change which terms are bolded in the expression, inpreparation for the use of another equality/inequality of the first type. ( N s +1 ( x ) s +1) + N s +1 ( y − x ) s +1) ) / (4.1) = (cid:16) λ (cid:107) x s +1 (cid:107) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) N s +1) s (cid:48) ( x ) + (cid:107) ( y − x ) s +1 (cid:107) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) N s +1) s (cid:48) ( y − x ) (cid:17) / rearrange = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) ) + (cid:16)(cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) N s +1) s (cid:48) ( x ) + N s +1) s (cid:48) ( y − x ) (cid:17) / convexity ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) (cid:32) N s (cid:48) s (cid:48) ( x ) + N s (cid:48) s (cid:48) ( y − x )2 (cid:33) s +1 s (cid:48) ind hyp (1) ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) (( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y )+ c D s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s (cid:48) ) s +1 s (cid:48) Lem 3 . ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) (( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s (cid:48) ) s +1 s (cid:48) +(( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y )) s +1 − s (cid:48) s (cid:48) c D s (cid:48) ( x, y ) rearrange = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) (( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s (cid:48) ) s +1 s (cid:48) +(( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y )) s +1 − s (cid:48) s (cid:48) c D s (cid:48) ( x, y ) convexity ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c SN s +1) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s +1) +(( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y )) s +1 − s (cid:48) s (cid:48) c D s (cid:48) ( x, y ) rearrange = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c SN s +1) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + (( N s (cid:48) ( y ) / s (cid:48) + c SN s (cid:48) s (cid:48) ( x, y )) s +1 − s (cid:48) s (cid:48) c D s (cid:48) ( x, y ) ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c SN s +1) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + c (cid:107) ( x s +1 − s (cid:48) , y s +1 − s (cid:48) ) (cid:107) D s (cid:48) ( x, y ) rearrange = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c SN s +1) s (cid:48) ( x, y ) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + (cid:80) (cid:98) ( s +1) / (cid:99) s (cid:48) =1 c (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ c SN s +1) s ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + (cid:80) (cid:98) ( s +1) / (cid:99) s (cid:48) =1 c (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) rearrange = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ c SN s +1) s ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + (cid:80) (cid:98) ( s +1) / (cid:99) s (cid:48) =1 c (cid:107) ( x s (cid:48) , y s (cid:48) ) (cid:107) D s +1 − s (cid:48) ( x, y ) Def 4 . = λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) )+ c SN s +1) s ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + c ( D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) rearrange = c SN s +1) s ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) + λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) ) =: ( ∗ )By Lemma 4.11, we can choose λ > c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) + λ (cid:107) x s +1 (cid:107) + (cid:107) ( y − x ) s +1 (cid:107) ) Lem 4 . ≥ λ (cid:107) x s +1 (cid:107) + (cid:107) x s +1 − y s +1 (cid:107) )= λ (cid:107) x s +1 (cid:107) + (cid:107) x s +1 − y s +1 (cid:107) ) + λ (cid:107) x s +1 (cid:107) + (cid:107) x s +1 − y s +1 (cid:107) ) ≥ λ (cid:107) y s +1 (cid:107) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) ≥ − ( s +1) λ (cid:107) y s +1 (cid:107) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) And thus we get( ∗ ) ≥ c SN s +1) s ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) − ( s +1) λ (cid:107) y s +1 (cid:107) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) = c SN s +1) s ( x, y ) + − ( s +1) λ (cid:107) y s +1 (cid:107) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) ( N s (cid:48) ( y ) / s +1) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) )+ c (cid:107) ( x s +1 , y s +1 ) (cid:107) = c SN s +1) s ( x, y ) + ( N s +1 ( y ) / s +1) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) )+ c (cid:107) ( x s +1 , y s +1 ) (cid:107) = c SN s +1) s ( x, y ) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) + ( N s +1 ( y ) / s +1) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) . ≥ c SN s +1) s +1 ( x, y ) + ( N s +1 ( y ) / s +1) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) = ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) − (cid:107) ( x s +1 , y s +1 ) (cid:107) ) + c (cid:107) ( x s +1 , y s +1 ) (cid:107) ≥ ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) rearrange = ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + c D s +1 ( x, y )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c D s +1 ( x, y ) Lem 4 . ≥ ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + c D s +1 ( x, y )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c (cid:107) ( δ / ( y ) − x ) s +1 (cid:107) rearrange = ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + c D s +1 ( x, y )+ (cid:80) ss (cid:48) = (cid:100) ( s +1) / (cid:101) c N s (cid:48) ( δ / ( y ) − x ) s +1) + c (cid:107) ( δ / ( y ) − x ) s +1 (cid:107) (4.1) ≥ ( N s +1 ( y ) / s +1) + c SN s +1) s +1 ( x, y ) + c D s +1 ( x, y )+ c N s +1 ( δ / ( y ) − x ) s +1) (cid:3) Lemma 4.17.
There exists a positive definite homogeneous quasi-norm N r on g and a constant c > (depending on g but not on x, y ) such that for all p ≥ r and all x, y ∈ g , ( N r ( x ) p + N r ( y − x ) p ) / − ( N r ( y ) / p ≥ c p/r N r ( δ / ( y ) − x ) p roof. Let N r , c be as in the conclusion of Lemma 4.16. Let p ≥ r . Then by convexity and thatlemma, ( N r ( x ) p + N r ( y − x ) p ) / ≥ (( N r ( x ) r + N r ( y − x ) r ) / p/r Lem 4 . ≥ (( N r ( y ) / r + cN r ( δ / ( y ) − x ) r ) p/r ≥ ( N r ( y ) / p + c p/r N r ( δ / ( y ) − x ) p (cid:3) Lemma 4.18.
There exists a left invariant, homogeneous, positive definite quasi-metric d N r on g and a constant c > (depending on g but not on w, x, y, z ) such that for all p ≥ r and w, x, y, z ∈ g , (2 d N r ( y, x ) p + d N r ( y, w ) p + d N r ( y, z ) p ) / − ( d N r ( x, w ) / p − ( d N r ( x, z ) / p ≥ c (cid:48) d N r ( w, z ) p Proof.
Let N r , c be as in the previous lemma. Let d N r be the metric derived from N r ; d N r ( x, y ) := N r ( y − x ). By left invariance of the metric, we may assume x = 0. Then by applying the previouslemma to each of the pairs ( y, w ) and ( y, z ), we obtain( d N r ( y, p + d N r ( y, w ) p ) / − ( d N r (0 , w ) / p ≥ c p/r d N r ( δ / ( w ) , p ( d N r ( y, p + d N r ( y, z ) p ) / − ( d N r (0 , z ) / p ≥ c p/r d N r ( δ / ( z ) , p Adding these and then using using H¨older, the quasi-triangle inequality, and homogeneity gives(2 d N r ( y, p + d N r ( y, w ) p + d N r ( y, z ) p ) / − ( d N r (0 , w ) / p − ( d N r (0 , z ) / p ≥ c p/r ( d N r ( δ / ( w ) , p + d N r ( δ / ( z ) , p ) ≥ − p +1 c p/r ( d N r ( δ / ( w ) ,
0) + d N r ( δ / ( z ) , p ≥ c (cid:48) d N r ( δ / ( w ) , δ / ( z )) p = 2 − p c (cid:48) d N r ( w, z ) p for some c (cid:48) > (cid:3) Theorem 4.19.
Every graded nilpotent Lie group of step r , equipped with a left invariant metrichomogeneous with respect to the dilations induced by the grading, is Markov p -convex for every p ∈ [2 r, ∞ ) .Proof. Markov p -convexity is invariant under biLipschitz equivalence. Thus, we need only show( g , d N r ) is Markov 2 p -convex for all p ≥ r , where d N r is the quasi-metric from Lemma 4.18. TheMarkov convexity of d N r follows from the 4-point inequality of Lemma 4.18 and the proof ofProposition 2.1 in [MN13]. (cid:3) Lower Bound on Markov Convexity of J r − ( R )The goal of this section is to prove Theorem 5.6, which occurs at the conclusion. The strategy is toconstruct a sequence of directed graphs (see Definition 5.1) with bad Markov convexity properties.These bad properties are manifested by the dispersive nature of random walks on the graphs. Thisis the content of Lemma 5.3. We then map these graphs into J r − ( R ) with sufficient control overthe distortion (Lemma 5.5) to prove Theorem 5.6.5.1. Directed Graphs and Random Walks.
Let ( N m ) ∞ m =0 be any sequence of integers with N = 0 and N m +1 ≥ max(1 , N m + (cid:100) ( m + 1) (cid:101) ). We’ll define a sequence of directed graphs(Γ m ) ∞ m =0 . The graphs will be directed from unique source vertex to unique and sink vertex, whichwe will denote by 0 m and 1 m , respectively. Let diam(Γ m ) be the number of edges in a directededge path from 0 to 1, which is also equal to the diameter of Γ m with respect to the shortest pathmetric. The construction will be such that diam(Γ m ) = 2 N m . efinition 5.1. We’ll perform the construction and also prove that diam(Γ m ) = 2 N m by induction.Let Γ be the interval I , that is, a graph with two vertices 0 , m has been constructed for some m ≥
0. We define anintermediate graph Γ (cid:48) m +1 by gluing together a := 2 N m +1 −(cid:100) ( m +1) (cid:101)− copies of I , then A :=2 N m +1 − N m − N m +1 − N m −(cid:100) ( m +1) (cid:101) = 2 − N m (2 N m +1 − a ) = 2 N m +1 − N m (1 − −(cid:100) ( m +1) (cid:101) ) copiesof Γ m , then a more copies of I again together in series. The source vertex of this graph is thesource vertex of the first copy of I , and the sink vertex is the sink vertex of the last copy of I . Thediameter of this graph is a · diam( I ) + A · diam(Γ m ) + a · diam( I ) ind hyp = 2 a + 2 N m A = 2 N m +1 We then define Γ m +1 to be two copies of Γ (cid:48) m +1 , denoted +Γ (cid:48) m +1 and − Γ (cid:48) m +1 , glued together inparallel. Denote the common source vertex 0 m and sink vertex 1 m . The diameter of Γ m +1 is thesame as the diameter of Γ (cid:48) m +1 . We note that each copy of Γ m in Γ m +1 is isometrically embedded;any shortest path between two points in a copy of Γ m ⊆ Γ m +1 completely belongs to Γ m .By swapping +Γ (cid:48) m +1 and − Γ (cid:48) m +1 in Γ m +1 , we obtain a directed graph involution ι : Γ m +1 → Γ m +1 .For q , q ∈ Γ m , ( q , q ) is called a vertical pair if d m ( q , m ) = d m ( q , m ).For each m ≥
0, let ( X mt ) Nm t =0 be the standard directed random walk on Γ m . Let d m denote theshortest path metric on Γ m . With full probability, d ( X mt , m ) = t for 0 ≤ t ≤ N m .See the two right-hand graphs of Figure 2 for what Γ and Γ look like when N = 0, N = 2,and N = 4. The graphs are drawn in such a way that the direction is from left to right, +Γ (cid:48) m liesabove the x -axis, and − Γ (cid:48) m lies below the x -axis. The source vertices 0 m are both drawn at (0 , are both drawn at (1 , Lemma 5.2.
For all p > and m ≥ , N m (cid:88) k =0 2 Nm (cid:88) t =1 E [ d m ( X mt , ˜ X mt ( t − k )) p ]2 kp ≥ m N m Π m − i =1 (1 − ( i + 1) − ) Proof.
Let p ≥
1. The proof is by induction on m . The base case m = 0 is trivially true. Assumethe inequality holds for some m ≥
0. Now we consider the standard random walk X m +1 t on Γ m +1 .Consider k and t in the range a + 1 ≤ t ≤ N m +1 − a , 0 ≤ k ≤ N m , where a = 2 N m +1 −(cid:100) ( m +1) (cid:101)− .Then t − k ≥ N m +1 −(cid:100) ( m +1) (cid:101)− + 1 − N m ≥
1, so X m +11 and ˜ X m +11 ( t − k ) agree. Thenfor all subsequent times, with full probability, X m +1 t and ˜ X m +1 t ( t − k ) belong to the same copyof Γ (cid:48) m +1 in Γ m +1 . Then, after recalling the construction of Γ (cid:48) m +1 as a number of copies of Γ m and I glued together, it can be seen that for the range of t in interest, X m +1 t and ˜ X m +1 t ( t − k )are standard random walks across A = 2 N m +1 − N m (1 − −(cid:100) ( m +1) (cid:101) ) consecutive copies of Γ m ,which we denote as A · X m +1 t and A · ˜ X m +1 t ( t − k ). Thus, under our assumptions on k and t , d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) has the same distribution as d m ( A · X mt , A · ˜ X mt ( t − k )). Hence weobtain by the inductive hypothesis N m (cid:88) k =0 2 Nm +1 − a (cid:88) t = a +1 E [ d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) p ]2 kp = N m (cid:88) k =0 2 Nm +1 − a (cid:88) t = a +1 E [ d m ( A · X mt , A · ˜ X mt ( t − k )) p ]2 kp = N m (cid:88) k =0 A (cid:88) T =1 a + T Nm (cid:88) t = a +( T − Nm +1 E [ d m ( A · X mt , A · ˜ X mt ( t − k )) p ]2 kp N m (cid:88) k =0 A (cid:88) T =1 2 Nm (cid:88) t =1 E [ d m ( X mt , ˜ X mt ( t − k )) p ]2 kp ind hyp ≥ A (cid:88) T =1 m N m Π m − i =1 (1 − ( i + 1) − )= 2 N m A m m − i =1 (1 − ( i + 1) − ) = 2 N m +1 (1 − −(cid:100) ( m +1) (cid:101) ) m m − i =1 (1 − ( i + 1) − ) ≥ N m +1 (1 − ( m + 1) − ) m m − i =1 (1 − ( i + 1) − ) = m N m +1 Π mi =1 (1 − ( i + 1) − )In summary, N m (cid:88) k =0 2 Nm +1 − a (cid:88) t = a +1 E [ d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) p ]2 kp ≥ m N m +1 Π mi =1 (1 − ( i + 1) − ) (5.1)Now consider k and t in the range 0 ≤ k ≤ N m +1 −
1, 1 ≤ t ≤ k , so that t − k ≤
0. Notethat this means this range is disjoint from the one previously considered. Since t − k ≤
0, therandom walks X m +1 and ˜ X m +1 ( t − k ) evolved independently immediately. Thus, with probability1/2, X m +1 and ˜ X m +1 ( t − k ) belong to different copies of Γ (cid:48) m +1 in Γ m +1 . This implies that, withprobability 1/2, d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) = 2 t . Thus, N m +1 − (cid:88) k =0 2 k (cid:88) t =1 E [ d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) p ]2 kp ≥ N m +1 − (cid:88) k =0 2 k (cid:88) t =1 (2 t ) p kp +1Lem 3 . > N m +1 − (cid:88) k =0 k ( p +1) kp +2 = N m +1 − (cid:88) k =0 k − = 2 N m +1 − − ≥
18 2 N m +1 In summary, N m +1 − (cid:88) k =0 2 k (cid:88) t =1 E [ d m +1 ( X m +1 t , ˜ X m +1 t ( t − k )) p ]2 kp >
18 2 N m +1 (5.2)Again, notice that in (5.1) and (5.2), the range of t , k we consider are disjoint from each otherand are subsets of the range 0 ≤ k ≤ N m +1 , 1 ≤ t ≤ N m +1 . Thus, by adding (5.1) and (5.2), weobtain N m +1 (cid:88) k =0 2 Nm +1 (cid:88) t =1 E [ d m ( X mt , ˜ X mt ( t − k )) p ]2 kp > m N m +1 Π mi =1 (1 − ( i + 1) − ) + 18 2 N m +1 > (cid:18) m + 18 (cid:19) N m +1 Π mi =1 (1 − ( i + 1) − )completing the inductive step. (cid:3) Lemma 5.3. ∞ (cid:88) k =0 2 Nm (cid:88) t =1 E [ d m ( X mt , ˜ X mt ( t − k )) p ]2 kp (cid:38) m N m for all p > .Proof. This follows from Lemma 5.2 and the fact that Π m − i =1 (1 − ( i + 1) − ) > Π ∞ i =1 (1 − ( i + 1) − ) > m ≥ (cid:3) .2. Mapping the Graphs into J r − ( R ) .Lemma 5.4. There exists φ ∈ C r − , ([0 , such that(1) φ is symmetric across the line x = , that is, φ ( x ) = φ (1 − x ) for all x ∈ [0 , ] .(2) φ ( x ) ≥ (2 x ) r for all x ∈ [0 , ] .(3) [ j r − (0)]( φ ) = (0 , , and thus by (1), [ j r − (1)]( φ ) = (1 , .(4) For every integer ≤ i < r and every x ∈ [ i − r , ( i + 1)2 − r ) , φ ( r ) ( x ) = φ ( r ) ( i − r ) (so φ ( r ) is constant on intervals of this form).Since φ ∈ C r − , ([0 , , φ ( r ) ∈ L ∞ ([0 , . We also remark here that whenever dealing with L ∞ functions, we choose representatives that are everywhere (not just almost everywhere) bounded bytheir norm.Proof. The proof is by induction on r . For the base case r = 1, define φ ( x ) := (cid:26) x x ∈ [0 , ]2 − x x ∈ [ , φ satisfies (1) - (4).Now suppose such a function φ exists for some r ≥
1. We’ll construct a function ψ that satisfies(1) - (4) for r + 1. Define φ ∈ C r − , ([0 , φ ( x ) := (cid:26) φ (2 x ) x ∈ [0 , ] − φ (2 − x ) x ∈ [ , ∈ C r, ([0 , x ) := ˆ x φ ( ξ ) dξ Φ satisfies (1), (3), and (4) by the inductive hypothesis. Note that the inductive hypothesis appliedto (2) implies φ ( x ) ≥ r (2 x ) r for every x ∈ [0 , ], and henceΦ( x ) ≥ r − r + 1 (2 x ) r +1 ≥
12 (2 x ) r +1 Also, since φ ≥
0, (which follows from the inductive hypothesis applied to (1) and (2)),Φ( x ) ≥ Φ (cid:18) (cid:19) ≥ (cid:18) (cid:19) r +2 for all x ∈ [ , ]. Together, these two inequalities imply ψ ( x ) := 2 r +2 Φ( x ) ≥ (2 x ) r +1 for all x ∈ [0 , ]. Thus, ψ satisfies (1)-(4), completing the inductive step. (cid:3) See Figure 1 for graphs of φ and its first two derivatives when r = 3. Note that these graphs arenot on the same scale. Lemma 5.5.
Let φ be the function from Lemma 5.4. Set N = 0 , and for m ≥ , set N m := (cid:100) Cm log ( m + 1) (cid:101) , where C is a sufficiently large constant to be chosen later, so that N m ≥ r and N m +1 ≥ max(1 , N m + (cid:100) ( m + 1) (cid:101) ) . Then there exists a sequence of maps F m : Γ m → J r − ( R ) such that, for all m ≥ and all directed paths γ from m to m in Γ m , there is a function φ γ ∈ C r − , ([0 , N m ]) such that(1) [ j r − (0)]( φ γ ) = (0 , and [ j r − (2 N m )]( φ γ ) = (2 N m , .(2) After isometrically identifying γ with [0 , N m ] via q (cid:55)→ d m ( q, m ) , F m restricted to γ equalsthe jet of φ γ ; F m ( t ) = [ j r − ( t )]( φ γ ) . ϕ x ϕ ′ x ϕ ′′ Figure 1.
Graphs of the function φ from Lemma 5.4 and its first two derivativeswhen r = 3. Note that these are not shown to the same scale. (3) For all vertical pairs ( q , q ) ∈ Γ m × Γ m , √ m ln( m + 1) | π ( F m ( q )) − F m ( q )) | ≥ d m ( q , q ) r (4) Let γ ( X m ) denote the directed path followed by the random walk X m (so γ ( X m ) is itself apath-valued random variable). For all y ∈ R , and ≤ t < N m , E (cid:34) exp (cid:32) y (cid:32) sup [ t,t +1] φ ( r ) γ ( X m ) (cid:33)(cid:33)(cid:35) ≤ exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) and E (cid:20) exp (cid:18) y (cid:18) inf [ t,t +1] φ ( r ) γ ( X m ) (cid:19)(cid:19)(cid:21) ≤ exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) and thus there exists a constant B < ∞ (not depending on y , t , or m ) such that E (cid:20) exp (cid:18) y (cid:13)(cid:13)(cid:13) φ ( r ) γ ( X m ) (cid:13)(cid:13)(cid:13) L ∞ [ t,t +1] (cid:19)(cid:21) ≤ e By (5) (cid:107) φ ( r ) γ (cid:107) ∞ ≤ √ m (cid:107) φ ( r ) (cid:107) ∞ .(6) (cid:107) π ◦ F m (cid:107) ∞ ≤ r ( m + 1) Crm +1 (cid:107) φ (cid:107) ∞ .Proof. The proof is by induction on m . The base case m = 0 is easy, we simply define F to be thejet of the 0 function on Γ = I . Then (1) - (6) hold. Assume such a sequence of maps F , . . . F m exist for some m ≥
0. Set K := (cid:107) π ◦ F m (cid:107) ∞ (5.3)Since N m +1 ≥ C ( m + 1) log ( m + 2), we may (and do) choose C sufficiently large so that K ind hyp (6) ≤ (cid:107) φ (cid:107) ∞ r ( m + 1) Crm +1 ≤ r ( N m +1 −(cid:100) ( m +1) (cid:101)− − √ m + 1 ln( m + 2) (5.4)Define ˜ φ ∈ C r − , ([0 , N m +1 ]) by˜ φ ( x ) := 2 rN m +1 √ m + 1 ln( m + 2) φ (2 − N m +1 x )Note that since N m +1 ≥ r , Lemma 5.4(4) tells us:˜ φ ( r ) ( x ) = ˜ φ ( r ) ( i ) (5.5)for every integer 0 ≤ i < N m and every x ∈ [ i, i + 1). We also have by the chain rule (cid:13)(cid:13)(cid:13) ˜ φ ( r ) (cid:13)(cid:13)(cid:13) ∞ = (cid:13)(cid:13) φ ( r ) (cid:13)(cid:13) ∞ √ m + 1 ln( m + 2) (5.6) nd additionally (cid:13)(cid:13)(cid:13) ˜ φ (cid:13)(cid:13)(cid:13) ∞ ≤ rN m +1 (cid:107) φ (cid:107) ∞ ≤ r ( C ( m +1) log ( m +2)+1) (cid:107) φ (cid:107) ∞ = 2 r ( m + 2) Cr ( m +1) (cid:107) φ (cid:107) ∞ (5.7)We will now define the function F m +1 on Γ m +1 = +Γ (cid:48) m +1 ∪ − Γ (cid:48) m +1 . Let us first work with+Γ (cid:48) m +1 . Let γ be a directed path from 0 m to 1 m in +Γ (cid:48) m +1 . Then by definition of +Γ (cid:48) m +1 , γ consists of a = 2 N m +1 −(cid:100) ( m +1) (cid:101)− copies of I , then A = 2 − N m (2 N m +1 − a ) copies of differentdirected paths γ i , 1 ≤ i ≤ A , each belonging to Γ m and connecting 0 m to 1 m , then a more copies of I glued together in series. Identify γ isometrically with [0 , N m +1 ] via q (cid:55)→ d m +1 ( q, m +1 ). Underthis identification, the first set of copies of I gets identified with the subinterval [0 , a ], each γ i getsidentified with the subinterval [ a +( i − N m , a + i N m ], and the last set of copies of I gets identifiedwith the subinterval [2 N m +1 − a, N m +1 ]. We then define φ γ := ˜ φ + f γ (5.8)where f γ is defined as follows: f γ is identically 0 on [0 , a ] ∪ [2 N m +1 − a, N m +1 ], and f γ ( x ) = φ γ i ( x − a − ( i − N m ) on [ a + ( i − N m , a + i N m ] ( φ γ i is given to us by the inductive hypothesis).By the inductive hypothesis applied to (1) and Lemma 5.4(3), φ γ ∈ C r − , ([0 , N m +1 ]) and satisfies(1). It is also clear from this definition, (5.6), and the inductive hypothesis applied to (5) that (cid:13)(cid:13)(cid:13) φ ( r ) γ (cid:13)(cid:13)(cid:13) ∞ (5.8) ≤ (cid:13)(cid:13)(cid:13) ˜ φ ( r ) (cid:13)(cid:13)(cid:13) ∞ + max ≤ i ≤ A (cid:13)(cid:13)(cid:13) φ ( r ) γ i (cid:13)(cid:13)(cid:13) ∞ (5.6) ≤ (cid:13)(cid:13) φ ( r ) (cid:13)(cid:13) ∞ √ m + 1 ln( m + 2) + max ≤ i ≤ A (cid:13)(cid:13)(cid:13) φ ( r ) γ i (cid:13)(cid:13)(cid:13) ∞ ind hyp (5) ≤ (cid:13)(cid:13) φ ( r ) (cid:13)(cid:13) ∞ √ m + 1 ln( m + 2) + 2 √ m (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ ≤ √ m + 1 (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ verifying (5). We can finally define F m +1 on +Γ (cid:48) m +1 by declaring it to be the jet of φ γ on γ .We need to check that F m +1 is well-defined. Since every point of +Γ m +1 is contained in somedirected path from 0 m to 1 m , we only need to check what happens when one point belongs to twodifferent paths. Let q ∈ +Γ (cid:48) m +1 and suppose q ∈ γ ∩ γ (cid:48) for some directed paths γ, γ (cid:48) from 0 m +1 to 1 m +1 in +Γ (cid:48) m +1 . Set t := d ( q, m +1 ). There are two cases: t ∈ [0 , a ] ∪ [2 N m +1 − a, N m +1 ] or t ∈ [ a + ( i − N m , a + i N m ] for some i . Assume the first case holds. Then our definition of F m +1 ( q ) based on either q ∈ γ or q ∈ γ (cid:48) is F m +1 ( q ) = [ j r − ( t )]( ˜ φ )so well-definedness holds in this case. In the other case, our definition of F m +1 ( q ) based on q ∈ γ is, by the inductive hypothesis applied to (2), F m +1 ( q ) = [ j r − ( t )]( ˜ φ ) + ([ j r − ( t − a − ( i − N m )]( φ γ i ) + ( a + ( i − N m − t, ind hyp = [ j r − ( t )]( ˜ φ ) + F m ( q ) + ( a + ( i − N m − t, q ∈ γ (cid:48) , F m +1 ( q ) = [ j r − ( t )]( ˜ φ ) + ([ j r − ( t − a − ( i − N m )]( φ γ (cid:48) i ) + ( a + ( i − N m − t, ind hyp = [ j r − ( t )]( ˜ φ ) + F m ( q ) + ( a + ( i − N m − t, a + ( i − N m − t,
0) is present so that the x -coordinate of the entire expressionwill be t , and that we identify q as belonging to a copy of Γ m so that F m ( q ) makes sense) so well-definedness holds in this case as well. Thus F m +1 is well-defined on +Γ (cid:48) m +1 . We define F m +1 on − Γ m +1 by F m +1 ( q ) = − F m +1 ( ι ( q )), where ι : +Γ (cid:48) m +1 → − Γ (cid:48) m +1 is the involution. It follows fromthis that if γ is a directed 0 m +1 -1 m +1 path in − Γ (cid:48) m +1 , then φ γ = − φ ι ( γ ) . Thus, (1) and (2) are u xu xu xu Figure 2.
Above, the image of Γ , and below, the image of Γ , based on N = 0, N = 2, N = 4, in J ( R ) under the map F . J ( R ) is identified with R via thecoordinates x, u , u . These are not drawn to the same scale. The two images onthe right are respectively graph isomorphic to Γ and Γ .satisfied. It remains to show (3), (4), and (6). Before doing so, let us summarize the discussion on F m +1 of this paragraph: for q ∈ Γ m +1 and t = d m +1 ( q, m +1 ), F m +1 ( q ) = [ j r − ( t )]( ˜ φ ) t ∈ [0 , a ] ∪ [2 N m +1 − a, N m +1 ] q ∈ +Γ (cid:48) m +1 [ j r − ( t )]( ˜ φ )+ F m ( q ) + ( a + ( i − N m − t, t ∈ [ a + ( i − N m , a + i N m ] q ∈ +Γ (cid:48) m +1 [ j r − ( t )]( − ˜ φ ) t ∈ [0 , a ] ∪ [2 N m +1 − a, N m +1 ] q ∈ − Γ (cid:48) m +1 [ j r − ( t )]( − ˜ φ ) − F m ( q ) − ( a + ( i − N m − t, t ∈ [ a + ( i − N m , a + i N m ] q ∈ − Γ (cid:48) m +1 (5.9) ee Figure 2 for the images of Γ and Γ , based on N = 0, N = 2, N = 4, in J ( R ). Using(5.9), we can quickly verify (6): (cid:107) π ◦ F m +1 (cid:107) ∞ (5.9) ≤ (cid:13)(cid:13)(cid:13) ˜ φ (cid:13)(cid:13)(cid:13) ∞ + (cid:107) π ◦ F m (cid:107) ∞ ind hyp (6) ≤ (cid:13)(cid:13)(cid:13) ˜ φ (cid:13)(cid:13)(cid:13) ∞ + 2 r ( m + 1) Crm +1 (cid:107) φ (cid:107) ∞ (5.7) ≤ r ( m + 2) Cr ( m +1) (cid:107) φ (cid:107) ∞ + 2 r ( m + 1) Crm +1 (cid:107) φ (cid:107) ∞ ≤ r ( m + 2) Cr ( m +1)+1 (cid:107) φ (cid:107) ∞ (3) and (4) require more involved arguments.Proof of (3). Let ( q , q ) ∈ Γ m +1 × Γ m +1 be a vertical pair. By definition of vertical pair, d m +1 ( q , m +1 ) = d m +1 ( q , m +1 ). Let t denote this common value. There are two cases, q , q belong to the same copy of Γ (cid:48) m +1 , or they belong to different copies. First assume they belongto the same copy. Without loss of generality say +Γ (cid:48) m +1 . Then there are two subcases for t : t ∈ [0 , a ] ∪ [2 N m +1 − a, N m +1 ] or t ∈ [ a + ( i − N m , a + i N m ] for some 1 ≤ i ≤ A . Assume the firstsubcase holds. Then by construction of +Γ (cid:48) m +1 , q , q belong to a copy of I , and thus the equality d m +1 ( q , m +1 ) = d m +1 ( q , m +1 ) implies q = q , so (3) trivially holds. Assume the second subcasefor t . Then | π ( F m +1 ( q ) − F m +1 ( q )) | (5.9) = | π ( F m ( q ) − F m ( q )) | and so (3) holds by the inductive hypothesis.Now assume we are in the second case where q , q belong to different copies of Γ (cid:48) m +1 . Withoutloss of generality, assume q ∈ +Γ (cid:48) m +1 and q ∈ − Γ (cid:48) m +1 . Observe that under this assumption, d m +1 ( q , q ) = 2 t if t ≤ N m +1 − and d m +1 ( q , q ) = 2(2 N m +1 − t ) if t ≥ N m +1 − . Because of thesymmetry of ˜ φ about the line x = 2 N m +1 − , it suffices to only check the case t ≤ N m +1 − . Let usfirst record the following inequality: π ([ j r − ( t )]( ˜ φ )) ≥ (2 t ) r √ m + 1 ln( m + 2) (5.10)which can be proven by π ([ j r − ( t )]( ˜ φ )) = ˜ φ ( t ) = 2 rN m +1 √ m + 1 ln( m + 2) φ (2 − N m +1 t ) Lem 5 . ≥ (2 t ) r √ m + 1 ln( m + 2)Again split into two subcases: t ∈ [0 , a ] or t ∈ [ a, N m +1 − ]. In the first subcase we have π ( F m +1 ( q )) (5.9) = π ([ j r − ( t )]( ˜ φ )) (5.10) ≥ (2 t ) r √ m + 1 ln( m + 2)and π ( F m +1 ( q )) (5.9) = π ([ j r − ( t )]( − ˜ φ )) (5.10) ≤ − (2 t ) r √ m + 1 ln( m + 2)and thus | π ( F m +1 ( q ) − F m +1 ( q )) | ≥ t ) r √ m + 1 ln( m + 2) = 2 d m +1 ( q , q ) r √ m + 1 ln( m + 2)proving (3) in this subcase.Now assume the second subcase, t ∈ [ a, N m +1 − ]. Then π ( F m +1 ( q )) (5.9) = π ([ j r − ( t )]( ˜ φ ) + F m ( q ) + ( a + ( i − N m − t, π ([ j r − ( t )]( ˜ φ )) + π ( F m ( q )) (5.3) ≥ π ([ j r − ( t )]( ˜ φ )) − K (5.4) ≥ π ([ j r − ( t )]( ˜ φ )) − r ( N m +1 −(cid:100) ( m +1) (cid:101) ) − √ m + 1 ln( m + 2) (5.10) ≥ (2 t ) r − r ( N m +1 −(cid:100) ( m +1) (cid:101) ) − √ m + 1 ln( m + 2) = (2 t ) r − (2 a ) r / √ m + 1 ln( m + 2) ≥ (2 t ) r − (2 t ) r / √ m + 1 ln( m + 2) = (2 t ) r √ m + 1 ln( m + 2) imilarly, π ( F m +1 ( q )) ≤ − (2 t ) r √ m + 1 ln( m + 2)and thus | π ( F m +1 ( q ) − F m +1 ( q )) | ≥ (2 t ) r √ m + 1 ln( m + 2) = d m +1 ( q , q ) r √ m + 1 ln( m + 2)proving (3) in this final subcase.Proof of (4). Let 0 ≤ t < N m +1 be an arbitrary integer. Again we consider two cases for t : t ∈ [0 , a ) ∪ [2 N m +1 − a, N m +1 ) or t ∈ [ a, N m +1 − a ). Assume the first case holds. There are twosubcases to consider for γ ( X m +1 ): γ ( X m +1 ) belongs to +Γ (cid:48) m +1 or γ ( X m +1 ) belongs to − Γ (cid:48) m +1 .These are complementary events each occuring with probability 1/2. Restricted to the first event,for every x ∈ [ t, t + 1], φ ( r ) γ ( X m +1 ) ( x ) (5.8) = ˜ φ ( r ) ( x ) + f γ ( X m +1 ) ( x ) = ˜ φ ( r ) ( x ) (5.5) = ˜ φ ( r ) ( t )where the second equality holds by the definition of f succeeding (5.8). Thus,sup [ t,t +1] φ ( r ) γ ( X m +1 ) = inf [ t,t +1] φ ( r ) γ ( X m +1 ) = ˜ φ ( r ) ( t )Likewise, for the second subcase where we restrict to the event that γ ( X m +1 ) belongs to − Γ (cid:48) m +1 ,sup [ t,t +1] φ ( r ) γ ( X m +1 ) = inf [ t,t +1] φ ( r ) γ ( X m +1 ) = − ˜ φ ( r ) ( t )Combining these yields E (cid:34) exp (cid:32) y (cid:32) sup [ t,t +1] φ ( r ) γ ( X m +1 ) (cid:33)(cid:33)(cid:35) = 12 (cid:16) exp (cid:16) y ˜ φ ( r ) ( t ) (cid:17) + exp (cid:16) − y ˜ φ ( r ) ( t ) (cid:17)(cid:17) = cosh (cid:16) y ˜ φ ( r ) ( t ) (cid:17) ≤ cosh (cid:16) y (cid:13)(cid:13)(cid:13) ˜ φ ( r ) (cid:13)(cid:13)(cid:13) ∞ (cid:17) (5.6) = cosh (cid:18) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ √ m + 1 ln( m + 2) (cid:19) Lem 3 . ≤ exp (cid:18) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m + 1) ln( m + 2) (cid:19) and the same estimate holds for the essential infimum, verifying (4) in this case.Now consider the second case, t ∈ [ a + ( i − N m , a + i N m ] for some 1 ≤ i ≤ A . Again, thereare two subcases to consider for γ ( X m +1 ): γ ( X m +1 ) belongs to +Γ (cid:48) m +1 or γ ( X m +1 ) belongs to − Γ (cid:48) m +1 . Restricted to the first event, and for the range of t under consideration, X m +1 is equal indistribution to a copy of X m (after an appropriate shift in the time parameter), by definition of+Γ (cid:48) m +1 . Thus, for every x ∈ [ t, t + 1], φ ( r ) γ ( X m +1 ) ( x ) (5.8) = ˜ φ ( r ) ( x ) + f γ ( X m +1 ) ( x ) = ˜ φ ( r ) ( x ) + φ γ ( X m ) ( x (cid:48) ) (5.5) = ˜ φ ( r ) ( t ) + φ γ ( X m ) ( x (cid:48) )where x (cid:48) = x − a − ( i − N m , and the second equality holds by the definition of f succeeding (5.8).Thus, sup [ t,t +1] φ ( r ) γ ( X m +1 ) = ˜ φ ( r ) ( t ) + sup [ t (cid:48) ,t (cid:48) +1] φ γ ( X m ) inf [ t,t +1] φ ( r ) γ ( X m +1 ) = ˜ φ ( r ) ( t ) + inf [ t (cid:48) ,t (cid:48) +1] φ γ ( X m )32 here t (cid:48) = t − a − ( i − N m . Likewise, for the second subcase where we restrict to the event that γ ( X m +1 ) belongs to − Γ (cid:48) m +1 , sup [ t,t +1] φ ( r ) γ ( X m +1 ) = − ˜ φ ( r ) ( t ) − inf [ t (cid:48) ,t (cid:48) +1] φ γ ( X m ) inf [ t,t +1] φ ( r ) γ ( X m +1 ) = − ˜ φ ( r ) ( t ) − sup [ t (cid:48) ,t (cid:48) +1] φ γ ( X m ) Combining these and using the inductive hypothesis applied to (4) and some basic monotonicityand symmetry properties of cosh yields E (cid:34) exp (cid:32) y (cid:32) sup [ t,t +1] φ ( r ) γ ( X m +1 ) (cid:33)(cid:33)(cid:35) = 12 exp (cid:16) y ˜ φ ( r ) ( t ) (cid:17) E (cid:34) exp (cid:32) y (cid:32) sup [ t (cid:48) ,t (cid:48) +1] φ ( r ) γ ( X m +1 ) (cid:33)(cid:33)(cid:35) + 12 exp (cid:16) − y ˜ φ ( r ) ( t ) (cid:17) E (cid:20) exp (cid:18) − y (cid:18) inf [ t (cid:48) ,t (cid:48) +1] φ ( r ) γ ( X m +1 ) (cid:19)(cid:19)(cid:21) ind hyp ≤
12 exp (cid:16) y ˜ φ ( r ) ( t ) (cid:17) exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) + 12 exp (cid:16) − y ˜ φ ( r ) ( t ) (cid:17) exp (cid:32) ( − y ) (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) = cosh (cid:16) y ˜ φ ( r ) ( t ) (cid:17) exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) ≤ cosh (cid:16) y (cid:13)(cid:13)(cid:13) ˜ φ ( r ) (cid:13)(cid:13)(cid:13) ∞ (cid:17) exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) (5.6) = cosh (cid:18) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ √ m + 1 ln( m + 2) (cid:19) exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) Lem 3 . ≤ exp (cid:18) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m + 1) ln( m + 2) (cid:19) exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m (cid:88) n =1 n ln( n + 1) (cid:33) = exp (cid:32) y (cid:13)(cid:13)(cid:13) φ ( r ) (cid:13)(cid:13)(cid:13) ∞ m +1 (cid:88) n =1 n ln( n + 1) (cid:33) and the same estimate holds for the infimum, verifying (4) in this case. This completes the inductivestep and the proof of the lemma. (cid:3) Theorem 5.6.
For every p > , r ≥ , coarsely dense set N ⊆ J r − ( R ) , and R ≥ , let B N ( R ) := { x ∈ N : d CC (0 , x ) ≤ R } . Then Π p ( B N ( R )) (cid:38) ln( R ) p − r ln(ln( R )) p + r where the implicit constant can depend on r, p but not on N, R . roof. Let p, r, N be as above. Since the Markov convexity constant Π p is scale-invariant, then byapplying a dilation we may assume without loss of generality that every point of J r − ( R ) is at adistance of at most 1 away from a point of N . Let F m : Γ m → J r − ( R ) be the sequence of mapsfrom Lemma 5.5. Extend the domain of t for the random walks on Γ m by X mt := X m if t ≤
0, and X mt := X m Nm if t ≥ N m . Each { X mt } t ∈ Z is a Markov process on the state space Γ m .With full probability, d CC ( X mt , m ) = min(max(0 , t ) , N m ). Since ˜ X mt ( t − k ) equals X mt indistribution, ( X mt , ˜ X mt ( t − k )) is a vertical pair with full probability. Then Lemma 5.5(3) applies,and we get the following lower bound for the left hand side of the Markov convexity inequality inDefinition 1.1: ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ d CC ( F m ( X mt ) , F m ( ˜ X mt ( t − k ))) p ]2 kp Lem 3 . ≥ ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ | π ( f m ( X mt ) − f m ( ˜ X mt ( t − k ))) | p/r ]2 kp Lem 5 . ≥ m − p r ln( m + 1) ps ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ d m ( X mt , ˜ X mt ( t − k )) p ]2 kp Lem 5 . (cid:38) m − p r ln( m +1) − pr m N m = m − p r N m ln( m + 1) pr In summary, ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ d CC ( F m ( X mt ) , F m ( ˜ X mt ( t − k ))) p ]2 kp (cid:38) m − p r N m ln( m + 1) pr (5.11)Now we upper bound the right hand side of the Markov convexity inequality. Since d CC ( F m ( X mt +1 ) , F m ( X mt )) = 0 whenever t ≤ t ≥ N m , (cid:88) t ∈ Z E [ d CC ( F m ( X mt +1 ) , F m ( X mt )) p ] = Nm − (cid:88) t =0 E [ d CC ( F m ( X mt +1 ) , F m ( X mt )) p ] =: ( ∗ ) (5.12)Then ( ∗ ) Lem 5 . = Nm − (cid:88) t =0 E (cid:2) d CC ([ j r − ( t + 1)]( φ γ ( X m ) )([ j r − ( t )]( φ γ ( X m ) )) p (cid:3) Lem 3 . ≤ Nm − (cid:88) t =0 E (cid:20)(cid:18) (cid:13)(cid:13)(cid:13) φ ( r ) γ ( X m ) (cid:13)(cid:13)(cid:13) L ∞ [ t,t +1] (cid:19) p (cid:21) (cid:46) Nm − (cid:88) t =0 E (cid:20)(cid:13)(cid:13)(cid:13) φ ( r ) γ ( X m ) (cid:13)(cid:13)(cid:13) pL ∞ [ t,t +1] (cid:21) Lems 3 . , . (cid:46) Nm − (cid:88) t =0 N m In summary, (cid:88) t ∈ Z E [ d CC ( F m ( X mt +1 ) , F m ( X mt )) p ] (cid:46) N m (5.13)Let π N : J r − ( R ) → N be any map so that d CC ( x, π N ( x )) ≤ π N to transfer inequalities (5.11) and (5.13) tocorresponding inequalities on N . Consider the maps ¯ F m : Γ m → N defined by ¯ F m := π N ◦ δ m ◦ F m .By Lemma 5.5(3), d CC ( δ m ( F m ( q )) , δ m ( F m ( q ))) ≥ d m ( q , q ) ≥ q , q ) ∈ Γ m × Γ m . Combining this with (5.14) yields d CC ( ¯ F m ( q ) , ¯ F m ( q )) (5.14) ≥ d CC ( δ m ( F m ( q )) , δ m ( F m ( q ))) − ≥ d CC ( δ m ( F m ( q )) , δ m ( F m ( q ))) = md CC ( F m ( q ) , F m ( q )) or any vertical pair ( q , q ). Combining this with (5.11) yields ∞ (cid:88) k =0 (cid:88) t ∈ Z E [ d CC ( ¯ F m ( X mt ) , ¯ F m ( ˜ X mt ( t − k ))) p ]2 kp (cid:38) m p +1 − p r N m ln( m + 1) pr (5.15)Next, d CC ( ¯ F m ( X mt +1 ) , ¯ F m ( X mt )) (5.14) ≤ d CC ( δ m ( F m ( X mt +1 )) , δ m ( F m ( X mt ))) + 2= 2 md CC ( F m ( X mt +1 ) , F m ( X mt )) + 2Combining this with (5.13) and (5.12) yields (cid:88) t ∈ Z E [ d CC ( ¯ F m ( X mt +1 ) , ¯ F m ( X mt )) p ] (cid:46) m p N m (5.16)For each R ≥
1, let m ( R ) denote the largest m so that ¯ F m ( R ) (Γ m ( R ) ) ⊆ B N ( R ). Then (5.15) and(5.16) imply Π p ( B N ( R )) (cid:38) m ( R ) p − r ln( m ( R ) + 1) r (5.17)Now we wish to estimate the quantity m ( R ). Let m ≥ m are connected by a geodesic that is a piecewise directed path, the Lipschitz constant of any mapon Γ m is the maximum of the Lipschitz constants of the map restricted to directed paths. Thus,by Lemmas 5.5(2), 5.5(5), and 3.1, Lip( F m ) (cid:46) √ m . Since diam(Γ m ) = 2 N m ≤ Cm log ( m +1)+1 and F m (0 m ) = 0, this implies F m (Γ m ) ⊆ B J r − ( R ) ( R (cid:48) ) with R (cid:48) (cid:46) ( m + 1) Cm + . Then δ m ( F m (Γ m )) ⊆ B J r − ( R ) ( R (cid:48)(cid:48) ) with R (cid:48)(cid:48) (cid:46) ( m + 1) Cm + . Then ¯ F m (Γ m ) = π N ( δ m ( F m (Γ m ))) ⊆ B J r − ( R ) ( R (cid:48)(cid:48) + 1).This implies, for any R ≥ R (cid:46) ( m ( R ) + 1) Cm ( R )+ , where the implied constant is independentof R . This implies m ( R ) (cid:38) ln( R )ln(ln( R )) for R ≥
3. Plugging this into (5.17) yieldsΠ p ( B N ( R )) (cid:38) ln( R ) p − r ln(ln( R )) p + r (cid:3) References [ANT13] Tim Austin, Assaf Naor, and Romain Tessera,
Sharp quantitative nonembeddability of the Heisenberggroup into superreflexive Banach spaces , Groups Geom. Dyn. (2013), no. 3, 497–522. MR 3095705[Ass83] Patrice Assouad, Plongements lipschitziens dans R n , Bull. Soc. Math. France (1983), no. 4, 429–448.MR 763553[BLU07] A. Bonfiglioli, E. Lanconelli, and F. Uguzzoni, Stratified Lie groups and potential theory for their sub-Laplacians , Springer Monographs in Mathematics, Springer, Berlin, 2007. MR 2363343[Bou86] J. Bourgain,
The metrical interpretation of superreflexivity in Banach spaces , Israel J. Math. (1986),no. 2, 222–230. MR 880292[CK06] Jeff Cheeger and Bruce Kleiner, On the differentiability of Lipschitz maps from metric measure spaces toBanach spaces , Inspired by S. S. Chern, Nankai Tracts Math., vol. 11, World Sci. Publ., Hackensack, NJ,2006, pp. 129–152. MR 2313333[HS90] Waldemar Hebisch and Adam Sikora,
A smooth subadditive homogeneous norm on a homogeneous group ,Studia Math. (1990), no. 3, 231–236. MR 1067309[JL84] William B. Johnson and Joram Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space ,Conference in modern analysis and probability (New Haven, Conn., 1982), Contemp. Math., vol. 26,Amer. Math. Soc., Providence, RI, 1984, pp. 189–206. MR 737400[Jun17] Derek Jung,
A variant of Gromov’s problem on H¨older equivalence of Carnot groups , J. Math. Anal. Appl. (2017), no. 1, 251–273. MR 3680967[Jun19] ,
Bilipschitz embeddings of spheres into jet space Carnot groups not admitting Lipschitz extensions ,Ann. Acad. Sci. Fenn. Math. (2019), no. 1, 261–280. MR 3919136 LD17] Enrico Le Donne,
A primer on Carnot groups: homogenous groups, Carnot-Carath´eodory spaces, andregularity of their isometries , Anal. Geom. Metr. Spaces (2017), no. 1, 116–137. MR 3742567[LDLM18] Enrico Le Donne, Sean Li, and Terhi Moisala, Gˆateaux differentiability on infinite-dimensional carnotgroups , arxiv (2018).[Li14] Sean Li,
Coarse differentiation and quantitative nonembeddability for Carnot groups , J. Funct. Anal. (2014), no. 7, 4616–4704. MR 3170215[Li16] ,
Markov convexity and nonembeddability of the Heisenberg group , Ann. Inst. Fourier (Grenoble) (2016), no. 4, 1615–1651. MR 3494180[LN06] J. R. Lee and A. Naor, Lp metrics on the heisenberg group and the goemans-linial conjecture , 2006 47thAnnual IEEE Symposium on Foundations of Computer Science (FOCS’06), Oct 2006, pp. 99–108.[LN14] Vincent Lafforgue and Assaf Naor,
Vertical versus horizontal Poincar´e inequalities on the Heisenberggroup , Israel J. Math. (2014), no. 1, 309–339. MR 3273443[LNP09] James R. Lee, Assaf Naor, and Yuval Peres,
Trees and Markov convexity , Geom. Funct. Anal. (2009),no. 5, 1609–1659. MR 2481738[MN13] Manor Mendel and Assaf Naor, Markov convexity and local rigidity of distorted metrics , J. Eur. Math.Soc. (JEMS) (2013), no. 1, 287–337. MR 2998836[Mos62] G. D. Mostow, Homogeneous spaces with finite invariant measure , Ann. of Math. (2) (1962), 17–37.MR 145007[Nao12] Assaf Naor, An introduction to the Ribe program , Jpn. J. Math. (2012), no. 2, 167–233. MR 2995229[Nao18] , Metric dimension reduction: A snapshot of the ribe program , Proc. Int. Cong. of Math., vol. 1,2018, pp. 759–838.[NPS18] Assaf Naor, Gilles Pisier, and Gideon Schechtman,
Impossibility of dimension reduction in the nuclearnorm [extended abstract] , Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on DiscreteAlgorithms, SIAM, Philadelphia, PA, 2018, pp. 1345–1352. MR 3775876[NY18] Assaf Naor and Robert Young,
Vertical perimeter versus horizontal perimeter , Ann. of Math. (2) (2018), no. 1, 171–279. MR 3815462[Ost14a] Mikhail Ostrovskii,
Radon-Nikod´ym property and thick families of geodesics , J. Math. Anal. Appl. (2014), no. 2, 906–910. MR 3103207[Ost14b] Mikhail I. Ostrovskii,
Metric spaces nonembeddable into Banach spaces with the Radon-Nikod´ym propertyand thick families of geodesics , Fund. Math. (2014), no. 1, 85–96. MR 3247034[Pan89] Pierre Pansu,
M´etriques de Carnot-Carath´eodory et quasiisom´etries des espaces sym´etriques de rang un ,Ann. of Math. (2) (1989), no. 1, 1–60. MR 979599[Pis16] Gilles Pisier,
Martingales in Banach spaces , Cambridge Studies in Advanced Mathematics, vol. 155,Cambridge University Press, Cambridge, 2016. MR 3617459[Rag72] M. S. Raghunathan,
Discrete subgroups of Lie groups , Springer-Verlag, New York-Heidelberg, 1972, Ergeb-nisse der Mathematik und ihrer Grenzgebiete, Band 68. MR 0507234[Rib76] M. Ribe,
On uniformly homeomorphic normed spaces , Ark. Mat. (1976), no. 2, 237–244. MR 0440340[RW10] S´everine Rigot and Stefan Wenger, Lipschitz non-extension theorems into jet space Carnot groups , Int.Math. Res. Not. IMRN (2010), no. 18, 3633–3648. MR 2725507[War05] Ben Warhurst,
Jet spaces as nonrigid Carnot groups , J. Lie Theory (2005), no. 1, 341–356. MR 2115247[Wol03] Thomas H. Wolff, Lectures on harmonic analysis , University Lecture Series, vol. 29, American Mathemat-ical Society, Providence, RI, 2003, With a foreword by Charles Fefferman and a preface by Izabella (cid:32)Laba,Edited by (cid:32)Laba and Carol Shubin. MR 2003254, University Lecture Series, vol. 29, American Mathemat-ical Society, Providence, RI, 2003, With a foreword by Charles Fefferman and a preface by Izabella (cid:32)Laba,Edited by (cid:32)Laba and Carol Shubin. MR 2003254