[PDF] An Algebra Containing the Two-Sided Convolution Operators

Abstract

We present an intrinsically defined algebra of operators containing the right and left invariant Calderón-Zygmund operators on a stratified group. The operators in our algebra are pseudolocal and bounded on L^p (1<p<\infty). This algebra provides an example of an algebra of singular integrals that falls outside of the classical Calderón-Zygmund theory.

Full PDF

aa r X i v : . [ m a t h . C A ] F e b An Algebra Containing the Two-SidedConvolution Operators

Brian Street

Abstract

We present an intrinsically deﬁned algebra of operators containingthe right and left invariant Calder´on-Zygmund operators on a stratiﬁedgroup. The operators in our algebra are pseudolocal and bounded on L p (1 < p < ∞ ). This algebra provides an example of an algebra of singularintegrals that falls outside of the classical Calder´on-Zygmund theory. Let G be a stratiﬁed Lie group. That is, G is connected, simply connected, andits Lie Algebra g may be decomposed g = V L · · · L V m , where [ V , V k ] = V k +1 for 1 ≤ k < m and [ V , V m ] = 0. The Calder´on-Zygmund theory for left(or right) invariant convolution operators on G is well-known (see [Ste93], andSection 3 for a review). Given a distribution kernel K as in Deﬁnition 3.2 oneobtains two “Calder´on-Zygmund singular integral operators”:Op L ( K ) f := f ∗ K Op R ( K ) f := K ∗ f The operators of the form Op L ( K ) form an algebra (Op L ( K ) Op L ( K ) =Op L ( K ∗ K )), are bounded on L p (1 < p < ∞ ), and are pseudolocal. Thesame is true for operators of the form Op R ( K ). Also, if we consider:Op L ( K ) Op R ( K ) f = ( K ∗ f ) ∗ K = K ∗ ( f ∗ K ) = Op R ( K ) Op L ( K ) f we see that Op L ( K ) and Op R ( K ) commute.Hence, it follows that:Op L ( K ) Op R ( K ) Op L ( K ) Op R ( K ) = Op L ( K ∗ K ) Op R ( K ∗ K )and so operators of the form Op L ( K ) Op R ( K ) are closed under composition.It is also evident that they are bounded on L p (1 < p < ∞ ) and are pseudolocal.The main goal of this paper is to present an algebra of operators, whichcontains operators of the form Op L ( K ) Op R ( K ), and such that the operatorsin this algebra are bounded on L p (1 < p < ∞ ), and are pseudolocal. Moreover,1he algebra will contain the so-called two-sided convolution operators, of theform: T f ( x ) = Z K ( y, z ) f (cid:0) z − xy − (cid:1) dydz (1)where K is a product kernel (see Section 3.2). This algebra provides a nat-urally occurring example that falls outside of the classical Calder´on-Zygmundparadigm.Operators that fall outside of the classical Calder´on-Zygmund paradigm of-ten arise in the construction of parametricies of hypoelliptic operators which arenot maximally subelliptic. In fact, one of the original motivations for the presentpaper was the form of the parametrix constructed in [Str07] for Kohn’s exampleof a sum of squares of complex vector ﬁelds, whose commutators span the tan-gent space at each point, and such that the sum of squares is hypoelliptic butnot subelliptic ([Koh05]). The parametrix is constructed from compositions ofleft and right convolution operators on the three dimensional Heisenberg group,and is therefore closely related to the algebra discussed in this paper. It is ourhope that the work in this paper will help to motivate the proper algebras to usein other problems, where the Calder´on-Zygmund theory is no longer applicable(for instance, as in [NS06]). Acknowledgements

This project began as a collaboration with Eli Stein. During its early stages, weshared many very interesting conversations on this subject. Even as the projectprogressed, he continued to provide me with suggestions and encouragement.Finally, virtually everything I know about Calder´on-Zygmund theory, I learnedstudying under him during my years as a graduate student. On all these counts,I am indebted to him. I would also like to thank Alex Nagel, with whom I sharedsome interesting conversations in the early stages of this project.

Recall, G is a stratiﬁed group (for a background on such groups see [Fol75]), andas such, the lie algebra g = V L · · · L V m , with the V j satisfying the relationsin the introduction.Fix a basis X (1) , . . . , X ( n ) for V , thought of as elements of the tangent spaceto the identity in G . Then we can think of each X ( j ) as either a right invariantor a left invariant vector ﬁeld, call them X ( j ) R and X ( j ) L respectively. From here,we get the left and right gradients: ▽ L = (cid:16) X (1) L , . . . , X ( n ) L (cid:17) ▽ R = (cid:16) X (1) R , . . . , X ( n ) R (cid:17) Our deﬁnitions will be in terms of ▽ L and ▽ R , but will not depend in anyessential way on the speciﬁc choice of basis of V . Throughout this paper, wewill use ordered multi-index notation. Thus, for a ﬁnite sequence s of numbers2 , . . . , n , we deﬁne | s | to be the length of the sequence, and ▽ sL , ▽ sR in the usualway. So that, for instance: ▽ ( i,j,k ) L = X ( i ) L X ( j ) L X ( k ) L and | ( i, j, k ) | = 3.For ǫ ≥

0, let ρ Lǫ denote the Carnot-Carath´eodory distance on G associatedto the vector ﬁelds {▽ L , ǫ ▽ R } and ρ Rǫ the one associated to the vector ﬁelds {▽ R , ǫ ▽ L } (see Section 4 and references therein for background on such met-rics). Let B Lǫ ( x, δ ) denote the ball centered at x of radius δ in the ρ Lǫ metric,and V Lǫ ( x, δ ) its volume. Similarly, we deﬁne B Rǫ and V Rǫ in terms of ρ Rǫ . Deﬁnition 2.1.

For r R ≥ r L >

0, we say φ ∈ C ∞ ( G ) is a normalized r L , r R bump function of order M centered at x ∈ G if φ is supported in B L rLrR (cid:16) x, r L (cid:17) ,and ∀ | α | + | β | ≤ M , (cid:12)(cid:12)(cid:12) ▽ αL ▽ βR φ (cid:12)(cid:12)(cid:12) ≤ r | α | L r | β | R V L rLrR (cid:16) x, r L (cid:17) When r L ≥ r R , we replace V L rLrR (cid:16) x, r L (cid:17) with V R rRrL (cid:16) x, r R (cid:17) .We deﬁne (for 0 < r L ≤ r R ) B ( r L , r R , N L , N R , m, x, y )= r N L L r N R R (cid:18) r L ρ L rLrR ( x, y ) (cid:19) − m V L rLrR (cid:18) x, r L + ρ L rLrR ( x, y ) (cid:19) and when r R ≥ r L , we reverse the roles of r L and r R and of the left and rightvector ﬁelds. Before we deﬁne our algebra rigorously, let us write the deﬁnitionwhile being a little loose with quantiﬁers. We say that T ∈ A if for every m ≥ φ xr (1) L ,r (1) R normalized r (1) L , r (1) R bump functions centered at x andevery φ yr (2) L ,r (2) R normalized r (2) L , r (2) R bump functions centered at y (we suppressthe order for the moment), we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:28) φ xr (1) L ,r (1) R , ▽ α L ▽ β R T ▽ α L ▽ β R φ yr (2) L ,r (2) R (cid:29) L (cid:12)(cid:12)(cid:12)(cid:12) ≤ CB (cid:16) r (1) L ∧ r (2) L , r (1) R ∧ r (2) R , | α | + | α | , | β | + | β | , m, x, y (cid:17) where | α | + | α | and | β | + | β | must be suﬃciently large depending on m , and C is uniform in the choice of normalized bump function, r (1) L , r (1) R , r (2) L , r (2) R , andin x, y . Here, and in the rest of the paper, a ∧ b denotes the minimum of a and b , while a ∨ b denotes the maximum. Rigorously, our deﬁnition is:3 eﬁnition 2.2. We deﬁne A to be the set of those operators T : S ( G ) →S ( G ) ′ , such that for all m ≥

0, there exists N , such that for all N L , N R ≥ N there exists C > M ∈ N such that for all x, y ∈ G , all r (1) L , r (2) L , r (1) R , r (2) R > φ xr (1) L ,r (1) R normalized r (1) L , r (1) R bump functions centered at x of order M , all φ yr (2) L ,r (2) R normalized r (2) L , r (2) R bump functions centered at y of order M , and all | α | + | α | = N L , | β | + | β | = N R , we have: (cid:12)(cid:12)(cid:12)(cid:12)(cid:28) φ xr (1) L ,r (1) R , ▽ α L ▽ β R T ▽ α L ▽ β R φ yr (2) L ,r (2) R (cid:29) L (cid:12)(cid:12)(cid:12)(cid:12) ≤ CB (cid:16) r (1) L ∧ r (2) L , r (1) R ∧ r (2) R , N L , N R , m, x, y (cid:17) (2) Remark . We will see a posteriori that N = Q + m + 1 will work. SeeRemark 7.2.We will show:1. The operators in A are the same as those in A ′ (deﬁned below; see Section7).2. The operators in A ′ extend uniquely to bounded operators on L p , 1 < p < ∞ (Section 8).3. If T ∈ A , then T ∗ ∈ A , where T ∗ denotes the L adjoint of T (Remark2.8).4. The operators in A ′ form an algebra (Remark 2.7).5. The operators in A ′ are pseudolocal (Section 9).6. Two-sided convolution operators (and therefore the right and left Calder´on-Zygmund operators) are in A ′ (Corollary 6.7).Our main technical result is that the operators in A are the same as thosein A ′ . To deﬁne A ′ , we need a preliminary deﬁnition. Deﬁnition 2.4.

We say that φ ( x, z ) ∈ C ∞ ( G × G ) is an r L , r R elementary ker-nel if, for every m and every α , β , α , β , there exists a C = C ( m, α , α , β , β )such that (cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z φ ( x, z ) (cid:12)(cid:12)(cid:12) ≤ CB ( r L , r R , | α | + | α | , | β | + | β | , m, x, z ) (3)and, for every N , N , N , N ∈ N , and every | α | = N , | α | = N , | β | = N , | β | = N , there exist functions ψ α ,α ,β ,β ∈ C ∞ ( G × G ) such that φ = r − N − N L r − N − N R X α ,α ,β ,β ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z ψ α ,α ,β ,β and the ψ satisfy (3) with diﬀerent constants. Finally, we say E is an r L , r R elementary operator if the Schwartz kernel of E is an elementary kernel.4or each r L , r R , Deﬁnition 2.4 implicitly deﬁnes a family of seminorms ofthe elementary kernels (ie, the least possible C in (3), and the least possible C obtained from all choices of ψ , etc.). If L : C ∞ ( G × G ) → C ∞ ( G × G ) isa linear map that takes r L , r R elementary kernels to r L , r R elementary kernels,continuously, it makes sense to ask if it does so uniformly in r L , r R , since wemay order the semi-norms consistently as r L and r R vary. Deﬁnition 2.5.

We deﬁne A ′ to be those operators T : S ( G ) → S ( G ) suchthat for each r L , r R , and every E an r L , r R elementary operator, T E is an r L , r R elementary operator, and this map is uniformly continuous in r L , r R . Here S is the set of Schwartz functions, all of whose moments vanish. Remark . The operators in A ′ are a priori deﬁned only on S . To see thatthey are the same as those in A , we ﬁrst extend them as bounded operators on L , and then prove that the extended operator is in A . Remark . It is evident that if T , T ∈ A ′ then T T ∈ A ′ . We will showthat the operators in A ′ are the same as those in A , and therefore A forms analgebra.The operators in A may be thought of as “smoothing of order 0.” In Section10 we deﬁne the analogous concept of operators which are smoothing of otherorders. In Section 10 we also discuss an alternative to Deﬁnition 2.2, and whya deﬁnition like Deﬁnition 2.2 seems to be necessary. Remark . Deﬁnitions 2.2 and 2.4 may not seem to be symmetric (eg, if T ∈ A is T ∗ ∈ A ? and if E is an r L , r R elementary kernel, is E ∗ ?), however they are.Indeed, despite the fact that B ( · , · , · , · , · , x, y ) is not symmetric in x and y , itfollows from the results in Section 4 that there exists a C >

C B ( · , · , · , · , · , y, x ) ≤ B ( · , · , · , · , · , x, y ) ≤ CB ( · , · , · , · , · , y, x )Some words on notation. When we refer to the “unit ball”, we are alwaysreferring to the set { x : k x k < } , where k·k is deﬁned in Section 3. A . B willalways mean A ≤ CB , where C is some constant, independent of any relevantparameters, and A ≈ B means A . B and B . A . Sometimes we will have asum of positive numbers of the form X n ≥ a n and we will have X n ≥ a n . X n ≥ r n a for some r , 0 < r <

1. In this case we will say the series P n ≥ a n falls oﬀgeometrically or even “is geometric,” and we will use the fact that in this case P n ≥ a n ≈ a . 5 Calder´on-Zygmund Operators

In this section, we will remind the reader of the standard theory of Calder´on-Zygmund convolution operators on G . Our goal is three-fold: ﬁrst to ﬁx nota-tion, second to present these concepts in a few diﬀerent ways, each of which willbe useful in understanding our more complicated algebra, and ﬁnally we willneed these characterizations to show that these Calder´on-Zygmund operatorsare in our algebra.Recall, G is a stratiﬁed group, and so, as in the introduction, the Lie algebra g = V L · · · L V n , where the V j satisfy the relations in the introduction. Theexponential map exp : g → G is a diﬀeomorphism. We deﬁne dilations of g ,which for r > r · X = r j X for X ∈ V j . These dilations induceautomorphisms of G by re X = e r · X . If we identify G with g via the exponentialmap, Lebesgue measure becomes Haar measure for G , and d ( rx ) = r Q dx forsome Q ∈ N . We call Q the “homogeneous dimension” of G . For a function φ : G → C and r >

0, we deﬁne φ ( r ) ( x ) = r Q φ ( rx ). Let k·k : G → R + be asmooth homogeneous norm. See [Fol75] for a more in depth discussion.For a background on the material presented here, see [Ste93] and [NRS01].Indeed, we will be following the presentation of “product kernels” from [NRS01]later in this section. Deﬁnition 3.1. A k -normalized bump function on G is a C k function supportedon the unit ball with C k norm bounded by 1. The deﬁnitions that follow turnout to not depend in any essential way on k , and so we shall speak of normalizedbump functions , thereby suppressing the dependence on k . Deﬁnition 3.2.

A Calder´on-Zygmund kernel on G , is a distribution K on G ,which coincides with a C ∞ function away from 0, and satisﬁes:1. (Diﬀerential inequalities) For each ordered multi-index α , there is a con-stant C α so that |▽ αL K ( x ) | ≤ C α k x k − Q −| α | one may, equivalently, use ▽ R in place of ▽ L .2. (Cancellation conditions) Given any normalized bump function φ , and any r > Z K ( x ) φ ( rx ) dx is bounded independent of φ and r . Proposition 3.3.

Let K be a distribution on G . Then, K is a Calder´on-Zygmund kernel if and only if there exists a sequence { φ j } j ∈ Z ⊂ S , forming abounded subset of S , such that K = X j ∈ Z φ ( j ) j We will abuse notation and write the pairing between distributions and test functions asan integral. here this sum is taken in distribution (any such sum converges in distribution).In this case, Op L ( K ) = X j ∈ Z Op L (cid:18) φ ( j ) j (cid:19) (4) where this sum is taken in the strong operator topology as bounded operatorson L p ( < p < ∞ ). In particular, Op L ( K ) is a bounded operator on L p ( < p < ∞ ). In addition, (4) converges in the topology of bounded convergenceas operators S → S . All of the above can be done uniformly over a boundedsubset of Calder´on-Zygmund kernels. All of the above holds for Op R ( K ) aswell.Proof. This result is essentially contained in the proofs of Theorem 2.2.1, The-orem 2.6.1, and Proposition 2.7.1 of [NRS01]. The only part not appearing inthat paper is the convergence in the topology of bounded convergence. Thisfollows in a manner completely analogous to Theorem 6.6. We leave the detailsto the interested reader.

Theorem 3.4.

Let T : S ( G ) → C ∞ ( G ) ′ . Then, T = Op L ( K ) (when re-stricted to S ), where K is a Calder´on-Zygmund kernel, if and only if for every φ ∈ S and every r > , Op L ( K ) Op L (cid:16) φ ( r ) (cid:17) = Op L (cid:16) ψ ( r ) r (cid:17) where ψ r ∈ S , and as φ ranges over a bounded set in S , and r ranges over r > , we have that ψ r ranges over a bounded set in S . We defer the proof to Section 3.1. Theorem 3.4 should be interpreted in thefollowing way: we think of operators of the form Op L (cid:0) φ ( r ) (cid:1) , with φ ∈ S , as our“ r elementary operators” in analogy with Deﬁnition 2.4. Theorem 3.4 simplysays that T is a Calder´on-Zygmund operator if and only if composition with T takes r elementary operators to r elementary operators uniformly, in analogywith Deﬁnition 2.5.We now turn to an equivalent way of considering Calder´on-Zygmund opera-tors that is analogous to Deﬁnition 2.2. Let K be a Calder´on-Zygmund kernel,let T = Op L ( K ), and let φ, ψ be normalized bump functions. Deﬁne: φ xr ( y ) = r Q φ (cid:0) r (cid:0) y − x (cid:1)(cid:1) and similarly for ψ xr . The cancellation condition of Deﬁnition 3.2 shows that | T φ x r | . r Q , on (cid:13)(cid:13) x − x (cid:13)(cid:13) ≤ r − . Combining this with the growth condition,one sees: | T φ x r ( x ) | . ( r Q if (cid:13)(cid:13) x − x (cid:13)(cid:13) ≤ r − , (cid:13)(cid:13) x − x (cid:13)(cid:13) − Q if (cid:13)(cid:13) x − x (cid:13)(cid:13) ≥ r − . (5)Conversely, it is clear that Equation (5) implies the cancellation condition ofDeﬁnition 3.2. To see that it also implies the growth condition (where there areno derivatives involved), merely choose φ so that as r → ∞ , φ x r → δ x .7ow suppose s > r , we see from (5), |h ψ x s , T φ x r i L | . ( r Q if (cid:13)(cid:13) x − x (cid:13)(cid:13) ≤ r − , (cid:13)(cid:13) x − x (cid:13)(cid:13) − Q if (cid:13)(cid:13) x − x (cid:13)(cid:13) ≥ r − . (cid:0) r + (cid:13)(cid:13) x − x (cid:13)(cid:13)(cid:1) Q . V (cid:0) x , r + (cid:13)(cid:13) x − x (cid:13)(cid:13)(cid:1) (6)where V ( x, δ ) denotes the volume of (cid:8) y : (cid:13)(cid:13) y − x (cid:13)(cid:13) < δ (cid:9) . So we see that (6) isimplied by (5). The converse is true as well, as can be seen by taking ψ suchthat ψ x s → δ x as s → ∞ . From these considerations, the following theoremfollows easily: Theorem 3.5.

Suppose T : S → C ∞ ( G ) ′ , and is left invariant. Then, asoperators on S , T = Op L ( K ) where K is a Calder´on-Zygmund kernel if andonly if for all m , and all φ and ψ normalized bump functions, and all α, β ordered multi-indicies such that | α | + | β | is suﬃciently large depending on m ,we have: (cid:12)(cid:12)(cid:12)D ψ x s , ▽ αL T ▽ βL φ x r E L (cid:12)(cid:12)(cid:12) . (cid:0) t (cid:13)(cid:13) x − x (cid:13)(cid:13)(cid:1) − m t | α | + | β | V (cid:0) x , t + (cid:13)(cid:13) x − x (cid:13)(cid:13)(cid:1) (7) where t = s ∧ r , and the bound is uniform in s , r , x , x , and choices ofnormalized bump functions. This is analogous to Deﬁnition 2.2.Remark . We will see later that using the cancellation condition on both sidessimultaneously as in (7) seems to be necessary in our situation. See Section 10.

Lemma 3.7.

Given φ ∈ S , there exists ψ ∈ S n such that φ = ▽ L · ψ (8) Moreover, for each continuous semi-norm |·| on S , the inﬁmum over all such ψ of P | ψ j | is a continuous semi-norm on S .For φ ∈ S , we say φ ∈ S ( m ) if φ can be written as in (8) with ψ ∈ S n ( m − ;where S (0) = S . Then, S = ∩ m S ( m ) .Proof. This lemma is well known.

Lemma 3.8.

Suppose φ , φ ∈ S , then φ ( j ) ∗ φ ( k ) = 2 −| j − k | ψ ( l ) where l can be either j ∧ k or j ∨ k , ψ ∈ S , and when φ , φ vary over a boundedset of S , and j, k vary over Z , ψ varies over a bounded set of S . roof. We prove the result ﬁrst in the case l = j ∧ k and j ≤ k , the case k < j is similar. Note that: φ ( j ) ∗ φ ( k ) ( x ) = 2 ( j + k ) Q Z φ (cid:0) j (cid:0) xy − (cid:1)(cid:1) φ (cid:0) k y (cid:1) dy and so replacing x with 2 − j x and doing a change of variables u = 2 j y − , we seethat we can just prove the lemma for j = 0, k ≥ φ = ▽ R · φ ′ as in Lemma 3.7 (using right invariant vector ﬁelds,instead of left), we see that φ ∗ φ ( k ) = 2 kQ − k Z φ (cid:0) xy − (cid:1) ▽ R · φ ′ (2 k y ) dy = 2 kQ − k Z ( ▽ R φ ) (cid:0) xy − (cid:1) · φ ′ (2 k y ) dy Repeating this process Q more times, we see: φ ∗ φ ( k ) = 2 − k X m Z ψ ,m (cid:0) xy − (cid:1) ψ ,m (2 k y ) dy (9)where m ranges over a ﬁnite set, and the ψ ,m , ψ ,m ∈ S (and range over abounded set as φ , φ do).To see that ψ ∗ (cid:18) − kQ ψ ( k ) (cid:19) is rapidly decreasing (independent of k ), wenow need to merely apply the fact that if f and f are two bounded rapidlydecreasing functions, then f ∗ f is rapidly decreasing (note that ψ (cid:0) k y (cid:1) de-creases faster than ψ ( y )).Since, ▽ R (cid:18) φ ∗ φ ( k ) (cid:19) = ( ▽ R φ ) ∗ φ , we see that φ ∗ φ ( k ) ∈ S . To seeit is really in S , we use the fact that φ = ▽ R · φ ′ , and therefore, φ ∗ φ ( k ) = ( ▽ R · φ ′ ) ∗ φ ( k ) = ▽ R · (cid:18) φ ′ ∗ φ ( k ) (cid:19) repeating this process and applying Lemma 3.7, completes the proof of the case l = j ∧ k .Turning to the case when l = j ∨ k , we again assume j = 0 and now assume k >

0, the other cases being similar. A computation similar to the one leadingup to (9) shows that φ ∗ φ ( k ) = 2 − Nk X m Z ψ ,m (cid:0) xy − (cid:1) ψ ,m (2 k y ) dy (10)for any ﬁxed N , where the ψ ,m , ψ ,m are as above. Thus, if one wishes to showthat (cid:13)(cid:13) k x (cid:13)(cid:13) N φ ∗ φ ( k ) (11)9s bounded by C ( Q − k , one merely needs to apply (10) and use the fact that Z ψ ,m (cid:0) xy − (cid:1) ψ ,m (cid:0) k y (cid:1) is rapidly decreasing, as shown above. In fact, we get the stronger result that(11) is bounded independent of k . Derivatives work as before, yielding theresult. Proof of Theorem 3.4.

First, suppose that T = Op L ( K ) where K is a Calder´on-Zygmund kernel. We prove the result for r = 2 l for some l ∈ Z . The moregeneral result follows from this by moving r to the closest such 2 l , via replacing φ by φ (cid:0) r l x (cid:1) .Applying Proposition 3.3, we may write:Op L ( K ) = X k ∈ Z Op L (cid:18) ψ ( k ) k (cid:19) with the ψ k ∈ S uniformly in k (even as j varies). We now apply Lemma 3.8to see: Op L ( K ) Op L (cid:16) φ ( j ) (cid:17) = X k ∈ Z Op L (cid:18) ψ ( k ) k (cid:19) Op L (cid:16) φ ( j ) (cid:17) = X k ∈ Z Op L (cid:18) φ ( j ) ∗ ψ ( k ) k (cid:19) = X k ∈ Z −| k − j | Op L (cid:18) φ ( j ) k (cid:19) =: Op L (cid:18) φ ( j ) (cid:19) where φ ∈ S ranges over a bounded set as the relevant variables change.Conversely, suppose T , satisﬁes the conditions on the theorem. We wishto show that T = Op L ( K ), where K is a Calder´on-Zygmund operator. Weknow that I (the identity) is a Calder´on-Zygmund operator, and therefore, byProposition 3.3: I = X k ∈ Z Op L (cid:18) φ ( k ) k (cid:19) with the convergence in the topology of bounded convergence as operators S →S . Hence, T = T I = X k ∈ Z T Op L (cid:18) φ ( k ) k (cid:19) = X k ∈ Z Op L (cid:18) ψ ( k ) k (cid:19) ψ k forming a bounded subset of S . Thus, by Proposition 3.3, T is aCalder´on-Zygmund operator. In this section, we turn to the deﬁnition of so-called “product kernels,” and usethat to deﬁne the relevant “two-sided convolution operators” that we will bestudying (see (1)). Our main reference for product kernels is [NRS01], and werefer the reader there for any further reading.

Deﬁnition 3.9.

A product kernel on G × G is a distribution K ( x, y ) on G × G ,which coincides with a C ∞ function away from { x = 0 } ∪ { y = 0 } and whichsatisﬁes:1. (Diﬀerential inequalities) For each ordered multi-indicies α , α , there isa constant C = C ( α , α ) such that: (cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,y K ( x, y ) (cid:12)(cid:12)(cid:12) ≤ C | x | − Q −| α | | y | − Q −| α | the deﬁnition remains unchanged if we replace ▽ L by ▽ R .2. (Cancellation conditions) Given any normalized bump function φ on G ,and any R >

0, the distributions: K φ,R ( x ) = Z K ( x, y ) φ ( Ry ) dyK φ,R ( y ) = Z K ( x, y ) φ ( Rx ) dx are Calder´on-Zygmund kernels, uniformly in φ and R .For such a kernel we deﬁne Op L ( K ) and Op R ( K ), acting on functions in C ∞ ( G × G ) to be convolution with K on the right and left over the group G × G , respectively.For a function f : G → C and r , r >

0, we deﬁne: f ( r ,r ) ( x, y ) = r Q r Q f ( r x, r y ) Deﬁnition 3.10.

Let S b ⊗S denote the set of those functions f ∈ S ( G × G )such that for every multi-index α , Z x α f ( x, y ) dx = 0 Z y α f ( x, y ) dy = 0 R K ( x, y ) φ ( x ) dx denotes that distribution which, when paired with the test function ψ ( y ), equals R K ( x, y ) φ ( x ) ψ ( y ) dx dy emark . We note that S b ⊗S is nothing more than the tensor product ofthe nuclear space S with itself. This explains our notation. See [Tr`e67] for abackground on tensor products. We will not use any deep results about tensorproducts, however they will provide us with one small convenience. Since theabove tensor product agrees with the projective tensor product, when we wishto prove a result about f ∈ S b ⊗S , it will often suﬃce to prove the result for f ( x, y ) = φ ( x ) φ ( y ), φ , φ ∈ S . We will use this fact freely in the sequel.In particular, each f ( x, y ) ∈ S b ⊗S can be decomposed: f = ▽ L,x · ▽

R,y · g (12)where g ∈ (cid:0) S b ⊗S (cid:1) n ; this can be easily seen for elementary tensor products byLemma 3.7, and therefore holds for all elements of S b ⊗S . Proposition 3.12.

Let K be a distribution on G × G . Then, K is a productkernel if and only if there exists a sequence { φ j,k } j,k ∈ Z ⊂ S b ⊗S , forming abounded subset of S b ⊗S , such that: K = X j,k ∈ Z φ ( j , k ) j,k where this sum is taken in distribution (any such sum converges in distribution).In this case, Op L ( K ) = X j,k ∈ Z Op L (cid:18) φ ( j , k ) j,k (cid:19) where this sum converges in the strong operator topology as maps L p ( G × G ) → L p ( G × G ) , for ( < p < ∞ ). In particular, Op L ( K ) extends to a boundedoperator on L p ( G × G ) , ( < p < ∞ ). All of the above can be done uniformlyfor kernels forming a bounded subset of the product kernels. A similar resultholds for Op R ( K ) .Proof. This is essentially contained in the proofs of Theorem 2.2.1, Theorem2.6.1, and Proposition 2.7.1 of [NRS01]. We leave the details to the interestedreader.Given a product kernel K on G × G , we may deﬁne the two-sided convolutionoperator Op T ( K ) as:Op T ( K ) f ( x ) = Z K ( y, z ) f (cid:0) z − xy − (cid:1) dy dz To see that this makes sense, for x ﬁxed, after the z integration, the integrandleft over is O (cid:16) k y k − Q (cid:17) and so converges absolutely for y large. For y small, thismakes sense using the usual pairing of distributions and test functions. Notethat, Op T ( K ( x ) K ( y )) = Op L ( K ) Op R ( K )12nd so two-sided convolution operators contain the right and left invariantCalder´on-Zygmund operators as special cases.Since our algebra A will contain the two-sided convolution operators cor-responding to product kernels, the L p boundedness (1 < p < ∞ ) of two-sidedconvolution operators will follow from the L p boundedness of operators in A .However, we will actually need the L p boundedness of two-sided convolutionoperators to prove the L p boundedness of operators in A , and therefore we nowturn to proving the L p boundedness of two-sided convolution operators.We now introduce a formal trick that allows us to consider two-sided convo-lution operators on G as convolution operators on G × G . This may be found in[Kis95]. For a function f : G → C , we may deﬁne a new function Ef : G × G → C by ( Ef ) ( x, y ) = f (cid:0) y − x (cid:1) Then, a simple computation yields that, whenever it makes sense, E Op T ( K ) f = Op L (cid:16) e K (cid:17) Ef where e K ( x, y ) = K (cid:0) x, y − (cid:1) . Note that e K is a product kernel if and only if K is. Note also that, E Op T ( K ) Op T ( K ) = Op L (cid:16) e K (cid:17) Op L (cid:16) e K (cid:17) E = Op L (cid:16) e K ∗ e K (cid:17) E = E Op T (cid:18) ^ e K ∗ e K (cid:19) (13)and so two-sided convolution operators form an algebra (since product kernelsform an algebra under convolution, see [NRS01]). Lemma 3.13.

Suppose K is a product kernel and K has compact support.Then, Op T ( K ) extends to a bounded operator on L p ( G ) , and moreover, k Op T ( K ) k L p ( G ) (cid:9) ≤ (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) L p ( G × G ) (cid:9) Proof.

The proof uses transference. We follow the outline of a similar argumenton page 483 of [Ste93], where the proof falls under the heading “method ofdescent.” We suppose that K ( x, y ) is supported on k x k , k y k ≤ M . Consider,for R > k Op T ( K ) f k pL p ( G ) = k ( E Op T ( K ) f ) ( · , x ) k pL p ( G ) , ∀ x ∈ G = (cid:13)(cid:13)(cid:13)(cid:16) Op L (cid:16) e K (cid:17) Ef (cid:17) ( · , x ) (cid:13)(cid:13)(cid:13) pL p ( G ) , ∀ x ∈ G = 1 R Q Z k x k≤ R (cid:13)(cid:13)(cid:13)(cid:16) Op L (cid:16) e K (cid:17) Ef (cid:17) ( · , x ) (cid:13)(cid:13)(cid:13) pL p ( G ) dx (14)13ut, (cid:16) Op L (cid:16) e K (cid:17) Ef (cid:17) ( x , x ) = Z e K (cid:0) y − x , z − x (cid:1) Ef ( y, z ) dy dz and since in (14), we are only considering k x k ≤ R , and by the support of e K only considering (cid:13)(cid:13) z − x (cid:13)(cid:13) ≤ M , we see that we are only integrating over k z k ≤ M + R . Hence, in (14) we may replace Ef ( y, z ) with F ( y, z ) := ( Ef ( y, z )) χ {k z k≤ M + R } Thus, k Op T ( K ) f k pL p ( G ) = 1 R Q Z k x k≤ R (cid:13)(cid:13)(cid:13)(cid:16) Op L (cid:16) e K (cid:17) F (cid:17) ( · , x ) (cid:13)(cid:13)(cid:13) pL p ( G ) dx ≤ R Q (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17) F (cid:13)(cid:13)(cid:13) pL p ( G × G ) ≤ R Q (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) pL p ( G × G ) (cid:9) k F k pL p ( G × G ) ≤ R Q (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) pL p ( G × G ) (cid:9) Z (cid:12)(cid:12) ( Ef ) ( y, z ) χ {k z k≤ M + R } (cid:12)(cid:12) p dy dz = 1 R Q (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) pL p ( G × G ) (cid:9) ( M + R ) Q k f k pL p ( G ) −−−−→ R →∞ (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) pL p ( G × G ) (cid:9) k f k pL p ( G ) Completing the proof.

Corollary 3.14.

Let K be a product kernel, then Op T ( K ) extends uniquely toa bounded operator on L p ( G ) , < p < ∞ , and k Op T ( K ) k L p ( G ) (cid:9) ≤ (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) L p ( G × G ) (cid:9) Proof.

It suﬃces to show that for f ∈ C ∞ ( G ), k Op T ( K ) f k L p ( G ) ≤ (cid:13)(cid:13)(cid:13) Op L (cid:16) e K (cid:17)(cid:13)(cid:13)(cid:13) L p ( G × G ) (cid:9) k f k L p ( G ) and this follows from Lemma 3.13 and a simple limiting argument. Proposition 3.15.

Suppose K is a product kernel, and suppose that { φ j,k } ⊂S b ⊗S is a bounded subset such that K = X j,k ∈ Z φ ( j , k ) j,k where this sum is taken in distribution. Then, Op T ( K ) = X j,k ∈ Z Op T (cid:18) φ ( j , k ) j,k (cid:19) where this sum converges in the strong operator topology on L p ( G ) , < p < ∞ . roof. Corollary 3.14 tells us that the operators: X | j | , | k |≤ N Op T (cid:18) φ ( j , k ) j,k (cid:19) = Op T  X | j | , | k |≤ N φ ( j , k ) j,k  are uniformly bounded on L p . It is easy to see that for f ∈ C ∞ , X j,k ∈ Z Op T (cid:18) φ ( j , k ) j,k (cid:19) f = Op T ( K ) f where this sum is taken in distribution. Putting these two facts together, andusing the fact that functions of the form f = X L X R g , g ∈ C ∞ , X L = ▽ αL , | α | = 1, X R = ▽ βR , | β | = 1, span a dense subset of L p , it suﬃces to show that: X j,k ∈ Z Op T (cid:18) φ ( j , k ) j,k (cid:19) f (15)converges in L p , for all such f . We separate (15) into four sums, and we ﬁrstconsider: X j ≥ ,k ≥ Op T (cid:18) φ ( j , k ) j,k (cid:19) f = X | α | =1 , | β | =1 X j ≥ ,k ≥ Op T (cid:18)(cid:16) ▽ αL,x ▽ βR,x ψ j,k (cid:17) ( j , k ) (cid:19) f = X α,β X j ≥ ,k ≥ − j − k Op T (cid:18) ψ ( j , k ) j,k (cid:19) ▽ αL ▽ βR f (16)here we have applied (12) and ψ j,k also depends on α , β and ranges over abounded subset of S b ⊗S as α , β , j , and k vary. Thus, Corollary 3.14 tells usthat (16) converges in L p . We now consider: X j< ,k< Op T (cid:18) φ ( j , k ) j,k (cid:19) f = X j< ,k< Op T (cid:18) φ ( j , k ) j,k (cid:19) X L X R g = X j< ,k< Op T (cid:18) X L X R φ ( j , k ) j,k (cid:19) g = X j< ,k< j + k Op T (cid:16) ( X L X R φ j,k )( j , k ) (cid:17) g (17)and since X L X R φ j,k ranges over a bounded subset of S b ⊗S , we again have byCorollary 3.14 that (17) converges in L p . Finally, the sums where j < , k ≥ k < , j ≥ Theorem 3.16.

Suppose T : S → C ∞ ( G ) ′ . Then, T = Op T ( K ) (whenrestricted to S ), where K is a product kernel, if and only if for every φ ∈ S b ⊗S ,and every r L , r R > , T Op T (cid:16) φ ( r L ,r R ) (cid:17) = Op T (cid:16) ψ ( r L ,r R ) r L ,r R (cid:17) where ψ r L ,r R ∈ S b ⊗S , and ranges over a bounded subset of S b ⊗S as φ rangesover a bounded subset and r L and r R vary. In this section, we review the metrics deﬁned naturally in terms of a given familyof vector ﬁelds (often called Carnot-Carath´eodory metrics, or sub-Riemannianmetrics). Our main references for this section are [NSW85, NS01] however wewill need to restate many of the results from those papers in a slightly strongerway; though no new proofs will be required. The expert in these topics mayskip this section (except, perhaps, for Section 4.1), given the understanding thatall the facts we will use about such distances are true uniformly for ρ Lǫ , ρ Rǫ for ǫ ∈ [0 , ρ Lǫ , ρ Rǫ were deﬁned in the introduction.Let Ω ⊂ R N be a connected open set, and let Y , Y , · · · Y q be a list (possiblywith repetitions) of real C ∞ vector ﬁelds on Ω. Associate to each Y j an integer d j ≥

1, called the formal degree of Y j . Following [NSW85, NS01], we deﬁne: Deﬁnition 4.1 (Deﬁnition 2.1.1 in [NS01]) . The list of vector ﬁelds and asso-ciated formal degrees { ( Y j , d j ) } is said to be of ﬁnite homogeneous type on Ωif: 1. For all 1 ≤ j, k ≤ q , [ Y j , Y k ] = X d l ≤ d j + d k c lj,k ( x ) Y l where c lj,k ∈ C ∞ (Ω).2. At each point x ∈ Ω, { Y ( x ) , . . . , Y q ( x ) } spans the tangent space at x .A fundamental example of Deﬁnition 4.1 (and the only one we will use)is given by a set of vector ﬁelds X , . . . , X n on Ω such that all the iteratedcommutators of length at most m span the tangent space at each point. Wetake Y , . . . , Y q to be a list of all these commutators, with the degree of Y j beingthe length of the commutator from which it arises.16 eﬁnition 4.2 (Deﬁnition 2.1.2 of [NS01]) . Let Y = { ( Y , d ) , . . . , ( Y q , d q ) } bea list of vector ﬁelds and formal degrees which are of ﬁnite homogeneous typeon Ω. For each δ > C ( δ, Y ) denote the set of absolutely continuous curves φ : [0 , → Ω which satisfy: φ ′ ( t ) = q X j =1 a j ( t ) Y j ( φ ( t )) with | a j ( t ) | ≤ δ d j for almost all t ∈ [0 , x, y ∈ Ω, set ρ Y ( x, y ) = inf (cid:26) δ > | (cid:18) ∃ φ ∈ C ( δ ) (cid:19)(cid:18) φ (0) = x, φ (1) = y (cid:19)(cid:27) The function ρ Y is called the control metric on Ω, generated by Y . Remark . If we take Y to be that list of vector ﬁelds generated by the leftinvariant vector ﬁelds of order 1 (ie, we take ▽ αL , | α | = 1 to be the vector ﬁeldswhose iterated commutators up to some order span the tangent space, and usethese to generate Y as discussed above), then the induced metric ρ Y ( x, y ) isequivalent to (cid:13)(cid:13) y − x (cid:13)(cid:13) . Indeed, it is easy to see that they are both left invariant,and both homogeneous of order 1 with respect to the dilations on the group,and the equivalence then follows from a simple compactness argument.If Y = { ( Y , d ) , · · · , ( Y q , d q ) } is of ﬁnite homogeneous type, and if I =( i , . . . , i N ) is an ordered N -tuple of integers, with each i j ≤ q , we deﬁne: λ YI ( x ) = det (cid:0) Y i , . . . , Y i q (cid:1) ( x )where we regard each Y i as an N -tuple of smooth functions, and λ YI is then thedeterminant of the corresponding N × N matrix. We also set: d ( I ) = d i + · · · + d i N and deﬁne: Λ Y ( x, δ ) = X I (cid:12)(cid:12) λ YI ( x ) (cid:12)(cid:12) δ d ( I ) Deﬁnition 4.4.

Let S be a set of lists of vector ﬁelds Y = { ( Y , d ) , . . . , ( Y q , d q ) } of homogeneous type, where q may vary with Y . We say S is bounded if thereis a uniform bound for q for all Y ∈ S , and the following hold:1. There is an M such that d j ≤ M for all formal degrees associated to some Y ∈ S .2. We insist that the set of vector ﬁelds listed in some Y ∈ S , thought ofas sections of T Ω, form a bounded set in the usual topology of smoothsections of T Ω.3. The c lj,k from Deﬁnition 4.1 may be chosen from a bounded subset of C ∞ uniformly for Y ∈ S . 17. For every compact set K ⊂ Ω, there exists a c > Y ∈ S we have Λ Y ( x, ≥ c for x ∈ K .The relevance of such bounded sets S is that many of the results in [NSW85,NS01] hold uniformly for Y ∈ S , with no changes to the proof. We shall needsome of these results and state them below. We remark that these bounded setsare the precompact sets in a natural topology on the set of families of vectorﬁelds of homogeneous type; though we do not expound on this further, as it willbe of no use to us in the sequel. Fix, for the remainder of this section, such abounded set S . We will remind the reader of the results from [NSW85, NS01]that we will use, and make explicit their uniformity in S . All of the results inthis section follow by merely keeping track of the constants’ dependence on Y in [NSW85, NS01]. Remark . The reader wishing to prove the results in this section may ﬁndit useful to recall that the inverse function theorem remains true uniformly forcompact subsets of C ∞ . Ie, if R ⊂ C ∞ is a compact set, and if x is a pointsuch that for all f ∈ R , the Jacobian determinant of f at x is non-zero (andhence, has absolute value bounded below, independent of f ), then there existsan open neighborhood U (independent of f ) containing x such that for all f ∈ R , f : U → f ( U ) is a diﬀeomorphism. The essential point here is actually that R is a compact subset of C .For a list of vector ﬁelds and formal degrees Y = { ( Y , d ) , . . . , ( Y q , d q ) } ofhomogeneous type, we deﬁne: B Y ( x, δ ) = { y : ρ Y ( x, y ) < δ } Deﬁnition 4.6.

We say that two functions ρ , ρ : Ω × Ω → [0 , ∞ ] are locallyequivalent if for every x ∈ Ω there exists an open set U containing x such thatfor every compact set K ⊂⊂ U there is a constant C such that if x , x ∈ K ,1 C ρ ( x , x ) ≤ ρ ( x , x ) ≤ Cρ ( x , x )[NSW85] deﬁnes other pseudo-distances that are locally equivalent to ρ Y ,but can be easier to work with. We remind the reader of two of them that weshall use. The deﬁnition of the ﬁrst is similar to that of ρ Y , but only allowsconstant linear combinations of the vectors Y , . . . , Y q . For δ > C ( δ, Y )denote the class of smooth curves φ : [0 , → Ω such that: φ ′ ( t ) = q X j =1 a j Y j ( φ ( t ))with | a j | < δ d j . Deﬁne: ρ Y, ( x, y ) = inf (cid:26) δ > (cid:18) ∃ φ ∈ C ( δ, Y ) (cid:19)(cid:18) φ (0) = x, φ (1) = y (cid:19)(cid:27) heorem 4.7 (Theorem 2 from [NSW85]) . ρ ,Y is locally equivalent to ρ Y ,with constants that can be chosen uniformly for Y ∈ S . The deﬁnition of the second locally equivalent metric allows us to singleout N of the vector ﬁelds Y i , . . . , Y i N . For each N -tuple I = ( i , . . . , i N ), let C ( δ, I, Y ) denote the class of smooth curves φ : [0 , → Ω such that: φ ′ ( t ) = N X j =1 a j Y i j ( φ ( t ))with | a j | < δ d ( Y ij ). We deﬁne ρ Y, ( x, y ) = inf (cid:26) δ > (cid:18) ∃ I ∃ φ ∈ C ( δ, I, Y ) (cid:19)(cid:18) φ (0) = x, φ (1) = y (cid:19)(cid:27) Theorem 4.8 (Theorem 3 from [NSW85]) . ρ ,Y is locally equivalent to ρ Y ,with constants that can be chosen uniformly for Y ∈ S . Moreover, for every x ∈ Ω , there exists an open set U containing x such that for every compactset K ⊂⊂ U we have the following for all Y ∈ S : if for a ﬁxed x ∈ K we have: δ d ( I ) (cid:12)(cid:12) λ YI ( x ) (cid:12)(cid:12) ≥ ǫ Λ Y ( x, δ ) then, there exists a C depending on ǫ and K , but not on Y , such that for every y ∈ K , ρ ,Y ( x, y ) ≥ C inf (cid:26) δ > (cid:18) ∃ φ ∈ C ( δ, I, Y ) (cid:19)(cid:18) φ (0) = x, φ (1) = y (cid:19)(cid:27) Theorem 4.9 (Theorem 1 from [NSW85]) . For every compact set K ⊂⊂ Ω ,there are constants C , C such that for all x ∈ K and all Y ∈ S , < C ≤ | B Y ( x, δ ) | Λ Y ( x, δ ) ≤ C where here, and in the rest of the paper, | E | denotes the Lebesgue measure of E . Corollary 4.10 (Corollary of Theorem 4.9) . For every compact set K ⊂⊂ Ω there is a constant C such that for all Y ∈ S and all x ∈ K , | B ( x, δ ) | ≤ C | B ( x, δ ) | Theorem 4.11 (Lemma 3.1.1 from [NS01]) . Let E ⊂⊂ Ω be compact. Thereexist constants δ , ǫ , σ > such that for each x ∈ E , each < δ < δ , andeach Y ∈ S , there is a function φ = φ x,δ,Y ∈ C ∞ (Ω) such that:1. For all y ∈ Ω , ≤ φ ( y ) ≤ .2. φ ( y ) = 0 when ρ ( x, y ) > σ δ , φ ( y ) = 1 when ρ ( x, y ) < ǫ δ . . For every ordered multi-index α , sup y ∈ Ω |▽ αY φ ( y ) | ≤ C α δ − d ( α ) where ▽ Y = ( Y , . . . , Y q ) , and d ( α ) is the formal degree of ▽ αY with each Y j having formal degree d j .Remark . Actually, the stronger results given by Theorem 3.3.1 and The-orem 3.3.2 of [NS01] are true uniformly for Y ∈ S , but we will not need theseresults. Given two metrics, ρ and ρ , one may deﬁne a third function ρ ◦ ρ : Ω × Ω → [0 , ∞ ] deﬁned by: ρ ◦ ρ ( x, y ) = inf { δ > ∃ z ∈ Ω , ρ ( x, z ) ≤ δ, ρ ( z, y ) ≤ δ } Suppose X = { ( X , c ) , . . . , ( X p , c p ) } and Y = { ( Y , d ) , . . . , ( Y q , d q ) } are twolist of vector ﬁelds that are of ﬁnite homogeneous type. Suppose also that[ X j , Y k ] = 0 for every 1 ≤ j ≤ p , 1 ≤ k ≤ q .We will see (under an assumption) that ρ Y ◦ ρ X = ρ X ∪ Y In fact, even without our assumption, our proof works locally. However, thecondition that the X j commute with the Y k is so restrictive this is a moot point(see Remark 4.15).Before we speak about our assumption, a word of notation. If ( Z, d ) appearsin both X and Y we count it as appearing twice in X ∪ Y , equivalently, wereplace ( Z, d ) with (2

Z, d ) in X ∪ Y .Our assumption is as follows: for every δ > | a j ( t ) | ≤ δ , 1 ≤ j ≤ p , a j measurable, and every x ∈ Ω, there exists a unique solution in Ω to: φ (0) = xφ ′ ( t ) = p X j =1 a j ( t ) X j ( φ ( t ))and similarly for the Y k . We denote this solution by the time-ordered exponen-tial (also known as the product integral): φ ( t ) = T-exp Z t p X j =1 a j ( s ) X j ds  x See [DF79, GAV89] for a background on product integration.20f Z ( s ) is a family of vector ﬁelds (and is locally integrable), then:T-exp (cid:18)Z t Z ( s ) ds (cid:19) x = lim N →∞ exp Z t N − N t Z ( s ) ds ! · · · exp Z tN Z ( s ) ds ! x From here we see that if Z ( s ) and Z ( r ) commute for every s and r , then:T-exp (cid:18)Z t Z ( s ) + Z ( s ) ds (cid:19) x = T-exp (cid:18)Z t Z ( s ) ds (cid:19) T-exp (cid:18)Z t Z ( s ) ds (cid:19) x (18) Theorem 4.13.

Under the setup above, we have: ρ Y ◦ ρ X = ρ X ∪ Y Proof.

Suppose that ρ Y ◦ ρ X ( x, y ) < δ , so that there exists a z with ρ X ( x, z ) < δ and ρ Y ( z, y ) < δ . Let φ X , φ Y : [0 , → Ω be absolutely continuous curves suchthat φ X (0) = x , φ X (1) = z , φ Y (0) = z , and φ Y (1) = y , and such that φ X ( t ) = T-exp Z t p X j =1 a j ( s ) X j ds  xφ Y ( t ) = T-exp Z t q X k =1 b k ( s ) Y k ds ! z with | a j | < δ c j , | b j | < δ d j . But then, γ ( t ) = T-exp Z t p X j =1 a j ( s ) X j + q X k =1 b k ( s ) Y k ds  x is a path from x to z . Indeed, γ (1) = T-exp Z p X j =1 a j ( s ) X j + q X k =1 b k ( s ) Y k ds  x = T-exp Z q X k =1 b k ( s ) Y k ds ! T-exp Z p X j =1 a j ( s ) X j ds  x = T-exp Z q X k =1 b k ( s ) Y k ds ! z = y But, we also have that: γ ′ ( t ) = p X j =1 a j ( t ) X j ( γ ( t )) + q X k =1 b k ( t ) Y k ( γ ( t ))21nd since | a j | < δ c j and | b j | < δ d j we see that ρ X ∪ Y ( x, y ) < δ .Conversely, suppose ρ X ∪ Y ( x, y ) < δ . Then there is a path of the form: φ ( t ) = T-exp Z t p X j =1 a j ( s ) X j + q X k =1 b k ( s ) Y k ds  x with φ (1) = y , | a j | < δ c j , | b k | < δ d k . Deﬁne φ X ( t ) = T-exp Z t p X j =1 a j ( s ) X j ds  x and let z = φ X (1). Deﬁne: φ Y ( t ) = T-exp Z t q X k =1 b k ( s ) Y k ds ! z Note that φ Y (1) = φ (1) = y by (18). It is easy to see that φ X ∈ C ( δ, X ), φ Y ∈ C ( δ, Y ), and it then follows that ρ X ( x, z ) < δ and ρ Y ( z, y ) < δ , showingthat ρ Y ◦ ρ X ( x, y ) < δ and completing the proof. Remark . If one wished to show only that ρ Y ◦ ρ X is locally equivalentto ρ X ∪ Y in Theorem 4.13 (which would be suﬃcient for our purposes), thenthe proof is a bit easier. Indeed, Theorem 4.7 would allow us to replace theexponentials with variable coeﬃcients with ones with constant coeﬃcients. Thenthe same proof yields the result, without any need for time ordered exponentials,nor the need for our assumption. In spite of this, we believe that the proof ofTheorem 4.13 helps to elucidate the situation. Remark . One example of such X j and Y k is as follows: let X j be a spanningset of the right invariant vector ﬁelds on some Lie group, and let Y k be a spanningset of the left invariant vector ﬁelds (and we may even restrict them to a smallconnected open set). It is not hard to see that this is the only example. ρ Lǫ and ρ Rǫ Given a ﬁnite set F of vector ﬁelds such that F along with the commutators ofall orders of elements of F up to some ﬁxed order m :[ X , [ X , . . . , [ X n − , X n ] . . . ]] , n ≤ m, X j ∈ F span the tangent space at each point (it is often said that such a set F satisﬁesH¨ormander’s condition), we associate a list of vector ﬁelds of ﬁnite homogeneoustype as in Section 4, by taking the list of all commutators up to order m andassociating to a commutator of length k degree k . That is to say the elementsof F are given degree 1, elements of the form [ X , X ] are given degree 2 and so22orth. Call this list of vector ﬁelds L ( F ). From this list of vector ﬁelds of ﬁnitehomogeneous type, we get a metric ρ L ( F ) .We deﬁne, as in the introduction for ǫ ∈ [0 , ρ Lǫ = ρ L ( {▽ L ,ǫ ▽ R } ) ρ Rǫ = ρ L ( {▽ R ,ǫ ▽ L } ) Here, our set Ω from Section 4 is the entire group G . Now it is easy to see that {L ( {▽ L , ǫ ▽ R } ) , L ( {▽ R , ǫ ▽ L } ) | ǫ ∈ [0 , } is a bounded set as in Section 4. Thus, all of the theorems from that sectionhold uniformly for ǫ ∈ [0 , B Lǫ ( x, δ ) to be the ball of radius δ centered at x in the ρ Lǫ metric,and we deﬁne V Lǫ ( x, δ ) to be its volume. Similarly, we deﬁne B Rǫ ( x, δ ) and V Rǫ ( x, δ ). Note that all of the relevant quantities from Section 4 are homoge-neous of an appropriate degree. For instance, ρ Lǫ ( rx, ry ) = rρ Lǫ ( x, y ) V Lǫ ( rx, δ ) = r Q V Lǫ (cid:18) x, δr (cid:19) From such considerations, it is easy to see that all of the results from Section4 hold globally, instead of locally. That is, many of the results are true on anyﬁxed compact set E . Take that compact set E to be the closed unit ball. Thento see that the result holds globally, merely scale everything down until it ﬁtsinto the unit ball, and apply the result on the unit ball. As all the quantitiesare homogeneous of the proper degrees, this extends the results. In the samemanner, we may even take δ = ∞ in Theorem 4.11. Remark . We have the following scaling properties of the distances ρ Lǫ and ρ Rǫ : rρ Lǫ ( x, y ) = ρ Lǫ ( rx, ry ) = ρ { r ▽ L , r ǫ ▽ R } ( x, y )and similarly for ρ Rǫ . The ﬁrst equality just follows by homogeneity of the vectorﬁelds and was discussed above. To see that the ﬁrst term equals the last term,note that: C ( δ, L ( ▽ L , ǫ ▽ R )) = C (cid:18) rδ, L (cid:18) r ▽ L , r ǫ ▽ R (cid:19)(cid:19) as can be seen directly from the deﬁnition. In this section, we investigate the relationship between ρ Lǫ , ρ Rǫ and two-sidedconvolution operators on G . We will use one simplifying piece of notation.For an operator T , we write Ker ( T ) ( x, y ) for the Schwartz kernel of T whenmapping from the y variable to the x variable.23eﬁne B := { x : k x k < } . With χ B denoting the characteristic function of B , set (for r L , r R > K r L ,r R ( x, y ) = Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B (cid:17)(cid:17) ( x, y )= r QL r QR Z χ rL B (cid:0) z − x (cid:1) χ rR B (cid:0) zy − (cid:1) dz = (cid:12)(cid:12)(cid:12)(cid:12)(cid:26) z : (cid:13)(cid:13) z − x (cid:13)(cid:13) < r L , (cid:13)(cid:13) zy − (cid:13)(cid:13) < r R (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) (19)Recall that (cid:13)(cid:13) z − x (cid:13)(cid:13) ≈ ρ L ( x, z ) (see Remark 4.3) and, similarly, (cid:13)(cid:13) zy − (cid:13)(cid:13) ≈ ρ R ( y, z ). Using that r L ρ L ( x, y ) = ρ L “ rL ▽ L ” (and similarly for r R ρ R ) (seeRemark 5.1), and using (19), we see that there exists a constant C (independentof r L , r R ) such that: (cid:26) ( x, y ) : ρ L “ rR ▽ R ” ◦ ρ L “ rL ▽ L ” ( x, y ) ≤ C − (cid:27) ⊆ { ( x, y ) : K r L ,r R ( x, y ) = 0 }⊆ (cid:26) ( x, y ) : ρ L “ rR ▽ R ” ◦ ρ L “ rL ▽ L ” ( x, y ) ≤ C (cid:27) (20)Section 4.1 tells us that: ρ L “ rR ▽ R ” ◦ ρ L “ rL ▽ L ” = ρ L “ rR ▽ R ” ∪L “ rL ▽ L ” = ρ L “ rR ▽ R , rL ▽ L ” while Remark 5.1 tells us that (when r L ≤ r R ): ρ L “ rR ▽ R , rL ▽ L ” = r L ρ L “ rLrR ▽ R , ▽ L ” = r L ρ L rLrR with a similar result with r R ≤ r L . For the remainder of this section, we restrictour attention to the case r L ≤ r R , with the understanding that the case r R ≤ r L follows in the same way with completely symmetric arguments.Putting all of this together, we see that: B L rLrR (cid:18) x, Cr L (cid:19) ⊆ { y : K r L ,r R ( x, y ) = 0 } ⊆ B L rLrR (cid:18) x, Cr L (cid:19) (21)Applying Corollary 4.10, we see that: |{ y : K r L ,r R ( x, y ) = 0 }| ≈ V L rLrR (cid:18) x, r L (cid:19) Deﬁne: M ( x, r L , r R ) = sup y K r L ,r R ( x, y )The main result of this section is the following theorem and its corollary:24 heorem 5.2. M ( x, r L , r R ) ≈ V LrLrR “ x, rL ” . Moreover, there is a δ > (inde-pendent of r L , r R ) such that for all x , there exists a y with: K r L ,r R ( x, y ) ≈ V L rLrR (cid:16) x, r L (cid:17) for all y ∈ B L rLrR (cid:16) y , δ r L (cid:17) . Corollary 5.3.

We may take y = x in Theorem 5.2.Remark . One of our main uses for Corollary 5.3 is as follows. Take δ as inthe corollary, and set χ = χ δ B . Then, it is easy to see that φ ( y ) := Ker (cid:16) Op L (cid:16) χ ( r L ) (cid:17) Op R (cid:16) χ ( r R ) (cid:17)(cid:17) ( x, y ) & V L rLrR (cid:16) x, r L (cid:17) for y ∈ B L rLrR (cid:16) x, r L (cid:17) . Thus, when we wish bound a function φ supported on B L rLrR (cid:16) x, r L (cid:17) and which is . V LrLrR “ x, rL ” , it suﬃces to instead bound φ . Lemma 5.5. M ( x, r L , r R ) ≈ M (cid:0) x, r L , r R (cid:1) Proof.

It is clear that M ( x, r L , r R ) ≤ Q M (cid:16) x, r L , r R (cid:17) and so we focus only on the reverse inequality.For the proof of this lemma, alone, we drop the assumption that r L ≤ r R .Then, it suﬃces to show that: M (cid:16) x, r L , r R (cid:17) . M ( x, r L , r R )and the remainder of the result will follow by symmetry.Deﬁne g = χ B ∗ χ B . Note that there exists a c > g ( x ) > c for25 ∈ B . Hence, we have: K r L , rR ( x, y ) = 2 − Q Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R )2 B (cid:17)(cid:17) ( x, y ) . Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) g ( r R ) (cid:17)(cid:17) ( x, y )= Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B ∗ χ ( r R )2 B (cid:17)(cid:17) ( x, y )= Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B (cid:17) Op R (cid:16) χ ( r R )2 B (cid:17)(cid:17) ( x, y )= Z K r L ,r R ( x, z ) χ ( r R )2 B (cid:0) y − z (cid:1) dz ≤ Z M ( x, r L , r R ) χ ( r R )2 B (cid:0) y − z (cid:1) dz = M ( x, r L , r R ) Z χ B ( z ) dz ≈ M ( x, r L , r R )Completing the proof.Let φ ∈ C ∞ ( G ) be ≥ ≤

1, supported on 2 B , and = 1 on B . Deﬁne: e K r L ,r R = Ker (cid:16) Op L (cid:16) φ ( r L ) (cid:17) Op R (cid:16) φ ( r R ) (cid:17)(cid:17)f M ( x, r L , r R ) = sup y e K r L ,r R ( x, y )So that the deﬁnition of K shows: K r L ,r R ( x, y ) ≤ e K r L ,r R ( x, y ) ≤ Q K rL , rR ( x, y )and Lemma 5.5 then tells us: M ( x, r L , r R ) ≈ f M ( x, r L , r R ) Lemma 5.6.

There exists C > (independent of r L , r R ) such that for any ≥ δ > , and any γ ∈ C (cid:16) δr L , L (cid:16) ▽ L , r L r R ▽ R (cid:17)(cid:17) , we have: (cid:12)(cid:12)(cid:12) e K r L ,r R ( x, γ (1)) − e K r L ,r R ( x, γ (0)) (cid:12)(cid:12)(cid:12) ≤ C δ f M ( x, r L , r R )Before we prove Lemma 5.6, let’s see how it ﬁnishes the proof of Theorem5.2. Fix x and let y be such that: e K r L ,r R ( x, y ) = f M ( x, r L , r R )Then if we take δ = δ = min n C , o in Lemma 5.6, we see that for y ∈ B L rLrR ( y , δ ), e K r L ,r R ( x, y ) ≥ f M ( x, r L , r R )26uppose, for contradiction that f M ( x, r L , r R ) >> V LrLrR “ x, rL ” (here >> justmeans not . ). Then, Z e K r L ,r R ( x, y ) dy ≥ Z y ∈ B LrLrR “ y , δ rL ” e K r L ,r R ( x, y ) dy ≥ Z y ∈ B LrLrR “ y , δ rL ” f M ( x, r L , r R ) dy = 12 f M ( x, r L , r R ) V L rLrR (cid:18) y , δ r L (cid:19) >> V L rLrR (cid:16) x, r L (cid:17) V L rLrR (cid:18) x, r L (cid:19) = 1Here we used that V L rLrR (cid:18) y , δ r L (cid:19) ≈ V L rLrR (cid:18) x, r L (cid:19) (22)as can be seen by the fact that since y is in the support of e K r L ,r R ( x, y ) wemust have ρ L rLrR ( x, y ) . r L , and thus, B L rLrR (cid:18) x, r L (cid:19) ⊂ B L rLrR (cid:18) y , C δ r L (cid:19) B L rLrR (cid:18) y , δ r L (cid:19) ⊂ B L rLrR (cid:18) x, C r L (cid:19) for some C .But we also have, Z e K r L ,r R ( x, y ) dy = Z φ ( x ) dx Z φ ( y ) dy ≈ f M ( x, r L , r R ) . V LrLrR “ x, rL ” . But, f M ( x, r L , r R ) ≈ M ( x, r L , r R ),27nd applying (20) we see: M ( x, r L , r R ) ≥ V L rLrr (cid:16) x, Cr L (cid:17) Z y ∈ B LrLrR “ x, CrL ” K r L ,r R ( x, y ) dy = 1 V L rLrR (cid:16) x, Cr L (cid:17) Z K r L ,r R ( x, y ) dy = 1 V L rLrR (cid:16) x, Cr L (cid:17) ≈ V L rLrR (cid:16) x, r L (cid:17) And so we see that: M ( x, r L , r R ) ≈ f M ( x, r L , r R ) ≈ V L rLrR (cid:16) x, r L (cid:17) proving the ﬁrst part of Theorem 5.2. Moreover, we therefore have for all y ∈ B L rLrR (cid:16) y , δ r L (cid:17) , e K r L ,r R ( x, y ) ≈ V L rLrR (cid:16) x, r L (cid:17) and since: e K r L ,r R ( x, y ) . K rL , rR ( x, y )we have for all y ∈ B L rLrR (cid:16) y , δ r L (cid:17) , K rL , rR ( x, y ) ≈ V L rLrR (cid:16) x, r L (cid:17) ≈ V L rLrR (cid:16) x, r L (cid:17) Dividing δ by 2, this proves the second part of Theorem 5.2 for K rL , rR .Now merely replace r L and r R with 2 r L and 2 r R to complete the proof. Proof of Lemma 5.6.

We again use time-ordered exponentials, as in Section 4.1;and we again remark that their use is unnecessary, given Theorem 4.7 (seeRemark 4.14), however we shall use them as we believe it adds to the clarity ofthe exposition.Let n(cid:16) Y ( L ) j , d j (cid:17)o = L ( ▽ L ) and let n(cid:16) Y ( R ) j , d j (cid:17)o = L ( ▽ R ) so that n(cid:16) Y ( L ) j , d j (cid:17) , (cid:16) ǫ d j Y ( R ) j , d j (cid:17)o = L ( ▽ L , ǫ ▽ R )We remark that Y ( L ) j f ( rx ) = r d j (cid:16) Y ( L ) j f (cid:17) ( rx ), and similarly for Y ( R ) j . We alsoremark that, if we do our enumeration consistently between right and left, we28ave Y ( R ) j f (cid:0) x − (cid:1) = ( − d j (cid:16) Y ( L ) j f (cid:17) (cid:0) x − (cid:1) . Suppose γ ∈ C (cid:16) δr L , L (cid:16) ▽ L , r L r R ▽ R (cid:17)(cid:17) with γ (0) = z , γ (1) = z ; so that γ ( t ) = T-exp Z t X j a j ( s ) Y ( L ) j + (cid:18) r L r R (cid:19) d j b j ( s ) Y ( R ) j ds  z with | b j | , | a j | < (cid:16) δr L (cid:17) d j .As in the proof of Theorem 4.13, we deﬁne: γ L ( t ) = T-exp Z t X j a j ( s ) Y ( L ) j ds  z γ R ( t ) = T-exp Z t X j (cid:18) r L r R (cid:19) d j b j ( s ) Y ( R ) j ds  γ L (1)so that γ R (1) = z . Consider, ddt e K r L ,r R ( x, γ R ( t )) = r QL r QR ddt Z φ (cid:0) r L (cid:0) y − x (cid:1)(cid:1) φ (cid:16) r R (cid:16) yγ R ( t ) − (cid:17)(cid:17) dy = r QL r QR Z φ (cid:0) r L (cid:0) y − x (cid:1)(cid:1) X j (cid:18) − r L r R (cid:19) d j r d j R b j ( t ) (cid:16) Y ( L ) j φ (cid:17) (cid:16) r R (cid:16) yγ R ( t ) − (cid:17)(cid:17) dy Hence, using that | b j | < (cid:16) δr L (cid:17) d j , and using that φ is a ﬁxed C ∞ functionsupported on 2 B , we see that: (cid:12)(cid:12)(cid:12)(cid:12) ddt e K r L ,r R ( x, γ R ( t )) (cid:12)(cid:12)(cid:12)(cid:12) . r QL r QR Z χ B (cid:0) r L (cid:0) y − (cid:1)(cid:1) δχ B (cid:16) r R (cid:16) yγ R ( t ) − (cid:17)(cid:17) dy . δM (cid:16) x, r L , r R (cid:17) ≈ δM ( x, r L , r R )Now, consider: e K r L ,r R ( x, γ L ( t )) = Z φ ( r L ) (cid:0) y − x (cid:1) φ ( r R ) (cid:16) yγ L ( t ) − (cid:17) dy = Z φ ( r L ) (cid:16) γ L ( t ) − y (cid:17) φ ( r R ) (cid:0) xy − (cid:1) dy and then a similar proof to the one above shows that: (cid:12)(cid:12)(cid:12)(cid:12) ddt e K r L ,r R ( x, γ L ( t )) (cid:12)(cid:12)(cid:12)(cid:12) . δM ( x, r L , r R )Completing the proof. 29 roof of Corollary 5.3. It is easy to see thatOp L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B (cid:17) is a self-adjoint operator, and thus, K r L ,r R ( x, y ) = K r L ,r R ( y, x )If we take y as in Theorem 5.2, then we see:Ker (cid:16) Op L (cid:16) ( χ B ∗ χ B ) ( r L ) (cid:17) Op R (cid:16) ( χ B ∗ χ B ) ( r R ) (cid:17)(cid:17) ( x, x )= Ker (cid:18)h Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B (cid:17)i (cid:19) ( x, x )= Z K r L ,r R ( x, y ) K r L ,r R ( x, y ) dy & Z y ∈ B LrLrR “ y , δ rL ”  V L rLrR (cid:16) x, r L (cid:17)  dy ≈ Z y ∈ B LrLrR “ y , δ rL ” V L rLrR (cid:16) y , δ r L (cid:17) V L rLrR (cid:16) x, r L (cid:17) dy = 1 V L rLrR (cid:16) x, r L (cid:17) where we have applied (22) to get the second to last line.Since χ B ∗ χ B ≤ χ B , we have shown: K rL , rR ( x, x ) & V L rLrR (cid:16) x, r L (cid:17) ≈ V L rLrR (cid:16) x, r L (cid:17) ≈ M (cid:16) x, r L , r R (cid:17) Replacing r L , r R with 4 r L , r R , we see that: e K r L ,r R ( x, x ) ≥ K r L ,r R ( x, x ) ≈ M ( x, r L , r R ) ≈ f M ( x, r L , r R )Lemma 5.6 then tells us that there is a δ > y ∈ B L rLrR ( x, δ ), K rL , rR ( x, y ) ≥ e K r L ,r R ( x, y ) & f M ( x, r L , r R ) ≈ M (cid:16) x, r L , r R (cid:17) proving the result for r L , r R in place of r L , r R .30e close this section with some simple inequalities that will be of use in thesequel. Lemma 5.7.

For r L , r R , r L , r R > , K r L ,r R ( x, z ) ≤ (cid:16) r L r R r L r R (cid:17) Q K r L , r LrRrL ( x, z ) if r L r R ≥ r L r R (cid:16) r R r L r R r L (cid:17) Q K r RrLrR ,r R ( x, z ) if r L r R ≤ r L r R Proof.

This follows directly from the deﬁnition.

Corollary 5.8.

Suppose r L ≤ r R and r L ≤ r R . Then, we have: V Lx (cid:16) r L r R , δ (cid:17) . (cid:16) r L r R r L r R (cid:17) Q V LrLrR ( x,δ ) if r L r R ≥ r L r R (cid:16) r L r R r L r R (cid:17) Q V LrLrR „ x, rRr LrLr R δ « if r L r R ≤ r L r R In the case when r L ≤ r R but r L ≥ r R , we have V Rx (cid:16) r R r L , δ (cid:17) . (cid:16) r L r R r L r R (cid:17) Q V LrLrR „ x, r Rr L δ « if r L r R ≥ r L r R (cid:16) r L r R r L r R (cid:17) Q V LrLrR “ x, rRrL δ ” if r L r R ≤ r L r R (23) Proof.

This follows by combining Corollary 4.10 with Lemma 5.7.

We are now in a position to better understand the normalized bump functionsfrom Deﬁnition 2.1 and the elementary kernels from Deﬁnition 2.4. The intuitionfor the normalized bump functions is easy to understand. Indeed, if φ and ψ are two k -normalized bump functions in the sense of Deﬁnition 3.1, thenKer (cid:16) Op L (cid:16) φ ( r L ) (cid:17) Op R (cid:16) ψ ( r R ) (cid:17)(cid:17) ( x, · )is essentially an r L , r R normalized bump function centered at x of some or-der, dependent on k . This follows from (21) and Theorem 5.2. Following thisanalogy, we have: Lemma 6.1.

Suppose φ xr L ,r R is an r L , r R normalized bump function centeredat x , and ψ is a C ∞ function supported in the unit ball B . Then, Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R s a constant times a r L ∧ r ′ L , r R normalized bump function, except perhaps withsupport on a ball with a constant times the radius of the support of a normalizedbump function. The order of Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R will depend on the order of φ xr L ,r R in a way implicit in the proof.Proof. The support and bounds of Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R are easy to see. Indeed,ﬁxing χ as in Remark 5.4, we see that (cid:12)(cid:12) φ xr L ,r R ( z ) (cid:12)(cid:12) . Ker (cid:16) Op L (cid:16) χ ( r L ) (cid:17) Op R (cid:16) χ ( r R ) (cid:17)(cid:17) ( z, x )Thus, (cid:12)(cid:12)(cid:12) Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R ( z ) (cid:12)(cid:12)(cid:12) . Ker (cid:16) Op L (cid:16) φ ( r ′ L ) (cid:17) Op L (cid:16) χ ( r L ) (cid:17) Op R (cid:16) χ ( r R ) (cid:17)(cid:17) ( x, z ) . Ker (cid:16) Op L (cid:16)e χ ( r L ∧ r ′ L ) (cid:17) Op R (cid:16) χ ( r L ) (cid:17)(cid:17) ( x, z )where e χ has some ﬁxed bound and is supported in some ﬁxed ball. Thus thebounds for Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R ( z ) follow from (21) and Theorem 5.2. It onlyremains to bound the derivatives of Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R ( z ).For ▽ R derivatives, this is easy. Indeed, ▽ αR Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R ( z ) = Op L (cid:16) φ ( r ′ L ) (cid:17) ▽ αR φ xr L ,r R ( z )and then the result follows from the deﬁnition of a normalized bump functionand our previous bounds. Similarly, if r ′ L ≤ r L , we have: ▽ αL Op L (cid:16) φ ( r ′ L ) (cid:17) φ xr L ,r R ( z ) = r ′| α | L Op L (cid:16) ▽ αL φ ( r ′ L ) (cid:17) φ xr L ,r R ( z )The only problem that remains is when r L ≤ r ′ L . In that case we use thefollowing result: ▽ αL Op L (cid:16) φ ( r ′ L ) (cid:17) = X | β | = | α | Op L (cid:16) φ ( r L ) β (cid:17) ▽ βL (24)where φ β is of the same form as φ . From (24), ▽ L derivatives follow much in thesame way as ▽ R derivatives. We leave the details to the reader. When one takes ▽ L and ▽ R derivatives simultaneously, the result follows from a combinationof the above two methods.(24) is well known, and so to save space and not introduce too much extranotation, we prove it only in the case of the Heisenberg group (see Section 9.1 forthe notation used here). Indeed, suppose ▽ L = ( X L , Y L ) and ▽ R = ( X R , Y R ),32ith [ X L , Y L ] = 4 ∂ t = − [ X R , Y R ], and X L = X R + 4 y∂ t . Then, X L Op L (cid:16) φ ( r ′ L ) (cid:17) = ( X R + 4 yT ) Op L (cid:16) φ ( r ′ L ) (cid:17) = Op L (cid:16) φ ( r ′ L ) (cid:17) X L + 4 1 r ′ L Op L (cid:16) yφ ( r ′ L ) (cid:17) ∂ t = Op L (cid:16) φ ( r ′ L ) (cid:17) X L + 1 r ′ L Op L (cid:16) yφ ( r ′ L ) (cid:17) [ X L , Y L ]= Op L (cid:16) φ ( r ′ L ) (cid:17) X L + Op L (cid:16) X L yφ ( r ′ L ) (cid:17) Y L − Op L (cid:16) Y L yφ ( r ′ L ) (cid:17) X L The proof of the more general result follows in a similar fashion.

Theorem 6.2.

Suppose φ ∈ S b ⊗S , then Op T (cid:0) φ ( r L ,r R ) (cid:1) is an r L , r R elemen-tary operator. Indeed, this is true uniformly as φ varies over a bounded set, and r L , r R > vary.Proof. By Remark 3.11 it suﬃces to prove the result for elementary tensorproducts. Ie, we replace φ ( x, y ) with φ ( x ) ψ ( y ), where φ, ψ ∈ S . Thus we areconcerned with showing: Op L (cid:16) φ ( r L ) (cid:17) Op R (cid:16) ψ ( r R ) (cid:17) is an r L , r R elementary operator.Since φ, ψ ∈ S , we may apply Lemma 3.7, for every N , N , N , N ∈ N , wemay write, φ = X | α | = N X | α | = N ▽ α L ▽ α R φ α ,α ψ = X | β | = N X | β | = N ▽ β R ▽ β L ψ β ,β with φ α ,α , ψ β ,β ∈ S . Therefore,Op L (cid:16) φ ( r L ) (cid:17) Op R (cid:16) φ ( r R ) (cid:17) = r − N − N L r − N − N R × X | α | = N , | α | = N | β | = N , | β | = N ( − | α | + | β | ▽ α L ▽ β R Op L (cid:16) φ ( r L ) α ,α (cid:17) Op L (cid:16) ψ ( r R ) β ,β (cid:17) ▽ α L ▽ β R and so to show that Op L (cid:0) φ ( r L ) (cid:1) Op R (cid:0) φ ( r R ) (cid:1) is an elementary operator, it suf-ﬁces to show that: Ker (cid:16) Op L (cid:16) φ ( r L ) (cid:17) Op R (cid:16) φ ( r R ) (cid:17)(cid:17) satisﬁes the estimates of (3). For this purpose, it will suﬃce to just assume φ, ψ ∈ S . Moreover, it is easy to reduce the problem to the case when | α | = | α | = | β | = | β | = 0, and so we prove it only in this case, leaving the details tothe reader. Henceforth, we will only need that φ and ψ are rapidly decreasing.33t is easy to see that it suﬃces to prove (3) for r L = 2 j , r R = 2 k for j , k ∈ Z (a completely unnecessary reduction, but it makes notation a little easier).We also assume k ≥ j , the other situation being similar. Let χ ( x ) = χ B (2 x )( B is k x k < χ ( x ) = χ ( x ) − χ (2 x ), so that: χ ( x ) + ∞ X j =1 χ (cid:0) − j x (cid:1) = 1Deﬁne φ j by the equation:2 ( − j ) Q φ j (cid:0) − j x (cid:1) = ( χ (cid:0) − j x (cid:1) φ ( x ) if j > χ ( x ) φ ( x ) if j = 0and deﬁne ψ k in a similar manner; so that: ∞ X j =0 φ ( − j ) j = φ ∞ X k =0 ψ ( − k ) k = ψ and φ j ( x ) , ψ k ( x ) are supported where k x k ≤

1. Using that φ is rapidly de-creasing, we see that for any N ∈ N : (cid:12)(cid:12)(cid:12)(cid:12) φ ( j − j ) j ( x ) (cid:12)(cid:12)(cid:12)(cid:12) . j Q (cid:0) j (cid:1) − N χ {k x k≤ j − j } ( x ) . (cid:0) j (cid:1) − N + Q ( j − j ) Q χ B (cid:0) j − j x (cid:1) = (cid:0) j (cid:1) − N ′ χ ( j − j ) B ( x )and similarly, (cid:12)(cid:12)(cid:12)(cid:12) ψ ( k − k ) k ( x ) (cid:12)(cid:12)(cid:12)(cid:12) . (cid:0) k (cid:1) − N ′ χ ( k − k ) B As in Section 5.1, we will use the notation: K r L ,r R = Ker (cid:16) Op L (cid:16) χ ( r L ) B (cid:17) Op R (cid:16) χ ( r R ) B (cid:17)(cid:17) We are ready to compute our main bound (we use (21), Lemma 5.7, and Theo-34em 5.2 freely below):Ker (cid:16) Op L (cid:16) φ ( j ) (cid:17) Op R (cid:16) ψ ( k ) (cid:17)(cid:17) ( x, z )= X j ≥ k ≥ Ker (cid:18) Op L (cid:18) φ ( j − k ) j (cid:19) Op R (cid:18) ψ ( k − k ) k (cid:19)(cid:19) ( x, z ) . X j ≥ k ≥ (cid:0) j (cid:1) − N (cid:0) k (cid:1) − N Ker (cid:18) Op L (cid:18) χ ( j − j ) B (cid:19) Op R (cid:18) χ ( k − k ) B (cid:19)(cid:19) ( x, z ) . X j ≥ k ≥ (cid:0) j (cid:1) − N (cid:0) k (cid:1) − N ( ( j − k ) Q K j − j , k − j ( x, z ) if j ≥ k ( k − j ) Q K j − k , k − k ( x, z ) if k ≥ j . X j ≥ k ≥ (cid:0) j (cid:1) − N (cid:0) k (cid:1) − N  ( j − k ) Q χ  ρL j − k x,z ) . j − j ﬀ V L j − k ( x, j − j ) if j ≥ k ( k − j ) Q χ  ρL j − k x,z ) . k − j ﬀ V L j − k ( x, k − j ) if k ≥ j Both the sum when j ≥ k and the sum when k ≥ j fall oﬀ faster than a geometricseries (for N and N chosen suﬃciently large), and therefore are bounded bytheir ﬁrst term. For the sum when j ≥ k , the ﬁrst term is when k is zero andwhen 2 j − j ≈ ρ L j − k ( x, z ) (or when j = 0 if such a j is less than 0), with asimilar result when k ≥ j . Hence, we see that:Ker (cid:16) Op L (cid:16) φ ( j ) (cid:17) Op R (cid:16) ψ ( k ) (cid:17)(cid:17) ( x, z ) . (cid:0) j ρ L j − k ( x, z ) (cid:1) N V L j − k (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) for any N ≥

0, completing the proof.Theorem 6.2 along with Proposition 3.15 show that every two-sided convo-lution operator can be decomposed as a sum of elementary kernels. In fact, thiswill be true for every operator in A ′ ; moreover, this will characterize A ′ . Wedevote the rest of this section to proving these facts. Our ﬁrst step is an analogof Lemma 3.8. Lemma 6.3.

Suppose E j , k is a j , k elementary operator, and E j , k is a j , k elementary operator. Then, E j , k E j , k = 2 −| j − j |−| k − k | E j , k where j can be either j or j and k can be either k or k and E j , k is a j , k elementary operator uniformly as j , j , k , k vary over Z with constantsonly depending on the constants for E j , k and E j , k . roof. Let φ j i ,k i = Ker (cid:0) E ji , ki (cid:1) , i = 1 ,

2. Thus, we are interested in thefunction φ j ,k ( x, z ) = Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy Suppose, for a moment, that j ≥ j . Then we see, from Deﬁnition 2.4, that Z φ j ,k ( x, z ) = X | α | = N − Nj Z φ j ,j ( x, y ) ▽ αL,y ψ j ,k ,α ( y, z ) dy = X | α | = N N ( j − j ) Z ψ j ,k ,α ( x, y ) φ j ,k ,α ( y, z ) dy where the ψ j i ,k i ,α are uniformly 2 j i , k elementary kernels. Hence, it suﬃcesto consider only terms of the form:2 N ( j − j ) Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy where N is any ﬁxed large integer (which may depend on the semi-norm wewish to estimate). Doing the same argument for k , k , we see that it suﬃcesto consider only terms of the form:2 − N | j − j |− N | k − k | Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy We proceed in the case when k ≥ j and k ≥ j . The three other casesfollow with only minor changes to the proof, and we leave those details to theinterested reader. Deﬁne χ l ( r ) =  l > l − ≤ r < l ,0 if l > r ≥ l or r < l − ,1 if l = 0 and r < l = 0 and r ≥ N , N can be any two ﬁxed large integers we choose, and we36et χ be as in Remark 5.4: (cid:12)(cid:12)(cid:12)(cid:12)Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) ≤ X l ≥ l ≥ (cid:12)(cid:12)(cid:12)(cid:12)Z χ l (cid:0) j ρ L j − k ( x, y ) (cid:1) ψ j ,k ( x, y ) χ l (cid:0) ρ L j − k ( y, z ) (cid:1) ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) . X Z χ n ρ L j − k ( x,y ) ≤ l − j o χ n ρ L j − k ( y,z ) ≤ l − j o N l V L j − k ( x, l − j )2 N l V L j − k ( x, l − j ) dy . X − N l − N l Z Ker (cid:16) Op L (cid:16) χ ( j − l ) (cid:17) Op R (cid:16) χ ( k − l ) (cid:17)(cid:17) ( x, y ) × Ker (cid:16) Op L (cid:16) χ ( j − l ) (cid:17) Op R (cid:16) χ ( k − l ) (cid:17)(cid:17) ( y, z ) dy = X − N l − N l Ker (cid:16) Op L (cid:16) χ ( j − l ) ∗ χ ( j − l ) (cid:17) Op R (cid:16) χ ( k − l ) ∗ χ ( k − l ) (cid:17)(cid:17) ( x, z )= X − N l − N l Ker (cid:18) Op L (cid:16)e χ ( ( j − l ∧ ( j − l ) (cid:17) Op R (cid:18)ee χ ( ( k − l ∧ ( k − l ) (cid:19)(cid:19) ( x, z )(25)where e χ and ee χ are non-negative bounded functions with support in a ﬁxedbounded set, with these bounds independent of j , j , k , k , l , l . Note that wecould have achieved the same left hand side for (25) in the three other caseswhere we allow k < j or k < j or both. We will now drop our assumption k ≥ j , k ≥ j , though we will return to it at the end.Let e B be a large ﬁxed ball containing the support of e χ and ee χ , and deﬁne: e K r ,r ( x, z ) = Ker (cid:16) Op L (cid:16) χ ( r ) e B (cid:17) Op R (cid:16) χ ( r ) e B (cid:17)(cid:17) ( x, z )so that we have:2 − N | j − j |− N | k − k | (cid:12)(cid:12)(cid:12)(cid:12)Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) . − N | j − j |− N | k − k | X l ≥ l ≥ − N l − N l e K ( j − l ∧ ( j − l , ( k − l ∧ ( k − l ( x, z )(26)We now proceed in proving the lemma in the case when j = j and k = k .The case when j = j and k = k is completely symmetric. The remainingtwo cases follow by similar arguments, and we leave those proofs to the reader.We also work in the case when k ≥ j , the other case being symmetric.We separate the RHS of (26) into 4 sums: depending on whether j − l ≤ j − l and whether k − l ≤ k − l . The ﬁrst case we deal with is the sum overthose l and l such that j − l ≤ j − l and k − l ≤ k − l . In this case,we need only take N = 1. In what follows, we will use (21), Lemma 5.7, and37heorem 5.2 freely (indeed, we will use their analogs for e K which follow fromthe methods in Section 5.1; note the e K we use here diﬀers slightly from the onein Section 5.1). We have:2 −| j − j |−| k − k | X l ,l − N l − N l e K ( j − l , ( k − l ( x, z ) . −| j − j |−| k − k | X l ≥ − N l e K ( j − l , ( k − l ( x, z ) . −| j − j |−| k − k | X l ≥ − N l χ n ρ L j − k ( x,z ) . l − j o V L j − k ( x, l − j )This sum falls oﬀ geometrically, and is therefore bounded by a multiple of itsﬁrst term, which occurs when ρ L j − k ( x, z ) ≈ l − j , or when l = 0 (whichever l is greater). Thus, we have that this sum is: . −| j − j |−| k − k | (cid:0) j ρ L j − k ( x, z ) (cid:1) N V L k − j (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) which is 2 −| j − j |−| k − k | times the bound for a 2 j , k elementary kernel.We now turn to the case when j − l ≤ j − l and k − l ≤ k − l . As weestimate this case, we will use the fact that we may choose N to be large andthis will allow us to absorb some terms by changing N . When we do this, wewill replace N by N ′ and then by N ′′ , etc. X l ,l − N | j − j |− N | k − k |− N l − N l e K j − l , k − l ( x, z ) . X l − N | j − j |− N | k − k |− N l e K j − l , k − l ( x, z ) . X l − N | j − j |− N | k − k |− N l × ( ( j − k + k − j ) Q e K j − l , j − l k − j ( x, z ) if j − k ≥ j − k ( j − k + k − j ) Q e K k − l j − k , k − l ( x, z ) if j − k ≤ j − k . X l − N ′ | j − j |− N ′ | k − k |− N l ( e K j − l , j − l k − j ( x, z ) e K k − l j − k , k − l ( x, z ) . X l − N ′ | j − j |− N ′ | k − k |− N l  χ  ρL j − k x,z ) . l − j ﬀ V L j − k ( x, l − j ) χ  ρL j − k x,z ) . k − j l − k ﬀ V L j − k ( x, k − j l − k )This sum falls oﬀ geometrically, and is therefore bounded by its ﬁrst term. Thus,38e have: X l ,l − N | j − j |− N | k − k |− N l − N l e K j − l , k − l ( x, z ) . − N ′ | j − j |− N ′ | k − k | ×  “ j ρ L j − k ( x,z ) ” N V L j − k “ x, − j + ρ L j − k ( x,z ) ” “ k j − k ρ L j − k ( x,z ) ” N V L j − k “ x, k − j − k + ρ L j − k ( x,z ) ” . − N ′′ | j − j |− N ′′ | k − k | (cid:0) j ρ L j − k ( x, z ) (cid:1) N V L j − k (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) thereby completing the bound in this case.We now turn to the case when j − l ≤ j − l and k − l ≥ k − l . X l ,l − N | j − j |− N | k − k |− N l − N l e K j − l , k − l ( x, z ) . X l ,l − N | j − j |− N | k − k |− N l − N l ( k − k + l − l ) Q e K j − k k − l , k − l ( x, z )where, when we applied Lemma 5.7, only the latter case ( k − l ≥ k − l )applies. Continuing our bound, we have: . X l − N ′ | j − j |− N ′ | k − k |− N ′ l e K j − k k − l , k − l ( x, z )but this is just the lower case for our computation when k − l ≥ k − l and j − l ≥ j − l . Thus, we have: X l ,l − N | j − j |− N | k − k |− N l − N l e K j − l , k − l ( x, z ) . − N ′′ | j − j |− N ′′ | k − k | (cid:0) j ρ L j − k ( x, z ) (cid:1) N ′ V L j − k (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) Finally, when j − l ≥ j − l and k − l ≤ k − l , the proof proceeds as inthe previous case, but now one ends up with the upper case for our computationwhen k − l ≥ k − l and j − l ≥ j − l . Putting all of this together, wehave: (cid:12)(cid:12)(cid:12)(cid:12)Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) . −| j − j |−| k − k | (cid:0) j ρ L j − k ( x, z ) (cid:1) N V L j − k (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) for any N we choose. 39ow let’s turn to derivatives. Fix ordered multi-indicies α , β , α , β , andconsider: (cid:12)(cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) ≤ X (cid:12)(cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z −| α || k − k |−| β || j − j | Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) where this is some ﬁnite sum and the ψ j i ,k i are elementary kernels, as we sawat the start of the proof. Applying the deﬁnition of elementary kernels, we see: (cid:12)(cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z −| α || k − k |−| β || j − j | Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) = 2 j ( | α | + | α | )+ k ( | β | + | β | ) (cid:12)(cid:12)(cid:12)(cid:12)Z e ψ j ,k ( x, y ) e ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) where the e ψ are also elementary kernels. Hence, our bound for the compositionof two elementary kernels proved above, applied to: (cid:12)(cid:12)(cid:12)(cid:12)Z e ψ j ,k ( x, y ) e ψ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) gives the proper bound from the deﬁnition of 2 j , k elementary kernels for: (cid:12)(cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z ▽ β R,x ▽ β R,z Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy (cid:12)(cid:12)(cid:12)(cid:12) Finally, we need to see that we may “pull out” derivatives as in Deﬁnition2.4. Pulling out x derivatives, works easily: Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy = X | α | = N | β | = N − j N − k N ▽ αL,x ▽ βR,x Z ψ j ,k ,α,β ( x, y ) φ j ,k ( y, z ) dy and is therefore 2 − j N − k N times a sum of terms of the same form.Pulling out z derivatives takes one more step. Indeed, ﬁx N and N andsuppose we wish to pull out z left derivatives of order N and right derivativesof order N (as in Deﬁnition 2.4). Then consider, Z φ j ,k ( x, y ) φ j ,k ( y, z ) dy = X − N | j − j |− N | k − k | Z ψ j ,k ( x, y ) ψ j ,k ( y, z ) dy = X X | α | = N | β | = N ▽ αL,z ▽ βR,z − N j − N k Z ψ j ,k ( x, y ) ψ j ,k ,α,β ( y, z ) dy Lemma 6.4.

Suppose φ is an r L , r R elementary kernel, r L ≤ r R . Then ifwe deﬁne ψ ( r L ) ( x ) = φ ( x, , we have that ψ ∈ S , uniformly for φ whichare uniformly r L , r R elementary kernels, with constants independent of r L , r R .When r R ≤ r L , we have: ψ ( r R ) ( x ) = φ ( x, yields ψ uniformly in S .Proof. This is a simple consequence of the deﬁnitions.

Corollary 6.5.

Suppose φ ∈ S , l ∈ Z , and E j , k is a j , k elementary kernel.Then, E j , k φ ( l ) = 2 −| l − j |−| l − k | ψ ( l ) (27) where ψ ∈ S . As φ and E j , k range over bounded sets, so does ψ . Moreover,this is true uniformly in j, k, l .Proof. Deﬁne e ψ ( l ) = E j , k φ ( l ). First, let us see that it will suﬃce to showthat e ψ is rapidly decreasing (uniformly, in the relevant parameters). Indeed,suppose we have that it is rapidly decreasing. First, let us see how to obtainthe factor 2 −| l − j |−| l − k | in (27). Consider, in the case l ≤ j , E j , k φ ( l ) = X | α | =1 − j e E j , k ,α ▽ αL φ ( l )= 2 l − j X | α | =1 e E j , k ,α e φ ( l ) α and so is a ﬁnite sum of terms of the same form, but now with a factor of 2 l − j out front.On the other hand, if j ≤ l , we may apply Lemma 3.7 to see: E j , k φ ( l ) = X | α | =1 − l E j , k ▽ αL e φ ( l ) α = X | α | =1 j − l e E j , k ,α e φ ( l ) α and so it is a ﬁnite sum of terms of the same form, but now with a factor of2 j − l out front. In a similar manner we may obtain a factor of 2 −| k − l | out front.Thus we have seen that, given that e ψ is rapidly decreasing, we have that ψ in the statement of the corollary, is rapidly decreasing, uniformly in the relevantparameters. Let us turn to derivatives of ψ . It is easy to see from the Deﬁnition2.4 that if | α | = 1, ▽ αL E j , k = X | β | =1 e E j , k ▽ βL (28)41ence, ▽ αL E j , k φ ( l ) = X | β | =1 e E j , k ▽ βL φ ( l )= 2 l X | β | =1 e E j , k e φ ( l ) β and so is 2 l times a ﬁnite sum of terms of the same form. Thus, ψ behavesproperly under derivatives, and we have shown that ψ is uniformly in S .To see ψ is uniformly in S , we need to “pull out” derivatives. However, wemerely use the other direction of (28) to see: E j , k φ ( l ) = 2 − l X | α | =1 E j , k ▽ αL e φ ( l ) α = X | α | =1 X | β | =1 − l ▽ βL e E j , k ,β e φ ( l ) α and so one can “pull out” derivatives.Thus, we turn to proving that e ψ is rapidly decreasing. In fact, by theargument earlier in this proof, it suﬃces to show that for each M , there existsan N such that:2 −| l − j | N −| l − k | N (cid:12)(cid:12)(cid:12) e ψ ( l ) ( x ) (cid:12)(cid:12)(cid:12) . lQ (cid:0) l | x | (cid:1) − M And therefore, it suﬃces to show that if l = min { l, j, k } , then (cid:12)(cid:12) E j , k φ ( x ) (cid:12)(cid:12) . l Q (cid:0) l | x | (cid:1) − M to do this, we will show that if we redeﬁne e ψ to be: e ψ ( l ) = E j , k φ ( l )then we have that e ψ ∈ S uniformly in the relevant parameters. We proceed inthe cases when l = j or l = l . The case when l = k is similar to that when l = j .We will next prove that E j , k Op L (cid:16) φ ( l ) (cid:17) = 2 −| j − l | E l , k where E l , k is a 2 l , k elementary kernel. Then, the result will follow fromthe fact that: E j , k φ ( l ) = Ker (cid:16) E j , k Op L (cid:16) φ ( l ) (cid:17)(cid:17) ( · , I as a right convolution operator. Then,we may apply Proposition 3.3 to see that: I = X k Op R (cid:18) ψ ( k ) k (cid:19) with this sum converging strongly in L . It is easy to see that everything we’redealing with in this proof is continuous on L , hence, E j , k Op L (cid:16) φ ( l ) (cid:17) = E j , k I Op L (cid:16) φ ( l ) (cid:17) = X k E j , k Op R (cid:18) ψ ( k ) k (cid:19) Op L (cid:16) φ ( l ) (cid:17) = X k E j , k E l , k = X k −| j − l |−| k − k | E l , k = 2 −| j − l | E l , k where we have used Lemma 6.3 and Theorem 6.2, completing the proof. Theorem 6.6.

Suppose for each j, k ∈ Z we have E j , k a j , k elementaryoperator, uniformly in j, k . Then, T = X j,k E j , k (29) converges in the topology of bounded convergence as operators S → S , andalso converges in the strong operator topology as bounded operators L → L .Moreover, T ∈ A ′ . Conversely, every operator in A ′ can be decomposed as in(29).Proof. The convergence of the sum T = X j,k E j , k in the topology of bounded convergence as operators S → S follows directlyfrom Corollary 6.5, thinking of a ﬁxed element φ ∈ S as φ = φ ( ). To see thatthe sum (29) converges in the strong operator topology L → L , we apply theCotlar-Stein lemma. Indeed, the adjoint of a 2 j , k elementary operator is againa 2 j , k elementary operator (see Remark 2.8), and therefore, E ∗ j , k E j , k = 2 −| j − j |−| k − k | e E j , k by Lemma 6.3. Thus, to see that the sum converges in the strong operatortopology L → L , it suﬃces to show that the operators e E j , k are uniformly43ounded L → L . This follows easily from Lemma 8.1. Alternatively, it is easyto see that the e E j , k are uniformly NIS operators corresponding to the metrics ρ L j − k (or ρ R k − j if k ≤ j ), which in turn correspond to spaces which areuniformly spaces of homogeneous type in the sense of [DJS85], by the remarksin Section 4. Thus the uniform L boundedness for e E j , k follows by usualproofs that NIS operators are bounded on L (see [Koe02, NRS01]).To see T ∈ A ′ , note that: T E j , k = X j,k E j , k E j , k = X j,k −| j − j |−| k − k | E j , k = E j , k where we have applied Lemma 6.3, where E j , k is a 2 j , k elementary oper-ator. For more general r L , r R elementary operators, merely think of an r L , r R elementary operator as a 2 j , k elementary operator, where j , k are chosento minimize (cid:12)(cid:12) j − r L (cid:12)(cid:12) + (cid:12)(cid:12) k − r R (cid:12)(cid:12) .For the converse, suppose T ∈ A . Thinking of the identity I as a two-sidedconvolution operator, we may apply Proposition 3.15 to see that: I = X j,k ∈ Z Op T (cid:18) φ ( j , k ) j,k (cid:19) where { φ j,k } ⊂ S b ⊗S is a bounded set. Applying Theorem 6.2, we see: I = X j,k ∈ Z E j , k (where this sum converges strongly in L , and as we have seen earlier in theproof, in the topology of bounded convergence S → S ). Hence, we see: T = T I = X j,k ∈ Z T E j , k = X j,k ∈ Z e E j , k completing the proof. Corollary 6.7.

Let K be a product kernel. Then, Op T ( K ) ∈ A ′ .Proof. This is a combination of Proposition 3.15 and Theorems 6.2 and 6.6.44

Equivalence of A and A ′ In this section, we show that A and A ′ are the same spaces. To begin with, wewill need a better understanding of the function B deﬁned in Section 2 by: B ( r L , r R , N L , N R , m, x, y )= r N L L r N R R (cid:18) r L ρ L rLrR ( x, y ) (cid:19) − m V L rLrR (cid:18) x, r L + ρ L rLrR ( x, y ) (cid:19) Lemma 7.1. If N L , N R ≥ Q + m + 1 , we have, X j ≤ j k ≤ k B (cid:0) j , k , N L , N R , m, x, y (cid:1) ≈ B (cid:0) j , k , N L , N R , m, x, y (cid:1) (30) and as a simple corollary: X j ≤ j B (cid:0) j , k , N L , N R , m, x, y (cid:1) ≈ B (cid:0) j , k , N L , N R , m, x, y (cid:1) Proof. & is clear, and so we focus on . . Without loss of generality, we mayassume j ≤ k . We separate the sum (30) into the sum when j ≤ k and thesum when k ≤ j . We consider, ﬁrst, the easier case when j ≤ k : X j ≤ kj ≤ j k ≤ k B ( j, k, N L , N R , m, x, y ) = X N L j + N R k (cid:0) j ρ L j − k ( x, y ) (cid:1) − m V L j − k (cid:0) x, − j + ρ L j − k ( x, y ) (cid:1) = X N L j + N R k (cid:0) j − j j ρ L j − k k − k + j − j ( x, y ) (cid:1) − m V L j − k (cid:0) x, j − j − j + ρ L j − k k − k + j − j ( x, y ) (cid:1) we now use the elementary fact that ρ Lǫ ≤ ρ L ǫ ≤ ρ Lǫ , to see: . X N L j + N R k (cid:0) j − j + k − k j ρ L j − k ( x, y ) (cid:1) − m V L j − k (cid:0) x, k − k (cid:0) − j + ρ L j − k ( x, y ) (cid:1)(cid:1) Applying Corollary 5.8, we have: . P N L j + N R k +( j − j + k − k ) Q “ j − j k − k j ρ L j − k ( x,y ) ” − m V L j − k “ x, k − k “ − j + ρ L j − k ( x,y ) ”” if j − k ≥ j − k P N L j + N R k +( j − j + k − k ) Q “ j − j k − k j ρ L j − k ( x,y ) ” − m V L j − k “ x, j − j “ − j + ρ L j − k ( x,y ) ”” if j − k ≤ j − k But for δ < V L j − k ( x, δr ) & δ Q V L j − k ( x, r ) (this is a consequence of Theorem4.9), and so we have, . X N L j + N R k +( j − j + k − k ) Q (cid:0) j − j + k − k j ρ L j − k ( x, y ) (cid:1) − m V L j − k (cid:0) x, − j + ρ L j − k ( x, y ) (cid:1) . B (cid:0) j , k , N L , N R , m, x, y (cid:1) N L , N R ≥ Q + m + 1.We now turn to the sum when k ≤ j : X k ≤ jj ≤ j k ≤ k B ( j, k, N L , N R , m, x, y ) = X N L j + N R k (cid:0) k ρ R k − j ( x, y ) (cid:1) − m V R k − j (cid:0) x, − k + ρ R k − j ( x, y ) (cid:1) (31)Using the fact that ρ Rǫ = ǫρ L ǫ (here we have removed the restriction ǫ ≤

1) wesee: ρ R k − j = ρ R k − j j − j + k − k ≥ j − j ρ R k − j = 2 k − j ρ L j − k Plugging this into (31), we see that (31) is . X N L j + N R k (cid:0) k − j + k − j j ρ L j − k ( x, y ) (cid:1) − m V R k − j (cid:0) x, − k + 2 k − j ρ L j − k ( x, y ) (cid:1) using that j ≤ k , . X N L j + N R k (cid:0) j − j j ρ L j − k ( x, y ) (cid:1) − m V R k − j (cid:0) x, − k + 2 k − j ρ L j − k ( x, y ) (cid:1) Applying Corollary 5.8 and using the fact that the indicies we are summing oversatisfy k − k ≥ j − j and so we are in the lower case of (23), and thus . X N L j + N R k +( k − k + j − j ) Q (cid:0) j − j j ρ L j − k ( x, y ) (cid:1) − m V L j − k (cid:0) x, k − k − j + 2 k − j + k − j ρ L j − k ( x, y ) (cid:1) using that j ≤ j ≤ k , we see: . X N L j + N R k +( k − k ) Q (cid:0) j − j j ρ L j − k ( x, y ) (cid:1) − m V L j − k (cid:0) x, − j + ρ L j − k ( x, y ) (cid:1) . B ( j , k , N L , N R , m, x, y )provided N L , N R ≥ Q + m + 1, completing the proof of the ﬁrst estimate. Thesecond estimate follows as a simple corollary. Remark . In our proof that A ′ ⊆ A , we will see that the only reason weneed to take N large in Deﬁnition 2.2 is so that we may apply Lemma 7.1.Because of this, once we show that A = A ′ , we will see that we may replace N in Deﬁnition 2.2 by Q + m + 1. Lemma 7.3.

Suppose φ x j , k is a normalized j , k bump function of order centered at x , φ z j , k is a normalized j , k bump function of order centeredat z , and E j , k is a j , k elementary operator. Then, (cid:12)(cid:12)(cid:12)D φ x j , k , E j , k φ z j , k E L (cid:12)(cid:12)(cid:12) . B (cid:0) j ∧ j ∧ j , k ∧ k ∧ k , , , m, x, z (cid:1) with constants uniform in all the relevant parameters. roof. Let j = j ∧ j , and k = k ∧ k . We prove the result in the case when k − k ≥ j − j , the other case being similar. We also proceed in the case when k ≥ j , though the proof is essentially independent of this choice. Consider,letting χ be as in Remark 5.4, (cid:12)(cid:12)(cid:12)D φ x j , k , E j , k φ z j , k E L (cid:12)(cid:12)(cid:12) . X l ≥ Z ρ L j − k ( y ,y ) ≈ l − j φ x j , k ( y ) 2 − lN V L j − k ( y , l − j ) φ z j , k ( y ) dy dy . X l ≥ − lN Ker (cid:18) Op L (cid:16) χ ( j ) (cid:17) Op R (cid:16) χ ( k ) (cid:17) Op L (cid:16) χ ( j − l ) (cid:17) Op R (cid:16) χ ( k − l ) (cid:17) × Op L (cid:16) χ ( j ) (cid:17) Op R (cid:16) χ ( k ) (cid:17) (cid:19) ( x, z ) (32)at this point we may drop the assumption k ≥ j , and note that we could havejust as easily shown (32) in the case k ≤ j . In the above N is any ﬁxed integerwe choose, obtained from the rapid decrease of Ker (cid:0) E j , k (cid:1) . Rearrangingterms, and using that, for instance, χ ( j ) ∗ χ ( j − l ) ∗ χ ( j ) = e χ ( j ∧ ( j − l ) )where e χ is a bounded function of bounded support, with bounds independentof all the relevant parameters above, we see that the left hand side of (32) is . X l ≥ − lN Ker (cid:18) Op L (cid:16) e χ ( j ∧ ( j − l ) ) (cid:17) Op R (cid:18)ee χ ( k ∧ ( k − l ) ) (cid:19)(cid:19) ( x, z ) . X l ≥ − lN B (cid:16) j ∧ ( j − l ) , k ∧ ( k − l ) , , , m, x, z (cid:17) where we have applied (21) and Theorem 5.2. We separate this sum into threesums. The ﬁrst: X ≤ l ≤ k − k − lN B (cid:0) j , k , , , m, x, z (cid:1) . B (cid:0) j , k , , , m, x, z (cid:1) with this sum = 0 if k > k . The second: X [( k − k ) ∨ ≤ l ≤ j − j − lN B (cid:0) j , k − l , , , m, x, z (cid:1) = X [( k − k ) ∨ ≤ l ≤ j − j − k N B (cid:0) j , k − l , , N, m, x, z (cid:1) ≈ − k N B (cid:0) j , k ∧ k , , N, m, x, z (cid:1) . B (cid:0) j , k ∧ k , , , m, x, z (cid:1) j > j , and we have applied Lemma 7.1 and we have usedthat we may take N large. Finally, X [( j − j ) ∨ ≤ l − lN B (cid:0) j − l , k − l , m, , , x, y (cid:1) = X [( j − j ) ∨ ≤ l − j N/ − k N/ B (cid:18) j − l , k − l , m, N , N , m, x, y (cid:19) ≤ X [( j − j ) ∨ ≤ l [( k − k ) ∨ ≤ l − j N/ − k N/ B (cid:18) j − l , k − l , N , N , m, x, y (cid:19) . − j N/ − k N/ B (cid:18) j ∧ j , k ∧ k , N , N , m, x, y (cid:19) . B (cid:0) j ∧ j , k ∧ k , , , m, x, y (cid:1) where again we have taken N large and applied Lemma 7.1. Corollary 7.4.

Suppose φ x j , k is a normalized j , k bump function centeredat x , φ z j , k is a normalized j , k bump function centered at z (each of somelarge order, how large will be implicit in the proof ), and E j , k is a j , k elementary operator. Then, (cid:12)(cid:12)(cid:12)D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L (cid:12)(cid:12)(cid:12) . (( j ∧ j ) − j ) ∧ k ∧ k ) − k ) ∧ × B (cid:0) j ∧ j ∧ j , k ∧ k ∧ k , | α | + | α | , | β | + | β | , m, x, z (cid:1) with constants uniform in all the relevant parameters.Proof. Let j = j ∧ j and k = k ∧ k . We ﬁrst prove the result without thefactor of: 2 ( j − j ) ∧ k − k ) ∧ Suppose that j = j ∧ j . Then, we have: D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L = X | α | = | α | D φ x j , k , ▽ β R e E j , k ,α ▽ α L ▽ α L ▽ β R φ z j , k E L = 2 j ( | α | + | α | ) X | α | = | α | D φ x j , k , ▽ β R e E j , k ,α ▽ β R e φ z j , k ,α E L a ﬁnite sum of terms of the same form but with | α | = 0 = | α | , times2 j ( | α | + | α | ) . We get a similar result when j = j ∧ j . Finally, when j = j ∧ j ,we merely let all the ▽ L derivatives land on the E j , k , D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L = 2 j ( | α | + | α | ) D φ x j , k , ▽ β R e E j , k ▽ β R φ z j , k E L k s, we see: (cid:12)(cid:12)(cid:12)D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L (cid:12)(cid:12)(cid:12) ≤ j ∧ j ( | α | + | α | ) k ∧ k ( | β | + | β | ) X De φ x j , k , e E j , k e φ z j , k E L where the sum denotes a ﬁnite sum of such terms. Applying Lemma 7.3, wesee: . j ∧ j ( | α | + | α | ) k ∧ k ( | β | + | β | ) B (cid:0) j ∧ j , k ∧ k , , , m, x, z (cid:1) = B (cid:0) j ∧ j , k ∧ k , | α | + | α | , | β | + | β | , m, x, z (cid:1) Which completes the proof, without the factor of 2 ( j − j ) ∧ k − k ) ∧ . To seehow to obtain that factor, suppose we are in the case when j = j ≤ j . Thenwe “pull derivatives out” of E j , k and let them land on φ x j , k ; indeed, D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L = X | α | =1 − j D φ x j , k , ▽ α L ▽ αL ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L = X | α | = | α | j − j D e φ x j , k ,α , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L which is 2 j − j = 2 j − j times a ﬁnite sum of terms of the original form. Asimilar proof works for when j = j and for the k s. Theorem 7.5.

Suppose T ∈ A ′ , then T ∈ A .Proof. We apply Theorem 6.6 to decompose T : T = X j,k ∈ Z E j , k where E j , k are uniformly 2 j , k elementary operators, and this sum convergesin the strong operator topology as operators L → L . Fix m and ﬁx N L , N R ≥ Q + m + 1. Suppose φ x j , k is a normalized 2 j , k bump function centered at x , φ z j , k is a normalized 2 j , k bump function centered at z (each of somelarge order), and suppose that | α | + | α | = N L , | β | + | β | = N R . Then, letting j = j ∧ j , k = k ∧ k , we see: (cid:12)(cid:12)(cid:12)D φ x j , k , ▽ α L ▽ β R T ▽ α L ▽ β R φ z j , k E L (cid:12)(cid:12)(cid:12) ≤ X j,k (cid:12)(cid:12)(cid:12)D φ x j , k , ▽ α L ▽ β R E j , k ▽ α L ▽ β R φ z j , k E L (cid:12)(cid:12)(cid:12) . X j,k ( j − j ) ∧ k − k ) ∧ B (cid:0) j ∧ j , k ∧ k , N L , N R , m, x, z (cid:1) . B (cid:0) j , k , N L , N R , m, x, z (cid:1) where we have applied Lemma 7.1 to get the last line, completing the proof.49e now turn to showing that A ⊂ A ′ . Fix T ∈ A . Then, we wish toshow that T E r L ,r R = e E r L ,r R . As we have seen before, it will suﬃce to provethis result for r L = 2 j , r R = 2 k . This follows by choosing j, k to minimize (cid:12)(cid:12) r L − j (cid:12)(cid:12) + (cid:12)(cid:12) r R − k (cid:12)(cid:12) . Lemma 7.6.

Given r L , r R > , y ∈ G , m ≥ , there exists a N such that if N L , N R ≥ N , and | α | + | α | = N L , | β | + | β | = N R , and φ yr L ,r R is a r L , r R bump function centered at y , (cid:12)(cid:12)(cid:12) ▽ α L ▽ β R T ▽ α L ▽ β R φ yr L ,r R ( x ) (cid:12)(cid:12)(cid:12) . B ( r L , r R , N L , N R , m, x, y ) Proof.

This follows directly from Deﬁnition 2.2, by taking φ r (1) L ,r (1) R → δ x , bytaking r (1) L , r (1) R → ∞ .To see that we can do this, merely take φ supported in the unit ball B suchthat R φ = 1. Then, Op L (cid:16) φ ( j ) (cid:17) → I as j → ∞ ; similarly for Op R (cid:16) φ ( k ) (cid:17) . Thus if we set: φ x j , k ( z ) = Ker (cid:16) Op L (cid:16) φ ( j ) (cid:17) Op R (cid:16) φ ( k ) (cid:17)(cid:17) ( x, z )we see that φ x j , k → δ x . Since we saw in Section 6 φ x j , k is essentially annormalized bump function (it may really have support in a ball with radius aconstant factor times the ball it is supposed to be supported in, and need to bemultiplied by a constant, but these only aﬀect the answer by a constant), weare done. Proposition 7.7.

Ker (cid:0)

T E j , k (cid:1) satisﬁes the estimates (3) with r L = 2 j , r R =2 k , uniformly in the relevant parameters.Proof. We proceed in the case when j ≤ k , the other case being similar. Con-sider, ▽ α L,x ▽ β L,x ▽ α L,z ▽ β L,z

Ker (cid:0)

T E j , k (cid:1) ( x, z )= ▽ α L,x ▽ β L,x ( − | α | + | β | Ker (cid:16)

T E j , k ▽ α L ▽ β R (cid:17) ( x, z )= 2 j | α | + k | β | ▽ α L,x ▽ β L,x

Ker (cid:16) T e E j , k (cid:17) ( x, z )which is 2 j | α | + k | β | times a term of the original form. Thus, it will suﬃce toprove the result when | α | = 0 = | β | .Fix z ∈ G . Let us consider the function of x given by: ▽ α L,x ▽ β R,x

Ker (cid:0)

T E j , k (cid:1) ( x, z )= 2 − j | α |− k | β | Ker (cid:16) ▽ α L ▽ β R T ▽ α L ▽ β R e E j , k (cid:17) ( x, z )50ere α and β are not the same ordered multi-incides as before; rather, we haveapplied Deﬁnition 2.4, and the term on the right hand side of the above equationreally denotes a ﬁnite sum of such terms. Letting φ j , k ( x ) = Ker (cid:16) e E j , k (cid:17) ( x, z ),we are considering the function given by:2 − j | α |− k | β | ▽ α L ▽ β R T ▽ α L ▽ β R φ j , k where φ j , k is a 2 j , k elementary kernel, and | α | , | β | can be as large as welike.Theorem 4.11 allows us to create a partition of unity ψ l ( x ) ( l ≥ ≤ ψ l ≤

1, for every l ψ is supported where ρ L j − k ( x, z ) . − j ψ l is supported where ρ L j − k ( x, z ) ≈ − j + l , l = 04. (cid:12)(cid:12)(cid:12) ▽ αL ▽ βR φ (cid:12)(cid:12)(cid:12) . | α | ( j − l )+ | β | ( k − l ) and, if | α | + | β | >

0, is supported where ρ L j − k ( x, z ) ≈ l − j , even if l = 0.This follows from Theorem 4.11 directly for z in the closed unit ball, and for2 − j + l small. However Theorem 4.11 really holds for all points in G and alldistances. Creating a bump function of radius r centered at z is equivalentto creating a bump function of radius δ centered at δr z . Thus Theorem 4.11extends to all points and all radii, by homogeneity (just take δ small enough),giving us the above partition of unity.Deﬁne φ l ( x ) = ψ l ( x ) φ j , k ( x, z ) (thinking of z as ﬁxed), so that P l ≥ φ l = φ j , k . We wish to show that φ l is 2 − lN times a 2 j − l , k − l normalized bumpfunction, where N is any integer we choose, and we really mean a constanttimes a normalized bump function, with support in, perhaps, a constant timesthe radius of the support it’s supposed to have.We already know that the support of φ l is correct, by the properties of ψ l ,so we turn to estimating derivatives of φ l . When l >

0, we have: (cid:12)(cid:12)(cid:12) ▽ αL ▽ βR φ l (cid:12)(cid:12)(cid:12) ≤ X a + a = | α | b + b = | β | ( a ( j − l )+ b ( k − l )) a j + b k lN V L j − k ( z, l − j ) . | α | ( j − l )+ | β | ( k − l ) lN ′ V L j − k ( z, l − j )here, a represents the number of ▽ L derivatives that land on ψ l , and a rep-resents the number that land on φ j , k . We have used in the last line that wemake take N large. This establishes that φ l ( l >

0) is 2 − lN times a 2 j , k bumpfunction. When l = 0, a nearly identical proof establishes the result.51hus, by Lemma 7.6, and using the fact that we may take | α | , | β | as largeas we like, we have:2 − j | α |− k | β | (cid:12)(cid:12)(cid:12) ▽ α L ▽ β R T ▽ α L ▽ β R φ l (cid:12)(cid:12)(cid:12) . − j | α |− k | β |− lN B (cid:0) j − l , k − l , | α | + | α | , | β | + | β | , m, x, z (cid:1) ≤ − lN B (cid:0) j − l , k − l , | α | , | β | , m, x, z (cid:1) ≤ − lN ′ B (cid:0) j , k , | α | , | β | , m, x, z (cid:1) Where we have used in the last step that the function:2 − l ( m +1) B (cid:0) j − l , k − l , | α | , | β | , m, x, z (cid:1) decreases as l increases. Hence,2 − j | α |− k | β | (cid:12)(cid:12)(cid:12) ▽ α L ▽ β R T ▽ α L ▽ β R φ j , k (cid:12)(cid:12)(cid:12) . B (cid:0) j , k , | α | , | β | , m, x, z (cid:1) completing the proof.Proposition 7.7 shows that Ker (cid:0) T E j , k (cid:1) satisﬁes the growth estimates ofa 2 j , k elementary kernel. Thus, to show that T E j , k is a 2 j , k elementaryoperator, it now remains to show that we may “pull out” derivatives, as inDeﬁnition 2.4. To do this, it will suﬃce to show that the class of operators A commutes with ▽ L and ▽ R derivatives. By this, we mean: Theorem 7.8.

Suppose T ∈ A . Then, T ▽ αL ▽ βR = X | α | = | α || β | = | β | ▽ α L ▽ β R T α ,β where T α ,β ∈ A . To see why Theorem 7.8 completes the proof that

A ⊆ A ′ , consider: T E j , k = X | α | = N | α | = N | β | = N | β | = N − j ( N + N ) − k ( N + N ) T ▽ α L ▽ β R e E j , k ▽ α L ▽ β R = X − j ( N + N ) − k ( N + N ) ▽ α L ▽ β R T α ,β e E j , k ▽ α L ▽ β R a ﬁnite sum of terms satisfying the proper bounds associated to elementarykernels. Hence, we conclude this section by proving Theorem 7.8. Proof of Theorem 7.8.

We will show that if X L = ▽ αL , | α | = 1, then T X L = P | α | =1 ▽ α L T α , and the whole result will follow by symmetry and induction.Let J be the homogeneous fundamental solution to the sublaplacian: X | α | =1 ( ▽ αL ) J . Note that: T X L = X | α | =1 ( ▽ αL ) J T X L = X | α | =1 ▽ αL (( ▽ αL J ) T X L )Thus, it suﬃces to show that ST X L ∈ A , where S is a left invariant convolutionoperator, with kernel of type 1, in the sense of [Fol75]. Hence, we wish toestimate terms like: (cid:28) φ xr (1) L ,r (1) R , ▽ α L ▽ β R ST X L ▽ α L ▽ β R φ yr (2) L ,r (2) R (cid:29) L where everything above is as in Deﬁnition 2.2. However, ▽ β R S = S ▽ β R and ▽ α L S = P | α ′ | = | α | S α ′ ▽ α ′ L , where S α ′ is a left invariant operator with convo-lution kernel of type 1. Thus, it suﬃces to bound terms of the form: (cid:28) φ xr (1) L ,r (1) R , S ▽ α L ▽ β R T X L ▽ α L ▽ β R φ yr (2) L ,r (2) R (cid:29) L where S is an operator with convolution kernel of type 1. (It is easy to seethat all the integrals involved converge absolutely, by Lemma 7.6.) Let ψ be a C ∞ bump function supported on the unit ball, which is 1 on the ball of radius3 /

4, and 0 ≤ φ ≤

1. Let K ( x ) be the convolution kernel of S , and deﬁne: φ ( x ) = ( φ ( x ) − φ (2 x )) K ( x ). Then, S = X j ∈ Z − j Op L (cid:16) φ ( j ) (cid:17) Applying Lemma 6.1, we see: (cid:12)(cid:12)(cid:12)D φ x j , k , S ▽ α L ▽ β R T X L ▽ α L ▽ β R φ y j , k E L (cid:12)(cid:12)(cid:12) ≤ X j ∈ Z − j (cid:12)(cid:12)(cid:12)D Op L (cid:16) φ ( j ) (cid:17) φ x j , k , ▽ α L ▽ β R T X L ▽ α L ▽ β R φ y j , k E L (cid:12)(cid:12)(cid:12) . X j ∈ Z − j (cid:12)(cid:12)(cid:12)D e φ x j ∧ j , k , ▽ α L ▽ β R T X L ▽ α L ▽ β R φ y j , k E L (cid:12)(cid:12)(cid:12) . − j X j ∈ Z B (cid:0) j ∧ j ∧ j , k ∧ k , | α | + | α | + 1 , | β | + | β | , x, y (cid:1) (33)Let j = j ∧ j , k = k ∧ k , a = | α | + | α | , b = | β | + | β | . Then, we separatethe sum on the left hand side of (33) into two sums: X j ≥ j − j B (cid:0) j , k , a + 1 , b, m, x, y (cid:1) = 2 − j B (cid:0) j , k , a + 1 , b, m, x, y (cid:1) = B (cid:0) j , k , a, b, m, x, y (cid:1) X j ≤ j − j B (cid:0) j , k , a + 1 , b, m, x, y (cid:1) = X j ≤ j B (cid:0) j , k , a, b, m, x, y (cid:1) ≈ B (cid:0) j , k , a, b, m, x, y (cid:1) where we have used that we may take a and b large, and we have applied Lemma7.1. This completes the proof. Remark . Theorem 7.8 is the only place where we use the crucial hypothesisthat we have a cancellation condition that happens on both sides of T at once. L p Boundedness

In this section, we show that operators in A ′ extend to bounded operatorson L p , 1 < p < ∞ . To do this, we will need a relevant Littlewood-Paleysquare function, and a relevant maximal function. Fortunately, we will be ableconstruct both out of the building blocks of the analogous operators for left andright convolution operators.We begin with the maximal functions. Deﬁne:( M L ( f )) ( x ) = sup R> R Q Z k y − x k Op L (cid:18) χ ( R ) B (cid:19) | f | and similarly, M R ( f ) = sup R> Op R (cid:18) χ ( R ) B (cid:19) | f | It follows from the results in [Ste93] that kM L f k L p ( G ) . k f k L p ( G ) for 1 < p < ∞ , and similarly for M R . For us, the relevant maximal functionwill be: M f = sup R > ,R > Op L χ “ R ” B ! Op R χ “ R ” B ! | f | It is easy to see that M f ≤ M L M R f (34)and therefore, kM f k L p ( G ) . k f k L p ( G ) Corresponding to each 1 ≥ ǫ >

0, we get a maximal function for ρ Lǫ (and onefor ρ Rǫ , but let’s focus on ρ Lǫ ), deﬁned by: M ǫ f ( x ) = sup R> V Lǫ ( x, R ) Z ρ Lǫ ( x,y ) ≤ R | f ( y ) | dy χ as in Remark 5.4, we see: M ǫ f . sup R> Op L (cid:16) χ ( R ) (cid:17) Op R (cid:16) χ ( ǫR ) (cid:17) | f | . M f so we see that M uniformly bounds the maximal functions corresponding to allof the geometries we are considering. Lemma 8.1.

Suppose

Ker ( E r L ,r R ) satisﬁes the bounds (3) of Deﬁnition 2.4without any derivatives (ie it is like an elementary kernel, but we need not be ableto “pull out” derivatives or take derivatives). Then, for f ∈ S , | E r L ,r R f | . M f ,with constants uniform in all the relevant parameters.Proof. We prove this in the case r L = 2 j , r R = 2 k , the more general casefollowing from this one. We assume k ≥ j , the other case following in the samemanner. Consider, (cid:12)(cid:12) E j , k f ( x ) (cid:12)(cid:12) . Z ρ L j − k ( x,y ) ≤ − j | f ( y ) | V L k − j ( x, − j ) dy + X l ≥ Z ρ L j − k ( x,y ) ≈ l − j | f ( y ) | lN V L k − j ( x, l − j ) dy ≤ X l ≥ − lN M k − j f . M f completing the proof.Recall the deﬁnition S ( m ) . φ ∈ S ( m ) if and only if φ = P | α | = m ▽ αL ψ α , where ψ ∈ S . We get essentially the same space if we replace ▽ L by ▽ R (by that wemean S R ( m ) ⊂ S L ( m ) ⊂ S R ( m ) where m , m → ∞ as m → ∞ ). In short, being ahigh order of ▽ L derivatives is the same as being a high order of ▽ R derivatives,which is the same as moments up to a high order vanishing. We have: Lemma 8.2.

For any N ∈ N there exist function φ , . . . , φ M , ψ , . . . , ψ M ∈S ( N ) (here M depends on N ) such that: M X l =1 X j ∈ Z φ ( j ) l ∗ ψ ( j ) l = δ Proof.

This follows directly from Theorem 1.61 in [FS82].We will be able to use the ψ l and φ l from Lemma 8.2 to construct a relevantLittlewood-Paley square function. Henceforth, we ﬁx such ψ l and φ l , thinkingof N as large (how large N will have to be will be implicit in our proof).55e deﬁne Λ l ,l j,k = Op L (cid:18) φ ( j ) l (cid:19) Op R (cid:18) ψ ( k ) l (cid:19) P l ,l j,k = Op L (cid:18) ψ ( j ) l (cid:19) Op R (cid:18) φ ( k ) l (cid:19) so that, X l ,l X j,k ∈ Z P l ,l j,k Λ l ,l j,k = I and we deﬁne our square function:Λ ( f ) = X l ,l X j,k ∈ Z (cid:12)(cid:12)(cid:12) Λ l ,l j,k f (cid:12)(cid:12)(cid:12)  Theorem 8.3.

For < p < ∞ , k f k L p ( G ) ≈ k Λ ( f ) k L p ( G ) Proof.

Fix l , l (recall, l and l just range over a ﬁnite set). The theorem willfollow if we can show that for any sequence of 1s and − ǫ j,k , we have that X j,k ǫ j,k Λ l ,l j,k is bounded on L p , uniformly in the choice of the sequence ǫ j,k (and with asimilar result for the P l ,l j,k , which will follow in the same way). To see why thisis enough, see p. 267 of [Ste93] and Chapter 4, Section 5 of [Ste70].However, X j,k ǫ j,k Λ l ,l j,k = X j,k Op T (cid:18) ǫ j,k φ ( j ) l ( x ) ψ ( k ) l ( y ) (cid:19) = Op T X j,k ǫ j,k φ ( j ) l ( x ) ψ ( k ) l ( y )  and X j,k ǫ j,k φ ( j ) l ( x ) ψ ( k ) l ( y )converges to a product kernel, uniformly in the choice of ǫ j,k (see [NRS01],Theorem 2.2.1). Hence, Corollary 3.14 shows us that X j,k ǫ j,k Λ l ,l j,k is uniformly bounded on L p . 56 heorem 8.4. Suppose T ∈ A ′ . Then, T extends to a bounded operator L p → L p , < p < ∞ .Proof. We ﬁrst prove the result imagining that φ l , ψ l ∈ S . At the end, we willexplain why it is enough to have them in S ( N ) for some large N . This proof ismore or less standard, however we include it to help make clear where we areusing φ l , ψ l ∈ S ( N ) .Since φ l , ψ l ∈ S , we have (by Theorem 6.2) that Λ l ,l j,k , P l ,l j,k are 2 j , k elementary kernels (uniformly in j, k ). Hence, we have (for f ∈ S ):Λ l ,l j ,k T P l ′ ,l ′ j ,k f = Λ l ,l j ,k E j , k = 2 −| j − k |−| j − k | E j , k f . −| j − k |−| j − k | M f where we have used the deﬁnition of A ′ , Lemma 6.3, and Lemma 8.1, and E j , k just represents some 2 j , k elementary kernel that may change from line to line.Deﬁne F l ,l j,k = Λ l ,l j,k T f . Then, (cid:12)(cid:12)(cid:12) F l ,l j,k (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X l ′ ,l ′ X j ,k Λ l ,l j,k T P l ′ ,l ′ j ,k Λ l ′ ,l ′ j ,k f (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . X l ′ ,l ′ −| j − j |−| k − k | M (cid:16) Λ l ′ ,l ′ j ,k f (cid:17) and hence, (cid:12)(cid:12)(cid:12) F l ,l j,k (cid:12)(cid:12)(cid:12) . X l ′ ,l ′ X k ,j −| j − j |−| k − k | (cid:16) M (cid:16) Λ l ′ ,l ′ j ,k f (cid:17)(cid:17)  X l ′ ,l ′ X j ,k −| j − j |−| k − k |  ≈ X l ′ ,l ′ X k ,j −| j − j |−| k − k | (cid:16) M (cid:16) Λ l ′ ,l ′ j ,k f (cid:17)(cid:17) and so, we have X l ,l X j,k (cid:12)(cid:12)(cid:12) F l ,l j,k (cid:12)(cid:12)(cid:12) . X l ,l X j,k (cid:16) M (cid:16) Λ l ,l j,k f (cid:17)(cid:17) k T f k L p ( G ) ≈ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)X l ,l X j,k (cid:12)(cid:12)(cid:12) F l ,l j,k (cid:12)(cid:12)(cid:12)  (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( G ) . (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)X l ,l X j,k (cid:16) M (cid:16) Λ l ,l j,k f (cid:17)(cid:17)  (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L p ( G ) . k Λ ( f ) k L p ( G ) ≈ k f k L p ( G ) where we have used the vector valued maximal function, see [Ste93], Chapter 2,Section 1. The vector valued inequality comes from (34), and the correspondinginequalities for M L , M R as shown in [Ste93].Now we turn to explaining why we only need φ l , ψ l ∈ S ( N ) for some ﬁxedlarge N . Indeed, this proof used only a ﬁnite number of the semi-norms thatdeﬁne the elementary operators. This follows from the fact that every proof wehave done about elementary operators was continuous. For example Lemma 6.3showed: E j , k E j , k = 2 −| j − j |−| k − k | e E j , k Where each semi-norm of e E was bounded in terms of a ﬁnite number of semi-norms of the terms on the left hand side. Thus, this proof only required aﬁnite number of semi-norms, which we may control by taking N large. Theonly potential worry is the line where we used the deﬁnition of A ′ (ie, T takeselementary operators to elementary operators); since this was a deﬁnition, andwe do not have a priori continuity in the above sense. However, this continuityfollows from a combination of Theorem 6.6 and Lemma 6.3. In this section, we show that the operators in A ′ are pseudolocal, and calculatebounds for derivatives of the kernel away from the diagonal; although we willnot put these bounds in a closed form. In Section 9.1, however, we will derivea closed form for the growth of the kernel oﬀ the diagonal in the case that G isthe three dimensional Heisenberg group.Fix T ∈ A ′ . Decompose T as in Theorem 6.6: T = X j,k ∈ Z E j , k We will imagine this is a ﬁnite sum, and show that the result is uniformly C ∞ oﬀ of the diagonal. It will then follow that T is pseudolocal. In fact, we will58rove the bounds separately for T L = X k ≥ j E j , k T R = X j ≥ k E j , k We focus on T L , the bounds for T R being the same, with the roles of right andleft reversed. Set K ( x, z ) = Ker ( T L ).Let us return to the notation from Section 2: g = V M · · · M V n It is easy to see that each vector ﬁeld ▽ αR with | α | = 1 can be written in theform: n X j =1 q j,α ( x ) X L,α,j (35)where X L,α,j ∈ V j (when thought of as left invariant vector ﬁelds), and q is ahomogeneous polynomial of degree j −

1. For example, on the Heisenberg group, X R = X L − y∂ t (see Section 9.1 for this notation). Lemma 9.1.

Let φ ( x, z ) be a j , k elementary kernel (we are still assuming k ≥ j ). Then, if | α | + | α | = a , | β | + | β | = b , (cid:12)(cid:12)(cid:12) ▽ α L,x ▽ β R,x ▽ α L,z ▽ β R,z φ ( x, z ) (cid:12)(cid:12)(cid:12) . ja + kb (cid:16) ∧ (cid:16)P | α | =1 P ns =1 | q s,α ( x ) | sj − k (cid:17)(cid:17)(cid:0) j ρ L j − k ( x, z ) (cid:1) N V L j − k (cid:0) x, − j + ρ L j − k ( x, z ) (cid:1) where N ≥ is ﬁxed and as large as we like.Proof. It is easy reduce to the case when a = 0 = b , just from the deﬁnitionof an elementary kernel. The case when 1 = 1 ∧ (cid:16)P | α | =1 P ns =1 | q s,α ( x ) | sj − k (cid:17) follows directly from the deﬁnition of an elementary kernel. For the other case,consider (in what follows, ψ with any subscript will denote a 2 j , k elementaryoperator): φ ( x, z ) = X | β | =1 − k ▽ βR,x ψ β ( x, z )= X | β | =1 n X s =1 − k q s,β ( x ) X L,β,s ψ β ( x, z )= X | β | =1 n X s =1 sj − k q s,β ( x ) ψ s,β ( x, z )now the claim follows by taking absolute values and applying the deﬁnition ofelementary kernels. 59 heorem 9.2. Let a = | α | + | α | . Then, for x = z , we have (cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z K ( x, z ) (cid:12)(cid:12)(cid:12) . ∞ X l =0 ρ L − l ( x, z ) − a V L − l (cid:0) x, ρ L − l ( x, z ) (cid:1) ∧ P | α | =1 P ns =1 | q s,α ( x ) | ρ L − l ( x, z ) − s +1 l V L − l (cid:0) x, ρ L − l ( x, z ) (cid:1) ! and therefore T L is pseudolocal. A similar result holds for T R , thereby showingthat T is pseudolocal.Proof. First let’s see why this shows that T L is pseudolocal. We claim thatthe above sum converges absolutely. This can be seen by using the facts that ρ Lǫ ≥ ρ L and V Lǫ ≥ V L (for all ǫ ∈ [0 , ∧ we get a geometric series. Hence, the whole series converges absolutely,showing that T L is pseudolocal.Let φ j , k = Ker (cid:0) E j , k (cid:1) . Let 0 ≤ l = k − j , and to save space, deﬁne δ l = ρ L − l ( x, z ). We think of l ≥ k , j suchthat k − j = l . Using Lemma 9.1, we see: (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ▽ α L,x ▽ α L,z X k − j = l φ j , k ( x, z ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . X k − j = l ja ∧ (cid:0) − l P | q s,α | j ( s − (cid:1) (1 + 2 j δ l ) N V L − l ( x, − j + δ l )= X j ∈ Z ja ∧ (cid:0) − l P | q s,α | j ( s − (cid:1) (1 + 2 j δ l ) N V L − l ( x, − j + δ l )where, in the numerator, P = P | α | =1 P ns =1 , and we have suppressed the x in q s,α ( x ). We separate the above sum into two sums: when 2 j ≥ δ l and when2 j ≤ δ l . Now, X j ≥ δl ≈ X j ≥ δl ja ∧ (cid:0) − l P | q s,α | j ( s − (cid:1) (2 j δ l ) N V L − l ( x, δ l ) ≈ δ − al ∧ (cid:16) − l P | q s,α | δ (1 − s ) l (cid:17) V L − l ( x, δ l )since the second term is a geometric sum (when N is suﬃciently large), andtherefore bounded by its ﬁrst term. This is precisely the bound we were strivingfor. We now turn to the sum when 2 j ≤ δ j : X j ≤ δj ≈ X j ≤ δj ja ∧ (cid:0) − l P | q s,α | j ( s − (cid:1) V L − l ( x, − j ) ≈ δ − al ∧ (cid:16) − l P | q s,α | δ (1 − s ) l (cid:17) V L − l ( x, δ l )60here this follows since the above sum is geometric, and therefore bounded byits ﬁrst term, completing the proof. Remark . It seems likely that Theorem 9.2 is the best we can do (at leastwhen a = 0). Indeed, if K and K are Calder´on-Zygmund kernels, we candecompose: K = X j ▽ L · φ ( j ) j K = X k ▽ R · ψ ( k ) k where φ j , ψ k are d-tuples of C ∞ functions supported on the unit ball B (see[NRS01]). Then we consider:Op L ( K ) Op R ( K ) = X j,k Op L (cid:18) ▽ L · φ ( j ) j (cid:19) Op R (cid:18) ▽ R · ψ ( k ) k (cid:19) One wishes to use the fact that ▽ L · φ j and ▽ R · φ k are derivatives of functionsto yield a gain over the estimate given in Theorem 5.2. The standard way ofdoing this, when k ≥ j , is to integrate the ▽ R by parts over to φ j . However,this process is exactly the one we used in Lemma 9.1.The observant reader will note, however, that the bound in Theorem 9.2 isnot actually symmetric in x and z , as the optimal bound should be. And that,moreover, we could use the same proof to prove a seemingly better symmetricbound. This turns out to not be an essential point, and indeed the bound isessentially symmetric in x and z . This is exempliﬁed in Section 9.1 in the caseof the Heisenberg group. Thus, without some new idea, one is unable to dobetter than Theorem 9.2. In this section, we derive a closed form for the bound in Theorem 9.2, in thecase of the three dimensional Heisenberg group, H . As a manifold H = C × R , and we give it coordinates ( z, t ) = ( x, y, t ). The multiplication is givenby ( z, t ) ( w, s ) = ( z + w, t + s + 2Im ( zw )). The dilation is given by r ( z, t ) = (cid:0) rz, r t (cid:1) . The left invariant vector ﬁelds of order 1 are spanned by X L = ∂ x + 2 y∂ t , Y L = ∂ y − x∂ t , while the right invariant vector ﬁelds of order 1 arespanned by X R = ∂ x − y∂ t , Y R = ∂ y + 2 x∂ t . We also have:[ X L , Y L ] = − ∂ t = − [ X R , Y R ]and so ∂ t spans the left (and right) invariant vector ﬁelds of order 2.We ﬁx ( z, t ) , ( w, s ) ∈ H , ( z, t ) = ( w, s ), and we again deﬁne (for l ≥ δ l = ρ L − l (( z, t ) , ( w, s )). Note that δ ∞ = ρ L , δ = ρ L . Fix α ∈ N , α ≥ α willplay the role of a in Theorem 9.2), and let ζ = ( z, t ). We will show:61 heorem 9.4. ∞ X l =0 δ − αl (cid:18) V L − l ( ζ, δ l ) ∧ | z | δ − l l V L − l ( ζ, δ l ) (cid:19) ≈ δ ∞ δ α (36) Note that this sum is exactly the one that appears in Theorem 9.2, in the caseof H . The ﬁrst question to address is: When do we use each side of the ∧ ? Namely,we are interested in the question, when is ρ L − l = δ l & − l | z | ? So let us inves-tigate the question: When is ǫ | z | . ρ Lǫ ? The answer is that it is true preciselywhen ρ Lǫ ≈ ρ L . Indeed, suppose we are on the scale ρ Lǫ ≈ δ , here δ is justsome number >

0, not to be confused with δ l . (We assume δ <

1, and thatwe are working very close to 0 and then extend the results by homogeneity.)Then, by Theorem 4.8, we wish to ﬁnd the maximal determinant among 3 ×  δ − yδ δ xδ δ δǫ yδǫ δǫ − xδǫ δ ǫ  Now the largest three determinants are given by: Rows (1 , ,

3) = δ , (1 , ,

4) =4 δ yǫ , and (1 , ,

5) = − δ xǫ . Thus, when δ ≥ | z | ǫ , (1 , ,

3) is the largestdeterminant (up to a constant), and therefore on this scale ρ Lǫ ≈ ρ L (since theﬁrst 3 rows corresponded to the left invariant vector ﬁelds). Remark . Actually, this proof extends to an arbitrary stratiﬁed group. Thatis, the right part of the ∧ in Theorem 9.2 is less than the left part preciselywhen ρ L − l ≈ ρ L . We leave the details to the interested reader. Proposition 9.6.

Suppose ρ Lǫ (( z, t ) , ( w, s )) . | z | ǫ . Then, ρ Lǫ (( z, t ) , ( w, s )) ≈ | z − w | + 1 ǫ | z | | t − s + 2Im ( zw ) | Proof.

When ǫ = 1, the result follows easily. Moreover, if 1 ≥ ǫ ≥ , we have ρ Lǫ ≈ ρ L , and the result follows from the case when ǫ = 1. Henceforth, werestrict our attention to the case ǫ < . Fix δ . | z | ǫ . We will show that thefollowing conditions are equivalent:1. ρ Lǫ (( z, t ) , ( w, s )) . δ .2. ∃ w , s such that | w − w | . δ , | s − s + 2Im ( w w ) | . δ , | z − w | . ǫδ , | t − s | . ǫ | z | δ .3. ∃ s such that | z − w | . δ , | s − s + 2Im ( zw ) | . δ , | t − s | . δ | z | ǫ .4. ∃ s such that | z − w | . δ , | s − s + 2Im ( zw ) | . δ | z | ǫ , | t − s | . δ | z | ǫ .62. | z − w | . δ , | t − s + 2Im ( zw ) | . δ | z | ǫ .6. | z − w | + ǫ | z | | t − s + 2Im ( zw ) | . δ .this will complete the proof, since the statement of the proposition is 1 ⇔ ⇔ , ⇒ ⇒ ⇒ , ⇔ , ⇔ ⇔

2, we apply Theorem 4.13 to see: ρ Lǫ = (cid:0) ǫρ R (cid:1) ◦ (cid:0) ǫρ L (cid:1) ◦ (cid:0) (1 − ǫ ) ρ L (cid:1) = (cid:0) ǫρ L (cid:1) ◦ (cid:0) (1 − ǫ ) ρ L (cid:1) ≈ (cid:0) ǫρ L (cid:1) ◦ ρ L where we have used that ǫ < , and the obvious fact that (cid:0) ǫρ L (cid:1) ◦ (cid:0) (1 − ǫ ) ρ L (cid:1) = ρ L . The statement 1 ⇔ (cid:0) ǫρ L (cid:1) ◦ ρ L .For 3 ⇒ w = z . Suppose we have 2. Consider, | z − w | ≤ | z − w | + | w − w | . δ + ǫδ . δ | s − s + 2Im ( zw ) | ≤ | z − w ) w ) | + | s − s + 2Im ( w w ) | . δ + | Im (( z − w ) w ) | = δ + | Im (( z − w ) ( w − w + z )) |≤ δ + | z − w | | w − w | + | z − w | | z | . δ + ǫδ + | z | ǫδ . | z | ǫδ and we therefore have 2 ⇒

4. Suppose we have 4. Deﬁne s = s − zw ).We will show that 3 holds with s in place of s (the notation s has alreadybeen used in the deﬁnition of 4). Indeed, | s − s + 2Im ( zw ) | = 0 . δ | t − s | = | t − s + 2Im ( zw ) | ≤ | s − s + 2Im ( zw ) | + | s − t | . | z | δǫ and so 3 holds.5 ⇒ s = t . Suppose we have 4, consider: | t − s + 2Im ( zw ) | ≤ | s − s + 2Im ( zw ) | + | t − s | . | z | δǫ and so 5 holds. Finally, 6 ⇔ ǫ | z | . ρ Lǫ is the same as the condition ǫ | z | . ρ L . Indeed, ρ Lǫ ≤ ρ L , and so one direction is clear. For the otherdirection, ﬁx η > ǫ such that ǫ | z | ≥ ηρ Lǫ . The left handside decreases to 0 as ǫ decreases, while the right hand side increases. Thusthere is a least ǫ (call it ǫ ) for which it holds. For this ǫ , we have ǫ | z | = ηρ Lǫ ,so by the above remarks, we have ρ Lǫ ≈ ρ L , and thus, ǫ | z | ≥ Cρ L . Hence forall ǫ such that ǫ | z | ≥ ηρ Lǫ (namely all ǫ ≥ ǫ ), we have ǫ | z | ≥ Cρ L .63e separate the sum (36) into two parts: when 2 l δ ∞ ≤ | z | and when 2 l δ ∞ ≥| z | . We ﬁrst look at the case when 2 l δ ∞ ≤ | z | , so that by Proposition 9.6, wehave δ l ≈ | z − w | + 2 l | t − s + 2Im ( zw ) || z | and we see by Theorem 4.9 and the remarks above that V L − l ( ζ, δ l ) ≈ − l | z | δ l to save space, we denote b = | z − w | and a = | t − s +2Im( zw ) || z | , so that δ l ≈ b + 2 l a Also, let c = | z | δ ∞ . Thus, we consider: X l δ ∞ ≤| z | δ − αl (cid:18) V L − l ( ζ, δ l ) ∧ | z | δ − l l V L − l ( ζ, δ l ) (cid:19) ≈ X l δ ∞ ≤| z | δ − αl V L − l ( ζ, δ l ) ≈ X l ≤ c l | z | ( b + 2 l a ) α ≈ | z | a α X l ≤ c l (cid:0) ba + 2 l (cid:1) α ≈ | z | a α Z log ( c )0 t (cid:0) ba + 2 t (cid:1) α dt ≈ | z | a α Z c (cid:0) ba + u (cid:1) α du ≈ | z | a α (cid:0) c + ba (cid:1) α − (cid:0) ba (cid:1) α (cid:0) c + ba (cid:1) α (cid:0) ba (cid:1) α Recall, l ≥ c ≥ l . In fact, let us ignorethe possibility that this sum is nonzero when c ≤ c ≤ l ). Thus, we have: ≈ | z | a α c (cid:16)P αj =1 (cid:0) ba (cid:1) α − j c j − (cid:17)(cid:0) c + ba (cid:1) α (cid:0) ba (cid:1) α ≈ | z | a α c (cid:0) c + ba (cid:1) α − (cid:0) c + ba (cid:1) α (cid:0) ba (cid:1) α | z | a α c (cid:0) c + ba (cid:1) (cid:0) ba (cid:1) α = 1 δ ∞ (cid:16) | z | δ ∞ a + b (cid:17) ( a + b ) α ≈ δ ∞ (cid:16) | z | δ ∞ a + b (cid:17) δ α where, in the last line, we have used that ( a + b ) ≈ δ , by Proposition 9.6.Finally, we will be done with this sum provided we can show | z | δ ∞ a + b ≈ δ ∞ To see this, consider ǫ such that ǫ | z | = ρ Lǫ (as before). Then, for this ǫ , wehave ǫ | z | = ρ Lǫ ≈ ρ L . However, we also have ǫ | z | & ρ Lǫ , and therefore, δ ∞ = ρ L ≈ ρ Lǫ ≈ (cid:18) b + aǫ (cid:19) ≈ (cid:18) b + | z | ρ L a (cid:19) = (cid:18) b + | z | δ ∞ a (cid:19) completing the proof for this sum.We now turn to the sum when 2 l δ ∞ ≥ | z | . In this case, δ l ≈ δ ∞ , and V L − l ( ζ, δ l ) ≈ V L ( ζ, δ l ), by the remarks at the beginning of the proof. Andthus, we are considering: X l δ ∞ ≥| z | δ − αl (cid:18) V L − l ( ζ, δ l ) ∧ | z | δ − l l V L − l ( ζ, δ l ) (cid:19) ≈ X l δ ∞ ≥| z | δ − α ∞ (cid:20) | z | l δ ∞ V L ( ζ, δ ∞ ) + 12 l V L ( ζ, δ ∞ ) (cid:21) ≈ X l δ ∞ ≥| z | | z | l δ α ∞ + 12 l δ α ∞ the ﬁrst sum is geometric, and therefore bounded by its ﬁrst term, and we have: ≈ δ α ∞ + X l δ ∞ ≥| z | l δ α ∞ The second term above is bounded by: X l ≥ l δ α ∞ ≈ δ α ∞ Hence, the whole sum is: ≈ δ α ∞ Since δ ∞ ≥ δ , this completes the proof.65 Deﬁnition 2.2 only tested high derivatives of the operator T . One can replaceDeﬁnition 2.2 with an equivalent deﬁnition that works for derivatives of allorders, but with the price that the B must be quite a bit more complicated.Let q Rj,α be the functions from (35) and q Lj,α be the corresponding ones with theroles of left and right reverse. We then deﬁne: e B (cid:0) j , k , x, y, N L , N R (cid:1) = X j ≤ j ,k ≤ k jN L + kN R  ∧  X | α | =1 N X s =1 (cid:12)(cid:12) q Rs,α ( x ) (cid:12)(cid:12) sj − k  ∧  X | α | =1 N X s =1 (cid:12)(cid:12) q Ls,α ( x ) (cid:12)(cid:12) sk − j  × K j , k ( x, y )where K j , k is the function from Section 5.1. Note that e B does not involve m .Then Deﬁnition 2.2 with B replaced by e B now works for all N L , N R ≥

0, anddeﬁnes the same algebra. That these algebras are the same follow in a mannersimilar to the bounds in the rest of this paper.One may think of the operators in A and A ′ as “smoothing of order 0.” Tocome up with an analogous deﬁnition for operators which are “smoothing oforder s L ” in the left invariant vector ﬁelds and “smoothing of order s R ” in theright invariant vector ﬁelds, it suﬃces to modify Deﬁnitions 2.2 and 2.5 onlyslightly.Indeed, we say an operator T : S ( N ) → S ′ (for some large N ) is in A s L ,s R if it satisﬁes the conditions of Deﬁnition 2.2 with B ( · , · , N L , N R , · , · , · ) replacedby B ( · , · , N L − s L , N R − s R , · , · , · ).We say an operator T : S → S is in A ′ s L ,s R if r s L L r s R R T E r L ,r R is an r L , r R el-ementary operator for every r L , r R elementary operator E r L ,r R , uniformly in therelevant parameters. Then, A s L ,s R = A ′ s L ,s R (under the obvious identiﬁcation).It is clear that if T ∈ A ′ s L ,s R , T ∈ A ′ s L ,s R , then we have T T ∈ A ′ s L + s L ,s R + s R ,and therefore we have a similar result for A s L ,s R (remember, we are just think-ing of these operators on S ( N ) for N large). Many of the results of this paperextend to these operators in the obvious way.Finally, let us consider the question of whether or not it is really necessaryto have a cancellation condition on both sides simultaneously as in Deﬁnition2.2, as opposed to something more along the lines of the standard deﬁnitionsof Calder´on-Zygmund operators. One could think about this in two ways. Onecould try to use a one sided cancellation condition along the lines of Lemma 7.6(or something slightly stronger, in terms of the e B above), along with a growthcondition of the kernel of T oﬀ of the diagonal. However, in light of Theorem9.4, any condition along these lines seems likely to be necessarily weaker thanour Deﬁnition 2.2.Alternatively, let us go back to considering the composition ofOp L ( K ) Op R ( K )66here K , K are Calder´on-Zygmund kernels. We decompose K = X j ∈ Z φ ( j ) j K = X k ∈ Z ψ ( k ) k where φ j , ψ k form a bounded subset of C ∞ , are supported in the unit ball, andhave mean 0 (see [NRS01], Theorem 2.2.1). This cancellation condition on the φ j essentially tells us that if η is another C ∞ function supported on the unitball, we have: Op L ( K ) Op L (cid:16) η ( j ) (cid:17) = X j ≤ j Op L (cid:18) e φ ( j ) j (cid:19) where the e φ j are essentially of the same form as the φ j , and with a similar resultfor Op L (cid:16) η ( j ) (cid:17) Op L ( K ). Thus, if η , η are of the same form as η , we have:Op L (cid:18) η ( j ) (cid:19) Op L ( K ) Op R (cid:18) η ( j ) (cid:19) = X j ≤ j ∧ j Op L (cid:18)e φ ( j ) j (cid:19) Hence, for composition Op L ( K ) Op R ( K ) we want:Op L (cid:18) η ( j ) (cid:19) Op R (cid:18) η ( k ) (cid:19) Op L ( K ) Op R ( K ) Op L (cid:18) η ( j ) (cid:19) Op R (cid:18) η ( k ) (cid:19) = X j ≤ j ∧ j k ≤ k ∧ k Op L (cid:18)e φ ( j ) j (cid:19) Op R (cid:18) e ψ ( k ) k (cid:19) and so a cancellation condition on one side alone will be ﬁne in the case when k ≤ k and j ≤ j (or the reverse situation), but seems like it will not beable to yield the desired estimate when k < k and j < j (or the reversesituation). References [DF79] John D. Dollard and Charles N. Friedman,

Product integration withapplications to diﬀerential equations , Encyclopedia of Mathematicsand its Applications, vol. 10, Addison-Wesley Publishing Co., Read-ing, Mass., 1979, With a foreword by Felix E. Browder, With anappendix by P. R. Masani.[DJS85] G. David, J.-L. Journ´e, and S. Semmes,

Op´erateurs de Calder´on-Zygmund, fonctions para-accr´etives et interpolation , Rev. Mat.Iberoamericana (1985), no. 4, 1–56.67Fol75] G. B. Folland, Subelliptic estimates and function spaces on nilpotentLie groups , Ark. Mat. (1975), no. 2, 161–207.[FS82] G. B. Folland and Elias M. Stein, Hardy spaces on homogeneousgroups , Mathematical Notes, vol. 28, Princeton University Press,Princeton, N.J., 1982.[GAV89] R. V. Gamkrelidze, A. A. Agrach¨ev, and S. A. Vakhrameev,

Ordinarydiﬀerential equations on vector bundles, and chronological calculus ,Current problems in mathematics. Newest results, Vol. 35 (Russian),Itogi Nauki i Tekhniki, Akad. Nauk SSSR Vsesoyuz. Inst. Nauchn.i Tekhn. Inform., Moscow, 1989, Translated in J. Soviet Math. (1991), no. 4, 1777–1848, pp. 3–107.[Kis95] Vladimir V. Kisil, Connection between two-sided and one-sided convo-lution type operators on non-commutative groups , Integral EquationsOperator Theory (1995), no. 3, 317–332.[Koe02] Kenneth D. Koenig, On maximal Sobolev and H¨older estimates for thetangential Cauchy-Riemann operator and boundary Laplacian , Amer.J. Math. (2002), no. 1, 129–197.[Koh05] J. J. Kohn,

Hypoellipticity and loss of derivatives , Ann. of Math. (2) (2005), no. 2, 943–986, With an appendix by Makhlouf Derridjand David S. Tartakoﬀ.[NRS01] Alexander Nagel, Fulvio Ricci, and Elias M. Stein,

Singular integralswith ﬂag kernels and analysis on quadratic CR manifolds , J. Funct.Anal. (2001), no. 1, 29–118.[NS01] Alexander Nagel and Elias M. Stein,

Diﬀerentiable control metricsand scaled bump functions , J. Diﬀerential Geom. (2001), no. 3,465–492.[NS06] , The ∂ b -complex on decoupled boundaries in C n , Ann. ofMath. (2) (2006), no. 2, 649–713.[NSW85] Alexander Nagel, Elias M. Stein, and Stephen Wainger, Balls andmetrics deﬁned by vector ﬁelds. I. Basic properties , Acta Math. (1985), no. 1-2, 103–147.[Ste70] Elias M. Stein,

Singular integrals and diﬀerentiability properties offunctions , Princeton Mathematical Series, No. 30, Princeton Univer-sity Press, Princeton, N.J., 1970.[Ste93] ,

Harmonic analysis: real-variable methods, orthogonality, andoscillatory integrals , Princeton Mathematical Series, vol. 43, Prince-ton University Press, Princeton, NJ, 1993, With the assistance ofTimothy S. Murphy, Monographs in Harmonic Analysis, III.68Str07] Brian Street,

A parametrix for kohn’s operator , Ph.D. thesis, Prince-ton University, 2007.[Tr`e67] Fran¸cois Tr`eves,