[PDF] Monotone expansion

Abstract

This work, following the outline set in [B2], presents an explicit construction of a family of monotone expanders. The family is essentially defined by the Mobius action of SL_2(R) on the real line. For the proof, we show a product-growth theorem for SL_2(R).

Full PDF

aa r X i v : . [ m a t h . C O ] A ug Monotone expansion

Jean Bourgain ∗ Amir Yehudayoﬀ † Abstract

This work, following the outline set in [B2], presents an explicit construction of a familyof monotone expanders. The family is essentially deﬁned by the M¨obius action of SL ( R ) onthe real line. For the proof, we show a product-growth theorem for SL ( R ). Expanders are sparse graphs with “strong connectivity” properties. Such graphs are extremelyuseful in various applications (see the survey [HLW]). Most sparse graphs are expanders, butfor applications explicit constructions are needed. Indeed, explicit constructions of expandersgraphs are known, e.g. [LPS, RVW]. Here we describe an explicit construction of monotoneexpanders (for more on such expanders see [DW]).The construction of monotone expander we present ﬁrst builds a “continuous” expander, whichin turn can be discretized to the required size. A continuous expander is a ﬁnite family of mapsΨ for which there exists a constant c > ψ ∈ Ψ is a smoothmap from the interval [0 ,

1] to itself, and for all measurable A ⊂ [0 ,

1] with | A | ≤ / | Ψ( A ) | ≥ (1 + c ) | A | , where Ψ( A ) = S ψ ∈ Ψ ψ ( A ). We say that Ψ is a continuous monotone expander if in additionevery ψ ∈ Ψ is monotone, i.e., ψ ( x ) > ψ ( y ) for x > y . Theorem 0.

There exists an explicit continuous monotone expander.

The word explicit in the theorem can be interpreted as follows. The family Ψ can be (uniformly)described by a constant number of bits, and given a rational x ∈ [0 ,

1] that can be described by b bits, ψ ( x ) is rational and can be computed in time polynomial in b , for all ψ ∈ Ψ. ∗ Institute for Advanced Study, Princeton NJ, [email protected] . † Technion–IIT, Haifa, Israel, [email protected] . k ψ − id k ∞ , (cid:13)(cid:13) ψ ′ − (cid:13)(cid:13) ∞ ≤ c, for every ψ ∈ Ψ, for a small constant c >

0, where id is the identity map.The proof of the theorem follows the outline described in [B2], which in turn uses ideas fromrecent works on growth and expansion in matrix groups. Most relevant is the work of Bourgainand Gamburd [BG1] showing expansion in SU (2). Also related, is the work of Bourgain andGamburd [BG2] proving expansion in SL ( F p ), and the work of Helfgott [H] showing growth in SL ( F p ).The theorem describes the existence of a continuous monotone expander. By partitioning [0 , n equal-length intervals, Ψ naturally deﬁnes a discrete bi-partite monotone expander on 2 n vertices. Namely, a bi-partite graph G with two color classes L, R of size n each so that (i) forevery A ⊂ L of size | A | ≤ n/

2, the size of B = { b ∈ R : ( a, b ) ∈ E ( G ) for some a ∈ A } is atleast (1 + c ) | A | , c > n , and (ii) the edges E ( G ) can be partitionedto ﬁnitely many sets E , . . . , E k , k independent of n , so that in each E i edges do not “cross”each other (viz., E i deﬁnes a partial monotone map). Since Ψ is explicit, the graph G is explicitas well. (If Ψ was continuous but not monotone, the same reduction would yield a family ofdiscrete bi-partite expanders.)No other proof of existence of discrete monotone expanders is known, not even using the proba-bilistic method. A partial explanation to that is the following. Natural probability distributionson partial monotone functions give, w.h.p., functions that are “close” to aﬃne. Klawe, however,showed in [K] that if one tries to construct expanders using aﬃne transformations, then theminimal number of generators required is super-constant (in the number of vertices), and so noconstruction “that is close to aﬃne” can work. Two more related comments: (i) The construc-tion in this text uses “generators” that are deﬁned as the ratio of two aﬃne transformations. (ii)Dvir and Wigderson [DW] showed that any proof of existence of a family of monotone expandersyields an explicit construction of monotone expanders.Implicit in the work of Dvir and Shpilka [DS] it is shown that an explicit discrete monotoneexpander easily yields an explicit dimension expander . Speciﬁcally, the existence of a constantnumber of n × n zero-one matrices M , . . . , M k so that for every ﬁeld F and for every subspace V of F n of dimension D ≤ n/

2, the dimension of the span of M ( V ) ∪ . . . ∪ M k ( V ) is at least(1 + c ) D . The work of Lubotzky and Zelmanov [LZ] shows that over the real numbers anyexplicit (perhaps non-monotone) expander yields an explicit dimension expander.Here is an outline of the proof. To present the main ideas, we ignore many of the problematicissues. Deﬁning maps.

Every matrix g ∈ SL ( R ) acts on R in a monotone way via the M¨obius action.The maps Ψ will be deﬁned by the actions of a set of matrices G ⊂ SL ( R ). This ensures thatthe maps in Ψ are monotone. Choose G as a family of matrices that freely generate a group2with some extra properties, see Lemma 1.1 for exact statement). To ﬁnd G , use the strong Titsalternative of Breuillard [Br], which roughly states that in a ball of constant radius in SL ( R )there are elements that freely generate a group. Proving expansion.

As in many expanders constructions, the expansion follows by proving thatthe operator T deﬁned by Ψ has a (restricted) spectral gap. As in recent works, the spectral gapis established as follows. Let ν be the probability distribution deﬁned by Ψ. Then, the ℓ -foldconvolution of ν with itself, ν ( ℓ ) , is ﬂat, even for ℓ relatively small. This statement implies therapid mixing of the random walk deﬁned by ν , and hence implies expansion. The proof consistsof three steps. (i) Small ℓ . To show that ν ( ℓ ) is “somewhat” ﬂat for small ℓ , use the fact that the groupgenerated by G is free, and Kesten’s estimates for the behavior of random walks on free groups.Roughly, as G freely generates a group, the convolution “grows along a tree” and hence ﬂat.Here we also need to use a “diophantine” property of G , i.e., that elements of G have constantrational entries. (ii) Intermediate-size ℓ . This is the main part of the argument. We prove a product-growththeorem for SL ( R ): if S is a subset of SL ( R ) with certain properties, then the size of S (3) = { s s s : s i ∈ S } is much larger than the size of S . (The outline of the proof of the producttheorem appears in Section 5.) Such a product theorem implies that ν (2 ℓ ) is much ﬂatter than ν ( ℓ ) , unless it is already pretty ﬂat. (iii) Large ℓ . By steps (i) and (ii), we can conclude that ν ( ℓ ) is pretty ﬂat, even for ℓ relativelysmall. It remains to show that ν ( Cℓ ) is very ﬂat, for C > SL ( R ) is not compact, such anargument can not be applied here. Instead, use the subgroup structure of SL ( R ), or in otherwords the two-transitivity of the M¨obius action. To do so, also use knowledge of the Fourierspectrum of the set A . We are able to obtain knowledge on the spectrum of A by adding to Ψthe translate map. The translate map implies that, w.l.o.g., we can assume that the spectrumof A does not have low frequencies. In essence, the maps deﬁning the monotone expander are induced by the action of SL ( R ) on R . To ﬁnd the relevant elements of SL ( R ), use the following lemma. The proof of the lemma isgiven in Section 2. Lemma 1.1.

There is a constant

C > so that the following holds. For ε > small, there is apositive integer Q and a subset G of SL ( R ) so that1. (1 /ε ) /C < Q < (1 /ε ) C , . Q < |G| C ,3. elements of G freely generate group,4. elements of G have entries of the form Z /Q , and5. every g ∈ G admits k g − k = ( g , − + ( g , ) + ( g , ) + ( g , − ≤ ε. The lemma summarizes all the properties G should satisfy in order to yield a monotone expander.When applying the lemma, ε is a small universal constant. An important (and useful) propertyof the lemma is that both |G| and Q are polynomially comparable to 1 /ε . Without this property,the lemma immediately follows from the strong Tits alternative [Br]. Property 4 yields the non-commutative diophantine property of G , roughly, that for every w = w ′ that are words of length k in the element of G , the distance between w and w ′ is at least (1 /Q ) k . This property is deﬁnedand used in [BG1]. Property 5 is crucial for handling the non-compactness of SL ( R ).Consider the M¨obius action: Given g = (cid:18) a bc d (cid:19) in SL ( R ), denote by g the map deﬁned by g ( x ) = ax + bcx + d . For all g in SL ( R ), the derivative of the map g is g ′ ( x ) = 1( cx + d ) . So g is monotone in any interval not containing − d/c . Construction.

Let Ψ be the family of monotone smooth maps ψ from sub-intervals of [0 , to [0 , deﬁned as follows.Let ε > be a small universal constant (to be determined). Let G be the family of matrices givenby Lemma 1.1. Deﬁne Ψ G = { g : g ∈ G ∪ G − } . Here we restrict g to output values in [0 , , i.e., if ψ ∈ Ψ G is deﬁned by g , then ψ is a map fromthe interval g − ([0 , ∩ [0 , to [0 , .Let K = K ( ε ) be a large integer (to be determined). Deﬁne the map ψ + : [0 , − /K ] → [1 /K, by ψ + ( x ) = x + 1 /K , and the map ψ − : [1 /K, → [0 , − /K ] by ψ − ( x ) = x − /K .Finally, Ψ = Ψ G ∪ { ψ + , ψ − , id } , where id is the identity map. heorem 1.2. There is a constant c > so that the following holds. Let A be a measurablesubset of [0 , with | A | ≤ / . Then, | Ψ( A ) | ≥ (1 + c ) | A | . Theorem 1.2 implies Theorem 0, and follows from the following “restricted spectral gap” the-orem. (To see that Ψ is explicit, add to Ψ all maps from “the large ball” in the proof ofLemma 1.1.) The M¨obius action induces a unitary representation of SL ( R ) on L ( R ) deﬁnedby T g − f ( x ) = p g ′ ( x ) f ( g ( x )) . For a positive integer K , denote by F K the family of maps f ∈ L ( R ) with supp ( f ) ⊂ [0 ,

1] and k f k = 1 so that for all k ∈ { , , . . . , K } , Z I ( k ) f ( x ) dx = 0 , where I ( k ) = [( k − /K, k/K ] . Theorem 1.3.

Let ε > be a small enough constant. Let G be the set given by Lemma 1.1. If K = K ( ε ) is a large enough positive integer, then for all f ∈ F K , *X g ν ( g ) T g f, f + < / , (1.1) with the probability measure ν = (2 |G| ) − X g ∈G g + g − , where g is the delta function at g . The “restricted spectral gap” theorem is proved in Section 3.

Proof of Theorem 1.2.

We ﬁrst reduce the general case to the “restricted spectral gap” case.Let σ > k ∈ { , . . . , K − } so that (cid:12)(cid:12) | A ∩ I ( k + 1) | − | A ∩ I ( k ) | (cid:12)(cid:12) ≥ σ | A | , then, using the maps ψ + , ψ − and id , | Ψ( A ) | ≥ (1 + σ ) | A | . It thus remains to consider the case that (cid:12)(cid:12) | A ∩ I ( k + 1) | − | A ∩ I ( k ) | (cid:12)(cid:12) < σ | A | for all k . Thus, forall k , (cid:12)(cid:12) K | A ∩ I ( k ) | − | A | (cid:12)(cid:12) < σK | A | . (1.2)5ssume towards a contradiction that the theorem does not hold.Since k g − k ≤ ε , for all x ∈ [0 , ε ) < g ′ ( x ) < − ε ) . Thus, for every x ∈ [0 , ≤ g ( x ) − x < ε. We need to ensure that even after applying the maps in Ψ G we remain in [0 , A ′ = A ∩ [ k ′ /K, − k ′ /K ]with k ′ the smallest integer so that k ′ ≥ εK . By (1.2),0 . | A | ≤ | A ′ | ≤ | A | , as long as σ, ε are small.Denote f = A ′ − | A ′ | . For all g ∈ G ∪ G − , h T g − f, f i ≥ − ε Z ( A ′ ( g ( x )) − | A ′ | )( A ′ ( x ) − | A ′ | ) dx ≥ . | A ′ | (1 − | A ′ | ) ≥ . k F k , as long as σ, ε, c are small.Project A ′ on F K . Deﬁne F as follows: for all x ∈ [0 , x ∈ I ( k ), then F ( x ) = A ′ ( x ) − K | A ′ ∩ I ( k ) | . Hence, F/ k F k ∈ F K . In addition, for σ small, using (1.2), k f − F k = K − k ′ X k = k ′ Z I ( k ) ( | A ′ | − K | A ′ ∩ I ( k ) | ) dx ≤ σ K | A ′ | ≤ . k F k . Therefore,0 . k F k ≤ *X g ν ( g ) T g ( f − F + F ) , f − F + F + ≤ . k F k + *X g ν ( g ) T g F, F + , which contradicts Theorem 1.3. 6 Finding set of generators

Notation.

For convenience, we use the following notation throughout the text. For a constant c ∈ R , we denote by c + a constant slightly larger than c , and by c − a constant slightly smallerthan c . Typically, the meaning of “slightly” depends on other parameters that are clear fromthe context. We also use the following asymptotic notation. Write a . b if a ≤ Cb with C auniversal constant. Write a & b if b . a , and a ∼ b if a . b . a . For δ >

0, denote by B δ ( x ) the ball of radius δ around x and by Γ δ ( A ) the δ -neighborhood ofthe set A . We consider the L -metric on SL ( R ). Proof of Lemma 1.1.

Breuillard [Br] proved a strong Tits alternative: there is a constant r ∈ Z so that if S is a ﬁnite symmetric subset of SL ( R ), which generates a non-amenable subgroup,then S ( r ) = { s s · · · s r : s i ∈ S } contains two elements, which freely generate a group.Let h = (cid:18) /q (cid:19) and h = (cid:18) /q (cid:19) . Observe h q = (cid:18) (cid:19) and h q = (cid:18) (cid:19) . Hence, h , h generate a non-amenable group. Apply the strong Tits alternative on the set S = { h , h , h − , h − } . There are thus g , g ∈ S ( r ) that freely generate a group.It remains to convert g , g to many elements that are close to identity and freely generate agroup. Let ℓ ∼ log(1 /ε ) so that the following holds. Consider W = (cid:8) w : w = s · · · s ℓ , s = g , s ℓ = g , s i ∈ { g , g , g − , g − } , s i +1 = s − i (cid:9) . Say that a word σ σ · · · σ k in an alphabet Σ ∪ Σ − is h Σ i -reduced if σ i +1 = σ − i for all i ∈{ , . . . , k − } . The size of W is order 3 ℓ and W consists of words of h g , g i -reduced-lengthexactly 2 ℓ . Claim 2.1.

The elements of W freely generate a group.Proof. Let w = w − in W ∪ W − . Write w = ( g a s g b ) and w = ( g a s g b ) with s , s reduced words in h g , g i , and g a , g b , g a , g b in { g , g , g − , g − } . If either w , w ∈ W or w , w ∈ W − , then g a = g − b and so w w = g a s g b g a s g b g a s g b g a s g b h g , g i -reduced form. If either w ∈ W, w ∈ W − or w ∈ W − , w ∈ W , then, since s = s − and the reduced-length of both s , s is ℓ − w w = g a s g b g a s ′ g b g a s g b in h g , g i -reduced form, with s ′ non-trivial.Any non-trivial h W i -reduced word is not the identity of h g , g i : For w = g a szsg b in W ∪ W − ,where z is a product of two elements of { g , g , g − , g − } , call z the center of w . The aboveimplies that if w = w − then the centers of both w , w are not reduced in the h g , g i -reducedform of w w .Hence, if w = w w · · · w k is a non-trivial h W i -reduced word, then even in its h g , g i -reducedform w is not the identity (as all centers are not reduced).Observe that for every w ∈ W , k w k , (cid:13)(cid:13) w − (cid:13)(cid:13) ≤ (1 + 1 /q ) rℓ := N. Cover the ball B N (1) in SL ( R ) with balls of radius ε/N . There exists w ∈ W so that (cid:12)(cid:12) B ε/N ( w ) ∩ W (cid:12)(cid:12) & | W | ( ε/N ) & ε ℓ (1 + 1 /q ) − rℓ . Deﬁne G = (cid:0) w − (cid:0) B ε/N ( w ) ∩ W (cid:1)(cid:1) \ { } . Choose q as a universal constant so that (1 + 1 /q ) r < .

01. Hence, |G| = | W | − & ℓ . In addition, for g ∈ G , k − g k ≤ N k w − w g k ≤ ε, and the entries of g are of the form Z /Q with Q = q rℓ and log Q ∼ log(1 /ε ). Finally, as G isof the form w − W \ { } with W freely generating a group, the elements of G freely generate agroup as well. To prove the “restricted spectral gap” property, we prove the following theorem that roughlystates that after enough iterations ν becomes very ﬂat. Denote by P δ the approximate identity on SL ( R ), namely, the density of the uniform distribution on the ball of radius δ around 1 in SL ( R ), P δ = B δ (1) | B δ (1) | . µ, µ ′ on SL ( R ) denote by µ ∗ µ ′ the convolution of µ and µ ′ . Denote by µ ( ℓ ) the ℓ -fold convolution of µ with itself. Theorem 3.1.

Let γ > . Assume that ε > , the parameter from 5 in Lemma 1.1, and δ > are small enough as a function of γ . If ℓ > C log(1 /δ )log(1 /ε ) with C = C ( γ ) > , then (cid:13)(cid:13)(cid:13) ν ( ℓ ) ∗ P δ (cid:13)(cid:13)(cid:13) ∞ < δ − γ . The proof of the theorem is given in Section 4. (When applying the theorem, γ is a universalconstant.) Proof of Theorem 1.3.

Let f ∈ F K . Assume that (1.1) does not hold, i.e., *X g ν ( g ) T g f, f + ≥ / . (3.1)We start by ﬁnding a level set of the Fourier transform that “violates (1.1) as well.” TheLittlewood-Paley decomposition of f is f = ∞ X k =0 ∆ k f, where for every k and for every λ ∈ supp d ∆ k f , | λ | ∼ k . We are interested in the Hecke operator T = X g ν ( g ) T g . As f ∈ F K , we can consider the part of f with high frequencies. Claim 3.2.

For k ≥ , deﬁne f = X k ≥ k ∆ k f. If K is large enough, depending on k , then h T f , f i > / . f , using the following claim. Claim 3.3.

There is k ≥ k so that k T ∆ k f k ≥ c k ∆ k f k with c > a universal constant.Proof. Bound k T f k ≤ X k,k ′ |h T ∆ k f , T ∆ k ′ f i| = X | k − k ′ |≤ C |h T ∆ k f , T ∆ k ′ f i| + X | k − k ′ | >C |h T ∆ k f , T ∆ k ′ f i| with C > X | k − k ′ |≤ C |h T ∆ k f , T ∆ k ′ f i| ≤ X | k − k ′ |≤ C k T ∆ k f k k T ∆ k ′ f k . C X k k T ∆ k f k Secondly, consider k > k ′ + C . Recall that (the absolute value of) the spectrum of ∆ k f is of order2 k . Similarly, the spectrum of ∆ k ′ f is of order 2 k ′ , which, since T g for g ∈ ( G ∪ G − )( G ∪ G − )is a smooth L ∞ -perturbation of identity, implies that the norm of the derivative of T g ∆ k ′ f isat most order 2 k ′ . Hence, |h T ∆ k f , T ∆ k ′ f i| . − k k ∆ k f k k ′ k ∆ k ′ f k . Thus, X k>k ′ + C |h T ∆ k f , T ∆ k ′ f i| . X k>k ′ + C k ′ − k k ∆ k f k k ∆ k ′ f k . − C k f k , and so, for appropriate C , X | k − k ′ | >C |h T ∆ k f , T ∆ k ′ f i| < / . Concluding, using Claim 3.2, X k ≥ k k ∆ k f k . k f k . / − / < k T f k − / . C X k ≥ k k T ∆ k f k . Set F = ∆ k f k ∆ k f k k from Claim 3.3. Thus, h T F, T F i ≥ c and so (cid:13)(cid:13) T F (cid:13)(cid:13) ≥ c . Iterating, for all ℓ > (cid:13)(cid:13)(cid:13) T ℓ F (cid:13)(cid:13)(cid:13) ≥ c ℓ . (3.2)To prove the theorem, argue that the norm of T ℓ F is actually small, thus obtaining the requiredcontradiction: Let γ > ℓ be the smallestpower of two so that ℓ > C ( γ ) k/ log(1 /ε )and by Theorem 3.1, (cid:13)(cid:13)(cid:13) ν ( ℓ ) ∗ P δ (cid:13)(cid:13)(cid:13) ∞ < δ − γ , with ε > δ = 4 − k . As δ is small and the spectrum of F is controlled, the following claim holds. Claim 3.4. (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)Z SL ( R ) ( T g F )(( ν ( ℓ ) ∗ P δ )( g )) dg (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) & c ℓ . Proof. If g = (cid:18) a bc d (cid:19) satisﬁes k g − k ≤ η ≤ /

20, then for all x ∈ R so that | x | ≤ | x − gx | = (cid:12)(cid:12)(cid:12)(cid:12) cx + dx − ax − bcx + d (cid:12)(cid:12)(cid:12)(cid:12) . η. In addition, if h ∈ B δ ( g ) for g ∈ supp ( ν ( ℓ ) ), then (cid:13)(cid:13) h − g − (cid:13)(cid:13) ≤ δ (1 + ε ) ℓ . Recall, 2 k δ (1 + ε ) ℓ is much smaller than c ℓ . Hence, since the norm of the derivative of F is atmost order 2 k , k T g F − T h F k = (cid:13)(cid:13) F − T h − g F (cid:13)(cid:13) . k δ (1 + ε ) ℓ . So, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) T ℓ F − Z SL ( R ) ( T h F )(( ν ( ℓ ) ∗ P δ )( h )) dh (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . k (1 + ε ) ℓ δ ≤ c ℓ / . The claim follows by (3.2). 11he claim above contradicts the following proposition, as shown below. In short, the propositionfollows by the ﬂatness lemma and the subgroup structure of SL ( R ). Proposition 3.5.

There exists universal constants σ , C > so that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)Z SL ( R ) ( T g F )(( ν ( ℓ ) ∗ P δ )( g )) dg (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) . δ − γ (1 + ε ) Cℓ − σ k . Proof.

Bound, using Theorem 3.1 and unitarity of T h , since the support of ν ( ℓ ) ∗ P δ is containedin B ε ) ℓ (1), (cid:13)(cid:13)(cid:13)(cid:13)Z ( T g F )(( ν ( ℓ ) ∗ P δ )( g )) dg (cid:13)(cid:13)(cid:13)(cid:13) = Z Z h T g F, T h F i (( ν ( ℓ ) ∗ P δ )( g ))(( ν ( ℓ ) ∗ P δ )( h )) dgdh . δ − γ (1 + ε ) ℓ Z B ε )2 ℓ (1) |h T g F, F i| dg. (3.3)Approximate B ε ) ℓ (1) by a smooth function: let κ : SL ( R ) → R ≥ be a smooth function sothat k κ k ∞ = 1, and so that κ ( g ) = 1 if k g − k ≤ ε ) ℓ and κ ( g ) = 0 if k g k > ε ) ℓ .Using Cauchy-Schwarz inequality, | (3.3) | . δ − γ (1 + ε ) ℓ (cid:18)Z |h T g F, F i| κ ( g ) dg (cid:19) / . (3.4)Write Z |h T g F, F i| κ ( g ) dg ≤ Z Z | F ( x ) || F ( y ) | (cid:12)(cid:12)(cid:12)(cid:12)Z T g F ( x ) T g F ( y ) κ ( g ) dg (cid:12)(cid:12)(cid:12)(cid:12) dxdy. Separate to two cases, according to the distance between x and y . Choose η > SL ( R ): g = (cid:18) a bc d (cid:19) = (cid:18) u cos θ v cos φu sin θ v sin φ (cid:19) with uv sin( φ − θ ) = 1 . On the chart a = 0, we have dg = dadbdc | a | = dudθdφ | u | sin ( θ − φ ) . Case one.

The ﬁrst case is when x, y are close: Bound

Z Z | x − y | <η | F ( x ) || F ( y ) | Z | T g F ( x ) || T g F ( y ) | κ ( g ) dgdxdy. (3.5)12rite F = F + F ∞ with k F k ≤ − σk and k F ∞ k ∞ ≤ σk for a universal constant σ > F , F ∞ replacing F ). Consider, e.g.,substituting F instead of the leftmost F in (3.5), Z Z | x − y | <η | F ( x ) || F ( y ) | Z | T g F ( x ) || T g F ( y ) | κ ( g ) dgdxdy ≤ Z | F ( x ) | Z | T g F ( x ) | κ ( g ) dgdx. (3.6)Fix x , and denote M = ( x + 1) − / (cid:18) − x (cid:19) ∈ SL ( R ) , so that M ( x ) = 0. (The matrix M shows two-transitivity of the M¨obius action: M maps x tozero and − x, − Z | T g F ( x ) | κ ( g ) dg = Z | T M − g − F ( x ) | κ ( M − g − ) dg (3.7) . Z Z Z | F (cot φ ) | κ ( M − g − ) 1 | sin φ || sin( θ − φ ) | dudθdφ. If κ ( M − g − ) = 0, then k g k . (1 + ε ) ℓ , and so in the integral above | sin( θ − φ ) | & (1 + ε ) − ℓ .Change variables again, | (3.7) | . (1 + ε ) ℓ Z Z Z | F ( ξ ) | κ ( M − g − ) 1 | ξ + 1 | / dudθdξ . (1 + ε ) ℓ . Hence, | (3.6) | . (1 + ε ) ℓ k F k ≤ (1 + ε ) ℓ − σk . The same bound holds also if we replace each of the other three F ’s by F in (3.5). It thusremains to trivially bound Z Z | x − y | <η | F ∞ ( x ) || F ∞ ( y ) | Z | T g F ∞ ( x ) || T g F ∞ ( y ) | κ ( g ) dgdxdy . η (1 + ε ) ℓ σk , and conclude | (3.5) | . (1 + ε ) ℓ (cid:16) η σk + 2 − σk (cid:17) . (3.8)13 ase two. Next, understand what happens for far x and y . The argument in this case is moreelaborate and uses knowledge of the spectrum of F . Start by Z Z | x − y |≥ η | F ( x ) || F ( y ) | (cid:12)(cid:12)(cid:12)(cid:12)Z T g F ( x ) T g F ( y ) κ ( g ) dg (cid:12)(cid:12)(cid:12)(cid:12) dxdy (3.9) ≤ Z Z | x − y |≥ η (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)Z SL ( R ) T g F ( x ) T g F ( y ) κ ( g ) dg (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) dxdy  / . In this case, argue for ﬁxed x and y in [0 ,

1] so that x ≥ y + η . Denote M = ( x − y ) − / (cid:18) − x − y (cid:19) ∈ SL ( R ) , so that M ( x ) = 0 and M ( y ) = ∞ . Change variables, (cid:12)(cid:12)(cid:12)(cid:12)Z T g F ( x ) T g F ( y ) κ ( g ) dg (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z T M − g − F ( x ) T M − g − F ( y ) κ ( M − g − ) dg (cid:12)(cid:12)(cid:12)(cid:12) = ( x − y ) − (cid:12)(cid:12)(cid:12)(cid:12)Z F (cot φ ) F (cot θ ) | sin φ · sin θ | κ ( M − g − ) dudθdφ | u || sin( θ − φ ) | (cid:12)(cid:12)(cid:12)(cid:12) . Change variables, Z F (cot φ ) F (cot θ ) | sin φ · sin θ | κ ( M − g − ) dudθdφ | u || sin( θ − φ ) | = Z Z F ( ξ ) F ( ζ ) E ( ξ, ζ ) dξdζ, with E ( ξ, ζ ) = p (1 + ξ )(1 + ζ ) | sin(cot − ζ − cot − ξ ) | Z κ ( M − g − ) du | u | . Continue by using that Fourier basis diagonalize ∇ . Start by bounding the norms of E and ∇ E .First, if κ ( M − g − ) = 0, then k g k . (1 + ε ) ℓ η − / . Hence, in the deﬁnition of E we can assume(1 + ε ) − ℓ η / . | u | . (1 + ε ) ℓ η − / , and 1 | sin(cot − ζ − cot − ξ ) | & (1 + ε ) − ℓ η. Therefore, there is a universal constant

C > k E k ∞ , kk∇ E k k ∞ . (1 + ε ) Cℓ η − C . F is of absolute value at least order 2 k , bound (cid:12)(cid:12)(cid:12)(cid:12)Z Z F ( z ) F ( w ) E ( z, w ) dzdw (cid:12)(cid:12)(cid:12)(cid:12) . − k (1 + ε ) Cℓ η − C . Thus, | (3.9) | ≤ − k (1 + ε ) Cℓ η − C − . (3.10) Concluding.

By (3.8) and (3.10), p | (3.3) | . δ − γ (1 + ε ) Cℓ (cid:16) η σk + 2 − σk + 2 − k η − C (cid:17) / ≤ δ − γ (1 + ε ) Cℓ − σk/ for appropriate choice of η , and with σ > c ℓ . | (3.4) | . δ − γ (1 + ε ) Cℓ − σ k , (3.11)which is a contradiction for γ = σ / k large and ε small. Theorem 3.1 follows from the following ﬂattening lemma, which roughly states that if µ = ν ( ℓ ) ∗ P δ is a little ﬂat then µ ∗ µ is much ﬂatter (unless µ is already very ﬂat). The proof of the lemmais given in Section 4.1. Lemma 4.1.

Let < γ < / . With the notation above, assume that δ − γ < k µ k < δ − / γ and ℓ > C log(1 /δ )log(1 /ε ) with C = C ( γ ) > . Also assume that ε > , the parameter from 5 in Lemma 1.1, and δ > are small enough as a function of γ . Then, there exists σ = σ ( γ ) > so that k µ ∗ µ k < δ σ k µ k .

15e apply the ﬂattening lemma iteratively. To start iterating, we need to show that µ is “a littleﬂat” to begin with. Proposition 4.2. If ℓ ≥ log Q (1 /δ ) with Q from Lemma 1.1, then k µ k ≤ δ − / γ with γ > a universal constant. This follows from Kesten’s bound, the following proposition about random walks on free groups.

Proposition 4.3.

Assume H is a ﬁnite set freely generating a group. Denote π = (2 | H | ) − X h ∈ H h + h − . Denote by p ( t ) ( x, x ) the probability of being at x after t steps in a random walk according to π started at x . Then, lim sup t →∞ ( p ( t ) ( x, x )) /t = √ k − k . Denote by W k ( G ) the set of words of length at most k in G ∪ G − . Proof of Proposition 4.2.

Let k be the maximal integer so that1 /Q k ≥ δ / . For every y ∈ supp ( ν ( k ) ), k y k ≤ (1 + ε ) k ≤ δ ε , for ε small. By Lemma 1.1, the entries of elements in W k ( G ) are in Z /Q k . So, for all y = y ′ in W k ( G ), (cid:13)(cid:13) y − y ′ (cid:13)(cid:13) ≥ δ / , which implies ( yB δ (1)) ∩ ( y ′ B δ (1)) = ∅ , for ε small. Hence, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)X y ν ( k ) ( y ) P δ ( y − · ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ X y ( ν ( k ) ( y )) (cid:13)(cid:13) P δ ( y − · ) (cid:13)(cid:13) ! / ≤ (cid:13)(cid:13)(cid:13) ν ( k ) (cid:13)(cid:13)(cid:13) / ∞ k P δ k . Finally, by Propositions 4.3 and Lemma 1.1, since convolution does not increase norms, k µ k . (cid:18) |G| − |G| (cid:19) k/ δ − / < δ − / γ . roof of Theorem 3.1. By Proposition 4.2, and Lemmas 4.1 and 1.1, (cid:13)(cid:13)(cid:13) µ ( k ) (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) ( ν ( ℓ ) ∗ P δ ) ( k ) (cid:13)(cid:13)(cid:13) ≤ δ − γ/ (4.1)with k = k ( γ ) > ℓ ≤ C log(1 /δ )log(1 /ε ) , with C > g , (cid:12)(cid:12)(cid:12) µ (2 k ) ( g ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)Z h µ ( k ) ( h ) µ ( k ) ( h − g ) dh (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:13)(cid:13)(cid:13) µ ( k ) (cid:13)(cid:13)(cid:13) ≤ δ − γ/ . Lemma 2.5 in [BG1] states cP δ ≤ P δ ∗ P δ ≤ c P δ with c > (cid:13)(cid:13)(cid:13) ν ( ℓ ) ∗ P δ (cid:13)(cid:13)(cid:13) ∞ ≤ C (1 + ε ) C ℓ (cid:13)(cid:13)(cid:13) µ (2 k ) (cid:13)(cid:13)(cid:13) ∞ ≤ C (1 + ε ) C ℓ δ − γ/ ≤ δ − γ with C = C ( γ ) > ℓ ≤ C ℓ , for ε, δ small. The ﬂattening lemma follows from the following product theorem. (The proof of the producttheorem is deferred to Section 5.) We need to use metric entropy : for a subset S of a metricspace denote by N δ ( S ) the least number of balls of radius δ needed to cover S . Theorem 4.4.

For all σ , τ > , there is ε > so that the following holds. Let δ > be smallenough. Let A ⊂ SL ( R ) ∩ B α (1) , α > a small universal constant, be so that1. A = A − ,2. N δ ( A ) = δ − σ ,σ ≤ σ ≤ − σ ,3. for every δ < ρ < δ ε , there is a ﬁnite set X ⊂ A so that | X | ≥ ρ − τ and for every x = x ′ in X we have k x − x ′ k ≥ ρ , and4. w.r.t. every complex basis change diagonalizing some matrix in SL ( R ) ∩ B (1) , there is g ∈ A (4) so that | g , g , | ≥ δ ε . hen, N δ ( AAA ) > δ − ε N δ ( A ) . The condition that A is contained in a small ball is not necessary, but simpliﬁes the statementand the proof. The condition A = A − is, of course, not necessary as well, but simpliﬁesnotation. Condition 4 above implies that A is far from strict subgroups. Proof of Lemma 4.1.

We prove the lemma for ℓ ∼ C ( γ ) log(1 /δ )log(1 /ε ) . The proof for larger ℓ follows, as convolution does not increase the norm.Assume towards a contradiction that k µ ∗ µ k > δ σ k µ k . To prove the theorem, we shall ﬁnd a set A that violates the product theorem. The set A willbe one of the level sets of µ in the following decomposition. Decompose µ as µ ∼ X j j χ j , where the sum is over O (log(1 /δ )) values of j (recall that µ is point-wise bounded by O (1 /δ )and we can ignore points with too small µ -measure), and where χ j is the characteristic functionof a set A j ⊂ SL ( R ) so that A j = A − j . (4.2)Choose j < j so that2 j + j k χ j ∗ χ j k & k µ ∗ µ k / log (1 /δ ) ≥ δ k µ k . (4.3)Using Young’s inequality, bound2 j + j k χ j k k χ j k ≥ δ k µ k ≥ δ j k χ j k . So, since 2 j | A j | ≤ j / | A j | / ≥ j − j / | A j | / ≥ j | A j | / | A j | / ≥ δ . (4.4)Similarly, 2 j / − j / ≥ j / | A j | / ≥ δ , j < j ≤ δ − j . Since 2 j | A j | ≤

1, using Young’s inequality and (4.2), we thus have δ − j | A j | ≤ h χ j ∗ χ j , χ j ∗ χ j i ≤ k χ j k k χ j ∗ χ j ∗ χ j k ≤ k χ j k k χ j k k χ j ∗ χ j k ≤ − j / k χ j ∗ χ j k . Hence, k χ j ∗ χ j k ≥ δ − j | A j | ≥ δ − j | A j | ≥ δ | A j | . (4.5)Use a version of Balog-Szemeredi-Gowers theorem proved in [T]. Denote K = B r (1) with r = δ − C ( γ ) ε = δ − , a compact subset of SL ( R ), with C ( γ ) ∼ C ( γ ) to be determined. Speciﬁcally, if ε is smallenough, then A j ⊂ K . The multiplicative energy of A j is k χ j ∗ χ j k . Equation (4.5) implies that A j has high energy.Theorem 5.4 (or, more precisely, its proof) in [T] implies that, for the appropriate C ( γ ), thereexists H ⊂ K which is an approximate group, namely, H = H − and there exists a ﬁnite set Y ⊂ K of size | Y | ≤ δ − (4.6)satisfying HH ⊂ Y H (4.7)so that δ | A j | ≤ | H | ≤ δ − | A j | . (4.8)In addition, there is y ∈ K such that | A | ≥ δ | A j | , (4.9)where A = A j ∩ yH. A = (cid:0) ( A − A ) ∪ ( A A − ) (cid:1) ∩ B α (1) , for α > | A | ≥ δ | A | ≥ δ | A j | . (4.10)We now prove that A violates the product theorem. We ﬁrst show that it violates the conclusionof the product theorem and then show that it satisﬁes the assumptions of the product theorem.Using (4.3) and Young’s inequality,2 j + j | A j | / | A j | = 2 j + j k χ j k k χ j k ≥ δ k µ k ≥ δ j | A j | / . Hence, using (4.9), µ ( yH ) ≥ µ ( A ) ≥ δ j | A j | ≥ δ . (4.11)On the other hand, µ ( yH ) . δ − max z ∈ supp ( ν ( ℓ ) | yH ∩ B δ − ( z ) | . So, there is z ∈ K so that | H ∩ S | ≥ δ , with S = B δ − ( z ) . Let Z be a maximal set of points in H so that for all z = z ′ in Z , zS ∩ z ′ S = ∅ . Bound, δ − | H | ≥ | HH | ≥ | Z | | H ∩ S | ≥ δ N δ ( H ) . Hence, N δ ( H ) ≤ δ − − | H | . (4.12)Finally, N δ ( AAA ) . N δ ( H (6) ) ≤ δ − − | H | ≤ δ − − | A | ≤ δ − N δ ( A ) . So, indeed, the conclusion of the product theorem does not hold. It remains to prove that A satisﬁes the assumptions of the product theorem.First, A = A − . A is not too small or too large. Equation (4.3) implies δ k µ k ≤ j + j k χ j ∗ χ j k ≤ j k χ j k j k χ j k ≤ j | A j | / , which implies δ − γ + ≤ j | A j | / . k µ k ≤ δ − / γ . Thus, δ − γ + | A j | ≤ (2 j | A j | ) ≤ δ ≤ (2 j | A j | ) . δ − γ | A j | . Therefore, δ − γ + ≤ | A j | ≤ δ γ − , which implies, using (4.8), δ − γ + ≤ | H | ≤ δ γ − . Therefore, using (4.10) and (4.6), (4.7), (4.12), δ − γ + ≤ δ − | A j | ≤ δ − | A | ≤ N δ ( A ) ≤ δ − − | H | ≤ δ − γ − , or N δ ( A ) = δ − σ , with σ < σ < − σ and σ = 2 γ − .Thirdly, we prove that A is well-distributed: Let ε = ε ( σ , τ ) > τ > δ < ρ < δ ε . We prove that there is aﬁnite set X ⊂ A so that | X | ≥ ρ − τ and for every x = x ′ in X we have k x − x ′ k ≥ ρ . Equation(4.11) says µ ( A ) ≥ δ . Write ν ( ℓ ) = ν ( ℓ ) ∗ ν ( ℓ − ℓ ) , for ℓ < ℓ the largest integer so that Q − ℓ > ρ. There thus exists z ∈ K so that ν ( ℓ ) ( A z ) ≥ δ . By Lemma 1.1, for every x = x ′ in supp ( ν ( ℓ ) ) ⊆ W ℓ ( G ), (cid:13)(cid:13) x − x ′ (cid:13)(cid:13) ≥ Q − ℓ > ρ. By Proposition 4.3, ν ( ℓ ) ( A z ) ≤ |W ℓ ( G ) ∩ A z | (cid:18) |G| − |G| (cid:19) ℓ/ . Thus, using Lemma 1.1 again, N ρ ( A ) ≥ δ N ρ ( A z ) ≥ δ |W ℓ ( G ) ∩ A z | ≥ δ (cid:18) |G| |G| − (cid:19) ℓ/ ≥ ρ − τ , τ ∼ A contains matrices with certain properties. That is, w.r.t. every basisin a bounded domain, there is g ∈ A (4) so that | g , g , | ≥ δ ε . Fix a basis diagonalizing somematrix in SL ( R ) ∩ B (1). Choose ℓ large, to be determined. By Proposition 8 from [BG2],since the elements of G freely generate a group, if S ⊂ W ℓ ( G ) is so that for all g , g , g , g ∈ S ,the bi-commutator [[ g , g ] , [ g , g ]] is 1, then | S | ≤ ℓ . As above, there is z ∈ K so that |W ℓ ( G ) ∩ A z | ≥ δ (cid:18) |G| |G| − (cid:19) ℓ / . The set A z is contained in a ball of radius r ′ = δ − around 1. Cover the ball of radius r ′ around 1 by balls of radius β = α/ ( r ′ + 1) ≥ δ . There thus exists z ∈ W ℓ ( G ) ∩ A z so that |W ℓ ( G ) ∩ A z ∩ B β ( z ) | ≥ δ (cid:18) |G| |G| − (cid:19) ℓ / > ℓ (the last inequality is the ﬁrst property ℓ should satisfy). Hence, there are g , g , g , g ∈ ( W ℓ ( G ) ∩ A z ∩ B β ( z )) z − ⊂ A A − with non-trivial bi-commutator. For every g ′ ∈ { g , g , g , g } , (cid:13)(cid:13) g ′ − (cid:13)(cid:13) ≤ (cid:13)(cid:13) g ′ z − z (cid:13)(cid:13) ( r ′ + 1) ≤ β ( r ′ + 1) = α, which implies g ′ ∈ A. If g ′ ∈ { g , g , g , g } is so that | ( g ′ ) , ( g ′ ) , | 6 = 0, then | ( g ′ ) , ( g ′ ) , | ≥ Q − ℓ ≥ δ ε (this is the second property ℓ should satisfy). In this case, we are done. Otherwise, recall that iffour 2 × g is lower triangular and g is upper triangular, which implies that g g has the required property. In this section we prove the product theorem, Theorem 4.4. The proof consists of several partsgiven in the following sub-sections. (The outline of the proof follows [BG1], but the proof inour case is more elaborate.) The theorem is ﬁnally proved in Section 5.5. We start this sectionwith a brief outline of the proof of the product theorem. We note that not only ﬁeld properties22re used but also metric properties, the argument is a multi-scale one. Here are the steps of theproof (ignoring many technicalities).We wish to prove that a set A with certain properties becomes larger when multiplied by itself.(i) Assume toward a contradiction that A (3) is not larger than A .(ii) Assuming (i), ﬁnd a set V of commuting matrices which is not too small and is close to A (2) .To do so, use a version of the Balog-Szemeredi-Gowers theorem.(iii) If V is concentrated in a small ball, then AV will “move V around” and hence AV will bemuch bigger than A . This is a contradiction, as AV is close to A (3) .(iv) Otherwise, V is not concentrated on any ball, which means that it is well-distributed. Inthis case, use the discretized ring conjecture, which roughly states that a well-distributed setin R becomes larger under sums and products. To move from SL ( R ) to R , use matrix-trace,which translates matrix-product to sums and products in the ﬁeld.In fact, the size of V obtained is roughly | A | / . To get back to the “correct” order of magnitude,we use that A is far from strict subgroups in that it contains a matrix g so that g , g , is farfrom zero (w.r.t. any basis change). In rough terms, this property of A is used to show that thesize of V gV gV is | V | ∼ | A | . In this sub-section we show that, under some non-degeneracy conditions, a set of matricesinduces a not-too-small set of commuting matrices. To prove this, we also show that a set ofmatrices induces a not-too-small trace-set. We start by stating the results. The proofs follow.The trace of a matrix g is Tr g = g , + g , . Every g in SL ( C ) with | Tr g | 6 = 2 can be diagonalized.(Elements g in SL ( R ) with | Tr g | < SL ( C ).)Deﬁne Diag to be the set of diagonal matrices v in SL ( C ) so that Tr v ∈ R .The following lemma shows that, at least in one “direction,” the trace-set of a set is not toosmall. Lemma 5.1.

Think of SL ( R ) as a subset of R , and let g , g , g , g ∈ SL ( R ) ∩ B / (1) be sothat | det( g , g , g , g ) | ≥ δ , (5.1) and let A ⊂ SL ( R ) ∩ B / (1) . Then, there is I ⊂ { , , , } of size | I | = 3 so that Y i ∈ I N δ ( Tr g − i A ) ≥ δ N δ ( A ) . Lemma 5.2.

Let A ⊂ SL ( C ) ∩ B α (1) , α > a small constant, be so that dist ( A, ± ≥ δ .Then, there exists a set V ⊂ SL ( C ) of commuting matrices so that N δ ( V ) ≥ δ N δ ( Tr A ) N δ ( A ) N δ ( A A − ) , and every v ∈ V satisﬁes dist ( v, A − A ) ≤ δ − . We shall also need the following corollary of the two lemmas.

Corollary 5.3.

Let A ⊂ SL ( R ) ∩ B α (1) , α > a small constant. Let g , g , g ∈ SL ( R ) ∩ B α (1) be so that | det(1 , g , g , g ) | ≥ δ . Then, there is a set of commuting matrices V ⊂ SL ( C ) sothat there is g ∈ { , g , g , g } so that N δ ( V ) ≥ δ N δ ( A ) / N δ ( Ag − AA − ) , and every v ∈ V satisﬁes dist ( v, A − A ) ≤ δ − .Proof of Lemma 5.1. For i ∈ { , , , } , denote g ′ i = (cid:18) d i − c i − b i a i (cid:19) , where g i = (cid:18) a i b i c i d i (cid:19) . By (5.1), | det( g ′ , g ′ , g ′ , g ′ ) | = | det( g , g , g , g ) | ≥ δ . Hence, let A ′ ⊂ A be contained in a ball of radius δ so that N δ ( A ) ≤ δ − N δ ( A ′ ) , and so that there is a set I ⊂ { , , , } of size | I | = 3 so that N δ ( A ′ ) ≤ δ − N δ ( P A ′ ) , where P is the projection to the sub-space span { g ′ i : i ∈ I } . (The map g P g restricted to asmall ball is a diﬀeomorphism with bounded distortion.) For every g = (cid:18) a bc d (cid:19) in SL ( R ), Tr g − i g = d i a − b i c − c i b + a i d = (cid:10) g, g ′ i (cid:11) , R . Thus, N δ ( P A ′ ) ≤ δ − Y i ∈ I N δ ( Tr g − i A ′ ) ≤ δ − Y i ∈ I N δ ( Tr g − i A ) . Proof of Lemma 5.2.

Choose T ⊂ Tr A so that | T | ∼ N δ ( Tr A ) , (5.2)and so that for all t = t ′ in T , | t − t ′ | , | t − | , | t + 2 | > δ. (If N δ ( Tr A ) is small, the lemma trivially holds.) Since trace is continuous, X t ∈ T N δ (cid:0)(cid:8) g ∈ A A − : | Tr g − t | < δ/ (cid:9)(cid:1) . N δ ( A A − ) . There thus exists t ∈ T so that the set A = { g ∈ A A − : | Tr g − t | < δ/ } satisﬁes N δ ( A ) . N δ ( A A − ) | T | . Choose g ∈ A so that Tr g = t .Choose A ⊂ A so that | A | = N δ ( A )and A ⊂ [ g ∈ A B δ ( g ) . (5.3)For g ∈ A , deﬁne (with a slight abuse of notation) A g = { x ∈ A : xg x − ∈ B δ ( g ) } . Since for every x we have Tr xg x − = Tr g = t , for every x ∈ A we have xg x − ∈ A . Equation(5.3) thus implies A = [ g ∈ A A g . g ∈ A so that N δ ( A g ) ≥ N δ ( A ) | A | = N δ ( A ) N δ ( A ) & N δ ( A ) N δ ( A A − ) | T | . (5.4)Fix x ∈ A g . By deﬁnition, for every x ∈ A g , (cid:13)(cid:13) xg x − − x g x − (cid:13)(cid:13) ≤ δ. Since A is bounded, k yg − g y k . δ, where y = x − x ∈ x − A g . Since g ∈ A is far from ±

1, conclude that diagonalizing g makes x − A close to diagonal: Since | Tr g | 6 = 2, there exists a matrix u so that v = ug u − is diagonal. By assumption on A , dist ( v , ± ∼ dist ( g , ± ≥ δ . So, | ( v ) , − ( v ) , | ≥ δ . In addition, (cid:13)(cid:13) uyu − v − v uyu − (cid:13)(cid:13) . δ. Hence, | ( uyu − ) , | , | ( uyu − ) , | . δ − . Since | det( uyu − ) | = 1, there is thus a diagonal v ∈ SL ( C ) so that (cid:13)(cid:13) uyu − − v (cid:13)(cid:13) . δ − . We can thus conclude that x − A g ⊂ A − A is in a ( δ − )-neighborhood of a set V ⊂ SL ( C ) ofcommuting matrices. In particular, N δ ( V ) ≥ δ N δ ( A g ) . Equations (5.4) and (5.2) imply the claimed lower bound on N δ ( V ). Proof of Corollary 5.3.

Since | det(1 , g , g , g ) | ≥ δ , the pairwise distances between ± ± g , ± g , ± g are at least δ . Thus, there exists a subset A ′ of A so that N δ ( A ′ ) ≥ δ N δ ( A )26nd dist ( A ′ , {± , ± g , ± g , ± g } ) ≥ δ . By Lemma 5.1, there exists g ∈ { , g , g , g } so that N δ ( Tr g − A ′ ) ≥ δ N δ ( A ′ ) / . Now, apply Lemma 5.2 on the set g − A ′ to complete the proof. The following lemma is the main result of this section. The lemma roughly tells us that if a set V of commuting matrices is well-distributed then adding a non-commuting element to V makesits trace-set grow under products. Lemma 5.4.

For every < σ < and < κ < , there is ε > so that the following holds.Let V ⊂ SL ( C ) ∩ B α (1) , α > a small constant, be so that V = V − , so that dist ( v, Diag ) ≤ δ − for all v in V , so that N δ ( V ) = δ − σ , and so that for all δ < ρ < δ ε , max a N δ ( V ∩ B ρ ( a )) < ρ κ δ − σ . (5.5) Let g = (cid:18) a bc d (cid:19) ∈ SL ( C ) ∩ B α (1) be so that Tr g ∈ R and | bc | ≥ δ ε . Then, N δ ( Tr W gW g ) ≥ δ − σ − ε , where W = V (8) . The starting point here is the discretized ring conjecture. This conjecture was ﬁrst prove in [B1]and later strengthened in [BG1], see Proposition 3.2 in [BG1].

Lemma 5.5.

For all < σ, κ < , there is ε > so that for all δ > small, the followingholds. Let A ⊂ [ − , be a union of δ -intervals so that | A | = δ − σ and for all δ < ρ < δ ε , max a | A ∩ B ρ ( a ) | < ρ κ | A | . Then, | A + A | + | AA | > δ − σ − ε . Proposition 5.6.

For all < σ, κ < , there is ε > so that the following holds. Let S ⊂ C be a subset of the complex unit circle, so that S is a union of δ -arcs, δ > small enough, so that S = S − , so that | S | = δ − σ (size is measured in the unit circle), and so that for all δ < ρ < δ ε , max a | S ∩ B ρ ( a ) | < ρ κ | S | . (5.6) If γ, λ ∈ R are so that γ > , | λ | ≥ δ ε , then the set D = { xy + γ/ ( xy ) + λ ( x/y + y/x ) : x, y ∈ S (4) } satisﬁes N δ ( D ) ≥ δ − ε − σ . We also need and prove the following variant of scalar ampliﬁcation.

Proposition 5.7.

For all < σ, κ < , there is ε > so that the following holds. Let S ⊂ [1 / , be a union of δ -intervals, δ > small enough, so that S = S − , so that | S | = δ − σ , and so that for all δ < ρ < δ ε , max a | S ∩ B ρ ( a ) | < ρ κ | S | . (5.7) If γ, λ ∈ R are so that γ > , | λ | ≥ δ ε , then the set D = { xy + γ/ ( xy ) + λ ( x/y + y/x ) : x, y ∈ S (4) } satisﬁes N δ ( D ) ≥ δ − ε − σ . Lemma 5.4 follows from scalar ampliﬁcation.

Proof of Lemma 5.4.

Let V ⊂ Diag be so that dist ( v, V ) ≤ δ = δ − for all v in V and dist ( v , V ) ≤ δ for all v in V . Speciﬁcally, for all δ < ρ < δ ε ,max a N δ ( V ∩ B ρ ( a )) ≤ δ − max a N δ ( V ∩ B ρ ( a )) ≤ δ − ρ κ δ − σ . (5.8)28bserve Tr (cid:18) x /x (cid:19) g (cid:18) y /y (cid:19) g = a xy + d / ( xy ) + bc ( x/y + y/x ) . (5.9)Write V = (cid:26)(cid:18) x /x (cid:19) : x ∈ T (cid:27) . The set T is contained in the real numbers union the complex unit circle. Denote by T = T ∩ R ,and T = T \ T . First, assume N δ ( T ) ∼ N δ ( V ) . (5.10)Deﬁne S to be a δ -neighborhood of T . Thus, | S | = δ − σ with σ ≥ σ/

2. Equation (5.8) implies that S satisﬁes (5.7) with κ = κ/

2. As in Proposi-tions 5.7, denote D = a { xy + γ/ ( xy ) + λ ( x/y + y/x ) : x, y ∈ ( S ) (4) } . with γ = ( d/a ) and λ = bc/a . Observe, ad − bc = 1 and a + d ∈ R imply d/a ∈ R and bc/a ∈ R . In addition, | λ | ≥ δ . The proposition thus implies N δ ( D ) ≥ δ − ε − | S | ≥ δ − ε − σ + . Using (5.9), conclude N δ ( Tr W gW g ) ≥ δ − σ − ε + . When (5.10) does not hold, consider T and use Proposition 5.6 instead of Proposition 5.7. Proof of Proposition 5.7.

Assume towards a contradiction that the proposition does not hold.W.l.o.g., for every s in S , dist ( s, { γ / , } ) ≥ δ . (5.11)We ﬁrst ﬁnd a set A so that A + A is not much larger than A . If s, s ′ ∈ S , then x = s ′ /s ∈ S (2) and y = ss ′ ∈ S (2) satisfy xy = s ′ and y/x = s . By assumption, we can thus conclude (cid:12)(cid:12)(cid:12)n ( s ′ + γ/s ′ ) + λ ( s + 1 /s ) : s ′ , s ∈ S (2) o(cid:12)(cid:12)(cid:12) . δ − ε | S | . Denote A = { λ ( s + 1 /s ) : s ∈ S (2) } A ′ = { s ′ + γ/s ′ : s ′ ∈ S (2) } . Since | λ | ≥ δ , | A | ≥ δ | S | . By (5.11), the derivative of the map s ′ s ′ + γ/s ′ is bounded away from zero in the relevantrange. Thus, | A ′ | ≥ δ | S | . Ruzsa’s inequality in measure version for open sets

A, A ′ ⊂ R states | A + A | ≤ | A + A ′ | / | A ′ | (see, e.g., Lemma 3.2 in [T]). Therefore, | A + A | ≤ δ − | S | . (5.12)We now ﬁnd a set that does not signiﬁcantly increase its size under sums and products. Deﬁne A = { s + 1 /s : s ∈ S } . By (5.11), | A | ≥ δ | S | . Hence, by (5.12), since | λ | ≥ δ , | A + A | ≤ δ − | A + A | ≤ δ − | A | . Observe ( s + 1 /s )( s + 1 /s ) = (( s s ) + 1 / ( s s ) ) + (( s /s ) + 1 / ( s /s ) ) . Hence, using (5.12), since | λ | ≥ δ , | A A | ≤ δ − | A + A | ≤ δ − | A | . So, | A + A | + | A A | ≤ δ − | A | . If ε > < σ ′ < | A | = δ − σ ′ . Choose κ ′ = κ/

2. Set ε = ε ( σ ′ , κ ′ ) > ε > δ < ρ < δ ε ,max a | A ∩ B ρ ( a ) | ≤ δ − max a | S ∩ B ρ ( a ) | < δ − ρ κ | S | ≤ δ − ρ κ | A | ≤ ρ κ ′ | A | . This contradicts Lemma 5.5. 30 .3 Expansion using a non-commuting element

We shall use the following variant of a lemma from [BG1], see [H] as well. Roughly, the lemmastates that adding a non-commuting element to a commuting set of matrices makes it growunder products.

Lemma 5.8.

Let V ⊂ SL ( C ) ∩ B α (1) , α a small constant, be so that dist ( v, Diag ) ≤ δ − for all v in V . Let g = (cid:18) a bc d (cid:19) ∈ Diag ∩ B α (1) be so that | bc | ≥ δ . Then, N δ ( V gV gV ) ≥ δ N δ ( V ) . Proof.

Assume N δ ( V ) > δ − (5.13)(otherwise, the lemma trivially holds). There are several cases to consider. Denote by

Diag R the set of matrices in Diag with entries in R . Consider the case that thereis a subset of Diag R with comparable metric entropy to that of V : Assume that there is Z ⊂ R so that | Z | ≥ δ N δ ( V ), so that for all z ∈ Z , dist (cid:18)(cid:18) z /z (cid:19) , V (cid:19) ≤ δ − , and so that for all z = z ′ in Z , | z − z ′ | > δ. W.l.o.g., assume that z ≥ p d/a (the proof in the other case is similar). Furthermore, by (5.13),we can assume w.l.o.g. that z − p d/a, | z − | ≥ δ . For z = ( z , z , z ) in Z , denote M z = (cid:18) z /z (cid:19) g (cid:18) z /z (cid:19) g (cid:18) z /z (cid:19) . To prove the lemma, we will show that for all z = z ′ in Z , k M z − M z ′ k ≥ δ . Observe M z = (cid:18) z z ( a z + bc/z ) ( z /z ) b ( az + d/z )( z /z ) c ( az + d/z ) (1 /z z )( bcz + d /z ) (cid:19) . The ﬁrst case is when z > z ′ . We have two sub-cases to consider. The ﬁrst sub-case is | z /z − z ′ /z ′ | ≥ δ . Bound (cid:12)(cid:12) ( M z ) , / ( M z ) , − ( M z ′ ) , / ( M z ′ ) , (cid:12)(cid:12) = | b/c | · (cid:12)(cid:12) ( z /z ) − ( z ′ /z ′ ) (cid:12)(cid:12) ≥ δ . Thus, δ ≤ (cid:12)(cid:12) ( M z ) , ( M z ′ ) , − ( M z ′ ) , ( M z ) , (cid:12)(cid:12) = (cid:12)(cid:12)(cid:0) ( M z ) , − ( M z ′ ) , (cid:1) ( M z ′ ) , + ( M z ′ ) , (cid:0) ( M z ′ ) , − ( M z ) , (cid:1)(cid:12)(cid:12) . So, k M z − M z ′ k ≥ δ . The second sub-case is | z /z − z ′ /z ′ | < δ . Bound | ( M z ) , − ( M z ′ ) , | = | ba | (cid:12)(cid:12) ( z /z )( z + ( d/a ) /z ) − ( z ′ /z ′ )( z ′ + ( d/a ) /z ′ ) (cid:12)(cid:12) & | ba | (cid:12)(cid:12) z + ( d/a ) /z − z ′ + ( d/a ) /z ′ (cid:12)(cid:12) − δ . The map z z + ( d/a ) /z has derivative at least δ for z ≥ p d/a + δ . So, | ( M z ) , − ( M z ′ ) , | ≥ δ . The second case is z = z ′ and ( z , z ) = ( z ′ , z ′ ). Assume w.l.o.g. z = z ′ (the argumentin the other case is similar). Since the entries of g (cid:18) z /z (cid:19) g are bounded away from 0 and V is close to 1, k M z − M z ′ k ≥ δ (cid:13)(cid:13) ( z z − z ′ z ′ , z z ′ − z ′ z ) (cid:13)(cid:13) . Since k ( z , z ′ ) k & (cid:12)(cid:12)(cid:12)(cid:12) det (cid:18) z − z ′ − z ′ z (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) & δ , (cid:13)(cid:13) ( z z − z ′ z ′ , z z ′ − z ′ z ) (cid:13)(cid:13) & δ. Otherwise, there is a subset of

Diag \ Diag R with comparable metric entropy to that of V :There is a subset of the complex unit circle Z so that | Z | ≥ δ N δ ( V ), so that for all z ∈ Z , dist (cid:18)(cid:18) z /z (cid:19) , V (cid:19) ≤ δ − , and so that for all z = z ′ in Z , | z − z ′ | > δ. dist ( Z, ≥ δ . Also assume w.l.o.g. that every element of Z has positiveimaginary part (the other case is similar). When z = z ′ , bound (cid:12)(cid:12) | ( M z ) , | − | ( M z ′ ) , | (cid:12)(cid:12) = | ba | (cid:12)(cid:12) | z + ( d/a ) /z | − | z ′ + ( d/a ) /z ′ | (cid:12)(cid:12) . If we denote, z = e iθ and z ′ = e iθ ′ , then (cid:12)(cid:12) | z + ( d/a ) /z | − | z ′ + ( d/a ) /z ′ | (cid:12)(cid:12) = 2( d/a ) (cid:12)(cid:12) cos(2 θ ) − cos(2 θ ′ ) (cid:12)(cid:12) ≥ δ | z − z ′ | > δ . Hence, k M z − M z ′ k ≥ δ . When z = z ′ , the argument is similar to the one in case 1.2. above. Roughly, we now show that two non-commuting matrices induce four “independent directions.”

Claim 5.9.

Let g ∈ SL ( C ) ∩ B (1) be so that dist ( g , ± ≥ δ and Tr g = 2 . Let g ∈ SL ( C ) be so that w.r.t. the basis that makes g diagonal | ( g ) , ( g ) , | ≥ δ . Then, | det(1 , g , g , g g ) | ≥ δ . Proof.

Choose a basis so that g is diagonal (this is a linear transformation on the g i ’s withbounded away from zero determinant). Denote λ = ( g ) , . In the new basis, | det(1 , g , g , g g ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) λ ( g ) , ( g g ) , ( g ) , ( g g ) , ( g ) , ( g g ) , /λ ( g ) , ( g g ) , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = | ( λ − /λ )(( g g ) , ( g ) , − ( g g ) , ( g ) , ) | . By choice, | λ − /λ | ≥ δ . and | ( g ) , ( g ) , | ≥ δ . Hence, | (( g g ) , ( g ) , − ( g g ) , ( g ) , ) | = | ( λ − /λ )( g ) , ( g ) , | ≥ δ . .5 Proof of product theorem Proof of Theorem 4.4.

Assume towards a contradiction that N δ ( AAA ) ≤ δ − N δ ( A ) . By [T], for every ﬁnite k , N δ ( A ( k ) ) ≤ δ − N δ ( A ) (5.14)as well.The ﬁrst step is to ﬁnd a large, commuting set of matrices. By assumption on A and usingClaim 5.9, choose g , g , g in A (8) with | det(1 , g , g , g ) | ≥ δ . Equation (5.14) and Corol-lary 5.3 imply that there is a set of commuting matrices V ⊂ SL ( C ) so that N δ ( V ) ≥ δ N δ ( A ) / = δ − σ / (5.15)and so that V ⊂ Γ δ − ( A (2) ) . Assume (by perhaps allowing V ⊂ Γ δ − ( A (4) )) that V = V − and V ⊂ B δ ε (1) . (5.16)Proceed according to two cases.The ﬁrst case is when V is well-spread, i.e., the conditions for using the discretized ring conjectureare held. Deﬁne σ = 1 − σ / − and κ = τ / N δ ( V ) = δ − σ . Assume that for all δ < ρ < δ ε with ε = ε ( σ, κ ) from Lemma 5.4,max a N δ ( V ∩ B ρ ( a )) < ρ κ δ − σ . By assumption on A , there is g ∈ A (4) so that (w.r.t. the basis that makes V diagonal) thedistance between g and 1 is at most a small constant, and | ( g ) , ( g ) , | ≥ δ ε . Even after thebasis change Tr g ∈ R . Thus, Lemma 5.4 implies N δ ( Tr W ) ≥ δ − σ − ε , where W = W g W g W and W = V (8) . C > dist ( g , ± & δ ε . Thus, using (5.16), dist ( W , ± & δ ε . We can hence apply Lemma 5.2 with W to obtain a set W ⊂ Γ δ − ( W − W )of commuting matrices so that N δ ( W ) ≥ δ N δ ( Tr W ) N δ ( W ) N δ ( W W − ) ≥ δ δ − σ − ε N δ ( V g V g V ) N δ ( W W − ) . By (5.14) and Lemma 5.8, we thus have N δ ( W ) ≥ δ δ − σ − ε N δ ( V ) N δ ( A ) . So, by (5.15), N δ ( W ) ≥ δ − σ − ε / . Again, we can ﬁnd g ∈ A (4) so that (w.r.t. the basis that makes W diagonal) dist ( g ,

1) is atmost a small constant, Tr g ∈ R , and | ( g ) , ( g ) , | ≥ δ . So, we can apply Lemma 5.8 againand get N δ ( A ) ≥ δ N δ ( W g W g W ) ≥ δ N δ ( W ) ≥ δ − σ − ε / = δ − σ − ε / = δ − ε / N δ ( A ) . This contradicts (5.14), and the proof is complete in this case.The proof in the second case, when V is not well-spread, is simpler. Indeed, we have N δ ( V ) ≥ ρ κ δ − σ with V = V ∩ B ρ ( a )(reusing notation). So, by Lemma 5.8, N δ ( V ) ≥ δ N δ ( V ) ≥ ρ κ δ − σ + , where V = V g V g V ⊂ Γ δ − ( A ( C ) )with g from above. By assumption on A , there is a ﬁnite X ⊂ A so that | X | ≥ ρ − τ x = x ′ in X , (cid:13)(cid:13) x − x ′ (cid:13)(cid:13) ≥ Cρ.

Denote V = [ x ∈ X xV . Therefore, N δ ( V ) ≥ | X |N δ ( V ) ≥ ρ − τ ρ κ δ − σ + ≥ ρ − τ/ δ − σ + ≥ δ − σ − ε τ/ = δ − N δ ( A ) . Since V ⊂ Γ δ − ( A ( C ) ), we obtained a contradiction to (5.14), and the proof is complete. References [B1] J. Bourgain. On the Erdos-Volkmann and Katz-Tao ring conjectures. Geom. Funct. Anal.13, pages 334–365, 2003.[B2] J. Bourgain. Expanders and dimensional expansion. Comptes Rendus Mathematique 347(7), pages 357–362, 2009.[BG1] J. Bourgain and A. Gamburd. On the spectral gap for ﬁnitely-generated subgroups ofSU(2). Inventiones Mathematicae 171 (1), pages 83–121, 2007.[BG2] J. Bourgain and A. Gamburd. Uniform expansion bounds for Cayley graphs of SL (F p ).Annals of Mathematics 167 (2), pages 625–642, 2008.[Br] E. Breuillard. A strong Tits alternative. Preprint, 2008.[DS] Z. Dvir and A. Shpilka. Towards dimension expanders over ﬁnite ﬁelds. In Proceedings ofthe IEEE 23rd Annual Conference on Computational Complexity, pages 304–310, 2008.[DW] Z. Dvir and A. Wigderson. Monotone expanders: constructions and applications. Theoryof Computing, 6 (1), pages 291–308, 2010.[H] H. A. Helfgott. Growth and generation in SL2